This is a simple tool to convert the ClustalW format:
P31937 ----------------MAASLRLLGAASGLRYWSR-RLRPAAGSFAAVCSRSVASKTPVG
P00508 MALLQSRLL-------LSAPRRAAATARASSWWSHVEMGPPDPILGVTEAFKRDTNSKKM
P12344 MALLHSGRFLSGVAAAFHPGLAAAASARASSWWAHVEMGPPDPILGVTEAFKRDTNSKKM
P00505 MALLHSGRVLPGIAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
P05202 MALLHSSRILSGMAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
: . .:* . :*:. : *. :... : . :::
P00508 MALLQSRLL-------LSAPRRAAATARASSWWSHVEMGPPDPILGVTEAFKRDTNSKKM
P12344 MALLHSGRFLSGVAAAFHPGLAAAASARASSWWAHVEMGPPDPILGVTEAFKRDTNSKKM
P00505 MALLHSGRVLPGIAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
P05202 MALLHSSRILSGMAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
: . .:* . :*:. : *. :... : . :::
P31937 FIGLGNM----GNPMAKNLMKHGYPLIIYDVFPDAC------KEFQDAGEQVVSSPADVA
P00508 NLGVGAYRDDNGKSYVLNCVRKAEAMIAAKKMDKEYLPIAGLADFTRASAELALGENSEA
P12344 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIAGLAEFCKASAELALGENNEV
P00505 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENSEV
P05202 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENNEV
:*:* *:. . :.:. . * . : . :* *. ::. . .
P00508 NLGVGAYRDDNGKSYVLNCVRKAEAMIAAKKMDKEYLPIAGLADFTRASAELALGENSEA
P12344 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIAGLAEFCKASAELALGENNEV
P00505 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENSEV
P05202 NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENNEV
:*:* *:. . :.:. . * . : . :* *. ::. . .
in FASTA format (one line sequence one line):
>P31937
----------------MAASLRLLGAASGLRYWSR-RLRPAAGSFAAVCSRSVASKTPVG
FIGLGNM----GNPMAKNLMKHGYPLIIYDVFPDAC------KEFQDAGEQVVSSPADVA
>P00508
MALLQSRLL-------LSAPRRAAATARASSWWSHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKSYVLNCVRKAEAMIAAKKMDKEYLPIAGLADFTRASAELALGENSEA
>P00505
MALLHSGRVLPGIAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENSEV
>P12344
MALLHSGRFLSGVAAAFHPGLAAAASARASSWWAHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIAGLAEFCKASAELALGENNEV
>P05202
MALLHSSRILSGMAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENNEV
----------------MAASLRLLGAASGLRYWSR-RLRPAAGSFAAVCSRSVASKTPVG
FIGLGNM----GNPMAKNLMKHGYPLIIYDVFPDAC------KEFQDAGEQVVSSPADVA
>P00508
MALLQSRLL-------LSAPRRAAATARASSWWSHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKSYVLNCVRKAEAMIAAKKMDKEYLPIAGLADFTRASAELALGENSEA
>P00505
MALLHSGRVLPGIAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENSEV
>P12344
MALLHSGRFLSGVAAAFHPGLAAAASARASSWWAHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIAGLAEFCKASAELALGENNEV
>P05202
MALLHSSRILSGMAAAFHPGLAAAASARASSWWTHVEMGPPDPILGVTEAFKRDTNSKKM
NLGVGAYRDDNGKPYVLPSVRKAEAQIAAKNLDKEYLPIGGLAEFCKASAELALGENNEV
The script take as argument a file containing the alignment in ClustalW format
#!/usr/bin/python
import sys
CluW=open(sys.argv[1],'r').readlines()
FASTA={}
for x in CluW[1:]:
line=x.split()
if len(line)==2:
FASTA[line[0]]=''
for x in CluW[1:]:
line=x.split()
if len(line)==2:
FASTA[line[0]]=FASTA[line[0]]+line[1].strip()
for k,v in FASTA.iteritems():
print '>'+k
print v
import sys
CluW=open(sys.argv[1],'r').readlines()
FASTA={}
for x in CluW[1:]:
line=x.split()
if len(line)==2:
FASTA[line[0]]=''
for x in CluW[1:]:
line=x.split()
if len(line)==2:
FASTA[line[0]]=FASTA[line[0]]+line[1].strip()
for k,v in FASTA.iteritems():
print '>'+k
print v
Follow the instruction to execute the script.