#!/usr/bin/python
import sys
X=open(sys.argv[1],'r').readlines()
Y=open(sys.argv[2],'r').readlines()
seq1=''
seq2=''
for f in X[1:]:
seq1=seq1+f.strip()
for f in Y[1:]:
seq2=seq2+f.strip()
i=0
g=open('DP.dat','w')
for A in seq1:
dot=''
i=i+1
e=0
for B in seq2:
e=e+1
if A==B:
dot=dot+'/'
g.write(str(i)+' '+str(e)+'\n')
else:
dot=dot+' '
import sys
X=open(sys.argv[1],'r').readlines()
Y=open(sys.argv[2],'r').readlines()
seq1=''
seq2=''
for f in X[1:]:
seq1=seq1+f.strip()
for f in Y[1:]:
seq2=seq2+f.strip()
i=0
g=open('DP.dat','w')
for A in seq1:
dot=''
i=i+1
e=0
for B in seq2:
e=e+1
if A==B:
dot=dot+'/'
g.write(str(i)+' '+str(e)+'\n')
else:
dot=dot+' '
The script takes as arguments two fasta sequences and generates a file named DP.dat that can be visualized with different tools.(Instruction to execution)
For example you can use gnuplot typing:
gnuplot
and than:
gnuplot> plot "DP.dat" with dots
You will obtain a plot like the following, where the numbers in the axis correspond to amino acid positions in the protein sequences.
Dop Plot obtained with GNUPLOT |
Nessun commento:
Posta un commento