Scripts for dummies: Python implementation of DOT PLOT

Hello! This is a very simple script to compute the dot plot in order to compare two biological sequences.

#!/usr/bin/python

import sys

X=open(sys.argv[1],'r').readlines()

Y=open(sys.argv[2],'r').readlines()

seq1=''

seq2=''

for f in X[1:]:

        seq1=seq1+f.strip()

for f in Y[1:]:

        seq2=seq2+f.strip()

i=0

g=open('DP.dat','w')

for A in seq1:

        dot=''

        i=i+1

        e=0

        for B in seq2:

                e=e+1

                if A==B:

                        dot=dot+'/'

                        g.write(str(i)+' '+str(e)+'\n')

                else:

                        dot=dot+' '

The script takes as arguments two fasta sequences and generates a file named DP.dat that can be visualized with different tools.(Instruction to execution)
For example you can use gnuplot typing:

gnuplot

and than:

gnuplot> plot "DP.dat" with dots

You will obtain a plot like the following, where the numbers in the axis correspond to amino acid positions in the protein sequences.

Dop Plot obtained with GNUPLOT

Scripts for dummies

Pagine

giovedì 17 novembre 2011

Python implementation of DOT PLOT

Nessun commento:

Posta un commento