Programming in Bioinformatics
-
Upload
sasikala-rajendran -
Category
Documents
-
view
253 -
download
3
description
Transcript of Programming in Bioinformatics
![Page 1: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/1.jpg)
Programming in Bioinformatics
![Page 2: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/2.jpg)
Control structures in Python
![Page 3: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/3.jpg)
if
• Syntax Ex:• If <condition>: x=0
<Instruction> #indentation if x<0:Elif <condition>: print “Negative”
<instruction elif x=0:else: print “Zero”
<instruction> else: print “Positive”
![Page 4: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/4.jpg)
Example• >>>seq=“ATGC”• >>> if "T" in seq:• print "DNA"• elif "U" in seq:• print "RNA"• else:• print "Its not a nucleotide seq"Ans:DNA
![Page 5: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/5.jpg)
While
• Syntax Ex:• While <condition>: a,b=0,1
<instruction 1> while a<10: <instruction 2> print a
. a=a+b
.
.<instruction n>
![Page 6: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/6.jpg)
![Page 7: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/7.jpg)
for• Syntax:• For variable_name in <list defintion>
<instruction 1>Ex: <instruction 2> a=[“A”,”T”,”C”,”G”]
. for base in a:
. print base
.<instruction n>
![Page 8: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/8.jpg)
![Page 9: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/9.jpg)
Functions
• A function is a code block with a name which perform an operation on one or more values and return a result
• Types:– User defined functions
creating our own function in Python– Built in functions
Python with pre-defined functions
![Page 10: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/10.jpg)
User defined FunctionUser defined Functions:Syntax:def fun_name(arguments/parameters):….….Ex:
• def base():print “A”print “T”
print “C”print “G”
![Page 11: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/11.jpg)
Arguments and parameters
• def subtract(x,y): Parametersprint x-y
• def seq(dna): Argument print dna
![Page 12: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/12.jpg)
Return statement
• def transcribe(dna)return dna.replace(“T”,”U”)
• def reverse(s): letters = list(s)
letters.reverse() return ''.join(letters
![Page 13: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/13.jpg)
Use of if statement• >>>dna="AGTGCGTCGATATAAAAAAA"• >>> def polyA_tail(dna):• if dna.endswith("AAAAAA"):• print "Has PolyA tail"• else:• print "Not having PolyA tail"
•• >>> polyA_tail(dna)• Has PolyA tail• >>> s="AGTGACGTACGT"• >>> polyA_tail(s)• Not having PolyA tail
![Page 14: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/14.jpg)
Use of for and if statement• >>>res_site=["GAATTC","GGATCC","AAGCTT"]• >>> seq="AAAGAATCCTTGGATCCAATTGCGC"• >>> def ecoli_ressite(dna):• for site in res_site:• if site in seq:• print site, "is a cleavage site"• else:• print site, "is not present"
•• >>> ecoli_ressite(dna)• GAATTC is not present• GGATCC is a cleavage site• AAGCTT is not present
![Page 15: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/15.jpg)
Dictionaries• >>>Seq=‘AATGTCTCT’• >>> def compliment(seq):• basecomplement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}• letters=list(seq)• letters=[basecomplement[base]for base in letters]• return ''.join(letters)
• >>> compliment('ATGTCTG')• 'TACAGAC'• >>>
![Page 16: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/16.jpg)
Calling a function
>>>def gc_content(seq):gc=seq.count('G')+seq.count('C')return gc*100.0/len(seq)
• >>> gc_content('CATTCG')• 50.0
![Page 17: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/17.jpg)
Built in FunctionsFrequently used built in Functions are:• len()• raw_input()• open()• list()• readline()• rstrip()• split()• join()• range()etc.,
![Page 18: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/18.jpg)
len() and range()Example-1>>>A=[“A’,”T”,”G”,”C”]>>>len(a)4Example-2>>>range(5)[0,1,2,3,4]>>>range(2,8)[2,3,4,5,6,7]>>>range(2,8,2)[2,4,6]>>>range(8,0,-1)[8,7,6,5,4,3,2,1]
![Page 19: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/19.jpg)
raw_input() and open()• >>> file=open("E:/test/seq.txt")• >>> print file• <open file 'E:/test/seq.txt', mode 'r' at 0x00C38C38>• >>> file.readline()• 'CCTGTATTAGCAGCAGATTCGATTAGCTTTACAACAATTCAATAAAATAGCTTCGCGCTA
A\n'• >>> file.readline()• 'CACACCCAATAAGTTAGAGAGAGTACTTTGACTTGGAGCTGGAGGAATTTGACATAGTC
GAT\n'• >>> file.readline()• 'CCACACCAAAAAAACTTTCCACGTGAACCGAAAACGAAAGTCTTTGGTTTTAATCAATAA
\n'• >>> file.readline()• 'CCACACCAAAAAAACTTTCCACGTGTGAACTATACTCCAAAAACGAAGTATTGGTTTATC
ATAA'• >>> file.readline()• ''
![Page 20: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/20.jpg)
rstrip()• >>> file="E:/test/seq.txt"• >>> for line in open(file):• line=line.rstrip()• print line
• CCTGTATTAGCAGCAGATTCGATTAGCTTTACAACAATTCAATAAAATAGCTTCGCGCTAA
• CACACCCAATAAGTTAGAGAGAGTACTTTGACTTGGAGCTGGAGGAATTTGACATAGTCGAT
• CCACACCAAAAAAACTTTCCACGTGAACCGAAAACGAAAGTCTTTGGTTTTAATCAATAA
• CCACACCAAAAAAACTTTCCACGTGTGAACTATACTCCAAAAACGAAGTATTGGTTTATCATAA
![Page 21: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/21.jpg)
split()
• >>> s=" AAAGGT GGCTAAT"• >>> f=s.split()• >>> print f• ['AAAGGT', 'GGCTAAT']
![Page 22: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/22.jpg)
join()
• >>> aa=["Asp","Gly","Ala","Pro"]• >>> print "-".join(aa)• Asp-Gly-Ala-Pro• >>> print "\n".join(aa)• Asp• Gly• Ala• Pro
![Page 23: Programming in Bioinformatics](https://reader034.fdocuments.us/reader034/viewer/2022050714/577cc6d71a28aba7119f440c/html5/thumbnails/23.jpg)