Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop
description
Transcript of Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop
![Page 1: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/1.jpg)
Sequence Analysis with Artemisand
Artemis Comparison Tool (ACT)
Carribean Bioinformatics Workshop
18th-29th January , 2010
![Page 2: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/2.jpg)
atcttttacttttttcatcatctatacaaaaaatcatagaatattcatcatgttgtttaaaataatgtattccattatgaactttattacaaccctcgtttttaattaattcacattttatatctttaagtataatatcatttaacattatgttatcttcctcagtgtttttcattattatttgcatgtacagtttatcatttttatgtaccaaactatatcttatattaaatggatctctacttataaagttaaaatctttttttaattttttcttttcacttccaattttatattccgcagtacatcgaattctaaaaaaaaaaataaataatatataatatataataaataatatataataaataatatataatatataataaataatatataatatataatatataataaataatatataatatataatatataataaataatatataataaataatatataatatataatatataatactttggaaagattatttatatgaatatatacacctttaataggatacacacatcatatttatatatatacatataaatattccataaatatttatacaacctcaaataaaataaacatacatatatatatataaatatatacatatatgtatcattacgtaaaaacatcaaagaaatatactggaaaacatgtcacaaaactaaaaaaggtattaggagatatatttactgattcctcatttttataaatgttaaaattattatccctagtccaaatatccacatttattaaattcacttgaatattgttttttaaattgctagatatattaatttgagatttaaaattctgacctatataaacctttcgagaatttataggtagacttaaacttatttcatttgataaactaatattatcatttatgtccttatcaaaatttattttctccatttcagttattttaaacatattccaaatattgttattaaacaagggcggacttaaacgaagtaattcaatcttaactccctccttcacttcactcattttatatattccttaatttttactatgtttattaaattaacatatatataaacaaatatgtcactaataatatatatatatatatatatatatatatatattataaatgttttactctattttcacatcttgtccttttttttttaaaaatcccaattcttattcattaaataataatgtattttttttttttttttttttttttattaattattatgttactgttttattatatacactcttaatcatatatatatatttatatatatatatatatatatatatatattattcccttttcatgttttaaacaagaaaaaaaactaaaaaaaaaaaaaataataaaatatatttttataacatatgtattattaaaatgtatatataaaaatatatattccatttattattatttttttatatacattgttataagagtatcttctcccttctggtttatattactaccatttcactttgaacttttcataaaaattaatagaatatcaaatatgtataatatataacaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaatatatatatatatatatatacatataatatatatttcatctaatcatttaaaattattattatatattttttaaaaaatatatttatgataacataaaaagaatttaattttaattaaatatatataattacatacatctaatattattatatatatataataagttttccaaatagaatacttatatattatatatatatatatatatatatatattcttccataaaaagaataaaataaaataaaaacaccttaaaagtatttgtaaaaaattccccacattgaatatatagttgtatttataaaattaaagaaaaagcataaagttaccatttaatagtggagattagtaacattttcttcattatcaaaaatatttatttcctaattttttttttttgtaaaatatatttaaaaatgtaatagattatgtattaaataatataaatatagcaaaatgttcaattttagaaatttgcctctttttgacaaggataattcaaaagatacaggtaaaaaaaaaaaaataaagtaaaacaaaacaaaacaaaaaacaaaaaaaaaaaaaaaaaaaaaaatgacatgttataatataatataataaataaaaattatgtaatatatcataatcgaagaaacatatatgaaaccaaaaagaaacagatcttgatttattaatacatatataactaacattcatatctttatttttgtagatgatataaaaaattttataaactcttatgaagggatatatttttcatcatccaataaatttataaatgtatttctagacaaaattctgatcattgatccgtcttccttaaatgttattacaataaatacagatctgtatgtagttgatttcctttttaatgagaaaaataagaatcttattgttttagggtaatgaaatatatatagatttatatttttatttatttattatatattattttttaatttttcttttatatatttattttatttagtgtataaaatgatatcctttatatttatatttacatgggatattcaaataataacaaaaatgagtatacacatatatatatatatatatatatatatgtatattttttttttttttttatgttcctataggaaagggaagaattcactgatttgtagtgtttacaatattagggaatgcaactttacacttttgaaaaaaattcagttaagcaaaaatattaataacattaaaaagacactgatagcaaaatgtaatgaatatataataacattagaaaataagaaaattactttttatttcttaaataaagattatagtataaatcaaagtgaattaatagaagacggaaaagaacttattgaaaatatctatttgtcaaaaaatcatatcttgttagtaataaaaaattcatatgtatatatataccaattagatattaaaaattcccatattagttatacacttattgatagtttcaatttaaatttatcctacctcagagaatctataaataataaaaaaaagcatataaataaaataaatgatgtatcaaataatgacccaaaaaaggataataatgaaaaaaatacttcatctaataatataa
![Page 3: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/3.jpg)
atcttttacttttttcatcatctatacaaaaaatcatagaatattcatcatgttgtttaaaataatgtattccattatgaactttattacaaccctcgtttttaattaattcacattttatatctttaagtataatatcatttaacattatgttatcttcctcagtgtttttcattattatttgcatgtacagtttatcatttttatgtaccaaactatatcttatattaaatggatctctacttataaagttaaaatctttttttaattttttcttttcacttccaattttatattccgcagtacatcgaattctaaaaaaaaaaataaataatatataatatataataaataatatataataaataatatataatatataataaataatatataatatataatatataataaataatatataatatataatatataataaataatatataataaataatatataatatataatatataatactttggaaagattatttatatgaatatatacacctttaataggatacacacatcatatttatatatatacatataaatattccataaatatttatacaacctcaaataaaataaacatacatatatatatataaatatatacatatatgtatcattacgtaaaaacatcaaagaaatatactggaaaacatgtcacaaaactaaaaaaggtattaggagatatatttactgattcctcatttttataaatgttaaaattattatccctagtccaaatatccacatttattaaattcacttgaatattgttttttaaattgctagatatattaatttgagatttaaaattctgacctatataaacctttcgagaatttataggtagacttaaacttatttcatttgataaactaatattatcatttatgtccttatcaaaatttattttctccatttcagttattttaaacatattccaaatattgttattaaacaagggcggacttaaacgaagtaattcaatcttaactccctccttcacttcactcattttatatattccttaatttttactatgtttattaaattaacatatatataaacaaatatgtcactaataatatatatatatatatatatatatatatatattataaatgttttactctattttcacatcttgtccttttttttttaaaaatcccaattcttattcattaaataataatgtattttttttttttttttttttttttattaattattatgttactgttttattatatacactcttaatcatatatatatatttatatatatatatatatatatatatatattattcccttttcatgttttaaacaagaaaaaaaactaaaaaaaaaaaaaataataaaatatatttttataacagatgtattattaaaatgtatatataaaaatatatattccatttattattatttttttatatacattgttataagagtatcttctcccttctggtttatattactaccatttcactttgaacttttcataaaaattaatagaatatcaaatatgtataatatataacaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaatatatatatatatatatatacatataatatatatttcatctaatcatttaaaattattattatatattttttaaaaaatatatttatgataacataaaaagaatttaattttaattaaatatatataattacatacatctaatattattatatatatataataagttttccaaatagaatacttatatattatatatatatatatatatatatatattcttccataaaaagaataaaataaaataaaaacaccttaaaagtatttgtaaaaaattccccacattgaatatatagttgtatttataaaattaaagaaaaagcataaagttaccatttaatagtggagattagtaagtttttcttcattatcaaaaatatttatttcctaattttttttttttgtaaaatatatttaaaaatgtaatagattatgtattaaataatataaatatagcaaaatgttcaattttagaaatttgcctctttttgacaaggataattcaaaagatacaggtaaaaaaaaaaaaataaagtaaaacaaaacaaaacaaaaaacaaaaaaaaaaaaaaaaaaaaaaatgacatgttataatataatataataaataaaaattatgtaatatatcataatcgaagaaacatatatgaaaccaaaaagaaacagatcttgatttattaatacatatataactaacattcatatctttatttttgtagatgatataaaaaattttataaactcttatgaagggatatatttttcatcatccaataaatttataaatgtatttctagacaaaattctgatcattgatccgtcttccttaggtgttattacaataaatacagatctgtatgtagttgatttcctttttaatgagaaaaataagaatcttattgttttagggtaatgaaatatatatagatttatatttttatttatttattatatattattttttaatttttcttttatatatttattttatttagtgtataaaatgatatcctttatatttatatttacatgggatattcaaataataacaaaaatgagtatacacatatatatatatatatatatatatatgtatattttttttttttttttatgttcctataggaaagggaagaattcactgatttgtagtgtttacaatattagggaatgcaactttacacttttgaaaaaaattcagttaagcaaaaatattaataacattaaaaagacactgatagcaaaatgtaatgaatatataataacattagaaaataagaaaattactttttatttcttaaataaagattatagtataaatcaaagtgaattaatagaagacggaaaagaacttattgaaaatatctatttgtcaaaaaatcatatcttgttagtaataaaaaattcatatgtatatatataccaattagatattaaaaattcccatattagttatacacttattgatagtttcaatttaaatttatcctacctcagagaatctataaataataaaaaaaagcatataaataaaataaatgatgtatcaaataatgacccaaaaaaggataataatgaaaaaaatacttcatctaataatataa
Sequencing is just the beginning of the process
Extracting information & interpreting
What´s therewhere are the geneswhich geneshow to find them?
SEQUENCE ANNOTATION
Sequencing is just the beginning of the process
Extracting information & interpreting
What´s therewhere are the geneswhich geneshow to find them?
SEQUENCE ANNOTATION
![Page 4: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/4.jpg)
Strategies for sequence annotationStrategies for sequence annotation
Predictive methods
Comparative methods
Experimental methods
Interpretation of the DNA sequence into genes according to rules
![Page 5: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/5.jpg)
![Page 6: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/6.jpg)
![Page 7: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/7.jpg)
Strategies for sequence annotationStrategies for sequence annotation
Predictive methods
Comparative methods
Experimental methods
Interpretation of the DNA sequence into genes according to rules
Interpretation of the DNA sequence into genes according to similarities with other sequences
![Page 8: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/8.jpg)
![Page 9: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/9.jpg)
Strategies for sequence annotationStrategies for sequence annotation
Predictive methods
Comparative methods
Experimental methods
Interpretation of the DNA sequence into genes according to rules
Interpretation of the DNA sequence into genes according to similarities with other sequences
Interpretation of the DNA sequence into genes according to experimental results (e.g. cDNA)
![Page 10: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/10.jpg)
EST Blast Hit
![Page 11: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/11.jpg)
Gene prediction programs:ORFs and CDSs
ORFs are not equivalent to CDSs
Not all open reading frames are coding sequences
![Page 12: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/12.jpg)
Gene prediction
Gene finder
Glimmer
Orpheus PHAT
GeneMark
![Page 13: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/13.jpg)
Gene finding programs
• Genefinding software packages use Hidden Markov Models.
• Predict coding, intergenic and intron sequences
• Need to be trained on a specific organism.
• Never perfect!
![Page 14: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/14.jpg)
Gene prediction programs: Problems
• ORFs are not equivalent to CDSs
• Gene prediction programs find new genes that share properties with a given set of genes.
• They can be confounded by:– Sequence constraints (ribosomal proteins etc.)
– Sequence biases
– Different sets of genes
– Horizontal gene transfer
– Non-coding DNA
![Page 15: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/15.jpg)
Gene prediction programs: Problems
Different gene training sets: Plasmodium falciparum
Original annotation
Updated annotation
![Page 16: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/16.jpg)
Gene prediction programs: Problems
Non-protein coding regions: S. typhi ribosomal RNA genes
glimmer
genefinder
final
orpheus
glimmer
genefinder
final
orpheus
![Page 17: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/17.jpg)
Gene prediction programs: ProblemsNon-protein coding regions: N. meningitidis DNA repeats
glimmerorpheusfinal
glimmerorpheusfinal
![Page 18: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/18.jpg)
Gene prediction programs: Problems
Pseudogenes M. leprae
![Page 19: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/19.jpg)
Gene prediction programs: Problems
Pseudogenes: M. lepraeGlimmer
![Page 20: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/20.jpg)
Gene prediction programs: Problems
Pseudogenes: M. lepraePseudogenes: M. lepraeORPHEUS
![Page 21: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/21.jpg)
Gene prediction programs: Problems
Pseudogenes: M. leprae
WUBLASTX vs. M. tuberculosis
![Page 22: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/22.jpg)
Gene prediction programs: Problems
Pseudogenes: M. leprae
Final annotation
![Page 23: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/23.jpg)
The Gene Prediction Process
DNA SEQUENCE
AN
NA
LY
SIS
SO
FT
WA
RE
UsefullCDSPrediction
Annotator
AT content
Gene finders
Codon Usage
BlastX
FASTA
ESTs
![Page 24: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/24.jpg)
Eukaryotic gene
AAAAAAAAAACAP
AAAAAAAAAACAP
TTTTTTTTT
TTTTTTTTT
intron Exon II5’UTR Exon Istop
3’UTR
EST
cDNA
mRNA
EST
Exon III
ATG GT AG GT AG
![Page 25: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/25.jpg)
AT content• Coding regions have higher GC content in
AT rich genomes
![Page 26: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/26.jpg)
AT content
![Page 27: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/27.jpg)
CODON USAGE
• Codon bias is different for each organism.• DNA content in coding regions is restricted
– but it is not restricted in non coding regions.
• The codon usage for any particular gene can influence expression.
![Page 28: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/28.jpg)
Codon usage
• All organisms have a preferred set of codons.
Malaria TrypanosomaGUU 0.41 GUU 0.28
GUC 0.06 GUC 0.19
GUA 0.42 GUA 0.14
GUG 0.11 GUG 0.39
![Page 29: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/29.jpg)
Codon Usage• http://www.kazusa.or.jp/codon/
![Page 30: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/30.jpg)
Codon Usage in Artemis
Forward frames
Reverseframes
![Page 31: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/31.jpg)
Codon usage & gene finding in : Leishmania
![Page 32: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/32.jpg)
GC frame plot
• Plots the third position GC content of each frame of a DNA sequence.
• In coding DNA the GC content of the 3rd base is often higher.
• Good prediction of coding in malaria and trypanosomes.
![Page 33: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/33.jpg)
GC frame plot of tubulin gene cluster on T. brucei Chr 1
![Page 34: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/34.jpg)
Homology Data
• Coding regions are more conserved than non coding regions due to selective pressure.
• Comparing all possible translations against all known proteins will give clues to known genes.
• Blastx
![Page 35: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/35.jpg)
Gene finding: using ACT
TBLASTX comparisons
P. knowlesi
P. falciparum
P. yoelii
![Page 36: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/36.jpg)
Gene finding by RNA-Seq(Transcriptional landscape of Neospora caninum Tachyzoites
Day 3 Tachyzoites (RNAseq)
Day 4 Tachyzoites (RNAseq)
![Page 37: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/37.jpg)
Day 3 Tachyzoites (RNAseq)
Day 4 Tachyzoites (RNAseq)
N. caninum Chr08
T. gondii Chr085’ UTR 3’ UTR
TBLASTX matches visualised in ACT
Transcriptome sequencing in Neospora(RNAseq is useful for predicting/confirming UTR boundaries)
![Page 38: Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop](https://reader035.fdocuments.us/reader035/viewer/2022062500/5681596b550346895dc6ab25/html5/thumbnails/38.jpg)
RNA-Seq: correcting gene models
Before
%GC
After
%GC
__16hr, __32hr, __48hr