6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern...
-
date post
15-Jan-2016 -
Category
Documents
-
view
217 -
download
0
Transcript of 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern...
![Page 1: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/1.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple Alignment and Motif Searching
Burkhard Morgenstern
Universität Göttingen
Institute of Microbiology and Genetics
Department of Bioinformatics
Tunis, March 2007
![Page 2: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/2.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple Alignment and Motif Searching
http://www.gobics.de/
burkhard/teaching/tunis_07.php
![Page 3: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/3.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
www.gobics.de/burkhard/teaching/tunis_07.php
![Page 4: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/4.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Information flow in the cell
![Page 5: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/5.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Information flow in the cell
Idea:
Sequence -> Structure -> Function
![Page 6: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/6.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Information flow in the cell
![Page 7: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/7.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Information flow in the cell
gap between sequence and structure/function data
Lots of data available at the sequence level
Fewer data at the structure and function level
![Page 8: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/8.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Exponential growth of data bases
![Page 9: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/9.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Major goal of bioinformatics: close the gap between sequence information and structure/function information
Most important tool for sequence analysis: sequence comparison
Simple approach: dot plot, more advanced approach: sequence alignment
![Page 10: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/10.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
![Page 11: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/11.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Gibbs and McIntyre (1970)
![Page 12: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/12.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I V M R E Q Y
Two sequences to be compared
![Page 13: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/13.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I V M R E Q Y
Comparison matrix
![Page 14: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/14.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I X V M R E Q Y
Search pairs of identical residues
![Page 15: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/15.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I X V X M R X E X X X Q X X Y X X
Dot plot: dot (X) for all pairs of identical residues
![Page 16: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/16.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I X V X M R X E X X X Q X X Y X X
![Page 17: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/17.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I X V X M R X E X X X Q X X Y X X
Homologies as diagonal lines from top-left to bottom-right corner
![Page 18: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/18.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y I V A R E A Q Y E C I X V X M R X E X X X Q X X Y X X
Inversions as diagonals from bottom left to top right
![Page 19: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/19.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y Q E V R E Y Q E I C I X V X M R Y X X X Q X X X E X X X X
Repeats as parallel diagonals
![Page 20: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/20.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Y Q E W T Y Q E V R E Y Q E I C I X V X M R Y X X X Q X X X E X X X X
![Page 21: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/21.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
Advantages:
1. Various types of similarity detectable (repeats, inversions)
2. Useful for large-scale analysis
Use filtering for long sequeces: dots represent matching segments instead of matching single residues
![Page 22: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/22.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The dot plot
![Page 23: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/23.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Evolutionary or structurally related sequences:
alignment possible
Sequence homologies represented by inserting gaps
![Page 24: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/24.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C I V M R E A Q Y
Two input sequences
![Page 25: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/25.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C I V M R E A Q Y
Comparison matrix for two sequences
![Page 26: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/26.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C I X V X M R X E X X A X Q X Y X X Dot plot for two sequences
![Page 27: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/27.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C I X V X M R X E X X A X Q X Y X X
Similarities in same relative order over entire seqences
![Page 28: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/28.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C I X V X M R X E X A Q X Y X
Global alignment of sequences possible
![Page 29: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/29.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C X X I X V X M X R X E X A X Q X Y X X
Alignment corresponds to path through comparison matrix
![Page 30: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/30.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C X X I X V X M X R X E X A X Q X Y X X
Matches (red), mis-matches (green), gaps (blue)
![Page 31: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/31.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E C X X I X V X M X R X E X A X Q X Y X X
Matches (red), mis-matches (green), gaps (blue)
![Page 32: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/32.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
(global) alignment: write sequences on top of each other, gaps represented by dash symbols
![Page 33: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/33.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E
C I V M R E A Q Y
Input sequences
![Page 34: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/34.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A Q Y –
alignment of input sequences
![Page 35: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/35.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A Q Y -
alignment consists matches (red), mismatches (green) and gaps (blue)
![Page 36: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/36.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A Q Y –
Basic task:
Find ‘best’ alignment of two sequences
= alignment that reflects structural and evolutionary relations
![Page 37: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/37.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A Q Y –
Questions:
1. What is a good alignment?
2. How to find the best alignment?
![Page 38: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/38.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A Q Y –
Idea: consider alignment as hypothesis about evolution of sequences.
gaps correspond to insertions/deletions mismatches correspond to substitutions
![Page 39: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/39.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C - I V M R E A - Q Y
Problem: astronomical number of possible alignments
![Page 40: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/40.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E Q Y E
C I - V M R E A Q Y
Problem: astronomical number of possible alignments
![Page 41: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/41.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
- C I V M R E A Q Y –
Problem: astronomical number of possible alignments
stupid computer has to find out: which alignment is best ??
![Page 42: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/42.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
- C I V M R E A Q Y –
First (simplified) rules:
1. minimize number of mismatches
2. maximize number of matches
![Page 43: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/43.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
- C I V M R E A Q Y –
General assumption: sequences not too distantly related.
In this case: mismatches (substitutions) and gaps (insertions/deletions) unlikely
Consequence: good alignment should reduce gaps and mismatches
![Page 44: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/44.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C I - V M R E A Q Y –
First (simplified) rules:
1. minimize number of mismatches
2. maximize number of matches
![Page 45: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/45.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
- C I V M R E A Q Y –
First (simplified) rules:
1. minimize number of mismatches
2. maximize number of matches
![Page 46: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/46.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
- C I V M R E A Q Y –
First (simplified) rules:
1. minimize number of mismatches
2. maximize number of matches
![Page 47: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/47.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E - Q Y E
C I - V M R E A Q Y –
Second (simplified) rule:
minimize number of gaps
![Page 48: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/48.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V - A R E - Q Y E
C I - V M - R E A Q Y –
Second (simplified) rule:
minimize number of gaps
Parsimony principle: minimize number of evolutionary events
![Page 49: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/49.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
For protein sequences: different degrees of similarity among amino
acids. counting matches/mismatches
oversimplistic
![Page 50: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/50.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T L V
Protein sequences to be aligned
![Page 51: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/51.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T L - V
Possible alignment
![Page 52: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/52.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T - L V
Alternative alignment
![Page 53: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/53.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T - L V
Some amino acid residues are more similar to each other than others
Therefore: similarity among amino acid residues has to be taken into account.
![Page 54: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/54.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
![Page 55: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/55.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T - L V
To assess quality of protein alignments:
use similarity scores for amino acids
s(a,b) similarity score for amino acids a and b
![Page 56: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/56.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Similarity measured by substitution matrices based on substitution probabilities
Important substitution matrices:
PAM (M. Dayhoff) BLOSUM (S. Henikoff / J. Henikoff)
![Page 57: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/57.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
The PAM matrix:
Consider probability pa,b of substitution a → b (or b → a) for amino acids a and b
Define for amino acids a and b similarity score S(a,b) based on probability pa,b
First task: find out pa,b for every pair of amino acids a, b
![Page 58: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/58.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
The PAM matrix:
Use closely related protein families – no alignment problem, no double substitutions
Construct phylogenetic tree with parsimony method
Count substitution frequencies/probabilities Normalize substitution probabilities Extrapolate probabilities for larger
evolutionary distances
![Page 59: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/59.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Finally: define similarity score
S(a,b) = log (pa,b / qa qb)
qa = (relative) frequency of amino acid a
![Page 60: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/60.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
![Page 61: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/61.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T - L V
Given a similarity score s(a,b) for pairs of amino acids, define quality score of alignment as:
sum of similarity values s(a,b) of aligned residues
minus gap penalty g for each residue aligned with a gap
![Page 62: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/62.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V
T - L V
Example:
Score = s(T,T) + s(I,L) + s (V,V) - g
![Page 63: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/63.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V T - L V
Next question: find alignment with best score
Dynamic-programming algorithm finds alignment with best score.
(Needleman and Wunsch, 1970)
![Page 64: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/64.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E A Q Y E
- C I V M R E - Q Y –
Alignment corresponds to path through comparison matrix
![Page 65: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/65.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E A Q Y E C I X V X M R X E X X Q X Y X X
![Page 66: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/66.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E A Q Y E X X C X I X V X M X R X E X X Q X Y X X
![Page 67: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/67.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T Y I V A R E A Q Y E
- C I V M R E - Q Y –
Alignment corresponds to path through comparison matrix
![Page 68: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/68.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V - R E A Q I - C I V M R E - H Y
![Page 69: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/69.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Score of alignment: Sum of similarity values of aligned residues minus gap penatly
T W L V - R E A Q I - C I V M R E - H Y
![Page 70: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/70.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Example: S = - g + s(W,C) + s(L,L) + s(V,V) - g + s(R,R) …
T W L V - R E A Q I - C I V M R E - H Y
![Page 71: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/71.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X H X Y X X
T W L V - R E A Q I - C I V M R E - H Y
![Page 72: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/72.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R E A Q Y I X X C X Alignment corresponds I X to path through V X comparison matrix M X R X E X X H X Y X X
T W L V - R E A Q I - C I V M R E - H Y
![Page 73: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/73.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i T W L V R E A Q Y I X X Dynamic programming: C X Calculate scores S(i,j) I X of optimal alignment of V X prefixes up to positions M X i and j. j R X E H Y
T W L V - R - C I V M R
![Page 74: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/74.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i T W L V R E A Q Y I X X C X S(i,j) can be calculated from I X possible predecessors V X S(i-1,j-1), S(i,j-1), S(i-1,j). M X j R X E H Y
T W L V - R - C I V M R
![Page 75: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/75.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i T W L V R E A Q Y I X X C X Score of optimal path that I X comes from top left = V X M X S(i-1,j-1) + s(R,R) j R X E H Y
T W L V - R - C I V M R
![Page 76: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/76.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i T W L V R E A Q Y I X X C X Score of optimal path that I X comes from above = V X j-1M X S(i,j-1) – g j R X E H Y
T W L V R - - C I V M R
![Page 77: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/77.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i-1 i T W L V R E A Q Y I X X C X Score of optimal path that I X comes from left = V X M X S(i-1,j) – g j R X X E H Y
T W L - - V R - C I V M R -
![Page 78: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/78.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i-1 i T W L V R E A Q Y I X X C X Score of optimal path = I X V X Maximum of these three M X values j R X X E H Y
T W L - - V R - C I V M R -
![Page 79: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/79.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Recursion formula for global alignment:
For sequences x and y
gijS
gjiS
yxsjiS
jiS
ji
)1,(
),1(
),()1,1(
max),(
![Page 80: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/80.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R C I V M R E H Y
![Page 81: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/81.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x V x x M x x R x x E x x H x x Y x x Fill matrix from top left to bottom right:
![Page 82: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/82.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x M x x R x x E x x H x x Y x x Fill matrix from top left to bottom right:
![Page 83: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/83.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x R x x E x x H x x Y x x Fill matrix from top left to bottom right:
![Page 84: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/84.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x x R x x E x x H x x Y x x Fill matrix from top left to bottom right:
![Page 85: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/85.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x x R x x x E x x H x x Y x x Fill matrix from top left to bottom right:
![Page 86: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/86.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x x R x x x E x x x H x x Y x x Fill matrix from top left to bottom right:
![Page 87: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/87.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x x R x x x E x x x H x x x Y x x Fill matrix from top left to bottom right:
![Page 88: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/88.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x C x x x I x x x V x x x M x x x R x x x E x x x H x x x Y x x x Fill matrix from top left to bottom right:
![Page 89: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/89.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x C x x x I x x x V x x x M x x x R x x x E x x x H x x x Y x x x Fill matrix from top left to bottom right:
![Page 90: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/90.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x C x x x x I x x x V x x x M x x x R x x x E x x x H x x x Y x x x Fill matrix from top left to bottom right:
![Page 91: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/91.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x C x x x x I x x x x V x x x M x x x R x x x E x x x H x x x Y x x x Fill matrix from top left to bottom right:
![Page 92: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/92.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x C x x x x I x x x x V x x x x M x x x R x x x E x x x H x x x Y x x x Fill matrix from top left to bottom right:
![Page 93: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/93.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x x x C x x x x x x I x x x x x x V x x x x x x M x x x x x x R x x x x x x E x x x x x x H x x x x x x Y x x x x x x Fill matrix from top left to bottom right:
![Page 94: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/94.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x x x C x x x x x x I x x x x x x V x x x x x x M x x x x x x R x x x x x x E x x x x x x H x x x x x x Y x x x x x x Find optimal alignment by trace-back procedure
![Page 95: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/95.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R x x x x x x C x I x V x M x R x E x H x Y x Initial matrix entries?
![Page 96: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/96.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i
T W L V R
X X
C X Entries S(i,j) scores
I X of optimal alignment of
j V X prefixes up to positions
M i and j.
R
E
H
Y
T W L V
- C I V
![Page 97: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/97.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
i T W L V R j X X X X X C Entries S(i,0) scores I of optimal alignment of V prefix up to positions M i and empty prefix. R E Score = - i* g H Y T W L V - - - -
![Page 98: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/98.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R C I V M R E H Y Initial matrix entries: Example, g = 2
![Page 99: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/99.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
T W L V R 0 -2 -4 -6 -8 -10 C -2 I -4 V -6 M -8 R -10 E -12 H -14 Y -16 Initial matrix entries: Example, g = 2
![Page 100: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/100.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise global alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X F X Y X X
T W L V - R E A Q I - C I V M R E - F Y
![Page 101: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/101.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise global alignment
Computational complexity: how does program run time and memory depend on size of input data?
l1 and l2 length of sequences:Computing time and memory proportional to
l1 * l2
Time and memory complexity = O(l1 * l2)
![Page 102: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/102.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
More realistic gap penalty: affine-linear instead of linear
Penalty for gap of length l:
c0 + (l-1)* c1
c0 = ‘gap-opening penalty’
c0 = ‘gap-extension penalty’
![Page 103: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/103.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
So far: global alignment considered: sequences aligned over their entire length.
But: sequences often share only local sequence similarity (conserved genes or domains)
Most important application: database searching
![Page 104: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/104.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X H X Y X X
T W L V - R E A Q I - C I V M R E - F Y
![Page 105: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/105.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X F X Y X X
T W L V - R E A Q I - C I V M R E - F Y
![Page 106: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/106.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
Problem:
Find pair of segments with maximal alignment score (not necessarily part of optimal global alignment!)
Equivalent: find path starting and ending anywhere in the matrix.
![Page 107: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/107.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X F X Y X X
T W L V - R E A Q I - C I V M R E - F Y
![Page 108: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/108.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
Recursion formula for global alignment:
S(i,j) = max { S(i-1,j-i)+s(ai,bj) , S(i-1,j) – g , S(i,j-i) – g }
![Page 109: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/109.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
Recursion formula for local alignment:
S(i,j) = max { 0 , S(i-1,j-i)+s(ai,bj) , S(i-1,j) – g , S(i,j-i) – g }
![Page 110: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/110.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R 0 0 0 0 0 0 C 0 I 0 V 0 M 0 R 0 E 0 H 0 Y 0 Initial matrix entries = 0
![Page 111: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/111.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R 0 0 0 0 0 0 C 0 0 I 0 V 0 M 0 R 0 E 0 H 0 Y 0 s(C,T) = -2
![Page 112: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/112.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Recursion formula for global alignment:
gijS
gjiS
yxsjiS
jiS
ji
)1,(
),1(
),()1,1(
max),(
![Page 113: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/113.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
Recursion formula for local alignment:
0
)1,(
),1(
),()1,1(
max),(gijS
gjiS
yxsjiS
jiS
ji
![Page 114: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/114.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise sequence alignment
For trace-back:
Store positions imax and jmax with
S(imax ,jmax) maximal
![Page 115: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/115.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
T W L V R E A Q Y I X X C X I X V X M X R X E X X F X Y X X
T W L V - R E A Q I - C I V M R E - F Y
![Page 116: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/116.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
Algorithm by Smith and Waterman (1983)
Implementation: e.g. BestFit in GCG package
![Page 117: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/117.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Pair-wise local alignment
Complexity: l1 and l2 length of sequences:computing time
and memory proportional to l1 * l2
Time and space complexity = O(l1 * l2)
Too slow for data base searching! Therefore tools like BLAST necessary for
database searching
![Page 118: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/118.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The Basic Local Alignment Search Tool (BLAST)
New BLAST version (1997)
Two-hit strategy Gapped BLAST Position-Specific Iterative BLAST
(PSI BLAST)
![Page 119: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/119.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The Basic Local Alignment Search Tool (BLAST)
PSI BLAST:
1. search database with standard BLAST
2. take best hits and create multiple alignment
3. calculate profile from multiple alignment
4. search database again with profile as query
![Page 120: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/120.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The Basic Local Alignment Search Tool (BLAST)
![Page 121: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/121.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The Basic Local Alignment Search Tool (BLAST)
profile for sequence family or motif:
table of amino acid/nucleotide frequencies at any position in alignment.
![Page 122: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/122.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The Basic Local Alignment Search Tool (BLAST)
Profile: frequencies of nucleotides at every position.
seq1 A T T G – A T
seq2 C T T G T A G
seq3 A - - G T A T
seq4 A T G G T G T
seq5 A C T G T A C
A 80 0 0 0 0 80 0
T 0 75 75 0 100 0 60
C 20 25 0 0 0 0 20
G 0 0 25 100 0 20 20
![Page 123: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/123.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 T Y I M R E A Q Y E S A Q
s2 T C I V M R E A Y E
s3 Y I M Q E V Q Q E R
s4 W R Y I A M R E Q Y E
![Page 124: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/124.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - Y E - - -
s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
![Page 125: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/125.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - Y E - - -
s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
![Page 126: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/126.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - Y E - - -
s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
![Page 127: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/127.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - Y E - - -
s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
General information in multiple alignment: Functionally important regions more conserved than
non-functional regions Local sequence conservation indicates functionality!
![Page 128: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/128.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Qs2 - T C I V M R E A - Y E - - -s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
For phylogeny reconstruction: Estimate pairwise distances between sequences
(distance-based methods for tree reconstruction) Estimate evloutionary events in evolution (parsimony
and maximum likelihood methods)
![Page 129: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/129.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - Y E - - -
s3 - - Y I - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
Astronomical number of possible alignments!
![Page 130: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/130.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - - - Y E -
s3 Y I - - - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
Astronomical number of possible alignments!
![Page 131: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/131.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
s1 - T Y I - M R E A Q Y E S A Q
s2 - T C I V M R E A - - - Y E -
s3 Y I - - - M Q E V Q Q E R - -
s4 W R Y I A M R E - Q Y E - - -
Computer has to decide: which one is best??
![Page 132: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/132.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
Questions in development of multiple-alignment programs (as in pairwise alignment):
(1) What is a good alignment? → objective function (`score’)
(2) How to find a good alignment? → optimization algorithm
First question far more important !
![Page 133: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/133.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
Traditional Objective functions:
Define Score of alignments as
Sum of individual similarity scores S(a,b) Gap penalties
Needleman-Wunsch scoring system (1970)
![Page 134: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/134.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
Traditional Objective functions
Can be generalized to multiple alignment
(e.g. sum-of-pair score, tree alignment)
Needleman-Wunsch algorithm can also be generalized to multiple alignment, but:
Very time and memory consuming!
-> Heuristics needed
![Page 135: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/135.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
First question: how to score multiple alignments?
Possible scoring scheme:
Sum-of-pairs score
![Page 136: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/136.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Multiple alignment implies pairwise alignments:
1aboA 36 WCEAQt..kngqGWVPSNYITPVN......
1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......
1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp
1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd.....
1vie 28 YAVESeahpgsvQIYPVAALERIN......
![Page 137: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/137.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Multiple alignment implies pairwise alignments:
1aboA 36 WCEAQt..kngqGWVPSNYITPVN......
1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......
1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp
1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd.....
1vie 28 YAVESeahpgsvQIYPVAALERIN......
![Page 138: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/138.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Multiple alignment implies pairwise alignments:
1aboA 36 WCEAQt..kngqGWVPSNYITPVN......
1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......
1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp
1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd.....
1vie 28 YAVESeahpgsvQIYPVAALERIN......
![Page 139: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/139.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Multiple alignment implies pairwise alignments:
1aboA 36 WCEAQt..kngqGWVPSNYITPVN......
1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......
1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp
1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd.....
1vie 28 YAVESeahpgsvQIYPVAALERIN......
![Page 140: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/140.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Multiple alignment implies pairwise alignments:
Use sum of scores of these p.a.
1aboA 36 WCEAQt..kngqGWVPSNYITPVN......
1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......
1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp
1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd.....
1vie 28 YAVESeahpgsvQIYPVAALERIN......
![Page 141: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/141.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Needleman-Wunsch coring scheme can be generalized from pair-wise to multiple alignment
![Page 142: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/142.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
![Page 143: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/143.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Complexity:
For sequences of length l1 * l2 * l3
O( l1 * l2 * l3 )
For n sequences ( average length l ):
O( ln )
Exponential complexity!
![Page 144: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/144.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Multiple sequence alignment
Needleman-Wunsch coring scheme can be generalized from pair-wise to multiple alignment
Optimal solution not feasible:
-> Heuristics necessary
![Page 145: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/145.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN
WWRLNDKEGYVPRNLLGLYP
AVVIQDNSDIKVVPKAKIIRD
YAVESEAHPGSFQPVAALERIN
WLNYNETTGERGDFPGTYVEYIGRKKISP
![Page 146: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/146.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN
WWRLNDKEGYVPRNLLGLYP
AVVIQDNSDIKVVPKAKIIRD
YAVESEAHPGSFQPVAALERIN
WLNYNETTGERGDFPGTYVEYIGRKKISP
Guide tree
![Page 147: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/147.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN
WWRLNDKEGYVPRNLLGLYP
AVVIQDNSDIKVVPKAKIIRD
YAVESEAHPGSFQPVAALERIN
WLNYNETTGERGDFPGTYVEYIGRKKISP
Idea: align closely related sequences first!
![Page 148: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/148.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN
WW--RLNDKEGYVPRNLLGLYP-
AVVIQDNSDIKVVP--KAKIIRD
YAVESEASFQPVAALERIN
WLNYNEERGDFPGTYVEYIGRKKISP
Profile alignment, “once a gap - always a gap”
![Page 149: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/149.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN
WW--RLNDKEGYVPRNLLGLYP-
AVVIQDNSDIKVVP--KAKIIRD
YAVESEASVQ--PVAALERIN------
WLN-YNEERGDFPGTYVEYIGRKKISP
Profile alignment, “once a gap - always a gap”
![Page 150: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/150.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN-
WW--RLNDKEGYVPRNLLGLYP-
AVVIQDNSDIKVVP--KAKIIRD
YAVESEASVQ--PVAALERIN------
WLN-YNEERGDFPGTYVEYIGRKKISP
Profile alignment, “once a gap - always a gap”
![Page 151: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/151.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN--------
WW--RLNDKEGYVPRNLLGLYP--------
AVVIQDNSDIKVVP--KAKIIRD-------
YAVESEA---SVQ--PVAALERIN------
WLN-YNE---ERGDFPGTYVEYIGRKKISP
Profile alignment, “once a gap - always a gap”
![Page 152: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/152.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
“Greedy” algorithm:
Consider partial solution of bigger problem
search best partial solution, fix solution search second-best partial solution that is consistent
with first solution, fix solution Search third-best partial solution … etc.
E.g.: Rucksack-Problem
![Page 153: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/153.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
WCEAQTKNGQGWVPSNYITPVN--------
WW--RLNDKEGYVPRNLLGLYP--------
AVVIQDNSDIKVVP--KAKIIRD-------
YAVESEA---SVQ--PVAALERIN------
WLN-YNE---ERGDFPGTYVEYIGRKKISP
Profile alignment, “once a gap - always a gap”
![Page 154: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/154.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
`Progressive´ Alignment
Most important software program:
CLUSTAL W:J. Thompson, T. Gibson, D. Higgins (1994), CLUSTAL
W: improving the sensitivity of progressive multiple sequence alignment … Nuc. Acids. Res. 22, 4673 - 4680
(~ 18.000 citations in the literature)
![Page 155: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/155.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
Problems with traditional approach:
Results depend on gap penalty
Heuristic guide tree determines alignment;
alignment used for phylogeny reconstruction
Algorithm produces global alignments.
![Page 156: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/156.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Tools for multiple sequence alignment
Problems with traditional approach:
But:
Many sequence families share only local similarity
E.g. sequences share one conserved motif
![Page 157: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/157.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Local sequence alignment
Find common motif in sequences; ignore the rest
EYENS
ERYENS
ERYAS
![Page 158: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/158.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Local sequence alignment
Find common motif in sequences; ignore the rest
E-YENS
ERYENS
ERYA-S
![Page 159: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/159.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Local sequence alignment
Find common motif in sequences; ignore the rest – Local alignment
E-YENSERYENSERYA-S
![Page 160: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/160.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Local sequence alignment
Important methods for local multiple alignment:
•PIMA•MEME/MAST
Idea: expectation maximation.
![Page 161: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/161.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Local sequence alignment
Traditional alignment approaches:
Either global or local methods!
![Page 162: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/162.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
New question: sequence families with multiple local similarities
Neither local nor global methods appliccable
![Page 163: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/163.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
New question: sequence families with multiple local similarities
Alignment possible if order conserved
![Page 164: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/164.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Morgenstern, Dress, Werner (1996),PNAS 93, 12098-12103
Combination of global and local methods
Assemble multiple alignment from gap-free local pair-wise alignments (,,fragments“)
![Page 165: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/165.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 166: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/166.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 167: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/167.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 168: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/168.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 169: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/169.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 170: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/170.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 171: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/171.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 172: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/172.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
![Page 173: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/173.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 174: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/174.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgc-ttag
cagtgcgtgtattactaac----------gg-ttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 175: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/175.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgc-ttag
cagtgcgtgtattactaac----------gg-ttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
Consistency!
![Page 176: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/176.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------TAATAGTTAaactccccCGTGC-TTag
cagtgcGTGTATTACTAAc----------GG-TTCAATcgcg
caaa--GAGTATCAcc----------CCTGaaTTGAATaa
![Page 177: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/177.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Score of an alignment:
Define score of fragment f:
l(f) = length of fs(f) = sum of matches (similarity values)
P(f) = probability to find a fragment with length l(f) and at least s(f) matches in random sequences that have the same length as the input sequences.
Score w(f) = -ln P(f)
![Page 178: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/178.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Score of an alignment:
Define score of fragment f:
Define score of alignment as
sum of scores of involved fragments
No gap penalty!
![Page 179: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/179.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Score of an alignment:
Goal in fragment-based alignment approach: find
Consistent collection of fragments with maximum sum of weight scores
![Page 180: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/180.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaaccccctcgtgcttagagatccaaaccagtgcgtgtattactaacggttcaatcgcgcacatccgc
Pair-wise alignment:
![Page 181: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/181.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaaccccctcgtgcttagagatccaaaccagtgcgtgtattactaacggttcaatcgcgcacatccgc
Pair-wise alignment:
recursive algorithm finds optimal chain of
fragments.
![Page 182: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/182.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
------atctaatagttaaaccccctcgtgcttag-------agatccaaaccagtgcgtgtattactaac----------ggttcaatcgcgcacatccgc--
Pair-wise alignment:
recursive algorithm finds optimal chain of
fragments.
![Page 183: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/183.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
------atctaatagttaaaccccctcgtgcttag-------agatccaaaccagtgcgtgtattactaac----------ggttcaatcgcgcacatccgc--
Optimal pairwise alignment: chain of fragments with maximum sum of weights found by dynamic programming:
Standard fragment-chaining algorithm
Space-efficient algorithm
![Page 184: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/184.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Multiple alignment:
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 185: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/185.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Multiple alignment:
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaccctgaattgaagagtatcacataa
(1) Calculate all optimal pair-wise alignments
![Page 186: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/186.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Multiple alignment:
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
(1) Calculate all optimal pair-wise alignments
![Page 187: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/187.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Multiple alignment:
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
(1) Calculate all optimal pair-wise alignments
![Page 188: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/188.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Fragments from optimal pair-wise alignments might be inconsistent
![Page 189: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/189.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 190: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/190.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 191: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/191.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 192: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/192.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
![Page 193: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/193.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
![Page 194: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/194.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 195: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/195.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Fragments from optimal pair-wise alignments might be inconsistent
1. Sort fragments according to scores
2. Include them one-by-one into growing multiple alignment – as long as they are consistent
(greedy algorithm, comparable to knapsack problem)
![Page 196: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/196.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 197: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/197.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 198: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/198.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 199: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/199.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 200: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/200.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Consistency problem
![Page 201: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/201.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Consistency problem
![Page 202: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/202.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
![Page 203: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/203.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
![Page 204: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/204.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagt taaactcccccgtgcttag
Cagtgcgtgtattact aacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
![Page 205: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/205.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taata-----gttaaactcccccgtgcttag
Cagtgcgtgtatta-----ctaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
![Page 206: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/206.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
site x = [i,p] (sequence i, position p)
![Page 207: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/207.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
Calculate upper bound bl(x,i) and lower bound bu(x,i) for each x and sequence i
![Page 208: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/208.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Upper and lower bounds for alignable positions
bl(x,i) and bu(x,i) updated for each new fragment in alignment
![Page 209: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/209.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Consistency bounds are to be updated for each new fragment that is included in to the growing Alignment
Efficient algorithm
(Abdeddaim and Morgenstern, 2002)
![Page 210: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/210.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
Advantages of segment-based approach:
Program can produce global and local alignments!
Sequence families alignable that cannot be aligned with standard methods
![Page 211: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/211.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
DIALIGN is available
Online at BiBiServ (Bielefeld Bioinformatics Server)
Downloadable UNIX/LINUX executables at BiBiServ
Source code (email to BM)
![Page 212: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/212.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Program input
Program usage:
> dialign2-2 [options] <input_file>
<input_file> = multi-sequence file in FASTA-format
![Page 213: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/213.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Program output
DIALIGN 2.2.1 ************* Program code written by Burkhard Morgenstern and Said Abdeddaim e-mail contact: [email protected] Published research assisted by DIALIGN 2 should cite: Burkhard Morgenstern (1999). DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15, 211 - 218.
For more information, please visit the DIALIGN home page at
http://bibiserv.techfak.uni-bielefeld.de/dialign/
program call: ./dialign2-2 -nt -anc s
Aligned sequences: length: ================== ======= 1) dog_il4 300 2) bla 200 3) blu 200
Average seq. length: 233.3
Please note that only upper-case letters are considered to be aligned.
![Page 214: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/214.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Program output
Alignment (DIALIGN format): =========================== dog_il4 1 cagg------ ----GTTTGA atctgataca ttgc------ ---------- bla 1 ctga------ ---------- ---------- --------GC CAAGTGGGAA blu 1 ttttgatatg agaaGTGTGA aacaagctat cctatattGC TAAGTGGCAG 0000000000 0000000000 0000000000 0000000011 1111111111 dog_il4 25 ---------- --ATGGCACT GGGGTGAATG AGGCAGGCAG CAGAATGATC bla 17 ggtgtgaata catgggtttc cagtaccttc tgaggtccag agtacc---- blu 51 ccctggcttt ctATGTGCAC AGAATGGGAG GAAAGTGCCT GCTAGTGAGC 0000000000 0000000000 0000000000 0000000000 0000000000 dog_il4 63 GTACTGCAGC CCTGAGCTTC CACTGGCCCA TGTTGGTATC CTTGTATTTT bla 63 ---------- ---------- ---TTTCCCA TGTGCTCCAT GGTGGAATGG blu 101 CAGGGACTCA GAGAGAATGG AGTATAGGGG TCAGGGCat- ---------- 0000000000 0000000000 0009999999 9999999888 8888888888 dog_il4 113 TCCGCCCCTT CCCAGCACca gcattatcct ---GGGATTG GAGAAGGGGG bla 90 ACCACTCCTT CTCAGCACaa caaagcccaa gaaGGTGTTG CGTTCTAGAC blu 140 ---------- ---------- ---------- ---GGGGTGG CCTTAGGCTC 8888888888 8888888800 0000000000 0007777777 7777777777
![Page 215: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/215.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 216: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/216.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 217: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/217.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 218: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/218.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 219: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/219.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 220: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/220.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 221: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/221.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 222: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/222.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacccctgaattgaataa
![Page 223: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/223.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 224: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/224.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgcttag
cagtgcgtgtattactaac----------ggttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 225: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/225.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgc-ttag
cagtgcgtgtattactaac----------gg-ttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 226: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/226.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------TAATAGTTAaactccccCGTGC-TTag------
cagtgcGTGTATTACTAAc----------GG-TTCAATcgcg
caaa--GAGTATCAcc----------CCTGaaTTGAATaa--
![Page 227: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/227.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
The DIALIGN approach
atc------taatagttaaactcccccgtgc-ttag
cagtgcgtgtattactaac----------gg-ttcaatcgcg
caaa--gagtatcacc----------cctgaattgaataa
![Page 228: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/228.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Program output
Alignment (DIALIGN format): =========================== dog_il4 1 cagg------ ----GTTTGA atctgataca ttgc------ ---------- bla 1 ctga------ ---------- ---------- --------GC CAAGTGGGAA blu 1 ttttgatatg agaaGTGTGA aacaagctat cctatattGC TAAGTGGCAG 0000000000 0000000000 0000000000 0000000011 1111111111 dog_il4 25 ---------- --ATGGCACT GGGGTGAATG AGGCAGGCAG CAGAATGATC bla 17 ggtgtgaata catgggtttc cagtaccttc tgaggtccag agtacc---- blu 51 ccctggcttt ctATGTGCAC AGAATGGGAG GAAAGTGCCT GCTAGTGAGC 0000000000 0000000000 0000000000 0000000000 0000000000 dog_il4 63 GTACTGCAGC CCTGAGCTTC CACTGGCCCA TGTTGGTATC CTTGTATTTT bla 63 ---------- ---------- ---TTTCCCA TGTGCTCCAT GGTGGAATGG blu 101 CAGGGACTCA GAGAGAATGG AGTATAGGGG TCAGGGCat- ---------- 0000000000 0000000000 0009999999 9999999888 8888888888 dog_il4 113 TCCGCCCCTT CCCAGCACca gcattatcct ---GGGATTG GAGAAGGGGG bla 90 ACCACTCCTT CTCAGCACaa caaagcccaa gaaGGTGTTG CGTTCTAGAC blu 140 ---------- ---------- ---------- ---GGGGTGG CCTTAGGCTC 8888888888 8888888800 0000000000 0007777777 7777777777
![Page 229: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/229.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
C. Notredame, D. Higgins, J. Heringa (2000), T-Coffee: A novel algorithm for multiple sequence alignment, J. Mol. Biol.
![Page 230: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/230.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
Problem with “progressive” approaches:
Strictly global alignments
Use only pair-wise comparison
![Page 231: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/231.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
Idea: Start with local and global pair-wise alignments (“primary
library” of alignments)
Construct “scondary library” of residues that are indirectly aligned by primary library.
Re-score residue pairs
Construct final alignment with “progressive” method
![Page 232: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/232.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
Advantage:
Combination of local and global approaches
Less sensitive against mis-alignments in progressive proceedure
![Page 233: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/233.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
![Page 234: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/234.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
![Page 235: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/235.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
T-COFFEE
T-COFFEE and DIALIGN: Less sensitive to spurious pairwise similarities Can handle local homologies better than
CLUSTAL
![Page 236: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/236.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Most multi-alignment approaches automated, i.e. based on algorithmic rules. Two components:
Objective function: assess alignment quality
Optimization algorithm: find optimal or near-optimal alignment
![Page 237: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/237.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Fully automated alignment programs necessary f no expert knowledge available if large amounts of data to be analyzed
But: Often no biologically reasonable
results Often additional information about
homologies etc. available
![Page 238: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/238.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Idea for improved alignment
Use expert knowledge to influence alignment procedure
DIALIGN with user-defined anchor points
![Page 239: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/239.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
Alignment of large genomic sequences to identify functional elements (phylogenetic footprinting)
Göttgens et al., 2000, 2001, 2002, … Pollard et al., 2004
DIALIGN, MGA, PipMaker, LAGAN, AVID, Mummer, WABA, …
![Page 240: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/240.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
Gene-regulatory sites identified by mulitple sequence alignment (phylogenetic footprinting)
![Page 241: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/241.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
![Page 242: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/242.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
DIALIGN alignment of human and murine genomic sequences
![Page 243: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/243.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
DIALIGN alignment of tomato and Thaliana genomic sequences
![Page 244: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/244.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
DIALIGN used by tracker for phylogenetic footprinting (Prohaska et al., 2004)
![Page 245: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/245.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
DIALIGN used by tracker for phylogenetic footprinting (Prohaska et al., 2004)
Alignment of Hox gene cluster:
![Page 246: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/246.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
DIALIGN used by tracker for phylogenetic footprinting (Prohaska et al., 2004)
Alignment of Hox gene cluster:
DIALIGN able to identify small regulatory elements, but
![Page 247: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/247.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
DIALIGN used by tracker for phylogenetic footprinting (Prohaska et al., 2004)
Alignment of Hox gene cluster:
DIALIGN able to identify small regulatory elements, but
Entire genes totally mis-aligned
![Page 248: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/248.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
DIALIGN used by tracker for phylogenetic footprinting (Prohaska et al., 2004)
Alignment of Hox gene cluster:
DIALIGN able to identify small regulatory elements, but
Entire genes totally mis-aligned Reason for mis-alignment: duplications !
![Page 249: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/249.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
The Hox gene cluster:
4 Hox gene clusters in pufferfish. 14 genes, different genes in different clusters!
![Page 250: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/250.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of large genomic sequences
The Hox gene cluster:
Complete mis-alignment of entire genes!
![Page 251: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/251.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
![Page 252: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/252.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Conserved motivs; no similarity outside motifs
![Page 253: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/253.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in two sequences
![Page 254: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/254.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in two sequences
![Page 255: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/255.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in two sequences
![Page 256: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/256.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Mis-alignment would have lower score!
![Page 257: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/257.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
![Page 258: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/258.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
![Page 259: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/259.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
Possible mis-alignment
![Page 260: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/260.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
S3
![Page 261: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/261.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
S3
![Page 262: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/262.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
S3
![Page 263: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/263.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Duplication in one sequence
S3
![Page 264: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/264.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Consistency problem
S3
![Page 265: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/265.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
More plausible alignment – and higher score:
S3
![Page 266: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/266.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Consistency problem
S3
![Page 267: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/267.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Alignment of sequence duplications
S1
S2
Alternative alignment; probably biologically wrong;lower numerical score!
S3
![Page 268: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/268.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
Biologically meaningful alignment not possible by automated approaches.
Idea: use expert knowledge to guide alignment procedure
User defines a set anchor points that are to be „respected“ by the alignment procedure
![Page 269: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/269.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFVALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
![Page 270: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/270.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFVALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
![Page 271: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/271.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFVALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
Use known homology as anchor point
![Page 272: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/272.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFV ALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
Use known homology as anchor point
Anchor point = anchored fragment (gap-free pair of segments)
Remainder of sequences aligned automatically
![Page 273: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/273.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFV ALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
Alignment of anchored positions a and b not enforced – a and b may be un-aligned –, but:
a is only residue that can be aligned to b
Residues left of a aligned with residues left of b
![Page 274: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/274.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
-------NLF VALYDFVASG DNTLSITKGE klrvlgynhn
iihredkGVI YALWDYEPQN DDELPMKEGD cmt-------
Anchored alignment
![Page 275: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/275.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFVALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQNDDELPMKEGDCMT
GYQYRALYDYKKEREEDIDLHLGDILTVNKGSLVALGFS
Anchor points in multiple alignment
![Page 276: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/276.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
NLFV ALYDFVASGDNTLSITKGEKLRVLGYNHN
IIHREDKGVIYALWDYEPQND DELPMKEGDCMT
GYQYRALYDYKKEREEDIDLHLGDILTVNKGSLVALGFS
Anchor points in multiple alignment
![Page 277: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/277.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Anchored sequence alignment
-------NLF V-ALYDFVAS GD-------- NTLSITKGEk lrvLGYNhn
iihredkGVI Y-ALWDYEPQ ND-------- DELPMKEGDC MT-------
-------GYQ YrALYDYKKE REedidlhlg DILTVNKGSL VA-LGFS--
Anchored multiple alignment
![Page 278: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/278.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Goal:
Find optimal alignment (=consistent set of fragments) under costraints given by user-specified anchor points!
![Page 279: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/279.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Algorithmic questions
![Page 280: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/280.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
NLFVALYDFVASGDNTLSITKGEKLRVLGYNHN IIHREDKGVIYALWDYEPQNDDELPMKEGDCMTGYQYRALYDYKKEREEDIDLHLGDILTVNKGSLVALGFS
![Page 281: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/281.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Algorithmic questions
![Page 282: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/282.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Sequences
Algorithmic questions
![Page 283: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/283.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Sequences start positions
Algorithmic questions
![Page 284: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/284.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Sequences start positions length
Algorithmic questions
![Page 285: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/285.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Additional input file with anchor points:
1 3 215 231 5 4.5
2 3 34 78 23 1.23
1 4 317 402 8 8.5
Sequences start positions length score
Algorithmic questions
![Page 286: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/286.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Requirements:
Anchor points need to be consistent! – if necessary: select consistent subset from user-specified anchor points
![Page 287: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/287.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 288: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/288.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 289: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/289.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Inconsistent anchor points!
![Page 290: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/290.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaat---agttaaactcccccgtgcttag
Cagtgcgtgtattac-taacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Inconsistent anchor points!
![Page 291: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/291.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Requirements:
Anchor points need to be consistent! – if necessary: select consistent subset from user-specified anchor points
Find alignment under constraints given by anchor points!
![Page 292: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/292.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Use data structures from multiple alignment
![Page 293: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/293.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
![Page 294: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/294.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Greedy procedure for multiple alignment
![Page 295: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/295.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Greedy procedure for multiple alignment
![Page 296: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/296.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa
Question: which positions are still alignable ?
![Page 297: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/297.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
For each position x and each sequence Si exist an
upper bound ub(x,i) and a lower bound lb(x,i) for
residues y in Si that are alignable with x
![Page 298: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/298.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
For each position x and each sequence Si exist an
upper bound ub(x,i) and a lower bound lb(x,i) for
residues y in Si that are alignable with x
![Page 299: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/299.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
ub(x,i) and lb(x,i) updated during greedy procedure
![Page 300: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/300.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
Initial values of lb(x,i), ub(x,i)
![Page 301: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/301.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
ub(x,i) and lb(x,i) updated during greedy procedure
![Page 302: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/302.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
ub(x,i) and lb(x,i) updated during greedy procedure
![Page 303: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/303.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Anchor points treated like fragments in greedy algorithm:
Sorted according to user-defined scores Accepted if consistent with previously accepted
anchors
ub(x,i) and lb(x,i) updated during greedy
procedure
Resulting values of ub(x,i) and lb(x,i) used as initial
values for alignment procedure
![Page 304: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/304.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
Initial values of lb(x,i), ub(x,i)
![Page 305: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/305.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
atctaatagttaaactcccccgtgcttag Si
cagtgcgtgtattactaacggttcaatcgcg
caaagagtatcacccctgaattgaataa x
Initial values of lb(x,i), ub(x,i) calculated using anchor
points
![Page 306: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/306.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Algorithmic questions
Ranking of anchor points to prioritize anchor points, e.g.
anchor points from verified homologies -- higher priority
automatically created anchor points (using CHAOS, BLAST, … ) -- lower priority
![Page 307: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/307.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Hox gene cluster
![Page 308: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/308.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Hox gene cluster
Use gene boundaries as anchor points
![Page 309: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/309.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Hox gene cluster
Use gene boundaries as anchor points
+ CHAOS / BLAST hits
![Page 310: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/310.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Hox gene cluster
no anchoring anchoring
Ali. Columns
2 seq 2958 3674
3 seq 668 1091
4 seq 244 195
Score 1166 1007
CPU time 4:22 0:19
![Page 311: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/311.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Hox gene cluster
Example:
Teleost Hox gene cluster:
Score of anchored alignment 15 % higher than score of non-anchored alignment !
Conclusion: Greedy optimization algorithm does a bad job!
![Page 312: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/312.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Improvement of Alignment programs
Two possible reasons for mis-alignments:
Wrong objective function: Biologically correct
alignment gets bad numerical score
Bad optimization algorithms: Biologically correct
alignment gets best numerical score, but algorithm
fails to find this alignment
![Page 313: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/313.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: Improvement of Alignment programs
Two possible reasons for mis-alignments:
Anchored alignments can help to decide
![Page 314: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/314.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: RNA alignment
![Page 315: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/315.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: RNA alignment
aa----CCCC AGC---GUAa gucgcuaucc a
cacucuCCCA AGC---GGAG Aac------- -
ccg----CCA AaagauGGCG Acuuga---- -
non-anchored alignment
![Page 316: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/316.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: RNA alignment
aa----CCCC AGC---GUAa gucgcuaucc a
cacucuCCCA AGC---GGAG Aac------- -
ccg----CCA AaagauGGCG Acuuga---- -
structural motif mis-aligned
![Page 317: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/317.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Application: RNA alignment
aaCCCCAGCG UAAGUCGCUA UCca--
--CACUCUCC CAAGCGGAGA AC----
----CCGCCA AAAGAUGGCG ACuuga
3 conserved nucleotides as anchor points
![Page 318: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/318.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
WWW interface at GOBICS(Göttingen Bioinformatics Compute Server)
![Page 319: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/319.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
WWW interface at GOBICS (Göttingen Bioinformatics Compute Server)
![Page 320: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/320.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
![Page 321: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/321.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
Goal: find location and structure of protein-coding genes in eukaryotic genome sequences.
![Page 322: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/322.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
attgccagtacgtagctagctacacgtatgctattacggatctgtagcttagcgtatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcttagtcgtgtagtcttgatctacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatggtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctag
![Page 323: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/323.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
attgccagtacgtagctagctacacgtatgctattacggatctgtagcttagcgtatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcttagtcgtgtagtcttgatctacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctagagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctagtcgtagtcgtagtcgttagcatctgtatggtcgtagtcgttagcatctgtatgctgttagctgtacgtacgtatttttctaggggagcttcgtagtctatggctag
![Page 324: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/324.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
![Page 325: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/325.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene predictions for eukaryotes
Three different approaches to computational gene-finding:
Intrinsic: use statistical information about known genes (Hidden Markov Models)
Extrinsic: compare genomic sequence with known proteins / genes
Cross-species sequence comparison: search for similarities among genomes
![Page 326: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/326.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
Generative probabilistic model for sequence of observations („symbols“).
Finite set of states
States can emit symbols Transitions between states possible Sequence generated by path between states
![Page 327: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/327.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
Example: The occasionally dishonest casino.
3 5 6 6 6 4 6 5 1 6 5 1 2
F F U U U U U F F F F F F
Possible states:
fair (F); unfair (U); begin (B); end (E)
![Page 328: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/328.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
Assumptions:
Emission probabilities known; depend only on current state.
Transition probabilities known, depend only on current state
![Page 329: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/329.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
F
U
E B
![Page 330: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/330.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
3 5 6 6 6 4 6 5 1 6 5 1 2 s
B F F U U U U U F F F F F F E φ
For sequence s and parse φ:
P(φ) probability of φ P(φ,s) joint probability of φ and s = P(φ) * P(s|φ) P(φ|s) a-posteriori probability of φ
![Page 331: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/331.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
3 5 6 6 6 4 6 5 1 6 5 1 2
B F F U U U U U F F F F F F E
Goal: find path φ with maximum a-posteriori probability P(φ|s)
Idea: find path that maximizes joint probability P(φ,s) by dynamic programming (Viterbi algorithm)
![Page 332: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/332.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
Application to gene prediction:
A T A A T G C C T A G T C s (DNA) Z Z Z E E E E E E I I I I φ (parse)
Introns, exons etc modeled as states in GHMM („generalized HMM“)
Given sequence s, find parse that maximizes P(φ|s)
(S. Karlin and C. Burge, 1997)
![Page 333: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/333.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
Application to gene prediction:
A T A A T G C C T A G T C s (DNA) Z Z Z E E E E E E I I I I φ (parse)
Introns, exons etc modeled as states in GHMM („generalized HMM“)
Given sequence s, find parse that maximizes P(φ|s)
(S. Karlin and C. Burge, 1997)
![Page 334: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/334.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
Basic model for GHMM-based intrinsic gene finding comparable to GenScan (M. Stanke)
![Page 335: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/335.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
![Page 336: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/336.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
![Page 337: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/337.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
Features of AUGUSTUS:
Intron length model Initial pattern for exons Similarity-based weighting for splice sites Interpolated HMM Internal 3’ content model
![Page 338: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/338.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Hidden-Markov-Models (HMM) for gene prediction
A T A A T G C C T A G T C s (DNA) Z Z Z E E E E I I I I φ (parse)
Explicit intron length model computationally expensive.
![Page 339: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/339.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
Intron length model:
• Explicit length distribution for short introns• Geometric tail for long introns
Intron (fixed)
Exon
Intron (expl.)
Exon
Intron (geo.)
![Page 340: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/340.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
![Page 341: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/341.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS
Extension of AUGUSTUS using include extrinsic information:
Protein sequences EST sequences Syntenic genomic sequences User-defined constraints
![Page 342: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/342.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
Comparison of genomic sequences
(human and mouse)
![Page 343: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/343.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
![Page 344: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/344.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
catcatatcttatcttacgttaactcccccgt
cagtgcgtgatagcccatatccgg
Gene prediction by phylogenetic footprinting
![Page 345: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/345.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
catcatatcttatcttacgttaactcccccgt
cagtgcgtgatagcccatatccgg
Gene prediction by phylogenetic footprinting
![Page 346: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/346.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
catcatatcttatcttacgttaactcccccgt
cagtgcgtgatagcccatatccgg
Standard score:Consider length, # matches, compute probability of random occurrence
Gene prediction by phylogenetic footprinting
![Page 347: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/347.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Translation option:
catcatatcttatcttacgttaactcccccgt
cagtgcgtgatagcccatatccgg
Gene prediction by phylogenetic footprinting
![Page 348: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/348.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Translation option:
L S Y V
catcatatc tta tct tac gtt aactcccccgt
cagtgcgtg ata gcc cat atc cgg
I A H I
DNA segments translated to peptide segments; fragment score based on peptide similarity:
Calculate probability of finding a fragment of the same length with (at least) the same sum of BLOSUM values
Gene prediction by phylogenetic footprinting
![Page 349: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/349.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
P-fragment (in both orientations)
L S Y V
catcatatc tta tct tac gtt aactcccccgt
cagtgcgtg ata gcc cat atc cgg
I A H I
N-fragment catcatatc ttatcttacgtt aactcccccgtgct || | | | cagtgcgtg atagcccatatc cg
For each fragment f three probability values calculated; Score of f based on smallest P value.
Gene prediction by phylogenetic footprinting
![Page 350: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/350.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
P-fragment (in both orientations)
L S Y V
catcatatc tta tct tac gtt aactcccccgt
cagtgcgtg ata gcc cat atc cgg
I A H I
N-fragment catcatatc ttatcttacgtt aactcccccgtgct || | | | cagtgcgtg atagcccatatc cg
P-fragments associated with strand and reading frame!
Gene prediction by phylogenetic footprinting
![Page 351: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/351.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
![Page 352: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/352.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
AGenDA: Alignment-based Gene Detection Algorithm
![Page 353: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/353.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
Fragments in DIALIGN alignment
![Page 354: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/354.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
Build cluster of fragments
![Page 355: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/355.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
Identify conserved splice sites
![Page 356: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/356.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
•Candidate exons bounded by conserved splice sites •Find optimal chain of candidate exons
![Page 357: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/357.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
![Page 358: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/358.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
![Page 359: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/359.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
![Page 360: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/360.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
0%10%20%30%40%50%60%70%80%90%
100%
sensitivity specificity
AGenDAGenScan
![Page 361: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/361.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Gene prediction by phylogenetic footprinting
AGenDA
GenScan
64 %
12 % 17 %
![Page 362: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/362.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Extended GHMM using extrinsic information
Additional input data: collection h of `hints’ about possible gene structure φ for sequence s
Consider s, φ and h result of random process. Define probability P(s,h,φ)
Find parse φ that maximizes P(φ|s,h) for given s and h.
![Page 363: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/363.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Hints created using
Alignments to EST sequences Alignments to protein sequences Combined EST and protein alignment (EST
alignments supported by protein alignments) Alignments of genomic sequences User-defined hints
![Page 364: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/364.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Alignment to EST: hint to (partial) exon
EST
G1
![Page 365: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/365.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
EST alignment supported by protein: hint to exon (part), start codon
EST
G1
Protein
![Page 366: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/366.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Alignment to ESTs, Proteins: hints to introns, exons
ESTs, Protein
G1
![Page 367: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/367.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Alignment of genomic sequences: hint to (partial) exon
G2
G1
![Page 368: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/368.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Consider different types of hints:
type of hints: start, stop, dss, ass, exonpart, exon, introns
Hint associated with position i in s (exons etc. associated with right end position) max. one hint of each type allowed per position in s Each hint associated with a grade g that indicates its source or reliability.
![Page 369: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/369.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
hi,t = information about hint of type t at position i
hi,t = $ if no hint of type t available at i
hi,t = [grade, strand, (length, reading frame)] if hint available
(hints created by protein alignments or DIALIGN contain information about reading frame)
![Page 370: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/370.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Standard program version, without hints
A T A A T G C C T A G T C s (sequence) Z Z Z E E E E E E I I I I φ (parse)
Find parse that maximizes P(φ|s)
![Page 371: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/371.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
AUGUSTUS+ using hints
A T A A T G C C T A G T C s (sequence) $ $ $ $ $ $ $ X $ $ $ $ $ h (type 1) $ $ $ $ $ $ $ $ $ $ $ $ $ h (type 2) $ $ $ $ X $ $ $ $ $ $ $ $ h (type 3) . . . .
Z Z Z E E E E E E I I I I φ (parse)
Find parse that maximizes P(φ|s,h)
![Page 372: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/372.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
As in standard HMM theory: maximize joint probability P(φ,s,h)
How to define P(φ,s,h) ?
![Page 373: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/373.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
General assumption: Hints of different types t and at different positions i independent of each other (for redundant hints: ignore „weaker“ types).
![Page 374: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/374.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
General assumption: Hints of different types t and at different positions i independent of each other (for redundant hints: ignore „weaker“ types).
),|(),(),,( shPsPhsP
![Page 375: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/375.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
General assumption: Hints of different types t and at different positions i independent of each other (for redundant hints: ignore „weaker“ types).
),|(),(),,( shPsPhsP
ti
ti shPshPsP,
, ),|(),|(),(
![Page 376: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/376.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Assumption: P(hi,t |φ,s) depends on type t, grade g and whether hi,t is compatible with φ or s.
Example: hi,t hint to exon E
hi,t compatible with parse φ if E part of φ.
hi,t compatible with sequence s if start and stop codons exist according to E and if no internal stop codon in E exists
![Page 377: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/377.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
For given g and t: 3 possible values for P(hi,t |φ,s)
P(hi,t |φ,s) = q+(t,g) if hi,t compatible with φ
P(hi,t |φ,s) = q-(t,g) if hi,t compatible with s
but not compatible with φP(hi,t |φ,s) = 0 if hi,t not compatible with s
Values learned from training data
![Page 378: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/378.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Results:
Gene (sub-)structures supported by hints receive bonus compared to non-supported structures
Gene (sub-)structures not supported by hints receive malus
(M. Stanke et al. 2006, BMC Bioinformatics)
![Page 379: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/379.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
![Page 380: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/380.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
h, h’ collections of hints;
h’i,t = hi,t for (i,t) ≠ (I,T)
h’I,T ≠ hI,T = $; g grade of h’I,T
φ+, φ- gene structures on s
h’IT compatible with φ+, but not with φ-
![Page 381: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/381.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
),|'(),(
),|'(),(
)',,(
)',,(
)',|(
)',|(
shPsP
shPsP
hsP
hsP
hsP
hsP
![Page 382: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/382.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
),|'(),(
),|'(),(
)',,(
)',,(
)',|(
)',|(
shPsP
shPsP
hsP
hsP
hsP
hsP
titi
titi
shPsP
shPsP
,,
,,
),|'(),(
),|'(),(
![Page 383: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/383.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
),|'(),(
),|'(),(
)',,(
)',,(
)',|(
)',|(
shPsP
shPsP
hsP
hsP
hsP
hsP
titi
titi
shPsP
shPsP
,,
,,
),|'(),(
),|'(),(
ti TI
TIti
TI
TI
titi
shP
shPshPsP
shP
shPshPsP
, ,
,,
,
,
,,
),|(
),|'(),|(),(
),|(
),|'(),|(),(
![Page 384: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/384.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
ti TI
TIti
TI
TI
titi
shP
shPshPsP
shP
shPshPsP
, ,
,,
,
,
,,
),|(
),|'(),|(),(
),|(
),|'(),|(),(
![Page 385: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/385.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
ti TI
TIti
TI
TI
titi
shP
shPshPsP
shP
shPshPsP
, ,
,,
,
,
,,
),|(
),|'(),|(),(
),|(
),|'(),|(),(
),|$(
),(),|(),(
),|$(
),(),|(),(
,
,
shP
gTqshPsP
shP
gTqshPsP
TI
TI
![Page 386: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/386.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
),|$(
),(),|(),(
),|$(
),(),|(),(
,
,
shP
gTqshPsP
shP
gTqshPsP
TI
TI
![Page 387: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/387.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
),|$(
),(),|(),(
),|$(
),(),|(),(
,
,
shP
gTqshPsP
shP
gTqshPsP
TI
TI
),|$(),(
),|$(),(
),|(
),|(
,
,
shPgTq
shPgTq
hsP
hsP
TI
TI
![Page 388: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/388.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Result:
i.e. structure φ+, which is compatible with additional hint h’IT receives relative bonus
),|$(),(
),|$(),(
,
,
shPgTq
shPgTq
TI
TI
),|(),(
),|(),(
),|(
),|(
),|(
),|(
,
,
shPgTq
shPgTq
hsP
hsP
hsP
hsP
TI
TI
![Page 389: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/389.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Results (gene level) on data set sag178
% SN % SP
Augustus 42 38
GenScan 18 14
GeneID 17 17
HMMGene 20 7
Aug. + EST 49 46
Aug. + prot 71 68
Aug. combined 68 65
Aug. all 82 79
GenomeScan 37 38
TwinScan 20 25
![Page 390: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/390.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Using hints from DIALIGN alignments:
1. Obtain large human/mouse sequence pairs (up to 50kb) from UCSC
2. Run CHAOS to find anchor points3. Run DIALIGN using CHAOS anchor points4. Create hints h from DIALIGN fragments5. Run AUGUSTUS with hints
![Page 391: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/391.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Hints from DIALIGN fragments:
Segment covered by peptide fragment minus 33 bp at both ends defines exon part hint on all 6 reading frames.
![Page 392: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/392.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
Hints from DIALIGN fragments:
Consider fragments with score ≥ 20
Distinguish high scores (≥ 45) from low scores Consider reading frame given by DIALIGN Consider strand given by DIALIGN
=> 2*2*2 = 8 grades
![Page 393: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/393.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
AUGUSTUS+
AUGUSTUS best ab-initio method at EGASP
![Page 394: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/394.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
EGASP test results
AUGUSTUS
GENSCAN
geneid GeneMark.hmm
Genezilla
0
10
20
30
40
50
60
70
80
90
100 Nukleotid Level
Sensitivität
Spezifität
![Page 395: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/395.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
EGASP test results
AUGUSTUS
GENSCAN
geneid GeneMark.hmm
Genezilla
0
10
20
30
40
50
60
70
80
90
100 Exon Level
Sensitivität
Spezifität
![Page 396: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/396.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
EGASP test results
AUGUSTUS
GENSCAN
geneid GeneMark.hmm
Genezilla
0
2,5
5
7,5
10
12,5
15
17,5
20
22,5
25
27,5
30 Transkript Level
Sensitivität
Spezifität
![Page 397: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/397.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
EGASP test results
AUGUSTUS
GENSCAN
geneid GeneMark.hmm
Genezilla
0
2,5
5
7,5
10
12,5
15
17,5
20
22,5
25
27,5
30 Gen Level
Sensitivität
Spezifität
![Page 398: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/398.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Sn Sp Sn Sp Sn Sp Sn Sp
Base Exon Transcript Gene
Ac
cu
rac
y
AUGUSTUS
AUGUSTUS+DIALIGN
DOGFISH-C
SGP2
TWINSCAN
TWINSCAN-MARS
N-SCAN
EGASP test results
![Page 399: 6/3/2015Burkhard Morgenstern, Tunis 2007 Multiple Alignment and Motif Searching Burkhard Morgenstern Universität Göttingen Institute of Microbiology and.](https://reader036.fdocuments.us/reader036/viewer/2022062423/56649d2b5503460f94a00203/html5/thumbnails/399.jpg)
04/21/23 Burkhard Morgenstern, Tunis 2007
Ongoing projects
Brugia malayi (TIGR)
Aedes aegypti (TIGR)
Schistosoma mansoni (TIGR)
Tetrahymena thermophilia (TIGR)
Galdieria Sulphuraria (Michigan State Univ.)
Coprinus cinereus (Univ. Göttingen)
Tribolium castaneum (Univ. Göttingen)