Pairwise alignment
-
Upload
leilani-davidson -
Category
Documents
-
view
54 -
download
0
description
Transcript of Pairwise alignment
![Page 1: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/1.jpg)
Pairwise alignment
Now we know how to do it: How do we get a multiple
alignment (three or more sequences)?
Multiple alignment: much greater combinatorial explosion than with pairwise alignment…..
![Page 2: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/2.jpg)
Multi-dimensional dynamic programming(Murata et al. 1985)
![Page 3: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/3.jpg)
Simultaneous Multiple alignmentMulti-dimensional dynamic programming
MSA (Lipman et al., 1989, PNAS 86, 4412)
extremely slow and memory intensive up to 8-9 sequences of ~250 residues
DCA (Stoye et al., 1997, CABIOS 13, 625)
still very slow
![Page 4: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/4.jpg)
Alternative multiple alignment methods
Biopat (first method ever) MULTAL (Taylor 1987) DIALIGN (Morgenstern 1996) PRRP (Gotoh 1996) Clustal (Thompson Higgins Gibson
1994) Praline (Heringa 1999) T Coffee (Notredame 2000) HMMER (Eddy 1998) [Hidden Marcov
Models] SAGA (Notredame 1996) [Genetic
algorithms]
![Page 5: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/5.jpg)
Progressive multiple alignment general principles
1213
45
Guide tree Multiple alignment
Score 1-2
Score 1-3
Score 4-5
Scores Similaritymatrix5×5
Scores to distances Iteration possibilities
![Page 6: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/6.jpg)
General progressive multiple alignment technique(follow generated tree)
13
25
13
13
13
25
254
d
root
![Page 7: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/7.jpg)
Progressive multiple alignment
Problem: Accuracy is very important Errors are propagated into the
progressive steps
“Once a gap, always a gap”
Feng & Doolittle, 1987
![Page 8: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/8.jpg)
Multiple alignment profilesGribskov et al. 1987
ACDWY
Gappenalties
i0.30.100.30.3
0.51.0
Position dependent gap penalties
![Page 9: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/9.jpg)
ACD……VWY
sequence
profile
Profile-sequence alignment
![Page 10: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/10.jpg)
ACD..Y
ACD……VWY
profile
profileProfile-profile alignment
![Page 11: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/11.jpg)
Clustal, ClustalW, ClustalX CLUSTAL W/X (Thompson et al., 1994) uses Neighbour
Joining (NJ) algorithm (Saitou and Nei, 1984), widely used in phylogenetic analysis, to construct guide tree.
Sequence blocks are represented by profiles, in which the individual sequences are additionally weighted according to the branch lengths in the NJ tree.
Further carefully crafted heuristics include: (i) local gap penalties (ii) automatic selection of the amino acid substitution matrix,
(iii) automatic gap penalty adjustment (iv) mechanism to delay alignment of sequences that appear to
be distant at the time they are considered. CLUSTAL (W/X) does not allow iteration (Hogeweg and
Hesper, 1984; Corpet, 1988, Gotoh, 1996; Heringa, 1999, 2002)
![Page 12: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/12.jpg)
Profile pre-processing Secondary structure-induced
alignment Globalised local alignment Matrix extension
Objective: try to avoid (early) errors
Strategies for multiple sequence alignment
![Page 13: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/13.jpg)
Pre-profile generation1213
45
Score 1-2
Score 1-3
Score 4-5
ACD..Y
12345
1ACD..Y
21345
2
Pre-profilesPre-alignments
512354
ACD..Y
Cut-off
![Page 14: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/14.jpg)
Profile pre-processing Secondary structure-induced
alignment Globalised local alignment Matrix extension
Objective: try to avoid (early) errors
Strategies for multiple sequence alignment
![Page 15: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/15.jpg)
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
PRIMARY STRUCTURE (amino acid sequence)
QUATERNARY STRUCTURE (oligomers)
SECONDARY STRUCTURE (helices, strands)
TERTIARY STRUCTURE (fold)
Protein structure hierarchical levels
![Page 16: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/16.jpg)
Profile pre-processing Secondary structure-induced
alignment Globalised local alignment Matrix extension
Objective: try to avoid (early) errors
Strategies for multiple sequence alignment
![Page 17: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/17.jpg)
Globalised local alignment
+ =
1. Local (SW) alignment (M + Po,e)
2. Global (NW) alignment (no M or Po,e)
Double dynamic programming
![Page 18: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/18.jpg)
Profile pre-processing Secondary structure-induced
alignment Globalised local alignment Matrix extension
Objective: try to avoid (early) errors
Strategies for multiple sequence alignment
![Page 19: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/19.jpg)
Matrix extension – T COFFEE
12
13
14
23
24
34
![Page 20: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/20.jpg)
Summary
Weighting schemes simulating simultaneous multiple alignment Profile pre-processing (global/local) Matrix extension (well balanced scheme)
Smoothing alignment signals globalised local alignment
Using additional information secondary structure driven alignment
Schemes strike balance between speed and sensitivity
![Page 21: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/21.jpg)
References Heringa, J. (1999) Two strategies for sequence
comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comp. Chem. 23, 341-364.
Notredame, C., Higgins, D.G., Heringa, J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol., 302, 205-217.
Heringa, J. (2002) Local weighting schemes for protein multiple sequence alignment. Comput. Chem., 26(5), 459-477.
![Page 22: Pairwise alignment](https://reader035.fdocuments.us/reader035/viewer/2022081420/56812da7550346895d92cce4/html5/thumbnails/22.jpg)
Where to find this….http://www.cs.vu.nl/~ibivu/teaching