8/3/2019 Sequence Alignment Algorithms
1/18
Developing Pairwise Sequence Alignment Algorithms
Dr. Nancy Warter-Perez
8/3/2019 Sequence Alignment Algorithms
2/18
Developing Pairwise Sequence Alignment Algorithms 2
OutlineOverview of global and local alignment References for sequence alignment algorithms
Discussion of Needleman-Wunsch iterative approachto global alignment Discussion of Smith-Waterman recursive approach tolocal alignment Discussion of how LCS Algorithm can be extended for
Global alignment (Needleman-Wunsch)Local alignment (Smith-Waterman)
Affine gap penaltiesGroup assignments for project
8/3/2019 Sequence Alignment Algorithms
3/18
Developing Pairwise Sequence Alignment Algorithms 3
Overview of Pairwise
Sequence Alignment Dynamic Programming
Applied to optimization problemsUseful when
Problem can be recursively divided into sub-problemsSub-problems are not independent Needle man-Wunsch is a global alignment technique that usesan iterative algorithm and no gap penalty (could extend to fixedgap penalty).S mith-Wat e rman is a local alignment technique that uses a
recursive algorithm and can use alternative gap penalties (such asaffine ). Smith-Waterman s algorithm is an extension of Longest Common Substring (LCS) problem and can be generalized to solveboth local and global alignment.Note: Needleman-Wunsch is usually used to refer to globalalignment regardless of the algorithm used.
8/3/2019 Sequence Alignment Algorithms
4/18
Developing Pairwise Sequence Alignment Algorithms 4
Project Referenceshttp://www.sbc.su.se/~arne/kurser/swell/pairwise
_alignments.html
Computational Molecular Biology An Algorithmic Approach, Pavel PevznerIntroduction to Computational Biology Maps,sequences, and genomes, Michael Waterman
Algorithms on Strings, Trees, and Sequences Computer Science and Computational Biology, DanGusfield
8/3/2019 Sequence Alignment Algorithms
5/18
Developing Pairwise Sequence Alignment Algorithms 5
Classic PapersNeedleman, S.B. and Wunsch, C.D. A GeneralMethod Applicable to the Search for Similarities in
Amino Acid Sequence of Two Proteins. J. Mol. Biol. ,48, pp. 443-453, 1970.(http://www.cs.umd.edu/class/spring2003/cmsc838t/papers/needlemanandwunsch1970.pdf )Smith, T.F. and Waterman, M.S. Identification of Common Molecular Subsequences. J. Mol. Biol. ,147, pp. 195-197,1981.( http://www.cmb.usc.edu/papers/msw_papers/msw-042.pdf )
8/3/2019 Sequence Alignment Algorithms
6/18
Developing Pairwise Sequence Alignment Algorithms 6
Needleman-Wunsch (1 of 3)
Match = 1
Mismatch = 0
Gap = 0
8/3/2019 Sequence Alignment Algorithms
7/18
Developing Pairwise Sequence Alignment Algorithms 7
Needleman-Wunsch (2 of 3)
8/3/2019 Sequence Alignment Algorithms
8/18
Developing Pairwise Sequence Alignment Algorithms 8
Needleman-Wunsch (3 of 3)From page 446:
It is apparent that the above array operation can beginat any of a number of points along the borders of thearray, which is equivalent to a comparison of N-terminalresidues or C-terminal residues only. As long as the
appropriate rules for pathways are followed, themaximum match will be the same. The cells of thearray which contributed to the maximum match, may be determined by recording the origin of the number that was added to each cell when the array was
operated upon.
8/3/2019 Sequence Alignment Algorithms
9/18
Developing Pairwise Sequence Alignment Algorithms 9
Smith-Waterman (1 of 3) A lgorithm
The two molecular sequences will be A=a 1a 2 . . . a n, and B=b 1b2 . . . b m . Asimilarity s(a,b ) is given between sequence elements a and b . Deletions of
length k are given weight W k . To find pairs of segments with highdegrees of similarity, we set up a matrix H . First set
H k0 = H ol = 0 for 0
8/3/2019 Sequence Alignment Algorithms
10/18
Developing Pairwise Sequence Alignment Algorithms 10
Smith-Waterman (2 of 3)The formula for H ij follows by considering the possibilities for
ending the segments at any a i and b j .
(1 ) If a i and b j are associated, the similarity isH i-l,j-l + s (a i ,b j ).
(2 ) If a i is at the end of a deletion of length k, the similarity is
H i k, j - W k .
(3 ) If b j is at the end of a deletion of length 1, the similarity is
H i , j-l - W l . (typo in paper )
(4 ) Finally, a zero is included to prevent calculated negativesimilarity, indicating n o similarity up to a i and b j .
8/3/2019 Sequence Alignment Algorithms
11/18
Developing Pairwise Sequence Alignment Algorithms 11
Smith-Waterman (3 of 3)The pair of segments with maximum similarity isfound by first locating the maximum element of H. The other matrix elements leading to thismaximum value are than sequentially determinedwith a traceback procedure ending with an element of H equal to zero. This procedureidentifies the segments as well as produces the
corresponding alignment. The pair of segmentswith the next best similarity is found by applyingthe traceback procedure to the second largestelement of H not associated with the firsttraceback.
8/3/2019 Sequence Alignment Algorithms
12/18
Developing Pairwise Sequence Alignment Algorithms 12
LCS Problem (cont.)Similarity score
si-1,j
si,j = max { si,j-1si-1,j-1 + 1, if vi = wj
8/3/2019 Sequence Alignment Algorithms
13/18
Developing Pairwise Sequence Alignment Algorithms 13
Extend LCS to Global
Alignment si-1,j + H(vi, -)
si,j = max { si,j-1 + H(-, wj)si-1,j-1 + H(vi, wj)
H(vi, -) = H(-, wj) = - V = fixed gap penalty
H(vi, wj) = score for match or mismatch can befixed, from PAM or BLOSUMModify LCS and PRINT-LCS algorithms to support global alignment (On board discussion)
8/3/2019 Sequence Alignment Algorithms
14/18
Developing Pairwise Sequence Alignment Algorithms 14
Extend to Local Alignment 0 (no negative scores)si-1,j + H(vi, -)
si,j = max { si,j-1 + H(-, wj)si-1,j-1 + H(vi, wj)
H(vi, -) = H(-, wj) = - V = fixed gap penaltyH(vi, wj) = score for match or mismatch can
be fixed, from PAM or BLOSUM
8/3/2019 Sequence Alignment Algorithms
15/18
Developing Pairwise Sequence Alignment Algorithms 15
Discussion on adding
affine gap penalties Affine gap penalty
Score for a gap of length x-( V + Wx)
WhereV > 0 is the insert gap penaltyW> 0 is the extend gap penalty
On board example fromhttp://www.sbc.su.se/~arne/kurser/swell/pairwise_alignments.html
8/3/2019 Sequence Alignment Algorithms
16/18
Developing Pairwise Sequence Alignment Algorithms 16
Alignment with Gap PenaltiesCan apply to global or local (w/ zero) algorithms
o si,j = max { o si-1,j - Wsi-1,j - ( V + W)
n si,j = max { n si1,j-1 - Wsi,j-1 - ( V + W)
si-1,j-1 + H(vi, wj)si,j = max { o si,j
n si,j
8/3/2019 Sequence Alignment Algorithms
17/18
Developing Pairwise Sequence Alignment Algorithms 17
Project Teams and
Presentation AssignmentsBase Project (Global Alignment):
Shwe and LeightonExtension 1 (Ends-Free Global Alignment):
Ehsanul and Water TreeExtension 2 (Local Alignment):
Scott and BrianExtension 3 (Affine Gap Penalty):
Charlyn and DavidExtension 4 (Database):Daniel and Ashley
Extension 5 (Space Efficient Algorithm):Kendra and Qing
8/3/2019 Sequence Alignment Algorithms
18/18
Developing Pairwise Sequence Alignment Algorithms 18
WorkshopMeet with your group and develop forthe overall structure of your program
High-level algorithmIdentify the modules, functions (includingparameters), and global variablesDetermine who is responsible for eachmoduleDevise a development timeline and atesting strategy
Top Related