A Stepwise Highway Alignment Optimization Using Genetic Algorithms
Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local...
-
Upload
amberly-black -
Category
Documents
-
view
229 -
download
0
Transcript of Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local...
![Page 1: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/1.jpg)
Intro to Alignment Intro to Alignment Algorithms: Algorithms:
Global and LocalGlobal and Local
Algorithmic Functions of Computational Biology
Professor Istrail
![Page 2: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/2.jpg)
Sequence ComparisonBiomolecular sequences DNA sequences (string over 4 letter alphabet {A, C,
G, T}) RNA sequences (string over 4 letter alphabet
{ACGU}) Protein sequences (string over 20 letter alphabet
{Amino Acids})
Sequence similarity helps in the discovery of genes, and the prediction of structure and function of proteins.
Algorithmic Functions of Computational Biology
Professor Istrail
![Page 3: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/3.jpg)
The Basic Similarity Analysis Algorithm
Global Similarity
• Scoring Schemes
• Edit Graphs
• Alignment = Path in the Edit Graph
• The Principle of Optimality
• The Dynamic Programming Algorithm
• The Traceback
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 4: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/4.jpg)
Jupiter’s code: Alignment
MassaMaster
Mas-saMaster
Mass-aMaster
Massa-Master
Algorithmic Functions of Computational Biology
Professor Istrail
![Page 5: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/5.jpg)
Sequence AlignmentInput: two sequences over the same alphabetOutput: an alignment of the two sequences
Example: GCGCATTTGAGCGA TGCGTTAGGGTGACCAA possible alignment: - GCGCATTTGAGCGA - -
TGCG - - TTAGGGTGACC
matchmismatch
indel
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 6: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/6.jpg)
m
n
yyyY
xxxX
...21
...21
Consider two sequences
Over the alphabet
},,, TGCA
ji yx , belong to
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 7: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/7.jpg)
Scoring Schemes
Unit-score A C G T
A
C
G
T
1
1
1
1-
-
0
0
00
0
0
0
0 00
0
0
00
0
0
00
0
00
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 8: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/8.jpg)
Alignment
ACG| | |AGG
Score = (A,A) (C,G) (G,G)
+ +
= 1 + 0 + 1 = 2
Unit-cost
A|A
A is aligned with A
C|G
C is aligned with G
G is aligned with G
G|G
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 9: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/9.jpg)
Gaps
ACATGGAATACAGGAAAT
ACAT GG - AAT ACA - GG AAAT
OPTIMALALIGNMENTS
SCORE 7 8
AAAGGGGGGAAA
SCORE 0 3
- - - AAAGGG GGGAAA - - -
“-” is the gap symbol
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 10: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/10.jpg)
(x,y) = the score for aligning x with y
(-,y) = the score for aligning - with y
(x,-) = the score for aligning x with -
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 11: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/11.jpg)
A-CG - GATCGTG
Alignment
Score
(A,A) + (G,G) +(C,C) +(-,T) + (-,T ) + (G,G)
THE SUM OF THE SCORES OF THE PAIRWISE ALIGNED SYMBOLS
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 12: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/12.jpg)
Scoring Scheme
Dayhoff score
...
PTIPLSRLFDNAMLRAHRLHQSAIENQRLFNIAVSRVQHLHL
Partial alignment for Monkey and Trout somatotropin proteins
- A R N D C Q E G H I L K M F P S T W Y V
-8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8-8 3 -3 0 0 -3 -1 0 1 -3 -1 -3 -2 -2 -4 1 1 1 -7 -4 0A
RND
64
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 13: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/13.jpg)
Scoring Functions
Scoring function = a sum of a terms each for a pair of aligned residues, and for each gap
The meaning = log of the relative likelihood that the sequences are related, compared to being unrelated
Identities and conservative substitutions are Positive terms
Non-conservative substitutions are Negative terms
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 14: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/14.jpg)
The Edit Graph
Suppose that we want to align AGT with AT
We are going to construct a graph where alignments between the two sequences correspond to paths between the begin and and end nodes of the graph.
This is the Edit Graph
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 15: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/15.jpg)
0 1 2 3
0
1
2
AGT has length 3AT has length 2
The Edit graph has (3+1)*(2+1) nodes
The sequence AGT
The sequence AT
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 16: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/16.jpg)
0 1 2 30
1
2
A G T
A
T
AGT indexes the columns, and AT indexes the rows of this “table”
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 17: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/17.jpg)
0 1 2 30
1
2
A G T
A
T
The Graph is directed. The nodes (i,j) will hold values.
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 18: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/18.jpg)
T0 1 2 30
1
2
A G
A
T
Algorithmic Functions of Computational Biology
Professor Istrail
![Page 19: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/19.jpg)
T
A
T
0 1 2 30
1
2
A GA-
-A
AA
-A
-A-
A
-A
G-
A-
A-
G-
G-
T-
T-
T-
-T
-T
-T
-T
AT
GT
TT
GA
TA
Directed edges get as labels pairs of aligned letters.
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 20: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/20.jpg)
Alignment = Path in the Edit GraphT
A
T
0 1 2 30
1
2
A GA-
-A
AA
-A
-A-
A
-A
G-
A-
A-
G-
G-
T-
T-
T-
-T
-T
-T
-T
AT
GT
TT
GA
TA
AGTA-T
Every path from Begin to End corresponds to an alignment
Every alignment corresponds to a path between Begin and End
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 21: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/21.jpg)
The Principle of Optimality
The optimal answer to a problem is expressed in terms of optimal answer for its sub-problems
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 22: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/22.jpg)
Dynamic Programming
Part 1: Compute first the optimal alignment score
Part 2: Construct optimal alignment
We are looking for the optimal alignment = maximal score path in the Edit Graph from the Begin vertex to the End vertex
Given: Two sequences X and YFind: An optimal alignment of X with Y
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 23: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/23.jpg)
The DP Matrix S(i,j)
0 1 2 30
1
2
A G T
A
TS(2,1)
S(1,0)
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 24: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/24.jpg)
The DP MatrixMatrix S =[S(i,j)]
S(i,j) = The score of the maximal cost path from the Begin Vertex and the vertex (i,j)
(i,j)
(i,j-1)
(i-1,j)
(i-1,j-1)The optimal path to (i,j)must pass through one of
the vertices
(i-1,j-1)
(i-1,j)
(i,j-1)
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 25: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/25.jpg)
Opt path
(i,j)
(i,j-1)
(i-1,j)
(i-1,j-1)
Optimal path to (i-1,j) + (- , yj)
-xi
yj-
S(i-1,j) +
(- , yj)
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 26: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/26.jpg)
Optimal path
(i,j)
(i-1,j)
(i,j-1)
(i-1,j-1)
Optimal path to (i-1,j-1) + (xi,yj)
S(i-1,j-1) + (xi , yj)
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 27: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/27.jpg)
Optimal path
(i,j)
(i,j-1)
(I-1,j)
(i-1,j-1)
Optimal path to (i,j-1) + (xi,-)
S(i,j-1) + (xi, -)
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 28: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/28.jpg)
The Basic ALGORITHM
S(i,j) =
S(i-1, j-1) + (xi, yj)
S(i-1, j) + (xi, -)
S(i, j-1) + (-, yj)
MAX
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 29: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/29.jpg)
T
A
T
0 1 2 30
1
2
A GA-
-A
AA
-A
-A-
A
-A
G-
A-
A-
G-
G-
T-
T-
T-
-T
-T
-T
-T
AT
GT
TT
GA
TA
0
0
0
0
1
1
0
1
1
0
1
2AGTA - TOptimal Alignment
Optimal Alignment and TracbackAlgorithmic Functions of Computational Biology –
Professor Istrail
![Page 30: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/30.jpg)
S(i,j) =
S(i-1, j-1) + (xi, yj),
S(i-1, j) + (xi, -),
S(i, j-1) + (-, yj)
MAX
0, We add this
The Basic ALGORITHM: Local Similarity
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 31: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/31.jpg)
General Scoring Schemes
1. Independence of mutations at different sites
Additive scoring scheme
2. Gaps of any length are considered one mutation
All of the efficient alignment algorithms -- employing on the dynamicprogramming method --are based fundamentally on the of the fact that the scoring function is additive.
Assumptions
Algorithmic Functions of Computational Biology –
Professor Istrail
![Page 32: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/32.jpg)
Substitutions Matrices
m
n
yyyY
xxxX
...21
...21
ji yx ,belong to
Consider ungapped alignment of equal length sequences
Compute the probability that the two sequences are related
Compute the probability that the two sequences are not related
Compute the ratio of the two probabilities
Algorithmic Functions of Computational Biology
Professor Istrail
![Page 33: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/33.jpg)
Random Model R
Every letter z occurs independently with probability
yjx qqRyxP i)/,(
qz
Algorithmic Functions of Computational Biology - Course 3
Professor Istrail
![Page 34: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/34.jpg)
Match Model M
Aligned pairs of residues occur with joint probability ab
pab
jiyxpMyxP )/,(
Algorithmic Functions of Computational Biology - Course 3
Professor Istrail
![Page 35: Intro to Alignment Algorithms: Global and Local Intro to Alignment Algorithms: Global and Local Algorithmic Functions of Computational Biology Professor.](https://reader035.fdocuments.us/reader035/viewer/2022062320/56649f585503460f94c7d11a/html5/thumbnails/35.jpg)
)/,(
)/,(
RyxP
MyxP)/ jxixiyj qqp
),( ii yxs
)/log(),( baab pppbas
= log
=
i
where
Log-odds ratio
s(a,b) = the substitution matrixAlgorithmic Functions of Computational Biology - Course 3
Professor Istrail