< An-Hsiang Ch’an > Decoding An-Hsiang Lecturer: Master Geng-Yun Photographer: Richard Chen
Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding...
-
Upload
jodie-nash -
Category
Documents
-
view
217 -
download
1
Transcript of Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding...
![Page 1: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/1.jpg)
NTU - DISP Lab 1
Biological Sequence Comparison and Alignment
Speaker: Yu-Hsiang Wang
Advisor: Prof. Jian-Jung Ding
Digital Image and Signal Processing LabGraduate Institute of Communication Engineering
National Taiwan University
![Page 2: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/2.jpg)
NTU - DISP Lab 2
OutlineIntroduction of biological sequence
alignmentSequence alignment algorithm
◦ Dynamic programming◦ FASTA & BLAST◦ UDCR◦ Less-redundant fast algorithm◦ Algorithm for approximate string matching and
improvement
ConclusionReference
![Page 3: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/3.jpg)
NTU - DISP Lab 3
Introduction of biological sequence alignmentTwo strings: S1: “caccba” and S2:
“cabbab”Alignment:
where
![Page 4: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/4.jpg)
NTU - DISP Lab 4
Introduction of biological sequence alignmentThe edit distance between two stringsScoring matrix of alphabet
or
Similarity
![Page 5: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/5.jpg)
NTU - DISP Lab 5
Dynamic programmingThe fundamental sequence alignment
algorithmThe recurrence relationTabular computation The traceback
![Page 6: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/6.jpg)
NTU - DISP Lab 6
Dynamic programmingThe recurrence relation:
◦Define D(i, j) to be the edit distance of S1[1..i] and S2[1..j]
◦Base condition: D(i, 0)=i (the first column) D(0, j)=j (the first row)
◦Recurrence relation: D(i, j) = min[D(i-1, j) + d, D(i, j-1) + d, D(i-1, j-1) +
t(i, j)] where t(i, j)=e if S1(i)=S2(j); otherwise t(i,
j)=r (Assume d=1, r=1, and e=0)
![Page 7: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/7.jpg)
NTU - DISP Lab 7
Dynamic programmingTabular computation
◦Initial table
![Page 8: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/8.jpg)
NTU - DISP Lab 8
Dynamic programmingTabular computation
◦Finished table
![Page 9: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/9.jpg)
NTU - DISP Lab 9
Dynamic programmingThe traceback
◦If D(i, j)=D(i, j-1)+1, set a pointer from (i, j) to (i, j-1), denote as “←”
◦If D(i, j)=D(i-1, j)+1, set a pointer from (i, j) to (i-1, j), denote as “↑”
◦If D(i, j)=D(i-1, j-1)+t(i, j), set a pointer from (i, j) to (i-1, j-1) as “ ”↖ where t(i, j)=0 if S1(i)=S2(j); otherwise t(i, j)=1
![Page 10: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/10.jpg)
NTU - DISP Lab 10
Dynamic programmingThe traceback
![Page 11: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/11.jpg)
NTU - DISP Lab 11
Dynamic programmingAlignment(1)
S1:wri-t-ers
S2:-vintner-
![Page 12: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/12.jpg)
NTU - DISP Lab 12
Dynamic programmingAlignment(2)
S1:wri-t-ers
S2:v-intner-
![Page 13: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/13.jpg)
NTU - DISP Lab 13
Dynamic programmingAlignment(3)
S1:wri-t-ers
S2:vintner-
![Page 14: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/14.jpg)
NTU - DISP Lab 14
FASTAThe main idea: similar sequences
probably share some short matches (word)
Only search for the consecutive identities of length k (k-tuple word)
![Page 15: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/15.jpg)
NTU - DISP Lab 15
FASTAStep1:Select k to establish the lookup table
“ATAGTCAATCCG” and “TGAGCAATCAAG”
![Page 16: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/16.jpg)
NTU - DISP Lab 16
FASTAStep2: labels each k-tuple word as an “x”
(the word hits sharing same offset are on a same diagonal)
![Page 17: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/17.jpg)
NTU - DISP Lab 17
FASTAStep3: Choose 10 best diagonal regions
![Page 18: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/18.jpg)
NTU - DISP Lab 18
FASTAStep4: If a region’s score is higher than
the threshold, it can remain
![Page 19: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/19.jpg)
NTU - DISP Lab 19
FASTAStep5: Combine these remained regions
into a longer high-scoring alignment (allow some spaces)
![Page 20: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/20.jpg)
NTU - DISP Lab 20
BLASTSimilar to FASTABLAST only care about the high-scoring
wordsEstablish a list to store those words
![Page 21: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/21.jpg)
NTU - DISP Lab 21
BLASTFor example (3-mers):
PEG
PQA
15
12
![Page 22: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/22.jpg)
NTU - DISP Lab 22
BLASTFor example, set D-score as 0
![Page 23: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/23.jpg)
NTU - DISP Lab 23
UDCRUse unitary mapping to represent the four
types of nucleotide◦bx[τ] = 1 if x[τ] = ‘A’,
◦bx[τ] = -1 if x[τ] = ‘T’,
◦bx[τ] = j if x[τ] = ‘G’,
◦bx[τ] = -j if x[τ] = ‘C’
![Page 24: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/24.jpg)
NTU - DISP Lab 24
UDCRCalculate discrete correlations:
◦For example x = ‘GTAGCTGAACTGAAC’; y = ‘AACTGAA’, bx = [j, 1, 1, j, j, 1, j, 1, 1, j, 1, j, 1, 1, j], by = [1, 1, j, 1, j, 1, 1].
z1= [j,-1+j, 1,1+j, -j,-1-j,-3+j2, j3,6+j,1-j4,-4-j3,
-4+j3,2+j5, 7,2-j5,-3-j3,-3+j2, 1+j3, 3, 1-j, -j], z2= [1, 0, 3, 2, 1, 0, 1, 1, 5, 5, 1, 1, 3, 7, 3, 0, 1,
2, 3, 0, 1]
![Page 25: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/25.jpg)
NTU - DISP Lab 25
UDCRSimilarity
◦since s[2]=6, we can know that the sequence {x[2],x[3],…,x[7]} is similar to y: x =‘GTAGCTGAACTGAAC’,
y = ‘AACTGAA’.
![Page 26: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/26.jpg)
NTU - DISP Lab 26
Less-redundant fast algorithmThe movement is generalized and the ↘
movement → is removedNew method to determine edit distance:
◦D(i, j)=min [D(i-1, j0) + j - j0 – 1 + s(i, j, j0)] where s(i, j, j) = 2,
s(i, j, j0) = 1, if j0 j-1 and x(τ) ≠ y(j) for ≦ all τ in the range of
j0+1 τ j≦ ≦
s(i, j, j0) = 0, if j0 j-1 and x(τ) = y(j) for ≦ a τ in the range of
j0+1 τ j≦ ≦
![Page 27: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/27.jpg)
NTU - DISP Lab 27
Less-redundant fast algorithmSlope rule
◦If D(i, ja) and D(i, jb) on the same row satisfy
D(i, jb) - D(i, ja) ≧ | jb - ja |
then D(i, jb) can be ignored.
![Page 28: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/28.jpg)
NTU - DISP Lab 28
Less-redundant fast algorithmDifferent entry rule(1)
◦If (a) x(i) ≠ y(j) (b) D(i-1, j) is inactive
(c) D(i-1. j-1) is active,
then D(i, j) can be calculated by
“D(i, j) = D(i-1, j-1) + 1”inactive
active
inactive
![Page 29: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/29.jpg)
NTU - DISP Lab 29
Less-redundant fast algorithmDifferent entry rule(2)
◦If (a) x(i) ≠ y(j) (b) D(i-1, j) is active
(c) x(i) ≠ y(j+1),
then D(i, j) can be calculated by
“D(i, j) = D(i-1, j) + 1”
![Page 30: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/30.jpg)
NTU - DISP Lab 30
Less-redundant fast algorithmDifferent entry rule(3)
◦If (a) x(i) ≠ y(j) (b) D(i-1, j) is active
(c) x(i) = y(j+1),
then “D(i, j) can be ignored”
![Page 31: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/31.jpg)
NTU - DISP Lab 31
Less-redundant fast algorithmSame entry rule(1)
◦If (a) x(i) = y(j) (b) D(i-1, j) is inactive
(c) one of D(i-1, j1), D(i-1, j1+1), …,D(i-1, j-1) is active,
then D(i, j) can be calculated by
“D(i, j) = D(i-1, j2) + j - j2 – 1”
![Page 32: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/32.jpg)
NTU - DISP Lab 32
Less-redundant fast algorithmSame entry rule(2)
◦If (a) x(i) = y(j) (b) D(i-1, j) is active
(c) none of D(i-1, j1),D(i-1, j1+1),…,D(i-1, j-1) is active,
then D(i, j) can be calculated by
“D(i, j) = D(i-1, j2) + 1” when x(i+1) ≠ y(j),
“D(i, j) can be ignored” when x(i+1) = y(j).
![Page 33: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/33.jpg)
NTU - DISP Lab 33
Less-redundant fast algorithmSame entry rule(3)
◦If (a) x(i) = y(j) (b) D(i-1, j) is active
(c) one of D(i-1, j1),D(i-1, j1+1),…,D(i-1, j-1) is active,
then D(i, j) can be calculated by
“D(i, j) = D(i-1, j2) + 1” when x(i+1) ≠ y(j),
“D(i, j) can be ignored” when x(i+1) = y(j).
![Page 34: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/34.jpg)
NTU - DISP Lab 34
Less-redundant fast algorithmLimitation rule
◦First, Roughly estimate the upper bound of the edit distance H1 by
◦ where k(x(i), y(j))=0 if x(i) = y(j) and k(x(i), y(j))=1 if x(i) ≠ y(j).
![Page 35: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/35.jpg)
NTU - DISP Lab 35
Less-redundant fast algorithmLimitation rule
◦The diagram of H1. (a) τ is 1. (b) No movement. (c) τ is -1.
![Page 36: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/36.jpg)
NTU - DISP Lab 36
Less-redundant fast algorithmLimitation rule
◦Set the upper bound of the edit distance as
where Max(N-i, M-j) is the maximal possible edit distances between x[i…N] and y[j…M].
![Page 37: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/37.jpg)
NTU - DISP Lab 37
Less-redundant fast algorithmLimitation rule
◦If
then D(i, j) can be ignored.
![Page 38: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/38.jpg)
NTU - DISP Lab 38
Algorithm for approximate string matchingFocus on diagonal.Define fkp = the largest index i
◦k denote the diagonal’s number◦p denote the edit distance’s value
![Page 39: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/39.jpg)
NTU - DISP Lab 39
Algorithm for approximate string matchingf1,-1=-∞, f1,0=-1, f1,1=2, f1,2=3, f1,3=4.
Computation range: -p ~ N-M+p
![Page 40: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/40.jpg)
NTU - DISP Lab 40
Improved algorithm for approximate string matchingUse main diagonal N-M to separate table.Linking ListNew scoring scheme
![Page 41: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/41.jpg)
NTU - DISP Lab 41
Improved algorithm for approximate string matchingExample of new scoring scheme
![Page 42: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/42.jpg)
NTU - DISP Lab 42
Improved algorithm for approximate string matchingScore 0 iteration
![Page 43: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/43.jpg)
NTU - DISP Lab 43
Improved algorithm for approximate string matchingScore 1 iteration
![Page 44: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/44.jpg)
NTU - DISP Lab 44
Improved algorithm for approximate string matchingScore 2 iteration
![Page 45: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/45.jpg)
NTU - DISP Lab 45
Improved algorithm for approximate string matchingScore 3 iteration
![Page 46: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/46.jpg)
NTU - DISP Lab 46
Improved algorithm for approximate string matchingScore 4 iteration
Edit distance: iteration+(N-M) = 4+(10-7) = 7
![Page 47: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/47.jpg)
NTU - DISP Lab 47
Improved algorithm for approximate string matchingExperiments (complexity depend on s-|M-N|)
Strings on 4-letters alphabet Strings on 20-letters alphabet
![Page 48: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/48.jpg)
NTU - DISP Lab 48
ConclusionTraditional Dynamic programming costs
too much computational time and memory.
Less-redundant fast algorithm can save lots of time and space.
Improved algorithm for approximate string matching can also save lots of time and space. (Although it cannot traceback to get the alignment)
![Page 49: Biological Sequence Comparison and Alignment Speaker: Yu-Hsiang Wang Advisor: Prof. Jian-Jung Ding Digital Image and Signal Processing Lab Graduate Institute.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f275503460f94c3f7e0/html5/thumbnails/49.jpg)
NTU - DISP Lab 49
Reference [1] D. Gusfield, Algorithms on Strings, Trees, and
Sequence, Cambridge University Press, 1997. [2] S. C. Pei, J. J. Ding “Sequence Comparison and
Alignment by Discrete Correlations, Unitary Mapping, and Number Theoretic Transforms”
[3] K.H. Hsu, “Faster Algorithms for Computing Edit Distance and Alignment between DNA Strings”, 2009.
[4] J. J. Ding, K. H. Hsu, “A Less-Redundant Fast Algorithm for Computing Edit Distances Exactly”
[5] E. UKKONEN, “Algorithms for Approximate String Matching,” Information and Control 64, 100-118, 1985
[6] D. Papamichail and G. Papamichail, “Improved Algorithms for Approximate String Matching,” BMC Bioinformatics, Vol.10(Suppl.1):S10, January, 2009