Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 ·...

16
Knuth-Morris-Pratt Algorithm for Pattern Matching

Transcript of Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 ·...

Page 1: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

Knuth-Morris-Pratt Algorithmfor Pattern Matching

Page 2: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

Pattern Matching

The act of checking a given sequence of tokens (text) for the presence of the constituents of some pattern.

Page 3: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

Brute Force Approach● Pattern of length M● Text of length N● Complexity?

O(N*M)

● Worst case?

Text = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXYPattern = XXXXY

Page 4: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

Knuth-Morris-Pratt Algorithm● Takes advantage of the information about already matched characters to

reduce the number of comparisons.● Avoids backing up in the text (only moves forward).

Page 5: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

KMP

To keep track of available shifts during each mismatched character we build a DFA (deterministic finite-state automata).

DFA is constructed just from the pattern and before the execution.

Page 6: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

KMP DFA

Page 7: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 8: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 9: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 10: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 11: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 12: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

DFA Construction

Page 13: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

KMP Code

Page 14: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

KMP

In Class Practice

Build a KMP DFA for the following pattern:

A C A C A G A

Produce both table and graphical representations of the DFA

Page 15: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

KMP

Answer

A C A C A G A

0 1 2 3 4 5 6

A 1 1 3 1 5 1 7

C 0 2 0 4 0 4 0

G 0 0 0 0 0 6 0

Page 16: Knuth-Morris-Pratt Algorithmpeople.cs.pitt.edu/~aus/cs1501/KMP_algorithm.pdf · 2016-10-18 · Knuth-Morris-Pratt Algorithm Takes advantage of the information about already matched

Knuth-Morris-Pratt AlgorithmQuestions?