CSC 212 – Data Structures Lecture 36: Pattern Matching.

CSC 212 –Data Structures

Lecture 36:

Pattern Matching

Suffixes and Prefixes

“I am the Lizard King!”Prefixes Suffixes

II I a

I am…I am the Lizard KinI am the Lizard King I am the Lizard King!

!g!ng!ing!…am the Lizard King!

am the Lizard King!

I am the Lizard King!

KMP Algorithm

Asymptotically optimal algorithmMeans cannot do better in big-Oh terms

Compares from left-to-rightSo like BruteForce, not Boyer-MooreBut shifts pattern intelligently

Relies on a Key Insight™Preprocess pattern to avoid redundant

comparisonsAlways go forward; Never, ever look back

The KMP Algorithm

. . a b a a b . . . . .

a b a a b a

Do notrepeat thesecomparisons

Need to resume

comparinghere

Shifting P hereensures these

two entries match

KMP Failure Function

Assume P[j] ≠ T[k]. Need rank in P to next compared to T[k]

E.g., How should we shift P after a miss? Uses failure function, F(j-1),

One value defined for each rank in PSpecifies rank j in P must restart comparisons

Computing Failure Function

For rank j, find longest proper prefix and suffix of P[0...j] For speed, store failure function in arrayUnlike Boyer-Moore, works w/infinite alphabets

Takes at most O(2m) = O(m) time

Similar algorithm computes failure function & KMP

Computing Failure FunctionAlgorithm KMPFailureFunction(String P)

F[0] 0i 1j 0while i < P.length()

if P[i] = P[j] // So, P[0…j] = P[i - j…i] F[i] j + 1 // Record the length of this prefix/suffix i i + 1 // Advance a character and see if still matches j j + 1else if j > 0 // No match, need to restart our computation j F[j - 1] // Skip over longest prefix that is also a suffixelse F[i] 0 // No prefix of P[0…i] is a suffix of P[0…i] i i + 1 // Move to the next character

return F

KMP Failure Functionj 0 1 2 3 4

P[j] a b a a b a

F(j) 0 0 1 1 2

The KMP AlgorithmAlgorithm KMPMatch(String T, String P)

F KMPFailureFunction(P)i 0j 0while i < T.length()

if P[j] = T[i] // So, P[0…j] = T[i - j…i] if j = P.length() - 1 return i - j i i + 1 // Advance and see if still a match j j + 1else if j > 0 // No match, but a prefix of P[0…j-1] matches j F[j - 1] // So skip past longest prefix that is a suffixelse i i + 1 // Nothing to reuse, move to the next character

return F

Example

a b a c a a b a c a b a c a b a a b b

19181715

a b a c a b

2 3 4 5 6

a b a c a b

10 11 12

j 0 1 2 3 4

P[j] a b a c a b

F(j) 0 0 1 0 1

The KMP Algorithm

In each pass of KMPMatch, either:P[j]=T[i] i increases by one, orP[j]≠T[i] & j > 0 P shifted right by at least 1P[j]≠T[i] & j = 0 i increases by 1

So at most 2n iterations of loop KMPMatch takes O(2n) = O(n) time KMPFailureFunction needs O(m) time Thus, algorithm runs in O(m n) time

Your Turn

Get back into groups and do activity

Before Next Lecture…

Finish up assignments Start thinking about questions for Final

CSC 212 – Data Structures Lecture 36: Pattern Matching.

Documents

Transcript of CSC 212 – Data Structures Lecture 36: Pattern Matching.

dilg.gov.phdilg.gov.ph/PDF_File/reports_resources/dilg-reports-resources... · CSC Appointment Form; CSC Form No. 212(PDS); Original authenticatedCertificate /s of Eligibility, or

ro4.csc.gov.phro4.csc.gov.ph/phocadownload/puboccidentaldec18.pdf · Completion of the Midwifery Course ... 2 copies of fully accomplished revised 2017 CSC Form 212 ... Photocopy

depedpalawan.com...Sheet (CSC Form 212, 2017) shall considered as bona fide applicant. An applicant Who has taught as an LGU-funded teacher, Kindergarten Volunteer Teacher ... CSC

CSC WESTERN LEYTE SATELLITE OFFICE · CSC WESTERN LEYTE SATELLITE OFFICE ... (CS Form No. 212, Revised 2017) ... III VISCAB-ADAS3-3-2010 9-1 203,832.00 Completion of 2 years studies

Updates on CSC Law and Rules - · PDF fileUpdates on CSC Law and Rules ... Personal Data Sheet (CS Form 212, Revised 2005) ... Sheet (PDS) (CS Form 212, Revised 2016) which

· 2 2.1 Accomplished CSC Form 212 (revised) 2.2.2 Performance rating the last 3 rating periods (should be at least Very Satisfactory)

CSC 212: Data Structures and Abstractionsmalvarez/courses/csc-212-s19/files/linked... · Marco Alvarez Department of Computer Science and Statistics University of Rhode Island CSC

CSC 212 – Data Structures Lecture 3: Fields, Methods, & Constructors.

deped.agusandelsur.gov.phdeped.agusandelsur.gov.ph/images/KCDAscuraADASII/DivisionMemoranda/... · CSC Form 212 (Revised 2005) in two copies with the latest 2x2 ID picture Certified

CSC 212 Vectors, Lists, & Sequences. Announcement Daily quizzes accepted electronically only Submit via one or other Dropbox E-mail also accepted,

CSC 212 – Data Structures

CSC 212 – Data Structures. Using Stack Stack Limitations Great for Pez dispensers, JVMs,& methods All of these use most recent item added only

Welcome ! CSC 212: Data Structures and Abstractionsmalvarez/courses/csc...Marco Alvarez, Instructor Christopher McCooey, Eben Aceto, Jacob Silva, John Bertsch, TAs 2 Typical Schedule

rT - · PDF filePersonal Data Sheet (CSC Form 212; Revised 2OG5) Appointrnent (KSS Porma Blg 33; Narebisa 1998) ... rT _.] T {OVIE AND

LECTURE 26: QUEUES CSC 212 – Data Structures. Using Stack.

CSC 212 Stacks & Queues. Announcement Daily quizzes accepted electronically only Submit via one or other Dropbox Cannot force you to compile & test.

CSC 212 Lecture 19: Splay Trees, (2,4) Trees, and Red-Black Trees.

LECTURE 38: ORDERED DICTIONARY CSC 212 – Data Structures.

CSC 212 – Data Structures Prof. Matthew Hertz WTC 207D / 888-2436 hertzm@canisius.edu.

LECTURE 40: SELECTION CSC 212 – Data Structures. Sequence of Comparable elements available Only care implementation has O(1) access time Elements.