Simpson’s Hidden Talents · Simpson’s Hidden Talents. Problem AGiven two strings and find the...

Post on 27-Sep-2020

5 views 0 download

Transcript of Simpson’s Hidden Talents · Simpson’s Hidden Talents. Problem AGiven two strings and find the...

Simpson’s Hidden Talents

Problem

Given two strings and find the longest prefix of that is a suffix of

S1 S2S1S2

Problem

Given two strings and find the longest prefix of that is a suffix of

S1 S2S1S2

ababcababdaba

babacbadababc

Example:S1 :S2 :

Problem

Given two strings and find the longest prefix of that is a suffix of

S1 S2S1S2

ababcababdaba

babacbadababc

Example:S1 :S2 :

First Idea

a b a b c a b a b d a b a

b a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

b a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

b a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

c b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

b a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

d a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

First Idea

a b a b c a b a b d a b a

a b a b c

S1 :

S2 :

What if …

a a a a a a a a a a a a a

a a a a a a a a a a a a z

S1 :

S2 :

What if …

a a a a a a a a a a a a a

a a a a a a a a a a a a z

S1 :

S2 :

How many comparisons?

n + (n −1) + ...+ 2 +1= Θ(n2)

Recall the Z algorithm*

Simplest linear string matching algorithmGiven a string , denote by the length of the longest string starting at position of S, that is a prefix of S

Note: just one stringCompute iteratively

*Algorithms on strings, trees and sequences, D.Gusfield

Z[i]S[1..n]

Z[i]

i

Definitions

S1 ni

Z[i]+ i −1

Z[i]Z box at position i

DefinitionsFix a position k

Sk r[k]l[k]

r[k] Rightmost end of a Z box that startsat a position 1< j ≤ k

l[k] The left end of the Z box defined byr[k]

The Z Algorithm

Compute iteratively

Only last value of will be needed. Refer to them as

Three cases to consider

Z[i], l[i],r[i]

l[i],r[i]

l, r

The Z AlgorithmIteration kCase 1: k > r

Skr

Compute by explicitelycomparing the characters atposition and on with thecharacters at the beginningof the string

Z[k]

k

The Z AlgorithmIteration k,k ≤ rCase 2a:

αβγ γ βγ

kl rk'

Z[k'] <| β |

S α

Z[k']

The Z AlgorithmIteration k,k ≤ rCase 2b:

αβ β

kl rk'

Z[k'] ≥| β |

βS

The Z algorithm

Why does it run in linear time?

The Contest Problem

What if we have two strings?

The solution is a simple extension of the Z algorithm

Call the longest string which starts at position of and which is also a prefix of

Y[i]i S2

S1

The Contest Problem

Definition of a Y box, and are analogous, this time with respect to

l r

S2

Case 1

S2kr

S1

Case 2a

α βγ

kl r

S2

βγ γ

k'

S1

Z[k'] <| β |

Need Z values

for S1

Z[k']

Case 2b

α β

kl rS

β

k'

βS

Z[k'] ≥| β |