Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov...
-
Upload
morgan-wilson -
Category
Documents
-
view
228 -
download
0
description
Transcript of Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov...
Linear-time computation of local Linear-time computation of local periodsperiods
Gregory KucherovINRIA/LORIA
Nancy, France
joint work with Roman Kolpakov (Moscow) and Jean-Pierre Duval, Thierry Lecroq, Arnaud
Lefebvre (Rouen)
Haifa Stringology Workshop, April 3-8 2005
2
Periodicities (repetitions) in stringsPeriodicities (repetitions) in strings
period: the (global) period: minimal period periodicity = word of period Example: square, cube : fractional periodicity periodicities = “runs” of squares (cyclic) root, 8/3 exponent
3
Finding periodicitiesFinding periodicities
CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTCCGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCCAAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCCAATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGACCGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATACCGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACGCGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTGCCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGCTCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGGCCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTAATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCTAGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGCCTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATTATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC
4
Finding periodicitiesFinding periodicities
CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTCCGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCCAAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCCAATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGACCGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATACCGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACGCGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTGCCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGCTCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGGCCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTAATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCTAGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGCCTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATTATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC
5
Some work has been done ...Some work has been done ...
... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005
6
Some work has been done ...Some work has been done ...
... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005
different results based on common simple techniques: extension functions and s-factorization
7
Rest of this talkRest of this talk
Basics– extension functions– computing periodicities in time– s-factorisation (Lempel-Ziv factorization)– computing periodicities in time
Computing all local periods in time
8
Extension function: simplest definitionExtension function: simplest definition
all values can be computed in time [Main&Lorentz 84]
9
Extension function: simplest definitionExtension function: simplest definition
all values can be computed in time [Main&Lorentz 84] a refined algorithm is presented in [Lothaire
05] (inspired from Manacher’s linear-time algorithm for computing palindromes)
10
Extension function: variantsExtension function: variants
11
Using extension functions to compute Using extension functions to compute periodicitiesperiodicities
Lemma: There exists a square of period iff
12
Using extension functions to compute Using extension functions to compute periodicitiesperiodicities
Example:
a t a c g a a c g a a c g g t a c g a a c g a
c g a a c g a ag a a c g a a c
13
Using extension functions to compute Using extension functions to compute periodicitiesperiodicities
Example:
a t a c g a a c g a a c g g t a c g a a c g a
c g a a c g a ag a a c g a a c
14
Using extension functions to compute Using extension functions to compute periodicitiesperiodicities
This implies (using binary division) that one can compute a compact representation of
all squares (maximal periodicieis) in time one can compute all squares in time
[Crochemore 81, Main&Lorentz 84] one can test the square-freeness in time
15
ss-factorization -factorization ((Lempel-Ziv factorization)Lempel-Ziv factorization)
, where :– if letter which immediately follows
does not occur in , then– otherwise is the longest subword
occurring at least twice in Example: s-factorization (Lempel-Ziv factorization) can
be computed in linear time using suffix tree or DAWG
16
Why Why s-s-factorization is useful herefactorization is useful here
17
Why Why s-s-factorization is useful herefactorization is useful here
18
Why Why s-s-factorization is useful herefactorization is useful here
lemma of [Main 89]
19
Computing (a compact representation of) Computing (a compact representation of) all squares in linear timeall squares in linear time
1. compute the s-factorization of (in )2. for each factor
A. compute all maximal periodicities ending inside and crossing the border between and (in )
B. recover all maximal periodicities occurring inside from a left copy of (in )
Important: the number of maximal periodicities is while the number of squares can be
20
Using extension functions + Using extension functions + s-s-factorization factorization to compute periodicitiesto compute periodicities
This implies that one can compute a compact representation of
all squares (maximal periodicities) in time [Kolpakov,Kucherov 99]
one can compute all squares (but also cubes, ...) in time
one can test the square-freeness in time [Crochemore 83, Main&Lorentz 85]
21
Local Local periodperiodss
minimal (local) square at = minimal square centered at local period at (denoted ) = root length of the minimal square at
internal square
right-external square
left- and right-external square
22
Critical Factorization TheoremCritical Factorization Theorem
for any , global period of
Critical Factorization Theorem: For every , there exists a position such that = global period of
23
Computing local periods (minimal squares)Computing local periods (minimal squares)
compute separately– internal minimal squares– left-external and right-external minimal
squares– both left- and right-external minimal
squares focus on internal minimal squares compute s-factorization for each factor , compute minimal squares
ending in this factor
24
Minimal squares inside a factorMinimal squares inside a factor
25
Minimal squares inside a factorMinimal squares inside a factor
26
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of
27
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of focus on those of them centered inside
28
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones
29
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!!
30
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!! compute maximal periodicities in increasing order of periods
31
Minimal squares crossing factor borderMinimal squares crossing factor border
focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!! compute maximal periodicities in increasing order of periods only a linear number of squares need to be tested for
minimality!!
32
Sketch of the proofSketch of the proof
assume we are looking at squares of period
33
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found
34
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most )
35
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most ) if , then either , or
36
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most ) if , then either , or
37
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most ) if , then either , or
38
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most ) if , then either , or
39
Sketch of the proofSketch of the proof
assume we are looking at squares of period consider largest period for which squares have
been found if , then test all squares of period (at most ) if , then either , or at most squares need to be tested
40
Computing (right-)external squaresComputing (right-)external squares
41
Computing (right-)external squaresComputing (right-)external squares
use extension functions!
42
Computing (right-)external squaresComputing (right-)external squares
use extension functions!
43
Computing (right-)external squaresComputing (right-)external squares
use extension functions!
44
Computing (right-)external squaresComputing (right-)external squares
use extension functions! for each , find minimal such that can be done in time
45
ConclusionsConclusions
All local periods can be computed in
note that the global period of is