On recognizing words that are squares for the shuffle product
Transcript of On recognizing words that are squares for the shuffle product
![Page 1: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/1.jpg)
On recognizing words that aresquares for the shuffle product
Romeo Rizzi 1 Stephane Vialette 2
1Department of Computer Science, University of Verona, Italy
2CNRS & LIGM, Universite Paris-Est Marne-la-Vallee, France
8th International Computer Science Symposium in RussiaUral Federal University, Ekaterinburg, Russia
June 25-29, 2013
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 1 / 34
![Page 2: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/2.jpg)
ShuffleThe shuffle u � v of words u and v over A is the finite set of allwords obtainable from merging the words u and v from left toright, but choosing the next symbol arbitrarily from u or v.
ab� cd = {abcd, acbd, acdb, cabd, cadb, cdab}
The iterated shuffle of u is the language
ε ∪ u ∪ (u � u) ∪ (u � u � u) ∪ . . .
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 2 / 34
![Page 3: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/3.jpg)
Iterated shuffle
There are basically two kinds of questions that can be addresseddepending on whether or not the shuffled element is given as a part ofthe input:
“Given u, v ∈ A∗, is u in the iterated shuffle of v?”, and“Given u ∈ A∗, is u in the iterated shuffle of some v ∈ A∗?”
For two applications of the shuffle product, we obtain:“Given u, v ∈ A∗, is u ∈ v � v?”, and“Given u ∈ A∗, does there exist v ∈ A∗ such that u ∈ v � v?”
As we shall see soon, these two problems dramatically differ incomplexity.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 3 / 34
![Page 4: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/4.jpg)
Some good news
Given words u, v1 and v2, it can be tested in O(|u|2/ log(|u|)
)time whether or not u ∈ v1 � v2 [Leeuwen, and Nivat, 1982].
The shuffle u � v of words u and v can be computed inO
((|u|+ |v|)
(|u|+|v||u|
))time [Spehner. 1986].
Given words u1, u2, . . . , uk , the shuffle �ki=1ui can be computed in
O((|u1|+|u2|+...+|uk ||u1|,|u2|,...,|uk |
))time [Allauzen. 2000].
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 4 / 34
![Page 5: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/5.jpg)
Some bad news
Given words u, v1, v2, . . . , vn ∈ A∗, it is NP-complete to decidewhether or not u ∈ �k
i=1vi . [Mansfield. 1983;Warmuth, andHaussler. 1984].
This remains true even if the alphabet has size 3 [Warmuth, andHaussler. 1984].
For two words u and v, it is NP-complete to decide whether ornot u is in the iterated shuffle of v [Warmuth, and Haussler.1984].
This remains true even if the alphabet has size 3 [Warmuth,and Haussler. 1984].
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 5 / 34
![Page 6: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/6.jpg)
Our main result
TheoremGiven u ∈ A∗, it is NP-complete to decide whether or not u is theshuffle of some word v ∈ A∗ with itself
(i.e., does there exist some v ∈ A∗ such that u ∈ v � v?).
This result was first claimed by K. Iwama [Iwama. 1983], but itturns out that the proof has a serious flaw.
This result was recently proved independently by Buss and Soltys[Buss, and Soltys. 2012].
The problem is polynomial-time solvable in case each characteroccurs at most four times.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 6 / 34
![Page 7: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/7.jpg)
Stack Exchange discussion board
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 7 / 34
![Page 8: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/8.jpg)
Words and linear graphsLet u = u1 u2 . . . un ∈ An be a word on some alphabet A.
The graph associated to u, denoted V(Gu), is defined by
V(Gu) = {u1, u2, . . . , un}, andE(Gu) = {{ui , uj} : i 6= j ∧ ui = uj}.
The structure of this underlying graph is linear, i.e., the set ofvertices is equipped with a natural total order < defined byui < uj if and only if i < j.
In other words, the vertices of Gu correspond to the letters of u inthe left to right order and there is an edge between any twoidentical distinct letter of u.
Clearly, Gu is the disjoint union of cliques, one clique for eachdistinct letter of u.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 8 / 34
![Page 9: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/9.jpg)
Words and linear graphs
For u = ababbbaa we obtain:
Gu a b a b b b a a
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 9 / 34
![Page 10: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/10.jpg)
Words, linear graphs, and inclusion-free matchings
Let u = u1 u2 . . . un ∈ An be a word on some alphabet A, and let Gube the linear graph associated to u.
A matching is perfect if it covers all the vertices of the graph.
In case the set of vertices is equipped with a total order, amatching M is said to be inclusion-free if there do not exist(independent) edges (ui , uj) and (uk , u`) in M such thatui < uk < u` < uj or uk < ui < ij < u`.
LemmaLet u ∈ A∗ for some alphabet A, and Gu be the corresponding lineargraph. Then, there exists v ∈ A∗ such that u ∈ v � v if and only ifthere exists an inclusion-free perfect matching in Gu.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 10 / 34
![Page 11: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/11.jpg)
Words, linear graphs, and inclusion-free matchings
Let u = u1 u2 . . . un ∈ An be a word on some alphabet A, and let Gube the linear graph associated to u.
A matching is perfect if it covers all the vertices of the graph.
In case the set of vertices is equipped with a total order, amatching M is said to be inclusion-free if there do not exist(independent) edges (ui , uj) and (uk , u`) in M such thatui < uk < u` < uj or uk < ui < ij < u`.
LemmaLet u ∈ A∗ for some alphabet A, and Gu be the corresponding lineargraph. Then, there exists v ∈ A∗ such that u ∈ v � v if and only ifthere exists an inclusion-free perfect matching in Gu.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 10 / 34
![Page 12: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/12.jpg)
Linear graph and inclusion-free perfect matching
u = a b b aa b b a
Gu a b a b b b a a
M a b a b b b a a
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 11 / 34
![Page 13: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/13.jpg)
Being a square for the shuffle product
TheoremGiven u ∈ A∗, it is NP-complete to decide whether or not u is theshuffle of some word v ∈ A∗ with itself.
We propose a polynomial-time reduction from the NP-completeLongest Common Subsequence for binary words[Maier.1978]:
“Given a collection of words U = {u1, u2, . . . , um}, ui ∈ {0, 1}∗ for1 ≤ i ≤ m, and a positive integer k, decide whether there exists asubsequence of size k common to all sequences of U ?”
Without loss of generality, we may assume that |ui | = |uj | for1 ≤ i < j ≤ m, and that that we are looking for a commonsubsequence with p letters 0 and q letters 1, k = p + q.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 12 / 34
![Page 14: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/14.jpg)
Lossless transmission
It will convenient to see the reduction as a flow-like procedure, wheresome piece of information (the common subsequence) emitted fromgadget Ws (the source) propagates lossless to gadget Wt (the sink)going through all gadgets Wi , 1 ≤ i ≤ m (every such gadget beingassociated to an input word of our input instance of 01-LCS).
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 13 / 34
![Page 15: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/15.jpg)
Being a square for the shuffle product - the big picture
SOURCE TARGET
LOSSLESS TRANSMISSION
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 14 / 34
![Page 16: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/16.jpg)
Being a square for the shuffle product - the big picture
TARGETSecond
word
Last
word
First
wordSOURCE
Emit word Transmit word Transmit word
Transmit wordTransmit word
LOSSLESS TRANSMISSION LINE
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 15 / 34
![Page 17: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/17.jpg)
The constructed string
The string of interest is defined by:
w = Ws W1 W2 . . . Wm Wt ,
where Ws, W1, W2, . . ., Wm and Wt are words in A∗.
Ws is the “source”.
Each Wi is associated to an input binary string.
Wt is the “target”.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 16 / 34
![Page 18: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/18.jpg)
Designing the source and the target
SourceWs = s′ 0pq s′ s (0p 1)q 0p s
TargetWt = t (0p 1)q 0q t t ′ 0pq t ′
Recall that we are looking for a common subsequence with pletters 0 and q letters 1, k = p + q.
Letters s, s′, t and t ′ do not occur in any other gadget word.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 17 / 34
![Page 19: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/19.jpg)
Designing the source : the big picture
Reverse direction: Easy observations for the source
Rizzi & Vialette Being a square for the shu�e product CSR 2013 23 / 35
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 18 / 34
![Page 20: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/20.jpg)
Designing a string gadget
For any input binary string ui = ui,1ui,2 . . . ui,n , define
Wi = xi W ′i xi yi W ′
i yi ,
where
W ′i = ui,1 zi ui,2 zi . . . ui,n−1 zi ui,n .
Letters xi , yi and zi only occur in the gadget word Wi .
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 19 / 34
![Page 21: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/21.jpg)
Designing a string gadget: the big picture
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 20 / 34
![Page 22: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/22.jpg)
Forward direction
Suppose that there exists a common subsequence v of the wordsu1, u2, . . . , um with p occurrences of the letter 0 and q occurrenceof the letter 1.
The solution v propagates lossless from source gadget Ws to targetgadget Wt going through all string gadgets Wi , 1 ≤ i ≤ m (everysuch gadget being associated to an input word of our inputinstance of 01-LCS).
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 21 / 34
![Page 23: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/23.jpg)
Reverse direction
Suppose that w is a square for the shuffle product.
According to our previous lemma, this amount to saying that Gwhas an inclusion-free perfect matching M.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 22 / 34
![Page 24: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/24.jpg)
Reverse direction: Easy observations for the source
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 23 / 34
![Page 25: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/25.jpg)
Reverse direction: Transmitting from the source
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 24 / 34
![Page 26: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/26.jpg)
Reverse direction: Internal transmission
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 25 / 34
![Page 27: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/27.jpg)
Reverse direction: Internal transmission
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 26 / 34
![Page 28: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/28.jpg)
Iterated shuffle
TheoremIt is NP-complete to decide whether or not a word u ∈ A∗ is in theiterated shuffle of some word v ∈ A∗ with u 6= v.
Proof.In the previous reduction, some letters occur exactly two times inthe constructed word w.
In this case, w cannot be the shuffle of k ≥ 3 identical copies ofsome word v ∈ A∗.
In other words, if w is in the iterated shuffle of some distinct wordv ∈ A∗, then w is a square for the shuffle product.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 27 / 34
![Page 29: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/29.jpg)
Turning the problem into an optimisation task“Let u ∈ A∗. Find the largest v � u that is a square for the shuffleproduct.”Define
f (u) := max{|v| : v � u and ∃w � v such that v ∈ w � w}g(n, k) := min{f (u) : u ∈ Σn and |Σ| = k}
Theorem ([Axenovich, Person, and Puzynina. 2013])g(n, 2) = n − o(n).I.e., any binary word of length n can be split into two identicalsubwords and, perhaps, a remaining subword of length o(n).
TheoremThere is polynomial-time approximation scheme (PTAS) for computingf (u).
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 28 / 34
![Page 30: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/30.jpg)
Being the shuffle of a word with its reverse“Given u ∈ A∗, does there exist v ∈ A∗ such that u ∈ v � vR?”
Theorem ([Henshall,Rampersad, and Shallit. 2011])Let u be some word over some binary alphabet A. It is polynomial-timesolvable to determine whether or not there exists v ∈ A∗ such thatu ∈ v � vR.
Proof.If there exists v ∈ A∗ such that u ∈ v � vR, then u is an abeliansquare (i.e., u = v v′, where v′ is a permutation of v).
If u is a binary abelian square, then there exists v ∈ A∗ such thatu ∈ v � vR.
Notice that the equivalence is no longer true for larger alphabets!
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 29 / 34
![Page 31: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/31.jpg)
Words and linear graphsLet u = u1 u2 . . . un ∈ An be a word on some alphabet A, and let Gube the linear graph associated to u.
A matching is perfect if it covers all the vertices of the graph.
In case the set of vertices is equipped with a total order, amatching M is said to be 2-nested if the following two conditionshold:
I For every edge (u, v) in M, one endpoint is in the first half of thenodes of Gu and the other is in the second half;
I M can be partitioned into two disjoint matchings M1 and M2 suchthat both M1 and M2 are nested matchings
LemmaLet u ∈ A∗ for some alphabet A, and Gu be the corresponding lineargraph. Then, there exists v ∈ A∗ such that u ∈ v � vR if and only ifthere exists a 2-nested perfect matching in Gu.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 30 / 34
![Page 32: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/32.jpg)
Words and linear graphsLet u = u1 u2 . . . un ∈ An be a word on some alphabet A, and let Gube the linear graph associated to u.
A matching is perfect if it covers all the vertices of the graph.
In case the set of vertices is equipped with a total order, amatching M is said to be 2-nested if the following two conditionshold:
I For every edge (u, v) in M, one endpoint is in the first half of thenodes of Gu and the other is in the second half;
I M can be partitioned into two disjoint matchings M1 and M2 suchthat both M1 and M2 are nested matchings
LemmaLet u ∈ A∗ for some alphabet A, and Gu be the corresponding lineargraph. Then, there exists v ∈ A∗ such that u ∈ v � vR if and only ifthere exists a 2-nested perfect matching in Gu.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 30 / 34
![Page 33: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/33.jpg)
Being the shuffle of a word with its reverse
ABACBACA ∈ ABCA� (ABCD)R
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 31 / 34
![Page 34: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/34.jpg)
Problems left open
“How hard is the problem of detecting squares for the shuffle productfor bounded alphabet words?”
It is proved in [Buss, and Soltys. 2012] that the problem isNP-complete for an alphabet with 9 symbols (it is claimed thatthis can be improved to 7 letters).
It is claimed without proof in [Aoki, Uehara, and Yamazaki.2001] (Fact 2 Subsection 2.2) that detecting squares for the shuffleproduct is NP-complete for binary words.
This result – that would be an important improvement over [Buss,and Soltys. 2012] and our result – is yet to be confirmed.
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 32 / 34
![Page 35: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/35.jpg)
Problems left open
“Let u, v1, v2, . . . , vk ∈ A∗. Decide whether or not u ∈ �ki=1vi . Where
does fall the problem of deciding whether or not u ∈ �ki=1vi in the
W-hierarchy for parameter k?”
Parameterized complexity [Downey, and Fellows. 1999].
In particular, what about the (parameterized) complexity ofdeciding whether or not u ∈ �k
i=1vi for parameter k?
Given words u, v1, v2, . . . , vk ∈ A∗, it is NP-complete to decidewhether or not u ∈ �k
i=1vi [Mansfield. 1983] and [Warmuth,and Haussler. 1984].
The question which lies at the heart of this is: How hard is thisproblem compared to LCS?
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 33 / 34
![Page 36: On recognizing words that are squares for the shuffle product](https://reader031.fdocuments.us/reader031/viewer/2022012915/61c63a3c812a0c79ba74b726/html5/thumbnails/36.jpg)
Problems left open
“How hard is the problem of detecting whether of word is in the shuffleof another word with its reverse?”
Polynomial time solvable for binary words[Henshall,Rampersad, and Shallit. 2011].
How useful are perfect 2-nested matchings?
RNA pseudoknotted structures: 2-intervals, RNA Bi-secondarystructures, LAPCS(S,S), . . .
Rizzi & Vialette () Being a square for the shuffle product CSR 2013 34 / 34