Automata and Logic C haracterization of Floyd Languages

26
September 20, 2012 1 Automata and Logic Characterization of Floyd Languages • Violetta Lonati DSI - Università degli Studi di Milano • Dino Mandrioli DEI - Politecnico di Milano • Matteo Pradella DEI - Politecnico di Milano

description

Automata and Logic C haracterization of Floyd Languages. Violetta Lonati DSI - Universit à degli Studi di Milano Dino Mandrioli DEI - Politecnico di Milano Matteo Pradella DEI - Politecnico di Milano. Rather unusual presentation. No outline at the beginning Only …. - PowerPoint PPT Presentation

Transcript of Automata and Logic C haracterization of Floyd Languages

Page 1: Automata and Logic  C haracterization of Floyd  Languages

1September 20, 2012

Automata and Logic Characterization of Floyd Languages

• Violetta Lonati DSI - Università degli Studi di Milano

• Dino Mandrioli DEI - Politecnico di Milano

• Matteo Pradella DEI - Politecnico di Milano

Page 2: Automata and Logic  C haracterization of Floyd  Languages

2September 20, 2012

Rather unusual presentation

• No outline at the beginning• Only …

Page 3: Automata and Logic  C haracterization of Floyd  Languages

3September 20, 2012

1. Short summary of Floyd languages and grammars(they are a little outdated …)

• In 1963 R. Floyd introduced Operator Precedence Grammars, a subclass context-free grammars, with the goal of developing efficient parsing techniques.

• OPGs –here named FGs after their inventor- are inspired by the structure of arithmetic expressions (and their operators)

Page 4: Automata and Logic  C haracterization of Floyd  Languages

The basics of Floyd Grammars (1)

• operator form (normal for CF):– No adjacent nonterminals

• precedences– balanced letters (A aBb, …) are equal in precedence (.=)– precedences between letters inspired by arithmetics’ precedences, e.g

+ . – adjacent letter precedences determine syntax tree:– S A; A bAc | bc– . b . b =. c . c . – Reduction: . b .= c . A (reverse of A bc )– . b A c . : b .= A c (nonterminals are “transparent”)– Reduction: . b .= A c . A (reverse of A bAc )– Reduction: A S (reverse of S A )

.

September 20, 2012 4

Page 5: Automata and Logic  C haracterization of Floyd  Languages

The basics of Floyd Grammars (2)

• G’ s (conflict free) Operator Precedence Matrix, OPM

• b . L(A)• R(A) . c

.b

b c

c

.

.

A

Ab c

. .

September 20, 2012 5

.

Page 6: Automata and Logic  C haracterization of Floyd  Languages

The basics of Floyd Grammars (3)

• G1 = {E → E + T | T, T → T × a | a}

• G2 = {E → E + T | T, T → T × F | F, F → (E) | a}

• G1’s precedences are:• a +, a ×⋗ ⋗• + +, + × , + a ⋗ ⋖ ⋖• × =˙ a• NB: implicitly: ⋖ , ⋗

E

E

E

T

T

T T a

aa

+

+

x

. .

.

. …

September 20, 2012 6

Page 7: Automata and Logic  C haracterization of Floyd  Languages

7September 20, 2012

2. A question raised by a reviewer

• “Why studying operator precedence languages now-a-days? just for fun??”

• Certainly we (fun is subjective feeling) had and have fun while investigating FG properties (this should not be an exception at least within a TCS community …)

• However, not just for fun:

Page 8: Automata and Logic  C haracterization of Floyd  Languages

8September 20, 2012

2.1 FGs have been abandoned

• Unlike more powerful classes (LR) they cannot generate all deterministic CF languages– (but this is more a theoretical than a practical weakness)

• They were originally motivated by parsing, and new powerful parsing techniques emerged … though rarely they exhibited the simplicity and efficiency of FG-based ones.

Page 9: Automata and Logic  C haracterization of Floyd  Languages

9September 20, 2012

2.2 A more recent and still quite alive and productive result: Model checking (MC)

(Remark: Both FGs and MC contributed to granting a Turing award …)

• What has MC to do with FGs?– MC is rooted in basic closure properties + decidability of the

emptiness property– These properties are typically enjoyed by regular languages

(finite state -FS) – MC exploits automata theoretic and logic (MSO)

characterization of FS languages

Page 10: Automata and Logic  C haracterization of Floyd  Languages

10September 20, 2012

2.3 A large amount of literature strove to extend the scope of MC beyond the limits of FS machines

• The typical goal is to keep the properties that allow for the application of MC algorithms

• Among the various attempts Visibly Pushdown Languages (VPLs) have certainly been quite successful

• VPLs generalize parenthesis languages:– { ( } = c , { ) } = r , VT = i

– Calls (open parentheses) and returns (closed ones) are not necessarily matched:• Unmatched returns at the beginning of the string• Unmatched calls at the end (acceptance with non empty stack)

Page 11: Automata and Logic  C haracterization of Floyd  Languages

11September 20, 2012

• VPLs inherit main properties of regular languages:– Closed w.r.t. boolean operations– Closed w.r.t. concatenation, Kleene *, prefix, suffix, …

• By keeping the partitioning of unaffected• With some “care” about reversal and homomorphism

– Deterministic VPAs equivalent to nondeterministic ones …• With a typical power-set construction

– MSO logic characterization– In summary: they resume and extend the original work by

McNaughton and others on tree automata.

Page 12: Automata and Logic  C haracterization of Floyd  Languages

12September 20, 2012

2.4 Somewhat surprisingly …(at least for us)• VPLs are a proper subclass of FLs

– Crespi-Reghizzi and Mandrioli (JCSS, 2012, # 6)• Precisely, they are all and only those FLs characterized by a• Partitioned Precedence matrix:

Page 13: Automata and Logic  C haracterization of Floyd  Languages

13September 20, 2012

2.5 FLs also share the classical closure properties enjoyed by regular languages and VPLs

• FLs closed w.r.t. (Crespi and Mandrioli, 1978 and … 2010)

• Boolean operations• Concatenation and Kleene * (more difficult to prove than for other

classes of languages)• Prefix and suffix • …• Thus they are perfect candidates to further extend MC techniques to

infinite state machines

Page 14: Automata and Logic  C haracterization of Floyd  Languages

14September 20, 2012

2.6 But studying FGs was abandoned a long time ago …

• (Somewhat surprisingly) an automata family associated with (accepting all and only) FLs was still lacking

• (Less suprisingly) a (MSO) logic characterization was also lacking:• Two important contributors to the power of MC• So:• Not just for fun• Incidentally:• FLs –unlike general deterministic languages – enjoy a local

parsability property which enable parallel and incremental parsing (Barenghi et al., SLE 2012), which “now-a-days” is probably more interesting than 40 years ago

Page 15: Automata and Logic  C haracterization of Floyd  Languages

15September 20, 2012

3. Floyd automata (FAs)

• The transition function can be seen as the union of two disjoint functions:– push: Q 2Q flush: Q Q 2Q

• Push and mark moves both push the input symbol on the top of the stack, together with the new state computed by push; such moves differ only in the marking of the symbol on top of the stack.

• The flush move is more complex: the symbols on the top of the stack are removed until the first marked symbol (included), and the state of the next symbol below them in the stack is updated by flush according to the pair of states that delimit the portion of the stack to be removed.

Page 16: Automata and Logic  C haracterization of Floyd  Languages

16September 20, 2012

• An () – language that can be modeled by a FA (but not by a VPA):– the stack management of a simple programming

language that is able to handle nested exceptions:– two procedures, called a and b. Calls and returns are

denoted by calla, callb, reta, retb, respectively.• During execution, it is possible to install an exception

handler hnd. • rst is issued when an exception occur, or after a correct

execution to uninstall the handler. With a rst the stack is “flushed”, restoring the state right before the last hnd.

Page 17: Automata and Logic  C haracterization of Floyd  Languages

17September 20, 2012

• Deterministic FAs are as powerful as nondeterministic ones– (as it happens for FSMs and VPAs)– proof is based on, but is not just a rephrasing of, the

normal power-set construction …

Page 18: Automata and Logic  C haracterization of Floyd  Languages

4. The “traditional” MSO characterization

• := a(x) | x X |x y | x y | x = x +1 | | | x. | X.

• The only “novelty” w.r.t. the standard Buchi’s syntax is the ‘’ relation– Which somewhat resembles the ‘---->’ relation

between two “matching positions” in VPLs.

September 20, 2012 18

Page 19: Automata and Logic  C haracterization of Floyd  Languages

4.1 Here comes review # 2 (plus others)

• “The MSO characterization for a class of languages is an interesting result which adds to a theory, though it is often quite a standard exercise, as it seems to be the case also for FL”– (Fortunately, also:• “Overall, the results are interesting and can be

accepted for presentation at ICTCS.”– )

• Side personal trouble: why only for MSO and not for FAs? ….

September 20, 2012 19

Page 20: Automata and Logic  C haracterization of Floyd  Languages

• Indeed the basic –and most original- construction due to Buchi to build an automaton starting from a MSO formula has been adapted in the following literature to many other automata families, including tree-automata, VPA, … and works for FAs too, with a couple of non-trivial technical warnings due to the need of extending precedence relations when changing alphabets.

September 20, 2012 20

Page 21: Automata and Logic  C haracterization of Floyd  Languages

• Indeed “coding” (FS) automata moves in terms of logic formulas is not-a-too-difficult exercise and has been repeated without serious obstacles for other automata (e.g. tree automata).

• For VPAs the authors introduced the x ---> y relation between “matching positions” and built a suitable formula to control the correct match when reading the return symbol

September 20, 2012 21

Page 22: Automata and Logic  C haracterization of Floyd  Languages

4.2 All this easily rephrased for FAs?

• Major difference w.r.t. all previous cases (to the best of our knowledge):– The relation x-y (roughly begin-end of a right hand

side) is not anymore one-to-one• Equivalently:

– There is no one-to-one correspondence<read symbol – automaton transition>, i.e.,unlike previous cases FAs are not real-time machines

September 20, 2012 22

Page 23: Automata and Logic  C haracterization of Floyd  Languages

4.3 Our approach

• The fundamental difference between FLs and all other languages studied in this type of literature is that the latter ones are “explicit structure” or “explicit parentheses” languages (regular and linear ones being very special and simple cases thereof), whereas FLs, as well as other general CF languages have an implicit syntax structure determined by the OPM:

• Recognizing a string of a FL requires a real, non-trivial parsing; and this has to be coded by means of suitable MSO formulas.

September 20, 2012 23

Page 24: Automata and Logic  C haracterization of Floyd  Languages

After a few different tries …

• The main idea: • Follow the key of FG parsing, i.e. the look-

ahead, look-back induced by the .> and .< relations:– They determine the (not one-to-one) x y

relation

September 20, 2012 24

.<.< .>

x y

Page 25: Automata and Logic  C haracterization of Floyd  Languages

Obvious?• Perhaps; however, from (another) reviewer, who also claimed

"a fairly trivial exercise" :– "page 8: hnd(x+1) and rst(y-1): shouldn't this be hnd(x) and rst(y) for

example, if z is 2 then x should be 1 and y should be 3“• Once the new relation is well established, a few more

“technicalities” (e.g., the automaton can enter different state (types) in the same position) required several weeks and pages for the authors to come up with a (hopefully) complete proof– Of course simpler, shorter, and quicker (and more “standard”) proofs

would be quite welcome • “(I have not checked the cited technical report but I have a rough idea of

what should be done)”. • Instead, if you are curious and (not convinced but lazy), or you just want to

compare your proof with our own, you can always go to http://arxiv.org/abs/1204.4639

September 20, 2012 25

Page 26: Automata and Logic  C haracterization of Floyd  Languages

(Very personal) conclusions

• FGs, FLs, FAs are a rich mine of theoretical properties –not only those addressed in this contribution – with important practical impact in different fields such as MC and parsing

• Worth further investigation, not just for fun:– -languages– Local parsability (extensions)– Pairing with semantic analysis – ….

September 20, 2012 26