an efficient Bottom-up parser for a large and useful class of context-free grammars. the “ L ”...

an efficient Bottom-up parser for a large and useful class of context-free grammars.

the “L” stands for left-to-right scan of the input;the “R” for constructing a Rightmost derivation in reverse.

The attractive reasons of LR parsers(1) LR parsers can be constructed for most programming

languages.(2) LR parsing method is more general than LL parsing method.(3) LR parsers can detect syntactic errors as soon as possible.

But, it is too much work to implement an LR parser by hand for a

typical programming-language grammar.=====> Parser Generator

The techniques for producing LR parsing tables Simple LR(SLR) - LR(0) items, FOLLOW Canonical LR(CLR) - LR(1) items Lookahead LR(LALR) - ① LR(1) items

② LR(0), Lookahead

LR Parsing[2/60]

ParsingTable

DriverRoutine

… ai … an $ : input

LR parser

Stack : SS00XX11SS11XX2 2 •••••• X XmmSSmm, where Si : state and Xi V.

Configuration of an LR parser :

(S0X1S1 ••• XmSSmm, aaiiai+1 ••• an$)

stack contents unscanned input

symbolsstates <Terminals> <Nonterminals>

… … …

ACTION Table GOTO Table

LR Parsing Table (ACTION table + GOTO table)

The LR parsing algorithm::= same as the shift-reduce parsing algorithm.

Four Actions : shift reduce accept error

1. ACTION[Sm,ai] = shift S

::= (S0X1S1 XmSm, aiai+1 an$)

(S0X1S1 XmSmaiS, ai+1 an$)

2. ACTION[Sm,ai] = reduce A α and |α| = r

::= (S0X1S1 XmSm, aiai+1 an$)

(S0X1S1 Xm-rSm-r, aiai+1 an$), GOTO(Sm-r , A) = S

(S0X1S1 Xm-rSm-rAS, aiai+1 an$)

3. ACTION [Sm,ai] = accept, parsing is completed.

4. ACTION [Sm,ai] = error, the parser has discovered an error

and calls an error recovery routine.

5 r1 r1

3 r3 r3

2 r2 r2

1 s4 acc

0 1 2s3

symbolsstates

LIST ELEMENTa , $

G: 1. LIST LIST , ELEMENT 2. LIST ELEMENT 3. ELEMENT a

Parsing Table :

where,sj means shift and stack state j,ri means reduce by production numbered i,acc means accept, and blank means error.

0 a,a$ s3

0 a 3 ,a$ r3 GOTO 2

0 ELEMENT 2 ,a$ r2 GOTO 1

0 LIST 1 ,a$ s4

0 LIST 1, 4 a 3 $ r3 GOTO 5

0 LIST 1, 4 ELEMENT 5 $ r1 GOTO 1

0 LIST 1 $ accept

0 LIST 1, 4

STACK INPUT ACTION

Input : = a, a Parsing Configuration :

initial

configuration

The method for constructing an LR parsing table from a grammar

① SLR ② LALR ③ CLR

Definition : an LR(0) item a production with a dot at some position of the right side.

ex) A XYZ P, [A .XYZ] [A X.YZ] [A XY.Z] [A XYZ.]

mark symbol ::= the symbol after the dot if it exists. kernel item ::= [A α.] if α, A = S'. closure item ::= [A .α] : the result of performing the CLOSURE

operation. reduce item ::= [A α.]

[Aα.β] means that an input string derivable from α has just been seen, if next seeing an input string derivable from β, we may be able to reduce by the production A αβ.

Definition : Augmented GrammarG = (VN, VT, P, S)

G' = (VN {S'},VT, P {S' S}, S')

where, S' is a new start symbol not in VN.

The purpose of this new starting production is to indicate

the parser when it should stop parsing and announce acceptance of the input. That is, acceptance occurs when and only when the parser is about to reduce by S' S.

If S αAωαβ1β2ω, then αβ1 : viable prefix. 　

”viable prefix is a prefix of a right sentential form that does not continue past the right end of the handle of that sentential form.”

We say item [Aβ1.β2] is valid for a viable prefix

if there is a derivation S αAω αβ1β2ω,

“In general, an item will be valid for many viable prefixes.”

Canonical collection of LR(0) items::= the set of valid items for each viable prefix that can appear on the stack of an LR parser. Computation : CLOSURE & GOTO function

Definition :

CLOSURE(I)CLOSURE(I)

= I ∪ {[B . ] | [A .B] CLOSURE(I), B P}

Meaning : [A .B] in CLOSURE(I) indicates that, at some point in the

parsing process, we next expect to see a substring derivable from B as input. If B is a production, we would also expect to see a substring from at this point. For this reason, we also include [B . ] in CLOSURE(I).

Computing Algorithm:

Algorithm CLOUSURE(I) ;

CLOUSURE := I ;

repeat

if [A .B ] CLOSURE and B P then

if [B .] CLOSURE then

CLOSURE := CLOSURE {∪ [B .]}

until no change

예 1) E' E

E E + T | TT T F | FF (E) | id

CLOSURE ({[E' .E]}) = {[E' .E], [E .E+T], [E .T], [T .TF], [T .F],

[F .(E)], [F .id]}. CLOSURE({[E E.+T]}) = { [E E.+T] }.

예 2) S AS | b

A SA | a CLOSURE({[S A.S]})

= {[S A.S], [S .AS], [S .b], [A .SA], [A .a]}.

Definition : GOTO(I,X)GOTO(I,X) = CLOSURE({[A X.] | [A .X] I}).

Meaning :If I is the set of items that are valid for some viable prefix , then GOTO(I,X) is the set of items that are valid for the viable prefix X.

ex) I = {[E' E.], [E E.+T]} GOTO(I,+) = CLOSURE({[E E+.T]})

= {[E E+.T], [T .TF], [T .F], [F .(E)], [F .id]}

CC00 = {CLOSURE ({[S' .S]})} ∪ {GOTO(I,X) | I ∈ C0, X ∈ V}

We are now ready to give the algorithm to construct C0, the canonical collection of sets of LR(0) items for an augmented grammar; the algorithm is the following:

Construction algorithm of C0. Algorithm Canonical_Collection;

begin C0 := { CLOSURE({[S' . S]}) };

repeat for I C∈ 0 do

Closure := CLOSURE(I); for each X ∈ MARK SYMBOL of Closure do J := GOTO(I,X);

if Ji = J then GOTO[I,X] := Ji

else GOTO[I,X] := J; C0 := C0 {J}∪ fi end for end for until no change end.

G : LIST LIST , ELEMENT LIST ELEMENT ELEMENT a

Augmented Grammar

G' : ACCEPT LIST LIST LIST , ELEMENT LIST ELEMENT ELEMENT a

Co : I0 : CLOSURE({[ACCEPT .LIST]})

= {[ACCEPT .LIST], [LIST .LIST,ELEMEMT],

[LIST .ELEMENT], [ELEMENT .a]}.

GOTO(I0,LIST) = I1 = {[ACCEPT LIST.],

[LIST LIST.,ELEMEMT]}. GOTO(I0,ELEMENT) = I2 = {[LIST ELEMENT.]}.

GOTO(I0,a) = I3 = {[ELEMENT a.]}.

GOTO(I1,,) = I4 = {[LIST LIST,.ELEMEMT],

[ELEMENT .a]}. GOTO(I4,ELEMENT) = I5 = {[LIST LIST,ELEMEMT.]}.

GOTO(I4,a) = I3.

ELEMENTELEMENT

LIST ,

Definition ::= a directed graph in which the nodes are labeled by the

sets of items and the edges by grammar symbols.

예 1) G : PR b DL ; SL e (PR P )

DL d ; DL | d (DL D )

SL s ; SL | s (SL S )

예 2) G : S S + A | A

A (S) | a(S) | a

• - 생성 규칙에 대한 LR(0) 아이템 [A->.]은 closure아이템인 동시에reduce 아이템이 된다 .

renaming G : P → bD ; Se D → d ; D | d S → s ; S | s

C0 :[P' P.]

[P' .P][P .bD;Se]

[P bD.;Se]I3

[P bD;.Se] [S .s;S][S .s]

[P bD;S.e]I7

[P bD;Se.]

[P b.D;Se] [D .d;D][D .d]

[S s.;S][S s.]

[S s;.S][S .s;S][S .s]

[D d.;D][D d.]

[D d;.D][D .d;D][D .d]

[D d;D.]

[S s;S.]I12

symbolsstates VT Ｕ {$} VN

shiftreduceaccepterror

Three methods SLR(simple LR) - C0, Follow CLR(Canonical LR) - C1

LALR(Lookahead LR) - C1

C0. Lookahead

Parsing Table

State i is constructed from Ii, where Ii C∈ 0.

The size of parsing table depends on the number of states of C0.

But, |C0| << |C1| .

The size of parsing table :SLR : |V| x |C0|

CLR : |V| x |C1|

LALR : |V| x |C0|

::= The method constructing the SLR parsing table from the C0.

Constructing Algorithm: C0 = {I0,I1,I2,...,In}

1. ACTION[i,a] := "shift j"

if [A .a ] I∈ i and GOTO(Ii,a) = Ij.

2. ACTION[i,a] := "reduce A α", for all a ∈ FOLLOW(A)

if [A .] I∈ i .

3. ACTION[i,$] := "accept" if [S' S.] I∈ i .

4. GOTO[i,A] := j if GOTO(Ii, A) = Ij.

5. "error" for all undefined entries and initial state is i if [S' .S] I∈ i .

reduce item 에 대해 FOLLOW 를 사용하여 resolve.

[L E.]

[A .L] [L .L,E][L .E][E .a]

[E a.]I3

[A L.][L L.,E]

[L L,.E][E .a]

[L L,E.]

G : 0. A L (A : ACCEPT, L : LIST, E : ELEMENT) 1. L L , E 2. L E 3. E a

FOLLOW(A) = {$}FOLLOW(L) = {,,$}FOLLOW(E) = {,,$}

I5 r1 r1

I3 r3 r3

I2 r2 r2

I1 s4 acc

symbolsstates

Parsing Table :

[S .S] [S .L=R][S .R][L .R][L .id][R .L]

[S S.]I1 I0

[L id.]

[S L.=R][R L.]

[S L=.R][R .L][L .R][L .id]

[L .R][R .L][L .R][L .id] [L R.]

[S L=R.] [R L.]

[S R.]I3

G: 1. S L = R 2. S R 4. L id 3. L R 5. R L

Consider I2 :

ACTION[2,=] := “shift 6 ”

ACTION[2,=] := “reduce RL ” ( = FOLLOW(R))∵ ∈

shift-reduce conflict

Not SLR(1)

Shift-reduce conflict vs. Reduce-reduce conflict

an efficient Bottom-up parser for a large and useful class of context-free grammars. the “ L ”...

Documents

Transcript of an efficient Bottom-up parser for a large and useful class of context-free grammars. the “ L ”...

1 Lecture 8 Grammars and Parsers grammar and derivations, recursive descent parser vs. CYK parser, Prolog vs. Datalog Ras Bodik with Ali & Mangpo Hack.

Resume parser

1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.

The Parser

Parser combinators

Generalized Earley Parser: Bridging Symbolic Grammars and ...web.cs.ucla.edu/~syqi/publications/icml2018earley/icml2018earley.pdf · We formulate an objective for future prediction

Xanadu Vanguard Parser - Xanadu · PDF fileXanadu Vanguard Parser

Maximum Likelihood Analysis of Algorithms and Data …...For some applications normal forms of a grammar are needed. For example a CKY-Parser only works for grammars in Chomsky normal

OOPEG: An Object-Oriented Parser Generator Based on ... · OOPEG: An Object-Oriented Parser Generator Based on Parsing Expression Grammars A Master’s Thesis by Jacob Korsgaard and

Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.

Yet Another Compiler-Compilermaggini/Teaching/TEL/slides... · 2018-01-25 · YACC – Yet Another Compiler-Compiler • YACC (Bison) is a parser generator for LALR(1) grammars Given

Parsing Grammars Regular Languages Grammars

CYK Parser

Just as grammars of language grammars of language grammars ...

1 From Eviss to Viola: Visual Parser Generators based on Extended Constraint Multiset Grammars Jiro…

Topdown parser

From Eviss to Viola: Visual Parser Generators based on Extended Constraint Multiset Grammars

1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.

Parser Cache€¦ · Parser Cache TheCiscocommand-lineparserintheCiscosoftwareperformsthetranslationandexecution(parsing)of commandlines.TheParserCachefeaturewasdevelopedtorapidlyprocesslargeconfigurationfiles,thereby

Slot Grammars - Machine Translation Archive · 1. Introduction This paper presents a formalism for natural lan- guage grammars, with accompanying parser. The grammars are called slot