LR(1) & LALR PARSER
Click here to load reader
-
Upload
zia-mcfadden -
Category
Documents
-
view
189 -
download
23
description
Transcript of LR(1) & LALR PARSER
![Page 1: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/1.jpg)
LR(1) & LALR PARSER
![Page 2: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/2.jpg)
INTRODUCTION
The LALR ( Look Ahead-LR ) parsing technique is
between SLR and Canonical LR, both in terms of power of parsing grammars and ease of implementation. This method is often used in practice because the tables obtained by it are considerably smaller than the Canonical LR tables, yet most common syntactic constructs of programming languages can be expressed conveniently by an LALR grammar. The same is almost true for SLR grammars, but there are a few constructs that can not be handled by SLR techniques.
![Page 3: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/3.jpg)
CONSTRUCTING LALR PARSING TABLES
A core is a set of LR (0) (SLR) items for the grammar, and an LR (1) (Canonical LR) grammar may produce more than two sets of items with the same core.
The core does not contain any look ahead information. Example: Let s1 and s2 are two states in a Canonical LR
grammar.
S1 – {C ->c.C, c/d; C -> .cC, c/d; C -> .d, c/d}S2 – {C ->c.C, $; C -> .cC, $; C -> .d, $}
These two states have the same core consisting of only the production rules without any look ahead information.
![Page 4: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/4.jpg)
CONSTRUCTION IDEA:
1. Construct the set of LR (1) items.
2. Merge the sets with common core together as one set, if no conflict (shift-shift or shift-reduce) arises.
3. If a conflict arises it implies that the grammar is not LALR.
4. The parsing table is constructed from the collection of merged sets of items using the same algorithm for LR (1) parsing.
![Page 5: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/5.jpg)
ALGORITHM:
• Input: An augmented grammar G’.
• Output: The LALR parsing table actions and goto for G’.
![Page 6: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/6.jpg)
Method:
1. Construct C= {I0, I1, I2,…, In}, the collection of sets of LR(1) items.
2. For each core present in among the set of LR(1) items , find all sets having the core, and replace these sets by their union.
3. Parsing action table is constructed as for Canonical LR.4. The goto table is constructed by taking the union of all sets of
items having the same core. If J is the union of one or more sets of LR (1) items, that is, J=I1 U I2 U … U Ik, then the cores of goto(I1,X), goto(I2,X),…, goto(Ik, X) are the same as all of them have same core. Let K be the union of all sets of items having same core as goto(I1, X). Then goto(J,X)=K.
![Page 7: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/7.jpg)
EXAMPLE
GRAMMAR:
1. S’ -> S
2. S -> CC
3. C -> cC
4. C -> d
![Page 8: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/8.jpg)
SET OF ITEMS:
I0 : S’ -> .S, $ S -> .CC, $ C -> .c C, c /d C -> .d, c /d
I1: S’ -> S., $
I2: S -> C.C, $ C -> .Cc, $ C -> .d, $
I3: C -> c. C, c /d C -> .Cc, c /d
C -> .d, c /d
I4: C -> d., c /d
I5: S -> CC., $
I6: C -> c.C, $ C -> .cC, $ C -> .d, $
I7: C -> d., $
I8: C -> cC., c /d
I9: C -> cC., $
![Page 9: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/9.jpg)
The goto graph:
![Page 10: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/10.jpg)
CANONICAL PARSING TABLE
![Page 11: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/11.jpg)
LALR PARSER
Merge the Cores:
• 3 & 6
• 4 & 7
• 8 & 9
![Page 12: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/12.jpg)
LALR PARSING TABLE
![Page 13: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/13.jpg)
SHIFT-REDUCE CONFLICT
COMPARISON OF LR (1) AND LALR:1. If LR (1) has shift-reduce conflict then LALR will
also have it.2. If LR (1) does not have shift-reduce conflict LALR
will also not have it.3. Any shift-reduce conflict which can be removed by
LR (1) can also be removed by LALR.4. For cases where there are no common cores SLR and
LALR produce same parsing tables.
![Page 14: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/14.jpg)
SHIFT-REDUCE CONFLICT
LR(1) Parser-
How Does a shift-Reduce conflict arises?
IA : { E->α..βc, F->γ. } & FOLLOW(F)={β}
For Eg. I3 : {E->L.=R , R-> L. } & Follow(E)={=}
For Solving this Problem by LALR Parser:
• IA={E->α.βc,ω1 ; F->γ.,δ1}
• IB={E->α.βc,ω2 ; F->γ.,δ2}
• IAB={E->α.βc,ω1|ω2 ; F->γ.,δ1|δ2}
![Page 15: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/15.jpg)
SHIFT-REDUCE CONFLICT
COMPARISON OF SLR AND LALR:1. If SLR has shift-reduce conflict then LALR may or may not
remove it.2. SLR and LALR tables for a grammar always have same number of
states.
Hence, LALR parsing is the most suitable for parsing general programming languages.The table size is quite small as compared to LR (1) , and by carefully designing the grammar it can be made free of conflicts. For example, in a language like Pascal LALR table will have few hundred states, but a Canonical LR will have thousands of states. So it is more convenient to use an LALR parsing.
![Page 16: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/16.jpg)
REDUCE-REDUCE CONFLICT
The Reduce-Reduce conflicts still might just remain .This claim may be better comprehended if we take the example of the following grammar:
• IA={ A->γ1.,δ1 ; ->γ2.,δ2}If δ1≠δ2 then NO reduce-reduce conflict.(mutual exclusive sets)• IA={ A->γ1.,δ1 ; ->γ2.,δ2}• IB={ A->γ1.,δ3 ; B->γ2.,δ4}• IAB={ A->γ1.,δ1|δ3 ; B->γ2.,δ2|δ4}Conflicts arises when δ1=δ4 or δ2=δ3.
![Page 17: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/17.jpg)
Example of R-R Conflict
• S'-> S
• S -> aAd
• S -> bBd
• S -> aBe
• S -> bAe
• A -> c
• B -> c
![Page 18: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/18.jpg)
Generating the LR (1) items for the above grammar
• I0 : S’-> .S , $• S-> . aAd, $• S-> . bBd, $ • S-> . aBe, $• S-> . bAe, $• I1: S’-> S ., $ • I2: S-> a . Ad, $• S-> a . Be, $ • A-> .c, d • B->.c, e • I3: S-> b . Bd, $• S-> b . Ae, $ • A->.c, e• B->.c,d• I4: S->aA.d, $
![Page 19: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/19.jpg)
Generating the LR (1) items for the above grammar
• I5: S-> aB.e,$• I6: A->c. , d ; B->c. , e • I7: S->bB.d, $ • I8: S->bA.e, $• I9: B->c. ,d ; A->c. , e • I10: S->aAd. , $ • I11: S->aBe., $• I12: S->bBd., $• I13: S->aBe., $
• The underlined items are of our interest. We see that when we make the Parsing table for LR (1), we will get something like this…
![Page 20: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/20.jpg)
The LR (1) Parsing Table
![Page 21: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/21.jpg)
The LALR Parsing table
LR(1) Parsing table on reduction to the LALR parsing table
![Page 22: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/22.jpg)
Conclusion
• So, we find that the LALR gains reduce-reduce conflict whereas the corresponding LR (1) counterpart was void of it. This is a proof enough that LALR is less potent than LR (1).
• But, since we have already proved that the LALR is void of shift-reduce conflicts (given that the corresponding LR(1) is devoid of the same), whereas SLR (or LR (0)) is not necessarily void of shift-reduce conflict, the LALR grammar is more potent than the SLR grammar
![Page 23: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/23.jpg)
Conclusion
SHIFT-REDUCE CONFLICT present in
SLR (Some of them are solved in….)
LR (1) (All those solved are preserved in…)
LALR
So, we have answered all the queries on LALR that we raised intuitively.
![Page 24: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/24.jpg)
24
Shift-Reduce Parsers
• Reviewing some technologies:– Phrase– Simple phrase– Handle of a sentential form
S
A b C
b C a C
b C b a C
A sentential form
handle Simple phrase
![Page 25: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/25.jpg)
25
Shift-reduce parser
• A parse stack– Initially empty, contains symbols already parsed
• Elements in the stack are not terminal or nonterminal symbols
– The parse stack catenated with the remaining input always represents a right sentential form
– Tokens are shifted onto the stack until the top of the stack contains the handle of the sentential form
![Page 26: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/26.jpg)
26
Shift-reduce parser
• Two questions1. Have we reached the end of handles and how
long is the handle?
2. Which nonterminal does the handle reduce to?
• We use tables to answer the questions– ACTION table– GOTO table
![Page 27: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/27.jpg)
27
Shift-reduce parser
• LR parsers are driven by two tables:– Action table, which specifies the actions to take
• Shift, reduce, accept or error
– Goto table, which specifies state transition
• We push states, rather than symbols onto the stack
• Each state represents the possible subtree of the parse tree
![Page 28: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/28.jpg)
28
Shift-reduce parser
![Page 29: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/29.jpg)
29
![Page 30: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/30.jpg)
30
![Page 31: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/31.jpg)
31
<program>
begin <stmts> end $
SimpleStmt ; <stmts>
SimpleStmt ; <stmts>
R 4
R 2
R 2
![Page 32: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/32.jpg)
32
LR Parsers• LR(1):
– left-to-right scanning– rightmost derivation(reverse)– 1-token lookahead
• LR parsers are deterministic– no backup or retry parsing actions
• LR(k) parsers – decide the next action by examining the tokens
already shifted and at most k lookahead tokens– the most powerful of deterministic bottom-up
parsers with at most k lookahead tokens.
![Page 33: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/33.jpg)
33
LR(0) Parsing
• A production has the form – AX1X2…Xj
• By adding a dot, we get a configuration (or an item)– A•X1X2…Xj
– AX1X2…Xi • Xi+1 … Xj
– AX1X2…Xj •
• The • indicates how much of a RHS has been shifted into the stack.
![Page 34: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/34.jpg)
34
LR(0) Parsing
• An item with the • at the end of the RHS
– AX1X2…Xj •
– indicates (or recognized) that RHS should be reduced to LHS
• An item with the • at the beginning of RHS– A•X1X2…Xj
– predicts that RHS will be shifted into the stack
![Page 35: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/35.jpg)
35
LR(0) Parsing
• An LR(0) state is a set of configurations– This means that the actual state of LR(0)
parsers is denoted by one of the items.
• The closure0 operation:– if there is an configuration B • A in the set
then add all configurations of the form A • to the set.
• The initial configuration– s0 = closure0({S • $})
![Page 36: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/36.jpg)
36
LR(0) Parsing
![Page 37: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/37.jpg)
37
![Page 38: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/38.jpg)
38
LR(0) Parsing• Given a configuration set s, we can compute its
successor, s', under a symbol X– Denoted go_to0(s,X)=s'
![Page 39: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/39.jpg)
39
LR(0) Parsing• Characteristic finite state machine (CFSM)
– It is a finite automaton, p.148, para. 2.– Identifying configuration sets and successor operation with CFSM
states and transitions
![Page 40: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/40.jpg)
40
LR(0) Parsing• For example, given grammar G2
S'S$SID|
![Page 41: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/41.jpg)
41
LR(0) Parsing• CFSM is the goto table of LR(0) parsers.
![Page 42: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/42.jpg)
42
![Page 43: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/43.jpg)
43
LR(0) Parsing• Because LR(0) uses no lookahead, we must
extract the action function directly from the configuration sets of CFSM
• Let Q={Shift, Reduce1, Reduce2 , …, Reducen}– There are n productions in the CFG
• S0 be the set of CFSM states
– P:S02Q
• P(s)={Reducei | B • s and production i is
B } (if A • a s for a Vt Then {Shift} Else )
![Page 44: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/44.jpg)
44
LR(0) Parsing• G is LR(0) if and only if s S0 |P(s)|=1
• If G is LR(0), the action table is trivially extracted from P– P(s)={Shift} action[s]=Shift
– P(s)={Reducej}, where production j is the augmenting production, action[s]=Accept
– P(s)={Reducei}, ij, action[s]=Reducei
– P(s)= action[s]=Error
![Page 45: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/45.jpg)
45
• Consider G1
SE$EE+T | TTID|(E)
CFSM for G1
![Page 46: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/46.jpg)
46
LR(0) Parsing
• Any state s S0 for which |P(s)|>1 is said to be inadequate
• Two kinds of parser conflicts create inadequacies in configuration sets– Shift-reduce conflicts
– Reduce-reduce conflicts
![Page 47: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/47.jpg)
47
LR(0) Parsing• It is easy to introduce inadequacies in
CFSM states– Hence, few real grammars are LR(0). For
example,• Consider -productions
– The only possible configuration involving a -production is of the form A •
– However, if A can generate any terminal string other than , then a shift action must also be possible (First(A))
• LR(0) parser will have problems in handling operator precedence properly
![Page 48: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/48.jpg)
48
LR(1) Parsing
• An LR(1) configuration, or item is of the form– AX1X2…Xi • Xi+1 … Xj, l where l Vt{}
• The look ahead commponent l represents a possible lookahead after the entire right-hand side has been matched
• The appears as lookahead only for the augmenting production because there is no lookahead after the endmarker
![Page 49: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/49.jpg)
49
LR(1) Parsing
• We use the following notation to represent the set of LR(1) configurations that shared the same dotted production
AX1X2…Xi • Xi+1 … Xj, {l1…lm}
={AX1X2…Xi • Xi+1 … Xj, l1}
{AX1X2…Xi • Xi+1 … Xj, l2}
…
{AX1X2…Xi • Xi+1 … Xj, lm}
![Page 50: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/50.jpg)
50
LR(1) Parsing
• There are many more distinct LR(1) configurations than LR(0) configurations.
• In fact, the major difficulty with LR(1) parsers is not their power but rather finding ways to represent them in storage-efficient ways.
![Page 51: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/51.jpg)
51
LR(1) Parsing• Parsing begins with the configuration
– closure1({S • $, {}})
![Page 52: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/52.jpg)
52
LR(1) Parsing• Consider G1
SE$EE+T | TTID|(E)
• closure1(S • E$, {})
S • E$, {}
E • E+T, {$}
E • T, {$}
E • E+T, {+}
E • T, {+}
T • ID, {+}
T • (E), {+}T • ID, {$}
T • (E), {$}
closure1(S • E$, {})={S • E$, {};E • E+T, {$+}E • T, {$+}T • ID, {$+}T • (E), {$+}}
How many configures?
![Page 53: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/53.jpg)
53
LR(1) Parsing• Given an LR(1) configuration set s, we
compute its successor, s', under a symbol X– go_to1(s,X)
![Page 54: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/54.jpg)
54
LR(1) Parsing
• We can build a finite automation that is analogue of the LR(0) CFSM – LR(1) FSM, LR(1) machine
• The relationship between CFSM and LR(1) macine– By merging LR(1) machine’s configuration
sets, we can obtain CFSM
![Page 55: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/55.jpg)
55
![Page 56: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/56.jpg)
56
• G3
SE$
EE+T|T
TT*P|P
PID|(E)
Is G3 an LR(0)Grammar?
![Page 57: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/57.jpg)
57
![Page 58: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/58.jpg)
58
LR(1) Parsing• The go_to table used to drive an LR(1) is
extracted directly from the LR(1) machine
![Page 59: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/59.jpg)
59
LR(1) Parsing• Action table is extracted directly from the
configuration sets of the LR(1) machine
• A projection function, P– P : S1Vt2Q
• S1 be the set of LR(1) machine states
• P(s,a)={Reducei | B •,a s and
production i is B } (if A • a,b s Then {Shift} Else )
![Page 60: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/60.jpg)
60
LR(1) Parsing• G is LR(1) if and only if
s S1 a Vt |P(s,a)|1
• If G is LR(1), the action table is trivially extracted from P– P(s,$)={Shift} action[s][$]=Accept
– P(s,a)={Shift}, a$ action[s][a]=Shift
– P(s,a)={Reducei}, action[s][a]=Reducei
– P(s,a)= action[s][a]=Error
![Page 61: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/61.jpg)
61
![Page 62: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/62.jpg)
62
SLR(1) Parsing
• LR(1) parsers are the most powerful class of shift-reduce parsers, using a single lookahead– LR(1) grammars exist for virtually all
programming languages– LR(1)’s problem is that the LR(1) machine
contains so many states that the go_to and action tables become prohibitively large
![Page 63: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/63.jpg)
63
SLR(1) Parsing
• In reaction to the space inefficiency of LR(1) tables, computer scientists have devised parsing techniques that are almost as powerful as LR(1) but that require far smaller tables
1. One is to start with the CFSM, and then add lookahead after the CFSM is build– SLR(1)
2. The other approach to reducing LR(1)’s space inefficiencies is to merger inessential LR(1) states– LALR(1)
![Page 64: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/64.jpg)
64
SLR(1) Parsing
• SLR(1) stands for Simple LR(1)– One-symbol lookahead – Lookaheads are not built directly into
configurations but rather are added after the LR(0) configuration sets are built
– An SLR(1) parser will perform a reduce action for configuration B • if the lookahead symbol is in the set Follow(B)
![Page 65: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/65.jpg)
65
SLR(1) Parsing
• The SLR(1) projection function, from CFSM states, – P : S0Vt2Q
– P(s,a)={Reducei | B •,a Follow(B) and
production i is B } (if A • a s for a Vt Then {Shift} Else )
![Page 66: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/66.jpg)
66
SLR(1) Parsing• G is SLR(1) if and only if
s S0 a Vt |P(s,a)|1
• If G is SLR(1), the action table is trivially extracted from P– P(s,$)={Shift} action[s][$]=Accept
– P(s,a)={Shift}, a$ action[s][a]=Shift
– P(s,a)={Reducei}, action[s][a]=Reducei
– P(s,a)= action[s][a]=Error
• Clearly SLR(1) is a proper superset of LR(0)
![Page 67: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/67.jpg)
67
SLR(1) Parsing• Consider G3
– It is LR(1) but not LR(0)
– See states 7,11
– Follow(E)={$,+,)}
SE$
EE+T|T
TT*P|P
PID|(E)
![Page 68: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/68.jpg)
68
SE$
EE+T|T
TT*P|P
PID|(E)
G3 is both SLR(1) and LR(1).
![Page 69: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/69.jpg)
69
Limitations of the SLR(1) Technique• The use of Follow sets to estimate the
lookaheads that predict reduce actions is less precise than using the exact lookaheads incorporated into LR(1) configurations– Consider G4
Elem(List, Elem)
ElemScalar
ListList,Elem
List Elem
Scalar ID
Scalar(Scalar)
Fellow(Elem)={“)”,”,”,….}
![Page 70: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/70.jpg)
70
Fellow(Elem)={“)”,”,”,….}
LR(1) lookahead for ElemScalar • is “,”
![Page 71: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/71.jpg)
71
LALR(1)
• LALR(1) parsers can be built by first constructing an LR(1) parser and then merging states– An LALR(1) parser is an LR(1) parser in which
all states that differ only in the lookahead components of the configurations are merged
– LALR is an acronym for Look Ahead LR
![Page 72: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/72.jpg)
72
The core of a configuration
• The core of the above two configurations is the same
EE+T
T T*P
T P
P id
P (E)
![Page 73: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/73.jpg)
73
States Merge• Cognate(s)={c|cs, core(s)=s}
EE+T
T T*P
T P
P id
P (E)
s
Cognate( ) =
![Page 74: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/74.jpg)
74
LALR(1)• LALR(1) machine
![Page 75: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/75.jpg)
75
LALR(1)
• The CFSM state is transformed into its LALR(1) Cognate – P : S0Vt2Q
– P(s,a)={Reducei | B •,a Cognate(s) and
production i is B } (if A • a s Then {Shift} Else )
![Page 76: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/76.jpg)
76
LALR(1) Parsing• G is LALR(1) if and only if
s S0 a Vt |P(s,a)|1
• If G is LALR(1), the action table is trivially extracted from P– P(s,$)={Shift} action[s][$]=Accept
– P(s,a)={Shift}, a$ action[s][a]=Shift
– P(s,a)={Reducei}, action[s][a]=Reducei
– P(s,a)= action[s][a]=Error
![Page 77: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/77.jpg)
77
LALR(1) Parsing• Consider G5
<stmt>ID
<stmt><var>:=<expr>
<var> ID<var> ID [<expr>] <expr><var>
• Assume statements are separated by ;’s, the grammar is not SLR(1) because; Follow(<stmt>) and
; Follow(<var>), since <expr><var>
![Page 78: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/78.jpg)
78
LALR(1) Parsing
• However, in LALR(1), if we use <var> ID the next symbol must be :=so action[ 1, := ] = reduce(<var> ID)
action[ 1, ; ] = reduce(<stmt> ID)
action[ 1,[ ] = shift
• There is no conflict.
![Page 79: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/79.jpg)
79
LALR(1) Parsing• A common technique to put an LALR(1) grammar
into SLR(1) form is to introduce a new nonterminal whose global (i.e. SLR) lookaheads more nearly correspond to LALR’s exact look aheads– Follow(<lhs>) = {:=}
<stmt>ID
<stmt><var>:=<expr>
<var> ID<var> ID [<expr>] <expr><var>
<stmt>ID
<stmt><lhs>:=<expr>
<lhs> ID<lhs> ID [expr>]
<var> ID<var> ID [expr>] <expr><var>
![Page 80: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/80.jpg)
80
LALR(1) Parsing• At times, it is the CFSM itself that is at fault.
S(Exp1)S[Exp1]S(Exp2]S[Exp2)<Exp1>ID<Exp2>ID
• A different expression nonterminal is used to allow error or warning diagnostics
![Page 81: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/81.jpg)
81
Building LALR(1) Parsers
• In the definition of LALR(1)– An LR(1) machine is first built, and then its
states are merged to form an automaton identical in structure to the CFSM
• May be quite inefficient
– An alternative is to build the CFSM first.• Then LALR(1) lookaheads are “propagated” from
configuration to configuration
![Page 82: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/82.jpg)
82
Building LALR(1) Parsers
• Propagate links:– Case 1: one configuration is created from
another in a previous state via a shift operation
A •X , L1 A X• , L2
![Page 83: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/83.jpg)
83
Building LALR(1) Parsers
• Propagate links:– Case 2: one configuration is created as the
result of a closure or prediction operation on another configuration
B •A , L1
A • , L2
L2={ x|xFirst(t) and t L1 }
![Page 84: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/84.jpg)
84
Building LALR(1) Parsers• Step 1: After the CFSM is built, we can create all
the necessary propagate links to transmit lookaheads from one configuration to another
• Step 2: spontaneous lookaheads are determined– By including in L2, for configuration A,L2, all
spontaneous lookaheads induced by configurations of the form B A,L1
• These are simply the non- values of First()
• Step 3: Then, propagate lookaheads via the propagate links– See figure 6.25
![Page 85: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/85.jpg)
85
Building LALR(1) Parsers
![Page 86: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/86.jpg)
86
![Page 87: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/87.jpg)
87
![Page 88: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/88.jpg)
88
Building LALR(1) Parsers
• A number of LALR(1) parser generators use lookahead propagation to compute the parser action table– LALRGen uses the propagation algorithm
– YACC examines each state repeatedly
• An intriguing alternative to propagating LALR lookaheads is to compute them as needed by doing a backward search through the CFSM– Read it yourself. P. 176, Para. 3
![Page 89: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/89.jpg)
89
Building LALR(1) Parsers
• An intriguing alternative to propagating LALR lookaheads is to compute them as needed by doing a backward search through the CFSM– Read it yourself. P. 176,
Para. 3
![Page 90: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/90.jpg)
90
Calling Semantic Routines in Shift-Reduce Parsers
• Shift-reduce parsers can normally handle larger classes of grammars than LL(1) parsers, which is a major reason for their popularity
• Shift-reduce parsers are not predictive, so we cannot always be sure what production is being recognized until its entire right-hand side has been matched– The semantic routines can be invoked only after a
production is recognized and reduced• Action symbols only at the extreme right end of a right-hand
side
![Page 91: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/91.jpg)
91
Calling Semantic Routines in Shift-Reduce Parsers
• Two common tricks are known that allow more flexible placement of semantic routine calls
• For example,<stmt>if <expr> then <stmts> else <stmts> end if
• We need to call semantic routines after the conditional expression else and end if are matched– Solution: create new nonterminals that generate
<stmt>if <expr> <test cond>
then <stmts> <process then part>
else <stmts> end if
<test cond><process than part>
![Page 92: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/92.jpg)
92
Calling Semantic Routines in Shift-Reduce Parsers
• If the right-hand sides differ in the semantic routines that are to be called, the parser will be unable to correctly determine which routines to invoke– Ambiguity will manifest. For example,
<stmt>if <expr> <test cond1> then <stmts> <process then part> else <stmts> end if;
<stmt>if <expr> <test cond2> then <stmts> <process then part> end if;
<test cond1><test cond2><process than part>
![Page 93: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/93.jpg)
93
Calling Semantic Routines in Shift-Reduce Parsers
• An alternative to the use of –generating nonterminals is to break a production into a number of pieces, with the breaks placed where semantic routines are required
<stmt><if head><then part><else part>
<if head>if <expr>
<then part>then <stmts>
<else part>then <stmts> end if;
– This approach can make productions harder to read but has the advantage that no –generating are needed
![Page 94: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/94.jpg)
94
Uses (and Misuses) of Controlled Ambiguity
• Research has shown that ambiguity, if controlled, can be of value in producing efficient parsers for real programming languages.
• The following grammar is not LR(1)Stmt if Expr then Stmt
Stmt if Expr then Stmt else Stmt
![Page 95: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/95.jpg)
95
Uses (and Misuses) of Controlled Ambiguity
• The following grammar is not ambiguous<stmt><matched>|<unmatched>
<matched> if <logic_expr> then <matched> else <matched>
| <any_non-if_statement>
<unmatched> if <logic_expr> then <stmt>
| if <logic_expr> then <matched> else <unmatched>
![Page 96: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/96.jpg)
96
Uses (and Misuses) of Controlled Ambiguity
• In LALRGen, when conflicts occur, we may use the option resolve to give preference to earlier productions.
• In YACC, conflicts are solved in– 1. shift is preferred to reduce;– 2. earlier productions are preferred.
• Grammars with conflicts are usually smaller.• Application: operator precedence and
associativity.
![Page 97: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/97.jpg)
97
Uses (and Misuses) of Controlled Ambiguity
• We no longer specify a parser purely in terms of the grammar it parses, but rather we explicitly include auxiliary rules to disambiguate conflicts in the grammar.– We will present how to specify auxiliary rules
in yacc when we introduce the yacc.
![Page 98: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/98.jpg)
98
Uses (and Misuses) of Controlled Ambiguity
• The precedence values assigned to operators resolve most shift-reduce conflicts of the formExpr Expr OP1 Expr
Expr Expr OP2 Expr
• Rules:– If Op1 has a higher precedence than Op2, we Reduce.– If Op2 has a higher precedence, we shift.– If Op1 and Op2 have the same precedence, we use the
associativity definitions.• Right-associative, we shift. Left-associative, we reduce. • No-associative, we signal, and error.
![Page 99: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/99.jpg)
99
EE+EEE*EEID
EE+EEE*E
EID
EE+EEE+EEE*EEID
EE*EEE+EEE*EEID
EE+EEE+EEE*E
EE*EEE+EEE*E
EID
EID
ID
E
E
E
+
*
ID
ID
![Page 100: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/100.jpg)
100
Uses (and Misuses) of Controlled Ambiguity (Cont’d)
• More example: Page 234 of “lex&yacc”
%nonassoc LOWER_THAN_ELSE%nonassoc ELSE%%
stmt: IF expr stmt %prec LOWER_THAN_ELSE | IF expr stmt ELSE stmt;
![Page 101: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/101.jpg)
101
Uses (and Misuses) of Controlled Ambiguity (Cont’d)
• If your language uses a THEN keyword (like Pascal does):
%nonassoc THAN%nonassoc ELSE%%
stmt: IF expr THAN stmt | IF expr THAN stmt ELSE stmt ;
![Page 102: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/102.jpg)
102
Optimizing Parse Tables
• A number of improvements can be made to decrease the space needs of an LR parse– Merging go_to and action tables to a single
table
![Page 103: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/103.jpg)
103
![Page 104: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/104.jpg)
104
![Page 105: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/105.jpg)
105
Optimizing Parse Tables
• Encoding parse table– As integers
• Error entries as zeros
• Reduce actions as positive integers
• Shift actions as negative integers
• Single reduce states– E.g. state 4 in Figure 6.35
• State 5 can be eliminated
![Page 106: LR(1) & LALR PARSER](https://reader038.fdocuments.us/reader038/viewer/2022102406/56812c4b550346895d90d04d/html5/thumbnails/106.jpg)
106
Optimizing Parse Tables
• See figure 6.36