1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token...
-
Upload
rodger-pierce -
Category
Documents
-
view
221 -
download
0
Transcript of 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token...
![Page 1: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/1.jpg)
1
Chapter 5 LL (1) Grammars and Parsers
![Page 2: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/2.jpg)
2
• Naming of parsing techniques
The way to parse token sequence
L: Leftmost R: Righmost
• Top-downLL
• Bottom-upLR
![Page 3: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/3.jpg)
3
The LL(1) Predict Function• Given the productions
A1
A2
…An
• During a (leftmost) derivation
… A … … 1 … or
… 2 … or
… n …
• Deciding which production to match– Using lookahead symbols
![Page 4: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/4.jpg)
4
The LL(1) Predict Function
• The limitation of LL(1)– LL(1) contains exactly those grammars that
have disjoint predict sets for productions that share a common left-hand side
Single Symbol Lookahead
![Page 5: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/5.jpg)
5
Not extended BNF form
$: end of file token
![Page 6: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/6.jpg)
6
![Page 7: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/7.jpg)
7
![Page 8: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/8.jpg)
8
The LL(1) Parse Table• The parsing information contain in the Predict
function can be conveniently represented in an LL(1) parse table T:Vn x Vt P{Error}
where P is the set of all productions
• The definition of TT[A][t]=AX1Xm if tPredict(AX1Xm );T[A][t]=Error otherwise
![Page 9: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/9.jpg)
9
![Page 10: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/10.jpg)
10
The LL(1) Parse Table
• A grammar G is LL(1) if and only if all entries in T contain a unique prediction or an error flag
![Page 11: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/11.jpg)
11
Building Recursive Descent Parsers from LL(1) Tables
• Similar the implementation of a scanner, there are two kinds of parsers– Build in
• The parsing decisions recorded in LL(1) tables can be hardwired into the parsing procedures used by recursive descent parsers
– Table-driven
![Page 12: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/12.jpg)
12
Building Recursive Descent Parsers from LL(1) Tables
• The form of parsing procedure:The name of the nonterminal the parsing
procedure handles
A sequence of parsing actions
![Page 13: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/13.jpg)
13
Building Recursive Descent Parsers from LL(1) Tables
• E.g. of an parsing procedure for <statement> in Micro
![Page 14: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/14.jpg)
14
Building Recursive Descent Parsers from LL(1) Tables
• An algorithm that automatically creates parsing procedures like the one in Figure 5.6 from LL(1) table
![Page 15: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/15.jpg)
15
Building Recursive Descent Parsers from LL(1) Tables
• The data structure for describing grammars
Name of the symbols in grammar
![Page 16: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/16.jpg)
16
Building Recursive Descent Parsers from LL(1) Tables
• gen_actions()– Takes the grammar symbols and generates the actions
necessary to match them in a recursive descent parse
![Page 17: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/17.jpg)
17
![Page 18: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/18.jpg)
18
An LL(1) Parser Driver
• Rather than using the LL(1) table to build parsing procedures, it is possible to use the table in conjunction with a driver program to form an LL(1) parser
• Smaller and faster than a corresponding recursive descent parser
• Changing a grammar and building a new parser is easy– New LL(1) table are computed and substituted for the
old tables
![Page 19: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/19.jpg)
19AaBcD
A DcBa
![Page 20: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/20.jpg)
20
![Page 21: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/21.jpg)
21
LL(1) Action Symbols
• During parsing, the appearance of an action symbol in a production will serve to initiate the corresponding semantic action – a call to the corresponding semantic routine.
gen_action(“ID:=<expression> #gen_assign;”)
match(ID);match(ASSIGN);expression();gen_assign();match(semicolon);
What are action symbols?(See next page.)
![Page 22: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/22.jpg)
22
![Page 23: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/23.jpg)
23
LL(1) Action Symbols
• The semantic routine calls pass no explicit parameters– Necessary parameters are transmitted through a
semantic stack
– semantic stack parse stack
• Semantic stack is a stack of semantic records.• Action symbols are pushed to the parse stack
– See figure 5.11
![Page 24: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/24.jpg)
24
Difference betweenFig. 5.9 and Fig 5.11
![Page 25: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/25.jpg)
25
Making Grammars LL(1)• Not all grammars are LL(1). However, some
non-LL(1) grammars can be made LL(1) by simple modifications.
• When a grammar is not LL(1) ?
• This is called a conflict, which means we do not know which production to use when <stmt> is on stack top and ID is the next input token.
<stmt>
ID
2,3
![Page 26: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/26.jpg)
26
Making Grammars LL(1)• Major LL(1) prediction conflicts
– Common prefixes– Left recursion
• Common prefixes<stmt> if <exp> then <stmt list> end if;<stmt> if <exp> then <stmt list> else <stmt list >
end if;
• Solution: factoring transform– See figure 5.12
![Page 27: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/27.jpg)
27
Making Grammars LL(1)
<stmt> if <exp> then <stmt list> <if suffix><if suffix> end if;<if suffix> else <stmt list> end if;
![Page 28: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/28.jpg)
28
Making Grammars LL(1)
• Grammars with left-recursive production can never be LL(1)
A A – Why? A will be the top stack symbol, and
hence the same production would be predicted forever
![Page 29: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/29.jpg)
29
Making Grammars LL(1)• Solution: Figure 5.13
AAA…A
ANTN…NTT
![Page 30: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/30.jpg)
30
Making Grammars LL(1)• Other transformation may be needed
– No common prefixes, no left recursion1 <stmt> <label> <unlabeled stmt>2 <label> ID :
3 <label> 4 <unlabeled stmt> ID := <exp> ;
<label>
ID
2,3
![Page 31: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/31.jpg)
31
Making Grammars LL(1)
Lookahead two tokens seems be needed!
LL(2) ?
• Example
A: B := C ;
B := C ;
![Page 32: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/32.jpg)
32
Making Grammars LL(1)
<stmt> ID <suffix><suffix> : <unlabeled stmt><suffix> := <exp> ;<unlabeled stmt> ID := <exp> ;
Solution: An equivalent LL(1) grammar !
• Example
A: B := C ;
B := C ;
![Page 33: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/33.jpg)
33
Making Grammars LL(1)• In Ada, we may declare arrays as
A: array(I .. J, BOOLEAN)
• A straightforward grammar for array bound<array bound> <expr> .. <expr><array bound> ID
• Solution<array bound> <expr> <bound tail><bound tail> .. <expr>
<bound tail>
![Page 34: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/34.jpg)
34
Making Grammars LL(1)• Greibach Normal Form
– Every production is of the form Aa• “a” is a terminal and is (possible empty) string of
variables
– Every context-free language L without can be generated by a grammar in Greibach Normal Form
– Factoring of common prefixes is easy
• Given a grammar G, we can– G GNF No common prefixes, no left
recursion (but may be still not LL(1))
![Page 35: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/35.jpg)
35
The If-Then-Else Problem in LL(1) Parsing
• “Dangling else” problem in Algo60, Pascal, and C– else clause is optional
• BL={[i]j | ij 0}[ if <expr> then <stmt>
] else <stmt>
• BL is not LL(1) and in fact not LL(k) for any k
![Page 36: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/36.jpg)
36
The If-Then-Else Problem in LL(1)• First try
– G1:
S [ S CLS CL ]CL
• G1 is ambiguous: E.g., [[]S
[ S CL
[ S CL
]
S
[ S CL
[ S CL
]
Why an ambiguous grammar is not LL(1)? Please try toanswer this question accordingto this figure.
![Page 37: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/37.jpg)
37
• Second try– G2:
S [S S S1S1 [S1]S1
• G2 is not ambiguous: E.g., [[]
• The problem is [ First([S) and [ First(S1)
[[ First2([S) and [[ First2 (S1)
– G2 is not LL(1), nor is it LL(k) for any k.
The If-Then-Else Problem in LL(1)
[ S1
[ S1 ]
S
![Page 38: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/38.jpg)
38
• Solution: conflicts + special rules– G3:
G S; S if S ES OtherE else SE
• G3 is ambiguous
• We can enforce that T[E, else] = 4th rule.• This essentially forces “else “ to be matched with
the nearest unpaired “ if “.
The If-Then-Else Problem in LL(1)
![Page 39: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/39.jpg)
39
• If all if statements are terminated with an end if, or some equivalent symbol, the problem disappears.
S if S ES OtherE else S end ifE end if
• An alternative solution– Change the language
The If-Then-Else Problem in LL(1)
![Page 40: 1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down LL Bottom-up.](https://reader035.fdocuments.us/reader035/viewer/2022062408/56649e795503460f94b78808/html5/thumbnails/40.jpg)
40
Properties of LL(1) Parsers
• A correct, leftmost parse is guaranteed• All grammars in the LL(1) class are unambiguous
– Some string has two or more distinct leftmost parses
– At some point more than one correct prediction is possible
• All LL(1) parsers operate in linear time and, at most, linear space