0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of...
Transcript of 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of...
![Page 1: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/1.jpg)
Compilation0368-3133
Lecture4:SyntaxAnalysis:Parsing
NoamRinetzky
1
![Page 2: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/2.jpg)
2
![Page 3: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/3.jpg)
TheRealAnatomyofaCompiler
Executable code
exe
Sourcetext
txtLexicalAnalysis
Sem.Analysis
Process text input
characters SyntaxAnalysistokens AST
Intermediate code
generation
Annotated AST
Intermediate code
optimizationIR Code
generationIR
Target code optimization
Symbolic Instructions
SI Machine code generation
Write executable
output
MI
3
LexicalAnalysis
SyntaxAnalysis
![Page 4: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/4.jpg)
Broadkindsofparsers
• Parsersforarbitrary grammars– Earley’s method,CYKmethod– Usually,notusedinpractice(thoughmightchange)
• Top-Downparsers– Constructparsetreeinatop-downmatter– Findtheleftmost derivation
• Bottom-Upparsers– Constructparsetreeinabottom-upmanner– Findtherightmost derivationinareverseorder
4
![Page 5: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/5.jpg)
CFGterminology
Symbols:Terminals (tokens):;:=() idnumprintNon-terminals:SEL
Startnon-terminal:SConvention:thenon-terminalappearinginthefirstderivationrule
Grammarproductions(rules)N® μ
S® S ; SS® id:= EE® idE® numE® E + E
5
![Page 6: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/6.jpg)
CFGterminology
• Derivation - asequenceofreplacementsofnon-terminalsusingthederivationrules
• Language - thesetofstringsofterminalsderivablefromthestartsymbol
• Sententialform- theresultofapartialderivation– Maycontainnon-terminals
6
![Page 7: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/7.jpg)
Derivations
• ShowthatasentenceωisinagrammarG– Startwiththestartsymbol– Repeatedlyreplaceoneofthenon-terminalsbyaright-handsideofaproduction
– Stopwhenthesentencecontainsonlyterminals
• GivenasentenceαNβ andruleN®µαNβ =>αµβ
• ω isinL(G)ifS=>*ω7
![Page 8: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/8.jpg)
Predictiveparsing
• Recursivedescent• LL(k)grammars
8
![Page 9: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/9.jpg)
Predictiveparsing
• GivenagrammarGandawordwattempttoderivewusingG
• Idea– Applyproductiontoleftmostnonterminal– Pickproductionrulebasedonnextinputtoken
• Generalgrammar– Morethanoneoptionforchoosingthenextproductionbasedonatoken
• Restrictedgrammars(LL)– Knowexactlywhichsingleruletoapply– Mayrequiresomelookahead todecide
9
![Page 10: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/10.jpg)
Booleanexpressionsexample
10
not(nottrueorfalse)
E® LIT|(EOPE)|not ELIT® true | falseOP® and |or |xor
![Page 11: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/11.jpg)
E=>not E=>not(EOPE)=>not(not EOPE)=>not(notLITOPE)=>not(nottrue OPE)=>not(nottrueor E)=>not(nottrueorLIT)=>not(nottrueorfalse )
not E
E
( E OP E )
not LIT or LIT
true false
Booleanexpressionsexample
not(nottrueorfalse)productiontoapplyknownfromnexttoken
E® LIT|(EOPE)|not ELIT® true | falseOP® and |or |xor
11
![Page 12: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/12.jpg)
E=>not E=>not(EOPE)=>not(not EOPE)=>not(notLITOPE)=>not(nottrue OPE)=>not(nottrueor E)=>not(nottrueorLIT)=>not(nottrueorfalse )
E
not E
( E OP E )
not LIT or LIT
falsetrue
Booleanexpressionsexample
not(nottrueorfalse)productiontoapplyknownfromnexttoken
E® LIT|(EOPE)|not ELIT® true | falseOP® and |or |xor
12
![Page 13: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/13.jpg)
Implementationviarecursion
E → LIT| ( E OP E )| not E
LIT → true| false
OP → and| or| xor
E() {if (current Î {TRUE, FALSE}) LIT();else if (current == LPAREN) match(LPARENT);
E(); OP(); E();match(RPAREN);
else if (current == NOT) match(NOT); E();else error;
}
LIT() {if (current == TRUE) match(TRUE);else if (current == FALSE) match(FALSE);else error;
}
OP() {if (current == AND) match(AND);else if (current == OR) match(OR);else if (current == XOR) match(XOR);else error;
}13
![Page 14: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/14.jpg)
FIRSTsets
• FIRST(X)={t |Xà*t β}∪{ℇ |Xà* ℇ}– FIRST(X)=allterminalsthatα canappearasfirstinsomederivationforX• +ℇ ifcanbederivedfromX
• Example:– FIRST(LIT)={true,false}– FIRST((EOPE))={‘(‘}– FIRST(notE)={not}
14
First(α)canbedefinedforanysequenceofsymbols
![Page 15: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/15.jpg)
ComputingFIRSTsets• FIRST(t)={t}//“t”terminal
• ℇ∈ FIRST(X) if– Xà ℇ or– Xà A1 ..Ak andℇ∈ FIRST(Ai)i=1…k
• FIRST(α)⊆ FIRST(X)if– Xà A1 ..Ak α andℇ∈ FIRST(Ai)i=1…k
15
First(X)iscomputedfornon-terminals
![Page 16: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/16.jpg)
Followsets
• Follow(X)={t |Sà*αXt β}– t– Terminalor$
16
![Page 17: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/17.jpg)
FOLLOWsets:Constraints
• $∈ FOLLOW(S)
• FIRST(β)– {ℇ}⊆ FOLLOW(X)– ForeachAà αXβ
• FOLLOW(A)⊆ FOLLOW(X)– ForeachAà αXβandℇ ∈ FIRST(β)
17
![Page 18: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/18.jpg)
Example:FOLLOWsets
• Eà TX Xà+E|ℇ• Tà (E)|int YYà *T|ℇ
18
Non.Term.
E T X Y
FOLLOW ),$ +,),$ $,) +,),$
• $∈ FOLLOW(S)• FIRST(β)– {ℇ}⊆ FOLLOW(X)
– ForeachAà αXβ
• FOLLOW(A)⊆ FOLLOW(X)– ForeachAà αXβandℇ∈ FIRST(β)
![Page 19: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/19.jpg)
PredictionTable
• Aà α
• T[A,t]=αift∈FIRST(α)• T[A,t]=αifℇ ∈ FIRST(α)andt∈ FOLLOW(A)
– tcanalsobe$
• Tisnotwelldefinedè thegrammarisnotLL(1)
19
![Page 20: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/20.jpg)
LL(k)grammars
• AgrammarisintheclassLL(K)whenitcanbederivedvia:– Top-downderivation– Scanningtheinputfromlefttoright(L)– Producingtheleftmostderivation(L)– Withlookahead ofktokens(k)
• AlanguageissaidtobeLL(k)whenithasanLL(k)grammar
20
![Page 21: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/21.jpg)
LL(1)grammars
• AgrammarisintheclassLL(1)iff– ForeverytwoproductionsA® α andA® β wehave
• FIRST(α)∩FIRST(β)={}//includinge• Ife∈ FIRST(α)thenFIRST(β)∩FOLLOW(A)={}• Ife∈ FIRST(β)thenFIRST(α)∩FOLLOW(A)={}
21
![Page 22: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/22.jpg)
22
Problem:NonLLGrammars
![Page 23: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/23.jpg)
Backtoproblem1:commonprefix
• FIRST(term)={ID}• FIRST(indexed_elem)={ID}
• FIRST/FIRSTconflict
term® ID |indexed_elemindexed_elem® ID [expr ]
23
![Page 24: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/24.jpg)
Solution:leftfactoring• RewritethegrammartobeinLL(1)
Intuition:justlikefactoringx*y+x*zintox*(y+z)
term® ID |indexed_elemindexed_elem® ID [expr ]
term® ID after_IDAfter_ID® [expr ]| e
24
![Page 25: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/25.jpg)
S® ifEthenSelseS|ifEthenS|T
S® ifEthenSS’|T
S’® elseS|e
Leftfactoring– anotherexample
25
![Page 26: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/26.jpg)
Problem:nullproduction
bool S(){returnA()&&match(token(‘a’))&&match(token(‘b’));
}
bool A(){returnmatch(token(‘a’))||true;}
S® Aa bA® a |e
§ Whathappensforinput“ab”?§ Whathappensifyoufliporderofalternativesandtry“aab”?
26
![Page 27: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/27.jpg)
• FIRST(S)={a} FOLLOW(S)={$}• FIRST(A)={a,e } FOLLOW(A)={a}
• FIRST/FOLLOWconflict
S® Aa bA® a |e
27
Problem:nullproduction
![Page 28: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/28.jpg)
Backtoproblem2:nullproduction
• FIRST(S)={a} FOLLOW(S)={}• FIRST(A)={a,e } FOLLOW(A)={a}
• FIRST/FOLLOWconflict
S® Aa bA® a |e
28
![Page 29: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/29.jpg)
Solution:substitution
S® Aa bA® a|e
S® aa b|ab
Substitute A in S
S® aafter_Aafter_A® ab|b
Left factoring
29
![Page 30: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/30.jpg)
Backtoproblem3:Leftrecursion
• Leftrecursioncannotbehandledwithaboundedlookahead
• Whatcanwedo?
E® E- term|term
30
![Page 31: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/31.jpg)
Leftrecursionremoval
• L(G1)=β,βα,βαα,βααα,…• L(G2)=same
N® Nα |β N® βN’N’® αN’|e
G1 G2
E® E- term|term
E® termTE|termTE® - termTE|e
§ Forour3rd example:
p. 130
Canbedonealgorithmically.Problem:grammarbecomesmangledbeyondrecognition
31
![Page 32: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/32.jpg)
LL(k)Parsers
• RecursiveDescent– Manualconstruction– Usesrecursion
• Wanted– Aparserthatcanbegeneratedautomatically– Doesnotuserecursion
32
![Page 33: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/33.jpg)
• Pushdownautomatonuses– Predictionstack– Inputstream– Transitiontable
• nonterminals xtokens->productionalternative• EntryindexedbynonterminalNandtokentcontainsthealternativeofNthatmustbepredicatedwhencurrentinputstartswitht
LL(k)parsingviapushdownautomata
33
![Page 34: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/34.jpg)
LL(k)parsingviapushdownautomata
• Twopossiblemoves– Prediction
• Whentopofstackisnonterminal N,popN,lookuptable[N,t].Iftable[N,t]isnotempty,pushtable[N,t]onpredictionstack,otherwise– syntaxerror
– Match• WhentopofpredictionstackisaterminalT,mustbeequaltonextinputtokent.If(t==T),popTandconsumet.If(t≠T)syntaxerror
• Parsingterminateswhenpredictionstackisempty– Ifinputisemptyatthatpoint,success.Otherwise,syntaxerror
34
![Page 35: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/35.jpg)
( ) not true false and or xor $
E 2 3 1 1
LIT 4 5
OP 6 7 8
(1) E → LIT(2) E → ( E OP E ) (3) E → not E(4) LIT → true(5) LIT → false(6) OP → and(7) OP → or(8) OP → xor
Non
term
inal
s
Input tokens
Whichruleshouldbeused
Exampletransitiontable
35
![Page 36: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/36.jpg)
Modelofnon-recursivepredictiveparser
PredictiveParsingprogram
Parsing Table
X
Y
Z
$
Stack
$b+a
Output
36
![Page 37: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/37.jpg)
a b c
A A® aAb A® c
A ® aAb | caacbb$
Inputsuffix Stack content Move
aacbb$ A$ predict(A,a)=A® aAb
aacbb$ aAb$ match(a,a)
acbb$ Ab$ predict(A,a)=A® aAb
acbb$ aAbb$ match(a,a)
cbb$ Abb$ predict(A,c)=A® c
cbb$ cbb$ match(c,c)
bb$ bb$ match(b,b)
b$ b$ match(b,b)
$ $ match($,$)– success
Runningparserexample
37
![Page 38: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/38.jpg)
Erorrs
38
![Page 39: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/39.jpg)
HandlingSyntaxErrors
• Reportandlocatetheerror• Diagnosetheerror• Correcttheerror• Recoverfromtheerrorinordertodiscovermoreerrors– withoutreportingtoomany“strange”errors
39
![Page 40: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/40.jpg)
ErrorDiagnosis
• Linenumber– maybefarfromtheactualerror
• Thecurrenttoken• Theexpectedtokens• Parserconfiguration
40
![Page 41: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/41.jpg)
ErrorRecovery
• Becomeslessimportantininteractiveenvironments
• Exampleheuristics:– Searchforasemi-columnandignorethestatement– Tryto“replace” tokensforcommonerrors– Refrainfromreporting3subsequenterrors
• Globallyoptimalsolutions– Foreveryinputw,findavalidprogramw’ witha“minimal-distance” fromw
41
![Page 42: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/42.jpg)
a b c
A A® aAb A® c
A ® aAb | cabcbb$
Inputsuffix Stack content Move
abcbb$ A$ predict(A,a)=A® aAb
abcbb$ aAb$ match(a,a)
bcbb$ Ab$ predict(A,b)=ERROR
Illegalinputexample
42
![Page 43: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/43.jpg)
ErrorhandlinginLLparsers
• Nowwhat?– Predictb S anyway“missingtokenbinsertedinlineXXX”
S ® a c | b Sc$
a b c
S S® ac S® bS
Inputsuffix Stack content Move
c$ S$ predict(S,c)=ERROR
43
![Page 44: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/44.jpg)
ErrorhandlinginLLparsers
• Result:infiniteloop
S ® a c | b Sc$
a b c
S S® ac S® bS
Inputsuffix Stack content Move
bc$ S$ predict(b,c)=S® bS
bc$ bS$ match(b,b)
c$ S$ Looks familiar?
44
![Page 45: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/45.jpg)
Errorhandlingandrecovery
• x=a*(p+q*(-b*(r-s);
• Whereshouldwereporttheerror?
• Thevalidprefixproperty
45
![Page 46: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/46.jpg)
TheValidPrefixProperty
• Foreveryprefixtokens– t1,t2,…,ti thattheparseridentifiesaslegal:
• thereexiststokensti+1,ti+2,…,tn suchthatt1,t2,…,tnisasyntacticallyvalidprogram
• Ifeverytokenisconsideredassinglecharacter:– Foreveryprefixworduthattheparseridentifiesaslegal
thereexistswsuchthatu.w isavalidprogram
46
![Page 47: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/47.jpg)
Recoveryistricky
• Heuristicsfordroppingtokens,skippingtosemicolon,etc.
47
![Page 48: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/48.jpg)
BuildingtheParseTree
48
![Page 49: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/49.jpg)
Addingsemanticactions
• Canaddanactiontoperformoneachproductionrule
• Canbuildtheparsetree– EveryfunctionreturnsanobjectoftypeNode– EveryNodemaintainsalistofchildren– Functioncallscanaddnewchildren
49
![Page 50: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/50.jpg)
Buildingtheparsetree
Node E() {result = new Node(); result.name = “E”;if (current Î {TRUE, FALSE}) // E ® LITresult.addChild(LIT());
else if (current == LPAREN) // E ® ( E OP E )result.addChild(match(LPAREN));result.addChild(E());result.addChild(OP()); result.addChild(E());result.addChild(match(RPAREN));
else if (current == NOT) // E ® not Eresult.addChild(match(NOT));result.addChild(E());
else error;return result;
} 50
![Page 51: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/51.jpg)
static int Parse_Expression(Expression **expr_p) {
Expression *expr = *expr_p = new_expression() ;
/* try to parse a digit */
if (Token.class == DIGIT) {
expr->type=‘D’; expr->value=Token.repr –’0’;
get_next_token();
return 1; }
/* try parse parenthesized expression */
if (Token.class == ‘(‘) {
expr->type=‘P’; get_next_token();
if (!Parse_Expression(&expr->left)) Error(“missing expression”);
if (!Parse_Operator(&expr->oper)) Error(“missing operator”);
if (Token.class != ‘)’) Error(“missing )”);
get_next_token();
return 1; }
return 0;
} 51
ParserforFullyParenthesizedExpers
![Page 52: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/52.jpg)
BottomUpparsing
52
![Page 53: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/53.jpg)
Bottom-UpParsing
• Goal:Buildaparsetree– Reporterrorifinputisnotalegalprogram
• How:– Readinputleft-to-right– Constructasubtree forthefirstleft-mosttreenodewhosechildern havebeenconstructed
53
![Page 54: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/54.jpg)
+ * 321
54
Bottom-upparsingE® E*TE® TT® T+FT® FF® idF® numF® (E)
E
E
TT
F
T
F F
(Nonstandardprecedence)
![Page 55: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/55.jpg)
Bottom-upparsing:LR(k)Grammars
• AgrammarisintheclassLR(K)whenitcanbederivedvia:– Bottom-up derivation– Scanningtheinputfromlefttoright(L)– Producingtherightmostderivation(R)
• Inreverseoreder– Withlookahead ofktokens(k)
• AlanguageissaidtobeLR(k)ifithasanLR(k)grammar
• ThesimplestcaseisLR(0),whichwewilldiscuss
55
![Page 56: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/56.jpg)
Terminology:Reductions&Handles
• Theoppositeofderivationiscalledreduction– LetAè α beaproductionrule– Derivation: βAµè βαµ– Reduction:βαµè βAµ
• Ahandle isthereducedsubstring– α isthehandlesforβαµ
56
![Page 57: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/57.jpg)
UseShift&ReduceIneachstage,weshift asymbolfromtheinputtothestack,orreduce accordingtooneoftherules.
57
![Page 58: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/58.jpg)
StackParser
Input
Output
ActionTable
Goto table
58
) x*)7+23((
RPIdOPRPNumOPNumLPLPtokenstream
Op(*)
Id(b)
Num(23) Num(7)
Op(+)
Howdoestheparserknowwhattodo?
![Page 59: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/59.jpg)
Howdoestheparserknowwhattodo?
• Astate willkeeptheinfogatheredonhandle(s)– Astateinthe“control”ofthePDA– Also(partof)thestackalphabet
• Atable willtellit“whattodo”basedoncurrentstateandnexttoken– ThetransitionfunctionofthePDA
• Astackwillrecordsthe“nestinglevel”– Stackcontainsasequenceofprefixesofhandles
59
SetofLR(0)items
![Page 60: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/60.jpg)
ImportantBottom-UpLR-Parsers
• LR(0) – simplest,explainsbasicideas• SLR(1)– simple,exaplins lookahead• LR(1) – complictated,verypowerful,expensive
• LALR(1)– complicated,powerfulenough,usedbyautomatictools
60
![Page 61: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/61.jpg)
LR(0)vsSLR(1)vsLR(1)vsLALR(1)• Alluseshift/reduce
• Maindifference:howtoidentifyahandle– Technically:Usingdifferentsetsofstates
• Moreexpsensiveèmorestatesèmorespecificchoiceofwhichreductionruletouse
• Buttheusage ofthestatesisthesameinallparsers
• Reductionisthesameinalltechniques– Oncethehandleisdetermined
61
![Page 62: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/62.jpg)
LR(0)Parsing
62
![Page 63: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/63.jpg)
LRitem
63
N ® α•β
Alreadymatched TobematchedInput
Hypothesisaboutαβ beingapossiblehandle:sofarwe’vematchedα,expectingtoseeβ
![Page 64: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/64.jpg)
Example:LR(0)Items• Allitemscanbeobtainedbyplacingadotateverypositionforeveryproduction:
64
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)
1: S ® •E$2: S ® E • $3: S ® E $ •4: E ® • T5: E ® T •6: E ® • E + T7: E ® E • + T8: E ® E + • T9: E ® E + T •10: T ® • i11: T ® i •12: T ® • (E)13: T ® (• E)14: T ® (E •)15: T ® (E) •
Grammar LR(0)items
![Page 65: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/65.jpg)
Example:LR(0)Items• Allitemscanbeobtainedbyplacingadotateverypositionforeveryproduction:
• Before • =reduced– matchedprefix
• After • =maybereduced– Maybematchedbysuffix
65
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)
1: S ® •E$2: S ® E • $3: S ® E $ •4: E ® • T5: E ® T •6: E ® • E + T7: E ® E • + T8: E ® E + • T9: E ® E + T •10: T ® • id11: T ® id •12: T ® • (E)13: T ® (• E)14: T ® (E •)15: T ® (E) •
Grammar LR(0)items
![Page 66: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/66.jpg)
LR(0)items
66
N ® α•β ShiftItem
N ® αβ• ReduceItem
Statesaresetsofitems
![Page 67: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/67.jpg)
LR(0)Items
• Aderivationrulewithalocationmarker(●)iscalledLR(0)item
E→E*B|E+B|BB→0|1
67
![Page 68: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/68.jpg)
PDAStates
• APDAstateisasetofLR(0)items.E.g.,q13 ={E→ E● *B,E→ E● +B,B→ 1●}
• Intuitively,ifwematched1,Thenthestatewillrememberthe3possiblealternativesrulesandwhereweareineachofthem
(1)E→ E● *B (2)E→ E● +B(3)B→ 1●
68
E→E*B|E+B|BB→0|1
![Page 69: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/69.jpg)
LR(0)Shift/ReduceItems
69
N t α•β ShiftItem
N t αβ• ReduceItem
![Page 70: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/70.jpg)
Intuition• Readinputtokensleft-to-rightandremembertheminthestack
• Whenarighthandsideofaruleisfound,removeitfromthestackandreplaceitwiththenon-terminalitderives
• Rememberingtokeniscalledshift– Eachshiftmovestoastatethatrememberswhatwe’veseensofar
• ReplacingRHSwithLHSiscalledreduce– Eachreducegoestoastatethatdeterminesthecontextofthederivation
70
![Page 71: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/71.jpg)
ModelofanLRparser
71
LR Parser0
T
2
+
7
id
5
Stack
$id+id+id
Outputstate
symbol
GotoTable
ActionTable
Input
TerminalsandNon-terminals
![Page 72: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/72.jpg)
LRparserstack
• Sequencemadeofstate,symbolpairs• Forinstanceapossiblestackforthegrammar
S® E$E® TE® E+TT® idT® (E)
couldbe:0 T2 +7 id572Stackgrowsthisway
![Page 73: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/73.jpg)
FormofLRparsingtable
73
state terminals non-terminals
Shift/Reduceactions Goto part01...
sn
rk
shiftstaten reducebyrulek
gm
goto statem
acc
accept
error
![Page 74: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/74.jpg)
LRparsertableexample
74
gotoactionSTATE
TE$)(+id
g6g1s7s50
accs31
2
g4s7s53
r3r3r3r3r34
r4r4r4r4r45
r2r2r2r2r26
g6g8s7s57
s9s38
r5r5r5r5r59
![Page 75: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/75.jpg)
Shiftmove
75
LRParsingprogram
q...
Stack
$…a…
gotoaction
Input
• action[q,a]=sn
![Page 76: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/76.jpg)
Resultofshift
76
LRParsingprogram
naq...
Stack
$…a…
gotoaction
Input
• action[q,a]=sn
![Page 77: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/77.jpg)
Reducemove
77
LRParsingprogram
qn
σn
…q1σ1q…
Stack$…a…
gotoaction
Input
2*n
• action[qn,a]=rk• Production:(k)At σ1… σn• Topofstacklookslike q1σ1…qnσnforsomeq1… qn• goto[q,A]=qm
![Page 78: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/78.jpg)
Resultofreducemove
78
LRParsingprogram
Stack$…a…
gotoaction
Input
• action[qn,a]=rk• Production:(k)At σ1… σn• Topofstacklookslike q1σ1…qnσnforsomeq1… qn• goto[q,A]=qm
qmAq…
![Page 79: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/79.jpg)
Acceptmove
79
LRParsingprogram
q...
Stack
$a…
gotoaction
Input
Ifaction[q,a]=acceptparsing completed
![Page 80: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/80.jpg)
Errormove
80
LRParsingprogram
q...
Stack
$…a…
gotoaction
Input
Ifaction[q,a]=error(usuallyempty)parsingdiscoveredasyntacticerror
![Page 81: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/81.jpg)
Example
81
Z t E $E t T | E + T
T t i | ( E )
![Page 82: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/82.jpg)
Example:parsingwithLRitems
82
Z t E $E t T | E + TT t i | ( E )
E t •T E t •E + TT t •iT t •( E )
Z t •E $
i + i $
WhydoweneedtheseadditionalLRitems?Wheredotheycomefrom?Whatdotheymean?
![Page 83: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/83.jpg)
e-closure
• GivenasetSofLR(0)items
• IfPt α•Nβ isinstateS• thenforeachruleNt✏ inthegrammarstateSmustalsocontainNt •✏
83
e-closure({Z t •E $}) = E t •T, E t •E + T,T t •i , T t •( E ) }
{ Z t •E $,
Z t E $E t T | E + TT t i | ( E )
![Page 84: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/84.jpg)
84
i + i $
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Zt E$Et T|E+TTt i|(E)
Itemsdenotepossiblefuturehandles
Rememberpositionfromwhichwe’retryingtoreduce
Example:parsingwithLRitems
![Page 85: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/85.jpg)
85
Tt i• Reduceitem!
i + i $
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Zt E$Et T|E+TTt i|(E)
Matchitemswithcurrenttoken
Example:parsingwithLRitems
![Page 86: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/86.jpg)
86
i
Et T• Reduceitem!
T + i $Zt E$Et T|E+TTt i|(E)
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Example:parsingwithLRitems
![Page 87: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/87.jpg)
87
T
Et T• Reduceitem!
i
E + i $Zt E$Et T|E+TTt i|(E)
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Example:parsingwithLRitems
![Page 88: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/88.jpg)
88
T
i
E + i $Zt E$Et T|E+TTt i|(E)
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Et E•+T
Zt E•$
Example:parsingwithLRitems
![Page 89: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/89.jpg)
89
T
i
E + i $Zt E$Et T|E+TTt i|(E)
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Et E•+T
Zt E•$ Et E+•T
Tt •iTt •(E)
Example:parsingwithLRitems
![Page 90: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/90.jpg)
90
Et E•+T
Zt E•$ Et E+•T
Tt •iTt •(E)
E + T $
i
Zt E$Et T|E+TTt i|(E)
Et •TEt •E+TTt •iTt •(E)
Zt •E$
T
i
Example:parsingwithLRitems
![Page 91: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/91.jpg)
91
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Zt E$Et T|E+TTt i|(E)
E + T
T
i
Et E•+T
Zt E•$ Et E+•T
Tt •iTt •(E)
i
Et E+T•
$
Reduceitem!
Example:parsingwithLRitems
![Page 92: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/92.jpg)
92
Et •TEt •E+TTt •iTt •(E)
Zt •E$
E $
E
T
i
+ T
Zt E•$
Et E•+T
i
Zt E$Et T|E+TTt i|(E)
Example:parsingwithLRitems
![Page 93: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/93.jpg)
93
Et •TEt •E+TTt •iTt •(E)
Zt •E$
E $
E
T
i
+ T
Zt E•$
Et E•+T
Zt E$•
i
Zt E$Et T|E+TTt i|(E)
Example:parsingwithLRitems
Reduceitem!
![Page 94: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/94.jpg)
94
Et •TEt •E+TTt •iTt •(E)
Zt •E$
Z
E
T
i
+ T
Zt E•$
Et E•+T
Zt E$•
Reduceitem!
E $
i
Zt E$Et T|E+TTt i|(E)
Example:parsingwithLRitems
![Page 95: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/95.jpg)
GOTO/ACTIONtables
95
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
q1 q3 q2 shift
q2 ZtE$
q3 q5 q7 q4 Shift
q4 EtE+T
q5 Tti
q6 EtT
q7 q5 q7 q8 q6 shift
q8 q3 q9 shift
q9 TtE
GOTOTableACTIONTable
empty–errormove
![Page 96: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/96.jpg)
LR(0)parsertables
• Twotypesofrows:– Shift row– tellswhichstatetoGOTOforcurrenttoken
– Reduce row– tellswhichruletoreduce(independentofcurrenttoken)• GOTOentriesareblank
96
![Page 97: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/97.jpg)
LRparserdatastructures• Input– remainderoftexttobeprocessed• Stack– sequenceofpairsN,qi
– N– symbol(terminalornon-terminal)– qi– stateatwhichdecisionsaremade
• Initialstackcontainsq0
97
+ i $Inputsuffix
q0stack i q5Stackgrowsthisway
![Page 98: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/98.jpg)
LR(0)pushdownautomaton• Twomoves:shiftandreduce• Shift move
– Removefirsttokenfrominput– Pushitonthestack– ComputenextstatebasedonGOTOtable– Pushnewstateonthestack– Ifnewstateiserror– reporterror
98
i + i $input
q0stack
+ i $input
q0stack
shift
i q5
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
Stackgrowsthisway
![Page 99: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/99.jpg)
LR(0)pushdownautomaton• Reduce move
– UsingaruleNtα– Symbolsinα andtheirfollowingstatesareremovedfromstack– NewstatecomputedbasedonGOTOtable(usingtopofstack,
beforepushingN)– Nispushedonthestack– NewstatepushedontopofN
99
+ i $input
q0stack i q5
ReduceTt i + i $input
q0stack q6
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
Stackgrowsthisway
![Page 100: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/100.jpg)
GOTO/ACTIONtable
100
State i + ( ) $ E T
q0 s5 s7 s1 s6
q1 s3 s2
q2 r1 r1 r1 r1 r1 r1 r1
q3 s5 s7 s4
q4 r3 r3 r3 r3 r3 r3 r3
q5 r4 r4 r4 r4 r4 r4 r4
q6 r2 r2 r2 r2 r2 r2 r2
q7 s5 s7 s8 s6
q8 s3 s9
q9 r5 r5 r5 r5 r5 r5 r5
(1)Z t E $(2)E t T (3)E t E + T(4)T t i (5)T t( E )
Warning:numbersmeandifferentthings!rn =reduceusingrulenumbernsm =shifttostate m
![Page 101: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/101.jpg)
Parsingid+id$
101
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)
Stack Input Action0 id+id$ s5
Initializewithstate0
Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 102: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/102.jpg)
Parsingid+id$
102
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
Stack Input Action0 id+id$ s5
Initializewithstate0
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 103: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/103.jpg)
Parsingid+id$
103
Stack Input Action0 id+id$ s50id5 + id$ r4
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 104: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/104.jpg)
Parsingid+id$
104
Stack Input Action0 id+id$ s50id5 + id$ r4
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
popid5
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 105: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/105.jpg)
Parsingid+id$
105
Stack Input Action0 id+id$ s50id5 + id$ r4
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
pushT6
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 106: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/106.jpg)
Parsingid+id$
106
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r2
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 107: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/107.jpg)
Parsingid+id$
107
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r20 E1 + id$ s3
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 108: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/108.jpg)
Parsingid+id$
108
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r20 E1 + id$ s30 E1+3 id$ s5
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 109: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/109.jpg)
Parsingid+id$
109
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r20 E1 + id$ s30 E1+3 id$ s50 E1+3id5 $ r4
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 110: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/110.jpg)
Parsingid+id$
110
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r20 E1 + id$ s30 E1+3 id$ s50 E1+3id5 $ r40E1 +3T4 $ r3
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 111: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/111.jpg)
Parsingid+id$
111
Stack Input Action0 id+id$ s50id5 + id$ r40 T6 + id$ r20 E1 + id$ s30 E1+3 id$ s50 E1+3id5 $ r40E1 +3T4 $ r30E1 $ s2
gotoactionSTE$)(+idg6g1s7s50
accs312
g4s7s53r3r3r3r3r34r4r4r4r4r45r2r2r2r2r26
g6g8s7s57s9s38
r5r5r5r5r59
(1)S® E$(2)E® T(3)E® E+ T(4)T® id(5)T® ( E)Stackgrowsthisway
rn =reduceusingrulenumbernsm =shifttostatem
![Page 112: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/112.jpg)
LR(0)automatonexample
112
Z® •E$E® •TE® •E+TT® •iT® •(E)
T® (•E)E® •TE® •E+TT® •iT® •(E)
E® E+T•
T® (E)•Z® E$•
Z® E•$E® E•+T E® E+•T
T® •iT® •(E)
T® i•
T® (E•)E® E•+T
E® T•q0
q1
q2
q3
q4
q5
q6
q7
q8
q9
T
(
i
E
+
$
T
)
+
E
i
T
(i
(
reducestateshiftstate
readinput“(“
ManagedtoreduceE
![Page 113: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/113.jpg)
StatesandLR(0)Items
• Thestatewill“remember”thepotentialderivationrulesgiventhepartthatwasalreadyidentified
• Forexample,ifwehavealreadyidentifiedEthenthestatewillrememberthetwoalternatives:
(1)E→ E*B, (2) E→ E+B• Actually,wewillalsorememberwhereweareineachof
them:(1)E→ E● *B, (2) E→ E● +B• AderivationrulewithalocationmarkeriscalledLR(0)
item.• ThestateisactuallyasetofLR(0)items.E.g.,
q13 ={E→ E● *B,E→ E● +B}
E→E*B|E+B|BB→0|1
113
![Page 114: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/114.jpg)
GOTO/ACTIONtables
114
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
q1 q3 q2 shift
q2 Z® E$
q3 q5 q7 q4 Shift
q4 E® E+T
q5 T® i
q6 E® T
q7 q5 q7 q8 q6 shift
q8 q3 q9 shift
q9 T® E
GOTOTable ACTIONTable
empty=errormove
![Page 115: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/115.jpg)
LR(0)parsertables
• Actionsdeterminedbytopmoststate• Twotypesofrows:
– Shiftrow– tellswhichstatetoGOTOforcurrenttoken
– Reducerow– tellswhichruletoreduce(independentofcurrenttoken)• GOTOentriesareblank
115
![Page 116: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/116.jpg)
GOTO/ACTIONtables
116
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
q1 q3 q2 shift
q2 Z® E$
q3 q5 q7 q4 Shift
q4 E® E+T
q5 T® i
q6 E® T
q7 q5 q7 q8 q6 shift
q8 q3 q9 shift
q9 T® E
GOTOTable ACTIONTable
empty=errormove
![Page 117: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/117.jpg)
GOTO/ACTIONtableaction
shift
shift
Z® E$
Shift
E® E+T
T® i
E® T
shift
shift
T® E
117
State i + ( ) $ E T
q0 s5 s7 s1 s6
q1 s3 s2
q2 r1 r1 r1 r1 r1 r1 r1
q3 s5 s7 s4
q4 r3 r3 r3 r3 r3 r3 r3
q5 r4 r4 r4 r4 r4 r4 r4
q6 r2 r2 r2 r2 r2 r2 r2
q7 s5 s7 s8 s6
q8 s3 s9
q9 r5 r5 r5 r5 r5 r5 r5
(1)Z ® E $(2)E ® T (3)E ® E + T(4)T ® i (5)T ® ( E )
Warning:numbersmeandifferentthings!rn =reduceusingrulenumbernsm =shifttostate m
![Page 118: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/118.jpg)
LR(0)Parsingofi +i
Stackq0q0i q5q0T q6q0E q1q0E q1+ q3q0E q1+ q3i q5q0E q1+ q3T q4q0E q1q0E q1$ q2q0 $
118
State i + ( ) $ E T
q0 s5 s7 s1 s6
q1 s3 s2
q2 r1 r1 r1 r1 r1 r1 r1
q3 s5 s7 s4
q4 r3 r3 r3 r3 r3 r3 r3
q5 r4 r4 r4 r4 r4 r4 r4
q6 r2 r2 r2 r2 r2 r2 r2
q7 s5 s7 s8 s6
q8 s3 s9
q9 r5 r5 r5 r5 r5 r5 r5
(1)Z ® E $(2)E ® T (3)E ® E + T(4)T ® i (5)T ® ( E )
Warning:numbersmeandifferentthings!rn =reduceusingrulenumbernsm =shifttostate m
inputi +i $+i $+i $+i $i $$$$$
Actionshiftreduce4reduce2shiftshiftreduce4reduce3shiftreduce1accept
![Page 119: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/119.jpg)
ConstructinganLRparsingtable
• Constructa(determinized)transitiondiagramfromLRitems
• Ifthereareconflicts– stop• Filltableentriesfromdiagram
119
![Page 120: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/120.jpg)
LRitem
120
N ® α•β
Alreadymatched TobematchedInput
Hypothesisaboutαβ beingapossiblehandle,sofarwe’vematchedα,expectingtoseeβ
![Page 121: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/121.jpg)
TypesofLR(0)items
121
N ® α•β Shift Item
N ® αβ• Reduce Item
![Page 122: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/122.jpg)
LR(0)automatonexample
122
Z® •E$E® •TE® •E+TT® •iT® •(E)
T® (•E)E® •TE® •E+TT® •iT® •(E)
E® E+T•
T® (E)•Z® E$•
Z® E•$E® E•+T E® E+•T
T® •iT® •(E)
T® i•
T® (E•)E® E•+T
E® T•q0
q1
q2
q3
q4
q5
q6
q7
q8
q9
T
(
i
E
+
$
T
)
+
E
i
T
(i
(
reducestateshiftstate
![Page 123: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/123.jpg)
Computingitemsets
• Initialset– Zisinthestartsymbol– e-closure({Z® •α |Z® α isinthegrammar})
• NextsetfromasetSandthenextsymbolX– step(S,X)={N® αX•β |N® α•Xβ intheitemsetS}– nextSet(S,X)=e-closure(step(S,X))
123
![Page 124: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/124.jpg)
Operationsfortransitiondiagramconstruction
• Initial={S’® •S$}
• ForanitemsetIClosure(I)=Closure(I)∪
{X® •µ isingrammar|N® α•Xβ inI}
• Goto(I,X)={N® αX•β |N® α•Xβ inI}
124
![Page 125: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/125.jpg)
Initialexample
• Initial={S® •E$}
125
(1)S® E$(2)E® T(3)E® E+T(4)T® id(5)T® (E)
Grammar
![Page 126: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/126.jpg)
Closureexample
• Initial={S® •E$}• Closure({S® •E$})={
S® •E$E® •TE® •E+TT® •idT® •(E)}
126
(1)S® E$(2)E® T(3)E® E+T(4)T® id(5)T® (E)
Grammar
![Page 127: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/127.jpg)
Gotoexample
• Initial={S® •E$}• Closure({S® •E$})={
S® •E$E® •TE® •E+TT® •idT® •(E)}
• Goto({S® •E$,E® •E+T,T® •id},E)={S® E• $,E® E• +T}
127
(1)S® E$(2)E® T(3)E® E+T(4)T® id(5)T® (E)
Grammar
![Page 128: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/128.jpg)
Constructingthetransitiondiagram
• Startwithstate0containingitemClosure({S® •E$})
• Repeatuntilnonewstatesarediscovered– ForeverystatepcontainingitemsetIp,andsymbolN,computestateqcontainingitemsetIq=Closure(goto(Ip,N))
128
![Page 129: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/129.jpg)
LR(0)automatonexample
129
Z® •E$E® •TE® •E+TT® •iT® •(E)
T® (•E)E® •TE® •E+TT® •iT® •(E)
E® E+T•
T® (E)•Z® E$•
Z® E•$E® E•+T E® E+•T
T® •iT® •(E)
T® i•
T® (E•)E® E•+T
E® T•q0
q1
q2
q3
q4
q5
q6
q7
q8
q9
T
(
i
E
+
$
T
)
+
E
i
T
(i
(
reducestateshiftstate
![Page 130: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/130.jpg)
Automatonconstructionexample
130
(1) S ® E $(2) E ® T(3) E ® E + T(4) T ® id (5) T ® ( E )S®•E$
q0
Initialize
![Page 131: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/131.jpg)
131
(1) S ® E $(2) E ® T(3) E ® E + T(4) T ® id (5) T ® ( E )
S® •E$E® •TE® •E+TT® •iT® •(E)
q0
applyClosure
Automatonconstructionexample
![Page 132: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/132.jpg)
Automatonconstructionexample
132
(1) S ® E $(2) E ® T(3) E ® E + T(4) T ® id (5) T ® ( E )
S® •E$E® •TE® •E+TT® •iT® •(E)
q0 E® T•
q6
TT® (•E)E® •TE® •E+TT® •iT® •(E)
(
T® i•
q5i
S® E•$E® E•+T
q1E
![Page 133: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/133.jpg)
133
(1) S ® E $(2) E ® T(3) E ® E + T(4) T ® id (5) T ® ( E )
S® •E$E® •TE® •E+TT® •iT® •(E)
T® (•E)E® •TE® •E+TT® •iT® •(E)
E® E+T•
T® (E)•S® E$•
Z® E•$E® E•+T E® E+•T
T® •iT® •(E)
T® i•
T® (E•)E® E•+T
E® T•q0
q1
q2
q3
q4
q5
q6q7
q8
q9
T
(
i
E
+
$
T
)
+
E
i
T
(i
(
terminaltransitioncorrespondstoshiftactioninparsetable
non-terminaltransitioncorrespondstogotoactioninparsetable
asinglereduceitemcorrespondstoreduceaction
Automatonconstructionexample
![Page 134: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/134.jpg)
Arewedone?
• Canmakeatransitiondiagramforanygrammar
• CanmakeaGOTOtableforeverygrammar
• CannotmakeadeterministicACTIONtableforeverygrammar
134
![Page 135: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/135.jpg)
LR(0)conflicts
135
Z® E$E® TE® E+TT® iT® (E)T® i[E]
Z® •E$E® •T
E® •E+TT® •iT® •(E)T® •i[E] T® i•
T® i•[E]
q0
q5
T
(
i
E Shift/reduceconflict
…
…
…
![Page 136: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/136.jpg)
LR(0)conflicts
136
Z® E$E® TE® E+TT® iV® iT® (E)
Z® •E$E® •T
E® •E+TT® •iT® •(E)T® •i[E] T® i•
V® i•
q0
q5
T
(
i
E reduce/reduceconflict
…
…
…
![Page 137: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/137.jpg)
LR(0)conflicts
• Anygrammarwithane-rulecannotbeLR(0)• Inherentshift/reduceconflict
– A® e• – reduceitem– P® α•Aβ – shiftitem– A® e• canalwaysbepredictedfromP® α•Aβ
137
![Page 138: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/138.jpg)
Conflicts
• Canconstructadiagramforeverygrammarbutsomemayintroduceconflicts
• shift-reduceconflict:anitemsetcontainsatleastoneshiftitemandonereduceitem
• reduce-reduceconflict:anitemsetcontainstworeduceitems
138
![Page 139: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/139.jpg)
LRvariants
• LR(0)– whatwe’veseensofar• SLR(0)
– RemovesinfeasiblereduceactionsviaFOLLOWsetreasoning
• LR(1)– LR(0)withonelookaheadtokeninitems
• LALR(0)– LR(1)withmergingofstateswithsameLR(0)component 139
![Page 140: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/140.jpg)
LR(0)GOTO/ACTIONStables
140
State i + ( ) $ E T action
q0 q5 q7 q1 q6 shift
q1 q3 q2 shift
q2 Z® E$
q3 q5 q7 q4 Shift
q4 E® E+T
q5 T® i
q6 E® T
q7 q5 q7 q8 q6 shift
q8 q3 q9 shift
q9 T® E
GOTOTableACTIONTable
ACTIONtabledeterminedonly bystate,ignoresinput
GOTOtableisindexedbystateandagrammarsymbolfromthestack
![Page 141: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/141.jpg)
SLRparsing
• Ahandleshouldnotbereducedtoanon-terminalNifthelookaheadisatokenthatcannotfollowN
• AreduceitemN® α• isapplicableonlywhenthelookaheadisinFOLLOW(N)– IfbisnotinFOLLOW(N)weprovedthereisnoderivation
Sè*βNb.– Thus,itissafetoremovethereduceitemfromtheconflicted
state
• DiffersfromLR(0)onlyontheACTIONtable– Nowarowintheparsingtablemaycontainbothshiftactionsand
reduceactionsandweneedtoconsultthecurrenttokentodecidewhichonetotake
141
![Page 142: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/142.jpg)
SLRactiontable
142
State i + ( ) [ ] $
0 shift shift
1 shift accept
2
3 shift shift
4 E® E+T E® E+T E® E+T
5 T® i T® i shift T® i
6 E® T E® T E® T
7 shift shift
8 shift shift
9 T® (E) T® (E) T® (E)
vs.
state action
q0 shift
q1 shift
q2
q3 shift
q4 E® E+T
q5 T® i
q6 E® T
q7 shift
q8 shift
q9 T® E
SLR– use1tokenlook-ahead LR(0)– nolook-ahead… as before…T ® i T ® i[E]
Lookaheadtokenfromtheinput
![Page 143: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/143.jpg)
LR(1)grammars
• InSLR:areduceitemN® α• isapplicableonlywhenthelookahead isinFOLLOW(N)
• ButFOLLOW(N)mergeslookahead forallalternativesforN– Insensitivetothecontextofagivenproduction
• LR(1)keepslookahead witheachLRitem• Idea:amorerefinednotionoffollowscomputedperitem 143
![Page 144: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/144.jpg)
LR(1)items• LR(1)itemisapair
– LR(0)item– Lookaheadtoken
• Meaning– Wematchedthepartleftofthedot,lookingtomatchtheparton
therightofthedot,followedbythelookaheadtoken
• Example– TheproductionL® idyieldsthefollowingLR(1)items
144
[L→● id,*][L→● id,=][L→● id,id][L→● id,$][L→id●,*][L→id●,=][L→id●,id][L→id●,$]
(0)S’→S(1)S→L=R(2)S→R(3)L→*R(4)L→id(5)R→L
[L→● id][L→id●]
LR(0)items
LR(1)items
![Page 145: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/145.jpg)
LR(1)items• LR(1)itemisapair
– LR(0)item– Lookaheadtoken
• Meaning– Wematchedthepartleftofthedot,lookingtomatchtheparton
therightofthedot,followedbythelookaheadtoken
• Example– TheproductionL® idyieldsthefollowingLR(1)items
• Reduceonlyifthetheexpectedlookhead matchestheinput– [L→id●,=]willbeusedonlyifthenextinputtokenis=
145
![Page 146: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/146.jpg)
LALR(1)
• LR(1)tableshavehugenumberofentries• Oftendon’tneedsuchrefinedobservation(andcost)
• Idea:findstateswiththesameLR(0)componentandmergetheirlookaheads componentaslongastherearenoconflicts
• LALR(1)notaspowerfulasLR(1)intheorybutworksquitewellinpractice– Mergingmaynotintroducenewshift-reduceconflicts,onlyreduce-reduce,whichisunlikelyinpractice
146
![Page 147: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/147.jpg)
Summary
147
![Page 148: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/148.jpg)
LRisMorePowerfulthanLL
• AnyLL(k)languageisalsoinLR(k),i.e.,LL(k)⊂ LR(k).– LRismorepopularinautomatictools
• Butlessintuitive
• Also,thelookaheadiscounteddifferentlyinthetwocases– InanLL(k)derivationthealgorithmseestheleft-handsideofthe
rule+kinput tokensandthenmustselectthederivationrule– InLR(k),thealgorithm“sees”allright-handsideofthederivation
rule+kinputtokensandthenreduces• LR(0)seestheentireright-side,butnoinputtoken
148
![Page 149: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/149.jpg)
terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;nonterminal Integer expr;precedence left PLUS, MINUS;precedence left DIV, MULT;Precedence left UMINUS;%%expr ::= expr:e1 PLUS expr:e2
{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}| expr:e1 MINUS expr:e2{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}| expr:e1 MULT expr:e2{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}| expr:e1 DIV expr:e2{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}| MINUS expr:e1 %prec UMINUS{: RESULT = new Integer(0 - e1.intValue(); :}| LPAREN expr:e1 RPAREN{: RESULT = e1; :}| NUMBER:n{: RESULT = n; :}
149
Usingtoolstoparse+createAST
![Page 150: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/150.jpg)
GrammarHierarchy
150
Non-ambiguous CFGLR(1)
LALR(1)
SLR(1)
LL(1)
LR(0)
![Page 151: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/151.jpg)
Earley Parsing
151Jay Earley, PhD
![Page 152: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/152.jpg)
Earley Parsing
• InventedbyJayEarley [PhD.1968]
• Handlesarbitrarycontextfreegrammars– Canhandleambiguousgrammars
• ComplexityO(N3)whenN=|input|• Usesdynamicprogramming
– Compactlyencodesambiguity152
![Page 153: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/153.jpg)
Dynamicprogramming
• BreakaproblemPintosubproblems P1…Pk– SolvePbycombiningsolutionsforP1…Pk– Memoize (store)solutionstosubproblemsinsteadofre-computation
• Bellman-Fordshortestpathalgorithm– Sol(x,y,i)=minimumof
• Sol(x,y,i-1)• Sol(t,y,i-1)+weight(x,t)foredges(x,t)
153
![Page 154: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/154.jpg)
Earley Parsing
• Dynamicprogrammingimplementationofarecursivedescentparser– S[N+1] Sequenceofsetsof“Earley states”
• N =|INPUT|• Earley state(item)sis asententialform+auxinfo
– S[i] Allparsetreethatcanbeproduced(byaRDP)afterreadingthefirsti tokens• S[i+1]builtusingS[0]…S[i]
154
![Page 155: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/155.jpg)
EarleyParsing
• ParsearbitrarygrammarsinO(|input|3)– O(|input|2)forunambigous grammer– LinearformostLR(k)langaues
• Dynamicprogrammingimplementationofarecursivedescentparser– S[N+1]Sequenceofsetsof“Earley states”
• N=|INPUT|• Earley statesisasententialform+auxinfo
– S[i]Allparsetreethatcanbeproduced(byanRDP)afterreadingthefirsti tokens• S[i+1]builtusingS[0]…S[i]
155
![Page 156: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/156.jpg)
EarleyStates
• s=<constituent,back>– constituent(dottedrule)forAàαβ
Aà•αβpredicatedconstituentsAàα•βin-progressconstituentsAàαβ•completedconstituents
– backpreviousEarlystateinderivation
156
![Page 157: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/157.jpg)
Earley States
• s=<constituent,back>– constituent (dottedrule)forAàαβ
Aà•αβpredicated constituentsAàα•β in-progressconstituentsAàαβ• completed constituents
– backpreviousEarlystateinderivation
157
![Page 158: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/158.jpg)
EarleyParser
Input=x[1…N]S[0]=<E’à •E,0>;S[1]=…S[N]={}fori =0...NdountilS[i]doesnotchangedoforeach s∈ S[i]ifs=<Aà…•a…,b>anda=x[i+1]then//scanS[i+1]=S[i+1]∪ {<Aà…a•…,b> }
ifs=<Aà …•X…,b>andXàαthen//predictS[i]=S[i]∪ {<Xà•α,i > }
ifs=<Aà …•,b>and<Xà…•A…,k>∈ S[b]then//completeS[i]=S[i]∪{<Xà…A•…,k>}
158
![Page 159: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/159.jpg)
Example
159
PRACTICAL EARLEY PARSING 621
S0
S′ → •E , 0E→ •E + E , 0E→ •n , 0
n
S1
E→ n• , 0S′ → E• , 0E→ E • +E , 0
+
S2
E→ E + •E , 0E→ •E + E , 2E→ •n , 2
n
S3
E→ n• , 2E→ E + E• , 0E→ E • +E , 2S′ → E• , 0
FIGURE 1. Earley sets for the grammar E → E + E | n andthe input n + n. Items in bold are ones which correspond to theinput’s derivation.
Earley recommended using lookahead for the COMPLETER
step [2]; it was later shown that a better approach was to uselookahead for the PREDICTOR step [8]; later it was shownthat prediction lookahead was of questionable value in anEarley parser which uses finite automata [9] as ours does.
In terms of implementation, the Earley sets are built inincreasing order as the input is read. Also, each set istypically represented as a list of items, as suggested byEarley [1, 2]. This list representation of a set is particularlyconvenient, because the list of items acts as a ‘work queue’when building the set: items are examined in order, applyingSCANNER, PREDICTOR and COMPLETER as necessary;items added to the set are appended onto the end of the list.
3. THE PROBLEM OF ϵ
At any given point i in the parse, we have two partially-constructed sets. SCANNER may add items to Si+1and Si may have items added to it by PREDICTOR andCOMPLETER. It is this latter possibility, adding items toSi while representing sets as lists, which causes grief withϵ-rules.
When COMPLETER processes an item [A→ •, j ] whichcorresponds to the ϵ-rule A → ϵ, it must look throughSj for items with the dot before an A. Unfortunately,for ϵ-rule items, j is always equal to i—COMPLETER
is thus looking through the partially-constructed set Si .3
Since implementations process items in Si in order, if anitem [B → . . . • A . . . , k] is added to Si after COMPLETER
has processed [A → •, j ], COMPLETER will never add[B → . . . A • . . . , k] to Si . In turn, items resulting directlyand indirectly from [B → . . . A• . . . , k] will be omitted too.This effectively prunes potential derivation paths, which cancause correct input to be rejected. Figure 2 gives an exampleof this happening.
3j = i for ϵ-rule items because they can only be added to an Earleyset by PREDICTOR, which always bestows added items with the parentpointer i.
S′ → S
S → AAAA
A → aA → E
E → ϵ
S0
S′ → •S , 0S → •AAAA , 0A→ •a , 0A→ •E , 0E→ • , 0A→ E• , 0S → A • AAA , 0
a
S1
A→ a• , 0S → A • AAA , 0S → AA • AA , 0A→ •a , 1A→ •E , 1E→ • , 1A→ E• , 1S → AAA • A , 0
FIGURE 2. An unadulterated Earley parser, representing setsusing lists, rejects the valid input a. Missing items in S0 soundthe death knell for this parse.
Two methods of handling this problem have beenproposed. Grune and Jacobs aptly summarize one approach:
‘The easiest way to handle this mare’s nest isto stay calm and keep running the Predictor andCompleter in turn until neither has anything moreto add.’ [10, p. 159]
Aho and Ullman [11] specify this method in their presen-tation of Earley parsing and it is used by ACCENT [12], acompiler–compiler which generates Earley parsers.
The other approach was suggested by Earley [1, 2].He proposed having COMPLETER note that the dot neededto be moved over A, then looking for this whenever futureitems were added to Si . For efficiency’s sake, the collectionof non-terminals to watch for should be stored in a datastructure which allows fast access. We used this methodinitially for the Earley parser in the SPARK toolkit [13].
In our opinion, neither approach is very satisfactory.Repeatedly processing Si , or parts thereof, involves a lotof activity for little gain; Earley’s solution requires anextra, dynamically-updated data structure and the unnaturalmating of COMPLETER with the addition of items. Ideally,we want a solution which retains the elegance of Earley’salgorithm, only processes items in Si once and has no run-time overhead from updating a data structure.
4. AN ‘IDEAL’ SOLUTION
Our solution involves a simple modification to PREDICTOR,based on the idea of nullability. A non-terminal A is saidto be nullable if A ⇒∗ ϵ; terminal symbols, of course,can never be nullable. The nullability of non-terminals ina grammar may be easily precomputed using well-knowntechniques [14, 15]. Using this notion, our PREDICTOR canbe stated as follows (our modification is in bold):
If [A→ . . . • B . . . , j ] is in Si , add [B → •α, i]to Si for all rules B → α. If B is nullable,also add [A→ . . . B • . . . , j] to Si .
THE COMPUTER JOURNAL, Vol. 45, No. 6, 2002
ifs=<Aà…•a…,b>anda=x[i+1]then//scanS[i+1]=S[i+1]∪ {<Aà…a•…,b> }
ifs=<Aà …•X…,b>andXàαthen//predictS[i]=S[i]∪ {<Xà•α,i > }
ifs=<Aà …•,b>and<Xà…•A…,k>∈ S[b]then//completeS[i]=S[i]∪{<Xà…A•…,k>}
![Page 160: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/160.jpg)
Earley Parsing
160Jay Earley, PhD
![Page 161: 0368-3133 Lecture 4 - TAUmaon/teaching/2016-2017/compilation/compilatio… · Broad kinds of parsers • Parsers for arbitrary grammars – Earley’smethod, CYK method – Usually,](https://reader035.fdocuments.us/reader035/viewer/2022071301/609e46a0fb7d8f028d5654f9/html5/thumbnails/161.jpg)
161