1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D....

21
1 June 14, 2022 1 June 14, 2022 June 14, 2022 Azusa, Azusa, CA CA Sheldon X. Liang Ph. D. Computer Science at Computer Science at Azusa Azusa Pacific University Pacific University Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/ CS400 Compiler Construction CS400 Compiler Construction

Transcript of 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D....

Page 1: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

1

April 21, 20231

April 21, 2023April 21, 2023 Azusa, CAAzusa, CA

Sheldon X. Liang Ph. D.

Computer Science at Computer Science at Azusa Pacific UniversityAzusa Pacific University

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS400 Compiler ConstructionCS400 Compiler Construction

Page 2: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

2

Syntax AnalysisPart I

Chapter 4

April 21, 20232

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 3: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

3

Position of a Parser in the Compiler Model

LexicalAnalyzer

Parser and rest offront-end

SourceProgram

Token,tokenval

Symbol Table

Get nexttoken

Lexical error Syntax errorSemantic error

Intermediaterepresentation

April 21, 20233

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 4: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

4

April 21, 20234

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Keep in mind following questionsKeep in mind following questions

• Parser– Twofold:– Checking syntax– Invoking semantic action

• Error Handling – Two abilities:– Identifying errors – Locating errors

• Chomsky Hierarchy – From Turing machine– to Finite Automaton– From non to more restriction

Page 5: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

5

The Parser

• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)

– And to report syntax errors accurately2. To invoke semantic actions

– For static semantics checking, e.g. type checking of expressions, functions, etc.

– For syntax-directed translation of the source code to an intermediate representation

April 21, 20235

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 6: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

6

Syntax-Directed Translation

• One of the major roles of the parser is to produce an intermediate representation (IR) of the source program using syntax-directed translation methods

• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address

code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!

April 21, 20236

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 7: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

7

WHIRL

• Winning Hierarchical Intermediate Representation

• 5 Levels:– VH, H, M L, VL– Lowering happens when needed – Each optimization performed at the right level

April 21, 20237

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 8: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

8

Error Handling

• A good compiler should assist in identifying and locating errors– Lexical errors: important, compiler can easily

recover and continue– Syntax errors: most important for compiler,

can almost always recover– Static semantic errors: important, can

sometimes recover– Dynamic semantic errors: hard or impossible

to detect at compile time, runtime checks are required

– Logical errors: hard or impossible to detect

April 21, 20238

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 9: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

9

Viable-Prefix Property

• The viable-prefix property of LL/LR parsers allows early detection of syntax errors– Goal: detection of an error as soon as possible without

further consuming unnecessary input

– How: detect an error as soon as the prefix of the input does not match a prefix of any string in the language

…for (;)…

…DO 10 I = 1;0…

Error isdetected here

Error isdetected here

Prefix Prefix

April 21, 20239

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 10: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

10

Error Recovery Strategies

• Panic mode– Discard input until a token in a set of designated synchronizing

tokens is found

• Phrase-level recovery– Perform local correction on the input to repair the error

• Error productions– Augment grammar with productions for erroneous constructs

• Global correction– Choose a minimal sequence of changes to obtain a global least-

cost correction

April 21, 202310

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 11: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

11

Grammars (Recap)

• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)

– N is a finite set of nonterminals

– P is a finite set of productions of the form

where (NT)* N (NT)* and (NT)*

– S N is a designated start symbol

April 21, 202311

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 12: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

12

Notational Conventions Used

• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +

• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt

• Grammar symbolsX,Y,Z (NT)

• Strings of terminalsu,v,w,x,y,z T*

• Strings of grammar symbols,, (NT)*

April 21, 202312

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 13: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

13

Derivations (Recap)

• The one-step derivation is defined by A

where A is a production in the grammar• In addition, we define

is leftmost lm if does not contain a nonterminal is rightmost rm if does not contain a nonterminal– Transitive closure * (zero or more steps)– Positive closure + (one or more steps)

• The language generated by G is defined byL(G) = {w T* | S + w}

April 21, 202313

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 14: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

14

Derivation (Example)Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E

E E * EE ( E )E - EE id

E - E - id

E * E

E + id * id + id

E rm E + E rm E + id rm id + id

Example derivations:

E * id + id

April 21, 202314

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 15: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

15

Chomsky Hierarchy: Language Classification

April 21, 202315

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Avram Noam Chomsky is an American linguist, philosopher,, and lecturer. He is a professor emeritus of linguistics at the MIT. Chomsky is well known in the academic and scientific community as the father of modern linguistics. In the 1950s, Chomsky began developing his theory of generative grammar, which has had a profound influence on linguistics. He established the Chomsky hierarchy, a classification of formal languages in terms of their generative power. His naturalistic approach to the study of language has affected the philosophy of language and mind.

Page 16: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

16

April 21, 202316

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

• A grammar G is said to be– Regular if it is right linear where each production is of the form

A w B or A wor left linear where each production is of the form

A B w or A w– Context free if each production is of the form

A where A N and (NT)*

– Context sensitive if each production is of the form A

where A N, ,, (NT)*, || > 0– Unrestricted

Chomsky Hierarchy: Language Classification

Page 17: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

17

Chomsky Hierarchy

L(regular) L(context free) L(context sensitive) L(unrestricted)

April 21, 202317

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 18: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

18

Chomsky Hierarchy

L(regular) L(context free) L(context sensitive) L(unrestricted)

Where L(T) = { L(G) | G is of type T }That is: the set of all languages

generated by grammars G of type T

L1 = { anbn | n 1 } is context free

L2 = { anbncn | n 1 } is context sensitive

Every finite language is regular! (construct a FSA for strings in L(G))

Examples:

April 21, 202318

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

unrestricted

context sensitive

context free

regular

Page 19: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

19

Parsing

• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley

• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods

• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods

• SLR, canonical LR, LALR

April 21, 202319

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Page 20: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

20

April 21, 202320

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction

Got it with following questionsGot it with following questions

• Parser– Twofold:– Checking syntax– Invoking semantic action

• Error Handling – Two abilities:– Identifying errors – Locating errors

• Chomsky Hierarchy – From Turing machine– to Finite Automaton– From non to more restriction

Page 21: 1 October 2, 2015 1 October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

21

Thank you very much!

Questions?

April 21, 202321

Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/

CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction