Theory of Compilation
-
Upload
cameran-fuentes -
Category
Documents
-
view
36 -
download
1
description
Transcript of Theory of Compilation
![Page 1: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/1.jpg)
THEORY OF COMPILATIONLecture 01 - Introduction
Eran Yahav
![Page 2: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/2.jpg)
2
Who?
Eran YahavTaub 734Tel: [email protected] 13:30-14:30http://www.cs.tecnion.ac.il/~yahave
TAs: • Adi Sosnovich• Maya Arbel• Guy Hefetz
![Page 3: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/3.jpg)
3
What is a Compiler?
“A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.”
--Wikipedia
![Page 4: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/4.jpg)
4
What is a Compiler?
Executable
code
exe
Source
text
txt
source language target language
Compiler
CC++PascalJava
PostscriptTeX
PerlJavaScriptPythonRuby
Prolog
LispSchemeMLOCaml
IA32IA64SPARC
CC++PascalJava
Java Bytecode…
![Page 5: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/5.jpg)
5
What is a Compiler?
Executable
code
exe
Source
text
txt
Compiler
int a, b;a = 2;b = a*2 + 1;
MOV R1,2SAL R1INC R1MOV R2,R1
![Page 6: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/6.jpg)
6
Anatomy of a Compiler
Executable
code
exe
Source
text
txt
Semantic
Representation
Backend
(synthesis)
Compiler
Frontend
(analysis)
int a, b;a = 2;b = a*2 + 1;
MOV R1,2SAL R1INC R1MOV R2,R1
![Page 7: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/7.jpg)
7
Modularity
SourceLanguage 1
txt
SemanticRepresentation
Backend
TL2
Frontend
SL2
int a, b;a = 2;b = a*2 + 1;
MOV R1,2SAL R1INC R1MOV R2,R1
Frontend
SL3
Frontend
SL1
Backend
TL1
Backend
TL3
SourceLanguage 1
txt
SourceLanguage 1
txt
Executabletarget 1
exe
Executabletarget 1
exe
Executabletarget 1
exe
SET R1,2STORE #0,R1SHIFT R1,1STORE #1,R1ADD R1,1STORE #2,R1
![Page 8: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/8.jpg)
8
Anatomy of a Compiler
Executable
code
exe
Source
text
txt
Semantic
Representation
Backend
(synthesis)
Compiler
Frontend
(analysis)
int a, b;a = 2;b = a*2 + 1;
MOV R1,2SAL R1INC R1MOV R2,R1
LexicalAnalysi
s
Syntax Analysi
s
Parsing
Semantic
Analysis
IntermediateRepresentati
on
(IR)
Code
Generation
![Page 9: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/9.jpg)
9
Interpreter
Interpreter
int a, b;a = 2;b = a*2 + 1;
Source
text
txt
Input
OutputSemantic
Representation
Frontend
(analysis)
Execution
Engine
![Page 10: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/10.jpg)
10
Compiler vs. Interpreter
Executable
code
exe
Source
text
txt
Semantic
Representation
Backend
(synthesis)
Frontend
(analysis)
Source
text
txt
Input
OutputSemantic
Representation
Frontend
(analysis)
Execution
Engine
![Page 11: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/11.jpg)
11
Compiler vs. Interpreter
Semantic
Representation
Backend
(synthesis)
Frontend
(analysis)
3
7Semantic
Representation
Frontend
(analysis)
Execution
Engineb = a*2 + 1;
b = a*2 + 1;
MOV R1,8(ebp)SAL R1INC R1MOV R2,R1
3
7
![Page 12: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/12.jpg)
12
Just-in-time Compiler (Java example)
Java
Source
txt
Input
OutputJava source to Java bytecode
compilerJava
Bytecode
txtJava
Virtual Machine
Just-in-time compilation: bytecode interpreter (in the JVM) compiles program fragments during interpretation to avoid expensive re-interpretation.
![Page 13: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/13.jpg)
13
Why should you care?
Every person in this class will build a parser some day Or wish she knew how to build one…
Useful techniques and algorithms Lexical analysis / parsing Semantic representation … Register allocation
Understand programming languages better Understand internals of compilers Understand (some) details of target architectures
![Page 14: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/14.jpg)
14
Why should you care?
TargetSource
Compiler
Useful formalisms Regular expressions Context-free grammars Attribute grammars
Data structures Algorithms
Programming LanguagesSoftware Engineering
Runtime environmentGarbage collectionArchitecture
![Page 15: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/15.jpg)
15
Why should you care?
![Page 16: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/16.jpg)
16
Course Overview
Executable
code
exe
Source
text
txt
Compiler
LexicalAnalysi
s
Syntax Analysi
s
Parsing
Semantic
Analysis
Inter.Rep.
(IR)
Code
Gen.
![Page 17: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/17.jpg)
17
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
x = b*b – 4*a*c
txt
<ID,”x”> <EQ> <ID,”b”> <MULT> <ID,”b”> <MINUS> <INT,4> <MULT> <ID,”a”> <MULT>
<ID,”c”>
TokenStream
![Page 18: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/18.jpg)
18
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
<ID,”x”> <EQ> <ID,”b”> <MULT> <ID,”b”> <MINUS> <INT,4> <MULT> <ID,”a”> <MULT> <ID,”c”>
‘b’ ‘4’
‘b’‘a’
‘c’
ID
ID
ID
ID
ID
factor
term factorMULT
term
expression
expression
factor
term factorMULT
term
expression
term
MULT factor
MINUS
SyntaxTree
![Page 19: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/19.jpg)
19
Journey inside a compiler
Sem.Analysi
s
Inter.Rep.
Code Gen.
‘b’
‘4’
‘b’
‘a’
‘c’
MULT
MULT
MULT
MINUS
LexicalAnalysi
s
Syntax Analysi
s
AbstractSyntaxTree
![Page 20: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/20.jpg)
20
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
‘b’
‘4’
‘b’
‘a’
‘c’
MULT
MULT
MULT
MINUS
type: intloc: sp+8
type: intloc: const
type: intloc: sp+16
type: intloc: sp+16
type: intloc: sp+24
type: intloc: R2
type: intloc: R2
type: intloc: R1
type: intloc: R1
AnnotatedAbstractSyntaxTree
![Page 21: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/21.jpg)
21
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
‘b’
‘4’
‘b’
‘a’
‘c’
MULT
MULT
MULT
MINUS
type: intloc: sp+8
type: intloc: const
type: intloc: sp+16
type: intloc: sp+16
type: intloc: sp+24
type: intloc: R2
type: intloc: R2
type: intloc: R1
type: intloc: R1
R2 = 4*aR1=b*bR2= R2*cR1=R1-R2
IntermediateRepresentation
![Page 22: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/22.jpg)
22
Journey inside a compiler
Inter.Rep.
Code Gen.
‘b’
‘4’
‘b’
‘a’
‘c’
MULT
MULT
MULT
MINUS
type: intloc: sp+8
type: intloc: const
type: intloc: sp+16
type: intloc: sp+16
type: intloc: sp+24
type: intloc: R2
type: intloc: R2
type: intloc: R1
type: intloc: R1
R2 = 4*aR1=b*bR2= R2*cR1=R1-R2
MOV R2,(sp+8)SAL R2,2MOV R1,(sp+16)MUL R1,(sp+16)MUL R2,(sp+24)SUB R1,R2
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
IntermediateRepresentation
AssemblyCode
![Page 23: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/23.jpg)
23
Error Checking
In every stage…
Lexical analysis: illegal tokens Syntax analysis: illegal syntax Semantic analysis: incompatible types,
undefined variables, …
Every phase tries to recover and proceed with compilation (why?) Divergence is a challenge
![Page 24: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/24.jpg)
24
Errors in lexical analysis
pi = 3.141.562
txt
Illegal token
pi = 3oranges
txt
Illegal token
pi = oranges3
txt
<ID,”pi”>, <EQ>, <ID,”oranges3”>
![Page 25: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/25.jpg)
25
Error detection: type checking
x = 4*a*”oranges”
txt
‘4’ ‘a’
“oranges”MULT
MULT
type: intloc: sp+8
type: intloc: const
type: stringloc: const
type: intloc: R2
type: intloc: R2
![Page 26: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/26.jpg)
26
The Real Anatomy of a Compiler
Executable
code
exe
Source
text
txtLexicalAnalysi
s
Sem.Analysis
Process text input
characters SyntaxAnalysi
s
tokens AST
Intermediate code
generation
Annotated AST
Intermediate code
optimization
IR CodegenerationIR
Target code optimizatio
n
Symbolic Instructions
SI Machine code
generation
Write executable
output
MI
![Page 27: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/27.jpg)
27
Optimizations
“Optimal code” is out of reach many problems are undecidable or too expensive (NP-complete) Use approximation and/or heuristics Must preserve correctness, should (mostly) improve code
Many optimization heuristics Loop optimizations: hoisting, unrolling, … Peephole optimizations Constant propagation
Leverage compile-time information to save work at runtime (pre-computation)
Dead code elimination
Majority of compilation time is spent in the optimization phase
![Page 28: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/28.jpg)
28
Machine code generation
Register allocation Optimal register assignment is NP-Complete In practice, known heuristics perform well
assign variables to memory locations Instruction selection
Convert IR to actual machine instructions
Modern architectures Multicores Challenging memory hierarchies
![Page 29: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/29.jpg)
29
Compiler Construction Toolset Lexical analysis generators
lex Parser generators
yacc Syntax-directed translators Dataflow analysis engines
![Page 30: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/30.jpg)
30
Summary
Compiler is a program that translates code from source language to target language
Compilers play a critical role Bridge from programming languages to the
machine Many useful techniques and algorithms Many useful tools (e.g., lexer/parser generators)
Compiler constructed from modular phases Reusable Different front/back ends
![Page 31: Theory of Compilation](https://reader031.fdocuments.us/reader031/viewer/2022012918/5681332c550346895d9a2347/html5/thumbnails/31.jpg)
31
Coming up next
Lexical analysis