1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is...
-
Upload
nicholas-phelps -
Category
Documents
-
view
217 -
download
0
Transcript of 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is...
![Page 1: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/1.jpg)
1
Structure of a Compiler
• Front end of a compiler is efficient and can be automated
• Back end is generally hard to automate and finding the optimum solution requires exponential time
• Intermediate code generation can effect the performance of the back end
InstructionSelection
InstructionScheduling
RegisterAllocation
Scanner ParserSemantic Analysis
Code Optimization
IntermediateCode
GenerationIR
![Page 2: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/2.jpg)
2
Intermediate Representations
• Abstract Syntax Trees (AST)
• Directed Acyclic Graphs (DAG)
• Control Flow Graphs (CFG)
• Static Single Assignment Form (SSA)
• Stack Machine Code
• Three Address Code
• Hybrid approaches mix graphical and linear representations
– SGI and SUN compilers use three address code but provide ASTs for loops if-statements and array references
– Use three-address code in basic blocks in control flow graphs
high-level
low-level
Graphical IRs
Linear IRs
![Page 3: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/3.jpg)
3
Abstract Syntax Trees (ASTs)
if (x < y)
x = 5*y + 5*y/3;
else
y = 5;
x = x+y;
Statements
<
AssignStmt
+
*
x
IfStmt
AssignStmt AssignStmt
x x y+ yxy
/
5 y 3*
5 y
5
![Page 4: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/4.jpg)
4
Directed Acyclic Graphs (DAGs)
• Use directed acyclic graphs to represent expressions
– Use a unique node for each expression
if (x < y)
x = 5*y + 5*y/3;
else
y = 5;
x = x+y;
Statements
<
AssignStmt
*
IfStmt
AssignStmt AssignStmt
x +y
/
5
3
![Page 5: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/5.jpg)
5
Control Flow Graphs (CFGs)
• Nodes in the control flow graph are basic blocks
– A basic block is a sequence of statements always entered at the beginning of the block and exited at the end
• Edges in the control flow graph represent the control flow
if (x < y)
x = 5*y + 5*y/3;
else
y = 5;
x = x+y;
if (x < y) goto B1 else goto B2
x = 5*y + 5*y/3 y = 5
x = x+y
B1 B2
B0
B3
• Each block has a sequence of statements• No jump from or to the middle of the block• Once a block starts executing, it will execute till the end
![Page 6: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/6.jpg)
6
Stack Machine Code if (x < y)
x = 5*y + 5*y/3;
else
y = 5;
x = x+y;
load x load y iflt L1 goto L2L1: push 5 load y multiply push 5 load y multiply push 3 divide add store x goto L3L2: push 5 store yL3: load x load y add store x
pops the toptwo elements andcompares them
pops the top two elements, multipliesthem, and pushes theresult back to the stack
pushes the valueat the location x to the stack
stores the value at the top of the stack to the location x
![Page 7: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/7.jpg)
7
Three-Address Code
• Each instruction can have at most three operands
• Assignments– x := y– x := y op z op: binary arithmetic or logical operators– x := op y op: unary operators (minus, negation, integer to
float conversion)• Branch
– goto L Execute the statement with labeled L next
• Conditional Branch– if x relop y goto L relop: <, =, <=, >=, ==, !=
• if the condition holds we execute statement labeled L next• if the condition does not hold we execute the statement following
this statement next
![Page 8: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/8.jpg)
Data structures for three address codes
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references to instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to triples
![Page 9: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/9.jpg)
Quadruples
• A record structure with 4 fields
– op, arg1, arg2 and result
• Examples
– For x := y op z we have:
• y in arg1, z in arg2 and x in result
– For unary operators, arg2 not used
• Content of fields are pointers to ST entries
![Page 10: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/10.jpg)
Triples
• Temps generated in quadruples must be entered in symbol table
• To avoid this, we can refer to a temp value by the location of the relevant statement
– We can have records with only 3 fields
• op, arg1 and arg2
– Fields arg1 and arg2 can be pointers to ST entries or to triple structure for temp values
![Page 11: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/11.jpg)
Indirect Triples
• Listing of pointers to triples, rather than triples themselves
• Example
– We can use an array to list pointers to triples in the desired order
![Page 12: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/12.jpg)
Example
• b * minus c + b * minus c
t1 = minus ct2 = b * t1t3 = minus ct4 = b * t3t5 = t2 + t4a = t5
Three address code
minus*
minus c t3*+=
c t1b t2t1
b t4t3t2 t5t4t5 a
arg1 resultarg2op
Quadruples
minus*
minus c*+=
cb (0)
b (2)(1) (3)a
arg1 arg2op
Triples
(4)
012345
minus*
minus c*+=
cb (0)
b (2)(1) (3)a
arg1 arg2op
Indirect Triples
(4)
012345
(0)(1)
(2)(3)(4)(5)
op353637383940
![Page 13: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/13.jpg)
Examples
1. X=(a+b)*-c/d
13
![Page 14: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/14.jpg)
14
Three-Address Code Generation for a Simple Grammar
Productions Semantic RulesS id := E id.place lookup(id.name);
S.code E.code || gen(id.place ‘:=‘ E.place);E E1 + E2 E.place newtemp();
E.code E1.code || E2.code || gen(E.place ‘:=‘ E1.place ‘+’ E2.place);E E1 * E2 E.place newtemp();
E.code E1.code || E2.code || gen(E.place ‘:=‘ E1.place ‘*’ E2.place);E ( E1 )E.code E1.code;
E.place E1.place;E E1 E.place newtemp();
E.code E1.code || gen(E.place ‘:=‘ ‘uminus’ E1.place);E id E.place lookup(id.name);
E.code ‘’ (empty string)
Attributes: E.place: location that holds the value of expression EE.code: sequence of instructions that are generated for E
Procedures: newtemp(): Returns a new temporary each time it is calledgen(): Generates instruction (have to call it with appropriate arguments)lookup(id.name): Returns the location of id from the symbol table
![Page 15: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/15.jpg)
15
Stack Machine Code Generation for a Simple Grammar
Productions Semantic RulesS id := E id.place lookup(id.name);
S.code E.code || gen(‘store’ id.place);E E1 + E2 E.code E1.code || E2.code || gen(‘add’);
(arguments for the add instruction are in the top of the stack)E E1 * E2 E.code E1.code || E2.code || gen(‘multiply’);E ( E1 )E.code E1.code; E E1 E.code E1.code || gen( ‘negate‘);E id E.code gen(‘load’ id.place)
Attributes: E.code: sequence of instructions that are generated for E(no place for an expression is needed since the result of an expressionis stored in the operand stack)
Procedures: newtemp(): Returns a new temporary each time it is calledgen(): Generates instruction (have to call it with appropriate arguments)lookup(id.name): Returns the location of id from the symbol table
![Page 16: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/16.jpg)
16
Code Generation for Boolean Expressions
• Two approaches
– Numerical representation
– Implicit representation
• Numerical representation
– Use 1 to represent true, use 0 to represent false
– For three-address code store this result in a temporary
– For stack machine code store this result in the stack
• Implicit representation
– For the boolean expressions which are used in flow-of-control statements (such as if-statements, while-statements etc.) boolean expressions do not have to explicitly compute a value, they just need to branch to the right instruction
– Generate code for boolean expressions which branch to the appropriate instruction based on the result of the boolean expression
![Page 17: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/17.jpg)
17
Boolean Expressions: Numerical Representation
Attributes : E.place: location that holds the value of expression E E.code: sequence of instructions that are generated for Eid.place: location for id
Global variable: nextstat: Returns the location of the next instruction to be generated(each call to gen() increments nextstat by 1)
Productions Semantic Rules
E id1 relop id2 E.place newtemp();E.code gen(‘if’ id1.place relop.op id2.place ‘goto’ nextstat+3);
|| gen(E.place ‘:=‘ ‘0’) || gen(‘goto’ nextstat+2)|| gen(E.place ‘:=‘ ‘1’);
E E1 and E2 E.place newtemp();E.code E1.code || E2.code
|| gen(E.place ‘:=‘ E1.place ‘and’ E2.place);
![Page 18: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/18.jpg)
18
Boolean Expressions: Implicit Representation
Productions Semantic Rules
E id1 relop id2 E.code gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true)|| gen(‘goto’ E.false);
E E1 and E2 E1.true newlabel();E1.false E. false; (short-circuiting)E2.true E. true;E2.false E. false;E.code E1.code || gen(E1.true ‘:’) || E2.code ;
Attributes : E.code: sequence of instructions that are generated for EE.false: instruction to branch to if E evaluates to falseE.true: instruction to branch to if E evaluates to true(E.code is synthesized whereas E.true and E.false are inherited)id.place: location for id
can be any relational operator:==, <=, >= !=
These places will be filled with lables later on when they become available
This generated labelwill be inserted to the place for E1.true in the code generated for E1
![Page 19: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/19.jpg)
19
Example
100 if x < y goto 103101 t1 := 0102 goto 104103 t1 := 1104 if a = b goto 107105 t2 := 0106 goto 108107 t2 := 1108 t3 := t1 and t2
Input boolean expression:x < y and a == b
Numerical representation:
if x < y goto L1goto LFalse
L1: if a = b goto LTruegoto LFalse...
LTrue:
LFalse:
Implicit representation:
These are the locations ofthree-address code instructions, they are not labels
These labels will be generatedlater on, and will be insertedto the corresponding places
![Page 20: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/20.jpg)
20
Flow-of-Control Statements
If-then-else
• Branch based on the result of boolean expression
Loops
• Evaluate condition before loop (if needed)
• Evaluate condition after loop
• Branch back to the top if condition holds
Merges test with last block of loop body
While, for, do, and until all fit this basic model
Pre-test
Loop body
Post-test
Next block
![Page 21: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/21.jpg)
21
Flow-of-Control Statements: Code Structure
E.code
S1.code
goto S.next
S2.code
to E.true
to E.false
E.true:
E.false:
S.next:
S if E then S1 else S2
if E evaluates to true
if E evaluates to false
E.code
S1.code
goto S.begin
to E.true
to E.false
E.true:
E.false:
S.begin:
S while E do S1
Another approach is to place E.code after S1.code
![Page 22: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/22.jpg)
22
Flow-of-Control Statements
Productions Semantic RulesS if E then S1 else S2 E.true newlabel();
E.false newlabel(); S1.next S. next;S2.next S. next;S.code E.code || gen(E.true ‘:’) || S1.code
|| gen(‘goto’ S.next) || gen(E.false ‘:’) || S2.code ;
S while E do S1 S.begin newlabel();E.true newlabel(); E.false S. next;S1.next S. begin;S.code gen(S.begin ‘:’) || E.code || gen(E.true ‘:’) || S1.code
|| gen(‘goto’ S.begin);
S S1 ; S2 S1.next newlabel();S2.next S.next;S.code S1.code || gen(S1.next ‘:’) || S2.code
Attributes : S.code: sequence of instructions that are generated for SS.next: label of the instruction that will be executed immediately after S(S.next is an inherited attribute)
![Page 23: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/23.jpg)
23
Example
Input code fragment:
while (a < b) { if (c < d) x = y + z; else x = y – z}
L1: if a < b goto L2goto LNext
L2: if c < d goto L3goto L4
L3: t1 := y + zx := t1goto L1
L4: t2 := y – zx := t2goto L1
LNext: ...
![Page 24: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/24.jpg)
24
Backpatching
• E.true, E.false, S.next may not be computed in a single pass (they are inherited attributes)
• Backpatching is a technique for generating labels for E.true, E.false, S.next and inserting them to the appropriate locations
• Basic idea
– Keep lists E.truelist, E.falselist, S.nextlist
• E.truelist: the list of instructions where the label for E.true have to be inserted when it becomes available
• S.nextlist: the list of instructions where the label for S.next have to be inserted when it becomes available
– When labels E.true, E.false, S.next are computed these labels are inserted to the instructions in these lists
![Page 25: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/25.jpg)
25
Flow-of-Control Statements: Case Statements
Case Statements1 Evaluate the controlling expression2 Branch to the selected case3 Execute the code for that case4 Branch to the statement after the case
Part 2 is the key
Strategies
• Linear search (nested if-then-else constructs)
• Build a table of case values & binary search it
• Directly compute an address (requires dense case set)
– Use an array of labels that is addressed by the case value
![Page 26: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/26.jpg)
26
Type Conversions
Mixed-type expressions
• Insert conversions as needed from conversion table
– i2f r1, r2 (convert the integer value in register r1 to float, and store the result in register r2)
• Most languages have symmetric conversion tables
Typical Addition
Table
![Page 27: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/27.jpg)
translation of a simple if-statement•
•
![Page 28: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/28.jpg)
Backpatching
• Previous codes for Boolean expressions insert symbolic labels for jumps
• It therefore needs a separate pass to set them to appropriate addresses• We can use a technique named backpatching to avoid this• We assume we save instructions into an array and labels will be
indices in the array• For nonterminal B we use two attributes B.truelist and B.falselist
together with following functions:– makelist(i): create a new list containing only I, an index into the array
of instructions– Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns
a pointer to the concatenated list– Backpatch(p,i): inserts i as the target label for each of the instruction
on the list pointed to by p
![Page 29: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/29.jpg)
Backpatching for Boolean Expressions•
•
![Page 30: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/30.jpg)
Backpatching for Boolean Expressions• Annotated parse tree for x < 100 || x > 200 && x ! = y
![Page 31: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/31.jpg)
Flow-of-Control Statements
![Page 32: 1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649e745503460f94b73de0/html5/thumbnails/32.jpg)
Translation of a switch-statement