CSE P501 – Compiler Construction
description
Transcript of CSE P501 – Compiler Construction
Spring 2014 Jim Hogg - UW - CSE - P501 K-1
CSE P501 – Compiler Construction
Mapping source code to x86
Basic statements and expressions
Objects, method calls, and dynamic dispatch
Spring 2014 Jim Hogg - UW - CSE - P501 K-2
Review: Variables
For MiniJava, data is held in either A stack frame - for variables local to the method An object - for fields
Local variables accessed via ebp ('above' ebp)
eg: mov eax, [ebp - 12]
Fields accessed via object address in a register
Instead of ebp pointing to start of frame, use ecx pointing to start of object
Details later
How to translate language constructs from MiniJava into assembly code?
X = Y, X op Y, IF-THEN, IF-THEN-ELSE, WHILE, etc
If you know assembler code for any chip, the translation templates should be obvious
If you don't know assembler code, imagine translating MiniJava constructs into a primitive language whose only control flow is GOTO
Spring 2014 Jim Hogg - UW - CSE - P501 K-3
Basic Statements and Expressions
Conventions in Examples
Examples show code snippets in isolation
Real code generator needs to deal with things like: Which registers are in use at which point in the program Which registers to spill into memory (pushed onto stack or stored in
stack frame) when a new register is needed and no free ones are available
Register eax used below as a generic example Rename as needed for more complex code
Examples include a few peephole optimizations
Spring 2014 Jim Hogg - UW - CSE - P501 K-4
Spring 2014 Jim Hogg - UW - CSE - P501 K-5
Code Generation for Constants
Source 17
Idea Realize constant value in a register
x86 mov eax, 17
Peephole Optimization if constant is 0, then: xor eax, eax
Spring 2014 Jim Hogg - UW - CSE - P501 K-6
Assignment Statement
Source var = exp; // var is a local variable
x86 <evaluate exp into, say, eax> mov [ebp-offsetvar], eax
Spring 2014 Jim Hogg - UW - CSE - P501 K-7
Unary Minus
Source -exp
x86 <evaluate exp into eax> neg eax
Optimization Collapse -(-exp) to exp
Unary plus is a no-op
Spring 2014 Jim Hogg - UW - CSE - P501 K-8
Binary + Source
exp1 + exp2
x86<evaluate exp1 into eax><evaluate exp2 into edx>add eax, edx
Optimizations If exp2 is a simple variable or constant: add eax,
exp2 Change exp1 + (-exp2) into exp1 - exp2 If exp2 is 1 then: inc eax
Spring 2014 Jim Hogg - UW - CSE - P501 K-9
Binary -, *
Same as + Use sub for – (but not commutative!) Use imul for *
Optimizations Use left shift to multiply by powers of 2
If your multiplier is really slow or you’ve got free scalar units and multiplier is busy, consider: 10*x = (8*x)+(2*x)
But this may perform worse on a different micro-architecture
Use x+x instead of 2*x, etc (faster) Use dec for x-1
Spring 2014 Jim Hogg - UW - CSE - P501 K-10
Integer Division
Slow on x86 Divides 64-bit integer in edx:eax by a 32-bit integer (eg,
in ebx) ~50 clock latency
Sourceexp1 / exp2
x86<evaluate exp1 into eax><evaluate exp2 into ebx>cdq ; extend eax into edx:eax, clobbers edxidiv ebx ; => quotient in eax; remainder in edx
Spring 2014 Jim Hogg - UW - CSE - P501 K-11
Control Flow
Basic idea: decompose higher level operation into jmps and conditional jmps
In following slides, jfalse is used to mean jump when a condition is false
No such instruction on x86! Will have to use appropriate instructions to set condition
codes, followed by conditional jumps Normally wouldn’t actually generate the value “true” or
“false” in a register
Spring 2014 Jim Hogg - UW - CSE - P501 K-12
While Source
while (cond) stm
x86loop: <evaluate cond>
jfalse done<code for stm>jmp loop
done:
Note In generated asm code we’ll need to create a
unique label name for each loop, conditional statement, etc. eg: L123:
Spring 2014 Jim Hogg - UW - CSE - P501 K-13
Optimization for While
Put the test at the endjmp test
loop: <code for stm>test: <evaluate cond>
jtrue loop
Why bother? Pulls one jmp instruction out of the loop May avoid pipeline stall on jmp on each iteration But many chips assume backwards-branch is always taken Hardware branch-predictor will 'tune in' to the pattern
Easy to do from AST or other IR; not so easy if generating code on-the-fly (eg, recursive-descent, 1-pass compiler)
Spring 2014 Jim Hogg - UW - CSE - P501 K-14
Do-While
Sourcedo stm while (cond);
x86loop: <code for stm>
<evaluate cond>jtrue loop
Spring 2014 Jim Hogg - UW - CSE - P501 K-15
IF-THEN
Sourceif (cond) stm
x86<evaluate cond>jfalse skip<code for stm>
skip:
Spring 2014 Jim Hogg - UW - CSE - P501 K-16
IF-THEN-ELSE
Sourceif (cond) stm1 else stm2
x86<evaluate cond>jfalse else<code for stm1>jmp done
else: <code for stm2>done:
Spring 2014 Jim Hogg - UW - CSE - P501 K-17
Jump Chaining
Observation: naïve code gen can will produce jumps to jumps
Optimization: if a jump targets an unconditional jump, change target of first jump to the target of the second
Repeat until no further changes
Spring 2014 Jim Hogg - UW - CSE - P501 K-18
IF-ELSEIF-ELSE
Sourceif (cond1) stm1 else if (cond2) stm2else stm3
x86<evaluate cond1>jfalse elseif<code for stm1>jmp done
elseif:<evaluate cond2>jfalse else<code for stm2>jmp done
else:<code for stm3>jmp done
done:Note: jump-to-next redundancy
Spring 2014 Jim Hogg - UW - CSE - P501 K-19
Boolean Expressions What do we do with this?
x > y Expression evaluates to true or false
Could generate the value in a register (eg: 0|1) But normally we don’t want/need the value; we’re
only trying to decide whether to jump
Eg: if (exp1 > exp2) stm<evaluate exp1 to eax><evaluate exp2 to edx>cmp eax, edxjle L123<code for stm>
L123:
Spring 2014 Jim Hogg - UW - CSE - P501 K-20
Boolean Operators: !
Source ! exp eg: if (!exp) stm
To compile: <evaluate exp into eax as 0 or 1> jnz L123, where L123 is label after stm
Spring 2014 Jim Hogg - UW - CSE - P501 K-21
Boolean Operators: && and ||
In C/C++/Java/C#, these are short-circuit operators Right operand is evaluated only if needed if (cond1 || cond2)
if cond1 evaluates to true, then no need to evaluate cond2 if (cond1 && cond2)
if cond1 evaluates to false, then no need to evaluate cond2 if (p != null && p.x == 2) . . .
this check relies upon short-circuit evaluation!
Basically, generate if statements that jump appropriately and only evaluate operands when needed
Spring 2014 Jim Hogg - UW - CSE - P501 K-22
Example: Code for &&
Source
if (exp1 && exp2) stm
x86 <evaluate exp1>jfalse skip <evaluate exp2>jfalse skip<code for stm>
skip:
Spring 2014 Jim Hogg - UW - CSE - P501 K-23
Example: Code for ||
Source if (exp1 || exp2) stm
x86<evaluate exp1>jtrue doit <evaluate exp2>jfalse skip
doit: <code for stm>skip:
Spring 2014 Jim Hogg - UW - CSE - P501 K-24
Realizing Boolean Values
If a boolean value needs to be stored in a variable or method call parameter, generate code to create result
Typical representations: 0 for false, +1 or -1 for true
C specifies 0 and 1 if stored; we’ll use that Best choice can depend on machine instructions;
normally some convention is established during the primeval history of the architecture (eg: VAX blbc)
Spring 2014 Jim Hogg - UW - CSE - P501 K-25
Boolean Values: Example
Source var = bexp ;
x86<code for bexp>jfalse genFalsemov eax, 1jmp storeIt
genFalse:mov eax, 0
storeIt: mov [ebp+offsetvar], eax
Spring 2014 Jim Hogg - UW - CSE - P501 K-26
Better, If Enough Registers Source
var = bexp ; x86
xor eax, eax<code for bexp>jfalse storeItinc eax
storeIt: mov [ebp+offsetvar], eax
Or use conditional move instruction; avoids pipeline stalls
mov eax, 1 ; assume bexp true<code for bexp>cmovz eax, 0mov [ebp+offsetvar], eax
Spring 2014 Jim Hogg - UW - CSE - P501 K-27
Alternatively: setcc
Source var = x < y;
x86mov eax, [ebp+offsetx] ; load xcmp eax, [ebp+offsety] ; compare to ysetl al ; set AL to 0|1 if less thanmovzx eax,al ; zero-extend to 32 bitsmov [ebp+offsetvar],eax ; generated by asg stm
Gnu mnemonic for movzx (byte->dbl word) is movzbl Or use conditional move (cmovcc) instruction for
sequences like:x = y < z ? y : z
Spring 2014 Jim Hogg - UW - CSE - P501 K-28
Other Control Flow: switch
Naïve: generate a chain of nested if-then-else’s
Better: switch is intended to allow an O(1) selection, provided the set of switch values is reasonably compact
Idea: create a 1-D array of jumps or labels and use the switch expression to select the right one
Also need to generate an IF check that expression lies within bounds
Spring 2014 Jim Hogg - UW - CSE - P501 K-29
Switch
Source
switch (exp) {
case 0: stms0;
case 1: stms1;case 2: stms2;
}
X86-ish<evaluate exp into eax>“if (eax < 0 || eax > 2) jmp done”mov eax, swtab[eax * 4]jmp eax
L0: <code for stms0>jmp done
L1: <code for stms1>jmp done
L2: <code for stms2>jmp done
done:
.dataswtab dd L0 ; switch table
dd L1dd LAssume no
"fallthru"
Spring 2014 Jim Hogg - UW - CSE - P501 K-30
Arrays
C/C++/Java have 1-dimensional arrays 0-origin array with n elements contains variables: a[0] … a[n-
1]
C/C++ n-dimensional arrays Multiple dimensions; row major order
Java n-dimensional "jagged" arrays
Spring 2014 Jim Hogg - UW - CSE - P501 K-31
0-Origin 1-D Integer Arrays
Sourceexp1[exp2]
x86<evaluate exp1 into eax><evaluate exp2 into edx>address is [eax+4*edx] ; 4 bytes per element
0 1 2 3 4 contiguous cells in memory
Spring 2014 Jim Hogg - UW - CSE - P501 K-32
2-D Arrays
C, etc (the curly-brace languages) use row-major order
Fortran (and Excel) uses column-major order Exercises: What is the layout? How do you calculate location of a[i, j]?
What happens when you pass array references between Fortran and C code?
Java has "jagged" arrays - array of arrays
0,0 0,1 0,2 0,3 0,4 1,0 1,1 1,2 1,3 1,4 ...
1 5 -4 23 1 2 6
3 99
Java jagged arrays
int[][] array = {{3, 4, 5}, {77, 50}};
int[][] a = { {1,5,-4,23}, {1,2,6,3}, {99} }
How to find A[row, col]
Spring 2014 Jim Hogg - UW - CSE - P501 K-33
0,0 0,1 0,2 0,3 0,41,0 1,1 1,2 1,3 1,42,0 2,1 2,2 2,3 2,43,0 3,1 3,2 3,3 3,4
row
column
row-major layout
0,0 0,1 0,2 0,3 0,4 1,0 1,1 1,2 1,3 1,4 2,0 2,1 2,2 2,3 2,4
int A[row, col] = @A + (row * numcols + col) * 4 // 0-basedA[row, col] = @A + (row – rowmin) * (colmax – colmin + 1) * elemsize + (col – colmin) * elemsize
Spring 2014 Jim Hogg - UW - CSE - P501 K-34
Code Generation for Objects Representation Method calls Inheritance and overriding
Strategies for implementing code generators
Code improvement – optimization
Next