Code Generation - Western Michigan University

44
Code Generation Code Generation Chapter 8 Chapter 8: Code Generation 1 CS5810 Spring 2009

Transcript of Code Generation - Western Michigan University

Page 1: Code Generation - Western Michigan University

Code GenerationCode Generation

Chapter 8

Chapter 8: Code Generation 1CS5810 Spring 2009

Page 2: Code Generation - Western Michigan University

Code GenerationCode Generation

Front End

Code Optimizer

Code Generator

Intermediate  Intermediate Code  Code 

Targetprogram

• Severe requirements– Target program preserve the semantics– Effective use of available resourcesEffective use of available resources– Itself must be efficient

• Primary tasksI t ti l ti– Instruction selection

– Register allocation and assignment– Instruction ordering

Chapter 8: Code Generation 2CS5810 Spring 2009

Page 3: Code Generation - Western Michigan University

Input to the Code GeneratorInput to the Code Generator

• IR produced by the front endIR produced by the front end– Three address code: quadruples, triples

Virtual machine: bytecodes stack machine code– Virtual machine: bytecodes, stack‐machine code

– Linear representations: postfix notation

G hi l t ti DAG t t– Graphical representations: DAG, syntax tree

• Symbol table– Determine the run‐time addresses of the names in IR

Chapter 8: Code Generation CS5810 Spring 2009 3

Page 4: Code Generation - Western Michigan University

The Target ProgramThe Target Program

• RISC (reduced instruction set computer)SC ( educed st uct o set co pute )– Many registers, three‐address instructions, simple addressing modes, simple instruction set arch

• CISC (complex instruction set computer)– few registers, two‐address instructions, various addressing modes variable length instruction setaddressing modes, variable‐length instruction set

• Stack‐based MachinePushing operands onto a stack– Pushing operands onto a stack 

– Almost disappeared, then revived with Java Virtual Machine (JVM)

Chapter 8: Code Generation CS5810 Spring 2009 4

Page 5: Code Generation - Western Michigan University

The Target ProgramThe Target Program

• Absolute machine‐language programsAbsolute machine language programs– Can be placed in a fixed location in memory and immediately executed

• Relocatable machine‐language programs– Subprograms can be complied separately, then linked together and loaded for execution by linking loader

A bl l• Assembly‐language program– Code generation easier, need assembly step

Chapter 8: Code Generation CS5810 Spring 2009 5

Page 6: Code Generation - Western Michigan University

Instruction SelectionInstruction Selection

• Map the IR into a code sequence that can beMap the IR into a code sequence that can be executed by the target machine

x = y + z a = b + c; d = a + ex = y + z

LD R0, b

a = b + c; d = a + e

LD R0, yADD R0 R0

,ADD R0, R0, cST a, R0

ADD R0, R0, zST x, R0

,LD R0, aADD R0, R0, e

Chapter 8: Code Generation 6CS5810 Spring 2009

, ,ST d, R0

Page 7: Code Generation - Western Michigan University

Register AllocationRegister Allocation

• Register allocation: select the set of vars that will greside in registers 

• Register assignment: pick the specific register that a var will reside inthat a var will reside in

L R1 aL  R1, aA  R1, bM R0 c

t = a + bt = t * c

M  R0, cD  R0, dST R1 t

t = t  / d

Chapter 8: Code Generation 7CS5810 Spring 2009

ST  R1, t

Page 8: Code Generation - Western Michigan University

Evaluation OrderEvaluation Order

• The order can affect the efficiencyThe order can affect the efficiency– Some orders require fewer registers

Picking best order is NP complete– Picking best order is NP‐complete

Chapter 8: Code Generation CS5810 Spring 2009 8

Page 9: Code Generation - Western Michigan University

The Target LanguageThe Target Language

• Assume following kinds of instructionsAssume following kinds of instructions– Load: LD dst, addr (LD r, x)

Store: ST x r– Store: ST x, r

– Computation: OP dst, src1, src2 (unary operators do not have src2)do not have src2)

– Unconditional jumps: BR L

Conditional jumps: Bcond r L (BLTZ r L)– Conditional jumps: Bcond r, L (BLTZ r, L) 

Chapter 8: Code Generation CS5810 Spring 2009 9

Page 10: Code Generation - Western Michigan University

The Target LanguageThe Target Language

• Addressing modedd ess g ode– Variable name x– Indexed address a(r)

• LD R1, a(R2) is R1 = contents(a+contents(R2))

– An integer indexed by a register• LD R1 100(R2) is R1 = contents(100+contents(R2))• LD R1, 100(R2)  is R1 = contents(100+contents(R2))

– Indirect addressing modes• LD R1, *100(R2) is R1=contents(contents(100+contents(R2))

– Immediate constant address mode• LD R1, #100

Chapter 8: Code Generation CS5810 Spring 2009 10

Page 11: Code Generation - Western Michigan University

The Target Language ExampleThe Target Language Examplex=y‐z b=a[i]

LD  R1, yLD  R2, z

LD  R1, iMUL  R1, R1, 8,

SUB R1, R1, R2ST  x, R1

, ,LD R2, a(R1)ST  b, R2, ,

x=*p if x < y goto L

LD  R1, pLD R2 0(R1)

LD  R1, xLD  R2, y

Chapter 8: Code Generation 11CS5810 Spring 2009

LD R2, 0(R1)ST  x, R2 Chapter 6: Intermediate Code 

Generation

SUB  R1, R1, R2BLTZ  R1, L

Page 12: Code Generation - Western Michigan University

Program and Instruction CostsProgram and Instruction Costs

• Determining the actual cost of compiling andDetermining the actual cost of compiling and running a program is complex– Cost of an instruction is one plus the costs– Cost of an instruction is one plus the costs associated with the addressing modes

• LD R0, R1: cost 1LD R0, R1: cost 1

• LD R0, M: cost 2

• LD R1, *100(R2): cost 3

Chapter 8: Code Generation CS5810 Spring 2009 12

Page 13: Code Generation - Western Michigan University

TEST YOURSELF #1TEST YOURSELF #1

• Generate code for the following three addressGenerate code for the following three address sequences 

x = a[i]y = b[i]

y = *qq = q + 4

Z = x * y *p = yp = p + 4

CS5810 Spring 2009 13Chapter 8: Code Generation

Page 14: Code Generation - Western Michigan University

Basic Blocks and Flow GraphsBasic Blocks and Flow Graphs

• Partition IR into basic blocks, which are maximal a t t o to bas c b oc s, c a e a asequences of consecutive three‐address instructions that– The flow of control can only enter the basic block through the first instructionControl leave the block without halting/branching– Control leave the block without halting/branching, except at the last instruction

• The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks

Chapter 8: Code Generation CS5810 Spring 2009 14

Page 15: Code Generation - Western Michigan University

Create Basic BlocksCreate Basic Blocks

• An instruction is a leader ifAn instruction is a leader if – The first instruction in the IR

The target of a jump– The target of a jump

– Immediately follows a jump

F h l d it b i bl k i t f• For each leader, its basic block consists of itself and all the instructions upto (but not i l di ) th t l d th d f th IRincluding) the next leader, or the end of the IR

Chapter 8: Code Generation CS5810 Spring 2009 15

Page 16: Code Generation - Western Michigan University

TEST YOURSELF #2TEST YOURSELF #2

• Construct basic blocks1) i=12) J = 1Construct basic blocks 

of the following IR3) T1 = 10 * I4) T2 = t1 + j5) T3 = 8 * t26) T4 t3 88

for i from 1 to 10 dofor j from 1 to 10 do

6) T4 = t3 – 887) A[t4] = 0.08) J = j+ 19) If j<=10 goto (3)for j from 1 to 10 do

a[i,j]=0.0for i from 1 to 10 do

9) If j<=10 goto (3)10) I = I + 111) If I <= 10 goto (2)12) I = 1for i from 1 to 10 do

a[i,i]=1.0

)13) T5 = I – 114) T 6 = 88 * t515) A[t6] = 1.0

CS5810 Spring 2009 16Chapter 8: Code Generation

16) I = i+117) I i<= 10 goto (13)

Page 17: Code Generation - Western Michigan University

Next‐Use InformationNext Use Information

• Knowing when the value of a var will be usedKnowing when the value of a var will be used next is essential for good code generation

• j uses the value of x computed at i if• j uses the value of x computed at i if– statement i assigns to x

If j h d d l h j– If j has x as an operand, and control can reach j from i without an intervening assignment to x

Chapter 8: Code Generation CS5810 Spring 2009 17

Page 18: Code Generation - Western Michigan University

Next‐Use AlgorithmNext Use Algorithm

• Input: a basic block BInput: a basic block B• Output: attach to i: x=y+z the next‐use info• Method: start at the last statement B• Method: start at the last statement B

– Attach to i the information currently found in symbol tablesymbol table

– set x to “not live” and “no next use”– Set y/z to “live” and the next uses of y/z to Iy/ y/– Note the steps of 2 and 3 may not be interchanged

Chapter 8: Code Generation CS5810 Spring 2009 18

Page 19: Code Generation - Western Michigan University

Optimization of Basic BlocksOptimization of Basic Blocks

• Substantial improvement can be gained by Substa t a p o e e t ca be ga ed byperforming local optimization

• DAG Representation of basic blocksp– A node for each initial values of the vars– Node for each statement 

• Children are the statements of the last definitions• Labeled by operators and vars for which it is the last definition in  the block

– Some nodes are “output nodes” whose vars are live on exit 

• Part of global flow analysis discussed later• Part of global flow analysis, discussed later

Chapter 8: Code Generation CS5810 Spring 2009 19

Page 20: Code Generation - Western Michigan University

DAGs for Basic BlocksDAGs for Basic Blocks

c+

‐a = b + cb = a – d

b,d

d0+

b

c = b + cd = a ‐ d

a

c0b0

Chapter 8: Code Generation 20CS5810 Spring 2009

Page 21: Code Generation - Western Michigan University

Finding Local Common SubexpressionsFinding Local Common Subexpressions

+c

+

‐a = b + cb = a – d a

b,d a = b + cd = a – d

c0

d0+

b0

c = b + cd = a ‐ d

a d   a  dc = d + c

00

Chapter 8: Code Generation 21CS5810 Spring 2009

Page 22: Code Generation - Western Michigan University

Dead Code EliminationDead Code Elimination

• Delete any root that has no live variables andDelete any root that has no live variables, and repeat the procedure

a = b + cb = b – d a b c

+e

+

c0 d0

+ ‐

b0

b = b  dc = c + de = b + c

c

0 00e = b + c

Chapter 8: Code Generation CS5810 Spring 2009 22

Page 23: Code Generation - Western Michigan University

Use of Algebraic LawsUse of Algebraic Laws

• Identities– x+0=0+x=x; x*1=1*x=x; x‐0=x; x/1=x;

• Reduction in strenth– x2=x*x; 2*x = x+x; x/2=x*0.5

• Commutativityx*y=y*x– x*y=y*x

• Associativity– a = b+c; e=c+d+b;; ;– a = b+c; t=c+d; e = t+b– a = b + c; e = a+d

Chapter 8: Code Generation CS5810 Spring 2009 23

Page 24: Code Generation - Western Michigan University

A Simple Code GeneratorA Simple Code Generator

• Generate code for a single basic blockGenerate code for a single basic block

• How to use registers?I t hi hit t ll f th– In most machine architectures, some or all of the operands must be in registers

Registers make good temporaries– Registers make good temporaries

– Hold values that are computed in one basic block and used in other blocksand used in other blocks

– Often used with run‐time storage management

Chapter 8: Code Generation CS5810 Spring 2008 24

Page 25: Code Generation - Western Michigan University

Register and Address DescriptorsRegister and Address Descriptors

• For each available register a registerFor each available register, a register descriptor (RD) keeps track of the vars whose current value in that registercurrent value in that register– Initially empty

• For each var an address descriptor (AD) keeps• For each var, an address descriptor (AD) keeps track of the locations where the current value of the var can be foundof the var can be found– Location can be a register, a memory address, etc. 

Chapter 8: Code Generation CS5810 Spring 2008 25

Page 26: Code Generation - Western Michigan University

Code‐generation AlgorithmCode generation Algorithm

• For a three‐address instruction, e.g. x=y+z, g y– Use getReg(x=y+z) to select registers Rx, Ry, Rz for x, y , z– If y is not in Ry, issue an instruction LD Ry, y’

Si il l f• Similarly for x

– Issue the instruction ADD Rx, Ry, Rz• Copy statement x=yCopy statement x y

– If y is not already in register, generate LD Ry, y’– Adjust RD for Ry  so it includes x– Change AD for x so its only location is Ry

• Ending the basic blockIf x is used at other blocks issue ST x R– If x is used at other blocks, issue ST x, Rx

Chapter 8: Code Generation CS5810 Spring 2008 26

Page 27: Code Generation - Western Michigan University

Managing Register and Address Descriptors

• For LD R xFor LD R, x– Change RD for R so it holds only x

– Change AD for x by adding R as an additional locationChange AD for x by adding R as an additional location

• For ST x, R– Change AD for x to include its own memory location– Change AD for x to include its own memory location

• For ADD Rx, Ry, RzChange RD for R so it holds only x– Change RD for Rx  so it holds only x

– Change AD for x so its only location is Rx– Remove R from the AD of any var other than x– Remove Rx from the AD of any var other than x

Chapter 8: Code Generation CS5810 Spring 2008 27

Page 28: Code Generation - Western Michigan University

ExampleExamplet = a – bu = a – cv = t + u

da = dd = v + u

• t, u, v are temp vars, while a, b, c, d are globalA i t h• Assume registers are enough– Reuse registers whenever possible

Chapter 8: Code Generation 28CS5810 Spring 2008

Page 29: Code Generation - Western Michigan University

ExampleExample 

a b c d

R1 R2 R3 va b c d t u

LD R1 a; LD R2 b; SUB R2 R1 R2t = a ‐ b

a b c d

LD R1, a; LD R2, b; SUB R2, R1, R2

a t a,R1 b c d R2

R1 R2 R3 va b c d t u,

LD R3, c;  SUB R1, R1, R3u = a ‐ c

u t c a b c,R3 d R2 R1

R1 R2 R3 va b c d t u

Chapter 8: Code Generation 29CS5810 Spring 2008

Page 30: Code Generation - Western Michigan University

t ADD R3, R2, R1v = t + u

R1 R2 R3 va b c d t uu t v a b c d R2 R1 R3

LD R2, da = d

u a,d v R2 b c d,R2 R1 R3

R1 R2 R3 va b c d t u

Chapter 8: Code Generation CS5810 Spring 2008 30

Page 31: Code Generation - Western Michigan University

ADD R1 R3 R1d = v + u ADD R1, R3, R1d = v + u

R1 R2 R3 va b c d t ud a v R2 b c R1 R3

ST a, R2; ST d, R1R1 R2 R3 va b c d t ud a v a,R2 b c d,R1 R3

Chapter 8: Code Generation CS5810 Spring 2008 31

Page 32: Code Generation - Western Michigan University

Function getRegFunction getReg

• Consider picking R for y in x = y + zConsider picking Ry for y in x = y + z– If y in a register, do nothing

If y not in a register and there is a empty one– If y not in a register and there is a empty one, choose it as Ry

– Let v be one of the var in R– Let v be one of the var in R • We are OK if v is somewhere besides R

• We are OK if v is xe a e O s

• We are OK if v is not used later

• Spill: ST v, R

Chapter 8: Code Generation CS5810 Spring 2008 32

Page 33: Code Generation - Western Michigan University

Peephole OptimizationPeephole Optimization

• Exam a sliding window and replace instructionExam a sliding window and replace instruction sequence with a shorter or faster sequence– Redundant instruction elimination– Redundant‐instruction elimination

– Flow‐of‐control optimization

Algebraic simplifications– Algebraic simplifications

– Use a machine idioms

Chapter 8: Code Generation CS5810 Spring 2008 33

Page 34: Code Generation - Western Michigan University

Eliminating Redundant Loads and Stores

LD a, R0ST R0, a

Chapter 8: Code Generation CS5810 Spring 2008 34

Page 35: Code Generation - Western Michigan University

Eliminating Unreachable CodeEliminating Unreachable Code

if debug==1 goto L1if debug gotogoto L2L1: print debugging infop gg gL2:

if debug!=1 goto L2

L2:

g gL1: print debugging infoL2:

Chapter 8: Code Generation CS5810 Spring 2008 35

Page 36: Code Generation - Western Michigan University

Flow‐of‐Control OptimizationsFlow of Control Optimizations

goto L1 goto L2g…L1: goto L2

…L1: goto L2g

if < b t L1 if a < b goto L2if a < b goto L1…L1 goto L2

if a < b goto L2…L1: goto L2

Chapter 8: Code Generation 36CS5810 Spring 2008

L1: goto L2Chapter 6: Intermediate Code Generation

L1: goto L2

Page 37: Code Generation - Western Michigan University

Optimal Code Generation for Expressions

• We can choose registers optimallyWe can choose registers optimally– If a basic block consists of a single expression, or

It is sufficient to generate code for a block one– It is sufficient to generate code for a block one expression at a time

Chapter 8: Code Generation CS5810 Spring 2008 37

Page 38: Code Generation - Western Michigan University

Ershov NumbersErshov Numbers

• Assign the nodes of an expression tree aAssign the nodes of an expression tree a number that tells how many registers needed– Label leaf 1– Label leaf 1

– The label of an interior node with one child is the label of its childlabel of its child

– The label of an interior node with two children• the larger one if the labels are differentthe larger one if the labels are different

• One plus the label if the labels are the same

Chapter 8: Code Generation CS5810 Spring 2008 38

Page 39: Code Generation - Western Michigan University

Ershov NumbersErshov Numberst4

t2

t1

be

t3

a ba

dc

t1 = a – bt2 = c + d(a‐b)+e*(c+d) t2 = c + dt3 = e * t2t4 = t1 + t3

(a b)+e (c+d)

Chapter 8: Code Generation CS5810 Spring 2008 39

t4 = t1 + t3

Page 40: Code Generation - Western Michigan University

Generating Code From Labeled Expression Tree

• Recursive algorithm starting at the root ecu s e a go t sta t g at t e oot– Label k means k registers will be used– Rb, Rb+1, …, Rb+k‐1 , where b>=1 is a base 

• To generate machine code for a node with label k and two children with equal labels– Recursively generate code for the right child, using base b+1: Rb+1, Rb+2, …, Result in Rb+k‐1

– Recursively generate code for the right child using– Recursively generate code for the right child, using base b: Rb, Rb+1, …, Result in Rb+k‐2

– Generate instruction OP Rb+k‐1 , Rb+k‐2 ,Rb+k‐1

Chapter 8: Code Generation CS5810 Spring 2008 40

Page 41: Code Generation - Western Michigan University

Generating Code From Labeled Expression Tree

• To generate machine code for a node withTo generate machine code for a node with label k and two children with unequal labels– Recursively generate code for the child with label– Recursively generate code for the child with label k, using base b: Rb, Rb+1, …, Result in Rb+k‐1

– Recursively generate code for the child with labelRecursively generate code for the child with label m, using base b: Rb, Rb+1, …, Result in Rb+m‐1

– Generate instruction OP Rb+k 1 , Rb+m 1 ,Rb+k 1Generate instruction OP Rb+k‐1 , Rb+m‐1 ,Rb+k‐1• For a leaf x, if base is b generate LD Rb, x

Chapter 8: Code Generation CS5810 Spring 2008 41

Page 42: Code Generation - Western Michigan University

Ershov NumbersErshov Numberst4

t2

t1

be

t3

a ba

LD R3 ddc

LD R3, dLD R2, cADD R3, R2, R3

LD R2, bLD R1 aADD R3, R2, R3

LD R2, e MUL R3, R2, R3

LD R1, aSUB R2, R1, R2ADD R3 R2 R3

Chapter 8: Code Generation CS5810 Spring 2008 42

MUL R3, R2, R3 ADD R3, R2, R3

Page 43: Code Generation - Western Michigan University

Insufficient Supply of RegistersInsufficient Supply of Registers

• Input: a labeled tree and a number r of registersput: a abe ed t ee a d a u be o eg ste s• For a node N with at least one child labeled r or greaterg– recursively generate code for the big child with b=1. The result will appear in RrG hi i i ST R– Generate machine instruction ST tk, Rr

– If the little child has label r or greater, b=1. If the label is j<r, then b=r‐j. The result in Rrj r, then b r j. The result in Rr

– Generate the instruction LD Rr‐1, tk– Generate OP Rr, Rr, Rr‐1 or OP Rr, Rr‐1, Rr

Chapter 8: Code Generation CS5810 Spring 2008 43

Page 44: Code Generation - Western Michigan University

Insufficient Supply of RegistersInsufficient Supply of Registerst4

t2

t1

be

t3

a ba

LD R2, d dc

LD R1, cADD R2, R1, R2

LD R2, bLD R1, a

LD R1, e MUL R2, R1, R2

,SUB R2, R1, R2LD R1, t3

Chapter 8: Code Generation CS5810 Spring 2008 44

ST t3, R2,

ADD R2, R2, R1