Code Generation and Optimisation

41
Code Generation and Optimisation Register Allocation via Graph Colouring

Transcript of Code Generation and Optimisation

Page 1: Code Generation and Optimisation

Code Generation and Optimisation

Register Allocation

via Graph Colouring

Page 2: Code Generation and Optimisation

Broad problem

y := 10; x := y*y*y; print x

li y, 10 mul t, y, y mul x, t, y print x

Current compiler

Problem: to automatically convert unlimited register code to limited register code.

Page 3: Code Generation and Optimisation

Broad solution

The number of registers used in a piece of code can be reduced by:

register reuse: when two registers are never simultaneously live they can be remapped to a single register;

register spilling: register values can be stored in memory and loaded into registers only when needed.

Page 4: Code Generation and Optimisation

Example: register reuse

Since x and t are never live at the same time they can be remapped to a single register r.

li y, 10 mul t, y, y mul x, t, y print x

li y, 10 mul r, y, y mul r, r, y print r

Page 5: Code Generation and Optimisation

Spill memory

Suppose registers a, b, and y are spilt to memory. They might be arranged in memory as follows:

Memory

a

b

y

spill

The spill register points to the beginning of the spilt registers in memory.

Page 6: Code Generation and Optimisation

Loads and Stores

To access spilt registers, we need to extend our target language with instructions for accessing memory.

instr ::= … (MIPS2) | load r , r , i (r0 ⟵ mem[r1+i]) | store r , r , i (mem[r1+i] ⟵ r0)

Concrete syntax of MIPS3

data Instr = … | LOAD(Reg, Reg, Int) | STORE(Reg, Reg, Int)

Abstract syntax of MIPS3

Page 7: Code Generation and Optimisation

Example: register spilling

Suppose y has been spilt to memory. We must wrap accesses to y with load and/or store instructions.

mul y, x, x ... print y

mul t, x, x store t, spill, 2

... load t, spill, 2

print t

Page 8: Code Generation and Optimisation

Register allocation

Aims:

Introduce as much register reuse as possible.

(The number of registers used will fall

significantly, but may still be beyond the limit of the target machine.)

Spill registers to memory only if necessary. Memory access is expensive!

Page 9: Code Generation and Optimisation

INTERFERENCE GRAPHS

When are registers live at the same time?

Page 10: Code Generation and Optimisation

Interference graphs

Using the results of liveness analysis, we can construct an interference graph:

each register r is represented by a node labelled r in the interference graph;

there is an edge between r1 and r2 if liveness analysis states that, at any point in the code, r1 and r2 are live at the same time.

Page 11: Code Generation and Optimisation

Exercise 1

Draw the interference graph for the following code.

li a, 0

addi b, a, 1

muli a, b, 2

blt a, n, loop

add c, c, b

L1:

loop:

L2:

L3:

L5: print c

L4:

Label Live-in

L1 c, n

loop a, c, n

L2 b, c, n

L3 b, c, n

L4 a, c, n

L5 c

Page 12: Code Generation and Optimisation

GRAPH COLOURING

Page 13: Code Generation and Optimisation

Register allocation

Let colours 1…N represent the N registers available in the target machine.

Register allocation is similar to graph colouring:

colour the interference graph with at most N colours, such that no two nodes connected by an edge have the same colour.

Page 14: Code Generation and Optimisation

Register allocation

if the graph is not N-colourable then registers may be spilt (assigned no colour).

But there is one big difference:

The number of spills should be kept to a minimum!

Page 15: Code Generation and Optimisation

Example 1

Suppose the target machine has 3 registers. Consider register allocation on the following interference graph.

a b

c d

Register Colour

a 1

b 2

c 2

d 3

Register Colour

a 1

b 2

c 3

d Spilt

Good answer: Bad answer:

Page 16: Code Generation and Optimisation

Algorithm

Register allocation, like graph colouring, is NP-complete. That is, no optimal polynomial-time algorithm is known.

We use a mild variant of Chaitin's aglorithm. It is an approximate polynomial time algorithm that typically gives good results.

Page 17: Code Generation and Optimisation

Algorithm

Has two components:

The basic colouring algorithm that is parameterised by the order in which nodes should be coloured.

The simplification algorithm which decides a good order to colour the nodes.

Page 18: Code Generation and Optimisation

Basic colouring algorithm

Inputs:

an interference graph g;

a list of registers rs (the order).

Outputs:

a colouring c that partially-maps registers to colours, initially empty. Uncoloured registers are to be spilt.

Page 19: Code Generation and Optimisation

Basic colouring algorithm

Let neighbours(g, c, r) denote the colours (if any) of the registers connected to r.

foreach r in rs possible := {1..N} – neighbours(g,c,r) ; if possible ≠ {} then c[r] := minimum(possible)

Basic colouring algorithm

Page 20: Code Generation and Optimisation

Exercise 2

c

e f

d

h g

a

b

Assuming the target machine has just two registers , apply the basic colouring algorithm twice to the following interference graph.

First use the node ordering b, e, f, h, g, c, d, a, and then use the node ordering h, d, g, a, c, f, e, b.

Page 21: Code Generation and Optimisation

Order matters!

When using the basic colouring algorithm, some node orderings lead to fewer spills than others.

The simplification algorithm aims to find a good ordering for the basic colouring algorithm.

Page 22: Code Generation and Optimisation

Chaitin’s observation

If g-r is N-colourable then so is g, since when r and its edges are added to g-r, the neighbourhood of r contains fewer than N colours.

N is the number of available registers;

graph g contains a node r with fewer than N neighbours;

g-r is the graph obtained by removing r and its edges.

Suppose that

Page 23: Code Generation and Optimisation

Node ordering

Chaitin's observation suggests the following recursively-defined node ordering.

To colour the nodes in graph g, first colour the nodes in g-r, where r has fewer than N neighbours, then colour node r.

(Note, this does not help us when there exists no node with fewer than N neighbours.)

Page 24: Code Generation and Optimisation

Deterministic simplification algorithm

Inputs:

an interference graph g.

Outputs:

A list of registers rs representing an order in which to colour the nodes.

Let pick(g) return the register in g with the fewest neighbours; if more than one exists, return the first alphabetically.

Page 25: Code Generation and Optimisation

Deterministic simplification algorithm

rs := [] while non-empty(g) r := pick(g) ; rs := [r] ++ rs ; g := del(r, g)

Deterministic simplification algorithm

Let del(r, g) return graph g with register r, and its edges, removed.

Page 26: Code Generation and Optimisation

Exercise 3

c

e f

d

h g

a

b

Give the output of the simplification algorithm on the following interference graph.

Page 27: Code Generation and Optimisation

REGISTER ALLOCATION

register: a device for storing small amounts of data

allocate: to apportion for a specific purpose

Webster’s Dictionary

Page 28: Code Generation and Optimisation

Substitution

If register x is coloured c, then the register allocator replaces all occurrences of x in the code with rc.

For example, if registers x, y, and t are coloured 1, 2, and 1 then:

li y, 10 mul t, y, y mul x, t, y print x

li r2, 10 mul r1, r2, r2

mul r1, r1, r2

print r1

Page 29: Code Generation and Optimisation

Spilling

If register r is not coloured, then spill-code must be inserted to transfer the value of r to and from memory as required.

For example, if registers x and y are to be spilt to memory locations spill[offsetx] and spill[offsety] then

add x, x, y

load t1, spill, offsetx

load t2, spill, offsety

add t1, t1, t2 store t1, spill, offsetx

Page 30: Code Generation and Optimisation

COALESCING

Reducing unnecessary move instructions

Page 31: Code Generation and Optimisation

Coalescing

If we have an instruction

move x, y

and there is no edge between x and y in the interference graph then the instruction can be eliminated.

Registers x and y can be coalesced into a new register xy whose neighbours are the union of those of x and of y.

Page 32: Code Generation and Optimisation

Snag

Since a coalesced register xy may have more neighbours than x or y alone, a graph that is N-colourable before coalescing may not be after.

Coalescing can be done selectively during register allocation without compromising N-colourablity: see Section 11.2 of Appel[1].

[1] Modern Compiler Implementation in ML.

Page 33: Code Generation and Optimisation

SUMMARY

What have we learnt?

Page 34: Code Generation and Optimisation

Summary

Register allocation transforms unlimited-register code into limited-register code.

Aim to introduce as few spills as possible by reusing registers.

Chaitin's algorithm not optimal but efficient and typically gives good results.

Coalescing allows unnecessary move instructions to be removed.

Page 35: Code Generation and Optimisation

Limitations

A long-living but infrequently-used variable may get mapped to a register r (and not spilt).

This seems wasteful of a valuable resource: it would be better to map a frequently-used variable to register r.

Page 36: Code Generation and Optimisation

Acknowledgements

Many ideas in this chapter were taken from:

Gregory Chaitin of IBM: see “Register Allocation and Spilling via Graph Colouring”.

Andrew Appel: see “Modern Compiler Implementation in ML”.

Page 37: Code Generation and Optimisation

IMPLEMENTATION

Page 38: Code Generation and Optimisation

Interference graph (1)

Interference graph: each register r is mapped to a list of registers that are live at the same time as r.

Compute all registers live at the same time as r:

liveWith :: (Liveness, Reg) -> [Reg] liveWith(live, r) = bigUnion([rs | (l, rs) <- live, member(r, rs)])

type IG = Map Reg [Reg]

Page 39: Code Generation and Optimisation

Interference graph (2)

Construct the interference graph for a given code sequence:

interferenceGraph :: Code -> IG interferenceGraph(code) = [(r, liveWith(live, r)) | r <- regs] where root = fresh() g = cfg(code, root) live = liveness(g) regs = bigUnion([use(i) ++ def(i) | i <- code])

Page 40: Code Generation and Optimisation

Basic colouring

A colour is an integer between 1 and numRegs and a colouring maps registers to colours:

type Colour = Int

type Colouring = Map Reg Colour

Colour graph g in the order specified by rs, given an initial colouring c:

basic :: ([Reg], IG, Colouring) -> Colouring basic([], g, c) = c basic(r:rs, g, c) = case possible of [] -> basic(rs, g, c) x:xs -> basic(rs, g, insert(c, r, x)) where possible = [1..numRegs] \\ neighbours(g, c, r)

Page 41: Code Generation and Optimisation

Simplification algorithm

Return registers in a good order for colouring:

Colour the registers in the order given by the simplification algorithm.

simplify :: IG -> [Reg] simplify([]) = [] simplify(g) = simplify(del(r, g)) ++ [r] where r = pick(g)

colour :: IG -> Colouring colour(g) = basic(simplify(g), g, [])