Code optimization by partial redundancy elimination using Eliminatability paths (E-paths)

Computer Science & Engineering, Indian Institute of Technology, Bombay

Code optimization byCode optimization bypartial redundancy eliminationpartial redundancy elimination

using Eliminatability paths (E-paths)using Eliminatability paths (E-paths)

Prof. Dhananjay M DhamdhereProf. Dhananjay M Dhamdhere


These slides are based onThese slides are based on

D. M. Dhamdhere: “E-path_PRE---Partial redundancy D. M. Dhamdhere: “E-path_PRE---Partial redundancy elimination made easy”, SIGPLAN Notices, v 37, n 8 elimination made easy”, SIGPLAN Notices, v 37, n 8 (2002), 53-65.(2002), 53-65.

D. M. Dhamdhere: “Eliminatability path---A versatile D. M. Dhamdhere: “Eliminatability path---A versatile basisbasis

for partial redundancy elimination, 2002for partial redundancy elimination, 2002

Dheeraj Kumar: “Syntactic and Semantic Partial Dheeraj Kumar: “Syntactic and Semantic Partial Redundancy elimination”, M. Tech. dissertation, Redundancy elimination”, M. Tech. dissertation, I.I.T. Bombay, 2006.I.I.T. Bombay, 2006.


Partial redundancy eliminationPartial redundancy elimination Partial redundancyPartial redundancy An expression An expression ee in statement in statement ss is partially redundant if its is partially redundant if its

value is value is identical with value of identical with value of ee in some path from start of program in some path from start of program

toto s s

Partial redundancy eliminationPartial redundancy elimination -- A partially redundant occurrence of -- A partially redundant occurrence of ee is made totally is made totally

redundant byredundant by inserting evaluations of inserting evaluations of ee in some path(s) from start of the in some path(s) from start of the program to program to ss -- The totally redundant occurrence of -- The totally redundant occurrence of ee is now eliminated is now eliminated


An example of PREAn example of PRE

a*b

a*b

1 2

3

-- Insert a*b in node 2

-- Delete a*b from node 3

t=a*b

t

1 2

3

t=a*b


Partial redundancy eliminationPartial redundancy elimination

Common subexpression elimination (CSE)Common subexpression elimination (CSE)

- Expression - Expression ee is computed along all paths reaching its is computed along all paths reaching its occurrenceoccurrence

Loop invariant movementLoop invariant movement

- A loop-invariant expression is available along the looping - A loop-invariant expression is available along the looping edge. edge.

Hence it is partially redundant.Hence it is partially redundant. Classical code motionClassical code motion

- A less known optimization. It is in fact partial redundancy - A less known optimization. It is in fact partial redundancy elimination in specific situations.elimination in specific situations.

PRE subsumes 3 important classical optimizations:


PRE subsumes 3 optimizationsPRE subsumes 3 optimizations

a=.. a*b

a*b

a*b

1

2

3

4

5

6

1. CSE

- a*b of node 5 is a CSE.

2. Loop invariant movement

- a*b of node 4 is partially redundant

3. Code movement

- a*b of node 6 can be moved to node 3.


Benefits and costs of PREBenefits and costs of PRE Benefits:Benefits: Execution efficiency through a reduction in the number Execution efficiency through a reduction in the number

ofof expression occurrences along a graph pathexpression occurrences along a graph path

Costs:Costs: - Use of compiler generated temporaries to hold values - Use of compiler generated temporaries to hold values of expressionsof expressions - Lifetimes of compiler generated temporaries increase - Lifetimes of compiler generated temporaries increase register pressureregister pressure - Insertion of new blocks due to - Insertion of new blocks due to edge placementedge placement

Desirable goals: Desirable goals: Computational optimality and lifetime optimalityComputational optimality and lifetime optimality


Data flow concepts used in partial Data flow concepts used in partial redundancy eliminationredundancy elimination

Availability : An expression Availability : An expression ee is available at a program is available at a program point point pp if its value is computed along ALL paths from if its value is computed along ALL paths from start of the program to start of the program to pp

Partial availability : An expression Partial availability : An expression ee is partially available is partially available at a program point at a program point pp if its value is computed along SOME if its value is computed along SOME path from start of the program to path from start of the program to pp

Availability = Total redundancy Availability = Total redundancy

Partial availability = Partial redundancyPartial availability = Partial redundancy



Anticipatability: An expression Anticipatability: An expression ee is anticipatable is anticipatable (that is, (that is,

“ “very busy”) at a program point very busy”) at a program point pp if it is computed if it is computed along along

ALL paths from ALL paths from pp to an exit of the program to an exit of the program



Anticipatability: An expression Anticipatability: An expression ee is anticipatable (that is, is anticipatable (that is, “ “very busy”) at a program point very busy”) at a program point pp if it is computed along if it is computed along

ALL paths from ALL paths from pp to an exit of the program to an exit of the program

Safety of a computation (Kennedy 1972): An expressionSafety of a computation (Kennedy 1972): An expression ee is is safesafe at a program point at a program point pp if it is either available or if it is either available or anticipatable at anticipatable at pp - Insertion of - Insertion of ee at at pp is a “new” computation if is a “new” computation if ee is not is not safe at safe at pp. . - It increases the execution time of the program. It may - It increases the execution time of the program. It may also raise “new” exceptionsalso raise “new” exceptions


““Safe” insertion of computationsSafe” insertion of computations

a*b

a*b

a*b

a*b

11 12

13

21 22

23

-- Insertion of a*b in node 12 is safe, however in 22 it is unsafe

-- Insertion in edge (22,23) is safe!

-- a*b is anticipatable in node 12, but not anticipatable in node 22


Some partial redundancies cannot be Some partial redundancies cannot be eliminated through safe code insertioneliminated through safe code insertion

a*b

a*b

i

k

m

n

t=a*b

-- Insertion in the in-edge of node n is unsafe because a*b is not anticipatable

a*b available

a*b anticipatable

a*b ¬ available, ¬ anticipatable


Performing Partial Redundancy Performing Partial Redundancy EliminationElimination

Identify partially redundant occurrences of an expression Identify partially redundant occurrences of an expression ee in a in a programprogram

Insert occurrences of Insert occurrences of ee at some program points where at some program points where ee is safe is safe

Delete partially redundant occurrences of Delete partially redundant occurrences of ee which have become totally which have become totally redundantredundant

Classical PRE: Elimination of partial redundancies in a program through Classical PRE: Elimination of partial redundancies in a program through safesafe

insertion of computations. insertion of computations.

- Can be looked upon as `code movement’ from the point of original- Can be looked upon as `code movement’ from the point of original occurrence to the point of insertionoccurrence to the point of insertion

- It cannot eliminate all partial redundancies in a program!- It cannot eliminate all partial redundancies in a program!


A brief history of PREA brief history of PRE Morel, Renvoise (1979): Bidirectional data flows for code placement in nodes (MRA).Morel, Renvoise (1979): Bidirectional data flows for code placement in nodes (MRA). Lacks both computational and lifetime optimality.Lacks both computational and lifetime optimality.

Dhamdhere (1988): Computational optimality and reduced lifetimes of temporariesDhamdhere (1988): Computational optimality and reduced lifetimes of temporaries than Morel-Renvoise through placement in nodes and edges (EPA).than Morel-Renvoise through placement in nodes and edges (EPA).

Knoop, Ruthing, Steffen (1992): Lazy code motion (LCM) offering computationalKnoop, Ruthing, Steffen (1992): Lazy code motion (LCM) offering computational optimality and lifetime optimality through a priori edge splitting and placement inoptimality and lifetime optimality through a priori edge splitting and placement in nodes. Drechsler and Stadel (1993) reformulated LCM to handle basic blocks.nodes. Drechsler and Stadel (1993) reformulated LCM to handle basic blocks.

Bodik, Gupta, Soffa (1998) : Complete elimination of partial redundancies throughBodik, Gupta, Soffa (1998) : Complete elimination of partial redundancies through selective code expansion (ComPRE). Based on the work by Steffen (1996).selective code expansion (ComPRE). Based on the work by Steffen (1996).

Kennedy et al (1999): PRE in SSA representation of programs (SSAPRE).Kennedy et al (1999): PRE in SSA representation of programs (SSAPRE).

Dhamdhere (2002): Eliminatability path --- A versatile basis for PRE Dhamdhere (2002): Eliminatability path --- A versatile basis for PRE (E-path_PRE). Develops a concept originating in Dhaneshwar, Dhamdhere (1995)(E-path_PRE). Develops a concept originating in Dhaneshwar, Dhamdhere (1995) and uses it for evaluation of PRE algorithms and development of new ones.and uses it for evaluation of PRE algorithms and development of new ones.

Xue, Knoop (2006) and Dheeraj kumar, Dhamdhere (2006)Xue, Knoop (2006) and Dheeraj kumar, Dhamdhere (2006)


Morel-Renvoise Algorithm (MRA)Morel-Renvoise Algorithm (MRA)

Performs insertions strictly in nodes of the program graphPerforms insertions strictly in nodes of the program graph

Placement possibility (PP) of Placement possibility (PP) of ee at entry/exit of basic blocks: at entry/exit of basic blocks: whether it is feasible and safe to place expression whether it is feasible and safe to place expression ee at at

entry/exit entry/exit of a blockof a block

Insert Insert ee at the exit of a basic block at the exit of a basic block bb if it can be placed at the if it can be placed at the exit of exit of bb but not at its entry but not at its entry

Delete an existing occurrence of Delete an existing occurrence of ee in a basic block if it can be in a basic block if it can be placed at the entry of that blockplaced at the entry of that block


Morel-Renvoise Algorithm (MRA)Morel-Renvoise Algorithm (MRA)(simplified equations)(simplified equations)


Morel-Renvoise Algorithm (MRA)Morel-Renvoise Algorithm (MRA)

a=.. a*b

a*b

a*b

t=a*b

t

a=..

t=a*b

1

2

3

4

5

6

1

2

3

4

5

6

2. a*b of node 4 cannot be optimized because it cannot be inserted in node 1.

t

3. a*b is saved in t in nodes 2 and 4. a*b of node 6 is replaced by use of t.

1. a*b is inserted in node 2. Insertion in node 3 would have been lifetime optimal.


Edge placement algorithm Edge placement algorithm (Dhamdhere 1988)(Dhamdhere 1988)

Performs insertions both in nodes and along Performs insertions both in nodes and along edges in the program graphedges in the program graph

An expression is An expression is hoistedhoisted as far up as possible to as far up as possible to obtain computational optimalityobtain computational optimality

It is then subjected to It is then subjected to sinkingsinking (without affecting (without affecting computational optimality) to obtain lifetime computational optimality) to obtain lifetime optimalityoptimality It is placed along an edge only if it cannot be placed in a It is placed along an edge only if it cannot be placed in a

nodenode It is performed only along a It is performed only along a critical edgecritical edge, i.e., an edge , i.e., an edge

from a “branch” node to a “join” nodefrom a “branch” node to a “join” node



A. Computational optimality:A. Computational optimality:

The ∏ term of PPIN is dropped. Hence PPIN can be true even ifThe ∏ term of PPIN is dropped. Hence PPIN can be true even if PPOUT of a predecessor is false.PPOUT of a predecessor is false.

If PP is true for entry of a basic block If PP is true for entry of a basic block ii but PP is false for exit of but PP is false for exit of a a

predecessor predecessor j, ej, e is placed along the edge ( is placed along the edge (j,ij,i). ).

-- It is called -- It is called edge placementedge placement. A basic block is inserted in the . A basic block is inserted in the edge if edge if ee is to be placed along it. is to be placed along it.

-- Edge placement performed only along a “critical edge”, i.e. -- Edge placement performed only along a “critical edge”, i.e. along an edge from a “branch” node to a “join” node.along an edge from a “branch” node to a “join” node.

Placement into nodes is done as in MRA.Placement into nodes is done as in MRA.


Edge placement algorithmEdge placement algorithm(Dhamdhere 1988)(Dhamdhere 1988)

B. Reducing lifetimes of expression variables:B. Reducing lifetimes of expression variables:

Move insertion points as far down as possible without Move insertion points as far down as possible without sacrificingsacrificing

computational optimality (it is achieved by the computational optimality (it is achieved by the ∑ term)∑ term)



)PPIN( apx-PPOUT

.)apx-PPOUT .Transp (Antloc

.)Transp .Antloc (Pavin apx-PPIN

succsucci

iii

iiii

EPA solution technique: (“hoisting-followed-by-sinking” approach)

1. Solve the unidirectional data flow problem obtained by omitting the ∑ term from the PPIN equation.

2. Now a second data flow is solved to incorporate the ∑ term: We examine all predecessors of a block i and change PPIN of block i from true to false if the ∑ term is false for its predecessors.

It hoists e as far up as possible. Provides computational optimality.

It sinks the hoisted expression as far down as possible withoutcompromising computational optimality.


Edge placement algorithm (EPA)Edge placement algorithm (EPA)

a=.. a*b

a*b

a*b

t=a*b t

a=..

1

2

3

4

5

6

1

2

3

4

5

6

t=a*b

1. a*b is inserted in node 3. However, EPA does not provide lifetime optimality in some cases.

2. a*b is inserted in edge (1,4). This is computationally optimal.

t

t


Lazy code motion (KRS 92)Lazy code motion (KRS 92)

All “join” edges are split a priori by inserting blocks along themAll “join” edges are split a priori by inserting blocks along them

D-Safe-earliest points: An expression is placed at the earliest D-Safe-earliest points: An expression is placed at the earliest pointspoints

where it is anticipatable.where it is anticipatable.

Evaluation of an expression is delayed to the latest point where Evaluation of an expression is delayed to the latest point where it it

can be placed without losing computational optimality.can be placed without losing computational optimality.

Thus, it conceptually performs “hoisting-followed-by-sinking”, Thus, it conceptually performs “hoisting-followed-by-sinking”, as in the edge placement algorithm.as in the edge placement algorithm.

Insertion and saving is performed uniformly.Insertion and saving is performed uniformly.

Data flow equations are not given here. (Drechsler and Stadel Data flow equations are not given here. (Drechsler and Stadel reformulated them.)reformulated them.)


Lazy code motion (KRS)Lazy code motion (KRS)

a=.. a*b

a*b

a*b

t

a=..

1

2

3

4

5

6

1

2

3

4

5

6

t=a*b

2. a*b is inserted in edge (3,6). LCM provides lifetime optimality

3. a*b is inserted in edge (1,4). As in EPA, this is computationally optimal

t

t

t=a*b(3,6)

t=a*b(1,4)

1. Edges (1,4), (3,6), (5,6) and (5,4) are split a priori

4. Empty blocks: removed


Eliminatability paths offer ..Eliminatability paths offer ..

A conceptual basis for PRE:A conceptual basis for PRE:

- Identifies partial redundancies which can be- Identifies partial redundancies which can be eliminated through insertion of code in safe placeseliminated through insertion of code in safe places

* We call them * We call them eliminatable partial redundancieseliminatable partial redundancies

- A simple method for identifying safe insertion - A simple method for identifying safe insertion pointspoints

which offer lifetime optimalitywhich offer lifetime optimality

- Thus, no “hoisting-followed-by-sinking”- Thus, no “hoisting-followed-by-sinking”



Computationally optimal PRE:Computationally optimal PRE:

- Elimination of all eliminatable partial - Elimination of all eliminatable partial redundancies redundancies

identified by E-paths through appropriate identified by E-paths through appropriate insertions provides computational optimality insertions provides computational optimality



PRE with lifetime optimality:PRE with lifetime optimality:

- Insertions performed using the notion of E-paths- Insertions performed using the notion of E-paths provides lifetime optimalityprovides lifetime optimality



A versatile basis for PRE:A versatile basis for PRE:

- Classical PRE: PRE performed by insertion, - Classical PRE: PRE performed by insertion, deletion anddeletion and

saving of expressions over a program graphsaving of expressions over a program graph

- PRE over SSA representations of programs - PRE over SSA representations of programs



Simplicity:Simplicity:

- Insertion, deletion and save points are identified using - Insertion, deletion and save points are identified using simple and well-known data flow concepts of simple and well-known data flow concepts of

availabilityavailability and anticipatability and anticipatability



A basis for evaluating effectiveness of an approach toA basis for evaluating effectiveness of an approach to PRE:PRE:

- Does the approach provide computational optimality?- Does the approach provide computational optimality? (i.e. does it eliminate all partial redundancies which can(i.e. does it eliminate all partial redundancies which can be eliminated?)be eliminated?)

- Does the approach provide lifetime optimality?- Does the approach provide lifetime optimality?


Eliminatability Paths (E-paths)Eliminatability Paths (E-paths)

A path A path i .. ki .. k in a program control flow graph is an E-path for an in a program control flow graph is an E-path for an expression expression ee if if

- Node- Node i i contains a locally available occurrence of e and node contains a locally available occurrence of e and node kk contains a locally anticipatable occurrence of contains a locally anticipatable occurrence of ee

- Nodes in the path (- Nodes in the path (i .. ki .. k) are empty wrt ) are empty wrt ee, i.e. they do not , i.e. they do not contain contain

an occurrence of an occurrence of ee or a definition of any of its operands or a definition of any of its operands

- - ee is safe at the exit of each node in [ is safe at the exit of each node in [ i .. ki .. k), i.e., it is either ), i.e., it is either available available

or anticipatable at the exit of each node in [or anticipatable at the exit of each node in [ i .. ki .. k).).

Path [i .. k) includes node i, but excludes node k.Path (i .. k) excludes nodes i and k.


Eliminatability Path*Eliminatability Path*

a*b

a*b

i

k

m

n

- a*b available at exit of [i .. m]

- a*b anticipatable at exit of [n .. k)

- Occurrence of a*b in node k

is said to be “eliminatable”

* Dhaneshwar, Dhamdhere (1995) used eliminatability of exps, but did not define or use E-paths explicitly.


Properties of E-paths: 1Properties of E-paths: 1 PRE using E-paths provides computational optimalityPRE using E-paths provides computational optimality

Use of this property: Use of this property:

- Use it to evaluate computational- Use it to evaluate computational optimality of a PRE algorithm.optimality of a PRE algorithm.

- A PRE algorithm possesses computational optimality if it can - A PRE algorithm possesses computational optimality if it can eliminate partial redundancy of eliminate partial redundancy of ee in EACH node in EACH node kk such that such that an E-path an E-path i .. ki .. k exists in G. exists in G.


Properties of E-paths: 2Properties of E-paths: 2 If If i .. ki .. k is an E-path and is an E-path and jj is a node in ( is a node in (i .. ki .. k]] - For each in-edge (- For each in-edge (g, jg, j) such that node ) such that node gg is not in an E-path: is not in an E-path:

if node if node gg has a successor has a successor ss which is not in an E-path which is not in an E-path then insert then insert ee in edge ( in edge (g, jg, j)) else insert else insert ee in node in node gg

- Such insertion provides lifetime optimality of the temporary - Such insertion provides lifetime optimality of the temporary variablevariable

used to hold value of used to hold value of ee

Use of the property:Use of the property:

- Check whether a PRE algorithm provides lifetime optimality by - Check whether a PRE algorithm provides lifetime optimality by comparingcomparing

program points where insertions are madeprogram points where insertions are made


Lifetime optimality using E-pathsLifetime optimality using E-paths

a*b

a*b

i

k

m

j

g1

t=a*b

- Insertion in edge (g1, j) and

node g2 is lifetime optimal

g2

t=a*b

- i .. k is an E-path


Evaluating MRA using E-pathsEvaluating MRA using E-paths

a=.. a*b

a*b

a*b

t=a*b

t

a=..

t=a*b

1

2

3

4

5

6

1

2

3

4

5

6

1. 5 .. 6 is an E-path. Insertion node 3 would have been lifetime optimal.

t

2. 5 .. 4 is an E-path. Hence a*b of node 4 is eliminatable, but not eliminated!

0. Three E-paths exist: 4 .. 5, 5 .. 4 and 5 .. 6.


PRE using E-pathsPRE using E-paths For an E-path For an E-path i .. ki .. k

a) Insertions: For a node a) Insertions: For a node jj in ( in (i .. ki .. k]]

- Insert - Insert ee in edge ( in edge (g, jg, j) if ) if gg is not in an E-path and has a is not in an E-path and has a successor which is not in an E-pathsuccessor which is not in an E-path

- Insert - Insert ee in predecessor in predecessor gg if if gg is not in an E-path and all its is not in an E-path and all its successors are in E-pathssuccessors are in E-paths

b) Save: Save the computation ofb) Save: Save the computation of e e in node in node i i, unless , unless ii is the is the end-node of some E-path end-node of some E-path h .. ih .. i (in which case it would be (in which case it would be deleted).deleted).

c) Deletion: Delete the occurrence of c) Deletion: Delete the occurrence of ee in node in node kk..


PRE using E-pathsPRE using E-paths E-path E-path ii .. k.. k may contain 3 kinds of segments may contain 3 kinds of segments

- Avail . ¬Ant segment- Avail . ¬Ant segment - Avail . Ant segment- Avail . Ant segment - ¬Avail . Ant segment : This is called the “E-path suffix”.- ¬Avail . Ant segment : This is called the “E-path suffix”.

Find a node m : ¬Avail(m) . Anticipatable(m). Find a node m : ¬Avail(m) . Anticipatable(m). ∑ ∑ Avail(p), Avail(p), p=predp=pred

This is the start node of the E-path suffix.This is the start node of the E-path suffix.

- Trace Avail . ¬Ant segment backwards from m to find node- Trace Avail . ¬Ant segment backwards from m to find node i i, , the the

start of the E-path and perform a save in itstart of the E-path and perform a save in it

- Trace ¬Avail . Ant segment forward from m- Trace ¬Avail . Ant segment forward from m a) to perform appropriate insertion for in-edgesa) to perform appropriate insertion for in-edges b) to find b) to find kk and perform a deletion and perform a deletion


Segments in an E-PathSegments in an E-Path

a*b

a*b

1

2

3

4

5

10

׃

a) 1 .. 2 : Avail · ¬Ant.

b) 3 .. 4 : Avail · Ant.

c) 5 .. 10 : ¬Avail ·ּAnt (E-path suffix).

E-path suffix: insertions may be needed in paths joining it

Start node Of E-path

suffix


Simple data flows for E-path_PRESimple data flows for E-path_PRE@@

Comp : e is locally available (i.e. downwards exposed) in node

Antloc : e is locally anticipatable (i.e. upwards exposed) in node

Transp : node does not contain definitions of e’s operands

@ : Terminology is from Morel-Renvoise algorithm


Simple data flows for E-path_PRESimple data flows for E-path_PRE

Availability and Anticipatability (i.e. very busy exps.)Availability and Anticipatability (i.e. very busy exps.) Eps-in/Eps-out (Node is in E-path suffix)Eps-in/Eps-out (Node is in E-path suffix)


Simple data flows for E-path_PRESimple data flows for E-path_PRE

Availability and AnticipatabilityAvailability and Anticipatability Eps-in/Eps-out (Node is in E-path suffix)Eps-in/Eps-out (Node is in E-path suffix) SA_in/SA_out (A save should be “performed SA_in/SA_out (A save should be “performed

above”)above”)


Efficiency of E-path_PRE data Efficiency of E-path_PRE data flowsflows

The generalized theory of bit-vector data flow analysis by The generalized theory of bit-vector data flow analysis by Khedker,Khedker,

Dhamdhere (1994) defines two concepts for determining the cost Dhamdhere (1994) defines two concepts for determining the cost of dataof data

flow analysisflow analysis

- Information flow path (ifp): A graph path along which data flow- Information flow path (ifp): A graph path along which data flow information may “flow” during data flow analysis. information may “flow” during data flow analysis.

(Information “flow” : Values of data flow properties change (Information “flow” : Values of data flow properties change from`lattice from`lattice

top’ to `lattice bot’ during iterative data flow analysis)top’ to `lattice bot’ during iterative data flow analysis)

- “Width” of a graph (reduces to depth of a graph for - “Width” of a graph (reduces to depth of a graph for unidirectional dataunidirectional data

flows)flows)



The generalized theory of bit-vector data flow analysis by Khedker,The generalized theory of bit-vector data flow analysis by Khedker, Dhamdhere (1994) defines two concepts for determining the cost of Dhamdhere (1994) defines two concepts for determining the cost of

datadata flow analysisflow analysis

- Information flow path (- Information flow path (ifpifp): A graph path along which data flow): A graph path along which data flow information may “flow” during data flow analysis. information may “flow” during data flow analysis.

(Information “flow” : Values of data flow properties change (Information “flow” : Values of data flow properties change from`lattice from`lattice

top’ to `lattice bot’ during iterative data flow analysis)top’ to `lattice bot’ during iterative data flow analysis)

- “Width” of a graph (reduces to depth of a graph for unidirectional - “Width” of a graph (reduces to depth of a graph for unidirectional datadata

flows)flows)

Number of bit-vector operations during work-list iterative df analysis Number of bit-vector operations during work-list iterative df analysis depend on length of an depend on length of an ifpifp, and the number of iterations during , and the number of iterations during round-robin iterative df analysis depend on width of an round-robin iterative df analysis depend on width of an ifpifp



The Eps_in/out data flow of E-path_PRE has been designed to haveThe Eps_in/out data flow of E-path_PRE has been designed to have “ “short” information flow paths. This fact may also lead to smallshort” information flow paths. This fact may also lead to small width of a program graph.width of a program graph.

Short information flow paths and small width leads to smallerShort information flow paths and small width leads to smaller solution times of data flows.solution times of data flows. This fact is borne out by experimentation --- comparison with theThis fact is borne out by experimentation --- comparison with the “ “later” data flow of Drechsler, Stadel (1993) (Dhamdhere 2002):later” data flow of Drechsler, Stadel (1993) (Dhamdhere 2002):

- In worklist solution: No. of bit vector operations is 80% smaller - In worklist solution: No. of bit vector operations is 80% smaller

- In round-robin iterative solution: No. of iterations is 37% smaller - In round-robin iterative solution: No. of iterations is 37% smaller


Code placement models in PRECode placement models in PRE

Node modelNode model

- Simple node model- Simple node model Each node contains a single statementEach node contains a single statement - Basic block model- Basic block model Each node is a basic blockEach node is a basic block

Insertion and Saving modelInsertion and Saving model

- Saving in situ- Saving in situ Value of an expression is saved in the place where it is locatedValue of an expression is saved in the place where it is located - Saving in entry/exit of node- Saving in entry/exit of node An expression is moved to node entry/exit if its value is to be An expression is moved to node entry/exit if its value is to be

savedsaved - Insertion at entry/exit of node- Insertion at entry/exit of node - Unified insertion and saving- Unified insertion and saving This is possible only when saving is done at node entry/exitThis is possible only when saving is done at node entry/exit


Code placement models in PRECode placement models in PRE Morel-Renvoise Algorithm (MRA):Morel-Renvoise Algorithm (MRA): - Basic blocks, saving in situ, insertion at exit- Basic blocks, saving in situ, insertion at exit

Edge placement algorithm (EPA):Edge placement algorithm (EPA): - Basic blocks, saving in situ, insertions at node exit and in critical - Basic blocks, saving in situ, insertions at node exit and in critical

edges edges (edge splitting performed on a needs basis)(edge splitting performed on a needs basis)

Lazy Code Motion (LCM):Lazy Code Motion (LCM): - Simple nodes, unified saving and insertion, insertion at node - Simple nodes, unified saving and insertion, insertion at node

entries and entries and in blocks inserted in join edges in a priori edge splittingin blocks inserted in join edges in a priori edge splitting

E_path-PREE_path-PRE - Basic blocks, saving in situ, insertions at node exit and in critical - Basic blocks, saving in situ, insertions at node exit and in critical

edgesedges

SIM-PRESIM-PRE - Basic blocks, saving in situ, insertion strictly along edges- Basic blocks, saving in situ, insertion strictly along edges


Evaluation of code placement Evaluation of code placement models using E-pathsmodels using E-paths

Morel-Renvoise algorithm (MRA)Morel-Renvoise algorithm (MRA)

Missed opportunities of optimization (seen before)Missed opportunities of optimization (seen before)

Lazy code motion (LCM)Lazy code motion (LCM)

Performs insertion in a join edge (Performs insertion in a join edge (p,jp,j) even if it could have ) even if it could have beenbeen

performed in node performed in node pp

a*b

a*b

a*b inserted

1

3

2


Evaluation of code placement Evaluation of code placement models using E-pathsmodels using E-paths

Optimal code motion (OCM) Knoop et al 1994Optimal code motion (OCM) Knoop et al 1994

- Basic blocks, Hybrid model, Insertions at node entry and - Basic blocks, Hybrid model, Insertions at node entry and exitexit

- Hybrid: Uniform insertion and saving model but saving is- Hybrid: Uniform insertion and saving model but saving is

performed in situperformed in situ

No insertions and savings will be performed at entry to a No insertions and savings will be performed at entry to a nodenode

(Lemmas 19 and 23). Hence this feature is redundant.(Lemmas 19 and 23). Hence this feature is redundant.


Evaluations of code placement Evaluations of code placement models using E-pathsmodels using E-paths

Complete elimination of partial redundancies (ComPRE)Complete elimination of partial redundancies (ComPRE)

Bodik, Gupta and Soffa (1998) (when adapted to classical Bodik, Gupta and Soffa (1998) (when adapted to classical PRE)PRE)

- Simple nodes, unified saving and insertion only in edges- Simple nodes, unified saving and insertion only in edges

An expression in a node is redundantly hoisted into its An expression in a node is redundantly hoisted into its entry-edgesentry-edges

- Addressing this problem will require an additional data - Addressing this problem will require an additional data flow flow

problem, making it less efficient than E-path_PRE.problem, making it less efficient than E-path_PRE.


Later workLater work

SIM-PRE by J. Xue, J. Knoop (2006): inserts only along SIM-PRE by J. Xue, J. Knoop (2006): inserts only along edges edges


SIM-PRE by Xue and KnoopSIM-PRE by Xue and Knoop

This data flow traces an E-path !


SIM-PRE by Xue and KnoopSIM-PRE by Xue and Knoop

a*b

a*b

i

k

m

j

g1

t=a*b

- Insertion in edge (g1, j) and

node g2 is lifetime optimal

g2 - i .. k is an E-path

- SIM-PRE inserts in edges (g1, j) and (g2, l )

lt = a*b


SIM-PRE by Xue and KnoopSIM-PRE by Xue and Knoop SIM-PRE performs better than E-path_PRE in bit vector operations (Graphic is from J. Xue, J. Knoop (2006))

However, it adds almost 50% more new blocks than E-path_PRE (Dheeraj Kumar, 2006)


Work by Dheeraj Kumar (2006)Work by Dheeraj Kumar (2006)@@

Simplified the data flows of E-path_PRESimplified the data flows of E-path_PRE Eps_in/outEps_in/out data flow finds nodes { data flow finds nodes {ii } that belong to an } that belong to an

E-path and have E-path and have AntoutAntouti i = = truetrue

SA_in/outSA_in/out data flow finds nodes { data flow finds nodes {ii } that belong to an } that belong to an E-path and have E-path and have AvoutAvoutii = = truetrue

@ : M. Tech. dissertation, IIT Bombay


Work by Dheeraj Kumar (2006)Work by Dheeraj Kumar (2006)

Simplified data flows of E-path_PRE (Proposal Simplified data flows of E-path_PRE (Proposal 2):2):


Work by Dheeraj Kumar (2006)Work by Dheeraj Kumar (2006)

Experimental resultsExperimental results

- SPECcpu2000 benchmark under GCC 3.4.3- SPECcpu2000 benchmark under GCC 3.4.3

- Proposal 2 performance- Proposal 2 performance

* Bit map operations 5.5% smaller than SIM-* Bit map operations 5.5% smaller than SIM-PREPRE

in worklist and 15.8% smaller in iterativein worklist and 15.8% smaller in iterative

* Introduced 30% fewer blocks than SIM-PRE* Introduced 30% fewer blocks than SIM-PRE


Thus, eliminatability paths offer ..Thus, eliminatability paths offer ..

A conceptual basis for PREA conceptual basis for PRE

A versatile basis for PREA versatile basis for PRE

A basis for evaluating effectiveness of an approach to PREA basis for evaluating effectiveness of an approach to PRE

(Efficiency is a bonus!)(Efficiency is a bonus!)

Code optimization by partial redundancy elimination using Eliminatability paths (E-paths)

Documents

Transcript of Code optimization by partial redundancy elimination using Eliminatability paths (E-paths)