Scalar - University of Texas at Austin · en etc This implies w e can determine syn tactically ......

59

Transcript of Scalar - University of Texas at Austin · en etc This implies w e can determine syn tactically ......

� �

� �

ScalarOptimization

��

��

Organization

��Whatkindofoptimizationsareuseful�

��Program

analysisfordeterminingopportunitiesforoptimization�

data�owanalysis�

latticealgebra

solvingequationsonlattices

applicationstodata�owanalysis

��Speedingupdata�owanalysis�

exploitationofstructure

sparserepresentations�controldependence�SSAform�sparse

data�owevaluatorgraphs

��

��

Optimizationsperformedbymostcompilers

Constantpropagation�replaceconstant�valuedvariableswith

constants

Commonsub�expressionelimination�avoidre�computingvalue

ifvaluehasbeencomputedearlierinprogram

Loopinvariantremoval�movecomputationsintolessfrequently

executedportionsofprogram

Strengthreduction�replaceexpensiveoperations�like

multiplication�withsimpleroperations�likeaddition�

Deadcoderemoval�eliminateunreachablecodeandcodethat

isirrelevanttooutputofprogram

��

��

Optimizationexample�

DO

J

�����

DO

I

�����

A�I�J�

Ifweassumecolumn�majororderofstorage�and�bytesperarray

element�

AddressofA�I�J�

BaseAddress�A

�J�������

�I���

�BaseAddress�A��J��I����

OnlythetermI�

�dependsonI��

restofcomputationis

invariantintheinnerloopandcanbehoistedoutofit�

Furtherhoistingofinvariantcodeispossiblesinceonlythesubterm

J�

�dependsonJ�

SinceIandJ

areincrementedby�eachtimethroughtheloop�

expressionslikeI�

�andJ�

�canbestrengthreducedby

replacingthemwithadditions�

��

��

Pseudo�codefororiginalloopnest�

DO

J

�����

DOI

�����

t

BaseAddress�A�

J��

I

STORE�t���

Pseudo�codeforoptimizedloopnest�

t�

�BaseAddress�A�

DO

J�

�����

t��

t�

��

t��

t�

DOI

�����

t�

t�

STORE�t���

��

��

Terminology

Ade�nitionofavariableisastatementthatmayassignto

thatvariable�De�nitionsofx�

�i�x

��

�ii����F�x�y����callbyreference�

Insecondexample�invocationofFmaywritetox�

sotobesafe�wedeclareinvocationtobea

definitionofx�

Auseofavariableisastatementthatmayreadthe

valueofthatvariable�

Usesofx�

�i�y�x�

�ii����F�x�z����

��

��

Aliasing�occursinaprogramwhentwoormorenamesrefertothe

samestoragelocation�

Examples�

�i�proceduref�vax�x�y�

����

���f�z�z����f�a�b����

Withinf�referenceparametersxandymaybealiases�

�ii���

x���

y��x

�y���

����x���

xand�yarealiasesforthesamelocation

��

��

Ourposition�

Wewillnotperformanalysisforvariablesthatmaybealiasedsuch

�referenceparameters�localvariableswhoseaddresseshavebeen

taken�etc���

Thisimplieswecandeterminesyntacticallywhereallde�nitions

andusesofavariableare�

Morere�nedapproach�performaliasanalysis�

��

��

Forthenextfewslides�wewillfocusonconstantpropagationto

illustratethegeneralapproachtodata�owanalysis�

Examples�

�i����

���

x����

x����

y��x��

y���

ifx�ztheny����fi�

��

if��ztheny����fi�

���y���

���y���

Constantpropagationmaysimplifycontrolflowaswell�

�ii����

���

x����

x����

y��x��

y���

��deadcode

ify�xtheny����fi�

��

iftruetheny���� ��simplify

���y���

�������

��

��

Whydoopportunitiesforconstantpropagationariseinprograms�

constantdeclarationsformodularity

macros

procedureinlining�smallmethodsinOOlanguages

machine�speci�cvalues

��

��

��

Overviewofalgorithm�

��Buildthecontrol�owgraphofprogram�

makes�owofcontrolinprogram

explicit

��Perform

�symbolicevaluation�todetermineconstants�

�Replaceconstantvaluedvariableusesbytheirvaluesand

simplifyexpressionsandcontrol�ow�

��

��

��

Step��buildthecontrol�owgraph�CFG��

Example�

���

x����

y��x���

if�yxtheny����fi�

���y���

��

��

��

11

13

1

35

51

x := 1

STAR

T

y:= x + 2

y > x

y := 5

merge

...y....

control flow graph (C

FG)

state vectorson CFG

edges

��

��

��

AlgorithmforbuildingCFG�easy�onlycomplicationbeingbreak�s

andGOTO�s�needtoidentifyjumptargets�

BasicBlock�straight�linecodewithoutanybranchesormergingof

control�ow

NodesofCFG�statements�orbasicblocks�switchesmerges

EdgesofCFG�representpossiblecontrol�owsequence

��

��

��

SymbolicevaluationofCFGforconstantpropagation

Propagatevaluesfromfollowinglattice�

false true .... -1 0 1 ...

definitely not a constant

may or m

ay not be constant

join(T,0) = Tjoin(0,-1) = T

meet(0,-1) =

meet(T,1) = 1

Twooperators�

join�a�b��lowestvalueabovebothaandb�alsowrittenas

a�

b�

meet�a�b��highestvaluebelowbothaandb�alsowrittenas

a�

b�

Symbolicinterpretationofexpressions�EVAL�e�Vin��ifany

argumentofeis�

�or��inVin�return�

�or�

respectively��

otherwise�evaluateenormallyandreturnthatvalue

��

��

��

��AssociateonestatevectorwitheachedgeoftheCFG�

initializingallentriesto��Initializework�listtoempty�

��SeteachentryofstatevectoronedgeoutofSTARTto��and

placethisedgeontheworklist�

��whileworklistisnotemptydo

Getedgefromworklist�

LetstatevectoronedgebeVin�

��Symbolicallyevaluatetargetnodeoftheedge�

��usingthestatevectorsonitsinputs�

��andpropagateresultstatevectortooutputedgeofnode�

if�targetnodeisassignmentstatementx��e�

PropagateVinEVAL�e�Vin��xtooutputedge�

elseif�targetnodeisswitch�p��

�ifEVAL�p�Vin�isT�PropagateVintoalloutputsofswitch

elseifEVAL�p�Vin�istrue�PropagateVintotruesideof

elsePropagateVintofalsesideofswitch�

��

��

��

else��targetnodeismerge

Propagatejoinofstatevectorsonallinputstooutput�

Ifthischangestheoutputstatevector�enqueueoutputedge

onworklist�

od�

��

��

��

Runningexample�

11

13

1

35

51

x := 1

STAR

T

y:= x + 2

y > x

y := 5

merge

...y....

control flow graph (C

FG)

state vectorson CFG

edges

��

��

��

Algorithm

canquitesubtle�x := 1

STAR

T

...........

p

x := x +1

merge

..x....

Firsttimethroughloop�useofx

inloopisdeterminedtobe

constant��Nexttimethroughloop�itreaches�nalvalue��

��

��

��

Complexityofalgorithm�

Heightoflattice����

eachstatevectorcanchangevalue��

V

times�

Sowhileloopinalgorithmisexecutedatmost��

E

V

times�

Costofeachiteration�O�V��

SooverallalgorithmtakesO�EV��time�

��

��

��

Questions�

Canweusesamework�listbasedalgorithm

withdi�erent

latticestosolveotheranalysisproblems�

Canweimprovethee�ciencythealgorithm�

Needtoseparatewhatisbeingcomputedfrom

howitisbeing

computed��

usealgebrasonceagain��

��

��

��

Latticealgebraicapproachtodata�owanalysis

Abstractly�ourwork�listalgorithm

canbeviewedasonesolution

procedureforsolvingasetoflatticealgebraicequations�

Data�owlattices�

partiallyordersetof�niteheight

meetandjoinoperationswithappropriatealgebraicproperties

arede�nedforallpairsofvaluesfrom

po�set�

Thesepropertiesimplythatthelatticehasaleastandagreatest

element�

��

��

��

Examples�

false true .... -1 0 1 ...

{x,y,z}

{x,y}{x,z}

{y,z}

{y}{x}

{z}{}

lattice for constant propagationpow

er set of variables {x,y,z}��

��

��

Monotonicfunction�IfD

isapartiallyorderedsetandf�D

D�

fissaidtobemonotonicifx�

y��f�x��

f�y��

Intuitively�iftheinputofamonotonicfunctionisincreased�the

outputeitherstaysthesameorincreasesaswell�

ExamplesofmonotonicfunctionsonCPlattice�

identity�f�x��x

bottomfunction�f�x���

constantfunction�f�x���

ExamplesofnonmonotonicfunctionsonCPlattice�

f�x��if�x����thenelse�

��

��

��

Theorem�LetD

bealatticeof�niteheightandf�D�

D

be

monotonic�Then�theequationsx�f�x�hasaleastandagreatest

solutiongivenbythelimitsofthechains��f����f���������and

��f����f���������

Proof�

��

f����definitionof��

f����

f�����monotonicityoff�

����

����

f����

�f��������

Sincethelatticehas�niteheight�thischainhassomelargest

elementl�andf�l��l�Solsolvestheequation�Itisalsoeasyto

showthatlistheleastsolutiontotheequations�

Asimilarargumentshowsthatagreatestsolutionexists�

��

��

��greatest solution

least solution

x .x

Examples�

f�x���

limit���f������������

limit���f������f������������

f�x��x

limit���f�������������

limit���f�������������

��

��

��

Corollary�

Iff�g�hetcaremonotonicfunctions�thesystemofequations

x�f�x�y�z����

y�g�x�y�z����

z�h�x�y�z��������

hasleastandgreatestsolutionsgivenbythelimitsoftheobvious

chains�eg�leastsolutionisobtainedbystartingwith�

forall

variables�substitutingintorighthandsidestogetnewvaluesfor

allvariables�anditeratingtillconvergenceoccurs��

��

��

��

Connectionbetweenconstantpropagationandlatticeequations�

iterativeprocedureisjustamethodtosolvelatticeequations�

S0 = [T,T]S1 = S0{1 / x}S2 = S1{2 / y }S3 = if (S2[y] < S2[x]) or (S2[y]=T) or (S2[x] = T) then S2 else S3 .....S6 = S4 U

S5......

x := 1

STAR

T

y:= x + 2

y > x

y := 5

merge

...y.... S0S1S2

S3

S4

S5

S6

(2 variables)S0, S1, S2,..... : C

xC

Lattice: Ctrue false ... -1 0 1 ....

��

��

��

Question�sinceequationshavemanysolutionsingeneral�which

oneshouldwecompute�

ForCP�leastsolutiongivesmoreaccurateinformationthanother

solutions�

x := 1

STAR

T

merge

y := ...

p(y)

...x... [1,T]

[T,T]

[T,T]

[T,T]

[T,T]

[T,T]

[T,T]

[1,T]

[1,T]

[1,T]

[1,T]

[1,T]

[.....]: least solution

[.....]: greatest solution

Ingeneral�ifcon�uenceoperatorisjoin�computeleastsolution�

otherwisecomputegreatestsolution�

��

��

��

Generalspeci�cationofdata�owproblem�

Lattice��niteheight

Rulesforwritingdownequationsfrom

CFG

Con�uenceoperator

Nospecialargumentsaboutterminationorcomplexityareneeded�

��

��

��

Constantpropagationisexampleof

FORWARD�FLOW�ALL�PATHSproblem�

Intuitively�dataispropagatedforwardinCFG�andvalueis

constantatapointponlyifitisthesameconstantforallpaths

fromstarttop�

Generalclassi�cationofdata�owproblems�reaching definitions

very busy expressionslive variables

constant propagation

available expressions

BA

CK

WA

RD

FOR

WA

RD

ALL PA

THS

AN

Y PA

TH

��

��

��

Availableexpressions�FORWARDFLOW�ALLPATHS

De�nition�Anexpression�xopy�isavailableatapointpifevery

pathfrom

STARTtopcontainsanevaluationofpafterwhich

therearenoassignmentstoxory�

Lattice�powersetofallexpressionsinprogram

orderedby

containment

��

��

��

y := x + 1z := x + 1

merge

...x+1...

STAR

T

<---- x+1 is "available" here

x := y op z E0E1

E1 = {y op z} U (E0 - Ex)

(where Ex is all expressions involving x)

STAR

TE0 = { }

EQU

ATIO

NS:

Lattice: powerset of all expressions in procedure

confluence operator: meet (intersection)

compute greatest solution

��

��

��

Reachingde�nitions�FORWARDFLOW�ANYPATH

Ade�nitiondofavariablevissaidtoreachapointpifthereisa

pathfrom

STARTtopwhichcontainsd�andwhichdoesnot

containanyde�nitionsofvafterd�

a := REA

D()

b := REA

D()

d := b - a

d := b + d

d > 0

d := a + b

merge

print (d)

STAR

T

merge

d > b

END

d0d1d2

{}{d0}

{d0,d1}

{d0,d1,d2}

{d0,d1,d2}

{d0,d1,d4}

d3

d4

{d0,d1,d2,d4}

{d0,d1,d3}

{d0,d1,d2,d3}

{d0,d1,d2,d3}

Lattice: powerset of definitions in procedure

Equations:STAR

T{}

x := e Din

Dout

Dout =

d{d} U

(Din - dx)

(dx is set of all definitions of x)

Com

pute least solution

Confluence operator: join (union)

Com

plexity: D*E*D

(D is num

ber of definitions)

��

��

��

Manyintermediaterepresentationsrecordreachingde�nitions

informationingraphicalform�

def�usechain�edgewhosesourceisade�nitionofvariablev�and

whosedestinationisauseofvreachedbythatde�nition

use�defchain�reverseofdef�usechain

a := REA

D()

b := REA

D()

d := b - a

d := b + d

d > 0

d := a + b

merge

print (d)

STAR

T

merge

d > b

END

d0d1d2

{}{d0}

{d0,d1}

{d0,d1,d2}

{d0,d1,d2}

{d0,d1,d4}

d3

d4

{d0,d1,d2,d4}

{d0,d1,d3}

{d0,d1,d2,d3}

{d0,d1,d2,d3}

def-use chains

��

��

��

Livevariableanalysis�BACKWARDFLOW�ANYPATH

Avariablexissaidtobeliveatapointpifxisusedbeforebeing

assignedonsomepathfromptoEND�usedinregisterallocation��

Lattice: powerset of variables ordered by containm

ent

Equations:

END

{}

x := y op z

E0 E1 = {y,z} U (E0 - {x})

Confluence operator: join (union)

Com

pute least solution

��

��

��

Verybusyexpressions�FORWARDFLOW�ALLPATHS

Anexpressione��yopz�issaidtobeverybusyatapointpifit

isevaluatedoneverypathfromptoENDbeforeanassignmentto

yorz�

Lattice: powerset of expressions ordered by containm

ent

Equations:

END

{}

x := y op z

E0 E1 = {y op z} U (E0 - Ex)

Com

pute greatest solution

(Ex is set of expressions containing x)

Confluence operator: m

eet (intersection)

��

��

��

Pragmaticsofdata�owanalysis�

Computeandstoreinformationatbasicblocklevel�

Usebitvectorstorepresentsets�

Question�canwespeedupdata�owanalysis�

Twoapproaches�

exploitstructureincontrol�owgraph

exploitsparsity

��

��

��

OptimizingData�owAnalaysis

��

��

��

ConstantpropagationonCFG�O�EV��

Reachingde�nitionsonCFG�O�EN��

AvailableexpressionsonCFG�O�EA��

Twoapproachestospeedingupdata�owanalysis�

exploitstructureintheprogram

exploitsparsityinthedata�owequations�usually�adata�ow

equationinvolvesonlyasmallnumberofdata�owvariables

��

��

��

Exploitingprogram

structure

Work�listalgorithm

didnotenforceanyparticularorderfor

processingequations

Shouldexploitprogram

structuretoavoidrevisitingequations

unnecessarily

p

y = ....x....

x = 2x = 3

e1e2

e3

- we should schedule e3 after w

e have processed e1 and e2; otherw

ise e3 will have to be done tw

ice

- if this is within a loop nest, can be a big w

in

��

��

��

Generalapproachtoexploitingstructure�elimination

IdentifyregionsofCFGthatcanbepreprocessedbycollapsing

regionintoasinglenodewiththesameinput�outputbehavior

asregion

Solvedata�owequationsiterativelyonthecollapsedgraph�

Interpolatedata�owsolutionintocollapsedregions�

Whatshouldbearegion�

basic�blocks

basic�blocks�if�then�else�loops

intervals

����

Structuredprograms�limitinwhichnoiterationisrequired

��

��

��

Example�reachingde�nitionsinstructuredlanguage

Tosummarizethee�ectofaregion�computegenandkillfor

region�

Data�owequationforregioncanbewrittenusinggenandkill�

inout

regionR

gen[R]: set of definitions in R

from w

hich there is a path to exit free of other definitions of the sam

e variable

kill[R]: set of definitions in program

that do not reach

out = gen[R] U

(in - kill[R])

exit of R even if they reach the beginning of R

��

��

��a = b + c

d

R

p

R1

R2

R1

R2

p R1

R

R

R

gen[R] = {d}

kill [R] = D

a(all definitions of a)

out[R] = gen[R

] U (in[R

] - kill[R])

gen[R] = gen[R

2] U (gen[R

1] - kill[S2])kill[R

] = kill[R2] U

kill[R2]

gen[R] = gen[R

1] U gen[R

2]kill[R

] = kill[R1]

Ukill[R2]

gen[R] = gen[R

1]kill[R

] = kill[R1]

in[R2] = gen[R

1] U (in[R

] - kill[R1])

in[R1] = in[R

]

in[R1] = in[R

2] = in[R]

in[R1] = in[R

] U gen[R

]

��

��

��

Observations�

Forstructuredprograms�wecansolvedata�owproblemslike

reachingde�nitionspurelybyelimination�withoutany

iteration��complexity�O�EV���

Forstructuredprograms�wecanevensolvethedata�ow

problemdirectlyontheabstractsyntaxtree�noneedtobuild

thecontrol�owgraph��

Forlessstructuredprograms�likereducibleprograms��we

mustbuildthecontrol�owgraphtoidentifyregionslike

intervals�butthereisstillnoneedtoiterate�

��

��

��

Exploitingsparsitytospeedupdata�owanalysis

Example�constantpropagation

CFGalgorithmforconstantpropagationusedcontrol�ow

graphtopropagatestatevectors�

Propagatinginformationforallvariablesinlock�stepforcesa

lotofuselesscopyinginformationfromonevectortoanother

�consideravariablethatisde�nedattopofprocedureand

usedonlyatbottom��

Solution�

doconstantpropagationforeachvariableseparately

propagateinformationdirectlyfromde�nitionstouses

skippingoverirrelevantportionsofcontrol�owgraph

Subtlepoint�inwhatordershouldweprocessvariables

��

��

��

Constantpropagationusingdef�usechains

Associatecellwitheachlhsandrhsoccurenceofallvariables�

initializeto��

Propagate�

alongeachdef�useedgeoutofSTART�and

enqueuetargetstatementsofdef�useedgesontoworklist�

Enqueueallde�nitionswithconstantRHSontoworklist�

while�worklistisnotempty�do

dequeuedefinitiondfromwork�list�

evaluateRHSofdusingcellvaluesforRHSvariables

andupdateLHScell�

ifthischangesLHScellvalue�

propagatenewvaluealongdef�usechainstoeachuse

�takejoinofcellvalueatuseandLHScellvalue��

ifcellvalueatusechangesandtargetstatementisadefinit

enqueuetargetstatementontoworklist�

od�

��

��

��

Example�

x := 1

STAR

T

y:= x + 2

y > x

merge

...y....

y := 45y := 5

control flow graph (C

FG)

def-use edges

cell for value at definition/use

��

��

��

Complexity�O�sizeofdef�

usechains�

ThiscanbeaslargeO�N�V�whereN

issizeofsetofCFGnodes�

However�withSSAform�canbereducedtoO�EV��

Problemwithalgorithm�lossofaccuracy�

Propagationalongdef�usechainscannotdeterminedirectlythat

y��

��isdeadcode�solastuseofyisnotmarkedconstant�

Onepossibility�repeatedcyclesofreachingde�nitioncomputation�

constantpropagationanddeadcodeelimination�

Isthereabetterway

Keyidea�

�ndunreachablestatementsduringconstantpropagation

donotpropagatevaluesoutofunreachablede�nitions

��

��

��

Oneapproach�usecontroldependenceanddef�usechains

Intuitiveideaofcontroldependence�Noden

iscontroldependent

onpredicatep

ifp

determineswhethern

isexecuted�

Convention�assumeSTARTisapredicatesounconditionally

executedstatementsarecontroldependentonSTART�

CDG�Controldependencegraph

��

��

��

x := 1

STAR

T

y:= x + 2

y > x

merge

...y....

y := 45y := 5

control flow graph (C

FG)

control dependence edge

��

��

��

Algorithm�Propagate�liveness�alongcontroldependenceedges

whilepropagatingconstantsalongdef�usechains�

x := 1

STAR

T

y:= x + 2

y > x

merge

...y....

y := xy := 5

control dependence edges

def-use chains

��

��

��

Constantpropagation

Associatecellwitheachlhsandrhsoccurenceofallvariables�

andwitheachstatement�initializedto�

Propagate�

alongeachdef�useedgeandcontroldependence

edgeoutofSTART�Ifvalueinanytargetcellchanges�enqueue

targetstatementontoworklist�

while�worklistisnotempty�do

dequeuestatementdfromwork�list�

ifcontroldependencecellofstatementis��top�

switch�typeofd��

case�definition��

EvaluateRHSofdusingcellvaluesforRHSvariables

andupdateLHScell�

IfthischangesLHScellvalue

propagatenewvaluealongdef�usechainstoeachuse

�takejoinofcellvalueatuseandLHScellvalue��

��

��

��

Ifcellvalueatusechanges�enqueuetargetstatementonto

�case�switch��

�Evaluatepredicateandpropagatealongappropriatecontrold

edgesoutofpredicate�

Ifcellvalueattargetchanges�

enqueuetargetstatementontoworklist�

fi�od�

��

��

��

Observations�

Wedonotpropagateinformationoutofdead�unreachable�

statements�

However�precisionofinformationisstillnotasgoodasCFG

algorithm�westillpropagateinformationoutofstatements

thatareexecutedbutareirrelevanttooutput�othersortof

deadstatements�asinSlide����

Needanalgorithmtocomputecontroldependenceingeneral

graphs�

SizeofCDG�O�EN��canbereduced�

��

��

��

Solutions�

Requirethatavariableassignedononesideofaconditionalbe

assignedonbothsidesofconditional�byinsertingdummy

assignmentsofformx

��

x��Programmersdon�twanttodo

this�

Makecompilerinsertdummyassignments�Hardto�gureout

inpresenceofunstructuredcontrol�ow�

UseSSAform�ensurethateveryuseisreachedbyexactlyone

de�nitionbyinserting�functionsatmergestocombine

multiplereachingde�nitions��

��

��

SSAalgorithm

forConstantPropagation

x := 1

STAR

T

y:= x + 2

y > x

y := 5

merge

...y....

Φ

- phi-function is like pseudo-assignment

One possibility: one phi-function for every variable

at every merge in C

FG.

- phi-function combines different reaching definitions

at a merge into a single one at output of m

erge

- control dependence at merge: com

pute for each side of the m

erge separately

Constant propagation:

Where should phi-functions be placed?

similar to previous algorithm

, but at merge,

propagate join of inputs only from live sides of m

erge

Com

puting minim

al SS

A form

: O(|E

|) per variable (P

ingali and Bilardi P

LDI 96)

(eg. variable x in example)

Minim

al SS

A form

: permit def-use chains to bypass

a merge if sam

e definition reaches all sides of merge

��

��

��

Sameideacanbeappliedtootherdata�owanalysisproblems

performdata�owanalysisforeachsub�problemseparately�eg�

foreachexpressionseparatelyinavailableexpressionsproblem�

buildasparsegraphinwhichonlystatementsthatmodifyor

usedata�owinformationforsub�problemarepresent�andsolve

inthat

Sparsedata�owevaluatorgraphcanbebuiltinO�jEj�timeper

problem�PingaliandBilardiPLDI��

��

��

��..x+y...

STA

RT

p1p2

x:=.......x+y..

..x+y..

x:=...

EN

D

Control Flow

Graph

Av := 1

Av:=0

Av := 0

Av := 1

Av := 1

Av:= 0

...Av...

...Av...

..Av..

φφconfluence operator: m

eet

(not available)

(available)10

Sparse D

ataflow E

valuator Graph

for availability of x+y

��