Class Data Flow Analysis

download Class Data Flow Analysis

of 44

Transcript of Class Data Flow Analysis

  • 8/9/2019 Class Data Flow Analysis

    1/44

    Data Flow Analysis

    October 5, 2004

  • 8/9/2019 Class Data Flow Analysis

    2/44

  • 8/9/2019 Class Data Flow Analysis

    3/44

  • 8/9/2019 Class Data Flow Analysis

    4/44

    Compiler Structure

    Source code parsed to produce abstract syntax tree.

    Abstract syntax tree transformed to control flow graph.

    Data flow analysisoperates on the control flow graph(and other intermediate representations).

    Source

    code

    Abstract

    Syntax

    tree

    Control

    Flow

    Graph

    Object

    code

  • 8/9/2019 Class Data Flow Analysis

    5/44

    Abstract Syntax Tree (AST)

    Programs are written in text

    as seuences of characters

    may be aw!ward to wor! with.

    "irst step# Con$ert to structured representation.

    %se lexer (li!e lex) to recogni&e to!ens

    %se parser (li!e yacc) to group to!ens structurally

    often produce to produce AST

  • 8/9/2019 Class Data Flow Analysis

    6/44

    Abstract Syntax Tree 'xample

    x # a b*

    y # a + b

    ,hile (y - a)

    a # a /*

    x # a b

    0

    program

    while

    block

    =

    x +

    a b

    >

    =

    a +

    a 1

    y a

  • 8/9/2019 Class Data Flow Analysis

    7/44

    ASTs

    ASTs are abstract

    don1t contain all information in the program

    e.g.2 spacing2 comments2 brac!ets2 parenthesis.

    Any ambiguity has been resol$ed

    e.g.2 a b c produces the same AST as

    (a b) c.

  • 8/9/2019 Class Data Flow Analysis

    8/44

    3isad$antages of ASTs

    ASTs ha$e many similar forms

    e.g.2 for while2 repeat 2 until2 etc

    e.g.2 if2 42 switch

    'xpressions in AST may be complex2 nested

    (56 + y) ( & - 7 4 /6 + & # & 68)

    ,ant simpler representation for analysis

    9 at least for dataflow analysis.

  • 8/9/2019 Class Data Flow Analysis

    9/44

    Control:"low ;raph (C";)

    A directed graphwhere

    'ach node represents a statement

    'dges represent control flow

    Statements may be

    Assignments x y op &or x op &

    Copy statements x y

  • 8/9/2019 Class Data Flow Analysis

    10/44

  • 8/9/2019 Class Data Flow Analysis

    11/44

    >ariations on C";s

    %sually don1t include declarations (e.g. int x*).

    ?ay want a uniue entry and exit point.

    ?ay group statements into basic blocks.

    A basic blockis a seuence of instructions with nobranches into or out of the bloc!.

  • 8/9/2019 Class Data Flow Analysis

    12/44

    Control:"low ;raph with

  • 8/9/2019 Class Data Flow Analysis

    13/44

    C"; $s. AST

    C";s are much simpler than ASTs

    "ewer forms2 less redundancy2 only simpleexpressions

  • 8/9/2019 Class Data Flow Analysis

    14/44

    3ata "low Analysis

    A framewor! for pro$ing facts about program

    Beasons about lots of little facts

    =ittle or no interaction between facts

    ,or!s best on properties about how programcomputes

  • 8/9/2019 Class Data Flow Analysis

    15/44

    A$ailable 'xpressions

    An expression e x op y is availableat a programpoint p2 if

    on e$ery path from the entry node of the graph to node p2 eis computed at least once2 and

    And there are no definitions of x or y since the most recentoccurance of e on the path

    ptimi&ation Df an expression is a$ailable2 it need not be recomputed At least2 if it is in a register somewhere

  • 8/9/2019 Class Data Flow Analysis

    16/44

    3ata "low "acts

    Ds expression e a$ailable4

    "acts#

    a b is a$ailablea + b is a$ailable

    a / is a$ailable

  • 8/9/2019 Class Data Flow Analysis

    17/44

    ;en and Eill

    ,hat is the effect of each

    x = a + b

    stmt gen kill

    y = a * b

    a = a + 1

    a + b

    a * b

    a + ba * b

    a + 1

    statement on the set of facts?

  • 8/9/2019 Class Data Flow Analysis

    18/44

    {a+b}

    {a + b,a * b}

    {a + b,a * b}

    {a + b}

    {a+b}{a+b}{a+

    b}

    Computing A$ailable 'xpressions

  • 8/9/2019 Class Data Flow Analysis

    19/44

    Terminology

    Ajoin pointis a program point where two branchesmeet

    A$ailable expressions is a forward, mstproblem

    !orward 3ata "low from in to out

    "st At Foint point2 property must hold on allpaths that are Foined.

  • 8/9/2019 Class Data Flow Analysis

    20/44

    3ata "low 'uations

    =et s be a statement

    succ(s) immediate successor statements of s0

    Pred(s) immediate predecessor statements of s0

    Dn(s) program point Fust before executing s

    ut(s) program point Fust after executing s

    Dn(s) s 2preds!ut(s1)

    ut(s) ;en(s) [(Dn(s) G Eill(s))

    Hote these are also called transfer functions

  • 8/9/2019 Class Data Flow Analysis

    21/44

    =i$eness Analysis

    A $ariable $ is live at a programpoint p if

    $ will be used on some execution pathoriginating from p before $ is o$erwritten

    ptimi&ation

    Df a $ariable is not li$e2 no need to !eep it

    in a register Df a $ariable is dead at assignment2 caneliminate assignment.

  • 8/9/2019 Class Data Flow Analysis

    22/44

    3ata "low 'uations

    A$ailable expressions is a forward must analysis

    3ata flow propagate in same direction as C"; edges

    'xpression is a$ailable if a$ailable on all paths

    =i$eness is a bac!ward may problem

    to !ow if $ariable is li$e2 need to loo! at future uses

    >ariable is li$e if a$ailable on some path

    Dn(s) ;en(s) [(ut(s) G Eill(s))

    ut(s) s 2succs!Dn(s1)

  • 8/9/2019 Class Data Flow Analysis

    23/44

    ;en and Eill

    ,hat is the effect of each

    x = a + b

    y = a * b

    a = a + 1

    stmt gen kill

    y > a

    a, b

    a, b

    a, y

    a

    x

    y

    a

    statement on the set of facts?

  • 8/9/2019 Class Data Flow Analysis

    24/44

    {x

    }

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    25/44

    {x, y, a}

    {x

    }

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    26/44

    {x, y, a}

    {x

    }

    {x, y, a}

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    27/44

    {x, y, a}

    {x

    }

    {x,y, a}

    {y, a,b}

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    28/44

    {x, y, a}

    {x}

    {x,y, a}

    {y, a,b}

    {y, a,

    b}

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    29/44

    {x,y, a,b}

    {x

    }

    {x,y, a}

    {y, a,b}

    {y, a,

    b}

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    30/44

    {x,y, a,b}

    {x

    }

    {x, y, a,b}

    {y, a,b}

    {y, a,

    b}

    Computing =i$e >ariables

  • 8/9/2019 Class Data Flow Analysis

    31/44

    {x, y, a,b}

    {x

    }

    {x, y, a,b}

    {y, a,b}

    {y, a,

    b}

    Computing =i$e >ariables

    {x, a, b}

  • 8/9/2019 Class Data Flow Analysis

    32/44

    {x, y, a,b}

    {x

    }

    {x, y, a,b}

    {y, a,b}

    {y, a,

    b}

    Computing =i$e >ariables

    {x, a, b}

    {a, b}

  • 8/9/2019 Class Data Flow Analysis

    33/44

    >ery

  • 8/9/2019 Class Data Flow Analysis

    34/44

    Code Ioisting

    Code hoisting finds expressions that are alwayse$aluated following some point in a program2regardless of the execution path and mo$es them tothe latest point beyond which they would always be

    e$aluated.

    Dt is a transformation that almost always reducesthe space occupied but that may affect its executiontime positi$ely or not at all.

  • 8/9/2019 Class Data Flow Analysis

    35/44

    Beaching 3efinitions

    A definition of a variable$ is an assignment to $

    A definition of $ariable $ reachespoint p if

    There is no inter$ening assignment to $

    Also called def#seinformation

    ,hat !ind of problem4

    "orward or bac!ward4 "orward

    ?ay or must4 may

  • 8/9/2019 Class Data Flow Analysis

    36/44

    Space of 3ata "low Analyses

    ?ost data flow analyses can be classified this way

    A few don1t fit# bidirectional

    =ots of literature on data flow analysis

    May Must

    Forward

    Backward

    eachi!g

    de"i!itio!s

    #$ailable

    expressio!s

    %i$e

    &ariables

    &ery busy

    expressio!s

  • 8/9/2019 Class Data Flow Analysis

    37/44

    3ata "low "acts and lattices

    Typically2 data flow facts form a lattice

    'xample2 A$ailable expressions

    top

    bottom

  • 8/9/2019 Class Data Flow Analysis

    38/44

    Partial rders

    Apartial orderis a pair (P2 ) such that

    P P

    is refle$ive# x x is anti#symmetric# x y and y x implies x y

    is transitive# x y and y & implies x &

  • 8/9/2019 Class Data Flow Analysis

    39/44

    =attices

    A partial order is a lattice if uandtare defined so that

    uis the meetor greatest lower boundoperation

    x uy x and x uy y Df & x and & y then & x uy

    tis theFoinor least upper boundoperation x x ty and y x ty Df x & and y &2 then x ty &

  • 8/9/2019 Class Data Flow Analysis

    40/44

    =attices (cont.)

    A finite partial order is a lattice if meet and Foin exist fore$ery pair of elements

    A lattice has uniue elements botand topsuch that

    x u? ? x t?x

    x u> x x t> >

    Dn a lattice

    x y iff x uy x

    x y iff x ty y

  • 8/9/2019 Class Data Flow Analysis

    41/44

    %seful =attices

    (6S2 ) forms a lattice for any set S. 6Sis the powerset of S (set of all subsets)

    Df (S2 ) is a lattice2 so is (S2)

    i.e.2 lattices can be flipped

    The lattice for constant propagation

    1 ' (

    >

    ?

  • 8/9/2019 Class Data Flow Analysis

    42/44

    "orward ?ust 3ata "low Algorithm

    ut(s) ;en(s) for all statements s

    , all statements0 (wor!list)

    Bepeat

    Ta!e s from ,

    Dn(s)

    s 2preds!ut(s1)

    Temp ;en(s) [(Dn(s) G Eill(s))

    Df (temp J ut (s))

    ut(s) temp

    , , [succ(s)

    0

    %ntil ,

  • 8/9/2019 Class Data Flow Analysis

    43/44

    ?onotonicity

    A function f on a partial order is monotonic if

    x y implies f(x) f(y)

    'asy to chec! that operations to compute Dn andut are monotonic

    Dn(s) s 2preds!ut(s1)

    Temp ;en(s) [(Dn(s) G Eill(s))

    Putting the two together

    Temp fs(s 2preds!ut(s1))

  • 8/9/2019 Class Data Flow Analysis

    44/44

    Termination

    ,e !now algorithm terminates because

    The lattice has finite height

    The operations to compute Dn and ut aremonotonic

    n e$ery iteration we remo$e a statement

    from the wor!list andKor mo$e down thelattice.