1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next...
-
Upload
stuart-oconnor -
Category
Documents
-
view
221 -
download
0
Transcript of 1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next...
1
Control Flow Analysis
• Topic today• Representation and Analysis Paper (Sections 1, 2)
• For next class:• Read Representation and Analysis Paper (Section 3)• Do problems 1 and 2 from the representation and analysis paper
2
Intermediate Representations
• Program analyses grew out of the compiler world, where they were used to help optimize code.
• Many optimizations depend on analyses of intermediate representations of software, such as:o Parse treeso Abstract syntax treeso Three-address code
3
Intermediate
representation
Code Compilation and Analysis
Parsing, lexical analysis
Source
program
Code generation,
optimization
Target code
Code execution
Intermediate
representation
•Analyze intermediate representation, perform additional analysis on the results•Use this information for code optimization techniques
4
Tree Representations
• Representations• Parse trees represent concrete syntax• Abstract syntax trees represent abstract syntax
• Concrete versus abstract syntax• Concrete syntax shows structure and is language-specific• Abstract syntax shows structure
5
Example: Grammar
Example1. a := b + c
2. a = b + c;
• Grammar for 1• stmtlist stmt | stmt stmtlist• stmt assign | if-then | …• assign ident “:=“ ident binop ident• binop “+” | “-” | …
• Grammar for 2• stmtlist stmt “;” | stmt”;” stmtlist• stmt assign | if-then | …• assign ident “=“ ident binop ident binop “+” | “-” | …
6
Example: Parse Tree
Example1. a := b + c 2. a = b + c;
Parse tree for 1 Parse tree for 2
7
Example: Parse Tree
Example1. a := b + c 2. a = b + c;
stmt
stmtlist
ident
assign
a
ident“:=“ binop
cb
ident
“+”
stmt
stmtlist
ident
assign
a
ident“=“ binop
cb
ident
“+”
“;”
Parse tree for 1 Parse tree for 2
8
Example: Abstract Syntax Tree
Example1. a := b + c 2. a = b + c;
Abstract syntax tree for 1 and 2
assign
a add
b c
9
Three Address Code
• General form: x := y op z• May include temporary variables (intermediate values)• May reference arrays: a[t1]• Specific forms (examples)
• Assignment• Binary: x := y op z • Unary: x := op y
• Copy: x := y • Jumps
• Unconditional: goto (L) • Conditional: if x relational-op y goto (L)
• …
10
Example: Three Address Code
if a > 10 then x = y + z
else
x = y – z
…
1. if a > 10 goto (4)
2. x = y – z
3. goto (5)
4. x = y + z
5. …
Source code Three address code
11
Larger Example: 3 Address Code
(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)(14) t6 := 4*I(15) x := a[t6]
(16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x
12
Control Flow Graph
• One of the most basic program representations• Nodes represent statements or basic blocks• Edges represent flow of control between nodes
• To build a CFG: Construct basic blocks Join blocks together with labeled edges
13
Basic Blocks
• A basic block is a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branch except at the end
• A basic block may or may not be maximal• For compiler optimizations, maximal basic blocks are
desirable• For software engineering tasks, basic blocks that
represent one source code statement are often used
14
Computing Basic Blocks (algorithm)
Input: a sequence of procedure statements
Output: A list of basic blocks with each statement in exactly one block
Method: Determine the set of leaders: the first statements of basic
blocks, using the following rules:o The first statement in the procedure is a leader
o Any statement that is the target of a conditional or unconditional goto statement is a leader.
o Any statement that immediately follows a conditional or unconditional goto statement is a leader.
Construct the basic blocks using the leaders. For each leader, its basic block consists of the leader and all statements up to but not including the next leader or the end of the procedure.
15
(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)
(14) t6 := 4*l(15) x := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x
Example: Compute Basic Blocks on this 3 Address Code
16
(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)
(14) t6 := 4*I(15) x := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x
Example: Compute Basic Blocks on this 3 Address Code: leaders
17
(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)
(14) t6 := 4*I(15) X := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x
Example: Compute Basic Blocks on this 3 Address Code: blocks
18
Computing Control Flow Graph from Basic Blocks (algorithm)
Input: a list of basic blocks
Output: A list of control-flow graph (CFG) nodes and edges
Method: Create entry and exit nodes; create edge (entry, B1); create (Bk,
exit) for each Bk that represents an exit from program Add CFG edge from Bi to Bj if Bj can immediately follow Bi in
some execution, i.e.,o There is a conditional or unconditional goto from the last statement
of Bi to the first statement of Bj oro Bj immediately follows Bi in the order of the program and Bi does not
end in an unconditional goto statement
Label edges that represent conditional transfers of control as “T” (true) or “F” (false); other edges are unlabeled
19
(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)
(14) t6 := 4*I(15) X := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x
Example: Compute Control Flow Graph from Basic Blocks
B1
B2
B3
B4
B5
B6
20
Example: Compute Control Flow Graph from Basic Blocks
B1
B2
B3
B4
B6B5TF
T
TF
F
Entry
Exit
21
Computing Control Flow from Source Code (example)
Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)
22
Computing Control Flow from Source Code (example)
Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)
S1
S2
S3
S4
S5 S6
S7
S8
S9
S10
entry
exit
F
T
F
T
23
Computing Control Flow from Source Code (maximal basic blocks)
Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)
S1
S2
S3
S4
S5 S6
S7
S8
S9
S10
entry
exit
F
T
F
T
24
Computing Control Flow from Source Code (another example)
Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial
25
Computing Control Flow from Source Code (another example)
Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial
S1
S2
S3 S4 S5 S6
entry
exit
26
Computing Control Flow from Source Code (maximal basic blocks)
Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial
S1
S2
S3 S4 S5 S6
entry
exit
27
Applications of Control Flow
• Program understanding: • In CFGs, program structure and flow are
explicit
• Complexity: • Cyclomatic (McCabe’s)• Computed in several ways:
• Edges –nodes +2• Number of regions in CFG• Number of decision statements + 1 (if
program is structured)
• Indicates number of test cases needed; indicates difficulty of maintaining
1
2 3
4 5
6
78
28
Applications of Control Flow
• Testing: branch, path, basis path• Branch: must test 12, 1 3, 45,
48, 56, 57• Path: infinite number because of loop• Basis path: set of paths such that each
path executes at least one more edge (cyclomatic complexity gives max necessary); example: 1,2,4,8; 1,3,4,5,6,7,4,8
1
2 3
4 5
6
78
29
Applications of Control Flow
• Support for Other Analyses: • Dominator• Postdominator• Data dependence• Control dependence• Points-to• Regression test selection
1
2 3
4 5
6
78
30
Terminology: Levels of Analysis
Local: within a single basic block or statementIntraprocedural: within a single procedure, function, or
method (sometimes intramethod)Interprocedural: across procedure boundaries, procedure
call, shared globals, etc.Intraclass: within a single classInterclass: across class boundaries. . .
Exercise• How would we represent interprocedural control
flow; that is, control flow involving an entire program containing several procedures?
31
Procedure Abegin S3 call B() S4end
Procedure Mainbegin S1 if (P1) then call A() else call B() endif call A() S2end
Procedure Bbegin S5end
Step 1: Draw individual CFGs
32
Step 2: Connect CFGs (how?)
33