Automatically Finding Patches Using Genetic...

Automatically Finding Patches Using Genetic Programmingpublished in ICSE’09

Westley Weimer, Thanh Vu Nguyen, Claire Le Goues, and Stephanie Forrest

Presenter: Jihun Park

SELab

2014.01.10

LAB Seminar

DrawbacksComplicate input

Introduction

2

• Software bug

– Maintenance takes the majority of total software dev. cost.

– Fixing bug is inevitable, difficult, and tedious manual process.

Automatic program

repair

Difficult formal specification

Program annotations

Special coding practice

Harmful repairs

Restricted property

Just narrowing down to few lines

DrawbacksComplicate input

Introduction

3

• Software bug

– Maintenance takes the majority of total software dev. cost.

– Fixing bug is inevitable, difficult, and tedious manual process.

Automatic program

repair

Difficult formal specification

Program annotations

Special coding practice

Harmful repairs

Restricted property

Just narrowing down to few lines

GOAL

Suggesting automatic patch generation technique using Genetic

Programming• With simple inputs,

• Not sacrificing required functionality,• Generating a concrete patch

Outline

• Introduction

• Motivating Example

• Background

• Approach Overview

• Genetic Programming (GP) for Program Repair

• Experiments

• Conclusion

• Discussion

4

Motivating Example

5

1 /* requires: a >= 0, b >= 0 */2 void gcd(int a, int b) {3 if (a == 0) {4 printf("%d", b);5 }6 while (b != 0)7 if (a > b)8 a = a - b;9 else10 b = b - a;11 printf("%d", a);12 exit(0);13 }

With positiveand negativetest cases,

gcd(1071, 1029) = 21

gcd(0, 55) = > infinite loop

Positive test case

Negative test case

Motivating Example

6



gcd(1071, 1029) = 21


Positive test case

Negative test case

Motivating Example

7



gcd(1071, 1029) = 21


Positive test case

Negative test case

Locate suspicious code

Motivating Example

8



gcd(1071, 1029) = 21


Positive test case

Negative test case


Insert/remove/replace statementswith existing ones to fix negative TC.

Motivating Example

9

1 /* requires: a >= 0, b >= 0 */2 void gcd(int a, int b) {3 if (a == 0) {4 printf("%d", b);5 exit(0);6 a = a - b;7 }8 while (b != 0)9 if (a > b)10 a = a - b;11 else12 b = b - a;13 printf("%d", a);14 exit(0);15 }


gcd(1071, 1029) = 21


Positive test case

Negative test case



gcd(1071, 1029) = 21

gcd(0, 55) = infinite loop

Positive test case

Negative test case

gcd(0, 55) = 55

Motivating Example

10

1 /* requires: a >= 0, b >= 0 */2 void gcd(int a, int b) {3 if (a == 0) {4 printf("%d", b);5 exit(0);6 a = a - b;7 }8 while (b != 0)9 if (a > b)10 a = a - b;11 else12 b = b - a;13 printf("%d", a);14 exit(0);15 }


gcd(1071, 1029) = 21


Positive test case

Negative test case



If we find a fix, minimize it by deleting extra statements.

gcd(1071, 1029) = 21


Positive test case

Negative test case

gcd(0, 55) = 55

Motivating Example

11

1 /* requires: a >= 0, b >= 0 */2 void gcd(int a, int b) {3 if (a == 0) {4 printf("%d", b);5 exit(0);6 }7 while (b != 0)8 if (a > b)9 a = a - b;10 else11 b = b - a;12 printf("%d", a);13 exit(0);14 }


gcd(1071, 1029) = 21


Positive test case

Negative test case



If we find a fix, minimize it by deleting extra statements.

gcd(1071, 1029) = 21


Positive test case

Negative test case

gcd(0, 55) = 55

0 1 0 1 1 0 1 1

0 1 0 1 1 0 0 0

Background

• Genetic programming

– Applying Genetic Algorithm(GA) to a computer program

Representation

Crossover

Mutation

Selection

Represent computer program as a individual (chromosome).

Crossover two individuals to make a new child.

Select next generation by assessing individuals using fitness function.

Program 0 1 0 1 1 0 0 1

0 1 0 1 1 0 0 1

0 1 0 1 1 0 0 1

0 0 0 1 0 0 0 1

1 1 1 1 1 0 0 1{ {

Individuals that works better than others

Change individuals with mutation operator.

0 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1

1 1 0 1 1 0 0 0

0 1 1 1 0 1 0 1

1 1 0 1 1 1 0 1

0 1 1 1 0 0 0 0

Approach Overview

13

Representation

Crossover

Mutation

Selection

Represent computer program as a individual (chromosome).

Crossover two individuals to make a new child.

Select next generation by assessing individuals using fitness function.

Minimization Minimize the final solution by removing extra statements

Change individuals with mutation operator.

Program {stmt1,w1} {stmt2,w2} …

{stmt1,w1} {stmt2,w2} … {stmt1,w1} {stmt5,w2} …



0 10 1 0 1 1 0 0 1{

{

Individuals that pass many test cases

Program Representation

• Represent program statements as AST node. (using CIL*)

14

Statement sequence

if

Compareop: ==

a 0

List<Stmt>

Method invocation

printf

“%d” b


Original AST

while

*: “Cil: An infrastructure for C program analysis and transformation”, G.C. Necula, S.McPeak, S.P.Rahul, and W. Weimer, ICCC’02



15

Statement sequence

If (a==0)

Printf(“%d”,b)

while(b!=0)

If(a>b)

a = a - b; b = b – a;

printf(“%d”,a) exit(0);

High-level AST


*: “Cil: An infrastructure for C program analysis and transformation”, G.C. Necula, S.McPeak, S.P.Rahul, and W. Weimer, ICCC’02



• Pairing each statement with weight.

16

Node representation


[3] If (a==0) [4] Printf(“%d”,b)

[6] while(b!=0)

[7] If(a>b) [8] a = a - b; [10] b = b – a;

[11] printf(“%d”,a) [12] exit(0);



• Pairing each statement with weight.

17

[3] If (a==0) [4] Printf(“%d”,b)

[6] while(b!=0)

[7] If(a>b) [8] a = a - b; [10] b = b – a;

[11] printf(“%d”,a) [12] exit(0);

Node representation


{{[3]},0.1} {{[4]},1.0} {{[6]},0.1} {{[7]},0.1} {{[8]},0} {{[10]},0.1} {{[11]},0} {{[12]},0}

Chromosome representation: A 𝐿𝑖𝑠𝑡 𝑜𝑓 < 𝑠𝑡𝑚𝑡𝑖 , 𝑝𝑟𝑜𝑏𝑖 >

Weighting policyVisited by only negative TCs: 1.0Visited by only positive TCs: 0.0

Visited by both TCs: 0.1

Mutation

• Mutation algorithm

– For a 𝑠𝑡𝑚𝑡𝑖, mutation is applied if the condition is satisfied

• Three mutation operator

– Insert: insert 𝑠𝑡𝑚𝑡𝑗 statement after 𝑠𝑡𝑚𝑡𝑖 .

– Swap: insert 𝑠𝑡𝑚𝑡𝑗 statement instead of 𝑠𝑡𝑚𝑡𝑖 .

– Delete: delete 𝑠𝑡𝑚𝑡𝑖.

* NOTE: 𝑝𝑟𝑜𝑏𝑖 is not changed.18

𝑟𝑎𝑛𝑑 0,1 ≤ 𝑝𝑟𝑜𝑏𝑖 ∧ 𝑟𝑎𝑛𝑑 0,1 ≤ 𝑊𝑚𝑢𝑡

𝑃𝑎𝑡ℎ 𝑖 ← < {𝑠𝑡𝑚𝑡𝑖; 𝑠𝑡𝑚𝑡𝑗}, 𝑝𝑟𝑜𝑏𝑖 >

𝑃𝑎𝑡ℎ 𝑖 ← < 𝑠𝑡𝑚𝑡𝑗, 𝑝𝑟𝑜𝑏𝑖 >

𝑃𝑎𝑡ℎ 𝑖 ← < { }, 𝑝𝑟𝑜𝑏𝑖 >

Crossover

• Randomly select cutoff point, then combine fromboth parent.

19

{stmt1,w1} {stmt2,w2} {stmt3,w3} {stmt4,w4} {stmt5,w5} {stmt6,w6}

{stmt1’,w1} {stmt2’,w2} {stmt3’,w3} {stmt4’,w4} {stmt5’,w5} {stmt6’,w6}

Randomly selected cutoff point

{stmt1,w1} {stmt2’,w2} {stmt3’,w3} {stmt4’,w4} {stmt5’,w5} {stmt6’,w6}

{stmt1’,w1} {stmt2,w2} {stmt3,w3} {stmt4,w4} {stmt5,w5} {stmt6,w6}

Parents:

Children:(Next generation)

|Parent generation|= (pop_size/2) |Parent +Children|= (pop_size)

|Next generation|= (pop_size/2)Selection process

Selection

• Fitness function is used by selection process to assesseach chromosome.

• Fitness function encodes software requirements.

– The positive test cases: necessary functionality that cannotbe sacrificed.

– The negative test cases: the fault to be repaired.

– A chromosome that cannot be compiled: zero fitness score.

𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑃 = 𝑊𝑃𝑜𝑠𝑇 × 𝑡 ∈ 𝑃𝑜𝑠𝑇 𝑃 𝑝𝑎𝑠𝑠𝑒𝑠 𝑡 |+𝑊𝑁𝑒𝑔𝑇 × 𝑡 ∈ 𝑁𝑒𝑔𝑇 𝑃 𝑝𝑎𝑠𝑠𝑒𝑠 𝑡 |

𝑊𝑃𝑜𝑠𝑇: Weight for positive test cases𝑊𝑁𝑒𝑔𝑇: Weight for negative test cases

Selection (cont’d)

• Selection process determines the next generation.

• Stochastic Universal Sampling (SUS) is used.

– The probability of selection is proportional to relativefitness in the population.

21

A B C D E F G

0 FTotal fitness = F

F/NStart point ∈ [0,F/N)

An Stochastic Universal Sampling (SUS) example

Termination criterion: A chromosome passes all test cases

Repair Minimization

• Using Tree differencing algorithm and Delta debugging,minimize the final result.

22

Original program Final results which passes all TCs

Removed

Added

Delta debugging:- Finding minimum difference of two test cases that one fails and the other passes.

Removed Added Added

… Find the minimal subset of the difference!

X O O

Experimental Setup

• Goal of experiment

1. Evaluate performance and scalability

2. Measure run-time cost

3. Evaluate the success rate

4. Understand how test cases affect repair quality

• Test cases

– 1 fault test cases

– A small number of (2-6) positive test cases• Non-crashing fuzz inputs (randomly generated)

• Manually created simple positive test cases

23

Experimental Setup (Cont’d)

• Parameters

– pop_size: 40

– maximum of ten generations

– 𝑊𝑃𝑜𝑠𝑇 = 1 and 𝑊𝑁𝑒𝑔𝑇 = 10

• 10 subject programs

24

Experimental Results

• 54% of time is spent executing test cases, and 30% is spent compiling program variants.

• 5.5 insertions, deletions, and swaps applied to a variant between generations.

• The average initial repair was evolved using 3.5 crossovers and 1.8 mutations over 6.0generations.

• All of repairs (1) compile, (2) fix the defect, and (3) avoid compromising requiredfunctionality in the positive test cases provided.

Related work

• W. Weimer et al.: automatic repairing with specification.

– Require formal spec. which are rarely available.

– Repairs sacrifice other required functionalities.

– Only repairs single-thread violations of temporal safety.

• T. Ball et al., S. Chaki et al., and A. Groce et al.: Tracelocalization, minimization, and explanation.

– Narrow down a large counterexample backtrace to a few lines.

– Only deal with the fault those found by static analysis.

• Arcuri: repair software bugs automatically using GP

– Needs formal specification as oracle.

– No evaluation on real bugs and real software.

26

Related work

• A Systematic Study of Program Repair: Fixing 55 outof 105 Bug s for $8 Each, Claire Le Goues, MichaelDewey-Vogt, Stephanie Forrest, and Westley Weimer.(ICSE’11)

– Using similar technique based on Genetic Programming,

– With the Amazon cloud service,

27

Conclusion

• Presenting a fully automated technique for repairbugs using Genetic Programming (GP).

• Suggesting a novel representation of program for GPand genetic operators.

• Suggesting patch minimization approach using deltadebugging and tree differencing algorithm.

28

Discussion

• Pros

– Positive test cases are much easier to obtain than formalspecifications or code annotations.

– Weighted representation makes this possible specifically inscalability manner.

• Cons

– Assuming that the defect is reproducible and the negativetest case is deterministic.

– Assuming that the path along negative TC is different frompositive TC.

– Assuming that the repair can be constructed fromstatements already extant in the program.

29

30

Thank you for listening

• A delta debugging example

32

Automatically Finding Patches Using Genetic...

Documents

Transcript of Automatically Finding Patches Using Genetic...