Slides from my oral defense
-
Upload
softwarecentral -
Category
Documents
-
view
936 -
download
0
Transcript of Slides from my oral defense
Coevolutionary Automated Software CorrectionCoevolutionary Automated Software Correction A Proof of ConceptA Proof of Concept
Master’s Oral DefenseMaster’s Oral Defense
September 8, 2008September 8, 2008
Josh WilkersonJosh Wilkerson
CommitteeCommittee
Dr. Daniel Tauritz – ChairDr. Daniel Tauritz – Chair
Dr. Bruce McMillinDr. Bruce McMillin
Dr. Thomas WeigertDr. Thomas Weigert
Page 2Motivation
In 2002 the National Institute of Science and Technology stated [9]:
– Software errors cost the U.S. economy $59.5 billion a year
– Approximately 0.6% of gross domestic product
– 30% of these costs could be removed by earlier, more effective software defect detection and an improved testing infrastructure
Page 3Problem Statement
Software Debugging:– Test the software
– Locate the errors identified
– Correct the errors
Time consuming yet critical process
Many publications on automating the testing process
None that fully automate the testing and correction phase
Page 4The System Envisioned
Page 5Most Related Work
Paolo Tonella [14] and Stefen Wappler [6,15,16]
– Unit testing of object oriented software
– Used evolutionary methods
– Focused only on testing, did nothing with correction
Timo Mantere [7,8]
– Two-population testing system using genetic algorithms
– Optimized program parameters through evolution
– The more control the EA has over the program the better the results
Page 6Technical Background
Darrel Rosin [10,11] and John Cartlidge [1]
– Extensive analysis of co-evolution
– Outline many potential problems that can occur during co-evolution
Koza [2,3,4,5]
– Popularized genetic programming in the 1990’s
– Father of modern genetic programming
Page 7CASC Evolutionary Model
Page 8CASC Evolutionary Model
Page 9Parsing in the CASC System
The program population is based on the program to be corrected (seed program)
Page 10Parsing in the CASC System: Step 1
The ANTLR system is used to create parsing tools (only done once for each language)
The parser created is based on a provided grammar (C++)
The resulting parser is dependent on the ANTLR libraries
Page 11Parsing in the CASC System: Step 2
The system reads in the source code for the program to correct
The code to evolve is extracted in preprocessing
Page 12Parsing in the CASC System: Step 3
The preprocessed source code to evolve is provided to the parsing tools
Page 13Parsing in the CASC System: Step 4
The parsing tools produce the Abstract Syntax Tree (AST) for the evolvable code
The AST produced is heavily dependent on the ANTLR libraries
These dependencies incur unnecessary computational cost
Page 14Parsing in the CASC System: Step 5
The ANTLR AST is provided to the CASC AST translator
The AST translator removes the ANTLR dependencies from the AST
The result is a lightweight version of the AST
Page 15Parsing in the CASC System: Step 6
The lightweight AST is provided to the CASC coevolutionary system
Copies of the AST are randomly modified
Initial variation phase
Page 16CASC Evolutionary Model
Page 17CASC Evolutionary Model
Page 18CASC Evolutionary Model
Page 19CASC Evolutionary Model
Reproduction
– Parents selected using tournament selection
– Uniform crossover with bias
– Program child subtrees of the roots were used for crossover
Mutation
– Each offspring has a chance to mutate
– Only specific nodes are considered for program mutation
– Genes to be mutated are altered based on a Gaussian distribution
Page 20CASC Evolutionary Model
Page 21CASC Evolutionary Model
Page 22
For each individual:– Randomly select set of (unique) opponents
– Check hash table to retrieve repeat pairing results
– Execute program with test case as input for each new pairing
– Apply fitness function to program output, store fitness for the trial
– Set individual fitness as average fitness across all trials
Program compilation is performed as needed
Program errors/time-outs result in arbitrarily low fitness
This is done in parallel, using the NIC-Cluster and MPI
CASC Evolutionary Model: Fitness Evaluation
Page 23CASC Evolutionary Model
Page 24CASC Evolutionary Model
Page 25CASC Evolutionary Model
Page 26Experimental Setup
Proof of concept
Correction of insertion sort implementation
Test case: unsorted data array
Page 27Experimental Setup
Fitness function
Scoring method
For each element x in the output data array:
– For each element a before x in the array, decrement score if x < a, increment score otherwise
– For each element b after x in the array, decrement score if x > b, increment score otherwise
Normalized to fall between 0 and 1
-1 assigned to programs with errors/time-outs
Page 28Experimental SetupExperimental Setup
Four seed programs used
– Each has one common error and one unique error (of varying severity)
Four different configurations used
– Mutation Rate: Likelihood of an offspring being mutated
– Mutative Proportion: Amount of change mutation incurs
Config 0 Config 1 Config 2 Config 3
Mutative Rate Moderate High Moderate High
Mutative Proportion Moderate Moderate High High
Page 29Results
A total of 16 experiments per full run
High computational complexity and limited resources
Five full runs were completed, totaling in 80 experiments
Page 30Summary of Results
Seed Program: Config. Best (Std. Dev.) Average (Std. Dev.)
A : Base 0.526 (0.262) 0.163 (0.157)
A : Enhanced Rate 0.557 (0.283) 0.170 (0.166)
A : Enhanced Proportion 0.537 (0.226) 0.196 (0.133)
A : Enhance Both 0.559 (0.255) 0.175 (0.153)
B : Base 0.965 (0.353) 0.275 (0.374)
B : Enhanced Rate 0.975 (0.357) 0.276 (0.370)
B : Enhanced Proportion 0.950 (0.432) 0.587 (0.458)
B : Enhance Both 0.959 (0.434) 0.415 (0.463)
C : Base 0.707 (0.224) 0.372 (0.196)
C : Enhanced Rate 0.717 (0.224) 0.366 (0.179)
C : Enhanced Proportion 0.716 (0.217) 0.369 (0.172)
C : Enhance Both 0.717 (0.224) 0.377 (0.181)
D : Base 1.0 (0.282) -0.484 (0.535)
D : Enhanced Rate 1.0 (0.948) -0.568 (0.572)
D : Enhanced Proportion 1.0 (0.946) -0.554 (0.587)
D : Enhance Both 1.0 (0.946) -0.601 (0.604)
Run three of both the program A and B experiments found a solution in the initial population (these were omitted from the table)
20% of the experiments (16) reported success
Page 31Summary of Results
Seed Program: Config. Best (Std. Dev.) Average (Std. Dev.)
A : Base 0.526 (0.262) 0.163 (0.157)
A : Enhanced Rate 0.557 (0.283) 0.170 (0.166)
A : Enhanced Proportion 0.537 (0.226) 0.196 (0.133)
A : Enhance Both 0.559 (0.255) 0.175 (0.153)
B : Base 0.965 (0.353) 0.275 (0.374)
B : Enhanced Rate 0.975 (0.357) 0.276 (0.370)
B : Enhanced Proportion 0.950 (0.432) 0.587 (0.458)
B : Enhance Both 0.959 (0.434) 0.415 (0.463)
C : Base 0.707 (0.224) 0.372 (0.196)
C : Enhanced Rate 0.717 (0.224) 0.366 (0.179)
C : Enhanced Proportion 0.716 (0.217) 0.369 (0.172)
C : Enhance Both 0.717 (0.224) 0.377 (0.181)
D : Base 1.0 (0.282) -0.484 (0.535)
D : Enhanced Rate 1.0 (0.948) -0.568 (0.572)
D : Enhanced Proportion 1.0 (0.946) -0.554 (0.587)
D : Enhance Both 1.0 (0.946) -0.601 (0.604)
75% of the experiments reported above 0.7 fitness
Page 32Summary of Results
Seed Program: Config. Best (Std. Dev.) Average (Std. Dev.)
A : Base 0.526 (0.262) 0.163 (0.157)
A : Enhanced Rate 0.557 (0.283) 0.170 (0.166)
A : Enhanced Proportion 0.537 (0.226) 0.196 (0.133)
A : Enhance Both 0.559 (0.255) 0.175 (0.153)
B : Base 0.965 (0.353) 0.275 (0.374)
B : Enhanced Rate 0.975 (0.357) 0.276 (0.370)
B : Enhanced Proportion 0.950 (0.432) 0.587 (0.458)
B : Enhance Both 0.959 (0.434) 0.415 (0.463)
C : Base 0.707 (0.224) 0.372 (0.196)
C : Enhanced Rate 0.717 (0.224) 0.366 (0.179)
C : Enhanced Proportion 0.716 (0.217) 0.369 (0.172)
C : Enhance Both 0.717 (0.224) 0.377 (0.181)
D : Base 1.0 (0.282) -0.484 (0.535)
D : Enhanced Rate 1.0 (0.948) -0.568 (0.572)
D : Enhanced Proportion 1.0 (0.946) -0.554 (0.587)
D : Enhance Both 1.0 (0.946) -0.601 (0.604)
There was a high amount of variation in the experiment endpoints
Large number of possible solutions for each seed program
Page 33Summary of Results
Seed Program: Config. Best (Std. Dev.) Average (Std. Dev.)
A : Base 0.526 (0.262) 0.163 (0.157)
A : Enhanced Rate 0.557 (0.283) 0.170 (0.166)
A : Enhanced Proportion 0.537 (0.226) 0.196 (0.133)
A : Enhance Both 0.559 (0.255) 0.175 (0.153)
B : Base 0.965 (0.353) 0.275 (0.374)
B : Enhanced Rate 0.975 (0.357) 0.276 (0.370)
B : Enhanced Proportion 0.950 (0.432) 0.587 (0.458)
B : Enhance Both 0.959 (0.434) 0.415 (0.463)
C : Base 0.707 (0.224) 0.372 (0.196)
C : Enhanced Rate 0.717 (0.224) 0.366 (0.179)
C : Enhanced Proportion 0.716 (0.217) 0.369 (0.172)
C : Enhance Both 0.717 (0.224) 0.377 (0.181)
D : Base 1.0 (0.282) -0.484 (0.535)
D : Enhanced Rate 1.0 (0.948) -0.568 (0.572)
D : Enhanced Proportion 1.0 (0.946) -0.554 (0.587)
D : Enhance Both 1.0 (0.946) -0.601 (0.604)
The seed program D experiments were the toughest for the system
Seeded error resulted in either a 0 or -1 fitness
Experiments were either hit or miss
Page 34Discussion of False Positives
A number of the programs returned by successful experiments still have an error
For example, this is the evolvable section from a solution:
for(m=0; m-1 < SIZE-1; m=m+1)
{
for(n=m+1; n>0 && data[n] < data[n-1]; n=n-1)
Swap(data[n], data[n-1]);
}
When m is SIZE-1, n is initialized to Size (invalid array index)
Tough to catch
Page 35Conclusion
The goal: demonstrate a proof of concept coevolutionary system for integrated automated software testing and correction
A prototype Coevolutionary Automated Software Correction system was introduced
80 experiments were conducted
16 successes, with 75% of best-of-experiment fitnesses reporting over 0.7 (out of 1.0)
These experiments indicate validity of CASC system concept
Further work is required to determine scalability
Article on this work submitted to IEEE TSE
Page 36Work in Progress and Future Work
Evolve complete parse tree– Preliminary results using GP evolutionary model are favorable
Cut down on run-times
– Add symmetric multiprocessing (server-client) functionality
– More efficient compilation
– Acquire additional computing resources (e.g., NSF Teragrid)
Investigate the potential benefits of co-optimization [12,13]
Page 37Work in Progress and Future Work
Implement adaptive parameter control
Investigate options for detecting errors like false positives
Parameter sensitivity analysis
Page 38References
[1] J. P. Cartlidge. Rules of Engagement: Competitive Coevolutionary Dynamics in Computational Systems. PhD thesis, University of Leeds, 2004.
[2] J. R. Koza. Genetic Programming: On the Programming of Computers by the Means of Natural Selection. MIT Press, Cambridge MA, 1992.
[3] J. R. Koza. Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge MA, 1994.
[4] J. R. Koza. Genetic Programming III: Darwinian Invention and Problem Solving. Morgan Kaufmann, 1999.
[5] J. R. Koza. Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Kluwer Acadmeic Publishers, 2003.
[6] F. Lammermann and S. Wappler. Benefits of software measures for evolutionary white-box testing. In Proceedings of GECCO 2005 - the Genetic and Evolutionary Computation Conference, pages 1083–1084, Washington DC, 2005. ACM, ACM Press.
Page 39References
[7] T. Mantere and J. T. Alander. Developing and testing structural light vision software by co-evolutionary genetic algorithm. In QSSE 2002 The Proceedings of the Second ASERC Workshop on Quantative and Soft Computing based Software Engineering, pages 31–37. Alberta Software Engineering Research Consortium (ASERC) and the Department of Electrical and Computer Engineering, University of Alberta, Feb. 2002
[8] T. Mantere and J. T. Alander. Testing digital halftoning software by generating test images and filters co-evolutionarily. In Proceedings of SPIE Vol. 5267 Intelligent Robots and Computer Vision XXI: Algorithms, Techniques, and Active Vision, pages 257–258. SPIE, Oct. 2003.
[9] M. Newman. Software Errors Cost U.S. Economy $59.5 Billion Annually. NIST News Release, June 2002.
[10] C. D. Rosin and R. K. Belew. Methods for competitive coevolution: Finding opponents worth beating. In L. Eshelman, editor, Proceedings of the Sixth International Conference on Genetic Algorithms, pages 373–380, San Francisco, CA, 1995. Morgan Kaufmann.
[11] C. D. Rosin and R. K. Belew. New methods for competitive coevolution. Evolutionary Computation, 5(1):1–29, 1997.
Page 40References
[12] T. Service. Co-optimization: A generalization of coevolution. Master's thesis, Missouri University of Science and Technology, 2008.
[13] T. Service and D. Tauritz. Co-optimization algorithms. In Proceedings of GECCO 2008 - the Genetic and Evolutionary Computation Conference, pages 387-388, 2008.
[14] P. Tonella. Evolutionary testing of classes. In Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis, pages 119–128, Boston, Massachusetts, 2004. ACM Press.
[15] S. Wappler and F. Lammermann. Using evolutionary algorithms for the unit testing of object-oriented software. In Proceedings of GECCO 2005 - the Genetic and Evolutionary Computation Conference, pages 1053–1060, Washington DC, 2005. ACM, ACM Press.
[16] S. Wappler and J. Wegener. Evolutionary unit testing of object-oriented software using strongly-typed genetic programming. In Proceedings of GECCO 2006 - the Genetic and Evolutionary Computation Conference, pages 1925– 1932, Seattle, Washington, 2006. ACM, ACM Press.
Page 41
Questions?
Page 42Koza’s GP Evolutionary Model
Back to future work slide
Page 43Diversity in New Experiments
Program Population Diversities Under New Evolutionary Model
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 10 20 30 40 50
Generation
Population Standard Deviation
Exp 1
Exp 2
Exp 3