The Robust Optimization of Non-Linear Requirements Models
-
Upload
gregoryg -
Category
Technology
-
view
577 -
download
2
description
Transcript of The Robust Optimization of Non-Linear Requirements Models
G R E G O R Y G A Y T H E S I S D E F E N S E
W E S T V I R G I N I A U N I V E R S I T Y G R E G @ G R E G G A Y . C O M
The Robust Optimization of Non-Linear Requirements Models
Consider a requirements model…
Contains: Various goals of a project. Methods for reaching those goals. Risks that prevent those goals. Mitigations that remove risks.
(but carry costs)
A solution: balance between cost and attainment. This is a non-linear optimization problem!
2
Understanding the Solution Space
Open and pressing issue. [Harman ‘07] Many SE problems are over-constrained.
No right answer, so give partial solutions.
Robustness of solutions is key – many algorithms give brittle results. [Harman ‘01]
Important to present insight into the neighborhood. What happens if I do B instead of A?
3
The Naïve Approach
Naive approaches to understanding neighborhood: Run N times and report (a) the solutions appearing in more
than N/2 cases, or (b) results with a 95% confidence interval.
Both are flawed – they require multiple trials!
Neighborhood assessment must be fast [Feather ’08] Real-time if possible. [Nielson ‘93] states that results must be within:
1 Second before the mind begins to drift. 10 seconds before the mind has completely moved on.
4
Research Goals
Two important concerns:
Is demonstrating solution robustness a time-consuming task?
Must solution quality be traded against solution robustness?
5
What We Want 6
These are the features we want out of an algorithm:
*(We want “more for less”)
Feature Algorithm
Neighborhood Assessment
High-Quality Results* Yes
Low Result Variance (Tame)
Yes
Scores Plateau to Stability (Well-behaved)
Yes
Speed Real-Time Results Yes
Scalability Yes
Why Do We Care? 7
The later in the development process that a defect is identified, the more exponentially expensive it becomes to fix it
(Paraphrased from Barry Boehm [Boehm ‘81])
Roadmap 8
Model Algorithms Experiments Future Work Conclusions
The Defect Detection and Prevention Model
Used at NASA JPL by Martin Feather’s “Team X” [Cornford ’01, Feather ‘02, Feather ’08, Jalali ‘08]
Early-lifecycle requirements model Light-weight ontology represents:
Requirements: project objectives, weighted by importance. Risks: events that damage attainment of requirements. Mitigations: precautions that remove risk, carry a cost value. Mappings: Directed, weighted edges between requirements
and risks and between risks and mitigations. Part-of-relations: Provide structure between model
components.
9
Light-weight != Trivial 10
Why Use DDP?
Three Reasons:
1. Demonstrably useful. [Feather ‘02, Feather ‘08] Cost savings often over $100,000 Numerous design improvements seen in DDP sessions Overall shift in risks in JPL projects.
2. Availability of real-world models [Jalali ‘08] Now and in the future.
11
The Third Reason 12
DDP is representative of other requirements tools. Set of influences, expressed in a hierarchy, with relationships
modeled through equations.
Sensitivity analysis [Cruz ‘73] not applicable to DDP.
Soft-Goals [Myloupolus ‘99]
QOC [MacLean ‘96]
Using DDP
Input = Set of enabled mitigations. Output = Two values: (Cost, Attainment)
Those values are normalized and combined into a single score [Jalali ‘08]:
13
Roadmap 14
Model Algorithms Experiments Future Work Conclusions
Search-Based Software Engineering 15
Four factors must be met: [Harman ’01, Harman ‘04] 1. A large search space. 2. Low computational complexity. 3. Approximate continuity (in the score space). 4. No known optimal solutions.
DDP Problem fits all: 1. Some models have up to (299 = 6.33*1029) possible settings. 2. Calculating the score is fast, algorithms run in O(N2) 3. Discrete variables, but continuous score space. 4. Solutions depend on project settings, optimal not known.
No single solution, so reformulate as a search problem and find several!
Theory of KEYS
Theory: A minority of variables control the majority of the search space. [Menzies ‘07]
If so, then a search that (a) finds those keys and (b) explores their ranges will rapidly plateau to stable, optimal solutions.
This is not new: narrows, master-variables, back doors, and feature subset selection all work on the same theory. [Amarel ’86, Crawford ’94, Kohavi ’97, Menzies ’03, Williams ’03]
Everyone reports them, but few exploit them!
16
KEYS Algorithm
Two components: greedy search and a Bayesian ranking method (BORE = “Best or Rest”).
Each round, a greedy search: Generate 100 configurations of mitigations 1…M. Score them. Sort top 10% of scores into “Best” grouping, bottom 90% into
“Rest.” Rank individual mitigations using BORE. The top ranking mitigation is fixed for all subsequent rounds.
Stop when every mitigation has a value, return final cost and attainment values.
17
BORE Ranking Heuristic
We don’t have to actually search for the keys, just keep frequency counts for “best” and “rest” scores.
BORE [Clark ‘05] based on Bayes’ theorem. Use those frequency counts to calculate:
To avoid low-frequency evidence, add support term:
18
KEYS vs KEYS2
KEYS fixes a single top-ranked mitigation each round.
KEYS2 [Gay ‘10] incrementally sets more (1 in round 1, 2 in round 2… M in round M) Slightly less tame, much
faster.
19
Discovering KEYS2 20
KEYS2 is simple, developing it was not! Many different heuristics tried:
Simple: Complex: Set more in powers of 2 Shift best/rest slope ln(round) Neighborhood exploration top N% KEYS-R
Lesson in simplicity.
Benchmarked Algorithms
KEYS much be benchmarked against standard SBSE techniques. Simulated Annealing, MaxWalkSat, A* Search
Chosen techniques are discrete, sequential, unconstrained algorithms. [Gu ‘97] Constrained searches work towards a pre-determined number
of solutions, unconstrained adjust to their goal space.
21
Simulated Annealing
Classic, yet common, approach. [Kirkpatrick ’83] Choose a random starting position. Look at a “neighboring” configuration.
If it is better, go to it. If not, move based on guidance from probability function
(biased by the current temperature).
Over time, temperature lowers. Wild jumps stabilize to small wiggles.
22
MaxWalkSat 23
Hybridized local/random search. [Kautz ‘96, Selman ‘93]
Start with random configuration. Either perform
Local Search: Move to a neighboring configuration with a better score. (70%)
Random Search: Change one random mitigation setting. (30%)
Keeps working towards a score threshold. Allotted a certain number of resets, which it will use if it fails to pass the threshold within a certain number of rounds.
A* Search 24
Best first path-finding heuristic. [Hart ‘68] Uses distance from origin (G) and estimated cost to
goal (H), and moves to the neighbor that minimizes G+H.
Moves to new location and adds the previous location to a closed list to prevent backtracking.
Optimal search because it always underestimates H. Stops after being stuck for 10 rounds.
Other Methods 25
[Gu ‘97]’s survey lists hundreds of methods! Gradient descent methods and sensitivity analysis
assume a continuous range for model variables. DDP models are discrete!
Integer programming could still be used (CPLEX [Mittelmann ‘07]) Too slow! [Coarfa ‘00] SE problems are over-constrained, so a solution over all
constraints is not possible. [Harman ’07] Parallel algorithms
Communications overhead overwhelms benefits.
Roadmap 26
Model Algorithms Experiments Future Work Conclusions
Experiment 1: Costs and Attainments 27
We use real-world models 2,4,5 (1,3 are too small and were only used for debugging). Models discussed in [Feather ‘02, Jalali ‘08, Menzies ‘03]
Run each algorithm 1000 times per model. Removed outlier problems by generating a lot of data points. Still a small enough number to collect results in a short time
span.
Graph cost and attainment values. Values towards bottom-right better.
Experiment 1 Results 28
Model
Algorithm
Cost (Y-Axis)
Attainment (X-Axis)
Experiment 1 Results 29
Bad
Good
Experiment 1 Results 30
Summing it Up… 31
Feature Simulated Annealing
MaxWalkSat A* KEYS KEYS2
High-Quality Results No No Yes Yes Yes
Low Result Variance (Tame)
No No No Yes Yes
Scores Plateau to Stability (Well-behaved)
? ? ? ? ?
Real-Time Results ? ? ? ? ?
Scalability ? ? ? ? ?
Experiment 2: Runtimes 32
For each model: Run each algorithm 100
times. Record runtime using
Unix “time” command. Divide runtime/100 to get
average.
Experiment 2 Results 33
Model 2 (31 mitigations)
Model 4 (58 mitigations)
Model 5 (99 mitigations)
Simulated Annealing
0.577 1.258 0.854
MaxWalkSat 0.122 0.429 0.398
A* Search 0.003 0.017 0.048
KEYS 0.011 0.053 0.115
KEYS2 0.006 0.018 0.038
Summing it Up… 34
Feature Simulated Annealing
MaxWalkSat A* KEYS KEYS2
High-Quality Results No No Yes Yes Yes
Low Result Variance (Tame)
No No No Yes Yes
Scores Plateau to Stability (Well-behaved)
? ? ? ? ?
Real-Time Results No No Yes Yes Yes
Scalability ? ? ? ? ?
Experiment 3: Scale-Up Study 35
By 2013, we expect DDP models 8x larger than those used in this thesis (from 2008)
KEYS and KEYS2 are “real time” now, but will they scale?
Year Num. Variables
2004 30
2008 100
2010 300
2013 800
Artificial Model Generation 36
To test scalability, a generator builds synthesized models by: Studying the existing real-world models and collecting
statistics on its internal structure. Mutating them into larger models based on user-input size,
density, and distribution altering parameters
Models built that were 2,4, and 8 times larger than existing models.
Scale-Up Results (1) 37
Scale-Up Results (2) 38
Scale-Up Results (3) 39
KEYS KEYS2
Runtimes Model Calls Runtimes Model Calls
Exponential 0.82 0.83 0.88 0.93
Polynomial (of degree 2)
0.99 0.99 0.99 0.98
KEYS and KEYS2 fit to O(N2) Both scale to larger models, but KEYS requires
exponentially more model calls (thus, large jump in execution time)
Thus, we recommend KEYS2
Summing it Up… 40
Feature Simulated Annealing
MaxWalkSat A* KEYS KEYS2
High-Quality Results No No Yes Yes Yes
Low Result Variance (Tame)
No No No Yes Yes
Scores Plateau to Stability (Well-behaved)
? ? ? ? ?
Real-Time Results No No Yes Yes Yes
Scalability No No Maybe No Yes
Decision Ordering Diagrams 41
Design of KEYS2 automatically provides a way to explore the decision neighborhood.
Decision ordering diagrams – Visual format that ranks decisions from most to least important . [Gay ‘10]
Decision Ordering Diagrams (2) 42
These diagrams can be used to assess solution robustness in linear time by (A) Considering the variance in performance after applying X
decisions. Spread = measure of variance (75th-25th quartile)
Decision Ordering Diagrams (3) 43
These diagrams can be used to assess solution robustness in linear time by (B) Comparing the results of using the first X decisions to that
of X-1 or X+1.
Decision Ordering Diagrams (4) 44
Useful under three conditions: (a) scores output are well-behaved (b) variance is tamed (c) they are generated in a timely manner.
Summing it Up… 45
Feature Simulated Annealing
MaxWalkSat A* KEYS KEYS2
High-Quality Results No No Yes Yes Yes
Low Result Variance (Tame)
No No No Yes Yes
Scores Plateau to Stability (Well-behaved)
No No No Yes Yes
Real-Time Results No No Yes Yes Yes
Scalability No No Maybe Yes Yes
Roadmap 46
Model Algorithms Experiments Future Work Conclusions
Future Work 47
Weakness in KEYS2 – it must call the model 100x per round.
70% of execution time is spent calling the model.
To improve –call the model less. Problem: do this without
losing support!
Future Directions 48
Two main parts – cache and active learning. Don’t call the model, just look up scores. Clustering must be efficient [Jiang ’08, Poshyvanyk ‘08]
First round – generate 100 random configurations. Greedily cluster them, assign identifiers to each, build a
hierarchical distance tree. After this, can simply drop new instances into the tree (as the
number of clusters will remain fixed). The model only needs to be called for new instances.
Future Directions (2) 49
Most learners passive – blindly generating or reading data.
Active learners exercise control over what data they learn on [Cohn ‘94].
Many clusters are linear – their members form a smooth area of the search space with similar scores [Shepperd ‘97].
If a new configuration falls into a linear region, don’t poll the model, just interpolate its score.
Train KEYS2 to outright ignore linear regions with poor scores.
Treatment Learning (NASA Ames)
Bayesian Nets (Tsinghua University)
Work with Misty Davies, Karen Gundy-Burlet.
Design process: Run simulations, then apply Treatment Learning via TAR3 or TAR4.1 [Menzies ‘03b]
KEYS is a multi-stage version of TAR4.1
Future – KEYS as part of the simulator.
Work with Hongyu Zhang. Bayes Net = directed graph of
variables and their influences. Variable ordering problem
[Hsu ‘04] Order of variables examined by
learning process is crucial to performance!
Need a ranking from best -> worst Genetic algorithms sometimes used,
but they are slow. KEYS offers a potential solution
to the VOP.
50
KEYS as a Component
Roadmap 51
Model Algorithms Experiments Future Work Conclusions
Conclusions 52
Optimization tools can study the space of requirements, risks, and mitigations.
Finding a balance between costs and attainment is hard!
Such solutions can be brittle, so we must comment on solution robustness. [Harman ‘01]
Conclusions (2) 53
Pre-experimental concerns: An algorithm would need to trade solution quality for
robustness (variance vs score). Demonstrating solution robustness is time-consuming and
requires multiple procedure calls.
KEYS2 defies both concerns. Generates higher quality solutions than standard methods, and
generates results that are tame and well-behaved (thus, we can generate decision ordering graphs to assess robustness).
Is faster than other techniques, and can generate decision ordering graphs in O(N2)
We Recommend KEYS2 54
Feature Simulated Annealing
MaxWalkSat A* KEYS KEYS2
High-Quality Results No No Yes Yes Yes
Low Result Variance (Tame)
No No No Yes Yes
Scores Plateau to Stability (Well-behaved)
No No No Yes Yes
Real-Time Results No No Yes Yes Yes
Scalability No No Maybe No Yes
References 55
Slide 3: [Harman ‘01] M. Harman and B.F. Jones. Search-based software engineering. Journal of Information and Software
Technology, 43:833–839, December 2001. [Harman ‘07] Mark Harman. The current state and future of search based software engineering. In Future of Software
Engineering, ICSE’07. 2007. Slide 4:
[Feather ‘08] M. Feather, S. Cornford, K. Hicks, J. Kiper, and T. Menzies. Application of a broad- spectrum quantitative requirements model to early-lifecycle decision making. IEEE Software, 2008.
[Nielson ‘93] J. Nielson. Usability Engineering. Academic Press, 1993.
Slide 7: [Boehm ‘81] B. W. Boehm. Software Engineering Economics. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1981.
Slide 9: [Cornford ‘01] S.L. Cornford, M.S. Feather, and K.A. Hicks. DDP a tool for life-cycle risk management. In IEEE Aerospace
Conference, Big Sky, Montana, pages 441–451, March 2001. [Feather ‘02] M.S. Feather and T. Menzies. Converging on the optimal attainment of requirements. In IEEE Joint Conference On
Requirements Engineering ICRE’02 and RE’02, 9-13th September, University of Essen, Germany, 2002. [Jalali ‘08] Tim Menzies, Omid Jalali, and Martin Feather. Optimizing requirements decisions with keys. In Proceedings
PROMISE ’08 (ICSE), 2008. Slide 12:
[Cruz ‘73] Cruz, J.B., editor. System Sensitivity Analysis. Dowden, Hutchinson, & Ross. Stroudsburg, PA. 1973. [Maclean ‘96] A. MacLean, R.M. Young, V. Bellotti, and T.P. Moran. Questions, options and criteria: Elements of design space
analysis. In T.P. Moran and J.M. Carroll, editors, Design Rationale: Concepts, Techniques, and Use, pages 53–106. Lawerence Erlbaum Associates, 1996.
[Mylopoulos ‘99] J. Mylopoulos, L. Cheng, and E. Yu. From object-oriented to goal-oriented requirements analysis. Communications of the ACM, 42(1):31–37, January 1999.
References (2) 56
Slide 15: [Harman ‘04] Mark Harman and John Clark. Metrics are fitness functions too. In 10th International Software Metrics
Symposium (METRICS 2004), 2004), pages = 58–69, location = Chicago, IL, USA, publisher = IEEE Computer Society Press, address = Los Alamitos, CA, USA.
Slide 16: [Amarel ‘86] S. Amarel. Program synthesis as a theory formation task: Problem representations and solution methods. In R.
S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach: Volume II, pages 499–569. Kaufmann, Los Altos, CA, 1986.
[Crawford ‘94] J.Crawford and A.Baker. Experimental results on the application of satisfiability algorithms to scheduling problems. In AAAI ’94, 1994.
[Kohavi ‘97] Ron Kohavi and George H. John. Wrappers for feature subset selection. Artificial Intelli- gence, 97(1-2):273–324, 1997.
[Menzies ‘03] T. Menzies and H. Singh. Many maybes mean (mostly) the same thing. In M. Madravio, editor, Soft Computing in Software Engineering. Springer-Verlag, 2003.
[Menzies ‘07] T. Menzies, D. Owen, and K. Richardson. The Strangest Thing About Software. Computer 40, 1 (Jan. 2007), 54-60.
[Williams ’03] R.Williams, C.P.Gomes, and B.Selman. Backdoors to typical case complexity. In Proceedings of IJCAI 2003, 2003.
Slide 18: [Clark ‘05] R. Clark. Faster treatment learning, Computer Science, Portland State University. Master’s thesis, 2005.
Slide 19: [Gay ‘10] Gay, Gregory and Menzies, Tim and Jalali, Omid and Mundy, Gregory and Gilkerson, Beau and Feather, Martin and
Kiper, James. Finding robust solutions in requirements models. Automated Software Engineering, 17(1): 87-116, 2010.
References (3) 57
Slide 21: [Gu ‘97] Jun Gu, Paul W. Purdom, John Franco, and Benjamin W. Wah. Algorithms for the satisfiability (sat)
problem: A survey. In DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 19–152. American Mathematical Society, 1997.
Slide 22: [Kirkpatrick ‘83] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Sci- ence,
Number 4598, 13 May 1983, 220, 4598:671–680, 1983.
Slide 23: [Kautz ‘96] Henry Kautz and Bart Selman. Pushing the envelope: Planning, propositional logic and stochastic
search. In Proceedings of the Thirteenth National Conference on Artificial Intel- ligence and the Eighth Innovative Applications of Artificial Intelligence Conference, pages 1194–1201, Menlo Park, August 4–8 1996. AAAI Press / MIT Press. Available from http://www.cc.gatech.edu/ ~jimmyd/summaries/kautz1996.ps.
[Selman ‘93] Bart Selman, Henry A. Kautz, and Bram Cohen. Local search strategies for satisfiability testing. In Michael Trick and David Stifler Johnson, editors, Proceedings of the Second DIMACS Challange on Cliques, Coloring, and Satisfiability, Providence RI, 1993.
Slide 24: [Hart ‘68] P.E. Hart, N.J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost
paths. IEEE Transactions on Systems Science and Cybernetics, 4:100–107, 1968.
References (4) 58
Slide 25: [Coarfa ‘00] Cristian Coarfa, Demetrios D. Demopoulos, Alfonso San, Miguel Aguirre, Devika Subramanian, and
Moshe Y. Vardi. Random 3-sat: The plot thickens. In In Principles and Practice of Constraint Programming, pages 143–159, 2000.
[Mittelmann ‘07] H.D. Mittelmann. Recent benchmarks of optimization software. In 22nd Euorpean Conference on Operational Research, 2007.
Slide 48: [Jiang ‘08] Y. Jiang, B. Cukic, and T. Menzies. Does transformation help? In Defects 2008, 2008. [Poshyvanyk ‘08] D. Poshyvanyk A. Marcus and R. Ferenc. Using the conceptual cohesion of classes for fault
prediction in object oriented systems. IEEE Transactions on Software Engineering, 34:287–300, 2 2008.
Slide 49: [Cohn ‘94] David Cohn, Les Atlas, and Richard Ladner. Improving generalization with active learning. Mach.
Learn., 15(2):201–221, 1994. [Shepperd ‘97] M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE
Transactions on Software Engineering, 23(12), November 1997. Slide 50:
[Menzies ’03b] T. Menzies and Y. Hu. Data mining for very busy people. In IEEE Computer, November 2003. Available from http://menzies.us/pdf/03tar2.pdf.
[Hsu ‘04] Hsu, W.H. Genetic Wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Inf. Sci. 163, 1-3 (June 2004), 103-122.
Questions? 59
Want to contact me later? Email: [email protected] Facebook: http://facebook.com/greg.gay Twitter: http://twitter.com/Greg4cr
More about me: http://www.greggay.com