A Comparison of Evaluation Methods in Coevolution 20070921
-
Upload
ting-shuo-yo -
Category
Technology
-
view
296 -
download
1
description
Transcript of A Comparison of Evaluation Methods in Coevolution 20070921
A Comparison of Evaluation Methods in Coevolution
Ting-Shuo Yo
Supervisor: Edwin D. de Jong
Arno P.J.M. Siebes
Final Presentation INF/SCR-06-54 Applied Computing Science, ICS
Outline
● Introduction● Evaluation methods in coevolution● Performance measures● Test problems● Results and discussion● Concluding remarks
Introduction
● Evolutionary computation● Coevolution● Coevolution for test-based problems● Motivation of this study
Initialization
Population
Parents
Offspring
3. REPRODUCTION (crossover, mutation,...)
2. SELECTION
4. REPLACEMENT
End
1. EVALUATION
TERMINATE
While (not TERMINATE)
Genetic Algorithm
SubpopulationSubpopulation
End
2. SELECTION 3. REPRODUCTION 4. REPLACEMENT
................
CoevolutionInitialization
While (not TERMINATE)
1. EVALUATION
2. SELECTION 3. REPRODUCTION 4. REPLACEMENT
TERMINATE
Test-Based Problems
x
f(x)original function
regression curve
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
s1
s2
s3
Coevolution for Test-Based Problems
Solutionpopulation
Testpopulation Interaction:
● Does the solution solve the test?
● How good does the solution perform on the test?
2. SELECTION 3. REPRODUCTION 4. REPLACEMENT
2. SELECTION 3. REPRODUCTION 4. REPLACEMENT
1. EVALUATION
Solutions: the more tests it solves the better.
Tests: the less solutions pass it the better.
Motivation
● Coevolution provides a way to select tests adaptively → stability and efficiency
● Solution concept → stability● Efficiency depends on selection and
evaluation.● Compared to evaluation based on all relevant
information, how do different coevolutionary evaluation methods perform?
Concepts for Coevolutionary Evaluation Methods
● Interaction● Distinction and informativeness● Dominance and multi-objective approach
Interaction● A function that returns the outcome of interaction
between two individuals from different subpopulations.– Checkers players: which one wins– Test / Solution: if the solution succeeds in solving the
test
● Interaction matrix0 1 0 0 10 0 1 1 00 1 1 0 01 0 0 0 01 0 1 0 0
T1T2T3T4T5
S1 S2 S3 S4 S522212
2 2 3 1 1sum
sum
Distinction
● Ability to keep diversity on the other subpopulation.● Informativeness
Solutions
Test cases
T3
- 0 0 0 01 - 0 1 11 0 - 1 10 0 0 - 00 0 0 0 -
S1S2S3S4S5
S1 S2 S3 S4 S5
2 0 0 2 2 6
0 1 0 0 10 0 1 1 00 1 1 0 01 0 0 0 01 0 1 0 0
T1T2T3T4T5
S1 S2 S3 S4 S522212
2 2 3 1 1sum
sum
sum
sum
Dominance and MO approach
● Keep the best for each objective.● MO: number of individuals that dominate it
f1
f2
non-dominated
dominatedS1 is dominated by S2 iff:
Evaluation Methods
● AS: Averaged Score● WS: Weighted Score● AI: Averaged Informativeness● WI: Weighted Informativeness● MO
AS and WS● AS : (# positive interaction) / (# all interaction)
● WS : each interaction is weighted differently.
0.40.40.40.20.4
0.4 0.4 0.6 0.2 0.2
Solutions
Test cases
0 1 0 0 10 0 1 1 00 1 1 0 01 0 0 0 01 0 1 0 0
T1T2T3T4T5
S1 S2 S3 S4 S522212
2 2 3 1 1sum
sum
AI and WI● AI : # of distinctions it makes● WI : each distinction is weighted differently.
T1T2T3T4T5
S1>S2 S1>S3 S1>S4 S1>S5 ... . . . . . . . . . . 1 1 0 1 .... 5 0 0 0 1 .... 2 1 1 0 0 .... 6 0 1 0 1 .... 2 0 0 0 0 .... 1
In the algorithm actually a weighted summation of AS and informativeness is used. 0.3 x informativeness + 0.7 x AS
MO
● Objectives : each individual in the other subpopulation.
● MO: number of individuals that dominate it.
● Non-dominated individuals have the highest fitness value.
f1
f2
non-dominated
dominated
Performance Measures● Objective Fitness (OF)
– Evaluation against a fix set of test cases– Here we use "all possible test cases" since we have
picked problems with small sizes.
● Objective Fitness Correlation (OFC)– Correlation between OFs and fitness values in the
coevolution (subjective fitness, SF).
Experimental Setup● Controlled experiments: GAAS
– GA with AS from exhaustive evaluation.
● Compare the OF based on the same number of interactions.
Test Problems● Majority Function Problem (MFP)
– 1D cellular automata problem– Two parameters: radius (r) and problem size (n)
A sample rule with r = 1
0 1 0 1 0 0 1 1 1A sample IC with n = 9
000 001 010 011 100 101 110 111
0 0 0 1 0 1 1 1
Input
Output
target bitneighbor bits
boolean-vector representation of this rule
Test Problems● Majority Function Problem (MFP)
Test Problems● Symbolic Regression Problem (SRP)
– Curve fitting with Genetic Programming trees– Two measures: sum of error and hit
GP Tree +
*
x
xx
xx
+
-
x
f(x)original function
regression curvehit
2x
Test Problems● Parity Problem (PP)
– Determine odd/even for the number of 1's in a bit string
– Two parameter: odd/even and bit string length (n)
0 1 0 1 0 0 1 1 1 1A problem with n = 10
A solution tree
Test Problems: PP
0 0 0 1 0Boolean-vector
GP Tree AND
OR
D0 D1 D2 D3 D4
5-even Parityfalse (0)
AND
D2NOT AND
D0 D3
NOT OR
D0 D1
AND
D1 D2
0 0 0 0 0
0
1
1 1
1
0
0
0false
Results of MFP (r=2, n=9)
Results of MFP (r=2, n=9)
Results of SRP x6−2x4x2
Results of SRP x6−2x4x2
Results of PP (odd, n=10)
Results of PP (odd, n=10)
Summary of Results
Multi-objective Approach
● One run for COMO in MFP.
● OF drops when NDR rises.
● Why high NDR?– Duplicate solutions– Too many objectives
MO approach to improve WI
MFPMO-WS-WI
MO approach to improve WI
SRP
MO-WS-WI
MO-WS-AI
WeiSum-AS-WI
MO-AS-WI
MO-AS-AI
MO approach to improve WI
PP
MO-AS-WI
Conclusions
● MO2 approach with weighted informativeness (MO-AS-WI and MO-WS-WI) outperforms other evaluation methods in coevolution.
● MO1 approach does not work well because there are usually too many objectives. This can be represented by a high NDR and results in a random search.
● Coevolution is efficient for the MFP and SRP.
Issues
● Test problems used are small, and there is not proof of generalizability to larger problems.
● Implication to statistical learning: select not only difficult but also informative data for training.
Question?
Thank you!
Average ScoreSolutions
Test cases
0 1 0 0 10 0 1 1 00 1 1 0 01 0 0 0 01 0 1 0 0
T1T2T3T4T5
S1 S2 S3 S4 S522222
2 2 3 1 1
0.40.40.40.40.4
0.4 0.4 0.6 0.2 0.2
Max(O(m),O(n))
Weighted Score
Solutions
Test cases
0 1 0 0 10 0 1 1 00 1 1 0 01 0 0 0 01 0 1 0 0
T1T2T3T4T5
S 1 S 2 S 3 S 4 S 522222
2 2 3 1 1
Max(O(m),O(n))
Average Informativeness
Max(O(mn2),O(nm2))
Weighted Informativeness
Max(O(mn2),O(nm2))
MO
Max(O(mn2),O(nm2))