Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1...

20
1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O. Schütze, X. Esquivel, A. Lara, C. Coello CINVESTAV-IPN Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional. Mexico City, Mexico

Transcript of Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1...

Page 1: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

1

O. Schütze

Some Comments on GD and IGD and Relations to the Hausdorff Distance

O. Schütze, X. Esquivel, A. Lara, C. Coello

CINVESTAV-IPNCentro de Investigación y de Estudios Avanzados

del Instituto Politécnico Nacional.Mexico City, Mexico

Page 2: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

2

O. Schütze

Outline

Introduction and Background• Trade off for the design of indicators for the evaluation

of MOEAs• Metric / Hausdorff distance

Investigation of the Indicators • GD• IGD

A ‘New’ Indicator • Metric properties• Extension to continuous models

Page 3: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

3

O. Schütze

Multi-Objective Optimization

⎪⎩

⎪⎨

→⊂

→⊂=

RRQf

RRQfF

nk

n

:

:min

1

(MOP)

Multi-Objective Optimization Problem

PQ = set of optimal solutions (Pareto set)F(PQ) = the image of PQ (Pareto front)

Pareto set

f2

f1

Pareto front

f1,f2

x

First we consider discrete (or discretized) models, i.e., |Q|<∞.

Page 4: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

4

O. Schütze

Outliers in Stochastic Search Algorithms

⎟⎟⎠

⎞⎜⎜⎝

⎛=

)(

)(

]1,0[:

1

xgx

xF

RF kn

Example: Consider the MOP

where g:[0,1]n Rk-1 ( Okabe, ZDT).

Assume a point x=(ε,z), z∈[0,1]n-1, is a member of the archive/population.Further, assume that new candidate solutions are chosen uniformly at random from the domain.

Then the probability to find a point that dominates x is less than ε( objective 1). The distance of x to PQ can be ‘large’.

(ε,x2)

Page 5: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

5

O. Schütze

Example

P hypothetical Pareto front

X1 perfect approximation of P, except one outlier

X2 none of the elements are‘near’ to P

Question: Which approxomation is ‘better’?

Extreme situations:

-- pessimistic view (Hausdorff distance): dH(X1,P)=9, dH(X2,P)=2.83

-- averaged result (Generational distance): GD(X1,P)=0.81, GD(X2,P)=2.83

Page 6: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

6

O. Schütze

Outlier Trade Off

Use of a Metric

+ greedy search = shortestpath to the set of interest( triangle inequality)

-- Penalization of single outliers of the candidate set

Averaging the Results

+ Single outliers do not have a mayor influence on the result

-- The greedy search is not neccessarily the shortest path to the set of interest

Trade off for the indicator D when measuring results of MOEAs (the design of MOEAs is influenced by D):

Page 7: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

7

O. Schütze

Metric

Definition: Suppose X is a set and d:X×X R is a function. Then d is called a metric on X if, and only if, for each a,b,c∈X:

),(),(),( )(),(),( )(

0),( and 0),( )(

cadbadcadcabdbadb

babadbada

+≤=

=⇔=≥ (Positive Property)(Symmetric Property)(Triangle Inequality)

Variants:

-- d is called a semi-metric if properties (a) and (b) are satisfied

-- A pseudo-metric is a semi-metric that satisfies the relaxedtriangle inequality:

1 )),,(),((),( ≥+≤ σσ cadbadcad

Page 8: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

8

O. Schütze

Hausdorff DistanceDefinition: Let u,v∈Rn, A,B⊂Rn, and ||.|| be a vector norm. The Hausdorff distance dH is defined as follows:

)),(),,(max(:),( )(

),(sup:),( )(

inf:),( )(

ABdistBAdistBAdc

AudistABdistb

vuAudista

H

Bu

Av

=

=

−=

∈u

A

B

A

Remarks:

(i) dist(A,B) is not symmetric: if B is a proper subset of A, then it isdist(B,A)=0 and dist(A,B)>0.

(ii) dH is a metric on the set of discrete sets. It can also be used for continuous spaces. In that case it is dH(A,B)=0 ⇔clos(A)=clos(B)

Page 9: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

9

O. Schütze

Discussion of GD (1)

GD as proposed by Van Veldhuizen applied on general finitesets X, Y⊂Rk using dist:

pX

i

pi Yxdist

XYXGD

/1||

1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛= ∑

=

Metric properties:

-- positive property: NOit is GD(X,Y)=0 ⇔ X⊂Y (X can be a proper subset of Y (*))

-- symmetric property: NO(*): then GD(X,Y)=0 but GD(Y,X)>0

-- triangle inequality: NO ( next slide)

Page 10: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

10

O. Schütze

Discussion of GD (2)

2.) Investigate (relaxed) triangle inequality: let X,Z⊂Rk s.t. GD(X,Z)>0. Let

rhs(Y):= GD(X,Y)+GD(Y,Z)

and define Yn := X ∪ {y1,y2,…,yn}

such that Σi dist(yi,Z) < ∞. Then GD(X,Y)=0 and GD(Y,Z) 0 for n ∞GD does not satisfy and relaxed triangle inequality since rhs(y) 0.

Note: for p>1, any set {y1,..,yn}⊂F(Q) (if compact) can be taken!!

1.) Normalization strategy of GD: Let A1={a} with dist(F(a),F(PQ))=1, i.e.,GD(F(A1),F(PQ))=1

Now let An be the multiset consisting of n copies of a, An={a,…,a}, then

0)1,..,1(

))(),(( →==nn

nPFAFGD

pp

T

Qn

Page 11: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

11

O. Schütze

New Variant of GD

pX

i

pip

pX

i

pip Yxdist

XYxdist

XYXGD

/1||

1

/1||

1),(1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎟⎠

⎞⎜⎜⎝

⎛= ∑∑

==

Nearby modification: take the power mean of the distances:

-- same (poor) metric properties, but

-- better averaging: GDp(F(An),F(PQ))=1 for all n∈N

-- (needed for the upcoming indicator)

Page 12: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

12

O. Schütze

Discussion IGD

IGD as proposed by Coello & Cruz applied on general finitesets X, Y⊂Rk using dist:

pY

i

pi Xydist

YYXIGD

/1||

1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛= ∑

=

-- same metric properties as GD since IGD(A,B) = GD(B,A)

-- same modification: take power mean of the distances:

pY

i

pip

pY

i

pip Xydist

YXydist

YYXIGD

/1||

1

/1||

1),(1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎟⎠

⎞⎜⎜⎝

⎛= ∑∑

==

Page 13: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

13

O. Schütze

A “New” Indicator

( )),(),,(max),( YXIGDYXGDYX ppp =Δ

Proposition 1: ∆p is a semi-metric for 1≤p<∞ and a metric for p=∞

Remark: for p=∞ the indicator ∆p coincides with dH

Proposition 2: let |X|,|Y|,|Z|≤N, then

)),(),((),( ZYYXNZX ppp

p Δ+Δ≤Δ

Observation: GD(X,Y) is an ‘averaged version’ of dist(X,Y), same for IGD combine GD and IGD as for dH:

Page 14: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

14

O. Schütze

Interpretation of p for the Trade Off

p=1 p=2 p=5 p=10 p=∞

N=2 0.541 0.15 0.026 0.008 0

N=4 0.249 0.06 0.019 0.009 0N=6 0.105 0.033 0.008 0.003 0

N=10 0.02 0.004 0.002 0.001 0N=100 0 0 0 0 0

The larger the value of p, the ´nearer´Δp is to a metric(but: how to choose p? what is the influence of N?)

Table: Percentage of the triangle violations (σ=1) for different values of p. Hereby, we have taken 100,000 different sets A,B,C with

|A|,|B|,|C|=N, k=2, each entry randomly chosen within [0,1].

Page 15: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

15

O. Schütze

ExampleP hypothetical Pareto front

X1 perfect approximation of P, except one outlier

X2 none of the elements are‘near’ to P

Question: Which approxomation is ‘better’?

p=1 p=2 p=5 p=10 p=∞∆p(P,X1) 0.8182 2.714 4.047 5.571 9

∆p(P,X2) 2.828 2.828 2.828 2.828 2.828

Page 16: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

16

O. Schütze

Extension to Continuous Models

M1m1 f1

f2M2

m2

211 ],[ RMm →γNow consider continuous models

In general: k objectives PQ (k-1)-dimensional

GDp: A finite, PQ compactGD turns to a continuous SOP

pM

m

pQ dtAFtdist

mMPFAFIGD

/1

11

1

1

))(),((1))(),(( ⎟⎟⎠

⎞⎜⎜⎝

⎛−

= ∫ γ

IGDp: PQ continuous the power mean of IGDp turns into an integral.

Example: k=2, F(PQ) connected, then

Page 17: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

17

O. Schütze

Discretization of F(PQ)

Task: PQ given analytically, compute an approximation Y of F(PQ) with dH(Y,F(PQ))<δ (a priori defined approximation quality)

For k=2: use continuation-like methods:select step size t such that ||F(x+tv)-F(x)||∞≈Θδ, Θ<1 a safety factor(selection of t based on Lipschitz estimations)

−4 −3 −2 −1 0 1 2 3 4−0.2

0

0.2

0.4

0.6

0.8

1

1.2

f1

f 2

PF

−4 −3 −2 −1 0 1 2 3 4−0.2

0

0.2

0.4

0.6

0.8

1

1.2

f1

f 2

PF

δ=0.01 δ=0.4

OKA2

Page 18: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

18

O. Schütze

Numerical Example

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

pop1pop2pop3pop4pop5pop6pop7Pareto Front

NSGA-II applied on ZDT1

Y = F(PQ)Yi=F(popi)

∆2(Y1,Y)=3.03∆2(Y2,Y)=2.71∆2(Y3,Y)=1.43∆2(Y4,Y)=0.77∆2(Y5,Y)=0.31∆2(Y6,Y)=0.12∆2(Y7,Y)=0.007

Page 19: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

19

O. Schütze

Discussion

Conclusions• New indicator ∆p proposed for the evaluation of MOEAs.• ∆p is a semi-metric, and a pseudo-metric for bounded

archive sizes• p can (in principle) be used to handle the ‘outlier trade off’

Open Questions • How to choose p?• How to measure the distance to a metric?• How to adapt the selection mechanisms in order to

improve ∆p?(∆p is NOT compliant with the dominance relation!)

Page 20: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O.

20

O. Schütze

Thank you for your attention!