A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...

Post on 23-Jun-2020

2 views 0 download

Transcript of A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...

A stochastic global optimization method for (urban?) multi-parameter systems

Rodolphe Le RicheCNRS and Ecole des Mines de St-Etienne

Purpose of this presentation

Discussion object : cities as an entity for heating consumption and related public policies.

Cities are analyzed, modeled and optimized for heat consumption (including the effects of solar passive input).

Cities are objects that are described at different parameterization levels : entire city (coarser level), districts, buildings (finer level).

In mechanical engineering, multi-levels of parameters are often associated to composite materials.

Talk : describe an idea for multi-level optimization that emerged in composite design. Any use for cities ?

Notation and application exampleComposite design at various scales

macro

meso , v

micro , x

y(x) numerical simulator of the

structure (stress, strains, strength,

mass, ...)

minx∈S

f ( y (x))

( fiber positions )( fiber positions )

( plate stiffnesses )

v(x)

v(x) is numerically inexpensive.x(v) doesn't exist : there are many fiber positions for one choice of plate stiffnesses.

Notation and application exampleHeating consumption at various scales

macro

meso , v

micro , x

y(x) numerical simulator of the structure (solar energy, thermal

simulation)

minx∈S

f ( y (x))

« IRIS » (district)

IRIS : Ilôt regroupé pour l'information statistique

building

city

Scope of application : global, derivative free problems

minx∈S

f x

localoptimum

x*xl

V x l

f

global optimum

Here, we focus on

S ⊂ ℝn or ℕn or {ℝn1∪ℕn2 }

i.e., continuous or integer or mixte optimization.

Flow chart of a general stochastic optimizer

● Initialize t and pt(x)

● Sample xt+1 ~ pt(x)

● Calculate f(xt+1)

● Update the distribution

pt+1(x) = Update( x1, f(x1), … , xt+1, f(xt+1) )or more often pt+1(x) = Update( pt(x) , xt+1, f(xt+1) )

● Stop or [ t = t+1 and go back to Sample ]

with different p's if x is continuous or discrete or mixed.

A simple example in Rn : ES-(1+1)

Initializations : x, f(x), m, C, tmax

.

While t < tmax

do,

Sample N(m,C) --> x'Calculate f(x') , t = t+1If f(x')<f(x), x = x' , f(x) = f(x') EndifUpdate m (e.g., m=x) and C

End while

N(m,C)

Normal law

C = [σ1

2 0 ...

0 σ22 0...

0 0 σ32 ...] for variables seen as independent.

Illustration : adaptation of 2D Gaussian with ES-(1+1)

Discrete variables : The Univariate Marginal Density Algorithm (UMDA)

( Baluja 1994 – as PBIL – and Mühlenbein 1996)

x∈S ≡ {1,2 , , A}n (alphabet of cardinality A )e.g. {−45o , 0o , 45o , 90o

}n (fiber orientations)

e.g. {matl1 , , matlA}n (material choice)

The algorithm is that of a population based stochastic optimization (see before + many x's at each iteration) with different sampling and updating of pt.pt assumes that the variables are independent (drop t ),

p x =∏i=1

n

pi xi

1 2 ... A0

0.2

0.4

0.6

pi

xi

piA

pi1

pi2

∑j=1

A

pij= 1

UMDA (2)

For i=1, n ui ~ U [0,1]

If 0 ≤ ui ≤ pi1 ⇒ xi=1

If ∑j=1

k

pij≤ ui ∑

j=1

k1

pij

⇒ xi=k

If ∑j=1

A−1

pij≤ ui ≤ 1 ⇒ xi=A

Sampling :

Learning :

u

0 p1 p1+p2 ... 1

1 2 Ak

Select the μ best points out of λ , f x1 : ≤ f x2 : ≤ ≤ f x :

pij=

∑k=1 :

:

I xik= j

1, I xi

k= j =1 if xi

k= j , =0 otherwise

pij is the frequency of j at position i in the bests

pij≥

1 (minimum frequency for ergodicity)

Application to composite design for frequency

density learned by UMDA (2D)

contour lines of the penalized objective function

Independent densities can neither represent curvatures nor variables' couplings.

( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )

Stochastic discrete optimization : learning the variables dependencies

More sophisticated discrete optimization methods attempt to learn the couplings between variables. For example, with pairwise dependencies :

X1:n

X2:n ... X

n:n

p(x) = p (x1 :n) p(x2:n∣x1:n)…p (xn :n∣xn−1:n)

Trade-off : richer probabilistic structures better capture the objective function landscape but they also have more parameters → need more f evaluations to be learned.

MIMIC ( Mutual Information Maximizing Input Clustering ) algorithm : De Bonnet, Isbell and Viola, 1997.BMDA ( Bivariate Marginal Distribution Algorithm ) : Pelikan and Muehlenbein, 1999.

Multi-level parameter optimization with DDOA

( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )

Mathematical motivation : create couplings between variables using many independent distributions (in x and v spaces).

Numerical motivation : take into account expert knowledge in the optimization to improve efficiency. E.g. in composites, the lamination parameters v (the plate stiffnesses) make physical sense.

Example in composites Use of the lamination parameters

v = lamination parameters = geometric contribution of the plies to the stiffness.

Inexpensive to calculate from x (fiber angles).

Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous.

But f(v) typically does not exist (e.g., ply failure criterion).

( Liu, Haftka, and Akgün, « Two-level composite wing structural optimization using response surfaces », 2000.Merval, Samuelides and Grihon, « Lagrange-Kuhn-Tucker coordination for multilevel optimization of aeronautical structures », 2008. )

Past examples in composites

Optimize a composite structure made of several assembled panelsby changing each ply orientation

→ many discrete variables

Structure levelOptimize a composite structure made of several assembled panelsby changing the lamination parameters of each panel

→ few continuous variables

Laminate levelMinimize the distance to target lamination parametersby changing the ply orientations

→ few discrete variables

Decomposed problem : Initial problem :

optimal v's

BUT for such a sequential approach to make sense, must exist and guide to optimal regions (i.e., prohibits emergence of solutions at finer scales).

f̂ (v )

ob

ject

ive

fun

ctio

n

x v

p(x)

does not exist

v(x)

The DDOA stochastic optimization algorithm

pDDOA(x) =

pX | v (X )=V (x )

If v(x) is costless, it does not cost to learn densities in the x AND v spaces at the same time.

The DDOA algorithm : X | v(X)=V ?

Simple mathematical illustration :

pX (x ) =1

2πexp(−1

2xT x) pV (v ) =

1

√2πexp(−1

2(v−1)2)

v = x1+ x2

pX | v( X )=v=1(x)

is a degenerated Gaussian along

x1+x2 = 1

(cross-section of the 2D bell curve along the blue line + normalization)

Intermediate step for a given v :

The DDOA algorithm : X | v(X)=V ?

pX |v (X )=V (x ) = pX | v(X )=v pV (v (x )) = ... =

12π

exp(−14

(x1−x2)2−

12

(x1+x2−1)2)

pX | v( X )=V (x)

is a coupled distribution that merges the effects of X and V

Analytical calculation in the Gaussian case. In practice, use simulations ...

The DDOA algorithm (flow chart)

Choose  , μ,   such that  >>1 and  >μλ ρ ρ λInitialize pV(v)and pX(x)

For i=1,   doλ

Sample vtarget from pV(v)Sample  >>1  x's from pρ X(x)x(i) = the closest x to vtarget

Calculate f(x(i))

end For

Rank   x(1: ), … , x( : ) the proposed pointsλ λ λUpdate pV(v) and pX(x) from x(1: ), … , x(μ: )λ λ

Stop ? If no, go back to top ...

sampling of X | v(x)=v

● pX(x) and p

V(v) can be simple densities, without variables couplings (→

easy to learn), yet pDDOA

(x) is a coupled density.

f(x) and selected points p X x = ∏i=1

npi xi p

DDOA(x)

One half of the algorithm searches in a low dimension space.

Application of DDOA to composite design for frequency

additional slides

Introduction to stochastic optimization

Random numbers are versatile search engines (work both in Rn and / or Nn ). They can also yield efficient methods.

Let pt(x) denote the probability density function of x at iteration t (e.g., after t evaluations of f). It represents the belief at t that the optimum x* is at x.

How to « sample pt(x) » once (Scilab notation) ?● if x is uniform between m and M, X ~ U[m,M], call

x = m + rand(n,1).*(M-m)● if x is (multi-)Gaussian with mean m and covariance matrix C, X ~ N(m,C),call

x = m + grand(1,'mn',0,C)

Application to a laminate frequency problem (1)

maxx f 1 x1 , , x15 , the first eigenfreq. of a simply supported platesuch that 0.48 ≤ eff x ≤ 0.52

where x i∈{0o , 15o , , 90o}

the constraint is enforced by penalty and creates a narrow ridge in the design space

( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )

Application to a laminate frequency problem (2)

Compare UMDA to a GA (genetic algorithm) and SHC (Stochastic Hill Climber)

Reliability = probability of finding the optimum at a given cost.

UMDA performs fairly well on this problem.

Optimum : [904o /±75o/±602

o/±455o/±305

o]s

Example in composites Use of the lamination parameters

Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous. But f(v) typically does not exist (e.g., ply failure criterion).