A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...
Transcript of A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...
A stochastic global optimization method for (urban?) multi-parameter systems
Rodolphe Le RicheCNRS and Ecole des Mines de St-Etienne
Purpose of this presentation
Discussion object : cities as an entity for heating consumption and related public policies.
Cities are analyzed, modeled and optimized for heat consumption (including the effects of solar passive input).
Cities are objects that are described at different parameterization levels : entire city (coarser level), districts, buildings (finer level).
In mechanical engineering, multi-levels of parameters are often associated to composite materials.
Talk : describe an idea for multi-level optimization that emerged in composite design. Any use for cities ?
Notation and application exampleComposite design at various scales
macro
meso , v
micro , x
y(x) numerical simulator of the
structure (stress, strains, strength,
mass, ...)
minx∈S
f ( y (x))
( fiber positions )( fiber positions )
( plate stiffnesses )
v(x)
v(x) is numerically inexpensive.x(v) doesn't exist : there are many fiber positions for one choice of plate stiffnesses.
Notation and application exampleHeating consumption at various scales
macro
meso , v
micro , x
y(x) numerical simulator of the structure (solar energy, thermal
simulation)
minx∈S
f ( y (x))
« IRIS » (district)
IRIS : Ilôt regroupé pour l'information statistique
building
city
Scope of application : global, derivative free problems
minx∈S
f x
localoptimum
x*xl
V x l
f
global optimum
Here, we focus on
S ⊂ ℝn or ℕn or {ℝn1∪ℕn2 }
i.e., continuous or integer or mixte optimization.
Flow chart of a general stochastic optimizer
● Initialize t and pt(x)
● Sample xt+1 ~ pt(x)
● Calculate f(xt+1)
● Update the distribution
pt+1(x) = Update( x1, f(x1), … , xt+1, f(xt+1) )or more often pt+1(x) = Update( pt(x) , xt+1, f(xt+1) )
● Stop or [ t = t+1 and go back to Sample ]
with different p's if x is continuous or discrete or mixed.
A simple example in Rn : ES-(1+1)
Initializations : x, f(x), m, C, tmax
.
While t < tmax
do,
Sample N(m,C) --> x'Calculate f(x') , t = t+1If f(x')<f(x), x = x' , f(x) = f(x') EndifUpdate m (e.g., m=x) and C
End while
N(m,C)
Normal law
C = [σ1
2 0 ...
0 σ22 0...
0 0 σ32 ...] for variables seen as independent.
Illustration : adaptation of 2D Gaussian with ES-(1+1)
Discrete variables : The Univariate Marginal Density Algorithm (UMDA)
( Baluja 1994 – as PBIL – and Mühlenbein 1996)
x∈S ≡ {1,2 , , A}n (alphabet of cardinality A )e.g. {−45o , 0o , 45o , 90o
}n (fiber orientations)
e.g. {matl1 , , matlA}n (material choice)
The algorithm is that of a population based stochastic optimization (see before + many x's at each iteration) with different sampling and updating of pt.pt assumes that the variables are independent (drop t ),
p x =∏i=1
n
pi xi
1 2 ... A0
0.2
0.4
0.6
pi
xi
piA
pi1
pi2
∑j=1
A
pij= 1
UMDA (2)
For i=1, n ui ~ U [0,1]
If 0 ≤ ui ≤ pi1 ⇒ xi=1
If ∑j=1
k
pij≤ ui ∑
j=1
k1
pij
⇒ xi=k
If ∑j=1
A−1
pij≤ ui ≤ 1 ⇒ xi=A
Sampling :
Learning :
u
0 p1 p1+p2 ... 1
1 2 Ak
Select the μ best points out of λ , f x1 : ≤ f x2 : ≤ ≤ f x :
pij=
∑k=1 :
:
I xik= j
1, I xi
k= j =1 if xi
k= j , =0 otherwise
pij is the frequency of j at position i in the bests
pij≥
1 (minimum frequency for ergodicity)
Application to composite design for frequency
density learned by UMDA (2D)
contour lines of the penalized objective function
Independent densities can neither represent curvatures nor variables' couplings.
( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )
Stochastic discrete optimization : learning the variables dependencies
More sophisticated discrete optimization methods attempt to learn the couplings between variables. For example, with pairwise dependencies :
X1:n
X2:n ... X
n:n
p(x) = p (x1 :n) p(x2:n∣x1:n)…p (xn :n∣xn−1:n)
Trade-off : richer probabilistic structures better capture the objective function landscape but they also have more parameters → need more f evaluations to be learned.
MIMIC ( Mutual Information Maximizing Input Clustering ) algorithm : De Bonnet, Isbell and Viola, 1997.BMDA ( Bivariate Marginal Distribution Algorithm ) : Pelikan and Muehlenbein, 1999.
Multi-level parameter optimization with DDOA
( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )
Mathematical motivation : create couplings between variables using many independent distributions (in x and v spaces).
Numerical motivation : take into account expert knowledge in the optimization to improve efficiency. E.g. in composites, the lamination parameters v (the plate stiffnesses) make physical sense.
Example in composites Use of the lamination parameters
v = lamination parameters = geometric contribution of the plies to the stiffness.
Inexpensive to calculate from x (fiber angles).
Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous.
But f(v) typically does not exist (e.g., ply failure criterion).
( Liu, Haftka, and Akgün, « Two-level composite wing structural optimization using response surfaces », 2000.Merval, Samuelides and Grihon, « Lagrange-Kuhn-Tucker coordination for multilevel optimization of aeronautical structures », 2008. )
Past examples in composites
Optimize a composite structure made of several assembled panelsby changing each ply orientation
→ many discrete variables
Structure levelOptimize a composite structure made of several assembled panelsby changing the lamination parameters of each panel
→ few continuous variables
Laminate levelMinimize the distance to target lamination parametersby changing the ply orientations
→ few discrete variables
Decomposed problem : Initial problem :
optimal v's
BUT for such a sequential approach to make sense, must exist and guide to optimal regions (i.e., prohibits emergence of solutions at finer scales).
f̂ (v )
ob
ject
ive
fun
ctio
n
x v
p(x)
does not exist
v(x)
The DDOA stochastic optimization algorithm
pDDOA(x) =
pX | v (X )=V (x )
If v(x) is costless, it does not cost to learn densities in the x AND v spaces at the same time.
The DDOA algorithm : X | v(X)=V ?
Simple mathematical illustration :
pX (x ) =1
2πexp(−1
2xT x) pV (v ) =
1
√2πexp(−1
2(v−1)2)
v = x1+ x2
pX | v( X )=v=1(x)
is a degenerated Gaussian along
x1+x2 = 1
(cross-section of the 2D bell curve along the blue line + normalization)
Intermediate step for a given v :
The DDOA algorithm : X | v(X)=V ?
pX |v (X )=V (x ) = pX | v(X )=v pV (v (x )) = ... =
12π
exp(−14
(x1−x2)2−
12
(x1+x2−1)2)
pX | v( X )=V (x)
is a coupled distribution that merges the effects of X and V
Analytical calculation in the Gaussian case. In practice, use simulations ...
The DDOA algorithm (flow chart)
Choose , μ, such that >>1 and >μλ ρ ρ λInitialize pV(v)and pX(x)
For i=1, doλ
Sample vtarget from pV(v)Sample >>1 x's from pρ X(x)x(i) = the closest x to vtarget
Calculate f(x(i))
end For
Rank x(1: ), … , x( : ) the proposed pointsλ λ λUpdate pV(v) and pX(x) from x(1: ), … , x(μ: )λ λ
Stop ? If no, go back to top ...
sampling of X | v(x)=v
● pX(x) and p
V(v) can be simple densities, without variables couplings (→
easy to learn), yet pDDOA
(x) is a coupled density.
f(x) and selected points p X x = ∏i=1
npi xi p
DDOA(x)
One half of the algorithm searches in a low dimension space.
Application of DDOA to composite design for frequency
additional slides
Introduction to stochastic optimization
Random numbers are versatile search engines (work both in Rn and / or Nn ). They can also yield efficient methods.
Let pt(x) denote the probability density function of x at iteration t (e.g., after t evaluations of f). It represents the belief at t that the optimum x* is at x.
How to « sample pt(x) » once (Scilab notation) ?● if x is uniform between m and M, X ~ U[m,M], call
x = m + rand(n,1).*(M-m)● if x is (multi-)Gaussian with mean m and covariance matrix C, X ~ N(m,C),call
x = m + grand(1,'mn',0,C)
Application to a laminate frequency problem (1)
maxx f 1 x1 , , x15 , the first eigenfreq. of a simply supported platesuch that 0.48 ≤ eff x ≤ 0.52
where x i∈{0o , 15o , , 90o}
the constraint is enforced by penalty and creates a narrow ridge in the design space
( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )
Application to a laminate frequency problem (2)
Compare UMDA to a GA (genetic algorithm) and SHC (Stochastic Hill Climber)
Reliability = probability of finding the optimum at a given cost.
UMDA performs fairly well on this problem.
Optimum : [904o /±75o/±602
o/±455o/±305
o]s
Example in composites Use of the lamination parameters
Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous. But f(v) typically does not exist (e.g., ply failure criterion).