A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...

A stochastic global optimization method for (urban?) multi-parameter systems

Rodolphe Le RicheCNRS and Ecole des Mines de St-Etienne

Purpose of this presentation

Discussion object : cities as an entity for heating consumption and related public policies.

Cities are analyzed, modeled and optimized for heat consumption (including the effects of solar passive input).

Cities are objects that are described at different parameterization levels : entire city (coarser level), districts, buildings (finer level).

In mechanical engineering, multi-levels of parameters are often associated to composite materials.

Talk : describe an idea for multi-level optimization that emerged in composite design. Any use for cities ?

Notation and application exampleComposite design at various scales

meso , v

micro , x

y(x) numerical simulator of the

structure (stress, strains, strength,

mass, ...)

minx∈S

f ( y (x))

( fiber positions )( fiber positions )

( plate stiffnesses )

v(x) is numerically inexpensive.x(v) doesn't exist : there are many fiber positions for one choice of plate stiffnesses.

Notation and application exampleHeating consumption at various scales

meso , v

micro , x

y(x) numerical simulator of the structure (solar energy, thermal

simulation)

minx∈S

f ( y (x))

« IRIS » (district)

IRIS : Ilôt regroupé pour l'information statistique

building

Scope of application : global, derivative free problems

minx∈S

localoptimum

global optimum

Here, we focus on

S ⊂ ℝn or ℕn or {ℝn1∪ℕn2 }

i.e., continuous or integer or mixte optimization.

Flow chart of a general stochastic optimizer

● Initialize t and pt(x)

● Sample xt+1 ~ pt(x)

● Calculate f(xt+1)

● Update the distribution

pt+1(x) = Update( x1, f(x1), … , xt+1, f(xt+1) )or more often pt+1(x) = Update( pt(x) , xt+1, f(xt+1) )

● Stop or [ t = t+1 and go back to Sample ]

with different p's if x is continuous or discrete or mixed.

A simple example in Rn : ES-(1+1)

Initializations : x, f(x), m, C, tmax

While t < tmax

Sample N(m,C) --> x'Calculate f(x') , t = t+1If f(x')<f(x), x = x' , f(x) = f(x') EndifUpdate m (e.g., m=x) and C

End while

N(m,C)

Normal law

C = [σ1

2 0 ...

0 σ22 0...

0 0 σ32 ...] for variables seen as independent.

Illustration : adaptation of 2D Gaussian with ES-(1+1)

Discrete variables : The Univariate Marginal Density Algorithm (UMDA)

( Baluja 1994 – as PBIL – and Mühlenbein 1996)

x∈S ≡ {1,2 , , A}n (alphabet of cardinality A )e.g. {−45o , 0o , 45o , 90o

}n (fiber orientations)

e.g. {matl1 , , matlA}n (material choice)

The algorithm is that of a population based stochastic optimization (see before + many x's at each iteration) with different sampling and updating of pt.pt assumes that the variables are independent (drop t ),

p x =∏i=1

1 2 ... A0

∑j=1

pij= 1

UMDA (2)

For i=1, n ui ~ U [0,1]

If 0 ≤ ui ≤ pi1 ⇒ xi=1

If ∑j=1

pij≤ ui ∑

⇒ xi=k

If ∑j=1

pij≤ ui ≤ 1 ⇒ xi=A

Sampling :

Learning :

0 p1 p1+p2 ... 1

1 2 Ak

Select the μ best points out of λ , f x1 : ≤ f x2 : ≤ ≤ f x :

∑k=1 :

I xik= j

1, I xi

k= j =1 if xi

k= j , =0 otherwise

pij is the frequency of j at position i in the bests

pij≥

1 (minimum frequency for ergodicity)

Application to composite design for frequency

density learned by UMDA (2D)

contour lines of the penalized objective function

Independent densities can neither represent curvatures nor variables' couplings.

( from Grosset, L., Le Riche, R. and Haftka, R.T., A double-distribution statistical algorithm for composite laminate optimization, SMO, 2006 )

Stochastic discrete optimization : learning the variables dependencies

More sophisticated discrete optimization methods attempt to learn the couplings between variables. For example, with pairwise dependencies :

X2:n ... X

p(x) = p (x1 :n) p(x2:n∣x1:n)…p (xn :n∣xn−1:n)

Trade-off : richer probabilistic structures better capture the objective function landscape but they also have more parameters → need more f evaluations to be learned.

MIMIC ( Mutual Information Maximizing Input Clustering ) algorithm : De Bonnet, Isbell and Viola, 1997.BMDA ( Bivariate Marginal Distribution Algorithm ) : Pelikan and Muehlenbein, 1999.

Multi-level parameter optimization with DDOA

Mathematical motivation : create couplings between variables using many independent distributions (in x and v spaces).

Numerical motivation : take into account expert knowledge in the optimization to improve efficiency. E.g. in composites, the lamination parameters v (the plate stiffnesses) make physical sense.

Example in composites Use of the lamination parameters

v = lamination parameters = geometric contribution of the plies to the stiffness.

Inexpensive to calculate from x (fiber angles).

Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous.

But f(v) typically does not exist (e.g., ply failure criterion).

( Liu, Haftka, and Akgün, « Two-level composite wing structural optimization using response surfaces », 2000.Merval, Samuelides and Grihon, « Lagrange-Kuhn-Tucker coordination for multilevel optimization of aeronautical structures », 2008. )

Past examples in composites

Optimize a composite structure made of several assembled panelsby changing each ply orientation

→ many discrete variables

Structure levelOptimize a composite structure made of several assembled panelsby changing the lamination parameters of each panel

→ few continuous variables

Laminate levelMinimize the distance to target lamination parametersby changing the ply orientations

→ few discrete variables

Decomposed problem : Initial problem :

optimal v's

BUT for such a sequential approach to make sense, must exist and guide to optimal regions (i.e., prohibits emergence of solutions at finer scales).

f̂ (v )

does not exist

The DDOA stochastic optimization algorithm

pDDOA(x) =

pX | v (X )=V (x )

If v(x) is costless, it does not cost to learn densities in the x AND v spaces at the same time.

The DDOA algorithm : X | v(X)=V ?

Simple mathematical illustration :

pX (x ) =1

2πexp(−1

2xT x) pV (v ) =

√2πexp(−1

2(v−1)2)

v = x1+ x2

pX | v( X )=v=1(x)

is a degenerated Gaussian along

x1+x2 = 1

(cross-section of the 2D bell curve along the blue line + normalization)

Intermediate step for a given v :

The DDOA algorithm : X | v(X)=V ?

pX |v (X )=V (x ) = pX | v(X )=v pV (v (x )) = ... =

exp(−14

(x1−x2)2−

(x1+x2−1)2)

pX | v( X )=V (x)

is a coupled distribution that merges the effects of X and V

Analytical calculation in the Gaussian case. In practice, use simulations ...

The DDOA algorithm (flow chart)

Choose , μ, such that >>1 and >μλ ρ ρ λInitialize pV(v)and pX(x)

For i=1, doλ

Sample vtarget from pV(v)Sample >>1 x's from pρ X(x)x(i) = the closest x to vtarget

Calculate f(x(i))

end For

Rank x(1: ), … , x( : ) the proposed pointsλ λ λUpdate pV(v) and pX(x) from x(1: ), … , x(μ: )λ λ

Stop ? If no, go back to top ...

sampling of X | v(x)=v

● pX(x) and p

V(v) can be simple densities, without variables couplings (→

easy to learn), yet pDDOA

(x) is a coupled density.

f(x) and selected points p X x = ∏i=1

npi xi p

DDOA(x)

One half of the algorithm searches in a low dimension space.

Application of DDOA to composite design for frequency

additional slides

Introduction to stochastic optimization

Random numbers are versatile search engines (work both in Rn and / or Nn ). They can also yield efficient methods.

Let pt(x) denote the probability density function of x at iteration t (e.g., after t evaluations of f). It represents the belief at t that the optimum x* is at x.

How to « sample pt(x) » once (Scilab notation) ?● if x is uniform between m and M, X ~ U[m,M], call

x = m + rand(n,1).*(M-m)● if x is (multi-)Gaussian with mean m and covariance matrix C, X ~ N(m,C),call

x = m + grand(1,'mn',0,C)

Application to a laminate frequency problem (1)

maxx f 1 x1 , , x15 , the first eigenfreq. of a simply supported platesuch that 0.48 ≤ eff x ≤ 0.52

where x i∈{0o , 15o , , 90o}

the constraint is enforced by penalty and creates a narrow ridge in the design space

Application to a laminate frequency problem (2)

Compare UMDA to a GA (genetic algorithm) and SHC (Stochastic Hill Climber)

Reliability = probability of finding the optimum at a given cost.

UMDA performs fairly well on this problem.

Optimum : [904o /±75o/±602

o/±455o/±305

Example in composites Use of the lamination parameters

Simplifications : fewer v's than fiber angles. Often, the v's are taken as continuous. But f(v) typically does not exist (e.g., ply failure criterion).

A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...

Documents

Transcript of A stochastic global optimization method for (urban?) multi ......Introduction to stochastic...

Stochastic Optimization Techniques for Big Data Machine ...ssds2015.shanghaitech.edu.cn/slides/tutorial/Tong Zhang-Stochastic... · Stochastic Optimization Techniques for Big Data

How does a stochastic optimization/approximation algorithm ... · differential inclusions are also presented. Keywords Stochastic approximation · Stochastic optimization · Nonsmooth-jump

Distributed Stochastic Consensus Optimization with ...

Approximation Algorithms for Stochastic Optimization

Stochastic optimization - Markov Chain Monte Carloethanf/MCMC/stochastic optimization.pdf · Stochastic Optimization Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya

Stochastic optimization: Beyond stochastic gradients and ...fbach/fbach_tutorial_vr_nips_2016.pdf · − Optimization of ﬁnite sums − Existing optimization methods for ﬁnite

Decentralized Stochastic Non-convex Optimization

Stochastic optimization of simulation models: management ... · Stochastic optimization of simulation models: management of water resources ... Paola Zuddas, Giovanni Sechi ... Stochastic

PCA-enhanced stochastic optimization methods

Lecture Notes Stochastic Optimization-Koole

Evolutionary Stochastic Portfolio Optimization and ...

The Value of Stochastic Modeling in Two-Stage Stochastic ...web.hec.ca/pages/erick.delage/INFORMS_VSM.pdf · E.g.: stochastic programming, robust optimization, deterministic optimization

Adaptive Stochastic Optimization

Stochastic Optimization Online and batch stochastic ...ibis.t.u-tokyo.ac.jp/.../nagoya_intensive/Nagoya2015_3_StochasticOpt… · 1 Online stochastic optimization Stochastic gradient

The Stochastic Optimization Problem

Stochastic Optimization Algorithms for Machine Learninglaurent.risser.free.fr/TMP_SHARE/optimCIMI2018/slidesGuanghuiLan.pdf · Stochastic Optimization Algorithms for Machine Learning

Stochastic Optimization Algorithms.Why?

Modified K-NN Model for Stochastic Streamflow Simulationcivil.colorado.edu/~balajir/my-papers/WRR-Paper-Sumbitted-S03.pdf · Modified K-NN Model for Stochastic Streamflow Simulation

ISSUES IN STOCHASTIC SEARCH AND OPTIMIZATION …Stochastic Search and Optimization • Focus here is on stochastic search and optimization: A. Random noise in input information (e.g.,

Decentralized Stochastic Optimization and Gossip …proceedings.mlr.press/v97/koloskova19a/koloskova19a.pdfDecentralized Stochastic Optimization and Gossip Algorithms with Compressed