Simulated annealing

43
Simulated annealing ... an overview

description

Simulated annealing. ... an overview. Contents. Annealing & Stat.Mechs The Method Combinatorial minimization The traveling salesman problem Continuous minimization Thermal simplex Applications. So in the first place... what is simulated annealing?. According to wikipedia : - PowerPoint PPT Presentation

Transcript of Simulated annealing

Page 1: Simulated annealing

Simulated annealing... an overview

Page 2: Simulated annealing

Contents1. Annealing & Stat.Mechs2. The Method3. Combinatorial minimization

▫ The traveling salesman problem

4. Continuous minimization▫ Thermal simplex

5. Applications

Page 3: Simulated annealing

So in the first place...what is simulated annealing?

•According to wikipedia :“A probabilistic metaheuristic for the

optimization problem of locating a good approximaiton to the global optimum of a given function in a large search space”

•When you need some “good enough” solution

•For problems with many local minima•Often used in very large discrete spaces

Page 4: Simulated annealing

Annealing

Originally a blacksmithing technique in which you cool down the metal slowly

Used to improve ductility and allow further manipulation/shaping

Page 5: Simulated annealing

Some notions of Stat.Mechs

So why is that, physically?•Gibbs free energy•Each configuration is possible, but

weighted by a boltzmann factor :

•Slow cooling : minimum energy configuration

•Fast cooling (quenching) : polycristals,

Page 6: Simulated annealing

Some notions of Stat.Mechs

Example : Spin of a chain of atoms

• Possible states : Si = ±1

• Energy of a link : Eij = JSiSj

• Maximum/minimum is ±NJ (for N the chain length)

• Distribution of energy states given by boltzmann...

• Thus at low temperature : all spins align

• But is low temperature enough in physical systems?

Page 7: Simulated annealing

How SA works...So basically, we are going to do the same thing with

functions!• Start by “baking” up the system (high randomization)• Gradually cool down (structure appears)• Enjoy

Page 8: Simulated annealing

The Method (with a big M)

Exact description of the state of the possible configurations

Example : the N-Queens problems

Possible representation of the system : a vector

In this case, {7,5,2,6,3,7,8,4}

Element 1 : Description

Page 9: Simulated annealing

The MethodElement 2 : Generator of random changes

Some kind of way of evolving the system → allowed moves

Requires some insight of the way the system is working

In this case : select an attacked queen, and move it to some random spot on the same row

Page 10: Simulated annealing

The MethodElement 3 : Objective function

Basically, the function to optimize; might not always be obvious in discrete systems

Analog of energy

In this case :

Page 11: Simulated annealing

The MethodElement 4 : Acceptance probability function

The probability of taking the step to the new proposed state

Generally :

Formally, some function of the form P(E1,E2,T)

Page 12: Simulated annealing

The MethodElement 5 : Annealing schedule

The specific way in which the temperature is going to flow from high to low

Will make the difference between a working algorithm/PAIN

Meaning of fast/slow cooling and hot/cold highly case-specific

Here : T(n) = 100/n

Page 13: Simulated annealing

The Method

•Resets •Specific heat calculation

Further considerations

Page 14: Simulated annealing

Combinatorial minimization

•Type of minimization where there is no continuous spectrum of values for the energy function equivalent

•Can be hard to conceptualize :▫Energy function might not be obvious▫The most efficient way to get to neighbour

states might also not be obvious, and can require a lot of thinking

Page 15: Simulated annealing

Combinatorial minimizationThe Traveling Salesman Problem

Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?

This is harder than it looks...

Possible solutions grow as O(n!)

Exact computation of solutions grows as O(exp(n))

Page 16: Simulated annealing

Combinatorial minimization

•A state can be defined as one distinct possible route passing through all cities

•Supposing N cities, a vector of length N can specify the order in which to visit themFor example with N=6 : {C1, C4, C2, C3, C6,

C5}•Must also specify the position of each city

Supposing 2D, thats {xi,yi}

The Traveling Salesman Problem : Description

Page 17: Simulated annealing

Combinatorial minimization

This one can be a bit tricky...

1. Take two random cities and swap them

2. Take two consecutive cities and swap them

3. Take two non-consecutive cities and swap the whole segment between them

The Traveling Salesman Problem : Generator

Page 18: Simulated annealing

Combinatorial minimization

Pretty simple. We have the position of each city...

This is basically the total distance function.

Note : point N is the same as first point

The Traveling Salesman Problem : Objective function

Page 19: Simulated annealing

Combinatorial minimization

This is the part that requires experimentation... In the literature, some considerations :▫If the square in which the cities are located

has a side of N1/2, then temperatures above N1/2 can be considered hot, and temperatures below 1 are cold;

▫Every 100 steps OR 10 successful reconfigurations, multiply temperature by 0.9

▫Could also be some continuous equivalent...

The Traveling Salesman Problem : Annealing schedule

Page 20: Simulated annealing

Combinatorial minimizationThe Traveling Salesman Problem : Some results

T = 1.2 T = 0.8

Page 21: Simulated annealing

Combinatorial minimizationThe Traveling Salesman Problem : Some results

T = 0.4 T = 0.0

Page 22: Simulated annealing

Combinatorial minimizationThe Traveling Salesman Problem : Some results

Constraints :

Page 23: Simulated annealing

Continuous minimization

Kinda simpler, at least conceptually :

• Description : System state is some point x

• Generator : x+dx where dx is generated somewhat randomly

This is where we actually have some room to mess around... the way dx is specified is entirely up to us

• Function : Function.

• Annealing schedule : Should be gradual once again, but strongly depends on the function being minimized.

Page 24: Simulated annealing

Continuous minimization

Return of the AMOEBA!

An example of implementation (NR Webnote 1)

1. 3.

2. 4.

Page 25: Simulated annealing

Continuous minimizationAn example of implementation

Except here’s a twist...

•A positive thermal fluctuation is added to all of the old values of the simplex

•Another fluctuation is subtracted from the value of the proposal point

•Thus the new point is favored over old points for high temperatures

Page 26: Simulated annealing

Applications• Lenses : Merit function depends on a lot of factors :

curvature radii, densities, thickness, etc.

• Placement problems : When you have to place a lot of stuff in a very limited space... how do you arrange them optimally?

Used in logic boards, processors

Page 27: Simulated annealing

Conclusion

•Simulated annealing is cool (after some time)

•Allows to research a large parameter space without getting bogged down in local minima

•Very useful for discrete, combinatory problems for which there are not a lot of algorithms (since most are gradient based)

•Questions?

Page 28: Simulated annealing

Simulated annealingRound 2 !

Page 29: Simulated annealing

Revisiting...1. Specific heat calculation

2. Parameter spaces & objective functions

3. Simulated Annealing vs MCMC

Page 30: Simulated annealing

Specific heat calculation

Reminder : we can define an equivalent to specific heat for a given problem through the energy (objective) function :

But how many steps do we need to get some acceptable value for Cv?

Answer : it basically depends on E, and more specifically on the variance of Cv.

Page 31: Simulated annealing

Specific heat calculationThus this will be case-specific.

All points considered to get <E> and <E2> are taken at the same temperature; thus the annealing schedule should be arranged with blocks of constant temperature

Cv‘s variance is usually pretty large, so we need a lot of data. As an example, for a function :

... at least 10k points/block.

Page 32: Simulated annealing

Parameter spaces & objective functionsGenerally speaking...

• Parameter space :

1. Figure out what parameters you need to describe a state exactly; these can be integers, real numbers, or whatever else you need

2. Write it out in vector form (or possibly even matrix form), ie. [P1 , P2 , ... Pn]

3. The set of all possible vectors (varying the parameters that you defined in 1) is your parameter space

• Objective function :

1. Write out the objective function in terms of the previously defined parameters

Page 33: Simulated annealing

Parameter spaces & objective functions

The traveling salesman problem

• Parameter space :

Set of all possible ways to fill visit all cities once. We have N cities, and we need 3 infos for each : xi position, yi position, and rank at which the city will be visited Ri

• Objective function :

Page 34: Simulated annealing

Parameter spaces & objective functionsThe knapsack problem

Page 35: Simulated annealing

Parameter spaces & objective functionsThe knapsack problem

• Parameter space :

Set of all possible ways to fill the sack without busting. Suppose we have N types of objects; we need 3 infos for each object : the number ni of objects of that type in the sack, the weight mi, and some value Vi

• Objective function :

Page 36: Simulated annealing

Parameter spaces & objective functions

Given a weighted graph (set of vertices and weighted links between these), what is the minimum value subgraph that connects all vertices?

The spanning tree problem

Page 37: Simulated annealing

Parameter spaces & objective functions

• Parameter space :

Set of all the M links, to which we attribute an activation value µi= 0,1 and weight xi

For N the number of vertices, there are N – 1 degrees of freedom

The spanning tree problem

Page 38: Simulated annealing

Parameter spaces & objective functions

• Objective function :

where µi = 1 or 0 (active/inactive link)

The spanning tree problem

Page 39: Simulated annealing

Parameter spaces & objective functionsThe N-queens problem

• Parameter space :

Set of all possible ways to place the N queens, supposing there is one per row. For each queen we need only the column position xi :

• Objective function :

Page 40: Simulated annealing

Parameter spaces & objective functionsThe N-queens problem

Page 41: Simulated annealing

Simulated Annealing vs MCMCNeeds of each method : Continuous case

SA MCMC•Generator function : some way to generate a proposal point x+dx

•Acceptance function (almost always) :

•Annealing schedule (how, specifically, will the temperature decrease)

•Usually stops after a set number of steps (has no memory of the chain, usually)

•Generator function : some way to generate a proposal point x+dx

•Acceptance function (commonly) :

•Number of chains, starting points, possibly with different temperatures

•Usually stops either after a set number of steps or when some condition of minimal error is fulfilled in the variance of the chain (requires keeping the chain in memory!)

Page 42: Simulated annealing

Simulated Annealing vs MCMCNeeds of each method : Combinatorial case

SA MCMC•Generator function : some way to generate a proposal neighbour state

•Acceptance function (almost always) :

•Annealing schedule (how, specifically, will the temperature decrease)

•Usually stops after a set number of steps (has no memory of the chain, usually)

•Generator function : some way to generate a proposal neighbour state

•Acceptance function (commonly) :

•Number of chains, starting points, possibly with different temperatures

•Usually stops either after a set number of steps or when some condition of minimal error is fulfilled in the variance of the chain (requires keeping the chain in memory!)

Page 43: Simulated annealing

Simulated Annealing vs MCMCSome more considerations...

• The choice of method is highly case dependant.

▫ Continuous : MCMCs should be able to handle most cases; SA only to be used for particularly badly behaved objective functions

▫ Combinatorial : MCMCs can work... but most objective functions in combinatorial problems tend to have a lot of very deep minima, so SA is usually best

• SA is more demanding computationally, but can search a wider parameter space

• MCMCs can easily be parallelized (more chains)