Sampling algorithms - University of Sheffield

31
Sampling algorithms * Purpose * Construction * Performance measures SANDA DEJANIC EAWAG ZÜRICH, SWITZERLAND THIS PROJECT HAS RECEIVED FUNDING FROM THE EUROPEAN UNION’S SEVENTH FRAMEWORK PROGRAMME FOR RESEARCH, TECHNOLOGICAL DEVELOPMENT AND DEMONSTRATION UNDER GRANT AGREEMENT NO 607000.

Transcript of Sampling algorithms - University of Sheffield

Page 1: Sampling algorithms - University of Sheffield

Sampling algorithms * Purpose * Construction * Performance measures

SANDA DEJANIC

EAWAG ZÜRICH, SWITZERLAND

T H I S P R O J E C T H A S R E C E I V E D F U N D I N G F R O M T H E E U R O P E A N U N I O N ’ S

S E V E N T H F R A M E W O R K P R O G R A M M E F O R R E S E A R C H , T E C H N O L O G I C A L

D E V E L O P M E N T A N D D E M O N S T R A T I O N U N D E R G R A N T A G R E E M E N T N O 6 0 7 0 0 0 .

Page 2: Sampling algorithms - University of Sheffield

What is the probability of the

outcome being 6+6?

Page 3: Sampling algorithms - University of Sheffield

How likely is to pick the right parameters for our simulation?

Page 4: Sampling algorithms - University of Sheffield

Likelihood distribution

Quantification of how well does our parameter describe the reality

Page 5: Sampling algorithms - University of Sheffield

Take into account all available knowledge!

Lets see the Monty Hall problem

Page 6: Sampling algorithms - University of Sheffield

Solution to the Monty Hall problem

Page 7: Sampling algorithms - University of Sheffield

Introducing the prior

Some of the parameters have to be in certain limits.

Example:

* width of a water pipe can not be less than zero.

Page 8: Sampling algorithms - University of Sheffield

Define prior

Mathematically describe what we know about the prior distribution of all parameters

Common case:

* uniform distribution

* normal distribution

* distribution taken from literature

Page 9: Sampling algorithms - University of Sheffield

Bayesian Theorem

Constructing

the posterior

distribution

Likelyhood Prior

Posterior

Page 10: Sampling algorithms - University of Sheffield

Optimization comes down to finding the maximum of the target distribution (posterior distribution)

Page 11: Sampling algorithms - University of Sheffield

Sampling has an aim to describe the whole target distribution (posterior distribution)

Page 12: Sampling algorithms - University of Sheffield

Optimization vs Sampling

Optimization

calibrates the

model and allows

us to make predictions

Sampling gives

the uncertainty

intervals of the

prediction

Page 13: Sampling algorithms - University of Sheffield

Sometimes describing the posterior is not so easy…

Page 14: Sampling algorithms - University of Sheffield

Like exploring a dark room with a flashlight

Page 15: Sampling algorithms - University of Sheffield

Developing algorithms to deal with complex posteriors…

MCMC

Markov Chain Monte Carlo

Page 16: Sampling algorithms - University of Sheffield

Metropolis algorithm

1 Begin with initial value

2 Generate a candidate from a proposal distribution

3 Evaluate the acceptance probability given by:

4 Generate a uniformly-distributed random number from

Unif[0,1] and accept if

5 Increase the counter and goto 2

Convergence guaranteed, but…

Generally it can be inefficient due to the scale, and shape of posterior

Page 17: Sampling algorithms - University of Sheffield

Adaptive algorithms

Learn from the past and adapt the proposal distribution

Reminder:

* Markov chain updates based only on the

previous step

Possibility of restrained exploration

Example:

* Ignoring some of the modes

Page 18: Sampling algorithms - University of Sheffield

Chain Diagnostics

Page 19: Sampling algorithms - University of Sheffield

Can we think of a clever way to make sure our posterior is explored entirely?

Page 20: Sampling algorithms - University of Sheffield

Ensemble based sampling algorithms

Stretch move vs. Differential evolution

Page 21: Sampling algorithms - University of Sheffield

Ensemble based sampling algorithms

Many walkers

exploring the

parameter

space

Learning

from each

other

Jumping

according

to already

accepted

positions Each

having

it’s

own

chain

Each

having

the burn

in phase

Posibility of running the model

on multiple cores

Page 22: Sampling algorithms - University of Sheffield
Page 23: Sampling algorithms - University of Sheffield

Standard Diagnostics

Chain diagnostics as for Metropolis

Mean

Gelman and Rubin

Marginals

Page 24: Sampling algorithms - University of Sheffield

Appraisal of the performance

* Burn in: moment of convergence

* Robustness: reliable, scalable, available

* Effective sample size I am not mean,

everybody is just

too sensitive…

- Posterior

Page 25: Sampling algorithms - University of Sheffield

Measuring the entropy

Analytically:

Empirically:

Taking the mean log.posterior from the ensemble

Page 26: Sampling algorithms - University of Sheffield

Burn in period

Page 27: Sampling algorithms - University of Sheffield

Robustness

Sensitivity to tuning

parameters

Page 28: Sampling algorithms - University of Sheffield

Effective sample size

Acceptance rate

Correlation

◦ * within the chains

◦ * within the ensemble

Thinning

Page 29: Sampling algorithms - University of Sheffield

Packages

Implementation in R and Julia on GitHub

MCMCEnsembleSampler

Original implementation in Python

EMCEE

Goodmann and Ware

Page 30: Sampling algorithms - University of Sheffield

There are many

tools…

Use the right one

Page 31: Sampling algorithms - University of Sheffield

Sanda Dejanic