Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

12
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Bayesian Inference Anders Gorm Pedersen Anders Gorm Pedersen Molecular Evolution Group Molecular Evolution Group Center for Biological Sequence Analysis Center for Biological Sequence Analysis Technical University of Denmark (DTU) Technical University of Denmark (DTU)

description

Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU). Bayes Theorem. P(A | B) = P(B | A) x P(A) P(B) P(M | D) = P(D | M) x P(M) P(D). - PowerPoint PPT Presentation

Transcript of Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

Page 1: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayesian InferenceBayesian Inference

Anders Gorm PedersenAnders Gorm Pedersen

Molecular Evolution GroupMolecular Evolution GroupCenter for Biological Sequence AnalysisCenter for Biological Sequence AnalysisTechnical University of Denmark (DTU)Technical University of Denmark (DTU)

Page 2: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayes Theorem

Reverend Thomas Bayes(1702-1761)

P(A | B) = P(B | A) x P(A) P(B)

P(M | D) = P(D | M) x P(M) P(D)

P(D|M): Probability of data given model = likelihood

P(M): Prior probability of model

P(D): Essentially a normalizing constant so posterior will sum to one

P(M|D): Posterior probability of model

Page 3: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayesians vs. Frequentists

• Meaning of probability:Meaning of probability:

– FrequentistFrequentist: long-run frequency of event in repeatable experiment: long-run frequency of event in repeatable experiment

– BayesianBayesian: degree of belief, way of quantifying uncertainty: degree of belief, way of quantifying uncertainty

• Finding probabilistic models from empirical data:Finding probabilistic models from empirical data:

– FrequentistFrequentist: parameters are fixed constants whose true values we : parameters are fixed constants whose true values we are are

trying to find good (point) estimates for.trying to find good (point) estimates for.

– BayesianBayesian: uncertainty concerning parameter values expressed by : uncertainty concerning parameter values expressed by meansmeans

of probability distribution over possible of probability distribution over possible parameter valuesparameter values

Page 4: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayes theorem: example

William is concerned that he may have reggaetonitis - a rare, inherited disease that affects 1 in 1000 people

Prior probability distribution

Doctor Bayes investigates William using new efficient test:

If you have disease, test will always be positiveIf not: false positive rate = 1%

William tests positive. What is the probability that he actually has the disease?

Page 5: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayes theorem: example II

Model parameter:d (does William have disease)

Possible parameter values:d=Y, d=N

Possible outcomes (data):+ : Test is positive- : Test is negative

Observed data:+

Likelihoods:P(+|Y) = 1.0 (Test positive given disease)P(-|Y) = 0.0 (Test negative given disease)

P(+|N) = 0.01 (Test positive given no disease)P(-|N) = 0.99 (Test negative given no disease)

Page 6: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayes theorem: example III

P(M | D) = P(D | M) x P(M) P(D)

P(Y|+) = P(+|Y) x P(Y) = P(+|Y) x P(Y) P(+) P(+|Y) x P(Y) + P(+|N) x P(N)

= 1 x 0.001 = 9.1% 1 x 0.001 + 0.01 x 0.999

Other way of understanding result:

(1) Test 1,000 people. (2) Of these, 1 has the disease and tests positive.(3) 1% of the remaining 999 10 will give a false positive test

=> Out of 11 positive tests, 1 is true: P(Y|+) = 1/11 = 9.1%

Page 7: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Bayes theorem: example IV

In Bayesian statistics we use data to update our ever-changing view of reality

Page 8: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

MCMC: Markov chain Monte Carlo

Problem: for complicated models parameter space is enormous.

Not easy/possible to find posterior distribution analytically

Solution: MCMC = Markov chain Monte Carlo

Start in random position on probability landscape.

Attempt step of random length in random direction.

(a) If move ends higher up: accept move

(b) If move ends below: accept move with probability

P (accept) = PLOW/PHIGH

Note parameter values for accepted moves in file.

After many, many repetitions points will be sampled in proportion to the height of the probability landscape

Page 9: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

MCMCMC: Metropolis-coupled Markov Chain Monte Carlo

Problem: If there are multiple peaks in the probability landscape, Problem: If there are multiple peaks in the probability landscape, then MCMC may getthen MCMC may get

stuck on one of themstuck on one of them

Solution: Metropolis-coupled Markov Chain Monte Carlo = MCMCMC = MCSolution: Metropolis-coupled Markov Chain Monte Carlo = MCMCMC = MC33

MCMC33 essential features: essential features:• Run several Markov chains simultaneouslyRun several Markov chains simultaneously• One chain “cold”: this chain performs MCMC samplingOne chain “cold”: this chain performs MCMC sampling• Rest of chains are “heated”: move faster across valleysRest of chains are “heated”: move faster across valleys• Each turn the cold and warm chains may swap position (swap Each turn the cold and warm chains may swap position (swap

probability is proportional to ratio between heights)probability is proportional to ratio between heights)

More peaks will be visitedMore peaks will be visited

More chains means better chance of visiting all important peaks, but More chains means better chance of visiting all important peaks, but each additionaleach additional

chain increases run-timechain increases run-time

Page 10: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

MCMCMC for inference of phylogeny

Result of run:

(a) Substitution parameters

(b) Tree topologies

Page 11: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Posterior probability distributions of substitution parameters

Page 12: Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group

CE

NT

ER

FO

R B

IOLO

GIC

AL

SE

QU

EN

CE

AN

ALY

SIS

Posterior Probability Distribution over Trees

• MAP (maximum a posteriori) estimate of phylogeny: tree topology occurring MAP (maximum a posteriori) estimate of phylogeny: tree topology occurring most often in MCMCMC outputmost often in MCMCMC output

• Clade support: posterior probability of group = frequency of clade in Clade support: posterior probability of group = frequency of clade in sampled trees.sampled trees.

• 95% credible set of trees: order trees from highest to lowest posterior 95% credible set of trees: order trees from highest to lowest posterior probability, then add trees with highest probability until the cumulative probability, then add trees with highest probability until the cumulative posterior probability is 0.95posterior probability is 0.95