INTRODUCTION TO BAYESIAN INFERENCE PART...

41
INTRODUCTION TO BAYESIAN INFERENCE PART 1 CHRIS BISHOP

Transcript of INTRODUCTION TO BAYESIAN INFERENCE PART...

Page 1: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

INTRODUCTION TOBAYESIAN INFERENCE – PART 1

CHRIS BISHOP

Page 2: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian
Page 3: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

http://research.microsoft.com/~cmbishop

Page 4: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

First Generation

“Artificial Intelligence” (GOFAI)

Within a generation ... the problem of creating ‘artificial intelligence’ will largely be solved

Marvin Minsky (1967)

Expert Systems (1980s)

knowledge-based AI

rules elicited from humans

Combinatorial explosion

General theme: hand-crafted rules

Page 5: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Second Generation

Neural networks, support vector machines

Difficult to incorporate complex domain knowledge

General theme: black-box statistical models

Page 6: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Third Generation

General theme: deep integration of domainknowledge and statistical learning

Bayesian framework

Probabilistic graphical models

Fast inference using local message-passing

Origins: Bayesian networks, decision theory, HMMs, Kalman filters, MRFs, mean field theory, ...

Page 7: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Probability Theory

Apples and Oranges

Fruit is orange, what is probability that box was blue?

Page 8: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

The Rules of Probability

Sum rule

Product rule

Page 9: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Bayes’ Theorem

Page 10: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Oranges and Apples

Suppose

Suppose we select an orange

Then

and hence

Page 11: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Probability Densities

Page 12: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Bayesian Inference

Consistent use of probability to quantify uncertainty

Predictions involve marginalisation, e.g.

posterior likelihood function prior

Page 13: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Why is prior knowledge important?

?

x

y

Page 14: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Probabilistic Graphical Models

Combine probability theory with graphs

new insights into existing models

framework for designing new models

Graph-based algorithms for calculation and computation (c.f. Feynman diagrams in physics)

efficient software implementation

Directed graphs to specify the model

Factor graphs for inference and learning

Page 15: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Decomposition

Consider an arbitrary joint distribution

By successive application of the product rule:

Page 16: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Directed Graphs

Arrows indicate causal relationships

Page 17: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

MAAS

Manchester Asthma and Allergies Study

Goal: discover environmental and genetic causes of asthma

1,186 children monitored since birth

640k SNPs per child

Many environment and physiological measurements:

skin and IgE blood tests at age 1, 3, 5, and 8

wheezing, methacholine response,

pets, parental smoking, day-care, breast feeding, ...

Page 18: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian
Page 19: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

{0,1} {0,1} {0,1} {0,1}

Page 20: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian
Page 21: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

21

Page 22: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Factor Graphs

Page 23: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

From Directed Graph to Factor Graph

Page 24: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Inference on Graphs

Page 25: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Factor Trees: Separation

v w x

f1(v,w) f2(w,x)

y

f3(x,y)

z

f4(x,z)

Page 26: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Messages: From Factors To Variables

w x

f2(w,x)

y

f3(x,y)

z

f4(x,z)

Page 27: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Messages: From Variables To Factors

x

f2(w,x)

y

f3(x,y)

z

f4(x,z)

Page 28: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

What if the graph is not a tree?

Keep iterating the messages:

loopy belief propagation

Page 29: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

What if marginalisations are not tractable?

True distribution Monte Carlo VMP / Loopy BP / EP

Page 30: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Illustration: Bayesian Ranking

Goal: global ranking from noisy partial rankings

Conventional approach: Elo (used in chess)

maintains a single strength value for each player

cannot handle team games, or > 2 players

Ralf Herbrich

Tom Minka

Thore Graepel

Page 31: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Two Player Match Outcome Model

y12

1 2

s1 s2

Page 32: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Two Team Match Outcome Model

y12

t1 t2

s2 s3s1 s4

Page 33: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Multiple Team Match Outcome Model

s1 s2 s3 s4

t1

y12

t2 t3

y23

Page 34: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

s1 s2 s3 s4

t1

y12

t2 t3

y23

Gaussian Prior Factors

Ranking Likelihood Factors

Page 35: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

0

5

10

15

20

25

30

35

40Level

0 100 200 300 400

Number of Games

char (Elo)

SQLWildman (Elo)

char (TrueSkill™)

SQLWildman (TrueSkill™)

Page 36: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Skill Dynamics

y12

¼1 ¼2

y12

¼1’ ¼2’

s1 s2 s1’ s2’

Page 37: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

TrueSkillTM

Xbox 360 Live: launched September 2005

TrueSkillTM for ranking and to match players

10M active users, 2.5M matches per day

“Planet-scale” application of Bayesian methods

Page 38: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

John Winn

Page 39: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

research.microsoft.com/infernet

Tom Minka

John Winn

John Guiver

Anitha Kannan

Page 40: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Infer.Net demonstration

Page 41: INTRODUCTION TO BAYESIAN INFERENCE PART 1mlg.eng.cam.ac.uk/mlss09/mlss_slides/Bishop-MLSS-09-1.pdfINTRODUCTION TO BAYESIAN INFERENCE ... knowledge and statistical learning ... Bayesian

Different?

True False

Prob. Cure

(Treated)Prob. Cure

(Placebo)

Prob. Cure

(All)