Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity...

Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta bn.html

Transcript of Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity...

Page 1: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.

Introduction toBayesian Belief Nets

Russ GreinerDep’t of Computing Science

Alberta Ingenuity Centre for Machine LearningUniversity of Alberta

Page 2: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.




Page 3: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.



Gates says [LATimes, 28/Oct/96]:

Microsoft’s competitive advantages is its expertise in “Bayesian networks”

Current Products Microsoft Pregnancy and Child Care (MSN) Answer Wizard (Office 95, Office 2000) Print Troubleshooter

Excel Workbook TroubleshooterOffice 95 Setup Media TroubleshooterWindows NT 4.0 Video TroubleshooterWord Mail Merge Troubleshooter

Page 4: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Motivation (II)

US Army: SAIP (Battalion Detection from SAR, IR… GulfWar)

NASA: Vista (DSS for Space Shuttle) GE: Gems (real-time monitor for utility

generators) Intel: (infer possible processing problems from end-of-line

tests on semiconductor chips) KIC:

medical: sleep disorders, pathology, trauma care, hand and wrist evaluations, dermatology, home-based health evaluations

DSS for capital equipment: locomotives, gas-turbine engines, office equipment

Page 5: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Motivation (III) Lymph-node pathology diagnosis Manufacturing control Software diagnosis Information retrieval Types of tasks

Classification/Regression Sensor Fusion Prediction/Forecasting

Page 6: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Outline Existing uses of Belief Nets (BNs) How to reason with BNs Specific Examples of BNs Contrast with Rules, Neural Nets,

… Possible applications of BNs Challenges

How to reason efficiently How to learn BNs

Page 7: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Blah blah ouch yak ouch blah ouch blahblah ouch blah

SymptomsSymptomsChief complaintHistory, …


Physical ExamTest results, …


Treatment, …

Page 8: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Objectives: Decision Support System

Determine which tests to perform which repair to suggest

based on costs, sensitivity/specificity, …

Use all sources of information symbolic (discrete observations, history,

…) signal (from sensors)

Handle partial information Adapt to track fault distribution

Page 9: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Underlying Task Situation: Given observations {O1=v1, … Ok=vk}

(symptoms, history, test results, …)

what is best DIAGNOSIS Dxi for patient? Approach1Approach1:: Use set of obs1 & … & obsm Dxi rules

Seldom Completely Certain

but… Need rule for each situation for each diagnosis Dxr

for each set of possible values vj for Oj

for each subset of obs. {Ox1, Ox2, … } {Oj}Can’t use

if only know Temp and BP

If Temp>100 & BP = High & Cough = Yes DiseaseX

Page 10: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Underlying Task, II

Situation: Given observations {O1=v1, … Ok=vk}

(symptoms, history, test results, …)what is best DIAGNOSIS Dxi for patient?

Challenge: How to express Probabilities?

Approach 2Approach 2: Compute Probabilities of Dxi

given observations { obsj }

P( Dx = u | O1= v1, …, Ok= vk )

Page 11: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.

11• But… even if binary Dx, 20 binary obs.’s. >2,097,000 numbers!

P( Dx=T, O1=T, O2=T, …, ON=T ) = 0.03

P( Dx=T, O1=T, O2=T, …, ON=F ) = 0.4 …P( Dx=T, O1=F, O2=F, … , ON=T ) = 0

…P( Dx=F, O1=F, O2=F, …, ON=F ) = 0.01

• Then: Marginalize:


P( Dx = u, O1= v1,…,Ok= vk ) = Σ P( Dx = u , O1= v1 , …, Ok= vk, …, ON= vN )

P( Dx = u | O1 = v1,…, Ok = vk) P( Dx = u, O1 = v1,…,Ok = vk )P( O1 = v1,…,Ok = vk)

P( Dx = u, O1=v1,..., Ok= vk,…, ON=vN )

Sufficient: “atomic events”:

for all 21+N values u {T, F}, vj {T, F}

How to deal with Probabilities

Page 12: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Problems with “Atomic Events”

Representation is not intuitive

Should make “connections” explicituse “local information”

Too many numbers – O(2N) Hard to store Hard to use

[Must add 2r values to marginalize r variables]

Hard to learn[Takes O(2N) samples to learn 2N


Include only necessary “connections”Belief Nets

P(Jaundice | Hepatitis), P(LightDim | BadBattery),…

Page 13: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


? Hepatitis?

? Hepatitis, not Jaunticed but +BloodTest?



Page 14: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Hepatitis Example• (Boolean)


H HepatitisJ JaundiceB (positive) Blood test

• Want P( H=1 | J=0, B=1 ) …, P(H=1 | B=1, J=1), P(H=1 | B=0,J=0), …

• Alternatively…

Option 1:

J B H P(J, B, H)0 0 0 0.033950 0 1 0.00950 1 0 0.00030 1 1 0.18051 0 0 0.014551 0 1 0.0382 1 0 0.000451 1 1 0.722

…Marginalize/Conditionalize, to get P( H=1 | J=0, B=1 ) …

Page 15: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Encoding Causal Links Simple Belief Net:





P(B=0 | H=h)P(B=1 | H=h)h













Node ~ VariableLink ~ “Causal dependency”

“CPTable” ~ P(child | parents)

Page 16: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.





P(J | H, B=0) = P(J | H, B=1) J, H ! P( J | H, B) = P(J | H)

J is INDEPENDENT of B, once we know HDon’t need B J arc!

h P(B=1 | H=h)

1 0.95

0 0.03



h b P(J=1|h , b )

1 1 0.8

1 0 0.8

0 1 0.3

0 0 0.3

Encoding Causal Links

Page 17: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.





P(J | H, B=0) = P(J | H, B=1) J, H ! P( J | H, B) = P(J | H)

J is INDEPENDENT of B, once we know HDon’t need B J arc!

h P(B=1 | H=h)

1 0.95

0 0.03



h P(J=1|h )

1 0.8


0 0.3


Encoding Causal Links

Page 18: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.





P(J | H, B=0) = P(J | H, B=1) J, H ! P( J | H, B) = P(J | H)

J is INDEPENDENT of B, once we know HDon’t need B J arc!

h P(B=1 | H=h)

1 0.95

0 0.03



h P(J=1|h )

1 0.8

0 0.3

Encoding Causal Links

Page 19: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Sufficient Belief NetH



h P(B=1 | H=h)

1 0.95

0 0.03



h P(J=1|h )

1 0.8

0 0.3

Requires: P(H=1) knownP(J=1 | H=1) knownP(B=1 | H=1) known

(Only 5 parameters, not 7)

Hence: P(H=1 | J=0, B=1) = P(H=1) P(J=0 | H=1) P(B=1 | J=0,H=1) 1

P(B=1 | H=1)

Page 20: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.



B does depend on J:If J=1, then likely that H=1 B =1

but… ONLY THROUGH H: If know H=1, then likely that B=1… doesn’t matter whether J=1 or J=0 !

P(B=1 | J=0, H=1) = P(B=1 | H=1)

N.b., B and J ARE correlated a priori P(B | J ) P(B)GIVEN H, they become uncorrelated P(B | J, H) = P(B | H)




Page 21: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Factored Distribution Symptoms independent, given Disease

ReadingAbility and ShoeSize are dependent,P(ReadAbility | ShoeSize ) P(ReadAbility )

but become independent, given AgeP(ReadAbility | ShoeSize, Age ) = P(ReadAbility | Age)

H HepatitisJ JaundiceB (positive) Blood test

P( B | J ) P ( B ) butP( B | J,H ) = P ( B | H )


ShoeSize Reading

Page 22: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Find argmax {hi}


P(H = hi )P(Oj = vj | H = hi)

Independent: P(Oj | H, Ok,…) = P(Oj | H)


O2O1 On...


ijjinni hHvOPhHPvOvOhHP )|()(1

)...,|( 11

Classification Task:Given { O1 = v1, …, On = vn }Find hi that maximizes (H = hi | O1 = v1, …, On = vn)

“Naïve Bayes”

Page 23: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Naïve Bayes (con’t)

Normalizing term

(No need to compute, as same for all hi)

Easy to use for Classification

Can use even if some vjs not specified


)...,|( 11 ij jjinni hHvOPhHPvOvOhHP

i j

ijjinn hHvOPhHPvOvOP )|()(),...,( 11

If k Dx’s and n Ois,requires only k priors, n * k pairwise-conditionals

(Not 2n+k… relatively easy to learn)



2n+1 – 11+2nn


O2O1 On...

Page 24: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Bigger Networks

Intuition: Show CAUSAL connections:GeneticPH CAUSES Hepatitis; Hepatitis CAUSES Jaundice

But only via Hepatitis: GeneticPH and not Hepatitis Jaundice

P( J | D ) P( J ) butP( J | D,H ) = P( J | H)

h P(J=1| h )

1 0.8

0 0.3

h P(B=1| h )

1 0.98

0 0.01

d i P(H=1|d ,i )

1 1 0.82

1 0 0.10

0 1 0.45

0 0 0.04

If GeneticPH, then expect Jaundice: GeneticPH Hepatitis Jaundice









Page 25: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Belief Nets DAG structure

Each node Variable v v depends (only) on its parents

+ conditional prob: P(vi | parenti = 0,1,… ) v is INDEPENDENT of non-descendants, given assignments to its parents

Given H = 1,- D has no influence on J- J has no influence on B- etc.




Page 26: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Less Trivial Situations• N.b., obs1 is not always independent of obs2 given H

• Eg, FamilyHistoryDepression ‘causes’ MotherSuicide and Depression

MotherSuicide causes Depression (w/ or w/o F.H.Depression)

• Here, P( D | MS, FHD ) P( D | FHD ) ! Can be done using Belief Network,

but need to specify:P( FHD ) 1P( MS | FHD ) 2P( D | MS, FHD ) 4








P(MS=1 | FHD=f)f





P(D=1 | FHD=f, MS=m)mf

Page 27: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Example: Car Diagnosis

Page 28: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.



Page 29: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.



A Logical Alarm Reduction Mechanism• 8 diagnoses, 16 findings, …

Page 30: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Troup Detection

Page 31: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


ARCO1: Forecasting Oil Prices

Page 32: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


ARCO1: Forecasting Oil Prices

Page 33: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Forecasting Potato Production

Page 34: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Warning System

Page 35: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Extensions Find best values (posterior distr.) for

SEVERAL (> 1) “output” variables Partial specification of “input” values

only subset of variables only “distribution” of each input variable

General Variables Discrete, but domain > 2 Continuous (Gaussian: x = i bi yi for parents {Y} )

Decision Theory Decision Nets (Influence Diagrams) Making Decisions, not just assigning prob’s

Storing P( v | p1, p2,…,pk)General “CP Tables” 0(2k)Noisy-Or, Noisy-And, Noisy-Max“Decision Trees”

Page 36: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Outline Existing uses of Belief Nets (BNs) How to reason with BNs Specific Examples of BNs

Contrast with Rules, Neural Nets, …

Possible applications of BNs Challenges

How to reason efficiently How to learn BNs

Page 37: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Belief Nets vs Rules Both have “Locality”

Specific clusters (rules / connected nodes)

WHY?: Easier for people to reason CAUSALLYeven if use is DIAGNOSTIC

BN provide OPTIMAL way to deal with+ Uncertainty+ Vagueness (var not given, or only dist)+ Error

…Signals meeting Symbols …

BN permits different “direction”s of inference

Often same nodes (rep’ning Propositions) butBN: Cause Effect “Hep Jaundice” P(J | H )

Rule: Effect Cause“Jaundice Hep”

Page 38: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Belief Nets vs Neural Nets Both have “graph structure” but

So harder to Initialize NN Explain NN(But perhaps easier to learn NN from examples only?)

BNs can deal withPartial InformationDifferent “direction”s of inference

BN: Nodes have SEMANTICs Combination Rules: Sound Probability

NN: Nodes: arbitrary Combination Rules: Arbitrary

Page 39: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Belief Nets vs Markov Nets Each uses “graph structure”

to FACTOR a distribution… explicitly specify dependencies, implicitly


but subtle differences…BNs capture “causality”, “hierarchies”MNs capture “temporality”


BATechnical: BNs use DIRECTRED arcs allow “induced dependencies”

I (A, {}, B) “A independent of B, given {}” ¬ I (A, C, B) “A dependent on B, given C”

MNs use UNDIRECTED arcs allow other independencies

I(A, BC, D) A independent of D, given B, CI(B, AD, C) B independent of C, given A, D D



Page 40: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Uses of Belief Nets #1 Medical Diagnosis: “Assist/Critique” MD

identify diseases not ruled-out specify additional tests to perform suggest treatments appropriate/cost-effective react to MD’s proposed treatment

Decision Support: Find/repair faults in complex machines[Device, or Manufacturing Plant, or …]… based on sensors, recorded info, history,…

Preventative Maintenance: Anticipate problems in complex machines

[Device, or Manufacturing Plant, or …]…based on sensors, statistics, recorded info, device history,…

Page 41: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Uses (con’t)

Logistics Support: Stock warehouses appropriately…based on (estimated) freq. of needs, costs,

Diagnose Software:Find most probable bugs, given

program behavior, core dump, source code, … Part Inspection/Classification:

… based on multiple sensors, background, model of production,… Information Retrieval:

Combine information from various sources,based on info from various “agents”,…

General: Partial Info, Sensor fusion-Classification -Interpretation-Prediction -…

Page 42: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Challenge #1Computational Efficiency

For given BN:General problem is



+ If BN is “poly tree”, efficient alg.

- If BN is gen’l DAG (>1 path from X to Y)

- NP-hard in theory- slow in practice

Tricks: Get approximate answer (quickly)+ Use abstraction of BN+ Use “abstraction” of query (range)

O1 = v1, …, On = vn

P(H | O1 = v1, …, On = vn)





Page 43: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


# 2a:Obtaining Accurate BN BN encodes distribution over n variables

Not O(2n) values, but “only” i 2k_i

(Node ni binary, with ki parents)

Still lots of values! …structure ..

Qualitative InformationStructure: “What depends on what?”

• Easy for people (background knowledge)• But NP-hard to learn from samples…

Quantitative InformationActual CP-tables

• Easy to learn, given lots of examples.• But people have hard time…

Knowledge acquisition: from human experts

Simple learning algorithm

Page 44: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Notes on Learning

Mixed Sources: Person provides structure;Algorithm fills-in numbers.

Just Learning Algorithm: algorithms that

learn from samplestructure values

Just Human Expert: People produce CP-table, as well as structure

Relatively few values really requiredEsp. if NoisyOr, NoisyAnd, NaiveBayes, …

Actual values not that important…Sensitivity studies

Page 45: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


My Current Work Learning Belief Nets

Model selection:Challenging myth that MDL is appropriate

criteria Learning “performance system”, not

model Validating Belief Nets

“Error bars” around answers

Adaptive User Interfaces Efficient Vision Systems Foundations of Learnability

Learning Active Classifiers Sequential learners

Condition Based maintenance, Bio-signal interpretation, …

Page 46: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


# 2b: Maintaining Accurate BN

The world changes.Information in BN*

may be perfect at time t sub-optimal at time t + 20 worthless at time t + 200

Need to MAINTAIN a BN over timeusing on-going human consultant

Adaptive BN Dirichlet distribution (variables) Priors over BNs

Page 47: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Conclusions Belief Nets are PROVEN TECHNOLOGY

Medical Diagnosis DSS for complex machines Forecasting, Modeling, InfoRetrieval…

Provide effective way toRepresent complicated, inter-related eventsReason about such situations

•Diagnosis, Explanation, ValueOfInfo•Explain conclusions•Mix Symbolic and Numeric observations

ChallengesEfficient ways to use BNsWays to create BNsWays to maintain BNsReason about time

Page 48: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Extra Slides

AI Seminar Friday, noon, CSC3-33 Free PIZZA!


Crusher Controller Formal Framework Decision Nets Developing the Model Why Reasoning is Hard Learning Accurate Belief Nets

Page 49: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


References•• Overview textbooks:

Judea Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, 1988.

Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, 1995. (See esp Ch 14, 15, 19.)

• General info re BayesNets Proceedings: for Uncertainty in AI

• Learning:David Heckerman, A tutorial on learning with Bayesian networks,



• Software:General:

Page 50: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Decision Net: Test/Buy a Car

Page 51: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Utility: Decision Nets Given c( action, state) R (cost function)

Cp(a) = Es[ c(a,s) ] = sS p(s | obs) * c(a, s) Best (immediate) action: a* = argmina A

{Cp(a) } Decision Net (like Belief Net) but…

3 types of nodes chance (like Belief net) action – repair, sensing cost/utility

Links for “dependency” Given observations, obs, computes best action, a*

Sequence of Actions: MDPs, POMDPs, …Go Back

Page 52: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Decision Net: Drill for Oil?

Go Back

Page 53: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Formal Framework

)|(),...,( 1 ii

in paxPxxP

)2( ||i


Always true:P(x1, …,xn) = P(x1) P(x2 | x1) P (x3 | x2, x1) … P (xn | xn-1,…,x1)

Given independencies,P(xk | x1,…,xk-1) = P (xk | pak) for some pak {x1, …, xk-1}


So just connect each y pai to xi… DAG structure

.Note: -Size of BN is so better to use small pai.

-pai = {1,…,i – 1} is never incorrect … but seldom min’l… (so hard to store, learn, reason with,…)- Order of variables can make HUGE difference Can have |pai| = 1 for one ordering

|pai| =i– 1 for anotherGo Back

Page 54: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Developing the ModelSource of information

+ (Human) Expert (s)

+ Data from earlier Runs

+ Simulator

Typical Process1. Develop / Refine Initial Prototype

2. Test Prototype ↦ Accurate System

3. Deploy System

4. Update / Maintain System

Page 55: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Develop/Refine PrototypeRequires expert

useful to have dataInitial Interview(s):

To establish “what relates to what”Expert time: ≈ ½ - day

Iterative process: (Gradual refinement)

To refine qualitative connectionsTo establish correct operationsExpert presents “Good Performance”

KE implements Expert’s claimsKE tests on examples (real data or expert), and reports to Expert

Expert time: ≈ 1 – 2 hours / week for ?? Weeks(Depends on complexity of device, and accuracy of model)

Go Back

Page 56: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Why Reasoning is HardBN reasoning may look easy:

Just “propagate” information from node to node



C Challenge: What is P(C=t)?A = Z = ¬B P ( A = t ) = P ( B = f ) = ½ So… ? P ( C = t ) = P ( A = t, B = t) = P ( A = t) * P( B = t) = ½ * ½ = ¼ Wrong: P ( C = t ) = 0 !

Need to maintain dependencies! P ( A = t, B = t ) = P ( A = t ) * P ( B = t | A = t)

z P(A=t|Z=z)

t 1.0

f 0.0

z P(B=t|Z=z)

t 0.0

f 1.0

a b P(C=t|a,b)

t t 1.0

t f 0.0

f t 0.0

f f 0.0



Go Back

Page 57: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Crusher Controller Given observations

History, sensor readings, schedule, … Specify best action for crusher

“stop immediately”, “increase roller speed by ”

Best == minimize expected cost …

Initially: just recommendation to human operator Later: Directly implement (some) actions

?Request values of other sensors?

Page 58: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Approach1. For each state s

(“Good flow”, “tooth about to enter”, …)

for each action a(“Stop immediately”, “Change p7 += 0.32”, …)

determine utility of performing a in s(Cost of lost production if stopped;… of reduced production efficient if continue; …)

2. Use observations to estimate (dist over) current states

Infer EXPECTED UTILITY of each action, based on distr.

3. Return action with highest Expected Utility

Page 59: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Details Inputs

Sensor Readings (history) Camera, microphone,

power-draw Parameter settings Log files, Maintenance

records Schedule (maintenance,

anticipated load, …) Outputs

Continue as is Adjust parameters

GapSize, ApronFeederSpeed, 1J_ConveyorSpeed

Shut down immediately Step adding new material Tell operator to look

State “CrusherEnvironment”


#TeethMissing NextUncrushableEntry Control Parameters

Page 60: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Benefits Increase Crusher Effectiveness

Find best settings for parameters To maximize production of well-sized chunks

Reduce Down Time Know when maintain/repair is critical

Reduce Damage to Crusher Usable Model of Crusher

Easy to modify when needed Training Design of next generation

Prototype for design of {control, diagnostician} of other machines

Go Back

Page 61: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


My Background PhD, Stanford (Computer Science)

Representational issues, Analogical Inference … everything in Logic

PostDoc at UofToronto (CS) Foundations of learnability, logical inference, DB, control

theory, … … everything in Logic

Industrial research (Siemens Corporate Research) Need to solve REAL problems

Theory Revision, Navigational systems, … …logic is not be-all-and-end-all!

Prof at UofAlberta (CS) Industrial problems (Siemens, BioTools, Syncrude) Foundations of learnability, probabilistic inference …

Page 62: Introduction to Bayesian Belief Nets Russ Greiner Dep’t of Computing Science Alberta Ingenuity Centre for Machine Learning University of Alberta greiner/bn.html.


Less Trivial Situations• N.b., obs1 is not always independent of obs2 given H

• Eg, FamilyHistoryDepression ‘causes’ MotherSuicide and Depression

MotherSuicide causes Depression (w/ or w/o F.H.Depression)

• Here, P( D | MS, FHD ) P( D | FHD ) ! Can be done using Belief Network,

but need to specify:P( FHD ) 1P( MS | FHD ) 2P( D | MS, FHD ) 4






f P(MS=1 | FHD=f)

1 0.10

0 0.03

f m P(D=1 | FHD=f, MS=m)

1 1 0.97

1 0 0.90

0 1 0.08

0 0 0.04