CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence -...

82
CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran University Fall 2018 Special appreciation to Ian Goodfellow, Joshua Bengio, Aaron Courville, Michael Nielsen, Andrew Ng, Katie Malone, Sebastian Thrun, Ethem Alpaydin, Christopher Bishop, Geoffrey Hinton, Tom Mitchell.

Transcript of CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence -...

Page 1: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

CS 330 - Artificial Intelligence - Model Based theory

Instructor: Renzhi Cao Computer Science Department

Pacific Lutheran University Fall 2018

1

Special appreciation to Ian Goodfellow, Joshua Bengio, Aaron Courville, Michael Nielsen, Andrew Ng, Katie Malone, Sebastian Thrun, Ethem Alpaydin, Christopher Bishop, Geoffrey Hinton, Tom Mitchell.

Page 2: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Notice

Homework is due today

Read materials on course website.

Quiz 2 on Thursday, study guide posted on Sakai

Lab 3 on Thursday for implementation of NB algorithm

Page 3: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian decision theory

Naive Bayes Classifier

Page 4: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian decision theory

Naive Bayes Classifier

Page 5: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian decision theory

• Example: Play Tennis

Page 6: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian decision theory

• Learning Phase

Outlook Play=Yes Play=NoSunny 2/9 3/5

Overcast 4/9 0/5Rain 3/9 2/5

Temperature Play=Yes Play=NoHot 2/9 2/5Mild 4/9 2/5Cool 3/9 1/5

Humidity Play=Yes Play=NoHigh 3/9 4/5

Normal 6/9 1/5

Wind Play=Yes Play=NoStrong 3/9 3/5Weak 6/9 2/5

P(Play=Yes) = 9/14 P(Play=No) = 5/14

Page 7: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian decision theory

• Prediction Phase

– Given a new instance, predict its label x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)– Look up tables achieved in the learning phrase

P(Outlook=Sunny|Play=No) = 3/5P(Temperature=Cool|Play==No) = 1/5P(Huminity=High|Play=No) = 4/5P(Wind=Strong|Play=No) = 3/5P(Play=No) = 5/14

P(Outlook=Sunny|Play=Yes) = 2/9P(Temperature=Cool|Play=Yes) = 3/9P(Huminity=High|Play=Yes) = 3/9P(Wind=Strong|Play=Yes) = 3/9P(Play=Yes) = 9/14

– Decision making with the MAP rule

P(Yes|x’) ≈ [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053

P(No|x’) ≈ [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206

Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.

Page 8: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 9: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

A murder mystery

As the clock strikes midnight in the Old Tudor Mansion, a raging storm rattles the shutters and fills the house with the sound of thunder. The dead body of Mr Black lies slumped on the floor of the library, blood still oozing from the fatal wound. Quick to arrive on the scene is the famous sleuth Dr Bayes, who observes that there were only two other people in the Mansion at the time of the murder. So who committed this dastardly crime? Was it the fine upstanding pillar of the establishment Major Grey? Or was it the mysterious and alluring femme fatale Miss Auburn?

Page 10: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

A murder mystery• It’s uncertain from the discovered body.

• What could be the framework for manipulating uncertain quantities.

• Probability

Page 11: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

A murder mystery

• Assume it’s unlikely that someone like Major Grey with impeccable credentials commit such a heinous crime.

Define a random variable Murderer

P(Murderer = Auburn) = 0.7P(Murderer = Grey) = 1 - P(Murderer = Auburn) = 0.3

Page 12: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

A murder mystery

• What distribution for variable Murderer?

Bernoulli distribution.

Randomly pick a point, 30% to Major Grey, and 70% to Miss Auburn.

Sampling: the process of picking a value for random variable so that the probability of picking a particular value is given by certain distribution.

Page 13: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Small practice

• Estimate probability of the following events: a After visiting a product page on Amazon, a user chooses to buy the

product. b After receiving an email, a user chooses to reply to it. c It will rain tomorrow where you live. d When a murder is committed, the murderer turns out to be a member of

the victim’s family.

• Compare with your partners.

Page 14: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Incorporating evidenceDr Bayes searches the mansion thoroughly. He finds that the only weapons available are an ornate ceremonial dagger and an old army revolver. “One of these must be the murder weapon”, he concludes.

Introduce a new random variable - weapon (the choice of murder weapon).

Two possible values: revolver or dagger.

Page 15: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Major Grey: ex-military experience, familiar with guns.

Miss Auburn: no experience with operation of an old revolver

1. Assume Major Grey were the murderer.

We might believe that the probability of his choosing a revolver rather than a dagger for the murder is 90%.

2. Assume Miss Auburn were the murderer.

We might believe that the probability of her using a revolver would be much smaller, like 20%.

Conditional probability

• P(weapon = revolver | murder = Grey) = 0.9

• P(weapon = dagger | murder = Grey)?

Page 16: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

conditional on Major Grey being the murderer

Page 17: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

conditional on Miss Auburn being the murderer

Page 18: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Conditional probability table (CPT)

Page 19: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Independent variables

The probability of each choice of weapon changes depending on the value of murderer.

We say weapon and murderer are dependent variables.

Random variable raining: raining outside the Old Tudor Mansion at the time of murder. Does that tell anything about murderer?

P(raining | murderer) = P(raining)

We say raining and murderer are independent variables.

Page 20: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Incorporating evidenceSearching carefully around the library, Dr Bayes spots a bullet lodged in the book case. “Hmm, interesting”, he says, “I think this could be an important clue”.

Page 21: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Joint probabilityOur intuition is that: more likely - Major Grey because of his military background, and experience with a revolver. But how can we use this information?

AuburnMajor

Joint probability P(murderer = Auburn, weapon = revolver) = 0.70 * 0.200 = 0.14

Page 22: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Miss AuburnMajor Grey

Page 23: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 24: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Miss AuburnMajor Grey

Page 25: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

End?So, after all this hard work, have we finally solved our murder mystery? Well, given the evidence so far it appears that Grey is more likely to be the murderer, but the probability of his guilt currently stands at 66% which feels too small for a conviction. But how high a probability would we need? To find an answer we turn to William Blackstone’s principle of 1765:

“Better that ten guilty persons escape than one innocent suffer.”

We therefore need a probability of guilt for our murderer which exceeds: 1/(1+10) is about 91%

More evidence!!!

Page 26: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Final clueDr Bayes pulls out his trusty magnifying glass and continues his investigation of the crime scene. As he examines the floor near Mr Black’s body he discovers a hair lying on top of the pool of blood. “Aha” exclaims Dr Bayes “this hair must belong to someone who was in the room when the murder took place!” Looking more closely at the hair, Dr Bayes sees that it is not the lustrous red of Miss Auburn’s vibrant locks, nor indeed the jet black of the victim’s hair, but the distinguished silver of Major Grey!

Page 27: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

New evidence HairThe hair is powerful evidence indicating that Major Grey was present at the time of the murder, but there is also the possibility that the hair was stolen by Miss Auburn and planted at the crime scene to mislead our perceptive sleuth

Random variable hair: true if it’s Major Grey’s hair.

Page 28: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

Final question?

The question to ask when considering a conditional independence assumption is “Does learning about one variable, tell me anything about the other variable, if I knew the value of the conditioning variable?”

In this case that would be “Does learning about the hair, tell me anything about the choice of weapon, if I already knew who the murderer was?”

Page 29: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 30: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Probability theory

The bayesian network!

The Bayesian network (or Bayes net) is a difference way of using graph to represent a probabilistic model. • There are variable nodes corresponding to each variable in the model • Parents variables are connected directly to child variable using directed edges.

Page 31: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networks• Naive Bayes assumption of conditional independence is

too restrictive

• It is intractable without some of these assumptions

• Bayesian network describes conditional independence among subsets of variables It allows combining prior knowledge about independences among variables with observed training data

Page 32: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networks

Definition: X is conditionally independent of Y given Z if the probability distribution governing X is independent of the value of Y given the value of Z; that is, if: (∀xi,yj,zk) P(X = xi | Y = yj, Z = zk) = P(X = xi | Z = zk)

Example: Two coins, regular coin and fake two-tailed coin (P(h) = 0). Choose a coin and toss two times, define: A = First coin toss result in Head B = Second coin toss result in Head C = Regular coin has been selected

A and B are dependent (since A happens tell us it is a regular coin, so probability of B will be changed) Given C (regular coin has been selected), A and B are independent. P(A| B, C) = P(A|C)

Page 33: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networks

• Represent dependence/independence via a directed graph – a set of nodes, for random variables

– A directed, acyclic graph (link ≈ "directly influences") a conditional distribution for each node given its parents:

p(X1, X2,....XN) = Π p(Xi | Pa(Xi ) )

• A simple, graphical notation for conditional independence assertions, and it specifies a joint distribution in a structured form.

The full joint distribution The graph-structured approximation

Pa(X) = immediate parents of X in the graph

Page 34: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• What is the P(A,B,C)?

Simple practice

A B

Cp(A,B,C) = p(C|A,B) p(A) p(B)

“Explaining away” effect: Given C, observing A makes B less likely e.g., earthquake/burglary/alarm example

A and B are (marginally) independent but become dependent once C is known

Page 35: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• What is the P(A,B,C)?

Simple practice

p(A,B,C) = p(A) P(B) P(C)

A CB

Page 36: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• What is the P(A,B,C)?

Simple practice

p(A,B,C) = p(B|A) P(C|A) P(A)

B and C are conditionally independent Given A

e.g., A is a disease, and we model B and C as conditionally independent symptoms given A

A

CB

Page 37: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• What is the P(A,B,C)?

Simple practice

p(A,B,C) = p(C|B) p(B|A) p(A)

Markov dependence

A CB

Page 38: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networksProperties of Bayesian network:

• Requires that graph is acyclic (no directed cycles)

• Two components – The graph structure (conditional independence

assumptions) – The numerical probabilities (for each variable given its

parents)

Page 39: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 40: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 41: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networks

What is the relationship of Bayesian net and Naive Bayesian?

• Naive Bayesian is a special case of Bayesian net.

Page 42: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian networksWhy we favor BN?

• Representation cost: – In previous example, we have five variables: F, A, S, H, N. So

we need 25 -1 probability statements. But with BN, we only need 2 + 2 + 8 + 4 + 4 = 20.

• Efficient learning computation

• Incorporation of domain knowledge

Page 43: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian network learning

There are several cases: • Network structure is known or unknown • Variable values might be fully observed / partly observed

Page 44: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

The parameters are actually the conditional probability table we calculated in Naive Bayesian algorithm.

Page 45: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

» End here, practice and review

Page 46: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 47: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian network learning

There are several cases: • Network structure is known or unknown • Variable values might be fully observed / partly observed

Page 48: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

The parameters are actually the conditional probability table we calculated in Naive Bayesian algorithm.

Page 49: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

What to do?

Page 50: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 51: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 52: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 53: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 54: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 55: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 56: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 57: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 58: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 59: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 60: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 61: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 62: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 63: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Flu Allergy Sinus Headache Nose

1 0 1 1 0

1 1 1 1 1

0 1 1 0 1

0 0 0 0 0

1 0 0 0 0

1 1 1 0 1

0 0 1 0 1

Case 1: known structure and observed data

Page 64: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Case 1: known structure and observed data

• Build CPTs • Calculate P(F=1, A=1, S=1, H=1, N=1)

Page 65: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Flu Allergy Sinus Headache Nose

1 0 ? 1 0

1 1 ? 1 1

0 1 ? 0 1

0 0 ? 0 0

1 0 ? 0 0

1 1 ? 0 1

0 0 ? 0 1

Case 2: known structure and unobserved data

Page 66: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Case 2: known structure and unobserved data

• Build CPTs • Calculate P(F=1, A=1, S=1, H=1, N=1)

Page 67: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 68: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 69: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Bayesian network learning

• 3. Assume the network structure is unknown.

Page 70: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• Suppose we choose the ordering M, J, A, B, E

» P(J | M) = P(J)?

Example

Page 71: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)? » No » P(A | J, M) = P(A | J)? P(A | J, M) = P(A)?

Example

Page 72: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No » P(B | A, J, M) = P(B | A)? » P(B | A, J, M) = P(B)?

Example

Page 73: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No » P(E | B, A ,J, M) = P(E | A)? » P(E | B, A, J, M) = P(E | A, B)?

Example

Page 74: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No » P(E | B, A ,J, M) = P(E | A)? No » P(E | B, A, J, M) = P(E | A, B)? Yes

Example

Page 75: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

Summary

• Bayesian networks provide a natural representation for (causally induced) conditional independence

• Topology + CPTs = compact representation of joint distribution

• Generally easy for domain experts to construct

Page 76: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

End of murderer mystery

Page 77: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

End of murderer mystery

P(murderer = Grey) P(murderer = Auburn)0.3 0.7

Murderer P(weapon|murderer)

Grey 0.9Auburn 0.2

Murderer P(hair|murderer)

Grey 0.5Auburn 0.1

Page 78: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

What is P(murderer, weapon, hair)?P(murderer, weapon, hair) = P(murderer) * P(weapon|murderer) *P(hair|murderer)

What is P(murderer | weapon, hair)?P(murderer | weapon, hair) = P(murderer, weapon, hair) / P(weapon, hair) = P(murderer) * P(weapon|murderer) *P(hair|murderer) / P(weapon, hair) ≈ P(murderer) * P(weapon|murderer) *P(hair|murderer)

P(murderer = Grey) P(murderer = Auburn)0.3 0.7

Murderer P(weapon|murderer)

Grey 0.9Auburn 0.2

Murderer P(hair|murderer)

Grey 0.5Auburn 0.1

Page 79: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

P(murderer | weapon = revolver, hair = true)

≈ P(murderer) * P(weapon = revolver | murderer) *P(hair = true | murderer)

P(murderer = Grey | weapon = revolver, hair = true) ≈ 0.3 * 0.9 * 0.5 = 0.135

P(murderer = Auburn | weapon = revolver, hair = true) ≈ 0.7 * 0.2 * 0.1 = 0.014

P(murderer = Grey) P(murderer = Auburn)0.3 0.7

Murderer P(weapon|murderer)

Grey 0.9Auburn 0.2

Murderer P(hair|murderer)

Grey 0.5Auburn 0.1

Page 80: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

P(murderer = Grey | weapon = revolver, hair = true) ≈ 0.3 * 0.9 * 0.5 = 0.135

P(murderer = Auburn | weapon = revolver, hair = true) ≈ 0.7 * 0.2 * 0.1 = 0.014

P(murderer = Grey | weapon = revolver, hair = true) => 0.135/(0.135+0.014) = 0.91 P(murderer = Auburn | weapon = revolver, hair = true) => 0.014/(0.135+0.014) = 0.09

P(murderer = Grey) P(murderer = Auburn)0.3 0.7

Murderer P(weapon|murderer)

Grey 0.9Auburn 0.2

Murderer P(hair|murderer)

Grey 0.5Auburn 0.1

Page 81: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran
Page 82: CS 330 - Artificial Intelligencecaora/cs330/Materials/fall... · CS 330 - Artificial Intelligence - Model Based theory Instructor: Renzhi Cao Computer Science Department Pacific Lutheran

End for today