CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty

CPSC 7373: Artificial IntelligenceLecture 4: Uncertainty

Jiang Bian, Fall 2012University of Arkansas at Little Rock

Chapter 13: Uncertainty

• Outline– Uncertainty– Probability– Syntax and Semantics– Inference– Independence and Bayes' Rule

UncertaintyLet action At = leave for airport t minutes before flightWill At get me there on time?

Problems:1. partial observability (road state, other drivers' plans, etc.)2. noisy sensors (traffic reports)3. uncertainty in action outcomes (flat tire, etc.)4. immense complexity of modeling and predicting traffic5.

Hence a purely logical approach either6. risks falsehood: “A25 will get me there on time”, or7. leads to conclusions that are too weak for decision making:

“A25 will get me there on time, if there‘s no accident on the bridge and it doesn’t rain and my tires remain intact etc etc.”

(A1440 might reasonably be said to get me there on time but I'd have to stay overnight in the airport …)

Methods for handling uncertainty• Default or nonmonotonic logic:

– Assume my car does not have a flat tire– Assume A25 works unless contradicted by evidence– Issues: What assumptions are reasonable? How to handle contradiction?

•• Rules with fudge factors:

– A25 |→0.3 get there on time– Sprinkler |→ 0.99 WetGrass– WetGrass |→ 0.7 Rain– Issues: Problems with combination, e.g., Sprinkler causes Rain??

•• Probability• Model agent's degree of belief• Given the available evidence,• A25 will get me there on time with probability 0.04

–

ProbabilityProbabilistic assertions summarize effects oflaziness: failure to enumerate exceptions, qualifications, etc.ignorance: lack of relevant facts, initial conditions, etc.

–

Subjective probability:• Probabilities relate propositions to agent's own state of knowledge• e.g., P(A25 | no reported accidents) = 0.06

These are not assertions about the world

Probabilities of propositions change with new evidence:e.g., P(A25 | no reported accidents, 5 a.m.) = 0.15

Bayes Network: Example

CAR WON’TSTART

BATTERYFLAT

BATTERYDEAD

ALTERNATORBROKEN

FAN-BLETBROKEN

NOTCHARGING

NOOIL

NOGAS

FUEL LINEBLOCKED

STARTERBROKEN

BATTERYMETER

BATTERYAGE

LIGHTS OIL LIGHT GAS GAUGE DIP STICK

Probabilities: Coin Flip

• Suppose the probability for heads is 0.5. What's the probability for it coming up tails?– P(H) = 1/2– P(T) = __?__


• Suppose the probability for heads is 1/4. What's the probability for it coming up tails?– P(H) = 1/4– P(T) = __?__


• Suppose the probability for heads is 1/2. Each of the coin flip is independent. What's the probability for it coming up three heads in a row?– P(H) = 1/2– P(H, H, H) = __?__


• Xi = result of i-th coin flig;

• Xi = {H, T}; and

• Pi(H) = 1/2 i∀• P(X1=X2=X3=X4) = __?__



• Xi = {H, T}; and

• Pi(H) = 1/2 i∀• P(X1=X2=X3=X4) = __?__

– P(X1=X2=X3=X4=H) + P(X1=X2=X3=X4=T) = 1/8



• Xi = {H, T}; and

• Pi(H) = 1/2 i∀• P({X1,X2,X3,X4} contains at least 3 H) = __?__



• Xi = {H, T}; and

• Pi(H) = 1/2 i∀• P({X1,X2,X3,X4} contains at least 3 H) = __?__– P(HHHH) + P(HHHT) + P(HHTH) + P(HTHH) +

P(THHH) = 5*1/16 = 5/16

Probabilities: Summary

• Complementary probability:– P(A) = p; then P(¬A) = 1 – p

• Independence:– X Y; then P(X)P(Y) = P(X,Y)⊥

joint probability

marginal

Dependence

• Given, P(X1=H)= ½– H: P(X2=H|X1=H) = 0.9

– T: P(X2=T|X1=T) = 0.8

• P(X2=H) = __?__

Dependence

• Given, P(X1=H)= ½– H: P(X2=H|X1=H) = 0.9

– T: P(X2=T|X1=T) = 0.8

• P(X2=H) = __?__– P(X2=H|X1=H) * P(X1=H) + P(X2=H|X1=T) * P(X1=T)– = 0.9 * ½ + (1 – 0.8) * ½ = 0.55

What we have learned?

• Total probability:

• Negation of probabilities

– What about?

What we have learned?

• Negation of probabilities

– What about?

– You can negate the event (X), but you can never negate the conditional variable (Y).

Example: Weather

• Given,– P(D1); P(D1=Sunny) = 0.9

– P(D2=Sunny|D1=Sunny) = 0.8

– P(D2=Rainy|D1=Sunny) = ??

Example: Weather• Given,– P(D1); P(D1=Sunny) = 0.9– P(D2=Sunny|D1=Sunny) = 0.8– P(D2=Rainy|D1=Sunny) = ??

• 1 - P(D2=Sunny|D1=Sunny) = 0.2

– P(D2=Sunny|D1=Rainy) = 0.6– P(D2=Rainy|D1=Rainy) = ??

• 1 - P(D2=Sunny|D1=Rainy) = 0.4

– Assume the transition probabilities from D2 to D3 are the same:

– P(D2=Sunny) = ??; and P(D3=Sunny) = ??

Example: Weather• Given,– P(D1); P(D1=Sunny) = 0.9– P(D2=Sunny|D1=Sunny) = 0.8– P(D2=Sunny|D1=Rainy) = 0.6– Assume the transition probabilities from D2 to D3 are the same:– P(D2=Sunny) = 0.78;

• P(D2=Sunny|D1=Sunny) * P(D1=Sunny) + P(D2=Sunny|D1=Rainy) * P(D1=Rainy) = 0.8*0.9 + 0.6*0.1 = 0.78

– P(D3=Sunny) = 0.756• P(D3=Sunny|D2=Sunny) * P(D2=Sunny) + P(D3=Sunny|D2=Rainy) *

P(D2=Rainy) = 0.8*0.78 + 0.6*0.22 = 0.756

Example: Cancer

• There exists a type of cancer, where 1% of the population will carry the disease.– P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99

• There exists a test of the cancer.– P(+|C) = 0.9; P(-|C) = 0.1– P(+|¬C) = 0.2; P(-|¬C) = 0.8

• P(C|+) = ??– Joint probabilities:

• P(+, C) = ??; P(-, C) = ??• P(+, ¬C) = ??; P(-, ¬C) = ??

Example: Cancer

• There exists a type of cancer, where 1% of the population will carry the disease.– P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99


• P(C|+) = ??– Joint probabilities:

• P(+, C) = 0.009; P(-, C) = 0.001• P(+, ¬C) = 0.198; P(-, ¬C) = 0.792

Example: Cancer• There exists a type of cancer, where 1% of the population will

carry the disease.– P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99


• P(C|+) = 0.043– P(+, C) / (P(+, C) + P(+, ¬C))– 0.009 / 0.009 + 0.198 = 0.043– Joint probabilities:

• P(+, C) = 0.009; P(-, C) = 0.001• P(+, ¬C) = 0.198; P(-, ¬C) = 0.792

Bayes Rule

LIKELILHOOD

MARGINAL LIKEIIHOOD

PRIORPOSTERIOR

TOTAL PROBABILITY

Bayes Rule: Cancer ExampleLIKELILHOOD

MARGINAL LIKEIIHOOD

PRIORPOSTERIOR

Bayes Network

• Graphically,

• Diagnostic reasoning: P(A|B) or P(A|¬B)• How many parameters??

A

B

Not observable

observable

P(A)

P(B|A)P(B|¬A)

Two test cancer example

• 2-Test Cancer Example

• P(C|T1=+,T2=+) = P(C|++) = ??

C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2



• P(C|T1=+,T2=+) = P(C|++) = 0.1698

C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2

Bayes Rule: Compute

)()()|()|(

BPAPABPBAP

1)|()|( BAPBAP

)()|()|(' APABPBAP

)()|()|(' APABPBAP

)|(')|( BAPBAP

)|(')|( BAPBAP

)()|()()|(/(1))|(')|('/(1

1)|()|())|(')|('()(/1

APABPAPABPBAPBAP

BAPBAPBAPBAPBP



C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2P(C|++) = ??

Prior + + P’

C 0.01 0.9 0.9 0.0081

¬C 0.99 0.2 0.2 0.0396

0081.0)()()()()|()|(' CPPPCPCPCP0396.0)()()()()|()|(' CPPPCPCPCP0477.0/1)0396.00081.0/(1))|(')|('/(1 CPCP

1698.00477.0/0081.0)|(')|( CPCP 8302.00477.0/0396.0)|(')|( CPCP



C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2P(C|+-) = ??



C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2P(C|+-) = 0.0056

Prior + - P’ P

C 0.01 0.9 0.1 0.0009 0.0056

¬C 0.99 0.2 0.8 0.1584 0.9943

Conditional Independence


• We not only assume that T1 and T2 are identically distributed; but also conditionally independent.– P(T2|C,T1)=P(T2|C)

C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2

Conditional Independence

• Given A, B C⊥– B C|A =⊥ ? B C⊥

A

B C

Conditional Independence• Given A, B C⊥

– B C|A =⊥ ? B C⊥• Intuitively, getting a positive test result about cancer gives us information about whether you have cancer or not.

– So if you get a positive test result you're going to raise the probability of having cancer relative to the prior probability.

– With that increased probability we will predict that another test will with a higher likelihood give us a positive response than if we hadn't taken the previous test.

A

B C



C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2

P(T2=+|T1=+) = ??

Conditional independence: cancer example


• Conditional independence: given that I know C, knowledge of the first test gives me no more information about the second test.– It only gives me information if C was unknown.

C

T1 T2

P(C) = 0.01; P(¬C) = 0.99P(+|C) = 0.9; P(-|C) = 0.1P(-|¬C) = 0.8; P(+|¬C) = 0.2

P(T2=+|T1=+)= P(+2|+1,C)P(C|+1) + P(+2 |+1, ¬C)P(¬C|+1)= P(+2|C)0.043 + P(+2 |¬C)(1-0.043)= 0.9 * 0.043 + 0.2 * 0.957= 0.2301

P(+2|+1,C) = P(+2|C)

Absolute and Conditional

C

A B

A BA B⊥

A B|C⊥

A B => A B | C ⊥ ⊥ ??

A B | C => A B ⊥ ⊥ ??

Explaining Away

H

S R P(S) = 0.7P(R) = 0.01

P(H|S, R) = 1P(H|¬S, R) = 0.9P(H|S, ¬R) = 0.7P(H|¬S, ¬R) = 0.1

Explaining away means that if we know that we are happy, then sunny weather can explain away the cause of happiness. - If I know that it’s sunny, it becomes less likely that I received a raise.If we see a certain effect that could be caused by multiple causes, seeing one of those causes can explain away any other potential cause of this effect over here.

P(R|H,S) = ??

Absolute and Conditional

C

A B

A BA B⊥

A B => A B | C ⊥ ⊥ ??

A B | C => A B ⊥ ⊥ ??

C

A B

Bayes Networks

C

D E

A B • Bayes networks define probability distributions over graphs of random variables.

• Instead of enumerating all possibilities of the combinations of random variables, the Bayes network is defined by probability distributions that are inherent to each individual node.

• The joint probability represented by a Bayes network is the product of various Bayes network probabilities that are defined over individual nodes where each node's probability is only conditioned on the incoming arcs.• P(A)P(B) P(C|A,B) P(D|C) P(E|C)• 10 probability values

P(A), P(B)

P(C|A,B)

P(D|C) P(E|C)

25-1 = 31 probability values

Bayes Network: Quiz 1A

B C D

E F

How many probability values are required to specific this Bayes network?

Bayes Network: Quiz 1A

B C D

E F

How many probability values are required to specific this Bayes network?13

P(A)

P(B|A), P(C|A), P(D|A)

P(E|B), P(F|C,D)

Bayes Network: Quiz 2A B C

D

E F

How many probability values are required to specific this Bayes network?

G

Bayes Network: Quiz 2

D

A B C

E F G

P(A), P(B), P(C) = 3

P(D|A, B, C) = 8

P(E|D), P(F|D), P(G|D, C) = 2 + 2 +4 = 8

19


CAR WON’TSTART

BATTERYFLAT

BATTERYDEAD

ALTERNATORBROKEN

FAN-BLETBROKEN

NOTCHARGING

NOOIL

NOGAS

FUEL LINEBLOCKED

STARTERBROKEN

BATTERYMETER

BATTERYAGE




CAR WON’TSTART

BATTERYFLAT

BATTERYDEAD

ALTERNATORBROKEN

FAN-BLETBROKEN

NOTCHARGING

NOOIL

NOGAS

FUEL LINEBLOCKED

STARTERBROKEN

BATTERYMETER

BATTERYAGE



1 1 1

1 1 1

2 4

1

2

2

4

4 416 2

47

D-Separation

A

B D

C E

Yes No

C A⊥C A | B ⊥C D⊥C D | A ⊥E C | D ⊥

D-Separation

A

B D

C E

Yes No

C A⊥ X A influences C by virtue of B.

C A | B ⊥ X If you know B, the knowledge of A won’t tell you anything about C.

C D⊥ X If I know D, I can infer more about C through A.

C D | A ⊥ XE C | D ⊥ X

• C and A are not independent but C and A are independent given B.

• C and D are not independent but C and D are independent given A.

• E and C are independent given D.

D-Separation

C

D E

A BYes N

oA E⊥A E | B ⊥A E | C⊥A B⊥A B | C ⊥

D-Separation

C

D E

A BYes N

oA E⊥ XA E | B ⊥ XA E | C⊥ XA B⊥ XA B | C ⊥ X

EXPLAINING AWAY EFFECT• The knowledge of A will discredit the

information given by B on its influence on C.

D-Separation: Reachability

• Active triplets: render variables dependent

• Inactive triplets: render variables independent

– cut off by a known variable in the middle, that separates or d-separates the left variable from the right variable, and they become independent.

D-Separation: ReachabilityActive triplets In-active triplets

D-Separation: QuizA C

B

D

F

E

G

H

Yes No

F A⊥F A | D ⊥F A | G⊥F A | H⊥

D-Separation: QuizA C

B

D

F

E

G

H

Yes No

F A⊥ XF A | D ⊥ XF A | G⊥ XF A | H⊥ X

Bayes Network: Summary

• Graph Structure• Compact representation• Conditional Independence• Next: applications

CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty

Documents

Transcript of CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty