Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic...
-
date post
18-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic...
![Page 1: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/1.jpg)
Part II: Graphical models
![Page 2: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/2.jpg)
Challenges of probabilistic models
• Specifying well-defined probabilistic models with many variables is hard (for modelers)
• Representing probability distributions over those variables is hard (for computers/learners)
• Computing quantities using those distributions is hard (for computers/learners)
![Page 3: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/3.jpg)
Representing structured distributions
Four random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
Domain
{0,1}{0,1}{0,1}{0,1}
![Page 4: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/4.jpg)
Joint distribution0000000100100011010001010110011110001001101010111100110111101111
• Requires 15 numbers to specify probability of all values x1,x2,x3,x4
– N binary variables, 2N-1 numbers
• Similar cost when computing conditional probabilities
€
P(x2, x3, x4 | x1 =1) =P(x1 =1,x2,x3,x4 )
P(x1 =1)
![Page 5: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/5.jpg)
How can we use fewer numbers?
Four random variables:
X1 coin toss produces heads
X2 coin toss produces heads
X3 coin toss produces heads
X4 coin toss produces heads
Domain
{0,1}{0,1}{0,1}{0,1}
![Page 6: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/6.jpg)
Statistical independence• Two random variables X1 and X2 are independent if
P(x1|x2) = P(x1)
– e.g. coinflips: P(x1=H|x2=H) = P(x1=H) = 0.5
• Independence makes it easier to represent and work with probability distributions
• We can exploit the product rule:
€
P(x1,x2, x3, x4 ) = P(x1 | x2,x3,x4 )P(x2 | x3, x4 )P(x3 | x4 )P(x4 )
€
P(x1, x2, x3, x4 ) = P(x1)P(x2)P(x3)P(x4 )
If x1, x2, x3, and x4 are all independent…
![Page 7: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/7.jpg)
Expressing independence
• Statistical independence is the key to efficient probabilistic representation and computation
• This has led to the development of languages for indicating dependencies among variables
• Some of the most popular languages are based on “graphical models”
![Page 8: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/8.jpg)
Part II: Graphical models
• Introduction to graphical models– representation and inference
• Causal graphical models– causality– learning about causal relationships
• Graphical models and cognitive science– uses of graphical models– an example: causal induction
![Page 9: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/9.jpg)
Part II: Graphical models
• Introduction to graphical models– representation and inference
• Causal graphical models– causality– learning about causal relationships
• Graphical models and cognitive science– uses of graphical models– an example: causal induction
![Page 10: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/10.jpg)
Graphical models
• Express the probabilistic dependency structure among a set of variables (Pearl, 1988)
• Consist of– a set of nodes, corresponding to variables– a set of edges, indicating dependency– a set of functions defined on the graph that
specify a probability distribution
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
![Page 11: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/11.jpg)
Undirected graphical models
• Consist of– a set of nodes– a set of edges– a potential for each clique, multiplied together to yield the distribution over variables
• Examples– statistical physics: Ising model, spinglasses– early neural networks (e.g. Boltzmann machines)
X1
X2
X3 X4
X5
![Page 12: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/12.jpg)
Directed graphical modelsX3 X4
X5
X1
X2
• Consist of– a set of nodes– a set of edges– a conditional probability distribution for each node, conditioned on its parents, multiplied together
to yield the distribution over variables
• Constrained to directed acyclic graphs (DAGs)• Called Bayesian networks or Bayes nets
![Page 13: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/13.jpg)
Bayesian networks and Bayes
• Two different problems– Bayesian statistics is a method of inference– Bayesian networks are a form of representation
• There is no necessary connection– many users of Bayesian networks rely upon
frequentist statistical methods– many Bayesian inferences cannot be easily
represented using Bayesian networks
![Page 14: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/14.jpg)
Properties of Bayesian networks
• Efficient representation and inference– exploiting dependency structure makes it easier
to represent and compute with probabilities
• Explaining away– pattern of probabilistic reasoning characteristic
of Bayesian networks, especially early use in AI
![Page 15: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/15.jpg)
Properties of Bayesian networks
• Efficient representation and inference– exploiting dependency structure makes it easier
to represent and compute with probabilities
• Explaining away– pattern of probabilistic reasoning characteristic
of Bayesian networks, especially early use in AI
![Page 16: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/16.jpg)
Efficient representation and inferenceFour random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
![Page 17: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/17.jpg)
The Markov assumption
Every node is conditionally independent of its non-descendants, given its parents
€
P(xi | xi+1,...,xk ) = P(xi | Pa(X i ))
where Pa(Xi) is the set of parents of Xi
€
P(x1,..., xk ) = P(x i | Pa(X i))i=1
k
∏
(via the product rule)
![Page 18: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/18.jpg)
Efficient representation and inferenceFour random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
1 1
24
total = 7 (vs 15)
![Page 19: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/19.jpg)
Reading a Bayesian network
• The structure of a Bayes net can be read as the generative process behind a distribution
• Gives the joint probability distribution over variables obtained by sampling each variable conditioned on its parents
![Page 20: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/20.jpg)
Reading a Bayesian networkFour random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
![Page 21: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/21.jpg)
Reading a Bayesian network
• The structure of a Bayes net can be read as the generative process behind a distribution
• Gives the joint probability distribution over variables obtained by sampling each variable conditioned on its parents
• Simple rules for determining whether two variables are dependent or independent
• Independence makes inference more efficient
![Page 22: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/22.jpg)
Computing with Bayes nets
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
€
P(x2, x3,x4 | x1 =1) =P(x1 =1, x2,x3, x4 )
P(x1 =1)
![Page 23: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/23.jpg)
Computing with Bayes nets
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
€
P(x1 =1) = P(x1 =1, x2,x3,x4 )x2 ,x3 ,x4
∑sum over 8 values
![Page 24: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/24.jpg)
Computing with Bayes nets
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
€
P(x1 =1) = P(x1 | x3, x4 )P(x2 | x3)P(x3)P(x4 )x2 ,x3 ,x4
∑
![Page 25: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/25.jpg)
Computing with Bayes nets
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
P(x1, x2, x3, x4) = P(x1|x3, x4)P(x2|x3)P(x3)P(x4)
€
P(x1 =1) = P(x1 | x3, x4 )P(x3)P(x4 )x3 ,x4
∑sum over 4 values
![Page 26: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/26.jpg)
Computing with Bayes nets
• Inference algorithms for Bayesian networks exploit dependency structure
• Message-passing algorithms– “belief propagation” passes simple messages between
nodes, exact for tree-structured networks
• More general inference algorithms– exact: “junction-tree”– approximate: Monte Carlo schemes (see Part IV)
![Page 27: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/27.jpg)
Properties of Bayesian networks
• Efficient representation and inference– exploiting dependency structure makes it easier
to represent and compute with probabilities
• Explaining away– pattern of probabilistic reasoning characteristic
of Bayesian networks, especially early use in AI
![Page 28: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/28.jpg)
Assume grass will be wet if and only if it rained last night, or if the sprinklers were left on:
Explaining away
Rain Sprinkler
Grass Wet
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 29: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/29.jpg)
Explaining away
Rain Sprinkler
Grass Wet
)(
)()|()|(
wP
rPrwPwrP =
Compute probability it rained last night, given that the grass is wet:
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 30: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/30.jpg)
Explaining away
Rain Sprinkler
Grass Wet
∑′′
′′′′=
sr
srPsrwP
rPrwPwrP
,
),(),|(
)()|()|(
Compute probability it rained last night, given that the grass is wet:
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 31: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/31.jpg)
Explaining away
Rain Sprinkler
Grass Wet
),(),(),(
)()|(
srPsrPsrP
rPwrP
¬+¬+=
Compute probability it rained last night, given that the grass is wet:
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 32: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/32.jpg)
Explaining away
Rain Sprinkler
Grass Wet
Compute probability it rained last night, given that the grass is wet:
),()(
)()|(
srPrP
rPwrP
¬+=
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 33: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/33.jpg)
Explaining away
Rain Sprinkler
Grass Wet
)()()(
)()|(
sPrPrP
rPwrP
¬+=
Compute probability it rained last night, given that the grass is wet:
Between 1 and P(s)
)(rP>
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 34: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/34.jpg)
Explaining away
Rain Sprinkler
Grass Wet
Compute probability it rained last night, given that the grass is wet and sprinklers were left on:
)|(
)|(),|(),|(
swP
srPsrwPswrP =
Both terms = 1
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 35: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/35.jpg)
Explaining away
Rain Sprinkler
Grass Wet
Compute probability it rained last night, given that the grass is wet and sprinklers were left on:
)(rP=)|(),|( srPswrP =
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 36: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/36.jpg)
Explaining away
Rain Sprinkler
Grass Wet
)(rP=)|(),|( srPswrP =)()()(
)()|(
sPrPrP
rPwrP
¬+= )(rP>
“Discounting” to prior probability.
.andif0 sSrR ¬=¬==
),|()()(),,( RSWPSPRPWSRP =
rRsSRSwWP ==== orif1),|(
![Page 37: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/37.jpg)
• Formulate IF-THEN rules:– IF Rain THEN Wet– IF Wet THEN Rain
• Rules do not distinguish directions of inference• Requires combinatorial explosion of rules
Contrast w/ production system
Rain
Grass Wet
Sprinkler
IF Wet AND NOT Sprinkler THEN Rain
![Page 38: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/38.jpg)
• Observing rain, Wet becomes more active. • Observing grass wet, Rain and Sprinkler become more active• Observing grass wet and sprinkler, Rain cannot become less active. No explaining away!
• Excitatory links: Rain Wet, Sprinkler Wet
Contrast w/ spreading activation
Rain Sprinkler
Grass Wet
![Page 39: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/39.jpg)
• Observing grass wet, Rain and Sprinkler become more active• Observing grass wet and sprinkler, Rain becomes less active: explaining away
• Excitatory links: Rain Wet, Sprinkler Wet• Inhibitory link: Rain Sprinkler
Contrast w/ spreading activation
Rain Sprinkler
Grass Wet
![Page 40: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/40.jpg)
• Each new variable requires more inhibitory connections• Not modular
– whether a connection exists depends on what others exist– big holism problem – combinatorial explosion
Contrast w/ spreading activationRain
Sprinkler
Grass Wet
Burst pipe
![Page 41: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/41.jpg)
Contrast w/ spreading activation
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
(McClelland & Rumelhart, 1981)
![Page 42: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/42.jpg)
Graphical models
• Capture dependency structure in distributions
• Provide an efficient means of representing and reasoning with probabilities
• Allow kinds of inference that are problematic for other representations: explaining away– hard to capture in a production system– more natural than with spreading activation
![Page 43: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/43.jpg)
Part II: Graphical models
• Introduction to graphical models– representation and inference
• Causal graphical models– causality– learning about causal relationships
• Graphical models and cognitive science– uses of graphical models– an example: causal induction
![Page 44: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/44.jpg)
Causal graphical models
• Graphical models represent statistical dependencies among variables (ie. correlations)– can answer questions about observations
• Causal graphical models represent causal dependencies among variables (Pearl, 2000)
– express underlying causal structure– can answer questions about both observations and
interventions (actions upon a variable)
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
![Page 45: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/45.jpg)
Bayesian networks
Four random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
Nodes: variablesLinks: dependency
Each node has a conditional probability distribution
Data: observations of x1, ..., x4
![Page 46: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/46.jpg)
Causal Bayesian networks
Four random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
Nodes: variablesLinks: causality
Each node has a conditional probability distribution
Data: observations of and interventions on x1, ..., x4
![Page 47: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/47.jpg)
Interventions
Four random variables:
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin
X1 X2
X3X4P(x4) P(x3)
P(x2|x3)P(x1|x3, x4)
hold down pencil
X
Cut all incoming links for the node that we intervene on
Compute probabilities with “mutilated” Bayes net
![Page 48: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/48.jpg)
• Strength: how strong is a relationship?• Structure: does a relationship exist?
E
B C
E
B CB B
Learning causal graphical models
![Page 49: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/49.jpg)
• Strength: how strong is a relationship?
E
B C
E
B CB B
Causal structure vs. causal strength
![Page 50: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/50.jpg)
• Strength: how strong is a relationship?– requires defining nature of relationship
E
B C
w0 w1
E
B C
w0
B B
Causal structure vs. causal strength
![Page 51: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/51.jpg)
Parameterization
• Structures: h1 = h0 =
• Parameterization:
E
B C
E
B C
C B
0 01 00 11 1
h1: P(E = 1 | C, B) h0: P(E = 1| C, B)
p00
p10
p01
p11
p0
p0
p1
p1
Generic
![Page 52: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/52.jpg)
Parameterization
• Structures: h1 = h0 =
• Parameterization:
E
B C
E
B C
w0 w1w0
w0, w1: strength parameters for B, C
C B
0 01 00 11 1
h1: P(E = 1 | C, B) h0: P(E = 1| C, B)
0w1
w0
w1+ w0
00w0
w0
Linear
![Page 53: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/53.jpg)
Parameterization
• Structures: h1 = h0 =
• Parameterization:
E
B C
E
B C
w0 w1w0
w0, w1: strength parameters for B, C
C B
0 01 00 11 1
h1: P(E = 1 | C, B) h0: P(E = 1| C, B)
0w1
w0
w1+ w0 – w1 w0
00w0
w0
“Noisy-OR”
![Page 54: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/54.jpg)
Parameter estimation
• Maximum likelihood estimation:
maximize i P(bi,ci,ei; w0, w1)
• Bayesian methods: as in Part I
![Page 55: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/55.jpg)
• Structure: does a relationship exist?
E
B C
E
B CB B
Causal structure vs. causal strength
![Page 56: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/56.jpg)
Approaches to structure learning
• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies E
B CB
(Pearl, 2000; Spirtes et al., 1993)
![Page 57: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/57.jpg)
Approaches to structure learning
E
B CB• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies
(Pearl, 2000; Spirtes et al., 1993)
![Page 58: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/58.jpg)
Approaches to structure learning
E
B CB• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies
(Pearl, 2000; Spirtes et al., 1993)
![Page 59: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/59.jpg)
Approaches to structure learning
E
B CB
Attempts to reduce inductive problem to deductive problem
• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies
(Pearl, 2000; Spirtes et al., 1993)
![Page 60: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/60.jpg)
Approaches to structure learning
E
B CB
• Bayesian:– compute posterior
probability of structures,
given observed dataE
B C
E
B C
P(h|data) P(data|h) P(h)
P(h1|data) P(h0|data)
• Constraint-based:– dependency from statistical tests (eg. 2)– deduce structure from dependencies
(Pearl, 2000; Spirtes et al., 1993)
(Heckerman, 1998; Friedman, 1999)
![Page 61: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/61.jpg)
Bayesian Occam’s Razor
All possible data sets d
P(d
| h
)
h0 (no relationship)
h1 (relationship)
€
P(d
∑ d | h) = 1For any model h,
![Page 62: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/62.jpg)
Causal graphical models
• Extend graphical models to deal with interventions as well as observations
• Respecting the direction of causality results in efficient representation and inference
• Two steps in learning causal models– strength: parameter estimation– structure: structure learning
![Page 63: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/63.jpg)
Part II: Graphical models
• Introduction to graphical models– representation and inference
• Causal graphical models– causality– learning about causal relationships
• Graphical models and cognitive science– uses of graphical models– an example: causal induction
![Page 64: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/64.jpg)
Uses of graphical models
• Understanding existing cognitive models– e.g., neural network models
• Representation and reasoning– a way to address holism in induction (c.f. Fodor)
• Defining generative models– mixture models, language models (see Part IV)
• Modeling human causal reasoning
![Page 65: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/65.jpg)
Human causal reasoning
• How do people reason about interventions?(Gopnik, Glymour, Sobel, Schulz, Kushnir & Danks, 2004; Lagnado
& Sloman, 2004; Sloman & Lagnado, 2005; Steyvers, Tenenbaum, Wagenmakers & Blum, 2003)
• How do people learn about causal relationships?– parameter estimation (Shanks, 1995; Cheng, 1997)
– constraint-based models (Glymour, 2001)
– Bayesian structure learning (Steyvers et al., 2003; Griffiths & Tenenbaum, 2005)
![Page 66: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/66.jpg)
Causation from contingencies
“Does C cause E?”(rate on a scale from 0 to 100)
E present (e+)
E absent (e-)
C present(c+)
C absent(c-)
a
b
c
d
![Page 67: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/67.jpg)
Two models of causal judgment
• Delta-P (Jenkins & Ward, 1965):
• Power PC (Cheng, 1997):
dccbaacePcePP +−+=−≡Δ −+++ )()|()|(
)()|(1 dcd
P
ceP
Pp
+Δ
=−
Δ≡ −+Power
![Page 68: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/68.jpg)
Buehner and Cheng (1997)
People
P
Power
0.00 0.25 0.50 0.75 1.00 P
![Page 69: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/69.jpg)
Buehner and Cheng (1997)
People
P
Power
Constant ΔP, changing judgments
![Page 70: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/70.jpg)
Buehner and Cheng (1997)
People
P
Power
Constant causal power, changing judgments
![Page 71: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/71.jpg)
Buehner and Cheng (1997)
People
P
Power
ΔP = 0, changing judgments
![Page 72: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/72.jpg)
• Strength: how strong is a relationship?• Structure: does a relationship exist?
E
B C
w0 w1
E
B C
w0
B B
Causal structure vs. causal strength
![Page 73: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/73.jpg)
Causal strength
• Assume structure:
ΔP and causal power are maximum likelihood estimates of the strength parameter w1, under different parameterizations for P(E|B,C): – linear ΔP, Noisy-OR causal power
E
B C
w0 w1
B
![Page 74: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/74.jpg)
• Hypotheses: h1 = h0 =
• Bayesian causal inference:
support =
E
B C
E
B CB B
€
P(d | h1) = P(d | w0,w1) p(w0,w1 | h1)0
1
∫0
1
∫ dw0 dw1
€
P(d | h0) = P(d | w0) p(w0 | h0)0
1
∫ dw0
Causal structure
P(d|h1)
P(d|h0)likelihood ratio (Bayes factor) gives evidence in favor of h1
![Page 75: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/75.jpg)
People
ΔP (r = 0.89)
Power (r = 0.88)
Support (r = 0.97)
Buehner and Cheng (1997)
![Page 76: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/76.jpg)
The importance of parameterization
• Noisy-OR incorporates mechanism assumptions:– generativity: causes increase probability of effects
– each cause is sufficient to produce the effect
– causes act via independent mechanisms(Cheng, 1997)
• Consider other models:– statistical dependence: 2 test
– generic parameterization (cf. Anderson, 1990)
![Page 77: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/77.jpg)
People
Support (Noisy-OR)
2
Support (generic)
![Page 78: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/78.jpg)
Generativity is essential
• Predictions result from “ceiling effect”– ceiling effects only matter if you believe a cause increases the
probability of an effect
P(e+|c+)P(e+|c-)
8/88/8
6/86/8
4/84/8
2/82/8
0/80/8
Support 10050
0
![Page 79: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/79.jpg)
Blicket detector (Dave Sobel, Alison Gopnik, and colleagues)
See this? It’s a blicket machine. Blickets make it go.
Let’s put this oneon the machine.
Oooh, it’s a blicket!
![Page 80: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/80.jpg)
– Two objects: A and B– Trial 1: A B on detector – detector active– Trial 2: A on detector – detector active– 4-year-olds judge whether each object is a blicket
• A: a blicket (100% say yes)
• B: probably not a blicket (34% say yes)
“Backwards blocking” (Sobel, Tenenbaum & Gopnik, 2004)
AB Trial A TrialA B
![Page 81: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/81.jpg)
Possible hypotheses
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
E
BA
![Page 82: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/82.jpg)
Bayesian inference
• Evaluating causal models in light of data:
• Inferring a particular causal relation:
€
P(hi | d) =P(d | hi)P(hi)
P(d | h j )P(h j )j
∑
∑∈
→=→H
jh
jj dhPhEAPdEAP )|()|()|(
![Page 83: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/83.jpg)
Bayesian inference
With a uniform prior on hypotheses, and the generic parameterization
A B
Probability of being a blicket
0.32 0.32
0.34 0.34
![Page 84: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/84.jpg)
Modeling backwards blocking
Assume…
• Links can only exist from blocks to detectors
• Blocks are blickets with prior probability q
• Blickets always activate detectors, but detectors never activate on their own– deterministic Noisy-OR, with wi = 1 and w0 = 0
![Page 85: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/85.jpg)
Modeling backwards blocking
P(E=1 | A=0, B=0): 0 0 0 0
P(E=1 | A=1, B=0): 0 0 1 1P(E=1 | A=0, B=1): 0 1 0 1P(E=1 | A=1, B=1): 0 1 1 1
E
BA
E
BA
E
BA
E
BA
P(h00) = (1 – q)2 P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2
€
P(B → E | d) = P(h01) + P(h11) = q
![Page 86: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/86.jpg)
€
P(B → E | d) =P(h01) + P(h11)
P(h01) + P(h10) + P(h11)=
q
q + q(1− q)
P(E=1 | A=1, B=1): 0 1 1 1
E
BA
E
BA
E
BA
E
BA
P(h00) = (1 – q)2 P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2
Modeling backwards blocking
![Page 87: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/87.jpg)
€
P(B → E | d) =P(h11)
P(h10) + P(h11)= q
P(E=1 | A=1, B=0): 0 1 1
P(E=1 | A=1, B=1): 1 1 1
E
BA
E
BA
E
BA
P(h10) = q(1 – q)P(h01) = (1 – q) q P(h11) = q2
Modeling backwards blocking
![Page 88: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/88.jpg)
Manipulating prior probability(Tenenbaum, Sobel, Griffiths, & Gopnik, submitted)
AB Trial A TrialInitial
![Page 89: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/89.jpg)
Summary
• Graphical models provide solutions to many of the challenges of probabilistic models– defining structured distributions– representing distributions on many variables– efficiently computing probabilities
• Causal graphical models provide tools for defining rational models of human causal reasoning and learning
![Page 90: Part II: Graphical models. Challenges of probabilistic models Specifying well-defined probabilistic models with many variables is hard (for modelers)](https://reader035.fdocuments.us/reader035/viewer/2022081514/56649d245503460f949fa782/html5/thumbnails/90.jpg)