Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.

Bayesian Networks

Graphical Models

• Bayesian networks

• Conditional random fields

• etc.

Topic=Course

“Grading” “Instructor” “Syllabus” “Vapnik”“Exams”

)|()|()|()|()(

CourseVapnikP

courseExamsPCourseSyllabusPCourseInstructorPCourseGradingPCourseP

VapnikExamsSyllabusInstructorGradingP

)|()(),...,|( 111 jii

ijnj cCxXPcCPxXxXcCP Naive Bayes:

Bayesian networks

• Idea is to represent dependencies (or causal relations) for all the variables so that space and computation-time requirements are minimized.

Topic=RussianMathematicians

Topic=Course

Vapnik Syllabus

true 0.01

false 0.99

Course

true false

true 0.9 0.1

false 0.2 0.8

Syllabus

Course

true 0.2

false 0.8

Russian Math

true false

True True 0.95 0.05

True False 0.8 0.2

False True 0.6 0.4

false false 0.05 0.95

Vapnik

CourseR.MConditional probabilitytables for each node

Topic=RussianMathematicians

Topic=Course

Vapnik Syllabus

Semantics of Bayesian networks

• If network is correct, can calculate full joint probability distribution from network.

where parents(Xi) denotes specific values of parents of Xi.

Compare with Naïve Bayes.

))(...)(|)((

ruleChain ))(...)(|)((

))(...)()((

XparentsxXP

xXxXxXP

Example

• Calculate

)]()()()[( fSyllabusfVapnikfCoursethRussianMatP

ExamplesWhat is unconditional (marginal) probability that Vapnik is true?

What is the unconditional (marginal) probability that Russian Mathematicians is true?

Causal inferenceEvidence is cause, inference is probability of effect

Example: Instantiate evidence Course= true. What is P(Syllabus | Course)?

Different types of inference in Bayesian Networks

Diagnostic inference

Evidence is effect, inference is probability of cause

Example: Instantiate evidence Syllabus = true. What is P(Course | Syllabus)?

Example: What is P(Course|Vapnik)?

Inter-causal inference

Explain away different possible causes of effect

Example: What is P(Course|Vapnik,RussianMath)?

Why is P(Course|Vapnik,RussianMath) < P(Course|Vapnik)?

Complexity of Bayesian Networks

For n random Boolean variables:

• Full joint probability distribution: 2n entries

• Bayesian network with at most k parents per node:

– Each conditional probability table: at most 2k entries

– Entire network: n 2k entries

What are the advantages of Bayesian networks?

• Intuitive, concise representation of joint probability distribution (i.e., conditional dependencies) of a set of random variables.

• Represents “beliefs and knowledge” about a particular class of situations.

• Efficient (?) (approximate) inference algorithms

• Efficient, effective learning algorithms

Issues in Bayesian Networks

• Building / learning network topology

• Assigning / learning conditional probability tables

• Approximate inference via sampling

• In general, however, exact inference in Bayesian networks is too expensive.

Approximate inference in Bayesian networks

Instead of enumerating all possibilities, sample to estimate probabilities.

X1 X2 X3 Xn

General question: What is P(X|e)?

Notation convention: upper-case letters refer to random variables; lower-case letters refer to specific values of those variables

Direct Sampling

• Suppose we have no evidence, but we want to determine P(C,S,R,W) for all C,S,R,W.

• Direct sampling:

– Sample each variable in topological order, conditioned on values of parents.

– I.e., always sample from P(Xi | parents(Xi))

1. Sample from P(Cloudy). Suppose returns true.

2. Sample from P(Sprinkler | Cloudy = true). Suppose returns false.

3. Sample from P(Rain | Cloudy = true). Suppose returns true.

4. Sample from P(WetGrass | Sprinkler = false, Rain = true). Suppose returns true.

Here is the sampled event: [true, false, true, true]

Example

• Suppose there are N total samples, and let NS (x1, ..., xn) be the observed frequency of the specific event x1, ..., xn.

• Suppose N samples, n nodes. Complexity O(Nn).

• Problem 1: Need lots of samples to get good probability estimates.

• Problem 2: Many samples are not realistic; low likelihood.

),...,(),...,(

lim 11

),...,(),...,(

nnS xxP

Markov Chain Monte Carlo Sampling

• One of most common methods used in real applications.

• Uses idea of “Markov blanket” of a variable Xi:

– parents, children, children’s parents

Illustration of “Markov Blanket”

• Recall that: By construction of Bayesian network, a node is conditionaly independent of its non-descendants, given its parents.

• Proposition: A node Xi is conditionally independent of all other nodes in the network, given its Markov blanket.

Markov Chain Monte Carlo Sampling Algorithm

• Start with random sample from variables: (x1, ..., xn). This is the current “state” of the algorithm.

• Next state: Randomly sample value for one non-evidence variable Xi , conditioned on current values in “Markov Blanket” of Xi.

Example

• Query: What is P(Rain | Sprinkler = true, WetGrass = true)?

• MCMC: – Random sample, with evidence variables fixed:

[true, true, false, true]

– Repeat:

1. Sample Cloudy, given current values of its Markov blanket: Sprinkler = true, Rain = false. Suppose result is false. New state:

[false, true, false, true]

2. Sample Rain, given current values of its Markov blanket:

Cloudy = false, Sprinkler = true, WetGrass = true. Suppose

result is true. New state: [false, true, true, true].

• Each sample contributes to estimate for query

P(Rain | Sprinkler = true, WetGrass = true)

• Suppose we perform 100 such samples, 20 with Rain = true and 80 with Rain = false.

• Then answer to the query is

Normalize (20,80) = .20,.80

• Claim: “The sampling process settles into a dynamic equilibrium in which the long-run fraction of time spent in each state is exactly proportional to its posterior probability, given the evidence.”

• That is: for all variables Xi, the probability of the value xi of Xi appearing in a sample is equal to P(xi | e).

Claim (again)

• Claim: MCMC settles into behavior in which each state is sampled exactly according to its posterior probability, given the evidence.

Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.

Documents

Transcript of Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.

Learning Bayesian Networks and Causal Discoverychirayukong.github.io/infsci2725/resources/08_Learning Bayesian... · Learning Bayesian Networks and Causal Discovery Bayesian networks

1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.

Bayesian Networks - Department of Computer Sciencephi/ai/slides/lecture-bayesian-networks.pdf · Bayesian Networks 3 A simple, graphical notation for conditional independence assertions

Lecture 29 Conditional Independence, Bayesian networks intro Ch 6.3, 6.3.1, 6.5, 6.5.1, 6.5.2 1.

Bayesian Networks - cs.cmu.edumgormley/courses/10601bd-f18/slides/lecture21... · • A Bayesian Network is a directed graphical model • It consists of a graph G and the conditional

Bayesian Belief Network - ULisboa · 3 Naive Bayes assumption of conditional independence too restrictive But it's intractable without some such assumptions... Bayesian Belief networks

Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.

Conditional Degree of Belief and Bayesian Inference

Learning Bayesian Networks in R · 2013-07-10 · Bayesian Networks Essentials Bayesian Networks Bayesian networks [21, 27] are de ned by: anetwork structure, adirected acyclic graph

Bayesian Networks - cs.jhu.eduphi/ai/slides-2017/lecture-bayesian-networks.pdf · Bayesian Networks 3 A simple, graphical notation for conditional independence assertions and hence

Bayesian Networks. Outline Introduction to Bayesian Networks Conditional probability and Bayes ’ Theorem Analyzing a Bayesian Network Practical Uses for.

Bayesian Networks - Chonbuknlp.chonbuk.ac.kr/.../lecture-bayesian-networks.pdf · 2017. 11. 28. · Bayesian Networks 3 A simple, graphical notation for conditional independence assertions

Bayesian Network Modelling › about › slides › slides-ibm16.pdf · What Are Bayesian Networks? Conditional Linear Gaussian Bayesian Networks CLGBNs contain both discrete and

Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.

Bayesian Conditional Generative Adverserial Networks · We propose a new GAN called Bayesian Conditional Gener- ative Adversarial Networks (BC-GANs) that use a random generator function

Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.

Bayesian networks - NTNU · Bayesian networks A simple, graphical notation for conditional independence assertions ... (CPT) giving the distribution over Xi for each combination of

Bayesian networks - University of Washington · Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact speciﬁcation of full

Bayesian Networks

Bayesian Networks Graphical Models - cs.ubc.casiamakx/532R/slides/directed_models.pdf · Learning objectives what is a Bayesian network? factorization conditional independencies how