High School of Economics Ljubljana Prešernova 6 1000 Ljubljana.
BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
-
Upload
catalina-craton -
Category
Documents
-
view
214 -
download
0
Transcript of BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
![Page 1: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/1.jpg)
BAYESIAN NETWORKS
Ivan Bratko
Faculty of Computer and Information Sc.
University of Ljubljana
![Page 2: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/2.jpg)
BAYESIAN NETWORKS
Bayesian networks, or belief networks: an approach to handling uncertainty in knowledge-based systems
Mathematically well-founded in probability theory, unlike many other, earlier approaches to representing uncertain knowledge
Type of problems intended for belief nets: given that some things are known to be true, how likely are some other events?
![Page 3: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/3.jpg)
BURGLARY EXAMPLE
We have an alarm system to warn about burglary.
We have received an automatic alarm phone call; how likely it is that there actually was a burglary?
We cannot tell about burglary for sure, but characterize it probabilistically instead
![Page 4: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/4.jpg)
BURGLARY EXAMPLE
There are a number of events involved:
burglary
sensor that may be triggered by burglar
lightning that may also trigger the sensor
alarm that may be triggered by sensor
call that may be triggered by sensor
![Page 5: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/5.jpg)
BAYES NET REPRESENTATION
There are variables (e.g. burglary, alarm) that can take values (e.g. alarm = true, burglary = false).
There are probabilistic relations among variables, e.g.:
if burglary = true
then it is more likely that alarm = true
![Page 6: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/6.jpg)
EXAMPLE BAYES NET
burglary lightning
sensor
alarm call
![Page 7: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/7.jpg)
PROBABILISTC DEPENDENCIESAND CAUSALITY
Belief networks define probabilistic dependencies (and independencies) among the variables
They may also reflect causality (burglar triggers sensor)
![Page 8: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/8.jpg)
EXAMPLE OF REASONING IN BELIEF NETWORK
In normal situation, burglary is not very likely.
We receive automatic warning call; since sensor causes
warning call, the probability of sensor being on
increases; since burglary is a cause for triggering the
sensor, the probability of burglary increases.
Then we learn there was a storm. Lightning may also
trigger sensor. Since lightning now also explains how
the call happened, the probability of burglary decreases.
![Page 9: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/9.jpg)
TERMINOLOGY
Bayes network =
belief network =
probabilistic network =
causal network
![Page 10: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/10.jpg)
BAYES NETWORKS, DEFINITION
Bayes net is a DAG (direct acyclic graph)
Nodes ~ random variables
Link X Y intuitively means:
“X has direct influence on Y”
For each node: conditional probability table quantifying effects of parent nodes
![Page 11: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/11.jpg)
MAJOR PROBLEM IN HANDLING UNCERTAINTY
In general, with uncertainty, the problem is the handling
of dependencies between events.
In principle, this can be handled by specifying the
complete probability distribution over all possible
combinations of variable values.
However, this is impractical or impossible: for n binary
variables, 2n - 1 probabilities - too many!
Belief networks enable that this number can usually be
reduced in practice
![Page 12: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/12.jpg)
BURGLARY DOMAIN
Five events: B, L, S, A, C
Complete probability distribution:
p( B L S A C) = ...
p( ~B L S A C) = ...
p( ~B ~L S A C) = ...
p( ~B L ~S A C) = ...
... Total: 32 probabilities
![Page 13: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/13.jpg)
WHY BELIEF NETS BECAME SO POPULAR?
If some things are mutually independent then not all conditional probabilities are needed.
p(XY) = p(X) p(Y|X), p(Y|X) needed
If X and Y independent:
p(XY) = p(X) p(Y), p(Y|X) not needed!
Belief networks provide an elegant way of stating independences
![Page 14: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/14.jpg)
EXAMPLE FROM J. PEARL
Burglary Earthquake
Alarm
John calls Mary calls
Burglary causes alarm Earthquake cause alarm When they hear alarm, neighbours John and Mary phone Occasionally John confuses phone ring for alarm Occasionally Mary fails to hear alarm
![Page 15: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/15.jpg)
PROBABILITIES
P(B) = 0.001, P(E) = 0.002
A P(J | A) A P(M | A)
T 0.90 T 0.70
F 0.05 F 0.01
B E P(A | BE)
T T 0.95
T F 0.95
F T 0.29
F F 0.001
![Page 16: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/16.jpg)
HOW ARE INDEPENDENCIES STATED IN BELIEF NETS
A
B
C
D
If C is known to be true, then prob. of D independent of A, B
p( D | A B C) = p( D | C)
![Page 17: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/17.jpg)
A1, A2, ..... non-descendants of C
B1 B2 ... parents of C
C
D1, D2, ... descendants of C
C is independent of C's non-descendants given C's parents
p( C | A1, ..., B1, ..., D1, ...) = p( C | B1, ..., D1, ...)
![Page 18: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/18.jpg)
INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE
EXAMPLE
a
parent of c b
c e nondescendants of c
d f
descendant of c
By applying rule about nondescendants:
p(c|ab) = p(c|b)
Because: c independent of c's nondesc. a given c's parents (node b)
![Page 19: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/19.jpg)
INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE
But, for this Bayesian network:
p(c|bdf) p(c|bd)
Athough f is c's nondesc., it cannot be ignored:
knowing f, e becomes more likely;
e may also cause d, so when e becomes more likely, c becomes less likely.
Problem is that descendant d is given.
![Page 20: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/20.jpg)
SAFER FORMULATION OF INDEPENDENCE
C is independent of C's nondescendants given
C's parents (only) and not C's descendants.
![Page 21: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/21.jpg)
STATING PROBABILITIES IN BELIEF NETS
For each node X with parents Y1, Y2, ..., specify conditional probabilities of form:
p( X | Y1Y2 ...)for all possible states of Y1, Y2, ...
Y1 Y2
X
Specify:
p( X | Y1, Y2)
p( X | ~Y1, Y2)
p( X | Y1, ~Y2)
p( X | ~Y1, ~Y2)
![Page 22: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/22.jpg)
BURGLARY EXAMPLE
p(burglary) = 0.001
p(lightning) = 0.02
p(sensor | burglary lightning) = 0.9
p(sensor | burglary ~lightning) = 0.9
p(sensor | ~burglary lightning) = 0.1
p(sensor | ~burglary ~lightning) = 0.001
p(alarm | sensor) = 0.95
p(alarm | ~sensor) = 0.001
p(call | sensor) = 0.9
p(call | ~sensor) = 0.0
![Page 23: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/23.jpg)
BURGLARY EXAMPLE
10 numbers plus structure of network
are equivalent to
25 - 1= 31 numbers required to specify
complete probability distribution (without
structure information).
![Page 24: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/24.jpg)
EXAMPLE QUERIES FOR BELIEF NETWORKS
p( burglary | alarm) = ? p( burglary lightning) = ? p( burglary | alarm ~lightning) = ? p( alarm ~call | burglary) = ?
![Page 25: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/25.jpg)
Probabilistic reasoning in belief nets
Easy in forward direction, from ancestors to descendents, e.g.:
p( alarm | burglary lightning) = ?
In backward direction, from descendants to ancestors,
apply Bayes' formula
p( B | A) = p(B) * p(A | B) / p(A)
![Page 26: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/26.jpg)
BAYES' FORMULA
A variant of Bayes' formula to reason about probability of hypothesis H given evidence E in presence of background knowledge B:
)(
)|()()|(
Yp
XYpXpYXp
)|(
)|()|()|(
BEp
BHEpBHpBEHp
![Page 27: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/27.jpg)
REASONING RULES
1. Probability of conjunction:
p( X1 X2 | Cond) = p( X1 | Cond) * p( X2 | X1 Cond)
2. Probability of a certain event:
p( X | Y1 ... X ...) = 1
3. Probability of impossible event:
p( X | Y1 ... ~X ...) = 0
4. Probability of negation:
p( ~X | Cond) = 1 – p( X | Cond)
![Page 28: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/28.jpg)
5. If condition involves a descendant of X then use Bayes' theorem:
If Cond0 = Y Cond where Y is a descendant of X in belief net
then p(X|Cond0) = p(X|Cond) * p(Y|XCond) / p(Y|Cond)
6. Cases when condition Cond does not involve a descendant of X:
(a) If X has no parents then p(X|Cond) = p(X), p(X) given
(b) If X has parents Parents then
)(_)|()|()|(
ParentstatespossibleSCondSpSXpCondXp
![Page 29: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/29.jpg)
A SIMPLE IMPLEMENTATION IN PROLOG
In: I. Bratko, Prolog Programming for Artificial Intelligence, Third edition, Pearson Education 2001(Chapter 15)
An interaction with this program:
?- prob( burglary, [call], P).
P = 0.232137
Now we learn there was a heavy storm, so:
?- prob( burglary, [call, lightning], P).
P = 0.00892857
![Page 30: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/30.jpg)
Lightning explains call, so burglary seems less likely. However, if the weather was fine then burglary becomes more likely:
?- prob( burglary, [call,not lightning],P).
P = 0.473934
![Page 31: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/31.jpg)
COMMENTS
Complexity of reasoning in belief networks grows exponentially with the number of nodes.
Substantial algorithmic improvements required for large networks for improved efficiency.
![Page 32: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/32.jpg)
d-SEPARATION
Follows from basic independence assumption of Bayes
networks
d-separation = direction-dependent separation
Let E = set of “evidence nodes” (subset of variables in
Bayes network)
Let Vi, Vj be two variables in the network
![Page 33: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/33.jpg)
d-SEPARATION
Nodes Vi and Vj are conditionally independent given set E if E
d-separates Vi and Vj
E d-separates Vi, Vj if all (undirected) paths (Vi,Vj) are “blocked”
by E
If E d-separates Vi, Vj, then Vi and Vj are conditionally
independent, given E
We write I(Vi,Vj | E)
This means: p(Vi,Vj | E) = p(Vi | E) * p(Vj | E)
![Page 34: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/34.jpg)
BLOCKING A PATH
A path between Vi and Vj is blocked by nodes E if there is a
“blocking node” Vb on the path. Vb blocks the path if one of
the following holds:
Vb in E and both arcs on path lead out of Vb, or
Vb in E and one arc on path leads into Vb and one out, or
neither Vb nor any descendant of Vb is in E, and both arcs
on path lead into Vb
![Page 35: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/35.jpg)
CONDITION 1
Vb is a common cause:
Vb
Vi Vj
![Page 36: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/36.jpg)
CONDITION 2
Vb is a “closer, more direct cause” of Vj than Vi is
Vi
Vb
Vj
![Page 37: BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.](https://reader038.fdocuments.us/reader038/viewer/2022110205/56649c745503460f949271d7/html5/thumbnails/37.jpg)
CONDITION 3
Vb is not a common consequence of Vi, Vj
Vi Vj
Vb Vb not in E
Vd Vd not in E