On the Power of Belief Propagation: A Constraint Propagation Perspective
description
Transcript of On the Power of Belief Propagation: A Constraint Propagation Perspective
On the Power of Belief Propagation:A Constraint Propagation Perspective
Rina Dechter
BozhenaBidyuk
RobertMateescu
EmmaRollon
Distributed Belief Propagation
Distributed Belief Propagation
1 2 3 4
4 3 2 1 555 5 5
How many people?
Distributed Belief Propagation
Causal support
Diagnostic support
Belief Propagation in Polytrees
BP on Loopy Graphs
Pearl (1988): use of BP to loopy networks
McEliece, et. Al 1988: IBP’s success on coding networks
Lots of research into convergence … and accuracy (?), but:
Why IBP works well for coding networks Can we characterize other good problem classes Can we have any guarantees on accuracy (even if
converges)
)( 11uX
1U 2U 3U
2X1X
)( 12xU
)( 12uX
)( 13xU
Arc-consistency
Sound Incomplete Always converges
(polynomial)
A B
CD
321A
321B
321D
321C
<
<< =
A < B
1 22 3
A < D1 22 3
D < C1 22 3
B = C
1 12 23 3
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
A P(A)1 .22 .53 .3… 0
A B P(B|A)1 2 .31 3 .72 1 .42 3 .63 1 .13 2 .9… … 0
A B D P(D|A,B)1 2 3 11 3 2 12 1 3 12 3 1 13 1 2 13 2 1 1… … … 0 D F G P(G|D,F)
1 2 3 12 1 3 1… … … 0
B C F P(F|B,C)1 2 3 13 2 1 1… … … 0
A C P(C|A)1 2 13 2 1… … 0
A123
A B1 21 32 12 33 13 2
A B D1 2 31 3 22 1 32 3 13 1 23 2 1
D F G1 2 32 1 3
B C F1 2 33 2 1
A C1 23 2A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
Belief network Flat constraint network
Flattening the Bayesian Network
j)elim(i, inek
iki
ji
j
hph ))(()}({
)) ( ( )(ikinekil
ji hRh
ij
A B P(B|A)
1 2 >01 3 >02 1 >02 3 >03 1 >03 2 >0… … 0
A B1 21 32 12 33 13 2
A h12(
A)1 >02 >03 >0… 0
21h
24h
25h
B h12(
B)1 >02 >03 >0… 0
B h12(
B)1 >03 >0… 0
A123
21h
24h
25h
B123
B13
Updated belief: Updated relation:
25
24
21)|(),( hhhABPBABel
25
24
21 ),(),( hhhBARBAR
A B Bel (A,B)
1 3 >02 1 >02 3 >03 1 >0… … 0
A B1 32 12 33 1
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
Belief Zero Propagation = Arc-Consistency
A P(A)
1 .2
2 .5
3 .3
… 0
A C P(C|A)
1 2 1
3 2 1
… … 0
A B P(B|A)
1 2 .3
1 3 .7
2 1 .4
2 3 .6
3 1 .1
3 2 .9
… … 0
B C F P(F|B,C)
1 2 3 1
3 2 1 1
… … … 0
A B D P(D|A,B)
1 2 3 1
1 3 2 1
2 1 3 1
2 3 1 1
3 1 2 1
3 2 1 1
… … … 0D F G P(G|D,F)
1 2 3 1
2 1 3 1
… … … 0
1R
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
Flat Network - Example
A P(A)
1 >0
3 >0
… 0
A C P(C|A)
1 2 1
3 2 1
… … 0
A B P(B|A)
1 3 1
2 1 >0
2 3 >0
3 1 1
… … 0
B C F P(F|B,C)
1 2 3 1
3 2 1 1
… … … 0
A B D P(D|A,B)
1 3 2 1
2 3 1 1
3 1 2 1
3 2 1 1
… … … 0
D F G P(G|D,F)
2 1 3 1
… … … 0
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
IBP Example – Iteration 11R
A P(A)
1 >0
3 >0
… 0
A C P(C|A)
1 2 1
3 2 1
… … 0
A B P(B|A)
1 3 1
3 1 1
… … 0
B C F P(F|B,C)
3 2 1 1
… … … 0
A B D P(D|A,B)
1 3 2 1
3 1 2 1
… … … 0
D F G P(G|D,F)
2 1 3 1
… … … 0
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
IBP Example – Iteration 21R
A P(A)
1 >0
3 >0
… 0
A C P(C|A)
1 2 1
3 2 1
… … 0
A B P(B|A)
1 3 1… … 0
B C F P(F|B,C)
3 2 1 1
… … … 0
A B D P(D|A,B)
1 3 2 1
3 1 2 1
… … … 0
D F G P(G|D,F)
2 1 3 1
… … … 0
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
IBP Example – Iteration 31R
A P(A)
1 1
… 0
A C P(C|A)
1 2 1
3 2 1
… … 0
A B P(B|A)
1 3 1
… … 0
B C F P(F|B,C)
3 2 1 1
… … … 0
A B D P(D|A,B)
1 3 2 1
… … … 0
D F G P(G|D,F)
2 1 3 1
… … … 0
IBP Example – Iteration 4
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
1R
A P(A)
1 1
… 0
A C P(C|A)
1 2 1
… … 0
A B P(B|A)
1 3 1
… … 0
B C F P(F|B,C)
3 2 1 1
… … … 0
A B D P(D|A,B)
1 3 2 1
… … … 0
D F G P(G|D,F)
2 1 3 1
… … … 0
A B C D F G Belief1 3 2 2 1 3 1… … … … … … 0
IBP Example – Iteration 5
2R
4R
3R
5R
6R
A
AB AC
ABD BCF
DFG
B
4 5
3
6
2
B
D F
A
A
A
C
1
1R
IBP – Inference Power for Zero Beliefs
Theorem: Iterative BP performs arc-consistency on the flat network.
Soundness: Inference of zero beliefs by IBP converges All the inferred zero beliefs are correct
Incompleteness: Iterative BP is as weak and as strong as arc-consistency
Continuity Hypothesis: IBP is sound for zero - IBP is accurate for extreme beliefs? Tested empirically
Experimental Results
Algorithms: IBP IJGP
Measures: Exact/IJGP
histogram Recall absolute
error Precision
absolute error
Network types: Coding Linkage analysis* Grids* Two-layer noisy-
OR* CPCS54, CPCS360
We investigated empirically if the results for zero beliefs extend to ε-small beliefs (ε > 0)
* Instances from the UAI08 competition
Have
det
erm
inism
?
YES
NO
Networks with Determinism: Coding
N=200, 1000 instances, w*=15
Nets w/o Determinism: bn2ow* = 24
w* = 27
w* = 26
Nets with Determinism: LinkagePe
rcen
tage
Abs
olut
e Er
ror
pedigree1, w* = 21Exact Histogram IJGP Histogram Recall Abs. Error Precision Abs. Error
Perc
enta
ge
Abs
olut
e Er
ror
pedigree37, w* = 30
i-bound = 3 i-bound = 7
The Cutset Phenomena & irrelevant nodes
Observed variables break the flow of inference
IBP is exact when evidence variables form a cycle-cutset
Unobserved variables without observed descendents send zero-information to the parent variables – it is irrelevant
In a network without evidence, IBP converges in one iteration top-down
X
X
Y
Nodes with extreme supportObserved variables with xtreme priors or xtreme
support can nearly-cut information flow:
0
0.00005
0.0001
0.00015
0.0002
0.00001 0.01 0.2 0.5 0.8 0.99 0.99999prior
RootB1C1B2C2Sink
D
B1
ED
A
B2
EBEC
EB EC
C1
C2
Average Error vs. Priors
Conclusion: For Networks with Determinism
IBP converges & sound for zero beliefs
IBP’s power to infer zeros is as weak or as strong as arc-consistency
However: inference of extreme beliefs can be wrong.
Cutset property (Bidyuk and Dechter, 2000): Evidence and inferred singleton act like cutset If zeros are cycle-cutset, all beliefs are exact Extensions to epsilon-cutset were supported empirically.