Variable and Value Variable and Value Ordering for MPE SearchOrdering for MPE Search
Sajjad Siddiqi and Jinbo Huang
Most Probable Explanation Most Probable Explanation (MPE)(MPE)
N : Bayesian networkX : {X,Y,Z}e : {X=x}
YY
XX ZZ
X Y Z Pr(X,Y,Z)
x y z 0.05
x y z 0.3
x y z 0.05
x y z 0.1
x y z 0.1
x y z 0.2
x y z 0.1
x y z 0.1
Most Probable Explanation Most Probable Explanation (MPE)(MPE)
N : Bayesian networkX : {X,Y,Z}e : {X=x}
YY
XX ZZ
X Y Z Pr(X,Y,Z)
x y z 0.05
x y z 0.3
x y z 0.05
x y z 0.1
x y z 0.1
x y z 0.2
x y z 0.1
x y z 0.1
maxmax
Exact MPE by InferenceExact MPE by Inference
Variable Elimination– Bucket Elimination
Exponential in the treewidth of the elimination order.
Compilation– Decomposable Negation Normal Form (DNNF)
Exploits local structure so treewidth is not necessarily the limiting factor.
Both methods can either run out of time or memory
Exact MPE by SearchingExact MPE by Searching
XX
YY
ZZ ZZ
0.10.1 0.20.2 0.10.1 0.10.1
xx
yy yy
zz zz zz zz
Exact MPE by SearchingExact MPE by Searching
Depth-First Search– Exponential in the number of variables X.
Depth-First Branch-and-Bound Search– Computes an upper bound on any
extension to the current assignment.– Backtracks when upper bound >=
current solution.– Reduces complexity of search.
Exact MPE by B-n-B SearchExact MPE by B-n-B Search
XX
YY
ZZ ZZ
0.10.1 0.20.2 0.10.1 0.10.1
xx
yy yy
zz zz zz zz
if if upper boundupper bound <= 0.2 <= 0.2 current solutioncurrent solution = 0.2 = 0.2
Computing Bounds: Mini-Computing Bounds: Mini-BucketsBuckets
Ignores certain dependencies amongst variables: – New network is easier to solve.– Solution grows only in one direction.
Splits a bucket into two or more mini-buckets.– Focuses on generating tighter bounds.
Mini-buckets is a special case of node splitting.
Node SplittingNode Splitting
Y1Y1 Y2Y2^̂ ^̂
andand are clones of are clones of YY: fully split: fully split
(N)(N)
YY
XX ZZ
RR
^̂
YY
XX ZZ
QQ11
RR
YY11 YY22^̂
(N`)(N`)
split variables = {split variables = {QQ,,YY}}
QQ22^̂ ^̂
splittingsplitting
Node SplittingNode Splitting
e: an instantiation of variables X in N.e: a compatible assignment to their
clones in N´e.g. if e = {Y=y}, then e = {Y1=y,
Y2=y}
thenMPEp (N, e) <= MPEp (N´, e, e)
= total number of instantiations of clone variables
^̂ ^̂
Computing Bounds: Node Computing Bounds: Node Splitting (Choi et. Al Splitting (Choi et. Al
2007).2007).Split network is easier to solve, its MPE
computes the bound.Search performed only over the ‘split
variables’ instead of all.Focuses on good network relaxations
trying to reduce the number of splits.
B-n-B Search for MPEB-n-B Search for MPE
YY
QQ QQ
yy yy
MPE(N`,{X=x})MPE(N`,{X=x})
MPE(N`,{X=x, Y=MPE(N`,{X=x, Y=yy, Y, Y11==yy, Y, Y22==yy})}) MPE(N`,{X=x, Y=MPE(N`,{X=x, Y=yy, Y, Y11==yy, Y, Y22==yy})})
MPE(N`,{X=x, Y=y, YMPE(N`,{X=x, Y=y, Y11==yy, Y, Y22==yy,Q=q, Q,Q=q, Q11=q, Q=q, Q22=q})=q})
= 4, for two split variables with binary domain= 4, for two split variables with binary domain
boundbound
exact solutionexact solution
^̂ ^̂ ^̂ ^̂
^̂ ^̂ ^̂ ^̂
boundbound
boundbound
B-n-B Search for MPEB-n-B Search for MPE
Leaves of the search tree give candidate MPE solutions.
Elsewhere we get upper bounds to prune the search.
A branch gets pruned if bound <= current solution.
Choice of Variables to SplitChoice of Variables to Split
Reduce the number of split variables.– Heuristic based on the reduction in the size
of jointree cliques and separators.Split enough variables to reduce the
treewidth to a certain threshold (when the network is easy to solve).
Variable and Value Variable and Value OrderingOrdering
Reduce search space using an efficient variable and value ordering.
(Choi et al. 2007) do not address this and use a neutral heuristic.
Several heuristics are analyzed and their powers combined to produce an effective heuristic.
Scales up the technique.
Entropy-based OrderingEntropy-based Ordering
YY
QQ QQ
yy yy
Pr(Pr(YY==yy), Pr(), Pr(YY==yy))Pr(Pr(QQ==qq), Pr(), Pr(QQ==qq))
entropy(entropy(YY), entropy(), entropy(QQ))ComputationComputation
Do the same for clonesDo the same for clones
and get average probabilities:
Pr (Y=y) = [Pr(Y=y)+Pr(Y1=y)+Pr(Y2=y)]/3
^̂ ^̂
Entropy-based OrderingEntropy-based Ordering
YY
QQ QQ
yy yy
Favor those instantiations that are more likely to be MPEs.
ComputationComputation Prefer Y over Q if
entropy(Y) < entropy(Q).
Prefer Y=y over Y=y if
Pr(Y=y) > Pr(Y=y)
Static and Dynamic versions.
Entropy-based OrderingEntropy-based Ordering
Probabilities computed using DNNF:– Evaluation and Differentiation of AC
Experiments: – Static heuristic, significantly faster
than the neutral. – Dynamic heuristic, generally too
expensive to compute and slower.
Nogood LearningNogood Learning
g = {X=x, Y=y, Z=z} is a nogood ifMPEp (N’, g, g) <= current solution
XX
YY
ZZ
current solution=1.0current solution=1.0bound=1.5bound=1.5
bound=1.3bound=1.3
bound=1.2bound=1.2
bound=0.5bound=0.5
yy
xx
zz
let g’ = g \ {Y=y} &MPEp (N’, g’, g’) <= current solution
then g = g’
Nogood-based OrderingNogood-based Ordering
Scores:S(X=x) = number of occurrences in nogoods
S(X) = [S(X=x) + S(X=x)]/2 (binary variables)
Dynamic Ordering:
Prefer higher scores.
Impractical: overhead of repeated bound computation during learning.
Score-based OrderingScore-based Ordering
A more effective approach based on nogoods.Scores of variables/values tell how can a nogood
be obtained quickly (backtrack early).
XX
YY
ZZ
bound=1.5bound=1.5
bound=1.3bound=1.3
bound=1.2bound=1.2
bound=0.5bound=0.5
yy
xx
zz
S(X=x) += 1.5-1.3=0.2S(X=x) += 1.5-1.3=0.2
S(Y=y) += 1.3-1.2=0.1S(Y=y) += 1.3-1.2=0.1
S(Z=z) += 1.2-0.5=0.7S(Z=z) += 1.2-0.5=0.7
Improved HeuristicImproved Heuristic
Periodically reinitialize scores (focus on recent past).
Use static entropy-based order as the initial order of variables/values.
Experimental SetupExperimental Setup
Intel Core Duo 2.4 GHz + AMD Athlon 64 X2 Dual Core Processor 4600+, both with 4 GB of RAM running Linux.
A memory limit of 1 GB on each MPE query.C2D DNNF compiler [Darwiche, 2004;
2005]. Trivial seed of 0 as the initial MPE solution
to start the search. Keep splitting the network variables until
treewidth <= 10.
Comparing search spaces on grid networksComparing search spaces on grid networks
Comparing search time on grid networksComparing search time on grid networks
Comparing nogood learning and score-based DVO on grid networks
Results on grid networks, 25 queries per networkResults on grid networks, 25 queries per network
Random networks Random networks 20 queries per network20 queries per network
Networks for genetic linkage analysis, which are some Networks for genetic linkage analysis, which are some of the hardest networksof the hardest networks
Only SC-DVO succeededOnly SC-DVO succeeded
Comparison with SamIam on grid networksComparison with SamIam on grid networks
Comparison with (Marinescu & Dechter, 2007) on grid networks Comparison with (Marinescu & Dechter, 2007) on grid networks (SMBBF (SMBBF –– Static mini-bucket best first) Static mini-bucket best first)
Parameter Parameter ‘‘i=20i=20’’, where , where ‘‘ii’’ controls the size of the mini-bucket controls the size of the mini-bucket
We tried a few cases from random and genetic linkage analysis We tried a few cases from random and genetic linkage analysis networks which SMBBF could not solve (4 random networks of networks which SMBBF could not solve (4 random networks of sizes 100, 110, 120, and 130 and sizes 100, 110, 120, and 130 and pedigree13pedigree13 from the genetic from the genetic
linkage analysis network).linkage analysis network).
ConclusionConclusionNovel and efficient heuristic for
dynamic variable ordering for computing the MPE in Bayesian networks.
A significant improvement in time and space over less sophisticated heuristics and other MPE tools.
Many hard network instances solved for the first time.
Top Related