Semantically-Linked Bayesian Networks: A Framework for Probabilistic Inference Over Multiple...

44
Semantically-Linked Bayesian Semantically-Linked Bayesian Networks: Networks: A Framework for Probabilistic A Framework for Probabilistic Inference Over Multiple Bayesian Inference Over Multiple Bayesian Networks Networks PhD Dissertation Defense PhD Dissertation Defense Advisor: Dr. Yun Peng Advisor: Dr. Yun Peng Rong Pan Rong Pan Department of Computer Science and Department of Computer Science and Electrical Engineering Electrical Engineering University of Maryland Baltimore County University of Maryland Baltimore County Aug 2, 2006 Aug 2, 2006

Transcript of Semantically-Linked Bayesian Networks: A Framework for Probabilistic Inference Over Multiple...

Semantically-Linked Bayesian Networks: Semantically-Linked Bayesian Networks: A Framework for Probabilistic Inference A Framework for Probabilistic Inference

Over Multiple Bayesian NetworksOver Multiple Bayesian Networks

PhD Dissertation DefensePhD Dissertation Defense

Advisor: Dr. Yun PengAdvisor: Dr. Yun Peng

Rong PanRong PanDepartment of Computer Science and Electrical EngineeringDepartment of Computer Science and Electrical Engineering

University of Maryland Baltimore CountyUniversity of Maryland Baltimore CountyAug 2, 2006Aug 2, 2006

OutlineOutline

►MotivationsMotivations►BackgroundBackground►OverviewOverview►How Knowledge is SharedHow Knowledge is Shared►Inference on SLBNInference on SLBN►Concept Mapping using SLBNConcept Mapping using SLBN►Future worksFuture works

Motivations (1)Motivations (1)►Separately developed BNs aboutSeparately developed BNs about

related domainsrelated domains

different aspects of the same domaindifferent aspects of the same domain

Motivations (2)Motivations (2)► Existing approach: Existing approach:

Multiply Sectioned Bayesian Networks (MSBN)Multiply Sectioned Bayesian Networks (MSBN)

Every subnet is sectioned from a global BNEvery subnet is sectioned from a global BN Strictly consistent subnetsStrictly consistent subnets Exactly Exactly identicalidentical shared variables with shared variables with samesame distribution distribution All parents of the shared variables must appear in one subnetAll parents of the shared variables must appear in one subnet

Sectioning

Motivations (3)Motivations (3)► Existing approach: Existing approach:

Agent Encapsulated Bayesian Networks (AEBN)Agent Encapsulated Bayesian Networks (AEBN)

Distribution BN Model for a specific applicationDistribution BN Model for a specific application Hierarchical global structureHierarchical global structure Very restricted expressivenessVery restricted expressiveness Exactly Exactly identicalidentical shared variables with shared variables with different different prior prior distributionsdistributions

AgentOutput Variable

Input Variable

Local Variable

Motivations (4)Motivations (4)

►A distributed BN model was expected with A distributed BN model was expected with features:features: Uncertainty reasoning over separately developed Uncertainty reasoning over separately developed

BNsBNs Variables shared by different BNs can be similar but Variables shared by different BNs can be similar but

not identicalnot identical Principled, well justifiedPrincipled, well justified Support various applicationsSupport various applications

BackgroundBackgroundBayesian NetworkBayesian Network

►DAGDAG►VariablesVariables

with Finite with Finite StatesStates

►Edges: causal Edges: causal influencesinfluences

►Conditional Conditional Probability Probability Table (CPT)Table (CPT)

2.08.0

65.035.0)|( AAP

BackgroundBackgroundEvidences in BNEvidences in BN

Mammal

TrueFalse

83.316.7

Male Mammal

TrueFalse

50.050.0

Virtual Evidence

TrueFalse

100 0

Mammal

TrueFalse

80.020.0

Male Mammal

TrueFalse

40.060.0

Mammal

TrueFalse

100 0

Male Mammal

TrueFalse

100 0

Mammal

TrueFalse

83.316.7

Male Mammal

TrueFalse

50.050.0

Original BN Hard Evidence:Male_Mammal = True

Soft Evidence:Q(Male_Mammal) = (0.5 0.5)

Virtual Evidence:L(Male_Mammal) = 0.8/0.2

Virtual Evidence = Soft Evidence:L(Male_Mammal) = 0.3/0.2

Mammal

TrueFalse

90.99.09

Male Mammal

TrueFalse

72.727.3

Virtual Evidence

TrueFalse

100 0

BackgroundBackgroundJeffrey’s Rule (Soft Evidence)Jeffrey’s Rule (Soft Evidence)

► Given external observations Given external observations QQ((BBii), the rest of the BN is updated ), the rest of the BN is updated

by Jeffrey’s Rule:by Jeffrey’s Rule:

where where PP((AA| | BBii) is the conditional probability before evidence, ) is the conditional probability before evidence,

QQ((BBii) is the soft evidence.) is the soft evidence. ► Multiple Soft EvidencesMultiple Soft Evidences

Problem: update one variable’s distribution to its target value Problem: update one variable’s distribution to its target value can make those of others’ off their targetscan make those of others’ off their targets

Solution: IPFPSolution: IPFP

i

ii BQBAPAQ )()|()(

BackgroundBackground Iterative Proportional Fitting Procedure (IPFP)Iterative Proportional Fitting Procedure (IPFP)

► QQ00 : initial distribution on the set of variables : initial distribution on the set of variables XX, , ► {{PP((SSii)}: a consistent set of )}: a consistent set of nn marginal probability marginal probability

distributions, where distributions, where XX SiSi . . ► The IPFP processThe IPFP process

where where ii is the iteration number, is the iteration number, jj = ( = (ii-1) mod -1) mod nn + 1 + 1

► The distribution after IPFP satisfies the given constraints The distribution after IPFP satisfies the given constraints {{PP((SSii)} and has minimum cross-entropy to the initial )} and has minimum cross-entropy to the initial distribution distribution QQ00

otherwise

SQifSQ

SPXQ

XQ jiji

ji

i

0

0)()(

)()(

)( 11

1

SLBN: Overview (1)SLBN: Overview (1)

►Semantically-Linked Bayesian Networks (SLBN)Semantically-Linked Bayesian Networks (SLBN) A theoretical framework that supports probabilistic A theoretical framework that supports probabilistic

inference over separately developed BNsinference over separately developed BNs

Global Knowledge

Similar variables

SLBN: Overview (2)SLBN: Overview (2)

► FeaturesFeatures Inference over separate BNs that share semantically similar Inference over separate BNs that share semantically similar

variablesvariables Global knowledge: J-graphGlobal knowledge: J-graph Principled, well-justifiedPrincipled, well-justified

► In SLBNIn SLBN BNs are linked at the similar variablesBNs are linked at the similar variables Probabilistic influences are propagated via the shared Probabilistic influences are propagated via the shared

variablesvariables Inference process utilizes Soft Evidence (Jeffrey’s Rule), Inference process utilizes Soft Evidence (Jeffrey’s Rule),

Virtual Evidence, IPFP, and traditional BN inferenceVirtual Evidence, IPFP, and traditional BN inference

How knowledge is shared:How knowledge is shared:Semantic Similarity (1)Semantic Similarity (1)

What is similarity?What is similarity?

SimilarSimilar::Pronunciation:Pronunciation: 'si-m&-l&r, 'sim-l&r 'si-m&-l&r, 'sim-l&rFunction:Function: adjectiveadjective1: 1: having characteristics in commonhaving characteristics in common2:2: alike in substance or essentials alike in substance or essentials3:3: not differing in shape but only in size or position not differing in shape but only in size or position

–– –– www.merrian-webster.comwww.merrian-webster.com

High-tech Company Employee V.S. High-income People

Computer Keyboard V.S. Typewriter

How knowledge is shared:How knowledge is shared:Semantic Similarity (2)Semantic Similarity (2)

► Semantic Similarity of conceptsSemantic Similarity of concepts Share of common instancesShare of common instances Quantified and utilized with directionQuantified and utilized with direction Quantified by the ratio of the shared instances to all Quantified by the ratio of the shared instances to all

the instancesthe instances

Natural language’s definition for “similar” is vagueNatural language’s definition for “similar” is vague Hard to formalizeHard to formalize Hard to quantifyHard to quantify Hard to utilize in intelligenceHard to utilize in intelligence

Conditional Probability

P(High-tech Company Employee | High-income People)

Man V.S.

Woman

How knowledge is shared:How knowledge is shared:Variable Linkage (1)Variable Linkage (1)

►In Bayesian Network (BN) / SLBNIn Bayesian Network (BN) / SLBN Concepts are represented by variablesConcepts are represented by variables Semantic similarities are between propositionsSemantic similarities are between propositions

We say

“High-tech Company Employee” is similar to “High-income People”

We mean

“High-tech Company Employee = True” is similar to “High-income People = True”

How knowledge is shared:How knowledge is shared:Variable Linkage (2)Variable Linkage (2)

►Variable linkagesVariable linkages Represent semantic similarities in SLBNRepresent semantic similarities in SLBN Are between variables in different BNsAre between variables in different BNs

ABBA

AB SNNBAL ,,,, A

BBAAB SNNBAL ,,,,

A : Source Variable

B : Destination Variable

NA: Source BN

NB: Destination BN

: Quantification of the similarityABS

is a m × n matrix:ABS

)}|({),( ijAB abPjiS

How knowledge is shared:How knowledge is shared:Variable Linkage (3)Variable Linkage (3)

►Variable Linkage V.S. BN EdgeVariable Linkage V.S. BN Edge

Variable LinkageVariable Linkage BN EdgeBN Edge

Representation Representation OfOf

Semantic SimilaritySemantic Similarity Causal InfluencesCausal Influences

Conditional Conditional ProbabilityProbability

Quantification of Quantification of Similarity; Similarity;

Invariant w.r.t. any Invariant w.r.t. any eventevent

Conditional Conditional dependency;dependency;

May be changed by May be changed by eventsevents

Prob. Influence Prob. Influence PropagationPropagation

Along the directionAlong the direction Both directionsBoth directions

((-msg. and -msg. and -msg.)-msg.)

How knowledge is shared:How knowledge is shared:Variable Linkage (4)Variable Linkage (4)

►Expressiveness of Variable LinkageExpressiveness of Variable Linkage Logical relationships defined in OWL syntax: Logical relationships defined in OWL syntax:

Equivalent, Union, Intersection, and Subclass Equivalent, Union, Intersection, and Subclass complement. complement.

Relaxation of logical relationships by replacing set Relaxation of logical relationships by replacing set inclusion by overlapping: Overlap, Superclass, inclusion by overlapping: Overlap, Superclass, SubclassSubclass

Equivalence relations but same concepts are modeled Equivalence relations but same concepts are modeled as different variablesas different variables

How knowledge is shared:How knowledge is shared:Examples (1)Examples (1)

Identical

Union

How knowledge is shared:How knowledge is shared:Examples (2)Examples (2)

Overlap

Superclass

How knowledge is shared:How knowledge is shared:Consistent Linked VariablesConsistent Linked Variables

► The priori beliefs on the linked variables on both sides The priori beliefs on the linked variables on both sides must be consistent with the variable linkage:must be consistent with the variable linkage: PP22((BB) = ) = ∑∑i i PPSS((BB||A=aA=aii))PP11((A=aA=aii))

There exists a single distribution consistent with the prior There exists a single distribution consistent with the prior belief on belief on AA, , BB, , AA, , BB, and the linkage’s similarity., and the linkage’s similarity.

► examined by IPFPexamined by IPFP

A B

A B

P1(A)

P1(A| A)

P1(A)

P2(B)

P2(B| A)

P2(B)PS(B| A)

Inference on SLBN Inference on SLBN The ProcessThe Process

1. Enter Evidence1. Enter Evidence

2. Propagate2. Propagate

4. Updated Result4. Updated Result

3. Enter Soft/Virtual Evidences;3. Enter Soft/Virtual Evidences;

BN Belief Update With traditional InferenceBN Belief Update With traditional Inference

SLBN Rules for ProbabilisticInfluence Propagation

SLBN Rules for ProbabilisticInfluence Propagation

BN Belief Update with Soft EvidenceBN Belief Update with Soft Evidence

Inference on SLBN Inference on SLBN The TheoryThe Theory

Bayes’ Rule Jeffrey’s Rule IPFP

Soft Evidence

BN Inference

Virtual Evidence

SLBN

Theoretical Basis

Implementation (Existing)

Implementation (SLBN)

Inference on SLBN Inference on SLBN Assumptions/RestrictionsAssumptions/Restrictions

► All linked BNs are consistent with the linkagesAll linked BNs are consistent with the linkages► One variable can only be involved in one linkageOne variable can only be involved in one linkage► Causal precedence in all linked BNs are consistentCausal precedence in all linked BNs are consistent

Linked BNs with inconsistent causal sequences

Linked BNs with consistent causal sequences

Inference on SLBN Inference on SLBN Assumptions/Restrictions (Cont.)Assumptions/Restrictions (Cont.)

►For a variable linkage, the causes/effects of For a variable linkage, the causes/effects of source is also the causes/effects of the source is also the causes/effects of the destinationdestination Linkages cannot cross each otherLinkages cannot cross each other

… ...

Crossed linkages

Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (1)

► Some hard evidence influence the source from Some hard evidence influence the source from bottombottom

Y1

Y2

Y3

X1

► Propagated influences Propagated influences are represented by soft are represented by soft evidencesevidences

► Beliefs of destination Beliefs of destination

BN are update with SEBN are update with SE

Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (2)

► Some hard evidence influence the source from Some hard evidence influence the source from toptop

Y1

Y2

Y3

X1

► Additional soft evidences Additional soft evidences are created to cancel the are created to cancel the influences from the influences from the linkage to linkage to parentparent((destdest((LL))))

Inference on SLBN Inference on SLBN SLBN Rules for Probabilistic Influence Propagation (3)

► Some hard evidence influence the source from both Some hard evidence influence the source from both top top andand bottom bottom

Y1

Y2

Y3

X1

► Additional soft evidences Additional soft evidences are created to propagate are created to propagate the combined influences the combined influences from the linkage to from the linkage to parentparent((destdest((LL))))

Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (1)Belief Update with Soft Evidence (1)

►Represent soft evidences by virtual evidencesRepresent soft evidences by virtual evidences Belief update with soft evidence is IPFPBelief update with soft evidence is IPFP Belief update with one virtual evidence is one step Belief update with one virtual evidence is one step

of IPFPof IPFP►Therefore, we canTherefore, we can

Use virtual evidence to implement IPFP on BNUse virtual evidence to implement IPFP on BN Use virtual evidence to implement soft evidenceUse virtual evidence to implement soft evidence

►SE VESE VE Iterate on the whole BNIterate on the whole BN Iterate on soft evidence variablesIterate on soft evidence variables

Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (2)Belief Update with Soft Evidence (2)

►Iterate on whole BNIterate on whole BN

AA BB

(0.3, 0.7)(0.3, 0.7) (0.8, 0.2)(0.8, 0.2)

(0.6, 0.4)(0.6, 0.4) (0.66, (0.66, 0.34)0.34)

(0.47, 0.53)(0.47, 0.53) (0.5, 0.5)(0.5, 0.5)

(0.6, 0.4)(0.6, 0.4) (0.60, (0.60, 0.40)0.40)

(0.55, 0.45)(0.55, 0.45) (0.5, 0.5)(0.5, 0.5)

……………………..(0.6, 0.4)(0.6, 0.4) (0.5, 0.5)(0.5, 0.5)

Q(B) = (0.5, 0.5)Q(A) = (0.6, 0.4)

ve veve ve

A B

Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (1)Belief Update with Soft Evidence (1)

►Iterate on SE variablesIterate on SE variables

Q(B) = (0.5, 0.5)Q(A) = (0.6, 0.4)

ve

A B

A = tA = t A = fA = f

BB = = tt 0.20.2 0.60.6

BB = = ff 0.10.1 0.10.1

P(A, B) =

IPFP with Q(A), Q(B)

A = tA = t A = fA = f

BB = = tt 0.2360.236 0.2640.264

BB = = ff 0.3640.364 0.1360.136

Q(A, B) =

Inference on SLBN Inference on SLBN Belief Update with Soft Evidence (3)Belief Update with Soft Evidence (3)

►Existing approaches : Big-CliqueExisting approaches : Big-Clique

Big-CliqueBig-CliqueIteration onIteration on

whole BNwhole BN

Iteration onIteration on

se variablesse variables

BN Inference BN Inference BasisBasis

Rewrite Rewrite

Junction-TreeJunction-TreeWrapper of Wrapper of any methodany method

Wrapper of any Wrapper of any methodmethod

Time for each Time for each IterationIteration O(O(ee|C||C|)) O(BN Inf.)O(BN Inf.) O(O(ee|V||V|))

SpaceSpace OO((ee|C||C|)) O(|O(|V|V|)) OO((ee|V||V|))

C: the big cliqueV: se variables|C|≥|V|

Iteration on whole BN: Small BNs, many soft evidencesIteration on se variables: Large BNs, a few soft evidences

J-Graph (1)J-Graph (1)OverviewOverview

► Joint-graph (J-graph) is a graphical probability Joint-graph (J-graph) is a graphical probability model that representsmodel that represents The joint distribution of SLBNThe joint distribution of SLBN The interdependencies between variables across The interdependencies between variables across

variable linkagesvariable linkages

► UsageUsage Check if all assumptions are satisfiedCheck if all assumptions are satisfied Justify Inference ProcessJustify Inference Process

J-Graph (2)J-Graph (2)DefinitionDefinition

►J-Graph is constructed by merging all linked J-Graph is constructed by merging all linked BNs and linkages into one graphBNs and linkages into one graph DAGDAG Variable nodes, Linkage NodesVariable nodes, Linkage Nodes Edges: all edges in the linked BNs have a Edges: all edges in the linked BNs have a

representation in J-graphrepresentation in J-graph CPT: CPT: QQ((AA||AA) = ) = PP((AA||AA), ), QQ((AA||BB) = ) = PPSS((AA||BB) for ) for

►QQ: distribution in J-graph, : distribution in J-graph, PP: original distribution : original distribution

ABL

J-Graph (3)J-Graph (3)ExampleExample

A

B C

D

A’

B’

C’

D’

A1

B→B’;1→2

C→C’;1→2

D2

A’2

D’2

Linkage nodes represent all linked variables and the linkage encode the similarity of the linkage in CPT merge the CPTs by IPFP

LinkageNode

Concept Mapping using SLBN (1)Concept Mapping using SLBN (1)MotivationsMotivations

►Ontology mappings are seldom certainOntology mappings are seldom certain Existing approaches Existing approaches

►use hard threshold to filter mappingsuse hard threshold to filter mappings►throw similarities away after mappings are createdthrow similarities away after mappings are created►mappings are identical and 1-1mappings are identical and 1-1

ButBut►often one concept is similar to more than one conceptoften one concept is similar to more than one concept►Semantically similar concepts are hard to be represented Semantically similar concepts are hard to be represented

logicallylogically

Probabilistic Information

Learner

BayesOWL

Concept Mapping using SLBN (2)Concept Mapping using SLBN (2)The FrameworkThe Framework

Onto2Onto1WWW

BN2BN1VariableLinkages

BayesOWL

SLBN

Concept Mapping using SLBN (3)Concept Mapping using SLBN (3)ObjectionObjection

►Discover new and complex concept mappingsDiscover new and complex concept mappings Make full use of the learned similarity in SLBN’s Make full use of the learned similarity in SLBN’s

inferenceinference Create an expression for a concept in another Create an expression for a concept in another

ontologyontology►Find how similar “Onto1:B Find how similar “Onto1:B Onto1:C” is to “Onto2:A” Onto1:C” is to “Onto2:A”

►Experiments have shown encouraging resultsExperiments have shown encouraging results

Concept Mapping using SLBN (3)Concept Mapping using SLBN (3)ExperimentExperiment

► Artificial Intelligence sub-domain from Artificial Intelligence sub-domain from ACMACM Topic Taxonomy Topic Taxonomy DMOZDMOZ (Open Directory) hierarchies (Open Directory) hierarchies

Learned Similarities:

07.021.0

12.060.0).,.( rsacmswdmozP

04.025.0

13.058.0).,.( snacmswdmozP

01.004.0

30.065.0).,.( krfmacmswdmozP

J(dmoz.sw, acm.rs) = 0.64

J(dmoz.sw, acm.sn) = 0.61

J(dmoz.sw, acm.krfm) = 0.49

After SLBN Inference:

JJ((dmozdmoz..swsw, , acmacm..rs rs acm acm..snsn) = 0.7250 ) = 0.7250

Q Q ((acmacm..rsrs = = True True acm acm..snsn = = True True || dmoz dmoz..swsw = = TrueTrue) = 0.9646 ) = 0.9646

Future WorksFuture Works

► Modeling with SLBNModeling with SLBN Discover semantic similar concepts by machine Discover semantic similar concepts by machine

learning algorithmslearning algorithms Create effective and correct linkages from learned Create effective and correct linkages from learned

algorithmsalgorithms► Distributed Inference methodsDistributed Inference methods► Loosing the restrictionsLoosing the restrictions

Inference with linkages of both directionsInference with linkages of both directions Use functions to represent similaritiesUse functions to represent similarities

Thank You!Thank You!

► Questions?Questions?

BackgroundBackgroundSemantics of BNSemantics of BN

► Chain ruleChain rule

wherewhere ((aaii) is the parent set of ) is the parent set of aaii..

► dd-separation:-separation:

i

ii aaPXP ))(|()(

B

CA

CA

BA

B

C

serials diverging converging

Instantiated

Not instantiated

d-separated variables do not influence each other.