Learning Markov Logic Networks Using Structural Motifs

Post on 23-Feb-2016

43 views 0 download

Tags:

description

Learning Markov Logic Networks Using Structural Motifs. Stanley Kok Dept. of Computer Science and Eng. University of Washington Seattle, USA Joint work with Pedro Domingos. Outline. Background Learning Using Structural Motifs Experiments Future Work. Background - PowerPoint PPT Presentation

Transcript of Learning Markov Logic Networks Using Structural Motifs

1

Learning Markov Logic Networks Using Structural Motifs

Stanley KokDept. of Computer Science and Eng.

University of WashingtonSeattle, USA

Joint work with Pedro Domingos

Background Learning Using Structural Motifs Experiments Future Work

Background Learning Using Structural Motifs Experiments Future Work

2

Outline

3

Markov Logic Networks[Richardson & Domingos, MLJ’06]

A logical KB is a set of hard constraintson the set of possible worlds

Let’s make them soft constraints:When a world violates a formula,it becomes less probable, not impossible

Give each formula a weight(Higher weight Stronger constraint)

Teaches(p,c) ) Professor(p)2.7

4

Markov Logic A Markov logic network (MLN) is a set of

pairs (F,w) F is a formula in first-order logic w is a real number

vector of truth assignments to ground atoms

partition function

weight ofith formula

#true groundingsof ith formula

5

MLN Structure Learning

Input: Relational Data

Advises

Pete Sam

Pete Saul

Paul Sara

… …

TAs

Sam CS1

Sam CS2

Sara CS1

… …

Teaches

Pete CS1

Pete CS2

Paul CS2

… …

2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)

1.4 Advises(p, s) Æ Teaches(p, c) ) TAs(s, c)

1.1 :TAs(s, c) ˅ : Advises (s, p)

Output: MLN

MLN Structure Learner

Generate-and-test or greedy MSL [Kok & Domingos, ICML’05] BUSL [Mihalkova & Mooney, ICML’07]

Computationally expensive; large search space Susceptible to local maxima

6

Previous Systems

7

AdvisesPete Sam Pete SaulPaul Sara

… …TAs

Sam CS1Sam CS2Sara CS1

… …

TeachesPete CS1Pete CS2Paul CS2

… …Sam

Pete CS1CS2CS3CS4CS5CS6CS7CS8

PaulPatPhil

SaraSaulSue

TAs

Advises Teaches

PetePaulPatPhil

SamSaraSaulSue

CS1 CS2

CS3 CS4

CS5 CS6

CS7 CS8

Teaches

TAs

Advises

Professor

Student

Course

‘Lifts’

LHL System[Kok & Domingos, ICML’09]

Trace paths &convert paths to first-order clauses

Background Learning Using Structural Motifs Experiments Future Work

8

Outline

First MLN structure learner that can learn long clauses Capture more complex dependencies Explore a larger space of clauses

Learning Using Structural Motifs (LSM)

Student1

Prof1

Course1

Student2

Course2

Student3

Course3

Prof2

Student4

Course4

Course5

Student5

Prof3

Course6

Student6

Prof4

Course7

Student7

Prof5

Student8

Course8

Course9

Student9

Prof6

Student10

Course10

Course11

Student11

Prof7

Student12

Course12

Course13

Student13

Prof8

Student13

Course1310

LHL Recap

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

Teaches

Advises

TAs

S9 S10 S11

P2

S12 S13

C3

C4

C5

S8

S20 S21 S22

P7

S23 S24

C7

C8

S33 S34 S35

P9

S36 S37

C13

C14

S25 S26

P8

S29 S30 S31

C10

C11

S32

C9

C12

S28S27

S14 S15 S16

P4P3

C6

S17 S18 S19

P6P5

C15

S38 S39 S40

C16

S41

P10 P11

11

Repeated Patterns

Student1

Prof1

Course1

Student2

Course2

Student3

Course3

Prof2

Student4

Course4

Course5

Student5

Prof3

Course6

Student6

Prof4

Course7

Student7

Prof5

Student8

Course8

Course9

Student9

Prof6

Student10

Course10

Course11

Student11

Prof7

Student12

Course12

Course13

Student13

Prof8

Student13

Course13

12

Repeated Patterns

Student1

Prof1

Course1

Student2

Course2

Student3

Course3

Prof2

Student4

Course4

Course5

Student5

Prof3

Course6

Student6

Prof4

Course7

Student7

Prof5

Student8

Course8

Course9

Student9

Prof6

Student10

Course10

Course11

Student11

Prof7

Student12

Course12

Course13

Student13

Prof8

Student13

Course13

Course

Student

Prof

Teaches

Advises

TAs

{ Teaches(p,c),TAs(s,c),

Advises(p,s) }

Finds literals that are densely connected Random walks & hitting times

Groups literals into structural motifs

Cluster nodes into high-level concepts Symmetrical paths & nodes

13

Structural motif = set of literals→ a set of clauses= a subspace of clauses

TAs(s,c)

:Teaches(p,c) ˅ TAs(s,c) ˅ Advises(p,s) Teaches(p,c) ˅ :TAs(s,c)

………

Learning Using Structural Motifs (LSM)

14

LSM’s Three Components

Advises

Pete

Sam

Pete Saul

Paul Sar

… …

TAs

Sam CS1

Sam CS2

Sara CS1

… …

Teaches

Pete CS1

Pete CS2

Paul CS2

… …

2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)

1.4 Advises(p, s) Æ Teaches(p, c)) TAs(s, c)

-1.1 TAs(s, c) ) Advises(s, p)

Input: Relational DB Output:MLN

LSM

IdentifyMotifs

FindPaths

CreateMLN

AA

Random Walk Begin at node A Randomly pick neighbor n

E

F

D

B

C15

Random Walk Begin at node A Randomly pick neighbor n Move to node n

E

F

D A

2B

C16

Random Walk Begin at node A Randomly pick neighbor n Move to node n Repeat

E

F

D A

B

2C17

Expected number of steps starting from node i before node j is visited for first time Smaller hitting time → closer to start node i

Truncated Hitting Time [Sarkar & Moore, UAI’07]

Random walks are limited to T steps Computed efficiently & with high probability by

sampling random walks [Sarkar, Moore & Prakash ICML’08]

18

Hitting Time from node i to j

Finding Truncated Hitting Time By Sampling

E

F

D 1

B

C

A

A

T=5

19

Finding Truncated Hitting Time By Sampling

E

F

4 A

B

C

D

A D

T=5

20

Finding Truncated Hitting Time By Sampling

5

F

D A

B

C

E

A D E

T=5

21

Finding Truncated Hitting Time By Sampling

E

F

4 A

B

C

D

A D E D

T=5

22

Finding Truncated Hitting Time By Sampling

E

6

D A

B

CF

A D E D F

T=5

23

Finding Truncated Hitting Time By Sampling

5

F

D A

B

C

E

A D E D F E

T=5

24

Finding Truncated Hitting Time By Sampling

A D E D F E

T=5

E

F

D A

B

C

hAD=1hAE=2

hAF=4

hAA=0hAB=5

hAC=5

25

26

Symmetrical Paths

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

TeachesAdvises

TAs

S10 S11

P2

S12 S13

C3

C4

S8

S9

P1→S2P1, Advises, S2

P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1

Physics History

Symmetrical

27

Symmetrical Paths

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

TeachesAdvises

TAs

S10 S11

P2

S12 S13

C3

C4

S8

S9

P1→S2

P1, Advises, S1, TAs, C1, TAs, S2 P1, Advises, S2

P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1

0, Advises, 1, TAs, 2, TAs, 3 P1, Advises, S4, TAs, C1, TAs, S3 0, Advises, 1, TAs, 2, TAs, 3

Physics History

28

Symmetrical Nodes

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

TeachesAdvises

TAs

S10 S11

P2

S12 S13

C3

C4

S8

S9

P1→S2

P1, Advises, S1, TAs, C1, TAs, S2 P1, Advises, S2

P1→S3P1, Advises, S3 0, Advises, 1 0, Advises, 1

0, Advises, 1, TAs, 2, TAs, 3 P1, Advises, S4, TAs, C1, TAs, S3 0, Advises, 1, TAs, 2, TAs, 3

… …

Physics History

Symmetrical

Sym. nodes have identical truncated hitting times

Sym. nodes have identical path distributions in a

sample of random walks

29

Learning Using Structural Motifs

Advises

Pete

Sam

Pete Saul

Paul Sar

… …

TAs

Sam CS1

Sam CS2

Sara CS1

… …

Teaches

Pete CS1

Pete CS2

Paul CS2

… …

2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)

1.4 Advises(p, s) Æ Teaches(p, c)) TAs(s, c)

-1.1 TAs(s, c) ) Advises(s, p)

Input: Relational DB Output:MLN

LSM

IdentifyMotifs

FindPaths

CreateMLN

30

Sample Random Walks

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

Teaches

Advises

TAs

S10 S11

P2

S12 S13

C3

C4

S9

Physics History

S8

0,Advises,1,TAs,2

…1…

31

Estimate TruncatedHitting Times

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

S10 S11

P2

S12 S13

C3

C4

S9

Physics History

S8

3.2

3.21

3.55

3.52 3.52 3.52 3.52

3.55 3.55 3.55 3.93

3.99

3.99

4 4

4 4

4

0

32

Prune ‘Faraway’ Nodes

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

Physics

S8

3.2

3.21

3.55

3.52 3.52 3.52 3.52

3.55 3.55 3.55 3.93S10 S11

P2

S12 S13

C3

C4

S9

History

3.99

3.99

4 4

4 4

4

0

33

Group Nodes with Similar Hitting Times

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

S8

3.2

3.21

3.55 3.55 3.55 3.55

Candidate symmetrical nodes

3.523.523.52 3.52

0

34

Cluster Nodes

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

Cluster nodes with similar path distributions

S8

0,Advises,1

0.5

…0,Advises,2,…,1 0.1

35

Create ‘Lifted’ Graph

S1 S2 S3 S4

S5 S6 S7

C1 C2

S8

P1

Teaches

Advises

TAs

Professor

Course

Student

36

Extract Motif with DFS

S1 S2 S3 S4

S5 S6 S7

C1 C2

S8

P1

Teaches

Advises

TAs

Professor

Course

Student

37

Create Motif

S1

C1

P1

Teaches

Advises

TAs{ Teaches(p,c),

TAs(s,c), Advises(p,s) }

{ Teaches(P1,C1),TAs(S1,C1),

Advises(P1,S1) }

true grounding of

Motif

38

Restart from Next Node

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

S10 S11

P2

S12 S13

C3

C4

S9

Physics History

S8

C2

39

S1 S2 S3 S4

P1

S5 S6 S7

C1

C2

Physics

S8

C2

Student1

Professor

Student

Course1

Course2

different motif over sameset of constants

Restart from Next Node

40

Select Motifs

Choose motifs with large #true groundings

{ Teaches(p,c),TAs(s,c),

Advises(p,s) }

Motif Est. #True Gndings

100

{ Teaches(p,c),…} 20

… …

Pass selected motifs to FindPaths & CreateMLN

41

LSM

Advises

Pete

Sam

Pete Saul

Paul Sar

… …

TAs

Sam CS1

Sam CS2

Sara CS1

… …

Teaches

Pete CS1

Pete CS2

Paul CS2

… …

2.7 Teaches(p, c) Æ TAs(s, c) ) Advises(p, s)

1.4 Advises(p, s) ) Teaches(p, c) Æ TAs(s, c)

-1.1 TAs(s, c) ) Advises(s, p)

Input: Relational DB Output:MLN

LSM

IdentifyMotif

FindPaths

CreateMLN

42

FindPaths

1+ 11+ 1

{ Teaches(p,c),TAs(s,c),

Advises(p,s) }

p

s

c

Teaches

TAs

Advises

Paths Found

Advises(p,s)

Advises(p,s) ,Teaches (p,c)

Advises(p,s) ,Teaches (p,c),TAs(s,c)

c

p

s

Advises(p, s) Æ Teaches(p, c) Æ TAs(s, c) Advises(p, s) V :Teaches(p, c) V :TAs(s, c) Advises(p, s) V Teaches(p, c) V :TAs(s, c) …

:Advises(p, s) V :Teaches(p, c) V :TAs(s, c)

43

Clause Creation

1+ 11+ 1

1+ 11+ 1

Advises(p, s)

Teaches(p,c)

TAs(s,c)

Æ

Æ

1+ 11+ 1

44

Clause Pruning

1+ 11+ 1

: Advises(p, s) V :Teaches(p, c) V TAs(s, c)

Advises(p, s) V :Teaches(p, c) V TAs(s, c)…: Advises(p, s) V :Teaches(p, c): Advises(p, s) V TAs(s, c)

… : Advises(p, s) : Teaches(p, c)

:Teaches(p, c) V TAs(s, c)

TAs(s, c)

Score -1.15 -1.17

-2.21 -2.23 -2.03

-3.13 -2.93 -3.93

…`

45

Clause Pruning

1+ 11+ 1

: Advises(p, s) V :Teaches(p, c) V TAs(s, c)

Advises(p, s) V :Teaches(p, c) V TAs(s, c)…: Advises(p, s) V :Teaches(p, c): Advises(p, s) V TAs(s, c)

… : Advises(p, s) : Teaches(p, c)

:Teaches(p, c) V TAs(s, c)

TAs(s, c)

Score -1.15 -1.17

-2.21 -2.23 -2.03

-3.13 -2.93 -3.93

Compare each clause against its sub-clauses (taken individually)

Add all clauses to empty MLN Train weights of clauses Remove clauses with absolute weights

below threshold

46

1+ 11+ 1

MLN Creation

Background Learning Using Structural Motifs Experiments Future Work

47

Outline

IMDB Created from IMDB.com DB Movies, actors, etc., and relationships 17,793 ground atoms; 1224 true ones

UW-CSE Describes academic department Students, faculty, etc., and relationships 260,254 ground atoms; 2112 true ones

48

Datasets

Cora Citations to computer science papers Papers, authors, titles, etc., and their

relationships 687,422 ground atoms; 42,558 true ones

49

Datasets

Five-fold cross validation Inferred prob. true for groundings of each pred.

Groundings of all other predicates as evidence For Cora, inferred four predicates jointly too

SameCitation, SameTitle, SameAut, SameVenue MCMC to eval test atoms: 106 samples or 24 hrs Evaluation measures: CLL, AUC

50

Methodology

LSM, LHL, BUSL, MSLTwo lengths per system

Short length of 4 Long length of 10

51

Methodology

Series10

0.2

0.4

0.6

0.8

1

52

AUC & CLL

Series1-1

-0.8

-0.6

-0.4

-0.2

0

Series1-1

-0.8

-0.6

-0.4

-0.2

0

AU

C

Series10

0.2

0.4

0.6

0.8

1

CLL

Short Clauses Long Clauses↑5%

↑45%

LSM LHL BUSL MSL LSM LHL BUSL MSL

LSM LHL BUSL MSLLSM LHL BUSL MSL

Series10123456789

10

Series10

100000

200000

53

Runtimeshr

Short Clauses Long Clauses

LSM LHL BUSL MSL LSM LHL BUSL MSL

1.3320.6

1.92

230,000

1.83

9.96

1.83 9.81

10,000X

same local maxima

VenueOfCit(v,c) ᴧ VenueOfCit(v,c') ᴧ AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a') ᴧ

TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ) SameTitle(t,t') SameCitation(c,c') ᴧ TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ᴧ

HasWordTitle(t,w) ᴧ HasWordTitle(t',w) ᴧ AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a')

54

Long Rules Learned on Cora

Background Learning Using Structural Motifs Experiments Future Work

55

Outline

Discover motifs at multiple granularities Combine LSM with generate-and-test

approaches Apply LSM to larger, richer domains, e.g.,

Web

56

Future Work

LSM finds structural motifs in data random walks & hitting times

Accurately and efficiently learns long clauses by searching within motifs

Outperforms state-of-the-art structure learners

57

Conclusion

58

AuthorOfCit(a,c) ᴧ AuthorOfCit(a',c') ᴧ SameAuthor(a,a') ᴧ TitleOfCit(t,c) ᴧ TitleOfCit(t',c') ᴧ SameTitle(t,t') ) SameCitation(c,c')

AuthorHasWord(a,w) ᴧ AuthorHasWord(a',w') ᴧ AuthorHasWord(a'',w) ᴧ AuthorHasWord(a'',w') ) SameAuthor(a,a')

59

Long Rules Learned (Cora)