Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2...

13
Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian University, 2 Corporate Technology, Siemens AG 1 12.09.2015

Transcript of Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2...

Page 1: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

1

Ensemble Solutions for Link-Prediction in Knowledge Graphs

Denis Krompaß1,2 and Volker Tresp1,2

1 Department of Computer Science. Ludwig Maximilian University, 2 Corporate Technology, Siemens AG

12.09.2015

Page 2: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

2

Outline

1. Knowledge Graphs, what are they and what are they good for?

2. Representation Learning in Knowledge Graphs State of the Art Latent Variable Models Integrating Prior Knowledge about Relation-Types

3. Analyzing the Complementary “Potential” of State of the Art Representation Learning Algorithms

Page 3: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

3

Knowledge Graphs

Stores facts about the world as relations between entities. Entities are no longer just strings but real world objects with

attributes, taxonomic information and relations to other objects. (AlbertEinstein, bornIn, Ulm)

Providing a machine with semantic information: Search engines Information retrieval Word-sense disambiguation …

Prominent Examples: Google Knowledge Graph IBM Watson

Page 4: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

4

Learning in Knowledge Graphs

AlbertEinstein

1. Link-Prediction2. Link-based

Clustering3. Disambiguation

Similarities

Latent Variable Model

ULM

bornIn

bornIn0.2

1.2

-0.3

-1.1

2.10.6

0.3

-0.8

0.1

0.7

-0.91.7

0.9

-0.4

1.3

0.3

0.1-0.1

0.9

-0.4

1.3

0.3

0.1-0.1

0.7

-0.5

1.3

0.3

0.2-0.2

Knowledge Graph Triples

Latent representations (or embeddings) for Entities and Relation-Types that disentangle complex relationships observed in the data (semantics).

Page 5: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

5

State of the Art Latent Variable Models

1. RESCAL Third-Order Tensor Factorization Methods Least-Squares Cost Function

2. TransE Distance-based Method Ranking Cost Function

3. Google Knowledge Vault Multi-way Neural Network (mwNN) Logistic Cost Function

Problem: Large Knowledge Graphs Contain Millions of Entities and thousands of Relation-Types Low dimensional representations have to be learned Try to find ways to increase prediction-quality under this constraint

Page 6: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Prior Knowledge about Relation-Types

Denis Krompaß, Stephan Baier and Volker Tresp. Type-Constrained Representation Learning in Knowledge Graphs. 14th International Semantic Web Conference (ISWC), 2015

*Results on large samples from these knowledge graphs 6

Domain and Range Constraints for Relation-Types

Latent Variable Model

Link-Prediction Improvement

+77% (Freebase*)+40% (YAGO2*)

+54% (DBpedia-Music*)

Type-Constraints (From the Schema)

Local closed-world assumption (From the Data)

RESCAL TransE Google KVault Neural NetworkWith low-dimensional embeddings

Integration in model training

Prediction of new triples

Page 7: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

7

Complementary Prediction?

• State of the art models differ in many aspectsDiverse predictors

• Analysis to which degree the models are complementary– Combine through arithmetic mean– Use Plat scaling for mapping the different outputs to

probabilities•70% Training Set•10% Validation Set

Hyperparameter Tuning + Plat Scaling•20 % Test Set

Page 8: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

8

Results

1. Ensemble has always much better link-prediction quality

Page 9: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Results

8

1. Ensemble has always much better link-prediction quality

2. Best complement is between TransE and mwNN

Page 10: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Results

8

1. Ensemble has always much better link-prediction quality

2. Best complement is between TransE and mwNN

3. RESCAL provides only complementary predictions in case of the Freebase dataset

Page 11: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Results

8

1. Ensemble has always much better link-prediction quality

2. Best complement is between TransE and mwNN

3. RESCAL provides only complementary predictions in case of the Freebase dataset

4. For the local closed-world assumption, very similar observations could be made

Page 12: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Summary

• Models are complementary to each other– This applies especially when low dimensional

embeddings are used (d=10)– Ensemble with d=10 comparable to best single

predictor with d=100– Up to more than 10% improvement on top of the

improvements achieved when Type-Constraints or the Local closed-world assumption are exploited

9

Page 13: Ensemble Solutions for Link- Prediction in Knowledge Graphs Denis Krompaß 1,2 and Volker Tresp 1,2 1 Department of Computer Science. Ludwig Maximilian.

Questions ?

10

http://www.dbs.ifi.lmu.de/~krompass/

[email protected]