Post on 16-Apr-2017
DBtrends
Exploring Query Logs for Ranking RDF Data
AKSW
Edgard Marx, Amrapali Javeri,
Diego Moussallem, Sandro Rautenberg
12th International Conference on Semantic Systems
Outline
• Motivation
• Background
• Ranking using Query Logs
• Evaluation• Results• Discussion
• Conclusion
• Future Works
2AKSW
4
http://linkeddatacatalog.dws.informatik.uni-annheim.de/state/"The size of LOD by 2014 was 31 billion triples"
"Facebook users generates 2.7 billion Like actions
per day and 300 million new
photos are uploaded daily"Josh Constine, 2012
We Have Data
"Google Processing 20,000
Terabytes A Day, And Growing"Erick Schonfeld, 2008techcrunch.com
techcrunch.com
AKSW
Motivation
Things
11
Background
AKSW
Web of Data
• Semantic Search• Entity Search• Question Answering• Named Entity Recognition• Link Discovery• Machine Learning
Use RDF Data
E=MC²
Ranking Functions (Types)
12
"Give me all persons"
AKSW
Retrieve
Processing &
Ranking
Background
...
Ranking Functions (Types)
13
"Give me all persons"
AKSW
Retrieve
Persons
Sort
Processing &
Ranking
Answer
Background
...
Ranking Functions (Types)
14
"Give me all persons"
AKSW
Retrieve
Persons
Sort
Processing &
Ranking
Answer
Background
...Query dependent Query independent
Ranking RDF Data
17AKSW
Background
Page et al.
2011
1999
Cheng et al. (Property)
2001
Lee et al.
Web of Data
Ranking RDF Data
18AKSW
Background
Page et al.
Thalhammer et al.
2011
1999
2014
Cheng et al. (Property)
2001
Lee et al.
Web of Data
Benchmarks
19
DBtrends Benchmark (Marx, 2016)
• 60 users from different countries (USA, India)• 9 entity ranking functions applied to DBpedia Knowledge Base
• Users sort relevant classes, properties and entities extracted from the top twenty entities belonging to the top four classes
• Task were executed using Amazon Mechanical Turk
Previous Benchmarks• Not public available
• Evaluate performace of 30 profilesAKSW
Background
Why use query logs?
• Query logs provide relevant information about user's preference
• They refer to the real-world entities
E=MC²
AKSW24
Ranking using Query Logs
Questions
• How to map real-world entitiesto Web of Data?
• How to measure it's relevance?• Where to find a good and trustable
query log?
AKSW25
Ranking using Query Logs
How to map real world
resources?
• Rocha et al. (2004)• Ding et al. (2005)• Hogan et al. (2006)• Alsarem et al (2015)
AKSW26
Ranking using Query Logs
Query Logs
search...
Web of Data
How to measure the
resource's relevance?
AKSW27
Ranking using Query Logs
• Users search (more often) for things that are relevant
• Query logs register how often something is searched
• Query logs can be used for better estimate resource's relevance by looking how oftenit is searched
Where to find a good and
trustable query log?
• Public API• Filters
Geographic• Country• State• City
Period Day Week Month Year
AKSW30
Ranking using Query Logs
DBtrends Ranking Function
AKSW32
Ranking using Query Logs
36Trendsdbr:New_York_City
“New York”
dbo:City
dbo:Place
2
1
1
• First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( )2-
DBtrends Ranking Function
18
36Trendsdbr:New_York_City
“New York”
dbo:City
dbo:Place
1
23
4
9 • First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( )
• Thereafter, the entity ranks are used as a base to propagate the rank to the classes ( )3 4-
2-
AKSW
1
33
Ranking using Query Logs
Entity Ranking Functions
• DBtrends• MIXED-RANK
• DB-IN • DB-OUT• DB-RANK
• PAGE-IN • PAGE-OUT• PAGE-RANK• E-PAGE-IN• SEO-PA• SHARED-LINKS
+
Evaluation
34AKSW
Property/Class Ranking
Functions
• Instances• Instances
Property
Class
AKSW35
Evaluation
• Relin• RandomRank• Instances• Instances
Results
AKSW
• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA
• DB-OUT• PAGE-IN• PAGE-OUT• DB-IN• DB-RANK
36
Evaluation Entity
Results
AKSW
• MIXED-RANK• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA
• DB-OUT• PAGE-IN• DBtrends • PAGE-OUT• DB-IN• DB-RANK
37
Evaluation Entity
Discussion
AKSW
• Functions that take into consideration external information provide more insights about resource's relevance
• RDF Links reflect natural connections rather than resouce's relevance
• MIXED-RANK• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA
• DB-OUT• PAGE-IN• DBtrends • PAGE-OUT• DB-IN• DB-RANK
Entity
38
Evaluation
Discussion
AKSW
• There is no pattern in the impact distribution of query longs
• Queries (not necessarly) help to improve a ranking functions
• Internal agreement ~63%
39
Evaluation Entity
Results
AKSW
• RandomRank• Relin• Instances• Instances
• Instances• Instances
Property
Class
40
Evaluation
Discussion
AKSW
• RandomRank• Relin• Instances• Instances
• Internal agreement ~37%• Ranks are very sparse• Not conclusive
41
Evaluation Property
Discussion
AKSW
dbo:PopulatedPlacedbo:Settlementdbo:Placeowl:Thing
A simple sort can be very
effective
43
Evaluation
dbo:PopulatedPlace
dbo:Settlement
dbo:Place
owl:Thing
• Instances• Instances
Class
Discussion
AKSW
• Confidence in executing the tasks:
Indians 90%
Americans 60%
• Ranks produced by Indians were
more sparse
• Abstract entities appear before
entities
44
Evaluation Caviats
Summary
AKSW
• Entity Ranking functions produce better results
when considering external information
• A simple sort of the number of instances can be
very effective for ranking classes
• Query logs can (not necessarily) improve entity
ranking functions
45
Evaluation
Future Works
AKSW
• Extend the evaluation to other
countries and ranking functions
• Evaluate the impact of
contex-aware ranking functions
• Use others similarity ranking
functions
47