Entity Ranking Using Wikipedia as a Pivot
description
Transcript of Entity Ranking Using Wikipedia as a Pivot
![Page 1: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/1.jpg)
1
Entity Ranking Using Wikipedia as a Pivot
(CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps
2010/12/14 Yu-wen,Hsu
![Page 2: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/2.jpg)
2
Outline
IntroductionFrom Wikipedia Entities to Web Entities
and backEntity Ranking on WikipediaEntity Ranking on WebConclusion
![Page 3: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/3.jpg)
3
Introduction
Entity ranking is the task of finding documents representing entities of a correct type that are relevant to a query.presenting a ranked list of entities directly,
rather than a list of web pages with relevant but also potentially redundant information about these entities.
![Page 4: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/4.jpg)
4
Differs from document retrieval on at least three points: i) returned documents have to represent an
entity ii) this entity should belong to a specified entity
typeiii) to create a diverse result list an entity should
only be returned once.
![Page 5: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/5.jpg)
5
Main Goal
To Rank Web entities1. Associate target entity types with the query2. Rank Wikipedia pages according to their
similarity with the query and target entity types3. Find web entities corresponding to the
Wikipedia entities
![Page 6: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/6.jpg)
6
Using Wikipedia as a pivotentities: Wikipedia pagesthe name of the entity: the title of the page the content of the page: the representation of
the entityEach Wikipedia page is assigned to a number
of categories: topical, type, and administrative categories.
![Page 7: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/7.jpg)
7
From Wikipedia Entities to Web Entities and back
From Web to Wikipediathese repositories provide enough clues to find
the corresponding entities on theWeb?they contain enough entities that cover the com
plete range of entities needed to satisfy all kinds of information needs?
![Page 8: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/8.jpg)
8
From Wikipedia to WebUse External Link
![Page 9: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/9.jpg)
9
Entity Ranking on Wikipedia* Entity Types
Entity Type Assignmentexploit the existing Wikipedia categorization of
documentsPseudo-relevance feedback of the top retrieved
documents we extract the categories that are most frequently
assignedthe top 10 results, and look at the 2 most frequently
occurring categories belonging to these documents
![Page 10: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/10.jpg)
10
*Entity Types -Scoring Entities
estimate background probabilities
smooth the probabilities of a term occurring in a category name with the background collection
: the name of the category : the category
: the query terms: the document: the entire Wikipedia document collection
![Page 11: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/11.jpg)
11
Similarity between two categories
The entity type score for a document in relation to a query topic
Score Normalization
![Page 12: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/12.jpg)
12
Entity Ranking on Wikipedia*Experimental Setup
Data Set:INEX: specific, ex countries, national parks.. TREC: people, organization, product
Advantage: clear, few options, could be easily selected
Disadvantage: cover a small part of all possible entity ranking queries
manually assigned more specific entity types
![Page 13: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/13.jpg)
13
rerank the top 2,500 results of the baselineManually assigned (author)Automatically assigned (PRF)
evaluation2009 TREC:P10 and NDCG@20INEX:P10 and MAP
INEX 2006-2008 consisting of 79 topics INEX 2009 topics consisting of a selection of 55 topics
from the 2006-2008 topics.
only count the so-called ‘primary’ pages
![Page 14: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/14.jpg)
14
![Page 15: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/15.jpg)
15
Entity Ranking on The Web
We have three approaches for finding web pages associated with Wikipedia pages.1. External links:
the External links section of the Wikipedia page2. Anchor text:
Wikipedia page title as query retrieve pages from the anchor text index
3. Combined:not all Wikipedia pages have external linksnot all external links of Wikipedia pages are part of the
Clueweb collectionless than 3 webpages are found, we fill up the results to 3
pages using the top pages retrieved using anchor text
![Page 16: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/16.jpg)
16
![Page 17: Entity Ranking Using Wikipedia as a Pivot](https://reader036.fdocuments.us/reader036/viewer/2022070405/56814027550346895dab886d/html5/thumbnails/17.jpg)
17
Conclusion
Our experiments show that our wikipedia-as-a-pivot approach outperforms a baselines of full-text search.
Both external links on Wikipedia pages, and searching an anchor text index of the web are effective approaches to find homepages for entities represented by Wikipedia pages.