Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015

27
ENTITY SEARCH ON THE WEB Tanmay Mondal MSLIS student, DRTC Bangalore,Karnataka-560059 LiNE ,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015

Transcript of Line,,NATIONAL SEMINAR ORGANIZED BY KULISAA 15.01.2015

ENTITY SEARCH ON THE WEB

Tanmay MondalMSLIS student, DRTC

Bangalore,Karnataka-560059

LiNE ,NATIONAL SEMINAR

ORGANIZED BY KULISAA

15.01.2015

Information

• Information is huge in various databases

It is growing exponentially

A document collection is the set of all web pages

indexed by search engines

TIME FACTOR

• A traditional information extraction approach is to scan every document in any collection

Search Options : OR, AND, *, NOT, ''...''

• Find out right information at right time is Time consuming for users

Person

Location Organization Nationality Religion Product

Phone Number

Email Address/URL

Distance

Date

Time

Money Generic Number

Query: “Countries where I can pay in Euro”Results: Germany, Spain, Italy, ...etc.

For specific Information

OUR QUERY

When people use retrieval systems they are often not searching for documents or text passages

The named entities (persons, organizations,

locations,products...) play a central role in answering

such information needs

At least 20-30% of the queries submitted to Web SE

are simply entities

~71% of Web search queries contain named entities

Entity Search

Entity refers to any object or a thing that can be uniquely identified in the real world

It is a presentation of a ranked list of entities directly, rather than a list of web pages

It's a better match search queries with a database containing

hundreds of millions of "entities"-people, places, organization

& their semantic relations.

Entities are everywhere

Web of Documents ==>> Web of Entites

Entity & its facets

An entity must be distinguished from other entities

Type of an entity refers to a generic class into which the given entity is classified.

Attribute refers to a property (predicate) associated with an

entity.

Value refers to the value of an attribute (for a given entity).

Relation provides more information with many entites

Entity: Prof. S.R. Ranganathan is  a person , IBM is an

organization

Examples

Popular Entity Search:

• Product search­ Various Products like Books, Electronics, Clothes, etc.

• People search­ Experts, Friends, Profile of famous persons, etc.

• Location search­ Address, Business, Governments’ Offices, etc.

• And many more search based on entities only….

select any one for more details

Idea about entity search engine

Main Work of ESE

Entity Retrieval : Entity search engines return a ranked list of entities most relevant for a user query

Entity Relationship / Fact Mining and Navigation : It discover interesting relationships / facts about the entities associated with the queries

Prominence Ranking : Detect the popularity of an entity and enable users to browse entities in different categories

Entity Description Retrieval : Entity description blocks for each entity information about an object in a web page is generally grouped together as an object block

FEATURES OF ENTITY SEARCH ENGINE (ESE)

• ESE provides explicit and easily under stable information

• It gathers/aggregates information from different sources and keeps in one place

• It extracts entities in a structured form• It understands the meaning of our query• It provides most useful information about entities• It shows important relation with other entities• Duplication of entites can be avoided

contd...

More structured than document based• Based on different categories, we can search

entities• Users don’t need to visit different sites for a

particular entity• It helps users to retrieve pin pointed answers

without wasting much time• It provides sources of information for detailed

or document information

Different ESE

• Entity Relationship query (http://idir.uta.edu/erq/)• EntityCube(http://entitycube.research.microsoft.com/)• Okkam Entity Name System (http://api.okkam.org/) • Yatedo( http://www.yatedo.com/)• Dbpedia (http://dbpedia.org/About) • WorldCat (https://www.worldcat.org/)

Geonames(http://www.geonames.org/)

• WolframAlpha (http://www.wolframalpha.com/) • Geneview (http://bc3.informatik.hu-berlin.de/)

Sindice(http://sindice.com/)• IMDb (http://www.imdb.com)• Sindice(http://sindice.com/)

Entity identifiers should not be multiplied beyond necessity

Every entity (individual, instance, “thing”) is assigned a global identifier, ideally unique

More than 7.5 million entity repository with more structured form

Sources Of Information1. Wikipedia Provides lists of different types of entites2. GeoNames contains over million geographical names3. OkkamDBManager databases like extranets, online shops or

publishing houses4. OkkamManualEntry insert new entities in a manual way

OKKAM ENS

Transforming unstructured information into structured data

Wolfram|Alpha

Wolfram|Alpha is an engine for computing answers and providing knowledge

It generates output by doing computations from its own

internal knowledge base, instead of searching the web and

returning links

It is an online service that answers factual queries directly by

computing the answer

Make all systematic knowledge immediately computable and

accessible to everyone

5 nearest stars

WEATHER REPORT OF INDIA IN 2010

My Library

Entites are for UseEntites are for Use

Each Entity has its own attributes & relationEach Entity has its own attributes & relation

Every Entity has its importanceEvery Entity has its importance

Save the Time for finding out EntitesSave the Time for finding out Entites

Entites are growing rapidlyEntites are growing rapidly