New Approach To Personal Network Search Based On Information Extraction (Tin180 Com)

13
1 Personal Social Network A New Approach to Personal Network Search based on Information Extraction Jie Tang, Mingcai Hong, Jing Zhang, Bangyong Liang, and Juanzi Li Knowledge Engineering Group, Department of Computer Science and Technology, Tsinghua University Sep. 5 th , 2006

description

http://tin180.com

Transcript of New Approach To Personal Network Search Based On Information Extraction (Tin180 Com)

Page 1: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

1

Personal Social Network— A New Approach to Personal Network Search based on Information Extraction

Jie Tang, Mingcai Hong, Jing Zhang, Bangyong Liang, and Juanzi Li

Knowledge Engineering Group, Department of Computer Science and Technology, Tsinghua University

Sep. 5th, 2006

Page 2: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

2

Personal Social Network• Personal social network is an important research

area.• A person usually has different types of

information– Personal profile (including portrait, homepage,

position, affiliation, publications, and documents) – Contact information (including address, email,

telephone, and fax number)– Friends

• Unfortunately, the information is often hidden in heterogeneous and distributed web pages

Page 3: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

3

Our Approach• Personal Social Network = Building + Search + Mining

Doc collectionAnnotation Integration

Person searchPublication searchAssociation search

Expert findingResearch

interesting finding

Page 4: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

4

Processing Flow

Submitted to Returned pages

Fed to

Extracting and saving to

Ontologybase

Query

Classification Model

Page 5: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

5

Building the Personal Network

>400,000 Persons>700,000 Publications

Page 6: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

6

SVMs

Start line model

Two SVM models

End line model

Training data

Test data

Feature extraction

Feature extraction

Identified blocks

Position feature

Positive word feature

...

SVMs

Feature extraction

Position feature

Positive word feature

...

Start line feature set End line feature set

Annotation using SVMsPersonal profile: e.g. image,

affiliation, etc.Contact information: fax, email, phone, etc.

Start position model End position model Identified info.

Features sets

Page 7: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

7

Person Search

Search for a person using the name or other information,

e.g. affiliation

Page 8: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

8

Publication Search

Searching for a publication using IR

model

Page 9: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

9

Publication Online-View

Page 10: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

10

Association Search

Finding associations between persons - high efficiency - Top-K associations

Usage: - to find a partner - to find a person with same interests

Page 11: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

11

Expert Finding

Finding experts on a topic

Page 12: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

12

Research Interest Finding

Finding research interests for a person

Page 13: New Approach To Personal Network Search Based On Information Extraction  (Tin180 Com)

13

Homepage:

http://keg.cs.tsinghua.edu.cn/persons/tj