Effective Latent Space Graph-based Re-ranking Model with Global Consistency

28
1 Effective Latent Space Graph- based Re-ranking Model with Global Consistency Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering The Chinese University of Hong Kong Feb. 12, 2009 WSDM 2009

description

WSDM 2009. Effective Latent Space Graph-based Re-ranking Model with Global Consistency. Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering The Chinese University of Hong Kong Feb. 12, 2009. Outline. Introduction Related work Methodology - PowerPoint PPT Presentation

Transcript of Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Page 1: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

1

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Hongbo Deng, Michael R. Lyu and Irwin King

Department of Computer Science and Engineering

The Chinese University of Hong Kong

Feb. 12, 2009

WSDM 2009

Page 2: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

2Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Outline

Introduction Related work Methodology

Graph-based re-ranking model Learning a latent space graph A case study and the overall algorithm

Experiments Conclusions and Future Work

Page 3: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

3Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Introduction

Problem definition Given a set of documents D A term vector di = xi

Relevance scores using VSM or LM A connected graph

Explicit link (e.g., hyperlinks) Implicit link (e.g., inferred from the content information)

Many other features

How to leverage the interconnection between documents/entities to improve the

ranking of retrieved results

with respect to the query?

d1

d1

d2

d3

d4d5

q

d1

d2

d3

d4d5

q

d1

d2

d3

d4d5

d2

d3

d4

d5

d2

d1

d3

d4

d5

Page 4: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

4Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Introduction

Initial ranking scores: relevance Graph structure: centrality (importance, authority) Simple method: Combine those two parts linearly Limitations:

Do not make full use of the information Treat each of them individually

What we have done? Propose a joint regularization framework Combine the content with link information in a latent

space graph

dnq

Page 5: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

5Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Related work

Using some variations of PageRank and HITS Centrality within graphs (Kurland

and Lee, SIGIR’05 & SIGIR’ 06) Improve Web search results using

affinity graph (Zhang et al., SIGIR’05)

Improve an initial ranking by random walk in entity-relation networks (Minkov et al., SIGIR’06)

Structural re-ranking model

Regularization framework

Learning a latent space

Linear combination, treat the content and link individually

Structural re-ranking model

Regularization framework Graph Laplacians for label propagation

(two classes) (Zhu et al., ICML’03, Zhou et al., NIPS’03)

Extent the graph harmonic function to multiple classes (Mei et al., WWW’08)

Score regularization to adjust ad-hoc retrieval scores (Diaz, CIKM’05)

Enhance learning to rank with parameterized regularization models (Qin et al., WWW’08)

Query-independent settingsDo not consider multiple relationships between objects.

Learning a latent space Latent Semantic Analysis (LSA)

(Deerwester et al., JASIS’90) Probabilistic LSI (pLSI) (Hofmann,

SIGIR’99) pLSI + PHITS (Cohn and Hofmann,

NIPS’00) Combine content and link for

classification using matrix factorization (Zhu et al., SIGIR’07)

Learning a latent space

Regularization framework

Use the joint factorization to learning the latent feature.

Difference: leverage the latent feature for building a latent space graph.

Page 6: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

6Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Methodology

+

Graph-based re-ranking model

Learning a latent space graph

Case study:Expert finding

Graph-based re-ranking model

Page 7: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

7Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Graph-based re-ranking model

Intuition: Global consistency: similar documents are most likely

to have similar ranking scores with respect to a query. The initial ranking scores provides invaluable

information Regularization framework

Global consistency Fit initial scores

Parameter

III. Methodology

Page 8: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

8Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Optimization problem

A closed-form solution

Connection with other methods μα 0, return the initial scores μα 1, a variation of PageRank-based model μα (0, 1), combine both information simultaneously∈

Graph-based re-ranking modelIII. Methodology

Page 9: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

9Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Learning a latent space graph

Methodology

+

Graph-based re-ranking model

Case study:Expert finding

Page 10: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

10Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Learning a latent space graph

Objective: incorporate the content with link information (or relational data) simultaneously Latent Semantic Analysis Joint factorization

Combine the content with relational data Build latent space graph

Calculate the weight matrix W

III. Methodology

Page 11: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

11Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Latent Semantic Analysis

Map documents to vector space of reduced dimensionality

SVD is performed on the matrix

The largest k singular values

Reformulated as an optimization problem

III. Methodology - Learning a latent space graph

Page 12: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

12Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Embedding multiple relational data Taking the papers as an

example Paper-term matrix C Paper-author matrix A

A unified optimization problem

C

NxK

+

X VC VA

III. Methodology - Learning a latent space graph

A NxMNxL Conjugate Gradient

Page 13: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

13Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Build latent space graph

The edge weight wij is defined

W

III. Methodology - Learning a latent space graph

Page 14: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

14Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Methodology

+

Graph-based re-ranking model

Learning a latent space graph

Case study:Expert finding

Page 15: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

15Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Case study: Application to expert finding

Utilize statistical language model to calculate the initial ranking scores The probability of a query given a document

Infer a document model θd for each document The probability of the query generated by the document

model θd

The product of terms generated by the document model (Assumption: each term are independent)

III. Methodology

Page 16: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

16Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Case study: Application to expert finding

Expert Finding: Identify a list of experts in the academic field for a given

query topic (e.g., “data mining” “Jiawei Han, etc”) Publications as representative of their expertise Use DBLP dataset to obtain the publications

Authors have expertise in the topic of their papers Overall aggregation of their publications

Refine the ranking scores of papers, then aggregate the refined scores to re-rank the experts

III. Methodology

Page 17: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

17Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Case study: Application to expert findingIII. Methodology

Page 18: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

18Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Experiments

DBLP Collection A subset of the DBLP records (15-CONF) Statistics of the 15-CONF collection

Page 19: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

19Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Benchmark Dataset

A benchmark dataset with 16 topics and expert lists

IV. Experiments

Page 20: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

20Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Evaluation Metrics

Precision at rank n (P@n):

Mean Average Precision (MAP):

Bpref: The score function of the number of non-relevant candidates

IV. Experiments

Page 21: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

21Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Preliminary Experiments

Evaluation results (%)

PRRM may not improve the performance GBRM achieve the best results

IV. Experiments

Page 22: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

22Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Details of the resultsIV. Experiments

Page 23: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

23Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Effect of parameter μα

μα 0, return the initial scores (baseline)

μα 1, discard the initial scores, consider the global consistency over the graph

IV. Experiments

Page 24: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

24Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Effect of parameter μα

Robust, achieve the best results when μα (0.5, 0.7)

IV. Experiments

Page 25: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

25Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Effect of graph construction

Different dimensionality (kd) of the latent feature, which is used to calculate the weight matrix W

Become better for greater kd, because higher dimensional space can better capture the similarities

kd, = 50 achieve better results than tf.idf

IV. Experiments

Page 26: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

26Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Effect of graph construction

Different number of nearest neighbors (knn)

Tends to degrade a little with increasing knn knn = 10 achieve the best results Average processing time: increase linearly with the

increase of knn

IV. Experiments

Page 27: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

27Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Conclusions and Future Work

Conclusions Leverage the graph-based model for the query-dependent

ranking problem Integrate the latent space with the graph-based re-ranking

model Address expert finding task in a the academic field using the

proposed method The improvement in our proposed model is promising

Future work Extend our framework to consider more features Apply the framework to other applications and large-scale

dataset

Page 28: Effective Latent Space Graph-based  Re-ranking Model with Global Consistency

28Hongbo Deng, Michael R. Lyu and Irwin KingDepartment of Computer Science and Engineering

The Chinese University of Hong Kong WSDM 2009

Q&A

Thanks!