Orthogonal query recommendation (RecSys 2013)

Post on 16-Jul-2015

311 views 1 download

Tags:

Transcript of Orthogonal query recommendation (RecSys 2013)

Orthogonal Query

Recommendation

(Puya) Hossein Vahabi 1, Margareta Ackerman 2, David Loker 3

Ricardo Baeza-Yates 4, Alejandro Lopez-Ortiz 3

1 Microblr Ltd, UK & Italy

2 CalTech, USA

3 University of Waterloo, Canada

4 Yahoo! Research Lab, Barcelona, Spain

ACM RecSys

2013

The 7th ACM Recommender Systems

Conference

Outline

Motivation

Orthogonal Queries

How to find Orthogonal Queries efficiently

?

Recommendation algorithm and

caching schemes

Evaluation

Conclusions

2

3

MOTIVATION

Motivation4

If a user is not satisfied with the

answers, a query recommendation

can be useful.

Query Recommendation

To click or

not to click ?

Motivation5

Sometimes users’ know what

they are looking for, but they don’t

know which keywords to use: “Daisy Duke” but looking for “Catherine Bach”

“diet supplement” but looking for “body building

supplements”

Traditional query

recommendation algorithms fails:

Why?

Why traditional query recommendations

fails? 7

Query Recommendation

Because they are looking for

highly related queries, while I’m

looking for FAA (Federal Aviation

Administration)

Why traditional query recommendations

fails? 8

Query Recommendation

No results for queries in the long-

tail

NO RESULTS.

How to deal with poorly formulated

or long-tail queries?

9

ORTHOGONAL QUERIES

Definition: Orthogonal Queries10

«Orthogonal queries are related

queries that have (almost) no

common terms with the user's

query.»

Definition: Orthogonal Queries11

12

HOW TO FIND ORTHOGONAL

QUERIES EFFICIENTLY?

How to find Orthogonal

Queries?13

How to find Orthogonal

Queries?14

How to find Orthogonal

Queries?15

4

Term Overlap Result Overlap

Examples: Orthogonal Queries16

Result overlap and term overlap

considering 100 results per query

Examples: Orthogonal Queries17

Same keywords: result overlap of 0.408.

Examples: Orthogonal Queries18

Different keywords: result overlap of 0.02.

Range Identification19

ResultOverlap vs Average

TermOverlap21

Pearson correlation coefficient: 0.567

indicating significant positive correlation

TermOverlap vs Average

ResultOverlap22

Pearson correlation coefficient: 0.686

indicating a strong positive correlation

Range Selection23

We have found out that:

there is strong positive correlation

between TermOverlap

ResultOverlap.

Based on the resultOverlap we

want to know whether two

queries have (almost) no

common terms

Range Selection24

Range selection:

resultOverlap vs avg.

termOverlap25

Range Selection26

Orthogonal Queries: Algorithm27

28

RECOMMENDATION

ALGORITHM

AND CACHING SCHEMES

Orthogonal Query(OQ)

Recommendation29

OQ Efficient Computation minHashtechnique plus Caching

OQ Recommendation We rank the orthogonal queries based on the cache policy priority

Cache

Learning to Cache30

How to best fill the cache? MCQ policy, Most Clicks Query

MFFQS policy, Most Frequent Final Query in Session

MFQ policy, Most Frequent Query

MRQ policy, Most Recent Query

S-xmin policy, Test session of x minutes.

We want to learn what is the best

cache policy No complicated

machine learning

techniques!

Satisfied Sessions with Retype31

Click

Dataset32

x min. Test Sessions

Learning to Cache33

Best cache policy has the

highest probability of having the

last query of the test sessions

What is the best policy based on

cache size?

What is the best policy for shorter

or longer test sessions?

Normalized Hits vs Cache Size34

MCQ has the highest hit ratio!

Test sessions of 30 min.

Hit Ratio (%) 35

MCQ has the highest hit ratio!

Test Sessions

Cache Policy

Cache size: 80k

Learning to Cache36

We did optimization also to learn

which was the best caching policy

specifically for OQ, and MCQ

again was the best !

37

EVALUATION

How to evaluate query

recommenders?41

A recommender system should

predict the last query of the

satisfied sessions with retype

based on the first one!

...

How to evaluate query

recommenders?42

...

Comparison43

CG Cover Graph

UF-IQF User Frequency-Inverse Query

Frequency

OQ Orthogonal Query

Recommendation

SC Short Cut

QFG Query Flow Graph

TQG Term-Query Graph

SQ Similar Query

We compared OQ recommendation with 6 others query recommendation algorithms

Comparison44

Results, S@10 as a percentage45

Cache size:

80k

Results, S@10 as a percentage46

We are the best for the long-tail

UF-IQF seems to be better than us, but this is due to the fact that in many cases the last query of a session is just a reformulation of the initial query or is just an spelling error

Is combining OQ with other query recommendation algorithms useful?

Successful Results Overlap:

S@1047

Overlap of the successful

orthogonal recommendations

w.r.t. the successful baseline

query recommendations

OQ SUCCEEDS WHERE

OTHERS FAIL!

User Study48

For each algorithm: generate 5 recommendations

For each initial query: Mix recommendations of different

algorithms

Ask 10 assessors to evaluate the best recommendations

Results: User Study on top-5

Recommendations49

In 45% of cases

OQ is judged to be

useful or

somewhat useful.

OQ is the best for

queries in the

long-tail.

Example50

51

CONCLUSIONS

Conclusions & Future Work52

We have defined Orthogonal

Queries as queries with (almost)

no term overlap with the user’s

query

We have presented how to

efficiently find Orthogonal Queries

by recurring to the result set

YOU MIGHT FIND ANOTHER

WAY

Conclusions & Future Work53

We have presented how to learn

from data which is the best

caching policy

We have designed a simple way

to recommend orthogonal queries

Future work is to find more

good techniques and variants of

this

54

THANK YOU