Temporal Diversity in RecSys - SIGIR2010

Post on 14-Dec-2014

1.113 views 0 download

description

Paper here: http://www.cs.ucl.ac.uk/staff/n.lathia/publications/sigir10.html

Transcript of Temporal Diversity in RecSys - SIGIR2010

Temporal Diversity in Recommender Systems

Neal Lathia1, Stephen Hailes1, Licia Capra1, Xavier Amatriain2

1Dept. Computer Science, University College London2Telefonica Research, Barcelona

ACM SIGIR 2010, Geneva

n.lathia@cs.ucl.ac.uk@neal_lathia, @xamat

EU i-Tour Project

recommender systems

● many examples over different web domains

● a lot of research: accuracy● multiple dimensions of usage that equate to user

satisfaction

● design a methodology to evaluate recommender systems that are iteratively updated; explore temporal dimension of filtering algorithms1

evaluating collaborative filtering over time

1N. Lathia, S. Hailes, L. Capra. Temporal Collaborative Filtering with Adaptive Neighbourhoods. ACM SIGIR 2009, Boston, USA

temporal diversity

● ...is not concerned with diversity of a single set of recommendations (e.g., are you recommended all six star wars movies at once?)

● ...is concerned with the sequence of recommendations that users see (are you recommended the same items every week?)

contributions

● is temporal recommendation diversity important?

● how to measure temporal diversity and novelty?

● how much temporal diversity do state-of-the-art CF algorithms provide?

● how to improve temporal diversity?

is diversity important?

data perspective: growth & activity

demographics (in paper): ~104 respondents

procedure

● claim: recommender system for “popular movies”

● rate week 1's recommendations

● movie titles, links to IMDB, DVD Covers● (click through buffer screen)

● rate week 2's recommendations

● (click through buffer screen)

● ....

overview of the surveys

W1

Survey 3: Random Movies

W2

W3

W4

W5

W1

Survey 3: Random Movies

W2

W3

W4

W5

W1

Survey 2: Popular Movies, Change Each Week

W2

W3

W4

W5

W1

Survey 2: Popular Movies, Change Each Week

W2

W3

W4

W5

W1

Survey 1: Popular Movies – No Change

W2

W3

W4

W5

Closing Questions

Closing Questions

74% important / very important23% neutral

86% important / very important

95% important / very important

surprise, unrest, rudecompliments, “spot on”

how did this affect the way people rated?

how did this affect the way people rated?

S3 Random: Always Bad

how did this affect the way people rated?

S3 Random: Always Bad

S2 Popular: Quite Good

how did this affect the way people rated?

S3 Random: Always Bad

S2 Popular: Quite Good

S1 Starts off Quite Good

S1 Ends off Bad

how did this affect the way people rated?

...ANOVA details in paper...

is diversity important? (yes)

how to measure temporal diversity?

measuring temporal diversity

diversity = ?

measuring temporal diversity

diversity = 3/10

how much temporal diversity do state-of-the-art CF algorithms provide?

3 algorithms – 3 influential factors

● baseline – popularity ranking

● item-based kNN

● singular value decomposition

● profile size vs. diversity

● ratings added vs. diversity

● time between sessions vs. diversity

profile size vs. diversity

baseline kNN SVD

profile size vs. diversity

baseline kNN SVD

main results

● as profile size increases, diversity decreases

● the more ratings added in the current session, the more diversity will be experienced in next session

● more time between sessions leads to more diversity

consequences

● want to avoid from having profiles that are too large

● (conflict #1) want to encourage users to rate as much as possible

● (conflict #2) want users to visit often, but diversity increases if they don't

● how does this relate back to traditional evaluation metrics?

accuracy vs. diversity

baseline

kNN

SVD

more accurate

more diverse

how to improve temporal diversity?

3 methods

● temporal switching

● temporal user-based switching

● re-ranking frequent visitor's lists

temporal switching

● “jump” between algorithms each week

temporal switching

● “jump” between algorithms each week

re-ranking visitor's lists

● (like we did in survey 2)

re-ranking visitor's lists

● (like we did in survey 2, amazon did in 1998!)

contributions/summary

● temporal diversity is important

● defined (simple, extendable) metric to measure temporal recommendation diversity

● analysed factors that influence diversity; most accurate algorithm is not the most diverse

● hybrid-switching/re-ranking can improve diversity

Temporal Diversity in Recommender Systems

Neal Lathia1, Stephen Hailes1, Licia Capra1, Xavier Amatriain2

1Dept. Computer Science, University College London2Telefonica Research, Barcelona

ACM SIGIR 2010, Geneva

n.lathia@cs.ucl.ac.uk

@neal_lathia, @xamat

Support by: EU FP7 i-TourGrant 234239