Post on 01-Nov-2014
description
Towards Diverse Recommendation
Towards Diverse Recommendation
Neil Hurley
Complex Adaptive System LaboratoryComputer Science and Informatics
University College Dublin
Clique Strategic Research Clusterclique.ucd.ie
October 2011
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance I
Much effort has been spent on improving the performance ofrecommenders from the point of view of rating prediction.
It is a well-defined statistical problem;We have agreed objective measure of prediction quality.
Efficient algorithms have been developed that are good atmaximising predictive accuracy.
Not a completely solved problem – e.g. dealing with dynamicdata.
But, there are well accepted evaluation methodologies andquality measures.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance I
Much effort has been spent on improving the performance ofrecommenders from the point of view of rating prediction.
It is a well-defined statistical problem;We have agreed objective measure of prediction quality.
Efficient algorithms have been developed that are good atmaximising predictive accuracy.
Not a completely solved problem – e.g. dealing with dynamicdata.
But, there are well accepted evaluation methodologies andquality measures.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance I
Much effort has been spent on improving the performance ofrecommenders from the point of view of rating prediction.
It is a well-defined statistical problem;We have agreed objective measure of prediction quality.
Efficient algorithms have been developed that are good atmaximising predictive accuracy.
Not a completely solved problem – e.g. dealing with dynamicdata.
But, there are well accepted evaluation methodologies andquality measures.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingnessDiversitySerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingnessDiversitySerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
Novelty
InterestingnessDiversitySerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingness
DiversitySerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingnessDiversity
SerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingnessDiversitySerendipity
User satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance II
But good recommendation is not about ability to predict pastratings.
Recommendation quality is subjective;People’s tastes fluctuate;People can be influenced and persuaded;Recommendation can be as much about psychology asstatistics.
A number of ‘qualities’ are being more and more talked aboutwith regard to other dimensions of recommendation:
NoveltyInterestingnessDiversitySerendipityUser satisfaction
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Recommendation Performance III
Clearly user-surveys may be the only way to determine subjectsatisfaction with a system.
(Castagnos et al, 2010) present useful survey results on theimportance of diversity.
In order to make progress on recommendation algorithms thatseek improvements along these dimensions, we need
Agreed (objective?) measures of these qualities and agreedevaluation methodologies
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Setting the Context
Agenda
Focus in this talk on measures of novelty and diversity, ratherthan algorithms for diversification.
Initially look at how these concepts are defined in IR research.
Then examine ideas that have emerged from the RScommunity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
The Probability Ranking Principle
“If a reference retrieval system’s response to each request is aranking of the documents in the collection in order of decreasingprobability of relevance . . . the overall effectiveness of the systemto its user will be the best that is obtainable” (W.S. Cooper)
Nevertheless, relevance measured for each single documenthas been challenged since as long ago as 1964
Goffman (1964). . . one must define relevance in relation to theentire set of documents rather than to only one documentBoyce (1982) . . . A retrieval system which aspires to theretrieval of relevant documents should have a second stagewhich will order the topical set in a manner so as to providemaximum informativeness
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
The Probability Ranking Principle
“If a reference retrieval system’s response to each request is aranking of the documents in the collection in order of decreasingprobability of relevance . . . the overall effectiveness of the systemto its user will be the best that is obtainable” (W.S. Cooper)
Nevertheless, relevance measured for each single documenthas been challenged since as long ago as 1964
Goffman (1964). . . one must define relevance in relation to theentire set of documents rather than to only one document
Boyce (1982) . . . A retrieval system which aspires to theretrieval of relevant documents should have a second stagewhich will order the topical set in a manner so as to providemaximum informativeness
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
The Probability Ranking Principle
“If a reference retrieval system’s response to each request is aranking of the documents in the collection in order of decreasingprobability of relevance . . . the overall effectiveness of the systemto its user will be the best that is obtainable” (W.S. Cooper)
Nevertheless, relevance measured for each single documenthas been challenged since as long ago as 1964
Goffman (1964). . . one must define relevance in relation to theentire set of documents rather than to only one documentBoyce (1982) . . . A retrieval system which aspires to theretrieval of relevant documents should have a second stagewhich will order the topical set in a manner so as to providemaximum informativeness
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
The Maximal Marginal Relevance (MMR) criterion
“ reduce redundancy while maintaining query relevance inre-ranking retrieved documents” (Carbonell and Goldstein 1998)Given a set of retrieved documents R, for a query Q incrementallyrank the documents according to
MMR , arg maxDi∈R\S
[λsim1(Di, Q)− (1− λ) max
Dj∈Ssim2(Di, Dj)
]where S is the set of documents already ranked from R.
Iterative greedy approach to increasing the diversity of aranking.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
The Expected Metric Principle
“in a probabilistic context, one should directly optimize for theexpected value of the metric of interest” Chen and Karger (2006).
Chen and Karger (2006) introduces a greedy optimisationframework in which the next document is selected to greedilyoptimise the selected objective.
An objective such as mean k-call at n where k-call is 1 if thetop-n result contains at least k relevant documents, naturallyincreases result-set diversity.
For 1-call, this results in an approach of selecting the nextdocument, assuming that all documents selected so far arenot relevant.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
PMP – rank according to
Pr(r|d) =⇒ Pr(d|r)Pr(d|qr)
k-call at n – rank according to
Pr(at least k of r0, ..., rn−1|d0, d1, ...dn−1)
Consider a query such as Trojan Horse, whose meaning isambiguous. The PMP criterion would determine the mostlikely meaning and present a ranked list reflecting thatmeaning. A 1-call at n criterion would present a resultpertaining to each possible meaning, with an aim of getting atleast one right.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
Figure: Results from Chen and Karger (2006) on TREC2004 RobustTrack.
MSL = Mean Search Length (mean of rank of first relevantdocument minus one)MRR = Mean Reciprocal Rank (mean of the reciprocal rank offirst relevant document)DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
Agrawal et al. (2009) propose a similar approach of anobjective function to maximise the probability of finding atleast one relevant result.
They dub their approach the result diversification problem andstate it as
S = arg maxS⊆D,|S|=k
Pr(S|q)
Pr(S|q) =∑
c
Pr(c|q)(1−∏d∈S
(1− V (d|q, c)))
whereS is the retrieved result set of k documentsc ∈ C is a set of categoriesV (d|q, c) is the likelihood of the document satisfying the userintent, given the query q.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Novelty and Diversity in Information Retrieval
Zhai and Lafferty (2006) – risk minimization of a lossfunction over possible returned document rankings measuringhow unhappy the user is with that set.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009)
r(.) : D ×Q→ R+ a measure of relevance
d(., .) : D ×D → R+ a similarity function
Diversification objective
R∗k = arg max{Rk⊆D,|Rk|=k}f(Rk, q, r(.), d(., .))
What properties should f() satisfy?
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) I
1 Scale Invariance – insensitive to scaling distance andrelevance by constant.
2 Consistency – Making output more relevance and morediverse and other documents less relevant and less diverseshould not change output of the ranking.
3 Richness – Should be able to obtain any possible set asoutput by appropriate choice of r(.) and d(., .).
4 Stability – Output should not change arbitrarily with size:R∗k ⊆ R∗k+1.
5 Independence of Irrelevant Attributes f(R) independentof r(u) and d(u, v) for u, v /∈ S.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) I
1 Scale Invariance – insensitive to scaling distance andrelevance by constant.
2 Consistency – Making output more relevance and morediverse and other documents less relevant and less diverseshould not change output of the ranking.
3 Richness – Should be able to obtain any possible set asoutput by appropriate choice of r(.) and d(., .).
4 Stability – Output should not change arbitrarily with size:R∗k ⊆ R∗k+1.
5 Independence of Irrelevant Attributes f(R) independentof r(u) and d(u, v) for u, v /∈ S.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) I
1 Scale Invariance – insensitive to scaling distance andrelevance by constant.
2 Consistency – Making output more relevance and morediverse and other documents less relevant and less diverseshould not change output of the ranking.
3 Richness – Should be able to obtain any possible set asoutput by appropriate choice of r(.) and d(., .).
4 Stability – Output should not change arbitrarily with size:R∗k ⊆ R∗k+1.
5 Independence of Irrelevant Attributes f(R) independentof r(u) and d(u, v) for u, v /∈ S.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) I
1 Scale Invariance – insensitive to scaling distance andrelevance by constant.
2 Consistency – Making output more relevance and morediverse and other documents less relevant and less diverseshould not change output of the ranking.
3 Richness – Should be able to obtain any possible set asoutput by appropriate choice of r(.) and d(., .).
4 Stability – Output should not change arbitrarily with size:R∗k ⊆ R∗k+1.
5 Independence of Irrelevant Attributes f(R) independentof r(u) and d(u, v) for u, v /∈ S.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) I
1 Scale Invariance – insensitive to scaling distance andrelevance by constant.
2 Consistency – Making output more relevance and morediverse and other documents less relevant and less diverseshould not change output of the ranking.
3 Richness – Should be able to obtain any possible set asoutput by appropriate choice of r(.) and d(., .).
4 Stability – Output should not change arbitrarily with size:R∗k ⊆ R∗k+1.
5 Independence of Irrelevant Attributes f(R) independentof r(u) and d(u, v) for u, v /∈ S.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
Axioms of Diversification (Gollapudi and Sharma 2009) II
6 Monotonicity – Addition of a document to R should notdecrease the score : f(R ∪ {d}) ≥ f(R).
7 Strength of Relevance – No f(.) ignores the relevancescores.
8 Strength of Similarity – No f(.) ignores the similarity scores.
No Function satisfies all 8 axioms
MaxSum DiversificationWeighted sum of the sums of relevance and dissimilarity ofitems in the selected set.safisfies all axioms except stability.
MaxMin DiversificationWeighted sum of the min relevance and min dissimilarity ofitems in the selected set.satisfies all axioms except consistency and stability.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Diversity
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Diversity
IR Measure of Diversity
S-recall (Zhai and Lafferty 2006)
S-recall at rank n is defined as the number of subtopics retrievedup to a given rank n divided by the total number of subtopics : LetSi ⊆ S be the number of subtopics in the ith document di then
S− recall@n =|⋃n
i=1 Si||S|
Let minrank(S, k) = size of the smallest subset of documentsthat cover at least k subtopics.
Usually most useful to consider S− recall@n wheren = minrank(S, |S|)
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Diversity
IR Measures of Diversity
S-precision (Zhai and Lafferty 2006)
S-precision at rank n is the ratio of the minimum rank at which agiven recall value can optimally be achieved to the first rank atwhich the same recall value actually has been achieved.Let k = |
⋃ni=1 Si|. Then
S− precision@n =minrank(S, k)
m∗where m∗ = arg min
j|
j⋃i=1
Si| ≥ k
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Diversity
IR Measures of Diversity
α-NDCG (Clarke et al. 2008)
Standard NDCG (Normalised Cumulative Discounted Gain)calculates a gain for each document based on its relevance and alogarithmic discount for the rank it appears at. Extended fordiversity evaluation, the gain is incremented by 1 for each newsubtopic, and αk(0 ≤ α ≤ 1) for a subtopic that has been seen ktimes in previously-ranked documents.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Diversity
IR Measures of Diversity
Intent-aware Precision (Agrawal et al. 2009)
Intent-aware precision precIA is calculated by first calculatingprecision for each distinct subtopic separately, then averaging theseprecisions according to a distribution of the proportion of usersthat are interested in that subtopic:
precIA@n =∑s∈S
Pr(s|q) 1n
n∑i=1
I(s ∈ di)
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
IR Measures of Novelty
Novelty Measures (Agrawal et al. 2009)
KL-divergence D(di||dj) is used to measure novelty of di wrtdj .
Alternatively, di can be modelled as a mixture of dj and abackground model. The higher the weight of dj in themixture, the less novel is di wrt dj .
Pairwise measures are combined to give overall measure ofnovelty wrt all documents in result set.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.
The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
Long recognised that the probability ranking principle doesnot adequately measure result list quality
– the usefulness of a document depends on what otherdocuments are on the list.
Considering that each document consists of a set of subtopics,information nuggets or facets
The novelty of a document is a measure of how muchredundancy it contains, where it is redundant w.r.t. a facet, ifthat facet is already covered by another document.The diversity of a result list is a measure of the number ofrelevant facets it contains.
No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered
Consider selecting document with least redundancy vsselecting document that improves overall diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Novelty and Diversity in Information retrieval
IR Measures of Novelty
Summary of IR Research
In general, IR lines of research wrt diversity and noveltyconsider the following:
Relevance scores for documents are not independent – need toconsider relevance wrt to the entire result set, rather than eachdocument in turn.Diversity is related to query ambiguity –
Difference between selecting documents according to theprobability of meaning; orSelecting documents to cover all meanings, so that at leastone is relevant.
Diversity is a measure of set; novelty is a measure of eachdocument wrt a particular set in which it is contained.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The Long Tail Problem
Figure: Sales Demand for 1000 productsDiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The Long Tail Problem
Figure: Top 2% of Most Popular Products Account for 13% of SalesDiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The Long Tail Problem
Figure: Least Popular Items Account for 30% of SalesDiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The Long Tail Problem
“Less is More”
– Chris Anderson [Why the Future of Business is Selling Less ofMore]
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Recommenders and The Long Tail Problem
To support an increase in sales, need to increase the diversityof the set of recommendations made to the end-user.
Recommend items in the long-tail that are highly likely to beliked by the current-user.
Implies finding those items that are liked by the current userand relatively few other users.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The End-user Perspective
Definition
The diversity of a set L of size p is the average dissimilarity of theitems in the set
fD(L) =2
p(p− 1)
∑i∈L
∑j<i∈L
(1− s(i, j))
We have found it useful to define novelty (or relative diversity) asfollows:
Definition
The novelty of an item i in a set L is
nL(i) =1
p− 1
∑j∈L,j 6=i
(1− s(i, j))
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Diversity – The End-user Perspective
User Profile from Movielens Dataset, |Pu| = 764, N = 20,|Tu| = 0.1× |Pu|40% of most novel items accrue no hits at all.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Other Definitions of Novelty/Diversity in RS
Castells et al. (2011) outlines some of the ways that noveltyimpacts on recommender system design.
Distinguishes item popularity and item similarity; user-relativemeasures and global measures
Popularity-based novelty:
novelty(i) = − log p(i) global measure
or 1− log(p(K|i))novelty(i) = − log(p(i|u)) user perspective
or 1− log(p(K|i, u))
Similarity perspective
novelty(i|S) =∑j∈S
p(j|S)d(i, j)
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Novelty for Recommender Systems
Pablo introduces rank-sensitive and relevance-awaremeasures of recommendation set diversity and novelty
Recommendation Novelty Metric
m(R|u) =∑
n
disc(n)p(rel|in, u)novelty(in|u)
Novelty-Based Diversity Metric
novelty(R|u) =∑
n,j∈udisc(n)p(rel|in, u)p(j|u)d(in, j)
diversity(R|u) =∑k<n
disc(n)disc(k)p(rel|in, u)p(j|u)d(in, ik)
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Evaluating Diversity
In our 2009 RecSys paper, we evaluated our diversificationmethod on test sets T (µ) consisting of items chosen from thetop 100× (1− µ)% most novel items in the user profiles.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Toy Example
Motivate our diversity methodology using a toy example inwhich a user-base of four users, u1, u2, u3, u4 is recommendeditems from a catalogue of 4 items i1, i2, i3, i4.
The system recommends N = 2 items to each user.
Any particular scenario can be represented in a table thatindicates whether a user actually likes an item or not (1 or 0)and the probability that the recommender system willrecommend the corresponding item to the user.
Assume that G1 = {i1, i2} is a single genre (e.g. horrormovies) and G2 = {i3, i4} is another.
Simple similarity measure s(i1, i2) = s(i3, i4) = 1 andcross-genre similarities are zero.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Toy Example
Biased but Full Recommended Set Diversity
i1 i2 i3 i4u1 1 (1) 1 (0) 1 (1
2) 0 (12)
u2 1 (0) 1 (1) 1 (12) 0 (1
2)u3 0 (1
2) 1 (12) 1 (1) 1 (0)
u4 1 (1) 0 (0) 1(12) 1 (1
2)
Always recommends an item from G1 and an item from G2.Probability of i1 being recommended to a randomly selecteduser – 1
4(1 + 0 + 12 + 1) = 5
8 – is higher than that of i2 (38),
for instance.Recommendations do not spread evenly across the productcatalogue.
Biased towards consistently recommending i1 to u1 but neverrecommending i2 to u1.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Toy Example
No System Level Biases
i1 i2 i3 i4u1 1 (1) 1 (1
3) 1 (13) 0 (1
3)u2 0 (1
3) 1 (1) 1 (13) 1 (1
3)u3 1 (1
3) 0 (13) 1 (1) 1 (1
3)u4 1 (1
3) 1 (13) 0 (1
3) 1 (1)
The probability of recommending i1, to a randomly chosenrelevant user (i.e. u1, u3 or u4) is 1
3(1 + 13 + 1
3) = 59 .
Similarly, for i2, i3 and i4.
Focusing on the set of items that are relevant to u1 (i.e. i1, i2and i3), the algorithm is three times as likely to recommend i1as either of the other relevant items.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Toy Example
No System or User Level Biases
i1 i2 i3 i4u1 1 (1
3) 1 (13) 1 (1
3) 0 (1)u2 0 (1) 1 (1
3) 1 (13) 1 (1
3)u3 1 (1
3) 0 (1) 1 (13) 1 (1
3)u4 1 (1
3) 1 (13) 0 (1) 1 (1
3)
Same probability of recommending any relevant item to a user
Same probability that an item is recommended when it isrelevant.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Algorithm Diversity
Definition
We define an algorithm to be fully diverse from the userperspective if it recommends any of the user’s set of relevant itemswith equal probability.
Definition
We define an algorithm to be fully diverse from the systemperspective if the probability of recommending an item, when it isrelevant is equal across all items.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Lorenz Curve and the Gini Index
A plot of the cumulative proportion of the product catalogueagainst cumulative proportion of sales
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Lorenz Curve and the Gini Index
69% of the sales are of the 10% top selling products.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Lorenz Curve and the Gini Index
G = 0 implies equal sales to all products. G = 1 when singleproduct gets all sales.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Measuring Recommendation Success
Measurement unit of success in recommender systems = Hit
Interpret as the recommendation of a product known to beliked by the user.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Hits Inequality – Concentration Curves of Hits
Lorenz curve and gini index measure inequality within thehits distribution over all items in the product catalogue.
Concentration curve and concentration index of hits vspopularity measures bias of hits distribution towards popularitems.
Concentraion curve and concentration index of hits vsnovelty measures bias of hits distribution towards novelitems.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Concentration Curves
n products accrue hits {h1, . . . , hn} – concentration curve dependson correlation between hits and popularity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Concentration Curves
n products accrue hits {h1, . . . , hn} – concentration curve dependson correlation between hits and popularity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Concentration Curves
n products accrue hits {h1, . . . , hn} – concentration curve dependson correlation between hits and popularity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Concentration Curves
n products accrue hits {h1, . . . , hn} – concentration curve dependson correlation between hits and popularity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Concentration Measures of Diversity
Temporal Diversity
Lathia et al. (2010) investigates diversity over time – dorecommendations change over time?
Now diversity is measured between two recommended sets,formed at different points in time
diversity(Ri+1, Ri) =1n|Ri+1\Ri|
And novelty is measured as the number of new items over alltime
novelty(Ri+1) =1n|Ri+1\ ∪i
j=1 Ri|
kNN algorihms exhibit more temporal diversity than SVDmatrix factorisation
Switching between multiple algorithms is offered as one meansto improve temporal diversity.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Outline
1 Setting the Context
2 Novelty and Diversity in Information retrievalIR Measures of DiversityIR Measures of Novelty
3 Diversity Research in Recommender SystemsConcentration Measures of DiversitySerendipity
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Measuring the Unexpected
Serendipity – the extent to which recommendations maypositively surprise users.Murakami et al. (2008) propose to measure unexpectedness asthe “distance between results produced by the method to beevaluated and those produced by a primitive predictionmethod”.
=1n
n∑i=1
max(Pr(si)− Prim(si), 0)× rel(si)×∑i
j=1 rel(sj)i
Ge et al. (2010) follow a similar approach, such that if R1 isthe recommended set returned by the RS and R2 is a setreturned by Prim then
serendipity =1
|R1\R2|∑
sj∈R1\R2
rel(sj)
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Novelty vs Serendipity
Novelty with regard to a given set is a measure of howdifferent an item is to other items in the set;
It does not involve any notion of relevance
Is a serendipitous recommendation equivalent to a relevantnovel recommendation?
To me, serendipity encapsulates a higher degree of risk – anovel item with a low chance of relevance, according to ourmodel, which yet turns out to be relevant.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Conclusions
IR research gives some directions in how to define andevaluate diversity and novelty
We can ask
Are these adequate for RS research?Can we map them to the needs of RS evaluation?How are they deficient?
Recent research is beginning to clarify these issues for RS
I believe that objective measures are possible
I look forward to some interesting discussions on these issues!!
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Conclusions
IR research gives some directions in how to define andevaluate diversity and novelty
We can ask
Are these adequate for RS research?Can we map them to the needs of RS evaluation?How are they deficient?
Recent research is beginning to clarify these issues for RS
I believe that objective measures are possible
I look forward to some interesting discussions on these issues!!
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
Thank You
My research is sponsored by Science Foundation Ireland undergrant 08/SRC/I1407: Clique: Graph and Network Analysis Cluster
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
References I
Agrawal, R., Gollapudi, S., Halverson, A. and Ieong, S.: 2009,Diversifying search results, Proceedings of the Second ACMInternational Conference on Web Search and Data Mining,WSDM ’09, ACM, New York, NY, USA, pp. 5–14.URL: http://doi.acm.org/10.1145/1498759.1498766
Boyce, B. R.: 1982, Beyond topicality : A two stage view ofrelevance and the retrieval process, Inf. Process. Manage.18(3), 105–109.
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
References II
Carbonell, J. and Goldstein, J.: 1998, The use of mmr,diversity-based reranking for reordering documents andproducing summaries, Proceedings of the 21st annualinternational ACM SIGIR conference on Research anddevelopment in information retrieval, SIGIR ’98, ACM, NewYork, NY, USA, pp. 335–336.URL: http://doi.acm.org/10.1145/290941.291025
Castells, P., Vargas, S. and Wang, J.: 2011, Novelty and DiversityMetrics for Recommender Systems: Choice, Discovery andRelevance, International Workshop on Diversity in DocumentRetrieval (DDR 2011) at the 33rd European Conference onInformation Retrieval (ECIR 2011).
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
References III
Chen, H. and Karger, D. R.: 2006, Less is more: probabilisticmodels for retrieving fewer relevant documents, in E. N.Efthimiadis, S. T. Dumais, D. Hawking and K. Jarvelin (eds),SIGIR, ACM, pp. 429–436.
Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan,A., Buttcher, S. and MacKinnon, I.: 2008, Novelty and diversityin information retrieval evaluation, Proceedings of the 31stannual international ACM SIGIR conference on Research anddevelopment in information retrieval, SIGIR ’08, ACM, NewYork, NY, USA, pp. 659–666.URL: http://doi.acm.org/10.1145/1390334.1390446
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
References IV
Ge, M., Delgado-Battenfeld, C. and Jannach, D.: 2010, Beyondaccuracy: evaluating recommender systems by coverage andserendipity, Proceedings of the fourth ACM conference onRecommender systems, RecSys ’10, ACM, New York, NY, USA,pp. 257–260.URL: http://doi.acm.org/10.1145/1864708.1864761
Goffman, W.: 1964, On relevance as a measure, InformationStorage and Retrieval 2(3), 201–203.
Gollapudi, S. and Sharma, A.: 2009, An axiomatic approach forresult diversification, Proceedings of the 18th internationalconference on World wide web, WWW ’09, ACM, New York,NY, USA, pp. 381–390.URL: http://doi.acm.org/10.1145/1526709.1526761
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
Towards Diverse Recommendation
Diversity Research in Recommender Systems
Serendipity
References V
Lathia, N., Hailes, S., Capra, L. and Amatriain, X.: 2010,Temporal diversity in recommender systems, in F. Crestani,S. Marchand-Maillet, H.-H. Chen, E. N. Efthimiadis andJ. Savoy (eds), SIGIR, ACM, pp. 210–217.
Murakami, T., Mori, K. and Orihara, R.: 2008, Metrics forevaluating the serendipity of recommendation lists, in K. Satoh,A. Inokuchi, K. Nagao and T. Kawamura (eds), New Frontiersin Artificial Intelligence, Vol. 4914 of Lecture Notes in ComputerScience, Springer Berlin / Heidelberg, pp. 40–46.
Zhai, C. and Lafferty, J.: 2006, A risk minimization framework forinformation retrieval, Inf. Process. Manage. 42, 31–55.URL: http://dx.doi.org/10.1016/j.ipm.2004.11.003
DiveRS: International Workshop on Novelty and Diversity in Recommender Systems