Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

33
Implicit Feedback Recommendation via Implicit- to-Explicit OLR Mapping Denis Parra (Pitt), Alexandros Karatzoglou (TID), Xavier Amatriain (TID), Idil Yavuz (Pitt) CARS 2011 October 23rd 2011

description

Presentation at the CARS Workshop in the context of the Conference of Recommender Systems 2011, held in Chicago.

Transcript of Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Page 1: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Implicit Feedback Recommendation via Implicit-to-Explicit OLR Mapping

Denis Parra (Pitt), Alexandros Karatzoglou (TID), Xavier Amatriain (TID), Idil Yavuz (Pitt)

CARS 2011October 23rd 2011

Page 2: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Outline

• Introduction• Datasets• Models• Results• Discussion• Conclusion

Page 3: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

A More Clear Outline

• This presentation has, IMHO, 3 sections:1. Good News2. Not that Good News3. Good News

• Which represent, respectively1. Results of first study on last.fm presented in UMAP

2011 2. Initial results of the study we present here 3. Expected Results after analysis of 2. (once I finish my

comps) – and your feedback!

Page 4: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 4

Introduction• Most of recommender system approaches rely on

explicit information of the users, but…• Explicit feedback: scarce (people are not especially

eager to rate or to provide personal info)• Implicit feedback: Is less scarce, but (Hu et al., 2008)

There’s no negative feedback … and if you watch a TV program just once or twice?

Noisy … but explicit feedback is also noisy (Amatriain et al., 2009)

Preference & Confidence … we aim to map the I.F. to preference (our main goal)

Lack of evaluation metrics … if we can map I.F. and E.F., we can have a comparable evaluation

7/12/2011

Page 5: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section I : Good News

• Last.fm User Study• Linear regression results

Page 6: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Recalling the 1st study (1/5)• Last.fm users (114 in total after filtering)• For each user, we crawled all the albums they

listened to send them a personalized survey

Page 7: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 7

Recalling the 1st study (2/5)• What items should they rate? Item (album) sampling:– Implicit Feedback (IF): playcount for a user on a given album.

Changed to scale [1-3], 3 means being more listened to.– Global Popularity (GP): global playcount for all users on a given

album [1-3]. Changed to scale [1-3], 3 means being more listened to.

– Recentness (R) : time elapsed since user played a given album. Changed to scale [1-3], 3 means being listened to more recently.

7/12/2011

Page 8: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 8

Recalling the 1st study (3/5)

• Demographics Survey + Rating 100 albums

7/12/2011

Page 9: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 9

Recalling the 1st study (4/5)• Gender• Age• Country• Hours per week spent on internet [int_hrs_per_week]• Hours per week listening to music online [msc_hrs_per_week]• Number of concerts per year [conc_per_year]• Do you read specialized music blogs or magazines? [blogs_mag]• Do you have experience evaluating music online? [rate_music]• How frequently do you buy physical music records? [buy_records]• How frequently do you buy music online? [buy_online] • Do you prefer listening to single tracks, whole albums or either way?

[track_or_CD]

7/12/2011

Page 10: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Recalling the 1st study (5/5)

• Prediction of rating by multiple Linear Regression evaluated with RMSE.

• Results showed that Implicit feedback (play count of the album by a specific user) and recentness (how recently an album was listened to) were important factors, global popularity had a weaker effect.

• Results also showed that listening style (if user preferred to listen to single tracks, CDs, or either) was also an important factor, and not the other ones.

Page 11: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

... but

• Linear Regression didn’t account for the nested nature of ratings

• And ratings were treated as continuous, when they are actually ordinal.

User 1

1 3 5 3 0 4 5 2 2 1 5 4 3 2

User n

3 2 1 0 4 5 2 5 4 3 2 1 3 5

. . .

Page 12: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

So, Ordinal Logistic Regression!

• Actually Mixed-Effects Ordinal Multinomial Logistic Regression

• Mixed-effects: Nested nature of ratings • We obtain a distribution over ratings (ordinal

multinomial) per each pair USER, ITEM -> we predict the rating using the expected value.

• … And we can compare the inferred ratings with a method that directly uses implicit information (playcounts) to recommend ( by Hu, Koren et al. 2007)

Page 13: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Model [to predict rating user x item]

• Model

• Predicted value

Page 14: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Final MELR Model with 4 fixed effects

Page 15: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section II : Not that Good News

• Datasets I and II• Results measured as MAP and nDCG.

Page 16: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Datasets

• D1: users, albums, if, re, gp, ratings, demographics/consumption

• D2: users, albums, if, re, gp, NO RATINGS.

Page 17: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Experiments• First step: build MELR model using D1• For D1 and D2: split dataset in 5 parts to

perform a 5-fold cross validation

Page 18: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Results

Page 19: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section III: Expected Good News

• After Analyzing our data/process

Page 20: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Lessons / Challenges (1/2)Problem/ Challenge

1. Ground truth: Playcounts of albums or tracks?

2. Quantization of playcounts (implicit feedback), recentness, and overall number of listeners of an album (global popularity) [1-3] scale v/s raw playcounts

3. Defining Relevancy of recommended elements (to compare with the raw playcounts)

Page 21: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Lessons / Challenges (2/2)Problem/ Challenge

4. Additional/Alternative metrics for evaluation [MAP and nDCG used in the paper]

5. New Survey (In order to deal with the issues of identifying “actual” relevancy in dataset2)

6. Significance of level-2 variables: track_or_CD (study 1 v/s 2, where concerts_per_year was significant)

Page 22: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

… so

• Lots of work to do [after my comps]• Questions, Suggestions?

Page 23: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 23

Is this the end?

7/12/2011

Page 24: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Thanks!

• Denis Parra [email protected]

Page 25: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Backup slides

Page 26: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Complete Results for D2

AVG (MAP) AVG(nDCG) SD (MAP) SD(nDCG)

Koren 0.101428473 0.271841949 0.001504383 0.001906666

KorenLog 0.123444659 0.295368576 0.002105906 0.002239792

logit_3 0.12225702 0.294351155 0.000868738 0.001100289

popularity 0.01777665 0.136672774 0.000937768 0.00086512

linear_2 0.123417895 0.295026219 0.001830371 0.001554675

linear_3 0.122317675 0.294211465 0.001744534 0.00212961

Page 27: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Distribution of ratings

Page 28: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Actual Distribution of ratings

Page 29: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Some Intuition About the Results

• Distributions of ratings in both datasets

Page 30: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 30

4 Regression Analysis

• Including Recentness increases R2 in more than 10% [ 1 -> 2]• Including GP increases R2, not much compared to RE + IF [ 1 -> 3]• Not Including GP, but including interaction between IF and RE improves

the variance of the DV explained by the regression model. [ 2 -> 4 ]7/12/2011

M1: implicit feedback

M2: implicit feedback & recentness

M4: Interaction of implicit feedback & recentness

M3: implicit feedback, recentness, global popularity

Page 31: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 31

4.1 Regression Analysis

• We tested conclusions of regression analysis by predicting the score, checking RMSE in 10-fold cross validation.

• Results of regression analysis are supported.

7/12/2011

Model RMSE1 RMSE2User average 1.5308 1.1051M1: Implicit feedback 1.4206 1.0402M2: Implicit feedback + recentness 1.4136 1.034M3: Implicit feedback + recentness + global popularity 1.4130 1.0338M4: Interaction of Implicit feedback * recentness 1.4127 1.0332

Page 32: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 32

4.2 Regression Analysis – Track or Album

• Including this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model

7/12/2011

Model Tracks Tracks/Albums

Albums

User average 1.1833 1.1501 1.1306M1: Implicit feedback 1.0417 1.0579 1.0257M2: Implicit feedback + recentness 1.0383 1.0512 1.0169M3: Implicit feedback + recentness + global popularity

1.0386 1.0507 1.0159

M4: Interaction of Implicit feedback * recentness 1.0384 1.049 1.0159

Page 33: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 33

Is this the end?

7/12/2011