Fast incremental matrix factorization for recommendation with … · 2018-12-29 · Fast...
Transcript of Fast incremental matrix factorization for recommendation with … · 2018-12-29 · Fast...
Fast incremental matrix factorization for recommendation with positive-only feedback
João Vinagre1,2
Alípio Jorge1,2
João Gama1,3
1 LIAAD - INESC TEC, Porto, Portugal2 Faculdade de Ciências – Universidade do Porto3 Faculdade de Economia – Univesidade do Porto
UMAP 2014July 7-11Aalborg, Denmark
2
Outline
1. Introduction
2. Incremental matrix factorization
3. Evaluation protocol
4. Experiments and results
5. Conclusions and future work
3
1. Introduction
● Contributions:● Incremental matrix factorization algorithm
– for positive-only feedback (aka binary)
– online learning with SGD
● Online evaluation protocol
– online monitoring
– evaluation over time
4
2. Incremental MF: motivation
● Q: Why incremental?● Recommendation algorithms deal with ever growing user
feedback:
– continuous data fow
– variable rate
– unpredictable order● User behavior is naturally complex
– preference drifts / shifts
– moods
● A: Typicall data stream mining problem
5
2. Incremental MF: motivation
● Q: Why use positive-only feedback?
● ratings data (explicit):
● positive-only feedback (typically implicit)
– like button
– web access logs
– music listening / playlisting
– shopping history
– news reading
– event participation
– …● A: Widely available and less intrusive (both for user and for system)
6
2. Matrix factorization – Batch SGD
≈ .
● Training: minimize prediction error for known ratings → stochastic gradient descent (SGD)
● Process is iterative (several passes through dataset)
R̂ui=Au⋅BiT
R A B
7
2. Matrix Factorization: Incremental SGD
● Positive-only feedback: assume Rui = 1 for every observed (u,i)
● Incremental learning:
● for each newly observed (u,i) Au and Bi are adjusted
● only one pass is performed (i.e. one iteration)
● Recommendation:
● sort items i by descending |1 – Ȓui| for each user u
8
3. Evaluation protocol
● Traditional train-test-holdout → stationary data
● Our approach: prequential evaluation
for each (u,i):
1 – recommend items to u (if u is known)
2 – score recommendation, given the observed i
3 – update the model with (u,i)
● Continuous monitoring
9
4. Experiments and results: algorithms and data
● Algorithms:
● UserKNN: Incremental user-based neighborhood (classic reference);
● (W)BPRMF: (Weighted) Bayesian Personalized Ranking Matrix Factorization (Rendle et al. 2009);
● ISGD
● Datasets:
Dataset Events # Users # Items Time frame Sparsity
Lastfm-600k 493.063 164 65.013 8 months 99,11%
Music-playlist 111.942 10.392 26.117 45 months 99,96%
Music-listen 335.731 4.768 15.323 12 months 99,90%
Movielens 226.310 6.014 3.232 34 months 98,84%
10
4. Experiments and results: overall results
Dataset Algorith Recall@10 Update time
Lastfm-600k
BPRMF 0.003 28.061
WBPRMF 0.003 29.194
ISGD 0.034 1.106
UKNN 0.006 290.133
Music-playlist
BPRMF 0.020 1.889
WBPRMF 0.057 2.156
ISGD 0.171 0.949
UKNN 0.132 190.250
Music-listen
BPRMF 0.028 0.846
WBPRMF 0.056 1.187
ISGD 0.061 0.118
UKNN 0.139 328.917
Movielens-1M
BPRMF 0.080 0.173
WBPRMF 0.084 0.229
ISGD 0.050 0.016
UKNN 0.110 84.927
11
4. Experiments and results: evolving results
12
4. Experiments and results: evolving results
13
5. Conclusions
● Algorithm:● ISGD is clearly faster than tested alternatives;
● accuracy of ISGD is competitive, but not a clear winner.
● Evaluation protocol● allows a fne-grained assessment of results, by
continuously monitoring the learning process.
14
5. Future work
● Deal with variability in results:
● assess the sensitivity of algorithms to data meta-features
– domain
– sparseness
– user-item ratio
– user behaviour (in)consistency
– noise, shilling attacks, etc● Verify convergence between incremental and batch
● Deal with temporal effects:
● forgetting
● time-awareness
● drift/shift detection → automatic parameter tuning
15
.