Matrix Factorisation
description
Transcript of Matrix Factorisation
![Page 1: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/1.jpg)
Matrix Factorization Techniques for Recommender Systems
Presented by Peng XuSupervised by Prof. Michel Desmarais
By Yehuda KorenRobert Bell
Chris Volinsky
1
![Page 2: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/2.jpg)
Contents
2. Recommender System Strategies
1. Introduction
3. Matrix Factorization Methods
4. A Basic Matrix Factorization Model
5. Learning Algorithms
6. Adding Biases
7. Additional Input Sources
8. Temporal Dynamics
9. Inputs With Varying Confidence Levels
10. Netflix Prize Competition
2
![Page 3: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/3.jpg)
1. Introduction
Why recommender system? (for corporations)Modern consumers are inundated with
choices.Key to enhance user satisfaction and
loyaltyParticularly useful for entertainment
products because of the available huge volume of data
3
![Page 4: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/4.jpg)
2. Recommender System Strategies
Content Filtering
Based on the profile!
Main problem:
External information needed!
Example: Music Genome Project
Collaborative Filtering
Based on the interactions!
Main problem:
Cold Start!
Example: Netflix
4
![Page 5: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/5.jpg)
Collaborative Filtering
Major appeal: • It is domain free! • It can address data aspects that are often elusive
and difficult to profile• Generally more accurate than content-based
techniquesTwo areas of collaborative filtering:• Neighborhood methods• Latent factor models
5
![Page 6: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/6.jpg)
Neighborhood Methods
This method centers on computing the relationships between items or between users.
Item-oriented approach
User-oriented approach(Illustrated by figure 1)
6
![Page 7: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/7.jpg)
Latent factor models
This method tries to explain the user’s ratings by characterizing both items and users on some factors inferred from the rating patterns.
That means you don’tneed to know the factors beforehand.
7
![Page 8: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/8.jpg)
3. Matrix Factorization Methods
Q: How to realize latent factor models?A: Matrix factorization
Q: Why it becomes popular?A: 1. Good scalability
2. Accurate predication3. Flexibility
8
![Page 9: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/9.jpg)
4. A Basic Matrix Factorization Model
Idea: represents the itemrepresents the user
and the inner product would capture the interaction between user and item
Thus user ’s rating of item , which is denoted by , can be estimated by
fiq ∈ i
fup ∈ u
Ti uq p
u iu i
uir
Tui i ur q p=
9
![Page 10: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/10.jpg)
A Basic Matrix Factorization Model
• First, we would naturally think about the SVD (Singular Value Decomposition) method, a well-established technique for identifying latent factors.
• However, there comes two problems:1. Sparseness of the data2. Overfitting
10
![Page 11: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/11.jpg)
A Basic Matrix Factorization Model
• Therefore, we would model directly the observed ratings only, while avoiding overfitting through a regularized model. That is: to learn the factor vectors( and ), the system minimizes the regularized squared error on the set of known ratings:
Here, is the set of the pair for which is known (the training set).
up iq
2 2 2
, ( , )min ( ) (|| || || || )T
ui i u i uq p u i
r q p q pκ
λ∗ ∗
∈
− + +∑κ ( , )u i uir
11
![Page 12: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/12.jpg)
5. Learning Algorithms
There are two approaches to minimizing the object function.1. Stochastic gradient descent Implementation ease Relatively fast running time2. Alternating least squares (ALS) When the system can use parallelization When system is centered on implicit data
12
![Page 13: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/13.jpg)
6. Adding Biases
Problems: 1. Some users used to rate high while some
used to rate low.2. Some products are widely perceived as
better than others. Therefore, it would be unwise to explain the
full rating value by an interaction of just the form T
i uq p13
![Page 14: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/14.jpg)
Adding Biases
• So, let’s consider a first-order approximation of the bias involved in rating
• The overall average rating is denoted by ;the parameters and indicate the observed deviations of user and item , respectively.
ui i ub b bµ= + +µ
ub ibu i
14
![Page 15: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/15.jpg)
Adding Biases
• So, the estimation becomes
• And the object function becomes
Tui i u i ur b b q pµ= + + +
2 2 2 2 2
, , ( , )min ( ) (|| || || || )T
ui u i u i u i u ip q b u i
r b b p q p q b bκ
µ λ∗ ∗ ∗
∈
− − − − + + + +∑
15
![Page 16: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/16.jpg)
7. Additional Input Sources
• denotes the set of items for which user expressed an implicit preference and item is associated with . Accordingly, a user who showed a preference for items in
is characterized by the vector
Normalize it we get
( )N u
if
ix ∈
( )N u( )
ii N u
x∈∑
0.5
( )| ( ) | i
i N uN u x−
∈∑
16
![Page 17: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/17.jpg)
Additional Input Sources
• Similarly, we use and to model another type of information knowned as user attributes. denotes the set of attributes of user and indicates the reaction because of the attribute . Thus the user ’s response to all his attributes is denoted by
• And thus our estimation becomes
( )A u fay ∈
( )A uu
ay
aya u
( )a
a A uy
∈∑
0.5
( ) ( )[ | ( ) | ]T
ui i u i u i ai N u a A u
r b b q p N u x yµ −
∈ ∈
= + + + + +∑ ∑17
![Page 18: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/18.jpg)
8. Temporal Dynamics
• In reality, product perception and popularity, customers’ inclinations may change thus the system should account for the temporal effects reflecting the dynamic, time-drifting nature of user-item interactions.
( ) ( ) ( )Tui i u i ur b t b t q p tµ= + + +
18
![Page 19: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/19.jpg)
9. Inputs With Varying Confidence Levels
• The matrix factorization model can readily accept varying confidence levels. If confidence in observing is denoted as , then the model becomes as follows:
uir uic
2
, , ( , )min ( )T
ui ui u i u ip q b u i
c r b b p qκ
µ∗ ∗ ∗
∈
− − − −∑2 2 2 2(|| || || || )u i u ip q b bλ+ + + +
19
![Page 20: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/20.jpg)
10. Netflix Prize Competition
• Started in 2006, ended in 2009• 1M prize for the team which has improved the
algorithm’s RMSE(Root Mean Square Error) performance by 10%
• Authors of this paper are the members of the final winner
20
![Page 21: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/21.jpg)
Netflix Prize Competition
21
![Page 22: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/22.jpg)
Netflix Prize Competition
22
![Page 23: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/23.jpg)
Summary
• Matrix factorization techniques have become a dominant methodology within collaborative filtering recommenders.
• Experience with datasets such as the Netflix Prize data has shown that they deliver accuracy superior to classical nearest-neighbor techniques.
23
![Page 24: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/24.jpg)
Summary
• At the same time, they offer a compact memory-efficient model that systems can learn relatively easily.
• What makes these techniques even more convenient is that models can integrate naturally many crucial aspects of the data, such as multiple forms of feedback, temporal dynamics, and confidence levels.
24
![Page 25: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/25.jpg)
Any questions?
25
![Page 26: Matrix Factorisation](https://reader035.fdocuments.us/reader035/viewer/2022082205/5695d4111a28ab9b02a02b98/html5/thumbnails/26.jpg)
Thank you!
26