Restricted Boltzmann Machines for Collaborative Filtering · 22/10/2019  · Presenter: Vijay...

Post on 12-Aug-2020

4 views 0 download

Transcript of Restricted Boltzmann Machines for Collaborative Filtering · 22/10/2019  · Presenter: Vijay...

Restricted Boltzmann Machines for Collaborative Filtering

Authors: Ruslan Salakhutdinov, Andriy Minh, and Geoffrey HintonProceedings of the 24th international conference on Machine learning. ACM, 2007

Presenter: Vijay Shankar VenkataramanFacilitators: Omar Nada, Jesse Cresswell

Oct 22, 2019

Netflix Prize Prize Dataset (2006)

● Features○ <user, movie, date of grade, grade>○ 480,189 users rated 17,770 movies on 5 point

scale● Training Data (100,480,507 ratings)

○ 480,189 users gave to 17,770 movies○ Training set - 99,072,112○ Probe set - 1,408,395

● Qualifying set (2,817,131 ratings)○ Quiz set - 1,408,342○ Test set - 1,408,789

2Grand Prize of $1M for improving the RMSE by 10% over Netflix’s Cinematch model!

Collaborative Filtering

Key Ideas

● Filtering - predict content likely to get high ratings from given user

● Collaborative - identify users who have rated content similarly

Methods - use neighbourhoods or models

3

(2013). Retrieved from https://en.wikipedia.org/wiki/Collaborative_filtering#/media/File:Collaborative_filtering.gif

Restricted Boltzmann Machines

Energy based unsupervised models

Boltzmann

Conditional Independence

Give a decent example - with biases and weights and energies

4

Restricted Boltzmann Machines

Energy based unsupervised models

Boltzmann

Conditional Independence

5

Restricted Boltzmann Machines

Energy based unsupervised models

Boltzmann

Conditional Independence

6

Restricted Boltzmann MachinesLearning Algorithm

- Make observed states more probable- Maximize the log probability

7

Expensive! Needs MCMC!

Restricted Boltzmann MachinesLearning Algorithm

- Make observed states more probable- Maximize the log probability

Approximate by

8

Needs MCMC! Expensive!

Contrastive Divergence

Restricted Boltzmann MachinesIntuition for Contrastive Divergence

Restricted Boltzmann MachinesIntuition for Contrastive Divergence

10

Restricted Boltzmann Machines for CF

11

1 RBM per user, shared weights

Marginal Distributions

Learning

Restricted Boltzmann Machines for CF

12

1 RBM per user, shared weights

Marginal Distributions

Learning

Restricted Boltzmann Machines for CF

13

1 RBM per user, shared weights

Marginal Distributions

Learning

Restricted Boltzmann Machines for CF

14

Inference

- Use hidden vectors to generate rating probability

- Expected value works well

- Expensive to calculate for many movies

Restricted Boltzmann Machines for CF

15

Inference

- Paper uses a mean-field approximation

Restricted Boltzmann Machines for CF Conditional RBMs

Conditional term

Learning conditional term

Restricted Boltzmann Machines for CF

17

One last trick - Conditional Factored RBMs

Experiments

18

Comparison between RBMs

RBM - F = 100Conditional factored RBM -- F = 500, C = 30

Experiments

19

Comparison with SVDs

Very different errors

Key takeaways

20

Energy based models like RBMs can learn good representations even for large datasets

Learning in RBMs does not rely on backpropagation

At the time, world class performance on Netflix dataset

Very different errors from Matrix Factorization Techniques

Discussion PointsWhy is the mean-field inference faster?

How do we tackle the cold start problem?

How useful are energy based methods in machine learning today?

Why are the errors very different from matrix factorizations? Latent representations are very different?

Do variational autoencoders do better than RBM in generating latent representations?

21