NYU Guest Lecture II: Recommender Systems and the Netflix...

47
Data Mining - Volinsky - 2011 - Columbia University 1 Data Mining - Volinsky - 2011 - Columbia University NYU Guest Lecture II: Recommender Systems and the Netflix Prize Chris Volinsky AT&T Research

Transcript of NYU Guest Lecture II: Recommender Systems and the Netflix...

Page 1: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Data Mining - Volinsky - 2011 - Columbia University 1 Data Mining - Volinsky - 2011 - Columbia University

NYU Guest Lecture II: Recommender Systems and the Netflix Prize

Chris Volinsky

AT&T Research

Page 2: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Recommender Systems

•  Systems which take user preferences about items as input and outputs recommendations •  Early examples

•  Bellcore Music Recommender (1995) •  MIT Media Lab: Firefly (1996)

Best example: Amazon.com Worst example: Amazon.com Also:

Netflix eBay Google Reader iTunes Genius digg.com Hulu.com 2 Data Mining - Volinsky - 2011 - Columbia University

Page 3: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Recommender Systems

•  Basic idea –  recommend item i to user u for the purpose of

•  Exposing them to something they would not have otherwise seen

•  Leading customers to the Long Tail •  Increasing customers’ satisfaction

•  Data for recommender systems (need to know who likes what) –  Purchase/rented –  Ratings –  Web page views

–  Which do you think is best?

Data Mining - Volinsky - 2011 - Columbia University 3

Page 4: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Recommender Systems

•  Two types of data: •  Explicit data: user provides information about their preferences

–  Pro: high quality ratings –  Con: Hard to get: people cannot be bothered

•  Implicit data: infer whether or not user likes product based on behavior –  Pro: Much more data available, less invasive –  Con: Inference often wrong (does purchase imply preference?)

•  In either case, data is just a big matrix –  Users x items –  Entries binary or real-valued

Data Mining - Volinsky - 2011 - Columbia University 4

45531

312445

53432142

24542

522434

42331

Page 5: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

The Netflix Prize

Data Mining - Volinsky - 2011 - Columbia University 5

Page 6: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Netflix •  A US-based DVD rental-by mail company •  >10M customers, 100K titles, ships 1.9M DVDs per day

Good recommendations = happy customers

6 Data Mining - Volinsky - 2011 - Columbia University

Page 7: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Netflix Prize •  October, 2006:

•  Offers $1,000,000 for an improved recommender algorithm

• Training data

•  100 million ratings •  480,000 users •  17,770 movies •  6 years of data: 2000-2005

•  Test data •  Last few ratings of each user (2.8 million) •  Evaluation via RMSE: root mean squared error •  Netflix Cinematch RMSE: 0.9514

•  Competition

•  $1 million grand prize for 10% improvement •  If 10% not met, $50,000 annual “Progress Prize” for best

improvement

date score movie user

2002-01-03 1 21 1

2002-04-04 5 213 1

2002-05-05 4 345 2

2002-05-05 4 123 2

2003-05-03 3 768 2

2003-10-10 5 76 3

2004-10-11 4 45 4

2004-10-11 1 568 5

2004-10-11 2 342 5

2004-12-12 2 234 5

2005-01-02 5 76 6

2005-01-31 4 56 6

date score movie user

2003-01-03 ? 212 1

2002-05-04 ? 1123 1

2002-07-05 ? 25 2

2002-09-05 ? 8773 2

2004-05-03 ? 98 2

2003-10-10 ? 16 3

2004-10-11 ? 2450 4

2004-10-11 ? 2032 5

2004-10-11 ? 9098 5

2004-12-12 ? 11012 5

2005-01-02 ? 664 6

2005-01-31 ? 1526 6

7 Data Mining - Volinsky - 2011 - Columbia University

What would you do?

Page 8: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Netflix Prize •  Hold-out set created by taking last 9

ratings for each user –  Non-random, biased set

•  Hold-out set split randomly three ways: –  Probe Set – appended to training data to

allow unbiased estimation of RMSE –  Submit ratings for the (Quiz+Test) Sets –  Netflix returns RMSE on the Quiz Set

only

–  Quiz Set results posted on public leaderboard, but Test Set used to determine the winner!

»  Prevents overfitting

8 Data Mining - Volinsky - 2011 - Columbia University

Page 9: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Data Characteristics Mean Score vs. Date of Rating

3.2

3.3

3.4

3.5

3.6

3.7

3.8

2000 2001 2002 2003 2004 2005 2006

Date

Mea

n S

core

0

5

10

15

20

25

30

35

40

1 2 3 4 5

Rating

Percentage

Training (m = 3.60)Probe (m = 3.67)

9 Data Mining - Volinsky - 2011 - Columbia University

Page 10: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Ratings per movie/user

Mean Rating # Ratings User ID

1.90 17,651 305344

1.81 17,432 387418

1.22 16,560 2439493

4.26 15,811 1664010

4.08 14,829 2118461

1.37 9,820 1461435

Avg #ratings/user: 208

Avg #ratings/movie: 5627

10 Data Mining - Volinsky - 2011 - Columbia University

Page 11: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Data Characteristics

•  Most Loved Movies Count Avg rating Most Loved Movies

137812 4.593 The Shawshank Redemption

133597 4.545 Lord of the Rings: The Return of the King

180883 4.306 The Green Mile

150676 4.460 Lord of the Rings: The Two Towers

139050 4.415 Finding Nemo

117456 4.504 Raiders of the Lost Ark

Most Rated Movies

Miss Congeniality

Independence Day

The Patriot

The Day After Tomorrow

Pretty Woman

Pirates of the Caribbean

Highest Variance

The Royal Tenenbaums

Lost In Translation

Pearl Harbor

Miss Congeniality

Napolean Dynamite

Fahrenheit 9/11

11 Data Mining - Volinsky - 2011 - Columbia University

Page 12: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

The competition progresses…

•  Cinematch beaten in two weeks •  Halfway to 10% in 6 weeks •  Our team, BellKor (Bob Bell and

Yehuda Koren) took over the lead in the summer…

•  With 48 hours to go to the $50K Progress Prize, we had a comfortable lead…with another submission in pocket

Top contenders for Progress Prize 2007

0

1

2

3

4

5

6

7

8

9

10

10/2

/200

6

11/2

/200

6

12/2

/200

6

1/2/

2007

2/2/

2007

3/2/

2007

4/2/

2007

5/2/

2007

6/2/

2007

7/2/

2007

8/2/

2007

9/2/

2007

10/2

/200

7

% im

prov

emen

t

ML@Toronto

How low can he go?wxyzConsulting

GravityBellKor

Grand prize

12 Data Mining - Volinsky - 2011 - Columbia University

Page 13: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

8pm 6am 10/1 8pm

Leaderboard 05:00 pm Sept 30

13 Data Mining - Volinsky - 2011 - Columbia University

Page 14: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Leaderboard 06:00 pm Sept 30

wanna split 50/50?

14 Data Mining - Volinsky - 2011 - Columbia University

Page 15: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

8pm 6am 10/1 8pm

•  ARRRRGH! We have one more chance….

15 Data Mining - Volinsky - 2011 - Columbia University

Page 16: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

A Nervous Last Day

•  Start in virtual tie on Quiz data •  Unclear who leads on Test data

•  Can Gravity/Dinosaurs improve again? •  More offers for combining •  Can we squeeze out a few more points?

•  Improved our mixing strategy •  Created a second team, KorBell, just to be safe

16 Data Mining - Volinsky - 2011 - Columbia University

Page 17: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

8pm 6am 10/1 8pm

Our final submission(s)…

17 Data Mining - Volinsky - 2011 - Columbia University

Page 18: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

18 Data Mining - Volinsky - 2011 - Columbia University

Page 19: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

So The Drama

Timeline: –  2008:

•  BellKor merges with BigChaos to win 2nd $50K Progress Prize (9.4%)

–  2009: •  June 26: BellKor, Big Chaos and Pragmatic Theory merge passing 10% threshold,

begins 30 day ‘last call’ period. •  Lead next best team by 0.37% •  Only two teams within 0.80%

–  But soon, the cavalry charges •  Mega-mergers start to form.

–  July 25 (25 hours left): •  We crawl up to 10.08%... •  The Ensemble, a coalition of 23 independent teams, combine to pass us on the public

leaderboard

19 Data Mining - Volinsky - 2011 - Columbia University

Page 20: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Data Mining - Volinsky - 2011 - Columbia University 20

Page 21: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

21

Test Set Results

•  BellKor’s Pragmatic Theory: 0.8567 •  The Ensemble: 0.8567 •  Tie breaker was submission date/time

•  We won by 20 minutes!

But really: •  BellKor’s Pragmatic Theory: 0.856704 •  The Ensemble: 0.856714

•  Also, a combination of BPC (10.06%) and Ensemble (10.06%) scores results in a 10.19% improvement!

Data Mining - Volinsky - 2011 - Columbia University

Page 22: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Our Approach

•  Our Prize winning solutions were an ensemble of many separate solution sets

•  Progress Prize 2007: 103 sets •  Progress Prize 2008 (w/Big Chaos): 205 sets •  Grand Prize 2009 (w/ BC and Pragmatic Theory): > 800 sets!!

–  We used two main classes of models •  Nearest Neighbors •  Latent Factor Models (via Singular Value Decomposition) •  Also regularized regression, not a big factor •  Teammates used neural nets and other methods •  Approaches mainly algorithmic, not statistical in nature

22 Data Mining - Volinsky - 2011 - Columbia University

Page 23: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Data representation (excluding dates)

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

movies

- unknown rating - rating between 1 to 5 23 Data Mining - Volinsky - 2011 - Columbia University

Page 24: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

24 Data Mining - Volinsky - 2011 - Columbia University

Page 25: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

movies

- unknown rating - rating between 1 to 5 25 Data Mining - Volinsky - 2011 - Columbia University

Page 26: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

movies

- estimate rating of movie 1 by user 5 26 Data Mining - Volinsky - 2011 - Columbia University

Page 27: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Neighbor selection: Identify movies similar to 1, rated by user 5

movies

27 Data Mining - Volinsky - 2011 - Columbia University

Page 28: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Compute similarity weights: s13=0.2, s16=0.3

movies

28 Data Mining - Volinsky - 2011 - Columbia University

Page 29: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 2.6 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Predict by taking weighted average: (0.2*2+0.3*3)/(0.2+0.3)=2.6

movies

29 Data Mining - Volinsky - 2011 - Columbia University

Page 30: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

– To predict the rating for user u on item i: •  Use similar users’ ratings for similar movies:

rui = rating for user u and item i bui= baseline rating for user u and item I sij = similarity between items i and j N(i,u) = neighborhood of item i for user u (might be fixed at k)

ˆ r ui =sijj∈N ( i,u)∑ ruj

sijj∈N ( i,u)∑

30 Data Mining - Volinsky - 2011 - Columbia University

Page 31: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors

•  Useful to “center” the data, and model residuals •  What is sij ???

–  Cosine distance –  Correlation

•  What is N(i,u)?? –  Top-k –  Threshold

•  What is bui

•  How to deal with missing values? •  Choose several different options and throw them in!

31 Data Mining - Volinsky - 2011 - Columbia University

Page 32: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors, cont •  This is called “item-item” NN

– Can also do user-user – Which do you think is better?

•  Advantages of NN – Few modeling assumptions – Easy to explain to users – Most popular RS tool

Data Mining - Volinsky - 2011 - Columbia University 32

Page 33: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Nearest Neighbors, Modified

•  Problem with traditional k-NN: •  Similarity weights are calculated globally, and •  do not account for correlation among the neighbors

–  We estimate the weights (wij) simultaneously via a least squares optimization :

Basically, a regression using the ratings in the nbhd. –  Shrinkage helps address correlation –  Adds lots of parameters

minw

v ⇥=u

⇤(rvi � bvi)�⇧

j�N(i,u)

wij(rvj � bvj)

⌅2

33 Data Mining - Volinsky - 2011 - Columbia University

Page 34: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Matrix Decomposition - SVD

Data Mining - Volinsky - 2011 - Columbia University 34

45531

312445

53432142

24542

522434

42331

items

.2 -.4 .1

.5 .6 -.5

.5 .3 -.2

.3 2.1 1.1

-2 2.1 -.7

.3 .7 -1

-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1

1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8

.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1

~

~

items

users

users

?

D3

Example with 3 factors (concepts

Each user and each item is described by a feature vector across concepts

Page 35: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Geared towards females

Geared towards

males

serious

escapist

The Princess Diaries

The Lion King

Braveheart

Lethal Weapon

Independence Day

Amadeus The Color Purple

Dumb and Dumber

Ocean’s 11

Sense and Sensibility

Latent factor models – Singular Value Decomposition

35 Data Mining - Volinsky - 2011 - Columbia University

SVD finds concepts

Page 36: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

First 2 Singular Vectors

36 Data Mining - Volinsky - 2011 - Columbia University

Page 37: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Factorization-based modeling

45531

312445

53432142

24542

522434

42331

.2 -.4 .1

.5 .6 -.5

.5 .3 -.2

.3 2.1 1.1

-2 2.1 -.7

.3 .7 -1

-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1

1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8

.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1 ~

•  This is a non-standard way to use SVD! –  Usually for reducing dimensionality, here for filling in missing data! –  Special techniques to do SVD w/ missing data

•  Alternating Least Squares = variant of EM algorithms

•  Probably most popular model among contestants –  12/11/2006: Simon Funk describes an SVD based method – “Try This At Home”

37 Data Mining - Volinsky - 2011 - Columbia University

Page 38: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Latent Factor Models, Modified

•  Problem with traditional SVD: –  User and item factors are determined globally –  Each user described as a fixed linear combination across factors –  What if there are different people in the household?

•  Let the linear combination change as a function of the item rated.

•  Substitute pu with pu(i), and add similarity weights

•  Again, adds lots of parameters

j�N(u,i)

sij(ruj � pu(i)T qi)2 + �⇥(pu(i)⇥2 + ⇥qi⇥2)

38 Data Mining - Volinsky - 2011 - Columbia University

Page 39: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Incorporating Implicit Data

•  Implicit Data: what you choose to rate is an important, and separate piece of information than how you rate it.

•  Why? •  Can be fit in NN or SVD

39 Data Mining - Volinsky - 2011 - Columbia University

Page 40: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

But what about content?

•  What if we collect meta-data on the movies? –  Actors –  Director –  Genre –  Year of release –  Etc

•  More valuable would be data on users! Why? •  Can be useful for cold start.

Data Mining - Volinsky - 2011 - Columbia University 40

Page 41: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Temporal Effects

It took us 1.5 years to figure out how to use the dates!

Account for temporal effects in two ways

•  Directly in the SVD model – allow user factor vectors pu to vary with time pu(t)

–  Many ways to do this, adds many parameters, need regularization –  DTTAH

•  Observed: number of user ratings on a given date is a proxy for how long ago the movie was seen:

–  Some movies age better than others

41 Data Mining - Volinsky - 2011 - Columbia University

Page 42: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

42

Memento vs Patch Adams

3.2

3.4

3.6

3.8

4

4.2

1 2 3 - 4 5 - 8 9 - 16 17 - 32 33 - 64 65 - 128 129 - 256 257+

Memento (127318 samples)

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

4

1 2 3 - 4 5 - 8 9 - 16 17 - 32 33 - 64 65 - 128 129 - 256 257+

Patch Adams (121769 samples)

ra#n

g  ra#n

g  

frequency  

frequency  

Data Mining - Volinsky - 2011 - Columbia University

Page 43: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Netflix Prize: Lessons Learned

Data Mining - Volinsky - 2011 - Columbia University 43

Page 44: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

The Power of Ensembles

•  Ensembles of models •  Over 800 in Grand Prize solution! •  Some of our “models” were blends of other models!

–  or models on residuals of other models –  Why: because it worked

•  Black Box magic at its worst •  However, mega blends are not needed in practice

–  Best model: complex model with implicit and explicit data, and time varying coefficients, millions of parameters and regularization.

–  This did as well as our 2007 Progress Prize winning combo of 107 models!

–  Even a small handful of ‘simple’ models gets you 80-85% of the way

•  In practice, .01% does not matter much

Courtesy LingPipe Blog

44 Data Mining - Volinsky - 2011 - Columbia University

Page 45: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Collaboration vs. Competition

•  By mid-2007 it was clear the winner would be made up of a combination of the top tier players

–  Strategizing, back-room deals, sharing of information became a key component of the competition

–  Developing ways to estimate gain from collaborating without actually sharing solution sets was important.

•  But, Netflix designed the competition with collaboration in mind –  Public leaderboard –  Requirement to publish winning methodology –  Open forums –  KDD Workshops in 2007 and 2008 with encouragement for top teams to participate

•  The interplay between collaboration and competition was fascinating

Data Mining - Volinsky - 2011 - Columbia University 45

Page 46: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Is this the way to do science?

•  Netflix Prize advanced the science of CF by a large amount –  Got new people involved in the field –  Made significant advances

•  However, it was a perfect storm of circumstances –  Interesting, compelling domain –  Great data –  Great organization of competition –  Well defined, objective goal –  Large data, but accessible to anyone with a (good) PC –  Luck!

•  “You look at the cumulative hours and you’re getting Ph.D.’s for a dollar an hour!” - Reed Hastings, Netflix CEO

Data Mining - Volinsky - 2011 - Columbia University 46

Page 47: NYU Guest Lecture II: Recommender Systems and the Netflix ...people.stern.nyu.edu/ja1517/pdsfall2012/NYU2-NetflixPrize.pdfLord of the Rings: The Return of the King 4.545 133597 The

Discussion: Recommending TV Shows