Fast ALS-Based Matrix Factorization for Recommender Systems

56
Fast ALS-Based Matrix Factorization for Recommender Systems David Zibriczky LAWA Workpackage Meeting 16th January, 2013

Transcript of Fast ALS-Based Matrix Factorization for Recommender Systems

Page 1: Fast ALS-Based Matrix Factorization for Recommender Systems

Fast ALS-Based Matrix

Factorization for

Recommender Systems

David Zibriczky

LAWA Workpackage Meeting

16th January, 2013

Page 2: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Problem setting

16th January, 20132

Page 3: Fast ALS-Based Matrix Factorization for Recommender Systems

Item Recommendation

• Classical item recommendation problem (see Netflix)

• Explicit feedbacks (ratings)

16th January, 20133 LAWA Workpackage Meeting

5 ?

?

The Matrix The Matrix 2 Twilight The Matrix 3

?

Page 4: Fast ALS-Based Matrix Factorization for Recommender Systems

Collaborative Filtering (Explicit)

• Classical item recommendation problem (see Netflix)

• Explicit feedbacks (ratings)

• Collaborative Filtering

• Based on other users

16th January, 20134 LAWA Workpackage Meeting

5

54

55

?

?

The Matrix 3The Matrix The Matrix 2 Twilight

5

?

Page 5: Fast ALS-Based Matrix Factorization for Recommender Systems

Collaborative Filtering (Implicit)

• Items are not movies only (live content, products, holidays, …)

• Implicit feedbacks (buy, view, …)

• Less information about pref.

16th January, 20135 LAWA Workpackage Meeting

?

?

Item4Item1 Item2 Item3

?

Page 6: Fast ALS-Based Matrix Factorization for Recommender Systems

Industrial motivation

• Keeping the response time low

• Up-to-date user models, the adaptation should be fast

• The items may change rapidly, the training time can be a bottleneck of

live performance

• Increasing amount of data from a customer Increasing training time

• Limited resources

16th January, 20136 LAWA Workpackage Meeting

Page 7: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Model

16th January, 20137

Page 8: Fast ALS-Based Matrix Factorization for Recommender Systems

Preference Matrix

• Matrix representation

• Implicit Feedbacks: Assuming

positive preference

• Value = 1

• Estimation of unknown preference?

• Sorting items by estimation Item

Recommendation

16th January, 20138 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 ? ? ?

User2 ? ? 1 ?

User3 1 1 ? ?

User4 ? 1 ? 1

Page 9: Fast ALS-Based Matrix Factorization for Recommender Systems

Matrix Factorization

𝑹 = 𝑷𝑸𝑻 𝑟𝑢𝑖 = 𝒑𝑢𝑇𝒒𝑖

𝑹𝑵𝒙𝑴: preference matrix

𝑷𝑵𝒙𝑲: user feature matrix

𝑸𝑴𝒙𝑲: item feature matrix

𝑵: #users

𝑴: #items

𝑲: #features

𝑲 ≪ 𝑴 , 𝑲 ≪ 𝑵

16th January, 20139 LAWA Workpackage Meeting

R Item1 Item2 Item3 …

User1

User2 𝒓𝑢𝑖

User3

P

𝒑𝑢𝑇

QT 𝒒𝑖

𝒑𝒖 ≔ 𝑷 𝒖 𝑻

𝒒𝒊 ≔ 𝑸 𝒊 𝑻

Page 10: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Objective Function

16th January, 201310

Page 11: Fast ALS-Based Matrix Factorization for Recommender Systems

Preference Matrix

16th January, 201311 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1

User2 1

User3 1 1

User4 1 1

Page 12: Fast ALS-Based Matrix Factorization for Recommender Systems

• Zero value for unknown preference (zero example). Many 0s, few 1s, in practice

Preference Matrix

16th January, 201312 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 13: Fast ALS-Based Matrix Factorization for Recommender Systems

• Zero value for unknown preference (zero example). Many 0s, few 1s, in practice-

• 𝒄𝑢𝑖 confidence for known feedback (constant or function of the context of event)

• Zero examples are less important, but important.

Confidence Matrix

16th January, 201313 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

C Item1 Item2 Item3 Item4

User1 𝒄11 1 1 1

User2 1 1 𝒄23 1

User3 𝒄31 𝒄32 1 1

User4 1 𝒄42 1 𝒄44

Page 14: Fast ALS-Based Matrix Factorization for Recommender Systems

• Objective function:

Weighted Sum of Squared Errors

16th January, 201314 LAWA Workpackage Meeting

C Item1 Item2 Item3 Item4

User1 𝒄11 1 1 1

User2 1 1 𝒄23 1

User3 𝒄31 𝒄32 1 1

User4 1 𝒄42 1 𝒄44

𝒇 𝑷,𝑸 = 𝑾𝑺𝑺𝑬 =

(𝒖,𝒊)

𝒄𝒖𝒊 𝒓𝒖𝒊 − 𝒓𝒖𝒊𝟐 𝑷 = ?

𝑸 = ?

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 15: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Optimizer

16th January, 201315

Page 16: Fast ALS-Based Matrix Factorization for Recommender Systems

• Ridge Regression

• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢

• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1

𝑃𝑇𝐶𝑖𝑅𝑐 𝑖

Optimizer – Alternating Least Squares

16th January, 201316 LAWA Workpackage Meeting

QT0.1 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

P

-0.2 0.6

0.6 0.4

0.7 0.2

0.5 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 17: Fast ALS-Based Matrix Factorization for Recommender Systems

• Ridge Regression

• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢

• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1

𝑃𝑇𝐶𝑖𝑅𝑐 𝑖

Optimizer – Alternating Least Squares

16th January, 201317 LAWA Workpackage Meeting

QT0.3 -0.3 0.7 0.7

0.7 0.8 -0.5 -0.1

P

-0.2 0.6

0.6 0.4

0.7 0.2

0.5 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 18: Fast ALS-Based Matrix Factorization for Recommender Systems

• Ridge Regression

• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢

• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1

𝑃𝑇𝐶𝑖𝑅𝑐 𝑖

Optimizer – Alternating Least Squares

16th January, 201318 LAWA Workpackage Meeting

QT0.3 -0.3 0.7 0.7

0.7 0.8 -0.5 -0.1

P

-0.2 0.7

0.6 0.5

0.8 0.2

0.6 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 19: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Alternating Least Squares

• Complexity of naive solution: 𝚶 𝑰𝑲𝟐𝑵𝑴 + 𝑰𝑲𝟑 𝑵 + 𝑴

𝑬: number of examples, 𝑰 : number of iterations

• Improvement (Hu, Koren, Volinsky)

Ridge Regression: 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢

𝑄𝑇𝐶𝑢𝑄 = 𝑄𝑇𝑄 + 𝑄𝑇 𝐶𝑢 − 𝐼 𝑄 = 𝐶𝑂𝑉𝑄0 + 𝐶𝑂𝑉𝑄+, 𝚶(𝑰𝑲𝟐𝑵𝑴) is costly

𝐶𝑂𝑉𝑄0 is user independent, need to be calculated at the start of the iteration

Calculating 𝐶𝑂𝑉𝑄+ needs only #𝑷(𝒖)+steps.

o #𝑷(𝒖)+: number of positive examples of user u

Complexity: 𝜪 𝑰𝑲𝟐𝑬 + 𝑰𝑲𝟑(𝑵 + 𝑴) = 𝜪 𝑰𝑲𝟐(𝑬 + 𝑲(𝑵 + 𝑴)

Codename: IALS

• Complexity issues on large dataset:

If 𝑲 is low: 𝜪(𝑰𝑲𝟐𝑬) is dominant

If 𝑲 is high: 𝑶(𝑰𝑲𝟑(𝑵 + 𝑴)) is dominant

19 LAWA Workpackage Meeting 16th January, 2013

Page 20: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Problem: Complexity

16th January, 201320

Page 21: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201321 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

P

? ? ?

Page 22: Fast ALS-Based Matrix Factorization for Recommender Systems

• Initialize with zero values

Ridge Regression with Coordinate Descent

16th January, 201322 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

P

0 0 0

Page 23: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201323 LAWA Workpackage Meeting

P

0.51 0 0

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻

• Optimize only one feature of 𝑝𝑢 at once

• 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖

• Apply more iteration

Page 24: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201324 LAWA Workpackage Meeting

P

0.51 0.10 0

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻

• Optimize only one feature of 𝑝𝑢 at once

• 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖

• Apply more iteration

Page 25: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201325 LAWA Workpackage Meeting

P

0.51 0.10 0.08

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻

• Optimize only one feature of 𝑝𝑢 at once

• 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖

• Apply more iteration

Page 26: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201326 LAWA Workpackage Meeting

P

0.47 0.10 0.08

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻

• Optimize only one feature of 𝑝𝑢 at once

• 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖

• Apply more iteration

Page 27: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201327 LAWA Workpackage Meeting

P

0.46 0.11 0.07

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻

• Optimize only one feature of 𝑝𝑢 at once

• 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖

• Apply more iteration

Page 28: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201328 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 0

0 0

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 29: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201329 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0 0

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 30: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201330 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 0

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 31: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201331 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 0.5

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 32: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201332 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 33: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201333 LAWA Workpackage Meeting

QT0.1 0 0 0

0 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 34: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201334 LAWA Workpackage Meeting

QT0.1 0 0 0

0.6 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 35: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201335 LAWA Workpackage Meeting

QT0.1 0.4 0 0

0.6 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 36: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201336 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 37: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201337 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 0

0 0

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 38: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201338 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 -0.1

0 0

0 0

0 0

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 39: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

16th January, 201339 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 -0.1

0.1 -0.4

-0.3 0.1

0.5 -0.6

• Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 40: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

• Complexity of naive solution: 𝚶 𝑰𝑲𝑵𝑴

• Ridge Regression calculates the features based on examples directly,

Covariance precomputing solution cannot be applied here.

40 LAWA Workpackage Meeting 16th January, 2013

Page 41: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent Improvement

• Synthetic examples (Pilászy, Zibriczky, Tikk)

• Solution of Ridgre Regression with CD: 𝑝𝑢𝑘 = 𝑖=1

𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖

𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘

=𝑆𝑄𝐸

𝑆𝑄𝑄

• Calculate statistics for this user, who watched nothing (𝑆𝐸𝑄0 and 𝑆𝑄𝑄0)

• The solution is calculated incrementally: 𝑝𝑢𝑘 =𝑆𝑄𝐸

𝑆𝑄𝑄=

𝑆𝑄𝐸0+𝑆𝑄𝐸+

𝑆𝑄𝑄0+𝑆𝑄𝑄+(𝑴 + #𝑷(𝒖)+ steps)

• Eigenvalue decomposition: 𝑄𝑇𝑄 = 𝑆Λ𝑆𝑇 = 𝑆 Λ𝑇

Λ𝑆 = 𝐺𝑇𝐺

• Zero examples are compressed to synthetic examples: 𝑄𝑀𝑥𝐾 → 𝐺𝐾𝑥𝐾

• 𝑆𝐺𝐺0 = 𝑆𝑄𝑄0, but needs only 𝐊 steps to compute: 𝑝𝑢𝑘 =𝑺𝑮𝑬𝟎+𝑆𝑄𝐸+

𝑺𝑮𝑮𝟎+𝑆𝑄𝑄+(𝑲 + #𝑷(𝒖)+ steps)

• 𝑆𝐺𝐸0 is calculated the same way as 𝑆𝑄𝐸0, but using 𝐊 steps only.

• Complexity: 𝛰 𝐼𝐾(𝐸 + 𝐾𝑀 + 𝐾𝑁)) = 𝚶 𝑰𝑲(𝑬 + 𝑲(𝑴 + 𝑵)

41 LAWA Workpackage Meeting 16th January, 2013

Page 42: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

• Complexity of naive solution: 𝚶 𝑰𝑲𝑵𝑴

• Ridge Regression calculates the features based on examples directly,

Covariance precomputing solution cannot be applied here.

• Synthetic Examples

• Codename: IALS1

• Complexity reduction (IALSIALS1)

𝜪 𝑰𝑲(𝑬 + 𝑲(𝑴 + 𝑵)

• IALS1 requires higher 𝑲 for the same accuracy as IALS.

42 LAWA Workpackage Meeting 16th January, 2013

Page 43: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer – Coordinate Descent

...does it work in practice?

16th January, 201343 LAWA Workpackage Meeting

Page 44: Fast ALS-Based Matrix Factorization for Recommender Systems

• Average Rank Position on the subset of a propietary implicit feedback dataset. The lower

value is better.

• IALS1 offers better time-accuracy tradeoffs, especially when K is large.

Comparison

44 LAWA Workpackage Meeting 16th January, 2013

IALS IALS1

K ARP time ARP time

5 0,1903 153 0,1898 112

10 0,1578 254 0,1588 134

20 0,1427 644 0,1432 209

50 0,1334 2862 0,1344 525

100 0,1314 11441 0,1325 1361

250 0,1311 92944 0,1312 6651

500 N/A N/A 0,1282 24697

1000 N/A N/A 0,1242 104611

0,120

0,125

0,130

0,135

0,140

0,145

0,150

0,155

100 1000 10000 100000A

RP

Training Time (s)

IALS IALS1

Page 45: Fast ALS-Based Matrix Factorization for Recommender Systems

Conclusion

• Explicit feedbacks are rarely or not provided.

• Implicit feedbacks are more general.

• Complexity issues of Alternating Least Squares.

• Efficient solution by using approximation and synthetic examples.

• IALS1 offers better time-accuracy tradeoffs, especially when 𝑲 is large.

• IALS is approximation algorithm too, so why not change it to be even

more approximative?

45 LAWA Workpackage Meeting 16th January, 2013

Page 46: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Other algorithms

16th January, 201346

Page 47: Fast ALS-Based Matrix Factorization for Recommender Systems

Model – Tensor Factorization

47 LAWA Workpackage Meeting 16th January, 2013

• Different preferences during the day

• Time period 1: 06:00-14:00

R1 Item1 Item2 Item3 …

User1 1 …

User2 1 …

User3 …

…. … … … …

Page 48: Fast ALS-Based Matrix Factorization for Recommender Systems

• Different preferences during the day

• Time period 2: 14:00-22:00

Model – Tensor Factorization

48 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 …

User1 1 …

User2 1 0 …

User3 …

…. … … … …

R2 Item1 Item2 Item3 …

User1 1 …

User2 1 …

User3 1 …

…. … … … …

Page 49: Fast ALS-Based Matrix Factorization for Recommender Systems

Model – Tensor Factorization

• Different preferences during the day

• Time period 3: 22:00-06:00

49 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 …

User1 1 …

User2 1 0 …

User3 …

…. … … … …

R2 Item1 Item2 Item3 …

User1 0 1 …

User2 1 …

User3 1 …

…. … … … …

R3 Item1 Item2 Item3 …

User1 1 …

User2 …

User3 1 1 …

…. … … … …

Page 50: Fast ALS-Based Matrix Factorization for Recommender Systems

Model – Tensor Factorization

50 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 …

User1 1 …

User2 1 0 …

User3 …

…. … … … …

R2 Item1 Item2 Item3 …

User1 0 1 …

User2 1 …

User3 1 …

…. … … … …

R3 Item1 Item2 Item3 …

User1 …

User2 𝒓𝑢𝑖𝑡 …

User3 …

…. … … … …

QTq11 q21 q31 …

q12 q22 q32 …

P

p11 p12

p21 p22

p31 p32

… …

Tt11

t12

t21

t22

t31

t32

𝑹𝑵𝒙𝑴: preference matrix

𝑷𝑵𝒙𝑲: user feature matrix

𝑸𝑴𝒙𝑲: item feature matrix

𝑻𝑳𝒙𝑲: time feature matrix

𝑵: #users

𝑴: #items

𝑳: #time periods

𝑲: #features

𝒓𝒖𝒊t =

𝒌

𝒑𝒖𝒌𝒒𝒊𝒌𝒕𝒕𝒌

𝑹 = 𝑷°𝑸°𝑻

Page 51: Fast ALS-Based Matrix Factorization for Recommender Systems

• Data sets: Netflix Rating 5, IPTV Provider VOD rental, Grocery buys

• Evaluation Metric: Recall@20, Precision-Recall@20

• Number of features: 20

Comparison – ITALS vs. IALS

51 LAWA Workpackage Meeting 16th January, 2013

Test case (20) IALS ITALS

Netflix Probe 0.087 0.097

Netflix Time Split 0.054 0.071

IPTV VOD 1day 0.063 0.112

IPTV VOD 1week 0.055 0.100

Grocer 0.065 0.103

Page 52: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison – ITALS vs. IALS

52 LAWA Workpackage Meeting 16th January, 2013

Page 53: Fast ALS-Based Matrix Factorization for Recommender Systems

Objective Function – Ranking-based objective function

16th January, 201353 LAWA Workpackage Meeting

• Ranking-based objective function approach:

• 𝒓𝒖𝒊 − 𝒓𝒖𝒋 : difference of preference between item i and j

• 𝒓𝒖𝒊 − 𝒓𝒖𝒋 : estimated difference of preference between item i and j

• 𝒔𝒋: importance of item j in objective function

• Model: Matrix Factorization

• Optimizer: Alternating Least Squares

• Name: RankALS

𝒇 𝜽 =

𝒖𝝐𝑼

𝒊𝝐𝑰

𝒄𝒖𝒊

𝒊𝝐𝑰

𝒔𝒋[ 𝒓𝒖𝒊 − 𝒓𝒖𝒋 − 𝒓𝒖𝒊 − 𝒓𝒖𝒋 ]𝟐

Page 54: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison – RankIALS vs. IALS

54 LAWA Workpackage Meeting 16th January, 2013

Page 55: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison – RankIALS vs. IALS

55 LAWA Workpackage Meeting 16th January, 2013

Page 56: Fast ALS-Based Matrix Factorization for Recommender Systems

Related Publications

• Alternating Least Squares with Coordinate Descent

I. Pilászy, D. Zibriczky, D. Tikk. Fast ALS-based matrix factorization for explicit and

implicit feedback datasets. RecSys 2010

• Tensor Factorization

B. Hidasi, D. Tikk: Fast ALS-Based Tensor Factorization for Context-Aware

Recommendation from Implicit Feedback, ECML PKDD 2012

• Personalized Ranking

G. Takács, D. Tikk: Alternating least squares for personalized ranking, RecSys 2012

• IPTV Case Study

D. Zibriczky, B. Hidasi, Z. Petres, D. Tikk: Personalized recommendation of linear content

on interactive TV platforms: beating the cold start and noisy implicit user feedback,

TVMMP @ UMAP 2012

56 LAWA Workpackage Meeting 16th January, 2013