Music Recommender Systems

27
Music Recommender Systems 超超 .com [email protected] http://www.fuchaoqun.com

description

Algorithm about recommender systems.

Transcript of Music Recommender Systems

Page 1: Music Recommender Systems

Music Recommender Systems

超群 [email protected]

http://www.fuchaoqun.com

Page 2: Music Recommender Systems

Who is Using Recommender Systems?

Page 3: Music Recommender Systems

Recommender Systems

• Summary :

– http://en.wikipedia.org/wiki/Recommender_system

• Keywords :

recommender systems 、 association rules 、 collaborative filtering 、 slope one 、 SVD 、 KNN....

Page 4: Music Recommender Systems

Algorithms

• Association Rules

• Slope one

• SVD

• ….

Page 5: Music Recommender Systems

Algorithms

• Association Rules

• Slope one

• SVD

• ….

Page 6: Music Recommender Systems

Association Rules

TID Items

1 Bread 、 Milk

2 Bread 、 Diaper 、 Beer 、 Egg

3 Diaper 、 Beer 、 Cola

4 Bread 、 Milk 、 Diaper 、 Beer

5 Bread 、 Milk 、 Diaper 、 Cola

Items Times

Beer 、 Diaper 3

Bread 、 Milk 3

Beer 、 Bread 2

Diaper 、 Milk 2

Beer 、 Milk 1

Page 7: Music Recommender Systems

Association Rules

• Support :

• Confidence:

• Algorithms : Apriori algorithm 、 FP-growth algorithm

• http://en.wikipedia.org/wiki/Association_rule_learning

• Demo : Python + Orange

http://www.fuchaoqun.com/2008/08/data-mining-with-python-orange-association_rule/

N

YXYXs

)(

)(

)()(

X

YXYXc

Page 8: Music Recommender Systems

Algorithms

• Association Rules

• Slope one

• SVD

• ….

Page 9: Music Recommender Systems

Slope One

User That is it Straight Through My

Heart

Jim 4 5

Mike 2 4

Fred 3 ?

Page 10: Music Recommender Systems

Slope One

• By Daniel Lemire in 2005

– http://www.daniel-lemire.com/fr/abstracts/SDM2005.html

• Simper Could Be Better

• Weighted Average:

• http://en.wikipedia.org/wiki/Slope_One

• Implements:

http://taste.sourceforge.net/ (Java)

http://code.google.com/p/openslopeone (PHP&MySQL)

nm

rRnrRmBP BCCBAA

)()()(

Page 11: Music Recommender Systems

Algorithms

• Association Rules

• Slope one

• SVD

• ….

Page 12: Music Recommender Systems

Similarity

Similarity :2

,2

,22

,12,

2,2

2,1

,,,2,2,1,1),cos(jmjjimii

jmimjiji

RRRRRR

RRRRRRji

Page 14: Music Recommender Systems

SVD In Image Compression

Original K=10 K=20

Page 15: Music Recommender Systems

Process SVD

1. Define the original user-item matrix, R, of size m x n, which includes the ratings of m users on n items. rij refers to the rating of user ui on item ij .

2. Preprocess user-item matrix R in order to eliminate all missing data values.

3. Compute the SVD of R and obtain matrices U, S and V , of size m x m, m x n, and n x n, respectively. Their relationship is expressed by: R =U * S * VT .

4. Perform the dimensionality reduction step by keeping only k diagonal entries from matrix S to obtain a k x k matrix, Sk. Similarly, matrices Uk and Vk of size m x k and k x n are generated. The "reduced" user-item matrix, R’, is obtained by R’ = Uk * Sk * Vk

T, while r'

ij denotes the rating by user ui on item ij as included in this reduced matrix.

5. Compute sqrt(Sk) and then calculate two matrix products: Uk * sqrt(Sk)T, which represents m users and sqrt(Sk) * Vk

T , which represents n items in the k dimen-sional feature space. We are particularly interested in the latter matrix, of size k x n.

6. Use KNN on user matrix and item matrix, or you can multiply them to get user's rating on every item.

Page 16: Music Recommender Systems

Demo

from Here

Which two people have the most similar tastes?

Which two season are the most close?

Page 17: Music Recommender Systems

Demo

Page 18: Music Recommender Systems

Demo

Page 19: Music Recommender Systems

SVD

• SVD– matlab– LAPCKL、 BLAS (Fortran)– numpy、 scipy (Python)– SVDLIBC、 Meschach (C)– http://en.wikipedia.org/wiki/Singular_value_decompositio

n– ……

• KNN:– matlab– FLANN– ……

• All in one solution:– DIVISI– ……

Page 20: Music Recommender Systems

MAGIC DIVISI !

#!/usr/bin/env python#coding=utf-8

import divisifrom divisi.cnet import *

data = divisi.SparseLabeledTensor(ndim = 2)

# read some rating into data# data[user_id, song_id] = 4

svd_result = data.svd(k = 128)

# get songs that the user may like# predict_features(svd_result, user_id).top_items(100)# get similar songs# feature_similarity(svd_result, song_id).top_items(100)# get users that have similar tastes# concept_similarity(svd_result, user_id).top_items(100)

Page 21: Music Recommender Systems

Music Recommender Systems

• Data collection

• Data Cleaning

• Data Preprocessing

• Data Mining

• Tracking & Optimization

Page 22: Music Recommender Systems

Data collection

• User rating

• User collection

• User listen log

• User view log

• ….

Page 23: Music Recommender Systems

Data Cleaning

• Missing data

• Wrong data

• Noise data

• Duplicate data

• ….

UserId SongId Times

3306 3654 200

3306 6950 236

3306 6528 268

3306 5874

3306 9527 foo

3306 5624 1000000

3306 9635 5

3306 6950 236

…. …. ….

Page 24: Music Recommender Systems

Data Preprocessing

UserId SongId Times

3306 3654 200

3306 6950 236

3306 6528 268

3306 5874 325

3306 9527 126

3306 5624 98

3306 9635 115

3306 6962 210

…. …. ….

UserId SongId Weight

3306 3654 0.62

3306 6950 0.73

3306 6528 0.82

3306 5874 1

3306 9527 0.39

3306 5624 0.30

3306 9635 0.35

3306 6962 0.65

…. …. ….

Page 25: Music Recommender Systems

Data Mining

UserId SongId Weight

3306 3654 0.62

3306 6950 0.73

3306 6528 0.82

3306 5874 1

3306 9527 0.39

3306 5624 0.30

3306 9635 0.35

3306 6962 0.65

…. …. ….

UserId Similary Users’ Id

…. ….

SongId Similary Songs’ Id

…. ….

Page 26: Music Recommender Systems

Tracking & Optimization

• Recommended result

• User view and click what he like

• Store user's click

• Data Mining

• Better recommendation

Page 27: Music Recommender Systems

That's it, Thanks.Q&A