ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s...
-
Upload
brianne-edwards -
Category
Documents
-
view
219 -
download
0
description
Transcript of ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s...
ICONIP 2010, Sydney, Australia 1
An Enhanced Semi-supervised Recommendation Model Based
on Green’s Function
Dingyan Wang and Irwin KingDept. of Computer Science & Engineering
The Chinese University of Hong Kong
OutlineBackgroundMotivationAn Enhanced ModelExperimental AnalysisConclusion
2ICONIP 2010, Sydney, Australia
Background• Recommendation in Collaborative Filtering
Recommendation
ICONIP 2010, Sydney, Australia 3
Background
• Significance– Consumer Satisfaction– Profit
• Mathematical Form– User-item matrix
complete task– Rating prediction
0
2 3 4 5 ? 1 ?1 ? ? 3 ? 4 2? 2 3 4 4 3 ?2 4 5 1 3 ? ?? 1 5 ? 5 ? 23 2 4 3 ? ? ?.. .. .. .. .. .. ..
R
User
Item
Rating for Prediction
ICONIP 2010, Sydney, Australia 4
Background
• Traditional Recommendation Methods– Memory-based method
• Item-based method, WWW ’01 & SIGIR ’06
• User-based method, SIGIR ’06
– Model-based method• Probabilistic matrix factorization, SIGIR ’07 & 04
ICONIP 2010, Sydney, Australia 5
Background
• A Novel View of Recommendation [Green’s function recommendation, KDD ’07 & WWW10]
– Label propagation on a graph
– Label prediction with semi-supervised learning
2
3
54
1
ICONIP 2010, Sydney, Australia 6
Motivation
• Higher accuracy in label propagation recommendation
• Importance of graph construction• Accuracy Reduction
– Data Sparsity• Some items have no similarity information
– Information Loss• Similarity in a local view
ICONIP 2010, Sydney, Australia 7
An Enhanced Model
• An Enhanced Model Based on Green’s Function
Enhanced Item-Graph Construction
User-Item Rating Matrix
Green’s Function Calculation
0R
Label Propagation
Predicted User-item Matrix '
0R
ICONIP 2010, Sydney, Australia 8
An Enhanced Model
• Enhanced Item-Graph Construction– Global similarity between items
• Latent-feature vector similarity– Local similarity between items
• Similarity derived from ratings– Global and local consistent similarity
• Linear combination of global and local similarity
ICONIP 2010, Sydney, Australia 9
An Enhanced Model
• Global Similarity Calculation– Latent features extraction
• Probabilistic matrix factorization (PMF), NIPS ’08
R UV
: M*N rating matrix ; : K*N item-latent matrix : M*K user-latent: rating of user i for item j; : indicator to show whether user i rated item j.
R V
2 2
1 1
( | , , ) [ ( | , )]ijIm n
Tij i j
i j
p R U V N R U V
ijR ijI
2 2 200
1 1
1min || || min ( , , ) min ( ) || || || ||2 2 2
m nU V
ij ij ij Fro Froi j
R R L Y U V I R R U V
ICONIP 2010, Sydney, Australia 10
U
An Enhanced Model
• Local Similarity Calculation– Cosine Similarity
– Pearson Correlation Coefficient (PCC)
|| || || ||( , ) j k
j k
r rSim j k
r r
, ,( ) ( )
2 2, ,
( ) ( ) ( ) ( )
( ) ( )( , )
( ) ( )
j ku j u ku U i U j
j ku j u ku U j U k u U j U k
r r r rSim j k
r r r r
ICONIP 2010, Sydney, Australia 11
An Enhanced Model
• Global And Local Consistent Similarity (GLCS)– Global similarity from item latent matrix
– Global and Local similarity combination
– Weighted undirected item-graph
( , ) ( , ) (1 ) ( , )j kGLCS j k sim v v sim j k
V( , ) cos ( , )j k j ksim v v ine v v
( , , )G V E W
( , )jkW GLCS j k
ICONIP 2010, Sydney, Australia 12
An Enhanced Model
• Green’s Function Calculation (An Example)– Given an item-graph
– Calculate the Laplacian matrix L= D-W
1 2
43
5
0.2
0.25
0.40.6
0.50.1
0.8
1 0.2 0.8 0.5 00.2 1 0.25 0.1 00.8 0.25 1 0 0.40.5 0.1 0 1 0.60 0 0.4 0.6 1
2.5 0 0 0 00 1.55 0 0 00 0 2.45 0 00 0 0 2.2 00 0 0 0 2
W=
D=
ICONIP 2010, Sydney, Australia 13
An Enhanced Model
• Green’s Function Calculation– Defined as the inverse of matrix L with zero-
mode discarded
* 1
2
1( )
Tni i
i i
v vG L
D W
,i i iLv v 1 20 ... n
1 0 without
ICONIP 2010, Sydney, Australia 14
An Enhanced Model
• Label Propagation Recommendation– rating as label ;– Closed form label propagation:
1
1, argmax,
0,
l
k ji ikijk
k G yy l j n
otherwise
Label PropagationLabel data Unlabeled data
ijR jy
ICONIP 2010, Sydney, Australia 15
Experimental Analysis
• Dataset– MovieLens dataset
• Metrics– Mean Absolute Error (MAE)– Mean Zero-one Error (MZOE)– Rooted Mean Squared Error (RMSE)
#Rating #Item #User #Rating Range
#Training Data
#Test Data
Sparsity Level
100,000 1682 943 1~5 80,000 20,000 6.3%
ICONIP 2010, Sydney, Australia 16
Experimental Analysis
• Impact of Weight Parameter
k=10
k=5
ICONIP 2010, Sydney, Australia 17
Experimental Analysis
• Performance Comparison– Previous Green’s function model (GCOS, GPCC),
[KDD ’07]
– Item-based recommendation (ICOS, IPCC)– User-based recommendation (UCOS, UPCC)
ICONIP 2010, Sydney, Australia 18
Conclusion
• Latent features provide global similarity.• Global and local consistent similarity can
improve item-graph construction.• The enhanced model outperformed other
memory-based methods and previous model.
ICONIP 2010, Sydney, Australia 19
Q&A
Thank you!
ICONIP 2010, Sydney, Australia 20
PMF
• Probabilistic Matrix Factorization– Define a conditional distribution over the
observed ratings as:
ICONIP 2010, Sydney, Australia 21
2 2
1 1
( | , , ) [ ( | , )]ijIm n
Tij i j
i j
p R U V N R U V
1, 0
0, 0ij
ijij
RI
R
Gaussian Distribution
PMF
• PMF– Assume zero-mean spherical Gaussian priors
on user and item feature
– By Bayesian Inference:
ICONIP 2010, Sydney, Australia 22
2 2
1
2 2
1
( | ) ( | 0, )
( | ) ( | 0, )
m
U i Ui
n
V j Vj
p U N U I
p V N V I
2 2 2 2 2 2( , | , , , ) ( | , , ) ( | ) ( | )U V U Vp U V R p R U V p U p V
PMF
• PMF– Optimization: to maximize the log likelihood of
the posterior distribution:
– Using Gradient Decent in Y, U, V to get local optimal.
ICONIP 2010, Sydney, Australia 23
2 2 200
1 1
1min || || min ( , , ) min ( ) || || || ||2 2 2
m nU V
ij ij ij Fro Froi j
R R L Y U V I R R U V
Algorithm
• Algorithm
ICONIP 2010, Sydney, Australia 24
ICONIP 2010, Sydney, Australia 25