IPTV Recommender Systems
Paolo Cremonesi
2
Paolo Cremonesi - Recommender Systems
Agenda
• IPTV architecture• Recommender algorithms• Evaluation of different algorithms• Multi-model systems
3
Paolo Cremonesi - Recommender Systems
Valentino Rossi
4
Paolo Cremonesi - Recommender Systems
CustomersService Provider Network ProviderContent Provider
IPTV architecture
Live TV
VOD
Set-top-box(decoder)
5
Paolo Cremonesi - Recommender Systems
IPTV architecture
• IPTV is a video service supplied by a telecom service provider that owns the network infrastructure and controls content distribution over the broadband network for reliable delivery to the consumer (generally to the TV/IP STB).
• ServicesBroadcast TV (BTV) services which consist in the simultaneous reception by the users of a traditional TV channel, Free-to-air or Pay TV. BTV services are usually implemented using IP multicast protocols. Video On Demand (VOD) services, which consist in viewing multimedia contents made available by the Service Provider, uponrequest. VOD services are usually implemented using IP unicastprotocols.
6
Paolo Cremonesi - Recommender Systems
CUSTOMER FRUSTRATION
CUSTOMER PURCHASES
CUSTOMERS FACEDIFFICULTIES FINDING THE “RIGHT” CONTENTHUNDREDS
LIVECHANNELS
THOUSANDSVOD
ITEMS
IPTV Platform: Now
7
Paolo Cremonesi - Recommender Systems
Today recommendations,based on your personal taste, are:
From this….
To this.
IPTV Platform: with a recommender systems
8
Paolo Cremonesi - Recommender Systems
IPTV recommender needs
• Improve user satisfaction
• Sell new content to usersVOD Pay-per-view channels
• Targeting advertisement
9
Paolo Cremonesi - Recommender Systems
Agenda
• IPTV architecture• Recommender algorithms• Evaluation of different algorithms• Multi-model systems
10
Paolo Cremonesi - Recommender Systems
Recommender System: how it works
USERDATA
USER’S TASTE
FRUTIONS ANDRATINGS
CONTENTMETADATA
RECOMMENDER
SYSTEM
CONTENTRECOMMENDATIONS
11
Paolo Cremonesi - Recommender Systems
Problem formulation
RecommenderUsers ratings Items metadata
•Item1•Item2•Item3
•.•.•.
•ItemX
Ranked list
Top N
12
Paolo Cremonesi - Recommender Systems
Similar ItemsUsers with
similar taste
Recommendation techniques
CollaborativeFiltering
Content-basedFiltering
Userbased
Itembased
Recommenderalgorithms
13
Paolo Cremonesi - Recommender Systems
Memory vs. model based
XContent-based
XDimensional-reduction
XItem-based
XUser-based
Modelbased
Memorybased
14
Paolo Cremonesi - Recommender Systems
Collaborative Filtering
4 5
2 2
?3
Neighborhood
User-basedsimilar users rate an item similarly
Item-basedsimilar items are rated by a user similarly
NB: similarity means correlation
User-basedsimilar users rate an item similarly
15
Paolo Cremonesi - Recommender Systems
Collaborative filtering: User Rating Matrix
User
Item
16
Paolo Cremonesi - Recommender Systems
I1 I2 I3 I4
U1 3 4 0 1
U2 2 2 1 0
U3 2 0 0 4
U4 1 5 0 1
U5 3 0 1 0
Explicit URM
User rating matrix URM
I1 I2 I3 I4
U1 0 1 0 1
U2 0 0 1 1
U3 1 1 0 0
U4 1 1 0 1
U5 0 1 1 1
Implicit URM
17
Paolo Cremonesi - Recommender Systems
Dimensional-reduction collaborative model
• items and users can be described by a number (K) of unknown features
• auf : describes if feature f is important for user u• bif : describes if feature f is present in item i• rui : rating assigned by (or estimated) user u to item i
rui =kf=1 auf · bif
18
Paolo Cremonesi - Recommender Systems
Singular Value Decomposition
A = U S VTVTR
A =
VkT
Rk Uk
Sk
m x n
m x n m x k k x k k x n
19
Paolo Cremonesi - Recommender Systems
Singular Value Decomposition
UT ·U = I
V ·VT = I
R = U · S ·VT
20
Paolo Cremonesi - Recommender Systems
Singular Value Decomposition
Rk : best rank-k approximation of R according to the Frobenious normnot according least square error!!
UTk ·Uk = I
Vk ·VTk = I
Rk = Uk · Sk ·VTk
21
Paolo Cremonesi - Recommender Systems
Folding-in
• New rows/columns of A are projected (folded-in) in the existing latent space without computing a new SVD
• e.g., a new user u
u’ = u Vk Sk-1
Ak
u
Uk
u’
Sk Vk
22
Paolo Cremonesi - Recommender Systems
Collaborative Filtering: pro & cons
• Pro: There is no need for content
• Cons:Cold Start: we needs to have enough users in the system to find a match.Sparsity: when the user/ratings matrix is sparse it is hard to find a neighbourhood.First Rater: cannot recommend an item that has not been previously rated anyone elsePopularity Bias: cannot recommend items to someone with unique tastes. Tends to recommend popular items (dataset coverage)
23
Paolo Cremonesi - Recommender Systems
Content-based Filtering
...mettendo a punto una scoperta che potrebbe portare al primo uso terapeutico della controversa procedura. Se gli studi animali si riveleranno promettenti, i ricercatori potrebbero cominciare a mettere alla prova le nuove cellule su occhi umani da qui a due anni...
Term 1
Term2
Term
3
• Similar items contain the same terms• The more a term occurs in an item, the more representative it is• The more a term occurs in the collection, the less representative it is
(i.e. it is less important in order to distinguish a specific item)
24
Paolo Cremonesi - Recommender Systems
Content-based filtering: Item-Content Matrix
Word
Item
25
Paolo Cremonesi - Recommender Systems
Content-based Filtering: techniques
User-item similarity
Term 1
Term2
Term
3
26
Paolo Cremonesi - Recommender Systems
Content-based Filtering: pro & cons
• Pro: No need for data on other usersNo cold-start or sparsity problems, neither first-rater
• Able to recommend to users with unique tastes• Able to recommend new and unpopular items
Can provide explanations about recommended itemsWell-known technology
• Cons:Requires a structured contentLower accuracyUsers tastes must be represented as a function of the contentUnable to exploit quality judgments of other users
27
Paolo Cremonesi - Recommender Systems
Content-based Filtering: Latent Semantic Analysis
A Ak
VkT
Uk
Sk=
Uk * sqrt (Sk)Vk * sqrt (Sk)
pseudo terms pseudo items
cosine
Ak
svd
-Terms in rows-Items in columns
m x n
28
Paolo Cremonesi - Recommender Systems
Recommender architecture
Items Storage Features extraction
Featuresrepresentation
Users Infer and learn profile
Interests/tastes representation
Compute user-item correlation
Items retrieval
Items recommendation
Resources management
Users management
Filter
feedback
Explicit vs implicit ratings
29
Paolo Cremonesi - Recommender Systems
Datasets
• User-item rating matrix23942 users564 movies56686 ratings
• Movie Meta-data (textual information)TitleGenreDirectorCastDuration…
Real datasets composed by movies and user fruitions, plus some extra information
30
Paolo Cremonesi - Recommender Systems
• Implicit vs Explicit• Come determinare il rating implicito
VODTV (EPG)
31
Paolo Cremonesi - Recommender Systems
Some problems with IPTV recommender
• Cold start
• Multi-language content(e.g., Switzerland)
• New user problem(user-based algorithms)
• New item problem(all collaborative algorithms)
• Semantic problem(e.g., house and home)
32
Paolo Cremonesi - Recommender Systems
Agenda
• IPTV architecture• Recommender algorithms• Evaluation of different algorithms• Multi-model systems
33
Paolo Cremonesi - Recommender Systems
Problem
• Many works do not describe clearly the methods used for performance evaluation and model comparison
• Different dataset partition methodology and evaluation metrics lead to divergent results
• The Netflix prize has improperly focused the research attention onHold-outRMSE
34
Paolo Cremonesi - Recommender Systems
Objective
• Design a new methodology to compare different algorithms according to
how often the user watches the TV(length of user profile)
if the user prefers “blockbuster” movies(user preference versus popular or unpopular movies and programs)
• Design a multi-model system
34
35
Paolo Cremonesi - Recommender Systems
Metrics
• Error metricsMean Square Error(MSE)Root Mean Square Error (RMSE)Mean Absolute Error (MAE)
Only for explicit datasetsTop-N recommender systems
• Accuracy metricsRecallPrecisionFalloutF-measure
☺ Both implicit and explicit datasets
36
Paolo Cremonesi - Recommender Systems
Accuracy metrics
37
Paolo Cremonesi - Recommender Systems
Accuracy metrics
38
Paolo Cremonesi - Recommender Systems
Accuracy metrics
39
Paolo Cremonesi - Recommender Systems
Netflix dataset: test user profile
40
Paolo Cremonesi - Recommender Systems
Netflix dataset: Global effects algorithm
RMSE: 0.95Recall: 1%F-measure: 0.01
RMSE: 0.95Recall: 1%F-measure: 0.01
41
Paolo Cremonesi - Recommender Systems
Netflix dataset: Adjusted cosine algorithm
RMSE: 1.6Recall: 8%F-measure: 0.16
RMSE: 1.6Recall: 8%F-measure: 0.16
42
Paolo Cremonesi - Recommender Systems
Netflix dataset: SVD algorithm
RMSE: 2.7Recall: 17%F-measure: 0.28
RMSE: 2.7Recall: 17%F-measure: 0.28
43
Paolo Cremonesi - Recommender Systems
Quality evaluation
• Focus on future performance on new data• Proper partitioning of original data set into:
training settest set
• Test set must be different and independent from training set• Active user: should be left out of the model
43
44
Paolo Cremonesi - Recommender Systems
Hold-out
45
Paolo Cremonesi - Recommender Systems
Leave-one-out
46
Paolo Cremonesi - Recommender Systems
K-fold
47
Paolo Cremonesi - Recommender Systems
Agenda
• IPTV architecture• Recommender algorithms• Evaluation of different algorithms• Multi-model systems
48
Paolo Cremonesi - Recommender Systems
Recommender system architecture
Web Services Items’ Content
(ICM)
Users’ Ratings (URM)
Batch Processing
Real-time Recommendation
Model Repository
Inputs Real time calls
STB
server
STB client
STB client
…
Business
Rules
Paolo Cremonesi - Recommender Systems
Proposed approach
• Batch system
Statistical analysis of the datasetDefinition of a number of modelsAccuracy evaluation for different user profiles
• Run-time system
User profile analysisSelection of best candidate modelRecommendation
Paolo Cremonesi - Recommender Systems
Multi-model recommender engine
51
Paolo Cremonesi - Recommender Systems
Dataset statistical analysis (example)
52
Paolo Cremonesi - Recommender Systems
Dataset statistical analysis (example)
100 101 102 103 1040
0.25
0.5
0.75
1
Position of the items in the top-rated
Per
cent
age
of ra
ted
item
s in
the
top-
rate
d
NMMLNF
53
Paolo Cremonesi - Recommender Systems
Dataset statistical analysis (example)
20 or more
10...19 2...9
Popular
Non-Popular
User groups Item popularity
54
Paolo Cremonesi - Recommender Systems
Popular vs. unpopular: SVD algorithm - NF
0 200 400 600 800 10000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Latent size
Rec
all
allpopularunpopular
55
Paolo Cremonesi - Recommender Systems
Popular vs. unpopular: SVD algorithm - NM
50 100 200 300155
0.05
0.1
0.15
0.2
0.25
Latent size
Rec
all
allpopularunpopular
56
Paolo Cremonesi - Recommender Systems
User profile length – NM recall
All
Group SVD Cos -like NBN_S NBN_I NBN_U2-9 11,21% 19,35% 19,65% 17,23% 21,65%
10 -19 11,23% 13,11% 12,09% 13,62% 12,60%20 -inf 9,91% 8,01% 6,45% 6,65% 6,52%
Group NBN_UGroup NBN_U
Popular
Group SVD Cos -like NBN_S NBN_I NBN_U2-9 22,21% 31,21% 31,54% 26,17% 33,93%
10 -19 24,12% 27,12% 24,61% 27,36% 25,59%20 -inf 25,72% 22,71% 20,71% 21,14% 20,93%
Group NBN_UGroup NBN_U
Unpopular
Group SVD Cos -like NBN_S NBN_I NBN_U2-9 9,92% 0,81% 0,13% 2,64% 1,48%
10 -19 1,23% 0,19% 0,56% 0,25%20 -inf 10,14% 0,70% 0,01% 0,10% 0,01%
Group SVD NBN_UGroup SVD NBN_U
10,01%
15,94%Best average algorithm (item-based)
20,92%Multi-model (overall)
Top Related