Recommender systems tutorial.pdf
Transcript of Recommender systems tutorial.pdf
-
7/25/2019 Recommender systems tutorial.pdf
1/144
- 1 -
Tutorial:Recommender
Systems
InternationalJointConferenceonArtificialIntelligence
Beijing,August4,2013
DietmarJannach
TUDortmund
Gerhard
FriedrichAlpenAdriaUniversittKlagenfurt
-
7/25/2019 Recommender systems tutorial.pdf
2/144
- 2 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
-
7/25/2019 Recommender systems tutorial.pdf
3/144
- 3 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
RecommenderSystems
Applicationareas
-
7/25/2019 Recommender systems tutorial.pdf
4/144
- 4 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
IntheSocialWeb
-
7/25/2019 Recommender systems tutorial.pdf
5/144
- 5 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Evenmore
Personalizedsearch
"Computationaladvertising"
-
7/25/2019 Recommender systems tutorial.pdf
6/144
- 6 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Aboutthespeakers
GerhardFriedrich
ProfessoratUniversityKlagenfurt,Austria
DietmarJannach
ProfessoratTUDortmund,Germany
Researchbackgroundandinterests
ApplicationofIntelligentSystemstechnologyinbusiness
Recommendersystemsimplementation&evaluation
Productconfigurationsystems
Webmining
Operationsresearch
-
7/25/2019 Recommender systems tutorial.pdf
7/144
- 7 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Agenda
Whatarerecommendersystemsfor?
Introduction
Howdo
they
work
(Part
I)
?
CollaborativeFiltering
Howtomeasuretheirsuccess?
Evaluationtechniques
Howdo
they
work
(Part
II)
?
ContentbasedFiltering
KnowledgeBasedRecommendations
HybridizationStrategies
Advancedtopics
Explanations
Humandecisionmaking
-
7/25/2019 Recommender systems tutorial.pdf
8/144
- 8 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
-
7/25/2019 Recommender systems tutorial.pdf
9/144
- 9 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
WhyusingRecommenderSystems?
Valueforthecustomer
Findthingsthatareinteresting
Narrowdownthesetofchoices
Helpmeexplorethespaceofoptions
Discovernewthings
Entertainment
Valuefortheprovider
Additionalandprobablyuniquepersonalizedserviceforthecustomer
Increasetrustandcustomerloyalty
Increasesales,clicktroughrates,conversionetc. Opportunitiesforpromotion,persuasion
Obtainmoreknowledgeaboutcustomers
-
7/25/2019 Recommender systems tutorial.pdf
10/144
- 10 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Realworldcheck
Mythsfromindustry
Amazon.comgeneratesXpercentoftheirsalesthroughtherecommendation
lists(30
-
7/25/2019 Recommender systems tutorial.pdf
11/144
- 11 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Problemdomain
Recommendationsystems(RS)helptomatchuserswithitems
Easeinformationoverload
Salesassistance(guidance,advisory,persuasion,)
RSaresoftwareagentsthatelicittheinterestsandpreferencesofindividual
consumers[]andmakerecommendationsaccordingly.
Theyhave
the
potential
to
support
and
improve
the
quality
of
the
decisionsconsumersmakewhilesearchingforandselectingproductsonline.
[Xiao&Benbasat,MISQ, 2007]
Differentsystem
designs
/paradigms
Basedonavailabilityofexploitabledata
Implicitandexplicituserfeedback
Domaincharacteristics
-
7/25/2019 Recommender systems tutorial.pdf
12/144
- 12 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Recommendersystems
RSseenasafunction[AT05]
Given: Usermodel(e.g.ratings,preferences,demographics,situationalcontext)
Items(withorwithoutdescriptionofitemcharacteristics)
Find:
Relevancescore.Usedforranking.
Finally:
Recommenditemsthatareassumedtoberelevant
But: Rememberthatrelevancemightbecontextdependent
Characteristicsofthelistitselfmightbeimportant(diversity)
-
7/25/2019 Recommender systems tutorial.pdf
13/144
- 13 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Recommendersystemsreduce
informationoverloadbyestimating
relevance
-
7/25/2019 Recommender systems tutorial.pdf
14/144
- 14 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Personalizedrecommendations
-
7/25/2019 Recommender systems tutorial.pdf
15/144
- 15 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Collaborative:"Tellmewhat'spopular
amongmypeers"
-
7/25/2019 Recommender systems tutorial.pdf
16/144
- 16 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Contentbased:"Showmemoreofthe
samewhatI'veliked"
-
7/25/2019 Recommender systems tutorial.pdf
17/144
- 17 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Knowledgebased:"Tellmewhatfits
basedonmyneeds"
-
7/25/2019 Recommender systems tutorial.pdf
18/144
- 18 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Paradigmsofrecommendersystems
Hybrid:combinationsofvariousinputs
and/orcompositionofdifferent
mechanism
-
7/25/2019 Recommender systems tutorial.pdf
19/144
- 19 -
Recommendersystems:basictechniques
Pros Cons
Collaborative Noknowledge
engineeringeffort,
serendipityofresults,
learnsmarketsegments
Requiressomeformofrating
feedback,coldstartfornewusers
andnewitems
Contentbased Nocommunityrequired,
comparisonbetweenitemspossible
Contentdescriptionsnecessary,
coldstartfornewusers,nosurprises
Knowledgebased Deterministic
recommendations,
assuredquality,nocoldstart,canresemblesales
dialogue
Knowledgeengineeringeffortto
bootstrap,basicallystatic,does
notreacttoshorttermtrends
-
7/25/2019 Recommender systems tutorial.pdf
20/144
- 20 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
-
7/25/2019 Recommender systems tutorial.pdf
21/144
- 21 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
CollaborativeFiltering(CF)
Themostprominentapproachtogeneraterecommendations
usedbylarge,commercialecommercesites
wellunderstood,variousalgorithmsandvariationsexist
applicableinmanydomains(book,movies,DVDs,..)
Approach
use
the
"wisdom
of
the
crowd"
to
recommend
items Basicassumptionandidea
Usersgiveratingstocatalogitems(implicitlyorexplicitly)
Customerswhohadsimilartastesinthepast,willhavesimilartastesinthe
future
-
7/25/2019 Recommender systems tutorial.pdf
22/144
- 22 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
1992: Usingcollaborativefilteringtoweaveaninformationtapestry,D.Goldbergetal.,CommunicationsoftheACM
Basicidea:"Eagerreadersreadalldocsimmediately,casualreaderswait
fortheeagerreaderstoannotate"
ExperimentalmailsystematXeroxParcthatrecordsreactionsofusers
whenreadingamail
Usersareprovidedwithpersonalizedmailinglistfiltersinsteadofbeing
forcedto
subscribe
Contentbasedfilters(topics,from/to/subject)
Collaborativefilters
E.g.Mailsto[all]whichwererepliedby[JohnDoe]andwhichreceived
positiveratings
from
[X]
and
[Y].
-
7/25/2019 Recommender systems tutorial.pdf
23/144
- 23 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
1994: GroupLens:anopenarchitectureforcollaborativefilteringofnetnews,P.Resnicketal.,ACMCSCW
Tapestrysystemdoesnotaggregateratingsandrequiresknowingeach
other
Basicidea:"Peoplewhoagreedintheirsubjectiveevaluationsinthe
pastarelikelytoagreeagaininthefuture"
Buildsonnewsgroupbrowserswithratingfunctionality
-
7/25/2019 Recommender systems tutorial.pdf
24/144
- 24 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Userbasednearestneighborcollaborativefiltering(1)
Thebasictechnique:
Givenan"activeuser"(Alice)andanitemInotyetseenbyAlice
ThegoalistoestimateAlice'sratingforthisitem,e.g.,by
findasetofusers(peers)wholikedthesameitemsasAliceinthepastand
whohaverateditemI
use,e.g.theaverageoftheirratingstopredict,ifAlicewilllikeitemI
dothisforallitemsAlicehasnotseenandrecommendthebestrated
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
-
7/25/2019 Recommender systems tutorial.pdf
25/144
- 25 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Userbasednearestneighborcollaborativefiltering(2)
Somefirstquestions
Howdowemeasuresimilarity?
Howmanyneighborsshouldweconsider?
Howdowegenerateapredictionfromtheneighbors'ratings?
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
-
7/25/2019 Recommender systems tutorial.pdf
26/144
- 26 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Measuringusersimilarity
ApopularsimilaritymeasureinuserbasedCF:Pearsoncorrelation
a,b :usersra,p :ratingofuseraforitemp
P :setofitems,ratedbothbyaandb
Possiblesimilarityvaluesbetween 1and1; =user'saverageratings
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
sim =0,85sim =0,70
sim =
0,79
,
-
7/25/2019 Recommender systems tutorial.pdf
27/144
- 27 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Pearsoncorrelation
Takesdifferencesinratingbehaviorintoaccount
Workswellinusualdomains,comparedwithalternativemeasures
suchascosinesimilarity
0
1
2
3
4
5
6
Item1 Item2 Item3 Item4
Ratings
Alice
User1
User4
-
7/25/2019 Recommender systems tutorial.pdf
28/144
- 28 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Makingpredictions
Acommonpredictionfunction:
Calculate,
whether
the
neighbors'
ratings
for
the
unseen
itemiare
higher
orlowerthantheiraverage
Combinetheratingdifferences usethesimilarityasaweight
Add/subtractthe neighbors'biasfromtheactiveuser'saverageanduse
thisasaprediction
-
7/25/2019 Recommender systems tutorial.pdf
29/144
- 29 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Makingrecommendations
Makingpredictionsistypicallynottheultimategoal
Usualapproach(inacademia)
Rankitemsbasedontheirpredictedratings
However
Thismightleadtotheinclusionof(only)nicheitems
Inpractice
also:Takeitempopularityintoaccount
Approaches
"Learningtorank"
Optimizeaccordingtoagivenrankevaluationmetric(seelater)
-
7/25/2019 Recommender systems tutorial.pdf
30/144
- 30 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Improvingthemetrics /predictionfunction
Notallneighborratingsmightbeequally"valuable"
Agreementoncommonlylikeditemsisnotsoinformativeasagreementon
controversialitems Possiblesolution: Givemoreweighttoitemsthathaveahighervariance
Valueofnumberofcorateditems
Use"significanceweighting",bye.g.,linearlyreducingtheweightwhenthe
numberofcorateditemsislow
Caseamplification
Intuition:Givemoreweightto"verysimilar"neighbors,i.e.,wherethe
similarityvalueiscloseto1.
Neighborhoodselection
Usesimilaritythresholdorfixednumberofneighbors
-
7/25/2019 Recommender systems tutorial.pdf
31/144
- 31 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Memorybasedandmodelbasedapproaches
UserbasedCFissaidtobe"memorybased"
theratingmatrixisdirectlyusedtofindneighbors/makepredictions
doesnotscaleformostrealworldscenarios
largeecommercesiteshavetensofmillionsofcustomersandmillionsof
items
Modelbasedapproaches
basedonanofflinepreprocessingor"modellearning"phase
atruntime,onlythelearnedmodelisusedtomakepredictions
modelsareupdated/retrainedperiodically
largevarietyoftechniquesused
modelbuildingandupdatingcanbecomputationallyexpensive
-
7/25/2019 Recommender systems tutorial.pdf
32/144
- 32 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
2001: Itembasedcollaborativefilteringrecommendationalgorithms,B.
Sarwaret
al.,
WWW
2001
ScalabilityissuesarisewithU2Uifmanymoreusersthanitems
(m>>n,m=|users|,n=|items|)
e.g.Amazon.com SpacecomplexityO(m2)whenprecomputed
TimecomplexityforcomputingPearsonO(m2n)
Highsparsityleadstofewcommonratingsbetweentwousers
Basicidea:"ItembasedCFexploitsrelationshipsbetweenitemsfirst,
insteadof
relationships
between
users"
-
7/25/2019 Recommender systems tutorial.pdf
33/144
- 33 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Itembasedcollaborativefiltering
Basicidea:
Usethesimilaritybetweenitems(andnotusers)tomakepredictions
Example:
LookforitemsthataresimilartoItem5
TakeAlice'sratingsfortheseitemstopredicttheratingforItem5
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
-
7/25/2019 Recommender systems tutorial.pdf
34/144
- 34 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Thecosinesimilaritymeasure
Producesbetterresultsinitemtoitemfiltering
forsomedatasets,noconsistentpictureinliterature
Ratingsareseenasvectorinndimensionalspace
Similarityiscalculatedbasedontheanglebetweenthevectors
Adjustedcosinesimilarity
takeaverageuserratingsintoaccount,transformtheoriginalratings
U:setofuserswhohaveratedbothitemsaandb
-
7/25/2019 Recommender systems tutorial.pdf
35/144
- 35 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Preprocessingforitembasedfiltering
Itembasedfilteringdoesnotsolvethescalabilityproblemitself
PreprocessingapproachbyAmazon.com(in2003)
Calculateallpairwiseitemsimilaritiesinadvance
Theneighborhoodtobeusedatruntimeistypicallyrathersmall,because
onlyitemsaretakenintoaccountwhichtheuserhasrated
Itemsimilaritiesaresupposedtobemorestablethanusersimilarities
Memoryrequirements
UptoN2 pairwisesimilaritiestobememorized(N=numberofitems)in
theory
Inpractice,thisissignificantlylower(itemswithnocoratings)
Furtherreductionspossible
Minimumthresholdforcoratings(items,whichareratedatleastbynusers)
Limitthesizeoftheneighborhood(mightaffectrecommendationaccuracy)
-
7/25/2019 Recommender systems tutorial.pdf
36/144
- 36 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Moreonratings
PureCFbasedsystemsonlyrelyontheratingmatrix
Explicitratings
Mostcommonlyused(1to5,1to7Likert responsescales)
Researchtopics
"Optimal"granularityofscale;indicationthat10pointscaleisbetteracceptedin
moviedomain
Multidimensionalratings(multipleratingspermovie)
Challenge
Usersnotalwayswillingtoratemanyitems;sparseratingmatrices
Howtostimulateuserstoratemoreitems?
Implicitratings
clicks,pageviews,timespentonsomepage,demodownloads
Canbeusedinadditiontoexplicitones;questionofcorrectnessofinterpretation
-
7/25/2019 Recommender systems tutorial.pdf
37/144
- 37 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Datasparsityproblems
Coldstartproblem
Howtorecommendnewitems?Whattorecommendtonewusers?
Straightforwardapproaches
Ask/forceuserstorateasetofitems
Useanothermethod(e.g.,contentbased,demographicorsimplynon
personalized)intheinitialphase
Alternatives
Usebetteralgorithms(beyondnearestneighborapproaches)
Example:
Innearestneighborapproaches,thesetofsufficientlysimilarneighborsmight
betosmalltomakegoodpredictions
Assume"transitivity"ofneighborhoods
-
7/25/2019 Recommender systems tutorial.pdf
38/144
- 38 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Examplealgorithmsforsparsedatasets
RecursiveCF
Assumethereisaverycloseneighbornofuwhohoweverhasnotratedthe
targetitemiyet. Idea:
ApplyCFmethodrecursivelyandpredictaratingforitemifortheneighbor
Usethispredictedratinginsteadoftheratingofamoredistantdirect
neighbor
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 ?User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
sim =0,85
Predict
ratingfor
User1
-
7/25/2019 Recommender systems tutorial.pdf
39/144
- 39 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Graphbasedmethods
"Spreadingactivation"(sketch)
Idea:Usepathsoflengths>3
torecommenditems Length3:RecommendItem3toUser1
Length5:Item1alsorecommendable
-
7/25/2019 Recommender systems tutorial.pdf
40/144
- 40 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Moremodelbasedapproaches
Plethoraofdifferenttechniquesproposedinthelastyears,e.g.,
Matrixfactorizationtechniques,statistics
singularvaluedecomposition,principalcomponentanalysis
Associationrulemining
compare:shoppingbasketanalysis
Probabilisticmodels
clusteringmodels,Bayesiannetworks,probabilisticLatentSemanticAnalysis Variousothermachinelearningapproaches
Costsofpreprocessing
Usuallynotdiscussed
Incrementalupdatespossible?
-
7/25/2019 Recommender systems tutorial.pdf
41/144
- 41 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
2000: ApplicationofDimensionalityReductionin
Recommender
System,
B.
Sarwar
et
al.,
WebKDD
Workshop
Basicidea:Trademorecomplexofflinemodelbuildingforfasteronline
predictiongeneration
SingularValue
Decomposition
for
dimensionality
reduction
of
rating
matrices
Capturesimportantfactors/aspectsandtheirweightsinthedata
factorscanbegenre,actorsbutalsononunderstandableones
Assumptionthatkdimensionscapturethesignalsandfilteroutnoise(K=20to100)
Constanttimetomakerecommendations
ApproachalsopopularinIR(LatentSemanticIndexing),data
compression,
-
7/25/2019 Recommender systems tutorial.pdf
42/144
- 42 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Apicturesays
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
0,4
0,6
0,8
1
-1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1
BobMary
Alice
Sue
-
7/25/2019 Recommender systems tutorial.pdf
43/144
- 43 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Matrixfactorization
VkT
Dim1 0.44 0.57 0.06 0.38 0.57
Dim2 0.58 0.66 0.26 0.18 0.36
Uk Dim1 Dim2
Alice 0.47 0.30
Bob 0.44 0.23
Mary 0.70 0.06
Sue 0.31 0.93 Dim1 Dim2
Dim1 5.63 0
Dim2 0 3.23
T
kkkk
VUM
k
SVD:
Prediction:
=3+0.84=3.84
)()( EPLVAliceUrr Tkkkuui
-
7/25/2019 Recommender systems tutorial.pdf
44/144
- 44 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Associationrulemining
Commonlyusedforshoppingbehavioranalysis
aimsatdetectionofrulessuchas
"Ifacustomer
purchases
baby
food
then
he
also
buys
diapers
in70%ofthecases"
Associationruleminingalgorithms
candetectrulesoftheformX=>Y(e.g.,babyfood=>diapers)fromasetof
salestransactionsD={t1,t2,tn}
measureofquality:support,confidence
-
7/25/2019 Recommender systems tutorial.pdf
45/144
- 45 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Probabilisticmethods
Basicidea(simplisticversionforillustration):
giventheuser/itemratingmatrix
determinetheprobabilitythatuserAlicewilllikeanitemi basetherecommendationonsuchtheseprobabilities
CalculationofratingprobabilitiesbasedonBayes Theorem
Howprobableisratingvalue"1"forItem5givenAlice'spreviousratings?
CorrespondstoconditionalprobabilityP(Item5=1|X),where
X=Alice'spreviousratings=(Item1=1,Item2=3,Item3=)
CanbeestimatedbasedonBayes'Theorem
Usually
more
sophisticated
methods
used Clustering
pLSA
-
7/25/2019 Recommender systems tutorial.pdf
46/144
- 46 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
2008: Factorizationmeetstheneighborhood:amultifacetedcollaborativefilteringmodel,Y.Koren,ACMSIGKDD
StimulatedbyworkonNetflixcompetition
Prizeof$1,000,000foraccuracyimprovementof10%RMSE
comparedto
own
Cinematch system
Verylargedataset(~100Mratings,~480Kusers,~18K
movies)
Lastratings/userwithheld(setK)
Rootmean
squared
error
metric
optimized
to
0.8567
K
rr
RMSE Kiu
uiui
),(
2)(
-
7/25/2019 Recommender systems tutorial.pdf
47/144
- 47 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Mergesneighborhoodmodelswithlatentfactormodels
Latentfactormodels
goodto
capture
weak
signals
in
the
overall
data
Neighborhoodmodels
goodatdetectingstrongrelationshipsbetweencloseitems
Combinationinonepredictionsinglefunction
Localsearchmethodsuchasstochasticgradientdescenttodetermine
parameters
Addpenaltyforhighvaluestoavoidoverfitting
2008: Factorizationmeetstheneighborhood:amultifacetedcollaborativefilteringmodel,Y.Koren,ACMSIGKDD
Kiu
iuiui
T
uiuuibqp
bbqpqpbbr),(
22222
,,)()(min
***
i
T
uiuui qpbbr
-
7/25/2019 Recommender systems tutorial.pdf
48/144
- 48 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Summarizingrecentmethods
Recommendationisconcernedwithlearningfromnoisyobservations
(x,y),where
hastobedeterminedsuch that
isminimal.
Avarietyofdifferentlearningstrategieshavebeenappliedtryingto
estimatef(x)
Nonparametricneighborhoodmodels
MFmodels,SVMs,NeuralNetworks,BayesianNetworks,
yxf )(
y
yy
2)(
-
7/25/2019 Recommender systems tutorial.pdf
49/144
- 49 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
CollaborativeFilteringIssues
Pros:
wellunderstood,workswellinsomedomains,noknowledgeengineeringrequired
Cons: requiresusercommunity,sparsityproblems,nointegrationofotherknowledgesources,
noexplanationofresults
WhatisthebestCFmethod?
Inwhichsituationandwhichdomain?Inconsistentfindings;alwaysthesamedomains
anddatasets;differencesbetweenmethodsareoftenverysmall(1/100)
Howtoevaluatethepredictionquality?
MAE/RMSE:WhatdoesanMAEof0.7actuallymean?
Serendipity:Notyetfullyunderstood
Whataboutmultidimensionalratings?
-
7/25/2019 Recommender systems tutorial.pdf
50/144
- 50 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
-
7/25/2019 Recommender systems tutorial.pdf
51/144
- 51 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
OneRecommenderSystemsresearchquestion
Whatshouldbeinthatlist?
RecommenderSystemsineCommerce
-
7/25/2019 Recommender systems tutorial.pdf
52/144
- 52 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Anotherquestionbothinresearchandpractice
Howdoweknowthatthesearegood
recommendations?
RecommenderSystemsineCommerce
-
7/25/2019 Recommender systems tutorial.pdf
53/144
- 53 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Thismightleadto
Whatisagoodrecommendation?
Whatisagoodrecommendationstrategy?
Whatisagoodrecommendationstrategyformy
business?
RecommenderSystemsineCommerce
We hope you will buy also These have been in stock for quite a while now
-
7/25/2019 Recommender systems tutorial.pdf
54/144
- 54 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Totalsalesnumbers
Promotionofcertainitems
Clickthroughrates
Interactivityonplatform
Customerreturnrates
Customersatisfactionandloyalty
Whatisagoodrecommendation?
Whatarethemeasuresinpractice?
-
7/25/2019 Recommender systems tutorial.pdf
55/144
- 55 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Purposeandsuccesscriteria(1)
Differentperspectives/aspects
Dependsondomainandpurpose
No
holistic
evaluation
scenario
exists
Retrievalperspective
Reducesearchcosts
Provide"correct"proposals Assumption:Usersknowinadvancewhattheywant
Recommendationperspective
Serendipity identifyitemsfromtheLongTail Usersdidnotknowaboutexistence
-
7/25/2019 Recommender systems tutorial.pdf
56/144
- 56 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
WhendoesaRSdoitsjobwell?
"Recommendwidely
unknownitemsthat
usersmightactually
like!"
20%ofitems
accumulate74%ofallpositiveratings
Recommenditemsfromthelongtail
d i i ( )
-
7/25/2019 Recommender systems tutorial.pdf
57/144
- 57 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Purposeandsuccesscriteria(2)
Predictionperspective
Predicttowhatdegreeuserslikeanitem
Most
popular
evaluation
scenario
in
research
Interactionperspective
Giveusersa"goodfeeling"
Educate
users
about
the
product
domain Convince/persuadeusers explain
Finally,conversionperspective
Commercialsituations Increase"hit","clickthrough","lookerstobookers"rates
Optimizesalesmarginsandprofit
Howdoweasresearchers
-
7/25/2019 Recommender systems tutorial.pdf
58/144
- 58 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Testwithrealusers
A/Btests
Examplemeasures:salesincrease,clickthroughrates
Laboratorystudies
Controlledexperiments
Examplemeasures: satisfactionwiththesystem(questionnaires)
Offlineexperiments
Basedonhistoricaldata
Examplemeasures:predictionaccuracy,coverage
know?
E i i l h
-
7/25/2019 Recommender systems tutorial.pdf
59/144
- 59 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Empiricalresearch
Characterizingdimensions:
Whoisthesubjectthatisinthefocusofresearch?
Whatresearchmethodsareapplied?
Inwhichsettingdoestheresearchtakeplace?
Subject Onlinecustomers,students,historical onlinesessions,computers,
Researchmethod Experiments,quasiexperiments, nonexperimental
research
Setting Lab,realworld scenarios
Research methods
-
7/25/2019 Recommender systems tutorial.pdf
60/144
- 60 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Researchmethods
Experimentalvs.nonexperimental(observational)researchmethods
Experiment(test,trial):
"Anexperiment
is
astudy
in
which
at
least
one
variable
is
manipulated
and
unitsarerandomlyassignedtodifferentlevelsorcategoriesofmanipulated
variable(s)."
Units:users,historicsessions,
Manipulatedvariable:typeofRS,groupsofrecommendeditems,
explanationstrategies
Categoriesofmanipulatedvariable(s):contentbasedRS,collaborativeRS
Experiment designs
-
7/25/2019 Recommender systems tutorial.pdf
61/144
- 61 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Experimentdesigns
Evaluation in information retrieval (IR)
-
7/25/2019 Recommender systems tutorial.pdf
62/144
- 62 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Evaluationininformationretrieval(IR)
Recommendationisviewedasinformationretrievaltask:
Retrieve(recommend)allitemswhicharepredictedtobe"good"or
"relevant".
Commonprotocol:
Hidesomeitemswithknowngroundtruth
Rankitemsorpredictratings >Count >Crossvalidate
Groundtruth
established
by
human
domain
experts
Reality
ActuallyGood ActuallyBad
Predictio
n Rated
Good
TruePositive(tp) FalsePositive(fp)
Rated
Bad
FalseNegative (fn) True Negative(tn)
Metrics: Precision and Recall
-
7/25/2019 Recommender systems tutorial.pdf
63/144
- 63 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Metrics:PrecisionandRecall
Precision:ameasureofexactness,determinesthefractionofrelevant
itemsretrievedoutofallitemsretrieved
E.g.theproportionofrecommendedmoviesthatareactuallygood
Recall:ameasureofcompleteness,determinesthefractionofrelevant
itemsretrievedoutofallrelevantitems
E.g.theproportionofallgoodmoviesrecommended
Dilemma of IR measures in RS
-
7/25/2019 Recommender systems tutorial.pdf
64/144
- 64 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
DilemmaofIRmeasuresinRS
IRmeasuresarefrequentlyapplied,however:
Groundtruthformostitemsactuallyunknown
What isarelevantitem?
Differentwaysofmeasuringprecisionpossible
Resultsfromofflineexperimentationmayhavelimitedpredictivepowerfor
onlineuserbehavior.
Metrics: Rank Score position matters
-
7/25/2019 Recommender systems tutorial.pdf
65/144
- 65 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
RankScoreextendsrecallandprecisiontotakethepositionsofcorrect
itemsinarankedlistintoaccount
Particularly
important
in
recommender
systems
as
lower
ranked
items
may
be
overlookedbyusers
Learningtorank:Optimizemodelsforsuchmeasures(e.g.,AUC)
Metrics:RankScore positionmatters
Actually good
Item 237
Item 899
Recommended
(predicted as good)Item 345
Item 237
Item 187
Forauser:
hit
Accuracy measures
-
7/25/2019 Recommender systems tutorial.pdf
66/144
- 66 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Accuracymeasures
Datasetswithitemsratedbyusers
MovieLensdatasets100K10Mratings
Netflix100Mratings
Historicuser
ratings
constitute
ground
truth
Metricsmeasureerrorrate
MeanAbsoluteError(MAE)computesthedeviationbetween
predictedratingsandactualratings
RootMeanSquareError(RMSE)issimilartoMAE,butplaces
moreemphasisonlargerdeviation
Offline experimentation example
-
7/25/2019 Recommender systems tutorial.pdf
67/144
- 67 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Offlineexperimentationexample
Netflixcompetition
Webbasedmovierental
Prizeof$1,000,000foraccuracyimprovement(RMSE)of10%comparedtoown
Cinematch system.
Historicaldataset
~480Kusersrated~18Kmoviesonascaleof1to5(~100Mratings)
Last9ratings/userwithheld Probeset forteamsforevaluation
Quizset evaluatesteamssubmissionsforleaderboard
Testset usedbyNetflixtodeterminewinner
Today Rating predictiononlyseenasanadditionalinputintotherecommendationprocess
Animperfectworld
-
7/25/2019 Recommender systems tutorial.pdf
68/144
- 68 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Offlineevaluationisthecheapestvariant
Still,givesusvaluableinsights
andletsuscompareourresults(intheory)
Dangersandtrends:
Dominationofaccuracymeasures
Focusonsmallsetofdomains(40%onmoviesinCS)
Alternativeandcomplementarymeasures:
Diversity,Coverage,Novelty,Familiarity,Serendipity,Popularity,
Concentrationeffects(Longtail)
p
Onlineexperimentationexample
-
7/25/2019 Recommender systems tutorial.pdf
69/144
- 69 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
p p
Effectivenessofdifferentalgorithmsfor
recommendingcellphonegames
[Jannach,Hegelich
09]
Involved150,000usersonacommercialmobile
internetportal
Comparisonof recommendermethods
Detailsandresults
-
7/25/2019 Recommender systems tutorial.pdf
70/144
- 70 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Recommendervariantsincluded:
Itembasedcollaborativefiltering
SlopeOne (alsocollaborativefiltering)
Contentbasedrecommendation
Hybridrecommendation
Toprateditems
Topsellers
Findings:
Personalizedmethodsincreasedsalesupto3.6%comparedtonon
personalized
Choiceofrecommendationalgorithmdependsonusersituation
(e.g.avoidcontentbasedRSinpostsalessituation)
}nonpersonalized
Nonexperimentalresearch
-
7/25/2019 Recommender systems tutorial.pdf
71/144
- 71 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Quasiexperiments
Lackrandomassignmentsofunitstodifferenttreatments
Nonexperimental/observationalresearch
Surveys/Questionnaires
Longitudinalresearch
Observationsoverlongperiodoftime
E.g.customerlifetimevalue,returningcustomers
Casestudies
Focusgroup
Interviews
Thinkaloudprotocols
Quasiexperimental
-
7/25/2019 Recommender systems tutorial.pdf
72/144
- 72 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
SkiMatcher ResortFinderintroducedbySkiEurope.comtoprovideusers
withrecommendationsbasedontheirpreferences
ConversationalRS
questionandanswerdialog
matchingofuserpreferenceswithknowledgebase
DelgadoandDavidsonevaluatedthe
effectivenessoftherecommenderovera
4monthperiodin2001
Classifiedasaquasiexperiment
asusersdecideforthemselvesifthey
wanttousetherecommenderornot
SkiMatcherResults
-
7/25/2019 Recommender systems tutorial.pdf
73/144
- 73 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
July August September October
UniqueVisitors 10,714 15,560 18,317 24,416
SkiMatcherUsers 1,027 1,673 1,878 2,558
NonSkiMatcher Users 9,687 13,887 16,439 21,858
RequestsforProposals 272 506 445 641
SkiMatcherUsers 75 143 161 229
NonSkiMatcher Users 197 363 284 412
Conversion 2.54% 3.25% 2.43% 2.63%
SkiMatcherUsers 7.30% 8.55% 8.57% 8.95%
NonSkiMatcher Users 2.03% 2.61% 1.73% 1.88%
IncreaseinConversion 359% 327% 496% 475%
[Delgado and Davidson, ENTER 2002]
InterpretingtheResults
-
7/25/2019 Recommender systems tutorial.pdf
74/144
- 74 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Thenatureofthisresearchdesignmeansthatquestionsofcausality
cannotbeanswered(lackofrandomassignments),suchas
Areusersoftherecommendersystemsmorelikelyconvert?
Doestherecommendersystemitselfcauseuserstoconvert?
SomehiddenexogenousvariablemightinfluencethechoiceofusingRSaswell
asconversion.
However,significant
correlation
between
using
the
recommender
systemandmakingarequestforaproposal
Sizeof
effect
has
been
replicated
in
other
domains
Tourism[Jannachetal.,JITT2009]
Electronicconsumerproducts
Observationalresearch
-
7/25/2019 Recommender systems tutorial.pdf
75/144
- 75 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Increaseddemandinniches/longtailproducts
Expostfromwebshopdata[Zankeretal.,ECWeb,2006]
Whatispopular? From:Jannachetal.,ProceedingsEC Web 2012
-
7/25/2019 Recommender systems tutorial.pdf
76/144
- 76 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Usercentricevaluation/Userstudies Increasedinterestinrecentyears
Variousnumbersofworkshops
ECWeb2012
Whatarethenexttopics?
-
7/25/2019 Recommender systems tutorial.pdf
77/144
- 77 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Twoadditionalmajorparadigmsofrecommendersystems
Contentbased
Knowledge
based
Hybridization:takethebestofdifferentparadigms
Advancedtopics:recommendersystemsareabouthumandecisionmaking
-
7/25/2019 Recommender systems tutorial.pdf
78/144
- 78 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Contentbasedrecommendation
-
7/25/2019 Recommender systems tutorial.pdf
79/144
- 79 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
CollaborativefilteringdoesNOTrequireanyinformationabouttheitems,
However,itmightbereasonabletoexploitsuchinformation
E.g.recommendfantasynovelstopeoplewholikedfantasynovelsinthepast
Whatdoweneed:
Someinformationabouttheavailableitemssuchasthegenre("content")
Somesortofuserprofiledescribingwhattheuserlikes(thepreferences)
Thetask: Learnuserpreferences
Locate/recommenditemsthatare"similar"totheuserpreferences
Paradigmsofrecommendersystems
-
7/25/2019 Recommender systems tutorial.pdf
80/144
- 80 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Contentbased:"Showmemoreofthe
samewhatI'veliked"
Whatisthe"content"?
-
7/25/2019 Recommender systems tutorial.pdf
81/144
- 81 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Thegenreisactuallynotpartofthecontentofabook
MostCBrecommendationmethodsoriginatefromInformationRetrieval
(IR)field:
Theitemdescriptionsareusuallyautomaticallyextracted(importantwords)
Goalistofindandrankinterestingtextdocuments(newsarticles,webpages)
Here:
ClassicalIRbasedmethodsbasedonkeywords
Noexpertrecommendationknowledgeinvolved
Userprofile(preferences)areratherlearnedthanexplicitlyelicited
Contentrepresentationanditemsimilarities
-
7/25/2019 Recommender systems tutorial.pdf
82/144
- 82 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Simpleapproach
Computethesimilarityofanunseenitemwiththeuserprofilebasedonthe
keywordoverlap(e.g.usingtheDicecoefficient)
sim(bi,bj)= |
|
| |
TermFrequencyInverseDocumentFrequency(TFIDF)
-
7/25/2019 Recommender systems tutorial.pdf
83/144
- 83 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Simplekeywordrepresentationhasitsproblems
Inparticularwhenautomaticallyextractedbecause
Noteverywordhassimilarimportance
Longerdocumentshaveahigherchancetohaveanoverlapwiththeuserprofile
Standardmeasure:TFIDF
Encodestextdocumentsasweightedtermvector
TF:Measures,howoftenatermappears(densityinadocument) Assumingthatimportanttermsappearmoreoften
Normalizationhastobedoneinordertotakedocumentlengthintoaccount
IDF:Aimstoreducetheweightoftermsthatappearinalldocuments
TFIDF
-
7/25/2019 Recommender systems tutorial.pdf
84/144
- 84 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Computetheoverallimportanceofkeywords
Givenakeywordiandadocumentj
TFIDF i,j TFi,j * IDFi
Termfrequency(TF)
Letfreqi,jnumberofoccurrencesofkeywordiindocumentj
LetmaxOthersi,jdenotethehighestnumberofoccurrencesofanother
keywordofj
, ,
,
InverseDocumentFrequency(IDF)
N:numberofallrecommendabledocuments
n(i):numberofdocumentsinwhichkeywordiappears
ExampleTFIDFrepresentation
-
7/25/2019 Recommender systems tutorial.pdf
85/144
- 85 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Figure taken from http://informationretrieval.org
Moreonthevectorspacemodel
-
7/25/2019 Recommender systems tutorial.pdf
86/144
- 86 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Vectorsareusuallylongandsparse
Improvements
Removestopwords("a","the",..) Usestemming
Sizecutoffs(onlyusetopnmostrepresentativewords,e.g.around100)
Useadditionalknowledge,usemoreelaboratemethodsforfeatureselection
Detectionofphrasesasterms(suchasUnitedNations)
Limitations
Semanticmeaningremainsunknown
Example:usageofawordinanegativecontext
"thereisnothingonthemenuthatavegetarianwouldlike.."
Usualsimilaritymetrictocomparevectors:Cosinesimilarity(angle)
Recommendingitems
-
7/25/2019 Recommender systems tutorial.pdf
87/144
- 87 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Simplemethod:nearestneighbors
GivenasetofdocumentsDalreadyratedbytheuser(like/dislike)
FindthennearestneighborsofanotyetseenitemiinD
Taketheseratingstopredictarating/votefori
(Variations:neighborhoodsize,lower/uppersimilaritythresholds)
Querybasedretrieval:Rocchio's method
TheSMARTSystem:Usersareallowedtorate(relevant/irrelevant)retrieveddocuments(feedback)
Thesystemthenlearnsaprototypeofrelevant/irrelevantdocuments
Queriesarethenautomaticallyextendedwithadditionalterms/weightof
relevantdocuments
Rocchio details
-
7/25/2019 Recommender systems tutorial.pdf
88/144
- 88 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
DocumentcollectionsD+ andD
, , usedtofinetune
thefeedback
oftenonlypositivefeedback
isused
Probabilisticmethods
-
7/25/2019 Recommender systems tutorial.pdf
89/144
- 89 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Recommendationasclassicaltextclassificationproblem
Longhistoryofusingprobabilisticmethods
Simpleapproach:
2classes:like/dislike
SimpleBooleandocumentrepresentation
Calculateprobabilitythatdocumentisliked/dislikedbasedonBayestheorem
Remember:P(Label=1|X)=k*P(X|Label=1) * P(Label=1)
Improvements
-
7/25/2019 Recommender systems tutorial.pdf
90/144
- 90 -
Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Sidenote:Conditionalindependenceofeventsdoesinfactnothold
New/YorkandHong/Kong"
Still,goodaccuracycanbeachieved
Booleanrepresentationsimplistic
Keywordcountslost
Moreelaborateprobabilisticmethods
E.g.estimateprobabilityoftermvoccurringinadocumentofclassCby
relativefrequencyofvinalldocumentsoftheclass
Otherlinearclassificationalgorithms(machinelearning)canbeused
SupportVectorMachines,..
Limitationsofcontentbasedrecommendationmethods
-
7/25/2019 Recommender systems tutorial.pdf
91/144
- 91 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Keywordsalonemaynotbesufficienttojudgequality/relevanceofa
documentorwebpage
Uptodateness,usability,aesthetics,writingstyle
Contentmayalsobelimited/tooshort
Contentmaynotbeautomaticallyextractable(multimedia)
Rampupphaserequired
Some
training
data
is
still
required Web2.0:Useothersourcestolearntheuserpreferences
Overspecialization
Algorithmstendtopropose"moreofthesame"
E.g.toosimilarnewsitems
-
7/25/2019 Recommender systems tutorial.pdf
92/144
- 92 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Whydoweneedknowledgebasedrecommendation?
-
7/25/2019 Recommender systems tutorial.pdf
93/144
- 93 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Productswithlownumberofavailableratings
Timespanplaysanimportantrole
Fiveyearoldratingsforcomputers
Userlifestyleorfamilysituationchanges
Customerswant
to
define
their
requirements
explicitly
Thecolorofthecarshouldbeblack"
Knowledgebasedrecommendation
-
7/25/2019 Recommender systems tutorial.pdf
94/144
- 94 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Knowledgebased:"Tellmewhatfits
basedonmyneeds"
KnowledgebasedrecommendationI
-
7/25/2019 Recommender systems tutorial.pdf
95/144
- 95 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Explicitdomainknowledge
Salesknowledgeelicitationfromdomainexperts
Systemmimicsthebehaviorofexperiencedsalesassistant
Bestpracticesalesinteractions
Canguaranteecorrectrecommendations(determinism)withrespectto
expertknowledge
Conversationalinteractionstrategy
Opposedtooneshotinteraction
Elicitationofuserrequirements
Transfer
of
product
knowledge
(educating
users)
KnowledgeBasedRecommendationII
-
7/25/2019 Recommender systems tutorial.pdf
96/144
- 96 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Differentviewsonknowledge
Similarityfunctions Determinematchingdegreebetweenqueryanditem(casebasedRS)
UtilitybasedRS
E.g.MAUT Multiattributeutilitytheory
Logicbasedknowledgedescriptions(fromdomainexpert)
E.g.Hardandsoftconstraints
ConstraintbasedrecommendationI
-
7/25/2019 Recommender systems tutorial.pdf
97/144
- 97 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
AknowledgebasedRSformulatedasconstraintsatisfactionproblem
Def.
XI,XU:VariablesdescribingitemsandusermodelwithdomainD
(e.g.lowerfocallength,purpose)
KB:Knowledgebase comprisingconstraints and domainrestrictions
(e.g.IFpurpose=on
travel
THENlowerfocallength
-
7/25/2019 Recommender systems tutorial.pdf
98/144
- 98 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
MultiAttributeUtilityTheory(MAUT)
Eachitemisevaluatedaccordingtoapredefinedsetofdimensionsthatprovide
anaggregatedviewonthebasicitemproperties
E.g.quality andeconomy aredimensionsinthedomainofdigitalcameras
id value quality economy
price 250>250
510
105
mpix 8
>8
4
10
10
6
optzoom 9
>9
6
10
9
6
... ... ... ...
CustomerspecificitemutilitieswithMAUT
Customer interests: customer quality economy
-
7/25/2019 Recommender systems tutorial.pdf
99/144
- 99 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Customerinterests:
Itemutilities:
customer quality economy
Cu1 80% 20%
Cu2 40% 60%
quality economy utility:cu1 utility:cu2
P1 (5,4,6,6,3,7,10) = 41 (10,10,9,10,10,10,6) = 65 45.8 [8] 55.4 [6]
P2 (5,4,6,6,10,10,8) = 49 (10,10,9,10,7,8,10) = 64 52.0 [7] 58.0 [1]
P3 (5,4,10,6,10,10,8) = 53 (10,10,6,10,7,8,10) = 61 54.6 [5] 57.8 [2]
... ... ... ...
*
**
* **
ConstraintbasedrecommendationII
BUT Wh if l i i ?
-
7/25/2019 Recommender systems tutorial.pdf
100/144
- 100 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
BUT:Whatifnosolutionexists?
notsatisfiable debuggingofknowledgebase
notsatisfiable but
correct debuggingofuserrequirements
Applicationofmodelbaseddiagnosisfordebugginguser
requirements
Diagnoses: issatisfiable
Repairs: issatisfiable
Conflictsets: notsatisfiable
IKBSRS
IKB
IKBSRS )\(
IKBCSSRSCS :
IKBSRS repair )\(
IKB
Example:findminimalrelaxations(minimaldiagnoses)
KnowledgeBase: Productcatalogue:
-
7/25/2019 Recommender systems tutorial.pdf
101/144
- 101 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Usermodel(SRS)
R1 Motives Landscape
R2 Brandpreference Canon
R3 Max.cost 350EUR
PowershotXY
Brand Canon
Lowerfocallength 35
Upperfocallength 140
Price 420 EUR
LumixBrand Panasonic
Lowerfocallength 28
Upperfocallength 112
Price 319 EUR
LHS RHS
C1 TRUE Brand =Brandpref.
C2 Motives=Landscape Low.foc.Length=
-
7/25/2019 Recommender systems tutorial.pdf
102/144
- 102 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Computationofminimalrevisionsofrequirements
Doyouwanttorelaxyourbrandpreference?
AcceptPanasonicinsteadofCanonbrand
Orisphotographinglandscapeswithawideanglelensandmaximumcostless
important?
Lowerfocallength>28mmandPrice>350EUR
Optionallyguidedbysomepredefinedweightsorpast communitybehavior
Beawareofpossiblerevisions(e.g.age,familystatus,)
ConstraintbasedrecommendationIII
-
7/25/2019 Recommender systems tutorial.pdf
103/144
- 103 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Morevariantsofrecommendationtask
Customersmaybenotknowwhattheyareseeking
Find"diverse"setsofitems
Notionofsimilarity/dissimilarity
Ideathatusersnavigateaproductspace
Ifrecommendationsaremorediversethanuserscannavigateviacritiqueson
recommended"entrypoints"moreefficiently(lessstepsofinteraction)
Bundlingofrecommendations
Finditembundlesthatmatchtogetheraccordingtosomeknowledge
E.g.travelpackages,skincaretreatmentsorfinancialportfolios
RSfordifferentitemcategories,CSPrestrictsconfiguring ofbundles
Conversationalstrategies
Process consisting of multiple
-
7/25/2019 Recommender systems tutorial.pdf
104/144
- 104 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Processconsistingofmultiple
conversationalmoves
Resemblesnaturalsalesinteractions
Notalluserrequirementsknownbeforehand Customersarerarelysatisfiedwiththeinitial
recommendations
Differentstylesofpreferenceelicitation:
Freetextqueryinterface Askingtechnical/genericproperties
Images/inspiration
ProposingandCritiquing
Example:adaptivestrategyselection
-
7/25/2019 Recommender systems tutorial.pdf
105/144
- 105 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Statemodel,differentactionspossible
Proposeitem,askuser,relax/tightenresultset,
[Riccietal.,JITT,2009]
Limitationsofknowledgebasedrecommendationmethods
Cost of knowledge acquisition
-
7/25/2019 Recommender systems tutorial.pdf
106/144
- 106 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Costofknowledgeacquisition
Fromdomainexperts
Fromusers
Remedy:
exploit
web
resources
Accuracyofpreferencemodels
Veryfinegranularpreferencemodelsrequiremanyinteractioncycleswiththe
userorsufficientdetaileddataabouttheuser Remedy:usecollaborativefiltering,estimatesthepreferenceofauser
However:preferencemodelsmaybeinstable
E.g.asymmetricdominanceeffectsanddecoyitems
-
7/25/2019 Recommender systems tutorial.pdf
107/144
- 107 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Hybridrecommendersystems
-
7/25/2019 Recommender systems tutorial.pdf
108/144
- 108 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Allthreebasetechniquesarenaturallyincorporatedbyagoodsalesassistance
(atdifferentstagesofthesalesact)buthavetheirshortcomings
Ideaofcrossingtwo(ormore)species/implementations
hybrida[lat.]:denotesanobjectmadebycombiningtwodifferentelements
Avoidsomeoftheshortcomings
Reachdesirablepropertiesnotpresentinindividualapproaches
Differenthybridizationdesigns
Monolithicexploitingdifferentfeatures
Parallel
use
of
several
systems Pipelinedinvocationofdifferentsystems
Monolithichybridizationdesign
-
7/25/2019 Recommender systems tutorial.pdf
109/144
- 109 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Onlyasinglerecommendationcomponent
Hybridizationis"virtual"inthesensethat
Features/knowledgesourcesofdifferentparadigmsarecombined
Monolithichybridizationdesigns:Featurecombination
-
7/25/2019 Recommender systems tutorial.pdf
110/144
- 110 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
"Hybrid"userfeatures:
Socialfeatures:Movieslikedbyuser Contentfeatures:Comedieslikedbyuser,dramaslikedbyuser
Hybridfeatures:userswholikemanymoviesthatarecomedies,
the
common
knowledge
engineering
effort
that
involves
inventing
good
featurestoenablesuccessfullearning[BHC98]
Monolithichybridizationdesigns:Featureaugmentation
C t t b t d ll b ti filt i [MMN02]
-
7/25/2019 Recommender systems tutorial.pdf
111/144
- 111 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Contentboostedcollaborativefiltering[MMN02]
Basedoncontentfeaturesadditionalratingsarecreated
E.g.AlicelikesItems1and3(unaryratings)
Item7issimilarto1and3byadegreeof0,75
ThusAlicelikesItem7by0,75
Itemmatricesbecomelesssparse
Recommendationofresearchpapers[TMA+04]
Citationsinterpretedascollaborativerecommendations
Integratedincontentbasedrecommendationmethod
Parallelizedhybridizationdesign
Outputofseveralexistingimplementationscombined
-
7/25/2019 Recommender systems tutorial.pdf
112/144
- 112 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Leastinvasivedesign
Weightingorvotingschemeapplied
Weightscanbelearneddynamically
Parallelizedhybridizationdesign:Switching
Specialcaseofdynamicweights(allweightsexceptoneare0)
-
7/25/2019 Recommender systems tutorial.pdf
113/144
- 113 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Requiresanoraclethatdecideswhichrecommenderisused
Example:
Switchingisbasedonsomequalitycriteria:
E.g.
if
too
few
ratings
in
the
system,
use
knowledge
based,
else
collaborative
Pipelinedhybridizationdesigns
One recommender system pre processes some input for the subsequent
-
7/25/2019 Recommender systems tutorial.pdf
114/144
- 114 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Onerecommendersystempreprocessessomeinputforthesubsequent
one
Cascade
Metalevel
Refinementofrecommendationlists(cascade)
Learningofmodel(e.g.collaborativeknowledgebasedmetalevel)
Pipelinedhybridizationdesigns:Cascade
Item1 0.8 2
Recommender 2
Item1 0.5 1
Recommender 1
-
7/25/2019 Recommender systems tutorial.pdf
115/144
- 115 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Recommendationlistiscontinuallyreduced
Firstrecommenderexcludesitems
Removeabsolutenogoitems(e.g.knowledgebased)
Secondrecommenderassignsscore
Orderingandrefinement(e.g.collaborative)
Item2 0.9 1
Item3 0.4 3
Item4 0
Item5 0
Item2 0
Item3 0.3 2
Item4 0.1 3
Item5 0
Item1 0,80 1
Item2 0,00
Item3 0,40 2
Item4 0,00Item5 0,00
Recommender cascaded (rec1, rec2)
Pipelinedhybridizationdesigns:Metalevel
Successorexploitsamodel builtbypredecessor
-
7/25/2019 Recommender systems tutorial.pdf
116/144
- 116 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
ismodelbuiltbyRSn1exploitedbyRSn
Examples:
Fabsystem:contentbased,collaborativerecommendation[BS97]
Online
news
domain Contendbasedrecommenderbuildsusermodelsbasedonweightedtermvectors
Collaborative filteringidentifiessimilarpeersbasedonweightedtermvectorsbutmakesrecommendationsbasedonratings
Collaborative,constraintbasedmetalevelRS
Collaborativefilteringidentifiessimilarpeers Aconstraintbaseislearnedbyexploitingthebehaviorofsimilarpeers
Learnedconstraintsareemployedtocomputerecommendations
),,(),(1
nrecnlevelmeta iureciurec
Whatisthebesthybridizationstrategy?
Only few works that compare strategies from the metaperspective
-
7/25/2019 Recommender systems tutorial.pdf
117/144
- 117 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Onlyfewworksthatcomparestrategiesfromthemeta perspective
Forinstance,[Burke02]
Mostdatasetsdonotallowtocomparedifferentrecommendationparadigms
I.e.ratings,requirements,itemfeatures,domainknowledge,critiquesrarely
availableinasingledataset
Someconclusionsaresupportedbyempiricalfindings
Monolithic:preprocessingefforttradedinformoreknowledgeincluded
Parallel:requirescarefuldesignofscoresfromdifferentpredictors Pipelined:workswellfortwoantitheticapproaches
Netflixcompetition stackingrecommendersystems
Weighteddesignbasedon>100predictors recommendationfunctions
Adaptiveswitchingofweightsbasedonusermodel,parameters
AdvancedtopicsI
-
7/25/2019 Recommender systems tutorial.pdf
118/144
- 118 -
Explanationsinrecommendersystems
Motivation
-
7/25/2019 Recommender systems tutorial.pdf
119/144
- 119 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
ThedigitalcameraProfishotisamustbuyforyoubecause....
Whyshouldrecommendersystemsdealwithexplanationsatall?
Theanswerisrelatedtothetwopartiesprovidingandreceiving
recommendations: Asellingagentmaybeinterestedinpromotingparticularproducts
Abuyingagentisconcernedaboutmakingtherightbuyingdecision
Explanationsinrecommendersystems
Additional information to explain the systems output following some
-
7/25/2019 Recommender systems tutorial.pdf
120/144
- 120 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Additionalinformationtoexplainthesystem soutputfollowingsome
objectives
Objectivesofexplanations
-
7/25/2019 Recommender systems tutorial.pdf
121/144
- 121 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Transparency
Validity
Trustworthiness
Persuasiveness
Effectiveness
Efficiency
Satisfaction
Relevance
Comprehensibility
Education
Explanationsingeneral
How?andWhy?explanationsinexpertsystems
-
7/25/2019 Recommender systems tutorial.pdf
122/144
- 122 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Formofabductive reasoning
Given: (itemiisrecommendedbymethodRS)
Find s.t.
Principleofsuccinctness
Findsmallestsubsetof s.t.
i.e.forall
holds
Butadditionalfiltering
Somepartsrelevantfor
deduction,mightbeobvious
forhumans
[Friedrich&Zanker,AIMagazine,2011]
TaxonomyforgeneratingexplanationsinRS
Majordesigndimensionsofcurrentexplanationcomponents:
Category of reasoning model for generating explanations
-
7/25/2019 Recommender systems tutorial.pdf
123/144
- 123 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Categoryofreasoningmodelforgeneratingexplanations
Whitebox
Blackbox
RSparadigmforgeneratingexplanations
Determinestheexploitablesemanticrelations
Informationcategories
RSparadigmsandtheirontologies
Classesofobjects
U
-
7/25/2019 Recommender systems tutorial.pdf
124/144
- 124 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Users
Items
Properties
Nary relationsbetweenthem
Collaborativefiltering
NeighborhoodbasedCF(a)
Matrixfactorization(b)
Introducesadditionalfactorsasproxiesfor
determining
similarities
RSparadigmsandtheirontologies
Contentbased
Properties characterizing items
-
7/25/2019 Recommender systems tutorial.pdf
125/144
- 125 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Propertiescharacterizingitems
TF*IDFmodel
Knowledgebased
Propertiesofitems
Properties
of
user
model Additionalmediatingdomainconcepts
Similaritybetween
items
-
7/25/2019 Recommender systems tutorial.pdf
126/144
- 126 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Similaritybetween
users
Tags
Tagrelevance(foritem)
Tag
preference
(of
user)
Thermencheck.com(hotspringresorts)
-
7/25/2019 Recommender systems tutorial.pdf
127/144
- 127 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
The water has favorable properties for X, butit is unknown if it also cures Y.
It offers organic food, but no kosher food.
It offers services for families with smallchildren, such as X, Y and Z.
It is a spa resort of medium size offering
around 1000 beds.
Resultsfromtestingtheexplanationfeature
PerceivedPositive
Usage exp+**
-
7/25/2019 Recommender systems tutorial.pdf
128/144
- 128 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich
Knowledgeableexplanationssignificantlyincreasetheusers
perceivedutility
Perceivedutility
strongly
correlates
with
usage
intention
etc.
Explanation
Trust
Perceived
UtilityUsageexp.
Recommend
others
Intentionto
repeated
usage**
sign.