Recommender systems tutorial.pdf

download Recommender systems tutorial.pdf

of 144

Transcript of Recommender systems tutorial.pdf

  • 7/25/2019 Recommender systems tutorial.pdf

    1/144

    - 1 -

    Tutorial:Recommender

    Systems

    InternationalJointConferenceonArtificialIntelligence

    Beijing,August4,2013

    DietmarJannach

    TUDortmund

    Gerhard

    FriedrichAlpenAdriaUniversittKlagenfurt

  • 7/25/2019 Recommender systems tutorial.pdf

    2/144

    - 2 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

  • 7/25/2019 Recommender systems tutorial.pdf

    3/144

    - 3 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    RecommenderSystems

    Applicationareas

  • 7/25/2019 Recommender systems tutorial.pdf

    4/144

    - 4 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    IntheSocialWeb

  • 7/25/2019 Recommender systems tutorial.pdf

    5/144

    - 5 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Evenmore

    Personalizedsearch

    "Computationaladvertising"

  • 7/25/2019 Recommender systems tutorial.pdf

    6/144

    - 6 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Aboutthespeakers

    GerhardFriedrich

    ProfessoratUniversityKlagenfurt,Austria

    DietmarJannach

    ProfessoratTUDortmund,Germany

    Researchbackgroundandinterests

    ApplicationofIntelligentSystemstechnologyinbusiness

    Recommendersystemsimplementation&evaluation

    Productconfigurationsystems

    Webmining

    Operationsresearch

  • 7/25/2019 Recommender systems tutorial.pdf

    7/144

    - 7 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Agenda

    Whatarerecommendersystemsfor?

    Introduction

    Howdo

    they

    work

    (Part

    I)

    ?

    CollaborativeFiltering

    Howtomeasuretheirsuccess?

    Evaluationtechniques

    Howdo

    they

    work

    (Part

    II)

    ?

    ContentbasedFiltering

    KnowledgeBasedRecommendations

    HybridizationStrategies

    Advancedtopics

    Explanations

    Humandecisionmaking

  • 7/25/2019 Recommender systems tutorial.pdf

    8/144

    - 8 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

  • 7/25/2019 Recommender systems tutorial.pdf

    9/144

    - 9 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    WhyusingRecommenderSystems?

    Valueforthecustomer

    Findthingsthatareinteresting

    Narrowdownthesetofchoices

    Helpmeexplorethespaceofoptions

    Discovernewthings

    Entertainment

    Valuefortheprovider

    Additionalandprobablyuniquepersonalizedserviceforthecustomer

    Increasetrustandcustomerloyalty

    Increasesales,clicktroughrates,conversionetc. Opportunitiesforpromotion,persuasion

    Obtainmoreknowledgeaboutcustomers

  • 7/25/2019 Recommender systems tutorial.pdf

    10/144

    - 10 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Realworldcheck

    Mythsfromindustry

    Amazon.comgeneratesXpercentoftheirsalesthroughtherecommendation

    lists(30

  • 7/25/2019 Recommender systems tutorial.pdf

    11/144

    - 11 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Problemdomain

    Recommendationsystems(RS)helptomatchuserswithitems

    Easeinformationoverload

    Salesassistance(guidance,advisory,persuasion,)

    RSaresoftwareagentsthatelicittheinterestsandpreferencesofindividual

    consumers[]andmakerecommendationsaccordingly.

    Theyhave

    the

    potential

    to

    support

    and

    improve

    the

    quality

    of

    the

    decisionsconsumersmakewhilesearchingforandselectingproductsonline.

    [Xiao&Benbasat,MISQ, 2007]

    Differentsystem

    designs

    /paradigms

    Basedonavailabilityofexploitabledata

    Implicitandexplicituserfeedback

    Domaincharacteristics

  • 7/25/2019 Recommender systems tutorial.pdf

    12/144

    - 12 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Recommendersystems

    RSseenasafunction[AT05]

    Given: Usermodel(e.g.ratings,preferences,demographics,situationalcontext)

    Items(withorwithoutdescriptionofitemcharacteristics)

    Find:

    Relevancescore.Usedforranking.

    Finally:

    Recommenditemsthatareassumedtoberelevant

    But: Rememberthatrelevancemightbecontextdependent

    Characteristicsofthelistitselfmightbeimportant(diversity)

  • 7/25/2019 Recommender systems tutorial.pdf

    13/144

    - 13 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Recommendersystemsreduce

    informationoverloadbyestimating

    relevance

  • 7/25/2019 Recommender systems tutorial.pdf

    14/144

    - 14 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Personalizedrecommendations

  • 7/25/2019 Recommender systems tutorial.pdf

    15/144

    - 15 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Collaborative:"Tellmewhat'spopular

    amongmypeers"

  • 7/25/2019 Recommender systems tutorial.pdf

    16/144

    - 16 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Contentbased:"Showmemoreofthe

    samewhatI'veliked"

  • 7/25/2019 Recommender systems tutorial.pdf

    17/144

    - 17 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Knowledgebased:"Tellmewhatfits

    basedonmyneeds"

  • 7/25/2019 Recommender systems tutorial.pdf

    18/144

    - 18 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Paradigmsofrecommendersystems

    Hybrid:combinationsofvariousinputs

    and/orcompositionofdifferent

    mechanism

  • 7/25/2019 Recommender systems tutorial.pdf

    19/144

    - 19 -

    Recommendersystems:basictechniques

    Pros Cons

    Collaborative Noknowledge

    engineeringeffort,

    serendipityofresults,

    learnsmarketsegments

    Requiressomeformofrating

    feedback,coldstartfornewusers

    andnewitems

    Contentbased Nocommunityrequired,

    comparisonbetweenitemspossible

    Contentdescriptionsnecessary,

    coldstartfornewusers,nosurprises

    Knowledgebased Deterministic

    recommendations,

    assuredquality,nocoldstart,canresemblesales

    dialogue

    Knowledgeengineeringeffortto

    bootstrap,basicallystatic,does

    notreacttoshorttermtrends

  • 7/25/2019 Recommender systems tutorial.pdf

    20/144

    - 20 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

  • 7/25/2019 Recommender systems tutorial.pdf

    21/144

    - 21 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    CollaborativeFiltering(CF)

    Themostprominentapproachtogeneraterecommendations

    usedbylarge,commercialecommercesites

    wellunderstood,variousalgorithmsandvariationsexist

    applicableinmanydomains(book,movies,DVDs,..)

    Approach

    use

    the

    "wisdom

    of

    the

    crowd"

    to

    recommend

    items Basicassumptionandidea

    Usersgiveratingstocatalogitems(implicitlyorexplicitly)

    Customerswhohadsimilartastesinthepast,willhavesimilartastesinthe

    future

  • 7/25/2019 Recommender systems tutorial.pdf

    22/144

    - 22 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    1992: Usingcollaborativefilteringtoweaveaninformationtapestry,D.Goldbergetal.,CommunicationsoftheACM

    Basicidea:"Eagerreadersreadalldocsimmediately,casualreaderswait

    fortheeagerreaderstoannotate"

    ExperimentalmailsystematXeroxParcthatrecordsreactionsofusers

    whenreadingamail

    Usersareprovidedwithpersonalizedmailinglistfiltersinsteadofbeing

    forcedto

    subscribe

    Contentbasedfilters(topics,from/to/subject)

    Collaborativefilters

    E.g.Mailsto[all]whichwererepliedby[JohnDoe]andwhichreceived

    positiveratings

    from

    [X]

    and

    [Y].

  • 7/25/2019 Recommender systems tutorial.pdf

    23/144

    - 23 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    1994: GroupLens:anopenarchitectureforcollaborativefilteringofnetnews,P.Resnicketal.,ACMCSCW

    Tapestrysystemdoesnotaggregateratingsandrequiresknowingeach

    other

    Basicidea:"Peoplewhoagreedintheirsubjectiveevaluationsinthe

    pastarelikelytoagreeagaininthefuture"

    Buildsonnewsgroupbrowserswithratingfunctionality

  • 7/25/2019 Recommender systems tutorial.pdf

    24/144

    - 24 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Userbasednearestneighborcollaborativefiltering(1)

    Thebasictechnique:

    Givenan"activeuser"(Alice)andanitemInotyetseenbyAlice

    ThegoalistoestimateAlice'sratingforthisitem,e.g.,by

    findasetofusers(peers)wholikedthesameitemsasAliceinthepastand

    whohaverateditemI

    use,e.g.theaverageoftheirratingstopredict,ifAlicewilllikeitemI

    dothisforallitemsAlicehasnotseenandrecommendthebestrated

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 3User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

  • 7/25/2019 Recommender systems tutorial.pdf

    25/144

    - 25 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Userbasednearestneighborcollaborativefiltering(2)

    Somefirstquestions

    Howdowemeasuresimilarity?

    Howmanyneighborsshouldweconsider?

    Howdowegenerateapredictionfromtheneighbors'ratings?

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 3

    User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

  • 7/25/2019 Recommender systems tutorial.pdf

    26/144

    - 26 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Measuringusersimilarity

    ApopularsimilaritymeasureinuserbasedCF:Pearsoncorrelation

    a,b :usersra,p :ratingofuseraforitemp

    P :setofitems,ratedbothbyaandb

    Possiblesimilarityvaluesbetween 1and1; =user'saverageratings

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 3

    User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

    sim =0,85sim =0,70

    sim =

    0,79

    ,

  • 7/25/2019 Recommender systems tutorial.pdf

    27/144

    - 27 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Pearsoncorrelation

    Takesdifferencesinratingbehaviorintoaccount

    Workswellinusualdomains,comparedwithalternativemeasures

    suchascosinesimilarity

    0

    1

    2

    3

    4

    5

    6

    Item1 Item2 Item3 Item4

    Ratings

    Alice

    User1

    User4

  • 7/25/2019 Recommender systems tutorial.pdf

    28/144

    - 28 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Makingpredictions

    Acommonpredictionfunction:

    Calculate,

    whether

    the

    neighbors'

    ratings

    for

    the

    unseen

    itemiare

    higher

    orlowerthantheiraverage

    Combinetheratingdifferences usethesimilarityasaweight

    Add/subtractthe neighbors'biasfromtheactiveuser'saverageanduse

    thisasaprediction

  • 7/25/2019 Recommender systems tutorial.pdf

    29/144

    - 29 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Makingrecommendations

    Makingpredictionsistypicallynottheultimategoal

    Usualapproach(inacademia)

    Rankitemsbasedontheirpredictedratings

    However

    Thismightleadtotheinclusionof(only)nicheitems

    Inpractice

    also:Takeitempopularityintoaccount

    Approaches

    "Learningtorank"

    Optimizeaccordingtoagivenrankevaluationmetric(seelater)

  • 7/25/2019 Recommender systems tutorial.pdf

    30/144

    - 30 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Improvingthemetrics /predictionfunction

    Notallneighborratingsmightbeequally"valuable"

    Agreementoncommonlylikeditemsisnotsoinformativeasagreementon

    controversialitems Possiblesolution: Givemoreweighttoitemsthathaveahighervariance

    Valueofnumberofcorateditems

    Use"significanceweighting",bye.g.,linearlyreducingtheweightwhenthe

    numberofcorateditemsislow

    Caseamplification

    Intuition:Givemoreweightto"verysimilar"neighbors,i.e.,wherethe

    similarityvalueiscloseto1.

    Neighborhoodselection

    Usesimilaritythresholdorfixednumberofneighbors

  • 7/25/2019 Recommender systems tutorial.pdf

    31/144

    - 31 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Memorybasedandmodelbasedapproaches

    UserbasedCFissaidtobe"memorybased"

    theratingmatrixisdirectlyusedtofindneighbors/makepredictions

    doesnotscaleformostrealworldscenarios

    largeecommercesiteshavetensofmillionsofcustomersandmillionsof

    items

    Modelbasedapproaches

    basedonanofflinepreprocessingor"modellearning"phase

    atruntime,onlythelearnedmodelisusedtomakepredictions

    modelsareupdated/retrainedperiodically

    largevarietyoftechniquesused

    modelbuildingandupdatingcanbecomputationallyexpensive

  • 7/25/2019 Recommender systems tutorial.pdf

    32/144

    - 32 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    2001: Itembasedcollaborativefilteringrecommendationalgorithms,B.

    Sarwaret

    al.,

    WWW

    2001

    ScalabilityissuesarisewithU2Uifmanymoreusersthanitems

    (m>>n,m=|users|,n=|items|)

    e.g.Amazon.com SpacecomplexityO(m2)whenprecomputed

    TimecomplexityforcomputingPearsonO(m2n)

    Highsparsityleadstofewcommonratingsbetweentwousers

    Basicidea:"ItembasedCFexploitsrelationshipsbetweenitemsfirst,

    insteadof

    relationships

    between

    users"

  • 7/25/2019 Recommender systems tutorial.pdf

    33/144

    - 33 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Itembasedcollaborativefiltering

    Basicidea:

    Usethesimilaritybetweenitems(andnotusers)tomakepredictions

    Example:

    LookforitemsthataresimilartoItem5

    TakeAlice'sratingsfortheseitemstopredicttheratingforItem5

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 3

    User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

  • 7/25/2019 Recommender systems tutorial.pdf

    34/144

    - 34 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Thecosinesimilaritymeasure

    Producesbetterresultsinitemtoitemfiltering

    forsomedatasets,noconsistentpictureinliterature

    Ratingsareseenasvectorinndimensionalspace

    Similarityiscalculatedbasedontheanglebetweenthevectors

    Adjustedcosinesimilarity

    takeaverageuserratingsintoaccount,transformtheoriginalratings

    U:setofuserswhohaveratedbothitemsaandb

  • 7/25/2019 Recommender systems tutorial.pdf

    35/144

    - 35 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Preprocessingforitembasedfiltering

    Itembasedfilteringdoesnotsolvethescalabilityproblemitself

    PreprocessingapproachbyAmazon.com(in2003)

    Calculateallpairwiseitemsimilaritiesinadvance

    Theneighborhoodtobeusedatruntimeistypicallyrathersmall,because

    onlyitemsaretakenintoaccountwhichtheuserhasrated

    Itemsimilaritiesaresupposedtobemorestablethanusersimilarities

    Memoryrequirements

    UptoN2 pairwisesimilaritiestobememorized(N=numberofitems)in

    theory

    Inpractice,thisissignificantlylower(itemswithnocoratings)

    Furtherreductionspossible

    Minimumthresholdforcoratings(items,whichareratedatleastbynusers)

    Limitthesizeoftheneighborhood(mightaffectrecommendationaccuracy)

  • 7/25/2019 Recommender systems tutorial.pdf

    36/144

    - 36 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Moreonratings

    PureCFbasedsystemsonlyrelyontheratingmatrix

    Explicitratings

    Mostcommonlyused(1to5,1to7Likert responsescales)

    Researchtopics

    "Optimal"granularityofscale;indicationthat10pointscaleisbetteracceptedin

    moviedomain

    Multidimensionalratings(multipleratingspermovie)

    Challenge

    Usersnotalwayswillingtoratemanyitems;sparseratingmatrices

    Howtostimulateuserstoratemoreitems?

    Implicitratings

    clicks,pageviews,timespentonsomepage,demodownloads

    Canbeusedinadditiontoexplicitones;questionofcorrectnessofinterpretation

  • 7/25/2019 Recommender systems tutorial.pdf

    37/144

    - 37 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Datasparsityproblems

    Coldstartproblem

    Howtorecommendnewitems?Whattorecommendtonewusers?

    Straightforwardapproaches

    Ask/forceuserstorateasetofitems

    Useanothermethod(e.g.,contentbased,demographicorsimplynon

    personalized)intheinitialphase

    Alternatives

    Usebetteralgorithms(beyondnearestneighborapproaches)

    Example:

    Innearestneighborapproaches,thesetofsufficientlysimilarneighborsmight

    betosmalltomakegoodpredictions

    Assume"transitivity"ofneighborhoods

  • 7/25/2019 Recommender systems tutorial.pdf

    38/144

    - 38 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Examplealgorithmsforsparsedatasets

    RecursiveCF

    Assumethereisaverycloseneighbornofuwhohoweverhasnotratedthe

    targetitemiyet. Idea:

    ApplyCFmethodrecursivelyandpredictaratingforitemifortheneighbor

    Usethispredictedratinginsteadoftheratingofamoredistantdirect

    neighbor

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 ?User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

    sim =0,85

    Predict

    ratingfor

    User1

  • 7/25/2019 Recommender systems tutorial.pdf

    39/144

    - 39 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Graphbasedmethods

    "Spreadingactivation"(sketch)

    Idea:Usepathsoflengths>3

    torecommenditems Length3:RecommendItem3toUser1

    Length5:Item1alsorecommendable

  • 7/25/2019 Recommender systems tutorial.pdf

    40/144

    - 40 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Moremodelbasedapproaches

    Plethoraofdifferenttechniquesproposedinthelastyears,e.g.,

    Matrixfactorizationtechniques,statistics

    singularvaluedecomposition,principalcomponentanalysis

    Associationrulemining

    compare:shoppingbasketanalysis

    Probabilisticmodels

    clusteringmodels,Bayesiannetworks,probabilisticLatentSemanticAnalysis Variousothermachinelearningapproaches

    Costsofpreprocessing

    Usuallynotdiscussed

    Incrementalupdatespossible?

  • 7/25/2019 Recommender systems tutorial.pdf

    41/144

    - 41 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    2000: ApplicationofDimensionalityReductionin

    Recommender

    System,

    B.

    Sarwar

    et

    al.,

    WebKDD

    Workshop

    Basicidea:Trademorecomplexofflinemodelbuildingforfasteronline

    predictiongeneration

    SingularValue

    Decomposition

    for

    dimensionality

    reduction

    of

    rating

    matrices

    Capturesimportantfactors/aspectsandtheirweightsinthedata

    factorscanbegenre,actorsbutalsononunderstandableones

    Assumptionthatkdimensionscapturethesignalsandfilteroutnoise(K=20to100)

    Constanttimetomakerecommendations

    ApproachalsopopularinIR(LatentSemanticIndexing),data

    compression,

  • 7/25/2019 Recommender systems tutorial.pdf

    42/144

    - 42 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Apicturesays

    -1

    -0,8

    -0,6

    -0,4

    -0,2

    0

    0,2

    0,4

    0,6

    0,8

    1

    -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1

    BobMary

    Alice

    Sue

  • 7/25/2019 Recommender systems tutorial.pdf

    43/144

    - 43 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Matrixfactorization

    VkT

    Dim1 0.44 0.57 0.06 0.38 0.57

    Dim2 0.58 0.66 0.26 0.18 0.36

    Uk Dim1 Dim2

    Alice 0.47 0.30

    Bob 0.44 0.23

    Mary 0.70 0.06

    Sue 0.31 0.93 Dim1 Dim2

    Dim1 5.63 0

    Dim2 0 3.23

    T

    kkkk

    VUM

    k

    SVD:

    Prediction:

    =3+0.84=3.84

    )()( EPLVAliceUrr Tkkkuui

  • 7/25/2019 Recommender systems tutorial.pdf

    44/144

    - 44 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Associationrulemining

    Commonlyusedforshoppingbehavioranalysis

    aimsatdetectionofrulessuchas

    "Ifacustomer

    purchases

    baby

    food

    then

    he

    also

    buys

    diapers

    in70%ofthecases"

    Associationruleminingalgorithms

    candetectrulesoftheformX=>Y(e.g.,babyfood=>diapers)fromasetof

    salestransactionsD={t1,t2,tn}

    measureofquality:support,confidence

  • 7/25/2019 Recommender systems tutorial.pdf

    45/144

    - 45 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Probabilisticmethods

    Basicidea(simplisticversionforillustration):

    giventheuser/itemratingmatrix

    determinetheprobabilitythatuserAlicewilllikeanitemi basetherecommendationonsuchtheseprobabilities

    CalculationofratingprobabilitiesbasedonBayes Theorem

    Howprobableisratingvalue"1"forItem5givenAlice'spreviousratings?

    CorrespondstoconditionalprobabilityP(Item5=1|X),where

    X=Alice'spreviousratings=(Item1=1,Item2=3,Item3=)

    CanbeestimatedbasedonBayes'Theorem

    Usually

    more

    sophisticated

    methods

    used Clustering

    pLSA

  • 7/25/2019 Recommender systems tutorial.pdf

    46/144

    - 46 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    2008: Factorizationmeetstheneighborhood:amultifacetedcollaborativefilteringmodel,Y.Koren,ACMSIGKDD

    StimulatedbyworkonNetflixcompetition

    Prizeof$1,000,000foraccuracyimprovementof10%RMSE

    comparedto

    own

    Cinematch system

    Verylargedataset(~100Mratings,~480Kusers,~18K

    movies)

    Lastratings/userwithheld(setK)

    Rootmean

    squared

    error

    metric

    optimized

    to

    0.8567

    K

    rr

    RMSE Kiu

    uiui

    ),(

    2)(

  • 7/25/2019 Recommender systems tutorial.pdf

    47/144

    - 47 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Mergesneighborhoodmodelswithlatentfactormodels

    Latentfactormodels

    goodto

    capture

    weak

    signals

    in

    the

    overall

    data

    Neighborhoodmodels

    goodatdetectingstrongrelationshipsbetweencloseitems

    Combinationinonepredictionsinglefunction

    Localsearchmethodsuchasstochasticgradientdescenttodetermine

    parameters

    Addpenaltyforhighvaluestoavoidoverfitting

    2008: Factorizationmeetstheneighborhood:amultifacetedcollaborativefilteringmodel,Y.Koren,ACMSIGKDD

    Kiu

    iuiui

    T

    uiuuibqp

    bbqpqpbbr),(

    22222

    ,,)()(min

    ***

    i

    T

    uiuui qpbbr

  • 7/25/2019 Recommender systems tutorial.pdf

    48/144

    - 48 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Summarizingrecentmethods

    Recommendationisconcernedwithlearningfromnoisyobservations

    (x,y),where

    hastobedeterminedsuch that

    isminimal.

    Avarietyofdifferentlearningstrategieshavebeenappliedtryingto

    estimatef(x)

    Nonparametricneighborhoodmodels

    MFmodels,SVMs,NeuralNetworks,BayesianNetworks,

    yxf )(

    y

    yy

    2)(

  • 7/25/2019 Recommender systems tutorial.pdf

    49/144

    - 49 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    CollaborativeFilteringIssues

    Pros:

    wellunderstood,workswellinsomedomains,noknowledgeengineeringrequired

    Cons: requiresusercommunity,sparsityproblems,nointegrationofotherknowledgesources,

    noexplanationofresults

    WhatisthebestCFmethod?

    Inwhichsituationandwhichdomain?Inconsistentfindings;alwaysthesamedomains

    anddatasets;differencesbetweenmethodsareoftenverysmall(1/100)

    Howtoevaluatethepredictionquality?

    MAE/RMSE:WhatdoesanMAEof0.7actuallymean?

    Serendipity:Notyetfullyunderstood

    Whataboutmultidimensionalratings?

  • 7/25/2019 Recommender systems tutorial.pdf

    50/144

    - 50 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

  • 7/25/2019 Recommender systems tutorial.pdf

    51/144

    - 51 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    OneRecommenderSystemsresearchquestion

    Whatshouldbeinthatlist?

    RecommenderSystemsineCommerce

  • 7/25/2019 Recommender systems tutorial.pdf

    52/144

    - 52 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Anotherquestionbothinresearchandpractice

    Howdoweknowthatthesearegood

    recommendations?

    RecommenderSystemsineCommerce

  • 7/25/2019 Recommender systems tutorial.pdf

    53/144

    - 53 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Thismightleadto

    Whatisagoodrecommendation?

    Whatisagoodrecommendationstrategy?

    Whatisagoodrecommendationstrategyformy

    business?

    RecommenderSystemsineCommerce

    We hope you will buy also These have been in stock for quite a while now

  • 7/25/2019 Recommender systems tutorial.pdf

    54/144

    - 54 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Totalsalesnumbers

    Promotionofcertainitems

    Clickthroughrates

    Interactivityonplatform

    Customerreturnrates

    Customersatisfactionandloyalty

    Whatisagoodrecommendation?

    Whatarethemeasuresinpractice?

  • 7/25/2019 Recommender systems tutorial.pdf

    55/144

    - 55 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Purposeandsuccesscriteria(1)

    Differentperspectives/aspects

    Dependsondomainandpurpose

    No

    holistic

    evaluation

    scenario

    exists

    Retrievalperspective

    Reducesearchcosts

    Provide"correct"proposals Assumption:Usersknowinadvancewhattheywant

    Recommendationperspective

    Serendipity identifyitemsfromtheLongTail Usersdidnotknowaboutexistence

  • 7/25/2019 Recommender systems tutorial.pdf

    56/144

    - 56 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    WhendoesaRSdoitsjobwell?

    "Recommendwidely

    unknownitemsthat

    usersmightactually

    like!"

    20%ofitems

    accumulate74%ofallpositiveratings

    Recommenditemsfromthelongtail

    d i i ( )

  • 7/25/2019 Recommender systems tutorial.pdf

    57/144

    - 57 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Purposeandsuccesscriteria(2)

    Predictionperspective

    Predicttowhatdegreeuserslikeanitem

    Most

    popular

    evaluation

    scenario

    in

    research

    Interactionperspective

    Giveusersa"goodfeeling"

    Educate

    users

    about

    the

    product

    domain Convince/persuadeusers explain

    Finally,conversionperspective

    Commercialsituations Increase"hit","clickthrough","lookerstobookers"rates

    Optimizesalesmarginsandprofit

    Howdoweasresearchers

  • 7/25/2019 Recommender systems tutorial.pdf

    58/144

    - 58 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Testwithrealusers

    A/Btests

    Examplemeasures:salesincrease,clickthroughrates

    Laboratorystudies

    Controlledexperiments

    Examplemeasures: satisfactionwiththesystem(questionnaires)

    Offlineexperiments

    Basedonhistoricaldata

    Examplemeasures:predictionaccuracy,coverage

    know?

    E i i l h

  • 7/25/2019 Recommender systems tutorial.pdf

    59/144

    - 59 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Empiricalresearch

    Characterizingdimensions:

    Whoisthesubjectthatisinthefocusofresearch?

    Whatresearchmethodsareapplied?

    Inwhichsettingdoestheresearchtakeplace?

    Subject Onlinecustomers,students,historical onlinesessions,computers,

    Researchmethod Experiments,quasiexperiments, nonexperimental

    research

    Setting Lab,realworld scenarios

    Research methods

  • 7/25/2019 Recommender systems tutorial.pdf

    60/144

    - 60 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Researchmethods

    Experimentalvs.nonexperimental(observational)researchmethods

    Experiment(test,trial):

    "Anexperiment

    is

    astudy

    in

    which

    at

    least

    one

    variable

    is

    manipulated

    and

    unitsarerandomlyassignedtodifferentlevelsorcategoriesofmanipulated

    variable(s)."

    Units:users,historicsessions,

    Manipulatedvariable:typeofRS,groupsofrecommendeditems,

    explanationstrategies

    Categoriesofmanipulatedvariable(s):contentbasedRS,collaborativeRS

    Experiment designs

  • 7/25/2019 Recommender systems tutorial.pdf

    61/144

    - 61 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Experimentdesigns

    Evaluation in information retrieval (IR)

  • 7/25/2019 Recommender systems tutorial.pdf

    62/144

    - 62 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Evaluationininformationretrieval(IR)

    Recommendationisviewedasinformationretrievaltask:

    Retrieve(recommend)allitemswhicharepredictedtobe"good"or

    "relevant".

    Commonprotocol:

    Hidesomeitemswithknowngroundtruth

    Rankitemsorpredictratings >Count >Crossvalidate

    Groundtruth

    established

    by

    human

    domain

    experts

    Reality

    ActuallyGood ActuallyBad

    Predictio

    n Rated

    Good

    TruePositive(tp) FalsePositive(fp)

    Rated

    Bad

    FalseNegative (fn) True Negative(tn)

    Metrics: Precision and Recall

  • 7/25/2019 Recommender systems tutorial.pdf

    63/144

    - 63 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Metrics:PrecisionandRecall

    Precision:ameasureofexactness,determinesthefractionofrelevant

    itemsretrievedoutofallitemsretrieved

    E.g.theproportionofrecommendedmoviesthatareactuallygood

    Recall:ameasureofcompleteness,determinesthefractionofrelevant

    itemsretrievedoutofallrelevantitems

    E.g.theproportionofallgoodmoviesrecommended

    Dilemma of IR measures in RS

  • 7/25/2019 Recommender systems tutorial.pdf

    64/144

    - 64 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    DilemmaofIRmeasuresinRS

    IRmeasuresarefrequentlyapplied,however:

    Groundtruthformostitemsactuallyunknown

    What isarelevantitem?

    Differentwaysofmeasuringprecisionpossible

    Resultsfromofflineexperimentationmayhavelimitedpredictivepowerfor

    onlineuserbehavior.

    Metrics: Rank Score position matters

  • 7/25/2019 Recommender systems tutorial.pdf

    65/144

    - 65 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    RankScoreextendsrecallandprecisiontotakethepositionsofcorrect

    itemsinarankedlistintoaccount

    Particularly

    important

    in

    recommender

    systems

    as

    lower

    ranked

    items

    may

    be

    overlookedbyusers

    Learningtorank:Optimizemodelsforsuchmeasures(e.g.,AUC)

    Metrics:RankScore positionmatters

    Actually good

    Item 237

    Item 899

    Recommended

    (predicted as good)Item 345

    Item 237

    Item 187

    Forauser:

    hit

    Accuracy measures

  • 7/25/2019 Recommender systems tutorial.pdf

    66/144

    - 66 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Accuracymeasures

    Datasetswithitemsratedbyusers

    MovieLensdatasets100K10Mratings

    Netflix100Mratings

    Historicuser

    ratings

    constitute

    ground

    truth

    Metricsmeasureerrorrate

    MeanAbsoluteError(MAE)computesthedeviationbetween

    predictedratingsandactualratings

    RootMeanSquareError(RMSE)issimilartoMAE,butplaces

    moreemphasisonlargerdeviation

    Offline experimentation example

  • 7/25/2019 Recommender systems tutorial.pdf

    67/144

    - 67 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Offlineexperimentationexample

    Netflixcompetition

    Webbasedmovierental

    Prizeof$1,000,000foraccuracyimprovement(RMSE)of10%comparedtoown

    Cinematch system.

    Historicaldataset

    ~480Kusersrated~18Kmoviesonascaleof1to5(~100Mratings)

    Last9ratings/userwithheld Probeset forteamsforevaluation

    Quizset evaluatesteamssubmissionsforleaderboard

    Testset usedbyNetflixtodeterminewinner

    Today Rating predictiononlyseenasanadditionalinputintotherecommendationprocess

    Animperfectworld

  • 7/25/2019 Recommender systems tutorial.pdf

    68/144

    - 68 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Offlineevaluationisthecheapestvariant

    Still,givesusvaluableinsights

    andletsuscompareourresults(intheory)

    Dangersandtrends:

    Dominationofaccuracymeasures

    Focusonsmallsetofdomains(40%onmoviesinCS)

    Alternativeandcomplementarymeasures:

    Diversity,Coverage,Novelty,Familiarity,Serendipity,Popularity,

    Concentrationeffects(Longtail)

    p

    Onlineexperimentationexample

  • 7/25/2019 Recommender systems tutorial.pdf

    69/144

    - 69 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    p p

    Effectivenessofdifferentalgorithmsfor

    recommendingcellphonegames

    [Jannach,Hegelich

    09]

    Involved150,000usersonacommercialmobile

    internetportal

    Comparisonof recommendermethods

    Detailsandresults

  • 7/25/2019 Recommender systems tutorial.pdf

    70/144

    - 70 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Recommendervariantsincluded:

    Itembasedcollaborativefiltering

    SlopeOne (alsocollaborativefiltering)

    Contentbasedrecommendation

    Hybridrecommendation

    Toprateditems

    Topsellers

    Findings:

    Personalizedmethodsincreasedsalesupto3.6%comparedtonon

    personalized

    Choiceofrecommendationalgorithmdependsonusersituation

    (e.g.avoidcontentbasedRSinpostsalessituation)

    }nonpersonalized

    Nonexperimentalresearch

  • 7/25/2019 Recommender systems tutorial.pdf

    71/144

    - 71 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Quasiexperiments

    Lackrandomassignmentsofunitstodifferenttreatments

    Nonexperimental/observationalresearch

    Surveys/Questionnaires

    Longitudinalresearch

    Observationsoverlongperiodoftime

    E.g.customerlifetimevalue,returningcustomers

    Casestudies

    Focusgroup

    Interviews

    Thinkaloudprotocols

    Quasiexperimental

  • 7/25/2019 Recommender systems tutorial.pdf

    72/144

    - 72 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    SkiMatcher ResortFinderintroducedbySkiEurope.comtoprovideusers

    withrecommendationsbasedontheirpreferences

    ConversationalRS

    questionandanswerdialog

    matchingofuserpreferenceswithknowledgebase

    DelgadoandDavidsonevaluatedthe

    effectivenessoftherecommenderovera

    4monthperiodin2001

    Classifiedasaquasiexperiment

    asusersdecideforthemselvesifthey

    wanttousetherecommenderornot

    SkiMatcherResults

  • 7/25/2019 Recommender systems tutorial.pdf

    73/144

    - 73 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    July August September October

    UniqueVisitors 10,714 15,560 18,317 24,416

    SkiMatcherUsers 1,027 1,673 1,878 2,558

    NonSkiMatcher Users 9,687 13,887 16,439 21,858

    RequestsforProposals 272 506 445 641

    SkiMatcherUsers 75 143 161 229

    NonSkiMatcher Users 197 363 284 412

    Conversion 2.54% 3.25% 2.43% 2.63%

    SkiMatcherUsers 7.30% 8.55% 8.57% 8.95%

    NonSkiMatcher Users 2.03% 2.61% 1.73% 1.88%

    IncreaseinConversion 359% 327% 496% 475%

    [Delgado and Davidson, ENTER 2002]

    InterpretingtheResults

  • 7/25/2019 Recommender systems tutorial.pdf

    74/144

    - 74 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Thenatureofthisresearchdesignmeansthatquestionsofcausality

    cannotbeanswered(lackofrandomassignments),suchas

    Areusersoftherecommendersystemsmorelikelyconvert?

    Doestherecommendersystemitselfcauseuserstoconvert?

    SomehiddenexogenousvariablemightinfluencethechoiceofusingRSaswell

    asconversion.

    However,significant

    correlation

    between

    using

    the

    recommender

    systemandmakingarequestforaproposal

    Sizeof

    effect

    has

    been

    replicated

    in

    other

    domains

    Tourism[Jannachetal.,JITT2009]

    Electronicconsumerproducts

    Observationalresearch

  • 7/25/2019 Recommender systems tutorial.pdf

    75/144

    - 75 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Increaseddemandinniches/longtailproducts

    Expostfromwebshopdata[Zankeretal.,ECWeb,2006]

    Whatispopular? From:Jannachetal.,ProceedingsEC Web 2012

  • 7/25/2019 Recommender systems tutorial.pdf

    76/144

    - 76 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Usercentricevaluation/Userstudies Increasedinterestinrecentyears

    Variousnumbersofworkshops

    ECWeb2012

    Whatarethenexttopics?

  • 7/25/2019 Recommender systems tutorial.pdf

    77/144

    - 77 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Twoadditionalmajorparadigmsofrecommendersystems

    Contentbased

    Knowledge

    based

    Hybridization:takethebestofdifferentparadigms

    Advancedtopics:recommendersystemsareabouthumandecisionmaking

  • 7/25/2019 Recommender systems tutorial.pdf

    78/144

    - 78 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Contentbasedrecommendation

  • 7/25/2019 Recommender systems tutorial.pdf

    79/144

    - 79 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    CollaborativefilteringdoesNOTrequireanyinformationabouttheitems,

    However,itmightbereasonabletoexploitsuchinformation

    E.g.recommendfantasynovelstopeoplewholikedfantasynovelsinthepast

    Whatdoweneed:

    Someinformationabouttheavailableitemssuchasthegenre("content")

    Somesortofuserprofiledescribingwhattheuserlikes(thepreferences)

    Thetask: Learnuserpreferences

    Locate/recommenditemsthatare"similar"totheuserpreferences

    Paradigmsofrecommendersystems

  • 7/25/2019 Recommender systems tutorial.pdf

    80/144

    - 80 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Contentbased:"Showmemoreofthe

    samewhatI'veliked"

    Whatisthe"content"?

  • 7/25/2019 Recommender systems tutorial.pdf

    81/144

    - 81 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Thegenreisactuallynotpartofthecontentofabook

    MostCBrecommendationmethodsoriginatefromInformationRetrieval

    (IR)field:

    Theitemdescriptionsareusuallyautomaticallyextracted(importantwords)

    Goalistofindandrankinterestingtextdocuments(newsarticles,webpages)

    Here:

    ClassicalIRbasedmethodsbasedonkeywords

    Noexpertrecommendationknowledgeinvolved

    Userprofile(preferences)areratherlearnedthanexplicitlyelicited

    Contentrepresentationanditemsimilarities

  • 7/25/2019 Recommender systems tutorial.pdf

    82/144

    - 82 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Simpleapproach

    Computethesimilarityofanunseenitemwiththeuserprofilebasedonthe

    keywordoverlap(e.g.usingtheDicecoefficient)

    sim(bi,bj)= |

    |

    | |

    TermFrequencyInverseDocumentFrequency(TFIDF)

  • 7/25/2019 Recommender systems tutorial.pdf

    83/144

    - 83 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Simplekeywordrepresentationhasitsproblems

    Inparticularwhenautomaticallyextractedbecause

    Noteverywordhassimilarimportance

    Longerdocumentshaveahigherchancetohaveanoverlapwiththeuserprofile

    Standardmeasure:TFIDF

    Encodestextdocumentsasweightedtermvector

    TF:Measures,howoftenatermappears(densityinadocument) Assumingthatimportanttermsappearmoreoften

    Normalizationhastobedoneinordertotakedocumentlengthintoaccount

    IDF:Aimstoreducetheweightoftermsthatappearinalldocuments

    TFIDF

  • 7/25/2019 Recommender systems tutorial.pdf

    84/144

    - 84 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Computetheoverallimportanceofkeywords

    Givenakeywordiandadocumentj

    TFIDF i,j TFi,j * IDFi

    Termfrequency(TF)

    Letfreqi,jnumberofoccurrencesofkeywordiindocumentj

    LetmaxOthersi,jdenotethehighestnumberofoccurrencesofanother

    keywordofj

    , ,

    ,

    InverseDocumentFrequency(IDF)

    N:numberofallrecommendabledocuments

    n(i):numberofdocumentsinwhichkeywordiappears

    ExampleTFIDFrepresentation

  • 7/25/2019 Recommender systems tutorial.pdf

    85/144

    - 85 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Figure taken from http://informationretrieval.org

    Moreonthevectorspacemodel

  • 7/25/2019 Recommender systems tutorial.pdf

    86/144

    - 86 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Vectorsareusuallylongandsparse

    Improvements

    Removestopwords("a","the",..) Usestemming

    Sizecutoffs(onlyusetopnmostrepresentativewords,e.g.around100)

    Useadditionalknowledge,usemoreelaboratemethodsforfeatureselection

    Detectionofphrasesasterms(suchasUnitedNations)

    Limitations

    Semanticmeaningremainsunknown

    Example:usageofawordinanegativecontext

    "thereisnothingonthemenuthatavegetarianwouldlike.."

    Usualsimilaritymetrictocomparevectors:Cosinesimilarity(angle)

    Recommendingitems

  • 7/25/2019 Recommender systems tutorial.pdf

    87/144

    - 87 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Simplemethod:nearestneighbors

    GivenasetofdocumentsDalreadyratedbytheuser(like/dislike)

    FindthennearestneighborsofanotyetseenitemiinD

    Taketheseratingstopredictarating/votefori

    (Variations:neighborhoodsize,lower/uppersimilaritythresholds)

    Querybasedretrieval:Rocchio's method

    TheSMARTSystem:Usersareallowedtorate(relevant/irrelevant)retrieveddocuments(feedback)

    Thesystemthenlearnsaprototypeofrelevant/irrelevantdocuments

    Queriesarethenautomaticallyextendedwithadditionalterms/weightof

    relevantdocuments

    Rocchio details

  • 7/25/2019 Recommender systems tutorial.pdf

    88/144

    - 88 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    DocumentcollectionsD+ andD

    , , usedtofinetune

    thefeedback

    oftenonlypositivefeedback

    isused

    Probabilisticmethods

  • 7/25/2019 Recommender systems tutorial.pdf

    89/144

    - 89 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Recommendationasclassicaltextclassificationproblem

    Longhistoryofusingprobabilisticmethods

    Simpleapproach:

    2classes:like/dislike

    SimpleBooleandocumentrepresentation

    Calculateprobabilitythatdocumentisliked/dislikedbasedonBayestheorem

    Remember:P(Label=1|X)=k*P(X|Label=1) * P(Label=1)

    Improvements

  • 7/25/2019 Recommender systems tutorial.pdf

    90/144

    - 90 -

    Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Sidenote:Conditionalindependenceofeventsdoesinfactnothold

    New/YorkandHong/Kong"

    Still,goodaccuracycanbeachieved

    Booleanrepresentationsimplistic

    Keywordcountslost

    Moreelaborateprobabilisticmethods

    E.g.estimateprobabilityoftermvoccurringinadocumentofclassCby

    relativefrequencyofvinalldocumentsoftheclass

    Otherlinearclassificationalgorithms(machinelearning)canbeused

    SupportVectorMachines,..

    Limitationsofcontentbasedrecommendationmethods

  • 7/25/2019 Recommender systems tutorial.pdf

    91/144

    - 91 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Keywordsalonemaynotbesufficienttojudgequality/relevanceofa

    documentorwebpage

    Uptodateness,usability,aesthetics,writingstyle

    Contentmayalsobelimited/tooshort

    Contentmaynotbeautomaticallyextractable(multimedia)

    Rampupphaserequired

    Some

    training

    data

    is

    still

    required Web2.0:Useothersourcestolearntheuserpreferences

    Overspecialization

    Algorithmstendtopropose"moreofthesame"

    E.g.toosimilarnewsitems

  • 7/25/2019 Recommender systems tutorial.pdf

    92/144

    - 92 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Whydoweneedknowledgebasedrecommendation?

  • 7/25/2019 Recommender systems tutorial.pdf

    93/144

    - 93 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Productswithlownumberofavailableratings

    Timespanplaysanimportantrole

    Fiveyearoldratingsforcomputers

    Userlifestyleorfamilysituationchanges

    Customerswant

    to

    define

    their

    requirements

    explicitly

    Thecolorofthecarshouldbeblack"

    Knowledgebasedrecommendation

  • 7/25/2019 Recommender systems tutorial.pdf

    94/144

    - 94 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Knowledgebased:"Tellmewhatfits

    basedonmyneeds"

    KnowledgebasedrecommendationI

  • 7/25/2019 Recommender systems tutorial.pdf

    95/144

    - 95 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Explicitdomainknowledge

    Salesknowledgeelicitationfromdomainexperts

    Systemmimicsthebehaviorofexperiencedsalesassistant

    Bestpracticesalesinteractions

    Canguaranteecorrectrecommendations(determinism)withrespectto

    expertknowledge

    Conversationalinteractionstrategy

    Opposedtooneshotinteraction

    Elicitationofuserrequirements

    Transfer

    of

    product

    knowledge

    (educating

    users)

    KnowledgeBasedRecommendationII

  • 7/25/2019 Recommender systems tutorial.pdf

    96/144

    - 96 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Differentviewsonknowledge

    Similarityfunctions Determinematchingdegreebetweenqueryanditem(casebasedRS)

    UtilitybasedRS

    E.g.MAUT Multiattributeutilitytheory

    Logicbasedknowledgedescriptions(fromdomainexpert)

    E.g.Hardandsoftconstraints

    ConstraintbasedrecommendationI

  • 7/25/2019 Recommender systems tutorial.pdf

    97/144

    - 97 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    AknowledgebasedRSformulatedasconstraintsatisfactionproblem

    Def.

    XI,XU:VariablesdescribingitemsandusermodelwithdomainD

    (e.g.lowerfocallength,purpose)

    KB:Knowledgebase comprisingconstraints and domainrestrictions

    (e.g.IFpurpose=on

    travel

    THENlowerfocallength

  • 7/25/2019 Recommender systems tutorial.pdf

    98/144

    - 98 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    MultiAttributeUtilityTheory(MAUT)

    Eachitemisevaluatedaccordingtoapredefinedsetofdimensionsthatprovide

    anaggregatedviewonthebasicitemproperties

    E.g.quality andeconomy aredimensionsinthedomainofdigitalcameras

    id value quality economy

    price 250>250

    510

    105

    mpix 8

    >8

    4

    10

    10

    6

    optzoom 9

    >9

    6

    10

    9

    6

    ... ... ... ...

    CustomerspecificitemutilitieswithMAUT

    Customer interests: customer quality economy

  • 7/25/2019 Recommender systems tutorial.pdf

    99/144

    - 99 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Customerinterests:

    Itemutilities:

    customer quality economy

    Cu1 80% 20%

    Cu2 40% 60%

    quality economy utility:cu1 utility:cu2

    P1 (5,4,6,6,3,7,10) = 41 (10,10,9,10,10,10,6) = 65 45.8 [8] 55.4 [6]

    P2 (5,4,6,6,10,10,8) = 49 (10,10,9,10,7,8,10) = 64 52.0 [7] 58.0 [1]

    P3 (5,4,10,6,10,10,8) = 53 (10,10,6,10,7,8,10) = 61 54.6 [5] 57.8 [2]

    ... ... ... ...

    *

    **

    * **

    ConstraintbasedrecommendationII

    BUT Wh if l i i ?

  • 7/25/2019 Recommender systems tutorial.pdf

    100/144

    - 100 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    BUT:Whatifnosolutionexists?

    notsatisfiable debuggingofknowledgebase

    notsatisfiable but

    correct debuggingofuserrequirements

    Applicationofmodelbaseddiagnosisfordebugginguser

    requirements

    Diagnoses: issatisfiable

    Repairs: issatisfiable

    Conflictsets: notsatisfiable

    IKBSRS

    IKB

    IKBSRS )\(

    IKBCSSRSCS :

    IKBSRS repair )\(

    IKB

    Example:findminimalrelaxations(minimaldiagnoses)

    KnowledgeBase: Productcatalogue:

  • 7/25/2019 Recommender systems tutorial.pdf

    101/144

    - 101 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Usermodel(SRS)

    R1 Motives Landscape

    R2 Brandpreference Canon

    R3 Max.cost 350EUR

    PowershotXY

    Brand Canon

    Lowerfocallength 35

    Upperfocallength 140

    Price 420 EUR

    LumixBrand Panasonic

    Lowerfocallength 28

    Upperfocallength 112

    Price 319 EUR

    LHS RHS

    C1 TRUE Brand =Brandpref.

    C2 Motives=Landscape Low.foc.Length=

  • 7/25/2019 Recommender systems tutorial.pdf

    102/144

    - 102 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Computationofminimalrevisionsofrequirements

    Doyouwanttorelaxyourbrandpreference?

    AcceptPanasonicinsteadofCanonbrand

    Orisphotographinglandscapeswithawideanglelensandmaximumcostless

    important?

    Lowerfocallength>28mmandPrice>350EUR

    Optionallyguidedbysomepredefinedweightsorpast communitybehavior

    Beawareofpossiblerevisions(e.g.age,familystatus,)

    ConstraintbasedrecommendationIII

  • 7/25/2019 Recommender systems tutorial.pdf

    103/144

    - 103 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Morevariantsofrecommendationtask

    Customersmaybenotknowwhattheyareseeking

    Find"diverse"setsofitems

    Notionofsimilarity/dissimilarity

    Ideathatusersnavigateaproductspace

    Ifrecommendationsaremorediversethanuserscannavigateviacritiqueson

    recommended"entrypoints"moreefficiently(lessstepsofinteraction)

    Bundlingofrecommendations

    Finditembundlesthatmatchtogetheraccordingtosomeknowledge

    E.g.travelpackages,skincaretreatmentsorfinancialportfolios

    RSfordifferentitemcategories,CSPrestrictsconfiguring ofbundles

    Conversationalstrategies

    Process consisting of multiple

  • 7/25/2019 Recommender systems tutorial.pdf

    104/144

    - 104 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Processconsistingofmultiple

    conversationalmoves

    Resemblesnaturalsalesinteractions

    Notalluserrequirementsknownbeforehand Customersarerarelysatisfiedwiththeinitial

    recommendations

    Differentstylesofpreferenceelicitation:

    Freetextqueryinterface Askingtechnical/genericproperties

    Images/inspiration

    ProposingandCritiquing

    Example:adaptivestrategyselection

  • 7/25/2019 Recommender systems tutorial.pdf

    105/144

    - 105 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Statemodel,differentactionspossible

    Proposeitem,askuser,relax/tightenresultset,

    [Riccietal.,JITT,2009]

    Limitationsofknowledgebasedrecommendationmethods

    Cost of knowledge acquisition

  • 7/25/2019 Recommender systems tutorial.pdf

    106/144

    - 106 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Costofknowledgeacquisition

    Fromdomainexperts

    Fromusers

    Remedy:

    exploit

    web

    resources

    Accuracyofpreferencemodels

    Veryfinegranularpreferencemodelsrequiremanyinteractioncycleswiththe

    userorsufficientdetaileddataabouttheuser Remedy:usecollaborativefiltering,estimatesthepreferenceofauser

    However:preferencemodelsmaybeinstable

    E.g.asymmetricdominanceeffectsanddecoyitems

  • 7/25/2019 Recommender systems tutorial.pdf

    107/144

    - 107 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Hybridrecommendersystems

  • 7/25/2019 Recommender systems tutorial.pdf

    108/144

    - 108 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Allthreebasetechniquesarenaturallyincorporatedbyagoodsalesassistance

    (atdifferentstagesofthesalesact)buthavetheirshortcomings

    Ideaofcrossingtwo(ormore)species/implementations

    hybrida[lat.]:denotesanobjectmadebycombiningtwodifferentelements

    Avoidsomeoftheshortcomings

    Reachdesirablepropertiesnotpresentinindividualapproaches

    Differenthybridizationdesigns

    Monolithicexploitingdifferentfeatures

    Parallel

    use

    of

    several

    systems Pipelinedinvocationofdifferentsystems

    Monolithichybridizationdesign

  • 7/25/2019 Recommender systems tutorial.pdf

    109/144

    - 109 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Onlyasinglerecommendationcomponent

    Hybridizationis"virtual"inthesensethat

    Features/knowledgesourcesofdifferentparadigmsarecombined

    Monolithichybridizationdesigns:Featurecombination

  • 7/25/2019 Recommender systems tutorial.pdf

    110/144

    - 110 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    "Hybrid"userfeatures:

    Socialfeatures:Movieslikedbyuser Contentfeatures:Comedieslikedbyuser,dramaslikedbyuser

    Hybridfeatures:userswholikemanymoviesthatarecomedies,

    the

    common

    knowledge

    engineering

    effort

    that

    involves

    inventing

    good

    featurestoenablesuccessfullearning[BHC98]

    Monolithichybridizationdesigns:Featureaugmentation

    C t t b t d ll b ti filt i [MMN02]

  • 7/25/2019 Recommender systems tutorial.pdf

    111/144

    - 111 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Contentboostedcollaborativefiltering[MMN02]

    Basedoncontentfeaturesadditionalratingsarecreated

    E.g.AlicelikesItems1and3(unaryratings)

    Item7issimilarto1and3byadegreeof0,75

    ThusAlicelikesItem7by0,75

    Itemmatricesbecomelesssparse

    Recommendationofresearchpapers[TMA+04]

    Citationsinterpretedascollaborativerecommendations

    Integratedincontentbasedrecommendationmethod

    Parallelizedhybridizationdesign

    Outputofseveralexistingimplementationscombined

  • 7/25/2019 Recommender systems tutorial.pdf

    112/144

    - 112 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Leastinvasivedesign

    Weightingorvotingschemeapplied

    Weightscanbelearneddynamically

    Parallelizedhybridizationdesign:Switching

    Specialcaseofdynamicweights(allweightsexceptoneare0)

  • 7/25/2019 Recommender systems tutorial.pdf

    113/144

    - 113 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Requiresanoraclethatdecideswhichrecommenderisused

    Example:

    Switchingisbasedonsomequalitycriteria:

    E.g.

    if

    too

    few

    ratings

    in

    the

    system,

    use

    knowledge

    based,

    else

    collaborative

    Pipelinedhybridizationdesigns

    One recommender system pre processes some input for the subsequent

  • 7/25/2019 Recommender systems tutorial.pdf

    114/144

    - 114 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Onerecommendersystempreprocessessomeinputforthesubsequent

    one

    Cascade

    Metalevel

    Refinementofrecommendationlists(cascade)

    Learningofmodel(e.g.collaborativeknowledgebasedmetalevel)

    Pipelinedhybridizationdesigns:Cascade

    Item1 0.8 2

    Recommender 2

    Item1 0.5 1

    Recommender 1

  • 7/25/2019 Recommender systems tutorial.pdf

    115/144

    - 115 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Recommendationlistiscontinuallyreduced

    Firstrecommenderexcludesitems

    Removeabsolutenogoitems(e.g.knowledgebased)

    Secondrecommenderassignsscore

    Orderingandrefinement(e.g.collaborative)

    Item2 0.9 1

    Item3 0.4 3

    Item4 0

    Item5 0

    Item2 0

    Item3 0.3 2

    Item4 0.1 3

    Item5 0

    Item1 0,80 1

    Item2 0,00

    Item3 0,40 2

    Item4 0,00Item5 0,00

    Recommender cascaded (rec1, rec2)

    Pipelinedhybridizationdesigns:Metalevel

    Successorexploitsamodel builtbypredecessor

  • 7/25/2019 Recommender systems tutorial.pdf

    116/144

    - 116 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    ismodelbuiltbyRSn1exploitedbyRSn

    Examples:

    Fabsystem:contentbased,collaborativerecommendation[BS97]

    Online

    news

    domain Contendbasedrecommenderbuildsusermodelsbasedonweightedtermvectors

    Collaborative filteringidentifiessimilarpeersbasedonweightedtermvectorsbutmakesrecommendationsbasedonratings

    Collaborative,constraintbasedmetalevelRS

    Collaborativefilteringidentifiessimilarpeers Aconstraintbaseislearnedbyexploitingthebehaviorofsimilarpeers

    Learnedconstraintsareemployedtocomputerecommendations

    ),,(),(1

    nrecnlevelmeta iureciurec

    Whatisthebesthybridizationstrategy?

    Only few works that compare strategies from the metaperspective

  • 7/25/2019 Recommender systems tutorial.pdf

    117/144

    - 117 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Onlyfewworksthatcomparestrategiesfromthemeta perspective

    Forinstance,[Burke02]

    Mostdatasetsdonotallowtocomparedifferentrecommendationparadigms

    I.e.ratings,requirements,itemfeatures,domainknowledge,critiquesrarely

    availableinasingledataset

    Someconclusionsaresupportedbyempiricalfindings

    Monolithic:preprocessingefforttradedinformoreknowledgeincluded

    Parallel:requirescarefuldesignofscoresfromdifferentpredictors Pipelined:workswellfortwoantitheticapproaches

    Netflixcompetition stackingrecommendersystems

    Weighteddesignbasedon>100predictors recommendationfunctions

    Adaptiveswitchingofweightsbasedonusermodel,parameters

    AdvancedtopicsI

  • 7/25/2019 Recommender systems tutorial.pdf

    118/144

    - 118 -

    Explanationsinrecommendersystems

    Motivation

  • 7/25/2019 Recommender systems tutorial.pdf

    119/144

    - 119 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    ThedigitalcameraProfishotisamustbuyforyoubecause....

    Whyshouldrecommendersystemsdealwithexplanationsatall?

    Theanswerisrelatedtothetwopartiesprovidingandreceiving

    recommendations: Asellingagentmaybeinterestedinpromotingparticularproducts

    Abuyingagentisconcernedaboutmakingtherightbuyingdecision

    Explanationsinrecommendersystems

    Additional information to explain the systems output following some

  • 7/25/2019 Recommender systems tutorial.pdf

    120/144

    - 120 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Additionalinformationtoexplainthesystem soutputfollowingsome

    objectives

    Objectivesofexplanations

  • 7/25/2019 Recommender systems tutorial.pdf

    121/144

    - 121 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Transparency

    Validity

    Trustworthiness

    Persuasiveness

    Effectiveness

    Efficiency

    Satisfaction

    Relevance

    Comprehensibility

    Education

    Explanationsingeneral

    How?andWhy?explanationsinexpertsystems

  • 7/25/2019 Recommender systems tutorial.pdf

    122/144

    - 122 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Formofabductive reasoning

    Given: (itemiisrecommendedbymethodRS)

    Find s.t.

    Principleofsuccinctness

    Findsmallestsubsetof s.t.

    i.e.forall

    holds

    Butadditionalfiltering

    Somepartsrelevantfor

    deduction,mightbeobvious

    forhumans

    [Friedrich&Zanker,AIMagazine,2011]

    TaxonomyforgeneratingexplanationsinRS

    Majordesigndimensionsofcurrentexplanationcomponents:

    Category of reasoning model for generating explanations

  • 7/25/2019 Recommender systems tutorial.pdf

    123/144

    - 123 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Categoryofreasoningmodelforgeneratingexplanations

    Whitebox

    Blackbox

    RSparadigmforgeneratingexplanations

    Determinestheexploitablesemanticrelations

    Informationcategories

    RSparadigmsandtheirontologies

    Classesofobjects

    U

  • 7/25/2019 Recommender systems tutorial.pdf

    124/144

    - 124 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Users

    Items

    Properties

    Nary relationsbetweenthem

    Collaborativefiltering

    NeighborhoodbasedCF(a)

    Matrixfactorization(b)

    Introducesadditionalfactorsasproxiesfor

    determining

    similarities

    RSparadigmsandtheirontologies

    Contentbased

    Properties characterizing items

  • 7/25/2019 Recommender systems tutorial.pdf

    125/144

    - 125 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Propertiescharacterizingitems

    TF*IDFmodel

    Knowledgebased

    Propertiesofitems

    Properties

    of

    user

    model Additionalmediatingdomainconcepts

    Similaritybetween

    items

  • 7/25/2019 Recommender systems tutorial.pdf

    126/144

    - 126 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Similaritybetween

    users

    Tags

    Tagrelevance(foritem)

    Tag

    preference

    (of

    user)

    Thermencheck.com(hotspringresorts)

  • 7/25/2019 Recommender systems tutorial.pdf

    127/144

    - 127 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    The water has favorable properties for X, butit is unknown if it also cures Y.

    It offers organic food, but no kosher food.

    It offers services for families with smallchildren, such as X, Y and Z.

    It is a spa resort of medium size offering

    around 1000 beds.

    Resultsfromtestingtheexplanationfeature

    PerceivedPositive

    Usage exp+**

  • 7/25/2019 Recommender systems tutorial.pdf

    128/144

    - 128 - Dietmar Jannach, Markus Zanker and Gerhard Friedrich

    Knowledgeableexplanationssignificantlyincreasetheusers

    perceivedutility

    Perceivedutility

    strongly

    correlates

    with

    usage

    intention

    etc.

    Explanation

    Trust

    Perceived

    UtilityUsageexp.

    Recommend

    others

    Intentionto

    repeated

    usage**

    sign.