Webtech recommender systems_presentation

25
Recommender Systems Simona Dakova Web Technologies Prof. Dr. Ulrik Schroeder WS 2010/11 1 The slides are licensed under a Creative Commons Attribution 3.0 License

Transcript of Webtech recommender systems_presentation

Page 1: Webtech recommender systems_presentation

Recommender Systems

Simona Dakova

Web Technologies – Prof. Dr. Ulrik Schroeder – WS 2010/111The slides are licensed under a

Creative Commons Attribution 3.0 License

Page 2: Webtech recommender systems_presentation

Overview Motivation

Netflix Prize Competition

Collaborative filtering approaches

Content-based techniques

Hybrid recommenders

Summary

Web Technologies2

Page 3: Webtech recommender systems_presentation

We live in information overload!

Web Technologies3

“We are leaving the age of Information and entering the Age of Recommendation” -The Long Tail (Chris Anderson)

Page 4: Webtech recommender systems_presentation

Netflix: 2/3 of the movies rented were recommended

Google News: 38% more click-throughs

Amazon: 35% sales from recommendations

They try to attract you!

Web Technologies4

Page 5: Webtech recommender systems_presentation

Why recommenders? Enhance e-commerce and boost sales

Browsers into buyers

Recommender vs. Search:

Discover the items you are looking, match your preferences

Limited list of results

Personalize your website content to the profile of an individual user

Discover interesting items

Automated personalization

Increase usage and satisfaction

Web Technologies5

Page 6: Webtech recommender systems_presentation

Netflix Prize Competition $1.000.000 - if you “only” improve existing system by 10%!

Contest started in 2006

Annual progress prize $ 50.000

Gained great popularity inacademic circles

The Winner

BellKor´s Pragmatic Chaos

10.5% improvement in July 2009

Web Technologies6

Page 7: Webtech recommender systems_presentation

Recommender System = ? Definition:

Algorithms/Systems for information filtering attempting to recommend certain items the user might like

Items:

Advertising messages, Investment choices, Restaurants, Cafes, Music tracks, Movies, TV programs, Books, Cloths, Supermarket goods, Tags, News articles, Online mates, Research papers

Web Technologies7

Page 8: Webtech recommender systems_presentation

User Profiling Understand people´s needs and interests

Explicit Data Collection

Ask for rating of items

Rank a set of items

Ask for detailed information/feedback

CON: not well received by users, not ubiquitous

Implicit Data Collection

Purchasing history

Items viewed

Navigational patterns

Obtain list of watched/listened items

Analyze social data

CON: Privacy concerns

Web Technologies8

Page 9: Webtech recommender systems_presentation

Technology overview

Web Technologies9

RECOMMENDERS

Collaborative filtering(CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

Page 10: Webtech recommender systems_presentation

Collaborative filtering (CF)

Web Technologies10

RECOMMENDERS

Collaborative filtering (CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

• prediction based on past ratings

• compute similarities betweenusers/items

• make prediction according to thecalculated weight (similarity)

• learn a model from user’s ratings

• use the model to predict theprobabilistic rating of the activeuser on given item

Page 11: Webtech recommender systems_presentation

Memory-based CF Algorithms

Web Technologies11

RECOMMENDERS

Collaborative filtering (CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

Page 12: Webtech recommender systems_presentation

Entire or sample of the user-item matrix

Steps:

1. For the active user/item identify his neighbors

Similarity computation

Pearson correlation

Vector cosine-based similarity

2. Neighborhood-based prediction/ Top-N Recommendation

Memory-based CF Algorithms

Web Technologies12

Page 13: Webtech recommender systems_presentation

User-based vs. Item-based

Web Technologies13

User-based = You may like it because your “friends” liked it

Item-based = You may like it because you like similar items

i1 i2 i3 i4 i5

u1 5 8 7 8

u2 10 1

u3 2 10 9 9

u4 2 9 9 10

u5 1 5 1

ua 2 9 10

i1 i2 i3 i4 i5

u1 5 8 7 8

u2 10 1

u3 2 10 9 9

u4 2 9 9 10

u5 1 5 1

ua 2 9 10

Page 14: Webtech recommender systems_presentation

Model-based CF Algorithms

Web Technologies14

RECOMMENDERS

Collaborative filtering (CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

Page 15: Webtech recommender systems_presentation

Model-Based CF Algorithms

Web Technologies15

Train your system to recognize complex patterns in user-

item data (ratings)

Make the recommendation based on the trained model

Relies on machine learning and data mining algorithms

Train

r11

r9

r8

r7

r6

r5

r4

r1

r3

r2

all ratings

r8

r7 r4

r3

MODEL(only set of ratings)

RECOMMENDATION

Page 16: Webtech recommender systems_presentation

Limitations and problems of CF Depend on human ratings

Data sparsity

Cold start , New user and New item problem

Scalability

Synonymy

Shilling attacks

Gray/Black sheep

Web Technologies16

Page 17: Webtech recommender systems_presentation

Content-based recommenders

Web Technologies17

RECOMMENDERS

Collaborative filtering(CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

Page 18: Webtech recommender systems_presentation

Content-based recommendation (CB)

For items containing textual information (keywords)

Information Retrieval

Compares similarity of the features of given items

Example: Movie recommendation application

Analyze common features among the movies

Recommend only the movies that have a high degree of similarity to whatever the user’s preferences are

Web Technologies18

LargeSImilarity

Small Similarity

Page 19: Webtech recommender systems_presentation

Limitations and problems of CB

Web Technologies19

Limited content analysis

Explicitly associated features

Multimedia data – relies on tagging

Same set of features – indistinguishable

Overspecialization

Difficult to recognize synonyms, concepts, or new emerging words

New user Problem

Page 20: Webtech recommender systems_presentation

Hybrid recommenders

Web Technologies20

RECOMMENDERS

Collaborative filtering(CF)

Content-basedFiltering (CB)

Hybridrecommenders

Memory-basedCF Algorithms

Model-basedCF Algorithms

Page 21: Webtech recommender systems_presentation

Collaborativefiltering

Hybrid recommenders Use combination of CF and CB

Implementing methods separately and combining their predictions

Incorporating CB characteristics into a CF approach or vice versa

Constructing a general unifying model that incorporates both

Example: content-boosted collaborative filtering

Web Technologies21

i1 i2 i3 i4

u1 5 8 x 7

u2 10 x 1 x

u3 2 x 10 9

u4 x 2 9 9

ua 2 x 9 10

i1 i2 i3 i4

u1 5 8 7 7

u2 10 4 1 8

u3 2 5 10 9

u4 6 2 9 9

ua 2 3 9 10

RECOMMENDATION

Contentpredictor

Page 22: Webtech recommender systems_presentation

Pros/Cons of Hybrid Recommenders Advantages

Address limitations of pure CF or CB systems

Provide more accurate recommendations

Performance improvement

Overcome sparsity

Disadvatages

Comlexity

Expensive to build

Web Technologies22

Page 23: Webtech recommender systems_presentation

The winning solution on Netflix Contest

A blend of several complex

algorithms into a hybrid recommender system

Main improvement:

Incorporate temporal effects that cause movie and user biases as well as the changing user preferences

Web Technologies23

Page 24: Webtech recommender systems_presentation

SummaryTechniques Advantages Limitations

Co

llab

ora

tive

Memory-based algorithms: Neighborhood-based CF Top-N recommendation

• easy implementation• no content considered

•data sparsity•cold start problem•limited scalability

Model-based algorithms: machine learning / data mining algorithms

• deal better with sparsity, scalability• intuitive rationale

• expensive modeling• trade-off between performance and scalability

Co

nte

nt-

bas

ed Information retrieval • no data about other users• recommendation for new/unpopular items• predictions for users with unique tastes

• limited content analysis• overspecialization•new user problem

Hyb

rid

s

combination of collaborative and content-based approaches

• overcome limitations of pure collaborative and content-based recommendations• more accurate recommendations• performance improvement

• complexity• expensive to build

Web Technologies24

Page 25: Webtech recommender systems_presentation

Literature

Adomavicius, G., Tuzhilin, A. 2005. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions.

Su, X., Khoshgoftaar, T. 2009 A Survey on Collaborative Filtering Techniques.

Sarwar, B., Karypis, G., Konstan, J., Riedl, J. 2001 Item-based collaborative Filtering Recommendation Algorithms.

Das, A., Datar, M., Garg, A. 2007 Google News Personalization: Scalable Online Collaborative Fitlering.

Linden, G., Smith, B., York, J. 2003 Amazon.com Recommendations Item-to-Item Collaborative Filtering.

Guy, I., Zwerdling, N., Ronen, I., Carmel, D., Erel, U. 2010 Social Media Recommendation based on People and Tags.

Schafer, J., Konstan, J., Riedl, J. 1999 Recommender Systems in E-Commerce.

http://www.irelaxa.com/Geecat/2010/09/16/recommendation-system-collaborative-filtering/

Piotte, M., Chabbert, M. 2009 Extending the toolbox.

Web Technologies25