Recommender Systems

RecSys: Recommender Systems

Tran The Truyen

http://truyen.vietlabs.com

The world is an over-crowded place

They all want to get our attention

We are overloaded

• Thousands of news articles

and blog posts each day

• Millions of movies, books

and music tracks online

• In Hanoi, > 50 TV channels,

thousands of programs

each day

• In New York, several

thousands of ad messages

sent to us per day

But we really need and consume only a few of them!

Sometimes, all we need is this

Or, just this

DON’T DISTURB!

Help me!

Can Google help?

• Yes, but only when we really know what

we are looking for

• What if I just want some interesting music

tracks?

– Btw, what does it mean by “interesting”?

Can Facebook help?

• Yes, I tend to find my friends’ stuffs

interesting

• What if I had only few friends, and what

they like do not always attract me?

Can experts help?

• Yes, but it won’t scale well

– Everyone receives exactly the same advice!

• It is what they like, not me!

– Like movies, what get expert approval does not guarantee attention of the mass

OK, here is the idea called RecSys:

• To recommend to us

something we may like– It may not be popular– The world is long-tailed

• How?– Based on our history of using services

– Based on other people like us

– Ever heard of “collective intelligence”?

I like these bits

Hang on, what is long-tailed?

• Popularised by Chris Anderson, Wired 2004

The short-tailed distribution

The long-tailed distribution

The bell-shaped distribution

Ever heard of

• GroupLens?

• Amazon recommendation?

• Netflix Cinematch?

• Google News personalization?

• Netflix Prize $1mil challenge?

• Strands?

• TiVo?

• Findory?

Want some evidences?(Celma & Lamere, ISMIR 2007)

• Netflix:

– 2/3 rented movies are from recommendation

• Google News

– 38% more click-through are due to recommendation

• Amazon

– 35% sales are from recommendation

What can be recommended?• Advertising messages

• Investment choices

• Restaurants

• Cafes

• Music tracks

• Movies

• TV programs

• Books

• Cloths

• Supermarket goods

• Tags

• News articles

• Online mates (Dating services)

• Future friends (Social network sites)

• Courses in e-learning

• Drug components

• Research papers

• Citations

• Code modules

• Programmers

But, what do recommender systems do, exactly?

1. Predict how much you may like a certain

product/service

2. Compose a list of N best items for you

3. Compose a list of N best users for a certain

product/service

4. Explain to you why these items are recommended to

you

5. Adjust the prediction and recommendation based on

your feedback and other people

Graph representation

Titanic Taken Panda

Me My friend You Another guy

?

We must also take a good care of

• Data normalisation

• Removal or reduction of noise

• Protection of users’ privacy

• Attack: someone just doesn’t like your

system

Task 1: Preference prediction

• Collaborative filtering

– User-based method

– Item-based method

– Matrix Factorization

• Content-based filtering

• Hybrid:

– Linear/sequential/switching combination

– Semi-Restricted Boltzmann Machines

Collaborative filtering (1)

• User-based method (1994,

GroupLens)– Many people liked “KungfuPanda”

– Can you tell how much I like it?

– The idea is to pick about 20-50people who share similar taste with me, then how much I like depend on how much THEY liked.

– In short: you may like it because your “friends” liked it

8

7

5 4 5 3 4

3 5 4 5

4 5 4

5 4 3 55

4

5

3 43

2 3 5

1 4 2

345

1 2 3 4 5 6 7 8

item

user

1

2

3

4

5

6


• Item-based method (2001,

deployed at Amazon) – I have watched so many good & bad movies

– Would you recommend me watching “Taken”?

– The idea is to pick from my previous list 20-50 movies that share similar audience with “Taken”, then how much I will like depend on how much I liked those early movies

– In short: I tend to watch this movie because I have watched those movies … or

– People who have watched those movies also liked this movie (Amazon style)

4 3 55

4

5

3 43

2 3 5

1 4 2

345

1 2 3 4 5 6 7 8

item

user

1

2

3

4

5

6

7

8

5 4 5 3 4

3 5 4 5

4 5 4

5


• Matrix Factorization (2006, Netflixchallgence) – You many have watched thousands of movies– But perhaps I can tell these movies belong to

10 groups, like Action, Sci-Fi, Animation, etc,…

– So 10 numbers are enough to describe your taste

– Likewise, “Titanic” has been watched by millions people, but perhaps …10 numbers are enough to describe its features

– Magic: these hidden aspects can be discovered automatically by Matrix Factorization!

~ [0.1 0.3 0.2 0.9 0.5 0.4 0.7 0.3 0.8 1.5]

Problems with collaborative filtering

• Scale– Netflix (2007): 5M users, 50K movies, 1.4B ratings

• Sparse data– I have rated only one book at Amazon!

• Cold-Start– New users and items do not have history

• Popularity bias– Everyone reads “Harry Potter”

• Hacking– Someone reads “Harry Potter” reads “Karma Sutra”

Content-based method• Web page: words, hyperlinks, images, tags, comments,

titles, URL, topic

• Music: genre, rhythm, melody, harmony, lyrics, meta data,

artists, bands, press releases, expert reviews, loudness,

energy, time, spectrum, duration, frequency, pitch, key,

mode, mood, style, tempo

• User: age, sex, job, location, time, income, education,

language, family status, hobbies, general interests, Web

usage, computer usage, fan club membership, opinion,

comments, tags, mobile usage

• Context: time, location, mobility, activity, socializing,

emotion

Content-based method (2)

• Can we acquire those content pieces

automatically?– Fairly easy for text

– Difficult for music and video, except for digital signals. E.g. music genre classification 60-80% accuracy

– A lot of noise, e.g. misplaced tags

– Attacks

• What can we do with these?– Compute similarity between items or users

– Query items that are similar to a given item

– Match item’s content and user’s profile

Content-based method (3)

• Measuring similarity

– Cosine, TF-IDF as in standard Information Retrieval

– KL-divergence for probability-oriented guys

– Euclidean, dimensionality reduction if you want

– Anything you can imagine of!

Hybrid: Semi-Restricted BoltzmannMachines (2009, IMPCA)

• A probabilistic combination of– Item-based method

– User-based method

– Matrix Factorization

– (May be) content-based method

• It looks like a Neural Network– But it does not really so ☺

• It really is a type of Markov

random fields, which is, again, a

type of Graphical Models– Self-advertising: I work on these stuffs for living!

Item X��

��

��

��

User A User B User C



product/service



product/service


you



• Top-N item list:

– Find similar users, collect what they like

– Filter out those the user has rated

– Rank the remaining items by considering

• The number of times each item is liked by those users

• The popularity of the item

• The associated ratings

• The similarity between each item in the list and what the user has rated

• Switching the role of item to user, we may have

top-N user list

Task 2,3: Top-N recommendation



product/service



product/service


you



• This is a current hit …

• More on this artist …

• Try something from similar artists …

• Someone similar to you also like this …

• As you listened to that, you may want this …

• These two go together …

• This is most popular in your group …

• This is highly rated …

• Try something new …

Task 4: Explanation

• Examples from Strands.com

– Welcom back (recently viewed)

– For you today

– New for you

– Hot / Most popular of this type

– Other people also do this …

– Similar or related products

– Complementary accessories

– This goes with this …

– Gift idea

– Shopping assisant

Task 4: Explanation (2)



product/service



product/service


you



• New items and users come each hour or minute

• The two worlds:– Most songs and books are still interesting for a long time (the tail is really long)

– Most news articles are read on the day and forgotten next day• But tracking back is useful to follow an event or scandal

• Online updating large-scale neighbour-based

systems is NOT easy at all

Task 5: Online updating

Evaluation

• How do we know the recommendation is

good?

– How good is good?

– Measures should be automated

• Practice: training/testing split (e.g. 80/20)

• Popular criteria

– Prediction error: ZOE, MAE, RMSE

– Hit recall/precision/F-measure, rank utility, ROC curve,

Evaluation (2)

• Yet little on

– Relevance

– Usefulness

– % Increase in purchase

– % Reduction in cost

– Novelty/surprise/long-tails

– Diversity

– Coverage

– Explainability

A question: Can we make use of these information sources?

• Blogs

• Social Media

• Online comments

• Online stores

• Review sites

• Locations

• Mobility

A case-study: Strands

• Services for any online-retailers– Retailers send product, purchase information into Strands server (one retailer per account) through APIs

– Strands returns recommendation for each visitor

• The same logic for social media servers

• moneyStrands for personal financial

management (e.g. investment recommendation)

• MyStrands for music personalization

Want more practical hints?

• New books:

–Toby Segaran, Programming Collective Intelligence, O'Reilly, 2007

–Satnam Alag, Collective Intelligence in Action, Manning Publications, 2009

• Check out for real deployment:

–TechCrunch– ReadWriteWeb

Want more state-of-the-arts?

• Research in Recommender Systems is becoming a

mainstream, evidenced from the recent conference

ACM RecSys.

• Other places:– ICWSM: Weblog and Social Media

– WebKDD: Web Knowledge Discovery and Data Mining

– WWW: The original WWW conference

– SIGIR: Information Retrieval

– ACM KDD: Knowledge Discovery and Data Mining

– ICML: Machine Learning

Questions left to you

• Will you trust such Recommender

Systems?

• Will you implement and deploy it here?

• Will you do research?– PhD scholarships available (as of 19/4/09)

– See http://truyen.vietlabs.com/scholarship.html

– Warning: you are going to waste 3-5 years of your youth life!

Recommender Systems

Technology

Transcript of Recommender Systems