CS6120: Intelligent Media Systems
Transcript of CS6120: Intelligent Media Systems
04/03/2014
1
CS6120: Intelligent Media Systems
Dr. Derek Bridge
School of Computer Science & Information Technology
UCC
Social Recommender Systems
β’ In social recommender systems,
β user profiles include relationships between users
MODIFICATIONS TO COLLABORATIVE RECOMMENDERS
04/03/2014
2
k-Nearest Neighbours for User-Based Collaborative Recommending
for each candidate item, π for each user π£ in ππ except for active user π’, i.e. other users who have rated π
compute π ππ π’, π£
let ππ be the set of π nearest neighbours, i.e. the π users who have rated π and who are most similar to π’ and whose similarity to π’ is positive
let ππ’,π be the weighted average of the ratings for π in ππ
recommend the candidates in descending order of predicted rating
Modification: Friends
for each candidate item, π
for each user π£ who is a friend of π and has rated π
compute π ππ π’, π£
let ππ be the set of π nearest neighbours, i.e. the π users who have rated π and who are most similar to π’ and whose similarity to π’ is positive
let ππ’,π be the weighted average of the ratings for π in ππ
recommend the candidates in descending order of predicted rating
Modification: Friends
Problem:
too few friends
β’ Coverage and accuracy problems β not enough friends to make
good predictions
Problem:
same bed, different dreams
β’ The interest graph is not equal to the social graph
04/03/2014
3
Modification: Trust
for each candidate item, π for each user π£ in ππ except for active user π’, i.e. other users who have rated π
compute πππππ π, π
let ππ be the set of π nearest neighbours, i.e. the π users who have rated π and who π’ trusts most and whose trust by π’ is positive
let ππ’,π be the weighted average of the ratings for π in ππ
recommend the candidates in descending order of predicted rating
Explicit Trust vs Implicit Trust
β’ Explicit β In some systems, users
select who they trust β’ e.g. Epinions
β NB π‘ππ’π π‘(π’, π£) means π’ trusts π£βs opinion
β’ Implicit β In other systems, trust is
inferred from user interaction β’ likes, shares, comments,
visits to profile, chats,β¦
β’ But your immediate web of trust is small
β coverage and accuracy problems
Trust Propagation
β’ Extend your web of trust using transitivity
β if π’ trusts π£ and π£ trusts π€, then π’ trusts π€
β’ Quantify this, e.g. by multiplication π‘ππ’π π‘ π’,π€ =π‘ππ’π π‘(π’, π£) Γ π‘ππ’π π‘(π£,π€)
0.7 0.4
0.28
04/03/2014
4
Trust Aggregation
β’ Aggregation
β for quantifying the trust when there is more than one path from π’ to π€
β Options β’ use the score of the
shortest path
β’ use the highest score
β’ average the scores
β’ β¦
0.7 0.4
0.28 0.1728
0.3 0.9
0.8 0.8
Discussion
Using user-user similarity
1. Sparsity can reduce coverage and accuracy
2. Cold-start problem with new users who have few ratings
3. Prone to attacks
4. The anonymity makes this less transparent
Using trust
1. Trust reaches more users (if it is propagated enough)
2. Reduced cold-start problem:
β each new trust relationship brings more coverage and accuracy than each new rating
β can import your social graph from another site
3. Attacks are more difficult
4. More transparent: more like word-of-mouth
Discussion
β’ Trust β a relationship between two
users
β’ Reputation β the communityβs view of a
user β e.g. by aggregating trust or
by votes on reviews/ comments/ answers to questions
β e.g. karma in reddit β e.g. reputation and badges
in stackoverflow β e.g. an academicβs H-score
β’ In a user-based collaborative recommender you could use a combination of β user-user similarity β trust β reputation
β’ In an item-based collaborative recommender, you could weight the users by trust or reputation within the item-item similarity
04/03/2014
5
NEWSFEED PRIORITIZATION
Facebook NewsFeed
β’ Stories from your friends β the average user has 1500
candidate stories β so Facebook must show you a
subset β it scores them and shows those
with highest scores
β’ Facebook terminology β Object: a status update, a
comment, a fan page,β¦ β Edge: an interaction with an
object β’ creating it, liking it, commenting on
it, sharing it, hiding it, stopping notifications, friending a person, tagging a photo, adding a location to a photo, joining an event,β¦
EdgeRank
β’ An object is more likely to show up in your NewsFeed if people you know have been interacting with it recently
β’ Score will be higher β the better you know them
β the more they interact with it
β the more recent the interaction
From edgerank.net
04/03/2014
6
NewsFeed Prioritization
β’ EdgeRank β Affinity, π’π:
β’ a measure of your engagement with this friend or page
β Weight, π€π: β’ e.g. commenting has more
weight than liking β’ e.g. changes to relationship
status have high weight
β Time decay factor, ππ: β’ recency of activity
β’ Since 2010, Facebook uses 100,000 signals β family vs close friend vs
acquaintances vs friends of friends
β number of friends interacting with an object; total number of people interacting with an object
β type of post that you seem to interact with more
β device; speed of Internet connection
β β¦
WHO TO FRIEND OR FOLLOW
Facebook: People You May Know
β’ Social graph β friends and family β edges in the graph are
bidirectional because connections require consent
β’ Recommendations β ideally people you might
know, not total strangers β primarily based on number
of mutual friends that you interact with a lot
β’ From Facebook: β βWe show you people
based on mutual friends, work and education information, networks youβre part of, contacts youβve imported and many other factors.β
04/03/2014
7
Twitter: Who To Follow
β’ Interest graph
β more connections to sources of info than to friends & family
β edges in the graph are often unidirectional
β’ Recommendations
β can be βstrangersβ
β users you might be similar to
β users you might be interested in
General Link Prediction Methods
β’ These are methods based on the structure of the graph β e.g. nodes represent users β e.g. edges represent relationships (friend, follower) or
interactions β obviously cannot take into account extrinsic factors,
e.g. a user relocates and becomes a (physical) neighbour of another
β’ If there is no edge between nodes π’ and π£, these methods give a score β the missing edges with the highest scores are
recommended
General Link Prediction Methods
Shortest paths
β’ Find the shortest path in the graph from π’ to π£
β’ Calculate its length, π
β’ The score of the missing edge from π’ to π£ is negative π
Neighbourhoods
β’ Let ππππ (π’) and ππππ (π£) be the neighbours in the graph of π’ and π£
β’ The score of the missing edge from π’ to π£ is, e.g. β size of intersection: ππππ (π’) β© ππππ (π£)
β Jaccard: ππππ (π’)β©ππππ (π£)
ππππ (π’)βͺππππ (π£)
β preferential attachment: ππππ (π’) Γ ππππ (π£)
04/03/2014
8
General Link Prediction Methods
All paths
β’ Find all paths in the graph from π’ to π£
β’ Let ππ be the set of paths from π’ to π£ that are of length π
β’ Katz: the score of the missing edge from π’ to π£ is π½π Γ ππβπ=1
where 0 < π½ β€ 1
Random walks
β’ Inspired by PageRank
β’ Randomly surf
β starting at, e.g., π’
β count how many times you visit the other nodes such as π£
β use this as the score for the missing edge
WTF: Twitterβs Early Method
β’ First, find the active userβs circle-of-trust β found by a kind of
personalized PageRank β’ start at the active user β’ randomly walk the graph β’ teleportation takes you
back to the active user β’ score users by how often
they get visited
β’ Let C be the users in the circle-of-trust
β’ Let F be the users who are followed by users in C
β’ Next, use a random walk algorithm called SALSA β start at a user in C β repeatedly
β’ traverse a link to a user in F β’ traverse a link back to a
user in C
β’ Last, recommend users in C and F with high scores β users in C are similar to the
active user β users in F are interesting to
the active user
Twitterβs Current Method
β’ There are no real details! β’ It is a weighted hybrid (ensemble) of 20+ scoring methods
β dynamically adjusts the weighting
β’ As well as graph structure, they also use: β data about nodes (users), e.g. how many times a user has
retweeted β data about the edges, e.g. timestamps β other kinds of edges (interactions), e.g. retweets, favourites,
replied
β’ They could use, but probably donβt: β tweet content
β’ They certainly do: β lots of A/B testing!
04/03/2014
9
Ongoing Research
β’ Similarly, recent research papers have discussed data that is extrinsic to the graph
β Facebook: demographics
β LinkedIn: overlap in the time two people belonged to an organization, organization size, role,β¦
β’ Unclear how much of this is in their current algorithms
GROUP RECOMMENDER SYSTEMS
Consuming Items Together
04/03/2014
10
Aggregation of each Group Memberβs Predicted Rating
β’ Group recommenders:
β the most common approach:
Predict rating for each member
Aggregate predicted
ratings
Candidate
Items
Recommendations
Aggregate predicted
ratings
Aggregate, e.g. average
Predict rating for each member
User-based collaborative
filtering
Aggregation of each Group Memberβs Predicted Rating
β’ E.g.:
Group prediction
2
2 5
3
Rating Aggregation Methods
β’ Average:
β the mean predicted rating for the item
β’ Median:
β the middle predicted rating for the item
β’ Multiplicative:
β multiply the predicted ratings
β’ Least Misery:
β the minimum predicted rating
β’ Most Pleasure:
β the maximum predicted rating
β’ Borda count:
β sum the inverse ranks
04/03/2014
11
Issues in Aggregation
β’ Normalize each userβs ratings before prediction and aggregation
β’ Manipulation of outcome β if ratings are explicit ones,
users can see othersβ ratings, and they know the aggregation method
β how would you manipulate Least Misery?
β what about Median?
β on the other hand, in this scenario, you sometimes get conformity effects
β’ People expect fairness over time β if the same group gets a
subsequent recommendation, a member who βlost outβ first time should carry more weight
β but group membership often varies a bit over time
β’ Social factors influence the item the group will agree on β personalities
β relationships within the group
Issues in Aggregation
β’ A group recommender could anticipate social factors in the aggregation:
Group prediction
2
2 5
2.6
User-based collaborative
filtering
Aggregate, e.g. average
Social modification
of predictions
3
2 3
Issues in Group Recommenders
β’ Explanations β’ Active versus passive groups
β Active: β’ e.g. negotiating over a movie/ vacation
β Passive: β’ e.g. choosing music or temperature for a shared space such
as a shop, gym or office
β’ What does a rating mean? β my opinion of the item β my opinion of the item for this group β my opinion of the experience