Suggesting Friends Using the Implicit Social Graph · Suggesting Friends Using the Implicit Social...
Transcript of Suggesting Friends Using the Implicit Social Graph · Suggesting Friends Using the Implicit Social...
Suggesting Friends Using the Implicit Social Graph
Title
Yusuf Simonson
The Social Graph
Gmail’s Social Graph
The Problem
• Users do not explicitly maintain group listsUsers do not explicitly maintain group lists– Membership changes dynamicallyToo time consuming and tedious– Too time consuming and tedious
The Implicit Social Graph
• Billions of vertices – one for each emailBillions of vertices one for each email address
• A group is a unique combination of one or• A group is a unique combination of one or more contacts with whom a user has interacted with in an email threadinteracted with in an email thread
The Implicit Social Graph
• Group membership represented throughGroup membership represented through edges
• An active user has on average 350 groups• An active user has on average 350 groups• Groups have a mean size of 6• Edges have direction and weight
Interaction Weight
• Iout = set of outgoing interactionsout g g• Iin = set of incoming interactions• t = current time• tnow = current time• t(i) = timestamp of interaction i• λ = half‐life of interaction weights
Use Cases
• “Don’t forget Bob”Don t forget Bob• “Got the wrong Bob?”
Restrictions
• Observability only of a user’s egocentricObservability only of a user s egocentric network
• Message contents are not included• Message contents are not included
Core Routine
• S = set of contactsS = set of contacts that make up the group to begroup to be expanded
• F = map of friend• F = map of friend sugges ons → confidence scoreconfidence score
UpdateScore Implementation #1
• Sums IR scores of all groups g pwith overlap to the seed
• Does not consider the d f i il it f thdegree of similarity of the seed group to candidate groups
• Biased toward groups with high IR, users in many groups
UpdateScore Implementation #2
• Scores weighted byScores weighted by similarity of the seed group to thegroup to the candidate group
UpdateScore Implementation #3
• Counts the numberCounts the number of groups a contact belongs to that havebelongs to that have any overlap with the seedseed
• Does not use IRBi d d• Biased toward users in many groups
UpdateScore Implementation #4
• Sum of IR scores forSum of IR scores for all the groups the candidate usercandidate user belongs to
• Biased toward• Biased toward frequently contacted usersusers
Evaluation
• Randomly sampled real email trafficRandomly sampled real email traffic• Removed traffic from suspected bots and inactive usersinactive users
• Sample a few contacts from each group for h dthe seed
• Measure ability of algorithms to account for the rest of the group
Results
Results
Results
“Don’t Forget Bob”
• Straightforward implementation of FriendStraightforward implementation of Friend Suggest algorithm
“Got the wrong Bob?”
• Iterate through eachIterate through each recipient
• Find similarly named• Find similarly named recipientsIf h i• If their score > current recipient’s,
if hnotify the user
“Got the wrong Bob?”
Potential Applications
• Photo document sharing sitesPhoto, document sharing sites• IM communicationO li l d i i i• Online calendar invitations
• Comments on blog posts• Text messaging, phone activity
Future Work
• Study relative importance of differentStudy relative importance of different interaction types to determine social relationshipsrelationships
• Use of Friend Suggest to identify trusted usersR d ti– Recommendations
– Content sharing
Q&AQ&A