Link Prediction and Recommendation across Multiple ...
Transcript of Link Prediction and Recommendation across Multiple ...
1
Yuxiao Dong*$, Jie Tang$, Sen Wu$, Jilei Tian# Nitesh V. Chawla*, Jinghai Rao#, Huanhuan Cao#
Link Prediction and Recommendation across
Multiple Heterogeneous Networks
*University of Notre Dame $Tsinghua University #Nokia Research China
Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V. Chawla, Jinghai Rao, Huanhuan Cao. Link Prediction and Recommendation across Heterogeneous Social networks. In IEEE ICDM'12.
3
x
?
?
x
Topic: Transfer Link Prediction Framework
General Features
Transfer Model
Source network Target network
x
1
2
3
5
Link Prediction and Recommendation
? ?
G=(V, E): social network vs: a particular user C: candidates for vs
Y: candidates’ rank
Input: G, vs, C Output: f: (G, vs, C) à Y
What is link prediction?
7
Attribute factor
Social factor
Ranking Factor Graph Model (RFG)
Latent Variable
Joint distribution: Attribute factors Social factors
8
Ranking Factor Graph Model
p Joint distribution: Attributes Social factors
p Attribute factor:
p Social factor:
p Exponential-linear functions to initialize factors
Model Initialization
9
Ranking Factor Graph Model
p RFG objective function:
p Learning[1]:
1. Wenbin Tang, Honglei Zhuang, Jie Tang. Learning to infer social ties in large networks. In ECML/PKDD'11, pp 381-397.
Objective function and model learning
10
Still Problems?
– Unbalanced Data: the number of potential candidates grows exponentially (d(vs)n-1) as the number of hops n increases.
– Few Training Data: obtaining sufficient training data is difficult.
Challenges in traditional link prediction problem
13
Transfer Link Prediction Framework
x
?
?
x
General Features
Transfer Model
Source network Target network
x
1
2
3
14
General social factors
p What are the general factors driving people make friends in real world and form links in social networks? p Homophily p Social Balance p Preferential Triad Closure
How do we create social connections?
15
General social factors
The principle of homophily suggests that users with similar characteristics tend to associate with each other.
Homophily / Social Balance / Preferential Triad Closure
1. The likelihood of two users creating a link increases when the number of their common neighbors increases in the four networks. 2. This effect of homophily is more pronounced when the number reaches 100, where the probabilities are all higher than 50% in the four networks.
16
Social balance theory is based on the principles that “the friend of my friend is my friend” and “the enemy of my enemy is my friend”.
It is more likely (more than 80% likelihood) for users to establish balanced triangle of friendships in all four online networks.
General social factors Homophily / Social Balance / Preferential Triad Closure
17
Preferential attachment and triadic closure are the basic models, concerning the nature of human social interactions and agency on a local scale.
x
x Lady Gaga
Barack Obama student
student
Why?
General social factors Homophily / Social Balance / Preferential Triad Closure
18
Four networks share a very similar distribution on probabilities of close triad formation in all six cases, though the four networks are totally different.
The enumeration is conditioned on whether X, Y, Z are opinion leaders (green means it is an opinion leader).
General social factors Homophily / Social Balance / Preferential Triad Closure
20
p TRFG Objective function:
Attributes factor in source network
Attributes factor in target network
General social factors across source and target networks
Transfer Ranking Factor Graph Model p RFG Objective function:
Bridge source & target networks
General social factors across source and target networks
21
Experiment Setup
– Randomly select 2000 nodes as the source users[2] from the network. – For each source user, we generate the candidate list for her/him.
1. Epinions, Slashdot, Wikivote are available at http://snap.stanford.edu and Twitter available at http://arnetminer.org/reciprocal 2. Lars Backstrom, Jure Leskovec. Supervised Random Walks: Predicting and Recommending Links in social Networks. In WSDM’11
#nodes #edges +edges description
Epinions 131,828 841,372 85% Who-trust-whom online social website
Slashdot 82,144 549,202 78% User community based technology news website
Wikivote 7,115 103,689 79% Who-vote-whom network for admins in Wikipedia
Twitter 63,803 153,098 38% Who-follow-whom micro-blogging networks
Facebook 4,039 88,234 100% Facebook friendship networks
Networks and Candidate Generation
23
Experiment Setup
p Unsupervised methods • Common neighbors
• Adamic/Adar
• Jaccard Index
• Preferential Attachment
p Supervised methods • SVMRank (SVM-light) • Logistic Regression (Weka) • Ranking Factor Graph Model
Baseline Predictors
26
Results Transfer: multiple source networks to one target network
4 source networks
3 source networks
2 source networks
27
Summary
Study the novel problem of Transfer Link Prediction across Multiple Heterogeneous Networks
Propose transfer ranking factor graph model to leverage the observed general factors, and demonstrate the effectiveness of it in five real-world networks
Discovery general social theories on link formation across networks, including homophily, social balance and preferential triadic closure