1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer...

23
1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM TJ Watson Research Center, USA 3 Huazhong University of Science and Technology, China 4 Beijing University of Aeronautics and Astronautics, China Social Action Tracking via Noise Tolerant Time-varying Factor Graphs
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    4

Transcript of 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer...

Page 1: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

1

1Chenhao Tan, 1Jie Tang, 2Jimeng Sun, 3Quan Lin, 4Fengjiao Wang

1Department of Computer Science and Technology, Tsinghua University, China2IBM TJ Watson Research Center, USA

3Huazhong University of Science and Technology, China4Beijing University of Aeronautics and Astronautics, China

Social Action Tracking via Noise Tolerant Time-varying Factor Graphs

Page 2: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

3

Motivation

• 500 million users• the 3rd largest “Country” in the world• More visitors than Google• Action: Update statues, create event

• More than 4 billion images• Action: Add tags, Add favorites

• 2009, 2 billion tweets per quarter• 2010, 4 billion tweets per quarter• Action: Post tweets, Retweet

Page 3: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

4

User Action in Social Networks

Add photo to her favorites

Post tweets on “Haiti

Earthquake”

Publish in KDD

Conference

Twitter Flickr Arnetminer

Page 4: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

5

User Action in Social Networks

Questions:- What factors influence you to add a photo into your favorite list?

- If you post a tweet on “Haiti Earthquake”, will your friends retweet it or reply?

Challenge: - How to track and model users’ actions? - How to predict users’ actions over time?

Page 5: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

6

John

Time t

John

Time t+1

Action Prediction:Will John post a tweet on “Haiti Earthquake”?

Attributes:1. Always watch news2. Enjoy sports3. ….

Influence1

Personal attributes4 Dependence2

Complex Factors

Correlation3

Page 6: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

7

Problem formulation

Gt =(Vt, Et, Xt, Yt)

Input: Gt =(Vt, Et, Xt, Yt)

t = 1,2,…T

Output:

F: f(Gt) ->Yt

Nodes at time t

Edges at time t

Attribute matrix at time t

Actions at time t

Page 7: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

8

NTT-FGM Model

Continuous latent action statePersonal attributes

Correlation

Dependence

Influence

Action Personal attributes

Page 8: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

9

How to estimate the parameters?

Model Instantiation

Page 9: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

10

Model Learning

Extremely time costing!!

Our solution: distributed learning (MPI)

Page 10: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

12

• Data Set

• Baseline– SVM– wvRN (Macskassy, 2003)

• Evaluation Measure:Precision, Recall, F1-Measure

Action Nodes #Edges Action Stats

Twitter Post tweets on “Haiti Earthquake”

7,521 304,275 730,568

Flickr Add photos into favorite list

8,721 485,253 485,253

Arnetminer Issue publications on KDD

2,062 34,986 2,960

Experiment

Page 11: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

13

Performance Analysis

Page 12: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

14

Factor Contribution Analysis

• NTT-FGM: Our model• NTT-FGM-I: Our model ignoring influence• NTT-FGM-CI: Our model ignoring influence and correlation

Page 13: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

15

Efficiency Performance

Page 14: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

16

Conclusion• Formally formulate the problem of social

action tracking

• Propose a unified model: NTT-FGM to simultaneously model various factors

• Present an efficient learning algorithm and develop a distributed implementation

• Validate the proposed approach on three different data sets, and our model achieves a better performance

Page 15: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

17

Thank you!QA?

Data & Code: http://arnetminer.org/stnt

Welcome to our poster!

Page 16: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

18

Statistical Study: InfluenceY-axis: the likelihood that the user also performs the action at t

X-axis: the percentage of one’s friends who perform an action at t − 1

Page 17: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

19

Statistical Study: DependenceY-axis: the likelihood that a user performs an action

X-axis: different time windows

Page 18: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

20

Statistical Study: CorrelationY-axis: the likelihood that two friends(random) perform an action together

X-axis: different time windows

Page 19: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

21

Appendix

Page 20: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

22

Appendix

Page 21: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

23

Appendix

Page 22: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

24

Prediction

• Based on the learning parameters we just need to solve the following equations:

Page 23: 1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.

25

Latent State Analysis

Action Bias Factor: f(y12|z1

2)

Influence Factor: g(z11,z1

2)

Correlation Factor: h(z12,z2

2), h(z12,x1

2)