Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1...

41
Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27, 2008 1 2
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1...

Discovering Leaders from Community Actions

Amit Goyal1

Francesco Bonchi2

Laks V.S. Lakshmanan1

Oct 27, 2008 1 2

Context & Motivations:Viral Marketing

3

Word of Mouth and Viral Marketing We are more influenced

by our friends than strangers

68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

4

Viral Marketing

Also known as Target Advertising

Initiate chain reaction by Word of mouth effect

Low investments, maximum gain

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

5

Viral Marketing as an Optimization Problem Given: Network with

influence probabilities Problem: Select top-k

leaders such that by targeting them, the spread of influence is maximized

Hao Ma et al 2008, Domingos et al 2001, Richardson et al 2002, Kempe et al 2003

How to calculate true influence probabilities?

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

6

A pattern mining approach

We propose a completely different approach based on frequent pattern mining.

We focus on the actions performed by users: Joining a community (as in flickr/facebook community) Rating a song, a movie (as in Y! Music, Y! Movie)

Importance of time in which actions are performed

Assumption: Users can see their friends’ actions

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

7

Our Contributions

Formally define the notion of leaders and its various flavors

Efficient algorithms for extracting these leaders

Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

8

Rest of the talk

Framework definition: Influence propagation on the social network Various notions of leaders

Algorithms Experiments Related Work Conclusion

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Framework Definition

10

Input Data (1)

A social network, i.e., an undirected graph G=(V,E) where nodes are users and edges represent social ties.

Users declare their friends. e.g. Facebook, Yahoo! Messenger etc

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

11

Input Data (2)

An actions log sorted in chronological order, i.e., a relation

Actions(User, Action, Time)

Example: Jack joined Yoga community at time 5

Assumption:Users can see their friends actions (feeds)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

12

Action Propagation

Jack Jill

Mary

Jack and Jill are friendsJack and Mary are friendsAction is “Joining the Yoga community”

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Action Propagated from Jack to JillAction propagated from Jack to Mary

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

3 time units

995 time units

13

Propagation Graph

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Can we say Mary got influenced by Jack?? NO

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

14

User Influence Graph

When an action propagates from user u to user v,

we may think of v

being influenced by u

Influence should decay in time

Size of influence graph << Size of PG

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Propagation Graph

User Influence Graph for Jack

15

Leaders – first definition

Who should be a leader? For an action, should influence sufficiently large number of users ( >ψ ) For an action, should influence these users in a reasonable amount of

time ( <π ) Should act as a leader in sufficiently large number of actions ( >σ )

If ψ= 2, π = 15, σ = 1then, both Jack and Jill are leaders

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

3

74

7

3995

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

3

7

7Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

16

Tribe Leader

A leader may influence different users for different actions

What if a leader lead a fixed set of users for different actions?

We call these leaders as Tribe Leaders

Can be considered as small communities

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

jack

A1 A3A2

A1, A2 and A3 are 3 different actions

17

Additional Constraint: Genuineness It may happen that one

user acts as a leader but in concrete he is always a follower of the other leaders

We want to avoid this kind of fake leaders.

gen(Jill) = 1/3 Another constraint:

confidence

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Tom

Jill

Jack

A1 A3

A2

A1 A2

A1, A2 and A3 are 3 different actions

Algorithmsbut how will I discover the

leaders??

19

Algorithms: Overview

Assumptions: Social graph is huge – millions of nodes Actions log is huge – millions of tuples For an action, size of user Influence Graph <<

size of Propagation Graph for all users Our algorithms are able to extract the patterns

(leaders and tribe leaders) in no more than one scan of the action log table.

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

20

Algorithms: Overview

Scan the action log table by means of a window of sizeπbackward in time, i.e., starting from the most recent timestamp (bottom of the table if we assume tuples to be ordered by time).

Efficiently compute the influence matrix, i.e., a matrix Users x Actions IMπ(u, a) represents number of users, influenced by u w.r.t. action a

within timeπ Compute leaders from IM

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

IM10(Jack, “joining yoga community”) = 3

21

Computing Influence Matrix (1)

We use a bit vector to track which users are influenced by a given user. Updated incrementally

Locking mechanism using another bit vector 0 => free bit; 1 => occupied bit

Node to bit index mapping stored in a queue Bits must be dynamically allocated.

S

R

T

W

V

Node InfVec

R 01010111

S 01000110

T 00010110

W 00000110

V 00000100

(V,2) (W,1) (T,4) (S,6) (R,0)

HeadQueue

01010111

Lock bit Vector

Time window on propagation graph

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

22

Computing Influence Matrix (2)

Slide up the current window – delete node V Delete the entry from queue Update the lock Update influence vectors

S

R

T

W

V

Node InfVec

R 01010011

S 01000010

T 00010010

W 00000010

V 0000010001010011

Lock bit Vector

(V,2) (W,1) (T,4) (S,6) (R,0)

HeadQueue

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

(V,2) (W,1) (T,4) (S,6) (R,0)

01010111

Lock bit Vector

Node InfVec

R 01010111

S 01000110

T 00010110

W 00000110

V 00000100

Time window on propagation graph

23

Computing Influence Matrix (3)

New node P added Issue a lock, add entry to the queue Compute its Influence Vector by propagation Number of followers of P = 4 IM(P,a) = 4

S

R

T

W

Node InfVec

P 01010111

R 01010011

S 01000010

T 00010010

W 00000010

(W,1) (T,4) (S,6) (R,0) (P,2)

HeadQueue

01010111

Lock bit Vector

P

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

(W,1) (T,4) (S,6) (R,0)

01010011

Lock bit Vector

Node InfVec

R 01010011

S 01000010

T 00010010

W 00000010

Time window on propagation graph

24

Mining Tribe Leaders

Influence Matrix not enough We use influence cube: Users x Actions x Users

ICπ(u,a,v) = 1, when user v is influenced by user u for action a within time π

We do not explicitly compute the whole cube due to sparsity.

Problem same as discovering existence of frequent itemsets of size larger than a given threshold

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

25

Algorithms - Final Comments

The only truly mandatory threshold is π(time threshold)

Influence Matrix: O(TAn2) in bit level operations T = total number of tuples in action log A = total number of distinct actions n = maximum number of nodes visible in any position of the

time window n << N, where N is the total number of users

Tribe Leaders: Influence Cube: O(TAn2) Finding existence of frequent itemsets: exponential in

number of followers But very fast due to optimizations (Bonchi 2003)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Experimentsenough talking, show me the

results dude!!

27

Data Preparation

Data Social graph: Yahoo! Instant Messenger Actions log: Yahoo! Movies

Action = user u rated movie m at time t joined through common users identifiers

Started from Yahoo! Instant Messenger subgraph of “most active” users (110M nodes) and 21M ratings from Yahoo! Movies.

Ended with 217.5K nodes, 221.4K edges and 1.8M ratings.

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

28

Data characteristics: connected components

Giant component94K Users (43.2% of connected users)

Total 46,650 connected components

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

29

Leaders Vs. Tribe leaders

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

30

Number of leaders found

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

31

Run-time

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

32

Genuineness: an almost binary concept!

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

33

Top-10 tribe leaders w.r.t. tribe size

• Tribe leaders exhibit high confidence.

• Tribe leaders with low genuineness were found dominated by other tribe leaders present in the tables.

• We found many users acting as leader in many actions but not being a tribe leader.

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

34

Related Work (1)

Identifying influential users Domingos et al 2001, Richardson et al 2002,

Kempe et al 2005 Identifying influential bloggers

Agarwal et al 2008 Identifying communities in Social Networks

Hoproft et al 2003, Kumar et al 2006, Backstrom et al 2006, Tantipathananadh et al 2007, Huang et al 2008, Friedland at el 2007

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

35

Related Work (2)

Influence and Correlation in Social Networks Aris Anagnostopoulos et al 2008

Revenue maximization Hartline et al 2008

Near optimal sensor placement for outbreak detection Leskovec et al 2007

Heat Diffusion Model Hao Ma et al 2008 (CIKM)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

36

Conclusions

Proposed framework based on frequent pattern mining for discovering leaders in social networks

Formally define the problem of extracting leaders from social graph and actions log. Various notions of leader, tribe leader Their confidence and genuine variants

Efficient algorithms for extracting leaders of various flavors Just one pass over the actions log table

Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

37

Ongoing/Future Work

Gurumine: Pattern Mining System for Discovering Leaders and Tribes (Demo paper to appear in ICDE 2009)

Leadership Cube: What kind of leaders attract what kind of followers for what kind of actions?

Viral Marketing Stronger notions of influence?

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

38

Thanks!

1

3

41

2 3

5

23

13 3

7

4

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

39

Backup

40

Number of leaders found

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

41

Additional constraint: confidence Similarly to association rules, we can have a

confidence measure for leaders. Leadership confidence =

# actions in which is a leader / # actions performed Example: Lets say Jack performed 10 actions out of

which in 7 actions, he acted as a leader (i.e. more than ψ users followed in short time), then conf(Jack) = 7/10

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/