MobiHoc 2014 - Kennesaw State University
Transcript of MobiHoc 2014 - Kennesaw State University
MINIMUM-SIZED INFLUENTIAL NODE SET
SELECTION FOR
SOCIAL NETWORKS UNDER THE
INDEPENDENT CASCADE MODEL
MobiHoc 2014
Jing (Selena) He
Department of Computer Science, Kennesaw State University
Shouling Ji, and Raheem Beyah
School of Electrical and Computer Engineering, Georgia Institute of Technology
Zhipeng Cai
Department of Computer Science, Georgia State University
2
INTRODUCTION
What is a social network?
The graph of relationships and interactions within a group of
individuals.
SOCIAL NETWORK AND SPREAD OF
INFLUENCE
Social network plays a fundamental
role as a medium for the spread of
INFLUENCE among its members
Opinions, ideas, information,
innovation…
Direct Marketing takes the “word-of-mouth”
effects to significantly increase profits
(facebook, twitter, myspace, …) 3
MOTIVATION
4
• 900 million users, Apr. 2012
• the 3rd largest ― “Country” in the world
• More visitors than Google
• Action: Update statues, create event
• More than 4 billion images
•Action: Add tags, Add favorites
• 2009, 2 billion tweets per quarter
• 2010, 4 billion tweets per quarter
•Action: Post tweets, Retweet
Social networks already become a bridge to connect
our really daily life and the virtual web space
5
MOTIVATION (CONT.)
• Modeling and tracking users’ actions in
social networks is a very important issue
and can benefit many real applications
– Advertising – Social recommendation – Expert finding
– Marketing
–…
6
Who are the opinion
leaders in a community?
Marketer Alice
APPLICATION
George
Frank
Ada
Eve David
Bob
Carol
2 2
4 1
1
2
3 3
Find minimum-sized node (user) set in a social network
that could influence on every node in the network
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
7
OUTLINE
Network Model
Models of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
8
NETWORK MODEL
A social network is represented as a undirected graph
Nodes start either active or inactive
An active node may trigger activation of neighboring nodes based on a pre-defined threshold τ
Monotonicity assumption: active nodes never deactivate
9
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
10
MODEL OF INFLUENCE
If u1 is active, then the active node set I = {u1}
P1(I) = 1
P2(I) = 0.5
P3(I) = 0.7
P4(I) = 0.6
11
MODEL OF INFLUENCE
12
If u1 and u4 are active, then the active node set I = {u1, u4}
P1(I) = 1 – (1 – P11)(1 – P14) = 1
P2(I) = 1 – (1 – P21)(1 – P24) = 0.9
P3(I) = 1 – (1 – P31)(1 – P34) = 0.97
P4(I) = 1 – (1 – P41)(1 – P44) = 1
Pii = 1, if ui ϵ I
Pii = 0, otherwise
Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
13
MINIMUM-SIZED INFLUENCE NODE SET SELECTION PROBLEM (MINS)
Given
a social network G = (V, E, P)
a threshold τ
Goal
The initially selected active node set denoted
by I could influence every node in the
network
∀ 𝑢𝑖 ∈ 𝑉, Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼
Objective
Minimize the size of I 14
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
15
CONTRIBUTION FUNCTION
f(I) = min (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉
Greedy algorithm
Initialize I = empty set
While f(I) < |V|τ do
Choose u to maximize f(I ∪ {u})
I = I ∪ {u}
End while
Return I
16
EXAMPLE
First round: I = empty set
Second round:
I = {u1}
f(I) = 0.8 + 0.5 + 0.7 + 0.6 = 2.6
I = {u2}
f(I) = 0.5 + 0.8 + 0.4 + 0.8 = 2.5
I = {u3}
f(I) = 0.7 + 0.4 + 0.8 + 0.8 = 2.7
I = {u4}
f(I) = 0.6 + 0.8 + 0.8 + 0.8 = 3.0
𝝉 = 0.8
f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉
17
17
EXAMPLE
Third round:
I = {u4 ,u1}
f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2
I = {u4 ,u2}
f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2
I = {u4 ,u3}
f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2
Use node ID to break the tie
I = {u4 ,u1}
The greedy algorithm stops, since
f(I) = |V|τ = 4 * 0.8 = 3.2.
𝝉 = 0.8
f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉
18
18
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
19
THEORETICAL ANALYSIS
20
Theorem 1. The MINS selection problem is NP-hard.
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
21
SIMULATION SETTINGS
generate random graphs based on the random graph
model G(n,p) = {G | G has n nodes, and an edge
between any pair of nodes is generated with
probability p}.
22
EXPERIMENT DATA
Real-world data set: academic coauthor network,
which is extracted from academic search system
Arnetminer [19].
co-authorship networks arguably capture many
of the key features of social networks more
generally.
Resulting graph: 640, 134 nodes (authors), 1,
554, 643distinct edges (coauthor relations)
23
OUTLINE
Network Model
Model of influence
Minimum-sized Influence Node Set selection problem
Problem definition
Greedy Algorithm
Proof of performance bound
Experiments
Data and setting
Results
24
RESULTS: SIMULATION
25
RESULTS: SIMULATION
26
RESULTS: REAL DATA
27
CONCLUSIONS
We introduce a new optimization problem, named the Minimum-sized Influential Node Set (MINS) selection problem. We prove that it is a NP-hard problem under the independent cascade model.
We define a polymatroid contribution function, which suggests us a greedy approximation algorithm. Comprehensive theoretical analysis about its performance ratio is given.
We conduct extensive experiments and simulations to validate our proposed greedy algorithm both on real world coauthor data sets and random graphs. 28
FUTURE WORK
Study more realistic network model
Directed graph
Study more general influence models
Deal with negative influences
Study the network evolution as time changes
29
30
Q & A