Analysis of the Evolution of Events on Online Social Networks
-
Upload
miguel-rebollo -
Category
Science
-
view
93 -
download
2
description
Transcript of Analysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Analysis of the Evolution of Events on OnlineSocial Networks
E. del Val, M. Rebollo and V. Botti
Grupo Tec. Inform.-Inteligencia ArtificialUniversitat Politècnica de València
CSS 2014
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Problem
Analysis of user behaviorHow user interactions evolve in a social network associated to therealization of a scheduled event in the real world.
4 types of events are consideredTV showssocio-politicalconferenceskeynotes
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
The Hypothesis
There are significative differences in the behavior of theparticipants in each type of event
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Data set
twits containing the hashtag of the event are retrievedinteraction-based network: mentions, replies and retweetsfocus on scheduled eventstemporally annotated network (nodes and links)empirical analysis
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Events
TV showsLa vozTopchefBreakingBadOperacionPalace
Keynotesapplekeynotenuevosiphoneinnovationreinvented (Nokia)
Socio-politicallomceviacatalanaEU elections debate
ConferencesTEDValenciaseo4seosTwitter awards 2013InternetOfThings forumCW’13 (web conference)
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Event characterization
Event N E clust. d path comp. k m GC %sl#lomce #24O 61,653 97,570 0.07 27 6.93 8,088 1.58 0.73 49,164 10.80
#via 41,166 76,094 0.07 23 8.28 8,705 1.85 0.56 30,745 12.19catalana
#topchef12 26,044 27,155 0.05 25 8.74 10,689 1.04 0.66 12,794 12.29#lavoz 45,914 39,891 0.07 9 2.11 19,398 0.87 0.81 17,941 6.95
#breakingbad 151,473 120,661 0.05 13 3.24 67,060 0.80 0.72 71,130 14.26#operacion 107,606 195,470 0.08 23 8.26 19,504 1.82 0.49 85,768 15.06
Palace#apple 3,367 1,729 0.04 4 1.27 1,827 0.51 0.93 700 1.53keynote#nuevos 9,509 10,600 0.05 9 2.09 1,227 1.12 0.62 7,799 6.47iPhone
#innovation 110 95 0.04 3 1.34 32 0.86 0.83 26 10.00reinvented#IoTWF 4,150 17,545 0.22 9 3.96 330 4.23 0.42 3,680 30.83#cw13 1,051 2,608 0.20 9 3.90 60 2.48 0.56 966 19.76
#seo4seos 367 1,474 0.35 6 2.91 16 4.02 0.30 347 31.07#tedx 325 843 0.17 8 3.60 45 2.59 0.32 276 8.60
valencia
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Message type
tv: linear, global, duringsoc: linear, inter., alwayskeyn: global, beforeconf: linear, inter., during
0 5000
10000 15000 20000 25000 30000 35000 40000 45000
0 50 100 150 200 250 300 350
num
twee
ts
snapshot
globalretweets
mentions replyTo
(a) TV show: #topchef12.
0
10000
20000
30000
40000
50000
60000
0 50 100 150 200 250 300 350
num
twee
ts
snapshot
globalretweets mentions
replyTo
(b) Socio-political: #viacatalana
0
1000
2000
3000
4000
5000
6000
0 50 100 150 200 250 300 350nu
m tw
eets
snapshot
globalretweets mentions
replyTo
(c) Keynote: #applekeynote
0
200
400
600
800
1000
0 100 200 300 400 500 600 700
num
twee
ts
snapshot
globalretweets
mentions replyTo
(d) Conference: #tedxValencia
Figure 2: Evolution of the number of global and individual messages (mentions, retweets, and reply to) in different type of events.
9
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Network size (links and nodes)
tv: always, similarsoc: always, ↑↑linkskeyn: before, ↑↑linksconf: but after, ↑↑links
0
5000
10000
15000
20000
25000
30000
0 50 100 150 200 250 300 350
num
link
s or
nod
es
snapshot
linksnodes
(a) TV show: #topchef12.
0
10000
20000
30000
40000
50000
60000
70000
80000
0 50 100 150 200 250 300 350
num
link
s or
nod
es
snapshot
linksnodes
(b) Socio-political: #viacatalana.
0
500
1000
1500
2000
2500
3000
3500
0 50 100 150 200 250 300 350nu
m li
nks
or n
odes
snapshot
linksnodes
(c) Keynote: #applekeynote.
0 100 200 300 400 500 600 700 800 900
0 100 200 300 400 500 600 700
num
link
s or
nod
es
snapshot
linksnodes
(d) Conference: #tedxValencia.
Figure 3: Evolution of the number of nodes and links in different type of events.
the number of nodes and links increases moderately. People in the analyzed events prefer to talk and interactbefore the event rather than during the keynote.
In the analyzed conference networks, the number of nodes grows rapidly before the event starts (see Figure3d). Then, during the event, there is also an important increase in the number of nodes. Towards the end of theevent as well as after the event, the number of nodes remains almost constant. If the conference consists on twoor three days, during the first day is when a most significant increase in the number of new nodes occurs. Thenumber of links evolves similarly as the nodes. However, the increase in the number of interactions is producedat a higher rate. This means that assistants to a conference are more social and interact with other assistants.This behavior is similar to behavior of participants in socio-political networks.
4.3. Symmetric linksInteractions among users can be in one direction or bidirectional. A unidirectional link means that a user
sends an individual message to other user and he/she never receives a response. This usually happens whenusers interact with celebrities or the official account of the events. A bidirectional (symmetric) link means thata user A sends a message to other user B and user B sends also a message to A, which implies that there is
10
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Symmetric links
tv: low, decreasingsoc: low, constantkeyn: lowestconf: tipically high,constant
0
5
10
15
20
25
30
35
40
0 50 100 150 200 250 300 350
sym
met
ric li
nks
snapshot
% of symmetric links
(a) TV show: #topchef12.
0
5
10
15
20
25
30
35
40
0 50 100 150 200 250 300 350
sym
met
ric li
nks
snapshot
% of symmetric links
(b) Socio-political: #viacatalana.
0
5
10
15
20
25
30
35
40
0 50 100 150 200 250 300 350sy
mm
etric
link
s
snapshot
% of symmetric links
(c) Keynote: #applekeynote.
0
5
10
15
20
25
30
35
40
0 100 200 300 400 500 600 700
sym
met
ric li
nks
snapshot
% of symmetric links
(d) Conference: #tedxValencia.
Figure 4: Evolution of the number of symmetric links in different type of events.
12
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Cummulative degree distrib. and CCDF
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Path length and diameter
all: grows before andremains constantnon significativedifferences
0
5
10
15
20
25
30
0 50 100 150 200 250 300 350
num
ste
ps
timeStep
path Lengthdiameter
(a) TV show: #topchef12.
0
5
10
15
20
25
30
0 50 100 150 200 250 300 350
num
ste
ps
timeStep
path Lengthdiameter
(b) Socio-political: #viacatalana.
1
1.5
2
2.5
3
3.5
4
0 50 100 150 200 250 300 350nu
m s
teps
timeStep
path Lengthdiameter
(c) Keynote: #applekeynote.
1
2
3
4
5
6
7
8
9
0 100 200 300 400 500 600 700
num
ste
ps
timeStep
path Lengthdiameter
(d) Conference: #tedxValencia.
Figure 6: Evolution of the average path length and diameter in different type of events.
15
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Clustering
not significative in theevolutiondifferences in the values(↑conf, ↓↓ keyn)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 50 100 150 200 250 300 350
Clu
ster
ing
Deg
ree
timeStep
clustering dynamics
(a) TV show: #topchef12.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 50 100 150 200 250 300 350
Clu
ster
ing
Deg
ree
timeStep
clustering dynamics
(b) Socio-political: #viacatalana.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 50 100 150 200 250 300 350C
lust
erin
g D
egre
e
timeStep
clustering dynamics
(c) Keynote: #applekeynote.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 100 200 300 400 500 600 700
Clu
ster
ing
Deg
ree
timeStep
clustering dynamics
(d) Conference: #tedxValencia.
Figure 7: Evolution of the average clustering in different type of events.
16
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Nodes in the giant component
all: created before theeventtv and keyn: lowsoc and conf: high
0
20
40
60
80
100
0 50 100 150 200 250 300 350% o
f nod
es th
at a
re in
the
gian
t com
pone
nt
timeStep
% of nodes
(a) TV show: #topchef12.
0
20
40
60
80
100
0 50 100 150 200 250 300 350% o
f nod
es th
at a
re in
the
gian
t com
pone
nt
timeStep
% of nodes
(b) Socio-political: #viacatalana.
0
20
40
60
80
100
0 50 100 150 200 250 300 350% o
f nod
es th
at a
re in
the
gian
t com
pone
nt
timeStep
% of nodes
(c) Keynote: #applekeynote.
0
20
40
60
80
100
0 100 200 300 400 500 600 700% o
f nod
es th
at a
re in
the
gian
t com
pone
nt
timeStep
% of nodes
(d) Conference: #tedxValencia.
Figure 8: Evolution of the % of nodes that are part of the giant component in different type of events.
17
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Betweenness
tv: official + celebritiessoc: politics, journalists,bloggerskeyn: media, tech sites,bloggersconf: official + speakers
(a) TV show: #topchef12.
(b) Socio-political: #viacatalana.
(c) Keynote: #applekeynote.
(d) Conference: #tedxValencia.
Figure 9: Evolution of the betweenness of the ten nodes with the highest betweenness value in different type of events.
The nodes with the highest betweenness value in the analyzed keynote networks are the nodes that representmedia, technological web pages, and bloggers. In general, the value of betweenness of these nodes startsincreasing before the event and after the event (see Figure 9c). However, during the event, the values ofbetweenness remain constant since there are not a high number of interactions (new links) between nodes.There is a big difference between the nodes that represent media or web pages and the rest of the nodes thatrepresent bloggers or users.
In the analyzed conference networks, the nodes with the highest betweenness are official accounts andspeakers. In general, the official account has the highest betweenness value with respect the betweenness ofthe speakers (see Figure 9d). During the event, the betweenness of the official account increases at a higherrate than the speakers account. Among the nodes that represent the speakers there is also a difference betweenthose that participate in the first sessions and the speakers that participate in later sessions. The betweennessof the speakers that participate in the first sessions increases from the beginning. However, the betweenness ofthe speakers that participate in later sessions is initially almost constant and starts to increase once the speakerparticipates in the conference. After the event the betweenness of all nodes remains almost constant.
5.2. Indegree and OutdegreeIn the analyzed TV show networks, the nodes that receives more number of mentions, retweets, or reply
to messages are the official accounts and the celebrities that participate in the event. The indegree of theofficial accounts increases linearly until the end of the event that remains constant (see Figure 10b). There are
19
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Indegree and outdegree
tv, soc and keyn: indegreesame as betweenness;outdegree anonymousconf: balanced
(a) TV show: #topchef12. Indegree
(b) TV show: #topchef12. Outdegree
(c) Socio-political: #viacatalana. Indegree
(d) Socio-political: #viacatalana. Outdegree
(e) Keynote: #applekeynote. Indegree
(f) Keynote: #applekeynote. Outdegree
(g) Conference: #tedxValencia. Indegree
(h) Conference: #tedxValencia. Outdegree
Figure 10: Evolution of the indegree and outdegree of the ten nodes with the highest indegree and outdegree value in different typeof events.
21
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Eigenvector
↑↑ heterogeneous (even inthe same event type)not significative
(a) TV show: #topchef12.
(b) Socio-political: #viacatalana.
(c) Keynote: #applekeynote.
(d) Conference: #tedxValencia.
Figure 11: Evolution of the eigenvector of the ten nodes with the highest eigenvector value in different type of events.
connection join the network and the eigenvector of the speakers decreases steadily. In contrast, the speakersthat participate in the event later on have an eigenvector that increases smoothly before their participation inthe event. Then, when the speakers participate in the event, their eigenvector increases sharply. This meansthat other nodes with a high degree of connection establish a connection with the speakers. After this increase,the eigenvector centrality of the last speakers remains almost constant or there is a small decrease. When theevent is going to finish, the eigenvector of all the nodes remains almost constant.
6. Discussion
After the analysis of the networks, we observed that the networks generated from the Twitter events can beclassified in two main groups based on the type of interactions between users. One group consists of the TVshow and keynote networks. The other group consists of the socio-political and conference networks.
In the group of TV show and keynote networks, users tend to participate in the event through global mes-sages. The majority of interactions are unidirectional from unknown users to official accounts or celebrities.This fact is clearly reflected in the structural properties of the networks. The number of nodes is higher thanthe number of links, which means that users prefer to participate through global messages rather than interactwith other users. The small proportion of individual messages are mentions that are usually from anonymoususers to a celebrity or an official account that usually does not respond or interact with anonymous users. Thisfact is reflected in the low percentage of symmetric links. One of the effects of the lack of symmetry in the
22
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Discussion
two clearly differenced types
TV shows and keynotesglobal, undirectional messagesasymmetric: celebrities - annonymouslow clustering and long pathesTV: after & before; keyn: during
socio-political and conferencesreal communication among participantshigher symmetry, reciprocityconf: official account more participative
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks
Introduction Results. Event analysis Discussion Conclusions
Conclusions
Conclusionsofficial event account barely influencescommunities created around ’persons of interest’giant component created before event2 groups clearly differenciedsignificant differences among the 4 groups
Future workfollow user activity to complete de informationinclude decay in the linksinclude specific temporal measures; multiplex structurecreate models for events and user profiles
@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks