Analysis of the Evolution of Events on Online Social Networks

18
Introduction Results. Event analysis Discussion Conclusions Analysis of the Evolution of Events on Online Social Networks E. del Val, M. Rebollo and V. Botti Grupo Tec. Inform.-Inteligencia Artificial Universitat Politècnica de València CSS 2014 @mrebollo UPV Analysis of the Evolution of Events on Online Social Networks

description

Talk at Computational Social Science Workshop (CSS'14), a Lucca. September 2014 (ECCS satellite workshop)

Transcript of Analysis of the Evolution of Events on Online Social Networks

Page 1: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Analysis of the Evolution of Events on OnlineSocial Networks

E. del Val, M. Rebollo and V. Botti

Grupo Tec. Inform.-Inteligencia ArtificialUniversitat Politècnica de València

CSS 2014

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 2: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Problem

Analysis of user behaviorHow user interactions evolve in a social network associated to therealization of a scheduled event in the real world.

4 types of events are consideredTV showssocio-politicalconferenceskeynotes

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 3: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

The Hypothesis

There are significative differences in the behavior of theparticipants in each type of event

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 4: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Data set

twits containing the hashtag of the event are retrievedinteraction-based network: mentions, replies and retweetsfocus on scheduled eventstemporally annotated network (nodes and links)empirical analysis

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 5: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Events

TV showsLa vozTopchefBreakingBadOperacionPalace

Keynotesapplekeynotenuevosiphoneinnovationreinvented (Nokia)

Socio-politicallomceviacatalanaEU elections debate

ConferencesTEDValenciaseo4seosTwitter awards 2013InternetOfThings forumCW’13 (web conference)

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 6: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Event characterization

Event N E clust. d path comp. k m GC %sl#lomce #24O 61,653 97,570 0.07 27 6.93 8,088 1.58 0.73 49,164 10.80

#via 41,166 76,094 0.07 23 8.28 8,705 1.85 0.56 30,745 12.19catalana

#topchef12 26,044 27,155 0.05 25 8.74 10,689 1.04 0.66 12,794 12.29#lavoz 45,914 39,891 0.07 9 2.11 19,398 0.87 0.81 17,941 6.95

#breakingbad 151,473 120,661 0.05 13 3.24 67,060 0.80 0.72 71,130 14.26#operacion 107,606 195,470 0.08 23 8.26 19,504 1.82 0.49 85,768 15.06

Palace#apple 3,367 1,729 0.04 4 1.27 1,827 0.51 0.93 700 1.53keynote#nuevos 9,509 10,600 0.05 9 2.09 1,227 1.12 0.62 7,799 6.47iPhone

#innovation 110 95 0.04 3 1.34 32 0.86 0.83 26 10.00reinvented#IoTWF 4,150 17,545 0.22 9 3.96 330 4.23 0.42 3,680 30.83#cw13 1,051 2,608 0.20 9 3.90 60 2.48 0.56 966 19.76

#seo4seos 367 1,474 0.35 6 2.91 16 4.02 0.30 347 31.07#tedx 325 843 0.17 8 3.60 45 2.59 0.32 276 8.60

valencia

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 7: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Message type

tv: linear, global, duringsoc: linear, inter., alwayskeyn: global, beforeconf: linear, inter., during

0 5000

10000 15000 20000 25000 30000 35000 40000 45000

0 50 100 150 200 250 300 350

num

twee

ts

snapshot

globalretweets

mentions replyTo

(a) TV show: #topchef12.

0

10000

20000

30000

40000

50000

60000

0 50 100 150 200 250 300 350

num

twee

ts

snapshot

globalretweets mentions

replyTo

(b) Socio-political: #viacatalana

0

1000

2000

3000

4000

5000

6000

0 50 100 150 200 250 300 350nu

m tw

eets

snapshot

globalretweets mentions

replyTo

(c) Keynote: #applekeynote

0

200

400

600

800

1000

0 100 200 300 400 500 600 700

num

twee

ts

snapshot

globalretweets

mentions replyTo

(d) Conference: #tedxValencia

Figure 2: Evolution of the number of global and individual messages (mentions, retweets, and reply to) in different type of events.

9

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 8: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Network size (links and nodes)

tv: always, similarsoc: always, ↑↑linkskeyn: before, ↑↑linksconf: but after, ↑↑links

0

5000

10000

15000

20000

25000

30000

0 50 100 150 200 250 300 350

num

link

s or

nod

es

snapshot

linksnodes

(a) TV show: #topchef12.

0

10000

20000

30000

40000

50000

60000

70000

80000

0 50 100 150 200 250 300 350

num

link

s or

nod

es

snapshot

linksnodes

(b) Socio-political: #viacatalana.

0

500

1000

1500

2000

2500

3000

3500

0 50 100 150 200 250 300 350nu

m li

nks

or n

odes

snapshot

linksnodes

(c) Keynote: #applekeynote.

0 100 200 300 400 500 600 700 800 900

0 100 200 300 400 500 600 700

num

link

s or

nod

es

snapshot

linksnodes

(d) Conference: #tedxValencia.

Figure 3: Evolution of the number of nodes and links in different type of events.

the number of nodes and links increases moderately. People in the analyzed events prefer to talk and interactbefore the event rather than during the keynote.

In the analyzed conference networks, the number of nodes grows rapidly before the event starts (see Figure3d). Then, during the event, there is also an important increase in the number of nodes. Towards the end of theevent as well as after the event, the number of nodes remains almost constant. If the conference consists on twoor three days, during the first day is when a most significant increase in the number of new nodes occurs. Thenumber of links evolves similarly as the nodes. However, the increase in the number of interactions is producedat a higher rate. This means that assistants to a conference are more social and interact with other assistants.This behavior is similar to behavior of participants in socio-political networks.

4.3. Symmetric linksInteractions among users can be in one direction or bidirectional. A unidirectional link means that a user

sends an individual message to other user and he/she never receives a response. This usually happens whenusers interact with celebrities or the official account of the events. A bidirectional (symmetric) link means thata user A sends a message to other user B and user B sends also a message to A, which implies that there is

10

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 9: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Symmetric links

tv: low, decreasingsoc: low, constantkeyn: lowestconf: tipically high,constant

0

5

10

15

20

25

30

35

40

0 50 100 150 200 250 300 350

sym

met

ric li

nks

snapshot

% of symmetric links

(a) TV show: #topchef12.

0

5

10

15

20

25

30

35

40

0 50 100 150 200 250 300 350

sym

met

ric li

nks

snapshot

% of symmetric links

(b) Socio-political: #viacatalana.

0

5

10

15

20

25

30

35

40

0 50 100 150 200 250 300 350sy

mm

etric

link

s

snapshot

% of symmetric links

(c) Keynote: #applekeynote.

0

5

10

15

20

25

30

35

40

0 100 200 300 400 500 600 700

sym

met

ric li

nks

snapshot

% of symmetric links

(d) Conference: #tedxValencia.

Figure 4: Evolution of the number of symmetric links in different type of events.

12

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 10: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Cummulative degree distrib. and CCDF

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 11: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Path length and diameter

all: grows before andremains constantnon significativedifferences

0

5

10

15

20

25

30

0 50 100 150 200 250 300 350

num

ste

ps

timeStep

path Lengthdiameter

(a) TV show: #topchef12.

0

5

10

15

20

25

30

0 50 100 150 200 250 300 350

num

ste

ps

timeStep

path Lengthdiameter

(b) Socio-political: #viacatalana.

1

1.5

2

2.5

3

3.5

4

0 50 100 150 200 250 300 350nu

m s

teps

timeStep

path Lengthdiameter

(c) Keynote: #applekeynote.

1

2

3

4

5

6

7

8

9

0 100 200 300 400 500 600 700

num

ste

ps

timeStep

path Lengthdiameter

(d) Conference: #tedxValencia.

Figure 6: Evolution of the average path length and diameter in different type of events.

15

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 12: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Clustering

not significative in theevolutiondifferences in the values(↑conf, ↓↓ keyn)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 50 100 150 200 250 300 350

Clu

ster

ing

Deg

ree

timeStep

clustering dynamics

(a) TV show: #topchef12.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 50 100 150 200 250 300 350

Clu

ster

ing

Deg

ree

timeStep

clustering dynamics

(b) Socio-political: #viacatalana.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 50 100 150 200 250 300 350C

lust

erin

g D

egre

e

timeStep

clustering dynamics

(c) Keynote: #applekeynote.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 100 200 300 400 500 600 700

Clu

ster

ing

Deg

ree

timeStep

clustering dynamics

(d) Conference: #tedxValencia.

Figure 7: Evolution of the average clustering in different type of events.

16

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 13: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Nodes in the giant component

all: created before theeventtv and keyn: lowsoc and conf: high

0

20

40

60

80

100

0 50 100 150 200 250 300 350% o

f nod

es th

at a

re in

the

gian

t com

pone

nt

timeStep

% of nodes

(a) TV show: #topchef12.

0

20

40

60

80

100

0 50 100 150 200 250 300 350% o

f nod

es th

at a

re in

the

gian

t com

pone

nt

timeStep

% of nodes

(b) Socio-political: #viacatalana.

0

20

40

60

80

100

0 50 100 150 200 250 300 350% o

f nod

es th

at a

re in

the

gian

t com

pone

nt

timeStep

% of nodes

(c) Keynote: #applekeynote.

0

20

40

60

80

100

0 100 200 300 400 500 600 700% o

f nod

es th

at a

re in

the

gian

t com

pone

nt

timeStep

% of nodes

(d) Conference: #tedxValencia.

Figure 8: Evolution of the % of nodes that are part of the giant component in different type of events.

17

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 14: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Betweenness

tv: official + celebritiessoc: politics, journalists,bloggerskeyn: media, tech sites,bloggersconf: official + speakers

(a) TV show: #topchef12.

(b) Socio-political: #viacatalana.

(c) Keynote: #applekeynote.

(d) Conference: #tedxValencia.

Figure 9: Evolution of the betweenness of the ten nodes with the highest betweenness value in different type of events.

The nodes with the highest betweenness value in the analyzed keynote networks are the nodes that representmedia, technological web pages, and bloggers. In general, the value of betweenness of these nodes startsincreasing before the event and after the event (see Figure 9c). However, during the event, the values ofbetweenness remain constant since there are not a high number of interactions (new links) between nodes.There is a big difference between the nodes that represent media or web pages and the rest of the nodes thatrepresent bloggers or users.

In the analyzed conference networks, the nodes with the highest betweenness are official accounts andspeakers. In general, the official account has the highest betweenness value with respect the betweenness ofthe speakers (see Figure 9d). During the event, the betweenness of the official account increases at a higherrate than the speakers account. Among the nodes that represent the speakers there is also a difference betweenthose that participate in the first sessions and the speakers that participate in later sessions. The betweennessof the speakers that participate in the first sessions increases from the beginning. However, the betweenness ofthe speakers that participate in later sessions is initially almost constant and starts to increase once the speakerparticipates in the conference. After the event the betweenness of all nodes remains almost constant.

5.2. Indegree and OutdegreeIn the analyzed TV show networks, the nodes that receives more number of mentions, retweets, or reply

to messages are the official accounts and the celebrities that participate in the event. The indegree of theofficial accounts increases linearly until the end of the event that remains constant (see Figure 10b). There are

19

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 15: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Indegree and outdegree

tv, soc and keyn: indegreesame as betweenness;outdegree anonymousconf: balanced

(a) TV show: #topchef12. Indegree

(b) TV show: #topchef12. Outdegree

(c) Socio-political: #viacatalana. Indegree

(d) Socio-political: #viacatalana. Outdegree

(e) Keynote: #applekeynote. Indegree

(f) Keynote: #applekeynote. Outdegree

(g) Conference: #tedxValencia. Indegree

(h) Conference: #tedxValencia. Outdegree

Figure 10: Evolution of the indegree and outdegree of the ten nodes with the highest indegree and outdegree value in different typeof events.

21

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 16: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Eigenvector

↑↑ heterogeneous (even inthe same event type)not significative

(a) TV show: #topchef12.

(b) Socio-political: #viacatalana.

(c) Keynote: #applekeynote.

(d) Conference: #tedxValencia.

Figure 11: Evolution of the eigenvector of the ten nodes with the highest eigenvector value in different type of events.

connection join the network and the eigenvector of the speakers decreases steadily. In contrast, the speakersthat participate in the event later on have an eigenvector that increases smoothly before their participation inthe event. Then, when the speakers participate in the event, their eigenvector increases sharply. This meansthat other nodes with a high degree of connection establish a connection with the speakers. After this increase,the eigenvector centrality of the last speakers remains almost constant or there is a small decrease. When theevent is going to finish, the eigenvector of all the nodes remains almost constant.

6. Discussion

After the analysis of the networks, we observed that the networks generated from the Twitter events can beclassified in two main groups based on the type of interactions between users. One group consists of the TVshow and keynote networks. The other group consists of the socio-political and conference networks.

In the group of TV show and keynote networks, users tend to participate in the event through global mes-sages. The majority of interactions are unidirectional from unknown users to official accounts or celebrities.This fact is clearly reflected in the structural properties of the networks. The number of nodes is higher thanthe number of links, which means that users prefer to participate through global messages rather than interactwith other users. The small proportion of individual messages are mentions that are usually from anonymoususers to a celebrity or an official account that usually does not respond or interact with anonymous users. Thisfact is reflected in the low percentage of symmetric links. One of the effects of the lack of symmetry in the

22

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 17: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Discussion

two clearly differenced types

TV shows and keynotesglobal, undirectional messagesasymmetric: celebrities - annonymouslow clustering and long pathesTV: after & before; keyn: during

socio-political and conferencesreal communication among participantshigher symmetry, reciprocityconf: official account more participative

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks

Page 18: Analysis of the Evolution of Events on Online Social Networks

Introduction Results. Event analysis Discussion Conclusions

Conclusions

Conclusionsofficial event account barely influencescommunities created around ’persons of interest’giant component created before event2 groups clearly differenciedsignificant differences among the 4 groups

Future workfollow user activity to complete de informationinclude decay in the linksinclude specific temporal measures; multiplex structurecreate models for events and user profiles

@mrebollo UPVAnalysis of the Evolution of Events on Online Social Networks