Studying Online Interactions using Social Network Analysis
-
Upload
anatoliy-gruzd -
Category
Education
-
view
3.402 -
download
2
Transcript of Studying Online Interactions using Social Network Analysis
Studying Online Interactions using Social Network Analysis
Anatoliy [email protected]@gruzd
Canada Research Chair in Social Media Data Stewardship Associate Professor, Ted Rogers School of ManagementDirector, Social Media LabRyerson University
BCC Workshop
Arlington, TX, Oct 4, 2015
Social Network Analysis (SNA)
• Nodes = People
• Edges /Ties (lines) = Relations/
“Who retweeted/ replied/
mentioned whom”
Anatoliy Gruzd 3
How to Make Sense of Educational “Big” Data?
@gruzd
Makes it much easier to
understand what is going on in a
group
Advantages of
Social Network Analysis
Once the network is discovered, we can find out:
• How do people interact with each other,
• Who are the most/least active members of a group,
• Who is influential in a group,
• Who is susceptible to being influenced, etc…
Anatoliy Gruzd 4@gruzd
Common approach for collecting social network data:
• Self-reported social network data may not be available/accurate
• Surveys or interviews
Problems with surveys or interviews
• Time-consuming
• Questions can be too sensitive
• Answers are subjective or incomplete
• Participant can forget people and
interactions
• Different people perceive events and
relationships differently
How Do We Collect Information About Online Social Networks?
Anatoliy Gruzd 5@gruzd
• Common approach: surveys or interviews
• A sample question about students’ perceived social structures
How Do We Collect Information About Social Networks?
Please indicate on a scale from [1] to [5],
YOUR FRIENDSHIP RELATIONSHIP WITH EACH STUDENT IN THE CLASS
[1] - don’t know this person
[2] - just another member of class
[3] - a slight friendship
[4] - a friend
[5] - a close friend
Alice D. [1] [2] [3] [4] [5]
…
Richard S. [1] [2] [3] [4] [5]
Source: C. Haythornthwaite, 1999
Anatoliy Gruzd 6@gruzd
Studying Online Social Networks
http://www.visualcomplexity.com/vc
• Forum networks
• Blog networks
• Friends’ networks (Facebook,
Twitter, Google+, etc…)
• Networks of like-minded people
(YouTube, Flickr, etc…)
Anatoliy Gruzd 7@gruzd
Goal: Automated Networks Discovery
Challenge: Figuring out what content-based features of online interactions can help to uncover nodes and ties between group members
How Do We Collect Information About Online Social Networks?
Anatoliy Gruzd 8@gruzd
Automated Discovery of Social Networks
Emails
Nick
Rick
Dick
• Nodes = People
• Ties = “Who talks to whom”
• Tie strength = The number of
messages exchanged between
individuals
Anatoliy Gruzd 9@gruzd
Automated Discovery of Social Networks
“Many to Many” Communication
ChatMailing listservForum Comments
Anatoliy Gruzd 10@gruzd
Automated Discovery of Social Networks Approach 1: Chain Network (Reply-to)
FROM: SamPREVIOUS POSTER: Gabriel
....
....
....
Posting
header
Content
Anatoliy Gruzd 11@gruzd
Automated Discovery of Social Networks Approach 1: Chain Network (Reply-to)
FROM: SamPREVIOUS POSTER: Gabriel
“ Nick, Gina and Gabriel: I apologize for not backing this up
with a good source, but I know from reading about this topic that … ”
Posting
header
Content
Possible Missing Connections:
• Sam -> Nick
• Sam -> Gina
• Nick <-> Gina 12 12@gruzd
Automated Discovery of Social Networks
Approach 2: Name Network
FROM: Ann
“Steve and Natasha, I couldn't wait to see your site.
I knew it was going to [be] awesome!”
This approach looks for personal names in the content of the messages to identify social connections between group members.
Anatoliy Gruzd 14@gruzd
• Main Communicative Functions of Personal Names (Leech, 1999)
– getting attention and identifying addressee
– maintaining and reinforcing social relationships
• Names are “one of the few textual carriers of identity” in discussions on the web (Doherty, 2004)
• Their use is crucial for the creation and maintenance of a sense of community (Ubon, 2005)
Automated Discovery of Social NetworksApproach 2: Name Network
Anatoliy Gruzd 15@gruzd
Example: Twitter Networks
@John
@Peter
@Paul
• Nodes = People
• Ties = “Who retweeted/
replied/mentioned whom”
• Tie strength = The number of
retweets, replies or mentions
How to Make Sense of Social Media Data?
Anatoliy Gruzd 23@gruzd
Automated Discovery of Social Networks
Twitter Data Example
24
Name Network ties Chain Network ties
@Cheeflo -> @JoeProf @Cheeflo -> @VMosco
none
Automated Discovery of Social Networks
Twitter Data Example
25
Name Network ties Chain Network ties
@gruzd -> @sidneyeve @gruzd -> @sidneyeve
Anatoliy Gruzd
Netlytic.org cloud-based research infrastructure for automated text analysis &
discovery of social networks from social media data
Ne
two
rk
s
Sta
ts
Co
nte
nt
27
29
cMOOC Dataset
Athabasca University Course◦ CCK11: Connectivism and Connective Knowledge
Not restricted to any one
platform
“Throughout this ‘course’
participants will use a variety of
technologies, for example, blogs,
Second Life, RSS Readers,
UStream, etc.”
ANATOLIY GRUZD @gruzd
31
CCK11 MOOC Dataset
#Posts Avg charactercount
Avg word count
Blogs 812 332.3 54.2
DiscussionThreads
68 541.8 90.8
Comments 306 837.6 138.9
Tweets 1722 113.9 14.4
ANATOLIY GRUZD @gruzd
Question 1 - #LAK14 #pLASMAWhat does this visualization tell us from an instructor’s perspective?
ANATOLIY GRUZD 32@gruzd
What does this visualization tell us from a student’s perspective?
ANATOLIY GRUZD 33@gruzd
SNA MeasuresMicro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g., closeness, eigenvector)
Macro-level
Density
Diameter
Reciprocity
Centralization
Modularity
ANATOLIY GRUZD 35@gruzd
SNA MeasuresMicro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g., closeness, eigenvector)
ANATOLIY GRUZD 36
In-degree suggests “prestige” highlighting the most mentioned or replied Twitter users
@gruzd
37
#CCK11 Class Tweets (Node size=“Indegree”)
ANATOLIY GRUZD @gruzd
SNA MeasuresMicro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g., closeness, eigenvector)
ANATOLIY GRUZD 38
Out-degree reveals active Twitter users with a good awareness of others in the network
@gruzd
39
#CCK11 Class Tweets (Node size=“Outdegree”)
ANATOLIY GRUZD @gruzd
SNA MeasuresMicro-level
In-degree centrality
Out-degree centrality
Betweenness centrality
Other centrality measures (e.g., closeness, eigenvector)
ANATOLIY GRUZD 40
Betweenness shows actors who are located on the most number of information paths and who often connect different groups of users in the network
@gruzd
Top 10 Twitter users in the Name network based on centrality measures
ANATOLIY GRUZD 41
In-degree Out-degree Betweenness
profesortbaker cck11feeds profesortbaker
shellterrell suifaijohnmak suifaijohnmak
gsiemens web20education cck11feeds
downes daisygrisolia thbeth
zaidlearn ianchia francesbell
web20education profesortbaker hamtra
hamtra pipcleaves daisygrisolia
profesorbaker hamtra web20education
ianchia jaapsoft jaapsoft
jaapsoft thbeth gsiemens
suifaijohnmak ryantracey
@gruzd
SNA MeasuresMacro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Density indicates the overall connectivity in the network (the total number of connections divided by the total number of possible connections).
It is equal to 1 when everyone is connected to everyone.
ANATOLIY GRUZD 42@gruzd
Student
InstructorStudent
Density = 1
SNA MeasuresMacro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Diameter gives a general idea of how “wide” the network is; the longest of the shortest paths between any two nodes in the network.
ANATOLIY GRUZD 43@gruzd
#1
Student A Instructor
Student B
Student C
Diameter = 3
#2
#3
SNA MeasuresMacro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Reciprocity shows how many online participants are having two-way conversations.
In a scenario when everyone replies to everyone, the reciprocity value will be 1.
ANATOLIY GRUZD 44@gruzd
Student
InstructorStudent
Student Reciprocity=1
SNA MeasuresMacro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Centralization indicates whether a network is dominated by few central participants (values are closer to 1),
or whether more people are contributing to discussion and information dissemination (values are closer to 0).
ANATOLIY GRUZD 45@gruzd
Student
InstructorStudent
Student Centralization=1
SNA MeasuresMacro-level
Density
Diameter
Reciprocity
Centralization
Modularity
Modularity provides an estimate of whether a network consists of one coherent group of participants who are engaged in the same conversation and who are paying attention to each other (values closer to 0);
or whether a network consists of different conversations and communities with a weak overlap (values closer to 1).
ANATOLIY GRUZD 48@gruzd
Name network (on the left) and Chain network (on the right)
ANATOLIY GRUZD 51
#CCK11 Class Tweets
@gruzd