Community Detection in Social Media
-
Upload
rezahk -
Category
Social Media
-
view
331 -
download
1
Transcript of Community Detection in Social Media
* Community Detection with Edge Content in Social Media Networks* Community Detection in Social Media by Leveraging Interactions and Intensities
Presented By: Mojtaba Rezaei & Reza Habibi Kerahroudi
University Of TehranNetworked Systems Engineering
Community Detection in Social MediaGraph Algorithms Course Wikipedia
Tw i t t e rYouTube
Community Detection with Edge Content in Social Media Networks
Presented By: Mojtaba Rezaei
University Of TehranNetworked Systems Engineering
Introduction
Most community detection algorithms use
the links between the nodes in order to
determine the dense regions in the graph.
in many recent applications, edge content is available in order to provide better supervision to the community detection process
Introduction (cont.)
An important problem in the area of social media
is that of community detection.
In the problem of community detection, the goal is to partition the network into dense regions of the graph.
Introduction (cont.)
a lot of rich information is encoded in the
content of
the interactions among the actors in the
network.
E-mail networkswe will see that edge content provides a number of unique distinguishing characteristics of the communities which cannot be modeled by node content.
Illustration of a social media network
The nodes represent users while the edges represent the favored images shared by the users
Introduction (cont.)
Edge-based content is much more
challenging, because the different interests of
the same actor node may be reflected in
different edges.
We will show that such an approach provides unique insights which are not possible with the useof pure link-based or content-based methods.
Community Detection With Edge Content
most community detection methods are
focused on partitioning the nodes based on
linkage, and we are interested in partitioning
the edges based on both linkage and content.
Community Detection With Edge Content (cont.)
when there are no links, the problem
defaults to the pure content-based
clustering problem.
Data Sets Enron Email Data Set
200, 399 messages belonging to 158 members of senior management
Flickr Social Network Data Set 15 popular Flickr user groups, including “family”, “auto”, “concerts”, “pet portraits”, “kids and
nature”, “street”, “art”,“wide party,” “folk music“ , "magic city”, “party favors”, "British
politics”, “youth basketball”, “fast food", "fancy dress party” and “great sky.”
This social media network has 4, 703 users in 15 groups
Community Detection in Social Media by Leveraging Interactions and Intensities
Presented By: Reza Habibi Kerahroudi
University Of TehranNetworked Systems Engineering
Introduction
User interaction networks capture users’ associations
derived from their activities in social media such as:
commenting on others’ posts, replying to comments,
referencing other users, etc.
Communities can be generally defined as groups of users
that are "closely-knit”, in the sense that a group’s
interconnections are more dense compared to connections
with the rest of the network.
Introduction (cont.)
Our focus is on revealing the types of communities generated with respect to certain events by analyzing them in the dimensions of size, topic diversity and time span.
VERTEX STRUCTURE STRUCTURAL SIMILARITY
ε – NEIGHBORHOOD
CORE VERTEX DIRECT STRUCTURE REACHABILITY STRUCTURE REACHABILITY STRUCTURE CONNECTIVITY STRUCTURE-CONNECTED CLUSTER CLUSTERING HUB OUTLIER
SCAN algorithm
Getting from SCAN to WSCAN SCAN discovers cohesive network subclusters
based on parameters μ and , which control the minimum community’s size and the minimum structural similarity between two community’s nodes, respectively.
To adapt SCAN for weighted interaction networks we propose weighted structure reachability for
(μ, )-cores’ detection.
Real-World Networks
For experimentation we have generated a
network based on Twitter user interactions,
(i.e. mentions, replies, retweets), extracted
from data collected via the Twitter Streaming
API with topic-related keywords.
Our selected topic refers to the official Euro group meetings (of Euro zone's finance ministers)
Real-World Networks Our EUROGROUP dataset
(covering 8 meetings from 13/06/12 to 30/11/12) acts as an exemplary case study of a series of events held at different time instances, having the same participants with a common generic context (i.e. the Euro zone's monetary issues), but different focus (depending on the agenda). The dataset spans 227 days and comprises: 29529 tweets, 10305 interactions and 3015 different users.
EUROGROUP meetings, tweets, and communities
Classification of the most significant topics based on interest intensity and diffusion