Magnet community identiﬁcation on social networks

MAGNET COMMUNITY

IDENTIfiCATION ON

SOCIAL NETWORKS

University of Illinois at Chicago, USA

ABSTRACT

A magnet community is such a community that attracts significantly more people’s interests and attentions than other communities of similar topics.

the study of magnet community identificationproblem.

1. We observe several properties of magnet communities.

2. We formalize these properties with the combination of community feature extraction into a graph ranking formulation.

INTRODUCTION

Magnet communities are such communities

that draw significantly more attention than

others even if they are all about the same topic.

INTRODUCTION

Importance:

1. help people understand the trends of their

domains.

2. help people make decisions when joining

communities.

Our goal :

Given communities in a domain, we want to rank

them based on their attractiveness to people among

the communities of that domain.

In the end, the top ranked communities are the ones

people tend to adhere to.

INTRODUCTION

INTRODUCTION

Challenges:

1. how to extract features from these heterogeneous sources of impacting factors of a community’s attractiveness.

2. how to combine all heterogeneous information into a unified ranking model.

3. Noise handling.

common properties:

attention flow

Attention quality

persistence of people’s attention

contributions

A new direction on social network analysis,

namely

magnet community identification.

One definition of magnet communities by

identifying their properties.

We demonstrate the effectiveness of our

framework on a particular domain of magnet

community identification, namely company’s

employee magnet community identification.

MAGNET COMMUNITY

IDENTIFICA-

TION FRAMEWORK

MAGNET COMMUNITY

IDENTIFICA-

TION FRAMEWORK Attractiveness computation framework

M =( m1 , m2 ,..., mk )

M = f ( FV , FE , M )

M∗

Our objective function：

MAGNET COMMUNITY

IDENTIFICA-

TION FRAMEWORK Attractiveness features

Standalone features

Attention migrating matrix as dependency

features

• an attention migrating matrix：D=(dij)k*k

• The attention vector, A =( ai )k∗1 = D · e

•

• Dependency features of communities：

MAGNET COMMUNITY

IDENTIFICA-

TION FRAMEWORK Concrete formula of magnet community

ranking framework

At least one of the following conditions hold

MAGNET COMMUNITY

IDENTIFICA-

TION FRAMEWORK

EVALUATION

Data collection and features extraction

Data collection

• www.linkedin.com

• Standalone features : a company’s revenue per

employee, industry, location, age

• 39527 companies’ information in 142 industries

EVALUATION

Feature extraction

• industry – count how many people flow into it andout of it, using company level departure and arrival data.

• Locations -- popularities

• Founded year feature -- the number of companies founded for each year.

Ranking performance

Baseline Description

• PageRank

• IT and financial

• The 2011 ideal employer ranking proposed by Universumglobalthe 2011 most admired company ranking by Fortune

EVALUATION

Case studies --- IT

EVALUATION

Case studies --- Financial

EVALUATION

Overall Correctness measures

EVALUATION

Parameter sensitivity

two parameters α and μ

CONCLUSION

提出了magnet community identification 的研究方向

对问题定义和举例、研究意义、挑战、目标、

传统思路的缺陷等有非常充分的说明

算法思路清晰，提出了三个特性，量化成目标

函数

JOINT TOPIC MODELING

FOR EVENT

SUMMARIZATION

ACROSS

NEWS AND SOCIAL

MEDIA STREAMS

Qatar Computing Research Institute

Qatar Foundation Doha, Qatar

ABSTRACT

a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets.

1. topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature.

2. co-ranking the news sentences and tweets in both sides.

INTRODUCTION

News -- well-crafted, fact-oriented long stories

written by professionals based on the latest

past events

Tweets – personalized, more opinionated free-

style short messages posted by the average

persons in real time.

INTRODUCTION

contributions

A novel problem of generating complementary summaries

a principled measure to assess the extent ofsentence-level complementarity

A topic modeling approach called cross-collectiontopic-aspect model (ccTAM) that combines ccLDA and topic-aspect mixture model for precisely estimating the proposed complementary measure.

a gold-standard dataset of complementary summaries

PROBLEM DEFINITION

LEARNING COMPLEMENTARY

RELATION

commonality and difference

general model and media-specific model.


RELATION

Measuring Commonality and Difference


RELATION

Cross-collection Topic-Aspect Model (ccTAM)

Inference

Infer the general topic-word distribution φ z

Inference

The collection-specific topic-word distribution

φcz

GENERATE COMPLEMENTARY

SUMMARIES

G =( N ∪ T, E )

N = { n1 ,n2, ··· ,nmn}, T = { t1 ,t2 , ··· ,tnt

}

E = { ( p ( ni | tj ) ,p ( tj | ni )) | i =1 , ··· ,mn ; j

=1 , ··· ,nt } is the set of directed edges

between two sets of nodes whose values are

node-to-node jumping probabilities.


SUMMARIES

Jumping Probability


SUMMARIES

Sentences/Tweets Co-ranking

Summary Generation

Summary-level complementarity

Sentence-level complementarity

EXPERIMENTS AND

RESULTS

Data Collecting

EXPERIMENTS AND

RESULTS

gold-standard summaries

The news summaries: English Wikipedia and

Wikinews

Tweets summaries

Baseline Methods

BL-0:LexRank

BL-1: KL-divergence(KLD)

BL-2: Cosine and language modeling(LM)

BL-3:LexRank+Complementarity(LexComp)

EXPERIMENTS AND

RESULTS

Results and Discussions

EXPERIMENTS AND

RESULTS

Example of output summaries

CONCLUSIONS

提出了用tweets补充news summarization的想法

提出了补充度的概念，并介绍了一种度量

tweets补充度的方法。

算法结合了很多已有模型，比如topic-aspect

model, cross-collection topic model, random

walk model

Magnet community identiﬁcation on social networks

Technology

Transcript of Magnet community identiﬁcation on social networks