Microblogs: Information and Social Network Huang Yuxin.

Post on 13-Jan-2016

218 views 0 download

Tags:

Transcript of Microblogs: Information and Social Network Huang Yuxin.

Microblogs: Information and Social Network

Huang Yuxin

Millions of users in Microblogs

• By July 2009, Twitter has attracted 41 million users.

• By March 2011, size of Twitter has grown to 175 million.

• The registered id in Sina Microblog has reached 100 million by March 2011

People can publish posts and share information on Microblogs

Social network in Microblogs

What information can we extract from Microblogs

• Plain Text• User reference (1/2 posts)• Hashtag (1/9 posts)• Retweet• Emoticons• Shortened URL (resource) (1/2 posts)• Time• Users’ Geology info

Basic features from text of twitterTiny URL

Users Post Time

Emoticons

Hashtag

Mention (User reference)

What is Twitter(WWW 2010)

• People who are moreactive tend to have morefollowers• The case is different forpeople with very highpopularity.(Because theyare celebrities)

Small World

• Average Path length of Twitter: 4.12

Reciprocity?

(Whole dataset)• 77.9% of user pairs with

any link between them are connected one-way.

• And 67.6% of users are not followed by any of their followings.

• The rate of reciprocity is higher in Asian countries than America.

• (www 2010)

(Part of active users• 72.4% of the users in Twitter

follow more than 80% of their followers

• 80.5% of users have 80% of users they are following follow them back

• (wsdm 2010)The difference of conclusion

between these two papers is caused by different data extraction method

Celebrities And Popular

Topics

Users’ participation in topics

• A topic can only attract certain group of users

Content types on twitter

• Daily Chatter• Conversations• Sharing Information• Reporting and Spreading News

Understanding following Behavior----a statistics made in a paper

• Why we follow: professional interest, technology, tone of presentation, keeping up with friends

• Why we unfollow: Too many posts in general, too much status/personal info, spam, duplicative posts.

Interesting Research Topics on Twitter

• Vertical Search on Twitter (partial indexing + time sensitive information retrieval)

• Static Topic Detection (topic model)• Burst Event Detection (topic specific)• Topic Biased Expert Recommendation (graph

feature+ activeness+ textual feature)• Cascading Feature Analysis (Network structure

+ topic spreading behavior on different topics)

Related Works

People I need to follow vs. Content I need to know

TWEET Listen

People I need to follow vs. Content I need to know

• An active publisher may has interest in many topics

• My page is always filled with non-valuable latest chatting

• I may only need to subscribe certain topics of an author

• Can we automatically classify one’s content and filter out irrelevant ones?

Topics spreads through network

EARTHQUAKE

EARTHQUAKE

EARTHQUAKE

EARTHQUAKE

Detecting hot Topics with community

• keywords temporal feature• Hot topics are biased to a group of users, or a

certain time period

• Retweet Trees, Social Networks accompanied with users’ expertise can all participate in the model training

Topic Model with network regularization(WWW 08)

21

e.g. coauthor network

Document d

k

12

O(C,G)=L(C)+ R(G,C)

keyword list

?????

Rumors have attracted much attention

Intuitions

• Rumors spread furiously and cause hot discussion

• Rumors tends to be controversial (people spreading it and people against it)

• The source of Rumor (celebrities? Nobody?)• Maybe a study of the spreading of particular

rumor is interesting.• Celebrities will clarify the truth?

Challenges

• How to differentiate rumors with personal view

• Most of the comments are subjective (expression of feelings)

• Most of the comments are subjective

Rumors vs. meaningless Topics

Suggestions and ideas are really Welcome