Web Seminar January 29, 2013 Follow this event on Twitter Hashtag : #AHRQIX
A network based model for predicting a hashtag break out in twitter
-
Upload
sultan-alzahrani -
Category
Data & Analytics
-
view
75 -
download
0
Transcript of A network based model for predicting a hashtag break out in twitter
![Page 1: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/1.jpg)
A Network-Based Model for Predicting Hashtag Breakouts in Twitter
![Page 2: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/2.jpg)
Agenda
Background
Methodology
Our visualization tool
Experiment & Results
![Page 3: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/3.jpg)
Introduction
Tweets:
Textual contents
User interaction: retweeting,
mentioning, replying, etc.
Hashtags:
tagging mechanism created
by users
Help in categorizing tweets
Become very popular in
trending topics
![Page 4: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/4.jpg)
Some Definitions
Tweet Hashtag Volume: Number of tweets “containing a given
hashtag” per day.
Spike: sharp increase in the volume
![Page 5: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/5.jpg)
Research Question
Some hashtags become viral.
Can we predict whether a hashtag will go viral at nascent
stages?
Network base?
Textual Content base?
![Page 6: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/6.jpg)
Viral Diffusion
Network Based Analysis
• Arruda et al. examined the role of centrality measures in diseasespread on a SIR model and spreading rumors on a social network.
• In SIR model for rumors, infected individuals recover by someprobability while a spreader becomes a carrier thru contacts insocial networks.
Content Based Analysis
• Hypothesized that a specific groups of words are more likely to be contained in viral tweets.
• Li et al. analyzed tweets in terms of emotional divergence aspects (or sentiment analysis) and noted that highly interactive tweets tend to contain more negative emotions than other tweets.
![Page 7: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/7.jpg)
Running average and standard deviation
20 days sliding window
![Page 8: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/8.jpg)
Running Average and Standard Deviation
20 days sliding window
![Page 9: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/9.jpg)
Hashtag Volume
![Page 10: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/10.jpg)
Utilizing Three Sigma Rule
68-95-99.7 Rule
Empirical rule
![Page 11: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/11.jpg)
Hashtags Distribution
![Page 12: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/12.jpg)
Accumulative Period
Break out or Die
out?
Build a
predictive
learning model
based on …
![Page 13: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/13.jpg)
Accumulative Period
Break out or Die
out?
Build a
predictive
learning model
based on …
![Page 14: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/14.jpg)
Break out vs Die out
Break out Non break out (Die out)
![Page 15: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/15.jpg)
Our Approach
Can we predict #Hashtag breakouts in Twitter at their early stages using local and global network interaction measures ?
Local measures: interaction network within the 20 days accumulation window.
Global measures: interaction network from earlier until the end of the current window.
1. Define a 3-sigma/empirical rule based breakout measure2. Model evolutionary episodes of hashtag volumes, as:
• Accumulation, Breakout, Die-Out3. Extract local and global network features 4. Train and test a classifier to:
• Predict if Accumulation leads to Breakout or Die-Out
![Page 16: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/16.jpg)
IDENTIFY evolutionary episodes in #Hashtag volume time-series
BreakoutAccumulation Die-out Accumulation Die-out
![Page 17: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/17.jpg)
Trending Hashtag Forcaster
![Page 18: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/18.jpg)
Local and global network measures are computed as features
Network measures:
Eigen Vector Centrality
Page Rank
Closeness Centrality
Betweeness Centrality
Degree Centrality
Indegree Centrality
Outdegree Centrality
Link Rate
Distinct Link Rate
Number of Uninfected neighbors of early adopters
Neighborhood average degree
![Page 19: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/19.jpg)
PCA Ranking of Features
Exploratory method: reducing the original measure
variables by orthogonal transformation.
PCA would return sorted number of (linearly uncorrelated)
components along with its variance.
Highest number of variance among instances.
![Page 20: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/20.jpg)
PCA Ranking of Features
![Page 21: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/21.jpg)
Prediction Accuracies
Break out
• Non Break out (die out)
![Page 22: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/22.jpg)
Conclusion and Future Work
• A content independent network based classifier for predicting hashtag breakouts
• Next, we propose to study the utility of content based features such as keywords, named-entities, topics and sentiments.
![Page 23: A network based model for predicting a hashtag break out in twitter](https://reader030.fdocuments.us/reader030/viewer/2022032422/55a8c4861a28abb6108b46b1/html5/thumbnails/23.jpg)
Thank you for listening!
Any question?