Tweet Summarizer

Tweet SummarizationSai Madhuri B, Srikanth K S

Project Details

Project Name: Tweet Summarization

Problem definition: For a given keyword K, a hashtag set H of unprocessed stream of tweets R is processed to get P which includes the set of tweets obtained from the co-occurrence tweet extractor. The set P consists of subsets p1 , p2 , ... , pn which correspond to tweets segregated into sub topic1 , subtopic2 , ..., sub topicn .

Dataset: From Twitter RestfulAPI

Gold Standard: Human evaluation

Extraction

● Twitter text● Time the tweet was created at● Screen name of the user● Follower count of the user● Favorite count of the user● Favorited flag of tweet● Retweeted flag of tweet● Retweet count of tweet

Filtering

● Converted HTML- encoded characters into ASCII.● Removed any Unicode characters.● Filtered out embedded URL's.● Removed the re-tweets.● Removed the handlers.● Removed the hashtags.● Removed the tweets whose length is less than 5 words.

Distribution of Tweets1752 tweets divided into 4 clusters.

Cluster 1 - tweets calling MH370 a bluff

Cluster 2 - tweets related to a certain golf star being attacked by hornets in Malaysia

Cluster 3 - tweets with information about a certain phase of MH370 search (SAR Mission)

Cluster 4 - tweets representing another phase of the 'MH370 search'

Clustering - Baseline

Bursty topic model

● Binomial Distribution (Fung et al)● Sub topic segmentation

Step - 1:Associated words

Bursty topic modelStep - 2:

Lifetime of sub topic

- set of words in association word set

ImplementationRepresentationStep - 1

● words as nodes● association as edge weight● find components to determine associated sets

Step - 2● tune to obtain overlapping life times

RankingLex rank

Input

- tweets from subtopics

Output- ranked tweets per sub topic

Human Evaluation

● Choosing number of categories.

● Discarding uninformative and incoherent tweets

● Ranking them for each cluster as per the richness of information and coherence

● Clubbing all the ranked tweets to obtain summary

Human Evaluation

A. Funny (Hold the front page! I've found the black box!!! Really sorry, it's been in my kit room all along)B. Sarcastic (All psychics in the world should gather for a psychic convention to solve MH370 mysteries) C. Uninformative ('my heart will go on' song made me think… what if a film about mh370 is made?) D. Unrelated (Larrazabal stung by hornets in Malaysia) E. Predictive ( I am guessing flight is in the Warthon Basin floor, Indian Ocean, but it has to be proven)

ROUGEMetrics used:

● precision● recall

Evaluation Models:1. Bursty topic model 2. Clustering

ResultsWe have performed human evaluation using two volunteers. Below we present the results we have obtained from ROUGE evaluation tool kit.

Evaluation Precision Recall

Human 1 vs Clustering 0.18060 0.08120

Human 2 vs Clustering 0.21070 0.19444

Human 1 vs Human 2 0.41358 0.20150

Human 1 vs Bursty Topic Model 0.29032 0.18947

Human 2 vs Bursty Topic Model 0.27880 0.37346

Intuitions

● Human 1 vs Human 2 difference● Example: Tweet 1: ‘Sub search will end in another week’ Tweet 2: ‘Sub marine search will be called off in a week’● Bursty topic model imitates human evaluation to a

better extent rather than clustering because of the temporal way of sub topic classification.

FutureBelow are few improvements that can be done to our model:

1. It might yield us better results if we incorporate grammatical checking on tweets

2. Tweak Lex Rank to accommodate user popularity as edge weights

Tweet Summarizer

Data & Analytics

Transcript of Tweet Summarizer