Trend Analysis

23
TREND

description

Trend Analysis: Definition, Detection, Tracking

Transcript of Trend Analysis

Page 1: Trend Analysis

TREND

Page 2: Trend Analysis

SemioNet: Semantic Social Network Analysis

TRENDDETECTION, TRACKING & TRANSITIONin Social Networks

1. Definition & General Idea2. Web Samples in Trend Hunting3. Detection Approches4. Architecture: TwitterMonitor5. Detection: MemeTracker6. Classification: ExoEndo

Page 3: Trend Analysis

REFERENCES Mathioudakis, Michael, and Nick Koudas. "Twittermonitor: trend detection over the twitter stream." Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 2010.Leskovec, Jure, Lars Backstrom, and Jon Kleinberg. "Meme-tracking and the dynamics of the news cycle." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009. Naaman, Mor, Hila Becker, and Luis Gravano. "Hip and trendy: Characterizing emerging trends on Twitter." Journal of the American Society for Information Science and Technology 62.5 (2011): 902-918.Becker, Hila, Mor Naaman, and Luis Gravano. "Beyond Trending Topics: Real-World Event Identification on Twitter." ICWSM 11 (2011): 438-441.

Page 4: Trend Analysis

Trend Analysis

Horizontal Analysis

The Science of Studying Changes in Social Patterns, Including Fashion, Technology & Consumer Behavior

The General Movement over TIME of a Statistically Detectable Change

Fundamentally, a Method for Understanding HOW & WHY Things have Changed – or will Change – over TIME

Page 5: Trend Analysis

APPLICATION

Page 6: Trend Analysis

APPROCH

Text Mining Topic Ident. & Clust.

"Kilroy was here" was a piece of graffiti that became popular in the 1940s, and existed under various names in different countries, illustrating how a meme can be modified through replication

Memes(/ˈmiːm/) is "an idea, behavior, or

style that spreads from person to person

within a culture.“ … through

writing, speech, gestures,

rituals, or other imitable phenomena with a mimicked theme. … cultural

analogues to genes in that they self-replicate, mutate, and respond to selective pressures.

Page 7: Trend Analysis

One-passReal-timeAdjustable against spamTheoretically sound!Adjustable against SPURIOUS Bursts. Coincidental Burst of Keyword over a short period of time

GroupBurst: Assesses Co-occurrences of Bursty Keyword in Recent Tweets

Context Extraction Algorithms (PCA, SVD) & Grapevine’s Entity Extractor to Add more

271 Million Monthly Active Users500 Million Tweets (140 ch) Per Day78% Active Users on Mobile77% Accounts Outside U.S.Supports 35+ languages

Page 8: Trend Analysis

MemeTrackingNews CycleTracking News Evolution

Quotes & MemesIntegral Part of Journalistic Practice

Travel Relatively Intact with Mutational Variants

Clustering by Graph

Page 9: Trend Analysis

Item: Each News Article/Blog Post

Phrase: A Quoted String Occurs in Items

Mem

eTra

cking …

Page 10: Trend Analysis

Phrase GraphDAG

“enough of senseless killing”

“Hear our voice. We have had enough of this senseless killing”

“senseless killing”

|P| < |Q|

Directed Edit Distance(P, Q) < δWord Consecutive Overlap(P, Q) > k P Q

Mem

eTra

cking …

Page 11: Trend Analysis

Phrase Clusters

Heuristic1.Start from the Roots 2.Down the DAG & greedily Assigns each Node to the Cluster to

which it has the most Edges

Given a Weighted DAG, Delete a Set of Edges of Min Total Weight So That Each of the Resulting Components is Single-Rooted.NP-hard

Directed Acyclic Graph (DAG) Partitioning Mem

eTra

cking …

Page 12: Trend Analysis

Mem

eTra

cking …

Page 13: Trend Analysis

Result

Dataset3 Months Aug 1 to Oct 31 2008 ~ 1M Docs per Day from 1.65 Million Sites!

47M Phrases, 22M Distinct9H Clustering Process Time35, 800 Non-trivial Clusters (at least two phrases)

Volume Distribution

Mem

eTra

cking …

Page 14: Trend Analysis

ThemeRiver

Mem

eTra

cking …

Page 15: Trend Analysis

Other Findings

Time lag between the news media and blogs

Quotes migrating from blogs to news media: 3.5%

Each Cluster

Modeling the news trendImitationRecency

Mem

eTra

cking …

Page 16: Trend Analysis
Page 17: Trend Analysis

Characterizing Trends

“trends in trend data.” Meta TrendTaxonomy of the trends

Key Distinguishing Features of TrendsNot only the Textual Content

Social Network StructureTies

GeographicAction Retweet, Reply, Mention, Hashtag

Page 18: Trend Analysis

Tre

nd

s

Exo

gen

ou

s

Broadcast-media

Broadcast of local media“fight” (boxing event)

“Ravens” (football game)

Broadcast of global/national media“Kanye”(KanyeWest acts up at the MTVVideo MusicAwards)

“Lost Finale” (series finale of Lost).

Global News

Breaking“earthquake” (Chile earthquake)

“Tsunami” (HawaiiTsunamiwarning)“Beyoncé”(Beyoncé cancels Malaysia concert).

Nonbreaking“HCR” (health care reform)

“Tiger” (Tiger Woods apologizes)“iPad” (toward thelaunch of Apple’s popular device).

National Holidays & Memorial Days“Halloween,” “Valentine’s.”

Local Participatory & Physical

Planned“marathon,”

“superbowl” (Super Bowl viewing parties)“patrick’s” (St. Patrick’s Day Parade).

Unplanned“rainy,” “snow.”

En

dog

en

ou

s

Memes#in2010 (in December 2009, users imagine their near future)

“November” (users marking the beginning of the month on November 1)

Retweets

Fan Community Activities“2pac” (the anniversary of the death of hip-hop artist Tupac Shakur).

Chara

cterizin

g Tre

nds …

Page 19: Trend Analysis

Trends from twitter.comTrends from Simple Trend DetectorTrends for Quality Analysis Supervised CategoriesTrends for Computing Features

Tquantity

Tterm freq.

Ttwitter

Tquality

Chara

cterizin

g Tre

nds …

Page 20: Trend Analysis

Content Features•Average number of words/characters•Proportion of messages with URLs, unique URLs, with hashtags ex/including trend terms•Top unique hashtag?•Similarity to centroid

Interaction Features• Proportion of retweets, replies, mentions

Time-based Features• Exponential fit head, tail• Logarithmic fit head, tail

Participation Features• Messages per author• Proportion of messages from top author• Proportion of messages from top 10% of authors

Social Network Features• Level of reciprocity•Maximal eigenvector centrality•Maximal degree centrality•Transitivity•Density•Average component size

Chara

cterizin

g Tre

nds …

Page 21: Trend Analysis

Exogenous vs.

Endogenous Trends

Content features: Exo higher URLs, smaller hashtags

Interaction features: Exo fewer retweets, similar number of replies

Time features: Exo different for the head period before the trend peak but will exhibit similar time features in the tail period after the trend peak, compared to endogenous trends.

Social network features: Exo fewer connections, less reciprocity

1.1

1.2

1.3

1.4

Chara

cterizin

g Tre

nds …

Page 22: Trend Analysis

TRANSITIONAlluvial Diagrams

Page 23: Trend Analysis

IDEA

Automatic Categorization of Trends

Photography Trend Selfie ImageTrust Trend Trustful Users, Trustful Twits

Untrendy People! Users Counteract the trends