Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in Wikipedia
-
Upload
brian-keegan -
Category
Technology
-
view
554 -
download
0
Transcript of Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in Wikipedia
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in Wikipedia
Brian Keegan (@bkeegan)
Yu-Ru Lin (@rhodiuslin)
David Lazer (@davidlazer)
Sunbelt XXXIII
Hamburg, Germany
May 23, 2013
2
Theoretical motivations
• Information seeking and sense-making• What kinds of information is general population
seeking following disaster?
• Mass convergence and crisis informatics• What implications does rapidly emerging information
have for emergency responders?
• Networks of knowledge and collaboration• How is this information verified and synthesized?
Twitter basically sucks• Information seeking and
sense-making• More noise & echo than signal,
fragmented behavior & commons
• Mass convergence and crisis informatics• Sampling & temporal censoring
• Networks of knowledge and collaboration• Unverifiable, misinformation, non-
cumulative
Wikipedia basically rules• Information seeking and
sense-making• Existing repertoires & activity
around contextual information
• Mass convergence and crisis informatics• Fine-grained & accessible history
• Networks of knowledge and collaboration• Cited, debated, and cumulative
account
Networks from Wikipedia data
• Markup• Hyperlinks: i has a link to j
• Revisions• Coauthorship: i shares an
editor with j
• Pageview activity• Correlation: i’s pageviews
correlated with j
7
Case study
Case study
• Boston Marathon bombings• Two distinct dates for burst of activity
related to major developments:• April 15: Bombing• April 19: Manhunt
• New information new articles bursting
9
Article dynamics – First 3 weeks
10
Dynamics – First 18 hours
Pageview dynamics
Pageview and editing coupling
Pageview and editing coupling
14
HYPERLINK NETWORK
Types of networks
• Markup• Hyperlinks: i has a link to j
• Revisions• Coauthorship: i shares an editor with j
• Pageview activity• Correlation: i’s pageviews correlated with j
16
Boston Marathonbombings
Boston Marathon
Watertown, Mass.
Boston, Mass.
Boylston St.
1 step
17
Boston Marathonbombings
Boston Marathon
Watertown, Mass.
Boston, Mass.
Boylston St.
1.5-step
Communities
Perpetrators
MIT PD
Watertown, Mass.
Shelter in place
Holy Cross
Pressure cooker
19
Burst detection
Rolling 30 day average
2x SE
20
Largest bursts
1. Ground stop (329)
2. Boylston Street (268)
3. Google Person Finder (237)
4. Patriots’ Day (201)
5. Copley Square (171)
6. Controlled explosion (168)
7. Lenox Hotel (116)
8. Pressure cooker (83)
9. MA EMA (83)
10. BP SOU (78)
21
April 15
April 16Pressure cooker
April 17
Holy Cross
April 18
MIT PD
Watertown, Mass.
Shelter in place
April 19
April 20
April 21
28
COAUTHORSHIP NETWORK
28
Types of networks
• Markup• Hyperlinks: i has a link to j
• Revisions• Coauthorship: i shares an editor with j
• Pageview activity• Correlation: i’s pageviews correlated with j
Coauthorship activity 4/15 – 5/1
30
31
CORRELATION NETWORKS
Types of networks
• Markup• Hyperlinks: i has a link to j
• Revisions• Coauthorship: i shares an editor with j
• Pageview activity• Correlation: i’s pageviews correlated with j
Temporal correlation networks
34
Duffel bag
New York Times MGH
Activity correlation network
36
DISCUSSION
Theoretical framework
• Information seeking and sense-making• Fine-grained traces of large-scale behavior in a
complex information space
• Mass convergence and crisis informatics• Nearly real-time behavior captures bursts of activity
related to current events
• Networks of knowledge and collaboration• Information seeking in knowledge network drives
creation of new knowledge and relationships
Future directions
• Track diffusion of bursts across larger hyperlink network• Are distant bursty events responsible for substantial
fraction of editing activity?
• Synchronized and anomalous bursts of activity as narrative elements • Czech Republic vs. Chechnya• Classifying events and mobilizing resources
Future directions
• Textual features predict bursts?• Edit distance, number of mentions, position on page,
etc. convey relatedness of content
• Multilevel & longitudinal statistical model of tie formation• Dyadic covariates: Pageview correlation
coauthorship ties hyperlinks
40
THANK YOU!
Brian Keegan
www.brianckeegan.com
@bkeegan
40