NA-NiC - ICANN | Archives | Internet Corporation for Assigned
Internet Archives and Social Science Research - Yeungnam University
-
Upload
mwe400 -
Category
Data & Analytics
-
view
100 -
download
0
description
Transcript of Internet Archives and Social Science Research - Yeungnam University
![Page 1: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/1.jpg)
BIG DATA AND SOCIAL SCIENCE THEORY Leveraging Large Scale Data to Discover New Pa4erns in Society
Monday, April 7, 2014 CybermoCons @ Korea Yeungnam University
Ma4hew Weber Rutgers University School of CommunicaCon & InformaCon
![Page 2: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/2.jpg)
2
Opportunity: The Internet Archive contains the largest single record of the history of the World Wide Web from 1995 to the present—a wealth of untapped research data.
Challenge: There is a significant lack of research-‐ready databases and tools available to the scholarly community
![Page 3: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/3.jpg)
© Internet Archive 2013
![Page 4: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/4.jpg)
© Internet Archive 2013
![Page 5: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/5.jpg)
5
![Page 6: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/6.jpg)
6
![Page 7: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/7.jpg)
7
![Page 8: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/8.jpg)
8
![Page 9: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/9.jpg)
9
![Page 10: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/10.jpg)
10
Opportunity: The ArchiveHub project aims to support the creaCon and disseminaCon of general guidelines & tools for conducCng theoreCcally and methodologically rigorous
longitudinal research using archival Web data
![Page 11: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/11.jpg)
11
![Page 12: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/12.jpg)
12
![Page 13: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/13.jpg)
13
![Page 14: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/14.jpg)
14
Dataset Research PotenAal Dates Captures Unique URLs
Hurricane Katrina Online networks and organizaConal resilience (Chewning, Lai and Doerfel, 2012; Perry, Taylor and Doerfel, 2003) in the wake of disasters; informaCon disseminaCon
2003 – 2012 1,694,236 663,740
Superstorm Sandy 2003 – 2012 41,703,112 20,013,455
US Senate Study the growth of poliCcal acCvity in online environments (Adamic & Glance, 2005; Bruns, 2007; Chang & Park, 2012); polarizaCon & media discourse
109th – 112th Congresses
26,965,770 8,674,397
US House 51,840,777 12,410,014
Occupy Wall Street
Previous research on NGOs in the online environment (Bach & Stark, 2004; Shumate, 2003, 2012; Shumate, Fulk, & Monge, 2005); use of hyperlink data to study the formaCon and role of alliances between SMOs
2010 – 2012 247,928,272 11,3259,655
US Media
Previous studies of news media organizaCons (Greer & Mensing, 2006; Weber, 2012; Weber & Monge, In Press); focus on evoluConary pa4erns
2008 – 2012 1,315,132,555 539,184,823
![Page 15: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/15.jpg)
15
http://archivehub.rutgers.edu
![Page 16: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/16.jpg)
16
![Page 17: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/17.jpg)
Tracing the Emergence of OrganizaConal Forms
17
Environment: OrganizaCons compete for scare resources; during rapid periods of
disrupCon, new entrants seek “protected” niches (Weber & Monge 2014)
PopulaAon: In digital spaces, online connecCons provide communicaCve representaCons of
informaCon flows (Weber & Monge, 2012)
FormaCon of Ces (e.g. hyperlinks) can posiCvely impact long-‐term likelihood of organizaCon survival (Weber, 2012)
OrganizaAon: OrganizaCons adapt internally, reconfiguring team structures and
developing new rouCnes for knowledge sharing (Ellison, Gibbs & Weber, In Press; Weber & Kim, Under Review)
![Page 18: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/18.jpg)
18
![Page 19: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/19.jpg)
Big Data… Big Theory?
• Networks are central to social movements in that links between nodes can be influenCal in collecCve acCon
• Examples of nodes includes parCcipants, organizaCons, media and communicaCons technologies • Social networks and social movements (Diani, 2003)
• The interacCon between actors, and between actors and hashtags, collecCvely represent a networked form of organizaCon • Network form of organizaCon (Powell, 1990)
![Page 20: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/20.jpg)
Over time, dyadic communication will become prevalent in an emerging networked organization. H1:
As a social movement develops as an emerging network form of organization, the organizational structure will be increasingly clustered.
H2:
![Page 21: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/21.jpg)
Data
• TriangulaCon of data insulates against false readings from large-‐scale data (see Lazer, Kennedy, King and Vespignani, 2014)
• Internet Archive: – 14 websites; 4,504 hyperlink dyads over a 2-‐month period.
• Lexis Nexis: – Search conducted to assess U.S. newspaper coverage of OWS from the early stages of the
movement in September 2011 through Sept. 2012 – Search OWS keywords, e.g. “Occupy Wall Street,” “Occupy Oakland”
• Twi4er – Gnip PowerTrack
• Search by keywords; captures a larger volume of Twi4er data than other opCons – Sample includes October 17, 2011, through January 5, 2012. IniCal study focused on the
criCcal two-‐month period from November 1 through December 31, 2011, – 750,816 tweets across the two-‐month period.
21
![Page 22: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/22.jpg)
![Page 23: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/23.jpg)
OWS News Coverage
![Page 24: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/24.jpg)
OWS on the Web
• 335 seed organizaCons based on records from #OccupyResearch • Data extracted for 2011 & 2012, based on “both matching”
24
0
2
4
6
8
10
12
14
16
18
Millions
Captures per Month
![Page 25: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/25.jpg)
Maximal Cores (k Coreness)
25
Aug. 2011 Jan. 2012
![Page 26: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/26.jpg)
26
-‐
10,000.00
20,000.00
30,000.00
40,000.00
50,000.00
60,000.00
70,000.00
80,000.00
Edges
60
80
100
120
140
160
180
VerAces
![Page 27: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/27.jpg)
27
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Density
![Page 28: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/28.jpg)
28
0
10
20
30
40
50
60
70
80
90
100
Clusters
![Page 29: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/29.jpg)
29
![Page 30: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/30.jpg)
ImplicaCons
• Big Data: – Guiding data collecCon with theoreCcally grounded quesCons avoids the
“needle-‐in-‐the-‐haystack” problem – Leverage advances in compuCng with exisCng theories to develop robust
studies of social science phenomenon
• Big Theory: – Expanding prior theories on networked organizaConal forms and form
emergence (evoluConary) – Building toward a macro theory of organizaConal form emergence based on
resource availability and networks
30
![Page 31: Internet Archives and Social Science Research - Yeungnam University](https://reader034.fdocuments.us/reader034/viewer/2022051818/54c1072c4a79595d208b4584/html5/thumbnails/31.jpg)
• Want data? – Email me! [email protected] – ArchiveHub: h4p://archivehub.rutgers.edu
• Collaborators – Kris Carpenter & Vinay Goel, Internet Archive – David Lazer, Northeastern University
31 Research supported by NSF Award #1244727 and the NetSCI Lab @ Rutgers