National Forest Program and Climate Change Challenges and Chances
Chances and Challenges of Studying Social Media Data
-
Upload
katrin-weller -
Category
Social Media
-
view
84 -
download
2
Transcript of Chances and Challenges of Studying Social Media Data
Chances and Challenges of Studying Social Media Data
Dr. Katrin Weller GESIS – Leibniz-Institute for the Social Sciences
Data Archive for the Social Sciences / Computational Social Science
Cologne, Germany
●
Digital Studies Fellow at John W. Kluge Center
Library of Congress
Washington D.C.
E-Mail: [email protected] ●Twitter: @kwelle ● Web: www.katrinweller.net
My Background • PhD in Information Science (until 2012 University of
Düsseldorf)
• Interests: Web Science, Social Media (focus on Twitter), Knoweledge representation + Semantic Web, informetrics + altmetrics, scholarly communication
• Since 2013: GESIS, Social Web Data: New data types for social science research; research methods and data archiving.
• Jan-May 2015: Digital Studies Fellowship at the Library of Congress
2
Recent and Current Work
• Co-editor of „Twitter & Society“ (Peter Lang, 2014).
• With Katharina Kinder-Kurlanda: „The hidden data of social media research“
• #FAIL! Things that didn‘t work out in social media research – and what we can learn from them (#fail2015a at Web Science Conference, Oxford, #fail2015b at Internet Research 16, Phoenix). https://failworkshops.wordpress.com
• Pilotproject for archiving social media datasets in an election study. (http://arxiv.org/abs/1312.4476v2)
3
Social media research 2000-2013
0
1000
2000
3000
4000
5000
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
No. of publications (Scopus)
Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
Scopus: 2000-2013 by country
0 1000 2000 3000 4000 5000 6000 7000
United States
United Kingdom
Germany
Australia
China
Spain
Canada
Italy
France
Taiwan
Netherlands
South Korea
Finland
Austria
Japan
Greece
India
Singapore
Switzerland
Hong Kong
Ireland
Scopus: 2000-2013 by subject area
10650; 36%
5542; 19%
2384
2288
2151
1535
773
772
65 Computer ScienceSocial SciencesEngineeringMedicineBusiness, Management and AccountingMathematicsArts and HumanitiesDecision SciencesPsychologyNursingEconomics, Econometrics and FinanceBiochemistry, Genetics and Molecular BiologyHealth ProfessionsEnvironmental ScienceEarth and Planetary SciencesAgricultural and Biological SciencesPharmacology, Toxicology and PharmaceuticsPhysics and AstronomyMaterials ScienceMultidisciplinaryNeuroscienceImmunology and MicrobiologyChemical EngineeringVeterinaryDentistryChemistryEnergy
Challenge
• Interdisciplinarity!
• „Social media research“ is not a coherent research field.
• Influences from lots of different disciplines.
• Some disciplines still isolated, not all equally advanced in technical tasks.
• Challenge of keeping track of what is going on – across disciplines.
11
Example: Twitter research in social sciences
12
Weller, K. (2014). What do we get from Twitter – and What Not? A Close Look at Twitter Research in the Social Sciences. Knowledge Organization 41(3), 238-248.
15
Different methods even in social science Twitter research
Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences. Knowledge Organization. 41(3), 238-248
Example
0
10
20
30
40
50
60
2008 2009 2010 2011 2012 2013
Publications on „Twitter and elections“ (Scopus and Web of Science)
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
16
Year of election
Name of election Country/region No. of papers (2013)
Date of election
2008 40th Canadian General Election Canada 1 14.10.2008
2009 European Parliament election, 2009 Europe 1 07.06.2009
2009 German federal election, 2009 Germany 2 27.09.2009
2010 2010 UK general election United Kingdom 4 06.05.2010
2010 South Korean local elections, 2010 South Korea 1 02.06.2010
2010 Dutch general election, 2010 Netherlands 2 09.06.2010
2010 Australian federal election, 2010 Australia 1 21.08.2010
2010 Swedish general election, 2010 Sweden 1 19.09.2010
2010 Midterm elections / United States House of Representatives elections, 2010 USA 4 02.11.2010
2010 Gubernational elections: Georgia USA 1 02.11.2010
2010 Gubernational elections: Ohio USA 1 02.11.2010
2010 Gubernational elections: Rhode Island USA 1 02.11.2010
2010 Gubernational elections: Vermont USA 1 02.11.2010
2010 2010 superintendent elections South Korea 1 17.12.2010
2011 Baden-Württemberg state election, 2011 Germany 1 27.03.2011
2011 Rhineland-Palatinate state election, 2011 Germany 1 27.03.2011
2011 Scottish parliament election 2011 Scotland 1 05.05.2011
2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.2011
2011 Norwegian local elections, 2011 Norway 2 12.09.2011
2011 2011 Danish parliamentary election Denmark 2 15.09.2011
2011 Berlin state election, 2011 Germany 2 18.09.2011
2011 Gubernational elections: West Virginia USA 1 04.10.2011
2011 Gubernational elections: Louisiana USA 1 22.10.2011
2011 Swiss federal election, 2011 Switzerland 1 23.10.2011
2011 2011 Seoul mayoral elections South Korea 1 26.10.2011
2011 Gubernational eletions: Kentucky USA 1 08.11.2011
2011 Gubernational elections: Mississippi USA 1 08.11.2011
2011 Spanish national election 2011 Spain 1 20.11.2011
2012 Queensland State election Australia 1 24.03.2012
2012 South Korean legislative election, 2012 South Korea 1 11.04.2012
2012 French presidential election, 2012 France 2 22.04.2012
2012 Mexican general election, 2012 Mexico 1 01.07.2012
2012 United States presidential election, 2012 / United States House of Representatives elections, 2012
USA 17 06.11.2012
2012 South Korean presidential election, 2012 South Korea 2 19.12.2012
2013 Ecuadorian general election, 2013 Ecuador 1 17.02.2013
2013 Venezuelan presidential election, 2013 Venezuela 1 14.04.2013
2013 Paraguayan general election, 2013 Paraguay 1 21.04.2013
Year of election
Name of election Country/region No. of papers (2013)
Date of election
2008 40th Canadian General Election Canada 1 14.10.2008
2009 European Parliament election, 2009 Europe 1 07.06.2009
2009 German federal election, 2009 Germany 2 27.09.2009
2010 2010 UK general election United Kingdom 4 06.05.2010
2010 South Korean local elections, 2010 South Korea 1 02.06.2010
2010 Dutch general election, 2010 Netherlands 2 09.06.2010
2010 Australian federal election, 2010 Australia 1 21.08.2010
2010 Swedish general election, 2010 Sweden 1 19.09.2010
2010 Midterm elections / United States House of Representatives elections, 2010 USA 4 02.11.2010
2010 Gubernational elections: Georgia USA 1 02.11.2010
2010 Gubernational elections: Ohio USA 1 02.11.2010
2010 Gubernational elections: Rhode Island USA 1 02.11.2010
2010 Gubernational elections: Vermont USA 1 02.11.2010
2010 2010 superintendent elections South Korea 1 17.12.2010
2011 Baden-Württemberg state election, 2011 Germany 1 27.03.2011
2011 Rhineland-Palatinate state election, 2011 Germany 1 27.03.2011
2011 Scottish parliament election 2011 Scotland 1 05.05.2011
2011 Singapore’s 16th parliamentary General Election Singapore 1 07.05.2011
2011 Norwegian local elections, 2011 Norway 2 12.09.2011
2011 2011 Danish parliamentary election Denmark 2 15.09.2011
2011 Berlin state election, 2011 Germany 2 18.09.2011
2011 Gubernational elections: West Virginia USA 1 04.10.2011
2011 Gubernational elections: Louisiana USA 1 22.10.2011
2011 Swiss federal election, 2011 Switzerland 1 23.10.2011
2011 2011 Seoul mayoral elections South Korea 1 26.10.2011
2011 Gubernational eletions: Kentucky USA 1 08.11.2011
2011 Gubernational elections: Mississippi USA 1 08.11.2011
2011 Spanish national election 2011 Spain 1 20.11.2011
2012 Queensland State election Australia 1 24.03.2012
2012 South Korean legislative election, 2012 South Korea 1 11.04.2012
2012 French presidential election, 2012 France 2 22.04.2012
2012 Mexican general election, 2012 Mexico 1 01.07.2012
2012 United States presidential election, 2012 / United States House of Representatives elections, 2012
USA 17 06.11.2012
2012 South Korean presidential election, 2012 South Korea 2 19.12.2012
2013 Ecuadorian general election, 2013 Ecuador 1 17.02.2013
2013 Venezuelan presidential election, 2013 Venezuela 1 14.04.2013
2013 Paraguayan general election, 2013 Paraguay 1 21.04.2013
Big DATA? 2013: twitter and election
No. of Tweets No. Of publications (2013)
0-500 3
501-1.000 4
1.001-5.000 1
5.001-10.000 1
10.001-50.000 7
50.001-100.000 4
100.001-500.000 5
500.001-1.000.000. 3
1.000.001-5.000.000 3
More than 5.000.000 3
More than 100.000.000 1
More than 1.000.000.000 1
No/unsufficient information 13
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
19
Comparability twitter and election
Data collection methods
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
20
Data source number No information 11
Collected manually from Twitter website (Copy-Paste / Screenshot)
6
Twitter API (no further information) 8
Twitter Search API 3
Twitter Streaming API 1
Twitter Rest API 1
Twitter API user timeline 1
Own program for accessing Twitter APIs 4
Twitter Gardenhose 1
Official Reseller (Gnip, DataSift) 3
YourTwapperKeeper 3
Other tools (e.g. Topsy) 6
Received from colleagues 1
Comparability twitter and election
Data collection periods
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
21
period Number of publications (2013)
0-10 hours 1
1-2 days 6
3-7 days 3
8-14 days 5
2-4 weeks 7
1-2 months 13
2-6 months 5
7-12 months 3
More than 12 months 0
No/unsufficient information 6
What is being studied?
0
100
200
300
400
500
600
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
YouTube
Blogs
Wikis
Foursquare
MySpace
Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/ .22
Challenges
• Quickly changing landscape of social media platforms
• Twitter as a model organism of social media research?
23
Scopus: 2000-2013 popular keywords
Social networks (897), Social network (657)
User interfaces (1,007)
Social networking (online) (2,291)
Facebook (847)
Knowledge management (860)
Web services (869) Information systems (810)
Twitter (667)
Semantics(765), Semantic Web(669)
Communication(650)
Information technology (639)
E-learning (623), Students(579) Education(520) Teaching (504)
Scopus search, conducted March 2014: (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
Social Media Research: Topics
• Political communication / elections • Activism • Popular culture, memes • Brand communication, marketing • Journalism (incl. agenda setting, citizen journalism, TV
backchannel) • Crisis communication, disaster response • Scholarly communication • Language • And many more
25
• Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2014. ““I love thinking about ethics!” Perspectives on ethics in social media research.” Internet Research (IR15), Deagu, South Korea, 22.-24.10.2014. Paper to be published in Selected Papers of Internet Research, view preprintHiddenDataEthics_Weller+Kinder-Kurlanda_IR15-preprint.
• Kinder-Kurlanda, K. E.; Weller, K. (2014). “I always feel it must be great to be a Hacker!”: The role of interdisciplinary work in social media research. In Proceedings of the 2014 ACM Web Science Conference WebSci’14, June 23–26, 2014, Bloomington, IN, USA, pp. 91-98. doi:http://dx.doi.org/10.1145/2615569.2615685. View preprint versionhiddendata_websci14-preprint_Kinder-Kurlanda+Weller(2014).
• Weller, K. & Kinder-Kurlanda, K. (in press). Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research? To appear in Workshop on Standards and Practices in Large-Scale Social Media Research. ICWSM, Oxford, May 2015. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10657/10552.
28
CHANCES
• Researchers value social media as a new type of data.
• Previously „ephemeral data“ become visible
• Immediate – quick reaction to events
• Structured
• „natural“ data
29
What I find really interesting is that structure becomes manifest in internet communication. So it’s the first time in history actually that we can, that social structures between people become manifest within a technology. (...) They become visible, they become crawlable, they become analyzable.
Some of the CHALLENGES
Preliminary results, more detailed analysis to follow.
- Interdisciplinarity
- Ethics
- Standards
- Data access & infrastructure
- …
30
Unregulated and developing field
• Researchers showed a high awareness of the unregulated and developing character of social media research methods.
But, I think that (…) in like a couple of years, maybe five – it depends a lot, because the subject of the research is changing every day, (…) but I think that we’re going to have, (…) more or less shared qualitative approaches with a lot of good practices.
Data Sharing
32
But you can’t make your data available for others to look at, which means both your study can’t really be replicated and it can’t be tested for review. But also it just means your data can’t be made available for other people to say, Ah you have done this with it, I’ll see what I can do with it, (…) There is no open data.
Data Sharing
“I think probably a couple times we’ve asked around if anyone else happened to have a particular dataset. […] but not so much, because they probably have tracked in a different data format, and then merging the two together actually becomes quite difficult as well.”
33
Ethics / privacy
34
“I will not quote tweets.”
“if somebody plays a really important role in a particular event then maybe they deserve the credit of being accredited as well.”
Representativeness
Blank, G. (2014). Who uses Twitter? Representativeness of Twitter Users. Presentation at General Online Research GOR 14. Retrieved from: http://conftool.gor.de/conftool14/index.php?page=downloadPaper&filename=Blank-Who_uses_Twitter_Representativeness-119.pptx&form_id=119&form_version=final
34
26
812
1814
10
1712
2328
333035
0
20
40
60
80
100
% w
ho h
ave
done m
ore
th
an n
eve
r
InterestPolitical activities
Interestin politics
Sendpolitical
message
ContactMP online
Re-postpoliticalnews
Politicalcommenton SNS
Findpolitical
facts
Signonline
petition
OxIS current users: 2013 N=1,613
Figure 6: Political Activities of Twitter Users
Twitter user Non-user
Data Quality
• E.g. comparison of Twitter API and Reseller data.
37
Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose. Retrieved from http://arxiv.org/abs/1306.5204
Inequality in data access possibilities
38
boyd, d., and Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society 15(5):662–679. DOI: 10.1080/1369118X.2012.678878
• Data haves and data have nots
– Financial reasons
– Connections to companies
– Different skills
– …
Top 5 Challenges in Twitter research
• Representativeness and validity
• Cross-platform studies
• Comparisons (e.g. different countries, points in time)
• Multi-method approaches
• Context and meaning
Bruns, Axel, and Katrin Weller. 2014. "Twitter data analytics – or: the pleasures and perils of studying Twitter (guest editorial for special issue)". Aslib Journal of Information Management 66 (3): 246-249. 39
Summary
Three sources of challenges in social media research: • The variety of user interactions that count as social media
and their ever changing nature that makes social media a moving target.
• The diversity of the research community, which challenges knowledge transfer and development of standards.
• The dependency on commercial companies to open up access to their data. Researchers themselves only have limited means to change these sources of challenges.
40
Weller, K. (2015). Accepting the challenges of social media research. Online Information Review 39(3).
Summary
Currently addressed challenges • research infrastructure, including data collection and
sharing facilities, training in new methods and technologies.
• The call for more thoughtfulness in research ethics. • Critical considerations on big data and data quality,
including reflection of the power of algorithms and misrepresentations through big data approaches. Requests for broader scopes by facilitating multi-method and multi-platform studies, as well as longitudinal studies
41
Outlook
• Long term preservation of social media, i.e. archiving of data and as well as of social media platforms’ look and feel.
• Documentation of applied research methods which should enable comparative studies across the single use cases, quality control and verification of results.
• Accessing social media users’ expectations on privacy in order to respond to them through ethical standards.
42