Download - Computational Social Science and microposts - The good, the bad and the ugly

Transcript

1

Markus StrohmaierGESIS – Leibniz Institute for

the Social Sciences & U. of Koblenz

Computational Social Science and microposts ‐The good, the bad and the ugly

#Microposts2014 at WWW’2014 Korea, Seoul

#Microp@WWWSeoul, 

www.markusstrohmaier.info

Found Social Data

2

erosion accretion

References:Webb, E. J., et al. (1966). Unobtrusive methods: Nonreactive research in the social sciences.

Everybody lies to her doctor, and the Hawthorne effect

Computational Social Science

CSSSA: http://computationalsocialscience.org/

Computational Social Science:“The science that investigates social phenomena through the medium of computing and algorithmic data processing.” [adapted from CSSSA]

3

• Harvard iQS, • Stanford IRiSS,• CMU CASOS,• ESRC COSMOS• Web Observatories• …

Computational Social Science: Example

Stanley Milgram (1967)

• Social Scientist• Theory: A small world• 6 degrees of separation

Jure Leskovec (2008)

• Computer Scientist• (Found) Data: 240 mio users• 7 degrees of separation

4

What social media platforms are we focusing on?

5Weller, K. (2014). Bibliometric analysis of social media research: Publication output for different social media platforms. Blog post, 07.04.2014. Retrieved from: http://kwelle.wordpress.com/2014/04/07/bibliometric‐analysis‐of‐social‐media‐research/

BlogsMicroposts

What disciplines are interested in microposts?

6Zimmer, Michael / Proferes, Nicholas. (In press). "A topology of Twitter research: Disciplines, methods, and ethics". To appear in AslibJournal of Information Management 3(66), special issue on Twitter data analysis.

THE GOODComputational Social Science and Microposts

7

Observing political movements andconversations

~

1. day of theprotests

Days on which the internet was shut down

Based on an analysis of about 100 mio tweets on egypt

Markus Strohmaier, work at (XEROX) PARC in collaboration with Lichan Hong (then at PARC, now at Google)

Analysis of political conversations on Twitter(Egyptian revolution 2011)

8

Assessing online conversational practices of political parties

9Lada Adamic and Natalie Glance. The political blogosphere and the 2004 US election: divided they blog. Proceedings of the 3rd international workshop on Link discovery. ACM, 2005.

Assessing online conversational practices of political parties on Twitter

10

During the German National Election 2013

Haiko Lietz, Claudia Wagner, Arnim Bleier, and Markus Strohmaier, When Politicians Talk: Assessing Online Conversational Practices of Political Parties on Twitter, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 

More examples

Predicting…• personality from twitter

Golbeck et al. 2011• depression via social media

De Choudhury et al. 2013• elections with Twitter

Tumasjan et al. 2010• crime using Twitter

Gerber 2014

• stock market indicatorsZhang et al. 2011

• flu trends using twitter dataAchrekar et al. 2011

• box-office revenuesAsur & Huberman 2010

11

THE BADComputational Social Science and Microposts

12

You cannot predict elections with Twitter

Election results % of tweetsCDU 28,4 18,6CSU 6,8 3,0SPD 24,0 14,7FDP 15,2 11,2Linke 12,4 8,3Grüne 11,1 9,3Piraten 2,1 34,8

13

Why the Pirate Party Won the German Election of 2009 or The Trouble With Predictions: A Response to Tumasjan, A., Sprenger, T. O., Sander, P. G., & Welpe, I. M. “Predicting Elections With Twitter: What 140 Characters Reveal About Political Sentiment” Jungherr, A., Jürgens, P., and Schoen, H. 2011. In  Social Science Computer Review.

Daniel Gayo‐Avello: No, You Cannot Predict Elections with Twitter. IEEE Internet Computing 16(6): 91‐94 (2012)

You cannot identify users‘ perceivedexpertise based on tweets

14Claudia Wagner, Vera Liao, Peter Pirolli, Les Nelson and Markus Strohmaier, It's not in their tweets: Modeling topical expertise of Twitter users, ASE/IEEE International Conference on Social Computing (SocialCom2012), Amsterdam, The Netherlands, 2012. 

What is Peer‘sexpertise?

It‘s not in histweets

# topics

You cannot assume users behaveconsistently over time

15Haiko Lietz, Claudia Wagner, Arnim Bleier, and Markus Strohmaier, When Politicians Talk: Assessing Online Conversational Practices of Political Parties on Twitter, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 

During the German National Election 2013

You cannot assume users behaveconsistently over time

16

votes

P. Singer, F. Flöck, C. Meinhart, E. Zeitfogel and M. Strohmaier. Evolution of Reddit: From the Front Page of the Internet to a self‐referential community? In 23rd International World Wide Web Conference (WWW2014), Web‐Science Track, Seoul, Korea, 2014.

The Frontpage of the Internet

A self‐referentialcommunityImages

Video

Text

self.reddit

THE UGLYComputational Social Science and Microposts

17

What are thereasons for someof the „bad“? 

Demographic Biases

18Mislove, Alan, et al. "Understanding the Demographics of Twitter Users." ICWSM (2011).

Social Bots

19C. Wagner, S. Mitter; C. Körner and M. Strohmaier. When social bots attack: Modeling susceptibility of users in online social networks. In Proceedings of the 2nd Workshop on Making Sense of Microposts (MSM'2012), held in conjunction with the 21st World Wide Web Conference (WWW'2012), Lyon, France, 2012

Impact of social bot attacks

20S. Mitter, C. Wagner, and M. Strohmaier. Understanding the impact of socialbot attacks in online social networks. In ACM Web Science 2013, May 2‐4th, Paris, France, 2013.

URL Shortener spam

21F. Klien and M. Strohmaier. Short links under attack: Geographical analysis of spam in a url shortener network. In Proceedings of the 23rd Conference on Hypertext and Social Media (HT2012). ACM, 2012.

data from qr.cx

URL Shortener spam

22

https://www.youtube.com/watch?v=06Mhn0L23Tk

F. Klien and M. Strohmaier. Short links under attack: Geographical analysis of spam in a url shortener network. In Proceedings of the 23rd Conference on Hypertext and Social Media (HT2012). ACM, 2012.

Context Collapse

23http://speakingofresearch.com/2014/02/27/fact‐into‐fiction‐why‐context‐matters‐with‐animal‐images/

Invisible Engagement

24Zeynep Tufekci, Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 

ChallengesHuberty (2014)

• N != all we have both N<all and N>all

• All (today) != All (tomorrow)user populations change

• Online behavior != Offline behaviormulti-faceted identities

• Behavior of all (today) != Behavior of all (tomorrow)behavior changes and evolves

25I expected a Model T, but instead I got a loom: Awaiting the second big data revolution, Mark Huberty (2014).

What makes matters even worse…

26

=What about:

?…?

Zeynep Tufekci, Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 

OpportunitiesSometimes though,

• N = all (or almost all)

• All (today) is similiar enough to All (tomorrow)

• Online behavior approximates offline behavior

• Behavior of all (today) predicts Behavior of all (tomorrow)

27

Where do we go from here?• An opportunity for computational social science

– Hypothesis exploration & validation– Triangulation (panels, statistics about societies, etc)– Open industrial and governmental data– Anonymization, reproducability and data archiving– Living labs and mass experimentation– …?

28

Thank you!

Markus Strohmaier