1
computational social media
lecture 03: tweeting
part 1
daniel gatica-perez
20.03.2015
this lecture
discussion about project
reading session 3
a human-centric view of twitter
0. intro
1. twitter users & uses
2. twitter-specific phenomena
3. large-scale human behavior through twitter
4. real-world events and twitter
2
announcements
� easter holiday: no lecture on
� 03.04.2015
� 10.04.2015
� after break, next lecture: 17.04.2015
projects
3
your project
� ideas
� resources
� evaluation
� schedule
defining your project
� something that you can do part-time in 8-10 weeks
� you can work in pairs or individually
� research question
� model or computational task
� survey-based study
� explore a new idea, with potential of resulting in a novel finding
� data options
� use ready-to-use publicly available data
� collect raw data using public APIs
� collect survey data from users
� collect manual annotations (by yourself or via crowdsourcing)
4
examples of projects
� develop a facebook app to collect data about X
� use Facebook user data to classify X
� use off-the-shelf NLP modules to extract topics/sentiment of
a data set of YouTube video blogs or tweets to classify X
� design a crowdsourcing experiment on Amazon Mechanical
Turk to label selfies according to big-five and narcissism
� design a crowdsourcing experiment to discover main
categories of popular Vine videos
� use geo-localized tweets or 4sq or yelp check-ins to segment
a city into “neighborhoods” according to level of activity
� study connections between context (e.g. day of week,
season, weather, city ID) and activity using tweets
� visualize social media data in novel forms
� program a mobile social app
some resources
� data examples (out of many possible)
� ICWSM data repository (biased to Twitter but other things too)
http://www.icwsm.org/2014/datasets/datasets/
� mypersonality project data
http://mypersonality.org/wiki/doku.php?id=download_databases
� Yelp data sets
https://www.yelp.com/academic_dataset
http://www.yelp.com/dataset_challenge
� KDD nuggets (most are not social data)
http://www.kdnuggets.com/datasets/index.html
� YouTube vlog transcriptions (my group)
� geo-localized tweets (my group)
� Foursquare check-in data (my group)
5
project evaluation
� project presentation
� 20-minute presentation/demo
� project report
� ACM conference paper format for short paper (4-5 pages)
� template available at:
http://www.acm.org/sigs/publications/proceedings-templates
� structure: abstract, introduction, data, method, results,
discussion, references
� perhaps a conference paper can come out of it
schedule
� define project
� final proposal by friday 17.04.2015 (better if earlier)
� one-page max (in pdf)
� title
� goals
� approach
� how you will measure success
� final presentation and report
� tentative date: friday 12.06.2015
6
twitter basic statisticshttps://about.twitter.com/company, accessed March 2014
mission
“to give everyone the power to create and share ideas and
information instantly, without barriers”
twitter usage
241 million monthly active users
500 million tweets are sent per day
76% of active users are on mobile
77% of accounts are outside the U.S.A.
supports 35+ languages
Vine has more than 40 million users
2700 employees
7
what is twitter?
"an echo chamber of random chatter”
“ the SMS of the internet”
“140 characters: somewhere between
an SMS (with larger audience)
an email (but less formal)
a blog (but less cumbersome)”
“an adrenalized Facebook where friends communicate in short bursts”
“a multipurpose tool upon which apps can be built”
“ a turn-on button connected to every website and every social media site”
J. van Dijck The culture of connectivity, Oxford University Press, 2013
https://twitter.com/jack/status/20
https://about.twitter.com/milestones
8
https://2012.twitter.com/en/golden-tweets.html
most retweeted in 2012…
most retweeted ever
9
source: The Evolution of Languages on Twitter, Brian Lehman, 10.03.2014 , http://blog.gnip.com/twitter-language-visualization/
language chosen
by users in profile
before twitter…古池や蛙飛び込む水の音ふるいけやかわずとびこむみずのおとold pond . . .
a frog leaps in
water’s sound
Bashõ (17th century)
http://en.wikipedia.org/wiki/Haiku
The Dinosaur
On waking, the dinosaur was still there.
Augusto Monterroso (20th century)
http://es.wikipedia.org/wiki/Microrrelato
10
http://www.fi.edu/learn/case-files/fermi/full/telegram.allen.front.jpg
credit: miki yoshihito (cc): http://www.flickr.com/photos/mujitra/10511181393
11
credit: oregon dot (cc): http://www.flickr.com/photos/oregondot/10442457403
credit: mark hillary (cc): http://www.flickr.com/photos/markhillary/5412943529/
12
what is twitter made of?
https://about.twitter.com/press/brand-assets
https://about.twitter.com/milestones
J. van Dijck The culture of connectivity, Oxford University Press, 2013
follow (2006)
users subscribe to other users' tweets
hashtags # (2007, official 2009)
words articulating a topic or event
allow for search and clustering
retweet RT (2007, official 2009)
repost tweets towards one's followers
enables trends by retweeting
D. Murthy, Twitter: Social Communication in the Twitter Age, Polity Press, 2013
- link strangers into larger
conversations
- facilitate impromptu
interactions
- not directed communication
but a stream
- enable the emergence of
trending topics
hashtags
13
https://twitter.com/chrismessina/status/223115412
https://twitter.com/ericrice
20.03.2014
http://mashable.com/2014/03/20/twitter
-eliminating-at-replies-hashtags/
14
geolocalized tweets in Switzerland
understanding research on twitter
1.
twitter users & uses
4.
real-world events
and twitter
2.
twitter-specific
phenomena
3.
large-scale human
behavior through
15
reading session 3
S. Wu, J. M. Hofman, W. Mason, and D. Watts,
Who Says What to Whom on Twitter,
in Proc. WWW 2011
http://labs.yahoo.com/publication/who-says-what-to-whom-on-twitter/
topic 1: understanding twitter users & uses
16
who uses twitter?
17
users and usage
tool for self-promotion- “influentials”:
- celebrities, stars
- politicians
- enables organization/management
of fans/audiences/voters
tool for communication- everyday small talk
- (citizen) journalism
- political grassroots activism
- emergencies and disasters
- community participation
J. van Dijck The culture of connectivity, Oxford University Press, 2013
D. Murthy, Twitter: Social Communication in the Twitter Age, Polity Press, 2013
J. Hagan, Tweet Science, New York Magazine, October 2011.
2006: older professional users in business and news
2009: shift to younger adults, then mainstream
shift: from “social network” to “information network”
“the impulse to make life a publicly annotated experience has blurred the
distinction between advertising and self-expression, marketing and identity” (Hagan 2011)
following vs. friending
highly asymmetric links (Kwak, 2010)
68% of users are not followed by any of their followings
D. Murthy, Twitter: Social Communication in the Twitter Age, Polity Press, 2013
H. Kwak, C. Lee, H. Park, S. Moon, What is Twitter: a social network or a news media? in Proc. WWW 2010.
cited in: J. van Dijck The culture of connectivity, Oxford University Press, 2013
“connection with very low expectation” (Murthy, 2013)
18
weak and “strong” ties in twitter
weak tie (following): easy to follow a large number of people
B. Huberman, D. Romero, F. Wu, Social networks that matter: Twitter under the microscope. First Monday, 14(1), Jan. 2009
309k users211k posted at least twice206 days (avg)
strong tie (“friend”): direct communication ‘@’ (at least twice) in
the observation period
twitter users in business
19
what are companies using twitter for?
1. customer support
2. brand reputation
management
3. polling & product
feedback
4. lead generation
5. news distribution
6. brand awareness
7. get support for a cause
8. humanizing brands
9. thought leadershipcredit: grayweed @ (cc):
http://www.flickr.com/photos/21986855@N07/10435967414
M. Carvill and D. Taylor, The Business of Being Social, Crimson, 2013
to be continued at reading session…
S. Wu, J. M. Hofman, W. Mason, and D. Watts,
Who Says What to Whom on Twitter,
in Proc. WWW 2011
http://labs.yahoo.com/publication/who-says-what-to-whom-on-twitter/
20
topic 2: studying twitter-specific phenomena
to be continued at reading session …
S. Wu, J. M. Hofman, W. Mason, and D. Watts,
Who Says What to Whom on Twitter,
in Proc. WWW 2011
http://labs.yahoo.com/publication/who-says-what-to-whom-on-twitter/
additional example (not discussed in class)
F. Kooti, H. Yang, M. Cha, K. Gummadi, and W. Mason, W.
The Emergence of Conventions in Online Social Networks,
in Proc. AAAI ICWSM 2012 (Best paper award)
http://smallsocialsystems.com/papers/ConventionsICWSM.pdf
21
topic 3: studying large-scale human behavior through twitter
example:human mood revealed by twitter
S. A. Golder and M. W. Macy, Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures,
Science, 30 September 2011, Vol. 333 no. 6051 pp. 1878-1881
22
mood
“a conscious state of
mind or predominant
emotion”
(Merriam-Webster
dictionary)
“a temporary state of
mind or feeling”
(Oxford dictionary)
goal and data
positive affect (PA):
enthusiasm, delight, activeness, alertness
negative affect (NA):
distress, fear, anger, guilt
PA and NA are independent dimensions
low PA: absence of positive feelings, not presence of negative ones
2.4 million twitter users worldwide
509 million tweets
up to 400 public messages per user
all users had at least 25 messages
average: 212 tweets/user
period: 02.2008 and 01.2010
only english speakers
goal: study variations in PA & NA over time of day, day of
week, and world region using longitudinal twitter data
23
extraction of positive affect (PA) and negative affect (NA)
� categories related to psychological
constructs and personal concerns
� word count per category
Linguistic Inquiry & Word Count (LIWC)
PA NA
J. Pennebaker, R. Booth, M. Francis, UT Austin
http://www.liwc.net/
24
LIWC (2)
dictionary 4,500 words
each word belongs to one ormore categories
“agree” is part of: affect,positive emotions and assent
64 word categories (excludingpunctuations)
http://www.liwc.net/
25
http://www.liwc.net/
measurements
26
Fig. 1 Hourly changes in individual affect broken down by day of the week (top, PA; bottom,
NA). Each series shows mean affect (black lines) and 95% confidence interval (colored regions)
wake up
asleep
sunday sleep in
saturday night fever
i hate mondays
weekend
thank god it’s friday
Fig. 2 Hourly changes in individual affect in four English-speaking regions.Each series shows mean affect (black lines) and 95% confidence interval (colored regions)
27
A. Mislove et al., Pulse of the Nation: US mood throughout the day inferred from Twitter (2010)
http://www.ccs.neu.edu/home/amislove/twitterm
mood in 2013according to twitter
28
https://blog.twitter.com/2014/got-a-case-of-the-mondays-youre-not-alone
Simon Rogers, 10.03.2014
ratio of tweets
containing specific
words in English per
million posted
https://blog.twitter.com/2014/got-a-case-of-the-mondays-youre-not-alone
Simon Rogers , 10.03.2014
ratio of tweets
containing specific
words in English per
million posted
29
https://blog.twitter.com/2014/got-a-case-of-the-mondays-youre-not-alone
Simon Rogers , 10.03.2014
ratio of tweets
containing specific
words in English per
million posted
what to remember
twitter: an information network
celebrities, media & the rest of us: all in the same pool
everybody can talk
… though not at the same volume
research in human behavior through twitter
large-scale traces of human states like mood
biased, yet useful data
often high expectations, but precaution is needed