Talk at MIT HCI Seminar

45
Machine learning approaches for understanding social interactions on Twitter May 6, 2014 Alice Oh [email protected] [email protected] http://uilab.kaist.ac.kr/members/aliceoh/

description

Machine learning approaches for understanding social interactions on Twitter

Transcript of Talk at MIT HCI Seminar

Page 1: Talk at MIT HCI Seminar

Machine learning approaches for understanding social interactions on Twitter

May 6, 2014Alice [email protected]@seas.harvard.eduhttp://uilab.kaist.ac.kr/members/aliceoh/

Page 2: Talk at MIT HCI Seminar

Our Research

• Topic Modeling• ICML 2014: Hierarchical Dirichlet scaling process• IJCAI 2013: Context-dependent conceptualization• NIPS Big Learning Workshop 2012: Distributed online learning for latent Dirichlet

allocation• CIKM 2012: Recursive Chinese restaurant processes for modeling topic hierarchies• ICML 2012: Dirichlet processes with mixed random measures

• Social Media Analysis• ACL 2014 Workshop: Self-disclosure topic model• WWW 2014: Computational analysis of agenda setting theory• AAAI 2013: Hierarchical aspect-sentiment model• ICWSM 2012: Social aspects of emotions in Twitter conversations• ACL 2012: Self-disclosure and relationship strength in Twitter conversations• WSDM 2011: Aspect sentiment unification model for online review analysis

2

Page 3: Talk at MIT HCI Seminar

Contact Information

• At Harvard until end of July, 2014 and open for • Collaborations: writing papers, sharing data, etc. • Discussions about topic modeling and computational social science

• Going back to KAIST in August • http://uilab.kaist.ac.kr • [email protected] • Can recommend students for intern, postdoc, and researcher positions

• Please consider attending • ICWSM (program co-chair), Ann Arbor, MI • ACL Workshop on Social Dynamics and Personal Attributes (co-

organizer), Baltimore, MD

3

Page 4: Talk at MIT HCI Seminar

What is topic modeling?

Page 5: Talk at MIT HCI Seminar

Blei, Communications of the ACM, 2012

Page 6: Talk at MIT HCI Seminar

Motivation

Page 7: Talk at MIT HCI Seminar

Motivation

• What are the topics discussed in the article?

• Is the article related to

• household finances?

• price of gasoline?

• price of Apple stock?

• How would you build an automatic system for answering these questions?

Page 8: Talk at MIT HCI Seminar

http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp

nascar, races, track, raceway, race, cars, fuel, auto, racingeconomic, slowdown, sales, recession, costs, spending, savefans, spectators, sports, leagues, teams, competition

8

Page 9: Talk at MIT HCI Seminar

http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?

nascar, races, track, raceway, race, cars, fuel, auto, racing

economic, slowdown, sales, recession, costs, spending, save

fans, spectators, sports, leagues, teams, competition

Topics: multinomial over wordsTopic Distributions

Page 10: Talk at MIT HCI Seminar

Input to LDA

10

http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?

Page 11: Talk at MIT HCI Seminar

Topics Discovered by LDA

nascar 0.12 spending 0.09 sports 0.12

races 0.1 economic 0.07 team 0.11

cars 0.1 recession 0.06 game 0.1

racing 0.09 save 0.05 player 0.1

track 0.08 money 0.05 athlete 0.09

speed 0.06 cut 0.04 win 0.07

... ... ...

money 0.002 speed 0.003 nascar 0.001

Topics: multinomial over vocabulary11

Page 12: Talk at MIT HCI Seminar

http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?

nascar, races, track, raceway, race, cars, fuel, auto, racing

economic, slowdown, sales, recession, costs, spending, save

fans, spectators, sports, leagues, teams, competition

Topics: multinomial over wordsTopic Distributions

Page 13: Talk at MIT HCI Seminar

Graphical Representation of LDA

Topic Distributions

nascar, races, track, raceway, race, cars, fuel, auto, racing

economic, slowdown, sales, recession, costs, spending, save

fans, spectators, sports, leagues, teams, competition

Topics: multinomial over words

Topicssales xxx slowdown recession cars races spending xxx save costs fuel

13

Page 14: Talk at MIT HCI Seminar

Do you feel what I feel? Social Aspects of Emotions in Twitter Conversations

Suin Kim, JinYeong Bak, Alice Oh ICWSM 2012

14

Page 15: Talk at MIT HCI Seminar

Twitter conversation data

• Twitter conversation data: approx 220k dyads who “reply” to each other, 1,670k conversational chains (We now have about 5x this amount)

!1!

2!

3!

4!

Page 16: Talk at MIT HCI Seminar

Emotion Cycles

16

Page 17: Talk at MIT HCI Seminar

Emotion cycles

We propose that organizational dyads and groups inhabit emotion cycles: Emotions of an individual influence the emotions, thoughts and behaviors of others; others’ reactions can then influence their future interactions with the individual expressing the original emotion, as well as that individual’s future emotions and behaviors. People can mimic the emotions of others, thereby extending the social presence of a specific emotion, but can also respond to others’ emotions, extending the range of emotions present.

17

Page 18: Talk at MIT HCI Seminar

Topic model with a twist• Dirichlet forest prior (Andrzejewski et al.)

• Mixture of Dirichlet tree distribution

• Dirichlet tree: Generalization of Dirichlet distribution

• Knowledge is expressed using Must-link and Cannot-link primitives

• Must-link(love, sweetheart)

• Cannot-link(exciting, bored)

18

q�

DF-LDA

Page 19: Talk at MIT HCI Seminar

Domain knowledge in Dirichlet forest prior

19

Seed Words

anticipationhopewaitawaitinspirexcitborereadiexpectnervoucalmmotivpreparcertainanxiouoptimistforese

joyawesomamazwonderexcitgladfinebeautihighluckisuperperfectcompletspecialblesssafeproud

angershitbitchassmeandamnmadjealoupissannoiangriupsetmoronragescrewstuckirrit

surpriseamazwowwonderweirdluckidiffer

awkwardconfusholistrangshockodd

embarrassoverwhelmastoundastonish

fearscarestresshorrornervouterroralarmbehindpanicfearafraiddesperthreatentensterrififrightanxiou

sadnesssorribadawsadwronghurtbluedeadlostcrushweakdepressworslowterribllone

disgustsickwrongevilfatuglihorriblgrossterriblselfishmiserpathetdisgustworthlessaw

ashamfuck

acceptanceokaioksamealrightsafelazirelaxpeaccontentnormalsecurcompletnumbfulfil

comfortdefeat

Must-link within a class Cannot-link between classes

Page 20: Talk at MIT HCI Seminar

Emotion Topics How do we express emotions?

JoyAnticipation AngerTopic 114 omg love haha thank really Topic 107 love thank follow wow

Topic 159 good day hope morning thank Topic 158 love thank miss hug

Topic 125 hope better feel thank soon Topic 26 good thank hope miss

Topic 146 come wait week day june Topic 146 good day time work

Topic 131 lmao fuck ass bitch shit Topic 4 ass yo lmao nigga

Topic 19 lmao shit damn fuck oh Topic 13 shit nigga smh yea

FearTopic 48 omg oh lmao shit scare Topic 78 happen heart attack hospital

Topic 27 don’t come night sleep outside Topic 140 time got work day

SurpriseTopic 172 yeag know think true funny Topic 89 know don’t think look

Topic 15 think don’t know make really Topic 94 haha dont think really

29 70 21 14 5

Sadness DisgustTopic 6 oh sorry haha know didnt Topic 59 hurt got good bad

Topic 106 tweet reply didn’t read sorry Topic 155 oh really make feel

Topic 116 oh fuck don’t ye ew Topic 116 look haha oh know

Topic 22 don’t oh think yeah lmao Topic 174 don’t think say people

AcceptanceTopic 43 ok oh thank cool okay Topic 102 know try let ok

Topic 199 xx thank good okay follow Topic 8 night love good sleep

17 7 18 NeutralTopic 180 com www http check youtube Topic 156 twitter facebook people account

Topic 184 account google app work email Topic 67 food chicken cook rt

19

20

Page 21: Talk at MIT HCI Seminar

Emotion Topics How do we express emotions?

JoyAnticipationTopic 114 omg love haha thank really Topic 107 love thank follow wow

Topic 125 hope better feel thank soon Topic 26 good thank hope miss

SadnessTopic 6 oh sorry haha know didnt Topic 59 hurt got good bad

NeutralTopic 180 com www http check youtube Topic 156 twitter facebook people account

GreetingCaringSympathy

IT/Tech

21

Page 22: Talk at MIT HCI Seminar

Emotion-tagged conversations

22

A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on @mrunmaiy's health - hope she is recovering well? B (neut): @labnol @dhempe she is recovering but slow. The injury is on the spine therefore worrisome. Still in icu. A (Sadness): @amithpr thanks for the update.. extremely said to hear that news.. B (neut): @labnol #prayformrun She is a fighter and will come out of this

B (neut): @AyeItsMeiMei just tell ur followers to report her for spam. then she'll be kicked off twitter A (Anger): @Jakeosaurous dude I didn't even do shit to her I'm just here tweeting & she calls me a ugly bitch? I was like oh wow thanks? B (neut): @AyeItsMeiMei yeah clearly shes so ugly she cant even use her real pic:P so dont feel bad A (Love): @Jakeosaurous haha. I don't care. She's getting spammed with hate. Hahaha. (": thanks though. B (neut): @AyeItsMeiMei np

Page 23: Talk at MIT HCI Seminar

Emotion Transitions Plutchik’s Wheel of Emotions

Joy39.7%

0.51

Acceptance10.4%

0.23

Fear2.6%

0.11

Surprise7.4%

0.17

Anticipation15.1%

0.26

Disgust2.9%

0.11

Sadness9.1%

0.19

0.31Anger12.8%

0.37

0.33

0.32

0.31

0.33

0.21

0.34

0.15

0.140.13

0.15

23

Page 24: Talk at MIT HCI Seminar

Defining “Influence”

emotion influencing tweet

User A

User B

Having a tough day today. RIP Harrison. I’ll

miss you a ton :/

Just pray about it. God will help you.

Not really religious, but thanks man. :)

If you need talk you know I’m here.

Time

(Sadness) (Acceptance)

(Anticipation)

24

Page 25: Talk at MIT HCI Seminar

Topic 117 tweet people don’t read post Topic 59 hurt got bad pain feel

Emotion Influences What can you say to make your partner feel better?

Joy → SadnessSadness → Joy

Topic 18 wear look think love black Topic 24 love thank great new look

Anticipation → Surprise

Topic 96 music listen play song good Topic 178 follow tweet people twitter thank

Acceptance → Anger

Topic 31 i’m got lmax shit da Topic 13 lmao shit nigga smh yea

Disgust → Joy

Topic 61 watch new live tv tonight Topic 63 watch good think know look

Suggesting Greeting Sympathy

Swear words Complaining

25

Page 26: Talk at MIT HCI Seminar

0

0.075

0.15

0.225

0.3

Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral

0.0410.0710.082

0.053

0.265

0.0610.081

0.0420.051

Emotion Influence: Sadness to Joy

Emotion Influence: Joy to Anger

0

0.09

0.18

0.27

0.36

Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral

0.2110.230.2140.2090.1910.2370.253

0.358

0.273

Expressing Anger has 26.5% of chance of changing the partner’s emotion from

Joy to Anger.

26

Expressing Joy has 35.8% of chance of changing the partner’s emotion from Sadness to Joy.

Page 27: Talk at MIT HCI Seminar

Self-disclosure topic model

JinYeong Bak, Chin-Yew Lin, and Alice OhACL 2014 Workshop on Social Dynamics and Personal Attributes

27

Page 28: Talk at MIT HCI Seminar

Self-disclosure Research using Twitter

• People disclose personal and secretive information • to build and maintain interpersonal relationship • to get social support

• Twitter is a great source for naturally-occurring, large-scale, longitudinal data on self-disclosure behavior

• We develop a topic model for classifying self-disclosure behavior into three categories: G (general, no disclosure), M (medium disclosure), H (high disclosure)

• We look at the correlation of self-disclosure behavior and frequency of Twitter conversations in longitudinal data

28

Page 29: Talk at MIT HCI Seminar

Self-disclosure in Twitter conversations

29

Conversa)on  2:  

I'm  moving  out.  

@xxxx  ???  What's  going  on  bb?  

@yyyy  Mother.  Done  with  her.  I  am  planning  to  get  out  now.  There's  nothing  I  can  do,  we  dont  get  along  

@xxxx  I'm.sorry  hunn.  That's  rough.  Where  are  you  going  to  go  though?  

@yyyy  Probably  stay  at  a  friends  place  in  the  Cmebeing  unCl  I  find  a  place  to  live!  

@xxxx  :/  well  I'm  glad  your  geHng  out  if  she  is  being  horrible  to  you  

Conversa)on  3:  

Oh,  prepregnancy  pants,  you  are  so  uncomfortable.  

@eeee  You  can  put  them  on?  Jealous.  

@ffff  they  are  cuHng  into  my  flesh  and  are  giving  me  a  ridiculous  muffin  top.  It  isn't  preOy.  But  we  have  company  coming  over.  

@eeee  Yea,  I  tried  yesterday.  I  got  one  pair  of  shorts  to  buOon  painfully  and  my  jeans  just  laughed  at  me.

Conversa)on  1:  

So  my  brother  is  going  to  Roskilde  FesCval  and    my  mother  and  sister  is  going  to  England..  That  leaves  me,  my  dad  and  my  dog.  

@cccc  why  aren't  you  going  to  england?  

@dddd  because  my  sister  is  going  with  3  of  her  friends  and  my  mom's  just  there...  to  be  there.  And  my  sister  didn't  want  me  to  come  :(  

Page 30: Talk at MIT HCI Seminar

Data

• Full data • 88k users, 51k dyads • 1.3M conversations • 10.5M tweets • Longitudinal data from August 2007 to July 2013

• Labeled data (gold standard for self-disclosure level) • 101 conversations • 673 tweets

30

Page 31: Talk at MIT HCI Seminar

Graphical Representation of SDTM

3 sets of topics, one for G, M, and H levels

By using a topic model, we can !-classify the levels of disclosure!-discover topics associated with each level!-generalize to other social media sites using the same set of seed words

Page 32: Talk at MIT HCI Seminar

Seed Words

• Medium level: frequent trigrams for personally identifiable information !

!

!

!

• High level: automatically extracted from sixbillionsecrets Website

32

Page 33: Talk at MIT HCI Seminar

Classification Results

33

Direct Classification using the Models

Classification with SVM using Features Learned from Models

Page 34: Talk at MIT HCI Seminar

Self-disclosure topics

34

Page 35: Talk at MIT HCI Seminar

SD level & conversation frequency

35

Page 36: Talk at MIT HCI Seminar

Sociolinguistic Analysis of Twitter in Multilingual Societies

Suin Kim, Ingmar Weber, Li Wei, and Alice OhUnder Review

36

Page 37: Talk at MIT HCI Seminar

Data

Page 38: Talk at MIT HCI Seminar

Data

Page 39: Talk at MIT HCI Seminar

Visualization of the network

Page 40: Talk at MIT HCI Seminar

How are they connected?

• English monolinguals and X-EN bilinguals bridge the network

Page 41: Talk at MIT HCI Seminar

Closer look at Bilinguals: Which language do they choose?

Page 42: Talk at MIT HCI Seminar

Closer look at Bilinguals: Hashtag usage

Page 43: Talk at MIT HCI Seminar

Closer look at Bilinguals: Topics (Results of LDA)

Page 44: Talk at MIT HCI Seminar

Closer look at Bilinguals: Topics (Results of LDA)

Page 45: Talk at MIT HCI Seminar

Future directions

• Develop model for prediction of language choice in bilinguals

• Look at how English is used throughout the world

• Cognitive studies of first- and second- language

• Self-disclosure and relationship building

• Email me for data sharing, collaborating, discussing, …

[email protected]