Social media analysis with NLP - Carnegie Mellon...

86
Social media analysis with NLP Michael Miller Yoder 28 April 2020 1

Transcript of Social media analysis with NLP - Carnegie Mellon...

Page 1: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Social media analysiswith NLP

Michael Miller Yoder

28 April 2020

1

Page 2: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Overview

1. Motivation: language in social context

2

Page 3: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

3

Page 4: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

4

Experiment 1

Page 5: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

Portrayal of characters and relationshipsin narrative (fanfiction)

5

Experiment 1

Experiment 2

Page 6: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

language embedded in social context

6

Page 7: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What types of social contexts is language used in?

7

Page 8: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What types of social contexts?

8

Page 9: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

9

Page 10: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

10

Page 11: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

11

Page 12: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

12

Page 13: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

For NLP, what is language?

13

Page 14: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

14

1990 2000 2010

statistical machine learning NLP

Penn Treebank

1987-1989

2020

Page 15: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

15

news

Page 16: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

16

news1987-1989

Page 17: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

17

1990 2000 2010

statistical machine learning NLP neural NLP

Penn Treebank

1987-1989

BERT

2020

Page 18: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

18

Page 19: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

19

Page 20: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

SOCIAL20

language

speakers audience

situations purposes

Page 21: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

21

Penn Treebank

1987-1989

credit: Amir Zeldes, [Zeldes & Simonson 2016]

Typical rates in the secondary market : 8.65 % one month ; 8.65 % three months ; 8.55 % six months. BANKERS ACCEPTANCES : 8.52 % 30 days ; 8.37 % 60 days ; 8.15 % 90 days ; 7.98 % 120 days ; 7.92 % 150 days ; 7.80 % 180 days.

Page 22: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

22

language is always embedded in social context

Page 23: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

NLP + social science: applications

23

hate speech detection community norms

Page 24: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

NLP + social science: applications

24

fairness and bias

Garg et al. 2017

media framing

https://criticalmediareview.wordpress.com/2015/10/19/what-is-media-framing/

Page 25: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

NLP + social science: applications

25

dialectal NLP tools

Garg et al. 2017www.tes.com

Page 26: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

Portrayal of characters and relationshipsin narrative (fanfiction)

26

Experiment 1

Experiment 2

Page 27: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

27

Page 28: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

28

Page 29: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

29

Page 30: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Models of identity

identity

30

Page 31: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches

“identity is the product rather than the source of linguistic and other semiotic practices … is social and cultural rather than primarily internal”

sociolinguistics

[Bucholtz and Hall 2005]

31

identity

Page 32: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches

“identity is the product rather than the source of linguistic and other semiotic practices … is social and cultural rather than primarily internal”

sociolinguistics

[Bucholtz and Hall 2005]

32

identity

society, culture

Page 33: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches

“As a shifting and contextual phenomenon, gender does not denote a substantive being”

gender studies

[Butler 1990]

33

Page 34: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches34

(changing) identity

“As a shifting and contextual phenomenon, gender does not denote a substantive being”

gender studies

[Butler 1990]

society, culture

Page 35: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches

“race and sex become grounded in experiences that actually represent only a subset of a much more complex phenomenon.”

critical race theory

[Crenshaw 1989]

35

(intersectional)identity

Page 36: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Critical identity approaches

“people have multiple identities connected not to their ‘internal states’ but to their performances in society”

discourse analysis

[Gee 2000]

36

identities

Page 37: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Computational identity approaches

“classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language”

computer science

[Rao et al. 2010]

37

identity

Page 38: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

“a [deep neural network] can be used to identify sexual orientation from facial images”

computer vision

[Kosinski and Wang 2018]

38

identity

Computational identity approaches

Page 39: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Can we investigate the production of identity in language with computational models?

39

Avoid naturalizing structures of identity and further marginalizing those who don’t fit them (Butler 1990)

Discover how notions of identity are being reinforced/challenged/reinvented

Page 40: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

40

?language + social

data y = f(x)

machine learning

Page 41: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

1. Self-presentation effects on social media

Qinlan ShenCMU Language Technologies Institute

Alex CodaCMU Language Technologies Institute

Carolyn P. RoséCMU Language Technologies Institute

Yunseok JangU Michigan Computer Science & Eng

Yale SongMicrosoft Research

Kapil ThadaniYahoo Research

WebSci 2020

Page 42: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Explicit identity positioning

● Working identity definition: “social positioning of self and other” [Bucholtz & Hall 2010]

● How does the social positioning of self affect interaction on social media?

● Tumblr as a site with particular identity implications, as well as social interaction

42

Page 43: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

43

Page 44: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

44

Lyca / 25

Page 45: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Blog descriptions on Tumblr

45

● Free-form text bio boxes

● Labeling practices outside gender/sexuality binaries [Oakley 2016]

max | 18yo | she/they | girl with dreams | twerfs don't follow

andre | he/him | 22 | mexican ✨trans | too many fandoms

hey! annie, she/hers, love me, infj

Page 46: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity categories

ageethnicity/nationalityfandomsgenderinterestslocation

personality typepronounsrelationship statussexual orientationzodiac sign

46

fandoms: shipping, star wars, lotr, homestuckgender: woman, husband, mtf, nonbinaryage: 24, xviii, 35yo, nineteen

Page 47: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What effects of similarities and differences in self-positioning do we see on content propagation

in Tumblr?

47

Page 48: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What effects of similarities and differences in self-positioning do we see on content propagation

in Tumblr?

48

blog descriptions reblogging

Page 49: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Reblog prediction

● Reblog "opportunity"

49

follower

followee

post

followee

postsimilar time

Page 50: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Reblog prediction

● Reblog "opportunity"

● Learning to rank pairwise formulation

followee

post

50

followee

post

reblog

similar time

follower

Page 51: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Reblog prediction

● Reblog "opportunity"

● Learning to rank pairwise formulation she/her

25 | nyc

post

51

reylo fan

post

reblog

similar time

Page 52: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Data

Number of users 34,797

Reblog prediction instances 712,670

Timeframe June - Nov 2018

52

Page 53: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Control features

53

post hashtags

number of likes, comments

post type (text, photo, etc)

Page 54: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

54

22 male infj coffee 🌈 they/them 29 leo infj

FOLLOWERFOLLOWED

Page 55: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

55

22 male infj coffee 🌈 they/them 29 leo infj

FOLLOWERFOLLOWED

match: age

Page 56: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

56

22 male infj coffee 🌈 they/them 29 leo infj

FOLLOWERFOLLOWED

match: personality type

Page 57: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

57

they/them 29 leo infj

FOLLOWERFOLLOWED

mismatch: pronouns

22 male infj coffee 🌈

X

Page 58: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

58

22 male infj coffee 🌈 they/them 29 leo infj

FOLLOWERFOLLOWED

match: infj

Page 59: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Identity features

59

22 male infj coffee 🌈 they/them 29 leo infj

FOLLOWERFOLLOWED

followed: 22, follower: 29

Page 60: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Is there an effect?

60

Self-presentation labels are associated with content propagation

Page 61: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What is the nature of this effect?

● Establishing solidarity: categories and label matches were positively associated with reblogging

61

indie indie

sappho sappho

any sexual orientation any sexual orientation

Page 62: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What is the nature of this effect?

62

Features Likelihood of reblogging

Follower: presents pronounsFollowed: does not

Both: cis or cishet ↑

Race/ethnicity label alignment ↑

Nationality label alignment none

Page 63: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

What is the nature of this effect?

63

Features Likelihood of reblogging

Follower: gamingFollowed: manga

Follower: memes Followed: history

Page 64: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Conclusion

● Evidence for an association between explicit, self-presented identity information and content propagation

○ Most studies use only content and network features to predict content propagation [Naveed et al. 2011, Zhang et al. 2016,

Vosoughi et al. 2018]

● Users who presented labels that indicated shared interests or shared values were more likely to share each other’s content

64

Page 65: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

2. Changes in portrayal of characters in narrative

65

Qinlan Shen

Luke Breitfeller

Carolyn P. Rosé

James Fiacco

Shefali GargEthan Xuanyue Yang

Huiming JinHariharan Muralidharan

Page 66: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships
Page 67: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Fanfiction

67

● Stories based on existing media [Fiesler et al. 2016]

● “Participatory culture” [Jenkins 2003]

Canon The original work fanfiction is based on

Ship Romantic relationship between characters

Fic A specific fanfiction story

Page 68: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Fanfiction

● “Queer female space” [Lothian et al. 2007]

68

Page 69: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Fanfiction

● “Queer female space” [Lothian et al. 2007]

○ queer pairings

69

M/M F/M F/F

Page 70: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Fanfiction

● “Queer female space” [Lothian et al. 2007]

○ queer pairings○ female characters

[Bamman & Milli 2016]

○ gender-swapping○ desire outside heterosexual,

cisgender norms 70

thed

cont

inuu

m.w

ord

pre

ss.c

om

Page 71: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

How do fanfiction authors use language to shift character identities from canon?

71

1. Locate text that is relevant for characterization

2. Test ability to capture changes in relationship portrayal

3. Describe patterns in characterization shifts

Page 72: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Text extraction

72

github.com/michaelmilleryoder/fanfiction-nlpBased on BookNLP [Bamman et al. 2014]

Page 73: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

73

● Word embeddings [Mikolov et al. 2013a] for social questions○ Stereotypes and bias in corpora [Garg et al. 2018]

○ Framing by different social groups [An et al. 2018)]

● Can word embeddings capture social framing of relationships in fanfiction?

Methods

Page 74: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Data

74

Harry Potter stories Archive of Our Own

>179k stories (as of 2018)

Characters

● Harry Potter● Hermione

Granger● Draco Malfoy● Ron Weasley● Ginny

Weasley

Pairings by popularity

● Draco/Harry● Hermione/Ron● Draco/Hermione● Ginny/Harry● Harry/Hermione● Harry/Ron

Page 75: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Prediction task

75

● Does the relationship match canon in being romantic/not romantic?

● False (relationship is changed) if

○ not romantic in canon and romantic in fanfiction or

○ romantic in canon and not romantic in fanfiction

Page 76: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Relationship representations

76

Harry wept at the sight of Hermione in the garden.

Ron looked down at his shoe. Troll bogeys. He would have to tell Harry about this.

Harry Hermione Harry Ron

● Weighted average of word embeddings in a 10-word window around character name mentions

Page 77: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

77

Page 78: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Visualization

● Track changes in contextualized embeddings for character names across fics

○ Train RNN-based language model and take final hidden state as contextualized word representation [Peters et al. 2018]

78

Page 79: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Visualization

Hermione sat in the front of the classroom. She...

Fleur whistled softly. "Hermione! Come here...

[ 0.34 0.72 0.21 … ]

[ 0.89 0.06 0.53 … ]

79

Page 80: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

80

Page 81: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

81

Canon vector is close to the center of the fanfiction vectors: harry

Canon vector is on the edge of fanfiction vectors: draco, remus, sirius

Page 82: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Conclusion

82

● Word embedding approaches can capture types of character framing

○ See evidence of differences in characterization, relationships

● Differences often match known fanfiction trends

Page 83: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Conclusion

83

Page 84: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Computational models of identity in language

● Computational techniques to analyze and model the presentation of identity in discourse

● The effects of the choice of self-presentation (Experiment 1)

● How identities are represented in changing ways in narrative (Experiment 2)

84

Page 85: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

language embedded in social context

85

Page 86: Social media analysis with NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/NLP/S20/files/slides/L27-social... · 2020-04-29 · in social media Portrayal of characters and relationships

Thank you!

86