Practical Sentiment Analysis

435
Practical Sentiment Analysis Tutorial Jason Baldridge @jasonbaldridge Sentiment Analysis Symposium 2014 Associate Professor Co-founder & Chief Scientist Wednesday, March 5, 14

description

An informative tutorial on practical sentiment analysis, natural language processing, and semi-supervised learning. Learn how your company can leverage the crowd for sentiment analysis of structured and un-structured content. Dr. Jason Baldridge is co-founder and Chief Scientist at People Pattern, and Associate Professor, Linguistics, at the University of Texas

Transcript of Practical Sentiment Analysis

Page 1: Practical Sentiment Analysis

Practical Sentiment Analysis Tutorial

Jason Baldridge @jasonbaldridge

Sentiment Analysis Symposium 2014

Associate Professor Co-founder & Chief Scientist

Wednesday, March 5, 14

Page 2: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

About the presenter

Associate Professor, Linguistics Department, The University of Texas at Austin (2005-present)

Ph.D., Informatics, The University of Edinburgh, 2002

MA (Linguistics), MSE (Computer Science), The University of Pennsylvania, 1998

Co-founder & Chief Scientist, People Pattern (2013-present)

Built Converseon’s Convey text analytics engine, with Philip Resnik and Converseon programmers.

2

Wednesday, March 5, 14

Page 3: Practical Sentiment Analysis

Why NLP is hardSentiment analysis overview

Document classificationbreak

Aspect-based sentiment analysisVisualization

Semi-supervised learningbreak

Stylistics & author modelingBeyond text

Wrap up

Wednesday, March 5, 14

Page 4: Practical Sentiment Analysis

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

VisualizationSemi-supervised learning

Stylistics & author modelingBeyond text

Wrap up

Wednesday, March 5, 14

Page 5: Practical Sentiment Analysis

Text is pervasive = big opportunities

Wednesday, March 5, 14

Page 6: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Texts as bags of words (with apologies) (http://www.wordle.net/)

6

Wednesday, March 5, 14

Page 7: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Texts as bags of words (with apologies) (http://www.wordle.net/)

http://www.wired.com/magazine/2010/12/ff_ai_essay_airevolution/

6

Wednesday, March 5, 14

Page 8: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

leg

on

manthe

dog

bit

Wednesday, March 5, 14

Page 9: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

legonmanthe dogbit thethe

Wednesday, March 5, 14

Page 10: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

legonmanthe dogbit thethe mandog

Wednesday, March 5, 14

Page 11: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

legonmanthe dogbit thethe mandog

Wednesday, March 5, 14

Page 12: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

legonmanthe dogbit thethe mandog

Subject

Object

Modifier

Wednesday, March 5, 14

Page 13: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

That is of course not the full story...

Texts are not just bags-of-words.

Order and syntax affect interpretation of utterances.

7

legonmanthe dogbit thethe mandog

Subject

Object

Location

Wednesday, March 5, 14

Page 14: Practical Sentiment Analysis

What does this sentence mean?

I saw her duck with a telescope.

Slide by Lillian Lee

Wednesday, March 5, 14

Page 15: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

Slide by Lillian Lee

Wednesday, March 5, 14

Page 16: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 17: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 18: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 19: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 20: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 21: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb

Slide by Lillian Lee

Wednesday, March 5, 14

Page 22: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

Slide by Lillian Lee

Wednesday, March 5, 14

Page 23: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

Slide by Lillian Lee

Wednesday, March 5, 14

Page 24: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

Slide by Lillian Lee

Wednesday, March 5, 14

Page 25: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

Slide by Lillian Lee

Wednesday, March 5, 14

Page 26: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

Slide by Lillian Lee

Wednesday, March 5, 14

Page 27: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

[http://www.clker.com/clipart-green-eyes-3.html]

Slide by Lillian Lee

Wednesday, March 5, 14

Page 28: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

[http://www.clker.com/clipart-3163.html]

Slide by Lillian Lee

Wednesday, March 5, 14

Page 29: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

[http://www.simonpalfrader.com/category/tournament-poker]

Slide by Lillian Lee

Wednesday, March 5, 14

Page 30: Practical Sentiment Analysis

What does this sentence mean?

[http://casablancapa.blogspot.com/2010/05/fore.htm]l [http://www.supercoloring.com/pages/duck-outline/]

I saw her duck with a telescope.

verb noun

[http://casablancapa.blogspot.com/2010/05/fore.htm]l

Slide by Lillian Lee

Wednesday, March 5, 14

Page 31: Practical Sentiment Analysis

Ambiguity is pervasive

the a are of I [Steve Abney]

Slide by Lillian Lee

Wednesday, March 5, 14

Page 32: Practical Sentiment Analysis

Ambiguity is pervasive

the a are of I [Steve Abney]

an “are” (100 m2)

another “are”

Slide by Lillian Lee

Wednesday, March 5, 14

Page 33: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.

Wednesday, March 5, 14

Page 34: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.

Max fell. John pushed him.

Wednesday, March 5, 14

Page 35: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.(Because)

Explanation

Max fell. John pushed him.

Wednesday, March 5, 14

Page 36: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.(Because)

Explanation

Max fell. John pushed him.(Then)

Continuation

Wednesday, March 5, 14

Page 37: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.(Because)

Explanation

Max fell. John pushed him.(Then)

Continuation

pushing precedes

falling

Wednesday, March 5, 14

Page 38: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

And it goes further...

Rhetorical structure affects the interpretation of the text as a whole.

10

Max fell. John pushed him.(Because)

Explanation

Max fell. John pushed him.(Then)

Continuation

pushing precedes

falling

falling precedes pushing

Wednesday, March 5, 14

Page 39: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 40: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

To get a spare tire (donut) for his car?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 41: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

store where donuts shop? or is run by donuts?

or looks like a big donut? or made of donut?

or has an emptiness at its core?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 42: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

I stopped smoking freshman year, but

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 43: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

Describes where the store is? Or when he stopped?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 44: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

Well, actually, he stopped there from hunger and exhaustion, not just from work.

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 45: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

At that moment, or habitually?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 46: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

That’s how often he thought it?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 47: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

But actually, a coffee only stays good for about 10

minutes before it gets cold.

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 48: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

Similarly: In America a woman has a baby every 15 minutes. Our job is to

find that woman and stop her.

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 49: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

the particular coffee that was good every few hours? the donut store? the situation?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 50: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

too expensive for what? what are we supposed to conclude about

what John did?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 51: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 201411

What’s hard about this story? [Slide from Jason Eisner]

how do we connect “it” to “expensive”?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours.

But it turned out to be too expensive there.

Wednesday, March 5, 14

Page 52: Practical Sentiment Analysis

NLP has come a long way

Wednesday, March 5, 14

Page 53: Practical Sentiment Analysis

Sentiment analysis overview

Why NLP is hard

Document classificationAspect-based sentiment analysis

VisualizationSemi-supervised learning

Stylistics & author modelingBeyond text

Wrap up

Wednesday, March 5, 14

Page 54: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Sentiment analysis: background [slide from Lillian Lee]

People search for and are affected by online opinions.

TripAdvisor, Rotten Tomatoes, Yelp, Amazon, eBay, YouTube, blogs, Q&A and discussion sites

According to a Comscore ’07 report and an ’08 Pew survey:

60% of US residents have done online product research, and 15% do so on a typical day.

73%-87% of US readers of online reviews of services say the reviews were significant influences. (more on economics later)

But, 58% of US internet users report that online information was missing, impossible to find, confusing, and/or overwhelming.

Creating technologies that find and analyze reviews would answer a tremendous information need.

14

Wednesday, March 5, 14

Page 55: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Broader implications: economics [slide from Lillian Lee]

Consumers report being willing to pay from 20% to 99% more for a 5-star-rated item than a 4-star-rated item. [comScore]

But, does the polarity and/or volume of reviews have measurable, significant influence on actual consumer purchasing?

Implications for bang-for-the-buck, manipulation, etc.

15

Wednesday, March 5, 14

Page 56: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Social media analytics: acting on sentiment

16

Richard Lawrence, Prem Melville, Claudia Perlich, Vikas Sindhwani, Estepan Meliksetian et al.In ORMS Today, Volume 37, Number 1, February, 2010.

Wednesday, March 5, 14

Page 57: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity classification [slide from Lillian Lee]

Consider just classifying an avowedly subjective text unit as either positive or negative (“thumbs up or “thumbs down”).

One application: review summarization.

Elvis Mitchell, May 12, 2000: It may be a bit early to make such judgments, but Battlefield Earth may well turn out to be the worst movie of this century.

Can’t we just look for words like “great”, “terrible”, “worst”?

Yes, but ... learning a sufficient set of such words or phrases is an active challenge.

17

Wednesday, March 5, 14

Page 58: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Using a lexicon [slide from Lillian Lee]

From a small scale human study:

18

Proposed word lists Accuracy

Subject 1

Positive: dazzling, brilliant, phenomenal, excellent, fantasticNegative: suck, terrible, awful, unwatchable, hideous 58%

Subject 2

Positive: gripping, mesmerizing, riveting, spectacular, cool, awesome, thrilling, badass, excellent, moving, exciting Negative: bad, cliched, sucks, boring, stupid, slow

64%

Automatically determined (from data)

Positive: love, wonderful, best, great, superb, beautiful, still Negative: bad, worst, stupid, waste, boring, ?, !

69%

Wednesday, March 5, 14

Page 59: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity words are not enough [slide from Lillian Lee]

Can’t we just look for words like “great” or “terrible”?

Yes, but ...

This laptop is a great deal.

A great deal of media attention surrounded the release of the new laptop.

This laptop is a great deal ... and I’ve got a nice bridge you might be interested in.

19

Wednesday, March 5, 14

Page 60: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity words are not enough

Polarity flippers: some words change positive expressions into negative ones and vice versa.

Negation: America still needs to be focused on job creation. Not among Obama's great accomplishments since coming to office !! [From a tweet in 2010]

Contrastive discourse connectives: I used to HATE it. But this stuff is yummmmmy :) [From a tweet in 2011 -- the tweeter had already bolded “HATE” and “But”!]

Multiword expressions: other words in context can make a negative word positive:

That movie was shit. [negative]

That movie was the shit. [positive] (American slang from the 1990’s)

20

Wednesday, March 5, 14

Page 61: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

More subtle sentiment (from Pang and Lee)

With many texts, no ostensibly negative words occur, yet they indicate strong negative polarity.

“If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut.” (review by Luca Turin and Tania Sanchez of the Givenchy perfume Amarige, in Perfumes: The Guide, Viking 2008.)

“She runs the gamut of emotions from A to B.” (Dorothy Parker, speaking about Katharine Hepburn.)

“Jane Austen’s books madden me so that I can’t conceal my frenzy from the reader. Every time I read ‘Pride and Prejudice’ I want to dig her up and beat her over the skull with her own shin-bone.” (Mark Twain.)

21

Wednesday, March 5, 14

Page 62: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Thwarted expectations (from Pang and Lee)

22

This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up.

There are also highly negative texts that use lots of positive words, but ultimately are reversed by the final sentence. For example

This is referred to as a thwarted expectations narrative because in the final sentence the author sets up a deliberate contrast to the preceding discourse, giving it more impact.

Wednesday, March 5, 14

Page 63: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity classification: it’s more than positive and negative

Positive: “As a used vehicle, the Ford Focus represents a solid pick.”

Negative: “Still, the Focus' interior doesn't quite measure up to those offered by some of its competitors, both in terms of materials quality and design aesthetic.”

Neutral: “The Ford Focus has been Ford's entry-level car since the start of the new millennium.”

Mixed: “The current Focus has much to offer in the area of value, if not refinement.”

23

http://www.edmunds.com/ford/focus/review.html

Wednesday, March 5, 14

Page 64: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Other dimensions of sentiment analysis

Subjectivity: is an opinion even being expressed? Many statements are simply factual.

Target: what exactly is an opinion being expressed about?

Important for aggregating interesting and meaningful statistics about sentiment.

Also, it affects how the language use indicates polarity: e.g, unpredictable is usually positive for movie reviews, but is very negative for a car’s steering

Ratings: rather than a binary decision, it is often of interest to provide or interpret predictions about sentiment on a scale, such as a 5-star system.

24

Wednesday, March 5, 14

Page 65: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Other dimensions of sentiment analysis

Perspective: an opinion can be positive or negative depending on who is saying it

entry-level could be good or bad for different people

it also affects how an author describes a topic: e.g. pro-choice vs pro-life, affordable health care vs obamacare.

Authority: was the text written by someone whose opinion matters more than others?

it is more important to identify and address negative sentiment expressed by a popular blogger than a one-off commenter or supplier of a product reviewer on a sales site

follower graphs (where applicable) are very useful in this regard

Spam: is the text even valid or at least something of interest?

many tweets and blog post comments are just spammers trying to drive traffic to their sites

25

Wednesday, March 5, 14

Page 66: Practical Sentiment Analysis

Document Classification

Why NLP is hardSentiment analysis overview

Aspect-based sentiment analysisVisualization

Semi-supervised learningStylistics & author modeling

Beyond textWrap up

Wednesday, March 5, 14

Page 67: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Text analysis, in brief

27

f( , ,... ) = [ , ,... ]

Wednesday, March 5, 14

Page 68: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Text analysis, in brief

27

f( , ,... ) = [ , ,... ]

Sentiment labelsParts-of-speechNamed EntitiesTopic assignmentsGeo-coordinatesSyntactic structuresTranslations

Wednesday, March 5, 14

Page 69: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Text analysis, in brief

27

f( , ,... ) = [ , ,... ]

Sentiment labelsParts-of-speechNamed EntitiesTopic assignmentsGeo-coordinatesSyntactic structuresTranslations

RulesAnnotation & Learning - annotated examples - annotated knowledge - interactive annotation and learningScalable human annotation

Wednesday, March 5, 14

Page 70: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Document classification: automatically label some text

Language identification: determine the language that a text is written in

Spam filtering: label emails, tweets, blog comments as spam (undesired) or ham (desired)

Routing: label emails to an organization based on which department should respond to them (e.g. complaints, tech support, order status)

Sentiment analysis: label some text as being positive or negative (polarity classification)

Georeferencing: identify the location (latitude and longitude) associated with a text

28

Wednesday, March 5, 14

Page 71: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Desiderata for text analysis function f

task is well-defined

outputs are meaningful

precision, recall, etc. are measurable and sufficient for desired use

29

Performant

Wednesday, March 5, 14

Page 72: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Desiderata for text analysis function f

task is well-defined

outputs are meaningful

precision, recall, etc. are measurable and sufficient for desired use

29

Performant

affordable access to annotated examples and/or knowledge sources

able to exploit indirect or noisy annotations

access to unlabeled examples and ability to exploit them

tools to learn f are available or can be built within budget

Reasonable cost (time & money)

Wednesday, March 5, 14

Page 73: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Four sentiment datasets

30

Dataset Topic Year # Train # Dev #Test Reference

Debate08Obama vs McCain debate

2008 795 795 795Shamma, et al. (2009) "Tweet the Debates: Understanding Community Annotation of

Uncollected Sources."

HCRHealth care

reform2010 839 838 839

Speriosu et al. (2011) "Twitter Polarity Classification with

Label Propagation over Lexical Links and the Follower Graph."

STS(Stanford)

Twitter Sentiment

2009 - 216 -Go et al. (2009) "Twitter

sentiment classification using distant supervision"

IMDBIMDB movie

reviews2011 25,000 25,000 -

Mas et al. (2011) "Learning Word Vectors for Sentiment

Analysis"

Wednesday, March 5, 14

Page 74: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Rule-based classification

Identify words and patterns that are indicative of positive or negative sentiment:

polarity words: e.g. good, great, love; bad, terrible, hate

polarity ngrams: the shit (+), must buy (+), could care less (-)

casing: uppercase often indicates subjectivity

punctuation: lots of ! and ? indicates subjectivity (often negative)

emoticons: smiles like :) are generally positive, while frowns like :( are generally negative

Use each pattern as a rule; if present in the text, the rule indicates whether the text is positive or negative.

How to deal with conflicts? (E.g. multiple rules apply, but indicate both positive and negative?)

Simple: count number of matching rules and take the max.

31

Wednesday, March 5, 14

Page 75: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Simplest polarity classifier ever?

32

def polarity(document) = if (document contains “good”) positive else if (document contains “bad”) negative else neutral

Debate08 HCR STS IMDB

20.5 21.6 19.4 27.4

No better than flipping a (three-way) coin?

Code and data here: https://github.com/utcompling/sastut

Wednesday, March 5, 14

Page 76: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

+ is positive, - is negative, ~ is neutral

Wednesday, March 5, 14

Page 77: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Corpus labels

+ is positive, - is negative, ~ is neutral

Wednesday, March 5, 14

Page 78: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Corpus labels

Machine predictions

+ is positive, - is negative, ~ is neutral

Wednesday, March 5, 14

Page 79: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

+ is positive, - is negative, ~ is neutral

Wednesday, March 5, 14

Page 80: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

Correct predictions

+ is positive, - is negative, ~ is neutral

Wednesday, March 5, 14

Page 81: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

Correct predictions

+ is positive, - is negative, ~ is neutral

(5+140+18)/795 = 0.205

Wednesday, March 5, 14

Page 82: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

Correct predictions

Incorrect predictions

+ is positive, - is negative, ~ is neutral

(5+140+18)/795 = 0.205

Wednesday, March 5, 14

Page 83: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

Column showing outcomes of documents labeled negative by the machine

Correct predictions

Incorrect predictions

+ is positive, - is negative, ~ is neutral

(5+140+18)/795 = 0.205

Wednesday, March 5, 14

Page 84: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The confusion matrix

We need to look at the confusion matrix and breakdowns for each label. For example, here it is for Debate08:

33

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

Total count of documents inthe corpus

Corpus labels

Machine predictions

Row showing outcomes of documents labeled negative in the corpus

Column showing outcomes of documents labeled negative by the machine

Correct predictions

Incorrect predictions

+ is positive, - is negative, ~ is neutral

(5+140+18)/795 = 0.205

Wednesday, March 5, 14

Page 85: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Precision, Recall, and F-score: per category scores

Precision: the number of correct guesses (true positives) for the category divided by all guesses for it (true positives and false positives)

Recall: the number of correct guesses (true positives) for the category divided by all the true documents in that category (true positives plus false negatives)

F-score: derived measure combining precision and recall.

34

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

P R F-~+

Avg

83.3 1.1 2.2

18.3 99.3 30.1

72.0 9.0 16.0

57.9 36.5 16.4

P = TP/(TP+FP)

R = TP/(TP+FN)

F = 2PR/(P+R)

P~ = 140+442+182140 = .183

R- = 5+442+75 = .011

F+ = .72+.092 × .72 × .09 = .16

Wednesday, March 5, 14

Page 86: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does it tell us?

Overall accuracy is low, because the model overpredicts neutral.

Precision is pretty good for negative, and okay for positive. This means the simple rules “has the word ‘good’” and “has the word ‘bad’” are good predictors.

35

- ~ +-~+

5 442 7 454

1 140 0 141

0 182 18 200

6 764 25 795

P R F-~+

Avg

83.3 1.1 2.2

18.3 99.3 30.1

72.0 9.0 16.0

57.9 36.5 16.4

Wednesday, March 5, 14

Page 87: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Where do the rules go wrong?

Confusion matrix for STS:

36

The one negative-labeled tweet that is actually positive, using the very positive expression “bad ass” (thus matching “bad”).

Booz Allen Hamilton has a bad ass homegrown social collaboration platform. Way cool! #ttiv

- ~ +-~+

0 73 2 75

0 31 2 33

1 96 11 108

1 200 15 216

Wednesday, March 5, 14

Page 88: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

A bigger lexicon (rule set) and a better rule

Good improvements for five minutes of effort!

Why such a large improvement for IMDB?37

pos_words = {"good","awesome","great","fantastic","wonderful"}neg_words = {"bad","terrible","worst","sucks","awful","dumb"}def polarity(document) =num_pos = count of words in document also in pos_wordsnum_neg = count of words in document also in neg_words

if (num_pos == 0 and num_neg == 0) neutral else if (num_pos > num_neg) positive else negative

Debate08 HCR STS IMDB

Super simple

Small lexicon

20.5 21.6 19.4 27.4

21.5 22.1 25.5 51.4

Wednesday, March 5, 14

Page 89: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

IMDB: no neutrals!

Data is from 10 star movie ratings (>=7 are pos, <= 4 are neg)

Compare the confusion matrices!

38

“Good/Bad” rule Small lexicon with counting rule

- ~ +

-

~

+

2324 5476 4700 12500

0 0 0 0

651 7325 4524 12500

2975 12801 9224 25000

Accuracy: 27.4

- ~ +

-

~

+

5744 3316 3440 12500

0 0 0 0

1147 4247 7106 12500

6891 7563 10546 25000

Accuracy: 51.4

Wednesday, March 5, 14

Page 90: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Sentiment lexicons: Bing Liu’s opinion lexicon

Bing Liu maintains and freely distributes a sentiment lexicon consisting of lists of strings.

Distribution page (direct link to rar archive)

Positive words: 2006

Negative words: 4783

Useful properties: includes mis-spellings, morphological variants, slang, and social-media mark-up

Note: may be used for academic and commercial purposes.

39Slide by Chris Potts

Wednesday, March 5, 14

Page 91: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Sentiment lexicons: MPQA

The MPQA (Multi-Perspective Question Answering) Subjectivity Lexicon is maintained by Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (Wiebe, Wilson, and Cardie 2005).

Note: distributed under a GNU Public License (not suitable for most commercial uses).

40Slide by Chris Potts

Wednesday, March 5, 14

Page 92: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Other sentiment lexicons

SentiWordNet (Baccianella, Esuli, and Sebastiani 2010) attaches positive and negative real-valued sentiment scores to WordNet synsets (Fellbaum1998).

Note: recently changed license to permissive, commercial-friendly terms.

Harvard General Inquirer is a lexicon attaching syntactic, semantic, and pragmatic information to part-of-speech tagged words (Stone, Dunphry, Smith, and Ogilvie 1966).

Linguistic Inquiry and Word Counts (LIWC) is a proprietary database consisting of a lot of categorized regular expressions. Its classifications are highly correlated with those of the Harvard General Inquirer.

41Slide by Chris Potts

Wednesday, March 5, 14

Page 93: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

When you have a big lexicon, use it!

42

Debate08 HCR STS IMDB

Super simple

Small lexicon

Opinion lexicon

20.5 21.6 19.4 27.4

21.5 22.1 25.5 51.4

47.8 42.3 62.0 73.6

Using Bing Liu’s Opinion Lexicon, scores across all datasets go up dramatically.

Well above (three-way) coin-flipping!

Wednesday, March 5, 14

Page 94: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

If you don’t have a big lexicon, bootstrap one

There is a reasonably large literature on creating sentiment lexicons, using various sources such as WordNet (knowledge source) and review data (domain-specific data source).

Advantage of review data: often able to obtain easily for many languages.

See Chris Potts’ 2011 SAS tutorial for more details:

http://sentiment.christopherpotts.net/lexicons.html

A simple, intuitive measure is the log-likelihood ratio, which I’ll show for IMDB data.

43

Wednesday, March 5, 14

Page 95: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Log-likelihood ratio: basic recipe

Given: a corpus of positive texts, negative texts, and a held out corpus

For each word in the vocabulary, calculate its probability in each corpus. E.g. for the positive corpus:

Compute its log-liklihood ratio for positive vs negative documents:

Rank all words from highest LLR to lowest.

44

Wednesday, March 5, 14

Page 96: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

LLR examples computed from IMDB reviews

45

edie 16.069394855429000

antwone 15.85538381213240

din 15.747494864581600

goldsworthy 15.552434312463400

gunga 15.536930128612500

kornbluth -15.090106131301700

kareena -15.11542393212590

tashan -15.233206936755000

hobgoblins -15.233206936755000

slater -15.318364724832600

+

-

Wednesday, March 5, 14

Page 97: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

LLR examples computed from IMDB reviews

45

edie 16.069394855429000

antwone 15.85538381213240

din 15.747494864581600

goldsworthy 15.552434312463400

gunga 15.536930128612500

kornbluth -15.090106131301700

kareena -15.11542393212590

tashan -15.233206936755000

hobgoblins -15.233206936755000

slater -15.318364724832600

+

-

Filter:

Wednesday, March 5, 14

Page 98: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

LLR examples computed from IMDB reviews

45

edie 16.069394855429000

antwone 15.85538381213240

din 15.747494864581600

goldsworthy 15.552434312463400

gunga 15.536930128612500

kornbluth -15.090106131301700

kareena -15.11542393212590

tashan -15.233206936755000

hobgoblins -15.233206936755000

slater -15.318364724832600

+

-

perfection 2.204227744897310

captures 2.0551924704260400

wonderfully 2.020824971323010

powell 1.9933170865620900

refreshing 1.867299924519800

pointless -2.477406360270270

blah -2.57814744950696

waste -2.673668672544840

unfunny -2.7084876042405500

seagal -3.6618321047833000

Filter:

Wednesday, March 5, 14

Page 99: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Top 25 filtered positive and negative words using LLR on IMDB

46

perfection captures wonderfully powell refreshing flynn delightful gripping beautifully underrated superb delight welles unforgettable touching favorites extraordinary stewart brilliantly friendship wonderful magnificent finest marie jackie

horrible unconvincing uninteresting insult uninspired sucks miserably boredom cannibal godzilla lame wasting remotely awful poorly laughable worst lousy redeeming atrocious pointless blah waste unfunny seagal

+

-

Some obvious film domain dependence, but also lots of generally good valence determinations.

Wednesday, March 5, 14

Page 100: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Using the learned lexicon

There are various ways to use the LLR ranks:

Take the top N of positive and negative and use them as the positive and negative sets.

Combine the top N with another lexicon (e.g. the super small one or the Opinion Lexicon).

Take the top N and manually prune words that are not generally applicable.

Use the LLR values as the input to a more complex (and presumably more capable) algorithm.

Here we’ll try three things:

IMDB100: the top 100 positive and 100 negative filtered words

IMDB1000: the top 1000 positive and 1000 negative filtered words

Opinion Lexicon + IMDB1000: take the union of positive terms in Opinion Lexicon and IMDB1000, and same for the negative terms.

47

Wednesday, March 5, 14

Page 101: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Better lexicons can get pretty big improvements!

48

Debate08 HCR STS IMDB

Super simple

Small lexicon

Opinion lexicon

IMDB100

IMDB1000

Opinion Lexicon + IMDB1000

20.5 21.6 19.4 27.4

21.5 22.1 25.5 51.4

47.8 42.3 62.0 73.6

24.1 22.6 35.7 77.9

58.0 45.6 50.5 66.0

62.4 49.1 56.0 66.1

Wednesday, March 5, 14

Page 102: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Better lexicons can get pretty big improvements!

48

Debate08 HCR STS IMDB

Super simple

Small lexicon

Opinion lexicon

IMDB100

IMDB1000

Opinion Lexicon + IMDB1000

20.5 21.6 19.4 27.4

21.5 22.1 25.5 51.4

47.8 42.3 62.0 73.6

24.1 22.6 35.7 77.9

58.0 45.6 50.5 66.0

62.4 49.1 56.0 66.1

Nonetheless: for the reasons mentioned previously, this strategy eventually runs out of steam. It is a starting point.

Wednesday, March 5, 14

Page 103: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

The rule-based approach requires defining a set of ad hoc rules and explicitly managing their interaction.

If we instead have lots of examples of texts of different categories, we can learn a function that maps new texts to one category or the other.

What were rules become features that are extracted from the input; their importance is extracted from statistics in a labeled training set.

These features are dimensions; their values for a given text plot it into space.

49

Wednesday, March 5, 14

Page 104: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

Idea: software learns from examples it has seen.

Find the boundary between different classes of things, such as spam versus not-spam emails.

50

HamSpam

Wednesday, March 5, 14

Page 105: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

Idea: software learns from examples it has seen.

Find the boundary between different classes of things, such as spam versus not-spam emails.

50

HamSpam

Wednesday, March 5, 14

Page 106: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

Idea: software learns from examples it has seen.

Find the boundary between different classes of things, such as spam versus not-spam emails.

50

HamSpam

Wednesday, March 5, 14

Page 107: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

Idea: software learns from examples it has seen.

Find the boundary between different classes of things, such as spam versus not-spam emails.

50

HamSpam

Wednesday, March 5, 14

Page 108: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Machine learning for classification

Given a set of labeled points, there are many standard methods for learning linear classifiers. Some popular ones are:

Naive Bayes

Logistic Regression / Maximum Entropy

Perceptrons

Support Vector Machines (SVMs)

The properties of these classifier types are widely covered in tutorials, code, and homework problems.

There are various reasons to prefer one or the other of these, depending on amount of training material, tolerance for longer training times, and the complexity of features used.

51

Wednesday, March 5, 14

Page 109: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for document classification

All of the linear classifiers require documents to be represented as points in some n-dimensional space.

Each dimension corresponds to a feature, or observation about a subpart of a document.

A feature’s value is typically the number of times it occurs.

Ex: Consider the document “That new 300 movie looks sooo friggin bad ass. Totally BAD ASS!” The feature “the lowercase form of the word ‘bad’” has a value of 2, and the feature “is_negative_word” would be 4 (“bad”,“ass”,“BAD”,“ASS”).

For many documentation classification tasks (e.g. spam classification), bag-of-words features are unreasonably effective.

However, for more subtle tasks, including polarity classification, we usually employ more interesting features.

52

Wednesday, March 5, 14

Page 110: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

Wednesday, March 5, 14

Page 111: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

Wednesday, March 5, 14

Page 112: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

w=so

Wednesday, March 5, 14

Page 113: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

Wednesday, March 5, 14

Page 114: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

Wednesday, March 5, 14

Page 115: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

wt=that_artwt=new_adjwt=300_nounwt=movie_nounwt=looks_verbwt=sooo_advwt=friggin_advwt=bad_adjwt=ass_noun

Wednesday, March 5, 14

Page 116: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

wt=that_artwt=new_adjwt=300_nounwt=movie_nounwt=looks_verbwt=sooo_advwt=friggin_advwt=bad_adjwt=ass_noun

NP

NP

NP

NP

NP

NPVP

S

Wednesday, March 5, 14

Page 117: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

wt=that_artwt=new_adjwt=300_nounwt=movie_nounwt=looks_verbwt=sooo_advwt=friggin_advwt=bad_adjwt=ass_noun

NP

NP

NP

NP

NP

NPVP

S

subtree=S_NP_movie-S_VP_looks-S_VP_NP_bad_ass

Wednesday, March 5, 14

Page 118: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

wt=that_artwt=new_adjwt=300_nounwt=movie_nounwt=looks_verbwt=sooo_advwt=friggin_advwt=bad_adjwt=ass_noun

NP

NP

NP

NP

NP

NPVP

S

subtree=NP_sooo_bad_ass

subtree=S_NP_movie-S_VP_looks-S_VP_NP_bad_ass

Wednesday, March 5, 14

Page 119: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Features for classification

53

That new 300 movie looks sooo friggin BAD ASS .

w=thatw=neww=300w=moview=looksw=sooow=frigginw=badw=ass

art adj noun noun verb adv adv adj noun punc

w=so

bi=<START>_that bi=that_newbi=new_300 bi=300_moviebi=movie_looksbi=looks_sooobi=sooo_frigginbi=friggin_badbi=bad_assbi=ass_.bi=._<END>

wt=that_artwt=new_adjwt=300_nounwt=movie_nounwt=looks_verbwt=sooo_advwt=friggin_advwt=bad_adjwt=ass_noun

NP

NP

NP

NP

NP

NPVP

S

subtree=NP_sooo_bad_ass

subtree=S_NP_movie-S_VP_looks-S_VP_NP_bad_ass

FEATUREENGINEERING...

(deep learning might help

ease the burden)

Wednesday, March 5, 14

Page 120: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Complexity of features

Features can be defined on very deep aspects of the linguistic content, including syntactic and rhetorical structure.

The models for these can be quite complex, and often require significant training material to learn them, which means it is harder to employ them for languages without such resources.

I’ll show an example for part-of-speech tagging in a bit.

Also: the more fine-grained the feature, the more likely it is rare to see in one’s training corpus. This requires more training data, or effective semi-supervised learning methods.

54

Wednesday, March 5, 14

Page 121: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Recall the four sentiment datasets

55

Dataset Topic Year # Train # Dev #Test Reference

Debate08Obama vs McCain debate

2008 795 795 795Shamma, et al. (2009) "Tweet the Debates: Understanding Community Annotation of

Uncollected Sources."

HCRHealth care

reform2010 839 838 839

Speriosu et al. (2011) "Twitter Polarity Classification with

Label Propagation over Lexical Links and the Follower Graph."

STS(Stanford)

Twitter Sentiment

2009 - 216 -Go et al. (2009) "Twitter

sentiment classification using distant supervision"

IMDBIMDB movie

reviews2011 25,000 25,000 -

Mas et al. (2011) "Learning Word Vectors for Sentiment

Analysis"

Wednesday, March 5, 14

Page 122: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Logistic regression, in domain

56

Debate08 HCR STS IMDB

Opinion Lexicon + IMDB1000

Logistic Regressionw/ bag-of-words

Logistic Regressionw/ extended

features

62.4 49.1 56.0 66.1

60.9 56.0(no labeled training set) 86.7

70.2 60.5 -

When training on labeled documents from the same corpus.

Models trained with Liblinear (via ScalaNLP Nak)

Note: for IMDB, the logistic regression classifier only predicts positive or negative (because there are no neutral training examples), effectively making it easier than for the lexicon-based method.

Wednesday, March 5, 14

Page 123: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Logistic regression (using extended features), cross-domain

57

Debate08 HCR STS

Debate08

HCR

Debate08+HCR

70.2 51.3 56.5

56.4 60.5 54.2

70.3 61.2 59.7Trai

ning

cor

pora

Evaluation corpora

In domain training examples add 10-15% absolutely accuracy (56.4 -> 70.2 for Debate08, and 51.3 -> 60.5 for HRC).

More labeled examples almost always help, especially if you have no in-domain training data (e.g. 56.5/54.2 -> 59.7 for STS).

Wednesday, March 5, 14

Page 124: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accuracy isn’t enough, part 1

The class balance can shift considerably without affecting the accuracy!

58

58+24+47216

= 59.7

D08+HRC on STS- ~ +

-~+

58 12 5 75

7 24 2 33

34 27 47 108

99 63 54 216

8+15+106216

= 59.7

(Made up) Positive-heavy classifier

- ~ +-~+

8 12 55 75

7 24 11 33

1 1 106 108

16 28 172 216

Wednesday, March 5, 14

Page 125: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accuracy isn’t enough, part 1

Need to also consider the per-category precision, recall, and f-score.

59

- ~ +-~+

58 12 5 75

7 24 2 33

34 27 47 108

99 63 54 216

P R F-~+

Avg

58.6 77.3 66.7

38.1 72.7 50.0

87.0 43.5 58.0

61.2 64.5 58.2

Acc: 59.7 Big differences in precision for the three categories!

Wednesday, March 5, 14

Page 126: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accuracy isn’t enough, part 2

Errors on neutrals are typically less grievous than positive/negative errors, yet raw accuracy makes one pay the same penalty.

60

D08+HRC on STS

One solution: allow varying penalties such that no points are awarded for positive/negative errors, but some partial credit is given for positive/neutral and negative/neutral ones.

- ~ +-~+

58 12 5 75

7 24 2 33

34 27 47 108

99 63 54 216

Wednesday, March 5, 14

Page 127: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accuracy isn’t enough, part 3

Who says the gold standard is correct? There is often significant variation among human annotators, especially for positive vs neutral and negative vs neutral.

Solution one: work on your annotations (including creating conventions) until you get very high inter-annotator agreement.

This arguably reduces the linguistic variability/subtlety characterized in the annotations.

Also, humans often fail to get the intended sentiment, e.g. sarcasm.

Solution two: measure performance differently. For example, given a set of examples annotated by three or more human annotators and the machine, is the machine distinguishable from the humans in terms of the amount it disagrees with their annotations?

61

Wednesday, March 5, 14

Page 128: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accuracy isn’t enough, part 4

Often, what is of interest is an aggregate sentiment for some topic or target. E.g. given a corpus of tweets about cars, 80% of the mentions of the Ford Focus are positive while 70% of the mentions of the Chevy Malibu are positive.

Note: you can get the sentiment value wrong for some of the documents while still getting the overall, aggregate sentiment correct (as errors can cancel each other).

Note also: generally, this requires aspect-based analysis (more later).

62

Wednesday, March 5, 14

Page 129: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Caveat emptor, part 1

In measuring accuracy, the methodology can vary dramatically from vendor to vendor, at times in unclear ways.

For example, some seem to measure accuracy by presenting a human judge with examples annotated by a machine. The human then marks which examples they believe were incorrect. Accuracy is then num_correct/num_examples.

Problem: people get lazy and often end up giving the machine the benefit of the doubt.

I have even heard that some vendors take their high-confidence examples and do the above exercise. This is basically cheating: high-confidence machine label assignments are on average more correct than low-confidence ones.

63

Wednesday, March 5, 14

Page 130: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Caveat emptor, part 2

Performance on in-domain data is nearly always better than out-of-domain (see the previous experiments).

The nature of the world is that the language of today is a step away from the language of yesterday (when you developed your algorithm or trained your model).

Also, because there are so many things to talk about (and because people talk about everything), a given model is usually going to end up employed in domains it never saw in its training data.

64

Wednesday, March 5, 14

Page 131: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Caveat emptor, part 3

With nice, controlled datasets like those given previously, the experimenter has total control over which documents her algorithm is applied too.

However, a deployed system will likely confront many irrelevant documents, e.g.

documents written in other languages

Sprint the company wants tweets by their customers, but also get many tweets of people talking about the activity of sprinting.

documents that match, but which are not about the target of interest

documents that should have matched, but were missed in retrieval

Thus, identification of relevant documents and even sub-documents with relevant targets, is an important component of end-to-end sentiment solutions.

65

Wednesday, March 5, 14

Page 132: Practical Sentiment Analysis

Aspect-based sentiment analysis

Why NLP is hardSentiment analysis overview

Document classification

VisualizationSemi-supervised learning

Stylistics & author modelingBeyond text

Wrap up

Wednesday, March 5, 14

Page 133: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Is it coherent to ask what the sentiment of a document is?

Documents tend to discuss many entities and ideas, and they can express varying opinions, even toward the same entity.

This is true even in tweets, e.g.

positive towards the HCR bill

negative towards Mitch McConnell

67

Here's a #hcr proposal short enough for Mitch McConnell to read: pass the damn bill now

Wednesday, March 5, 14

Page 134: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Fine-grained sentiment

Two products, iPhone and Blackberry

Overall positive to iPhone, negative to Blackberry

Postive aspect/features of iPhone: touch screen, voice quality. Negative (for the mother): expensive.

68Slide adapted from Bing Liu

Wednesday, March 5, 14

Page 135: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Components of fine-grained analysis

Opinion targets: entities and their features/aspects

Sentiment orientations: positive, negative, or neutral

Opinion holders: persons holding the opinions

Time: when the opinions are expressed

69Slide adapted from Bing Liu

Wednesday, March 5, 14

Page 136: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

An entity e is a product, person, event, organization, or topic. e is represented as

a hierarchy of components, sub-components, and so on.

Each node represents a component and is associated with a set of attributes of the component.

An opinion can be expressed on any node or attribute of the node.

For simplicity, we use the term aspects (features) to represent both components and attributes.

Entity and aspect (Hu and Liu, 2004; Liu, 2006)

70

iPhone

screen battery

{cost,size,appearance,...}

{battery_life,size,...}{...} ...

Slide adapted from Bing Liu

Wednesday, March 5, 14

Page 137: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Opinion definition (Liu, Ch. in NLP handbook, 2010)

An opinion is a quintuple (e,a,so,h,t) where:

e is a target entity.

a is an aspect/feature of the entity e.

so is the sentiment value of the opinion from the opinion holder h on feature a of entity e at time t. so is positive, negative or neutral (or more granular ratings).

h is an opinion holder.

t is the time when the opinion is expressed.

Examples from the previous passage:

71

(iPhone, GENERAL, +, Abc123, 5-1-2008) (iPhone, touch_screen, +, Abc123, 5-1-2008) (iPhone, cost, -, mother_of(Abc123), 5-1-2008)

Slide adapted from Bing Liu

Wednesday, March 5, 14

Page 138: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The goal: turn unstructured text into structured opinions

Given an opinionated document (or set of documents)

discover all quintuples (e,a, so, h, t)

or solve a simpler form of it, such as the document level task considered earlier

Having extracted the quintuples, we can feed them into traditional visualization and analysis tools.

72Slide adapted from Bing Liu

Wednesday, March 5, 14

Page 139: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Several sub-problems

e is a target entity: Named Entity Extraction (more)

a is an aspect of e: Information Extraction

so is sentiment: Sentiment Identification

h is an opinion holder: Information/Data Extraction

t is the time: Information/Data Extraction

73Slide adapted from Bing Liu

All of these tasks can make use of deep language processing methods, including parsing, coreference, word sense disambiguation, etc.

Wednesday, March 5, 14

Page 140: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Named entity recognition

Given a document, identify all text spans that mention an entity (person, place, organization, or other named thing).

Requires having performed tokenization, and possibly part-of-speech tagging.

Though it is a bracketing task, it can be transformed into a sequence task using BIO labels (Begin, Inside, Outside)

Usually, discriminative sequence models like Maxent Markov Models and Conditional Random Fields are trained on such sequences, and used for prediction.

74

Mr. [John Smith]Person traveled to [New York City]Location to visit [ABC Corporation]Organization.

Mr. John Smith traveled to New York City to visit ABC CorporationO B-PER I-PER O O B-LOC I-LOC I-LOC O O B-ORG I-ORG

Wednesday, March 5, 14

Page 141: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

OpenNLP Pipeline demo

Sentence detection

Tokenization

Part-of-speech tagging

Chunking

NER: persons and organizations

75

Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group. Rudolph Agnew, 55 years old and former chairman of Consolidated Gold Fields PLC, was named a director of this British industrial conglomerate.

Wednesday, March 5, 14

Page 142: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Things are tricky in Twitterland - need domain adaptation

76

.@Peters4Michigan's camp tells me he'll appear w Obama in MI tomorrow. Not as scared as other Sen Dems of the prez:

.[@Peters4Michigan]PER 's camp tells me he'll appear w [Obama]PER in [MI]LOC tomorrow. Not as scared as other Sen [Dems]ORG of the prez:

Named entities referred to with @-mentions (makes things easier, but also harder for model solely trained on newswire text)

Tokenization: many new innovations, including “.@account” at begining of tweet (which blocks it being an @-reply to that account)

Abbreviations mess with features learned on standard text, e.g. “w” for with (as above), or even for George W. Bush:

And who changed that? Remember Dems, many on foreign soil, criticizing W vehemently? Speaking of rooting against a Prez ....

Wednesday, March 5, 14

Page 143: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Identifying targets and aspects

We can specify targets, their sub-components, and their attributes:

But language is varied and evolving, so we are likely to miss many ways to refer to targets and their aspects.

E.g. A person declaring knowledge about phones might forget (or not even know) that “juice” is a way of referring to power consumption.

Also: there are many ways of referring to product lines (and their various releases, e.g. iPhone 4s) and their competitors, and we often want to identify these semi-automatically.

Much research has worked on bootstrapping these. See Bing Liu’s tutorial for an excellent overview:

http://www.cs.uic.edu/~liub/FBS/Sentiment-Analysis-tutorial-AAAI-2011.pdf

77

iPhone

screen battery

{cost,size,appearance,...}

{battery_life,size,...}{...} ...

Wednesday, March 5, 14

Page 144: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Target-based feature engineering

Given a sentence like “We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.”

NER to identify the “Porsche Panamera” as the target

Aspect identification to see that opinions are being expressed about the car’s driving and styling.

Sentiment analysis to identify positive sentiment toward the driving and negative toward the styling.

Targeted sentiment analysis require positional features

use string relationship to the target or aspect

or use features from a parse of the sentence (if you can get it)

78

Wednesday, March 5, 14

Page 145: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

In addition to the standard document-level features used previously, we build features particularized for each target.

These are just a subset of the many possible features.

Positional features

79

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

Wednesday, March 5, 14

Page 146: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

In addition to the standard document-level features used previously, we build features particularized for each target.

These are just a subset of the many possible features.

Positional features

79

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

Wednesday, March 5, 14

Page 147: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

In addition to the standard document-level features used previously, we build features particularized for each target.

These are just a subset of the many possible features.

Positional features

79

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

Wednesday, March 5, 14

Page 148: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

In addition to the standard document-level features used previously, we build features particularized for each target.

These are just a subset of the many possible features.

Positional features

79

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

We love how the Porsche Panamera drives, but its bulbous exterior is unfortunately ugly.

Wednesday, March 5, 14

Page 149: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Challenges

Positional features greatly expands the space of possible features.

We need more training data to estimate parameters for such features.

Highly specific features increase the risk of overfitting to whatever training data you have.

Deep learning has a lot of potential to help with learning feature representations that are effective for the task by reducing the need for careful feature engineering.

But obviously: we need to be able to use this sort of evidence in order to do the job well via automated means.

80

Wednesday, March 5, 14

Page 150: Practical Sentiment Analysis

Visualization

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

Semi-supervised learningStylistics & author modeling

Beyond textWrap up

Wednesday, March 5, 14

Page 151: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Visualize, but be careful when doing so

It's often the case that a visualization can capture nuances in the data that numerical or linguistic summaries cannot easily capture.

Visualization is an art and a science in its own right. The following advice from Tufte (2001, 2006) is easy to keep in mind (if only so that your violations of it are conscious and motivated):

Draw attention to the data, not the visualization.

Use a minimum of ink.

Avoid creating graphical puzzles.

Use tables where possible.

82Slide by Chris Potts

Wednesday, March 5, 14

Page 152: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Sentiment lexicons: SentiWordNet

83Slide by Chris Potts

Wednesday, March 5, 14

Page 153: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter Sentiment results for Netflix.

84Slide by Chris Potts

Wednesday, March 5, 14

Page 154: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitrratr blends the data and summarization together

85Slide by Chris Potts

Wednesday, March 5, 14

Page 155: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Relationships between modifiers in WordNet similar-to graph

86Slide by Chris Potts

Wednesday, March 5, 14

Page 156: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Relationships between modifiers in WordNet similar-to graph

87Slide by Chris Potts

Wednesday, March 5, 14

Page 157: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Visualizing discussions: Wikipedia deletions [http://notabilia.net/]

88

Could be used as a visualization for evolving sentiment over time in a discussion among many individuals.

Wednesday, March 5, 14

Page 158: Practical Sentiment Analysis

Semi-supervised Learning

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

Visualization

Stylistics & author modelingBeyond text

Wrap up

Wednesday, March 5, 14

Page 159: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling

90

Scaling for text analysis tasks typically requires more than big computation or big data. ➡ Most interesting tasks involve representations “below” the text itself.

Wednesday, March 5, 14

Page 160: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling

90

Scaling for text analysis tasks typically requires more than big computation or big data. ➡ Most interesting tasks involve representations “below” the text itself.

Being “big” helps when you know what you are computing and how you can compute it. ➡ GIGO, and “big” garbage is still garbage.

Wednesday, March 5, 14

Page 161: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling

90

Scaling for text analysis tasks typically requires more than big computation or big data. ➡ Most interesting tasks involve representations “below” the text itself.

Being “big” helps when you know what you are computing and how you can compute it. ➡ GIGO, and “big” garbage is still garbage.

Scaling often requires being creative about how to learn f from relatively little explicit information about the task. ➡ Semi-supervised methods and indirect supervision to the rescue.

Wednesday, March 5, 14

Page 162: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 163: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 164: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 165: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 166: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 167: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 168: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Scaling annotations

91

Wednesday, March 5, 14

Page 169: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accurate toolExtremely low annotation

92

Annotation is relatively expensive

Wednesday, March 5, 14

Page 170: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accurate toolExtremely low annotation

92

Annotation is relatively expensive

Wednesday, March 5, 14

Page 171: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accurate toolExtremely low annotation ?

92

Annotation is relatively expensive

Wednesday, March 5, 14

Page 172: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Accurate toolExtremely low annotation ?

92

Annotation is relatively expensive

We lack sufficient resources for most languages, most domains and most problems.

Semi-supervised learning approaches become essential.➡ See Philip Resnik’s SAS 2011 keynote: http://vimeo.com/32506363

Wednesday, March 5, 14

Page 173: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Example: Learning part-of-speech taggers

93

They often book flights .

The red book fell .

Wednesday, March 5, 14

Page 174: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Example: Learning part-of-speech taggers

93

They often book flights .

The red book fell .

N Adv V N PUNC

D Adj N V PUNC

Wednesday, March 5, 14

Page 175: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Example: Learning part-of-speech taggers

93

They often book flights .

The red book fell .

N Adv V N PUNC

D Adj N V PUNC

Wednesday, March 5, 14

Page 176: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Example: Learning part-of-speech taggers

93

They often book flights .

The red book fell .

POS Taggers are usually trained on hundreds of thousands of annotated word tokens. What if we have almost nothing?

N Adv V N PUNC

D Adj N V PUNC

Wednesday, March 5, 14

Page 177: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMM

94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 178: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMEM

94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 179: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMEM

94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 180: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMTag Dict

Generalization EM

94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 181: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMTag Dict

Generalization EM

cover the vocabulary94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 182: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMModel

MinimizationTag Dict

Generalization EM

cover the vocabulary remove noise94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 183: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

annotation HMMModel

MinimizationTag Dict

Generalization EM

cover the vocabulary remove noise train94

The overall strategy: grow, shrink, learn

Wednesday, March 5, 14

Page 184: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Extremely low annotation scenario [Garrette & Baldridge 2013]

Obtain word types or tokens annotated with their parts-of-speech by a linguist in under two hours

95

the Dbook N, Voften Advred Adj, NTypes:  construct  a  tag  dic.onary  from  scratch  

(not  simulated)

Tokens:  standard  word-­‐by-­‐word  annota.on

They often book flights . N Adv V N PUNCThe red book fell . D Adj N V PUNC

Wednesday, March 5, 14

Page 185: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Strategy: connect annotations to raw corpus and propagate them

96

Raw Corpus

Tokens

Types

Wednesday, March 5, 14

Page 186: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Strategy: connect annotations to raw corpus and propagate them

96

Raw Corpus

Tokens

Types

Wednesday, March 5, 14

Page 187: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Strategy: connect annotations to raw corpus and propagate them

96

Raw Corpus

Tokens

Types

Wednesday, March 5, 14

Page 188: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Strategy: connect annotations to raw corpus and propagate them

96

Raw Corpus

Tokens

Types

Wednesday, March 5, 14

Page 189: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Strategy: connect annotations to raw corpus and propagate them

96

Raw Corpus

Tokens

Types

Wednesday, March 5, 14

Page 190: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 191: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 192: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 193: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 194: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 195: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 196: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 197: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 198: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 199: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)

Wednesday, March 5, 14

Page 200: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Label propagation for video recommendation, in brief

97

Alice

Bob

Eve

Basil Marceaux for Tennessee Governor

Jimmy Fallon: Whip My Hair

Radiohead: Paranoid Android

Pink Floyd: The Wall (Full Movie)Local updates, so scales easily!

Wednesday, March 5, 14

Page 201: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 202: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

Type Annotations________________________thedog

DTNN

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 203: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

Type Annotations________________________thedog

DT NN

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 204: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

Token Annotations________________________

Type Annotations________________________thedog

the dog walksDT NN VBZ

DT NN

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 205: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

Token Annotations________________________

Type Annotations________________________thedog

the dog walks

DT NN

DT NN

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 206: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

NEXT_walksPREV_<b> PREV_the

PRE1_tPRE2_th SUF1_g

TYPE_the TYPE_thug TYPE_dog

DT NN

DT NN

Tag dictionary generalization

98

Wednesday, March 5, 14

Page 207: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

99

Wednesday, March 5, 14

Page 208: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Tag dictionary generalization

99

Wednesday, March 5, 14

Page 209: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Tag dictionary generalization

99

Wednesday, March 5, 14

Page 210: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

100

Wednesday, March 5, 14

Page 211: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

100

Wednesday, March 5, 14

Page 212: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

101

Wednesday, March 5, 14

Page 213: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

101

Wednesday, March 5, 14

Page 214: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

102

Wednesday, March 5, 14

Page 215: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

102

Wednesday, March 5, 14

Page 216: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

103

Wednesday, March 5, 14

Page 217: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

103

Wednesday, March 5, 14

Page 218: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

Tag dictionary generalization

104

Wednesday, March 5, 14

Page 219: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

DT NN

DT NN

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

104

Wednesday, March 5, 14

Page 220: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

104

Wednesday, March 5, 14

Page 221: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

104

Wednesday, March 5, 14

Page 222: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

105

Wednesday, March 5, 14

Page 223: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

105

Wednesday, March 5, 14

Page 224: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Result:

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

105

Wednesday, March 5, 14

Page 225: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Result:

• a tag distribution on every token

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

105

Wednesday, March 5, 14

Page 226: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Result:

• a tag distribution on every token

• an expanded tag dictionary (non-zero tags)

TOK_the_1 TOK_dog_2TOK_the_4 TOK_thug_5

Tag dictionary generalization

105

Wednesday, March 5, 14

Page 227: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

Wednesday, March 5, 14

Page 228: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

Wednesday, March 5, 14

Page 229: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

Wednesday, March 5, 14

Page 230: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

Wednesday, March 5, 14

Page 231: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

Wednesday, March 5, 14

Page 232: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

0

25

50

75

100

English Kinyarwanda Malagasy

EM only EM only+ Our approach + Our approach

Tokens TypesEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approachEM only EM only+ Our approach + Our approach

TotalAccuracy

106

Results (two hours of annotation)

With 4 hours + a bit more ➡ 90% [Garrette, Mielens, & Baldridge 2013]

Wednesday, March 5, 14

Page 233: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity classification for Twitter

107

Obama looks good. #tweetdebate #current+- McCain is not answering the questions #tweetdebate

Sen McCain would be a very popular President - $5000 tax refund per family! #tweetdebate+

- "it's like you can see Obama trying to remember all the "talking points" and get his slogans out there #tweetdebate"

Wednesday, March 5, 14

Page 234: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity classification for Twitter

107

Obama looks good. #tweetdebate #current+- McCain is not answering the questions #tweetdebate

Sen McCain would be a very popular President - $5000 tax refund per family! #tweetdebate+

- "it's like you can see Obama trying to remember all the "talking points" and get his slogans out there #tweetdebate"

Logistic regression... and... done!

Wednesday, March 5, 14

Page 235: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Polarity classification for Twitter

107

Obama looks good. #tweetdebate #current+- McCain is not answering the questions #tweetdebate

Sen McCain would be a very popular President - $5000 tax refund per family! #tweetdebate+

- "it's like you can see Obama trying to remember all the "talking points" and get his slogans out there #tweetdebate"

Logistic regression... and... done!

What if instance labels aren’t there?

Wednesday, March 5, 14

Page 236: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

No explicitly labeled examples?

108

Wednesday, March 5, 14

Page 237: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

No explicitly labeled examples?

108

Positive/negative ratio using polarity lexicon.➡ Easy & works okay for many cases, but fails spectactularly elsewhere.

Wednesday, March 5, 14

Page 238: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

No explicitly labeled examples?

108

Positive/negative ratio using polarity lexicon.➡ Easy & works okay for many cases, but fails spectactularly elsewhere.

Emoticons as labels + logistic regression.➡ Easy, but emoticon to polarity mapping is actually vexed.

Wednesday, March 5, 14

Page 239: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

No explicitly labeled examples?

108

Positive/negative ratio using polarity lexicon.➡ Easy & works okay for many cases, but fails spectactularly elsewhere.

Emoticons as labels + logistic regression.➡ Easy, but emoticon to polarity mapping is actually vexed.

Label propagation using the above as seeds.➡ Noisy labels provide soft indicators, the graph smooths things out.

Wednesday, March 5, 14

Page 240: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

No explicitly labeled examples?

108

Positive/negative ratio using polarity lexicon.➡ Easy & works okay for many cases, but fails spectactularly elsewhere.

Emoticons as labels + logistic regression.➡ Easy, but emoticon to polarity mapping is actually vexed.

Label propagation using the above as seeds.➡ Noisy labels provide soft indicators, the graph smooths things out.

If you have annotations, you can use those too.➡ Including ordered labels like star ratings: see Talukdar & Crammer 2009

Wednesday, March 5, 14

Page 241: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 242: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

“Obama”, “silly”, “petty”

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 243: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

“Obama”, “silly”, “petty”

=

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 244: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 245: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 246: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

is happy Obama is president

Obama’s doing great!

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 247: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

is happy Obama is president

Obama’s doing great!

“Obama”, “silly”, “petty”

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 248: Practical Sentiment Analysis

Using social interaction: Twitter sentiment

Obama is making the repubs look silly and petty

bird images from http://www.mytwitterlayout.com/

http

://st

arw

ars.

wik

ia.c

om/w

iki/R

2-D

2

is happy Obama is president

Obama’s doing great!=

(hopefully)

“Obama”, “silly”, “petty”

Papers: Speriosu et al. 2011; Tan et al. KDD 2011

Wednesday, March 5, 14

Page 249: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Wednesday, March 5, 14

Page 250: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :)

Wednesday, March 5, 14

Page 251: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :)

Wednesday, March 5, 14

Page 252: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :)

Wednesday, March 5, 14

Page 253: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

Wednesday, March 5, 14

Page 254: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

Word n-gram

swe can’t

love ny

i love

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

Wednesday, March 5, 14

Page 255: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

OpinionFindercare

hate

love

Word n-gram

swe can’t

love ny

i love

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

Wednesday, March 5, 14

Page 256: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

OpinionFindercare

hate

love

Word n-gram

swe can’t

love ny

i love

Emoticons;-):(:)

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

Wednesday, March 5, 14

Page 257: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

OpinionFindercare

hate

love

Word n-gram

swe can’t

love ny

i love

Emoticons;-):(:)

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

-+

+

Wednesday, March 5, 14

Page 258: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

OpinionFindercare

hate

love

Word n-gram

swe can’t

love ny

i love

Emoticons;-):(:)

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

+ +-

-+

+

Wednesday, March 5, 14

Page 259: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Twitter polarity graph with knowledge and noisy seeds

110

OpinionFindercare

hate

love

Word n-gram

swe can’t

love ny

i love

Emoticons;-):(:)

Alice

I love #NY! :)

Ahhh #Obamacare

Bob

We can’t pass this :(#killthebill

I hate #Obamacare! #killthebill

EveWe need health care!

Let’s get it passed! :) Hashtags

killthebillobamacareny

+ +-

-+

+

+ -

+ -

+ -

+ -

+ -

+ -

Wednesday, March 5, 14

Page 260: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Results: polarity assignment (positive/negative, no neutral)

111

Stanford TwitterSentiment

Obama-McCainDebate

Health CareReform

Random

Lexicon Ratio

Emoticon-trained(Logistic regression)

Label propagation

50.0 50.0 50.0

72.1 59.1 58.1

83.1 61.3 62.9

84.7 66.7 71.2

Take-home message: label propagation can make effective use of labeled features (from external knowledge sources) and noisy annotations.

Wednesday, March 5, 14

Page 261: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Let’s not forget scalable human annotationMechanical Turk: can work well, but also problematic (e.g. lack of workers for many languages).

Also real-time polling and reactions, e.g., ReactLabs.

112

Wednesday, March 5, 14

Page 262: Practical Sentiment Analysis

Stylistics & author modeling

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

VisualizationSemi-supervised learning

Beyond textWrap up

Wednesday, March 5, 14

Page 263: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Content-based analysis vs stylistics

For general categorization tasks, the content words are the most helpful.

E.g. to know whether a document is about sports or finance, it helps to know it contains “baseball”, “umpire” and “game” versus “money”, “stocks”, and “bonds”.

Often we filter so-called stop-words when creating features for such tasks.

Stylistics: tasks of interest include authorship attribution, status, depression, deceit, demographics, and more.

Stylistics is different from content categorizaiton: the subtle differences in use of function words is very important.

E.g. in authorship attribution studies, content words are often filtered out. The “stop words” instead become the “keep words”!

114

Wednesday, March 5, 14

Page 264: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Author comparison: match texts on left to those by same author on the right

115

His manner was not effusive. It seldom was; but he was glad, I think, to see me. With hardly a word spoken, but with a kindly eye, he waved me to an armchair, threw across his case of cigars, and indicated a spirit case and a gasogene in the corner. Then he stood before the fire and looked me over in his singular introspective fashion.

For all the preposterous hat and the vacuous face, there was something noble in the simple faith of our visitor which compelled our respect. She laid her little bundle of papers upon the table and went her way, with a promise to come again whenever she might be summoned.

He was invited to Kellynch Hall; he was talked of and expected all the rest of the year; but he never came. The following spring he was seen again in town, found equally agreeable, again encouraged, invited, and expected, and again he did not come; and the next tidings were that he was married.

There are many theories about what happened, but two general narratives seem to be gaining prominence, which we will call the greed narrative and the stupidity narrative. The two overlap, but they lead to different ways of thinking about where we go from here.

He was not an ill-disposed young man, unless to be rather cold hearted and rather selfish is to be ill-disposed: but he was, in general, well respected; for he conducted himself with propriety in the discharge of his ordinary duties. Had he married a more amiable woman, he might have been made still more respectable than he was.

Our moral and economic system is based on individual responsibility. It’s based on the idea that people have to live with the consequences of their decisions. This makes them more careful deciders. This means that society tends toward justice — people get what they deserve as much as possible.

Wednesday, March 5, 14

Page 265: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Author comparison: match texts on left to those by same author on the right

115

His manner was not effusive. It seldom was; but he was glad, I think, to see me. With hardly a word spoken, but with a kindly eye, he waved me to an armchair, threw across his case of cigars, and indicated a spirit case and a gasogene in the corner. Then he stood before the fire and looked me over in his singular introspective fashion.

For all the preposterous hat and the vacuous face, there was something noble in the simple faith of our visitor which compelled our respect. She laid her little bundle of papers upon the table and went her way, with a promise to come again whenever she might be summoned.

He was invited to Kellynch Hall; he was talked of and expected all the rest of the year; but he never came. The following spring he was seen again in town, found equally agreeable, again encouraged, invited, and expected, and again he did not come; and the next tidings were that he was married.

There are many theories about what happened, but two general narratives seem to be gaining prominence, which we will call the greed narrative and the stupidity narrative. The two overlap, but they lead to different ways of thinking about where we go from here.

He was not an ill-disposed young man, unless to be rather cold hearted and rather selfish is to be ill-disposed: but he was, in general, well respected; for he conducted himself with propriety in the discharge of his ordinary duties. Had he married a more amiable woman, he might have been made still more respectable than he was.

Our moral and economic system is based on individual responsibility. It’s based on the idea that people have to live with the consequences of their decisions. This makes them more careful deciders. This means that society tends toward justice — people get what they deserve as much as possible.

Brooks, New York Times, Apr 2, 2009

Brooks, New York Times, Feb 19, 2009

Wednesday, March 5, 14

Page 266: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Author comparison: match texts on left to those by same author on the right

115

His manner was not effusive. It seldom was; but he was glad, I think, to see me. With hardly a word spoken, but with a kindly eye, he waved me to an armchair, threw across his case of cigars, and indicated a spirit case and a gasogene in the corner. Then he stood before the fire and looked me over in his singular introspective fashion.

For all the preposterous hat and the vacuous face, there was something noble in the simple faith of our visitor which compelled our respect. She laid her little bundle of papers upon the table and went her way, with a promise to come again whenever she might be summoned.

He was invited to Kellynch Hall; he was talked of and expected all the rest of the year; but he never came. The following spring he was seen again in town, found equally agreeable, again encouraged, invited, and expected, and again he did not come; and the next tidings were that he was married.

There are many theories about what happened, but two general narratives seem to be gaining prominence, which we will call the greed narrative and the stupidity narrative. The two overlap, but they lead to different ways of thinking about where we go from here.

He was not an ill-disposed young man, unless to be rather cold hearted and rather selfish is to be ill-disposed: but he was, in general, well respected; for he conducted himself with propriety in the discharge of his ordinary duties. Had he married a more amiable woman, he might have been made still more respectable than he was.

Our moral and economic system is based on individual responsibility. It’s based on the idea that people have to live with the consequences of their decisions. This makes them more careful deciders. This means that society tends toward justice — people get what they deserve as much as possible.

Doyle, Sherlock Holmes, A Scandal in Bohemia, 1891

Brooks, New York Times, Apr 2, 2009

Brooks, New York Times, Feb 19, 2009

Doyle, Sherlock Holmes, A Case of Identity, 1891

Wednesday, March 5, 14

Page 267: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Author comparison: match texts on left to those by same author on the right

115

His manner was not effusive. It seldom was; but he was glad, I think, to see me. With hardly a word spoken, but with a kindly eye, he waved me to an armchair, threw across his case of cigars, and indicated a spirit case and a gasogene in the corner. Then he stood before the fire and looked me over in his singular introspective fashion.

For all the preposterous hat and the vacuous face, there was something noble in the simple faith of our visitor which compelled our respect. She laid her little bundle of papers upon the table and went her way, with a promise to come again whenever she might be summoned.

He was invited to Kellynch Hall; he was talked of and expected all the rest of the year; but he never came. The following spring he was seen again in town, found equally agreeable, again encouraged, invited, and expected, and again he did not come; and the next tidings were that he was married.

There are many theories about what happened, but two general narratives seem to be gaining prominence, which we will call the greed narrative and the stupidity narrative. The two overlap, but they lead to different ways of thinking about where we go from here.

He was not an ill-disposed young man, unless to be rather cold hearted and rather selfish is to be ill-disposed: but he was, in general, well respected; for he conducted himself with propriety in the discharge of his ordinary duties. Had he married a more amiable woman, he might have been made still more respectable than he was.

Our moral and economic system is based on individual responsibility. It’s based on the idea that people have to live with the consequences of their decisions. This makes them more careful deciders. This means that society tends toward justice — people get what they deserve as much as possible.

Doyle, Sherlock Holmes, A Scandal in Bohemia, 1891

Brooks, New York Times, Apr 2, 2009

Austen, Persuasion, 1818 Brooks, New York Times, Feb 19, 2009

Austen, Sense and Sensibility, 1811

Doyle, Sherlock Holmes, A Case of Identity, 1891

Wednesday, March 5, 14

Page 268: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Quantitative features that help discriminate the authors

Grammatical person: 1st (we/us/our, I/me/my)

Grammatical tense: present, past

Word frequencies: frequent use of “he”

Punctuation: use of colons and semi-colons

Average word and sentence length

Syntax: prepositional adverbial phrases (“With hardly...”, “For all the...”)

These must be counted in all texts. The texts of unknown authorship should then have values most similar to those of the texts of one of the known authors.

116

Wednesday, March 5, 14

Page 269: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Forensic linguistics

Forensic linguistics is a branch of applied linguistics that applies linguistic theory, research and principles to real life language in the legal context.

Even more generally, it can be viewed as analyzing examples of language to discover properties that reveal more than just what is said.

authorship (same as other examples?, plagiarism)

psychological attributes of the author (deception, depression)

similarity to other examples (e.g., trademark disputes)

117

Wednesday, March 5, 14

Page 270: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Linguistic fingerprinting and identification

There has been much interest in finding linguistic “fingerprints”, but there are problems with the concept:

language acquisition: language is learned and continually changing

linguistic homogeneity: education, mass media

register: the same person speaks differently in different contexts, with different people

No accepted definition of general linguistic fingerprint has so far been proposed, nor are we likely to see one for these reasons.

Nonetheless, people do exhibit regularities in their speech and writing that could distinguish them from others.

This allows us to compare a limited set of authors/speakers in certain restricted conditions, just as we did with the first page of these slides.

118

Wednesday, March 5, 14

Page 271: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Identifying style

Every speaker uses language differently, leading to a unique style.

Style is both:

a collection of markers which can be observed and measured

a set of unconscious habits which can be observed and measured

Quantifying style:

word usage: presence/absence of words, relative word frequencies

type/token ratios

average word and sentence length

the number of unique words (hapax legomena)

119

Wednesday, March 5, 14

Page 272: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Authorship attribution with machine learningMachine learning provides a class of algorithms that perform unsupervised clustering.

They don’t have labels for any of the data points (e.g., documents).

Based on properties measured from the data points, coherent clusters of documents with similar properties can be identified.

A cluster can correspond to many different things, including collections of documents by the same author.

Mixture models: a popular class of probabilistic algorithms for clustering.

collections of probability distributions over the data

“soft” cluster membership: points are proportionally part of multiple clusters

a mixture of Gaussian distributions (a.k.a. the normal distribution) are one of the most commonly used type of mixture model

The K-means algorithm is a related “hard” clustering algorithm that is simple and easy to understand.

120

Wednesday, March 5, 14

Page 273: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Hard clustering into k groups

Assume you can measure various attributes for each data point, e.g.:

the weight and top speed of various vehicles

the average sentence length and average word length of various authors.

Next, you want to identify k groups of similar items based on these attributes.

How many groups? How to find them using an algorithm?

121

wei

ght

top speed

Wednesday, March 5, 14

Page 274: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Hard clustering into k groups

Assume you can measure various attributes for each data point, e.g.:

the weight and top speed of various vehicles

the average sentence length and average word length of various authors.

Next, you want to identify k groups of similar items based on these attributes.

How many groups? How to find them using an algorithm?

121

wei

ght

top speed

k=2?

Wednesday, March 5, 14

Page 275: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Hard clustering into k groups

Assume you can measure various attributes for each data point, e.g.:

the weight and top speed of various vehicles

the average sentence length and average word length of various authors.

Next, you want to identify k groups of similar items based on these attributes.

How many groups? How to find them using an algorithm?

121

wei

ght

top speed

k=2?

Wednesday, March 5, 14

Page 276: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Hard clustering into k groups

Assume you can measure various attributes for each data point, e.g.:

the weight and top speed of various vehicles

the average sentence length and average word length of various authors.

Next, you want to identify k groups of similar items based on these attributes.

How many groups? How to find them using an algorithm?

121

wei

ght

top speed

k=3?k=2? wei

ght

top speed

Wednesday, March 5, 14

Page 277: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Hard clustering into k groups

Assume you can measure various attributes for each data point, e.g.:

the weight and top speed of various vehicles

the average sentence length and average word length of various authors.

Next, you want to identify k groups of similar items based on these attributes.

How many groups? How to find them using an algorithm?

121

wei

ght

top speed

k=3?k=2? wei

ght

top speed

Wednesday, March 5, 14

Page 278: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

An authorship clustering problem

Texts from three authors (five documents each)

Arthur Conan Doyle (obtained from Project Gutenberg)

Jane Austen (obtained from Project Gutenberg)

Paul Krugman (obtained from New York Times website)

Measure the relative frequency of the words “I” and “the” in each document.

122

Document I theEmma 1.8 3.2

Mansfield 1.5 3.9Persuasion 1.3 4.0

Pride 1.7 3.6Sense 1.6 3.4

Document I theCity 2.1 4.7

Gerard 3.6 6.1Holmes 2.8 5.3Hound 2.5 5.6

Polestar 2.3 6.2

Document I the12-01-2008 0.0 6.712-07-2008 0.0 5.612-15-2008 0.3 6.712-19-2008 0.1 6.412-22-2008 0.4 6.3

Austen Doyle Krugman

Now, “forgetting” who the authors are, let’s see if they fall into distinct clusters based on these attributes.

Wednesday, March 5, 14

Page 279: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Plot the attributes against each other

123

0

1.75

3.5

5.25

7

0 1 2 3 4

AustenDoyleKrugman

Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 280: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: intuition

We see the clusters quite clearly, but a computer doesn’t and we need to specify an algorithm that allows it to identify them.

The K-means algorithm is a simple algorithm for such tasks.

The basic idea:

the values for the attributes in each dimension will be similar for each document of the same author

each author is represented as the averages for the attributes of all the documents he or she wrote

but: we don’t know those averages since we “forgot” the authors!

so, we take a guess at the average for each author, and then see which documents each of our hypothesized authors were likely to have produced

these guesses will probably be wrong, but we can fix that by iteratively re-estimating them

124

Wednesday, March 5, 14

Page 281: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 282: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

Pick K random points (could be some of the data points in D)

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 283: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

Pick K random points (could be some of the data points in D)

Stopping criteria (when does the algorithm stop?)

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 284: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

Pick K random points (could be some of the data points in D)

Stopping criteria (when does the algorithm stop?)

(Re)initialize the document clusters.

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 285: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

Pick K random points (could be some of the data points in D)

Stopping criteria (when does the algorithm stop?)

Find the closest centroid for each document; put the document in that group.

(Re)initialize the document clusters.

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 286: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means: algorithm

We are given N documents: D = d1, d2, ..., dN

We need to output K centroids: C = c1, c2, ..., cK

These centroids partition the documents into clusters based on which centroid is closest to each document.

125

K-means(D,K)C ←SelectRandomCenters(D,K)while C does change for k ← 1 to K gk ← {} for n ← 1 to N j ← argmini distance(ci,dn) gj ← gj ∪ {dn} for k ← 1 to K

ck ←

return C

d∑d in gk

1|gk|

Pick K random points (could be some of the data points in D)

Stopping criteria (when does the algorithm stop?)

Find the closest centroid for each document; put the document in that group.

Recompute centroids based on the new document clusters (the gk’s).

(Re)initialize the document clusters.

[Based on Manning, Raghavan, and Schutze 2008]

Wednesday, March 5, 14

Page 287: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Calculating distance

The documents are data points in some (possibly high-dimensional) space. We’ll work with 2D here.

Recall the Pythagorean theorem: c2 = a2 + b2

Here, the “a” is the distance on the x-axis and the “b” is the distance on the y-axis between points di and dj.

distance(di,dj) = (xi - xj)2 + (yi - yj)2

Consider two data points d1 = (5,4) and d2 = (1,2).

distance(d1, d2) = (x1 - x2)2 + (y1 - y2)2 = (5-1)2 + (4-2)2 = 42 + 22 = 20

Note: we could take the square root, but it doesn’t matter since we are just comparing a bunch of squared distances.

126

Wednesday, March 5, 14

Page 288: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Simplified problem: just two authors and four documents

Let’s apply the K-means algorithm to four documents

Keep in mind that we are acting like we don’t know who is the author of each document.

127

Document I the

Mansfield 1.5 3.9

Persuasion 1.3 4.0

Document I the

Gerard 3.6 6.1

Holmes 2.8 5.3

Austen Doyle

D = {d1, d2, d3, d4} = { (1.5,3.9), (1.3,4.0), (3.6,6.1), (2.8,5.3) }

Choose K = 2 (i.e., 2 authors)

Choose C = {c1, c2} = { (3.6,6.1), (2.8,5.3) } as initial seed centroids.

Wednesday, March 5, 14

Page 289: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Here’s what it looks like

128

3

4

5

6

7

1 1.75 2.5 3.25 4

Wednesday, March 5, 14

Page 290: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Here’s what it looks like

128

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 291: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Computing the groups

Calculate the nearest centroid for each document and put it in the group for that centroid.

d1:

distance(c1,d1) = (3.6-1.5)2 + (6.1-3.9)2 = 9.25

distance(c2,d1) = (2.8-1.5)2 + (5.3-3.9)2 = 3.65

c2 is closer, so d1 is in g2

Doing this for d2, d3, and d4, we find that:

g1 = {d3} and g2 = {d1,d2,d4}

129

Wednesday, March 5, 14

Page 292: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Here’s what it looks like

130

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 293: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

New centroids

Next, we need to compute the new centroids based on these groups.

g2 has multiple elements:

sum of the x-values: 1.5+1.3+2.8 = 5.6

sum of the y-values: 3.9+4.0+5.3 = 13.2

size of g2 is 3, so c2 = (5.6/3, 13.2/3) = (1.9, 4.4)

g1 stays the same:

size of g1 is 1, so we have c1 = (3.6/1, 6.1/1) = (3.6,6.1)

131

Wednesday, March 5, 14

Page 294: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Here’s what it looks like

132

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 295: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Re-assign groups based on new centroids

We then keep iterating until the centroids stay the same

calculate nearest centroid for each document, put in in the group

recalculate centroids for new groups

Notice that d4 is now closer to c1

distance(c1,d4) = (3.6-2.8)2 + (6.1-5.3)2 = 1.28

distance(c2,d4) = (1.9-2.8)2 + (4.4-5.3)2 = 1.62

133

Wednesday, March 5, 14

Page 296: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Here’s what it looks like

134

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 297: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

The next round would be...

135

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 298: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

With the right groups

136

3

4

5

6

7

1 1.75 2.5 3.25 4

d1d2

c1

d3

c2

d4

Wednesday, March 5, 14

Page 299: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Running k-means on all the documents

137

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 300: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Running k-means on all the documents

137

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 301: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Running k-means on all the documents

137

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 302: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Running k-means on all the documents

137

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wrong cluster!

Wednesday, March 5, 14

Page 303: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Back to authorship identification

Features were extracted from the documents and used as values for plotting each document in a multi-dimensional space.

Documents were then clustered according to K-means (other algorithms could be used).

K-means gave us a set of centroids, so we can plot other documents into the same multi-dimensional space and compute which one is closest.

138

Wednesday, March 5, 14

Page 304: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

We are given new documents of known authorship

139

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 305: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

We are given new documents of known authorship

139

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 306: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

We are given new documents of known authorship

139

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

Wednesday, March 5, 14

Page 307: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

We are given new documents of known authorship

139

3

4

5

6

7

0 1 2 3 4Relative frequency of “I”

Rel

ativ

e fr

eque

ncy

of “

the”

(Austen)

Wednesday, March 5, 14

Page 308: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Attribution

The known documents provide evidence that the clusters we found are sets of documents produced by the author(s) of the known documents.

We can estimate the confidence in our authorship assignments based on how close the known documents are to each center.

140

Wednesday, March 5, 14

Page 309: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

A case study: The Federalist Papers

85 essays written in 1787 and 1788 arguing for the ratification of the new US constitution by the individual states.

Three authors, Alexander Hamilton, John Jay and James Madison, all writing under the pseudonym “Publius”.

Later, Hamilton and Madison both claimed to have written a number of the same articles. Scholarship in the 20th century revealed most of them to be Madison’s.

141

Alexander Hamilton1st US Secretary of the Treasury.51 articles (nos. 1, 6–9, 11–13, 15–17, 21–36, 59–61, and 65–85); co-authored 18, 19 & 20 w/ Madison.

James Madison4th US President, “Father of the Constitution”29 articles (nos. 10, 14, 37–58 and 62–63); co-authored 18, 19 & 20 w/ Hamilton.

John Jay1st Chief Justice of the US.5 articles (nos. 2-5 and 64)

Wednesday, March 5, 14

Page 310: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Authorship of the Federalist Papers

Historian Douglas Adair in 1944 argued that Madison was the author of many of the disputed papers.

This was confirmed by Mosteller and Wallace in 1964 using a Bayesian classification model.

Adair’s authorship determinations are still generally accepted, though twelve essays are still disputed over by some scholars.

Experiment: cluster the documents based on all words that occur 5 or more times and k-means. (With some principal components analysis in between).

142

Wednesday, March 5, 14

Page 311: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Extracting “features”: frequent words and their counts

143

FEDERALIST No. 1

General Introduction

For the Independent Journal. Saturday, October 27, 1787

HAMILTON

To the People of the State of New York:

AFTER an unequivocal experience of the inefficacy of the subsistingfederal government, you are called upon to deliberate on a newConstitution for the United States of America. The subject speaks itsown importance; comprehending in its consequences nothing less than theexistence of the UNION, the safety and welfare of the parts of which itis composed, the fate of an empire in many respects the most interestingin the world. It has been frequently remarked that it seems to have beenreserved to the people of this country, by their conduct and example,to decide the important question, whether societies of men are reallycapable or not of establishing good government from reflection andchoice, or whether they are forever destined to depend for theirpolitical constitutions on accident and force. If there be any truthin the remark, the crisis at which we are arrived may with propriety beregarded as the era in which that decision is to be made; and a wrongelection of the part we shall act may, in this view, deserve to beconsidered as the general misfortune of mankind.

.... <the rest of the essays> ....

ID Author NumWords . people for jury macedon one power with an as at to more states its be by this upon government them they has the not that than a ; but state courts , is it in if may have executive would been no constitution and any on of or there all his are from their which will other1 HAMILTON 1771 49 5 12 0 0 4 2 6 11 10 8 71 7 2 10 34 14 14 6 9 2 6 6 130 14 28 11 25 7 2 5 0 105 13 20 27 4 11 10 0 2 3 3 8 40 6 9 104 6 2 9 0 12 11 14 18 25 32 JAY 1841 40 21 13 0 0 10 1 13 1 15 10 52 5 2 5 15 10 14 1 9 4 22 6 105 10 44 5 29 11 8 0 0 122 16 38 34 3 4 17 0 5 8 1 0 83 1 8 81 10 0 4 0 6 4 21 11 2 43 JAY 1604 37 7 11 0 0 8 3 10 3 24 1 55 13 11 1 31 18 6 0 16 8 5 5 91 13 20 8 13 7 7 7 3 117 7 21 25 6 6 7 1 2 2 2 0 60 5 6 60 32 1 4 2 8 15 11 11 24 74 JAY 1806 32 7 12 0 0 13 2 12 3 20 2 50 13 1 9 26 14 1 0 16 12 17 1 84 14 17 9 16 10 10 5 0 125 10 28 24 12 10 9 0 17 2 1 0 90 5 11 70 24 3 4 2 11 8 19 10 15 115 JAY 1475 35 2 7 0 0 10 1 11 3 3 4 44 11 1 4 31 10 6 0 2 11 11 0 64 8 23 9 9 8 4 1 0 86 7 20 28 3 2 1 0 37 0 2 0 72 3 5 51 10 0 4 0 3 11 11 10 6 46 HAMILTON 2528 77 3 13 0 0 3 4 15 12 19 6 60 10 7 1 18 12 11 4 2 3 13 10 187 17 24 6 52 10 5 5 0 174 8 11 60 5 3 27 0 6 13 0 1 81 0 5 166 19 8 6 3 18 16 15 24 6 107 HAMILTON 2562 81 0 14 0 0 4 2 12 15 19 11 81 6 31 2 47 28 22 11 2 12 7 4 205 14 27 5 48 16 3 7 0 158 14 17 43 12 3 23 0 51 8 3 1 51 7 13 156 10 9 12 0 7 11 18 24 1 98 HAMILTON 2306 72 8 9 0 0 6 5 13 13 16 11 79 8 10 10 35 11 19 3 3 11 13 8 156 12 18 3 46 17 10 6 0 164 21 21 42 7 8 18 2 27 16 4 4 54 2 11 133 17 2 9 0 14 9 16 26 11 69 HAMILTON 2213 68 2 10 0 0 12 3 14 12 18 10 70 6 12 5 26 13 14 4 17 5 20 12 168 13 24 1 46 14 6 6 1 133 16 23 37 6 5 24 0 8 15 2 3 45 4 9 153 11 3 8 4 14 7 14 25 7 510 MADISON 3337 79 4 18 0 0 8 1 11 14 20 8 99 12 3 9 61 39 11 0 13 8 11 4 259 14 31 13 78 32 11 2 0 221 41 47 63 6 16 16 0 6 9 5 2 121 4 18 155 22 6 4 8 26 12 21 39 30 2211 HAMILTON 2766 78 0 18 0 0 5 3 21 15 21 11 82 9 13 6 44 20 24 6 5 6 10 8 186 20 24 6 69 8 8 4 0 173 9 21 66 6 9 15 0 50 3 5 0 70 4 5 178 12 8 13 1 9 19 11 25 17 1012 HAMILTON 2410 72 3 8 0 0 5 1 18 11 17 13 81 5 10 8 40 15 17 7 8 6 8 13 174 6 27 4 47 15 8 11 0 161 22 26 54 7 3 13 0 22 8 2 0 62 7 12 139 7 9 5 2 15 19 14 23 11 1413 HAMILTON 1045 28 2 1 0 0 8 5 10 3 8 1 42 6 9 2 25 5 5 2 7 1 3 2 72 5 18 10 26 6 2 3 0 53 9 6 14 6 8 4 0 14 2 4 0 17 3 3 56 3 9 3 0 4 5 1 11 9 114 MADISON 2357 53 5 20 0 0 2 3 7 9 24 7 71 8 12 8 45 18 13 0 9 4 17 8 200 13 33 5 37 19 4 1 0 144 25 31 40 6 16 17 0 5 9 10 2 60 3 17 122 11 0 6 0 6 11 19 40 26 715 HAMILTON 3395 83 4 18 0 0 1 9 19 18 15 24 116 6 11 8 44 32 24 10 14 4 15 14 251 16 38 10 69 25 8 5 1 186 48 30 73 7 7 31 0 13 14 6 4 74 3 10 194 38 18 21 1 22 22 7 55 19 6< ... 70 more lines for the other essays ...>

Wednesday, March 5, 14

Page 312: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Dimensionality reduction: principal components analysis

144

Author Color

Hamilton green

Madison purple

Jay cyan

Hamilton & Madison red

Wednesday, March 5, 14

Page 313: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means clustering

145

Three clusters (k=3): the model thinks the collaborations were most similar to Madison, plus three of Hamilton’s.

AuthorCluster COLLABMH HAMILTON JAY MADISON 1 3 3 0 26 2 0 48 0 0 3 0 0 5 0

Wednesday, March 5, 14

Page 314: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

K-means clustering

145

With no changing of the features or parameters, amazingly close to the authorship attributions given by Adair.

Three clusters (k=3): the model thinks the collaborations were most similar to Madison, plus three of Hamilton’s.

AuthorCluster COLLABMH HAMILTON JAY MADISON 1 3 3 0 26 2 0 48 0 0 3 0 0 5 0

AuthorCluster COLLABMH HAMILTON JAY MADISON 1 3 0 0 1 2 0 0 5 0 3 0 0 0 25 4 0 51 0 0

Four clusters (k=4): Cleanly clustered. One article (#47) attributed to Madison is astray, but it sensibly goes with the collaborative articles.

Wednesday, March 5, 14

Page 315: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Web-scale authorship attribution

Narayanan et al. (2012) “On the Feasibility of Internet-Scale Author Identification”

The paper shows, for the first time, that large-scale authorship attribution is feasible (in their case, testing with 100,000 authors) and accurate enough to be useful in authorship attribution use cases.

For example, in their experiments, in 35% of their test cases, the correct author is in the top 20 authors predicted by their model.

Given that this is out of 100,000 authors, that is quite significant.

146

Wednesday, March 5, 14

Page 316: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Supervised learning and clustering

Simple note: clusters are often used as features for learning supervised classifiers.

E.g. cluster words according to their syntactic contexts and use the cluster ids as the features. This can help with some word-sense disambiguation without using a predefined set of word senses.

147

He paid a lot of interest after he took out that bank loan to buy his house.

C2

He paid a lot of interest to the lecture he saw on YouTube.

C2 C2 C5C8

C6 C6 C42 C12

Clusters can also themselves be labeled and used for training!

C31

Wednesday, March 5, 14

Page 317: Practical Sentiment Analysis

Relative status: who has the lead?Communicative behaviors are “patterned and coordinated, like a dance” [Niederhoffer and Pennebaker, ‘02]

http

://m

inim

alm

ovie

post

ers.

tum

blr.c

om/p

ost/

1608

2323

317/

pulp

-fict

ion-

by-a

na-b

alde

rram

as

Slide by Lillian Lee

Wednesday, March 5, 14

Page 319: Practical Sentiment Analysis

Relative status: who has the lead?Communicative behaviors are “patterned and coordinated, like a dance” [Niederhoffer and Pennebaker, ‘02]

http

://m

inim

alm

ovie

post

ers.

tum

blr.c

om/p

ost/

1608

2323

317/

pulp

-fict

ion-

by-a

na-b

alde

rram

as

adah ja ad to the adajkj the

adah ja ad at a adajkj the

adah ja ad of adajkj the

adah ja ad of adajkj the

adah to ja ad an adajkj gh

adah ja ad the adajkj forhgh

Those with less power tend to immediately match the function-word choices of those with more power. [C. Danescu-Niculescu-Mizil et al. WWW 2012]

Slide by Lillian Lee

Wednesday, March 5, 14

Page 320: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Content words still matter: detecting dodging by debaters

149Slide by Philip Resnik

Wednesday, March 5, 14

Page 321: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Demographics

Predict categories such as age, race, and gender based on what a person writes.

Mixes both content and stylistics features

E.g. men and women tend to talk about different topics and they also use function words at different rates. (See James Pennebaker’s work)

Usually also relies on other inputs, such as a person’s name, when they write (e.g. tweet times), and their social network.

150

http://www.tweetolife.com/gender/ [no longer working]

Wednesday, March 5, 14

Page 322: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Example: People Pattern audience demographics (3.8k accounts)

151

Wednesday, March 5, 14

Page 323: Practical Sentiment Analysis

Beyond text

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

VisualizationSemi-supervised learning

Stylistics & author modeling

Wrap up

Wednesday, March 5, 14

Page 324: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean?

153

Wednesday, March 5, 14

Page 325: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 326: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 327: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 328: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 329: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 330: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What does “barbecue” mean? Barbecue’

153

Wednesday, March 5, 14

Page 331: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

What I thought semantics was before 2005

154

From: John Enrico and Jason Baldridge. 2011. Possessor Raising, Demonstrative Raising, Quantifier Float and Number Float in Haida. International Journal of American Linguistics. 77(2):185-218

Wednesday, March 5, 14

Page 332: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Updated perspective a la Ray Mooney (UT Austin CS)

155

http://www.cs.utexas.edu/users/ml/slides/chen-icml08.ppt

Wednesday, March 5, 14

Page 333: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

http://www.lib.utexas.edu/books/travel/index.htmlTravel at the Turn of the 20th Century

156

Wednesday, March 5, 14

Page 334: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Motivation: Google Lit Trips [http://www.googlelittrips.com/]

157

Grapes of Wrath in Google Earth

Text

http://www.googlelittrips.com/GoogleLit/9-12/Entries/2006/11/1_The_Grapes_of_Wrath_by_John_Steinbeck.html

Wednesday, March 5, 14

Page 335: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Motivation: Google Lit Trips [http://www.googlelittrips.com/]

157

Grapes of Wrath in Google Earth

Text

http://www.googlelittrips.com/GoogleLit/9-12/Entries/2006/11/1_The_Grapes_of_Wrath_by_John_Steinbeck.html

Wednesday, March 5, 14

Page 336: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Crisis response: Haiti earthquake

158

Wednesday, March 5, 14

Page 337: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Crisis response: Haiti earthquake

158

Wednesday, March 5, 14

Page 338: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Look, Mom, no hands! (Err, um... no metadata.)

159

Wednesday, March 5, 14

Page 339: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Look, Mom, no hands! (Err, um... no metadata.)

159

Topics with a clear, circumscribed geographic focus emerge!

Wednesday, March 5, 14

Page 340: Practical Sentiment Analysis

© 2013 Jason M Baldridge Sentiment Analysis Symposium, March 2014

But, of course, certain kinds of metadata are now plentiful.

160

Wednesday, March 5, 14

Page 341: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geotagged Wikipedia

161

30° 17′ N 97° 44′ W

Wednesday, March 5, 14

Page 342: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

01:55:55 RT @USER_dc5e5498: Drop and give me 50....

05:09:29 I said u got a swisher from redmond!? He said nah kirkland! Lol..ooooooooOkay!

05:57:35 Lmao!:) havin a good ol time after work! Unexpected! #goodtimes

06:00:09 RT @USER_d5d93fec: #letsbereal .. No seriously, #letsbereal>>lol. Don't start.

06:00:37 On my way to get @USER_60939380 yeee! She want some of this strawberry! Sexy!

...

47°31’41’’ N 122°11’52’’ W162

Geotagged Twitter

Wednesday, March 5, 14

Page 343: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

01:55:55 RT @USER_dc5e5498: Drop and give me 50....

05:09:29 I said u got a swisher from redmond!? He said nah kirkland! Lol..ooooooooOkay!

05:57:35 Lmao!:) havin a good ol time after work! Unexpected! #goodtimes

06:00:09 RT @USER_d5d93fec: #letsbereal .. No seriously, #letsbereal>>lol. Don't start.

06:00:37 On my way to get @USER_60939380 yeee! She want some of this strawberry! Sexy!

...

47°31’41’’ N 122°11’52’’ W162

Geotagged Twitter

Wednesday, March 5, 14

Page 344: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Document geolocation: where is this person?

163

Wednesday, March 5, 14

Page 345: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Document geolocation

Language-model-based Information Retrieval (LMIR) [Ponte and Croft 1998, Zhai and Lafferty 2001]

Given texts annotated with locations, construct language models for points or regions on the earth’s surface.

(There are many choices for doing this.)

Given a new (unlabeled) text, rank all locations w.r.t. how good a match they are.

(There are many choices for doing this.)

Give the ranking of locations, choose a single coordinate as the location for the unlabeled text.

(There are many choices for doing this.)

164

[Serdyukov, Murdock, & van Zwol 2009; Cheng, Caverlee, & Lee 2010; Wing & Baldridge 2011]

Wednesday, March 5, 14

Page 346: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014165

Amsterdam, Zaandam, Amstelveen, Diemen, Landsmeer ...

Frankfurt, Frechen, Hürth, Brühl, Wesseling, ...

Construct pseudo-documents from a geodesic grid

Grid: equal degree/area cells; rectangular or otherwise?

Pseudoc location is cell center or centroid of all documents in cell.

Wednesday, March 5, 14

Page 347: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014165

Amsterdam, Zaandam, Amstelveen, Diemen, Landsmeer ...

Frankfurt, Frechen, Hürth, Brühl, Wesseling, ...

Construct pseudo-documents from a geodesic grid

Grid: equal degree/area cells; rectangular or otherwise?

Pseudoc location is cell center or centroid of all documents in cell.

Wednesday, March 5, 14

Page 348: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014166

Amsterdam, Zaandam, Amstelveen, Diemen, Landsmeer ...

Frankfurt, Frechen, Hürth, Brühl, Wesseling, ...

Generate a language model for each pseudo-document

Interpolate or smooth against the LM for the entire collection, neighboring pseudo-documents, or both.

Wednesday, March 5, 14

Page 349: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014167

Generate a language model for a query document

May optionally be smoothed, depending on the ranking method.

Wednesday, March 5, 14

Page 350: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

longitude

1516

1718

19

20

latitu

de

50

51

52

53

5455

− log( rank )

−10−8−6−4

−2

0

168

Rank all pseudo-documents w.r.t. “query” text

Kullback-Leibler divergence

Query Likelihood

Wednesday, March 5, 14

Page 351: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

longitude

1516

1718

19

20

latitu

de

50

51

52

53

5455

− log( rank )

−10−8−6−4

−2

0

169

Choose the best matching pseudo-document

Highest ranking document

Mean shift over top-K

Reranking (Learning to Rank)

Wednesday, March 5, 14

Page 352: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Locations of Twitter users are not uniformly distributed!

170

(Small) GeoUT (Twitter) plotted on Google Earth, one pin per user.

Density of (all) documents in GeoUT

over the USA (390 million tweets)

Wednesday, March 5, 14

Page 353: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

k-d tree for geotagged Wikipedia, looking at N. America

171

Wednesday, March 5, 14

Page 354: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

k-d tree for geotagged Wikipedia, looking at N. America

171

Wednesday, March 5, 14

Page 355: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Pre-grid clustering [Erik Skiles, MA thesis, UT Austin, Ling]

172

Wednesday, March 5, 14

Page 356: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Four clusters on GeoUT (390 million tweets)

173

Wednesday, March 5, 14

Page 357: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Four clusters on GeoUT (390 million tweets)

173

West coast East coast Midwest & South Spanish language

All tweets

Wednesday, March 5, 14

Page 358: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Performance (kd-tree with clustering)

174

Wikipedia (entire world)Half of documents geotagged within 12 km of truthPercent of documents within 166km (100 miles): 91%

Twitter (USA)Half of users geotagged within 330 km of truthPercent of documents within 166km (100 miles): 40%

For better or worse, it soon might not matter whether you have location turned on or not... what

you say is where you are / are from. (Also, other factors, e.g. who you are linked to, of course.)

Wednesday, March 5, 14

Page 359: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

Wednesday, March 5, 14

Page 360: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

Wednesday, March 5, 14

Page 361: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

beach

Wednesday, March 5, 14

Page 362: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

beach

Wednesday, March 5, 14

Page 363: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

mountainbeach

Wednesday, March 5, 14

Page 364: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

mountainbeach

Wednesday, March 5, 14

Page 365: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

mountainbeach

wine

Wednesday, March 5, 14

Page 366: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

mountainbeach

wine

Wednesday, March 5, 14

Page 367: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geographic meaning of words (based on Wikipedia)

mountainbeach

wine barbecue

Wednesday, March 5, 14

Page 368: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Image geo-location: http://graphics.cs.cmu.edu/projects/im2gps/

Wednesday, March 5, 14

Page 369: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Entity linking

It is one thing to identify that a span of text is a named entity, and another thing entirely to pick the unique entity in the world that it refers to.

177

John Smith went to London to visit Apple last year.

Which John Smith?

London in UK, Canada, or elsewhere?

Apple Inc. or Apple Records?

2013? 2012?1999? 1850?

Wednesday, March 5, 14

Page 370: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym (place name) resolution: geographic entity linking

178

They visit Portland every year.

Wednesday, March 5, 14

Page 371: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym (place name) resolution: geographic entity linking

178

They visit Portland every year.

?

?

?

?

?

?

?

?

?

?

?

?

?

?

??

?

Which Portland? (Also: Canada, Australia, Ireland...)

Wednesday, March 5, 14

Page 372: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym resolution in context

179

Although Elisha Newman made the first land entry in the township of Portland (June, 1833), he did not become a settler until three years later, by which time a few settlers had located in the town. From Mr. Newman's story, it appears that early in 1833, he was visiting friends in Ann Arbor, and during an evening conversation discussed with others the subject of unlocated lands lying west of Ann Arbor. One of the company (Joseph Wood) remarked that he had been out with the party sent to survey Ionia and other counties, and that the surveyors were struck by the valuable water-power at the mouth of the Looking Glass River, saying there would surely be a village there some day.Mr. Newman was at once taken with the idea of locating lands at the mouth of the Looking Glass. Following up his impulse, he made ready to start at once, and, accompanied by James Newman and Joseph Wood, went out to the Looking Glass on a tour of inspection. Being satisfied with the location, he returned Eastward with his companions, and at White Pigeon made his land entry.Newman did not return for a permanent settlement until the spring of 1836, and meanwhile, in November, 1833, Philo Bogue bought a piece of land on section 28, in the bend of the Grand River, where he proposed to set up a trading post. Unaided he rolled up a log cabin near where the Detroit, Lansing, and Northern depot was located, and when he brought the house into decent shape went over to Hunt's at Lyons for his family, whom he had left there against such time as he should have affairs prepared for their comfort.

Wednesday, March 5, 14

Page 373: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Spatial minimality

180

Although Elisha Newman made the first land entry in the township of Portland (June, 1833), he did not become a settler until three years later, by which time a few settlers had located in the town. From Mr. Newman's story, it appears that early in 1833, he was visiting friends in Ann Arbor, and during an evening conversation discussed with others the subject of unlocated lands lying west of Ann Arbor. One of the company (Joseph Wood) remarked that he had been out with the party sent to survey Ionia and other counties, and that the surveyors were struck by the valuable water-power at the mouth of the Looking Glass River, saying there would surely be a village there some day.

Mr. Newman was at once taken with the idea of locating lands at the mouth of the Looking Glass. Following up his impulse, he made ready to start at once, and, accompanied by James Newman and Joseph Wood, went out to the Looking Glass on a tour of inspection. Being satisfied with the location, he returned Eastward with his companions, and at White Pigeon made his land entry.

Newman did not return for a permanent settlement until the spring of 1836, and meanwhile, in November, 1833, Philo Bogue bought a piece of land on section 28, in the bend of the Grand River, where he proposed to set up a trading post. Unaided he rolled up a log cabin near where the Detroit, Lansing, and Northern depot was located, and when he brought the house into decent shape went over to Hunt's at Lyons for his family, whom he had left there against such time as he should have affairs prepared for their comfort.

Wednesday, March 5, 14

Page 374: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geo

Nam

es

4048392 Portland Mills Portland Mills 39.7781 -87.00918 P PPL US IN 133 0 223 218 America/Indiana/Indianapolis 2010-02-154084605 Portland Portland 32.15459 -87.1686 P PPL US AL 047 0 30 41 America/Chicago 2006-01-154127143 Portland Portland Portlend,Портленд 33.2379 -91.51151 P PPL US AR 003 430 38 39 America/Chicago 2011-05-144169227 Portland Portland 30.51242 -86.19578 P PPL US FL 131 0 8 14 America/Chicago 2006-01-154217115 Portland Portland 34.05732 -85.03634 P PPL US GA 233 0 229 228 America/New_York 2010-09-054277586 Portland Portland 37.0778 -97.31227 P PPL US KS 191 0 362 364 America/Chicago 2006-01-154305000 Portland Portland 37.12062 -85.44608 P PPL US KY 001 0 220 223 America/Chicago 2006-01-154305001 Portland Portland 38.26924 -85.8108 P PPL US KY 111 0 135 138 America/Kentucky/Louisville 2006-01-154305002 Portland Portland 38.74812 -84.44772 P PPL US KY 191 0 265 266 America/New_York 2006-01-15404289 Portland Portland Portlend,Портленд 38.71088 -91.71767 P PPL US MO 027 0 170 172 America/Chicago 2010-01-294521811 Portland Portland Portlend,Портленд 39.00341 -81.77124 P PPL US OH 105 0 187 188 America/New_York 2010-01-294650946 Portland Portland Portlend,Портленд 36.58171 -86.51638 P PPL US TN 165 11480 244 245 America/Chicago 2011-05-144720131 Portland Portland Portlend,Портленд 27.87725 -97.32388 P PPL US TX 409 15099 13 11 America/Chicago 2011-05-144841001 Portland Portland Portlend,Портленд 41.57288 -72.64065 P PPL US CT 007 5862 24 27 America/New_York 2011-05-144871855 Portland Portland 43.12858 -93.12354 P PPL US IA 033 35 327 330 America/Chicago 2011-05-144906524 Portland Portland 41.66253 -89.98012 P PPL US IL 195 0 190 190 America/Chicago 2006-01-155006314 Portland Portland Portlend,Портленд 42.8692 -84.90305 P PPL US MI 067 3883 221 223 America/Detroit 2011-05-145746545 Portland Portland 45.52345 -122.67621 P PPLA2 US OR 051 583776 12 15 America/Los_Angeles 2011-05-14

Spatial minimality

180

Although Elisha Newman made the first land entry in the township of Portland (June, 1833), he did not become a settler until three years later, by which time a few settlers had located in the town. From Mr. Newman's story, it appears that early in 1833, he was visiting friends in Ann Arbor, and during an evening conversation discussed with others the subject of unlocated lands lying west of Ann Arbor. One of the company (Joseph Wood) remarked that he had been out with the party sent to survey Ionia and other counties, and that the surveyors were struck by the valuable water-power at the mouth of the Looking Glass River, saying there would surely be a village there some day.

Mr. Newman was at once taken with the idea of locating lands at the mouth of the Looking Glass. Following up his impulse, he made ready to start at once, and, accompanied by James Newman and Joseph Wood, went out to the Looking Glass on a tour of inspection. Being satisfied with the location, he returned Eastward with his companions, and at White Pigeon made his land entry.

Newman did not return for a permanent settlement until the spring of 1836, and meanwhile, in November, 1833, Philo Bogue bought a piece of land on section 28, in the bend of the Grand River, where he proposed to set up a trading post. Unaided he rolled up a log cabin near where the Detroit, Lansing, and Northern depot was located, and when he brought the house into decent shape went over to Hunt's at Lyons for his family, whom he had left there against such time as he should have affairs prepared for their comfort.

Wednesday, March 5, 14

Page 375: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geo

Nam

es

4048392 Portland Mills Portland Mills 39.7781 -87.00918 P PPL US IN 133 0 223 218 America/Indiana/Indianapolis 2010-02-154084605 Portland Portland 32.15459 -87.1686 P PPL US AL 047 0 30 41 America/Chicago 2006-01-154127143 Portland Portland Portlend,Портленд 33.2379 -91.51151 P PPL US AR 003 430 38 39 America/Chicago 2011-05-144169227 Portland Portland 30.51242 -86.19578 P PPL US FL 131 0 8 14 America/Chicago 2006-01-154217115 Portland Portland 34.05732 -85.03634 P PPL US GA 233 0 229 228 America/New_York 2010-09-054277586 Portland Portland 37.0778 -97.31227 P PPL US KS 191 0 362 364 America/Chicago 2006-01-154305000 Portland Portland 37.12062 -85.44608 P PPL US KY 001 0 220 223 America/Chicago 2006-01-154305001 Portland Portland 38.26924 -85.8108 P PPL US KY 111 0 135 138 America/Kentucky/Louisville 2006-01-154305002 Portland Portland 38.74812 -84.44772 P PPL US KY 191 0 265 266 America/New_York 2006-01-15404289 Portland Portland Portlend,Портленд 38.71088 -91.71767 P PPL US MO 027 0 170 172 America/Chicago 2010-01-294521811 Portland Portland Portlend,Портленд 39.00341 -81.77124 P PPL US OH 105 0 187 188 America/New_York 2010-01-294650946 Portland Portland Portlend,Портленд 36.58171 -86.51638 P PPL US TN 165 11480 244 245 America/Chicago 2011-05-144720131 Portland Portland Portlend,Портленд 27.87725 -97.32388 P PPL US TX 409 15099 13 11 America/Chicago 2011-05-144841001 Portland Portland Portlend,Портленд 41.57288 -72.64065 P PPL US CT 007 5862 24 27 America/New_York 2011-05-144871855 Portland Portland 43.12858 -93.12354 P PPL US IA 033 35 327 330 America/Chicago 2011-05-144906524 Portland Portland 41.66253 -89.98012 P PPL US IL 195 0 190 190 America/Chicago 2006-01-155006314 Portland Portland Portlend,Портленд 42.8692 -84.90305 P PPL US MI 067 3883 221 223 America/Detroit 2011-05-145746545 Portland Portland 45.52345 -122.67621 P PPLA2 US OR 051 583776 12 15 America/Los_Angeles 2011-05-14

Spatial minimality

180

Ann ArborDetroit

IoniaLyons

PortlandWhite Pigeon

1>7>4

>15>17

1

# LocationsToponym

Although Elisha Newman made the first land entry in the township of Portland (June, 1833), he did not become a settler until three years later, by which time a few settlers had located in the town. From Mr. Newman's story, it appears that early in 1833, he was visiting friends in Ann Arbor, and during an evening conversation discussed with others the subject of unlocated lands lying west of Ann Arbor. One of the company (Joseph Wood) remarked that he had been out with the party sent to survey Ionia and other counties, and that the surveyors were struck by the valuable water-power at the mouth of the Looking Glass River, saying there would surely be a village there some day.

Mr. Newman was at once taken with the idea of locating lands at the mouth of the Looking Glass. Following up his impulse, he made ready to start at once, and, accompanied by James Newman and Joseph Wood, went out to the Looking Glass on a tour of inspection. Being satisfied with the location, he returned Eastward with his companions, and at White Pigeon made his land entry.

Newman did not return for a permanent settlement until the spring of 1836, and meanwhile, in November, 1833, Philo Bogue bought a piece of land on section 28, in the bend of the Grand River, where he proposed to set up a trading post. Unaided he rolled up a log cabin near where the Detroit, Lansing, and Northern depot was located, and when he brought the house into decent shape went over to Hunt's at Lyons for his family, whom he had left there against such time as he should have affairs prepared for their comfort.

Wednesday, March 5, 14

Page 376: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Geo

Nam

es

4048392 Portland Mills Portland Mills 39.7781 -87.00918 P PPL US IN 133 0 223 218 America/Indiana/Indianapolis 2010-02-154084605 Portland Portland 32.15459 -87.1686 P PPL US AL 047 0 30 41 America/Chicago 2006-01-154127143 Portland Portland Portlend,Портленд 33.2379 -91.51151 P PPL US AR 003 430 38 39 America/Chicago 2011-05-144169227 Portland Portland 30.51242 -86.19578 P PPL US FL 131 0 8 14 America/Chicago 2006-01-154217115 Portland Portland 34.05732 -85.03634 P PPL US GA 233 0 229 228 America/New_York 2010-09-054277586 Portland Portland 37.0778 -97.31227 P PPL US KS 191 0 362 364 America/Chicago 2006-01-154305000 Portland Portland 37.12062 -85.44608 P PPL US KY 001 0 220 223 America/Chicago 2006-01-154305001 Portland Portland 38.26924 -85.8108 P PPL US KY 111 0 135 138 America/Kentucky/Louisville 2006-01-154305002 Portland Portland 38.74812 -84.44772 P PPL US KY 191 0 265 266 America/New_York 2006-01-15404289 Portland Portland Portlend,Портленд 38.71088 -91.71767 P PPL US MO 027 0 170 172 America/Chicago 2010-01-294521811 Portland Portland Portlend,Портленд 39.00341 -81.77124 P PPL US OH 105 0 187 188 America/New_York 2010-01-294650946 Portland Portland Portlend,Портленд 36.58171 -86.51638 P PPL US TN 165 11480 244 245 America/Chicago 2011-05-144720131 Portland Portland Portlend,Портленд 27.87725 -97.32388 P PPL US TX 409 15099 13 11 America/Chicago 2011-05-144841001 Portland Portland Portlend,Портленд 41.57288 -72.64065 P PPL US CT 007 5862 24 27 America/New_York 2011-05-144871855 Portland Portland 43.12858 -93.12354 P PPL US IA 033 35 327 330 America/Chicago 2011-05-144906524 Portland Portland 41.66253 -89.98012 P PPL US IL 195 0 190 190 America/Chicago 2006-01-155006314 Portland Portland Portlend,Портленд 42.8692 -84.90305 P PPL US MI 067 3883 221 223 America/Detroit 2011-05-145746545 Portland Portland 45.52345 -122.67621 P PPLA2 US OR 051 583776 12 15 America/Los_Angeles 2011-05-14

Spatial minimality

180

PortlandLyonsIonia

White Pigeon

Ann ArborDetroit

IoniaLyons

PortlandWhite Pigeon

1>7>4

>15>17

1

# LocationsToponym

Although Elisha Newman made the first land entry in the township of Portland (June, 1833), he did not become a settler until three years later, by which time a few settlers had located in the town. From Mr. Newman's story, it appears that early in 1833, he was visiting friends in Ann Arbor, and during an evening conversation discussed with others the subject of unlocated lands lying west of Ann Arbor. One of the company (Joseph Wood) remarked that he had been out with the party sent to survey Ionia and other counties, and that the surveyors were struck by the valuable water-power at the mouth of the Looking Glass River, saying there would surely be a village there some day.

Mr. Newman was at once taken with the idea of locating lands at the mouth of the Looking Glass. Following up his impulse, he made ready to start at once, and, accompanied by James Newman and Joseph Wood, went out to the Looking Glass on a tour of inspection. Being satisfied with the location, he returned Eastward with his companions, and at White Pigeon made his land entry.

Newman did not return for a permanent settlement until the spring of 1836, and meanwhile, in November, 1833, Philo Bogue bought a piece of land on section 28, in the bend of the Grand River, where he proposed to set up a trading post. Unaided he rolled up a log cabin near where the Detroit, Lansing, and Northern depot was located, and when he brought the house into decent shape went over to Hunt's at Lyons for his family, whom he had left there against such time as he should have affairs prepared for their comfort.

Wednesday, March 5, 14

Page 377: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Spatial minimality often fails

181

I moved from Encinitas, CA, a nice beach town in North San Diego County to Asheville, NC. By far, Ashville is more hip, especially West Asheville. Asheville has a lot in common with Portland. Austin, I've never been to so I cannot comment. But what makes a place cool and hip, in my opinion are that give a area "punch". There are a lot of ingredients. One is geography. Add a college or university (and all that they bring- and draw), good restaurants, a good music scene, a progressive attitude and tolerance. Hmmm. I'm sure there are many more to ponder. But that's my start. Oh, lots of bars!

From: http://www.city-data.com/forum/austin/1694181-what-makes-city-like-austin-portland-3.html

City-data.com incorrectly marks “West” and “Portland” as the cities in Texas -- presumably because of their textual and spatial proximity to “Austin”.

Wednesday, March 5, 14

Page 378: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Spatial minimality often fails

181

I moved from Encinitas, CA, a nice beach town in North San Diego County to Asheville, NC. By far, Ashville is more hip, especially West Asheville. Asheville has a lot in common with Portland. Austin, I've never been to so I cannot comment. But what makes a place cool and hip, in my opinion are that give a area "punch". There are a lot of ingredients. One is geography. Add a college or university (and all that they bring- and draw), good restaurants, a good music scene, a progressive attitude and tolerance. Hmmm. I'm sure there are many more to ponder. But that's my start. Oh, lots of bars!

From: http://www.city-data.com/forum/austin/1694181-what-makes-city-like-austin-portland-3.html

City-data.com incorrectly marks “West” and “Portland” as the cities in Texas -- presumably because of their textual and spatial proximity to “Austin”.

But: it is clear from the text that Portland, Oregon and Austin, Texas are the referents, though their states are never mentioned and are far from the other locations!

I moved from Encinitas, CA, a nice beach town in North San Diego County to Asheville, NC. By far, Ashville is more hip, especially West Asheville. Asheville has a lot in common with Portland. Austin, I've never been to so I cannot comment. But what makes a place cool and hip, in my opinion are that give a area "punch". There are a lot of ingredients. One is geography. Add a college or university (and all that they bring- and draw), good restaurants, a good music scene, a progressive attitude and tolerance. Hmmm. I'm sure there are many more to ponder. But that's my start. Oh, lots of bars!

Wednesday, March 5, 14

Page 379: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 380: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 381: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 382: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 383: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 384: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

Wednesday, March 5, 14

Page 385: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Toponym classifiers

182

Strategy: build a textual classifier per toponym by obtaining indirectly labeled examples from Wikipedia.

P(Portland-OR|music) > P(Portland-ME|music)P(Portland-OR|wharf ) < P(Portland-ME|wharf )

Wednesday, March 5, 14

Page 386: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Results: disambiguating toponyms

183

Average error distance

Accuracy Average error distance

Accuracy

Population

SPIDER(spatial minimality)

WISTR(Wiki supervised)

SPIDER+WISTR

216 81.0 1749 59.7

2180 30.9 266 57.5

279 82.3 855 69.1

430 81.8 201 85.9

TR-CoNLLReuters News Texts

August 1996

Perseus Civil War CorpusBooks

Late 19th Century

Take-home message: text classifiers are very effective & can be boosted by spatial minimality algorithms.

Wednesday, March 5, 14

Page 387: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Back to grounding

184

Grounding often involves connecting text to knowledge sources and other modalities (image, video) & bootstrapping.

Wednesday, March 5, 14

Page 388: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Back to grounding

184

Grounding often involves connecting text to knowledge sources and other modalities (image, video) & bootstrapping.

Semi-supervised learning methods such as label propagation help bridge the annotation gap.

Wednesday, March 5, 14

Page 389: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Back to grounding

184

Grounding often involves connecting text to knowledge sources and other modalities (image, video) & bootstrapping.

Semi-supervised learning methods such as label propagation help bridge the annotation gap.

Also, they can help us create models for deeper aspects of language, such as syntactic structure and logical form.

Wednesday, March 5, 14

Page 390: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporality of words, by hour http://www.tweetolife.com/hour/ (no longer working)

185

Wednesday, March 5, 14

Page 391: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporality of words, by hour http://www.tweetolife.com/hour/ (no longer working)

185

Wednesday, March 5, 14

Page 392: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporality of expressions, by day: http://www.google.com/trends

186

Wednesday, March 5, 14

Page 393: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporality of expressions, by day: http://www.google.com/trends

186

Wednesday, March 5, 14

Page 394: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporality of expressions, by year: http://ngrams.googlelabs.com/

187

slavetrenches aircraft

war

Wednesday, March 5, 14

Page 395: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 396: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 397: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 398: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 399: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 400: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Temporal resolution [Kumar, Lease, and Baldridge 2011]

188

2000

BC

0 A

D

2000

AD

4000

BC

Wednesday, March 5, 14

Page 401: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Lexical brain decoding [Yarkoni, Poldrack, Nichols, Van Essen & Wager (2011)]

189

Wednesday, March 5, 14

Page 402: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Lexical brain decoding [Yarkoni, Poldrack, Nichols, Van Essen & Wager (2011)]

189

Wednesday, March 5, 14

Page 403: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

More modalities: videos [Motwani & Mooney, 2012]

190

Wednesday, March 5, 14

Page 404: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0beach

Wednesday, March 5, 14

Page 405: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 406: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 407: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 408: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 409: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 410: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 411: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 412: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Beyond word co-occurences for vector-space models

191

bear boat car cow hadoop snow water wrench

3 234 42 4 1 2 325 0

beach

Wednesday, March 5, 14

Page 413: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Combining distributional models with logics

192

Erk (2013): “Towards a semantics for distributional representations.”

Garrette et al (2012): “A formal approach to linking logical form and vector-space lexical semantics”Beltagy et al (2013): “Montague Meets Markov: Deep Semantics with Probabilistic Logical Form”

Wednesday, March 5, 14

Page 414: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Multi-component structured vector-space models

193

beachchildren

visit

the children visit the beach

Agent Patient

Wednesday, March 5, 14

Page 415: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Language learning in context [Kim & Mooney, 2013]

194

Wednesday, March 5, 14

Page 416: Practical Sentiment Analysis

© 2014 Jason M Baldridge Sentiment Analysis Symposium, March 2014

Language learning in context [Kim & Mooney, 2013]

194

Wednesday, March 5, 14

Page 417: Practical Sentiment Analysis

Wrap up

Why NLP is hardSentiment analysis overview

Document classificationAspect-based sentiment analysis

VisualizationSemi-supervised learning

Stylistics & author modelingBeyond text

Wednesday, March 5, 14

Page 418: Practical Sentiment Analysis

All your meaning are belong to us

Wednesday, March 5, 14

Page 419: Practical Sentiment Analysis

All your meaning are belong to us

Wednesday, March 5, 14

Page 420: Practical Sentiment Analysis

All your meaning are belong to us

Wednesday, March 5, 14

Page 421: Practical Sentiment Analysis

ALPAC yer bags

Wednesday, March 5, 14

Page 422: Practical Sentiment Analysis

It’s cold out there, after the hype

Wednesday, March 5, 14

Page 423: Practical Sentiment Analysis

The Hype Cycle (Gartner)

Wednesday, March 5, 14

Page 424: Practical Sentiment Analysis

The Hype Cycle (Gartner, 2013)

http://www.gartner.com/newsroom/id/2575515

Wednesday, March 5, 14

Page 425: Practical Sentiment Analysis

The Hype Cycle (Gartner, 2013)

http://www.gartner.com/newsroom/id/2575515

Wednesday, March 5, 14

Page 426: Practical Sentiment Analysis

http://davidrothman.net/2009/09/02/all-your-healthbase-are-belong-to-us-want-em-back/

Grounding matters

Wednesday, March 5, 14

Page 427: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 428: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 429: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 430: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 431: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 432: Practical Sentiment Analysis

Being right can be consequential too

Wednesday, March 5, 14

Page 433: Practical Sentiment Analysis

Being right can be consequential too

Narayanan et al 2012: “On the Feasibility of Internet-Scale Author Identification”

Wednesday, March 5, 14

Page 434: Practical Sentiment Analysis

Being right can be consequential too

Narayanan et al 2012: “On the Feasibility of Internet-Scale Author Identification”

Wednesday, March 5, 14

Page 435: Practical Sentiment Analysis

Tremendous challenges and tremendous opportunities

• It is possible to create automated methods for opinion mining tasks that make reasonably accurate predictions that are useful for summarizing attitudes and opinions at a glance.

• The deeper you go, the better you can discriminate detailed opinions, but also the harder it is to have the necessary data to build accurate models. (The problem is “NLP-complete”.)

• Semi-supervised methods provide a lot of promise for these endeavors.

• Methods that reduce the burden of feature engineering, such as deep learning, are also important.

• A big component of this is grounding texts in the real world.

Wednesday, March 5, 14