A cognitive study of subjectivity extraction in sentiment annotation
description
Transcript of A cognitive study of subjectivity extraction in sentiment annotation
A cognitive study of subjectivity extraction in sentiment
annotation
Abhijit Mishra1, Aditya Joshi1,2,3, Pushpak Bhattacharyya1
1 IIT Bombay, India 2 Monash University, Australia
3IITB-Monash Research Academy
At 5th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, ACL 2014, Baltimore
Subjectivity Extraction
• Goal: To identify subjective portions of text
Motivation
• Strong AI suggests that a machine must be perform sentiment analysis in a manner and accuracy similar to human beings
• Do humans perform subjective extraction as well?
A “cognitive study” of subjectivity extraction in sentiment annotation
Outline
• Sentiment Oscillations & Subjectivity Extraction
• Experiment Setup• Anticipation & Homing• Conclusion & Future Work
Sentiment Oscillations & subjectivity extraction
• Subjective documents may be:
• Humans perform subjectivity extraction either as a result of “anticipation” or as “homing”.
• Which of the two methods are adopted depends on the linear/oscillating nature of the subjective document.
Linear:
The story was captivating. The actors did a great job. I absolutely loved the movie!
Oscillating:
The story was captivating. Only if they had better actors. But then I enjoyed the movie, on the whole.
Experiment Setup (1/2)
• A human annotator reads a document and predicts its sentiment
• A Tobii T120 eye-tracker records eye movements while he/she reads the document
* No time restriction, no user input required: to minimize errors.
Experiment Setup (2/2)
• Dataset– 3 Movie reviews in English from imdb– One linear, one oscillating, one between the two
extremes (D0, D1, D2 respectively)• Three documents? Really?!– To eliminate predictability– To reduce errors due to fatigue
• 12 human annotators (P0, .. P11 respectively)
Observations: Anticipation (1/2)
• In case of linear subjective documents, an annotator reads some sentences and begins to skip sentences.
Observations: Anticipation (2/2)
Document Length Average number of non-unique sentences read by participants
D0 10 21
D1 9 33.83
D2 13 50.42
Observations: Homing (1/3)
• In case of oscillating subjective documents, an annotator (a) first reads all sentences, (b) revisits some sentences again
Observations: Homing (2/3)
• Considerable overlap between sentences that are read in the second pass
• All of them are subjective.Participant TFD-SE PTFD TFC-SE
P5 7.3 8 21
P7 3.1 5 11
P9 51.94 10 26
P11 116.6 16 56
Reading statistics for D1TFD: Total fixation duration for subjective extract; PTFD: Proportion of total fixation duration = (TFD)/(Total duration); TFC-SE: Total fixation count for subjective extract
Observations: Homing (3/3)
• Homing at a sub-sentence level– Sarcasm• Multiple regressions around the sarcasm portion for
participant P1, document D1• Participant P1 does not correctly detect the sentiment
of the document
– Thwarting
Conclusion & Future Work
• Based on how sentiment changes through a document, humans may perform subjectivity extraction as a result of anticipation or homing
• Applications:– Pricing models for crowd-sourced annotation– Sentiment classifiers that incorporate “sentiment
runlengths”
References
• WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarization With Wikipedia, Subhabrata Mukherjee and Pushpak Bhattacharyya, ECML PKDD 2012
• A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts, Bo Pang, Lillian Lee, ACL 2004