Sentiment Detection

21
Sentiment Detection Rik Sarkar (03305048) Kedar Godbole (03305805)

description

Sentiment Detection. Rik Sarkar (03305048) Kedar Godbole (03305805). Outline. Sentiment detection: the problem statement Difficulties in sentiment detection Approaches to sentiment detection Conclusion Project proposal. Problem Statement. - PowerPoint PPT Presentation

Transcript of Sentiment Detection

Page 1: Sentiment Detection

Sentiment Detection

Rik Sarkar (03305048)

Kedar Godbole (03305805)

Page 2: Sentiment Detection

Outline

Sentiment detection: the problem statement Difficulties in sentiment detection Approaches to sentiment detection Conclusion Project proposal

Page 3: Sentiment Detection

Problem Statement

Detect the polarity about a particular topic in a document

Polarity:

- Positive

- Negative

- Mixed

- Neutral

Page 4: Sentiment Detection

Motivation

Reviews on the Web

Opinions about a product Opinions about the individual aspects of a

product Movie/book reviews Feedback/evaluation forms

Page 5: Sentiment Detection

Issues

Reference to multiple objects in the same document

- The NR70 is trendy. T-Series is fast becoming

obsolete. Dependence on the context of the document

- “Unpredictable” plot ; “Unpredictable” performance

Negations have to be captured

- Monochrome display is not what the user wants

Page 6: Sentiment Detection

Issues (contd.)

Metaphors/Similes

- The metallic body is solid as a rock

Part-of and Attribute-of relationships

- The small keypad is inconvenient

Absence of a polar word

- How can someone sit through this seminar?

Page 7: Sentiment Detection

Approaches to Sentiment Detection

Based on pre-selected sets of words Naive Bayes Support Vector Machines Unsupervised learning Enhancement by NLP

Page 8: Sentiment Detection

An Unsupervised Learning Technique

Extract phrases from the review based on patterns of POS tags

JJ – Adjective RB – Adverb NN – Noun

First word Second word

JJ NN

RB JJ

JJ JJ

NN JJ

Page 9: Sentiment Detection

Unsupervised Learning

)2()1(

)2&1(log

wordpwordp

wordwordp

PointWise Mutual Information (PMI)and Semantic Orientation (SO)

PMI(word1, word2) =

SO (phrase) = PMI (phrase, ”excellent”) – PMI (phrase, “poor”)

Page 10: Sentiment Detection

Unsupervised Learning

Determine the Semantic Orientation (SO) of the phrases

Search on AltaVista

SO (phrase) =

)"(")""(

)"(")""(log

excellenthitspoorphraseNEARhits

poorhitsexcellentphraseNEARhits

Page 11: Sentiment Detection

Unsupervised Learning

Calculate average semantic orientation of document:

Extracted phrase

POS tags Semantic Orientation

Low fees JJ NN 0.333

Online service JJ NN 2.780

Inconveniently located

RB VB -1.541

Average Semantic Orientation = 0.524

Page 12: Sentiment Detection

Need for NLP

Identifying phrases is not enough – need to identify subject/object

- The NR70 is trendy. T-Series is fast becoming

obsolete.

Need to identify part-of and attribute-of relationship

- The battery is long-lasting

Page 13: Sentiment Detection

Focus of the sentiment

Feature/attribute terms:

BNP - Base Noun Phrases- battery, display, keypad

dBNP - Definite Base Noun Phrases- “the display”

bBNP - Beginning Definite Base Noun Phrases- “The battery is long-lasting”

Page 14: Sentiment Detection

Sentiment Analyzer

Sentiment lexicon database

- <lexical_entry> <POS> <sent_category>

- “excellent” JJ +

Sentiment pattern database

- <predicate> <sent_category> <target>

- “I am impressed with the flash capabilities”

- impress + PP(by;with) target

Page 15: Sentiment Detection

SA (contd.)

Identify sentences containing feature terms Ternary expressions (T-expressions)

- +ve/-ve sentiment verbs

<target, verb, “”>

- trans verbs

<target, verb, source> Binary expressions (B-expressions)

- <adjective, target>

Page 16: Sentiment Detection

SA (contd.)

Identify sentiment phrases within subject, object phrases

Associating sentiment with the target

- Based on sentiment patterns

“I was impressed by the flash capabilities”

“This camera takes excellent pictures”

- Based on B-expressions

“Poor performance in a dark room”

Page 17: Sentiment Detection

Other issues

Position of the sentiment words

- Words at the beginning and end of a review

Sentiment about the characters in the movie versus Sentiment about the actors in the movie – abstraction.

“He played the role of a very corrupt politician”

“He played the role brilliantly”

Page 18: Sentiment Detection

Conclusion

Sentiment detection can be used in areas ranging from marketing research to movie reviews.

Sentiment Detection is a “hard” problem due to context-sensitivity, complex sentences, etc.

Statistical methods should be augmented with NLP techniques.

Page 19: Sentiment Detection

References

Yi, Nasukawa, et al. Sentiment Analyzer: Extracting Sentiments about a Given Topic using NLP techniques. Proceedings of the Third IEEE International Conference on Data Mining, p. 427, Nov 19-22, 2003

Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of ACL, p. 417-424, 2002

Matthew Hurst and Kamal Nigam. Retrieving Topical Sentiments from Online Document Collections. Document Recognition and Retrieval XI, p. 27-34, 2004

Page 20: Sentiment Detection

References (contd.)

B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using Machine Learning techniques. Proceedings of the 2002 ACL EMNLP Conference, p. 79-86, 2002

Page 21: Sentiment Detection

Project

Sentiment analyzer for a specific domain Given set of features, initial list of polar words Learns new polar words from documents

analyzed