A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of...

18
A Joint Model of Te xt and Aspect Ratin gs for Sentiment Su mmarization Ivan Titov (University of Illin ois) Ryan McDonald (Google Inc.) ACL 2008

Transcript of A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of...

Page 1: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

A Joint Model of Text and Aspect Ratings for Sentiment Summarization

Ivan Titov (University of Illinois)

Ryan McDonald (Google Inc.)

ACL 2008

Page 2: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Introduction

An example of an aspect-based summary

Q1: Aspect identification and mention extraction (coarse or fine?)

Q2: sentiment classification

Page 3: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Introduction: Extraction problem

Page 4: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Assumptions for their model

Ratable aspects normally represent coherent topics which can be potentially discovered from co-occurrence information in the text.

Most predictive features of an aspect rating are features derived from the text segments discussing the corresponding aspect.

Page 5: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Multi-Aspect Sentiment model (MAS)

This model consists of two pars: Multi-Grain Latent Dirichlet Allocation (Titov an

d McDonald, 2008) : build topics

A set of sentiment predictors : force specific topics correlated with a particular aspect.

Page 6: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

MG-LDA (1)

An extension of LDA (Latent Dirichlet Allocation): build topics that globally classify terms into product instances. (Creative Labs Mp3 players versus iPods, New York versus Paris Hotels)

MG-LDA models global topics and local topics.

The distribution of global topics is fixed for a document, while the distribution of local topics is allowed to vary across the document.

Page 7: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

MG-LDA (2)

Ratable aspects will be captured by local topics and global topics will capture properties of reviewed items.

Example: “. . . public transport in London is straightforward, the tube station is about an 8 minute walk . . . or you can get a bus for £1.50”

A mixture of topic London (London, tube, £) The ratable aspect location (transport, walk, bus) Local topics are reused between very different types of

items.

Page 8: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

MG-LDA (3)

A doc is represented as a set of sliding windows, each covering T adjacent sentences.

Each window v in doc d has an associated distribution over local topics and a distribution defining preference for local topics versus global topics A word can be sampled using any window covering its sentence s, where the window is chosen according to a categorical distribution

Windows overlap permits the model to exploit a larger co-occurrence domain.

Symmetrical Dirichlet prior for

Page 9: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Dirichlet distribution: Dir(α)

Its probability density function returns the belief that the probabilities of K rival events are xi given that each event has been observed αi - 1 times.

Several images of the probability density of the Dirichlet distribution when K=3 for various parameter vectors α. Clockwise from top left: α=(6, 2, 2), (3, 7, 5), (6, 2, 6), (2, 3, 4).

Page 10: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Multi-Aspect Sentiment Model (1)

Assumption: the text of the review discussing an aspect is predictive of its rating.

MAS introduces a classifier for each aspect, which is used to predict its rating.

Only words assigned to that topic can participate in the prediction of the sentiment rating of the aspect.

However, rating for different aspects can be correlated. Ex. Negative cleanliness -> rooms, service, dining.

Page 11: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Multi-Aspect Sentiment Model (2)

Opinions about an item in general without referring to any particular aspect. Ex. This product is the worst I have ever purchased -> low ratings for every aspect.

Based on overall sentiment rating and compute corrections.

N-gram model:

Page 12: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Inference in MAS

Gibbs sampling Appears only if ratings are known

Page 13: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Experiments - Corpus

Reviews of hotels from TripAdvisor.com. 10,000 reviews (109,024 sentences, 2,145,31

3 words in total) Every review was rated with at least 3 aspect

s: service, location, and rooms. Ratings from 1 to 5.

Page 14: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Result Example

Page 15: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Evaluation

779 random sentences labeled with one or more aspects.

164, 176, 263 sentences for service, location, and rooms, respectively.

Page 16: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Results: Aspect Service

Page 17: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Results: Aspect Location

Page 18: A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Result: Aspect Rooms