A Unified Relevance Model for Opinion Retrieval
description
Transcript of A Unified Relevance Model for Opinion Retrieval
![Page 1: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/1.jpg)
1
A Unified Relevance Model for Opinion Retrieval
(CIKM 09’) Xuanjing Huang, W. Bruce Croft
Date: 2010/02/08
Speaker: Yu-Wen, Hsu
![Page 2: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/2.jpg)
2
Outline
IntroductionFormal Framework for Opinion RetrievalSentiment expansion techniquesTest Collections and Sentiment ResourcesExperimentsConclusion
![Page 3: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/3.jpg)
3
Introduction
Opinion retrieval is very different to traditional topic-based retrieval. Relevant documents should not only be
relevant to the targets, but also contain subjective opinions about them.
The text collections are more informal “word of mouth” web data.
![Page 4: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/4.jpg)
4
the greatest challenge : the difficulty in representing the user’s information need.
typical queries : content words, some cue words. the majority of previous work in opinion retrieval
documents are ranked by topical relevance only candidate relevant documents are re-ranked by their
opinion scores
![Page 5: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/5.jpg)
5
The novelty and effectiveness of the approach: topic retrieval and sentiment classification can be
converted to a unified opinion retrieval procedure through query expansion;
not only suitable for English data, but also to be effective for a Chinese benchmark collection;
extended to other tasks where user queries are inadequate to express the information need,
geographical information retrieval and music retrieval.
![Page 6: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/6.jpg)
6
Formal Framework for Opinion Retrieval
Suppose we can obtain an opinion relevance model from a query. R : the opinion relevance model for a query D : the document model V : the vocabulary
![Page 7: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/7.jpg)
7
Content words and opinion words contribute differently to opinion retrieval.
Topical relevance is determined by the matching of content words and the user query
The sentiment and subjectivity of a document is decided by the opinion words.
We divide the vocabulary V into two disjoint subsets: a content word vocabulary CV an opinion word vocabulary OV
![Page 8: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/8.jpg)
8
An opinion relevance model is a unified model of both topic relevance and sentiment. In this model, Score(D), the score of a document is defined as:
CV : the set of original query terms, or obtained by any query expansion technique
OV : only opinion words are used to expand an original query instead of content words.
![Page 9: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/9.jpg)
9
Sentiment expansion techniques
Query-independent sentiment expansion
Query-dependent sentiment expansionMixture relevance model
![Page 10: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/10.jpg)
10
*Query-independent sentiment expansion
Sentiment expansion based on seed words positive seeds, negative seeds advantage: nearly always express strong
sentiment disadvantage: some of them are infrequent in
the text collections
![Page 11: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/11.jpg)
11
Sentiment expansion based on text corpora only the most frequent opinion words are
selected to expand the original queries occurrences in a general corpus may be
misleading reliable opinionated corpus: Cornell movie
review datasets and MPQA Corpus
![Page 12: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/12.jpg)
12
Sentiment expansion based on relevance data The goal is to automatically learn a function from
training data to rank documents Given a set of query relevance judgments, we
can define the individual contribution to opinion retrieval for an opinion word
w : an opinion word, used to expand every original query.
contribution of w : the maximum increase in the MAP of the expanded queries over a set of original queries
![Page 13: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/13.jpg)
13
Qi : the i-th query in the training set Qi ∪w: the i-th expanded query weight(w) : the weight of w, AP(Qi): the average precision of the retrieved documen
ts for Qi AP(Qi ∪ w,weight(w)) :the average precision of the retri
eved documents for the expanded query while the weight of w is set as weight(w)
![Page 14: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/14.jpg)
14
*Query-dependent sentiment expansion
A target is always associated with some particular opinion words. e.g. Mozart: Austrian musician (genius , famous)
extract opinion words from a set of user-provided relevant opinionated documents
when there are no user-provided relevant documents. the relevant document set C can be acquired through
pseudo-relevance feedback. rank documents using query likelihood scores select some top ranked documents to get the pseudo
relevant set of C.
![Page 15: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/15.jpg)
15
*Mixture relevance model
query-independent and query-dependent. query-independent: the most valuable
opinion words are always general words and can be used to express opinions about any target.
query-dependent: those words most likely to co-occur with the terms in the original query are used for expansion.
These words are used to express opinions about particular targets.
![Page 16: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/16.jpg)
16
the interpolation of the scores assigned by original query, query-independent sentiment expansion, and query-dependent sentiment expansion
![Page 17: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/17.jpg)
17
Test Collections and Sentiment Resources
Both evaluations aim at locating documents that express an opinion about a given target.
![Page 18: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/18.jpg)
18
*Sentiment resources
English resources General Inquirer : 3,600 opinion words OpinionFinder’s Subjectivity Lexicon : >5,600
![Page 19: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/19.jpg)
19
Chinese resources HowNet Sentiment Vocabulary: 7000
![Page 20: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/20.jpg)
20
Experiments
![Page 21: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/21.jpg)
21
![Page 22: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/22.jpg)
22
![Page 23: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/23.jpg)
23
If retrieval effectiveness is preferred, mixture approaches should be adopted
If retrieval efficiency is preferred, query independent sentiment expansion should be adopted.
![Page 24: A Unified Relevance Model for Opinion Retrieval](https://reader036.fdocuments.us/reader036/viewer/2022062518/56814654550346895db369c2/html5/thumbnails/24.jpg)
24
Conclusion
we have proposed the opinion relevance model, a formal framework for directly modeling the information need for opinion retrieval.
query terms are expanded with a small number of opinion words to represent the information need.
We propose a series of sentiment expansion approaches to find the most appropriate query-independent or query-dependent opinion words.