Abstractive Review Summarization

Review Summarization

SNLP PROJECT, AUTUMN 2015Department of Computer Science, IIT Kharagpur

Guided By : Dr. Pawan Goyal

Overview• Introduction• Problem Statement• Approach• Summarization Framework• Discourse Parsing• Aspect Rhetorical Relation Graph• Content Selection and Structuring• Abstract Generation• Microplanning

IntroductionProduct Summary

• plays vital role for both Customers and Manufacturers

Effective Review Summary• how good product is based on different parameters and aspects.• It is more abstract and captures more parameters and prioritize

them based on number of times they are used.• Manufacturers used them for the improvement of the product.

Problem Statement• Generate a abstractive summarization system for product reviews by

generating aspect-based sentiment analysis and exploiting their discourse structure and assuming no prior domain knowledge.

• Generate an aspect-based abstract from multiple reviews of a product.

• Product-independent template-based NLG framework to generate an abstract based on the selected content.

Approach• At first we made use of Stanford NLP API to generate the universal

dependencies and the Parse tree.

• Then we extract aspects from the dependency tree bank of a review.

• Now analysing the reviews we generate a annotated review with aspects and its sentiment polarity and strength.

• Then Apply a discourse parser to each review and obtain a discourse tree representation for every review and modified the discourse trees so that it contains only the aspects.

Approach• After that, we aggregate the aspect discourse trees and generate a graph, select a sub graph representing the most important aspects and the rhetorical relations between them using a PageRank algorithm, then transform it into an aspect tree.

• Finally, we generate a natural language summary by applying a template-based NLG framework.

Aspect Extraction•From the dependency tree bank of a review, we extract all the noun phrases and the nouns.

•The noun phrases are then classified further into categories like adjective based, noun based and determiner based.

•We then apply the dependency relations to check the aspects that contributes to the sentiment of the sentences as per a specific set of rules.

Annotated review•From each review sentence, we generate a graph using the relations like nsubj, amod, advmod, dobj between the words. This graph is essentially the dependency tree but presented in a more structured manner for traversal.

•Given an aspect we see patterns as stated in [5]. Dependency relations are the basis for such patterns. Based on these patterns we find sentiment flow from the polar words(derived from SenticNet [6]) to the aspect words. In the end, the patterns determine the polarity of the aspects.

•The strength of sentiment is fetched by matching the modifiers with a dictionary to give the final strength to each aspect.

Discourse parser

Review 1 Review 2 Review N

……..

ADT NADT 1 ADT 2………

ARRG generation

Weighted page ranking

ARRG

Sub Graph Generation

Microplanning

Sentencerealization

Summarization Framework•Generates a summary from multiple input reviews based on an Aspect Hierarchy Tree (AHT) that reflects the importance of aspects as well as the relationships between them.

•In our framework, an AHT is generated automatically from the set of input reviews, where each sentence of every review is marked by the aspects presented in that sentence and the polarity of opinions over them.

• P/S scores are integer values in the range [-3, +3], where +3 is the most positive and 3 is the most negative polarity value

Abstract Generation•The automatic generation of a natural language summary in our system involves the following tasks i)Micro-planning, which covers lexical selection.

ii)sentence realization, which produces English text from the output of the Micro-planner.

Microplanning and Sentence Realization

• Once the content is selected and structured, it is passed to the microplanning module which perform lexical choice.

• Lexical choice is an important component of microplanning.

• Lexical choice is formulated in our system based on a “formal” style and “fluent” connectivity among other lexical units.

• In sentence realization we generated abstract sentences for aspects with no children and generate supporting sentences for aspects with

children

Results ObtainedWe have obtained the discourse parsed tree of the reviews and have identified the aspects with their polarity strength. Rhetorical relation among the EDUs are also identified. A small snapshot of the result is provided below:

( Nucleus (span 1 3) (rel2par Joint) ( Satellite (leaf 1) (rel2par Attribution) (text _!_!I want to start off!__!) ) ( Nucleus (span 2 3) (rel2par span) ( Satellite (leaf 2) (rel2par Attribution) (text _!_!saying!__!) ) ( Nucleus (leaf 3) (rel2par span) (text _!_!that this camera is small for a reason . <s>!__!) ) )

Results ObtainedFrom the Discourse tree, we have generated the Aspect based discourse tree which defined the underlying aspect of EDU and the Rhetorical relation among them.

room,Evaluation,small,0.261905room,Evaluation,camera,0.33333 room,Evaluation,size,0.357143 memory,Elaboration,size,0.547619 size,Manner-Means,camera,0.761905 camera,Evaluation,size,0.166667 camera,Evaluation,memory,0.261905 camera,Contrast,small,0.5

Results ObtainedWe have generated the ARRG of the product based on the ADTs based on the output of previous component. The output snippet is provided below: camera,Elaboration,auto mode,0.23photo quality,Background,auto mode,0.75camera,Elaboration,photo quality,0.5camera,Elaboration,auto mode,0.33camera,Elaboration,photo quality,0.2####,####,####,####camera,Elaboration,control,0.5control,Contrast,auto mode,0.66camera,Elaboration,auto mode,0.075camera,Elaboration,control,0.2camera,Elaboration,auto mode,0.375####,####,####,####

Results ObtainedWe have generated the tuples with highest strength by applying page ranking on ARRG

photo quality,Background,auto mode,0.75camera,Elaboration,photo quality,0.5camera,Elaboration,auto mode,0.705camera,Elaboration,control,0.5control,Contrast,auto mode,0.66

Results ObtainedWe have used the above tuple set of graphs to generate our review summary and we have received the following results :

All customers ( 51 people ) who reviewed the camera felt that it was great .Most shoppers ( 36 people ) mentioned the size and they really liked this feature .Accordingly almost half (29 people) of the users commented about the pictures and they really liked this feature mainly because of clarity. About 45.0% of reviewer commented about the software and they absolutely liked it .About 45.0% of reviewer commented about the small size and they really liked this feature .In relation to the aspect, About 30.0% of the shoppers mentioned the use and they absolutely liked it. 8 reviewers commented about the flash and in overall they felt that it was fine mainly because of pictures.

Conclusion•We have presented a framework for abstractive summarization of product reviews based on discourse structure.

• For content selection, we propose a graph model based on the importance and association relations between aspects, that assumes no prior domain knowledge, by taking advantage of the discourse structure of reviews.

•For abstract generation, we propose a product independent template-based natural language generation(NLG) framework that takes aspects and their structured relation as input and generates an abstractive summary.

•In addition, we plan to develop and evaluate an end-to-end system, in which the aspect extraction and polarity estimation of aspects are automated.

References [1] Abstractive Summarization of Product Reviews Using Discourse Structure http://www.aclweb.org/anthology/D/D14/D14-1168.pdf

[2] Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions http://lexitron.nectec.or.th/public/COLING-2010_Beijing_China/PAPERS/pdf/PAPERS039.pdf

[3] Poria, E. Cambria, G. Winterstein, and G.-B. Huang. Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems 69, pp. 45-63 (2014)

[4] S. Poria, E. Cambria, A. Gelbukh, F. Bisio, and A. Hussain. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Computational Intelligence Magazine 10(4), pp. 26-36 (2015)

http://www.aclweb.org/anthology/D/D14/D14-1168.pdf

http://www.aclweb.org/anthology/D/D14/D14-1168.pdf

http://lexitron.nectec.or.th/public/COLING-2010_Beijing_China/PAPERS/pdf/PAPERS039.pdf

http://lexitron.nectec.or.th/public/COLING-2010_Beijing_China/PAPERS/pdf/PAPERS039.pdf

THANK YOU !!

Abstractive Review Summarization

Documents

Transcript of Abstractive Review Summarization