CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition...

39
CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews Hu Xu, University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of Illinois at Chicago, Tsinghua University BigData ’16

Transcript of CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition...

Page 1: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

CER: Complementary Entity Recognition

via Knowledge Expansion on Large Unlabeled Product Reviews

Hu Xu, University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of Illinois at Chicago, Tsinghua University

BigData ’16

Page 2: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

My Black Friday

Experience

Page 3: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

I have a case and want to add a new GPU.

Page 4: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

This is how a compatible GPU should look like.

Page 5: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

I found a 1070 GPU with a good price on Newegg.

Page 6: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Unfortunately, the GPU is too long to fit in… and non-refundable.

Page 7: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

In the end, I damaged my case a little…

Page 8: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

What’s a better way to avoid this?

Page 9: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

So we need to identify the fact that the GPU does not like some cases…

Page 10: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition!

• Method Overview

• Basic Recognition via Dependency Paths

• Knowledge Expansion on a Large Amount of Reviews

• Experiments

• Conclusions

Page 11: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Sentiment Analysis on Reviews (Liu, 2012)

• Product reviews contain a huge amount of information about first-hand user experiences in a sadly unstructured text format.

• Aspect-level sentiment analysis on product reviews is a key task to understand customers’ opinions on opinion targets: products and aspects (features) of products.

• We focus on complementary entities in reviews.

!

Page 12: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

What’s an Entity?• Something that has separate and distinct existence and

objective or conceptual reality.

• —Merriam-Webster

• We are interested in entities related to products.

• Named Entity

• e.g., Samsung Galaxy S6, Microsoft Surface

• General Entity

• tablet, cellphone, computer, etc.

Page 13: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

What’s a Complementary Entity?• Customers also express their opinions on a relation between a reviewed

product and another product.

• One relation type is complementary relation: two products (entities) should work together.

• Definition:

• target entity: the reviewed product;

• complementary entity: the related product in a complementary relation.!

• Example:

• This card works with my phone.

Page 14: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

A Few Examples

Page 15: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

A Few Examples with Opinions

Page 16: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Complementary Entity Recognition (CER)

• Extract complementary entities from sentences of reviews. (The target entities can be obtained from product titles of the reviewed product)

• e.g., extract phone from “It works with my phone.”

• Differences from Named Entity Recognition (NER):

• Including general entities: e.g., case, phone.

• Context dependent

• e.g., “It works with my iPhone 7” vs “I like my iPhone 7.”

• We only focus on CER in this paper since sentiment classification is an independent task and requires different techniques.

Page 17: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition

• Method Overview!

• Basic CER via Dependency Paths

• Knowledge Expansion on a Large Amount of Reviews

• Experiments

• Conclusions

Page 18: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Method Overview• We propose an unsupervised method with two components:

• Basic CER via Dependency Paths

• extract complementary entities from review sentences.

• Knowledge Expansion on a Large Amount of Reviews

• improve the precision by using the knowledge expanded on a large amount of reviews

• the contexts of complementary entities can be noisy

• high quality knowledge can reduce such noise via high precision paths.

Page 19: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Method Overview

Test Review

Dependency Paths!for CER

Domain Reviews

Knowledge Expansion

Complementary !Entity

Page 20: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Noisy Context of Complementary Entity

• Taking a micro SD card as the target entity for example:

• Similar context can be used for other purposes:

• It works in my phone.

• It works in practice.

• It works in airplane mode.

• The verbs used in the context of complementary entities are unlimited and domain related:

• It works with my phone.

• I use it with my phone.

• I insert this card into my phone.

• This card like my phone.

Page 21: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Knowledge Expansion• We introduce two kinds of knowledge to help to filter out noises:

• Candidate Complementary Entities:

• e.g., Samsung Galaxy S6, MS Surface, phone, tablet, etc. for micro SD card

• Domain-Specific Verbs:

• e.g., use, work, fit, insert for micro SD card!

• We observe that products under the same category share similar context knowledge.

• We group reviews under the same category together in case the number of reviews for a specific product is limited.

Page 22: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition

• Method Overview

• Basic CER via Dependency Paths!

• Knowledge Expansion on a Large Amount of Reviews

• Experiments

• Conclusions

Page 23: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Dependency Parsing (De Marneffe and Manning, 2008)

)

• A sentence can be parsed into a tree structure with words as nodes and typed grammar relations as edges.

• e.g., It works with my phone.

!

!

!

Page 24: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Dependency Path• Paths passing through the nodes that are

complementary entities can be used to extract complementary entity.

!

!

!

Page 25: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Dependency Paths !

!

!

!

!

• These paths (e.g., Path 6) may have low precision due to context noises and domain knowledge can help to reduce such noises.

Page 26: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition

• Method Overview

• Basic CER via Dependency Paths

• Knowledge Expansion on a Large Amount of Reviews!

• Experiments

• Conclusions

Page 27: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Knowledge Expansion• To improve CER’s precision, we use domain

knowledge to filter noises.

• Similarly, we use dependency paths to extract high quality knowledge from a large amount of reviews.

• The dependency paths must be of high precision to ensure the quality of knowledge.

• We use only seed general verbs work and fit.

Page 28: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Knowledge Expansion

work

fit

Page 29: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Knowledge Expansion

work

fit

Tablet

phone

Samsung Galaxy S6

Microsoft Surface Pro 4

Page 30: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Knowledge Expansion

work

fit

Tablet

phone

Samsung Galaxy S6

Microsoft Surface Pro 4

work

use

insert

fit

like

Page 31: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Dependency Paths for Knowledge Expansion

Page 32: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition

• Method Overview

• Basic CER via Dependency Paths

• Knowledge Expansion on a Large Amount of Reviews

• Experiments!

• Conclusions

Page 33: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Experiment Setting• Dataset:

• We annotated 7 products for testing purpose;

• We collect 6000 reviews for each category of products, used for knowledge expansion.

• We compare 10 methods:

• 2 Noun Phrase Chunkers, NER, CRF, Sceptre, “My” Entity Path, CER, CER1K+, CER3K+, CER6K+.

Page 34: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Experiment Results

• CER6K+ performs best:

• F1-score is more than 70%.

• CER3K+ (with 3000 domain reviews) is already good enough.

• Other baselines are not designed for this task.

Page 35: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Experiment Results

Page 36: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Domain Knowledge

Page 37: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Roadmaps• Preliminary: Complementary Entity Recognition (CER)

• Method Overview

• Basic Recognition via Dependency Paths

• Knowledge Expansion on a Large Amount of Reviews

• Experiments

• Conclusions

Page 38: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Conclusions• We introduce a novel task Complementary Entity

Recognition (CER) and an unsupervised method for recognition.

• We utilize big data to expand domain knowledge and use the domain knowledge to improve the performance of recognition.

• Future works can be

• sentiment classification for complementary entities;

• automatic knowledge accumulation from data.

Page 39: CER: Complementary Entity Recognitionhxu/bigdata2016_slides.pdf · Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target

Q&A• The annotated dataset can be found at:

• https://www.cs.uic.edu/~hxu/CER_dataset.html

• For details, please go for the original paper:

• Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu, CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews, IEEE International Conference on Big Data 2016, Washington D.C., Dec 5-8, 2016.