Post on 17-Jun-2015
description
Marc-Antoine DupréAlexander PatronasErhard DinhoblKsenija IvekovicMartin Trenkwalder
Web Opinion Mining
RoadmapWhat is opinion mining and why?Objects, model and taskWords and phrasesSentiment classificationFeature-based opinion miningOpinion SpamTools on opinion mining
Questions:
What do users think about a specific product?
Which of our customers are unsatisfied? Why?
Which product is more popular among users?
Answer: Web Opinion Mining
Web Opinion MiningFacebook, blogs, … > opinionWikipedia > fact Opinions: underlying question
“ what do people in America think about Barack Obama?”
Mostly in deep webAI algorithm necessaryUseful: market intelligence (better ads)
Objects, ModelOpinion holder / object / opinionFeatures of object
F = {f1, f2, f3, …} fi ϵ F fi defined by words or phrases
W = {w1, w2, w3, …} Wi ϵ WO is some object (event, person,
product, …)
“Now the opinion holder is j and comments on a subset of features Sj of F of O. Now feature fk ϵ Sj is commented by j by a word or phrase from Wk to determine the feature and a positive, negative or neutral opinion on fk”
Task One document – one opinion from one
holder Opinion: positive, negative, neutral 3 levels:
Document - class determining Sentence (one opinion)
sentence type (objective or subjective) sentence class (neutral, positive, negative)
Feature – determining words and phrases
Words and PhrasesWords often context dependent („long“ –
long loading time – long battery runtime)
3 approaches to get wordlist:Manual approach Corpus-based approach Dictionary-based approach
Sentiment ClassificationClassify documents (e.g. reviews) based on
overall sentiments expressed by opinion holdersPositive, negative or neutral
Useful, but doesn’t find what reviewer liked or disliked!A negative sentiment on an object doesn’t
mean that opinion holder dislikes everything about object and opposite
Need to go to sentence level and the feature level
Feature-based Opinion Mining Objective: find what reviewers like and
dislikeFeatures and components
Three tasks:Extract object features that have been
commented on in each reviewDetermine whether opinions on the feature
are positive, negative or neutralGroup synonyms and produce summary
Different Review Formats
GREAT Camera., Jun 3, 2004 Reviewer: jprice174 from Atlanta, Ga.
I did a lot of research last year before I bought this camera... It kinda hurt to leave behind my beloved nikon 35mm SLR, but I was going to Italy, and I needed something smaller, and digital.
The pictures coming out of this camera are amazing. The 'auto' feature takes great pictures most of the time. And with digital, you're not wasting film if the picture doesn't come out. …
….
Extracting Object Features1. Part-of-speech tagging:
Features are noun and noun phrases
2. Frequent features generation Association mining to generate candidate
features Feature pruning
3. Infrequent feature generation Opinion words extraction Finding infrequent features using opinion words
Identifying Orientation of Opinion Sentence
Used dominant orientation of opinion words as sentence orientationIf positive opinion prevails, the opinion
sentence is regarded as a positive and vice versa
Feature-based Summary
GREAT Camera., Jun 3, 2004 Reviewer: jprice174 from Atlanta, Ga.
I did a lot of research last year before I bought this camera... It kinda hurt to leave behind my beloved nikon 35mm SLR, but I was going to Italy, and I needed something smaller, and digital.
The pictures coming out of this camera are amazing. The 'auto' feature takes great pictures most of the time. And with digital, you're not wasting film if the picture doesn't come out. …
….
Feature Based Summary:
Feature1: picturePositive: 12 The pictures coming out of this
camera are amazing. Overall this is a good camera with
a really good picture clarity.…Negative: 2 The pictures come out hazy if your
hands shake even for a moment during the entire process of taking a picture.
Focusing on a display rack about 20 feet away in a brightly lit room during day time, pictures produced by this camera were blurry and in a shade of orange.
Feature2: battery life…
Opinion Spam
Reviews contain rich user opinions on products and services, that possibly influence the purchase decisions of users
Generally three types of spam reviews:Untruthful opinionsReviews on brands onlyNon-Reviews
Tools for Sentiment Analysis [1/2]APIs
Evri – semantic search engine, very powerful API
OpenDover – Java based webservice
Blogosphere/TwittersphereRankSpeed – search by criteriasTwittratr – simple search tool (keyword
based)TwitterSentiment – project from Stanford
University, classifiers from machine learning algorithms, transparent
Tools for Sentiment Analysis [2/2]Newspaper
Newssift – sentiment search tool on newspapers (by Financial Times)
ApplicationsLingPipe – Java toolRadian6 – commercial social media
monitoring applicationRapidMiner – open-source machine learning
and data mining tool (Community Edition)
LIVE DEMO (evri)
Thank you for your attention!