Introduction to Sentiment Mining

Post on 20-May-2015

267 views 3 download

Tags:

description

14 December 2011 in Kassel, Germany

Transcript of Introduction to Sentiment Mining

Sentiment Mining

Prof. Maurice Mulvenna

University of Kassel 14 December 2011

Outline §  Ulster §  What is Sentiment Mining §  Why Sentiment Mining §  Challenges §  Methods §  Data Sources §  Applications §  Examples §  Simple Keyword-based Prototype §  Some Results

The Right Choice COLERAINE JORDANSTOWN

MAGEE BELFAST

FOUR CAMPUSES-ONE UNIVERSITY

•  Largest University in Ireland – over 25,500 local, national and international student body

•  International reputation in research •  “Excellence” in teaching • Graduate employment well above national average • Excellent study and recreational facilities

University of Ulster in Top 10 UK universities in applications

What to Study

Computing and Multimedia Electrical and Mechanical Engineering Humanities/Performing Arts Life and Health Sciences Social Sciences Art, Design and Built Environment Business and Management

Around 600 degree programmes:

Faculty of Computing and Engineering

Within the Faculty there are: §  5 Schools §  Approximately 3000 students §  200 staff §  Extensive specialist facilities on the Coleraine,

Jordanstown and Magee Campuses

What is Sentiment Mining §  Also referred to as sentiment analysis or opinion

mining §  It refers to the application of natural language

processing, computational linguistics, and text analytics to identify and extract subjective information in source materials. (Wikipedia)

§  Its aim is to determine the attitude or mood of a user or user group (i.e. happy or sad) the contextual polarity of statements or larger documents (i.e. positive or negative) the intended emotional communication (i.e. sarcasm or irony)

Why Sentiment Mining §  Capture and analyse public opinion §  Capturing the word-of-mouth effect §  Evaluate the social profile of individual §  News detection and analysis §  Quantify the emotional state of users (i.e. duress,

stress, sadness, angriness, etc.) §  Feedback mechanism to e.g. policy makers §  National (e.g., UK riots) and §  International ( االلععررببيي للررببييعع ‎ or ‘Arab Spring’)

events that impact and resonate in peoples’ daily lives

Marketing

http://mashable.com/2011/11/23/kindle-fire-nook-ipad-online-buzz/

Challenges §  Sentiment is a subjective measure and as such is subject

to interpretation §  Data Volumes

Number of statements, users, documents, etc. Size of documents and the complexity (topic, sentence, paragraph, chapter, document level)

§  Noise, and unstructured data §  Slang, vernaculars, abbreviations (i.e. wdc, cu, ru, lol, etc.) §  Language heterogeneity

Demographic dependencies Social dependency

§  Ambivalence §  Complexity of NLP tasks

Methods §  Keyword-based approaches §  Machine learning techniques

Latent semantic analysis Support vector machines "bag of words” Methods Naive Bayes classifiers Other NLP tools that allow the detailed parsing of text related sources including the underlying grammar.

Data Sources §  Any single document or document collection (i.e.

reviews of any kind – travel, food, movie, etc.) §  Social media networks (i.e. Twitter) §  Spoken communication (either directly or after

converting it into a textual representation)

à Any source in which an opinion or emotion is expressed or communicated

Applications §  Reputation Management §  Customer Profiling §  Product Management §  News Detection and Analysis §  Public Opinion Analysis §  Affective Computing where systems should

interpret the emotional state of users and adapt there behaviour accordingly also providing an appropriate response for the emotions detected.

The essence of the book is Lanier's attempt to answer the question: "What happens when we stop shaping technology and technology starts shaping us?" "

Prototype Architecture

Prototype Interface

Topic Map & Tag Clouds

Timeline

Runtime

Keyword

100 Keywords

6500 Keywords

Evaluation & Accuracy

Manual Classification

SentiGen

Thank you