The usual goal of sentiment analysis is to provide numeric measures of positive or negative valence for brands, products, and commodities, which can be aggregated over time or geographical regions to analyze patterns and trends. Dr. Shlomo Argamon discusses some new methods he and his team are developing which extend this paradigm in two ways. First, their systems analyze more aspects of each individual sentiment expression, including different types of attitude ("unwieldy" vs. "unreliable"), comparisons ("X is better than Y"), evaluative trends ("X is improving"), and modality ("possibly" vs. "likely" vs. "definitely"). Secondly, they are combining sentiment analysis with their methods for automated authorship profiling, which label texts with author characteristics such as gender, age, native language, education level, and so forth. When this is done, a new type of analysis emerges: data mining can be used to find "sentimental market segments", discovering, for example, that opinion is trending upwards for males aged 20-30, but downwards 30-50 year-olds who did not attend college. He presents some of their research results and discuss the implications for future applications and developments in sentiment analytics.

  • 1. Sentimental Market Segmentation Shlomo Argamon Illinois Institute of Technology Department of Computer Science Chicago, IL Sentiment Analysis Symposium April 13, 2010, New York, NY

2. Sentiment Analysis 3.

  • Why??

Sentiment Analysis 4. Sentiment Analysis What are they thinking? What do they want? What will they buy? 5. Wheres the ROI?

  • What should I fix?
    • Find comparatively negative aspects of my product or positive areas about competitors
  • Howm I doing?
    • Examine sentiment trends to examine effects of marketing or new products
  • Wheres the action?
    • Find the customers unfulfilled needs

6. More Generally The Market (potential) Customers 7. Customer Model Perceptions Choices Products & Features Advertising Potential Customer Opinions Texts Needs & Wants 8. Customer Model Perceptions Products & Features Advertising Potential Customer Opinions Texts Needs & Wants Choices 9. Customer Model Perceptions Products & Features Advertising Potential Customer Opinions Texts Needs & Wants Choices 10. Customer Model Perceptions Products & Features Advertising Potential Customer Opinions Texts Needs & Wants Choices 11. Customer Model Products & Features Advertising Potential Customer Opinions Texts Needs & Wants Perceptions Perceptions Perceptions Choices 12. Customer Model Products & Features Advertising Opinions Texts Needs & Wants Perceptions Perceptions Perceptions Potential Customer Potential Customer Potential Customer Potential Customer Choices 13. Customer Model Products & Features Advertising Opinions Texts Needs & Wants Perceptions Perceptions Perceptions Potential Customer Potential Customer Potential Customer Potential Customer Choices 14. Customer Model Products & Features Advertising Opinions Texts Needs & Wants Perceptions Perceptions Perceptions Potential Customer Potential Customer Potential Customer Potential Customer Choices ? 15. Market Segmentation

  • Product/brand segmentation
    • What products are close to which?
    • Based on customerneeds and perceptions , not product features
  • Customer/community segmentation
    • Meaningful subsets of potential customers
    • Relative to a given market!
    • Know their characteristics

16. Perceptual Maps 17. Community Maps 18. Understand The Community

  • Not justwhatthey are saying
      • Whois saying it?
        • What groups of people have similar opinions (about X)?
        • What kinds of people are they?
      • Howdo they see things?
        • How do they group products, brands, or features?

19. Sentiment Analysis 20. Document Filter ( topic, source ) Sentiment Classifier Trend Snapshot Multidimensional 21. Topic Classifier Topic/Sentiment Correlation Trend Snapshot Multidimensional Sentiment Finder 22. Target Finder Target/Sentiment Correlation Trend Snapshot Multidimensional Sentiment Finder 23. But

  • Who are they and what do they think??
  • We still need
  • More detail on their opinions
  • Profiles of the writers

24. Detailed Sentiment Finder Target Finder Sentiment Complexes Authorship Profiling Demographic Trends Perceptual Map Customer Map Demographic Profiles 25. Detailed Sentiment Finding 26. Detailed Sentiment Finding Complex Sentiment Expressions FindChunks ( attitudes, targets,hinges, ) Chunks Expression Linkage Disambiguation Texts Lexicon Linkage Rules Dependency Parsing Syntactic Relations 27. Different kinds of sentiment 28. Syntactic Linkage 29. More complex patterns 30. Sentiment expressions

  • [I] evaluator[couldnt] polaritybring myself to [like] attitude[him] target .
  • [It] target-1is [not] polarity[as [good] attitudeas] comparator[the Minolta D7] target-2 .
  • [Gap.Com] targetis an [excellent] attitudeexample of [a retailer] superordinate[using its online shopping store as an extension and expansion of its retailing] aspect .

31. Authorship Profiling 32. Authorship Profiling

  • Infer things about the author from the style of the language
    • Gender
    • Age
    • Native language
    • Personality type
    • Education level
    • Etc

33. Capturing language style

  • Linguistic variationorthogonal to topic
  • Function words
  • Parts-of-speech
  • Syntactic structures
  • Morphology
  • Linguistic complexity
  • Vocabulary size
  • Mistakes
  • Slang

34. Male/Female Classification

  • 20th Century narrative fiction: 79%
  • 20th Century non-fiction: 83%
  • 21st Century blogs: 77%
  • 17th-19th Century French lit.:76%

she, for, with, not, and, in, I, you,pronouns, present-tense-verbs the, this, that, those, as, one, of, to,prepositions, adjectives, numbers Female Features Male Features 35. Age Classification

  • Blogs, classified as teens, twenties, thirties-plus: 75%

22 45 89 dumb 53 80 216 mad 11 28 46 crappy 23 41 125 mum 57 128 292 awesome 63 102 369 boring 10 26 74 sis 47 111 384 bored 15 18 137 homework 2 3 105 maths 30s 20s 10s Word 111 153 45 bar 37 52 31 dating 28 40 35 someday 131 192 151 college 56 84 64 album 61 98 65 student 70 115 32 beer 41 88 77 drunk 55 123 18 apartment 18 44 22 semester 30s 20s 10s Word 46 35 10 workers 69 54 15 provide 55 36 12 systems 237 92 51 son 59 29 13 democratic 185 118 38 local 72 38 14 tax 70 38 14 campaign 82 50 16 development 141 83 27 marriage 30s 20s 10s Word 36. Other dimensions

  • Native language: ~80%
  • Personality:
    • Neuroticism: ~68%
    • Extraversion: ~55-70%

37. Prototype Results 38. Simple Prototype

  • 53,983 blog snippets from the ICWSM task corpus (Aug-Sep, 2008)
  • 268,665 sentiment expressions found
  • Examples:
    • I think that Sarah [Palin] targetwould be a [terrible] attitude[vice president] superordinate
    • [the game] targetwas [too simplistic] attitude[to serve as proper material for argument] aspect

39. Unsupervised Profiling

  • Gender ( Male/Female )
  • Age ( Younger/Older )
    • Based on features from previous studies
  • Education level ( LowerEd, MediumEd, HigherEd )
    • Based on linguistic complexity

40. Trend Analysis 41. Apartment sentiment by gender 42. Apartments - gender difference 43. Sarah Palin - gender difference 44. Market Mapping 45. Perceptual Map - Relationships 46. Perceptual Map - Relationships 47. Perceptual Map - Issues 48. Perceptual Map - Issues 49. Community Map - Issues 50. Perceptual Map - Politicians 51. Community Map - Relationships 52. Community Map - Marriage 53. Sentimental Market Segmentation

  • Construct perceptual and community maps, by:
    • Detailed extraction of sentiment expressions
      • Semantic and structural detail
    • Authorship profiling
      • Tells us what kinds of people are writing which opinions
      • (Also need to attribute third-party sources)
    • Dimensionality reduction over author opinions and profiles (PCA, MDS, etc.)
    • Currently in process of obtaining IP protection

54. Acknowledgments

  • Sentiment analyzer:
    • Ken Bloom
    • with Navendu Garg and Casey Whitelaw
  • Authorship profiling:
    • Moshe Koppel and James W. Pennebaker
    • with Jonathan Schler and Sterling Stein

55. Thank you

