Summit slide loop ny

Post on 04-Jul-2015

103 views 2 download

Tags:

Transcript of Summit slide loop ny

Introduction to Text Analytics

October 2, 2013

Dr. Stuart ShulmanPhone No.: +1-413-345-8939

E-mail: stu.shulman@visioncritical.com

The Value Proposition

Our solution helps users easily discover information to:• streamline business processes

• increase ROI & create new business opportunities

• identify positive and negative trends

• discover unique, rare or unexpected information

How Do These Tools Help Analysts?

What This Means for Analysis

The Core MethodsCoding and Classifying Text Data

Iteration and Re-Use Are Critical Techniques

Measure Everything Starting With Human Agreement

The Core DiscoverText Approach

An Indispensable Role for Humans

Innovation Happens in Groups

“CoderRank” – A Lifetime Accuracy Measurement

Vision Critical Patent Pending – “Enhanced Machine Learning”

Five Essential Tools for Text Analytics

1. Search

2. Filtering on Metadata

3. Human Coding

4. Automated Clustering

5. Machine Classification

A Social Media Use CaseSifting and Sorting Relevant Data

Great Researchers Demand Transparent Tools

The HMC is a Leading Edge Gnip Customer

Gnip Data Streams and Search Filters

Fair Warning

This part of the presentation contains strong and potentially quite offensive, inappropriate, disturbing, or just completely stupid language.

Studying Media Campaign Effects

Create Custom Machine Classifiers

Yes

No

No

Search is Fundamental for Purposive Sampling

Defined Search Speeds Up Discovery

Tumblr. – “The Wild West of the Internet”

Stupid Stuff People Do & Tweet

redacted

redacted

Are These Tweets Just Social Garbage?

redacted

redacted

Signs of Health Fear Engagement

redacted

redacted

An IdeaScreen Use CaseConcept Testing Data

Raw VoC Data: A Fortune 500 Tech Company

Near Duplicate Clusters Can Be Interesting

Two Naturally Occurring Clusters of Free Text

Wherever Humans Go in Numbers, There Are Clusters

1st Wave of Human Coding Blazes a Trail

A „Simple‟ Coding Scheme with No Coder Training

Filtering Based on Classifier Scores

Testing Coder Agreement on a Small Sample

Measuring Inter-Coder Agreement

Validation of Coders & Codes

Text Analytics is a Series Buckets & Datasets

Breaking Down Concerns by Subtype

Breaking Down Advocacy by Pro and Con

A New Vision Critical Front EndThe First Preview of the New Release

The New VC Front End for DiscoverText

Coding Items to Train a Classifier

Leverage Item Metadata While Coding or Filtering

Code Items in a List View