Welcome - Text Analytics Summit 2010

22
Seth Grimes @sethgrimes Text Analytics Summit 2010 #TAS10 The Big Questions Facing the Text Analytics Industry

description

Welcome address slides for the Text Analytics Summit 2010, presented by conference chair Seth Grimes, Alta Plana Corporation.

Transcript of Welcome - Text Analytics Summit 2010

Page 1: Welcome - Text Analytics Summit 2010

Seth Grimes@sethgrimes

Text Analytics Summit 2010#TAS10

The Big Questions Facing the Text Analytics Industry

Page 2: Welcome - Text Analytics Summit 2010

>> Past, Present & Future

He who controls the present, controls the past. He who controls the past, controls the future.

-- derived from George Orwell’s 1984

Page 3: Welcome - Text Analytics Summit 2010

>> The (Near) Past: Lacking Use Cases

In 1999 –

“The nascent field of text data mining (TDM) has the peculiar distinction of having a name and a fair amount of hype but as yet almost no practitioners.”

-- Prof. Marti A. Hearst,“Untangling Text Data Mining”

Page 4: Welcome - Text Analytics Summit 2010

>> So “Big Questions”…

Whatever you call it – “text analytics” ≈ “text mining” ≈ “text data mining” – a lot has happened since.

How is the industry developing?• Solution providers.

• Customers & prospects.

• Technology & solutions.

Page 5: Welcome - Text Analytics Summit 2010

>> What’s Past is Prologue

“Don't look back. Something might be gaining on you.”

-- Satchel Paige

Page 6: Welcome - Text Analytics Summit 2010

>> The Present: Today’s Market

I estimate a $425 million global market in 2009.• Up about 25% from $350 million in 2008, up

in turn 40% from $250 million in 2007.• Covers software licenses, vendor provided

support and professional services.

$(hundreds) million more value created by:• Universities and research centers, especially

in the life sciences.• Government, particularly for intelligence &

counter-terrorism.• OEM licensees, for listening platforms, e-

discovery, etc.• Systems integrators and consultants.

Page 7: Welcome - Text Analytics Summit 2010

>> Applications Today

Broadly grouped --• Intelligence and counter-terrorism.• Life sciences.

• Content management, publishing & search.• Customer & market intelligence.• E-discovery.• Enterprise feedback.• Law enforcement.• Risk, fraud, compliance, and investigation.

Page 8: Welcome - Text Analytics Summit 2010

>> Today’s Text Analytics Players

BI, data mining, and analytics.

Enterprise- and specialized-application focus.

Search tools and services.

Software-tool, OEM suppliers.

Text analytics pure-plays, diverse applications.

Web services (APIs).

Page 9: Welcome - Text Analytics Summit 2010

>> Market Trends

Stronger than ever:• Life sciences.• Intelligence & counter-terrorism.

Continued steep growth:• Media & publishing.

Seek to mine and to classify/process. For users, semantic annotations ease navigation and boost findability.

• Customer experience. Key to quality, satisfaction.

• Market intelligence including competitive intelligence.

Aggregates and details are both important.

New on the scene – or at least newly visible:• Social-media monitoring, measurement,

analysis.

“The Diverse and Exploding Digital Universe,” (IDC, 2008)

Page 10: Welcome - Text Analytics Summit 2010

>> Technology Initiatives

Now and near future.• Semantic search.

Guha (IBM), McCool (Stanford), Miller (W3C): “The addition of explicit semantics can improve [navigational and research] search” (2003).

• Question answering.Matthew Glotzbach, Google: “Question answering is

the future of enterprise search” (2006).• Sentiment analysis & social-media analytics.

Bing Liu, Univ of Illinois: “The Web has dramatically changed the way that people express their views and opinions.”

Page 11: Welcome - Text Analytics Summit 2010

>> Technology Initiatives 2

Now and near future.• Customer experience.

Bruce Temkin, ex-Forrester Research: “The future is clearly about analyzing feedback in any form that your customers give it. That’s a trend that won’t go away.”

• Text visualization.We’re still coming to terms with the idea of actually

extracting and exploiting the information content of rich media.

• Web 3.0 & the Semantic Web.Ronen Feldman, Bar-Ilan University and Hebrew

University: “Text analytics [is] driving the Semantic Web” (2006).

Page 12: Welcome - Text Analytics Summit 2010

>> Search, from Keywords to Intelligence

Text analytics enables smarter search that better responds to user goals.

Page 13: Welcome - Text Analytics Summit 2010

>> Question Answering

Text analytics (information extraction) feeds curated knowledge bases. Search is transformed from information retrieval to information access.

Page 14: Welcome - Text Analytics Summit 2010

>> Sentiment Analysis

Two assertions:• Human

communications are inherently subjective.

• Opinion often masquerades as Fact.

Page 15: Welcome - Text Analytics Summit 2010

>> Sentiment Analysis… & Social Media

“Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations.”

-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis”

Page 16: Welcome - Text Analytics Summit 2010

>> Finding Business Value

In customer-experience initiatives, “more unsolicited, unstructured data [implies] increasing use of text analytics.”

-- Bruce Temkin, ex-Forrester Research

Page 17: Welcome - Text Analytics Summit 2010

>> Text Visualization

Page 18: Welcome - Text Analytics Summit 2010

>> Looking Ahead

Page 19: Welcome - Text Analytics Summit 2010

The Semantic Web Vision

"

Linked Data: “exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web.”

An open-architure, coordinated by the W3C standards (World Wide Web Consortium)

“The Semantic Web is a web of data, in some ways like a global database.” -- Tim Berners-Lee, 1998

Page 20: Welcome - Text Analytics Summit 2010

>> Web 3.0

Web 3.0 = Web 2.0 + the Semantic Web + semantic tools. Recurring themes:• Semantically enriched -- context sensitive --

localized.

Text analytics enables Web 3.0 and the Semantic Web.• Automated content categorization and

classification.• Text augmentation: metadata generation,

content tagging.• Information extraction to databases.• Exploratory analysis and visualization.

Page 21: Welcome - Text Analytics Summit 2010

>> In Sum

Robust growth.

Consolidation and emergence.

Technical challenges.

New frontiers.

… and two days to learn more.

Page 22: Welcome - Text Analytics Summit 2010

Seth Grimes@sethgrimes

Text Analytics Summit 2010#TAS10

The Big Questions Facing the Text Analytics Industry