Future of text analysis forrester briefing

29
The Future of Text Analysis Dr. Stuart Shulman Texifter, LLC 5/26/22

description

A fall 2011 briefing for personnel at Forrester in Cambridge MA.

Transcript of Future of text analysis forrester briefing

Page 1: Future of text analysis   forrester briefing

The Future of Text AnalysisDr. Stuart Shulman

Texifter, LLC

Thursday, April 13, 2023

Page 2: Future of text analysis   forrester briefing

Briefing Agenda

• R&D in Annotation and Public Comments• “The Future of Text Analysis” – The vision• “What is DiscoverText?” – The software• The Features – The basics– Capturing social media importing other text– Creating archives, buckets and datasets– Coding a dataset or training a classifier

Page 3: Future of text analysis   forrester briefing

Dr. Stuart W. ShulmanFounder & CEO, Texifter, LLCAssistant Professor, Department of Political ScienceUniversity of Massachusetts AmherstDirector, Qualitative Data Analysis Program (QDAP)Associate Director, National Center for Digital GovernmentEditor, Journal of Information Technology & Politics413-545-5375 [email protected]://people.umass.edu/stu/

Page 4: Future of text analysis   forrester briefing
Page 5: Future of text analysis   forrester briefing
Page 6: Future of text analysis   forrester briefing
Page 7: Future of text analysis   forrester briefing

Major Project Components

Credentials

The Future of ProjectsProjects leverage users’ credentials to control

access to documents, tools, and resources

Documents Peers

Advanced ‘Social’ Search

Tools for Tagging Shared Analysis

Metadata Networks Filtering

Qualitative & Quantitative Findings

Page 8: Future of text analysis   forrester briefing

The Future of DocumentsImport & archive data from multiple sources into a single, searchable, unified repository

Email

Files Web

Page 9: Future of text analysis   forrester briefing

The Future of SearcheDiscovery will search, merge, filter & classify

unlimited amounts of text and other data

Filter

Search

Classify

Report

Page 10: Future of text analysis   forrester briefing

Well Worth Reading

Page 11: Future of text analysis   forrester briefing

The Future of Tools

Duplicate & near duplicate

detection

Dynamic user-seeded tag clouds

Adaptable, intuitive and

reusable topic models &

shared memos

Sentiment detection,

redaction & seamless

adjudication

Text processing tools will enable quicker processing and more accurate results

Page 12: Future of text analysis   forrester briefing

The Future of Peer RelationsUtilize trusted peers to scale your knowledge resources,

increase productivity & lower total project costs

Page 13: Future of text analysis   forrester briefing

Peers GroupsSecurely segment your peers into project groups by

agency, firm, department, location, or affiliation,while controlling their access via credentials

Page 14: Future of text analysis   forrester briefing

Security & CredentialsData will be encrypted, secure and accessible by only peers who are granted specific permissions via their credentials

Page 15: Future of text analysis   forrester briefing

Coding, Tagging or LabelingAnnotation enhances your analysis by applying

human interpretation to machine results

Page 16: Future of text analysis   forrester briefing

Coding in Flexible Teams

Page 17: Future of text analysis   forrester briefing

Crowdsourcing

- MIT professor Eric von Hippel, specialist in innovation management“This is really the biggest paradigm shift in innovation since the Industrial Revolution”

Crowdsourcing will bring widely distributedwisdom to process of text analysis

Page 18: Future of text analysis   forrester briefing

Active Machine Learning

Search

Code

ValidateShare

Analyze

By utilizing information and decisions previously captured, we can enhance future machine-based decisions

Active Learning

Loop

Page 19: Future of text analysis   forrester briefing

What is DiscoverText?DiscoverText is a:• personal or organizational archive in the cloud• search engine for eDiscovery • social media comment aggregator• de-duplication and near duplicate clustering engine• FOIA redaction toolkit• coding, reporting and validation team workbench• repository of human annotation (text about text), and• customizable machine-learning classifier

– (beta launched April 2011)

Page 20: Future of text analysis   forrester briefing
Page 21: Future of text analysis   forrester briefing
Page 22: Future of text analysis   forrester briefing
Page 23: Future of text analysis   forrester briefing
Page 24: Future of text analysis   forrester briefing
Page 25: Future of text analysis   forrester briefing
Page 26: Future of text analysis   forrester briefing
Page 27: Future of text analysis   forrester briefing
Page 28: Future of text analysis   forrester briefing
Page 29: Future of text analysis   forrester briefing