Introduction to Text Analysis

Post on 30-Nov-2014

422 views 1 download

description

Slides from my text analysis workshop at the MLA, a part of "Getting Started in the Digital Humanities with Help from DHCommons."

Transcript of Introduction to Text Analysis

Introduction to Text Analysis

MLA Annual ConventionGetting Started in the Digital Humanities

January 9, 2014

Lauren F. KleinGeorgia Institute of Technology

lauren.klein@lmc.gatech.edu@laurenfklein

Introduction to Text Analysis

• What is text analysis?

Introduction to Text Analysis

• What is text analysis?• Why should you use it?

Introduction to Text Analysis

• What is text analysis?• Why should you use it?• How do you use it?– Examples– Tools

What is Text Analysis?

What is Text Analysis?According to Geoffrey Rockwell:

• “Text analysis systems can search large texts quickly. They do this by preparing electronic indexes to the text so that the computer does not have to read through the entire text. When finding words can be done so quickly that it is "interactive", it changes how you can work with the text - you can serendipitously explore without being frustrated by the slowness of the search process.

• “Text analysis systems can conduct complex searches. Text analysis systems will often allow you to search for lists of words or for complex patterns of words. For example you can search for the co-occurrence of two words.

• “Text analysis systems can present the results in ways that suit the study of texts. Text analysis systems can display the results in a number of ways; for example, a Keyword In Context display shows you all the occurrences of the found word with one line of context.”

http://tada.mcmaster.ca/Main/WhatTA

http://www.wordle.net

http://www.wordle.net

Mark Hansen and Ben Rubin Movable Type

Why Use Text Analysis?

Why Use Text Analysis?Geoff Rockwell, again:

• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and

formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”

Why Use Text Analysis?Geoff Rockwell, again:

• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and

formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”

Why Use Text Analysis?Geoff Rockwell, again:

• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and

formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”

Ted Underwood:• “Proving a literary thesis with statistical analysis is often like cracking a nut with a

jackhammer. You can do it: but the results are not necessarily better than you would get by hand.”

Why Use Text Analysis?Geoff Rockwell, again:

• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and

formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”

Ted Underwood:• “Proving a literary thesis with statistical analysis is often like cracking a nut with a

jackhammer. You can do it: but the results are not necessarily better than you would get by hand.”

What I think (in the spirit of Movable Type):• Text analysis as “a way to tell a new story.”

How to Use Text Analysis?

Ben Blatt, http://www.slate.com/articles/arts/culturebox/2013/11/hunger_games_catching_fire_a_textual_analysis_of_suzanne_collins_novels.html

Sarah Lohman, http://www.fourpoundsflour.com/the-gallery-data-visualization-of-a-timeline-of-taste/

Daniel, http://lkleincourses.lmc.gatech.edu/dh12/2012/02/22/the-role-of-senses-in-a-study-in-scarlet/

Ted Underwood and Jordan Sellers, http://journalofdigitalhumanities.org/1-2/the-emergence-of-literary-diction-by-ted-underwood-and-jordan-sellers/

Rob Nelson, http://dsl.richmond.edu/dispatch/

Matt Jockers, http://www.nbcnews.com/technology/data-mining-classics-makes-beautiful-science-954577

Matt Jockers, from Macroanalysis (Univ. of Illinois Press, 2013)

Lauren Klein, from “The Image of Absence” (American Literature 85.4)

Tools for Text Analysis

• Wordle • Google Ngram Viewer • IBM Many Eyes • Voyant • MONK (requires institutional access)• MALLET• Stanford’s Natural Language Processing Toolkit• R

Google Ngram Viewer

Google Ngram Viewerhttps://books.google.com/ngrams

IBM Many Eyes

Many Eyeshttp://www-958.ibm.com/software/analytics/manyeyes/

Voyant Tools

Voyant Toolshttp://voyant-tools.org/

MALLET

MALLEThttp://mallet.cs.umass.edu/

Stanford NLP Toolkit

Stanford NLP Toolkithttp://nlp.stanford.edu/downloads/

R Programming Language

R (programming language)http://www.r-project.org/

TAPoR

TAPoR (Text Analysis PoRtal)http://tapor.ca/

More Lists of Tools

• http://toolingup.stanford.edu/?page_id=367

• http://guides.library.upenn.edu/dhtextanalysis

• http://dirt.projectbamboo.org/categories/text-mining

Many Eyes Demo

http://lkle.in/1bTr2eT

Voyant Tools Demo

http://lkle.in/1e186zN