Data visualization and digital humanities research: a survey of available data sets and tools
description
Transcript of Data visualization and digital humanities research: a survey of available data sets and tools
![Page 1: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/1.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Data visualization and digital humanities research:
a survey of available data sets and tools
LITA National Forum 2011St. Louis, MO
Friday, September 30, 2011Erik Mitchell, University of Maryland
Susan Sharpless Smith, Wake Forest University
![Page 2: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/2.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Motivation
“Digital humanities needs gateway drugs. Kudos to the pushers on the Google Books team.” - Dan Cohen http://www.dancohen.org/2010/12/19/
“Linked open data could have the same leveraging effect that the World Wide Web had on computing, said Micki McGee, an assistant professor of sociology at Fordham University”-Steve Kolowich, The Promise of Digital Humanities, Inside HigherEd
![Page 3: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/3.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Birth of a word
“Imagine if you could record your life, everything you said, everything you did available in a perfect memory store at your finger tips. “
- Deb Roy – The Birth of a Word http://www.ted.com/
![Page 4: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/4.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Overview
• Discuss examples of data-focused research tools
• Explore tools• Consider roles for librarians• Wrap-up/Q & A
![Page 5: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/5.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Taxonomy of uses
Resource type Research methods
Discovery Text searching, citation chaining, concept exploration
Visualization Mapping, graphing, charting
Analysis / publishing Dataset publishing, statistical analysis, annotation
![Page 6: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/6.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Searching and Discovery
Examples: BYU Corpua http://corpus.byu.edu/
WOK Citation Mapping WOK
![Page 7: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/7.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Visualization
Free Visualization Tools
![Page 8: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/8.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Analysis and publishing
NodeXL http://nodexl.codeplex.com/
![Page 9: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/9.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Tool Comparison - linguistics
Evaluation areas Tool features
Index approach features Concordancing, lemmatization, semantic relationships, collocation/KWIC, sense disambiguation
External links / interoperability Links to lexical databases (e.g. wordnet), data export, metadata structures, common search features
Dataset population Population definition, open or closed, data source, syncronic/diacronic, mono, bi, pluralingual?
![Page 10: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/10.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Tool exploration
• Discover / Search• What kinds of discovery tools exist and how
common are the discovery features across different datasets / systems?
• Visualization• What visualization features exist, are there products
that are easy to use, are the skills transferable?
• Analysis / Annotation• What analytical tools are included, what analysis
techniques are common?
![Page 11: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/11.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Perseus
http://www.perseus.tufts.edu
![Page 12: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/12.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
JSTOR Data For Research
http://dfr.jstor.org
![Page 13: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/13.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Wordseer
Aditi Muralidharan Marti Hearsthttp://bebop.berkeley.edu/wordseer
![Page 14: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/14.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Google’s Ngram Viewerbooks.google.com/ngrams
culturomics.org
But here's the rub. Google Books, as others point out, wasn't really built for research. . . That means Google Books didn't come with the interfaces scholars need for vast data manipulation . . . http://chronicle.com/article/The-Humanities-Go-Google/65713/
![Page 15: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/15.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Ted talk on Google NGRAM viewer
http://www.ted.com/talks/what_we_learned_from_5_million_books.html
![Page 16: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/16.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Concordancing
Eric Lease Morgan - http://dh.crc.nd.edu/sandbox/cyl/catalog/
![Page 17: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/17.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Google’s public data explorer
http://www.google.com/publicdata/
![Page 18: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/18.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Data analysis - NodeXL
http://nodexl.codeplex.com/Analyzing Social Media Networks with NodeXL: Insights from a Connected World
![Page 19: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/19.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Data cleaning – Google Refine
http://code.google.com/p/google-refine
![Page 20: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/20.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Data visualization – Google Fusion Tables
http://google.com/fusiontables
http://www.google.com/fusiontables/DataSource?dsrcid=332788
![Page 21: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/21.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Research/teaching need
• Researcher needs vary from advanced linguistic analysis and IT support to need for basic digital content/infrastructure
Corpus-based research
![Page 22: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/22.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Librarian contributions
• Domain specific, tool-type specific comparisons
• IT and research support – data analysis, data curation, tool/data sources identification
• Shift from “reference” to “research” in sync with move from resource discovery to thematic analysis
![Page 23: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/23.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Next steps
• Build new skills, develop new systems• Create tutorials guides• Explore connections between data/curation
and publishing and these tools – so is there a connection
• Explore role of library discovery systems and consider new feature implementation.
![Page 24: Data visualization and digital humanities research: a survey of available data sets and tools](https://reader036.fdocuments.us/reader036/viewer/2022062411/56816781550346895ddc8cf9/html5/thumbnails/24.jpg)
1011
00LI
tera
ryCr
itici
sm01
0111
0100
01Sh
akes
pear
e010
1Tra
nsla
tion1
0Lin
guis
tics1
1101
Dig
tialC
olle
ction
s010
10To
pic
Ma
pp
ing0
1Hist
ory
Sites of interestData analysis• Google Refine• Rapidminer• Lingua tools
(http://search.cpan.org/~emorgan/)
• http://alias-i.com/lingpipe/web/competition.html
• Digital Resource Tools
Visualization• NodeXL• Google Public Data Explorer• Google Fusion Tables• http://bit.ly/lita_datatools• Projectbamboo.org
Data publishing• Corpus of Contemporary
American English• British National Corpus• http://corpus.byu.edu/• JSTOR DFR• digitalresearchtools.pbwor
ks.com
Discovery• Wordseer• Perseus (Tufts)• Google Ngram Viewer• Corpus.byu.edu