Dh2014 e mopcobre-complete
-
Upload
laura-mandell -
Category
Education
-
view
821 -
download
2
description
Transcript of Dh2014 e mopcobre-complete
Distributed “Forms of Attention”:eMOP and the CobreToolAnton duPlessis, Laura Mandell, James Creel, and Alexy
MaslovTexas A&M University
DH2014 July 10, 2014
Introduction
Distributed Reading
Causer, T., J. Tonra, and V. Wallace. “Transcription Maximized; Expense Minimized?: Crowdsourcing and editing The Collected Works of Jeremy Bentham.” Literary and Linguistic Computing 27.2 (2012), pp. 119-137. Causer, T., and V. Wallace. “Building a Volunteer Community: Results and Findings from Transcribe Bentham.” Digital Humanities Quarterly 6.1 (2012). http://www.digitalhumanities.org/dhq/vol/6/2/000125/000125.html.
Gibbs, Frederick W. “New Textual Traditions from Community Transcription.” Digital Medievalist 7 (2011). http://www.digitalmedievalist.org/journal/7/gibbs/
Holley, Rose. “How Good Can It Get: Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs.” D-Lib Magazine 1.3/4 (2009).
---. “Many Hands Make Light Work.” March 2009. National Library of Australia. ISBN 978‐0‐642‐27694‐0
Crowdsourcing
Reading
Guillory, John. “Close Reading: Prologue and Epilogue,” ADE Bulletin 149 (2010): 8-14. Hayles, N. Katherine. “Hyper and Deep Attention: The Generational Divide in Cognitive Models,” Profession 2007: 187-199.
Commentary
vs.
Contribution
Bruno Latour {
Cobre: Overview
• Developed for Los Primeros Libros Project• an international collaboration to digitize and provide access to 16th Century
New World imprints (1539 – 1600)• http://primeroslibros.org• http://libros.library.tamu.edu
• Create opportunities for academic investigation and instruction• Interface leverages scrolling filmstrip view of tiled thumbnails • Magnification and comparison tools facilitate detailed examination• View and compare multiple exemplars of the same work that would be
impossible with the physical books• Compare state, emission, edition, etc. of an exemplar• Examine variations in print, missing / obstructed text, missing / misnumbered /misbound /
damaged pages, fire marks, marginalia and other copy specific attributes• Synchronous examination of multiple books permits parallel comparison
Cobre: Suite of Tools
• Reading Tools– Book View– Reading View– Detailed View– Repository View– Comparison View
• Quick Comparison View • Annotations
– Structural • table of contents
– Non-structural • copy specific features
– Transcription • capability to view and correct the
OCR output of a text
• Editing Tools– Basic– Canonical
• abstract construct that permits alignment of different exemplars of the same work by leveraging the structural metadata
– Frankenbook• application of the canonical
construct using images drawn from any exemplar(s) to replace the placeholders to create custom editions via a drag and drop method
Cobre: Book View
Cobre: Dspace View
Cobre: Detailed View
Cobre: Transcription Tool
Cobre: Annotation Tool
Cobre: Comparative View
Cobre: Quick Comparative View
Cobre: The Advantage
New Features supporting transcription for eMOP
• A systematic workflow for getting EEBO and ECCO content and metadata into Cobre
• Accept existing OCR text as transcriptions in XML import
• Editors for human transcription/revision of pages
• Addition of transcriptions to XML export
New Ingestion Workflow
The Bitstream Metadata Bitstream (BMB)
• DSpace does not support bitstream (i.e. file) level metadata of the detail required for annotation and transcription.
• We include an additional bitstream that contains metadata about the page-image bitsreams – the Bitstream Metadata Bitstream
• The BMB is an XML file with “chunks” that describe one or more pages.
Accepting transcriptions from the BMB XML – a view in the DSpace Source
A file attached to the item
Example snippet of its contents
OCR text in the BMB accepted upon harvest into Cobre
Transcription Editor
Invoked with a click
Transcription Editor on Detail View
Transcription Editor on Comparison View
Vetting Transcriptions
Administrative users can indicate whether a transcription is vetted as acceptable
Vetted Transcriptions will appear in BMB XML export
Click this
And export these
Results of Usability Studies
Confusions