The opportunistic librarian: A Leuven confession
DemmyVerbeke
Libraries and DH acrl.ala.org/dh (Posner 2013)
Why the Digital Humanities? (Spiro 2011;Vandegrift Varner 2013)
1. Provide wide access to cultural information 2. Enhance teaching
and learning 3. Transform scholarly communication 4. Make a public
impact 5. Enable manipulation of data
Leiden University Library (1610)
Joe & Rika Mansueto Library, University of Chicago Saltire
Centre, Glasgow Caledonian University
The collaboration triangle Humanities research Information
technology Information management DH
R&D in libraries (Nowviskie 2013, Nowviskie 2014) When a
library can both support basic digital scholarship needs through
distributed services and create a critical mass of staffing and
intellectual energy in something like a center (however conceived),
it has set the conditions for the advancement of knowledge itself,
through the fulfillment of research desires yet unknown,
un-expressed. (Nowviskie 2014)
Centre of expertise for: Plagiarism Copyright OpenAccess
Digital Humanities Academic Bibliography & Institutional
Repository Reference librarians A library supporting research
Institutional context since November 2011: DHTask Force,Arts
Faculty (vice dean of research, faculty librarian, head of the
facultys computer department, research support officer of the
faculty, and all interested researchers) 2014: 3 new academic
positions:Tenure track professor in DH (Arts Faculty), Computer
Science for DH (Department of Computer Science), Human-Media
Interaction (Institute for Media Studies) 2015: Advanced Master in
Digital Humanities
A library supporting DH White paper Digital Humanities en/in KU
Leuven bibliotheken of the Library Council of the Humanities and
Social Sciences Group (February 2013) intention to focus on:
digitisation projects supporting relevant grant applications
partnering in DH projects, from inception to completion (and
beyond) providing training in DH tools playing an expert role in
the field of scholarly communication
Project example Portable Light Dome (Mini-dome)
www.arts.kuleuven.be/info/ONO/ Meso/digitalisatie
Project example RICH - Reflectance Imaging for Cultural
Heritage www.illuminare.be/rich_project
Project example Europeana Photography
www.europeana-photography.eu
Project example OCR/NER for 17th-, 18th- and 19th-C Dutch books
Funding: SUpport action Centre for CompEtEnce in Digitisation
(www.succeed-project.eu) Team: Digitisation services of University
Library (Diewer van der Meijden, Mark Verbrugge, BrunoVandermeulen)
LIBIS (Sam Alloing) Arts Faculty Library (DemmyVerbeke) Student
workers (Jolien Berckmans, Els Meskens) Support: INL (Instituut
voor Nederlandse Lexicologie)
KU Leuven & succeed goals End goal: integration of OCR in
digitisation workflow at KU Leuven integration of NER in
digitisation workflow at KU Leuven Specifically: learn from
digitising textual material with a view to OCR (rather than as a
representation of the book as physical object) understand OCR
possibilities learn how to enrich textual material with NER develop
workflows, identify infrastructure problems, etc.
KU Leuven & succeed corpus 13 books from the pretiosa
collection of the Gulden Librije: - translations from Latin -
monolingual Dutch (so without Latin original) - books with
comparable, simple typefaces (no Gothic) - books that have not been
digitized yet Augustinus, Stad Gods (1876-8); Augustinus, Belydenis
(1741); Bothius, Vertroostinge der wysgeerte (1703); Horatius, Over
de dichtkunst (1866); Horatius, Hekeldichten en brieven (1728);
Nepos, Leevens van doorlugtige mannen (1796); Nepos, Leeven der
doorluchtige veld-ooversten (1726); Ovidius, Treur-digten (1814-5);
Ovidius, Treur-gesangen (1692); Seneca, Christelycke Seneca
(1705);Tacitus, Vande ghedenkwaerdige geschiedenissen der Romeinen
(1645);Vergilius, Wercken (1737);Vergilius, Aeneis (1662)
KU Leuven & succeed tools ABBY Finereader Engine SDK 11 OCR
User PatternTrainer ofABBY Finereader train OCR IMPACT historical
lexicon for Dutch, integrated as a FineReader external dictionary
improve OCR Aletheia build ground truth ocrevalUAtion compare OCR
results NER tool for Europeana Newspapers NER NE AttestationTool
manually correct NER NERT build training & test set
Conclusion This is one of the great opportunity spaces that the
Digital Humanities opens up, giving archivists, librarians, and
curators a chance to not simply enlarge but completely re-envision
their communities, publics, and missions. (Burdick et al. 2012,
48-49)
References @viroviacum [email protected] Anne
Burdick and others, Digital_Humanities (Cambridge: MIT Press, 2012)
Christian Clausner, Stefan Pletschacher and
ApostolosAntonacopoulos, Efficient OCRTraining Data Generation with
Aletheia, in Proceedings of the 11th International Association for
Pattern Recognition (IAPR)Workshop on Document Analysis Systems
(DAS2014) William A. Kretzschmar and William Gray Potter, Library
Collaboration with Large Digital Humanities Projects, Literary and
Linguistic Computing 25, no. 4 (2010): 439445 &D