British Library Labs 21st Century Curatorship Talk
Transcript of British Library Labs 21st Century Curatorship Talk
British Library Labsand Lessons for theLibrary
Mahendra Mahey
21st Century Curatorship TalksThursday 18 September, 2014, 1500 - 1600Meeting Room K, British Library, St Pancras, London
Manager of British Library Labs
Ben O’SteenTechnical Lead British Library Labs
http://labs.bl.uk 2#bl_labs [email protected] by the Andrew Mellon Foundation
http://labs.bl.uk 3#bl_labs [email protected]
Digital Scholarship
Digital Research
Access & Reuse Group
©
Developers/ Technical
Staff
British Library
Universities & widere.g. companies, start-ups, independent scholars etc.
Stakeholders involved in Labs
United KingdomThe World
Researchers
Developers
BL Labs
Curators / Researchers
DigitalContent
http://labs.bl.uk 4#bl_labs [email protected]
How Labs works…
BL Labs
OpenSoftware
Publications
Tools & services to
support Digital Scholarship
Case Studies
AudienceResearch
question / idea
idea
idea
Competition
Contact
Events
Meetings and visits
Experimenting with our digital collections
Outputs from engagementData
Other Digital Collection / Data
BL Digital Collections /
Data
Researchers
Developers
Data Driven
Projects
http://labs.bl.uk/Digital+Collections
1 2 3 4 5
http://labs.bl.uk 5#bl_labs [email protected]
Example Digital research methods
http://labs.bl.uk/Launch+Event (has some examples from researchers)
Corpus analysis toolsText Mining
Visualisations
Location based searching
Geotagging
Annotation
Natural Language Processing
Using Application Programming Interfaces for datasets e.g. Metadata, Images
Transcribing
Crowdsourcing / Human Computation
http://labs.bl.uk 6#bl_labs [email protected]
The winners of the Labs 2013 competition
Pieter Francois (left) and Dan Norton (right) and each received a cheque for £2000 in November 2013as winners of the first British Library Lab Competition 2013
Two entries chosen in June 2013
They both worked in residence from July to October 2013with Labs to complete their projects
Pieter Francois (left) and Dan Norton (right)
with Adam Farquhar (middle) Head of Digital
Scholarship
http://labs.bl.uk 7#bl_labs [email protected]
Mixing the Library: The Disc Jockey & the Digital Collection
http://www.tompro.co.uk
http://www.ablab.org/shetland
http://www.ablab.org/pd/di/
Prototype design
Annotation
Preview ‘item’
Selected ‘right’ channel ‘item’
Selected ‘left’channel ‘item’
Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels
‘Play back’ of ‘items’ (Blue) and annotations (Yellow)
http://212.71.253.54:8000/a
Living Lab: Library of the Future, see: http://alturl.com/284zw
Basic functioning prototype:
http://labs.bl.uk 8#bl_labs [email protected]
Pieter Francois
https://www.youtube.com/watch?v=xK80Jy0ijkA
http://labs.bl.uk 9#bl_labs [email protected]
Winners of 2014 Competition
Victorian Meme Machine
Bob Nicholson of Edge Hill University
Anna Gerber and Desmond Schmidt from Queensland University
Blog posting http://goo.gl/iJy0aTYouTube: http://goo.gl/mBTlk2
Blog: http://goo.gl/ofpNoslYouTube: http://goo.gl/iseHTE
Text to Image Linking Tool (TILT)
http://labs.bl.uk 10#bl_labs [email protected]
Bob Nicholson
https://www.youtube.com/watch?v=zK95lzaPNp0
http://labs.bl.uk 11#bl_labs [email protected]
Anna Gerber and Desmond Schmidt
https://www.youtube.com/watch?v=Bl4bjZSJ4cY&feature=youtu.be
Text to Image Linking Tool (TILT)
http://labs.bl.uk 12#bl_labs [email protected]
The story of one digital collection…
The story of 68,000 books and 1 million images
and Flickr
Image: Artwork by Alicia Martin
http://mechanicalcurator.tumblr.com
http://www.flickr.com/photos/britishlibrary/
http://labs.bl.uk 13#bl_labs
Extracting Images from OCR
13
Digitisation
<?xml version="1.0" encoding="UTF-8" ?>
- <mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mets="http://www.loc.gov/METS/" xsi:schemaLocation="http://www.loc.gov/
METS/ http://www.loc.gov/standards/mets/ver
sion18/mets.xsd info:lc/xmlns/premi
s-v2
Optical
Character
Recognition Image snipped outAlgorithmically
From ALTO XML
Image snipped out
Image taken from page 207 of 'London and its Environs. A picturesque survey of the metropolis and the suburbs ... Translated by Henry Frith. With ... illustrations'
ALTO XML
http://labs.bl.uk 14#bl_labs [email protected]
Face Recognition of 19th Century Faces
The face-recognition algorithm worked better for female faces than men’s
http://labs.bl.uk 15#bl_labs [email protected]
The Mechanical Curator
http://mechanicalcurator.tumblr.com
• #similar_to_77576796197_published_date• #similar_to_77576796197_slantyness
• #similar_to_77576796197_bubblyness_x• #similar_to_77576796197_bubblyness_y
• #new_train_of_thought
Image from ‘A Lost Estate, by Mary E.Mann,Volume: 02, Page: 91, 1889, London, Bentley & Son
http://labs.bl.uk 16#bl_labs [email protected]
1,020,418 images!
http://www.flickr.com/photos/britishlibrary/
Each image has a URL
Some metadata, but you can add tags!
Flickr has an API so researchers and developers can build appsAnd query the data
Flickr Commons – 1,020,418 images!
http://labs.bl.uk 17#bl_labs [email protected]
Flickr in numbers
>190,000, 000 !!!image views since launch December 13th, 2013 to 18 September 2014553 images seen less than 10 times
103,000 tags added
Labs involved a number of funded research projects & 4 grassroots crowdsourcing efforts.
http://labs.bl.uk 18#bl_labs [email protected]
Tagging a million images - Metadata games and other projects
http://www.metadatagames.org/
Games developed using Flickr sets
http://goo.gl/j6fxac
Cardiff University’s - Lost Visions Project
http://labs.bl.uk 19#bl_labs [email protected]
Flickr coverage in the media!
http://labs.bl.uk 20#bl_labs [email protected]
Opportunities – increasing traffic to Library services
You can purchase a ‘High Res’ Copy
View in the Library Item Viewer
Download .pdfAll illustrations
in book
Other illustrations in booksPublished in same year
View the item in the Library Catalogue Tags auto generated
User generatedTag
Grouping for image
http://labs.bl.uk 21#bl_labs [email protected]
Creative Useshttp://goo.gl/qPPgxX
http://goo.gl/OH6FSn
Jura’s Sound Skateboard
http://labs.bl.uk 22#bl_labs [email protected]
Lessons learned…in getting digital content
• Filter was necessary to choose content because of the amount of content, size and time period of the project
• Getting the story behind the collection was crucial, usually from the curator
• Sometimes access and reuse requests are needed for content
• Getting the curators on board (engaging with the competition, getting them to be judges) and rewarding them after is important (e.g. technical quick wins by working with the Labs technical lead)
http://labs.bl.uk 23#bl_labs [email protected]
Lessons learned…metadata
• Metadata cleansing needed, duplicate records, records not always linked when updated
• Lots of digital content doesn’t have metadata, initiate crowd sourcing perhaps?
There is limited subject classification for the 19th century metadata for books
http://labs.bl.uk 24#bl_labs [email protected]
Lessons learned…technical
• Some content is only available on site due to licensing restrictions
• Labs highlights when systems don’t always join up and this can be flagged internally
• Some restrictions mean that workarounds have to be developed for researchers to work with the content
http://labs.bl.uk 25#bl_labs [email protected]
Lessons learned…human
• Working on site means internal systems and process challenges, issues not insurmountable, workarounds possible
• Starting a dialogue with the right person is the most important lesson I learned about the Library (obvious but true)
http://labs.bl.uk 26#bl_labs [email protected]
Poster given at Digital Humanities 2014
http://figshare.com/articles/Interoperable_Infrastructures_for_Digital_Research_A_proposed_pathway_for_enabling_transformation/1092550
Adam FarquharJames Baker
http://labs.bl.uk 27#bl_labs [email protected]
What do digital researchers want?
• Scalable access to large quantities of digital content
• To work with all types of content - text, image, audio, video
• To work the way they want to, use any work flow, address any sort of problem
• To work across collections irrespective of cotent owner or licence terms
http://labs.bl.uk 28#bl_labs [email protected]
What do researchers get?
• Restrictive, prospective and incompatible infrastructures
• Assets distributed unevenly across organisations and systems
http://labs.bl.uk 29#bl_labs [email protected]
Proposed pathway
• Use off-the-shelf technologies and services
• Bring computational capacity to data
• Provide researchers with something they know and use - a file system and a desktop
• Offer research libraries a cost-effective model that scales with use
http://labs.bl.uk 30#bl_labs [email protected]
Five Principles
• Keep it simple
• Lower the bar
• Bring your own tools
• Be creative
• Enable users to start small and grow big