PSAE REPORT Zion-Benton Twp. H.S. District 126 Dr. Chris Clark, Superintendent August 27, 2012
Digital Berkshire, April 2012: Chris Clark, British Library PT#2
-
Upload
berkshire-digital -
Category
Technology
-
view
383 -
download
0
Transcript of Digital Berkshire, April 2012: Chris Clark, British Library PT#2
● Scale & materiality– Not individual, standard documents but vast
collections of them; authenticity demands multiplicity of versions
● Cost– Preservation not by individuals but large
organizations
● Intellectual Property– If content worth saving someone is making money
from it
http://www.economist.com/node/15557443 http://youtube-global.blogspot.com/2011/05/thanks-youtube-community-for-two-big.html
“The challenge for libraries is to find ways to preserve platform dependent digital works and to prevent the loss of complex digital media…. Since we cannot possibly save everything, we need to carefully consider which digital materials are the most important to preserve and try to anticipate the needs of future scholars and researchers” Marlene Manoff, 2006
If preservation priority is X and user need is Y, what are the values of X and Y?
If sustainability means that information is kept useful and available, then the LOCKSS approach has real merit! It implies that SERVICES must be preserved as well.
Abundance of stored content:attention is scarce & must be earned
ContentServices
platforms to focus in 2012+
Maintain active presence on
Continue to assess
partnerships
commercial
cura
tori
al
serv
ices
academic
funding bodies
digital consortia
media
user
s
digital scholars
social networks
systems & services
mar
keti
ng
eIS
/STM
exhi
bitio
ns
sear
ch &
re
triev
al
pres
erva
tion
ser
vice
s
Digital Research &
Curator Teammachines
Digital Research & Curator Team
Consolidate
Collaborate
Extend
Digital Curation as collaborative process: acquisitions, workflows, tools, project management, funding, exhibitions & marketing
Digital Scholarship:horizon scanning, Tech Watchcommunities of practice, consortia
Training & development:seminars, conferences, events, ‘Digital Conversations’+ ‘Tooling up’
Europeana – SB Berlin Centenary of the outbreak of the
First World War Will create a European corpus of
digitised materials concerning the First World War in all its aspects
Will contribute to Europeana a substantial collection of more than 400,000 outstanding sources
User generated content Roadshows in 10 countries to
create unique pan-European archive
Preston event produced more than 2300 images from letters, diaries, medals, pictures, trench art, and more
British Newspaper Archive
British Library and brightsolid online publishing
Up to 40 million newspaper pages from the British Library's collection over 10 years
Collection includes runs of most newspapers published in the UK since 1800
Over 4m pages added since launch
Google Books
A 6 year project starting June 2011
250,000 Books, 1700-1870 From the French Revolution to
the end of slavery. Material in major European
languages Focus on books that are not
yet freely available in digital form online
Access via Google Books and BL
Storage at Google and BL Contract and terms available
on the web!
Broadcast News TV & radio news receivable in
the UK, since May 2010, e.g. Al-Jazeera English, CNN, France 24, Russia Today
Search subtitles (where available)
AHRC-funded project looking at speech-to-text technologies for opening up audio and video archives
Project will index 3,00 hours of TV news and 3,000 hours of radio content
IMPACT Historic Text Improve the digital accessibility
of printed text produced before 1900 OCR does not produce
satisfactory results for old books, magazines and newspapers
Historic material have archaic fonts, complex layouts, warped or degraded pages
Manual post-correction is slow and expensive
Early music on-line: digitised 300 volumes (21k images) of rare early printed music from the British Library’s collections
Open educational licence encourages use and re-purposing of content and embedding in teaching and research
Detailed inventories of the books’ contents created for the first time, with access points for composer and title
Data included in British Library catalogue, COPAC and RISM music database, with links to digitised content
Digital images provided to Aruspix, which is developing an OCR and transcription tool for early music
www.earlymusiconline.org
Personal digital archives Data analysis beyond
documents Use computer forensics Capture, management,
description, and preservation of personal digital collections to facilitate access and analysis
Archives range from poets (W Cope) and playwrights (H Pinter) to computer scientists (D Michie) and biologists
Web archives Create a research collection of
UK websites Develop high-impact data
analytical access services Demonstrate the potential of
domain level web archives, or the “haystacks”
UK web domain > 9m .uk domain names
Estimate 110TB/crawl
Goal Builds on previous crowd-
sourcing projects, e.g. UK SoundMap
Addressed key challenges – awareness, engagement, productivity at scale
• Approach• Accessible and convenient application• Immediate results and feedback• Competitive tools• Recognition and visible contribution
Ordnance Surveyors Drawing 40 (detail). Pen and Ink on paper. 1801. British Library, Maps OSD 40(3).
What is georeferencing?
Results: 725 maps assigned spatial metadata over 5
days Publicity minimal – social media key ~90 participants, top five completed half the
work Data quality good: <3% had errors
T-Pen Transcription UI http://www.youtube.com/watch?v=sOnJtWtCFZc
Evolution by projects and commercial ties tends to reduce interoperability and inconveniences the researcher
International collaborations, such as International Image Interoperability Framework, seek a shared canvas
21
ARROW project - a tool to assist ‘diligent search’ and provide faster answers to:Rights status? – Rightsholders? – Can I digitise?
2008 2009 2010 2011 2012 2013
ARROW
ARROW Plus 29 Partners Libraries, BIP, Reprographic Rights
Organisation (UK) 12 countries
(Austria, Denmark, France, Finland, Germany, Italy, the Netherlands, Norway, Slovenia, Spain,Sweden, UK)
Pilots: Germany, France; Spain; UK Books only
36 Partners 14 countries
(Austria, Belgium, Bulgaria, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, the Netherlands, Poland, Portugal, Spain)
Books and images in books
22
Automated (where it can be – still some manual processes)
Therefore saves time and cost
ARROW benefits
ARROW search = 5 % of Manual search time
National partners working together across different sectors
Domain partners working together across countries
Persistent enquiry: can I use this?
Open Knowledge Foundation
Creative Commons Licenses
Persistent URLs
Six decades into the computer revolution,four decades since the invention of the microprocessor, andtwo decades into the rise of the modern internet, all of the technology required to transform industries through software finally works and can be delivered at global scale.
Marc Andreessen ‘Why software is eating the world’Wall Street Journal August 20 2011
Our vision: In 2020, the British Library will be a leading hub in the global information network, advancing knowledge through our collections, expertise and partnerships, for the benefit of the economy and society and the enrichment of cultural life.
If Andreessen is right, we may not be talking in 2020 about digital libraries and digital curators but an agency for the curation and creation of software.
@chrisleeclark
www.bl.uk