Digitisation Workshop Pres 2008(V1)
-
Upload
mal-booth -
Category
Technology
-
view
2.554 -
download
1
description
Transcript of Digitisation Workshop Pres 2008(V1)
DigitisationDigitisationRevolutionising Library Management Revolutionising Library Management
Day 2Day 2
Sydney, April 2007Sydney, April 2007
Mal Booth – Head,
Research Centre
Where am I from?Where am I from?The Memorial’s Research Centre functions as a The Memorial’s Research Centre functions as a
library and an archive. We develop, manage and library and an archive. We develop, manage and provide public access to Australia’s official, provide public access to Australia’s official,
personal, & published records of war.personal, & published records of war.
Global trends in digitisationGlobal trends in digitisation• Faster, better, cheaper equipment & storage
• Better DAMS & CMS software• Institutional repositories• More audio & film• Collaboration• Shared collections (eg. Picture Australia)• Mass digitisation programs: Google, Microsoft,
Yahoo, Open Content Alliance (OCA), Internet Archive
I’m not sure what these are, but I’m not sure what these are, but they are important!they are important!
• Dynamism
• Preservation (as a benefit & obligation)
• Playing• Management & planning• Compromise• Access
Recent Digitisation ExamplesRecent Digitisation Examples• WW1, WW2, Korea & Vietnam WW1, WW2, Korea & Vietnam
unit war diaries
• 260k+ 260k+ images of our collections
• Official histories (published works)
• Digitisation on demand Digitisation on demand
Digitisation on demandDigitisation on demandCurrently running at 90,000 pp p.a.Currently running at 90,000 pp p.a.
About one fifth of these About one fifth of these imagesimages
What we will cover todayWhat we will cover today1. GETTING STARTED1. GETTING STARTED
a. Why and what to digitise?a. Why and what to digitise?
b. How (preservation/access) & Principlesb. How (preservation/access) & Principles
c. Copyright and IP considerations (briefly)c. Copyright and IP considerations (briefly)
d. Resources needed; in-house or outsource?d. Resources needed; in-house or outsource?
e. Process outline: from planning to long term e. Process outline: from planning to long term maintenance (life-cycle)maintenance (life-cycle)
2. METHODS, CONTENT & STORAGE2. METHODS, CONTENT & STORAGE
a. Production: file formats & standards, scanners & a. Production: file formats & standards, scanners & cameras, softwarecameras, software
b. Output: indexing, access, search optimisation, b. Output: indexing, access, search optimisation, delivery optionsdelivery options
c. Storage, ongoing maintenance & management c. Storage, ongoing maintenance & management requirementsrequirements
d. Just doing it, lessons learned & key issuesd. Just doing it, lessons learned & key issues
Why and what to digitise?Why and what to digitise?WHYWHY
• Increase & broaden access (remote & 24/7)Increase & broaden access (remote & 24/7)
• Fragile, valuable &/or unique materials (loss or damage Fragile, valuable &/or unique materials (loss or damage would be catastrophic)would be catastrophic)
• Support research & educationSupport research & education
• Anticipating future use or re-useAnticipating future use or re-use
• Improved search & retrieval Improved search & retrieval
• Promoting knowledge, understanding & recognition of Promoting knowledge, understanding & recognition of collectionscollections
• Relationships to other collectionsRelationships to other collections
• Preservation of at-risk collections by risk reduction & Preservation of at-risk collections by risk reduction & conservationconservation
WHATWHAT: popular collections; fragile/unique; at-risk; significant : popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what priorities; relationships (corporate or collaborative); & what you have the right to digitise!you have the right to digitise!
How: some Principles* - How: some Principles* - CollectionsCollections ((organised groups of objectsorganised groups of objects))
• Agreed collection development policyAgreed collection development policy
• Sound descriptionSound description
• Lifecycle curationLifecycle curation
• Broad access to allBroad access to all
• Respect for IPRespect for IP
• Evaluation for use & usefulnessEvaluation for use & usefulness
• InteroperabilityInteroperability
• Integration of staff & user workflowsIntegration of staff & user workflows
• Sustainability & continued usabilitySustainability & continued usability
* * NISO Framework of Guidance for the Building of Good Digital NISO Framework of Guidance for the Building of Good Digital Collections Collections
How: some Principles - How: some Principles - ObjectsObjects ((digital assetsdigital assets))
• Production ensures collection priorities & Production ensures collection priorities & maintains interoperability and re-usemaintains interoperability and re-use
• Preservability: persistence & accessibility Preservability: persistence & accessibility over time; across evolving media, over time; across evolving media, software & formatssoftware & formats
• Meaningful outside its context: portable, Meaningful outside its context: portable, reusable, interoperablereusable, interoperable
• Persistent identifiers: URLs or URIsPersistent identifiers: URLs or URIs
• Authentication: veracity, accuracy & Authentication: veracity, accuracy & authenticityauthenticity
• Inclusion of associated metadata: Inclusion of associated metadata: descriptive, administrative & structuraldescriptive, administrative & structural
How: some Principles - How: some Principles - MetadataMetadata
((selection and implementation of information about objects: selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservationdescriptive; administrative; technical; structural; & preservation))
• Appropriate to materials, users and useAppropriate to materials, users and use
• Support for interoperability: mappings & crosswalks Support for interoperability: mappings & crosswalks between schemesbetween schemes
• Use of authority control and content standardsUse of authority control and content standards
• Includes a clear statement on conditions of use for Includes a clear statement on conditions of use for the objects (eg. fair use)the objects (eg. fair use)
• Support for long term management, eg. PREMISSupport for long term management, eg. PREMIS
• Metadata records are treated as digital objectsMetadata records are treated as digital objects
How: some Principles - How: some Principles - InitiativesInitiatives ((the creation & management of collectionsthe creation & management of collections))
• A substantial design and planning componentA substantial design and planning component
• Appropriate staffing and expertiseAppropriate staffing and expertise
• Best practice project managementBest practice project management
• An evaluation planAn evaluation plan
• A project report that documents the process & A project report that documents the process & outcomesoutcomes
• Consideration of the entire lifecycle (ongoing Consideration of the entire lifecycle (ongoing management)management)
Copyright & Intellectual Property (1)Copyright & Intellectual Property (1)Concerns:Concerns:
• What sort of items are protected by copyright? What sort of items are protected by copyright?
• What is the duration of copyright protection? What is the duration of copyright protection?
• What sorts of activities infringe copyright? What sorts of activities infringe copyright?
• When is a copyright licence required?When is a copyright licence required?
• Understanding the “exceptions” to copyright Understanding the “exceptions” to copyright infringementinfringement
See: See: Copyright and Cultural Institutions: Short Guidelines for Copyright and Cultural Institutions: Short Guidelines for Digitisation Digitisation by Emily Hudson and Andrew Kenyonby Emily Hudson and Andrew Kenyon
& ACC’s & ACC’s SSpecial case exception: education, libraries, collections (deals with the new section 200AB)
IFLA/IPA Statement on Orphaned Works
Resources required (1)Resources required (1)• HardwareHardware – scanners, cameras, computers, monitors, digital – scanners, cameras, computers, monitors, digital
storage, memory & processing powerstorage, memory & processing power
• SoftwareSoftware – scanning, OCR, office apps, image editing & – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file management, DAM?, video/audio capture, metadata capture?, file conversion, calibrationconversion, calibration
• FurnishingsFurnishings – for staff, computers, scanners, storage – for staff, computers, scanners, storage
• Facility space Facility space – scanning, preparation & storage, QA– scanning, preparation & storage, QA
• Specialist staff Specialist staff – curatorial, cataloguers, IT/DBA, web, scanning, – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservatorsproject management, conservators
• Training needsTraining needs
• Conservation needsConservation needs– archival supplies & consultancies– archival supplies & consultancies
• Budget funds Budget funds – salaries, hardware/software purchases & lease, – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingencylicenses, running/ongoing costs, contingency
• Corporate support Corporate support – context within corporate or other priorities – context within corporate or other priorities and strategiesand strategies
WW1 Diaries scanning facilitiesWW1 Diaries scanning facilities
Approximately 200,000 high
res. images per year
Outsource or Inhouse?Outsource or Inhouse?• Contractor responsible for Contractor responsible for
capital equipment, training capital equipment, training and technology obsolescence and technology obsolescence costs costscosts costs
• No need to find scanning No need to find scanning spacespace
• Less need for digitisation Less need for digitisation knowledgeknowledge
• Economies of scale (& Economies of scale (& capability for large volumes & capability for large volumes & throughput)throughput)
• The bureau may be able to The bureau may be able to achieve a better quality result achieve a better quality result & have a broader range of & have a broader range of servicesservices
• A better fix on costs and A better fix on costs and timescales (but these can timescales (but these can vary widely)vary widely)
• Better institutional Better institutional knowledge, understanding & knowledge, understanding & capacitycapacity
• Less risk than working with Less risk than working with external partiesexternal parties
• Better ability to meet specific Better ability to meet specific needs and deadlines?needs and deadlines?
• Cheaper costs for oversized or Cheaper costs for oversized or non-standard materials?non-standard materials?
• QA may be more efficientQA may be more efficient
• Saving on transport and Saving on transport and insurance and less risk with insurance and less risk with onsite scanningonsite scanning
• Assured staff and expertise Assured staff and expertise
Dealing with an external bureauDealing with an external bureau
• Clear contracts are importantClear contracts are important
• Choosing a bureau Choosing a bureau – check with reference sites– check with reference sites
• Range and scope of material Range and scope of material - non-standard - non-standard materialsmaterials
• Collaboration with others to achieve further Collaboration with others to achieve further economies of scale economies of scale may be possiblemay be possible
• QA QA can be a project killercan be a project killer
• Metadata Metadata – what will the bureau record?– what will the bureau record?
• Consider partial outsourcing or bringing a specialist Consider partial outsourcing or bringing a specialist partner onsitepartner onsite
Some funding optionsSome funding options
• Program funding Program funding – dependent on corporate priorities– dependent on corporate priorities
• User pays User pays – but will they?– but will they?
• Grants Grants - eg. - eg. http://www.nla.gov.au/chg/
• Donors or sponsors Donors or sponsors -- from or associated with a web from or associated with a web presencepresence
• Collection Depreciation Collection Depreciation – depends on valuation and – depends on valuation and an accounting standardan accounting standard
• As a training activity As a training activity – can be viable learning – can be viable learning experience for a small team & projectexperience for a small team & project
• New policy proposalsNew policy proposals
““Investing in an Intangible Asset”Investing in an Intangible Asset”• The benefits of long term preservation of digital assets are The benefits of long term preservation of digital assets are
difficult to value (reliably and objectively), but the costs of difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken. not doing so are high if action isn’t taken. More information More information on costs and benefits is neededon costs and benefits is needed..
• Digital preservation is still new, so there is Digital preservation is still new, so there is scope for market scope for market creation & development, research and experimentationcreation & development, research and experimentation..
• Information managers know why such programs are Information managers know why such programs are important, but find it hard to communicate this to those important, but find it hard to communicate this to those who control our finances. who control our finances. Business casesBusiness cases based on empirical based on empirical evidence need something like the balanced scorecard evidence need something like the balanced scorecard approach to approach to bridge the gap between us and decision bridge the gap between us and decision makersmakers..
• Digital preservation is still an Digital preservation is still an organisational innovation organisational innovation and and must be must be managed effectively managed effectively as it is dependent on as it is dependent on independently driven technological developments.independently driven technological developments.
From DCC’s From DCC’s Investment in an Intangible Asset
The AWM Document Digitisation The AWM Document Digitisation ProcessProcess
Cornell’s digital imaging process mapCornell’s digital imaging process map
• Radiating out Radiating out from the goals from the goals and deliverables and deliverables of the project are of the project are the institutional the institutional resourcesresources
• The outer wheel The outer wheel represents the represents the processes or processes or stages of digital stages of digital imaging initiatives imaging initiatives – clockwise from – clockwise from SelectionSelection
Draft DCC Curation LifecycleDraft DCC Curation Lifecycle
PRODUCTION: file formats and PRODUCTION: file formats and standardsstandards
Commonly used formats:
• TIFF
• JPEG
• GIF
Future formats:• JPEG 2000• PNG
PRODUCTION: PRODUCTION: file formats – file formats –
how and where how and where they are usedthey are used
PRODUCTION: scanners & camerasPRODUCTION: scanners & cameras
• Flatbed scannersFlatbed scanners• Map/plan scannersMap/plan scanners• Overhead scannersOverhead scanners• Digital camerasDigital cameras• Book scannersBook scanners• Book-edge scannersBook-edge scanners• Microfilm and slide scannersMicrofilm and slide scanners
PRODUCTION: softwarePRODUCTION: softwareImage editing software
• Consider: cost; hardware requirements; usability; functionality
• Options : Adobe Photoshop CS3 (expensive/best) & Photoshop Elements (cheap); Gimp (free); + prop. software for RAW files
• Derivative and pdf production: Acrobat Writer (expensive); ImageMagick (conversion software); Ghostscript (pdf interpreter); & pdftk (pdf toolkit)
Other useful open source software:
• JHOVE object validation
• FedoraCommons object repository management system
• ebXML e-business suite
• Xena digital document preservation software (from NAA)
• DSpace institutional repository system
• DROID automated batch identification of file formats (from TNA UK)
OUTPUTOUTPUTIndexing
• Most descriptive metadata will come from your MARC records
• If a separate database is needed: Access, SQL & Oracle
Access options (also part of just doing it)
• Collection OPACs, databases, Zoomify, EAD, DVDs, CDs
• Other: Blogs, Facebook ArtShare, Flickr, Facebook page
Search engine optimisation
• How can I create a Google-friendly site?
STORAGE & MAINTENANCESTORAGE & MAINTENANCE
Storage
Consider: Speed (read/write, data transfer); Capacity; Reliability (stability, redundancy); Standardization; Cost; & Fitness to task
Management, maintenance & preservation
• Digital preservation practices
• Preservation metadata
• Trusted digital repositories?
What we want● Accuracy / authenticity● Searchability ● Easy navigation &
download● Cost effectiveness ● Good quality product● Text capture and search
(OCR) where poss.● Integration● Scalability● Web interactivity● Simple solutions
● Costs estimates escalate ● Technology has limits, but is improving● You learn with new technology by doing● There is more to copyright than owning it● Anticipate needs & increasing expectations● $ hard to find for access (sponsorship?)● Better management & storage of assets● A need to educate managers & suppliers!● Keeping trained staff is a challenge● Costs/benefits of new technologies (risk?)● Importance of QA in projects!● Need for a strategic plan(s)● Be prepared to compromise
What we are findingWhat we are finding
LessonsLessons
Enterprise Content Management: Enterprise Content Management: management, search & web facilities management, search & web facilities
for digital assets and servicesfor digital assets and services• Extensive Extensive digital asset managementdigital asset management features features
• Excellent Excellent electronic document & record electronic document & record managementmanagement
• Intuitive Intuitive web content managementweb content management features features
• Facilitate simple and complex Facilitate simple and complex workflowworkflow processes processes
• Extensive and Extensive and unified searchingunified searching constructs constructs
• Scaleable Scaleable
• CompliantCompliant with all government recordkeeping with all government recordkeeping requirements & emerging requirements & emerging digital preservation digital preservation standardsstandards
• IntegrateIntegrate easily with existing Memorial systems easily with existing Memorial systems
• Simple to administerSimple to administer in terms of security, auditing & in terms of security, auditing & storage managementstorage management
Other Corporate Systems
Digital AssetManagement
Electronic Document & Records Management
Record Management E:mailMemorial Intranet
Web Content Management
AJRPWebsite
Lotus NotesOAI InterfaceFIRST OPAC
MICA OPAC(CAS)
ECM - Conceptual Overview
CMS Digital ObjectMgmt System
DOMS
BiographicalDatabases &War Diaries
RecordSearchNAA
Collection MgmtMICA
Library SystemFIRST
Fund RaisingSystem
Raisers Edge
Financial & HRSystem
SAP
POS System,Advance Retail
CAS
InternalOrders
OnLine ShopSearch
PhotocopyQuotes
ReQuest
eSalesPICTION
implementing user-friendly implementing user-friendly technologiestechnologies
• make sure they are findable and useable
• pick a few “winners” & lead by example
• collaborate & network
• get involved in your core business
• don't leave it to IT-staff
• learn to compromise (the 80:20 rule)
• experiment
• start now! it is sometimes easier to seek forgiveness than gain permission
JISC 2007 – five key issues for JISC 2007 – five key issues for digitisationdigitisation
1. Re-focus on the user (simple, easily found & used output)
2. Aggregate and present content that can resonate with multiple communities
3. Learn from Google & YouTube but keep our values
4. New business models are needed, collaborating with and without the private sector
5. More collaboration between publishers, curators, funders, users, vendors and standards bodies