Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea...
-
Upload
dwayne-emery-crawford -
Category
Documents
-
view
217 -
download
3
Transcript of Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea...
Digital Repository Service (DRS)
Harvard University Library OISpresented by:
Wendy Gogel &Andrea Goethals
Today’s Agenda April 26, 2010
How DRS began Building the collections Where DRS is now Where DRS is going
How DRS began
The formative years
November 1997: Library Digital Initiative (LDI) Proposal
“…create the first-generation technical infrastructure to support storage of and access to digital library materials.”
July 1998: LDI was approved and funded December 1998: planning for DRS began
October 2000 launch
Digital Repository Service (DRS) provides a set of professionally
managed services to ensure the usability of securely stored digital objects over time.
is both a preservation and an access repository
DRS is … Technical infrastructure
Deposit tools Delivery services Management tools Storage system
People Technical expertise and advice Content and system monitoring and management Preservation planning and activities User support and guidance
Content
Building the collections
Content
Programs & Projects LDI Internal Challenge Grant Program
1999-2007 Harvard Art Museum inventory project
2005-2009 Open Collections Program
2002-2010 Google Books project
2005 - 2009 Web Archiving
2007 - ongoing
Content
Digitizing Facilities Harvard College Library Imaging Services HCL Fine Arts Library Digital Imaging Lab Harvard Art Museum Digital Imaging and
Visual Resources Harvard College Library Audio Preservation
Services Peabody Museum of Archaeology and
Ethnology
Metadata
Metadata
Audio
Matins for Sunday after the Elevation of the Holy Cross
Laura Boulton (1899-1980) Collection of Byzantine and Orthodox MusicsArchive of World Music
One of a series of Byzantine hymns and liturgies recorded in a monastery on Patmos, 1960.
Logbook (Part I, p. 1-10)
Where DRS is now
DRS by the numbers
109 TB of content 356 TB total (counting all copies)
15 M files Includes compressed archives - in reality
closer to 707 M files 857,000 compressed Google books
containing 676 M files 7,300 compressed web harvests
containing 17.5 M web files
Format distribution: file count
ZIP8%
TIFF16%
JPEG16%
TEXT19%
JP240%
image/jp2
text/plain
image/jpeg
image/tiff
application/zip
text/xml
audio/x-wave
application/x-gzip
image/x-photo-cd
audio/x-pn-realaudio
application/pdf
audio/x-aiff
application/x-icc
image/gif
Format distribution: file size
ZIP53%
TIFF26%
JP216%
WAVE3%
JPEG2%
application/zip
image/tiff
image/jp2
audio/x-wave
image/jpeg
application/x-gzip
audio/x-pn-realaudio
audio/x-aiff
image/x-photo-cd
text/plain
text/xml
application/pdf
application/x-icc
image/gif
Content Growth
0
20
40
60
80
100
120
2000-10
2001-02
2001-06
2001-10
2002-02
2002-06
2002-10
2003-02
2003-06
2003-10
2004-02
2004-06
2004-10
2005-02
2005-06
2005-10
2006-02
2006-06
2006-10
2007-02
2007-06
2007-10
2008-02
2008-06
2008-10
2009-02
2009-06
2009-10
2010-02
TB
DRS Architecture
TCP/IP
NFS
Metadata Storage
Database
DRS Web Admin Tools
Delivery ServicesIngest Services
Consistency Validation Service Content Storage Service
DRS Architecture
Disk archive (High use, copy 1)
Site 2 Boston
Site 1 Cambridge
Disk archive (High use, copy 2)
Disk archive (Low use. copy 1)
Tape archive (High use, copy 3)Tape archive (Low use, copy 2)
Media only
Tape archive (High use, copy 4)Tape archive (Low use, copy 3)
Site 3 Westborough
TCP/IP
NFS
Load BalancedDelivery Services
Metadata Storage
Database
DRS Web Admin Tools
Load BalancedDelivery Services
DRS Loader
Catalogs – Web Sites - Google
Access Management
Service
Name Resolution Service
SFTP Drop
Boxes
Consistency Validation Service
BatchBuilder
SAM/QFS
DepositorsWeb Archiving
Service
DRS third-party componentsOpen source software Castor (XML to Java
mapping) XX XML Validator Java Swing (U/I toolset) iText PDF creator JHOVE Apache Lucene, Struts,
Tomcat JQUERY (javascript tools) Apache Log4j (logging) Giffy (Tiff-to-Gif Converter) XML tools (Xerces, Xalan,
JaxB, JDOM) Linux
COTS software Luratech Image Server Real Media Helix Streaming
Server Oracle database SUN Solaris SUN SAM/QFS Storage Archive
Manager
Common OIS software Access Management Service Name Resolution Server
Open Source Berkeley DB
Where DRS is going
DRS 2
Why?1. To better support digital preservation
planning & activities2. To better support operational &
collection management needs of DRS depositors, collection managers, library administrators & repository staff
DRS 2 Process
Phases of work DRS 2.1, 2.2, 2.3, etc.
Themed phases DRS 2.1: “Object Security and Integrity” DRS 2.2: “Management and Monitoring” DRS 2.3: “Delivery and Dissemination”
Includes support for new formats DRS 2.1: PDFs, opaque objects DRS 2.2: more audio formats (MP3, MP4/AAC) DRS 2.3: drawings, dissemination
DRS 2 Deliverables
New backend New deposit tools New management tools New dissemination tools Support for new formats
DRS 2 Timing
July 2010, (for testing): New backend New deposit environment (5 object types)
August 2011, release: New backend New deposit environment (16 object types) New management interface Enhanced rights metadata Enhanced audio support Migrated content
DRS 2 Timing
Spring 2012, release: Dissemination Support for drawing formats
Questions?