Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea...

43
Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals

Transcript of Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea...

Page 1: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Digital Repository Service (DRS)

Harvard University Library OISpresented by:

Wendy Gogel &Andrea Goethals

Page 2: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Today’s Agenda April 26, 2010

How DRS began Building the collections Where DRS is now Where DRS is going

Page 3: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

How DRS began

Page 4: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

The formative years

November 1997: Library Digital Initiative (LDI) Proposal

“…create the first-generation technical infrastructure to support storage of and access to digital library materials.”

July 1998: LDI was approved and funded December 1998: planning for DRS began

Page 5: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

October 2000 launch

Digital Repository Service (DRS) provides a set of professionally

managed services to ensure the usability of securely stored digital objects over time.

is both a preservation and an access repository 

Page 6: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS is … Technical infrastructure

Deposit tools Delivery services Management tools Storage system

People Technical expertise and advice Content and system monitoring and management Preservation planning and activities User support and guidance

Content

Page 7: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Building the collections

Page 8: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Content

Programs & Projects LDI Internal Challenge Grant Program

1999-2007 Harvard Art Museum inventory project

2005-2009 Open Collections Program

2002-2010 Google Books project

2005 - 2009 Web Archiving

2007 - ongoing

Page 9: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Content

Digitizing Facilities Harvard College Library Imaging Services HCL Fine Arts Library Digital Imaging Lab Harvard Art Museum Digital Imaging and

Visual Resources Harvard College Library Audio Preservation

Services Peabody Museum of Archaeology and

Ethnology

Page 10: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 11: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 12: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Metadata

Page 13: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Metadata

Page 14: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 15: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 16: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 17: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 18: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 19: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 20: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 21: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 22: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 23: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 24: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Audio

Matins for Sunday after the Elevation of the Holy Cross

Laura Boulton (1899-1980) Collection of Byzantine and Orthodox MusicsArchive of World Music

One of a series of Byzantine hymns and liturgies recorded in a monastery on Patmos, 1960.

Logbook (Part I, p. 1-10)

Page 25: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 26: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 27: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 28: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Page 29: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Where DRS is now

Page 30: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS by the numbers

109 TB of content 356 TB total (counting all copies)

15 M files Includes compressed archives - in reality

closer to 707 M files 857,000 compressed Google books

containing 676 M files 7,300 compressed web harvests

containing 17.5 M web files

Page 31: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Format distribution: file count

ZIP8%

TIFF16%

JPEG16%

TEXT19%

JP240%

image/jp2

text/plain

image/jpeg

image/tiff

application/zip

text/xml

audio/x-wave

application/x-gzip

image/x-photo-cd

audio/x-pn-realaudio

application/pdf

audio/x-aiff

application/x-icc

image/gif

Page 32: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Format distribution: file size

ZIP53%

TIFF26%

JP216%

WAVE3%

JPEG2%

application/zip

image/tiff

image/jp2

audio/x-wave

image/jpeg

application/x-gzip

audio/x-pn-realaudio

audio/x-aiff

image/x-photo-cd

text/plain

text/xml

application/pdf

application/x-icc

image/gif

Page 33: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Content Growth

0

20

40

60

80

100

120

2000-10

2001-02

2001-06

2001-10

2002-02

2002-06

2002-10

2003-02

2003-06

2003-10

2004-02

2004-06

2004-10

2005-02

2005-06

2005-10

2006-02

2006-06

2006-10

2007-02

2007-06

2007-10

2008-02

2008-06

2008-10

2009-02

2009-06

2009-10

2010-02

TB

Page 34: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS Architecture

TCP/IP

NFS

Metadata Storage

Database

DRS Web Admin Tools

Delivery ServicesIngest Services

Consistency Validation Service Content Storage Service

Page 35: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS Architecture

Disk archive (High use, copy 1)

Site 2 Boston

Site 1 Cambridge

Disk archive (High use, copy 2)

Disk archive (Low use. copy 1)

Tape archive (High use, copy 3)Tape archive (Low use, copy 2)

Media only

Tape archive (High use, copy 4)Tape archive (Low use, copy 3)

Site 3 Westborough

TCP/IP

NFS

Load BalancedDelivery Services

Metadata Storage

Database

DRS Web Admin Tools

Load BalancedDelivery Services

DRS Loader

Catalogs – Web Sites - Google

Access Management

Service

Name Resolution Service

SFTP Drop

Boxes

Consistency Validation Service

BatchBuilder

SAM/QFS

DepositorsWeb Archiving

Service

Page 36: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS third-party componentsOpen source software Castor (XML to Java

mapping) XX XML Validator Java Swing (U/I toolset) iText PDF creator JHOVE Apache Lucene, Struts,

Tomcat JQUERY (javascript tools) Apache Log4j (logging) Giffy (Tiff-to-Gif Converter) XML tools (Xerces, Xalan,

JaxB, JDOM) Linux

COTS software Luratech Image Server Real Media Helix Streaming

Server Oracle database SUN Solaris SUN SAM/QFS Storage Archive

Manager

Common OIS software Access Management Service Name Resolution Server

Open Source Berkeley DB

Page 37: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Where DRS is going

Page 38: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS 2

Why?1. To better support digital preservation

planning & activities2. To better support operational &

collection management needs of DRS depositors, collection managers, library administrators & repository staff

Page 39: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS 2 Process

Phases of work DRS 2.1, 2.2, 2.3, etc.

Themed phases DRS 2.1: “Object Security and Integrity” DRS 2.2: “Management and Monitoring” DRS 2.3: “Delivery and Dissemination”

Includes support for new formats DRS 2.1: PDFs, opaque objects DRS 2.2: more audio formats (MP3, MP4/AAC) DRS 2.3: drawings, dissemination

Page 40: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS 2 Deliverables

New backend New deposit tools New management tools New dissemination tools Support for new formats

Page 41: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS 2 Timing

July 2010, (for testing): New backend New deposit environment (5 object types)

August 2011, release: New backend New deposit environment (16 object types) New management interface Enhanced rights metadata Enhanced audio support Migrated content

Page 42: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

DRS 2 Timing

Spring 2012, release: Dissemination Support for drawing formats

Page 43: Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.

Questions?