1 Designing Storage Architecture for Digital Collections 2012.

13
1 http://www.digitalpreservation.gov/meetings/storage1 Designing Storage Architecture for Digital Collections 2012

Transcript of 1 Designing Storage Architecture for Digital Collections 2012.

1

http://www.digitalpreservation.gov/meetings/storage12.html

Designing Storage Architecture for Digital Collections 2012

2

DSA 2012 - GOALS

The DSA 2012 meeting brought together technical and industry experts:

Library of Congress IT and subject matter experts;

government specialists with an interest in preservation;

decision-makers from a wide range of organizations with digital preservation requirements; and

recognized authorities and practitioners of digital preservation.

3

DSA 2012 TECHNOLOGY OVERVIEW

Technology Overview State of the Industry Digital Content

Architecture/Storage Tiers File Fixity Checking Density of Storage Media – Future Systems for Data Provenance

4

STORAGE TIERS

Storage Tiers at LC Tier 0 – High-Speed Data Tier 1 - Transactional Data Tier 2 - Active Data Tier 3 – Data at Rest Tier 4 - Backup Data Tier 5 – Long-term Storage Offline

5

FILE FIXITY

File Fixity and Data Integrity in a Large Archive (NAVCC)

• At least 2 copies of everything digital• Test and monitor for the failures• Refresh the damaged copy from the good copy• This process must be as automated as possible• Someday we’re going to lose something – What’s that likelihood? – What costs are reasonable to reduce that?

6

ProvenanceNeed to build provenance into all new systems Add provenance to legacy systems.Use of layers makes it simpler to do this.Why is provenance hard to get and maintain?Need to look at it in terms of multiple layers, and let

every layer create and transmit the provenance information it understands:

Operating systems Database systems Workflow Engines Applications

7

DSA 2012 PANELS

How to Store Data Over Time Panel 1- Industry Architects: How would you store

data over time? Future of Magnetic Tape Panel 2 – Tape vendors: What do you see in the

future market? Future of Hierarchical Storage Management Panel 3 - Hierarchical Storage Management

system vendors: What do you see happening in the large archival

customer base. What do you see as the challenges?

8

National Digital Stewardship Alliance: An initiative of NDIIPP to promote broadened access to digital materials, build community around digital content stewardship activities, and foster the development of standards, best practices, and infrastructure for digital preservation.

9

NDIIPP/NDSA

NDSA Levels of Digital Preservation

Goal: A tool for mitigating technical digital preservation risks

Level One (Protect Your Data) Level Two (Know Your data) Level Three (Monitor Your Data) Level Four (Repair Your Data)

Each level addresses, “what can I do about….?”Storage and geographic locationFile fixity and data integrityInformation securityMetadataFile formats

10

NDSA Preliminary Preservation Storage Survey

87% of respondents are responsible for their content for an indefinite period of time, more or less forever.

64% of respondents are planning to make significant changes in the technologies in their preservation storage architecture in the next three years.

74% of respondents report a strong preference to host and control their own technical infrastructure for preservation storage

50% of respondents are considering, or currently contracting out storage services to be managed by another organization or company

69% of respondents are considering or currently participating in a distributed storage cooperative or system

(ex. LOCKSS alliance, MetaArchive, Data-PASS) 51% of respondents are considering or already using a cloud

storage provider to keep one copy of their content.

11

What are the biggest challenges you see for Storage Architectures for Digital Preservation

over the next five years?

Biggest Challenges

Cost

Scala

bility

Syste

m

Securit

y

Data In

tegr

ity

Met

adat

a/Pro

vena

nce

12

What do you see as under-tapped or untapped opportunities for meeting these challenges?

Untapped Opportunities

Tools Standards Collaboration Other

13