Long-Term Preservation

29
Long-Term Preservation

description

Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From phonograph to MP3. Those that do not keep up with this development soon will lack support: - PowerPoint PPT Presentation

Transcript of Long-Term Preservation

Page 1: Long-Term Preservation

Long-Term Preservation

Page 2: Long-Term Preservation

Technical Approaches to Long-Term Preservation

• the challenge is to interpret formats• a similar development: sound carriers• From phonograph to MP3

Page 3: Long-Term Preservation

• Those that do not keep up with this development soon will lack support:– new audio documents are only produced in

current formats– out-of-date equipment spare parts are hard to

come by.

Page 4: Long-Term Preservation

• technical approaches to long-term preservation of digital documents fall into two categories:– aim to preserve the original state of documents

along with systems that are suitable for rendering the documents in their original format

– aim to continually transform digital documents into the formats of state-of-the-art rendition systems and at the same time to retain their original “look and feel.”

Page 5: Long-Term Preservation

Migration• advantages:– well known– documents available all the time– Possibly improved quality

• disadvantages:– reduced authenticity– hard to automate

Page 6: Long-Term Preservation

Hardware Museums• The mission of a hardware museum is to

collect (and keep operational) all relevant computing systems so that future generations may view our documents in their original environments.

Page 7: Long-Term Preservation

• Hardware museums are not feasible in practice :– too many items– Additional software

and hardware required

– hard to maintain

Hardware museum at the Universität der Bundeswehr München, Germany

Page 8: Long-Term Preservation

Emulation• Emulators allow the function of processors and

other hardware components to be simulated by software.

• When using emulation, for each digital document the following items have to be preserved (using, e.g., migration):– The character stream and the metadata– A specification of the hardware that can be interpreted

by the emulator– The complete software of the rendition system (in the

form of binary data streams).

Page 9: Long-Term Preservation

• If interested persons would like to access a document conserved that way in, say, 100 years from now, they would have to proceed as follows:

1. create an emulator,– Load the hardware specification into an emulator

to obtain a software implementation which is functionally equivalent to the original hardware.

2. install software– On the emulated computer install the systems

software and the application programs needed for rendering the document

Page 10: Long-Term Preservation

3. and render documents– Load the character stream of the digital document

into the emulated . . . and render computer and start the rendition software to access the document.

Page 11: Long-Term Preservation

• advantages of emulation:– relatively small cost per document– cost proportional to actual use one emulator suffices for

many documents– high authenticity

• Whenever an old format becomes obsolete emulation (while new ones become popular), new conversion techniques and tools have to be developed that achieve the required transformation.

Page 12: Long-Term Preservation

Standard Formats• costs proportional to number of formats• standards for simple character sequences and

For complex document types

Page 13: Long-Term Preservation

Legal and Social Concerns• long-term preservation of digital documents

involves legal and social concerns:1. “Digital Rights Management” (DRM) and copy

protection2. reserved software right3. Should hardware manufacturers provide

emulators?4. criteria for selection5. costs as a limiting factor 6. make costs affordable7. balance of interests between shareholders

Page 14: Long-Term Preservation

OAIS Models

• Open Archival Information System Reference Model

• an ISO standard on the long-term preservation of digital documents.

• two complementary points of view: both, an information model and a process model

Page 15: Long-Term Preservation

The Information Model• Data Object and Information Object• The knowledge which is required to understand

data is called Knowledge Base• In order to understand the data one needs

additional information.• Ex, Along with the source code of the Java program,

a book about the programming language Java must be available (Representation Information)

Page 16: Long-Term Preservation
Page 17: Long-Term Preservation

• The Content Information is the information object proper which contains all the information necessary to interpret data

• Preservation Description Information (PDI) denotes all the information required to suitably preserve the corresponding Content Information.

Page 18: Long-Term Preservation

• Content Information and PDI are combined into one logical entity, the Information Package.

• Packaging Information. It specifies how Content Information and PDI are actually related to each other e.g., by describing the directory structure of a CD-ROM.

Page 19: Long-Term Preservation

• Descriptive Information which yields Information about the content of the Information Package and thus allows the Information Package to be found in the archive.

Page 20: Long-Term Preservation

Modeling Context and Processes• In order to define the processes that are going

on in the archive in more detail, the OAIS Reference Model starts by considering the context of the archive.

• An archive’s purpose is to maintain documents, which are submitted to it and which are to be made available to future users.

Page 21: Long-Term Preservation

• Producers, i.e., authors, institutions, etc. that deliver documents to the archive.

• Management. defines the specific purpose of the archive, e.g., which documents are to be collected and which are not

Page 22: Long-Term Preservation

• The OAIS Reference Model differentiates three different kinds of Information Packages in their relation to the environment of the archive:– Submission Information Packages (SIP) are sent to

the archive by Producers– Archive Information Packages (AIP) are preserved in

the archive– Dissemination Information Packages (DIP) are passed

from the archive to Consumers.

Page 23: Long-Term Preservation
Page 24: Long-Term Preservation

• The Ingest process receives an SIP from the Producer and prepares it Ingest for storage and administration within the archive.

• SIPs must be transformed into AIPs, and Descriptive Information corresponding to the AIPs has to be created.

• AIP is passed on to the Archival Storage process, and the corresponding Descriptive Information to the Data Management process.

Page 25: Long-Term Preservation

• Data Management process manages the Descriptive Information and also the data that are necessary to run the system

• Administration process handles routine work in the archive: negotiates with producers the prerequisites for sending documents to the archive.

Page 26: Long-Term Preservation

DSEP Model

• Deposit System for Electronic Publications• The business routine of library can be

subdivided into four domains:– Acquisition of stock– Capturing metadata– Preservation and maintenance– Providing access

Page 27: Long-Term Preservation
Page 28: Long-Term Preservation

• The process Delivery & Capture transforms documents into SIPs conforming to the DSEP standards.

• The process Packaging & Delivery unpacks the DIP and transforms it into a format that can be used by the library system.

Page 29: Long-Term Preservation