January 2006 Archival Storage Strategies and Technologies Presentation

18
Porter-Roth Associates 1 Archival Storage Strategies & Technologies AIIM Presentation January 25, 2006

description

 

Transcript of January 2006 Archival Storage Strategies and Technologies Presentation

Page 1: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 1

Archival Storage Strategies & Technologies

AIIM Presentation

January 25, 2006

Page 2: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 2

Bud Porter-RothPorter-Roth Associates

[email protected]

http://www.erms.com

Page 3: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 3

Agenda

IntroductionThe Preservation ProblemRecommendations

Page 4: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 4

Introduction

Basic NeedBasic Need

ComplianceCompliance

Disaster Recovery

Disaster Recovery

Page 5: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 5

Introduction

Flash DrivesFlash Drives

File SystemsFile Systems

ee--Mail ServersMail Servers

Local DrivesLocal Drives

WebWebServersServers

ImagingImagingRepositoriesRepositories

PaperPaperFilesFiles

Electronic Electronic Document Document RepositoriesRepositories

MicrofilmMicrofilm

BusinessBusinessSystemsSystems

Video LibrariesVideo Libraries

PhotographsPhotographs

PDAs

Page 6: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 6

The Preservation Problem

The problem is actually two separate, sort of unrelated issues:

Hardware and software to store and read documents

Hardware, OS, applications

The format that the documents are inWord, PDF, XML

Page 7: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 7

The Preservation Problem

The “Problem” in briefSoftware formats change and become non-supportedSoftware formats fall out of favor over time and disappearHardware drives change and become non-supportedStorage media changes overtime and becomes obsolete

Floppy disksOptical disks (WORM, CD, DVD)Tape (many flavors of)Portable storage media like the “Memory Stick” in use today

With all of the above issues, for digital documents, it means that there is a strong chance that you will be forced to convert something to something else over time – as a, in the foreseeable future, continuing process.

Page 8: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 8

The Preservation Problem

TIFF (Tagged Image File Format) usually with ITG Group 4 compressionJPEG (Joint Photographic Experts Group)GIF (Graphic Interchange Format)PNG (Portable Network Graphics)Native file formats (Word, Excel, etc) also known as “Born Digital” documentsPDF, PDF/A, PDF/XMany other proprietary electronic formatsPaperFilm

Page 9: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 9

The Preservation Problem

What is the best option for preserving electronic documents overarchival time spans? (Disregarding the hardware storage issues)

TIFF? A “digital picture” of your pageWidely adopted standard for document imagingNot human readable without the softwareNo access to underlying text without OCR

XML? A format description of the page – a style sheetGood for describing logical structure, but not appearanceMany incompatible domain-specific schemas

Native Format (e.g., MS Word)? Several ubiquitous, but closed proprietary formatsCan you spell WordPerfect?

PDF? PDF/A?

Microsoft Metro renamed XPS?

Page 10: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 10

Desirable Properties of a Format

Device independenceCan be reliably and consistently rendered without regard to the hardware/software platform

Self-containedContains all resources necessary for rendering

Self-documentingContains its own description

Transparency Amenable to direct analysis with basic tools

Page 11: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 11

Adobe PDF and PDF/A

PDF is a ubiquitous open format for electronic documents

Proprietary, but with publicly available specificationCompanies, other than Adobe, make PDF products

Many statutory, regulatory, and institutional policies mandate the retention of PDF-based documents over multiple generations of technology

The feature-rich nature of PDF can complicate preservation efforts

Page 12: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 12

PDF/A

PDF/A is intended to address three primary issues:

Define a file format that preserves the static visual appearance of electronic documents over time

Provide a framework for recording metadata about electronic documents

Provide a framework for defining the logical structure and semantic properties of electronic documents

Page 13: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 13

PDF/A

PDF/A constraints include:Audio and video content are forbidden Javascript and executable file launches are prohibitedAll fonts must be embedded and also must be legally embeddable for unlimited, universal renderingColorspaces specified in a device-independent mannerEncryption is disallowedUse of standards-based metadata is mandated

Page 14: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 14

PDF/A

However…PDF/A alone does not guarantee preservation

PDF/A alone does not guarantee exact replication of source material

The intent of PDF/A is not to claim that PDF-based solutions are the best way to preserve electronic documents

But once you have decided to use a PDF-based approach, PDF/A defines an archival profile of PDF that is more amenable to long-term preservation

Page 15: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 15

PDF/A ….Nevertheless

PDF/A may not be the last preservation format you will use or need

However, proper application of PDF/A should result in reliable, predictable, and unambiguous access to the full information content of electronic documents

Page 16: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 16

Microsoft XPS

XPS is an abbreviation for the XML Paper Specification The XML Paper Specification describes the XPS Document format. Adocument in XPS Document format (XPS Document) is a paginated representation of electronic paper described in an XML-based format. The XPS Document format is an open, cross-platform document format that allows customers to effortlessly create, share, print, and archive paginated documents.XPS Documents use a file container that conforms to the Open Packaging Conventions. The new file formats in the next version of the Microsoft Office System, codenamed Office "12," also use the Open Packaging Conventions for organizing data into files, allowing businesses to be able to manage Office "12" documents and XPS Documents in the same manner.The XPS Document format is both a fixed-layout document interchange format, a native Windows Vista spool file format, and a PDL (Page Description Language, used by printing devices). http://www.microsoft.com/whdc/xps/default.mspx

Page 17: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 17

Recommendations

This is still a wild frontier, with no certain outcome or single standard “The good thing about standards is that there are so many of them….”When in doubt about long-term storage of vital documents, paper or film is still a good answerBeware of new technologies, even ones that are “standards”TIFF, JPEG, PDF, PDF/A are recommended.The weight of in-place document formats will mean that change will be very slow and may stop change unless a dramatic “out of the blue” technology appears

Page 18: January 2006 Archival Storage Strategies and Technologies Presentation

Porter-Roth Associates 18

Conclusion & Questions

Finally!

Questions?Questions?