Cochrane von Suchodoletz File Creation, Rendering and Formats

27
File Creation, Rendering and Formats Euan Cochrane, Archives New Zealand & Dirk von Suchodoletz, University of Freibu Future Perfect 2012 26 March 2012 Wellington, New Zealand

description

File Creation, Rendering and Formats Euan Cochrane and Dirk von Suchodoletz

Transcript of Cochrane von Suchodoletz File Creation, Rendering and Formats

Page 1: Cochrane von Suchodoletz File Creation, Rendering and Formats

File Creation, Rendering and Formats

Euan Cochrane, Archives New Zealand&

Dirk von Suchodoletz, University of Freiburg

Future Perfect 201226 March 2012

Wellington, New Zealand

Page 2: Cochrane von Suchodoletz File Creation, Rendering and Formats

ContentsEuan

•Files, formats and their relationships to creating applications

•Files, formats and their relationships to rendering applications

Dirk

•Maintaining the ability to use older rendering applications

Euan

•Context and conclusions

Page 3: Cochrane von Suchodoletz File Creation, Rendering and Formats

Digital Preservation

• What is digital preservation?

Maintaining the full information content of digital objects [across time]

Maintaining the ability to render digital objects [across time]

“The goal of digital preservation is the accurate rendering of authenticated content over time”

• What is a file format?

“[pre-defined/particular] way that information is encoded for storage in a computer file”

Page 4: Cochrane von Suchodoletz File Creation, Rendering and Formats

File Creation and Formats• In 2007 Over 90% of HTML documents did not conform

to standards

• Microsoft Office 2007

(and possibly 2010) create

ODS files differently to most

open source office suites.

• Microsoft Office 2007 and 2010 create Microsoft Office 97-2003 formatted files differently to Microsoft Office 97-2003

Page 5: Cochrane von Suchodoletz File Creation, Rendering and Formats

Format Standards are Often Ambiguous or not Available

• The JPEG standard specifies an end of image marker but not an end of file marker – Different apps write them differently

• LibreOffice 3.5 (14 February 2012) now “supports” Visio file import. This support is based on reverse engineering as the format standard is not publically available. It is not complete

Page 6: Cochrane von Suchodoletz File Creation, Rendering and Formats

“Rendering Matters” Research

• Compared the rendering of ~100 files on old software running on old hardware (the “control”) to:

1. LibreOffice version 3.3.0

2. Microsoft Office 2007

3. Word Perfect Office X5

4. Control Software running on emulated hardware

Page 7: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 8: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 9: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 10: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 11: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 12: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 13: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 14: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 15: Cochrane von Suchodoletz File Creation, Rendering and Formats
Page 16: Cochrane von Suchodoletz File Creation, Rendering and Formats

Summary Research Results

• [The choice of] Rendering [Environment] Matters

• MS-Office 2007 was a better rendering tool for the old files than either LibreOffice or WordPerfect Office

• The use of particular attributes/features in office files is inconsistent but most are used at least once.

• At least one “odd”/rare attribute/feature is included in most office files

Page 17: Cochrane von Suchodoletz File Creation, Rendering and Formats

Original Environments (OE) Original creating application best candidate to render

documents properly

Proprietary format knowledge embedded in the application

One environment renders all objects of a certain type

Keeping original software (and hardware) environments has impact on preservation and access workflows

Page 18: Cochrane von Suchodoletz File Creation, Rendering and Formats

Components of Access through OE Emulators for different computer architectures

Software archive of all required applica-tions, operating systems, additional components like fonts, codecs

Workflows on object ingest

Access systems for end users

Page 19: Cochrane von Suchodoletz File Creation, Rendering and Formats

Emulators Wide range available for all relevant

computer architectures

Many Open Source

Not yet DP aware – long term availability to be secured

DP community should seek more influence

Page 20: Cochrane von Suchodoletz File Creation, Rendering and Formats

Software Archive Preserve the relevant software components and

operational knowledge

Page 21: Cochrane von Suchodoletz File Creation, Rendering and Formats

Necessary Workflows

Freiburg digital preservation group leads the state-sponsored two years bwFLA project

BwFLA project providing access to complex, interactive digital objects

Provide extended ingest workflows with feedback loop

Page 22: Cochrane von Suchodoletz File Creation, Rendering and Formats

Extended Ingest Workflow Make use of donator's expertise to collect complete

information and components Extend software archive if necessary Add necessary technical metadata Record knowledge on object handling

Let the donor check and sign-off the rendering results

Page 23: Cochrane von Suchodoletz File Creation, Rendering and Formats

Access Workflows Provide a reading room system or extension

– Pre-configure emulator to the OE required by the object

– Prepare the inclusion of the object into the original environment

– Automate the startup of the OSE

– Provide the user information and hints on how to interact with the OE & automate parts of this

– (Dis)allow to a certain degree to save results from the original environments or capture certain states (e.g. using screenshots)

Page 24: Cochrane von Suchodoletz File Creation, Rendering and Formats

Access System Many components already exist, develo-ped by past DP

projects Next step: Make them a usable “product”

Page 25: Cochrane von Suchodoletz File Creation, Rendering and Formats

Reading Room Access System

Make emulation accessible to standard users like in memory institutions

Robust platform, extension to standard reading room systems

Unified access to a wide range of different emulators + preconfigured environments

Page 26: Cochrane von Suchodoletz File Creation, Rendering and Formats

Context and Conclusions

• Making decisions about preservation strategies

• When to Normalise?

• Variation in format implementation doesn’t matter if you maintain a compatible rendering environment

• Variation in rendering across environments doesn’t matter if you maintain the “right” rendering environment

• There are practical options for maintaining rendering environments

Page 27: Cochrane von Suchodoletz File Creation, Rendering and Formats

Thank you