Supporting Preservation of Research Data in the Chemical Sciences

7
Supporting Preservation of Research Data in the Chemical Sciences. Dr. Simon Coles School of Chemistry, University of Southampton 2 nd June 2009

description

A skim of the experience of a repository data project facing the challenge of digital preservation.

Transcript of Supporting Preservation of Research Data in the Chemical Sciences

Page 1: Supporting Preservation of  Research Data in the Chemical Sciences

Supporting Preservation of Research Data in the Chemical

Sciences.

Dr. Simon Coles School of Chemistry, University of Southampton 2nd June 2009

Page 2: Supporting Preservation of  Research Data in the Chemical Sciences

Representation Information for Crystallography Data

• Representation Information (RI), from the OAIS Model, is any information required to render, process, interpret, use and understand data. • Registry/repository for RI (RRoRI) by the DCC and the CASPAR Project • Crystallography domain and the workflow of the NCS are examined to identify significant RI • RI networks relating to the CIF file format are formulated and ingested into the RRoRI • Use case scenario describes how the RI stored in RRoRI may be used in order to gain access to the information content of a CIF instance by someone unfamiliar with that file format.

Page 3: Supporting Preservation of  Research Data in the Chemical Sciences

Preservation Planning for Crystallography Data

• Original plan was to apply a DRAMBORA assessement to each of the repositories in thefederation as a means of raising awarenessof curation and preseravtion issues.

• Now covers the notion of trust and trustworthinesswith a brief look at several preservation planningtools including: the DCC Curation Lifecycle Model; the OAIS Reference Model; audit and certification instruments (TRAC, NESTOR, DRAMBORA, Data Seal of Approval); PLATO and PLATTER (from the PLANETS Project); and cost models (PrestoSpace, LIFE2 projects).

• Raises curation and preservation issues that are likely to berelevant in the context of a crystallography community and the eCrystals federation.

Page 4: Supporting Preservation of  Research Data in the Chemical Sciences

Preservation Metadata for Crystallography Data

• The original aim was to augment the eBank-UK application profile with preservation metadata specifically for crystallography data

• Superceded by the development of the crystallography Data Commons initiative

• Proposed the following…

Page 5: Supporting Preservation of  Research Data in the Chemical Sciences

Resources

• Data Set/Collection,• Raw Data, • Derived Data, • Result Data, • Transient Data, • Workflow?

Page 6: Supporting Preservation of  Research Data in the Chemical Sciences

Publication/DisseminationPersistent Identifier

Preservation Policy/strategy

Rights management: binding intellectual property rights that may limit the ability to preserve and

disseminate the digital object over time e.g. use and reuse

Technical environment: describing the technical requirements needed to render and use the digital object e.g. File format, software, instrumentation

Provenance: the custodial history of the object

Context: contextual information indicating how the object was created and under what circumstances

Authenticity: validating that the digital object is in fact what purports to be, and has not been altered in an undocumented way e.g. checksum

Page 7: Supporting Preservation of  Research Data in the Chemical Sciences

Management

• Embargo e.g. policy

• Representation Information: any information required to render, process, use, reuse, interpret and understand the object e.g. Specifications; File formats; Software; Hardware; Semantics

• Preservation activity: actions taken to preserve the digital object, and any consequences of these actions that impact its look, feel, or functionality