Metadata approaches for digital presentation

87
Metadata approaches for digital preservation Michael Day UKOLN, University of Bath & Digital Curation Centre http://www.ukoln.ac.uk/ DELOS Summer School on Preservation 2007 Santa Croce in Fossabanda, Pisa, Italy 3-8 June 2007

description

Slides from a presentation given at the DELOS Summer School on Preservation 2007, Santa Croce in Fossabanda, Pisa, Italy 3-8 June 2007

Transcript of Metadata approaches for digital presentation

Page 1: Metadata approaches for digital presentation

Metadata approaches for digital preservation

Michael DayUKOLN, University of Bath & Digital

Curation Centrehttp://www.ukoln.ac.uk/

DELOS Summer School on Preservation 2007

Santa Croce in Fossabanda, Pisa, Italy3-8 June 2007

Page 2: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 2

Session outline

Section 1: Introducing metadata Section 2: Metadata in support of digital preservation Practical exercise Section 2 (continued) Break Section 3: The PREMIS Data Dictionary Section 4: Preservation support in other metadata standards Summing-up

Page 3: Metadata approaches for digital presentation

Section 1:Introducing metadata

Page 4: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 4

Defining metadata (1)

Some definitions:o Literally, "data about data"

Defines the basic concept, but is (perhaps) not very meaningful

Refers to everything and nothing (Wendy Duff, 2004)

o "Machine-understandable information about Web resources or other things" - Tim Berners-Lee, W3C (1997)

Page 5: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 5

Defining metadata (2)

"Structured data about resources that can be used to help support a wide range of operations" - Michael Day, 2001

"Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage" information objects - NISO, 2004o Both hint at the many roles metadata can

support

Page 6: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 6

Defining metadata (3)

Metadata is now typically defined by functiono "Data associated with objects which relieves

their potential users of having to have full advance knowledge of their existence or characteristics" (Dempsey & Heery, 1998)

o Popular categorisation: Descriptive metadata Structural metadata Administrative metadata

Page 7: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 7

Metadata functions

Resource disclosure & discovery The retrieval and use of resources Resource management, including preservation Verification of authenticity Intellectual property rights management Commerce Content-rating Authentication and authorisation Personalisation and localisation of services …

Page 8: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 8

Application areas

"Web resources or other things," e.g.: Web sites, Web pages, digital images, databases,

books, museum objects, archival records, collections, services, geographical locations, organisations, events, concepts, … even metadata itself

Page 9: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 9

Metadata locations

Within a resource, e.g.:

o Title page and table of contents (books), META tags in document headers (Web pages), ID3 metadata (MP3), "file properties" (office documents), EXIF data (images)

Directly linked to the resource, e.g.:

o Link rel="meta" elements (Web pages) Independently managed in a separate database; can

be linked by identifiers

o This is the most common approach

Page 10: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 10

Metadata is important

… "is recognised as a critically important, and yet increasingly problematic and complex concept with relevance for information objects as they move through time and space" -- Gilliland-Swetland (2004)

Page 11: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 11

Metadata standards (1)

But there are a large (and growing) number of metadata initiatives, formats, schemas, etc.o For example, see James Turner's MetaMap for

one attempt to visualise the metadata information space:

http://mapageweb.umontreal.ca/turner/meta/english/

Page 12: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 12

© 2004 MetaMap version 1.2 presented by James M. Turner, Véronique Moal, & Julie Desnoyers

Page 13: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 13

Metadata standards (2)

Typically defined by "resource management communities"o Different traditions, perspectives, functional

requirements Typically comprise:

o A "conceptual model" (sometimes not explicit)o A set of named components ("terms", "elements"

etc) and documentation on their meaning and useo A specification of how to represent a metadata

instance in a digital format (binding)

Page 14: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 14

Some examples (1)

Bibliographic:o MARC (Machine-Readable Cataloguing) formats,

e.g. MARC21, UNIMARCExchange format since 1960sContent often defined by family of related

standards, e.g. the ISBD series, AACR2, RDAo MODS (Metadata Object Description Schema)o ONIX

Used by publishers and the book trade

Page 15: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 15

Some examples (2)

Archives and records:o ISAD(G) General International Standard Archival

Descriptiono EAD (Encoded Archival Description)o EAC (Encoded Archival Context)o Recordkeeping metadata

ERMS (The National Archives, UK)RKMS (Monash University, Australia)ISO 23081-1:2006 (Metadata for records)

Page 16: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 16

Some examples (3)

Museum Objects:o SPECTRUM

Digital images:o VRA Core, ANSI/NISO Z39.87-2006

Government information:o AGLS, e-GMS

Learning objects:o IEEE LOM, UK LOM Core, IMS specifications

Multimedia:o MPEG-7, MPEG-21

Page 17: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 17

Metadata implementation

Many different ways to implement metadata

o Databases (internally)

o Structured formats (for harvesting or exchange)ISO 2709 (MARC)Attribute-value pairsHTML/XHTML (e.g., for header information)Extensible Markup Language (XML)

Many existing metadata standards utilise XML, e.g. Dublin Core, METS, MODS

Modularity

Page 18: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 18

Initial summing-up (1)

Metadata is ubiquitous Metadata enables people and software applications

to do things (functions)o Not only about "discovery"o Different functions require different metadata

There are many different standards Challenges remain in working across standards

(interoperability), or in using standards in combination (modularity)

Page 19: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 19

Initial summing-up (2)

XML is a current popular choice for implementation, at least to facilitate metadata harvesting (e.g. OAI-PMH) or exchange

Page 20: Metadata approaches for digital presentation

Section 2:Metadata in support of digital preservation

Page 21: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 21

Wider roles of metadata

Early recognition that metadata was not only useful for resource discovery

Resource managemento Managing accesso Managing resourceso Recording contexts (technical and other)

Some examples:o Records management and archiveso Digitisation initiativeso Digital preservation

Page 22: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 22

Preservation metadata (1)

Definitions:o All of the various types of data that allow the re-creation

and interpretation of the structure and content of digital data over time (Ludäsher, Marciano and Moore, 2001)

o "… the information a repository uses to support the digital preservation process" -- PREMIS working group (2005)

o All digital preservation strategies depend, to some extent, upon the creation, capture and maintenance of appropriate metadata

o "Preserving the right metadata is key to preserving digital objects" -- ERPANET Briefing Paper (Duff, Hofman & Troemel, 2003)

Page 23: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 23

Preservation metadata (2)

Preservation metadata fulfil a range of different roles, e.g.:o "… metadata accompanies and makes reference

to each digital object and provides associated descriptive, structural, administrative, rights management, and other kinds of information" (Lynch, 1999)

o Spans the categories of administrative, structural, descriptive and technical metadata

Page 24: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 24

Preservation metadata (3)

Metadata is key to the understanding and reuse of digital information, e.g.:o "… it is impossible to conduct a correct analysis

of a data set without knowing how the data was cleaned, calibrated, what parameters were used in the process, etc." -- Deelman, et al. (2004)

o Growing emphasis on open access to research data (OECD working group)

o The 'data deluge'

Page 25: Metadata approaches for digital presentation

Practical exercise

Page 26: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 26

Start with an object ...

Facsimile of PhaistosDisk (Crete), photographcourtesy of ArchaeologyData Service

Page 27: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 27

Ask some questions

Select a digital objecto e.g., an image, a word file, a Web page, a database,

audio file, movie, ... Consider what information would be necessary to aid

future understanding or re-useo e.g. descriptive, technical, structural, administrative,

legal, contextual, ... Consider what information would be needed in order for

future generations to be sure that the object is authentic, usable and understandable (the performance)

Page 28: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 28

Generate metadata

Broadly group the information identified into categories, i.e. to create a schemao e.g. groups might include: descriptive, technical,

structural, administrative, legal, contextual, ... Consider whether the metadata types are intrinsic or

extrinsic to the resource, e.g. could it be automatically generated

Reporting back:o A short summary of the main preservation metadata

requirements of the chosen object (5 minutes maximum per group)

Page 29: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 29

Summing-up

Metadata might be required for:

o Identification

o Technical metadata

o User environments (e.g., scenarios)

o Significant characteristics

o Fixity and authenticity

o Provenance (e.g., tracking changes)

o Rights

o Structure and packaging

o Description

Page 30: Metadata approaches for digital presentation

Section 2 (continued):Metadata in support of digital preservation

Page 31: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 31

Preservation metadata (4)

Current position:o Early initiatives tended to be theoretical in

nature (e.g., metadata frameworks); current ones (e.g. PREMIS) have a more practical focus

o There is now some consensus in cultural heritage domain on the types of metadata requiredA major influence on this has been the

Reference Model for an Open Archival Information System (OAIS)

Page 32: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 32

OAIS background

Reference Model for an Open Archival Information System (OAIS)o Nothing to do with the OAI (Open Archives

Initiative) or OAI-PMHo Development led by the Consultative Committee

for Space Data Systems (CCSDS)o Issued as CCSDS Recommendation (Blue Book)

650.0-B-1 (January 2002)o Also adopted as: ISO 14721:2003o http://public.ccsds.org/publications/archive/

650x0b1.pdf

Page 33: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 33

OAIS definitions

Provides definitions of terms, e.g.:o OAIS - "An archive, consisting of an organization of

people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community”

o Designated Community - the community of stakeholders and users that the OAIS serves

o Knowledge Base - a set of information, incorporated by a user or system, that allows that user or system to understand the received information

Page 34: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 34

OAIS definitions

o Information Object - Data Object + Representation Information

o Representation Information - any information required to render, interpret and understand digital data

o Information Package - Conceptual linking of Content Information + Preservation Description Information + Packaging Information (Submission, Archival and Dissemination Information Packages)

o Preservation Description Information - information (metadata) about Provenance, Context, Reference, Fixity information

Page 35: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 35

OAIS Functional Model

Six entitieso Ingesto Archival Storageo Data Managemento Administrationo Preservation Planningo Access

Described using UML diagrams ...

Page 36: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 36

OAIS Functional Model

4-1.

2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

OAIS Functional Entities (Figure 4-1)

Page 37: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 37

OAIS Information Objects

o Information Object (basic concept):Data Object (bit-stream)Representation Information (permits “the full

interpretation of Data Object into meaningful information”)

o Information Object Classes:Content InformationPreservation Description Information (PDI)Packaging InformationDescriptive Information

Page 38: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 38

OAIS Information Objects

InformationObject

RepresentationInformation

1+

interpretedusing1+Data

Object

interpretedusing

PhysicalObject

DigitalObject

BitSequence

1+ OAIS Information Object (Figure 4-10)

Page 39: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 39

OAIS Information Objects

Representation Information:o Any information required to render, interpret and

understand digital data (includes file formats, software, algorithms, standards, semantic information etc.)

o Representation Information is recursive in natureo Essential that Representation Information itself is curated

and preserved to maintain access to (render and interpret) digital data e.g. Format registries (Global Digital Format Registry

(GDFR), PRONOM, Digital Curation Centre Representation Information Registry)

Page 40: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 40

OAIS Information Objects

Interpreted using

Semantic information

Structure Information

Other Representation

Informationadds meaning to

Representation Information

*

1

*

1

4-11

.1

OAIS Representation Information Object (Figure 4-11)

Page 41: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 41

OAIS Information Packages

o Information package:Container that encapsulates Content

Information and PDIPackages for submission (SIP), archival

storage (AIP) and dissemination (DIP)AIP = “... a concise way of referring to a set of

information that has, in principle, all of the qualities needed for permanent, or indefinite, Long Term Preservation of a designated Information Object”

Page 42: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 42

OAIS Information Packages

Information Package Concepts and Relationships (Figure 2-3)

Page 43: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 43

OAIS Information Packages

Archival Information Package (AIP):o Content Information

Original target of preservationInformation Object (Data Object & Representation

Information)o Preservation Description Information (PDI)

Other information (metadata) “which will allow the understanding of the Content Information over an indefinite period of time”

A set of Information ObjectsIn part based on categories discussed in CPA/RLG

report: Preserving Digital Information (1996)

Page 44: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 44

OAIS Information Packages

PreservationDescriptionInformation

Reference Information

ProvenanceInformation

ContextInformation

FixityInformation

PDI Preservation Description Information (Figure 4-16)

Page 45: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 45

OAIS Information Packages

o Fixity - supporting data integrity checking mechanisms

o Reference - for supporting identification and location over time

o Context - documenting the relationship of the Content Information to its environment

o Provenance - documents the history of the Content Information

Page 46: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 46

OAIS Information Packages

Page 47: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 47

OAIS Information Model

Also defines:o Archival Information Units and Archival

Information CollectionsRecognises the complexity some some

objects, addresses granularityo Information Package transformations

For Ingest and Access

Page 48: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 48

Preservation metadata standards

Two main triggers in library world:o An urgent practical response to the growing

amount of digital content needing management:National Library of Australia (1999)Harvard University LibraryNational Library of New Zealand (2003)

o Research projectsUK Cedars project outline specification (2000)NEDLIB project (2000)

Page 49: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 49

OCLC/RLG Metadata Framework

Metadata Framework Working Group (2000 - 2002)o Sponsored by OCLC and RLGo Preservation Metadata Framework (2002)

Structured around the OAIS information model and the work of earlier initiatives

o Framework was a set of recommendations, not a specification for implementation

o Led directly to the development of the PREMIS Working Group

Page 50: Metadata approaches for digital presentation

Break

Page 51: Metadata approaches for digital presentation

Section 3:The PREMIS Data Dictionary

Page 52: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 52

PREMIS Working Group (1)

PREMIS Working Group (2003 - 2005)o Preservation Metadata: Implementation Strategieso Sponsored by OCLC and RLGo International working group and advisory committee

Primarily practical focusMembers from the US, the UK, the Netherlands,

Germany, Australia and New Zealando Chaired by Priscilla Caplan and Rebecca Guenther

Page 53: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 53

PREMIS Working Group (2)

Main objectives:o A 'core' set of preservation metadata elements

(Data Dictionary)o Strategies for encoding, packaging, storing,

managing, and exchanging metadata Outputs:

o Implementation Survey report (Sept. 2004)o PREMIS Data Dictionary 1.0 (May 2005)

All WG documents are available from: http://www.oclc.org/research/projects/pmwg/

Page 54: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 54

PREMIS survey (1)

Implementing Preservation Repositories for Digital Materials (2004)

o Review of current practice within cultural heritage organisationsBased on responses to questionnaire together

with follow-up interviewsQuestions about business plans, policies,

preservation strategies, as well as metadataAnalysis based on ~50 responsesSnapshot of practice, noting trends

Page 55: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 55

PREMIS survey (2)

Findings:

o Very little current experience of digital preservation; no knowledge whether the metadata collected will be adequate

o The OAIS model has informed the implementation of many repositories

o METS was the most commonly-used scheme for non-descriptive metadata

o Metadata is stored both in databases and together with content data objects

Page 56: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 56

PREMIS survey (3)

Trends identified:

o Redundant storage of metadata both within databases (for ease of use) and encapsulated with data objects (self-documenting)

o METS is commonly used for the packaging of different metadata

o OAIS is just the starting point

o The retention of the original versions of objects to reduce risks

o The use of multiple preservation strategies

Page 57: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 57

PREMIS data dictionary (1)

Background:o OAIS remains the conceptual foundation (but

some differences in terminology)o The data dictionary is a translation of the OAIS-

based 2002 Framework into a set of implementable semantic units

o Preservation metadata = "the information a repository uses to support the digital preservation process"

Page 58: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 58

PREMIS data dictionary (2)

o Defines metadata that supports "maintaining viability, renderability, understandability, authenticity, and identity in a preservation context."

o New 'canonical' definition of preservation metadatao Core metadata = "things that most working

repositories are likely to need to know in order to support digital preservation."

o Recognition of the need for automatic capture of metadata

Page 59: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 59

PREMIS data dictionary (3)

o The Data Dictionary is implementation independent, i.e. does not define how it should be stored

o Based on simple data model that defines five types of entities

o Defines semantic units for Objects, Events, Agents and Rights

Page 60: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 60

PREMIS data model (1)

Intellectual entities

Objects

Events

Rights

Agents

Page 61: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 61

PREMIS data model (2)

o Entities:Digital Object, Intellectual Entity, Event,

Agent, & Rightso Relationships are statements of association

between instances of entitieso Semantic Units are the properties of an entity,

and have values

Page 62: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 62

PREMIS data model (3)

o Digital Object = a discrete unit of informationFiles = named and ordered sequence of bytes

known by an operating systemBitstream = a set of bits embedded within a

fileRepresentation = the set of files needed for a

"complete and reasonable" rendering of an Intellectual Entity

Page 63: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 63

PREMIS data model (4)

o Intellectual Entity = a coherent set of content that can be viewed as a single unit

o Event = an action involving at least one Object or Agent known to the repositoryDocuments actions that modify Digital

Objects, records validity checks, etc.Objects can be associated with any number

of events

Page 64: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 64

PREMIS data model (5)

o Agent = persons, organisations, or programs associated with preservation eventsNot the main focus of the data dictionary

o Rights Statements = assertions of rights pertaining to Objects or AgentsWG concentrates on rights and permissions

associated with preservation activities

Page 65: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 65

PREMIS data model (6)

o Relationships:Relationships between Objects:

Structural relationships, e.g. how files combine to make up an Intellectual Entity

Derivation relationships, e.g. resulting from format transformations or replications

Dependency relationships, e.g. when Objects depend on others, e.g. fonts, DTDs, etc.

1:1 principle

Page 66: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 66

PREMIS documentation

o Data Dictionary, v 1.0Defines semantic units for Objects, Events, Agents and

RightsImplementation independent

Defines semantics

o Separate proposed XML bindings (PREMIS schemas)

o PREMIS Maintenance Agency (Library of Congress)

o Editorial Committee and Implementers' Group (PIG)

Page 67: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 67

Limits to scope

o Does not focus on descriptive metadataDomain specific and dealt with by many other schemes

o Does not define the specific characteristics of Agentso Does not directly consider rights and permissions not directly

associated with preservation actions, e.g. access or reuseo Does not deal with technical metadata for all different types of

digital file (left to format experts)o Does not deal with the detailed documentation of media or

hardware (left to media and hardware specialists)o Does not consider in detail the business rules of a repository,

e.g. roles, policies, and strategies (but this could be added to data model)

Page 68: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 68

Some issues

o The PREMIS Data Dictionary is an important contribution to the ongoing development of preservation metadata

o It is, however, implementation independentDefinition of semantics and a suggested XML binding

o ConformanceNon-PREMIS elements not conflict with or overlap with

PREMIS semantic unitsNeed for more harmonisation (?)

o The exchange of ObjectsMandatory metadata needs to be able to be extracted and

packaged with the objecto The use of controlled vocabularies

Page 69: Metadata approaches for digital presentation

Section 4:Preservation support within other metadata standards

Page 70: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 70

Some relevant domains

Archives and records managemento Focus on integrity and authenticityo The development of recordkeeping systems

Digitisation initiativeso Focus on digitisation processeso Preparation of 'master' files with all appropriate metadata

Learning object managemento Primary focus on the management of objects (e.g. IP rights),

rather than preservation

Commercial content (e.g. television companies)o Not covered here

Page 71: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 71

Recordkeeping metadata (1)

Research projects in the mid-1990so Pittsburgh Project "Functional Requirements for Evidence in

Recordkeeping":Defined fundamental properties of records based on their

role as evidence of business transactionsRevealed need for influence on the design of recordkeeping

systems (automatic capture of metadata)Business Acceptable Communications (BAC) reference

model for metadata (1995)o University of British Columbia (UBC) project:

Also stressed the evidentiary value of recordsImportance of authenticity and integrity

Page 72: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 72

Recordkeeping metadata (2)

Follow-up research:o InterPARES project (international)o Australian Recordkeeping Metadata Schema (RKMS)o National standards:

National Archives of Australia - Recordkeeping Metadata Standard

The National Archives (UK) - Requirements for Electronic Records Management Systems - Metadata standard

o Preservation approach (encapsulation in XML)Public Record Office Victoria - Victorian Electronic

Records Strategy (VERS)

Page 73: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 73

Recordkeeping metadata (3)

ISO 23081-1:2006

o Information and documentation -- Records management processes -- Metadata for records -- Part 1: Principles

o Developed by ISO Technical Committee TC 46, Information and documentation, Subcommittee SC11 Archives/Records Management

o First of a family of standards

o Builds on the framework of the ISO 15489 Records Management standard

Page 74: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 74

Recordkeeping metadata (4)

ISO 23081-1 definitions:o Builds on ISO 15489 definition: "data describing the

context, content and structure of records and their management through time"

o "As such, metadata are structured or semi-structured information that enables the creation, registration, classification, access, preservation and disposition of records through time and within and across domains … [and] can be used to identify, authenticate and contextualise records and the people, processes and systems that create, manage, maintain and use them and the policies that govern them"

Page 75: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 75

Recordkeeping metadata (5)

ISO 23081-1 general principleso Metadata capture is built into business processes

Defines the critical characteristics of records, must be explicit

o Need to define roles and responsibilitiesRecords managers, information professionals,

executives, unit managers, system administrators, …o Long-term preservation is just one of the roles fulfilled by

recordkeeping metadata

Page 76: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 76

Recordkeeping metadata (6)

ISO 23081-1 types of metadatao About the record itself

Should includes metadata about structure, format and technical dependencies

o About business rules, policies and mandateso About agents (people) - for accountabilityo About business activities or processeso About records management processes

Page 77: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 77

Recordkeeping metadata (7)

Automatic capture and sharing of metadatao Monash University Clever Recordkeeping

Metadata (CRKM) projectFocus on interoperability:

Enabling information originating in one context to be (re)used in other ways

With a high degree of automation

Relies on standardsMetadata registries for storing standardised

representations of schemas

Page 78: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 78

Digitisation initiatives (1)

Digitisation initiatives focus on:o The technical information that needs to be captured as

part of the digitisation process, e.g.:NISO Z39.87 Technical Metadata for Digital Still Images

o Ways of packaging content and metadata in order to create standardised packagese.g. for collecting all the individual page images that

comprise a book and enabling their display in the ight order

Metadata Encoding & Transmission Standard (METS) http://www.loc.gov/standards/mets/

Page 79: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 79

METS basics (1)

Originated in digitisation projects, i.e. Making of America II

An XML-based framework for packaging various types of metadata (and data), including:o METS Header

o Descriptive Metadata - for discovery and retrieval

o Administrative Metadata - enabling managers to administer the object (as part of a collection)

o Structural Metadata - describing how individual components relate to one anotherFile Section, Structural Map, Structural Links, Behavior

Page 80: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 80

METS basics (2)

o Implemented very widely in digital library projects, e.g. Oxford Digital Library

o Supports InteroperabilityDifferent metadata can be combined within a

METS container, e.g. MODS, MARC in XML, DC in XML, etc.

o Supports the portability of objects

o METS can be seen as a type of Information Package (in OAIS terms), combining both data and metadata

Page 81: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 81

Learning objects (1)

Learning objectso Any digital resource that can be (re)used to support learning

"… any entity - digital or non-digital - that may be used for learning, education or training" - IEEE Learning Object Metadata (LOM) standard

o Essentially modularIncludes: images (graphs, photographs), Web sites,

presentations, quizzes, bibliographies, multimedia, etc.o There is a major focus on re-useo Learning objects (at all levels of granularity) are included in

institutional repositories

Page 82: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 82

Learning objects (2)

o But learning objects reflect many of the difficulties found with other digital objectsTechnical dependence on other resources

(linear navigation, embedded content or software), complicated because of added granularity

Page 83: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 83

Learning Objects (3)

IEEE Learning Object Metadata (LOM)o Institute of Electrical and Electronics Engineerso Two main foci:

Resource discoveryDescribes the structure of learning objects and

management processes (e.g. rights management)Used by JORUM (UK LOM Core)

o There is now an increasing consideration of the potential role of long-term digital preservation in the learning object arena:JISC report (2004)JORUM watch reports

Page 84: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 84

Preserving metadata

Several problems:

o Need to preserve both object and metadataKeep both metadata instances and information

about all of the metadata schemas in useSchemas constantly evolve, so we will need

accurate registries of metadata schemas

o Need to maintain links between object/s and metadata

o Semantic drift

Page 85: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 85

Summing up

Metadata is essential to support the long-term management and preservation of digital objects

There is now the beginning of consensus on what particular types of metadata might be required to support some preservation processes (e.g., the OAIS model, PREMIS Data Dictionary) and packaging (e.g. METS, MPEG-21 DIDL)

There is growing experience with the practical implementation of preservation metadata, e.g. using the PREMIS Data Dictionary

There is much more still to be done

Page 86: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 86

Further reading

OAIS Reference Model (2002):http://public.ccsds.org/publications/archive/650x0b1.pdf

PREMIS Data Dictionary for Preservation Metadata (2005):http://www.oclc.org/research/projects/pmwg/

DPC Technology Watch Report on "Preservation Metadata" by Brian Lavoie and Richard Gartner (2005): http://www.dpconline.org/docs/reports/dpctw05-01.pdf

DCC Digital Curation Manual Instalmentso "Metadata" by Michael Day (2005), "Preservation Metadata" by

Priscilla Caplan (2006), and "Archival Metadata" by Wendy Duff and Marlene van Ballegooie (2006)

o http://www.dcc.ac.uk/resource/curation-manual/

Page 87: Metadata approaches for digital presentation

DELOS Summer School 2007, Pisa, Italy © 2007 Michael Day, UKOLN, University of Bath 87

Acknowledgements

UKOLN is funded by the Museums, Libraries and Archives Council, the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils, as well as by project funding from the JISC, the European Union, and other sources. UKOLN also receives support from the University of Bath, where it is based.

http://www.ukoln.ac.uk/

The Digital Curation Centre is funded by the JISC and the UK Research Councils' e-Science Core Programme.

http://www.dcc.ac.uk/