A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases...

18
A CIDOC CRM – compatible A CIDOC CRM – compatible metadata model metadata model for digital preservation for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University of Economics and Business Panos Constantopoulos and Vicky Dritsou

Transcript of A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases...

Page 1: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

A CIDOC CRM – compatible A CIDOC CRM – compatible metadata model metadata model for digital preservationfor digital preservation

Information Systems and Databases LaboratoryDepartment of Informatics Athens University of Economics and Business

Panos Constantopoulos and Vicky Dritsou

Page 2: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 224 October 2006

Structure of the presentationStructure of the presentation

• Introduction to Digital Preservation

• Metadata

• Existing proposals

• A conceptual preservation metadata model

• Properties of the model

• Model concepts

• Schema

• The complete model

• Conclusion

• Further research

Page 3: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 324 October 2006

Introduction to Digital Preservation (1/2)Introduction to Digital Preservation (1/2)

• Two types of perils for digital content exist

- Physical: physical destruction of file systems, corruption of digital media, fire, earthquake

- Technological: obsolete systems, non-compatible systems, software and formats

• Physical perils are more straightforward to confront

- By saving multiple copies of digital content:

• On different media

• At different geographic locations

• Technological hazards require a more complex policy to be applied

- By following the appropriate preservation strategy

Page 4: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 424 October 2006

• Digital preservation strategies for technological hazards

- Information migration

- Technology emulation

- Technology preservation

- Backwards compatibility

- Reliance on standards

- Encapsulation

- Transformation to non-digital form

- Digital archeology

• Most strategies require some information to be collected and stored

- This is achieved by using metadata

Introduction to Digital Preservation Introduction to Digital Preservation (2/2(2/2))

Page 5: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 524 October 2006

MetadataMetadata• Defined as “data for data” or otherwise “information about information”• Metadata properties

- Not necessarily digital- Not autonomous

• Digital information needs to pre-exist

- Supplementary- Dynamic character

• Metadata types- Descriptive- Structural- Administrative

• Preservation metadata- They contain elements from all tree types

• But which metadata should we choose?

Page 6: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 624 October 2006

Existing proposals Existing proposals

• Several approaches exist

• We have studied five widely known ones:- Dublin Core

- Open Archival Information Systems (OAIS)

- Curl Exemplars Digital Archives (CEDARS)

- Pittsburgh Project

- National Library of Australia (NLA)

• Discussion- None contains inter-related concepts (element lists)

- DC: Access-oriented, inadequate

- OAIS, CEDARS: very detailed, difficult to use

- PP: detailed, necessary/optional elements, use instructions

- NLA: Structured elements, object types

Page 7: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 724 October 2006

A conceptual preservation metadata A conceptual preservation metadata modelmodel

• A parsimonious metadata set derived from- comparison of the afore-mentioned proposals

- CIDOC CRM

• Metadata elements- Title - Information Carrier

- Identifier - Activity

- Subject - Right

- Language - Actor

- Type - Effect

- Format - History

- Technical Equipment

• Relations

Page 8: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 824 October 2006

• Each element forms a concept

• Contains relationships among concepts

• Results in a conceptual model

- Compatible with CIDOC CRM

• A small number of new concepts

• Can serve as an application ontology

- A guide for preservation

• Independent from preservation strategy

- Elements contain all the information required from each strategy

- Further details can be added with the extension of concepts

Properties of the modelProperties of the model

Page 9: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 924 October 2006

Model concepts (1/3)Model concepts (1/3)• Main concept: Digital Object

- Subclass of E73 Information Object

- Has attributes: Title, Subject, Type, Size, Identifier, Language, Digital Content

• Identifiers may be global or local- Global identifiers must be unique

• Digital Content allows separation of content from descriptive/administrative aspects

- Stored in an Information Carrier

- Digital Objects can consist of other digital objects (Complex Objects)

- Type: image, text, sound, multimedia,…

- Each object type can be formatted in one of a number of specific formats

Page 10: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1024 October 2006

Schema (1/3)Schema (1/3)

Digital Object(E73 Information

Object))

Size (Ε54)

Title (Ε35)

Subject (Ε1 CRM Entity)

Information Carrier (E84)

Type (E55)

Object Identifier (Ε41 Appellation)

Global Identifier (E41)

Local Identifier (E41)

is identified by (Ρ1)

Language (E56)

has size (Ρ43)

has title (Ρ102)

has subject

(Ρ129 is about)

has

lang

uage

(Ρ72

)

is saved to (Ρ128 is carried by)

has type (Ρ67.1)

Format (E29 Design)

is fo

rmat

ted

in

Digital Content (E73 Information

Object)

cont

ains

(P10

6 is

com

pose

d of

)

Complex Object (E73 Information

Object)

Page 11: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1124 October 2006

• Activities have digital objects as input and output, are carried out by Actors and are subject to Rights

• Activity types:• Creation• Deletion• Modification• Alteration• Copy• Read

- In all of them, except from Read and Deletion, we assume that the output is a new object

• We keep the sequence of performed Activities by assigning the appropriate attribute

• Effects can be used as a space-saving device when versions need not be kept

Model concepts (2/3)Model concepts (2/3)

Page 12: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1224 October 2006

Schema (2/3)Schema (2/3)

Digital Object (E73)

Activity (E7)

takes as input

(Ρ16 used specific object)

gives as output

(Ρ123 resulted in)

Activity Type (E55)has type

(P21 had general purpose)

previous (P134 continued)

is subject to (Ρ104)

Actor (Ε39)

Right (Ε30)

held by (Ρ105)

carries out (P14 performed)

Effect (Ε3 Condition State)

has effe

ct

cont

ains

History

to p

erfo

rm

Page 13: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1324 October 2006

• Activities require the appropriate Technical Equipment to be performed

- Software

- Hardware

• These are all specializations of E71 Man-Made Thing

- The software needed depends on the Type and Format of the object

• Information carrier also requires Technical Equipment

- For reading the object

- For writing the object

Model concepts (3/3)Model concepts (3/3)

Page 14: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1424 October 2006

Schema (3/3)Schema (3/3)

Digital Object (E73)

Information Carrier (E84)

Type (E55)

is saved to (Ρ128 is carried by)

has type (Ρ67.1)Technical Equipment

(E71 Man-Made Thing)

Hardware (E71)Software (Ε71)Format (E29)

is formatted in

is supported by

(Ρ103 was intention of)

requires for reading (Ρ16 used specific object) requires for writing

(Ρ16 used specific object)

Page 15: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1524 October 2006

The complete modelThe complete model

Digital Object (E73)

Activity (E7)

takes as input (Ρ16)gives as output (Ρ123)

Activity Type (E55)has type (P21)

previous (P134)

Size (Ε54)

Title (Ε35)

Subject (Ε1)

Information Carrier (E84)

Type (E55)

Object Identifier (Ε41)

Global Identifier (E41)

Local Identifier (E41)

is identified by (Ρ1)

Language (E56)

has size (Ρ43)

has title (Ρ102)

has subject (Ρ129) ha

s la

ngua

ge (Ρ

72)

is saved to (Ρ128)

has type (Ρ67.1)

Technical Equipment (E71)

Hardware (E71)Software (Ε71)Format (E29)

is formatted in

is supported by (Ρ103)

requires for reading (Ρ16) requires for writing (Ρ16)

is subject to (Ρ104)

Actor (Ε39)

Right (Ε30)

held by (Ρ75)

carries out (P14)

Digital Content (E73)

cont

ains

(P10

6)

Effect (Ε3)

has effe

ct

Complex Object (E73)

cont

ains

History

to perform

Page 16: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1624 October 2006

ConclusionConclusion

• Metadata elements drawn from existing metadata sets

• Conceptual model for digital preservation

- Previous works included only lists of metadata elements

- Extensible as needed

• Compatible with CIDOC CRMDigital objects as

- digital surrogates of non-digital objects

- cultural objects by themselves

Page 17: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1724 October 2006

Further researchFurther research

• Historical processes:

- interpretation

- CIDOC CRM domain of application

• Preservation processes:

- decision and production processes

- Prescription and monitoring

Explore differences in modelling requirements

Page 18: A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University.

CIDOC CRM Workshop 1824 October 2006

Thank you for your attention!Thank you for your attention!