A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases...
-
Upload
cameron-mccarthy -
Category
Documents
-
view
216 -
download
0
Transcript of A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases...
A CIDOC CRM – compatible A CIDOC CRM – compatible metadata model metadata model for digital preservationfor digital preservation
Information Systems and Databases LaboratoryDepartment of Informatics Athens University of Economics and Business
Panos Constantopoulos and Vicky Dritsou
CIDOC CRM Workshop 224 October 2006
Structure of the presentationStructure of the presentation
• Introduction to Digital Preservation
• Metadata
• Existing proposals
• A conceptual preservation metadata model
• Properties of the model
• Model concepts
• Schema
• The complete model
• Conclusion
• Further research
CIDOC CRM Workshop 324 October 2006
Introduction to Digital Preservation (1/2)Introduction to Digital Preservation (1/2)
• Two types of perils for digital content exist
- Physical: physical destruction of file systems, corruption of digital media, fire, earthquake
- Technological: obsolete systems, non-compatible systems, software and formats
• Physical perils are more straightforward to confront
- By saving multiple copies of digital content:
• On different media
• At different geographic locations
• Technological hazards require a more complex policy to be applied
- By following the appropriate preservation strategy
CIDOC CRM Workshop 424 October 2006
• Digital preservation strategies for technological hazards
- Information migration
- Technology emulation
- Technology preservation
- Backwards compatibility
- Reliance on standards
- Encapsulation
- Transformation to non-digital form
- Digital archeology
• Most strategies require some information to be collected and stored
- This is achieved by using metadata
Introduction to Digital Preservation Introduction to Digital Preservation (2/2(2/2))
CIDOC CRM Workshop 524 October 2006
MetadataMetadata• Defined as “data for data” or otherwise “information about information”• Metadata properties
- Not necessarily digital- Not autonomous
• Digital information needs to pre-exist
- Supplementary- Dynamic character
• Metadata types- Descriptive- Structural- Administrative
• Preservation metadata- They contain elements from all tree types
• But which metadata should we choose?
CIDOC CRM Workshop 624 October 2006
Existing proposals Existing proposals
• Several approaches exist
• We have studied five widely known ones:- Dublin Core
- Open Archival Information Systems (OAIS)
- Curl Exemplars Digital Archives (CEDARS)
- Pittsburgh Project
- National Library of Australia (NLA)
• Discussion- None contains inter-related concepts (element lists)
- DC: Access-oriented, inadequate
- OAIS, CEDARS: very detailed, difficult to use
- PP: detailed, necessary/optional elements, use instructions
- NLA: Structured elements, object types
CIDOC CRM Workshop 724 October 2006
A conceptual preservation metadata A conceptual preservation metadata modelmodel
• A parsimonious metadata set derived from- comparison of the afore-mentioned proposals
- CIDOC CRM
• Metadata elements- Title - Information Carrier
- Identifier - Activity
- Subject - Right
- Language - Actor
- Type - Effect
- Format - History
- Technical Equipment
• Relations
CIDOC CRM Workshop 824 October 2006
• Each element forms a concept
• Contains relationships among concepts
• Results in a conceptual model
- Compatible with CIDOC CRM
• A small number of new concepts
• Can serve as an application ontology
- A guide for preservation
• Independent from preservation strategy
- Elements contain all the information required from each strategy
- Further details can be added with the extension of concepts
Properties of the modelProperties of the model
CIDOC CRM Workshop 924 October 2006
Model concepts (1/3)Model concepts (1/3)• Main concept: Digital Object
- Subclass of E73 Information Object
- Has attributes: Title, Subject, Type, Size, Identifier, Language, Digital Content
• Identifiers may be global or local- Global identifiers must be unique
• Digital Content allows separation of content from descriptive/administrative aspects
- Stored in an Information Carrier
- Digital Objects can consist of other digital objects (Complex Objects)
- Type: image, text, sound, multimedia,…
- Each object type can be formatted in one of a number of specific formats
CIDOC CRM Workshop 1024 October 2006
Schema (1/3)Schema (1/3)
Digital Object(E73 Information
Object))
Size (Ε54)
Title (Ε35)
Subject (Ε1 CRM Entity)
Information Carrier (E84)
Type (E55)
Object Identifier (Ε41 Appellation)
Global Identifier (E41)
Local Identifier (E41)
is identified by (Ρ1)
Language (E56)
has size (Ρ43)
has title (Ρ102)
has subject
(Ρ129 is about)
has
lang
uage
(Ρ72
)
is saved to (Ρ128 is carried by)
has type (Ρ67.1)
Format (E29 Design)
is fo
rmat
ted
in
Digital Content (E73 Information
Object)
cont
ains
(P10
6 is
com
pose
d of
)
Complex Object (E73 Information
Object)
CIDOC CRM Workshop 1124 October 2006
• Activities have digital objects as input and output, are carried out by Actors and are subject to Rights
• Activity types:• Creation• Deletion• Modification• Alteration• Copy• Read
- In all of them, except from Read and Deletion, we assume that the output is a new object
• We keep the sequence of performed Activities by assigning the appropriate attribute
• Effects can be used as a space-saving device when versions need not be kept
Model concepts (2/3)Model concepts (2/3)
CIDOC CRM Workshop 1224 October 2006
Schema (2/3)Schema (2/3)
Digital Object (E73)
Activity (E7)
takes as input
(Ρ16 used specific object)
gives as output
(Ρ123 resulted in)
Activity Type (E55)has type
(P21 had general purpose)
previous (P134 continued)
is subject to (Ρ104)
Actor (Ε39)
Right (Ε30)
held by (Ρ105)
carries out (P14 performed)
Effect (Ε3 Condition State)
has effe
ct
cont
ains
History
to p
erfo
rm
CIDOC CRM Workshop 1324 October 2006
• Activities require the appropriate Technical Equipment to be performed
- Software
- Hardware
• These are all specializations of E71 Man-Made Thing
- The software needed depends on the Type and Format of the object
• Information carrier also requires Technical Equipment
- For reading the object
- For writing the object
Model concepts (3/3)Model concepts (3/3)
CIDOC CRM Workshop 1424 October 2006
Schema (3/3)Schema (3/3)
Digital Object (E73)
Information Carrier (E84)
Type (E55)
is saved to (Ρ128 is carried by)
has type (Ρ67.1)Technical Equipment
(E71 Man-Made Thing)
Hardware (E71)Software (Ε71)Format (E29)
is formatted in
is supported by
(Ρ103 was intention of)
requires for reading (Ρ16 used specific object) requires for writing
(Ρ16 used specific object)
CIDOC CRM Workshop 1524 October 2006
The complete modelThe complete model
Digital Object (E73)
Activity (E7)
takes as input (Ρ16)gives as output (Ρ123)
Activity Type (E55)has type (P21)
previous (P134)
Size (Ε54)
Title (Ε35)
Subject (Ε1)
Information Carrier (E84)
Type (E55)
Object Identifier (Ε41)
Global Identifier (E41)
Local Identifier (E41)
is identified by (Ρ1)
Language (E56)
has size (Ρ43)
has title (Ρ102)
has subject (Ρ129) ha
s la
ngua
ge (Ρ
72)
is saved to (Ρ128)
has type (Ρ67.1)
Technical Equipment (E71)
Hardware (E71)Software (Ε71)Format (E29)
is formatted in
is supported by (Ρ103)
requires for reading (Ρ16) requires for writing (Ρ16)
is subject to (Ρ104)
Actor (Ε39)
Right (Ε30)
held by (Ρ75)
carries out (P14)
Digital Content (E73)
cont
ains
(P10
6)
Effect (Ε3)
has effe
ct
Complex Object (E73)
cont
ains
History
to perform
CIDOC CRM Workshop 1624 October 2006
ConclusionConclusion
• Metadata elements drawn from existing metadata sets
• Conceptual model for digital preservation
- Previous works included only lists of metadata elements
- Extensible as needed
• Compatible with CIDOC CRMDigital objects as
- digital surrogates of non-digital objects
- cultural objects by themselves
CIDOC CRM Workshop 1724 October 2006
Further researchFurther research
• Historical processes:
- interpretation
- CIDOC CRM domain of application
• Preservation processes:
- decision and production processes
- Prescription and monitoring
Explore differences in modelling requirements
CIDOC CRM Workshop 1824 October 2006
Thank you for your attention!Thank you for your attention!