DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle Lindlar
-
Upload
lindlar -
Category
Technology
-
view
568 -
download
1
description
Transcript of DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle Lindlar
1 / 23 21 / 10 / 13
DURAARK Preserving Architectural Knowledge
Michelle Lindlar (LUH / TIB)
DEDICATE – Final SeminarGlasgow, October 21st 2013
2 / 23 21 / 10 / 13
TIB (Technische Informationsbibliothek)is the German National Library of Science and Technology
Why architectural data?subjects: engineering, architecture, chemistry, computer science, mathematics and physics
Competence centre for non-textual materials (KNM)
2007 – 2011 DFG funded PROBADO3D projectmetadata and content based search for digital architectural 3D models
http://www.probado.de/en_3d.html
Why digital preservation?2009-2011: Goportis digital preservation pilot project, together with our Goportis partners ZB MED and ZBW
Since 2012: Goportis digital preservation system hosted by TIB
A few words about TIB
3 / 23 21 / 10 / 13
DURAARK (DURAble Architectural Knowledge)FP7 – ICT – Digital Preservation (STReP)February 2013 – January 2016
GoalDevelop methods and tools for sustainable long-term preservation of building data (3D and BIM models, metadata, related knowledge & Web data)
Scope• address all layers of digital preservation (bit,
logical, semantic)• interlinked curation and preservation workflows• focus on two file formats: IFC and E57• incorporate existing OAIS compliant digital
preservation system
Project overview
4 / 23 21 / 10 / 13
Tangible outcomes
Semantic enrichment: Vocabularies for description of built structures and enrichment techniques based on a unified and sustainable naming scheme
Tailored Workflows: Thoroughly investigate requirements of institutional stakeholders (libraries/archives) and SMEs on long-term archiving. Develop according workflows.
Sustainability of file formats: Face problem of digital decay by using Industry Foundation Classes (IFC) and E57 as open and already well-established file formats suited for long-term preservation. Ensure availabilityof characterization tools for those formats.
Goal and Tangible Outcomes
5 / 23 21 / 10 / 13
DURAARK – an interdisciplinary project
6 / 23 21 / 10 / 13
TUE, Department of the Built Environment, Eindhoven University of Technology- WP3 leader, semantics & metadata
CITA, Center for Information Technology and Architecture Copenhagen- WP7 leader, evaluation, test
LUH: German National Library of Science and Technology (TIB) & L3S Research Center Hannover
-Coordinator- WP3 Semantic Enrichment- WP6 leader, long-term preservationLuleå University of Technology
- WP8 leader, dissemination/exploitation
Fraunhofer Austria- WP2 leader, system specification & integration
UBO: Universität Bonn- Technical Coordinator- WP4/WP5: change management, shape recognition
Catenda, SME- User perspective, market requirements, evaluation
ConsortiumJakob Beetz (Eindhoven University of Technology)
7 / 23 21 / 10 / 13
3 layers of a digital object
8 / 23 21 / 10 / 13
risks:• media obsolescence• technical failure• human error• DRM
possible actions:• media migration, refreshing, replication• technological redundancy, ideally with geographic spread• error detection, monitoring, recovery & disaster planning• controlled storage with regular maintenance• security and trust
Solved through „good IT practice“ (which, of course,needs to be implemented …)
1. Bit(stream) [Physical] preservation layer
http://commons.wikimedia.org/wiki/File:Compact_Floppy.jpg
9 / 23 21 / 10 / 13
risks:• software / file format obsolesence• software OS hardware dependencies
• additionally: configuration / package dependencies• lack of compliance to format standards („mal-formed objects“)• DRM
possible actions:• migration, emulation, normalization• „hardware museum“• data/information extraction• extensive technical metadata capturing• definition of significant properties (what to preserve)
Established basic processes … but theyrequire adaptation for new formats.
2. Logical [object] preservation layer
http://www.flickr.com/photos/89771128@N02/8451172304/in/pool-2121762@N23
10 / 23 21 / 10 / 13
risks:• terminology and concepts change over time• context and provenance may be lost
(purpose, setting, limitations, cultural context, related objects)
possible actions:• semantic enrichment• tracing of metadata• audit trail capturing• migration at semantic level• documentation of context• document intended meaning / interpretation
Least developed area of digital preservation
3. Semantic [interpretability] preservation layer
11 / 23 21 / 10 / 13
DURAARK Stack
12 / 23 21 / 10 / 13
Use Cases (1/2)
13 / 23 21 / 10 / 13
DURAARK stakeholders
producers
long-termdata stewards
14 / 23 21 / 10 / 13
DURAARK stakeholders
consumers
long-termdata stewards
15 / 23 21 / 10 / 13
Curation and Preservation
producer /consumer
long-termdatasteward
Actionsneed to meetrequirements of
DCC Curation Lifecycle Modelhttp://www.dcc.ac.uk/resources/curation-lifecycle-model
Createsdatato bepreservedby
16 / 23 21 / 10 / 13
Consumer Use Cases
• result of stakeholder analysis• describe desired use, re-use, access• will be adressed in geometric and
semantic enrichment processing layer
Knowing why something should bepreserved helps us in evaluating thecharacteristics to be preserved
Use Cases (2/2)
17 / 23 21 / 10 / 13
OAIS: Information Object
http://public.ccsds.org/publications/archive/650x0m2.pdf
18 / 23 21 / 10 / 13
Metadata: Technical„Metadata that describes the technical state of and process used to create a file. Often closely related either to its file format or the original software used to create the file, e.g. scanning equipment and settings used to create or modify a digital object.“http://www.digitalpreservation.gov/ndsa/ndsa-glossary.html
Information needed in order to maintain access to the file
Significant properties:criteria which an institutionconsiders important factors of an object‘s quality, structureor behaviour, which should bepreserved over time, i.e. over the course of digital preservation actions.
http://public.ccsds.org/publications/archive/650x0m2.pdf
Technical Metadata
19 / 23 21 / 10 / 13
File format characterization
Existing tools for various fileformats:
Jhove, Tika, fido, fits, DROID, …
Few existing tools for IFC and E57:
E57 validator, IFC validator
20 / 23 21 / 10 / 13
National Library of Australia: Testing Software Tools of Potential Interest for Digital Preservationhttp://www.openplanetsfoundation.org/system/files/Digital%20Preservation%20Project%20Report%20-%20Testing%20Software%20Tools.pdf
21 / 23 21 / 10 / 13
IFC extraction:geometry typesschema versionimplementation levelapplicationversion of applicationmeasurement unitsMVDgeotaggedgross areanumber of stories…
E57 extraction:geo-referenced (yes/no)total square metrenumber of floorsresolution settingsquality settingssensor model, sensor serial number, …total number of scanstotal number of pointsintensity (yes/no)colour (yes/no)reasons for spatial disturbance: distribution
of detected elementssub quality parameters (positioning) – in %
e.g., distance error matched references; occupied quadrants
sub quality parameters (references) – in %e.g., point drift, longitudinal mismatch
…
Potential candidates for technical metadata
22 / 23 21 / 10 / 13
Currently developing stakeholder questionnaire
covering the following areas:– data holdings (formats, SW, produced internally / externally)– data storage / management (data carriers, backup practises, archiving
practises) – access (when, for what reason)– experience with data loss (yes/no, reasons)
Looking for interested institutionsand multiplicators !
Want to help?