Parul Sharma Sally Vermaaten Right Combination
-
Upload
future-perfect-2012 -
Category
Technology
-
view
563 -
download
1
description
Transcript of Parul Sharma Sally Vermaaten Right Combination
![Page 1: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/1.jpg)
The Right Combination:Using DDI and PREMIS for data preservation
Parul Sharma & Sally Vermaaten
March 2012
![Page 2: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/2.jpg)
1. The context – drivers for preservation2. The problem – challenges faced when trying to re-
use data3. Our solution – metadata for data management &
preservation4. Our recommendations– strategies for making the
right metadata choices
2
Outline
![Page 3: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/3.jpg)
1. THE CONTEXT: DRIVERS FOR PRESERVATION
3
![Page 4: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/4.jpg)
Data is a cross-domain concern
Geospatial dataScientific data
4
Statistical dataFinancial and commercial data
![Page 5: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/5.jpg)
5
There are many drivers for data preservation
Legal mandates
Verification
Uniqueness of data
Cost of data collection
Data re-use
![Page 6: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/6.jpg)
6
An example of data re-use at Statistics New Zealand
![Page 7: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/7.jpg)
2. THE PROBLEM: CHALLENGES FACED WHEN TRYING TO RE-USE DATA
7
![Page 8: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/8.jpg)
Common challenges to re-use/preservation of any type of digital object
![Page 9: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/9.jpg)
Common challenges to re-use/preservation of any type of digital object
I can’t find it I can’t open it (wrong hardware/software) I’m not sure it is the right thing
![Page 10: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/10.jpg)
Unique challengesto re-use/preservation of structured data
![Page 11: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/11.jpg)
11
I’m not sure it is the authoritative dataI don’t understand the meaning of the data - data is not self-descriptive I can’t use the data because I can’t harmonize it with other data
Unique challengesto re-use/preservation of structured data
![Page 12: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/12.jpg)
3. OUR SOLUTION: METADATA FOR DATA MANAGEMENT & PRESERVATION
12
![Page 13: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/13.jpg)
13
Our solutions
![Page 14: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/14.jpg)
14
Our solutions
![Page 15: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/15.jpg)
15
Our solutions
![Page 16: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/16.jpg)
16
Our solutions
![Page 17: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/17.jpg)
17
Our solutions
![Page 18: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/18.jpg)
18
To support these processes…Metadata is keyWe could invent our own standard for recording metadata but there is a better way …
![Page 19: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/19.jpg)
How?
19
+ +
Describe!
Data Documentation Initiative (DDI)
Dublin Core
PREservation Metadata: Implementation Strategies (PREMIS)
Discover !
Preserve!
![Page 20: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/20.jpg)
Comparison of standards coverage
20
Dublin Core DDI PREMIS
Discovery information about a resource (e.g. Title, Creator, Publication date)
Surveys and outputs (Series and Studies)
Objects (significant characteristics, checksums, basic identifying information)
Methodology & quality information
Events (preservation actions)
Classifications used Agents
Dataset descriptions Rights
Variables used
Links to documentation
![Page 21: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/21.jpg)
Metadata to support re-use
21
DDIPREMIS
![Page 22: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/22.jpg)
4. OUR RECOMMENDATIONS: STRATEGIES FOR MAKING THE RIGHT METADATA CHOICES
22
![Page 23: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/23.jpg)
Metadata Top Tips
1. Create structures that will allow you to re-use metadata tools
2. Use standards that are fit for your content so users can re-use
3. Consider overlap between standards so you’re using the right standard for the right job
4. Provide standard based tools and capture at point of creation to improve quality and efficiency
23
![Page 24: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/24.jpg)
1. Create structures that will allow you to re-use metadata tools
Set yourself up to be able to use the same tools to harvest and mine your metadata (e.g. handy reports, searching across content types) by:
– developing a standard structure that can support all your content types
– and recording generic information in generic metadata standards
24
![Page 25: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/25.jpg)
25
Data_1500
DublinCore.xml
PREMIS.xml
Original
data.sas7bdat
questionnaire.doc
ArchiveMaster
Data
data.csv
Documentation
questionnaire.pdf
Metadata
DDI.xml
Database_0120
DublinCore.xml
PREMIS.xml
Original
database.mdb
ArchiveMaster
Header
metadata.xsd
metadata.xml
Content
Schema1
Table1
table.xsd
table.xml
Non-format specific metadata
Format specific structure &
metadata
![Page 26: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/26.jpg)
2. Use standards that are fit for your content so users can re-use
26
Enable future re-use and understanding by recording format or content-specific metadata in fit-for-purpose standards e.g.
DDI for statistical dataSIARD for databasesMIX for images
![Page 27: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/27.jpg)
3. Consider overlap between standards so you’re using the right standard for the
right job
27
Information DDI PREMIS Dublin Core Useful to duplicate?
Basic identifying information
•Title•Creator•PublicationDate•ID
•Title•Creator•Date•Identifier
yes
Access information
•Access Conditions •Rights entity •Rights No – PREMIS is most expressive and generic location
![Page 28: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/28.jpg)
4. Provide standard based tools and capture at point of creation to improve quality and efficiency
At first, you may need to capture or collate all metadata about data yourself Think ahead about tools you might be able to provide to data experts to allow them to record the information directly in the standard if possible
28
![Page 29: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/29.jpg)
29
![Page 30: Parul Sharma Sally Vermaaten Right Combination](https://reader036.fdocuments.us/reader036/viewer/2022062418/5557a380d8b42a696c8b467b/html5/thumbnails/30.jpg)
Takeaways
1. Organisations have many reasons to re-use data over time 2. There are unique challenges to preserving data3. Where possible, save yourself some work and make your
metadata more harvestable and data more understandable by using international standards like DDI and PREMIS
4. When you use metadata standards like DDI and PREMIS together:• create generic structures• use fit-for-purpose standards for specific content• consider information overlap • ‘delegate’ metadata capture where possible
30