Integration - the heart of researcher centric research data management systems - Steve Mackey,...
-
Upload
repository-fringe -
Category
Education
-
view
72 -
download
1
Transcript of Integration - the heart of researcher centric research data management systems - Steve Mackey,...
1
Integration – the heart of researcher centric research data management systems
Steve Mackey
15 January 2015
2
Agenda
• Who we are, what we do• How it works• RDM systems, where it fits• Workflows• Integrations
21 October 2014
3
Archive storage with a difference
Flagship Arkivum100 service with 100% data integrity guarantee
World-wide professional indemnity insurance – Arkivum100
Long term contracts for enterprise data archiving
Fully automated and managed solution
Audited and certified to ISO27001
Data escrow, exit plan, no lock-in
21 October 2014
Adding media – effectively continual process
Monthly checks and maintenance updates
Annual data retrieval and integrity checks
Hardware refresh
Software migration
Hardware migration
Tape format migration – LTO n to LTO n+2
Support and admin staff migration
Change of supplier of products and services
Keeping Data Alive for 25+ Years
3-5 year obsolescence of servers, operating systems and software
5
Arkivum Appliance• CIFS/NFS presentation
(integrates easily to local file systems)
• Simple administration of user access permissions and storage allocations
• Robust REST API for application integration
• GUI for file ingest status, recovery pre-staging, security
• Ingest triggered by: timeout, checksum exchange, manifest (bulk).
• Checksum/fixity chain of custody from ingest through replication
• Immutable (WORM)• Regular (6 monthly) data
copy read verify• Offline Escrow data copy
(open source, self describing)
• Data encryption throughout keys only held by customer
21 October 2014
EncryptedArchive
Arkivum Service
Arkivum Gatewayon Appliance
Copy foringest
OriginalDatasets& Files
ValidatedArchive
Decryptedobject
Arkivum Service
Arkivum Gatewayon Appliance
Copy foringest
OriginalDatasets& Files
Archive Copy 1
ValidatedArchive
Arkivum/100
Arkivum Gatewayon Appliance
Archive Copy 1
Archive Copy 2
Copy foringest
OriginalDatasets& Files
ValidatedArchive
Arkivum/100
Arkivum Gatewayon Appliance
Archive Copy 1
Archive Copy 2
Copy foringest
OriginalDatasets& Files
ValidatedArchive
Arkivum/100
Arkivum Gatewayon Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
Copy foringest
OriginalDatasets& Files
ValidatedArchive
Arkivum/100
Arkivum Gatewayon Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
OriginalDatasets& Files
ValidatedArchive
CachedCopy
Arkivum/100
Arkivum Gatewayon Appliance
Archive Copy 1
Archive Copy 2
Escrow Copy
CachedCopy
ValidatedArchive
http://datablog.is.ed.ac.uk/2013/12/06/the-four-quadrants-of-research-data-curation-systems/
PUREElementsConveris
ePrints,Dspace,Hydra
FigshareRe3data.orgLanding pagesCKAN
Institutional storage
17
Workflows
• RDM Workflow - The sequence of repeatable processes (steps) through which Research Data passes during its lifecycle, including the steps involved in its creation, curation, preservation, access and eventual disposal.
21 October 2014
18
RDM Workflows Report
• JISC Research Data Spring
• A Consortial Approach to Building an Integrated RDM System – “Small and Specialist”
• http://dx.doi.org/10.6084/m9.figshare.1476832
21 October 2014
Figshare (Amazon)
Archive (Arkivum)
Rese
arch
er 8. Data DOI
2. Data files
Local Research Data
5. Data DOI
DataCite (BL)
HR system
1. Researcher details
Web browser
4. Mint DOI
3. Data Description
Journal7. Article
CRIS(Elements)
6. Data DOI
12. Dataset Description and Data DOI
9.Article and Article DOI
14. Data files
Repository(DSpace)
10. Article and Article DOI
13. Dataset Description And Data DOI
Article DOI
16. Data is safe
15. Data is safe
11. Article DOI
21
Why integrate?• Simpler and easier RDM processes from a Researcher perspective, which both
encourages adoption and lowers the cost of institutional support to the research base. • Clear and repeatable RDM processes that help ensure higher levels of quality and
consistency in RDM across the research base. • Ability to deploy RDM as community-driven shared service(s) so that smaller
institutions can ‘join forces’ to benefit from having access to a common RDM infrastructure.
• Scaling RDM up across a large research base using automation and ‘factory’ type approaches to achieve ‘economies of scale’ and move away from RDM being a manual and labour intensive endeavour.
• Specifically for Archive layer storage this may include:– Confirmation of integrity of received files via checksums/fixity– File archive status reporting– Trigger for original file deletion– File location, data pool management– File recovery staging– Encryption key management
21 October 2014