RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

15
It’s a Real World: Developing Preserva6on Policy for Dryad Ayoung Yoon (Dryad preserva2on working group, Doctoral Candidate at UNCCH) Sara Mannheimer (Former Dryad curator, Data management librarian at Uof Montada) Elena Feinstein, Jane Greenberg, Ryan Scherle, Dryad Digital Repository March 26, 2014 Research Data Access & Preserva6on Submit (RDAP) 2014

description

Research Data Access and Preservation Summit, 2014 San Diego, CA March 26-28, 2014 Ayoung Yoon Dryad preservation working group, Doctoral Candidate at UNC-­‐CH Sara Mannheimer Former Dryad curator, Data management librarian at Montana State University Elena Feinstein, Jane Greenberg, Ryan Scherle Dryad Digital Repository

Transcript of RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Page 1: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

It’s  a  Real  World:    Developing  Preserva6on  Policy  for  Dryad  

Ayoung  Yoon  (Dryad  preserva2on  working  group,  Doctoral  Candidate  at  UNC-­‐CH)  Sara  Mannheimer  (Former  Dryad  curator,  Data  management  librarian  at  Uof  Montada)  

Elena  Feinstein,  Jane  Greenberg,  Ryan  Scherle,  Dryad  Digital  Repository  

March  26,  2014  Research  Data  Access  &  Preserva6on  Submit  (RDAP)  2014  

Page 2: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Outline •  Introduc2on    •  What  is  Dryad  Digital  Repository?    •  Preserva2on  policy  development  process    •  Dryad  preserva2on  policy  •  Lesson  learned  and  open  ques2ons    •  Conclusion  •  Acknowledgement      

Page 3: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Introduction •  “Data  deluge”  •  Journals  and  funding  agency  mandates  •  Benefits  to  archiving  and  preserving  research  data:  

–  Facilitates:  •  Verifica2on  of  research  •  accessibility  and  discoverability  •  opportuni2es  for  data  reuse  •  increased  cita2ons  •  research  visibility  

–  Prevents:    •  redundant  data  collec2on  •  inefficient  legacy  data  cura2on  •  burden  of  sharing-­‐on-­‐request  

•  Challenges  of  data  archiving:  –  Wider  variety  of  file  formats  than  most  digital  archival  materials.    –  New  versions  as  data  sets  are  added  to  and  updated  –  Security  considera2ons  –  Large  amounts  of  data  

  Benefits  adapted  from  Beagrie  N,  Lavoie  BF,  Woollard  M  (2010)  Keeping  research  data  safe  2.  HEFCE  

Page 4: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Why preservation policy?

•  Preserva2on  policy  supports  strategic  planning  for  implementa2on  

•  Communicates  to  stakeholders  –  trustworthiness  and  commitment  to  preserva2on    

•  Not  many  data  preserva2on  policies.  Some  examples:  –  CERN:  CMS  data  –  Archaeology  Data  Service  –  NSIDC  Data  Management  Policies  –  Odum  Ins2tute  Preserva2on  Policy  –  ISPSR  –  DataONE  

Page 5: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Dryad Digital Repository •  A  curated,  general-­‐purpose  repository  that  makes  the  

data  underlying  scien2fic  and  medical  publica2ons  discoverable,  freely  reusable,  and  citable    (hap://datadryad.org/).  

•  Facilitates  data  availability,  data  sharing,  and  scholarly  communica2on.  

•  Originally  partnered  with  leading  journals  and  scien2fic  socie2es  in  evolu2onary  biology  and  ecology.  

•  Broad  collec2ng  policy  –  almost  any  data  is  accepted,      as  long  as  it  is  associated  with  a  publica2on.    

 

Page 6: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Common filetypes in Dryad

0   200   400   600   800   1000   1200  

WAV  

HTML  

Phylip  

R  script  

JPEG  Image  

Newick  tree  file  

RTF  

XML  

GZip  archive  

MS  Word  OpenXML  

MS  Word  97-­‐2007  

Nexus  

PDF  

FASTA  

MS  Excel  OpenXML  

Zip  archive  

CSV  

MS  Excel  97-­‐2007  

Page 7: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Dryad and Preservation Needs •  Preserva2on  is  a  major  part  of  Dryad’s  mission.  •  Current  preserva2on  ac2ons:  

–  MD5  Checksums  –  provenance  metadata  –  informal  encouragement  of  preferred  formats  

•  Developing  and  implemen2ng  a  formal  preserva2on  policy  will:  –  guide  current  and  future  preserva2on  prac2ce  –  Facilitate  the  long-­‐term  preserva2on  of  the  repository’s  digital  

assets  

Page 8: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Policy Development Process

     2012    Feb  2013                                      May  2013                                      July  2013                                  Nov  2013                                          

An  ini2al  preserva2on  plan  (version  1.0.)  

Preserva2on  Working  Group  in  Feb  2013  

Version  2.0.  presented  to  the  Dryad  Board  of  Directors  

Version  2.0.  revised  in  coopera2on  with  Dryad  staff  

•  Version  2.4.  Approved  by  Dryad  Board  of  Directors  

•  Preserva2on  Working  Group  dissolved.  

Preserva2on  Task  Force  formed    

Page 9: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Preservation Policy •  Purpose    •  Scope  and  content  coverage    •  Overview  of  preserva2on  strategies    •  Format  support  and  levels  of  preserva2on  

–  e.g.  Preferred  formats  and  format  support  levels  

•  Implemen2ng  the  strategy  –  e.g.  integra2ons  of  OAIS  func2onal  ac2vi2es,  pre-­‐ingest  &  

ingest,  and  archival  storage,  authen2city  and  integrity,  security,  versioning,  and  withdrawal  of  collec2ons  

•  Sustainability  plans  –  e.g.  technical  sustainability,  ins2tu2onal  and  financial  

sustainability  

Page 10: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Lesson Learned and Open Questions

•  A  nego2a2on  between  what  is  ideal  and  what  is  realis2c  –  Adop2ng  Interna2onal  standards,  models,  and  best  prac2ces  exist  for  long-­‐term  preserva2on    •  Open  Archival  Informa2on  System  (OAIS)  reference  model  (ISO  14721:2003)  

•  PREMIS  (PREserva2on  Metadata:  Implementa2on  Strategies)  

–  Other  standards  and  guidelines  about  audit  and  cer2fica2on  for  building  a  trusted  digital  repository  •  Trustworthy  Repositories  Audit  &  Cer4fica4on:  Criteria  and  Checklist  (TRAC)  and  Data  Seal  of  Approval  (DSA)  

Page 11: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Lesson Learned and Open Questions

•  Aligning  with  other  internal  and  ins2tu2onal  policies  –  Follow  Dryad’s  internal  policies,  we  looked  primarily  to  Dryad’s  Terms  of  Service  document  (haps://datadryad.org/pages/policies),  which  includes  policies  on  submission,  content,  payment,  usage,  and  privacy    

–  Comply  with  Dryad’s  unofficial  policies,  which  have  yet  to  be  finalized  •  A  policy-­‐in-­‐progress:  Dryad’s  policy  on  versioning  

–  Comply  with  policy  from  partner  ins2tu2ons  •  Dryad  func2ons  as  a  partnership  between  the  University  of  North  Carolina  at  Chapel  Hill  (UNC),  Duke  University  (Duke),  and  North  Carolina  State  University  (NC  State)    

Page 12: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Lesson Learned and Open Questions

•  Structuring  the  policy  according  to  Dryad’s  specific  needs  –  Mee2ng  specific  organiza2onal  needs  is  fundamentally  important  and  should  be  the  first  considera2on  in  all  work,  as  each  organiza2on  has  different  goals,  priori2es,  and  capabili2es.    

–  Data  depositors’  requirements:  minimum  requirements  •  balance  “minimum  efforts”  and  having  “enough”  representa2on  informa2on  

•  compensated  by  other  factors    

Page 13: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Conclusion •  Policy-­‐crea2on  and  planning  are  just  first  steps  -­‐-­‐  

implementa2on  will  require  further  considera2ons  •  Future  plan  

–  Poten2als  for  implemen2ng  TRAC  /  DSA  in  the  future  –  Divide  policy  and  implementa2on  into  separate  documents  

–  New  Task  Force  

Page 14: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Acknowledgement

•  The  works  was  supported  in  part  from  Na2onal  Science  Founda2on  (NSF),  Award  number:  1147166/ABI  Development:  Dryad:  scalable  and  sustainable  infrastructure  for  the  publica2on  of  data.    

Page 15: RDAP14: It’s a Real World: Developing Preservation Policy for Dryad

Thank you! Ayoung  Yoon  

 Doctoral  candidate      University  of  North  Carolina  at  Chapel  Hill    [email protected]  

Sara  Mannheimer    Data  management  librarian      Montana  State  University    [email protected]