Supplementary Data and Publishers Neil Beagrie, Julia Chruszcz, and Peter Williams Charles Beagrie...

12
Supplementary Data and Publishers Neil Beagrie, Julia Chruszcz, and Peter Williams Charles Beagrie Ltd Dryad UK April 2010

Transcript of Supplementary Data and Publishers Neil Beagrie, Julia Chruszcz, and Peter Williams Charles Beagrie...

Supplementary Data and Publishers

Neil Beagrie, Julia Chruszcz,

and Peter Williams

Charles Beagrie Ltd

Dryad UK April 2010

Overview• Consultancy for Dryad Sustainability: covered areas of draft

business plan and sustainability for Dryad

• Presenting one of the contributions(publishers) to section on Comparators and Costs

• Outcomes from desk research and 12 interviews with publishers/data publishers + some additional input drawn from Keeping Research Data Safe

• Very brief presentation – article in preparation for Learned Publishing Oct 2010 issue….KRDS2 available from JISC shortly (or me now )

Interviewees• Journal of Clinical Investigation• Journal of the American Medical Association• Molecular Phylogenetics and Evolution (Elsevier)• Journal of Heredity (OUP)• Ecological Society of America• Wiley-Blackwell + Ecology Letters• Royal Society• Federation of American Societies for Experimental Biology• OECD Publishing• Internet Archaeology and Archaeology Data Service• Pangaea: Publishing Network for Geoscientific & Environmental

Data• Dataverse Network (Social Sciences, Harvard)

Some Findings: growth• Many interviewees stated that supplementary data and

materials are showings rapid growth• 3 gave figures: from 32 articles in 2000, to 251 in 2009 – an

increase of 784%; from 6% in 2005 to 38% in 2009; from 2% a decade ago to 87% in 2009.

Some Findings: workflow• supplementary data have grown organically at the various

journals investigated (author driven);• Both the work and the costs being absorbed into the daily

running of journals;• in 4 cases minimal impact on work duties; in 5 others there was a

significant but often unquantified impact (two of these might be considered data publications with a focus on publishing data papers or datasets); and in 3 cases the information was not available or unknown;

• can be explained in terms of level of effort or importance applied : the greatest levels of effort are associated with copy editing, format migration, addition of metadata, etc, whilst the least effort is required for simply hosting the material; and/or high-levels of automation in the workflow.

Some Findings: costs• These were in most cases unknown or only partially known;• Costs mentioned but usually not quantified include: digital

storage costs, salary costs of journal staff; and long term preservation costs;

• detailed cost information was really only available from Internet Archaeology via Archaeology Data Service which had participated in an activity based costing study (KRDS2);

• Internet Archaeology archiving costs reflect those for a “dataset publisher” so only a comparator for part of Dryad’s content – large datasets.

Some Findings: revenue• only author fees and journal subscription fees were

mentioned as current revenue sources for the supplementary materials in journals;

• 3 journals interviewed have author charges for supplementary materials (see next slide);

• The data archiving and sharing organisations interviewed relied primarily on (uncertain) research grants and temporary or re-current core funding, but one had access to a small endowment and another has a charging policy for some depositors.

Some Findings: author charges• Journal of Clinical Investigation - authors are charged $300 for

supplemental data to appear online with accepted articles; • Ecological Archives - submission of ‘appendices and

supplements’ is free up to 10MB. Above this, there is a fee of $250 for the first 1 GB and $50 for each subsequent GB. The fee for publication of a data paper is $250 for publication of the abstract in the relevant journal plus publication of up to 10 MB in Ecological Archives. An additional $250 is charged for data sets between 10MB and 1GB, and for larger datasets there is an additional $50 per GB fee;

• The Federation of American Societies for Experimental Biology (FASEB) charges $100 for each Supplemental file.

Summary KRDS Activity ModelPre-Archive Phase

Outreach

Initiation

Creation

Archive Phase

Acquisition

Disposal

Ingest

Archival Storage

Preservation Planning

First Mover Innovation

Data Management

Access

Support Services Administration

Common Services

Estates

KRDS: what did we learn?Whole of Service costing/Seeing the“Big Picture”

Selection of 2009 Allocation of UKDA Activity Costs

Acquisition 5.8%

Ingest 21.5%

A. Storage +Pres. Planning 3.1%

Access 16.9%

KRDS:Implications

• Changing view of digital preservation costs: – “getting stuff in and out” costs much higher than

“keeping it (bit preservation + migration)”;– Staff costs c.70% of total costs;– Importance of economies of scale and

automation.

Further Information“Keeping Research Data Safe” (KRDS1)Final

report and Executive Summary at http://www.jisc.ac.uk/publications/publications/keepingresearchdatasafe.aspx

Keeping Research Data Safe2 (KRDS2) webpage at www.beagrie.com/jisc.php

KRDS2 report available from JISC website early May 2010 or email [email protected]