Open access data

31
A centre of expertise in digital information management www.ukoln.ac.uk UKOLN is supported by: Open access data Michael Day Digital Curation Centre UKOLN, University of Bath [email protected] Impact from Software workshop, Cardiff University, 15 May 2013

description

Slides from a presentation given at a "Impact from Software" workshop, Cardiff University, 15 May 2013

Transcript of Open access data

Page 1: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

UKOLN is supported by:

Open access data

Michael Day Digital Curation Centre

UKOLN, University of Bath

[email protected]

Impact from Software workshop, Cardiff University, 15 May 2013

Page 2: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Presentation outline

• Open Science

– Royal Society, Science as an open enterprise (2012)

• The changing requirements of funding bodies

– RCUK, EPSRC …

• Emerging Research Data Management (RDM) practice

• Citation of research data

• New ways of measuring “impact” (altmetrics)

Page 3: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Science as an open enterprise (1)

• Royal Society, Science as an open enterprise (June 2012) http://royalsociety.org/policy/projects/science-public-enterprise/report/

– Report of a Working Group chaired by Professor Geoffrey Boulton

– “Realising the benefits of open data requires a more intelligent openness, one where data are effectively communicated. For this, data must fulfil four fundamental requirements, something not always achieved by generic metadata. They must be accessible, intelligible, assessable and usable” (p. 14)

Page 4: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Science as an open enterprise (2)

• Recommendation 1 (p. 71)

– “Scientists should communicate the data they collect and the models they create, to allow free and open access, and in ways that are intelligible, assessable and usable for other specialists in the same or linked fields wherever they are in the world. Where data justify it, scientists should make them available in an appropriate data repository. Where possible, communication with a wider public audience should be made a priority, and particularly so in areas where openness is in the public interest.”

Page 5: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Science as an open enterprise (3)

• Recommendation 2 (p 72)

– “Universities and research institutes should play a major role in supporting an open data culture by: recognising data communication by their researchers as an important criterion for career progression and reward; developing a data strategy and their own capacity to curate their own knowledge resources and support the data needs of researchers; having open data as a default position, and only withholding access when it is optimal for realising a return on public investment.”

Page 6: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Science as an open enterprise (4)

• Science as an open enterprise: Recommendation 3 (p 73)

– “Assessment of university research should reward open data on the same scale as journal articles and other publications. Assessment should also include measures that reward collaborative ways of working”

• Implications for research evaluation exercises:

– Report argues that “the skill and creativity required to successfully acquire data represents a high level of scientific excellence and should be rewarded as such”

Page 7: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Science as an open enterprise (5)

• Science as an open enterprise (p. 73):

– “Dataset metrics should:

• a. Ensure the default approach is that datasets which underpin submitted scientific articles are accessible and usable, at a minimum by scientists in the same discipline.

• b. Give credit by using internationally recognised standards for data citation.

• c. Provide standards for the assessment of datasets, metadata and software that combines appropriate expert review with quantitative measures of citation and reuse.

• d. Offer clear rules on the delineation of what counts as a dataset for the purposes of review, and when datasets of extended scale and scope should be given increased weight.

• e. Seek ways of recognising and rewarding creative and novel ways of communal working, by using appropriately validated social metrics.

– “These principles should be adopted by the UK Higher Education Funding Councils as part of their Research Excellence Framework (REF). The REF is a powerful driver for how universities evaluate and reward their researchers. Use in the REF of metrics that record citable open data deposition would be a powerful motivation for data release”

Page 8: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Panton Principles

• Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010).

– “By open data in science we mean that it is freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. To this end data related to published science should be explicitly placed in the public domain.”

– Endorsed by the Open Knowledge Foundation

• http://pantonprinciples.org/

Page 9: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

OECD Principles and Guidelines

• OECD Principles and Guidelines for Access to Research Data from Public Funding (2007) http://www.oecd.org/science/sci-tech/38500813.pdf

• Principle A: Openness

– “Openness means access on equal terms for the international research community at the lowest possible cost, preferably at no more than the marginal cost of dissemination. Open access to research data from public funding should be easy, timely, user-friendly and preferably Internet-based.”

Page 10: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

UKRIO Code of Practice for Research

• UK Research Integrity Office, Code of Practice for Research: Promoting good practice and preventing misconduct (September 2009): http://www.ukrio.org/what-we-do/code-of-practice-for-research/

– 3.12.5 Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form”

– 3.12.6 Researchers should consider how data will be gathered, analysed and managed, and how and in what form relevant data will eventually be made available to others, at an early stage of the design of the project.

– 3.12.7 Researchers should collect data accurately, efficiently and according to the agreed design of the research project, and ensure that it is stored in a secure and accessible form

Page 11: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

RCUK Common Principles (1)

• RCUK Common Principles on Data Policy http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

– Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.

– Institutional and project specific data management policies and plans should be in accordance with relevant standards and community best practice. Data with acknowledged long-term value should be preserved and remain accessible and usable for future research.

– To enable research data to be discoverable and effectively re-used by others, sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and re-use potential of the data. Published results should always include information on how to access the supporting data.

Page 12: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

RCUK Common Principles (2)

• RCUK Common Principles (continued):

– RCUK recognises that there are legal, ethical and commercial constraints on release of research data. To ensure that the research process is not damaged by inappropriate release of data, research organisation policies and practices should ensure that these are considered at all stages in the research process.

– To ensure that research teams get appropriate recognition for the effort involved in collecting and analysing data, those who undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. The length of this period varies by research discipline and, where appropriate, is discussed further in the published policies of individual Research Councils.

– In order to recognise the intellectual contributions of researchers who generate, preserve and share key research datasets, all users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they are accessed.

– It is appropriate to use public funds to support the management and sharing of publicly-funded research data. To maximise the research benefit which can be gained from limited budgets, the mechanisms for these activities should be both efficient and cost-effective in the use of public funds.

Page 13: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Funding body requirements (1)

• Changing expectations of funding bodies:

– Institutions need to inform themselves about main funder policies (mandates) with respect to research data management

– There is an explicit link now being made between research income and appropriate data management infrastructures being in place

Page 15: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

EPSRC Policy Framework (1)

• EPSRC Policy Framework on Research Data (2011) http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx

• EPSRC framework expected all institutions receiving grant funding:

– To develop a roadmap aligning their policies and processes with EPSRC’s expectations by 1st May 2012

– To be fully compliant with these expectations by 1st May 2015

Page 16: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

EPSRC Policy Framework (2)

• Examples of expectations:

– Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation

– Data not generated in digital format should be stored in a manner to facilitate it being shared

– Data should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third party

– Adequate resources from existing funding streams

– EPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions

Page 17: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Funding body requirements (3)

• Implications for researchers and institutions:

– Increasing number of research councils and funding bodies have requirements for data management and sharing

– Potential loss of research income if these mandates are not met

– Both institutions and researchers need to determine the costs associated with short and longer-term management and curation

– Responsibility for data management infrastructure seems to be shifting more to HEIs, but institutional infrastructures and services are still emerging

Page 18: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Good practice in RDM

• UK landscape quite variable

– Jisc MRD Programme projects have kick-started a lot of activity in UK HEIs

– Other HEIs getting involved, e.g. prompted by the EPSRC Policy Framework (Digital Curation Centre Institutional Engagements)

• Summary of good practice identified to date:

– Sarah Jones, Graham Pryor and Angus Whyte, How to Develop RDM Services - a guide for HEIs. Digital Curation Centre, 2013. http://www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services

Page 19: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Citation of research data (1)

• Providing citation infrastructures for research data is seen as vitally important for the promotion of data sharing

– Facilitates discovery, retrieval and attribution (as it has for published research outputs)

• “… the most important condition for sharing their data is to receive proper citation credit when others use their data. For 92% of the respondents, it is important that their data are cited when used by other researchers.” (Tenopir, et al., 2011, p. 9)

• “Promotion of data citation will foster a scholarly communication system that allows for identification, retrieval, and attribution of research data” (Mooney and Newton, 2012)

– Linking data sharing with the de facto reward system of science

Page 20: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Citation of research data (2)

• Need for internationally recognised standards for data citation:

– Royal Society, Science as an open enterprise (p. 73) identified the need to use citation standards, but also explicitly linked this to the REF: “Use in the REF of metrics that record citable open data deposition would be a powerful motivation for data release.”

– EPSRC Policy Framework recommended the use of a “robust digital object identifier,” suggested DataCite

Page 21: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

DataCite (1)

• DataCite (http://www.datacite.org) is a not-for-profit organisation that aims to promote and support the sharing of research data

– Membership organisation – current UK members are the British Library and the Digital Curation Centre (associate)

– They are developing an infrastructure that supports methods of data citation, discovery, and access

– They are currently leveraging the DOI (Digital Object Identifier) infrastructure, which is also used for research articles

– They can provide DOIs for datasets

– DataCite DOIs have to resolve to a public landing page with information about the dataset and a direct link to it

May-13

Learning material produced by RDMRose

http://www.sheffield.ac.uk/is/research/projects/rdmrose

Page 22: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

DataCite (2)

• Basic form of DataCite citations:

• Creator (PublicationYear): Title. Publisher. Identifier

• Version and ResourceType are optional extra elements

• For citation purposes, DataCite recommends that DOI names are displayed as linkable, permanent URLs

• Example:

– University of Poppleton (2011): Precipitation measurements 1905-2010 taken at Western Bank weather station. Meteorological service, The University of Poppleton. http://dx.doi.org/10.1594/UoP.MS.298

May-13

Learning material produced by RDMRose

http://www.sheffield.ac.uk/is/research/projects/rdmrose

Page 23: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

DataCite metadata (1)

• DataCite:

– DataCite Metadata Schema (currently v. 2.2, 2011) defines core metadata properties Looks a little bit like Dublin Core, but schema incorporates other elements of unique identifier-based infrastructures (e.g. ORCID – researcher IDs)

– http://schema.datacite.org (doi:10.5438/0005)

Page 24: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

DataCite metadata (2)

• Mandatory Properties:

– Identifier

– Creator

– Title

– Publisher

– PublicationYear

• Administrative Metadata

– LastMetadataUpdate

– MetadataVersionNumber

• Optional Properties:

– Subject

– Contributor

– Date

– Language

– ResourceType

– AlternateIdentifier

– RelatedIdentifier

– Size

– Format

– Version

– Rights

– Description

Page 25: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Citation of research data (3)

• Issues include:

– At what granularity should data be made citeable?

– How to credit each contributor in a dataset that is assembled from very many contributions?

– Where in a research paper should a data citation be given (e.g. a paper describing a dataset versus subsequent papers using it)?

– What to do with frequently updated data?

• For more guidance on these matters, see:

– Ball, A., & Duke, M. (2011a). Data Citation and Linking. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/data-citation-and-linking

– Ball, A., & Duke, M. (2011b). How to Cite Datasets and Link to Publications. DCC How-To Guides. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/how-guides/cite-datasets

May-13

Learning material produced by RDMRose

http://www.sheffield.ac.uk/is/research/projects/rdmros

e

Page 26: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

New ways of measuring “impact”

• Royal Society, Science as an open enterprise (p. 73)

– “Seek ways of recognising and rewarding creative and novel ways of communal working, by using appropriately validated social metrics”

• Social metrics = alternative metrics = Altmetrics (Jason Priem):

– “Altmetrics measure the number of times a research output gets cited, tweeted about, liked, shared, bookmarked, viewed, downloaded, mentioned, favourited, reviewed, or discussed. It harvests these numbers from a wide variety of open source web services that count such instances, including open access journal platforms, scholarly citation databases, web-based research sharing services, and social media.” - http://aoasg.org.au/altmetrics-and-open-access-a-measure-of-public-interest/

– More rapid feedback on “impact” than the bibliometric evaluation of research papers, records a wider range of usage types (e.g., Priem, et al. 2012)

• Example:

– Impact Story: http://impactstory.org/

Page 27: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Summing up

• Open Science is now on the agenda of many policy makers and scientists

• The data policies of funding bodies (e.g. RCUK) increasingly stress the importance of making publicly-funded research data available for others to use

– See also: US Office of Science and Technology Policy, Expanding Public Access to the Results of Federally Funded Research (2013)

• Data publication and citation is being promoted as a means to align research data with the impact metrics collected for other kinds of research outputs

• There is a significant interest in developing new ways of measuring impact (e.g. Altmetrics)

Page 28: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

References

• Mooney, H, Newton, MP. (2012). The Anatomy of a Data Citation: Discovery, Reuse, and Credit. Journal of Librarianship and Scholarly Communication 1(1):eP1035. doi:10.7710/2162-3309.1035

• Priem, J, Piwowar, HA, Hemminger, BM. (2012). Altmetrics in the Wild: Using Social Media to Explore Scholarly Impact. arXiv:1203.4745v1

• Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, et al. (2011) Data Sharing by Scientists: Practices and Perceptions. PLoS ONE 6(6): e21101. doi:10.1371/journal.pone.0021101

Page 29: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Questions?

Page 30: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Acknowledgments

• The Digital Curation Centre (DCC) is a world-leading centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community. The DCC is funded by JISC.

• More information is available from: http://www.dcc.ac.uk/

• UKOLN receives support from JISC and the University of Bath, where it is based.

• More information is available from: http://www.ukoln.ac.uk/

Page 31: Open access data

A centre of expertise in digital information management

www.ukoln.ac.uk

Thank you!

And what the dead had no speech for, when living, They can tell you, being dead: the communication Of the dead is tongued with fire beyond the language of the living

(T. S. Eliot, Little Gidding)