Research data: burden or treasure? (Talk from #fote13)

30
Research data: burden or treasure? Kevin Ashley Digital Curation Centre www.dcc.ac.uk @kevingashley [email protected] Reusable with attribution: CC-BY The DCC is supported by Jisc & FP7

description

A talk at #fote13 (fote-conference.com) about why we should *all* - as taxpayers - care about reuse of research data

Transcript of Research data: burden or treasure? (Talk from #fote13)

Page 1: Research data: burden or treasure? (Talk from #fote13)

Research data: burden or treasure?

Kevin Ashley Digital Curation Centre

www.dcc.ac.uk@kevingashley

[email protected]

Reusable with attribution: CC-BYThe DCC is supported by Jisc & FP7

Page 2: Research data: burden or treasure? (Talk from #fote13)

2

164 universities in UK*

*2011 HESA data

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

71 (43%) > 5% research income

115 (70%) > £1m income from research

Page 3: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 3

£4.4 billion total research grants

2013-10-11

Page 4: Research data: burden or treasure? (Talk from #fote13)

4

Funders are making demands

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 5: Research data: burden or treasure? (Talk from #fote13)

2013-10-11Kevin Ashley – FOTiE 2013 - CC-

BY5

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx

EPSRC expects all those institutions it fundsto develop a roadmap that aligns … with EPSRC’s expectations by 1st May 2012;to be fully compliant … by 1st May 2015.

Page 6: Research data: burden or treasure? (Talk from #fote13)

2012-06-15Kevin Ashley, DCC; IRWM12,

ULCC; CC-BY6

• Awareness of regulatory environment• Data access statement• Policies and processes• Data storage• Structured metadata descriptions• DOIs for data• Securely preserved for a minimum of 10 years

from last use

Page 7: Research data: burden or treasure? (Talk from #fote13)

7

How much data do we have?

• Edinburgh – provision for 5 Petabytes• Oxford – guessing 3Pb/year• For comparison – LHC @ CERN – 15 Pb/year• £2m investment in storage not unusual

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 8: Research data: burden or treasure? (Talk from #fote13)

8

The Data Deluge is upon us

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Sensor’s ability to produce data outstrips IT’s ability to process it

Page 9: Research data: burden or treasure? (Talk from #fote13)

9

Research Data Centres – the solution!

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

MANY AREAS OF RESEARCH HAVE NO

DATA CENTRE TO SERVE THEM

Page 10: Research data: burden or treasure? (Talk from #fote13)

10

Cloud – sorted!

• Sorry, but it isn’t.• See David Rosenthal’s analysis of the

economics of Amazon for preservation“Distributed digital preservation in the cloud”IJDC 8(1), 2013 doi:10.2218/ijdc.v8i1.248

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 11: Research data: burden or treasure? (Talk from #fote13)

11

Cost of data for 100 years – local vs Amazon S3Data from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 12: Research data: burden or treasure? (Talk from #fote13)

12

Cost of data for 100 years – local vs Amazon S3 AND GlacierData from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 13: Research data: burden or treasure? (Talk from #fote13)

13

That looks like a problem

• Funder requirements exist for a reason:– That data is valuable

• Value to funder, society from reuse• Value to the institution is there also

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

BIS business case: £1.5m investment in research data services pays back 2.5 times after 5 years

Page 14: Research data: burden or treasure? (Talk from #fote13)

14

Integrity

• Not everyone publishes here

• Almost all fraud connected to unavailable data

• People suffer & die due to research fraud

• When your research is reproducible – it gets cited

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 15: Research data: burden or treasure? (Talk from #fote13)

15

Citability

• Making data available increases citations• Everyone – academic, funder, institution –

loves citations• Want evidence?

– Alter, Pienta, Lyle – 240%, social sciences *– Piwowar, Vision – 9% (microarray data)†– Henneken, Accomazzi – 20% (astronomy) #

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

† Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1

* Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data.http://hdl.handle.net/2027.42/78307

# Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618

Page 16: Research data: burden or treasure? (Talk from #fote13)

16

Value in the institution

• New research depends on the old – well managed data resources like well-equipped labs

• Teaching more effective when real data from research is used

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 17: Research data: burden or treasure? (Talk from #fote13)

17

Wherever it is, it has valueWant a 400% -> 1200% return on your investment?

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Try BADC!

http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx

Page 18: Research data: burden or treasure? (Talk from #fote13)

18

Commercial services

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 19: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 19

Can we find it?

• Data must be discoverable to be reused• Alone, or in conjunction with publication• Institutional catalogues, national data

registries – JISC is piloting through DCC

2013-10-11

Page 20: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 202013-10-11

Page 21: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 212013-10-11

Page 22: Research data: burden or treasure? (Talk from #fote13)

22

Jisc – through DCC – can help

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 23: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 23http://dataintelligence.3tu.nl/en/home/

http://www

.sheffield.

ac.uk/is/re

search/pro

jects/

rdmrose

Choice of RDM training materials for librarians

Up-skilling for data

http://datalib.edina.ac.uk/mantra/libtraining.html

2013-10-11

Page 24: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 242013-10-11

Idea

Develop

Fund

Plan

Record

Process

Publish

Read

Page 25: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 252013-10-11

Idea

Develop

Fund

Plan

Record

Process

Publish

Read

Idea

Develop

Fund

Plan

Record

Process

Publish

Read

Page 26: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 26

Idea

Develop

Fund

Plan

Record

Process

Publish

Read

2013-10-11

Page 27: Research data: burden or treasure? (Talk from #fote13)

27

Data reuse stories

• The palaeontologist who saved years of work with archaeological data

• The ‘noise’ from research radar that mapped dust from Eyjafjallajökull

• The 19th-century logs and photographs that help us model climate change

2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Often your data tells stories that your

publications do not

Page 28: Research data: burden or treasure? (Talk from #fote13)

28

3TU treasure chest2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY

Page 29: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 29

Thanks for your attention

[email protected]@kevingashley

2013-10-11

Page 30: Research data: burden or treasure? (Talk from #fote13)

Kevin Ashley – FOTiE 2013 - CC-BY 30

DCC ‘institutional engagement’

Assess needs

Make the case

Develop support and

services

RDM policy development

Customised Data Management Plans

DAF & CARDIO assessments

Guidance and training

Workflow assessment

DCC support

team

Advocacy with senior management

Institutional data catalogues

Pilot RDM tools

…and support policy implementation2013-10-11