Research data: burden or treasure? (Talk from #fote13)
-
Upload
kevin-ashley -
Category
Technology
-
view
104 -
download
0
description
Transcript of Research data: burden or treasure? (Talk from #fote13)
Research data: burden or treasure?
Kevin Ashley Digital Curation Centre
www.dcc.ac.uk@kevingashley
Reusable with attribution: CC-BYThe DCC is supported by Jisc & FP7
2
164 universities in UK*
*2011 HESA data
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
71 (43%) > 5% research income
115 (70%) > £1m income from research
Kevin Ashley – FOTiE 2013 - CC-BY 3
£4.4 billion total research grants
2013-10-11
4
Funders are making demands
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
2013-10-11Kevin Ashley – FOTiE 2013 - CC-
BY5
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
EPSRC expects all those institutions it fundsto develop a roadmap that aligns … with EPSRC’s expectations by 1st May 2012;to be fully compliant … by 1st May 2015.
2012-06-15Kevin Ashley, DCC; IRWM12,
ULCC; CC-BY6
• Awareness of regulatory environment• Data access statement• Policies and processes• Data storage• Structured metadata descriptions• DOIs for data• Securely preserved for a minimum of 10 years
from last use
7
How much data do we have?
• Edinburgh – provision for 5 Petabytes• Oxford – guessing 3Pb/year• For comparison – LHC @ CERN – 15 Pb/year• £2m investment in storage not unusual
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
8
The Data Deluge is upon us
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Sensor’s ability to produce data outstrips IT’s ability to process it
9
Research Data Centres – the solution!
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
MANY AREAS OF RESEARCH HAVE NO
DATA CENTRE TO SERVE THEM
10
Cloud – sorted!
• Sorry, but it isn’t.• See David Rosenthal’s analysis of the
economics of Amazon for preservation“Distributed digital preservation in the cloud”IJDC 8(1), 2013 doi:10.2218/ijdc.v8i1.248
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
11
Cost of data for 100 years – local vs Amazon S3Data from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
12
Cost of data for 100 years – local vs Amazon S3 AND GlacierData from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
13
That looks like a problem
• Funder requirements exist for a reason:– That data is valuable
• Value to funder, society from reuse• Value to the institution is there also
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
BIS business case: £1.5m investment in research data services pays back 2.5 times after 5 years
14
Integrity
• Not everyone publishes here
• Almost all fraud connected to unavailable data
• People suffer & die due to research fraud
• When your research is reproducible – it gets cited
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
15
Citability
• Making data available increases citations• Everyone – academic, funder, institution –
loves citations• Want evidence?
– Alter, Pienta, Lyle – 240%, social sciences *– Piwowar, Vision – 9% (microarray data)†– Henneken, Accomazzi – 20% (astronomy) #
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
† Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1
* Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data.http://hdl.handle.net/2027.42/78307
# Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618
16
Value in the institution
• New research depends on the old – well managed data resources like well-equipped labs
• Teaching more effective when real data from research is used
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
17
Wherever it is, it has valueWant a 400% -> 1200% return on your investment?
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Try BADC!
http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx
18
Commercial services
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Kevin Ashley – FOTiE 2013 - CC-BY 19
Can we find it?
• Data must be discoverable to be reused• Alone, or in conjunction with publication• Institutional catalogues, national data
registries – JISC is piloting through DCC
2013-10-11
Kevin Ashley – FOTiE 2013 - CC-BY 202013-10-11
Kevin Ashley – FOTiE 2013 - CC-BY 212013-10-11
22
Jisc – through DCC – can help
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Kevin Ashley – FOTiE 2013 - CC-BY 23http://dataintelligence.3tu.nl/en/home/
http://www
.sheffield.
ac.uk/is/re
search/pro
jects/
rdmrose
Choice of RDM training materials for librarians
Up-skilling for data
http://datalib.edina.ac.uk/mantra/libtraining.html
2013-10-11
Kevin Ashley – FOTiE 2013 - CC-BY 242013-10-11
Idea
Develop
Fund
Plan
Record
Process
Publish
Read
Kevin Ashley – FOTiE 2013 - CC-BY 252013-10-11
Idea
Develop
Fund
Plan
Record
Process
Publish
Read
Idea
Develop
Fund
Plan
Record
Process
Publish
Read
Kevin Ashley – FOTiE 2013 - CC-BY 26
Idea
Develop
Fund
Plan
Record
Process
Publish
Read
2013-10-11
27
Data reuse stories
• The palaeontologist who saved years of work with archaeological data
• The ‘noise’ from research radar that mapped dust from Eyjafjallajökull
• The 19th-century logs and photographs that help us model climate change
2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Often your data tells stories that your
publications do not
28
3TU treasure chest2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY
Kevin Ashley – FOTiE 2013 - CC-BY 29
Thanks for your attention
[email protected]@kevingashley
2013-10-11
Kevin Ashley – FOTiE 2013 - CC-BY 30
DCC ‘institutional engagement’
Assess needs
Make the case
Develop support and
services
RDM policy development
Customised Data Management Plans
DAF & CARDIO assessments
Guidance and training
Workflow assessment
DCC support
team
Advocacy with senior management
Institutional data catalogues
Pilot RDM tools
…and support policy implementation2013-10-11