Data: institutional & data centre roles The DCC...

47
Data: institutional & data centre roles The DCC experience Kevin Ashley Digital Curation Centre www.dcc.ac.uk @kevingashley [email protected] Reusable with attribution: CC-BY The DCC is supported by Jisc

Transcript of Data: institutional & data centre roles The DCC...

Page 1: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data: institutional & data centre roles The DCC experience

Kevin Ashley Digital Curation Centre

www.dcc.ac.uk @kevingashley

[email protected]

Reusable with attribution: CC-BY The DCC is supported by Jisc

Page 2: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

A summary

• Why data reuse ?

• What stops us ?

• How data management helps

• Barriers & costs

• The case for reuse - again

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 2

Page 3: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

My home – the DCC

• Mission – to increase capability and capacity for research data services in UK institutions

• Not just a UK problem – an international one

• Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 3

Page 4: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

DCC guidance

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 4

Page 5: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 5

SWEDEN

DENMARK

CANADA

Page 6: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data reuse stories

• The palaeontologist who saved years of work with archaeological data

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 6

Page 7: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What a paleontologist looks at

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 7

Now 100 million years ago

25m 50m 75m

1m

Page 8: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What a paleontologist looks at

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 8

Now 100 million years ago

25m 50m 75m

1m Now 1 million years

750,000 500,000 100,000

Page 9: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What an archaeologist looks at

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 9

Now 1 million years

750,000 500,000 100,000

100,000 years ago

75,000 50,000 25,000

Page 10: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data reuse stories

• The palaeontologist who saved years of work with archaeological data

• The 19th-century ships logs that help us model climate change

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 10

Page 11: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 11

The Old weather project

Data for research, not from research

Page 12: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 12

Page 13: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data reuse stories

• The palaeontologist who saved years of work with archaeological data

• The 19th-century ships logs that help us model climate change

• The ‘noise’ from research radar that mapped dust from Eyjafjallajökull

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 13

Page 14: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data reuse - messages

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 14

Often your data tells stories that your

publications do not

Not all data comes from other researchers

One person’s noise is another person’s signal

Discipline-bounded data discovery doesn’t give us

all we need or want

Page 15: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 15

Why care?

• Data is expensive – an investment

• Reuse:

– More research

– Teaching & Learning

– Planning

• Impact – with or without publication

• Accountability

• Legal & regulatory requirements

Page 16: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Why does this matter?

• Research quality – How close can we get to

the truth?

• Research speed – How quickly can we get

to the truth?

• Research finance – How much does the

truth cost?

• Improving one or more of these is of interest to all actors:

• Researchers as data creators

• Researchers as data reusers

• Research institutions

• Funders – hence government and society

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 16

Page 17: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

G8UK - Endorses

OA

Open Data

Charter

Policy Paper

18 June 2013

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 17

G8UK - Billigt offenen Zugang

Eine offene Daten Charter

Strategiepapier.

Page 18: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Funder requirements

• UK

• USA – NSF, NEH, NIH • Europe

• Most place burden on researcher – some on the institution

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 18

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx

Page 19: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

RCUK policy - The 1-minute version

• Research data are a public good – make openly available in timely & responsible way

• Have policies & plans. Data with long-term value should be preserved & usable

• Metadata for discovery & reuse. Link publications & data

• Sometimes law, ethics get in the way. We understand.

• Limited embargos OK. Recognition is important – always cite data sources

• OK to use public money to do this. Do it efficiently.

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 19

Page 20: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

EPSRC policy points

• Awareness of regulatory environment

• Data access statement

• Policies and processes

• Data storage

• Structured metadata descriptions

• DOIs for data

• Securely preserved for a minimum of 10 years from last use

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY

20

Compliance expected by 2016

Page 21: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 21

DCC Policy Summary

http://www.dcc.ac.uk/resources/policy-and-legal

Page 22: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Compliance

Benefits

Page 23: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 23

Some institutional roles

• Leadership – coordinate action • Audit – who has what, where does it go? • Advice on access – data, wherever it is • Preservation – permanence • Citability • Data/publication linking • Promoting data in teaching • Selection • Education – early career researchers

Page 24: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Who (in the UK) is addressing RDM ?

Library

IT

Research

Office

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 24

Page 25: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 25

How?

• Create policy – collaborate with others

• Develop existing digital services

• Learn about audit tools (DCC & others)

• Learn about data & sources

• Reskill subject librarians

• Learn about your own data

• Bridge between publishers & researchers

Page 26: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 26

Understanding Data Requirements

http://www.dcc.ac.uk/

Page 27: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What stops data reuse

• Loss • Destruction • Pride • Gluttony • Ineptitude • Concealment • Bureaucracy • Complexity • Procrastination • Lack of potential

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 27

Page 28: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Kevin Ashley –e-IRG2014 - CC-BY

28

“Departments don’t have guidelines or norms

for personal back-up and researcher procedure,

knowledge and diligence varies tremendously.

Many have experienced moderate to

catastrophic data loss”

Incremental Project Report, June 2010

http://www.flickr.com/photos/mattimattila/3003324844/

2014-06-09

Page 29: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 29

Page 30: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Roles and Responsibilities

What data to keep

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 30

Page 31: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Excuses – and responses • “People will ask questions”

– So use a data centre or repository

• “It will be misinterpreted” – Stuff happens. Also, openness encourages correction

• “It’s not interesting” – Let others be the judge – your noise is my signal

• “I might get another paper out of it” – Up to a point. We might get more research out of it

• “I don’t have permission” – A real problem. But solvable at senior level

• “It’s too bad/complicated” –see above • “It’s not a priority”

– Unfortunately, funders are making it so. But if you looked at the evidence, it would be your priority as well

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 31

See e.g. Carly Strasser’s blog: http://datapub.cdlib.org/2013/04/24/closed-data-excuses-excuses/

Page 32: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 32

Page 33: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Data centres are good value!

• See Jisc reports on ADS, BADC, UKDA:

• Returns on investment between 400% and 1200%

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 33

Page 34: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What about collaboration?

• Collaborate within the university

• Collaborate with partners

• Collaborate with regional, national services

• Not everything can be done well locally

• Infrastructure needed at research group, institution, national, (discipline) & international level

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 34

Page 35: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 35

Page 36: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 36

Page 37: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Pimp your data –

make it findable & reusable

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 37

Gking.harvard.edu/data

Page 38: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

On costs

• Costs of data curation relatively simple to measure: see work of 4C (4cproject.eu)

• Charging and payment are more complex

• Funder rules can lead to perverse, inefficient payment systems

• Fundamental question is ‘who pays’. This changes the answer to ‘what does it cost’

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 38

Page 39: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Commercial services

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 39

Page 40: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

The UK funding model

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 40

Research Council Higher Education

Funding Council

The complexities appear in every country, just in different ways

Annual funding, per institution

Project funding, to PI

Page 41: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

What it means

• Project funding can only be spent during projects on direct project costs

• Project funding comes with overheads, which universities must use for research infrastructure

• Ongoing (‘QR’) money is continuous, relates to research ranking

• Important to distinguish business-as-usual from exceptional requirements

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 41

Page 42: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

A research lifecycle

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 42

Time

Resources

Exceptional zone

Normal zone

Project end point

Business as usual threshold

Eligible for project funding

Page 43: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Funders view

We have money

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 43

We have rules about how you use money to meet requirements

We have requirements

Over to you!

Page 44: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Being clever with costs

• Ongoing costs beyond project end cannot be charged to a grant, but…

• ‘Pay once, store forever’ charges are acceptable.

• Thus, incentive to outsource long-term curation

• Yet universities are only acting as last-resort option in any case – discipline data archives preferred

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 44

Many of these are run by

funders

Page 45: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Closing thoughts

• Library/data centre roles: – selecting content – protecting it – enabling and encouraging reuse – Assisting with data management planning

• Library: – helping users find the most relevant content – much

research data does not come from research

• Data center: – setting standards – enabling uptake

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 45

Page 46: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

Infrastructure levels

• Truly international – instruments, standards

• National variation, international core:

– Training

– Data management planning

– Policy

– ..

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 46

Page 47: Data: institutional & data centre roles The DCC experiencee-irg.eu/documents/10920/260645/eirg-20140609-ashley.pdf · • Training, shared services, guidance, policy, standards, futures

My message to researchers • The credit belongs to you

• The data belongs to all of us

• Share, and we all reap the benefits

2014-06-09 Kevin Ashley –e-IRG2014 - CC-BY 47