Open data in ubi systems research - introduction to open science and open data in research (part 2)
Introduction to Research Data Management
-
Upload
angusawhyte -
Category
Documents
-
view
98 -
download
0
description
Transcript of Introduction to Research Data Management
Introduction toResearch Data Management
Oxford Brookes UniversityFaculty of Technology, Design & Environment
Dr Angus Whyte, DCC27th Sept 2012
This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland License
The Digital Curation Centre
• Consortium of 3 units in Universities of Bath (UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII)
• Launched 1st March 2004
• National centre since 2004 – address challenges in digital curation that cross institutions or disciplines
• Funded by JISC, plus HEFCE funding from 2011 for • support to national cloud services • targeted institutional development
DCC Mission
“What’s it got to do with me?” Drivers and benefits to HEI’s developing infrastructure and services to support research data management.
5
Introduction
• What is research data management? • Why is it important?• What risks does it address?• What benefits does it provide?• What is good practice?
6
What is Research Data Management?
Caring for, facilitating access Preserving and Adding value to research data throughout its lifecycle.
Organisation, Resources and Technology required to support and sustain.
7
What Kinds of Data?…whatever is produced in research or evidences its outputs
8
RDM… data centred project management
• Planning data management• Creating data • Naming and describing• Storing active data• Selecting or disposing • Depositing and sharing • Protecting sensitive data• Licensing access
9
An emerging art for institutions
*Jo Walsh & Rufus Pollock Open Knowledge Foundationhttp://www.okfn.org/files/talks/xtech_2007/
A design space bounded by two principles…
10
An emerging art for institutions
A design space bounded by two principles… and constraints
*Jo Walsh & Rufus Pollock Open Knowledge Foundationhttp://www.okfn.org/files/talks/xtech_2007/
£££
11
An emerging art for institutions
*Jo Walsh & Rufus Pollock Open Knowledge Foundationhttp://www.okfn.org/files/talks/xtech_2007/
£££
REF
A design space bounded by two principles… and constraints
12
Why is RDM Important?
“Rapid and pervasive technological change has created new ways of acquiring, storing, manipulating and transmitting vast data volumes, as well as stimulating new habits of communication and collaboration amongst scientists. These changes challenge many existing norms of scientific behaviour”
Convergence in research policy
13
Why is RDM Important?
“We have opened up much public data already, but need to go much further in making this data accessible. We believe publicly funded research should be freely available. We have commissioned independent groups of academics and publishers to review the availability of published research, and to develop action plans for making this freely available”
Convergence in research policy
14
Policy moves towards openness
Organisation for Economic Co-operation and Development describes data as a public good that should be made available
Research Councils UK in its code of good research conduct says data should be preserved and accessible for 10 years +
Research Funder data policies increasingly demanding of institutional commitment and provisions...
RCUK Common Principles on Data Policy
Public good: Publicly funded research data are produced in the public interest should be made openly available with as few restrictions as possible
Planning for preservation: Institutional and project specific data management policies and plans needed to ensure valued data remains usable
Discovery: Metadata should be available and discoverable; Published results should indicate how to access supporting data
Confidentiality: Research organisation policies and practices to ensure legal, ethical and commercial constraints assessed; research process should not be damaged by inappropriate release
First use: Provision for a period of exclusive use, to enable research teams to publish results
Recognition: Data users should acknowledge data sources and terms & conditions of access
Public funding: Use of public funds for RDM infrastructure is appropriate and must be efficient and cost-effective.http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
16
Funder Expectations
EPSRC expects all those institutions it funds• to develop a roadmap that aligns their
policies and processes with EPSRC’s expectations by 1st May 2012;
• to be fully compliant with these expectations by 1st May 2015.
• Compliance will be monitored and non-compliance investigated.
• Failure to share research data could result in the imposition of sanctions.
17
Funder Expectations
18
Funder Expectations
Applications submitted on or after 1st November 2012 will need to take account of the new guidance and application form requirements.
The key changes are that:
All proposals will be required to contain …a new ‘Technical Summary’
Those with digital outputs or digital technologies that are essential to their planned research outcomes will be expected to submit a technical attachment.
Current technical appendix section of the Je-S form will be removed.
http://www.ahrc.ac.uk/News-and-Events/News/Pages/Changes-to-all-AHRC-Research-Grant-and-Fellowships-applications.aspx
19
Data Policies by Funder
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
20
It’s not just top-down!
• Data intensive research• Demand from public to engage, criticise• Citizen science – new stakeholders in
research• Digital engagement and open data in
creative industries and built environment• Demands more planning and support
21
It’s not just top-down!
22
From citizen science…
22
Responding to more demand for public engagement
Crowd sourcing discovery
Increases complexity of data management
23
…to digital engagement
23
Established in e.g. planning and creative industries
New opportunities from open data
24
Public demand for data & engagement
“We have opened up much public data already, but need to go much further in making this data accessible. We believe publicly funded research should be freely available. We have commissioned independent groups of academics and publishers to review the availability of published research, and to develop action plans for making this freely available”
25
Open data in public governance
26
Open data in public governance
27
Open data in art and designbus routes data sculpture
• “a 3D data sculpture of the Sunday Minneapolis / St. Paul public transit system, where the horizontal axes represent directional movement and the vertical represents time. the piece titled "bus structure 2am-2pm" is constructed of 47 horizontal layers, each forming a map of the bus routes that run during a given interval of time. looking down from the top, one sees the Sunday bus map of the Twin Cities, while looking from the side, the times appears as strata building upwards. within each layer, every transit route that operates at that time is represented by wood balls placed at its scheduled stops, connected by the horizontal copper rods. each route moves through time and space differently, carving out its own trail that may or may not meet conveniently with other routes.
• in total 42 routes, 47 intervals of time & 296 bus stops are depicted by about a half-mile of copper rod & 6,000 wood balls, suspended in the air by hundreds of blue threads
http://infosthetics.com/archives/2008/05/bus_routes_data_sculpture.html
Reusing public data to create an object with reuse value?
28
…. University information
http://data.southampton.ac.uk/
29
…. and scholarly publication
http://openarchaeologydata.metajnl.com/
DOI plus citation = career reward for data mgmt
1. Deposit data in repository
2. Submit data paper
•Context
•Method
•Data scope
•Data description
3. Peer reviewed
4. Data & paper DOIs
31
Common practice in Universities‘Departments typically don’t have guidelines or norms for personal back-up and researcher procedure, knowledge and diligence varies tremendously. Many have experienced moderate to catastrophic data loss.’
Incremental Project Scoping Study and Implementation Plan http://www.lib.cam.ac.uk/preservation/incremental/documents/Incremental_Scoping_Report_170910.pdf
‘The current environment is such that responsibility for good data management is devolved to individual researchers and in practice PIs set the 'rules' and establish the cultural practices of the research groups and this means there is good data management practice going on in pockets but no consistency across groups. There is also consequently a high risk of data losses by a number of means’.
MaDAM Project Requirements Analysis http://www.merc.ac.uk/sites/default/files/MaDAM_Requirements%20_%20gap%20analysis-v1.4-FINAL.pdf
32
Risks if you don’t address…
• Loss of funding
• Legal non-compliance DPA, FOI…etc.
• Research integrity, reputation
• Inability to verify, scrutinise
• Loss of data or (re)usability
• Outputs lack visibility
• Diminished public communication
33
Risks if you don’t…
• Loss of funding
• Legal non-compliance DPA, FOI…etc.
• Research integrity, reputation
• Inability to verify, scrutinise
• Loss of data or (re)usability
• Outputs lack visibility
• Diminished public communication
34
Benefits if you do…
• Secure storage for sensitive data
• Improved access for scholarly communication
• Scrutiny and verification of research
• Research integrity, reputation
• Secondary use and data mining
• Opportunities for collaboration
• Increased visibility, citation
• Knowledge transfer, public communicationBenefits from Infrastructure Projects in JISC MRD http://www.jisc.ac.uk/media/documents/programmes/mrd/RDM_Benefits_FinalReport-Sept.pdf
35
E.g. MaDAM project
Benefits from Infrastructure Projects in JISC MRD http://www.jisc.ac.uk/media/documents/programmes/mrd/RDM_Benefits_FinalReport-Sept.pdf
Pilot project offering secure storage, description, flexible sharing
•“I can put my hands straight on my data, through one application”
•“I can easily share & find data within my research group”
•“I have support in data management planning”
•“I can publish my data, under my control, with the wider community”
•“I’m not repeating experiments unnecessarily”
•“I’m freed up from some of my data management duties to concentrate on my research”
Researchers spending less time managing data, getting more value for their efforts and freeing more time for research.
HALOGEN (History, Archaeology, Linguistics, Onomastics, GENetics):
Throwing light on the past through cross-disciplinary databasing
http://www.le.ac.uk/halogen
Portable Antiquities Scheme (British Museum) Place-names (Nottingham) Surnames Genetics IT hosting and GIS Best practice: #JISCMRD, UKRDS, DCC, RIN, internatlional
Collaboration opportunities from data integration
HALOGEN (History, Archaeology, Linguistics, Onomastics, GENetics):
Throwing light on the past through cross-disciplinary databasing
http://www.le.ac.uk/halogen
Portable Antiquities Scheme (British Museum) Place-names (Nottingham) Surnames Genetics IT hosting and GIS Best practice: #JISCMRD, UKRDS, DCC, RIN, internatlional
Collaboration opportunities from data integration
• New research opportunities– Cross database work – seed new research samples
• Verification, re-purposing, re-use of data– Cleaning & enhancing private research datasets for reuse & correlation– Increased transparency– excellent training for best practice in research data management
• Increasing research productivity– Build in cleaning, annotation, enhancement into normal research
workflows– research datasets may immediately be reusable and interoperable
• Impact & Knowledge Transfer– Reuse IT infrastructure: EU FP7 Mintweld (industrial engineering) &
BRICCS National Health Service/University Trust data sharing.• Increasing skills base of researchers/students/staff
Direct benefits from HALOGEN
39
Data access raises visibility
Data with DOI = citeable research output
40
Taking it step by step…
• Awareness and training• ‘Audits’ to assess current assets, practices and
requirements, gaps in provision• Identifying quick wins while developing long-
term plan• Not reinventing: integrating, adapting,
augmenting– e.g. policies, doctoral training, storage
41
Who to involve?
• Funders• Archive / long-term data
repository• Senior management• Others...
• Researcher(s)• Research support officers /
project staff• Lab technicians• Librarians / Data Centre staff• Faculty ethics committees• Institutional legal/IP advisors• FOI officer / DPA officer /
records manager• Computing support• Institutional compliance
officers
42
Thank you!
What are key issues for you…
43
DCC support activities
Delivering support
Customised Data Management Plans – templates / guidance to be added to DMP Online
Training – institutional/disciplinary tailored courses, online resources
Incremental – repackaging existing support to raise awareness and make guidance more meaningful to researchers
Developing strategic institutional RDM framework
Strategy development – getting key people together to discuss/plan for RDM
Policy development – scoping, defining, embedding research data policies
Costing - assist with the development of costing and pricing for RDM services
Risk management - identify risks in RDM practice and recommend mitigations
Institutional data catalogues - recommend options for exposing metadata about your research data via CRIS systems, repositories, or a mix of these
Needs assessment
CARDIO Tool– collaborative assessment & benchmarking of RDM strengths/weaknesses
Data Asset Framework – interviews to scope current RDM practice and recommend improvements
Workflow assessment – methodology for analysing current RDM workflows
44
Roles & responsibilities
Liz Lyon “The Informatics Transform: Re-Engineering Libraries for the Data Decade” International Journal of Digital Curation Volume 7, Issue 1 | 2012
45
Roles & responsibilities