DAF methodology & Glasgow Uni scoping study

17
… because good research needs good data Tools of the Trade Workshop, Manchester, 19 th May Funded by: This work is licensed under the Creative Commons Attribution- NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. DAF methodology & Glasgow Uni scoping study Sarah Jones DCC, University of Glasgow [email protected] .uk

description

DAF methodology & Glasgow Uni scoping study. Sarah Jones DCC, University of Glasgow [email protected]. Background to DAF project. - PowerPoint PPT Presentation

Transcript of DAF methodology & Glasgow Uni scoping study

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

Funded by:This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

DAF methodology & Glasgow Uni scoping study

Sarah Jones

DCC, University of Glasgow

[email protected]

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Background to DAF project

“JISC should develop a Data Audit Framework to enable all universities and colleges to carry out an audit of departmental data collections,

awareness, policies and practice for data curation and preservation”

Liz Lyon, Dealing with Data: Roles, Rights, Responsibilities and Relationships, (2007)

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

The methodology

http://www.data-audit.eu/DAF_Methodology.pdf

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Stage 1: planning

Objective

Determine what you want to find out and prepare work in advance

Process

- Define scope / expected outcomes

- Research organisational context

- Set up survey, interviews, meetings…

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Stage 2: identifying data

Objective

Create inventory to understand scale of data

Process

Engage researchers to:

- Identify key data assets

- Classify data to restrict scope

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Stage 3: assessing data management

Objective

Identify weaknesses in data management and potential risks

Process

- In-depth assessment of most crucial

assets, given purpose of audit

- Discussion on lifecycle of data to

assess data management

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Stage 4: recommendationsObjective

Recommend changes to improve data management

Process

- Collate audit results

- Analyse data

- Suggest changes to mitigate

weaknesses

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

DAF pilot implementations• Early test cases: GeoSciences; Archaeology; Mechanical Engineering; Humanities

• University of Edinburgh Physiology; Divinity; History; Brain Imaging; Astronomy

• University College LondonArchaeology; Scandinavian Studies; Physics & Astronomy; Life & Medical Sciences

• Imperial College LondonChemical Engineering; Physics; Business School

• King’s College LondonGeography; Psychiatry; Environmental Research; Biomedical And Health Sciences

• DataShare examplesCardiac group; Dept of International Development; Social Sciences

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Workshop on next steps for DAF

• Many of the pilots found the actual process of gathering information on data management was more valuable than the asset register. The DAF approach was felt to be useful for defining requirements to improve data management. (JISC funded RDMI projects)

• A suggestion was made to enhance DAF with practical examples / guidance from the pilot studies. (Implementation Guide)

• Align the DAF process with other data management planning tools. (IDMP project between AIDA, DAF, DRAMBORA, LIFE)

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

GU scoping studies• Digital preservation Advisory Board established at GU in 2008• Keen to identify scale of digital preservation needs across the uni

• Scoping studies ran in 2009 in: • Archaeology• Chemistry• Corporate Communications• Court Office• English Language• Electronics and Electrical Engineering• Evolutionary Ecology and Biology• MRC Social and Public Health Sciences Unit

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Methodology

• Semi-structured interviews • interview framework sent in advance• some background research done before interview e.g. reading staff profile• recorded (with permission) then transcribed and sent for comments

• Spoke with HoDs, researchers, teaching, admin and support staff

• Reviewed preliminary findings and increased scope• added more PhDs and ECRs as most researchers we’d spoken to were senior • added corporate communications for ‘web’ perspective• Spoke to additional key people at the Uni e.g. William Nixon, repository

manager; James Currall, security expert.

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Interview framework

1. what digital material is being created

2. how this is being created and maintained

3. any issues that have been encountered

4. plans for the long-term e.g. preservation, reuse

5. requirements for support and services.

http://www.gla.ac.uk/media/media_126658_en.pdf

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

What did we find?

Pockets of good practice…

…. but a lot of confusion and need for support

•to connect data with documentation, we name files using

a code number which is the person’s initials, the lab book

number in roman numerals and then the experiment number

•We produced documentation workflows on how to take material from the DAT machines, how to transfer these into computer files, guidelines on transcription and anonymisation, and making derivates. It’s all very well documented which

means there is consistency across the team, which is vitally important.

It makes a huge difference if somebody can come and talk through problems and solutions with you. A personal contact like the RDOs is helpful.

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Procedures for creation & management•the network has always been the bane of everyone’s

lives to find stuff on - you end up opening umpteen files to see if it’s the one you’re after

Digital images are a classic case in point as many still have the numerical ordering and cryptic letter

sequence auto-generated by the camera.

•the paper records system hasn’t transferred easily to the digital

•The volume of data produced makes

maintenance a bit like drinking from a fire hose.

•the licence is very expensive and if this weren’t renewed it wouldn’t be possible to

continue to access the data

They had major problems last year moving from ArcGIS 9.1 to 9.3 –

everything stopped working as they’d changed the geo-database format. It was not

straightforward to fix…

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Storage and backupResearch groups tend to

run their own little fiefdom. The correlation seems to be the more computers they have, the less IT expertise

there is.

If they throw some money at the problem they can install another networked drive and the problem goes away for a while

•Insufficient backup space is a recurring problem, but it’s not really a lack of

space, it’s more an issue of not being able to control what people store on

their hard drives.

•People bring in sticks with 4GB of data on that simply no longer work and nothing can

be done to retrieve it.

large and reliable storage is expensive. You need this for home directories but things that are to be archived or backed up could be punted out of the way to cheaper storage.

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

Selection / long-term preservation•It’s one thing to keep something going, but are people still able to use it

in the same way? •If the website comes to an end, the data could still be preserved, but you lose the richness of being able to search that, or see it on a map, or have

them synchronised.•it’s like giving your baby away

If I know the code will be public I’ll pay more attention to properly annotating it with comments

so other people can understand it.

Probably only one tenth of what’s currently held should be retained.

•How do you decide what can be deleted? I’m not confident to

make that decision.

•Archiving is to allow someone else to reuse it

… because good research needs good data

Tools of the Trade Workshop, Manchester, 19th May 2010

www.data-audit.eu/

What next…

• DPAB continues to address this at senior management level

• JISC-funded Incremental project (part of MRD programme)

• Ensuring researchers can find guidance and support when needed

• Making data training and guidance more understandable to researchers

• Offering tailored support and partnering

http://www.lib.cam.ac.uk/preservation/incremental/index.html