RoaDMaP Music Case Study Report…  · Web view · 2017-06-13Project Background. Enough context...

14

Click here to load reader

Transcript of RoaDMaP Music Case Study Report…  · Web view · 2017-06-13Project Background. Enough context...

Project Information

Project Identifier

To be completed by JISC

Project Title

Leeds RoaDMaP (Leeds Research Data Management Pilot)

Project Hashtag

#leedsrdm

Start Date

01/01/2012

End Date

30/06/2013

Lead Institution

University of Leeds

Project Director

Brian Clifford, Deputy University Librarian and Head of Learning and Research Support, [email protected]

Project Manager

Rachel Proudfoot

Contact email

[email protected]

Partner Institutions

Digital Curation Centre, F5, National Instruments

Project Webpage URL

http://library.leeds.ac.uk/roadmap-project

Programme Name

JISC Managing Research Data Programme 2011-13

Programme Manager

Simon Hodson

Document Information

Author(s)

Dr Ian Sapiro, Brenda Phillips

Project Role(s)

Music Case Study Lead, Project Officer

Title

RoaDMaP Project: Music Case Study Report

Reporting Period

From 01-01-2012 to 30-06-2012

Date

Filename

URL

If this report is on your project web site

Access

Project and JISC internal

General dissemination

Document History

Version

Date

Comments

1.0

30/05/2013

Final draft

1

RoaDMaP Music Case Study Report

1. Project Background

a) Enough context to make the project intelligible to non-specialist

In 2005, the University of Leeds received a donation of analogue reels of magnetic tape from film composer Trevor Jones on long-term loan for the purposes of research and teaching, the agreement being that the materials could be digitized to render them usable. Digitisation has therefore taken place as funding has been available, with a small proportion of the archive digitised through an AHRC-funded project in 2008, allowing some research to be carried out using these materials. Where digitised, this archival material takes the form of sound files, accessed principally as ProTools sessions in line with standard practice in the film-music industry. Jones has made two further donations of materials for digitisation and use, in 2008 and 2013, incorporating a wider range of media and recording types (e.g. stereo and 5.1 surround-sound mixes, in addition to 24-track recordings and demos).

The AHRC state a Technical Plan (TP) should be provided for all applications where digital outputs or digital technologies are an essential part to the planned research outcomes. A digital output or digital technology is defined as an activity which involves the creation, gathering, collecting and/or processing of digital information. Its purpose is to demonstrate to the AHRC that technical provisions within a research proposal have been adequately addressed in terms of: (a) Delivering the planned digital output or the digital technology from a practical and methodological perspective; (b) Doing so in a way which satisfies the AHRC's requirements for preservation and sustainability. The AHRC has a responsibility to ensure that the research that it funds is achievable and high-quality, and that the outputs of the research will, wherever appropriate, be accessible to the community over the longer term.

b) Staff, dates, locations

The work on the Trevor Jones archive has been led by Professor David Cooper, Dean of the Faculty of Performance, Visual Arts and Communications (PVAC), and supported by Dr Ian Sapiro, Lecturer in Music. Both have worked on the archive since its delivery in 2005.

c) Collaborative partners

Trevor Jones himself retains an interest in our activities, though there are no collaborative partners as such.

d) Description of data sets

Overall the data set is the film works of Trevor Jones between the late 1970s and mid 1990s, when the industry as a whole made the transition from analogue to digital recording technology. Owing to the passage of time, older analogue tapes exhibit a tendency to squeal or stick to the guides and heads in the recorders tape path after a long period of storage, and even where materials have been vacuum sealed, tapes can show some signs of mould or mildew. In almost all cases, therefore, the taped material needs to be baked to enable the reels to be unwound and played without tearing or distorting the recorded contents, so that the audio data can be transferred into a digital format. The dataset comprises some 24-track sound mixes and some 8-track sound mixes, with occasional mixes of other sizes and some stereo pairs. While the data itself is largely static, the associated metadata for the materials may change and be increased as the recordings become available and research is carried out. The dataset is the digital copies of analogue tapes digitised using a small grant from the AHRC.

Improved project response for TP purposes re data/file formats: It is anticipated that video sources in the archive will be digitised in uncompressed AVI format or similar, the exact specifications to de decided in consultation with the company that carries out the digitisation and the technical experts at the University of Leeds, to ensure that the resulting digital files are of high-quality. RoaDMaP personnel will advise on best practice in all cases.

2. Established Practice and Challenges

a) Current Data Management Planning practice

A Technical Plan must be submitted when applying for external funding from the AHRC for a research project in which electronic data are to be generated.

Historically, nothing has been undertaken beyond the requirements of the funding body application. However the Technical Plan is similar to a DMP. Neither the Principal Investigator (PI: Professor Cooper) nor the Co-Investigator (CI: Dr Sapiro) had ever completed a DMP for everyday research, hence they had little to no experience of data management planning. While there was an intention to keep a second copy of all digital materials on DVD, this has never been carried out owing to time constraints.

b) Known challenges or issues

1. The security of the data, and prevention of loss/lack of access, since the data is unique and cannot be replaced.

2. It is possible that the format of the digital archival materials may need to change at some point in the future to keep pace with modern technology, but the use of non-proprietary formats, where possible, mitigates this somewhat. Should funding be secured for further digitisation, the amount of research data would increase significantly.

3. There is little proper metadata for cataloguing of the archival materials at present, though creation of this data is something the researchers wish to do. At present, familiarity with small subsets from within the digital collection is sufficient for reuse by the current research team only.

4. Owing to difficulties of access and the nature of the current storage of the data, data are not currently made available to others. Should a secure method of making them available become possible it would be investigated further.

5. At present, nothing different happens to the data in the longer term and it is not logged in a data centre or other repository. The data is irreplaceable it is unclear how long it should be kept, since it is unclear how much use it has for teaching and research in the long term.

6. More about storage solutions, and what support is available centrally to assist with the management of digital research data.

Improved project response for TP purposes: re sustainability of electronic resources: The repository system is fully supported by the University of Leeds, so access to data will be maintained through future advances in technology, and the systems will be updated as web-standards and environments change. The University will manage and preserve the repository beyond the grant period, making it a long-term sustainable resource for research and teaching. Adherence to the recommendations of the JISC-funded RoaDMaP project will ensure data are stored and managed in line with best practice, ensuring the sustainability of the resource in the long term. Owing to the scarcity of the sorts of materials to be held in the repository (as outlined in the case for support) it is unclear what time-span is required for storage, but the technology employed will ensure that all materials remain accessible for at least ten years beyond the end of the project. Additional items could be added to the repository as and when Jones undertakes further scoring projects, and additional metadata generated from ongoing research would also be added as appropriate.

c) Key players researchers, support staff

In addition to the staff listed above, developments in the project teams experience and knowledge around DMP as supported by RoaDMaP highlighted that in order to successfully meet the Technical Plan and data management planning requirements adopting a mixed team approach is not only supportive but will strengthen the projects delivery outcomes. The project now recognises and endorses a cross team approach to managing and reporting on the TP, i.e.

Improved project response for TP purposes re project management of technical aspects, management, and reporting structures and timetabling: the Post-Doctoral Research Assistant (PDRA) will be the principal liaison with industry specialists, with Library personnel, the repository and Leeds Research Data Management Pilot (RoaDMaP) teams and technical personnel, University of Leeds Information Systems Services (ISS) and the website design team [] and materials processed externally (whether audio, video or textual) will be digitised by experts in the field.

d) Training requirements

1. Help with DMP Guidance around what is needed and how to complete a DMP, along with some proper/formal notification that one is needed, and a system for checking they exist.

2. Assistance in the creation of a DMP, raising awareness of the issues surrounding the long-term storage and retrieval of data, and guidance on data storage solutions would also be helpful. The training should be provided by experts, regardless of which organisation provides it.

3. Collection of appropriate metadata, guidance on appropriate catalogue presentation style/system, creation of a DMP.

3. Methodology

Define scope of case study: who, why & Interaction with RoaDMaP project: when, how

The case study centres around the film-music research materials in the Trevor Jones Archive, housed at the University of Leeds. The principal researchers are Professor David Cooper and Dr Ian Sapiro. The materials are a unique collection relating to Joness back catalogue from his career as a film composer. Professor Cooper and Dr Sapiro have been in communication with the University Library for some time regarding potential storage and preservation of the digital materials produced through research into the collection. Interaction with the RoaDMaP project began close to the start of RoaDMaP, though the IT Manager for the Faculty of PVAC, Tim Banks, was aware of the research and materials before RoaDMaP started. Dr Sapiro has been part of RoaDMaP throughout as a representative of the Music case study.

4. Benefits for the case study project and for RoaDMaP including how they were/will be measured

a) Who will (did) benefit?

Immediate beneficiaries are the research team and also the composer, who may now gain access to the materials himself. Wider beneficiaries are those who might use the research materials, which may be made available to the whole scholarly community through the storage and retrieval system arranged through the RoaDMaP project, and the structured metadata generated following consultation with and guidance from the RoaDMaP project team.

Improved project response for TP purposes re project deliverables: the repository facility will be fully searchable, enabling the project team to locate and study specific items within the digitised collection, and will also enable metadata to be appended to items as the project progresses. The testing referred to above will ensure the system is fit-for-purpose and can ably manage the stresses of extensive use by multiple users.

The RoaDMaP project has also benefitted from dealing with the range of media types and file formats associated with this case study, and the unique way in which the various resources relate to each other. The case study has encouraged broader thinking regarding concepts of cataloguing, metadata structures, and the presentation of digitally stored materials, as well as questions of access and data storage life.

b) Who may need to (did) change behaviour how will we know?

The research team have changed their behaviour with regard to data storage, and are now more aware of issues regarding the secure and redundant storage of research data, especially those which cannot be replaced easily (or at all). Measurement of this change will be evidenced through behaviour in future projects, notably the potential AHRC large-grant project to digitise and work with the rest of the Trevor Jones Archive. The proposed digital storage system for this project demonstrates a change in thinking towards this important aspect of managing research data, and if the project receives funding there are targets and milestones built into the proposal in this area.

c) Benefits during project/post-project

Some University server space has been secured for a small proportion of the digital research materials generated from the Trevor Jones Archive; digital copies of some of the sound materials and the majority of the paper-based items will be stored on a secure server with appropriate safeguards (both in terms of redundancy and access).

1. Tim Banks has confirmed long-term availability of a couple of terabytes of storage on the University N: drive as agreed by ISS for the Trevor Jones data. Storage is currently on a number of individual Mac-formatted drives.

2. The project was keen for RoaDMaP to gain a greater understanding about the nature of its research data e.g. file types, the nature of research data in the arts and how it differs from the sciences, i.e.

Life of data is less of an iterative process than may be the norm within other arts discipline

Act of digitisation contributes to the destructing of the data

The tape baking outcomes will produce 'a different state of unknown' by the nature of the baking process itself

d) Dissemination, evidence of impact, legacy

The influence of RoaDMaP is evidenced through the Technical Plan submitted with the recent AHRC large-grant application, which received very favourable feedback in peer review. The researchers have included recommendations made by the RoaDMaP team through work on the case study, which will be implemented should funding be secured. Potential impact includes the use of the storage and metadata structure in a commercial-facing implementation, since the proposed AHRC-funded research project includes an investigation of the implementation of the repository system in the context of the film-music industry (and beyond).

Improved project response for TP purposes re project timetabling: Year 1 will see the digitisation of all analogue sound reels and the majority of the related paperwork, with the digital files delivered to the RA as they become available. Basic metadata relating to the materials will be entered into the repository with the digital files.

Year 2 will see the completion of the digitisation of materials with the processing of all remaining audio, video and textual sources, and additional metadata will be appended to digital items as outlined in the case for support. The web interface through which the materials in the repository will be accessed will be created and subjected to rigorous in-house and some external testing across the year. The digital materials will be fully accessible to the project team through the website during the final year of the project, and metadata will continue to be added as it is generated by the research (e.g. metadata pertaining to relationships between items rather than the items themselves).

Improved project response for TP purposes re their monitoring processes: RoaDMaP personnel will advise on the required metadata for repository items. [] The Special Collections staff at the University of Leeds's Brotherton Library will assess and monitor the presentation of the digital materials from a bibliographic viewpoint, and RoaDMaP personnel will advise on the required metadata for repository items. Additionally, the CI is a member of the RoaDMaP project team and will ensure this project meets standards and good practice developed by that JISC-funded project.

5. What we wish wed known before our research project

Research data lessons learned with the benefit of hindsight

Greater understanding of the metadata requirements would have been valuable when the original materials were digitised, since this data will have to be generated after the event, which can be more difficult. Increased knowledge of open file formats and appropriate standards of quality for files (especially sound and video files) would also have been useful.

6. Next steps

a) Agreed / recommended activity to change or embed data management practice, What will be done differently as a result of the project?

1. With a much clearer focus on data sharing and re-use, as already identified, the entire data storage and management process will be different in future projects owing to the RoaDMaP project. This includes the selection of file formats, collection of metadata, storage plans and structures, and general management of the data with a view to potential rather than just current use. The actions below represent some of these developments:

Improved project response for TP purposes re data preservation and sustainability: Advice has been taken from members of the repository and Library teams regarding the collection of metadata to ensure that digital items can be found within the repository, and Library, IT and RoaDMaP personnel and external technical experts have been consulted in order to ascertain appropriate file formats and resolutions for preserving the digital materials long-term without loss of detail. One of the project research questions relates to the application of the digital repository framework in the industrial film-making process, and the advice and guidance of the repository team has been invaluable in developing an understanding of the ways in which digital objects can be preserved and used.

2. Develop an outline description of a music data object for the purposes of the Trevor Jones data (24 tracks and the storing of a large multi-track or download 24 separate files of tracks, linkages and contextual documentation).

3. Indicate, as appropriate, where a distinction is to be made between tape/cue supported, by including a basic readme file document to assist in searching the repository (i.e. by film/list of queues/single cue) and file relationships/ease of streaming requirements.

4. Digitise any remaining hard-copy documentation to allow the appropriate metadata to be captured.

5. Further explore potential for existing database/metadata of each tape reel to be imported into EPRINTS, including researching the different sound/audio file XML schemas (e.g. Library of Congress), and customising the EPRINTS schematic via well formed XML file. With reference to EPRINTS security, it was noted that the data can also be securely hosted elsewhere but accessed via a secure link rather than through the repository itself.

Some resources of interest:

The Centre for Digital Music, a multidisciplinary research group in the field of Music & Audio Technology, Music Ontology Specification on main concepts and properties for describing music (i.e. artists, albums, tracks, but also performances, arrangements, etc.) on the Semantic Web, and their adoption of the 5 compulsory and 12 optional fields of the DataCite metadata schema: http://rdm.c4dm.eecs.qmul.ac.uk/content/dspace-metadata-schemas-and-data-submission

Library of Congress Digital Library Standards - XML Schemas that detail technical metadata for audio- and video-based digital objects: http://www.loc.gov/standards/amdvmd/

6. Undertake preparation to make one Trevor Jones score data set repository deposit ready; include differing metadata requirements as they address metadatas different purposes plus definitions/ glossary, csv spreadsheets re data records, characteristics of discipline, etc., as appropriate. This metadata would contain a contextual rational informing on WHY the decisions made were made, what data was kept and why, plus any technical background info that would help a user understand and re-use the data.

Some resources of interest:

Monash University in Australia host some small music collections:

"Australian Archive of Jewish Music Collection"- a small number of mp3 files; "The Kartomi Collection of Traditional Musical Arts in Sumatra" - mixed files - e.g. fields notes (PDF); audio (mp3); image (jpeg); video (wmv):

http://arrow.monash.edu.au/vital/access/manager/Community/monash:62951

The Edinburgh DataShare service - A Collection of Dinka Songs

Each folder in the collection holds multiple files including readme files with information about the dataset, filenaming etc, audio (wav), notation files in XML:

http://datashare.is.ed.ac.uk/handle/10283/155

Improved project response for TP purposes re documenting the resource to be funded: The digital materials are documented directly through the metadata stored alongside them in the repository, which will be supplemented with additional data as the project progresses (see case for support). Since the repository will be fully searchable, the metadata will allow the project team to locate and retrieve items via keyword search, and will also allow the interlinking of related items in the collections (e.g. a page of score, the various audio versions of the music which exist, any paperwork pertaining to that particular cue, etc.).

7. Explore the potential to build a metadata field under the audio subsets to store/aid reuse/discovery, which at a minimum will provide contextual data re the sound file schema if not included as part of data structure itself. This can then be stored as part of the data record and once supported by a request layer to be devised for sharing data could meet ethics, copyright and IPR requirements.

Improved project response for TP purposes re data development methods and content selection: Metadata will be collected for all items in the collection to ensure the repository can be searched satisfactorily and that retrieval of items for use in the research is efficient and comprehensive. Standards for metadata and the presentation of digital materials will be agreed with experts from the University of Leeds Library Special Collections and central IT staff, and the RoaDMaP project team will assist in ensuring research materials are stored securely in line with best practice.

Graphic for illustrative purposes only

8. Upload data onto revised storage drive.

9. The AHRC awarded funding to Professor Cooper and Dr Sapiro for their 570,000 project to work with the Trevor Jones Archive materials. Securing this funding leads to further next steps as follows:

a) Expansion of the dataset to include all items within the Jones collection

b) Appointment of a Post-Doctoral Research Assistant to work on the project. The PDRA will require training in aspects of data management and matters regarding metadata and digital objects as outlined in 2h, above

c) Consolidation of all objects for digitisation in order that all media types are accounted for, suitable metadata can be generated for each item, and appropriate file formats are selected to ensure use, reuse and long-term preservation

d) Securing of required repository space for the digital materials arising through the project, and establishment of relevant security and redundancy

7. Summary and Key Points

a) Take away messages for readers

RoaDMaP helped the research team understand the importance and value of data management planning, and of keeping data in a secure environment with suitable redundancy. The project was also instrumental in encouraging the researchers to consider the research value of their data more widely beyond their own specific project and to think properly about data reuse when creating digital items and associated metadata.

Improved project response for TP purposes re advice sought on planning your proposed project: Discussions have been held with members of the University of Leeds's central digital storage team and the faculty IT Manager regarding the management and capabilities of the digital repository, and advice has been taken from RoaDMaP personnel regarding appropriate file formats and standards for digital materials, as well as from Library Special Collections' staff regarding the nature and quantity of metadata required within the repository. Personnel from the University's central IT service and the faculty IT Manager have also been consulted regarding issues and standards of storage and access for digital content, as have the Leeds UK Research Data Service team and the RoaDMaP project team.

Improved project response for TP purposes re infrastructural support, hardware, software and relevant technical expertise: Backup of the data is inherent in the repository and will maintain best practice as determined by the outcomes of RoaDMaP. Digitisation carried out by external companies will be undertaken by experts, with companies selected following a competitive tendering process, and technical standards, requirements and deadlines will be negotiated and agreed in advance of work starting. General technical support will be available through the School of Music, faculty and central IT services at Leeds.

Improved project response for TP purposes re backup procedures that the project will use to safeguard its electronic resources during development: The repository team will supervise the retrieval of data should it be necessary following server failure or similar problems. Leeds has stringent regulations regarding electronic data and digital materials will be safeguarded, sustained and fully backed up in line with best practice as determined by RoaDMaP.

The input of the RoaDMaP team was significant with regard to the Technical Plan submitted as part of the successful AHRC large-grant application. The technical aspects of the project were praised in peer review, and it is apparent that this contributed to the award of substantial research funding to digitise and work with the Trevor Jones materials. The benefit of their expertise was invaluable, and will continue to be so over the duration of the project. Construction of a DMP, while not required by the AHRC, enabled the research team to understand the project data more completely, and the DMP will be of immense use in the actual management of the project across the next three years.

8. Recommendations

The funding award from the AHRC demonstrates that training in the value and purpose of DMPs is of immense value to all researchers that generate data in whatever forms. Raising awareness of data reuse and storage, the need for high-quality, accurate metadata, and the advantages of open file formats will lead to future research projects having added value beyond the specific studies for which data is originally generated and used, and enables PIs and CIs to demonstrate to funding bodies that data management has been fully considered in the planning of a research project.