Best Practices in Research Data Management · Best Practices: File Naming •Avoid special...
Transcript of Best Practices in Research Data Management · Best Practices: File Naming •Avoid special...
Best Practices in Research Data Management (RDM)
By- Rahul & Kayla
Today’s Objectives
• Why Research Data Management?
• Identify common data management issues
• Best practices for managing data
• Future of RDM at RITMO
What is Data?
• “Research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results”
• Experimental
• Simulation data
• Derived or compiled data
Why Should I Manage it?
• Compliance to UiO policy and guidelines
UiO wishes to manage research data according to internationalstandards, such as the FAIR principles
Findable: Data should have rich metadata and persistent identifier
Accessible: Understanding authorization/authentication.
Interoperable: Metadata should be shared, accessible in broadlyapplicable language for knowledge representation
Reusable: The data should be well defined so that they can be replicated /combined in different settings.
• Science & Personal Benefits
Managing data saves you time and effort, and avoids the duplication of efforts, “good RDM = good research
Managing and sharing data increases the impact and visibility of research
Encourages improvement and validation of research methods
“Cyberinfrastructure, Data, and Libraries, Part 1: A Cyberinfrastructure Primer
for Librarians. D-Lib Magazine, September/October, 2007, Volume 13 Number9/10”
What if I Don’t Consider RDM
Data Sharing and Management Snafu in 3 Short Acts: A data management horror story
http://www.youtube.com/watch?v=N2zK3sAtr-4
Six “Issues” in Research Data Management
• Responsibility
• Data Management Plans
• Records Management
• File Management
• File Naming
• Metadata
• Backup and Security
• Long Term Planning
Best Practices: Responsibility
• Define roles and assign responsibilities for data management
• Identify skills needed to perform tasks outlined in data management plan and match to available staff
• Develop training plans for continuity
• Assign responsible parties and monitor results
Issue: Data Management Plans
CREATING
DATA
PROCESSING
DATA
ANALYSING DATA
PRESERVING DATA
GIVING ACCESS TO
DATA
RE-USING DATA
Best Practices: Data Management Plans
• What types of data will be created?
• Who will own, have access to, and be responsible for managing these data?
• What equipment and methods will be used to capture and process data?
• Where will data be stored during and after?
Issue: File Management
• Does this sound familiar?
• Inconsistently labeled files
• in multiple versions…
• inside poorly structured folders…
• stored on multiple media…
• in multiple locations…
• and in various formats…
Best Practices: File Naming
• Avoid special characters in a file name.
• Use capitals or underscores instead of periods or spaces.
• Use 25 or fewer characters.
• Use documented & standardized descriptive information about the project/experiment.
• Use date format ISO 8601:YYYYMMDD.
• Or agreed upon consistant date format
• Include a version number.
Issue: Metadata
What is Metadata?
• “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource **.”
• A note to the future…
• How will someone make sense of your data e.g. the cells and values of your spreadsheet?
• What universal or disciplinary standards could be used to label your data?
• How can you describe a data set to make it discoverable?
[**2004, NISO, Understanding Metadata, pg. 1]
Common metadata fields
• Title
• Creator
• Identifier
• Subject
• Funders
• Rights
• Access information
• Language
• Dates
• Location
• Methodology
• Data processing
• Sources
• List of file names
• File Formats
• File structure
• Variable list
• Code lists
• Versions
• Checksums
Best Practices: Metadata
• Create a Data Dictionary
• Describe the contents of data files
• Define the parameters and the units on the parameter
• Explain the formats for dates, time, and other parameters
• Define any coded values
• Describe quality flags or qualifying values
• Define missing values
Issue: Backup & Security
• How often should data be backed up?
• How many copies of data should you have?
• Where can you store your data?
Issue: Backup & Security
• Best Practices
• Keep original RAW data untouched and saved
• Use copy to process into cooked data
• Unencrypted is ideal for storing your data because it will make it most easily read by you and others in the future…but if you do need to encrypt your data because of human subjects then:
• Keep passwords and keys on paper (2 copies), and in a PGP (pretty good privacy) encrypted digital file
• Uncompressed is also ideal for storage, but if you need to do so to conserve space, limit compression to your 2nd backup copy
Issue: Long-Term Planning
• What will happen to my data after my project ends?
• How can I appraise the value of my data?
• What are my options for archiving and preserving my data?
• What are my options for publishing and sharing data?
Best Practices: Long-Term Planning
• When choosing a file format, select a consistent format that can be read well into the future and is independent of changes in applications.
• Non-proprietary: Open, documented standard, Unencrypted, Uncompressed, ASCII formatted files will be readable into the future.
Future of RDM at RITMO
• HTD vs Data Managers• Long term work will be done through HTD project
• Short term information and changes will come from the data managers
• UiO• Information, procedures, and regulations are ever changing
• Data managers will keep information current
• Department• Due to current state of IT and other such systems, data management will still
occur in individual department infrastructure
• Data Managers will work to create consistency and eventually merge all data systems
Using BIDS at RITMO
• Brain Imaging Data structure (BIDS) is potential for organizing , annotating, and describing data collected.
• Currently BIDS is widely used for FMRI, EEG, iEEG , MEG, behavourial studies and EMG
• BIDS structure will further adopted to organize data from Motion capture system and Eye-tracking.
RDM Coming Soon at RITMO
• Folder structure
• Data drives will be reorganized for both FrontNeuro and FourMs to the same consistent folder structure
• Including metadata introduction files and basic DMPs
• Data management resources at UiO
• Website detailing ethics and RDM information is being created compiling all information from UiO as well as other organizations such as NSD
Learn More
• Data Management Principles & Education:
• Research Data MANTRA
• DataONE: Best Practices
• UK Data Archives
• MIT Data Management and Publishing Guide
• Data Management Plans
• Digital Curation Centre
• DMPTool2
• DataONE: Data Management Planning