Collaborative Data Management using OSF
-
Upload
c-tobin-magle -
Category
Data & Analytics
-
view
73 -
download
4
Transcript of Collaborative Data Management using OSF
Collaborative data management using OSF
Tobin MagleData Management Specialist
Morgan Library12-07-2016
http://www.slideshare.net/CTobinMagle/collaborative-data-management-using-osf
Outline
• Intro to data management services
• What is data management?• Why should I care?
• Data Management Planning
• Collaboration tool: Open Science Framework
My Background: molecular microbiology
(1) CT Magle et al Infect Immun. 2014 Feb;82(2):618-25. doi: 10.1128/IAI.00444-13. Epub 2013 Nov 25.(2) Sun W, Tanaka TQ, Magle CT, et al.. Sci Rep. 2014 Jan 17;4:3743. doi: 10.1038/srep03743.
Workshops
One on one meetings
• How do I write a DMP?
• How do I organize my data?
• How do I clean and format my data?
• How do I automate my analyses?
• How do I get my data ready to share?
Data archiving service
• CSU Digital Repository• Over 100 Datasets
• Satisfy requirements for manuscripts and grants
• At no cost <1 TB• $150/TB for 5 years• $300/TB for >5 years
Data Management Serviceshttps://lib.colostate.edu/services/data-management
What is data management?
The policies, practices and procedures needed to manage the storage, access and preservation of data
produced from a research project
data management != data sharing
Collaboration
Why should I care?• Good for research integrity
• Good for you
• Public good
• Collaboration is hard
Full lecture by Keith Baggerly, Bioinformatician (University of Texas, MD Anderson Cancer Center)https://www.youtube.com/watch?v=7gYIs7uYbMo
http://www.nytimes.com/2011/07/08/health/research/08genes.html
Where does data management fit into research?
Throughout the whole research cycle
Hypothesis
The research cycle
Hypothesis Experimental design
The research cycle
Hypothesis DataExperimental design
The research cycle
Hypothesis DataExperimental design
Results
The research cycle
Hypothesis DataExperimental design
ResultsArticle
The research cycle
Hypothesis DataExperimental design
ResultsArticle
The research cycle
Hypothesis DataExperimental design
ResultsArticle
Data Management Plans
The research cycle
HypothesisRaw data
Experimental design
Tidy Data
ResultsArticle
Data Management Plans
Cleaning
Analysis
The research cycle
HypothesisRaw data
Experimental design
Tidy Data
ResultsArticle
Data Management Plans
Cleaning
Sharing
Analysis
Open Data
ClosedData
Archiving
The research cycle
HypothesisRaw data
Experimental design
Tidy Data
ResultsArticle
Data Management Plans
Cleaning
Sharing
Analysis
Open Data
Code Reproducible Research
ClosedData
Archiving
The research cycle
HypothesisRaw data
Experimental design
Tidy Data
ResultsArticle
Data Management Plans
Cleaning
Sharing
Analysis
Open Data
Code Reproducible Research
Reuse
ClosedData
Archiving
The research cycle
HypothesisRaw data
Experimental design
Tidy Data
ResultsArticle
Data Management Plans
Cleaning
Sharing
Analysis
Open Data
Code Reproducible ResearchReuse
ClosedData
Archiving
The research cycle
Version Control
Metadata
Collaboration
What is a data management plan?
• A description of how you plan to describe, preserve and share your research data.
• Often required by funders
• Collaboration takes extra planning
Successful DMPs include
• A data inventory
• A strategy for describing the data
• A plan for preserving the data
• A method for access to the data
http://help.osf.io/m/60347/l/618674-creating-a-data-management-plan-dmp
Successful collaborative DMPs include• A data inventory
• A strategy for describing the data
• A plan for preserving the data
• A method for access to the data
http://help.osf.io/m/60347/l/618674-creating-a-data-management-plan-dmp
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3143734
• Assigned Roles
• Shared work space
• Context
• Version control
Shared workspace: OSF
• Components
• Add-ons
• Contributors
• Wiki
http://help.osf.io/m/collaborating/l/524109-using-the-wiki http://www.slideshare.net/DuraSpace/121014-slides-roadmap-to-the-future-of-share
Organization rules
• Be consistent
• One directory per project
• Separate subdirectories for• Raw data• Processed data• Code• Output
• Make raw data read-only
• Make README fileshttp://help.osf.io/m/60347/l/611391-organizing-files
Components
• “Subprojects”
• Separate privacy settings, contributors, wiki, add-ons, and files.
• Examples:• Different projects: https://osf.io/82fba/• Clinical: https://osf.io/gq4mz/• Mix: https://osf.io/ezcuj/• File types: https://osf.io/if7ug/• Manuscript sections:
https://osf.io/zmja2/
Demo: add files and components
Add ons
Now
OSF
OpenSesame
Soon
OSF
29 grants to develop open tools and services: https://cos.io/pr/2015-09-24/
Demo: Link add ons
Context: Wiki
• Evolves during project
• Describe the project
• Goals
• Progress report
• Code book:• ID systems (for records)• Variable systems
Contributors
• Control who can see what• Administrator
• Read/Write
• Read only
• Separate for each component
Demo: Add contributor
Version control
• Who did what when?• Native in OSF
• Git Integration
• One file name, many versions
Demo: version history
Need help?
• Email: [email protected]
• DMPTool: http://dmptool.org/
• OSF: https://osf.io/
• Data Management Services website: http://lib.colostate.edu/services/data-management
• Slides: http://www.slideshare.net/CTobinMagle/collaborative-data-management-using-osf