Data Management: Tips & Tools
-
Upload
stephanie-wright -
Category
Data & Analytics
-
view
45 -
download
0
Transcript of Data Management: Tips & Tools
![Page 1: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/1.jpg)
Data Management
Stephanie WrightUniversity of [email protected]
SPATIAL / IsoCampJune 2015
Tips & Tools
![Page 2: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/2.jpg)
Who Am I?
![Page 3: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/3.jpg)
• Computing Trainer• Cruise Ship Lecturer (Love Boat)• Library Merger Manager• Atmospheric Sciences Librarian• Assessment Librarian• Data Services Coordinator
HTTP://GUIDES.LIB.WASHINGTON.EDU/SWRIGHT
![Page 4: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/4.jpg)
Disclaimer I am not a scientist I am a librarian …
![Page 5: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/5.jpg)
Disclaimer I am not a scientist More like this…
![Page 6: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/6.jpg)
What Do I Do?
• Data Management Plans (DMPs)• Courses• Consultations• Research Projects• DataONE, RDA, eScience Institute• Institutional Data Repository (DRUW)
![Page 7: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/7.jpg)
Why?
![Page 8: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/8.jpg)
THEN NOW
![Page 9: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/9.jpg)
THEN
NOW
![Page 10: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/10.jpg)
THEN NOW
![Page 11: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/11.jpg)
A Real Life Example
![Page 12: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/12.jpg)
![Page 13: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/13.jpg)
Many tables
![Page 14: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/14.jpg)
my spreadsheet
No headings
![Page 15: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/15.jpg)
Embedded figures
![Page 16: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/16.jpg)
my spreadsheet
![Page 17: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/17.jpg)
my spreadsheet
![Page 18: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/18.jpg)
my spreadsheet
![Page 19: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/19.jpg)
![Page 20: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/20.jpg)
?
![Page 21: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/21.jpg)
One More Example
https://www.youtube.com/watch?v=66oNv_DJuPc
Data Sharing and Management Snafu in 3 Short Acts
![Page 22: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/22.jpg)
Why Does It Matter?
From Flickr by tomhilton
![Page 23: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/23.jpg)
HTTP://WWW.SPARC.ARL.ORG/ISSUES/OPEN-DATA/DATA-SHARING-INITIATIVE/POLICIES
… “Federal agencies investing in research and development (more than $100 million in annual expenditures) must have clear and coordinated policies for increasing public access to research products.”
![Page 24: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/24.jpg)
![Page 25: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/25.jpg)
![Page 26: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/26.jpg)
![Page 27: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/27.jpg)
“The best thing to do with your data will be thought of by someone else.”
“We need open data because we don’t just want to use a car we want to poke around in the engine, see how it works and then rebuild it.”
~ Rufus PollockFounder and President of Open Knowledge Foundation (www.okfn.org)
![Page 28: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/28.jpg)
From Flickr by cogdog
![Page 29: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/29.jpg)
WICHERTS JM, BAKKER M, MOLENAAR D (2011) WILLINGNESS TO SHARE RESEARCH DATA IS RELATED TO THE STRENGTH OF THE EVIDENCE AND THE QUALITY OF REPORTING OF STATISTICAL RESULTS. PLOS ONE 6(11): E26828. DOI:10.1371/JOURNAL.PONE.0026828
HTTP://127.0.0.1:8081/PLOSONE/ARTICLE?ID=INFO:DOI/10.1371/JOURNAL.PONE.0026828
![Page 30: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/30.jpg)
How To Do It?
![Page 31: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/31.jpg)
Data planning is more efficient than data forensics.
DATA MANAGEMENT PLANNING•What will be collected•Methods•Standards•Sharing/access•Long-term storage
![Page 32: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/32.jpg)
COLLECTING •Keep raw data raw• Use scripts to process data
![Page 33: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/33.jpg)
ORGANIZING• Machine readable• Human readable• Works well with default ordering
![Page 34: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/34.jpg)
AVOID• spaces• punctuation• special characters• case sensitivity
20130503_DOEProject_DesignDocument_Smith_v2-01.docx20130709_DOEProject_MasterData_Jones_v1-00.xlsx20130825_DOEProject_Ex1Test1_Data_Gonzalez_v3-03.xlsx20130825_DOEProject_Ex1Test1_Documentation_Gonzalez_v3-03.xlsx20131002_DOEProject_Ex1Test2_Data_Gonzalez_v1-01.xlsx20141023_DOEProject_ProjectMeetingNotes_Kramer_v1-00.docx
Eaffinis_nanaimo_2010_counts.xls
Site name
YearWhat was measured
Study organis
m
![Page 35: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/35.jpg)
YYYYMMDD
![Page 36: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/36.jpg)
NOBLE, WILLIAM S. (2009) "A QUICK GUIDE TO ORGANIZING COMPUTATIONAL BIOLOGY PROJECTS." PLOS COMPUTATIONAL BIOLOGY. 5(7): DOI/10.1371/JOURNAL.PCBI.1000424
• Pick a method that works for you and stick to it• DOCUMENT IT!
![Page 37: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/37.jpg)
METADATA•Who?•What?•Where?•When?•How?•Why?
![Page 38: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/38.jpg)
Digital context
• Name of the data set
• The name(s) of the data file(s) in the data set
• Date the data set was last modified
• Example data file records for each data type file
• Pertinent companion files
• List of related or ancillary data sets
• Software (including version number) used to prepare/read the data set
• Data processing that was performed
Personnel & stakeholders
• Who collected
• Who to contact with questions
• Funders
Scientific context
• Scientific reason why the data were collected
• What data were collected
• What instruments (including model & serial number) were used
• Environmental conditions during collection
• Temporal & spatial resolution
• Standards or calibrations used
Information about parameters
• How each was measured or produced
• Units of measure
• Format used in the data set
• Precision & accuracy if known
Information about data
• Definitions of codes used
• Quality assurance & control measures
• Known problems that limit data use (e.g. uncertainty, sampling problems)
![Page 39: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/39.jpg)
Temperature data
Salinity data
Data import into Excel
Analysis: mean, SD
Graph production
Quality control & data cleaning“Clean”
T & S data
Summary
statistics
Data in spread-sheet
Simple: Flow chart
WORKFLOW
![Page 40: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/40.jpg)
Simple: Commented script
![Page 41: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/41.jpg)
Resulting output
More Fancy: Kepler, Taverna
![Page 42: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/42.jpg)
From Flickr by cogdog
![Page 43: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/43.jpg)
BACKING UP: 3 places, 3 ways
From Flickr by lippo
From Flickr by see phar
Original
Near
Far
What software?What hardware?What personnel?
How often?Set up reminders!
Test system
![Page 44: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/44.jpg)
SHARING
RepositoriesInstitutionalDisciplinaryJournalre3data.org
Sustainable formatsOpen, non-proprietaryCommonly used in your disciplineNot encrypted or compressed
![Page 45: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/45.jpg)
Review your DMPDid you do what you said you would?
![Page 46: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/46.jpg)
Photo credit Michael Ham
![Page 47: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/47.jpg)
How Do I Learn More?
•Funding Mandateshttp://chronicle.com/article/Where-Should-You-Keep-Your/231065/http://datapub.cdlib.org/2013/02/28/the-new-ostp-policy-what-it-means/
•File Naming Conventions: http://www.exadox.com/en/articles/file-naming-convention-ten-rules-best-practice
•Folder Structures: http://www.damlearningcenter.com/resources/articles/best-practices-for-folder-organization/
•Metadata:http://www.dcc.ac.uk/resources/metadata-standards
•DataONE Primerhttps://www.dataone.org/best-practices
•Software Carpentryhttp://software-carpentry.org/
•Research Data Alliancehttps://rd-alliance.org/
•Your Libraryhttp://guides.lib.washington.edu/dmg
![Page 48: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/48.jpg)
Tools
•Data Mgmt PlanningDMPTool https://dmptool.org/
•MetadataMorpho https://www.dataone.org/software-tools/morphoNOAA MERMaid http://www.ncddc.noaa.gov/ metadata-standards/mermaid/
•WorkflowsKepler https://kepler-project.org/Taverna http://www.taverna.org.uk/
•Sharing re3data http://www.re3data.org/GitHub https://github.com/
•MiscellaneousEZID http://ezid.cdlib.org/ImpactStory https://impactstory.org/ORCID http://orcid.org/
![Page 49: Data Management: Tips & Tools](https://reader030.fdocuments.us/reader030/viewer/2022032513/55d23937bb61ebbe1c8b45d4/html5/thumbnails/49.jpg)
Any Other Questions? Stephanie Wright
Web data.blogspot.com
Twitter @UWLibsData
Email [email protected]