DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files...
Transcript of DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files...
DATAVERSECOMMUNITY
MEETING10 Years Sharing Data with Dataverse
#dataverse2017
<2006
Once there was the VDC
2
2006
And then came the Dataverse Network
2015
142
2006
Now we have the Dataverse
2015 2017
2 14 23
RESEARCHERS ARE SHARING AND USING DATA
200datasets/month
4,000files/month
60,000downloads/month
HarvardDataverse
> 70,000 datasets
> 2.5 M downloads
> 340,000 files
< 2006
When we started, there werevery few journals with data
policies,no data requirements from
funders
2006 2015 2017
weak = recommendstrong = require
Weak data sharing and strong data sharing vs. disciplines
Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing, Forthcoming
Now,Journals
acrossdisciplines
startsupporting
data policies
Gen
etic
sJo
urna
ls
Bio
med
ical
Jour
nals
Com
puta
tion
alSc
ienc
es
Econ
omic
s
Ope
n A
cces
sJo
urna
ls
Ecol
ogy
33%
4%
2006 2015 2017
AndFunders
require datasharing
PRIVATE RESEARCH FUNDERS
Bill and Melinda Gates Foundation Information Sharing ApproachSloan Foundation Data Sharing PolicyWellcome Trust Data Sharing PolicyArnold FoundationMoore FoundationRobert Wood Johnson FoundationHHMI Policy on the Sharing of Publication-Related Materials, Data and Software
PUBLIC RESEARCH FUNDERS
Department of AgricultureDepartment of CommerceDepartment of DefenseDepartment of EducationDepartment of EnergyDepartment of Health and Human Services
Agency for Healthcare Research and Quality (AHRQ)Assistant Secretary for Preparedness and Response (ASPR)Center for Disease Control and Prevention (CDC)Food and Drug Administration (FDA)National Institutes of Health (NIH)
Department of Homeland SecurityDepartment of Housing and Urban DevelopmentDepartment of InteriorDepartment of LaborDepartment of TransportationDepartment of Veterans AffairsEnvironmental Protection Agency (EPA)
WE ARE EXPERIENCING ACULTURAL CHANGE
WE ARE EXPERIENCING ACULTURAL CHANGE
WE ARE THE CULTURALCHANGE!
King, 1995, Replication,Replication
Altman and King, 2007, A Proposed for theScholarly Citation of Quantitative Data
Altman et al, 2001, A Digital Library for the Disseminationand Replication of Quantitative Social Science
King, 2007, An Introduction to the DataverseNetwork as an Infrastructure for Data Sharing
Crosas, Honaker, King, Sweeney, 2015,Automating Open Science for Big Data
Crosas, 2012, The Dataverse Network: an open sourceapplication for sharing, discovering, and preserving research
data
Altman and Crosas, 2013, The Evolution to DataCitation: from principles to implementation
Crosas, 2013, A Data Sharing Story
2014, Joint Declaration of DataCitation Principles
Pepe et al, 2014, How Do Astronomers Share Data?
Goodman et al, 2014, Ten Simple Rules forthe Care and Feeding of Scientific Data
Castro et al, 2015, Achieving Human andMachine Accessibility of Cited Data
Sweeney, Crosas, Bar-Sinai, 2015, Sharing SensitiveData with Confidence: The DataTags System
Meyer et al. 2016, Data Publication with the Structural Biology Data Grid Supports Live Analysis
Wilkinson et al, 2016, The FAIRGuiding Principles for Scientific
Data Management andStewardship
Bierer, Crosas, Pierce, 2017, DataAuthorship as an Incentive to
Data Sharing
The Dataverse project and team leading many aspects of data sharing
2017
METRICS FROM LAST YEAR,JUNE 2016 TO JUNE 2017
AN ACTIVETEAM AND
COMMUNITY
22 COMMUNITYCALLS
190 ATTENDEES25 ORGANIZATIONS/UNIVERSITIES10 COUNTRIES
Community
975 GOOGLEGROUPMESSAGES
Community
7,114 IRCMESSAGES
Community
245 UNIQUE USERS
12 SPRINTS (STARTED IN JANUARY 2017)
IQSS Dataverse Team
220 STANDUPMEETINGS
IQSS Dataverse Team
52,000 SLACKMESSAGES
IQSS Dataverse Team
43 GITHUBCONTRIBUTORS
Code
334 PULLREQUESTS
Code
8,335 GITHUBCOMMITS
Code
1,153 SUPPORTTICKETS
Support
DATAVERSE CUP 2017
A VISION:DATAVERSE AS A KEY PART OF
THE FULL RESEARCH DATALIFECYCLE
TOWARDS A DATA-CENTRIC RESEARCH LIFECYCLE
Data Collection
Lab
E-NotebooksInstruments
Surveys...
Assign DUA&
metadata
Cloud Computing andStorage
Run data &code
Explore &Visualize data
Track Provenance
Journals &Funders
DataCitation
Work withSensitive Data
FROM DATA COLLECTION, TO COMPUTING AND SHARING
RESEARCHCOLLABORATIONS
Data Privacy
Big Data
Data Policies
Replication
...
COMMUNITY
STANDARDS ANDBEST PRACTICES
INSTITUTIONSREQUIREMENTS
JOURNALSREQUIREMENTS
FUNDERSREQUIREMENTS
TECHNOLOGYADVANCES