Trailblazing in the Wilderness of Data Management
-
Upload
stephanie-wright -
Category
Education
-
view
28 -
download
1
Transcript of Trailblazing in the Wilderness of Data Management
Trailblazing in the Wilderness of Data Management
Where are we going and how do we get there from here.
Stephanie WrightData Services CoordinatorUniversity of Washington Libraries
Click to edit Master title style
AGENDA
• Definitions• Why venture out• Paths already taken
–Assessments of needs–Existing programs–Tools & resources
• Blazing your own trail
Montana State University – 21 June 2013
Definitions
• Data• Data Management• Big Data• Long Tail of Data• Acronyms
www.lib.washington.edu
Definitions
www.lib.washington.edu
DATA
By data, we do not mean a synonym for information. We mean research data, that which is collected, observed, or created, for purposes of analyzing to produce original research results.
Research data may be created in tabular, textual, statistical, numeric, geospatial, image, multimedia or other formats.
(Adapted from DISC-UK DataShare Project, p. 16)
Definitions
www.lib.washington.edu
DATA
Data can be produced from a variety of processes (e.g., observation, experimentation, simulation, derivation, compilation), represented in numerous forms and stored in many digital formats (e.g., ASCII, PDF, SPSS, Excel, TIFF, Java, FITS, CIF, ZVI) The scope of this definition includes data from disciplines in the sciences, social sciences, and
humanities.
(Adapted from MIT Libraries, “What is Data?”, 2009)
Definitions
www.lib.washington.edu
DATA MANAGEMENT
Pertains to the collection, cleaning, storage, sharing, access, disposal, preservation and/or archiving of research data.
(Adapted from University of North Carolina, Research Data Stewardship Report, 2012)
Definitions
www.lib.washington.edu
BIG DATA
• Volume• Velocity• Variety
25 Definitions of Big Data: http://www.opentracker.net/article/25-definitions-big-data
– Now over 30 definitions
Definitions
www.lib.washington.edu
LONG TAIL OF DATA
Image credit: disruptormonkey.typepad.com
Acronyms
www.lib.washington.edu
• RDM – Research Data Management
• IR – Institutional Repository
• DR – Data Repository
• DMP – Data Management Plan
Why Venture Out
• Funding agencies• Universities• Researchers• Libraries
www.lib.washington.edu
Image credit: National Park Service, Yellowstone photo collection, (http://www.nps.gov/features/yell/slidefile/mammals/wolf/Images/15314.jpg)
www.lib.washington.edu
Funding Agencies
www.lib.washington.edu
• 1998: NSF• 2003: NIH• 2011: NSF• 2013: NSF, OSTP, OMB, NIH
Universities
www.lib.washington.edu
• Competitiveness• Reduce duplication of effort• Preserve the research record of the
institution• Encourage innovation & discovery
Researchers
www.lib.washington.edu
• Verifiability & reproducibility• Increased citation rates for
publications– (Piwowar et al, 2007)
• Preservation of individual scholarly record
• Save time by planning early
Libraries
www.lib.washington.edu
•Digital Preservation Network (DPN)
“The Digital Preservation Network is being created by research-intensive universities to ensure long-term preservation of the complete digital scholarly record.”
http://d-p-n.org/
Libraries
www.lib.washington.edu
NSF Proposal & Award Policies & Procedures Guide (Oct 2012)
“Instructions for preparation of the Biographical Sketch have been revised to rename the "Publications" section to "Products" .... (P)roducts may include, but are not limited to, publications, data sets, software, patents, and copyrights.”
Paths Already Taken
• Assessments• Existing programs• Tools & Resources
www.lib.washington.edu
Image credit: John W. Ridge (http://commons.wikimedia.org/wiki/File:Yellowstone_Trail_Map.jpg)
Assessments
www.lib.washington.edu
• UNC (2012) “Research Data Stewardship Report”
• University of Colorado Boulder (2012) “Research Data Management @ UCB”
• Purdue “Data Curation Profiles Directory” (http://docs.lib.purdue.edu/dcp/)
• More: Georgia Tech, Cornell, Houston, Oregon….
Findings
www.lib.washington.edu
• Researchers use a wide variety of data types – across disciplines
• Most researchers rely on themselves for data management
• Researchers want to maintain control of their data
• Many are unaware of existing services
• They want tools that work in existing workflows
What’s Needed
www.lib.washington.edu
• Creating & maintaining DMPs• Best practices guidance all along
lifecycle• Storage
– Short-term access– Long-term access– Backup– Versioning– Security
• Metadata creation
Existing Programs
www.lib.washington.edu
• Cornell– Research Data Management Service
Group• Sr VP for Research and University
Librarian• Faculty Advisory Board
– 9 faculty across disciplines– OSP & Office of Research Integrity &
Assurance
• Management Council– 2 librarians, 2 faculty, 2 IT, 1 research institute
Existing Programs
www.lib.washington.edu
• Purdue– D2C2: Distributed Data Curation
Center• Executive Committee
– Dean of Libraries, VP of Research & VP of IT
• Library: consulting & metadata support• IT: storage & research computing support
Existing Programs
www.lib.washington.edu
• University of Washington– Data Services Program (1.5 FTE)
• Data Services Coordinator• Data Services Communications &
Curriculum Libn
– Data Services Team (10 members)– Partnerships
• Research Centers (eSci, CSDE, IHME)• Office of Research (OSP)• Campus IT• iSchool
Tools & Resources
www.lib.washington.edu
• Data Mgmt Planning: DMPTool• Metadata & Sharing: DataUP• Sharing & Storage: DataBib• Citation: EZID• Best Practices: DMVitals
Blazing Your Own Trail
www.lib.washington.edu
Image credit: Michigan State University Department of History, HST 321: History of the American West (http://history.msu.edu/hst321/files/2010/07/colter.jpg)
www.lib.washington.edu
• Identify needs• Consider potential partners• Scope
– Disciplines– Specific areas of the data lifecycle
• Determine priorities– New services? Enhance existing?
Market existing?
Where do you want to go?
www.lib.washington.edu
• Objective L1– Assess and improve where needed,
student learning of critical knowledge & skills
• Objective D1– Elevate the research excellence and
recognition of MSU faculty• D1.2
• Objective D2– Enhance infrastructure in support of
research, discovery and creative activities
MSU Strategic Plan
www.lib.washington.edu
• Support for active data storage
• Data security guidance• Backup services• Development of tools that
can be inserted into existing workflows
Campus IT
www.lib.washington.edu
• Guidance on legal / ethical considerations
• Incorporate DM planning into grant submission process
• New faculty data management orientations
Office of Research
www.lib.washington.edu
• Market and provide access to existing RDM resources
• Provide learning opportunities on RDM best practices
• DMP consultation• Storage (final)• Metadata consultation
Libraries
www.lib.washington.edu
• University policy on data management
• Integrate RDM activities into T&P process
• Consider campus policy on open data
University
Questions
Thank you!Stephanie Wright
Data Services [email protected]
@shefw
http://guides.lib.washington.edu/swright
Data Management Guidehttp://guides.lib.washington.edu/dmg
ResearchWorks Data Serviceshttp://researchworks.lib.washington.edu/rw-data.html