What to do about data? An overview of guidelines and policies for dataset collection development

3
What to do about data? An overview of guidelines and policies for dataset collection development Sarah Young, Health Science and Policy Librarian, Cornell University, [email protected] Why purchase data? Secondary datasets are increasingly important to researchers as they attempt to answer questions, make predictions and test hypotheses in new and powerful ways. For libraries that strive to provide information to support research needs, these datasets can be considered a ‘new currency’ in collection development. There are many unique considerations in the collection and acquisition of datasets. Methods Currently existing dataset collection development policies, guidelines and programs were gathered from web searches of academic library websites, calls to listservs and personal communications. A total of 18 policies, guidelines, or programs were identified and considered in this work. A literature review was conducted with a focus on the collection of commercially available datasets. For references and links to dataset collection development policies, please see handout. Getting a dataset collection development program off the ground: Getting the word out Liaison librarians and subject selectors can and should be involved in working with researchers and faculty across disciplines, particularly in the beginning stages of the dataset evaluation process. They can help determine if free datasets, or datasets already held in library collections, meet researcher needs and can get the word out to departments. Handling requests Requests can be handled on an ad-hoc basis or via formal application procedures. Two institutions examined in this studied provided an online application process through which researchers could apply for library support for dataset purchasing (University of Cincinnati and the University of Illinois). Negotiating licenses License negotiation can be lengthy and tedious; commercial vendors selling datasets are often used to working with individual researchers, not libraries or institutional licensing arrangements. Datasets in the Workflow Decide whether datasets will be treated like other electronic acquisitions. Licenses may be negotiated by e-resource acquisitions departments with expertise in negotiating terms of use. datasets should be integrated into the normal cataloguing workflow, and should be considered a part of the digital preservation program. Purpose The purpose of this overview was to get a sense of current approaches to dataset collection development at other research institutions, to determine key considerations in dataset purchasing, and to highlight particular challenges in implementing a dataset collection development program. The amount the library is willing and able to contribute to a given dataset should be considered, with joint purchases between the library and the researchers when possible. Data should be provided in a format that can be supported by the library and used by the researcher. Consider readability in commonly used statistical software. Datasets that come with adequate documentation and relevant metadata are preferred. Consider the language and ease of cataloguing. Datasets should comply with the library's existing storage capabilities. Confidential data requires special storage and access considerations. The commercial supplier of the data and the data itself should be vetted for quality and reliability, and long-term access ensured. Datasets purchased should be institutionally accessible to all faculty, students and staff. Terms should be in accordance with those for other electronic resource purchases made by the library. Consider fair use and the rights of scholars to data derivatives. Datasets with a broad subject appeal to the research community, supporting the mission of the institution, should be prioritized. Consider currency, the value of historical data, and geographic scope. Will the value of a dataset increase or decrease over time? Considerations in dataset purchasing Acknowledgements Thanks to all of those who took the time to thoughtfully respond to listserv inquiries! Storage needs Cost Quality Format Scope and Relevance Terms of Use Documen- tation

description

Datasets are increasingly emerging as a ‘new currency’ in collection development. While purchasing models may in some ways mirror more traditional forms of electronic information, there are many unique considerations in the collection and acquisition of datasets. The purpose of this study is to determine the extent to which academic libraries have formalized dataset collection development policies and to highlight some of the key considerations in the development of such policies. The focus here is on commercially available datasets, rather than datasets produced at home institutions.

Transcript of What to do about data? An overview of guidelines and policies for dataset collection development

Page 1: What to do about data?  An overview of guidelines and policies for dataset collection development

What to do about data?An overview of guidelines and policies for dataset collection developmentSarah Young, Health Science and Policy Librarian, Cornell University, [email protected]

Why purchase data?

Secondary datasets are increasingly important to researchers as they attempt to answer questions, make predictions and test hypotheses in new and powerful ways. For libraries that strive to provide information to support research needs, these datasets can be considered a ‘new currency’ in collection development. There are many unique considerations in the collection and acquisition of datasets.

Methods

Currently existing dataset collection development policies, guidelines and programs were gathered from web searches of academic library websites, calls to listservs and personal communications. A total of 18 policies, guidelines, or programs were identified and considered in this work. A literature review was conducted with a focus on the collection of commercially available datasets. For references and links to dataset collection development

policies, please see handout.

Getting a dataset collection

development program off the

ground:

Getting the word out

Liaison librarians and subject selectors can and should be involved in working with researchers and faculty across disciplines, particularly in the beginning stages of the dataset evaluation process. They can help determine if free datasets, or datasets already held in library collections, meet researcher needs and can get the word out to departments.

Handling requests

Requests can be handled on an ad-hoc basis or via formal application procedures. Two institutions examined in this studied provided an online application process through which researchers could apply for library support for dataset purchasing (University of Cincinnati and the University of Illinois).

Negotiating licenses

License negotiation can be lengthy and tedious; commercial vendors selling datasets are often used to working with individual researchers, not libraries or institutional licensing arrangements.

Datasets in the Workflow

Decide whether datasets will be treated like other electronic acquisitions. Licenses may be negotiated by e-resource acquisitions departments with expertise in negotiating terms of use. datasets should be integrated into the normal cataloguing workflow, and should be considered a part of the digital preservation program.

Purpose

The purpose of this overview was to get a sense of current approaches to dataset collection development at other research institutions, to determine key considerations in dataset purchasing, and to highlight particular challenges in implementing a dataset collection development program.

The amount the library is willing and able to contribute to a given dataset should be considered, with joint purchases between the library and the researchers when possible.

Data should be provided in a format that can be supported by the library and used by the researcher. Consider readability in commonly used statistical software.

Datasets that come with adequate documentation and relevant metadata are preferred. Consider the language and ease of cataloguing.

Datasets should comply with the library's existing storage capabilities. Confidential data requires special storage and access considerations.

The commercial supplier of the data and the data itself should be vetted for quality and reliability, and long-term access ensured.

Datasets purchased should be institutionally accessible to all faculty, students and staff. Terms should be in accordance with those for other electronic resource purchases made by the library. Consider fair use and the rights of scholars to data derivatives.

Datasets with a broad subject appeal to the research community, supporting the mission of the institution, should be prioritized. Consider currency, the value of historical data, and geographic scope. Will the value of a dataset increase or decrease over time?

Considerations in dataset purchasing

Acknowledgements

Thanks to all of those who took the time to thoughtfully respond to listserv inquiries!

Storage needs

Cost

Quality

Format

Scope and Relevance

Terms of Use

Documen-tation

Page 2: What to do about data?  An overview of guidelines and policies for dataset collection development

New England Area Librarian e-Science Symposium Poster Session, April 9, 2014

What to do about data?

An overview of guidelines and policies for dataset collection development

Sarah Young, Health Science and Policy Librarian, Cornell University, [email protected]

Dataset purchasing policies, guidelines and programs

*Brown University http://library.brown.edu/about/datacenter

Carleton College https://apps.carleton.edu/campus/library/assets/FacParticipation2012_13rev.

pdf

Duke University not online; personal communication

Emory University https://edc.library.emory.edu/content/policy

Georgetown University http://guides.library.georgetown.edu/datapolicy

Harvard University https://hcl.harvard.edu:8001/forms/requests/data_purchase_guidelines.cfm

*MIT http://libguides.mit.edu/ssds/suggest

New York University not online; personal communication

*Yale University http://csssi.yale.edu/collections/data?destination=node%2F18

James Madison University http://www.lib.jmu.edu/faculty/datasetcdpolicy.aspx

*McMaster University http://library.mcmaster.ca/maps/Library_Data_Service_Collection_Policy.pdf

*Michigan State University http://libguides.lib.msu.edu/dataservicescollectiondevpolicy

*NC State University http://libguides.lib.msu.edu/dataservicescollectiondevpolicy

*Texas A&M http://library.tamu.edu/about/collections/collection-development/tamu-

purchased-data-value-statement.html

**University of Cincinnati http://webcentral.uc.edu/taftawards/programdetail.cfm?programid=8

**University of Illinois at

Urbana-Champaign

http://www.library.illinois.edu/sc/datagis/purchase/description2013.html

University of New

Hampshire

http://www.library.unh.edu/research-support/data-services

University of North

Carolina

http://library.unc.edu/services/data/purchase/

* Denotes institutions with detailed dataset collection development policies online.

** Denotes institutions with formal application processes for dataset purchasing programs.

Page 3: What to do about data?  An overview of guidelines and policies for dataset collection development

New England Area Librarian e-Science Symposium Poster Session, April 9, 2014

Other data policies to consider:

ICPSR (Inter-University Consortium for Political and Social Research

http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/policies/colldev.html

UK Data Archive

http://www.data-archive.ac.uk/media/54773/ukda067-rms-collectionsdevelopmentpolicy.pdf

References

Church, J. (2008). International Survey Data: Challenges and Strategies for Collection

Development. DttP: A Quarterly Journal of Government Information Practice &

Perspective, 36(1), 12–16.

Davis, H. M., & Vickery, J. N. (2007). Datasets, a Shift in the Currency of Scholarly

Communication: Implications for Library Collections and Acquisitions. Serials Review,

33(1), 26–32. doi:10.1016/j.serrev.2006.11.004

Dollar, D., Eow, G., Linden, J., & Grafe, M. (2013). Distinctive Collections: The Space Between

“General” and “Special” Collections and Implications for Collection Development. In

Proceedings of the Charleston Library Conference. Charleston, SC: Purdue University

Press. doi:10.5703/1288284315094

Erwin, T., Sweetkind-Singer, J., & Larsgaard, M. L. (2009). The National Geospatial Digital

Archives—Collection Development: Lessons Learned. Library Trends, 57(3), 490–515.

Florance, P. (2006). GIS collection development within an academic library. Library Trends,

55(2), 222–234.

Lee, S. D. (2002). Electronic collection development: a practical guide. New York; London:

Neal-Schuman Publishers ; Library Association Pub.

Mooney, H., Hogenboom, K., Bordelon, B., Partlo, K., Hudson, M., & Jankowska, M. (2013, May

29). Strategies and Models for Data Collection Development. Presented at the

International Association for Social Science Information Services & Technology (IASSIST)

2013, Cologne, Germany. Retrieved from

http://iassistdata.org/downloads/2013/2013_c2_mooney_etal.pdf

Teper, T. H., Hogenboom, K., & Wiley, L. N. (2011). Collecting Small Data. Research Library

Issues: A Quarterly Report from ARL, CNI, and SPARC, 276, 12–19.

Walters, W. H. (1999). Building and maintaining a numeric data collection. Journal of

Documentation, 55(3), 271–287.