Post on 15-Jul-2020
Task Force on Research Data Implementation
Summary Report
December 2014
Prepared by:
Ron Jantz
Yu-Hung Lin
Aletia Morgan
Laura Palumbo, Chair
Minglu Wang
Krista White
Ryan Womack
Yingting Zhang
Yini Zhu
Revised January 2015
2
Task Force on Research Data Implementation
Summary Report
December 2014
Introduction:
Last year, the Office of Science and Technology Policy mandated that the direct results of
research funded by federal agencies with research budgets of more than one hundred million
dollars be made publicly accessible.1 This followed the 2011 policy change by the National
Science Foundation, which required researchers to submit a data management plan outlining how
their funded data would be managed, shared, and preserved.2 As a result, researchers are
complying with these new mandates by seeking easy and effective ways to share, access, and
preserve research data.
Academic libraries have begun to fill the demand for digital repositories which allow their
researchers’ data to be discoverable, accessible, and preserved for the long term. Rutgers
University Libraries are poised to offer exceptional research data services, and in July 2014 the
Task Force on Research Data Implementation began work in order to “establish an
administrative and evaluation framework for the deposit of research data” in accordance with the
Libraries’ and the University’s Strategic Plans.3 The Task Force was charged with the
completion of ten items to prepare for the ongoing and efficient acceptance of research data (see
Task Force Charge, Appendix A). The following sections of this report will address each of these
task items individually.
Environmental Scan of Institutional and Data Repositories
1. Review the administrative structure of other data repositories that might serve as models.
2. Review the evaluation process for technical, legal, and confidential issues involving data
deposit at other institutions that might serve as models.
The Task Force completed a review of thirty-seven repositories to assess their administrative
structure, and their evaluation processes for technical, legal, and confidential issues in fulfillment
of the first two task items of our charge. The repositories were evaluated based on the
Association of Research Libraries Systems and Procedures Exchange Center: Research Data
Management Services (ARL SPEC) Kit 334 (July 2013), which “…surveys ARL member
libraries on their activities related to access, management, and archiving of research data at their
institutions.”
The Task Force developed a set of thirty four review criteria to analyze the Research Data
Management (RDM) systems of the reviewed institutions, which were categorized into five
areas: Research Data Management Services (RDMS); Data Archiving Services; RDM Service
Staffing; Partnerships; and Research Data Policy. These criteria were reviewed based on publicly
1 http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
2 http://www.nsf.gov/bfa/dias/policy/dmp.jsp
3 Excerpts from the Task Force’s Interim Report are used throughout the Final Report where relevant.
3
available information from the repositories’ and libraries’ websites, and the findings were
summarized in an Interim Report, dated October 2014. Since the Interim Report was written,
additional information was sought from selected repository managers via phone conversations
and is included in this report where relevant to the remaining items of our charge. (See Task
Force on Research Data Implementation Interim Report, October 2014.)
Following are summarized excerpts from the Interim Report. From our research, we discovered
that:
Almost all of the institutions reviewed provided research data management training and
consulting, typically in data management plan preparations. This is an area to be
leveraged to increase library visibility and to establish additional connections with
research faculty.
About half of the reviewed repositories were operated by libraries, and many worked in
collaboration with outside units and offices such as the Office of the Vice President for
Research, and the Office of Information Technology.
The number of research data management service staff members was dependent on each
institution’s funding and culture. Staffing numbers ranged in size, from one or two staff
to as many as eighteen at one institution.
Most repositories place the responsibility for the evaluation of data on the principal
investigator. Only two of the reviewed repositories placed curation responsibilities
exclusively with librarians, although others used teams including librarians.
Over three-fourths of the reviewed repositories allowed self-deposit of data or self-
deposit and mediated deposit.
Data deposit agreements were common, and most shared a similar format. Depositors
typically needed to agree that they were legally allowed to deposit the data for public
access; that the data does not contain any personal or sensitive information; that the
depositor holds the institution harmless from any liability incurred as a result of the
deposit and public access of the data; and that the repository may enact certain described
operations in order to provide for data discovery, maintenance, and preservation.
Privacy and security issues were typically addressed by agreements wherein the depositor
stated that the data was free of any confidential or sensitive data; and by stripping of
identifying information, or in some cases by encryption. Responsibility for the protection
of confidential data was placed with the principal investigator or the researcher depositor.
Information about repository storage capacity was limited. Restrictions to file sizes and
file types were more prevalent, with offerings ranging from 10 – 500 GB free of charge;
and acceptance of most standard file types associated with open source and widely used
proprietary software was common.
4
Funding models for storage and preservation of research data have not been established
for many repositories, although a few did provide information about costs of services.
These are described below.
We found that the institutions reviewed were at varying stages with regard to acceptance of
research data. A few were not accepting data or had a limited number of datasets, but many had
respectable quantities of data, and some were well established data exclusive repositories.
Eighteen institutions had repositories operated by libraries, and many worked in collaboration
with outside units and offices such as the Office of the Vice President for Research, and the
Office of Information Technology. Staff responsible for the repositories’ activities varied by the
size of the institution, with the largest data management teams at two institutions consisting of
fifteen to eighteen members. Collaborative efforts with units outside of the libraries and a team
approach in general seem to make the most sense for larger institutions.
Most repositories place the responsibility of the evaluation of data on the principal investigator.
Only two repositories placed curation responsibilities exclusively with librarians, although others
used teams including librarians. Over three-fourths of the reviewed repositories allowed self-
deposit of data or self-deposit and mediated deposit, similar to the process used by RUcore in
acceptance of research documents, albeit with additional forms and guidance. Data deposit
agreements were common, and most shared a similar format as noted above. Responsibility for
the protection of confidential data was placed with the principal investigator or the researcher
depositor. Based on our findings, we believe that researcher responsibility for legal issues, and
self-deposit make the most sense from a liability and efficiency standpoint, with case by case
exceptions.
Information about repository storage capacity was limited. Restrictions to file sizes and file types
were more prevalent, with offerings ranging from 10 – 500 GB free of charge; and acceptance of
most standard file types associated with open source and widely used proprietary software was
common. It seems as if funding models for storage and preservation of research data have not
been established for most repositories, although a few did provide information about costs.
Ongoing funding for data preservation and repository growth is an important aspect of data
management services that will need to be addressed in order to create sustainable systems,
particularly where staff time is a factor.
Although the cost of storage is relatively low, some repositories have fees based on storage
volume, possibly as a way to quantify their services. At the mission-driven, non-institutional
organizations (ICPSR, Dryad, Odum), a more explicit funding arrangement is specified. ICPSR
funds its operations through grant-funding for major subject areas, with some chargebacks to
users for deposit. Dryad has some tiers of pricing, with the lowest price set at $65 per deposit.
Odum also charges fees tailored to each project.
Among university repositories, a few specify limits to the free service offered. The University of
Edinburgh allows up to 500GB of space for each researcher for free. Michigan indicates there
will be charges for extra metadata work by librarians. Stanford recommends that grant proposals
include IT costs, and that it may charge for data over 10GB in the future (not at present).
Syracuse also mentions an evolving funding model. Princeton and Berkeley charge strictly by
5
size. Princeton charges 0.006/MB (or $6/GB) as a one-time charge. Berkeley charges
$0.14/month for each GB stored.
Johns Hopkins model is unusual, partly because it was designed from the start to become self-
funding once initial grants ran out. For small collections, a $1600 charge is standard. For large
collections (2TB or more), 2% of the total grant funding is billed to support the data repository.
Finally, Purdue has the most detailed funding model. Central university funding pays for the
following free allocations: 10GB for 3 years for trial projects, 1GB for 10 years for a small
publication, and 100GB for 10 years for a grant-funded project or publication. Additional space
is billed per GB on a yearly, or a 10-year basis. (See https://purr.purdue.edu/about/pricing for full
details.)
Our review also revealed that services in addition to the technical, legal, and administrative
aspects of data ingest, such as data management training and consulting, are flourishing in
institutions even without data capabilities in their repositories. This is an area that can be
leveraged for enhanced library visibility in anticipation of the acceptance of data into RUcore.
The above findings from our Interim Report guided the completion of the remaining items of our
charge.
Rutgers Major Stakeholders in Research Data: Policies and Practices 3. Consult with appropriate major stakeholders to ensure that RUL workflows and practices facilitate and
do not conflict with policies and practices of those departments, especially the office of the Vice President
for Research, and Research and Sponsored Programs.
Although Rutgers is currently without an existing University-wide Research Data Policy, we
assumed that a data policy would be in place in the near future, and would resemble the data
policies we have seen at other institutions. Based on our research, the commonalities in the best
policies seem to be that the university owns the data; the principal investigator is responsible for
making sure it is preserved; and protocols exist in the event the PI leaves the institution. We
allowed this to guide our conversations with Rutgers research offices.
Several members of the Research Data Task Force have periodic meetings with campus
stakeholders in the disposition of research data as part of their normal responsibilities, including
regular meetings with Terri Kinzy (Associate Vice President for Research Administration) and
Eileen Murphy (Director of Research Development) since they assumed their new leadership
positions in the Office of Research & Economic Development (ORED) in late 2013. The
Research Data Manager, Aletia Morgan, and Data Librarian, Ryan Womack, initially met with
them in December to introduce them to RUL, RUcore, and the existing and potential RUL data
management services available to campus researchers. Since then, Aletia has continued to meet
with them periodically, and as available, she and Yingting Zhang have been regular attendees to
the monthly Research Facilitators meetings sponsored by ORED.
Given the evolution in this relationship, we were able to talk with Eileen and Terri specifically
regarding the RUL Research Data Task Force, and its goal of finalizing service processes and
6
expectations that can make RUL a trusted partner in the management, preservation, sharing, and
reuse of research data. They are excited about the potential for the Libraries to facilitate
researcher compliance with funder requirements. Their primary piece of advice as we move
forward with the service is to “keep it simple”. They were emphatic that if we make the
submission process cumbersome and judgmental, researchers will simply go elsewhere with their
data.
A key question critical to the acceptance of research data is whether the researcher has the
authority as data custodian to determine whether datasets are appropriate for preservation
services, or whether the library’s workflow process should include a detailed assessment of each
dataset with respect to the quality, human subject protection, ownership (copyright), and
commercial value of any data. Eileen recommended that we speak directly to Judith Neubauer,
Associate Vice President for Research Regulatory Affairs in ORED, and that our workflow
should be crafted to be consistent with her guidance. On September 2, Aletia Morgan and Laura
Palumbo met with Judy Neubauer, Eileen Murphy and Terri Kinzy to discuss the role of the
Libraries with regard to the assessment of research data.
Judy Neubauer was very helpful, and was clear in her belief that the researcher is the ultimate
custodian of the data generated from research, and should be the final arbiter regarding the
suitability for data deposit in any sharing resource or repository, whether in RUcore or
elsewhere. We discussed the current absence of a University Research Data Policy, and while
she did not see this as an impediment to the work of the Libraries, she did say that it should be a
very simple document that defines the researcher as the responsible party regarding the use of
any data generated from funded research.
As another important stakeholder in the research process, we met with leadership of the Office of
Research and Sponsored Programs (ORSP). On November 14, Laura Palumbo and Aletia
Morgan met with Diane Ambrose and Cassandra Burrows (Senior Associate Director and
Assistant Director, respectively). Our primary question was whether their unit would want to be
notified and given any right of approval for any dataset deposits. Additionally, we hoped to
identify processes that would improve our awareness of projects that could yield data for RUcore
deposit, and ways to promote our services among researchers.
Our conversation was very helpful. In brief, Diane and Cassandra felt it unnecessary for us to
notify any ORSP staff when researchers deposit data with RUcore. Certainly, their staff would
be available to respond to any questions we have, but they would not expect this to happen often.
We progressed from this issue to ways to promote awareness of RUcore among the research
community. Their recommendation is that we present updates on RUcore and the RUL research
data services to ORSP staff periodically, to ensure that they are aware of what we can do to help
researchers in their data management efforts, including development of data management plans.
Further, we talked about developing processes to promote RUcore to researchers at the time of
funding commitments, and also sharing a list of awards with us periodically, so that we can reach
out to researchers with projects that might yield data suitable for RUcore preservation.
On November 13, Yingting Zhang had a conversation with Donna Hoagland, Director of the
Institutional Review Board for New Brunswick-Piscataway, about working with the Libraries on
7
issues related to human subject data, and she was supportive of a future collaboration with us.
Additionally, Laura Palumbo and Aletia Morgan met with Paula Bistak (Executive Director
Human Subjects Protection Program) and Michelle Gibel Watkinson (Senior IRB
Administrator), to discuss the question of how or whether the IRB would want to review any
RUcore data submissions; and to understand whether IRB approvals typically address the
protection, sharing, maintenance or disposition of any research data. This will obviously not be
an issue during the initial implementation, as no projects with human subject privacy issues will
be accepted. But as we move forward, we will want to make sure that we are working in
accordance with IRB requirements. Paula and Michelle were supportive of further collaboration
between the Libraries and their offices, and it was suggested that the Libraries work with them to
develop guidance for researchers regarding the levels of sharing permitted through RUcore, from
public access, to restricted access to a defined user group, or preservation only.
The charge for the task force to consult with the leaders of the Rutgers research enterprise
yielded valuable information. The relationship we had already forged with these offices
encouraged candid discussions that will help us ensure that our final service recommendations
and ultimate implementation practices are consistent with their standards and expectations.
Principles for Prioritization of Data Deposit Projects 4. Establish principles for prioritization of data deposit projects based on RUL strategic priorities. This
should include a definition of various types of potential projects to ensure that we have the resources both
to host and to sustain projects, ie. federal grants, non-grant funded research, etc
Research data to be accepted into RUcore will demonstrate the scholarly and scientific research
being conducted by Rutgers’ researchers. This data will contribute to the advancement of
knowledge and research in diverse subject areas, including the sciences, health sciences, social
sciences, and humanities, as described in the Libraries’ Strategic Plan. By accepting and publicly
sharing research data, we will highlight the unique ability of the Libraries to facilitate discovery,
access, and reuse of Rutgers University’s research. Research data will be accepted insofar as the
Libraries have “…the resources, including, but not limited to, expertise, technology, and funding,
to support the project, both initially and ongoing.” (Digital Projects Evaluation Process, Rutgers
University Libraries, March 2013).
Research data may be the result of unfunded as well as grant-funded research, to allow for a
broad spectrum of research areas to be included; however projects which require data deposit to
comply with funder mandates may be given preference. Working with Rutgers’ researchers, the
Libraries will provide access to Rutgers’ research data. The Principal Investigator or Primary
Responsible Researcher, hereafter to be referred to as the Responsible Researcher and which
shall be meant to include both titles, will be responsible for assuring that the data can be shared
publicly in accordance with University policies, Federal and other funders’ directives, and is in
compliance with any legal restrictions. Through a deposit agreement, they will attest that by
sharing the data they will not be in violation of any confidentiality agreements, copyright laws,
or other laws, and will hold Rutgers University Libraries harmless from any damages resulting
from the sharing or misuse of the data. (See the following Evaluation of Data Projects for
Deposit for information concerning the deposit agreement.)
8
The Task Force has developed Research Data Service Guidelines for acceptance of data into
RUcore. These guidelines were drafted assuming a staged approach, with the initial
implementation of mediated ingest consisting of data projects without human or animal subjects,
commercial interests, and which are typically less than 100 GB of data volume per project,
although larger quantities may be considered. We believe that we currently have the staffing,
storage, and system requirements to accept these projects immediately.
During the initial implementation of data acceptance, the deposit process will be mediated by
members of the RUL Research Data Team. Researchers will work directly with a Project
Manager from the RUL Research Data Team who will guide the researcher through the deposit
process and see it through to completion (see RUL Research Data Team and Data Deposit
Workflow for more information). Data projects outside the guidelines for the initial
implementation, such as those with human subjects, would be considered in the full
implementation of research data services, or on a case-by-case basis as a special data project. We
envision development of a full implementation of data acceptance that would allow researchers
to self-deposit data, in addition to providing mediated deposit when necessary. (See the
following Evaluation of Data Projects for Deposit for more information about self-deposit of
data.)
Please see Appendix B for the Research Data Service Guidelines, which outlines acceptance of
data projects for initial mediated deposit; the subsequent self-deposit and mediated full
implementation; and special projects which will be assessed on a case-by-case basis.
Evaluation of Data Projects for Deposit 5. Develop a framework for evaluation for data deposit in RUcore that includes a questionnaire or series
of questionnaires to be used for each data deposit, covering technical, legal, and confidential criteria
(similar to the Digital Projects Evaluation Process approved by Cabinet in March 2013).
The Task Force has created separate high level criteria intake questionnaires for the initial
acceptance of mediated data projects, and for the full implementation of data acceptance, which
also includes self-deposit. The questionnaires for each stage of data acceptance ensure that the
requirements of the Guidelines are met, and are to be signed by the Responsible Researcher.
Once the questionnaire has been completed and it has been determined that the high level criteria
are met, an application form is completed by the Responsible Researcher to establish a minimum
amount of metadata. During mediated data deposit, the questions would be asked of a researcher
by the appropriate member of the RUL Research Data Team, and/or a subject liaison. (See
Proposed RUL Research Data Team.) Once the project application is complete, the Responsible
Researcher would sign a Data Deposit Agreement, allowing RUL to accept the data.
The data deposit agreements reviewed during our environmental scan of data repositories
typically state that the Responsible Researcher is responsible for insuring that they are legally
allowed to deposit the data for public access; that the data does not contain any personal or
sensitive information; that the depositor holds the institution harmless from any liability incurred
as a result of the deposit and public access of the data; and that the repository may enact certain
described operations in order to provide for data discovery, maintenance, and preservation.
9
The Task Force drafted a suggested data deposit agreement adapted from a previously created
document prepared by the Copyright Librarian. We have made suggested modifications based on
our review of existing deposit agreements, as well as data policies found at peer and aspirant
institutions, with the assumption that similar research data policy would be adopted by Rutgers.
In short, we found that most data policies assert that the University owns the research data; that
the Principal Investigator or Responsible Researcher is the custodian of that data; and it
stipulates that the data would remain with the institution should the Responsible Researcher
leave. The suggested deposit agreement will need to be modified to align with a University data
policy, when such a policy is adopted.
During the full implementation of data services, data deposit may be automated as well as
mediated. Mediated data deposit will still be an option for researchers needing assistance, and for
projects which are very large, complex, or which would require infrastructure modifications, i.e.
a special research data project. For self-deposited data, the forms would be online and would
require NetID authentication and an electronic signature. Guideline questions would be affirmed
by the researcher, preliminary metadata entered, and the deposit agreement accepted. This self-
deposit process would include a waiting period, during which time the RUL Research Data Team
would review documents before the data becomes public.
Discussions with repository director Lisa Johnston at the University of Minnesota reveal that
their waiting period is two days for self-deposit, during which time a cursory review of file types
is performed. A conversation with Purdue’s PURR manager Scott Brandt indicates that no
checking of self-deposited data is done; the researcher alone is responsible for compliance with
any restrictions on sharing the data, limiting the library’s liability for these issues. This is also the
case at Penn State, where Mike Giarlo indicated that uploaded data becomes “live” immediately.
For RUcore, we believe that the appropriate time frame for review of the high level criteria
intake questionnaires would be five working days. The time frame for further review of
applications and documentation will depend on the complexity of the project.
Should researchers need assistance with copyright issues, guidance would be available through
referral to the Copyright Librarian. Researchers needing assistance with issues concerning
intellectual property and commercial interests would be directed to seek advice from the Office
of Technology Commercialization. Researchers with human or animal subjects’ data would be
directed by the Research Data Manager to a suitable repository for data sharing during the initial
implementation of data acceptance.
For both mediated and self-deposited data in RUcore, the responsibility for compliance with any
legal restrictions would lie with the Principal Investigator/Responsible Researcher. They will
assume liability for determining if their data is free from any copyright or intellectual property
constraints, sensitive or confidential information, any restrictions on public accessibility, or any
other legal and ethical issues which might prevent their depositing and sharing the data publicly.
The Libraries will be exempt from liability by not assuming responsibility for these
determinations.
Please see Appendix C for the initial implementation Mediated Research Data Projects
Questionnaire; Appendix D for the Flowchart for the Mediated Data Projects
10
Questionnaire; Appendix E for the full implementation Self-Deposit, Mediated, and Special
Research Data Projects Questionnaire; Appendix F for the Full Implementation Flowchart
for Mediated and Self-Deposited Data Projects; and Appendix G for the Research Data
Project Application for the Responsible Researcher. Also included in Appendix H is a
proposed Research Data Deposit Agreement, which can be modified when a Rutgers
University Data Policy becomes available.
Evaluation Guidelines for Subject Liaisons 6. Develop a corresponding guide on evaluation criteria to provide clarity to subject librarians. (Similar
to the Deed of Gift Explanation in the RUL Deed of Gift).
Guidelines have been drafted to better enable subject liaisons who have completed the
RUresearch training course “Supporting Faculty Research Data Needs”, to work directly with
researchers in assisting with data deposit. If a subject liaison has not been trained, they will work
with an appropriate Project Manager from the RUL Research Data Team until such time as they
are able to manage a data project without assistance. Project Managers will provide assistance to
researchers with forms and referrals to other personnel or offices if necessary, and enter metadata
into RUcore. (For more details about the Project Manager’s duties, please see Data Deposit
Workflow.)
Please see Appendix I for the General Guidelines for Librarians advising on Research Data
Projects for RUcore deposit. Project Managers also have available to them more detailed
information about consulting with researchers regarding data in the RUresearch Sakai site.
Proposed RUL Research Data Team
7. Recommend assignments for functional responsibility in the area of data deposit.
Based on our review of other repositories, and given the background and experience of current
staff, we believe that RUL has the staffing capacity to accept research data immediately. After
consideration of the staffing arrangements at other institutions, we propose the following
organizational structure for a RUL Research Data Team, which would serve researchers at all
Rutgers locations.
The RUL Data Team should consist of existing Libraries personnel, who are already well
qualified for the review and acceptance of research data. The Team should be lead by a full time
Data Manager, whose time is one hundred percent attributable to the activities related to data
acceptance in RUcore. The Research Data Manager would be responsible the overall leadership
of the Research Data Service, including participation in national and international organizations
related to Research Data. The Research Data Manager would also coordinate the work of
student interns from SCI and elsewhere, and work with others assisting on data projects (e.g.,
departmental graduate students, postdocs, and research fellows).
We envision a Data Team that consists of two parts; a Core Data Team, who will be responsible
for preliminary review of data projects and who will also serve as Project Managers when
appropriate; and an Expanded Data Team, who will act as Project Managers and oversee data
11
projects to their completion. The Core Data Team as a whole could meet on a weekly basis, if
projects are awaiting review. If there are issues with rights, commercialization, sensitive
information, or other legal issues, the Responsible Researcher will be referred to the appropriate
personnel or office for guidance, such as the Copyright Librarian, the Repository Collection
Librarian, the Office of Technology Commercialization, or the Institutional Review Board, for
resolution of any issues before the data project is accepted.
The Expanded Data Team would meet as needed to review prioritization and scheduling of
projects, and members of the expanded team who have been assigned to specific data projects
would meet separately as needed to move these projects forward. These are the personnel who
we would assign to each group:
Core Data Team
Research Data Manager
Data Librarians- 2
Chemistry & Physics Librarian/Science Data Specialist
Metadata Librarian for Continuing Resources, Scholarship and Data
Digital Data Curator
Digital Library Architect
Expanded Data Team-includes the above plus:
Digital Humanities Librarians- 2
Health Sciences Librarians- 2
Social Science Librarian
Physical Sciences Librarian, Newark
Life Sciences Librarian, Newark
With the exception of the Research Data Manager, these personnel would be responsible for data
duties on a part-time basis, and as RUL Research Data Services grow, additional staffing and
resources should be allocated to the Research Data Service. Job descriptions may need to
accommodate changes in time spent on data in the future. We should seek to include
representatives on the data team from Camden as well.
Data Deposit Workflow
8. Chart a workflow for the data deposit evaluation process.
The preceding guidelines and questionnaires establish the process for evaluation of data for
deposit. These have been drafted so that they can be adapted to accommodate a self-deposit
process, where the responsible researcher would be enabled to answer relevant questions about
his or her data directly in an online form. On the following pages are a flowchart which provides
an overview of the workflow of the data deposit process, and narratives which describe the
process in more detail.
12
RUL Research Data Project Workflow
13
Project Workflow for Initial Implementation: Mediated Deposit
Potential data management projects will come to the attention of the RUL Research Data Team
from any of several sources, including the researcher, research assistant or agent, another
librarian, or subsequent to earlier Research Data Management Plan support. The Research Data
Manager (RDM) will typically work with the Responsible Researcher (RR) to identify the basic
characteristics of the project as specified in the high level criteria intake Questionnaire, although
this may be done by a member of the Core Data Team if this is where the researcher has made
initial contact.
The RDM (or initially contacted member of the Core Data Team) will review the attributes of the
project as identified in the intake Questionnaire to determine its likely suitability for the initial
implementation of RUL Research Data Services. If the project appears to conform to the
specifications of the initial implementation of the Research Data Service, the RDM or Team
member will notify the rest of the Core Data Team. The group will identify an appropriate
Project Manager (PM) from the Data Team to lead the further review and implementation of the
project.
The Project Manager will ensure that the appropriate departmental Liaison Librarian is aware of
the project (if not already part of the Data Team), and will work with the researcher to complete
the Project Application, which will supply metadata and collection information necessary for
RUcore ingest. With the Project Application complete, the PM will review the project with the
Data Team, who will confirm that the project is suitable for deposit in the initial implementation.
Ideally the entire review process will not take more than two weeks. The Data Team may bring
in other librarians or staff members as needed, who may participate in the project as an
opportunity for shared learning.
The PM will notify the researcher of acceptance of the data project, and the researcher will then
be required to sign a Data Deposit Agreement. After the Responsible Researcher signs the Data
Deposit Agreement, the PM will collect the data and any other additional documents or resources
not already provided. The PM will then work with the other members of the RUL Data Team,
including metadata specialists as needed to complete the ingest of the data and documentation
into RUcore, and will enter metadata into WMS. The minimum amount of metadata which will
be required for ingest are those identifiers which have been described in the Project Application.
Additional metadata may be available, and the PM will enter this metadata as required. Once the
collection is created and the ingest is complete, the project data will be available on RUcore.
The datasets and links to all related events such as scholarly publications will be available for
public access and download.
If the project does not appear to be suitable for the initial implementation, either based on the
responses to the intake Questionnaire or the information gathered in the more formal Project
Application process, the RDM or PM will communicate with the Core Data Team to review
whether the project might meet the guidelines for acceptance of data during the full
implementation, or whether it is a candidate for treatment as a special project. Proposed special
projects will be reviewed by the Associate University Librarian for Digital Library Systems for
14
acceptance. If there are unresolved rights issues, technical issues, or if the project is otherwise
outside of the criteria for the initial implementation, the RDM or PM will assist with referrals to
appropriate personnel to resolve these issues, and attend further consultations as necessary.
Project Workflow for Full Implementation: Mediated or Self-Deposit
The full implementation of the RUL Research Data Service will offer two routes for research
data to come in to RUcore; mediated and self-deposit. The mediated project process in the full
implementation will be the same as was described previously for the initial implementation, with
the additional capability of accepting datasets with more diverse characteristics and varied
requirements for controlled access. If the researcher desires a mediated process, that workflow
will be followed.
The development of tools for self-deposit of research datasets will be modeled after the existing
Faculty Deposit for scholarly publications. The online self-deposit process will require the
depositor to provide information about the project by completing the Questionnaire, the
Application Form, and the Deposit Agreement.
If the data is self-deposited, an automatic notification will be sent to the RUL Core Data Team,
who will review the completed Questionnaire, Application, signed Deposit Agreement, and data
files to determine if the project can be accepted into RUcore directly, whether it will require
mediated deposit, or if it should be considered a special data project. The researcher will receive
online confirmation that the data has been successfully uploaded, that the project is being
reviewed, and that notification of the outcome of the review will be sent within five working
days.
The Core Data Team will determine who will be the Project Manager (PM) for the duration of
the project. The PM will be a member of the Expanded Data Team, who may decide to include a
subject specialist or other librarian or staff member to assist with the project. If the Data Team
determines that there are unresolved rights issues, technical issues, or issues concerning sensitive
data, the PM will contact the researcher with referrals to appropriate personnel to resolve these
issues, and will also work with the researcher as needed to resolve any outstanding issues. The
PM will collect any outstanding documentation.
If the project does not appear to have unresolved issues, the data will be given a brief review to
check for de-identification of confidential or sensitive information if applicable. The data will
also be checked for descriptive documentation in the form of a “README” file, so that
researchers will be able to understand and use the data files; to verify that file names are not
nonsensical; that the file types can be accepted into RUcore; that the files can be opened and read
in the appropriate application; that there is sufficient supplementary documentation provided
such as codebooks or questionnaires; and that any URLs are persistent. This brief review for
completeness should take no more than five working days. After that time the researcher will be
notified regarding the acceptance of the project, and the name of the RUL Project Manager who
will become the primary contact for questions concerning the data project.
15
The PM will then work with the other members of the RUL Data Team to complete the ingest of
the data and documentation into RUcore, and will enter metadata into WMS. The minimum
amount of metadata which will be required for ingest are those identifiers which have been
described in the Project Application. Additional metadata may be necessary, and the PM or other
subject specialist or staff member will enter this metadata as required, but this should not delay
the ingest and release of data through RUcore. Additional metadata for project completeness
should be entered within two to four weeks after publication of the data in RUcore.
Rutgers Major Stakeholders in Research Data
9. Determine the major stakeholders at Rutgers who need to be familiar with the RUL data deposit
process.
In addition to creating a high-quality data preservation and sharing service that meets the needs
of our institutional researchers, it is important to recognize that outreach and communication are
a critically important part of the initiative. Researchers are typically busy people; they are
initially unlikely to reach out to the library as a provider of data services, since we are still
thought of by many as “the book people”. In a recent conversation with an IT professional in
SAS we heard, “RUcore is the best-kept secret on campus.” This is something we need to see
change. Fortunately, the Open Access initiative and new SOAR portal to RUcore is an
opportunity to raise the profile of the Libraries and RUcore in the minds of faculty members.
The RUL Research Data Service will allow the Libraries to become a valued partner in
compliance with research funder data management requirements.
Stakeholders for the RUL Research Data Service include many university organizations. These
include:
Office of Research & Economic Development
Office of Research & Sponsored Programs
Institutional Review Boards
Office of Information Technology
Research faculty
Unit research facilitators
Department staff charged with supporting researchers
Academic & research unit computing staff
RUL librarians & IT staff
Peer institutions
The task force can envision three different levels of awareness that we need to develop among
our various stakeholders.
Recognition The RUL Research Data Team would seek to create a University-wide awareness of
the existence of the RUL Research Data Service, and its potential to support research data
management through Data Management Planning as well as data preservation and sharing.
Outreach activities to support this level of awareness could include content in the RUL web site,
the ORED Newsletter, regular Rutgers email newsletters, plus periodic presentations to
established regular meetings of academic departments, OIT, and other university units. Special
16
marketing and promotional materials such as bookmarks, brochures, flyers, QR codes, and
digital displays could be created. Additionally, it is clear that any data service needs to be
publicized through outside journal articles and conference proceedings, which will create
additional visibility for RUL and Rutgers in general.
Implementation Once we begin implementation beyond support for Research Data Management
Plan development, researchers and their support administrators will need clear documentation
and educational materials to support the actual process of data preservation. During the initial
implementation, the nature of datasets we will be accepting for ingest is straightforward and can
be easily described to potential depositors through our general outreach activities noted above, as
well as through more detailed presentations about the actual project processing. As we progress
toward full implementation, we will need to develop additional documents to help researchers
identify datasets that require and are suitable for preservation and or sharing. Additionally,
informational materials should be developed and sent to all researchers as they deposit scholarly
articles to encourage them to connect their data with their publications.
Development and Support For the long term success of RUL Research Data Services, it will be
critical that the RUL Research Data Team maintain and enhance relations with key university
stakeholders such as ORED, IRB, and Academic Unit Deans. We must prove the value and
effectiveness of our services in research data management, preservation, and sharing, while
meeting the expectations for ease of use and efficiency. The development of a regular reporting
process by which we will inform these units of RUcore metrics that include collection access,
dataset citations, and measures of researcher satisfaction, will invite recognition and support.
Ultimately, our goal is to develop and implement communication vehicles and feedback
mechanisms to ensure that RUcore is accepted as a high-quality data repository for research
products.
RUcore Technical Support for Research Data
10. Consult with CISC to assess RUcore hardware and software infrastructure to support immediate,
three year and five year needs.
On November 26, the Guidelines for Research Data Services were reviewed with the Cyber-
Infrastructure Steering Committee (CISC), and projections of storage needs were discussed. As a
result, suggestions were incorporated into the Guidelines included in this report. The Task Force
proposed the following projected storage capacity needs, which were approved by CISC.
For the initial implementation of mediated data acceptance, which would last for
approximately one year, the storage needs are not anticipated to exceed 2 Terabytes
(TB).
For the first three years, the storage needs would be up to 20 TB.
For the first five years, it is not expected that storage needs would exceed 100 TB.
It was suggested that very large data projects which might exceed the existing available storage
capacity should be considered, and that additional storage could be purchased for the researcher
17
for a fee. Storage fees could be established as a reasonable method of quantifying data services
provided to the researcher.
Conclusions and Recommendations:
Researchers are working to comply with federal mandates to make funded research publicly
accessible, and are seeking easy and efficient methods of safely sharing and preserving their
data. There has been a rapid advancement of academic libraries into this arena, in an effort to
help researchers fulfill the requirements for public access to federally funded research. In
addition to institutional repositories, data specific repositories such as Dryad and ICPSR
continue to grow. Academic libraries with institutional repositories see the opportunity to
become part of the research workflow, and are actively promoting their research data services to
their communities. We believe that Rutgers University Libraries currently have the expertise,
experience, and system capabilities to accept research data in RUcore immediately, as evidenced
by the development of the pilot data portal in 2010, and the acceptance of pilot data projects
starting in 2012. Further, if we do not soon establish ourselves as participants in the sharing and
preservation of research data, we will be left behind as researchers find other ways to comply
with funding requirements.
We would seek to replicate the successes of our peer and aspirant institutions in the acceptance
of research data. Our review of institutional and data repositories found similarities in the way in
which others are facing the challenge of providing research data services, and in their research
data policies. It was assumed that a policy similar to those we reviewed would be adopted by
Rutgers, and allowed this to guide our thinking about data acceptance. Based on our research, we
found the commonalities in the best data policies seem to be that the university owns the data;
the Principal Investigator is the custodian of the data; and protocols exist in the event the
Principal Investigator leaves the institution. The Interim Report submitted by the Task Force in
October 2014 reviews institutional data policies in more detail, and provides examples of data
policies that were considered to have been thorough and well-written.
We also found that most of the institutional and data repositories we reviewed offered self-
deposit of data, or both self-deposit and mediated data deposit. Of the thirty-seven repositories
we reviewed, the following institutions offer self-deposit or a combination of self-deposit and
mediated deposit, with CIC institutions shown in boldface:
University of California at Berkeley
University of California at San Diego (iDASH)
University of California at San Francisco
Columbia University
Cornell University (eCommons)
Dryad
University of Edinburgh
University of Guelph
Harvard University
ICPSR
18
University of Illinois at Chicago
University of Illinois at Urbana-Champaign
Indiana University
University of Iowa
Johns Hopkins University
University of Maryland
University of Michigan
University of Minnesota
MIT
New York University
Ohio State University
University of Oregon
Penn State University
University of Pittsburgh
Purdue University
Stanford University
Syracuse University (QDR)
University of Texas
University of Virginia
University of Wisconsin- Madison
The Task Force believes that we should allow self-deposit of data as many of our peer and
aspirant institutions have done, and as the Libraries are already doing with scholarly articles.
With the acceptance and coming implementation of the Rutgers Open Access Policy, we can
soon expect to receive data associated with published articles in a self-deposit process. Self-
deposit is already familiar to researchers through submittals to the government’s PubMed
Central, and other established data repositories. Self-deposit of data, we feel, is a way of heeding
the advice we received from ORED with regard to data deposit: “Keep it simple.”
In order to obtain more details about the deposit processes reviewed in the Interim Report, Aletia
Morgan and Laura Palumbo had subsequent conversations with repository managers at the
University of Maryland, the University of Minnesota, Purdue University, and Penn State
University to verify procedures and staffing. They found that Maryland and Minnesota do
cursory reviews of self-deposited data for formatting issues or other obvious problems, with the
University of Minnesota aiming to complete this review within two days. Penn State allows self-
deposit with immediate visibility of uploaded data, and without review of any kind. Only
minimal metadata is required, and the researchers create their own metadata exclusively. Purdue
also performs no review of deposited data, leaving full responsibility for compliance with any
restrictions up to the researcher. While the Task Force recommends a brief review for obvious
problems and technical issues, we support the notion that by reviewing data in detail we would
accept responsibility for any errors and legal violations.
Staffing of data service teams at these institutions was also discussed, and it was determined that
a limited number of full-time staff typically work with teams of librarians and others with part-
time data responsibilities to accomplish the tasks associated with data acceptance. Time spent on
data related tasks varied, based on how much data the repository is currently accepting. It was
19
discovered that some staff at Purdue receive funding from grant money to accomplish their work.
We feel that RUL currently has sufficient well-trained staff to accept research data into RUcore,
as described previously.
In order to create a sustainable service, funding should be sought once the Libraries have begun
to accept research data on a regular basis. The most logical source of this funding would be from
the Office of Research and Economic Development, whose goal it is to help researchers obtain
grants and comply with funder directives. Some additional funding can be achieved by
establishing fees for additional storage capacity, which can be passed on to funders by
incorporation into grant proposals. However, because storage is relatively inexpensive, this
probably will not be a major source of income. If the Libraries can establish research data
acceptance as a core service, funding can be provided through budgeting from departments who
would benefit from this service.
Efforts should be made to maintain the relationships we have established with the Office of
Research and Economic Development, the Office of Research and Sponsored Programs, and the
Institutional Review Boards. We should continue to work to establish integration of data
acceptance in RUcore into their workflows, so that researchers are aware of the availability of
RUcore for sharing and preservation of research data; and of the related services that RUL can
provide, such as the preparation of data management plans and consulting on data projects.
These offices indicated that they are happy to work with the Libraries to establish workflows for
our mutual benefit.
The acceptance of research data into RUcore is an important service to faculty which would
highlight the expertise of the Libraries, and which would allow us to establish deeper
relationships with our research communities. It could also become a source of funding as a core
service to researchers. However, research data services must be easy to use in order to be of
value to time-pressed researchers, and to be seen as worthy of financial support. RUL research
data services will not be used if they do not become more visible soon. Researchers are obligated
to comply with funding mandates, and will find ways to do so without the Libraries if we do not
take action. We propose the immediate acceptance of mediated data projects as described in this
report, which will provide the basis for further learning and expansion of our data services. With
the benefit of this additional experience, and the resulting deepening relationships with research
faculty, we will be ready for the establishment of seamless online data acceptance, such as is
being done by our peer and aspirant institutions.
20
Appendices
Appendix A: Task Force on Research Data Implementation Charge
Appendix B: Research Data Service Guidelines
Appendix C: Mediated Research Data Projects Questionnaire
Appendix D: Flowchart for Mediated Data Projects Questionnaire
Appendix E: Self-Deposit, Mediated, and Special Research Data Projects Questionnaire
Appendix F: Full Implementation Flowchart for Mediated and Self-Deposited Data Projects
Appendix G: Research Data Project Application for the Responsible Researcher
Appendix H: Research Data Deposit Agreement
Appendix I: General Guidelines for Librarians advising on Research Data Projects for RUcore
Deposit
21
Appendix A: Task Force on Research Data Implementation Charge
Rutgers Libraries Task Force on Research Data Implementation
The task force on Research Data Implementation is charged with establishing an administrative and
evaluation framework for the deposit of research data in RUcore. This implementation process will
inform the development of a university data policy by the office of General Counsel working with our
Libraries Copyright and Licensing Librarian.
The task force should involve other individuals as necessary to do its work, and engage at the outset with
the office of the Vice President for Research and the Office of Research and Sponsored Programs to
ensure that the implementation addresses issues of importance to the research faculty and appropriate
administrative offices. The task force should also liaise with the Committee on Scholarly Communication
through its chair, Laura Mullen.
Janice Pilch, Copyright and Licensing Librarian, will work separately on drafting a data policy. When
your draft implementation plan is ready, Janice can review it from the copyright perspective. We believe
this two part process will work effectively.
This is a Cabinet Task Force under the joint leadership of Grace Agnew and Melissa Just who will
oversee and guide its work on behalf of Cabinet. We expect the plan to be completed no later than
December 2014, and Cabinet would expect a progress report mid-way through the process.
The charge to the Task Force is to:
1. Review the administrative structure of other data repositories that might serve as models.
2. Review the evaluation process for technical, legal, and confidential issues involving data deposit at
other institutions that might serve as models.
3. Consult with appropriate major stakeholders to ensure that RUL workflows and practices facilitate
and do not conflict with policies and practices of those departments, especially the office of the Vice
President for Research, and Research and Sponsored Programs.
4. Establish principles for prioritization of data deposit projects based on RUL strategic priorities. This
should include a definition of various types of potential projects to ensure that we have the resources
both to host and to sustain projects, ie. federal grants, non-grant funded research, etc.
5. Develop a framework for evaluation for data deposit in RUcore that includes a questionnaire or series
of questionnaires to be used for each data deposit, covering technical, legal, and confidential criteria
(similar to the Digital Projects Evaluation Process approved by Cabinet in March 2013).
6. Develop a corresponding guide on evaluation criteria to provide clarity to subject librarians. (Similar
to the Deed of Gift Explanation in the RUL Deed of Gift).
7. Recommend assignments for functional responsibility in the area of data deposit.
8. Chart a workflow for the data deposit evaluation process.
9. Determine the major stakeholders at Rutgers who need to be familiar with the RUL data deposit
process.
10. Consult with CISC to assess RUcore hardware and software infrastructure to support immediate,
three year and five year needs.
22
Task Force Members:
Laura Palumbo, Chair
Ron Jantz
Yu-Hung Lin
Aletia Morgan
Minglu Wang
Krista White
Ryan Womack
Yingting Zhang
Yini Zhu
6/26/14
23
Appendix B
Research Data Service Guidelines
Research data to be accepted into RUcore will demonstrate the scholarly and scientific research
being conducted by Rutgers’ researchers. This data will contribute to the advancement of
knowledge and research in diverse subject areas, including the sciences, health sciences, social
sciences, and humanities, as described in the Libraries’ Strategic Plan. By accepting and publicly
sharing research data, we will highlight the unique ability of the Libraries to facilitate discovery,
access, and reuse of Rutgers University’s research.
Research data may be the result of unfunded as well as grant-funded research, to allow for a
broad spectrum of research areas to be included; however projects which require data deposit to
comply with funder mandates may be given preference. Working with Rutgers’ researchers, the
Libraries will provide access to Rutgers’ research data. The Principal Investigator or Primary
Responsible Researcher, hereafter to be referred to as the Responsible Researcher and which
shall be meant to include both titles, will be responsible for assuring that the data can be shared
publicly in accordance with University policies, Federal and other funders’ directives, and is in
compliance with any legal restrictions.
Research data will be accepted insofar as the Libraries have “…the resources, including, but not
limited to, expertise, technology, and funding, to support the project, both initially and ongoing.”
(Digital Projects Evaluation Process, Rutgers University Libraries, March 2013).
We anticipate a phased approach to acceptance of data, in order to be able to scale services as
projects become more complex. During the initial implementation of data acceptance, the deposit
process will be mediated by members of the RUL Research Data Team. Researchers will work
directly with a trained Project Manager, who will guide the researcher through the deposit
process and see it through to completion. For the initial implementation of data acceptance, we
recommend the following guidelines:
Initial Implementation: Mediated Data Acceptance
All projects will be considered, regardless of funding status. Projects which require data
deposit to comply with funder mandates may be given preference.
One of the Responsible Researchers (or co-PIs) must be Rutgers faculty or staff. This
Responsible Researcher must initiate the data deposit process.
Datasets which are associated with Rutgers graduate students’ deposited electronic theses
and dissertations will be accepted.
Short-term embargoes will be allowed, such as until the publication of a book or article.
The Responsible Researcher will be responsible for determining that the data is
appropriate for public sharing. Through a deposit agreement, they will attest that by
24
sharing the data they will not be in violation of any confidentiality agreements, copyright
laws, or other laws, and will hold Rutgers University Libraries harmless from any
damages resulting from the sharing or misuse of the data.
Total data volume will be 100 GB or less per project without storage fees. Projects
requiring more than 100 GB will be reviewed the Associate University Librarian for
Digital Library Systems and considered on a case-by-case basis.
Data which requires media transformation or digitization will be referred to the
Repository Collection Librarian for necessary transformation/digitization.
The data will not be derived from human or animal subjects. Sensitive or confidential
data, if de-identified, will be accepted later in the full implementation of data services.
The data will not be the result of research conducted on behalf of or in conjunction with
any outside commercial interests. Projects which involve commercial interests will be
accepted later during the full implementation of data services.
Research data which would require technical or system modifications within RUcore will
be considered on a case-by-case basis.
Full Implementation: Self-Deposited and Mediated Data Acceptance
After the initial acceptance of data as described above, self-deposit of data by the Responsible
Researcher is recommended, provided that the necessary storage capacity, staffing, and technical
components are in place. During the full implementation of data services, researchers would
have the option to use online self-deposit forms, or to choose mediated deposit as was offered in
the initial implementation. Self-deposited data will require the researcher to affirm that high level
criteria are met, which will be reviewed by the Data Team. If these criteria are met, Data Deposit
and Application Forms will then be completed online, and additional internal reviews will be
performed. Once the internal review has been completed, the data will be made accessible in
accordance with any restrictions which have been placed on the data (see Evaluation of Data
Projects for Deposit and Data Deposit Workflow for details).
Datasets that meet the initial implementation requirements will continue to be accepted in
the full implementation of data services.
Data involving human and animal subjects will be permitted with proper de-
identification; IRB and other approvals must be in place. The Responsible Researcher
will be responsible for determining that confidentiality and any other legal requirements
are met.
Projects requiring up to 500 GB of storage will be accepted, and fees may be assessed
accordingly.
25
Projects which involve commercial interests will be considered.
We anticipate that the acceptance of research data into RUcore will be an evolving process,
during which Rutgers Libraries will adapt and grow as new challenges are presented. We cannot
foresee what new developments might arise, however we believe that RUL will be agile enough
to accommodate changing research data needs. These special, complex and/or very large
research data projects will be considered on a case-by-case basis.
Special Research Data Projects, to be assessed on a per case basis
Data projects which will require extensive staff time to develop may be accepted; fees
may be assessed.
Data which will require the purchase of additional storage capacity will be considered,
and storage costs will be assessed.
Projects which will require system modifications will be considered.
Data which requires media transformation or digitization will be considered.
26
Appendix C
Mediated Research Data Projects Questionnaire
During the initial implementation of Research Data Services, data deposit will be a mediated
process. Researchers will work in conjunction with members of the proposed Libraries’ Research
Data Team, and other appropriate personnel in RUL. Members of the Research Data Team will
guide the researcher through the process of data deposit, and will begin by soliciting answers to
the following high level criteria intake questionnaire.
The responsibility for compliance with any legal restrictions lies with the Principal
Investigator/Responsible Researcher (hereafter referred to as the Responsible Researcher). They
will assume liability for determining if their data is free from any copyright or intellectual
property constraints, sensitive or confidential information, any restrictions on public
accessibility, or any other legal and ethical issues which might prevent their depositing and
sharing the data publicly. The Libraries will be exempt from liability by not assuming
responsibility for these determinations.
Guidance is available to researchers needing assistance with copyright issues from the Copyright
Librarian. Researchers needing assistance with issues concerning intellectual property and
commercial interests should seek advice from the Office of Technology Commercialization.
Researchers with human or animal subjects’ data should contact the Research Data Manager for
advice concerning a suitable repository for data sharing during the initial implementation of data
acceptance.
This high level criteria intake questionnaire can be sent to the Responsible Researcher as a pre-
consultation step, or completed at the initial consultation with the Data Project Manager. This
information will determine whether the project can be considered for deposit under the
Guidelines for Research Data Services during the initial implementation of data acceptance.
Please have the Responsible Researcher answer the following questions, and sign the completed
form:
1. Is the Responsible Researcher Rutgers faculty or staff? For datasets associated with
Rutgers electronic theses and dissertations, please contact the Repository Collection
Librarian, Rhonda Marker, at rmarker@rci.rutgers.edu.
2. Are all data in digital format? For projects requiring digitization, please contact the
Repository Collection Librarian at rmarker@rci.rutgers.edu .
3. Is the project without human or animal subjects? Data involving human or animal
subjects will be accepted in the full implementation of data services. Please contact the
Research Data Manager, Aletia Morgan at aletia.morgan@rutgers.edu for guidance
concerning an appropriate repository for human or animal subjects’ data.
27
4. Is the project independent of support or participation of outside commercial interests?
Projects with outside commercial interests will be considered during the full
implementation of data services. For assistance with commercial issues, please consult
with the Office of Technology Commercialization at 848-932-0115 or
lbdars@otc.rutgers.edu
5. Have all copyright, licensing, and other legal restrictions been met? If unsure, please
consult with the Office of Technology Commercialization at 848-932-0115 or
lbdars@otc.rutgers.edu; and the Copyright Librarian, Janice Pilch at
janice.pilch@rutgers.edu.
6. May the data be made publicly accessible? We can accept data with short-term
embargoes, such as until the publication of a book or article; for projects that require
permanent preservation without public access, please contact the Research Data Manager
at aletia.morgan@rutgers.edu.
7. Is the data the final version intended for public release? We cannot offer working storage
for ongoing research projects.
8. Is the data volume less than 100 GB for this project? If not, please consult with the
Repository Collection Librarian at rmarker@rci.rutgers.edu.
9. Is the research funded? If so, a copy of the funded grant application will need to be
provided. The grant application is for internal use only and will not be made publicly
accessible, unless so desired.
10. If all of the above conditions are met, the Project Manager should assist the researcher, as
needed, with completion of the Research Data Project Application for the Responsible
Researcher. If the above questions cannot be affirmed, please consult the Research Data
Manager at aletia.morgan@rutgers.edu to see if the data can be accepted at this time.
All of the above questions have been correctly and truthfully answered in the affirmative.
Responsible Researcher Date
28
Appendix D: Flowchart for Mediated Data Projects Questionnaire
START
Is the Responsible Researcher Rutgers Faculty or Staff?
Is the researcher is a grad student with a thesis or
dissertation?
No
See the Repository Collection Librarian
Yes
Data cannot be accepted, contact RDM for alternatives
No
Is the data in digital format?
Are there human or animal subjects?
Yes
No
This data can be accepted during the Full Implementation of RD
Services. See the RDM for further instructions.
Yes
No
Are there commercial interests?
No
This data can be accepted during the Full Implementation of RD Services. See
the Office of Technology Commercialization for assistance.
Yes
Have all copyright, licensing, and other legal restrictions
been met?
Yes
See the Copyright Librarian and/or the Office of Technology
Commercialization for assistance.
No
Yes
Continued on next page
29
Page 2: Flowchart for Mediated Data Projects Questionnaire
Is the data volume less than 100 GB for this
project?
Contact the RDM. Project will be considered by AUL for Digital Library Systems.
No
Is the research funded?
Yes
Provide a copy of your grant documents for
internal use.
Yes
Sign form, Proceed to
application form and deposit agreement
No
Is the data the final version for public release?
Working storage is not an option. Contact the RDM for assistance.
No
Can the data be shared publicly now?
Can the data be shared eventually (i.e. after an embargo
period)?
Contact the RDM for more information.
No
Yes Yes
Yes
Continued from previous page
No
30
Appendix E
Self-Deposit, Mediated, and Special Research Data Projects Questionnaire
The responsibility for compliance with any legal restrictions lies with the Principal
Investigator/Responsible Researcher (hereafter referred to as the Responsible Researcher). The
Responsible Researcher will assume liability for determining if their data is free from any
copyright or intellectual property constraints, sensitive or confidential information, any
restrictions on public accessibility, or any other legal and ethical issues which might prevent
depositing and sharing the data publicly.
Guidance is available to researchers needing assistance with copyright issues from the Copyright
Librarian. Researchers needing assistance with issues concerning intellectual property and
commercial interests should seek advice from the Office of Technology Commercialization.
Those with questions about data from human or animal subjects should contact the appropriate
Institutional Review Board.
Please answer the following questions:
1. Is the Responsible Researcher Rutgers faculty or staff? For datasets associated with
Rutgers electronic theses and dissertations, please contact the Repository Collection
Librarian, Rhonda Marker, at rmarker@rci.rutgers.edu.
2. Are all data in digital format? For projects requiring digitization, please contact the
Repository Collection Librarian at rmarker@rci.rutgers.edu .
3. Has IRB approval been obtained for projects which contain human or animal subjects?
Can you provide copies of all IRB approvals from all institutions? These documents can
be restricted from public access as necessary. Have all information directly identifying
the subjects been removed as required by IRB, and any other measures necessary to
prevent the disclosure of the subjects’ identities been taken? Prior to depositing this data,
you will be required to sign an agreement attesting that you have taken all actions as
necessary to permit you to lawfully share the data through RUcore.
4. Have all copyright, licensing, and other legal restrictions been met? All commercial
interests must be disclosed. For questions about these issues, please consult with the
Office of Technology Commercialization at 848-932-0115 or lbdars@otc.rutgers.edu;
and the Copyright Librarian, Janice Pilch at janice.pilch@rutgers.edu.
31
5. If the materials include photographs, audio recordings, or audiovisual recordings, do you
have signed release forms for use of a person’s image or voice? Copies of signed release
forms must be provided, but will be restricted from public access.
6. Is the data able to be made publicly accessible? We can accept data with short-term
embargoes, such as until the publication of a book or article; for projects that require
permanent preservation without public access, please contact the Research Data Manager
at aletia.morgan@rutgers.edu.
7. Is the data the final version intended for public release? We cannot offer working storage
for ongoing research projects.
8. Is the data volume less than 500 GB for this project? If not, please consult with the
Repository Collection Librarian at rmarker@rci.rutgers.edu.
9. Is the research funded? If so, a copy of the funded grant application will need to be
provided. This document is for internal use only and will not be made publicly accessible.
10. If all of the above conditions are met, the Responsible Researcher, in addition to
providing all necessary documentation, will be required to fill out a project application
and sign a Data Deposit Agreement, prior to upload of data. If the above questions cannot
be affirmed, please consult the Research Data Manager at aletia.morgan@rutgers.edu to
see if the data can be accepted at this time.
All of the above questions have been correctly and truthfully answered in the affirmative.
Responsible Researcher Date
32
Appendix F: Full Implementation Flowchart for Mediated and Self-Deposited Data
Projects
START
Is the Responsible Researcher Rutgers Faculty or Staff?
Is the researcher is a grad student with a thesis or
dissertation?
No
See the Repository Collection Librarian
Yes
Data cannot be accepted, contact RDM for alternatives
No
Is the data in digital format?
Are there human or animal subjects?
Yes
No
Has IRB approval been obtained? Yes
No
Yes
Have all copyright, licensing, and other legal restrictions
been met?
Yes
See the Copyright Librarian and/or the Office of Technology
Commercialization for assistance.
Yes
Contact your local IRB.
No
Has all personal, confidential, or sensitive information been
removed?
Contact ORED and RDM for
assistance
No
Yes
Have all commercial interests been disclosed?
Yes or
No
Have release forms been obtained for images or audiovisual
recordings?
No
Yes
No
Continued on next page
N/A
33
Page 2: Full Implementation Flowchart for Mediated and Self-Deposited Data Projects
Is the data volume less than 500 GB for this
project?
Contact the RDM. Project will be considered by AUL for Digital Library Systems.
No
Is the research funded?
Yes
Provide a copy of your grant documents for
internal use.
Yes
Sign form, Proceed to
application form and deposit agreement
Is the data the final version for public release?
Working storage is not an option. Contact the RDM for assistance.
No
Yes
No
Can the data be shared publicly now?
Can the data be shared eventually (i.e. after an embargo
period)?
Contact the RDM for more information.
Continued from previous page
No No
Yes Yes
34
Appendix G
Research Data Project Application for the Responsible Researcher
Please complete the following information about your research data:
1. Project Title
2. Name of Principal Investigator/Responsible Researcher, Department and lab group, e-
mail address and Rutgers NetID. The Principal Investigator/Responsible Researcher must
be affiliated with Rutgers.
3. List the contributors whose names will be associated with the project. Please list
departments and lab groups, e-mail addresses and Rutgers NetIDs. If not Rutgers
affiliates, for each researcher please list his or her current institutional association and
department, e-mail address, phone number, and physical address. Briefly describe their
role(s) in the project.
4. An abstract describing the project.
5. Keywords which will help identify your data domain(s).
6. Any links or additional resources associated with the project.
7. Documentation to allow users to understand the nature of your data, even if they are not
familiar with your subject area. A README file should be included which describes
your data files. Other supplementary files such as codebooks or questionnaires should
also be provided.
8. A list of files or data objects and the format of these files/objects. All data objects must
be in digital format; for data which requires digitization please contact the Repository
Collection Librarian, Rhonda Marker, at rmarker@rci.rutgers.edu.
9. If your data contains copyrighted material, please provide proof of permission to share
the data publicly via RUcore.
10. Funding sources for the project, if any. Please provide a copy of your grant application
and approval for internal use.
11. Any constraint on sharing the data not previously described, such as an embargo period.
If your data will be restricted to use by Rutgers researchers only who must meet certain
criteria, please describe the archiving and re-use conditions you believe are appropriate to
your data. We will contact you to discuss this further.
12. Any approvals obtained or in process, which are not already provided.
35
(For Full Implementation of data services projects only)
13. If your project contains data from human or animal subjects, please describe the level of
the sensitivity of the data. Indicate if the data contains confidential information, and
describe the methods you used to anonymize your data. You must submit your IRB
application and approval along with the data.
I have provided all of the above information as the Responsible Researcher for this data project.
All of the information is correct, and I have not knowingly withheld or misrepresented any
information.
Responsible Researcher Date
36
Appendix H
Research Data Deposit Agreement
As the Responsible Researcher and custodian of this research data (“Work”), I hereby grant to
Rutgers University Libraries the non-exclusive right to retain, reproduce, and distribute the
deposited work in whole or in part, in and from its electronic format, without fee. This agreement
does not represent a transfer of copyright to Rutgers University Libraries.
Rutgers University Libraries may make and keep more than one copy of the Work for purposes
of security, backup, preservation, and access, and may migrate the Work to any medium or
format for the purpose of preservation and access in the future. Rutgers University Libraries will
not make any alteration, other than as allowed by this agreement, to the Work.
I represent and warrant to Rutgers University Libraries that the Work is my original work. I also
represent that the Work does not, to the best of my knowledge, infringe or violate any rights of
others. It does not contain any confidential or sensitive information.
I further represent and warrant that I have obtained all necessary rights to permit Rutgers
University Libraries to reproduce and distribute the Work and that any third-party owned content
is clearly identified and acknowledged within the Work.
By granting this license, I acknowledge that I have read and agreed to the terms of this
agreement and all related RUcore and Rutgers policies at
http://rucore.libraries.rutgers.edu/policies/ and http://policies.rutgers.edu/ . I hold Rutgers
University Libraries harmless from any damages incurred as a result of public sharing and/or
reuse of this data.
Name (please print)____________________________________________________________________
Signature ___________________________________________ Date ____________________________
Address ______________________________________________________________________________
City __________________________________________ State ____________ Zip _________________
Telephone Number (____) _____________ E-mail: _________________________________________
Author and Title of Work Department and School
_____________________________________________________________________________________
Co-Researchers Department and School
_____________________________________________________________________________________
37
Appendix I
General Guidelines for Librarians advising on Research Data Projects for RUcore deposit
This document uses the term Responsible Researcher (RR) to indicate the lead researcher
responsible for the data project under consideration. In the case of a grant-funded project, the
Responsible Researcher is the Principal Investigator. If the project is not grant-funded, the
Responsible Researcher is the person responsible for creating the data or supervising data
collection.
About the Responsible Researcher
1. Librarians may be approached by various levels of personnel who have worked on a data
project --- graduate students, administrative staff, and others. While these contacts may
useful for providing information about the project, and may ultimately be involved in the
data transfer and review, the librarian must determine who the Responsible Researcher is for
the project. Only the RR can vouch for the ultimate veracity and compliance of the data, and
RUL will need the RR to participate in the deposit process. The RR for the project must be
currently employed by Rutgers.
2. Joint research projects with non-Rutgers partners must proceed with contact through the
Rutgers RR. The Rutgers RR must consider whether rights and permissions are jointly held,
and it is acceptable to all parties for any non-Rutgers or shared data to be deposited at
Rutgers.
Copyright, Commercial Interests, Sensitive Information, and other rights issues
3. Copyright to data is held by the University, but the RR as the creator and steward of the data
has the right to share the data in accordance with funder requirements and in furtherance of
scholarly communication. The RR should be familiar with these policy issues and be able to
affirm that they are the person with the right to deposit the data (in anticipation of an
accepted data policy). Guidance may be obtained from the Copyright Librarian.
4. If data is based on other data sources (e.g., extracts from commercial databases, data shared
under a use agreement), the RR should verify that the data to be shared does not violate any
usage terms. The RR should provide information about the usage terms in these cases.
5. If data includes audiovisual materials involving people, the RR must provide copies of the
release forms or consent agreements for the study participants.
6. The RR should be prepared to submit grant documents and IRB approval documents along
with the data submission. These materials will not be publicly shared, but are necessary for
RUL staff to ensure that the data will be properly maintained over time in accordance with
the original grant and IRB terms.
38
7. The librarian should obtain from the RR information about any related publications, and offer
to have these deposited in RUcore.
8. If a project involves potential patent or commercialization issues, the RR should verify that
the public release of data is permissible and does not interfere with potential
commercialization. The RR should consult with the Office of Technology
Commercialization in such cases. In practice, the RR will likely be well aware of such issues
and have been working with OTC from the start of the project.
9. If the data involves human subjects or other sensitive information, the RR should be sure that
the data for public release has no personally identifiable information (PII) or similar issues.
If the project has been through IRB review, many of these issues will already have been
considered, and the distribution of public release data should be discussed in the IRB
documents. HIPAA regulations would apply to health data, and FERPA regulations would
apply to educational data, but even in cases where no statutory obligations hold, ethical
concerns would prohibit certain kinds of data release. The RR should consider these issues
in the preparation of data for widespread public dissemination.
Storage, Access, and Files
10. The RR should understand that RUcore does not provide working storage, and that data
provided should be the final form intended for release. To a limited degree, we can
accommodate versioning of files that change over the course of a project, but each version
should be intended for release.
11. The RR should understand the RUcore is primarily intended for public data distribution.
Embargoes of data for a limited time (e.g., until publication of a related article/book) can be
accommodated. There should be a compelling reason for data to be subject to longer term
embargoes.
12. Similarly, data access can be restricted to specific user groups, such as Rutgers users or
individual users who have signed a usage agreement, but there should be a compelling reason
for these restrictions.
13. The RR should be encouraged to deposit data in widely used, open formats when available.
These will ensure wider audience and a greater ability to ensure long-term preservation.
Data in proprietary or unusual formats will be accepted, however, if that is necessary to
support the researchers’ workflow. All data should be accompanied by a “ReadMe” file, so
that users have sufficient information to be able to understand and use the data.
14. Data deposit is intended for “born digital” or files that are already digitized. If the RR has
analog materials that need to be converted prior to deposit, the librarian should advise the RR
about the digitization process offered by the Libraries.
39
15. The librarian should ascertain the approximate total file size and number of files that the RR
intends to deposit. If a large quantity of storage space is needed, the librarian can advise on
the costs of storage. If there are a large number of files which will require distinct metadata,
the librarian should consult with the Head of Central Technical Services to estimate the
complexity of the project. Multiple files can be bundled into zip files, and RUcore’s
Directory Ingest tool can be used to represent complex structures, so the absolute quantity of
files is less important than whether or not the project will require extra staff time for
metadata work.
40