Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric...

27
Institutional Institutional Repositories Repositories Presentation for Kentuckiana Presentation for Kentuckiana Metroversity Meeting Metroversity Meeting February 10, 2005 February 10, 2005 Eric Weig Eric Weig Head, Digital Programs Head, Digital Programs University of Kentucky Libraries University of Kentucky Libraries

Transcript of Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric...

Page 1: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Institutional RepositoriesInstitutional Repositories Presentation for Kentuckiana Metroversity MeetingPresentation for Kentuckiana Metroversity Meeting

February 10, 2005February 10, 2005

Eric WeigEric WeigHead, Digital ProgramsHead, Digital Programs

University of Kentucky LibrariesUniversity of Kentucky Libraries

Page 2: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Reality CheckReality Check Users want access to everything in digital format.Users want access to everything in digital format. In Academia, gigabytes of digital files stored on thousands In Academia, gigabytes of digital files stored on thousands

of disks on personal and institution departmental computers of disks on personal and institution departmental computers are not managed and preserved in a systematic way. In are not managed and preserved in a systematic way. In part, this is due to the continued trust in and reliance on part, this is due to the continued trust in and reliance on paper as the medium of record.paper as the medium of record.

More and more of the output of the academy never makes it More and more of the output of the academy never makes it to paper, and the digital file is the only record of these to paper, and the digital file is the only record of these activities.activities.

Digital preservation is an unsettled and continually evolving Digital preservation is an unsettled and continually evolving field.field.

It is essential to any college or university’s mission and It is essential to any college or university’s mission and history that materials with long term value are managed and history that materials with long term value are managed and preserved, and that other researchers are provided access preserved, and that other researchers are provided access to them where appropriate.to them where appropriate.

Page 3: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Trusted Digital RepositoryTrusted Digital Repository A trusted digital repository is one whose mission is to provide

reliable, long-term access to managed digital resources to its designated community, now and in the future. To meet expectations all trusted digital repositories must:

accept responsibility for the long-term maintenance of digital resources on behalf of its depositors and for current and future users;

have an organizational system that supports not only long-term viability of the repository, but also the digital information for which it has responsibility;

demonstrate fiscal responsibility and sustainability; design its system(s) in accordance with commonly accepted

conventions and standards to ensure the ongoing management, access, and security of materials deposited within it;

be depended upon to carry out its long-term responsibilities to depositors and users openly and explicitly;

have policies, practices, and performance that can be audited and measured(Attributes of a Trusted Digital Repository: Attributes and Responsibilities, An RLG-OCLC Report, May 2002.)

Page 4: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Broad Definition of Institutional Broad Definition of Institutional RepositoryRepository

Any collection of digital material hosted, owned Any collection of digital material hosted, owned or controlled, or disseminated by a college or or controlled, or disseminated by a college or university.university.

Page 5: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Narrowed Common Institutional Narrowed Common Institutional Repository DefinitionRepository Definition

A digital archive of the intellectual product A digital archive of the intellectual product created by the faculty, research staff, and created by the faculty, research staff, and students of a college or university.students of a college or university. Institutionally defined vs. subject based repositoryInstitutionally defined vs. subject based repository Content possessing enduring valueContent possessing enduring value Cumulative and perpetualCumulative and perpetual Open and InteroperableOpen and Interoperable Collects, stores, and disseminates Collects, stores, and disseminates Designed for long-term preservationDesigned for long-term preservation

Page 6: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Kentuckiana Digital LibraryKentuckiana Digital Library

A digital archive of historically significant A digital archive of historically significant material documenting the history and heritage of material documenting the history and heritage of Kentucky.Kentucky. Subject based vs. institutionally defined repositorySubject based vs. institutionally defined repository Content possessing enduring valueContent possessing enduring value Cumulative and perpetualCumulative and perpetual Open and InteroperableOpen and Interoperable Collects, stores, and disseminates Collects, stores, and disseminates Designed for long-term preservationDesigned for long-term preservation

Page 7: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Institutionally or Subject BasedInstitutionally or Subject Based

Institutionally: Original research and other Institutionally: Original research and other intellectual property generated by an institution's intellectual property generated by an institution's constituent population representative of many constituent population representative of many academic fields.academic fields.

Subject Based: Centered around a given topic or Subject Based: Centered around a given topic or subject area.subject area.

Page 8: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Content Possessing Enduring Content Possessing Enduring ValueValue

Make accessible an institution's intellectual capital and Make accessible an institution's intellectual capital and digital assets.digital assets.

Scholarly content that may include pre-prints and other Scholarly content that may include pre-prints and other works-in-progress, peer-reviewed articles, monographs, works-in-progress, peer-reviewed articles, monographs, enduring teaching materials, data sets and other enduring teaching materials, data sets and other research material, conference papers, electronic theses research material, conference papers, electronic theses and dissertations.and dissertations.

Materials documenting an institution’s history. Materials documenting an institution’s history. For subject based institutional repositories, this is For subject based institutional repositories, this is

material that has gone through a formal selection material that has gone through a formal selection process where relevance and significance has been process where relevance and significance has been established, most often by scholars or content experts.established, most often by scholars or content experts.

Page 9: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Cumulative and PerpetualCumulative and Perpetual

Material submitted to the repository is regulated Material submitted to the repository is regulated by developed criteria and policies that oversee by developed criteria and policies that oversee who is allowed to submit material and what who is allowed to submit material and what attributes are required for submitted material. attributes are required for submitted material.

Once material is submitted, it cannot be Once material is submitted, it cannot be removed except under special, clearly defined removed except under special, clearly defined circumstances.circumstances.

Institutional commitment to maintain repository Institutional commitment to maintain repository in perpetuity.in perpetuity.

Page 10: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Collects, Stores, and DisseminatesCollects, Stores, and Disseminates

Repository has the ability to ingest new material, Repository has the ability to ingest new material, store that material, and provide access through store that material, and provide access through digital dissemination of stored material.digital dissemination of stored material.

This process relies on a secure network This process relies on a secure network infrastructure, established policies and practices infrastructure, established policies and practices regulating collection, retention and access as regulating collection, retention and access as well as necessary human capital required to well as necessary human capital required to manage, develop, and maintain the repository’s manage, develop, and maintain the repository’s administrative and technical infrastructure. administrative and technical infrastructure.

Page 11: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Open and InteroperableOpen and Interoperable

Goal of providing a repository that is accessible Goal of providing a repository that is accessible to the Institution’s populous as well as to the to the Institution’s populous as well as to the larger national and international research larger national and international research community.community.

Access to some content may be limited due to Access to some content may be limited due to copyright restrictions, policies established by a copyright restrictions, policies established by a particular research community (limiting access to particular research community (limiting access to departmental working papers to members of that departmental working papers to members of that department, for example), and even monetary department, for example), and even monetary access fees for certain data. access fees for certain data.

Page 12: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Open and Interoperable, cont.Open and Interoperable, cont.

Repository systems support interoperability in Repository systems support interoperability in order to provide broad access via multiple order to provide broad access via multiple search engines and other current and future search engines and other current and future discovery tools.discovery tools. Developed on non-proprietary infrastructure that is Developed on non-proprietary infrastructure that is

open and well defined.open and well defined. Uses accepted and developed standards for data Uses accepted and developed standards for data

integrity and extensibility.integrity and extensibility. Description of policies and practices regarding the Description of policies and practices regarding the

digital repository are well documented and made digital repository are well documented and made publicly available where appropriate.publicly available where appropriate.

Model of Choice: OAIS (Open Archival Model of Choice: OAIS (Open Archival Information System)Information System)

Page 13: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Designed for Long-Term Designed for Long-Term PreservationPreservation

Wherever possible, define and promote digital file Wherever possible, define and promote digital file formats and standards that promote long term viability formats and standards that promote long term viability and future migration of digital content.and future migration of digital content.

Keep up with software and hardware upgrades.Keep up with software and hardware upgrades. Collect and maintain administrative metadata concerning Collect and maintain administrative metadata concerning

your digital assets.your digital assets. Establish redundancy in data storageEstablish redundancy in data storage Develop, maintain and implement migration plans that Develop, maintain and implement migration plans that

include integrity checks to assure that migrated data has include integrity checks to assure that migrated data has not suffered data loss.not suffered data loss.

Institution establishes recurring funding to adequately Institution establishes recurring funding to adequately sustain the repository.sustain the repository.

Page 14: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

OAIS (Open Archival Information OAIS (Open Archival Information System)System)

The Consultative Committee for Space Data Systems (CCSDS) was established in 1982 to provide an international forum for space agencies interested in the collaborative development of standards for data handling in support of space research.

At the request of the ISO, CCSDS assumed the task of coordinating the development of archive standards for the long-term storage of digital data.

A reference model was developed to establish common terms and concepts, provide a framework for establishing the significant entities and relationships among entities in an archive environment, and serve as the foundation for the development of standards supporting the archive environment. Draft OAIS reference model released in May 1999.

An OAIS is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and establish long term access for a Designated Community. Long Term is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. Long Term may extend indefinitely. (Reference Model for an Open Archival Information System (OAIS), Management Council of the Consultative Committee for Space Data Systems, January, 2002.)

Page 15: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

OAIS ModelOAIS Model

Consumer

Management

OAIS

SIP

DIPAIP

Administration

Preservation planning

Ingest Access

AIP

Datamanagement

Archivalstorage

Descriptive info Descriptive info

Producer

Page 16: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

OAIS Information Package Types OAIS Information Package Types

Information Packages: Gathered metadata Information Packages: Gathered metadata concerning digital objects.concerning digital objects. SIP (Submission Information Package): sent from the SIP (Submission Information Package): sent from the

information/metadata producer to the archive information/metadata producer to the archive AIP (Archive Information Package): information AIP (Archive Information Package): information

package actually stored by the archive, the SIP package actually stored by the archive, the SIP metadata reformatted for preservation and archival metadata reformatted for preservation and archival storage storage

DIP (Dissemination Information Package): information DIP (Dissemination Information Package): information package transferred from the archive in response to a package transferred from the archive in response to a request by a consumer, the AIP metadata reformatted request by a consumer, the AIP metadata reformatted for various distribution methods established to meet for various distribution methods established to meet user needs. user needs.

Page 17: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

A Bit About MetadataA Bit About Metadata

DescriptiveDescriptive: describes a resource for purposes such as : describes a resource for purposes such as discovery and identification. It can include elements such discovery and identification. It can include elements such as title, abstract, author, and keywords. as title, abstract, author, and keywords.

StructuralStructural: indicates how compound objects are put : indicates how compound objects are put together, for example, how pages are ordered to form together, for example, how pages are ordered to form chapters.chapters.

AdministrativeAdministrative: provides information to help manage a : provides information to help manage a resource, such as when and how it was created, file type resource, such as when and how it was created, file type and other technical information, and who can access it. and other technical information, and who can access it. Subsets of administrative data are: Subsets of administrative data are:

Rights management metadata, which deals with intellectual Rights management metadata, which deals with intellectual property rightsproperty rights

Preservation metadata, which contains information needed to Preservation metadata, which contains information needed to archive and preserve a resource. archive and preserve a resource.

Page 18: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

OAIS Metadata SolutionsOAIS Metadata Solutions

METS (Metadata Encoding & Transmission Standard):METS (Metadata Encoding & Transmission Standard): A A standard for encoding descriptive, administrative, and structural standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium.the XML schema language of the World Wide Web Consortium.

OAI (Open Archives Initiative): OAI (Open Archives Initiative): The Open Archives Initiative The Open Archives Initiative Protocol for Metadata Harvesting provides an application-Protocol for Metadata Harvesting provides an application-independent interoperability framework based on metadata independent interoperability framework based on metadata harvesting. There are two classes of participants in the OAI-PMH harvesting. There are two classes of participants in the OAI-PMH framework:framework:

Data ProvidersData Providers administer systems that support the OAI-PMH as a administer systems that support the OAI-PMH as a means of exposing metadata; and means of exposing metadata; and

Service ProvidersService Providers use metadata harvested via the OAI-PMH as a basis use metadata harvested via the OAI-PMH as a basis

for building value-added services.for building value-added services.

Page 19: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Software Solutions: Open SourceSoftware Solutions: Open Source

DSpace: Current digital repository system that most DSpace: Current digital repository system that most closely aligns with the OAIS model. Open source in closely aligns with the OAIS model. Open source in nature and freely distributed to academic institutions.nature and freely distributed to academic institutions.

EPrints: Repository system built around the OAI protocal EPrints: Repository system built around the OAI protocal for self-archiving, harvesting, storing, and distributing for self-archiving, harvesting, storing, and distributing OAI-compliant metadata.OAI-compliant metadata.

FEDORA: FEDORA: FFlexible and lexible and EExtensible xtensible DDigital igital OObject and bject and RRepository epository AArchitecture system, designed by the Cornell rchitecture system, designed by the Cornell Digital Library Research GroupDigital Library Research Group

Page 20: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Software Solutions: Open Source, Software Solutions: Open Source, cont.cont.

CDSware: Electronic web preprint server. Complies with CDSware: Electronic web preprint server. Complies with the Open Archives Initiative metadata harvesting the Open Archives Initiative metadata harvesting protocol (OAI-PMH) and uses MARC 21 as its protocol (OAI-PMH) and uses MARC 21 as its

bibliographic standard.bibliographic standard. DLXS (Digital Library Extension Service): Open Source DLXS (Digital Library Extension Service): Open Source

Software developed by the University of Michigan Digital Software developed by the University of Michigan Digital Library Production Service.Library Production Service.

Greenstone: Software for building and distributing digital Greenstone: Software for building and distributing digital library collections. Produced by the New Zealand Digital library collections. Produced by the New Zealand Digital Library Project at the University of WaikatoLibrary Project at the University of Waikato..

Page 21: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Software Solutions: Vendor Software Solutions: Vendor PlatformsPlatforms

EbraryEbraryEndeavor EncompassEndeavor EncompassCONTENTdm: XML-based digital library CONTENTdm: XML-based digital library

management software licensed by OCLC.management software licensed by OCLC.Exlibris DigiToolExlibris DigiTool

Page 22: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Software: Vendor vs. Open SourceSoftware: Vendor vs. Open Source

Open Source software is a less expensive option Open Source software is a less expensive option in terms of software licensing, but can require in terms of software licensing, but can require more expertise to implement and manage. more expertise to implement and manage. However, open source software with a well However, open source software with a well established user community is often very established user community is often very approachable even with limited technical approachable even with limited technical expertise.expertise.

Page 23: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

Software: Vendor vs. Open Source, Software: Vendor vs. Open Source, cont.cont.

The future of vendor supplied software is The future of vendor supplied software is dependent upon profit. Thus, one should be dependent upon profit. Thus, one should be prepared for the possibility that in the event that prepared for the possibility that in the event that the software becomes unprofitable for a vendor, the software becomes unprofitable for a vendor, they may cease its development. Can be a risky they may cease its development. Can be a risky investment to be among the early implementers investment to be among the early implementers of a new vendor produced solution.of a new vendor produced solution.

Page 24: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

When Choosing Vendor SolutionWhen Choosing Vendor Solution

Make sure vendor software utilizes underlying Make sure vendor software utilizes underlying data structures that adhere to well-established data structures that adhere to well-established open standards, thus allowing your data to open standards, thus allowing your data to remain non-proprietary in nature.remain non-proprietary in nature.

Check vendor customer lists to make contacts Check vendor customer lists to make contacts with and gather feedback from others already with and gather feedback from others already using the software.using the software.

Page 25: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

The Future of Open Source vs. The Future of Open Source vs. Vendor SoftwareVendor Software

There has been a lot of leveraging over the last There has been a lot of leveraging over the last five years to provide open source solutions for five years to provide open source solutions for digital repositories. Key advances concerning digital repositories. Key advances concerning technological standards like OAI and METS technological standards like OAI and METS have more recently provided a level of have more recently provided a level of consensus that is leading to even more open consensus that is leading to even more open source possibilities.source possibilities.

METS Implementation Registry: METS Implementation Registry: http://sunsite.berkeley.edu/mets/registry/http://sunsite.berkeley.edu/mets/registry/

Page 26: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

IR ProjectsIR Projects

Institutional Based:Institutional Based: DSpace at MITDSpace at MIT California Digital Library’s eScholarshipCalifornia Digital Library’s eScholarship ePrints.orgePrints.org University of Virginia’s Fedora ProjectUniversity of Virginia’s Fedora Project CODA at the California Institute of TechnologyCODA at the California Institute of Technology Ohio Knowledge Bank at Ohio State UniversityOhio Knowledge Bank at Ohio State University

Subject Based:Subject Based: Online Archive of CaliforniaOnline Archive of California American MemoryAmerican Memory

Page 27: Institutional Repositories Presentation for Kentuckiana Metroversity Meeting February 10, 2005 Eric Weig Head, Digital Programs University of Kentucky.

References for Further ReadingReferences for Further Reading

Crow, Raym, “The Case for Institutional Repositories: A SPARC Position Crow, Raym, “The Case for Institutional Repositories: A SPARC Position Paper” August 2002. <Paper” August 2002. <http://http://www.arl.org/sparc/IR/ir.htmlwww.arl.org/sparc/IR/ir.html>.>.

Drake, Miriam A., “Institutional Repositories: Hidden Treasures” Drake, Miriam A., “Institutional Repositories: Hidden Treasures” Searcher: Searcher: The Magazine for Database Professionals,The Magazine for Database Professionals, vol. 12 no. 5 (May 2004). vol. 12 no. 5 (May 2004). http://www.infotoday.com/searcher/may04/drake.shtmlhttp://www.infotoday.com/searcher/may04/drake.shtml>.>.

Johnson, Richard K., “Institutional Repositories: Partnering with Faculty to Johnson, Richard K., “Institutional Repositories: Partnering with Faculty to Enhance Scholarly Communication,” Enhance Scholarly Communication,” D-Lib Magazine D-Lib Magazine vol. 8 no. 11 vol. 8 no. 11 (November 2002). <(November 2002). <http://www.dlib.org/dlib/november02/johnson/11johnson.htmlhttp://www.dlib.org/dlib/november02/johnson/11johnson.html>.>.

Lavoie, Brian, “Meeting the Challenges of Digital Preservation: The OAIS Lavoie, Brian, “Meeting the Challenges of Digital Preservation: The OAIS Reference Model” Reference Model” OCLC NewsletterOCLC Newsletter, no. 243 (January/February 2000):26-, no. 243 (January/February 2000):26-30. <30. <http://www.oclc.org/research/publications/archive/2000/lavoie/http://www.oclc.org/research/publications/archive/2000/lavoie/>.>.

Lynch, Clifford A., “Institutional Repositories: Essential Infrastructure for Lynch, Clifford A., “Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age” Scholarship in the Digital Age” ARL ARL no. 226 (February 2003): 1-7. <no. 226 (February 2003): 1-7. <http://www.arl.org/newsltr/226/ir.htmlhttp://www.arl.org/newsltr/226/ir.html>. >.

Tennant, Roy, “Institutional Repositories” Tennant, Roy, “Institutional Repositories” Library JournalLibrary Journal (September 2002). (September 2002). <<http://www.libraryjournal.com/article/CA242297?display=http://www.libraryjournal.com/article/CA242297?display=searchResults&sttsearchResults&stt=001&text=institutional%2Brepositories=001&text=institutional%2Brepositories>.>.