Tracking Metadata Use for Digital Collections Ellen Knutson, Carole L. Palmer, Michael Twidale...

1
Tracking Metadata Use for Digital Collections Ellen Knutson, Carole L. Palmer, Michael Twidale [email protected], [email protected], [email protected] Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign Institute of Museum and Library Services Digital Collections and Content http://imlsdcc.grainger.uiuc.edu Funded by a National Leadership Grant from: Number of Participating Institutions by Type Metadata by Collaboration Percentage of Institutions by Metadata Scheme Metadata for Collections Containing Images Introduction Method Themes from Interviews Conclusion from Content Analysis How can resource developers best represent digital collections and items to meet the requirements of divergent service providers and user communities? Our first goal is to establish a baseline that describes the institutions, collections, and initial metadata schemes of Institute of Museum and Library Services (IMLS), National Leadership Grant (NLG) digital collection projects. We performed a content analysis of 95 NLG proposals funded from 1998 to 2002. Project web sites were also consulted in cases where the information was not specified in the proposal. Interviews are ongoing with a selected group of grantees. Preliminary results from 13 interviews are reported here. MARC and Dublin Core (DC) are the most often used metadata schemes in these projects. Academic libraries make up the majority of the institutions involved in creating digital collections. Institutions working alone were more likely to select the MARC standard, while institutions working on a collaborative project were more likely to select DC. For collections including images (80% of collections), DC is the most popular. MARC is also used frequently, either on its own, combined with DC, or with CIMI (Computer Interchange of Museum Information) standards, VRA Core (Visual Resource Association Core), EAD or TEI Header. Scheme Selection Factors • Previous experience • Use by partners and peers • Ease of training • Coverage of basic elements Basis for Local Variations • Need for specific information • Responsiveness to existing data • Adherence to project goals Richness vs. Simplicity • MARC preferred for field richness • Dublin Core preferred for ease of application • Nonstandard use of fields more prevalent with Dublin Core Key Challenges/Problems • Software/technical limitations • Disparate Formats • Disparate metadata schemes • Priorities/Time Key Collection Description Fields • Institutions • Subject • Description • Format of Material • Original Collection Projects and Collections Participants could envision their project as both a single collection and multiple collections. The delineation of multiple 0 10 20 30 40 50 60 70 80 90 Academ ic Libraries M useum s H isto rica lSo cietie s P u blic Librarie s S tate Librarie s O ther Non Pro fit Orga n iza tion s S ch oo lD istrict Bo ta n ica lG a rd en s A rch ive s Library Co nso rtium Academ ic D e p artm en t/Institu te No Yes D u b lin C o re 6 15 Local 1 4 M ARC 12 3 O ther 1 4 Unknow n 6 4 M ultiple Schem es 19 20 Further Research The DCC project is currently in the process of conducting a survey with the 95 institutions to further develop the baseline analysis. As our research continues, we will be investigating other factors at play in metadata applications and how they evolve as projects progress and collections are used. Over the next two years, we will conduct interviews and focus groups with a representative group of NLG grant awardees, and will also administer a second large-scale survey in the final year of the project. The "Other" category includes three government agencies, two special libraries, two companies, two herbaria, a zoo, a Native American tribe, a state park, and a national historic site. The “Other” metadata schemes specified include: EAD, TEI Header, CSDGM, VRA Core, CIMI standards, and Ebind. Dub lin C ore 23% Local 5% MARC 16% O ther 6% U nknow n 11% Dublin C ore/MARC 8% Dublin C ore/Othe rs (not MARC) 6% Othe r M ulti 3% O ther 40% M A R C /O thers (not D ublin C ore) 16% Dublin C ore/MARC/ T EI o r EA D 6% 7 2 5 1 1 2 3 2 8 10 5 3 3 5 1 1 2 1 2 2 2 1 3 1 3 1 2 0 2 4 6 8 10 12 14 16 18 20 Dublin Core MARC/Others (not Dublin Core) MARC Dublin Core/MARC Dublin Core/Others (not MARC) Dublin Core/MARC/TEI or EAD Local Other Other Multi Unknown Images Images/Text Images/Text/Other Images/Other (Not text

Transcript of Tracking Metadata Use for Digital Collections Ellen Knutson, Carole L. Palmer, Michael Twidale...

Page 1: Tracking Metadata Use for Digital Collections Ellen Knutson, Carole L. Palmer, Michael Twidale eknutson@uiuc.edu, clpalmer@uiuc.edu, twidale@uiuc.edu Graduate.

Tracking Metadata Use for Digital CollectionsEllen Knutson, Carole L. Palmer, Michael Twidale [email protected], [email protected], [email protected]

Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign

Institute of Museum and Library Services

Digital Collections and Content http://imlsdcc.grainger.uiuc.edu

Funded by a National Leadership Grant from:

Number of Participating Institutions by Type

Metadata by Collaboration

Percentage of Institutions by Metadata Scheme

Metadata for Collections Containing Images

Introduction Method

Themes from Interviews

Conclusion from Content Analysis

How can resource developers best represent digital collections and items to meet the requirements of divergent

service providers and user communities?

Our first goal is to establish a baseline that describes the institutions, collections, and initial metadata schemes of Institute of Museum and Library Services (IMLS), National Leadership Grant (NLG) digital collection projects.

We performed a content analysis of 95 NLG proposals funded from 1998 to 2002. Project web sites were also consulted in cases where the information was not specified in the proposal.

Interviews are ongoing with a selected group of grantees. Preliminary results from 13 interviews are reported here.

MARC and Dublin Core (DC) are the most often used metadata schemes in these projects. Academic libraries make up the majority of the institutions involved in creating digital collections. Institutions working alone were more likely to select the MARC standard, while institutions working on a collaborative project were more likely to select DC. For collections including images (80% of collections), DC is the most popular. MARC is also used frequently, either on its own, combined with DC, or with CIMI (Computer Interchange of Museum Information) standards, VRA Core (Visual Resource Association Core), EAD or TEI Header.

Scheme Selection Factors• Previous experience• Use by partners and peers• Ease of training• Coverage of basic elements

Basis for Local Variations• Need for specific information • Responsiveness to existing data• Adherence to project goals

Richness vs. Simplicity• MARC preferred for field richness• Dublin Core preferred for ease of application• Nonstandard use of fields more prevalent with Dublin Core

Key Challenges/Problems• Software/technical limitations• Disparate Formats • Disparate metadata schemes• Priorities/Time

Key Collection Description Fields• Institutions• Subject• Description• Format of Material• Original Collection

Projects and CollectionsParticipants could envision their project as both a single collection and multiple collections. The delineation of multiple collections was generally by institution for collaborative projects and by format for non-collaborative projects.

0 10 20 30 40 50 60 70 80 90

Academic Libraries

Museums

Historical Societies

Public Libraries

State Libraries

Other

Non Profit Organizations

School District

Botanical Gardens

Archives

Library Consortium

Academic Department/Institute

No YesDublin Core 6 15Local 1 4MARC 12 3Other 1 4Unknown 6 4Multiple Schemes 19 20

Further Research

The DCC project is currently in the process of conducting a survey with the 95 institutions to further develop the baseline analysis. As our research continues, we will be investigating other factors at play in metadata applications and how they evolve as projects progress and collections are used. Over the next two years, we will conduct interviews and focus groups with a representative group of NLG grant awardees, and will also administer a second large-scale survey in the final year of the project.

The "Other" category includes three government agencies, two special libraries, two companies, two herbaria, a zoo, a Native American tribe, a state park, and a national historic site.

The “Other” metadata schemes specified include: EAD, TEI Header, CSDGM, VRA Core, CIMI standards, and Ebind.

Dublin Core23%

Local5%

MARC16%

Other6%

Unknown11%

Dublin Core/MARC8%

Dublin Core/Others (not MARC)

6%

Other Multi3%

Other40%

MARC/Others (not Dublin Core)

16%

Dublin Core/MARC/TEI or EAD

6%

7

2

5

1

1

2

3

2

8

10

5

3

3

5

1

1

2

1

2

2

2

1

3

1

3

1

2

0 2 4 6 8 10 12 14 16 18 20

Dublin Core

MARC/Others (not Dublin Core)

MARC

Dublin Core/MARC

Dublin Core/Others (not MARC)

Dublin Core/MARC/TEI or EAD

Local

Other

Other Multi

Unknown Images

Images/Text

Images/Text/Other

Images/Other (Not text)