Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research...

25
Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital Age Conference, May 26-30 2003, Dubrovnik

Transcript of Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research...

Page 1: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Collection-Level Description

Gordon DunsireDepute Director, Centre for Digital Library Research

Presentation for a workshop at the Libraries in the Digital Age Conference, May 26-30 2003, Dubrovnik

Page 2: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Overview

• What is a collection?

• What is collection-level description?

• Why is it important?

• Development in UK

• Some practical issues

• Scottish Collections Network (SCONE)

• Information environments

Page 3: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

What is a collection?

• “Any aggregation of individual items (objects, resources)”– CD Focus briefing paper 1– Size is not a factor – 1 item is possible– Varying degrees of permanence– Physical juxtaposition not necessary; collections can be

distributed across multiple locations

• Cross-domain– Libraries, museums, art galleries, archives, digital

• Definition is too vague to be practicable– Limit to “useful” collections

• “Useful” defined in terms of “Functional granularity”

Page 4: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Functional granularity

• “… useful or necessary for the purposes of resources discovery or collection management” – Heaney– As deemed by “the institution”– Might include user groups as well as owners

and administrators

• Exclude– Dynamic collections (results of retrieval)– Single persons (unless significant)

Page 5: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

What is CLD?

• Collection-Level Description– Metadata at the level of aggregation:

Title: William Speirs Bruce Collection

Description: Collection of material on oceanography and Arctic and

Antarctic exploration, bequeathed by Dr. William Speirs Bruce, Polar

explorer and oceanographer (1867-1921).

Location: Edinburgh University Library. Main Library

Collectors: William S. (William Speirs) Bruce (1867-1921)

[Collecting: Closed]

Subjects: Antarctica--Discovery and exploration

Part of: Edinburgh University Library. Department of Special Collections

printed books collections

Page 6: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Confusing terms

• Collection-Level Description– The complete metadata for a collection– The process of creating a CLD

• Collection-Description– A finding-aid for the collection (e.g. catalogue)

• Description– An attribute of a Collection giving a short

summary of the collection history and contents, etc.

Page 7: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Why is CLD important? (1)

• Ideally, all metadata/retrieval is at the level of the work (item-level description)

• But in the Real world …– Online ILD metadata not available

• Legacy; Institutional policies

– Wide variation in ILD structure and content standards

• Between domains; within domains• Within single institutions!

Page 8: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Why is CLD important? (2)

• CLD offers broader coverage– More stuff can be found– Cheaper to implement– High recall, low precision

• Some metadata cannot be accommodated in ILD without extensive duplication– E.g. Collection title, Collector, Owner,

Location, etc.

Page 9: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Why is CLD important? (3)

• Collaborative management– Collaborative acquisition policies– Preservation and storage– Priorities for digitisation, wider access, etc.

• Landscaping in distributed digital information environments– Portals– Broad overview, then more precise discovery

Page 10: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

LandscapingSearch term or Profile parameter

e.g. name, subject, education level, accessibility

Retrieve relevant CLDs to create broad "map" of concentrations of resources: peaks of significance; "lodes" for further exploration

CLDs link to digital collections, and online (analytic) finding aids

Local ILDs for resource discovery:cross-searching possible with Z39.50/OAI

Page 11: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Development of CLD in UK

• Entity-relationship model– Michael Heaney– Also covers analytic finding aids: collection-

descriptions (C-Ds)

• Database schema– For RSLP by UKOLN; simplifies Heaney’s

model

• Implementation– JISC IE Services Registry; simplifies RSLP

Page 12: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Heaney’s Analytic Model

Page 13: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Heaney's components

• Entities– Collection; Agent; Location

• Relationships– Collection:IsLocatedIn:Location– Administrator[Agent]:Administers:Location– Collector[Agent]:Collects:Collection– * Collection:HasPart:Collection– * Collection:IsDescribedBy:C-D[Collection]

* Heaney focussed on single collections

Page 14: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

CLD in practice (1)

• Collection titles– If no specific title, derive from name of

institution or user group defining the collection

• Collection hierarchies– Multi-level granularity (6 levels in SWOP)– Polyhierarchy: one physical super-collection,

but many virtual– Data redundancy; inheritance from super-

collection• E.g. location, owner, access

Page 15: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

CLD in practice (2)

• Content interoperability– Cross-searching names and subjects in landscapes– Varying standards in different organizations

• Agent names (persons and organizations)– Much wider range than item-level description

• Owners, administrators in addition to creators, subjects, to be included in name authority files

• Subjects– Collections on specific subjects– General collections; subject strengths

Page 16: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

CLD in practice (3)

• Dates– 18th century books on classical Greece

collected from 1890 to 1930 – dates of: manufacture; subject; aggregation

• Significance– Quantity vs quality; subjective; dynamic– 5 first editions with manuscript notes by

Robert Burns, or 50000 items by and about Burns?

Page 17: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

SCONE story (1)

• CAIRNS– Z39.50 clump for distributed searching– Metadata for Z servers (service-level

description!)– Associated metadata for collection-

descriptions (catalogue indexes, etc.)– Associated metadata for CLDs– Access (SQL) database

Page 18: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

SCONE story (2)

• SCONE project– Collaborative collection management

• HE/FE plus public libraries sector (SEED)• CDLR as lead site

– Test datasets• SLIR; SWOP; ESH; Websites

– Then Heaney's model and RSLP schema

• SCONE service– 2600 CLDs

Page 19: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

SCONE story (3)

• SQL database (MS SQL Server)– Uses Heaney’s analysis rather than RSLP– Fully relational, normal form– Incorporates additional metadata not specified

• Subject strengths (RCO)• Service-level description elements (CAIRNS)

• ColdFusion Web data server

• DreamWeaver Website maintenance

Page 20: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

SCONE futures

• CC-interop (COPAC/Clumps interoperability) project– Cross-relates SCONE to major UK schemas– SCONE clone for RIDING clump

• HaIRST (institutional resources) and SPEIR (Scottish portals) projects– SCONE used for landscaping

Page 21: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

JISC Information Environment

• “the set of network or online services that support publishing and use of information and learning resources”

• Functional model for resource discovery has 4 stages– Observes that some components already

exist or are under development

Page 22: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

JISC IE Functional Model

• 1: Enter– Initial landscape: presentation of collections &

services for local service or user profile

• 2: Survey– Modify set of collections & services

• 3: Discover– Item-level searching using distributed (z39.50) or

physical (OAI harvested; FTP) union catalogue

• 4: Detail– Further information about items

Page 23: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

IE for Scotland (A)

Survey

Collection descriptions service[SCONE]

Landscaper

Collection-leveldescriptions

Entry

Initial landscape[Scottish Cultural Portal;

SCONE]

Page 24: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

IE for Scotland (B)

Detail

Discover

Distributed union

catalogue[CAIRNS]

Harvested union catalogue[HaIRST]

Union catalogue[COPAC]

Item metadata

Item metadata

Item metadata

Item metadata

Page 25: Collection-Level Description Gordon Dunsire Depute Director, Centre for Digital Library Research Presentation for a workshop at the Libraries in the Digital.

Links

• Me– [email protected]

• SCONE service– http://scone.strath.ac.uk/service/index.cfm

– “About SCONE” for more information

• CDLR (other projects)– http://cdlr.strath.ac.uk

• JISC Information Environment– http://www.jisc.ac.uk/index.cfm?name=

about_info_env