T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information,...

16
T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000

Transcript of T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information,...

Page 1: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

A Registry for Dublin Core

Thomas Baker, GMDIuK 2000: "Information, Knowledge

and Knowledge ManagementDarmstadt, 27 March 2000

Page 2: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Metadata is a language

• A metadata "sentence" might say:– "This watercolor has Painter Joseph Beuys,

Title Ohne Titel, and Date Painted 1959.

• Dublin Core was designed as a simple metadata language -- a"pidgin" of general concepts for coarse-grained resource discovery.

• In unqualified Dublin Core, the sentence above would say:– "This resource has Creator Joseph Beuys,

Title Ohne Titel, and Date 1959.

Page 3: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Like languages, schemas evolve• Like words in languages, metadata terms

may be coined, adopted, approved by official bodies, change meaning, or fall from use.

• As in languages, need for simplicity is inevitably in tension with need for complexity– Dublin Core Element Set is almost always too

simple to use "as is", so it is extended locally– If not managed, the proliferation of local

extensions threatens interoperability in broader context

Page 4: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Registries as dictionaries• Metadata systems, like languages, need

dictionaries for tracking usage and managing change

• Like language dictionaries, registries can:– Prescribe good grammar and good usage

guidelines– Describe how implementors are actually using

metadata– Translate between natural languages– Define the "parts of speech" of metadata

grammar -- building blocks of sentences

Page 5: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Requirements for the DCMI registry • Users and implementors need

– a dictionary of terms– a place to publish project- or discipline-

specific adaptations to share with colleagues and partners

• Dublin Core Metadata Initiative needs– to manage its namespace (as a standards

agency)– to provide machine-readable schemas for

loading into editors and search engines– to provide crosswalks to related schemas– to link the (English) standard to translations in

other languages (25 to date)

Page 6: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

DCMI Registry Working Group

• An RDF schema registry has already been deployed to support review processes within DCMI

• Working Group: propose policy guidelines for managing the DCMI namespace with this registry

• Later: encourage implementors of DC-related schemas (adaptations, profiles, translations) to put RDF schemas on the Web and link to them

Page 7: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

RDF as a publication format for schemas• How standards are are currently defined:

– in HTML pages and paper documents– no explicit hyperlinks between related elements

in different standards

• RDF Schema format (based on XML)– URIs provide cross-references to related schemas

and documentation– on Web, browse related namespaces (and

profiles)– richer thesaurus relations will support crosswalks

between elements that are not exactly equivalent

Page 8: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Current prototype DCMI registry

• Based on MyRDF Toolkit– Eric Miller, Online Computer Library Center, Dublin

(Ohio) and RDF Working Group of W3C

– To define, search, and navigate among distributed collections of RDF schemas

– Works like the Web itself: to add a schema, make it available in RDF on the Web and create hyperlinks to and from that schema

– Points towards a scalable ecology of metadata registries on the Web

Page 9: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Linking multiple translations of a standard• Model: Multilingual Dublin Core Project

– University of Library and Information Science, Tsukuba, Japan

– Uses RDF schemas to share machine-readable tokens for translations of DC terms in Japanese, Arabic, Punjabi (26 languages to date)

– Java applets for displaying fonts– Raises policy questions: How can we manage

the evolution of Dublin Core as a multilingual standard? How can other language communities help shape the global (English) standard?

Page 10: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Tracking linguistic variation and equivalences• Model: MetaForm Project

– State and University Library, Goettingen

– Local "manifestations" of Dublin Core for specific projects introduce variations -- like "dialects"

– "Crosscuts" -- how are elements used in different implementations?

– Provides "mappings" and "crosswalks" between Dublin Core and other schemas of similar scope

– Demonstrates the sort of output one would want from queries to a distributed registry

Page 11: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Mapping between schemas via an interlingua

• Model: DESIRE Project Metadata Registry– UK Office of Library and Information Networking,

Bath

– Maps various schemas to a core of shared concepts (interlingua)

• In DESIRE Registry, based on ISO Basic Semantic Registry

– Suggests how interoperability might be achieved among multiple schemas in a scalable manner (n-to-1 instead of n-to-n)

Page 12: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Namespaces versus Profiles• Implementors usually need to "mix and

match"– use parts of one standard with parts of another– coin some local terms to fill in gaps

• Application profiles (DESIRE Project)– schemas are defined in namespaces– namespace semantics reused in application

profiles

• Registries should include application profiles– expressible using RDF schema format– will to help implementors learn from peers

Page 13: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Annotation vocabularies

• Registration authorities or third parties layer annotations on metadata schemas or elements

• For example, DCMI could "recommend" an element or qualifier -- whether it is in DCMI's own namespace or elsewhere

• RDF schemas support this• Supports notion of "publish first, filter

later" (Wilensky talk this morning)

Page 14: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

DCMI Usage Committee

• Reviews elements and qualifiers in light of grammatical principle

• Levels of annotation (under discussion)

– local terms in use by projects– terms proposed for review by Usage

Committee– terms found conforming to grammar principles– terms recommended by Usage Committee– terms that have become obsolete

• Versioning and life-cycle of terms (under discussion)

Page 15: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

Metadata grammar

• DCMI grammar: Elements and Qualifiers– Resources (Web pages, books, museum objects)

have things like Creators and Titles -- Elements– Elements are modified by Qualifiers (adjectives)

• RDF grammar: Resources, Relations, Classes– RDF and DC grammars are close conceptually,

though terminologies differ– Using RDF schemas for the DCMI registry is helping

clarify differences and harmonize grammars

Page 16: T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.

T. Baker / 27 March 2000

[email protected]