March 15, 2000 Howard Rosenbaum [email protected] Metadata: Information Access for the New...

66
March 15, 2000 Howard Rosenbaum [email protected] Metadata: Information Access for the New Century Indiana Library Federation Annual Indiana Library Federation Annual Meeting Meeting p://www.slis.indiana.edu/hrosenba/www/Pres/metadata00/index.h

Transcript of March 15, 2000 Howard Rosenbaum [email protected] Metadata: Information Access for the New...

March 15, 2000Howard Rosenbaum

[email protected]

Metadata: Information Access for the New Century

Indiana Library Federation Annual MeetingIndiana Library Federation Annual Meeting

Metadata: Information Access for the New Century

Indiana Library Federation Annual MeetingIndiana Library Federation Annual Meeting

http://www.slis.indiana.edu/hrosenba/www/Pres/metadata00/index.html

Metadata revealed!

3.15.00

I. Introduction

• The state of the net today

• What is metadata and why do we need it?

II. What different metadata schemes are available?

• Dublin Core and Warwick Framework

• Digital Object Identifier

• Resource Description Format

• Persistent URL

III. What does metadata mean for information management in libraries?

Metadata revealed!

3.15.00

I. Introduction

• The state of the net today

“...At some point, the Internet has to stop looking like the world’s biggest rummage sale.

For taming this particular frontier, the right people are librarians, not cowboys. The Internet is made of information, and nobody knows more about how to order information than librarians, who have been pondering that problem for thousands of years.”Rennie, J. (1997). Civilizing the Internet. Scientific American. 6.

Metadata revealed!

3.15.00

World Total 275.54 million

Africa 2.46 million

Asia/Pacific 54.90 million

Europe 71.99 million

Middle East 1.29 million

Canada & USA 136.06 million

South America 8.79 million

How many are online?

http://www.nua.ie/surveys/how_many_online/index.html

Metadata revealed!

3.15.00

According to the Internet Software Consortium, in Jan 2000 there were 72,398,092 hosts in the nethttp://www.isc.org/ds/WWW-200001/report.html

“The publicly indexable World Wide Web now contains about 800 million pages, encompassing about 6 terabytes of text data”

“Our results show that search engines are increasingly falling behind in their efforts to index the web”

Lawrence and Giles (1999). Accessibility of information on the web. Nature 400 (July 8). 107, 108

Metadata revealed!

3.15.00

Coverage

Search engine coverage of the publicly indexable web has decreased substantially “with no engine indexing more than about 16% of web pages”

Unequal access

Search engines are typically more likely to index sites that have more links to them (more ‘popular’ sites)

They are also more likely to index US sites than non-US sites and commercial sites rather than educational sites

Out of date

Indexing of new or modified pages by just one of the major search engines can take months

http://www.wwwmetrics.com/

Metadata revealed!

3.15.00

Information distribution

83% of sites contain commercial content and 6% contain scientific or educational content

1.5% of sites contain pornographic content

Low metadata use

The simple HTML “keywords” and “description” metatags are only used on the homepages of 34% of sites

Only 0.3% of sites use the Dublin Core metadata standard

http://www.wwwmetrics.com/

Metadata revealed!

3.15.00

Some observations

The Internet was never intended to be a tool for information organization and retrieval

Network resources are proliferating rapidly, so some organization and method of access (beyond browsing) is needed

These resources increasing at an increasing rate (we are helping)

Material on the net is quirky, transient, and chaotically archived

Because of the decentralized nature of the net, it is clear that an imposed scheme is unworkable

Metadata revealed!

3.15.00

• So what is metadata and why do we need it?

The Internet is full. Go away.

Metadata may be one way for us to find what we need when we need it and in the form we want

“The concept of metadata predates the Web, having … been coined ... in the 1960s to describe datasets effectively. Metadata is data about data, and ... provides basic information such as the author of a work, the date of creation, links to any related works, etc.”Miller. P. (1996). Metadata for the Masses. Ariadne. http://www.ariadne.ac.uk/issue5/metadata-masses/

Metadata revealed!

3.15.00

When we search, we find that there are many more irrelevant hits in a typical search engine return page

What good is a search that returns 47,000 documents for the phrase “dublin core”?

“Metadata is information that describes other information sources. [It is] a potential remedy to the problem of finding relevant information on the Internet”Thomas, C.T. and Griffin, L.S. (1999). Who will create metadata for the Internet. First Monday. 3(12). http://www.firstmonday.dk/issues/issue3_12/thomas/index.html

Metadata revealed!

3.15.00

In addition, there is the interesting question of the type of metadata that is appropriate for the web

“There is an obvious requirement for metadata, [it] must be of a form suitable for interpretation both by the search engines and by human beings, and it must also be simple to create so that any web page author may easily describe the contents of their page and make it immediately both more accessible and more useful”Miller. P. (1996). Metadata for the Masses. Ariadne. http://www.ariadne.ac.uk/issue5/metadata-masses/

Metadata revealed!

3.15.00

Metadata is the information necessary to identify, locate, organize, and access an electronic resource

It describes what can be said about something and what people can do with it (rights)

It describes datasets concisely using a standard format

For this reason it has the unique ability of making all metadata records equal in worth

Metadata records provide information about data in a similar way that library catalogues provide information about books

A catalogue facilitates searching for particular topics or author(s) - metadata is searchable in a comparable way.

Metadata revealed!

3.15.00

There are two levels of this problem

Organizing an existing collection so that it is accessible over the Internet

The American Memory Project at: http://lcweb2.loc.gov/ammem/ammemhome.html

Berkeley Digital Sunsite collections at: http://sunsite.berkeley.edu/Admin/collection.html

Developing schemes to organize directories of networked information and search tools

This is being done with search engines and metadata

Metadata revealed!

3.15.00

Who uses metadata?

Business uses of metadata

External: advertising and search engine placement

Internal: management of internal digital documents

Academic uses of metadata

To provide a scheme for organizing digital information

For extending access to these materials

Metadata revealed!

3.15.00

Types of metadata

Descriptive (access)

Description: captions, keywords or categories

Access points, location, identifier

Relationship to other objects

File type, size or creation date

Administrative

Management information

Provenance: authentication, document conversion info

Rights, terms and conditions

Structural

Putting the object together from its logical components

Metadata revealed!

3.15.00

Benefits of metadata

For the producer

Ability to provide relevant details about the resource

Ability to provide information which is not in the resource (e.g. descriptive text for images or executable files)

ability to highlight most important aspects of resource

For the indexing service

No need to guess about resource content

Highly structured data to index

Less bandwidth, more efficient, easier to maintain

Metadata revealed!

3.15.00

For the user

More precise results via retrieval on surrogate content

Field-based searching

Access to non-textual resources

Less information overload

Metadata revealed!

3.15.00

Metadata can support many potential applications:

Resource discoveryContent ratingsE-commerceAuthenticationData managementIntellectual property rights managementDigital preservationSearching, locationAuthenticationQuality/ratingSemantic interoperabilityResource management

Metadata revealed!

3.15.00

There are two levels at which the problem can be attacked

Classifying and organizing a core collection of digital materials

The questions: what to collect, how to organize it, how to maintain it, and how to provide access to it

Creating directories, search tools, metadata schemes and other means of access to digital materials outside

the core collection

The questions: what to include, why, maintenance, and the provision of access

These questions are becoming increasingly important in the design of digital libraries

Metadata revealed!

3.15.00

I. Introduction

• The state of the net today

• What is metadata and why do we need it?

II. What different metadata schemes are available?

• Dublin Core and Warwick Framework

• Digital Object Identifier

• Resource Description Format

• Persistent URL

III. What does metadata mean for information management in libraries?

Metadata revealed!

3.15.00

One suggestion is to use “metadata”

The “Dublin Core Metadata Program” is one example

What are the necessary elements that should be used to describe networked information?

This was discussed at an OCLC workshop in 1995

Goals

Fostering a common understanding of the needs, strengths, shortcomings, and solutions of stakeholders

Reaching consensus on a core set of metadata elements to describe networked resources

http://www.oclc.org:5047/oclc/research/conferences/metadata/dublin_core_report.html

Metadata revealed!

3.15.00

A small set of metadata elements would be valuable

It would encourage authors and publishers to provide metadata, in a form that automated resource discovery tools could collect

It would encourage the creation of network publishing tools containing a template for metadata elements, simplifying the task of creating metadata records

This type of record could serve as the basis for a more detailed cataloging record if the need arises

If something like the Dublin Core becomes a standard, metadata records will be able to be understood across user communities

Metadata revealed!

3.15.00

Defined Universal Bibliographic Language for INternet and Coherent Online REsource

It is a minimal information resource description set

It is intended for organization and resource discovery on the web

It will improve searching with simple resource description semantics

Researchers have built a consensus around a core element set that is

Simple and intuitive

Cross-disciplinary

International

Flexible

Metadata revealed!

3.15.00

The Dublin core metadata element set supports resource discovery because it is:

Easy for authors and content managers to create and maintain

Interoperable, extensible, and platform independent

Syntax-independent

Intended for, but not limited to, network resources

Intended to be embedded, but needn’t be

Not intended to meet complete metadata needs of any given community

Metadata revealed!

3.15.00

These are the elements in the Dublin Core

Title: The name of the object

Author: The person(s) primarily responsible for the intellectual content of the object

Subject/keywords: The topic addressed by the work

Typically expressed as keywords, key phrases or classification codes that describe a topic of the resource

Description: An account of the content of the resource

May include but is not limited to an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content

Metadata revealed!

3.15.00

Publisher: The agent or agency responsible for making the object available

Date: A date associated with an event in the life cycle of the resource YYYY-MM-DD

ObjectType: The genre of the object, such as novel, poem, or dictionary

Format: The data representation of the object, or the physical or digital manifestation of the resource,

Typically the media-type or dimensions of the resource

May be used to determine the software, hardware or other equipment needed to display or operate it

Metadata revealed!

3.15.00

Resource Identifier: An unambiguous reference to the resource within a given context, using string or number conforming to a formal identification system

URI, URL, DOI, ISBN

Relation: Relationship to other objects

Source: Objects, either print or electronic, from which this object is derived, if applicable

Coverage: The extent or scope of the content of the resource

Will include spatial location (geographic coordinates or a place name), temporal period (a period label, date, or date range) or jurisdiction (a named administrative entity)

Metadata revealed!

3.15.00

OtherAgent/contributor: The person(s), (editors and transcribers) who have made other significant intellectual contributions to the work

Language: Language of the intellectual content

Rights Management: information about rights held in and over the resource

Using the Dublin Core:

dc.title=The Book of Me

dc.creator=Me

dc.subject=My life

dc.subject=All about me

dc.publisher=The Press of Me

dc.contributor=Only me

Metadata revealed!

3.15.00

Here’s what it might look like embedded in an HTML document:

<HTML><HEAD> <TITLE>The Home Page of Me</TITLE>

<META NAME="package" CONTENT="(TYPE=begin) Dublin Core">

<META NAME="DC.title" CONTENT="The story of me"><LINK REL=SCHEMA.dc HREF="http://purl.org/dublin_core_elements #title">

<META NAME="DC.subject" CONTENT=”biography, fascinating person, me"> <LINK REL=SCHEMA.dc HREF="http://purl.org/dublin_core_elements#subject">

<META NAME="DC.description" CONTENT="A hard hitting biography"> <LINK REL=SCHEMA.dc HREF="http://purl.org/dublin_core_elements#description>

<META NAME="DC.creator" CONTENT="Howard Rosenbaum"> <LINK REL=SCHEMA.dc HREF="http://purl.org/dublin_core_elements#creator>

</HEAD>

Metadata revealed!

3.15.00

The Warwick Framework

At the Warwick Workshop, researchers developed a “container architecture” known as the Warwick Framework

The goal was to create an architecture that associates diverse types of metadata with a resource

It is a mechanism for logically and physically aggregating distinct “packages” of metadata.

The Framework is an advance because:

It allows the designers of individual metadata sets to focus on specific requirements without concerns for generalization

Metadata revealed!

3.15.00

The syntax of each metadata set can vary in conformance with semantic requirements, community practices, and functional processing requirements

The management of and responsibility for specific metadata sets is left to respective “communities of expertise”

It promotes interoperability by allowing tools and agents to selectively access and manipulate individual packages and ignore others

It permits access to different metadata sets that are related to the same object to be separately controlled

It flexibly accommodates future metadata sets by not requiring changes to existing sets or the programs that make use of them

Metadata revealed!

3.15.00

Digital object identifiershttp://www.doi.org/

This is an initiative from international book and journal publishers

It is a new identification system to be used for all digital content

The DOI system provides a unique identification for that content, protecting intellectual property

It also provides a way to link users of the materials to the rights holders themselves to facilitate automated digital commerce in the new digital environment

Metadata revealed!

3.15.00

Developed and tested over the last year, the DOI system is now being used by more than a dozen U.S. and European publishers in a pilot program that has been running since July

Participation in Phase Two of the Prototype was extended to all publishers at the Frankfurt Book Fair in October 1997

DOI will be a persistent means to authenticate content to insure that what the customer is requesting is what is being sent

Metadata revealed!

3.15.00

The DOI System has three parts, the identifier, the directory, and the database.

The identifier is made up of two components

The first element, the prefix, is assigned to the publisher by the Directory Manager

However, at this phase of the Prototype, the prefixes all begin with 10 to designate the Directory Manager making the assignment of the prefix

This is followed by a number designating the publisher who will be depositing the individual DOIs

Publishers may chose to request a prefix for each imprint or product line, or may use a single prefix

Metadata revealed!

3.15.00

The second element, following a slash mark, is the suffix

This is the designation assigned by the publisher to the specific content being identified

Many use recognized international standards for their suffixes

If they do so, they are encouraged to indicate the standard being used by preceding it with a code

The suffix can follow any system of the publisher’s choosing, and be assigned to objects of any size - book, article, abstract, chart - or any file type - text, audio, video, image or software

Metadata revealed!

3.15.00

An object (book) may have one DOI, and a component within that object (chapter) may have another DOI

The publisher decides the level of identification based on the nature of objects sold and distributed over the Internet

The suffix can be as simple as a sequential number or a publishers' own internal numbering system

10.1002/[ISBN]0-471-58064-3

Prefix Suffix

Directory Registrant Code Item # Prefix (Optional)

Metadata revealed!

3.15.00

The Directory:

The DOI system acts as a routing system

Digital content may change ownership or location over time, so the DOI system uses a central directory

When a user clicks on a DOI, a message is sent to the central directory where the current web address associated with that DOI appears

This location is sent back to the user’s browser with a message telling it to “go to this particular net address.”

In a second the user sees a “response screen” - a Web page - on which the publisher offers the reader either the content itself, or further information about the object how to obtain it

Metadata revealed!

3.15.00

When the object moves to a new server or the copyright holder sells the product, one change is recorded in the directory and all subsequent readers are sent to the new site

The DOI remains reliable and accurate because the link to the associated information or source of the content is easily and efficiently changed

The database

Information about the object, beyond simply the response screen is maintained by the publisher

It might include the content or the information on where and how to obtain the content or other related data

The information that the user has access to in response to a DOI query is the third component of the DOI system

Metadata revealed!

3.15.00

The DOI is being developed to conform to, and take advantage of, all relevant international standards

The syntax of the DOI is being proposed as NISO standard Z39.84

DOI metadata will be expressed in RDF using XML

DOI conforms to the syntax for URNs laid down by IETF

DOI has aligned with Interoperability of Data in ECommerce Systems (INDECS)

INDECS uses the current major initiatives of structured metadata (including Dublin Core and IFLA) to define a common metadata model for ecommerce.

The DOI Foundation is a member of the W3 and is in close contact with standardisation activities from ISO and others as well as initiatives from WIPO and other major bodies

Metadata revealed!

3.15.00

Examples of DOI Usage

An article reference found on the net is linked to an abstract and information about the availability of full text

A reader of one article was linked to related material including similar articles, or books.

A reader using DOIs saw the full text of an article,the Table of Contents of the journal in which the article appeared

She could subscribe to the journal, purchase a book, or order the content for later delivery

A user was able to use the DOI to automatically contact a help service, or download the current driver for a software product

Metadata revealed!

3.15.00

RDF: Resource Description Framework 2/99

W3C (World Wide Web Consortium) initiative

W3C’s (RDF) provides a generic metadata architecture

It is a specification currently being developed to support the definition of metadata across the web.

It describes how metadata for content is defined in web documents

This metadata is descriptive information about the structure and content of information in a document

RDF is useful for describing information about indexing, navigating and searching a site, as well as push channel definitions and digital signatureshttp://www.w3c.org/RDF/

Metadata revealed!

3.15.00

RDF is the instantiation of the Warwick Framework for the Web

The basic RDF data model consists of three object types:

Resource: anything that can be specified by a URI, such as a web page, an entire web site, a specific newsgroup message

Properties: characteristics or attributes of a resource, along with some notion of meaning, valid values, etc.

Statements: resource + named property + value of property

This is expressed as a “tuple” {subject predicate object}

Metadata revealed!

3.15.00

It will be the foundation for an architecture for metadata on the Web

Resource description

Electronic commerce

Site mapping

Third party rating

Digital signatures

Search engine data collection (web crawling)

Digital library collections

Distributed authoring

Metadata revealed!

3.15.00

Using XML, it might look like this:

<RDF xmlns:DC="http://purl.org/DC">

<Description about="http://www.w3.org/folio.html">

<DC:Title>The W3C Folio 1999</DC:Title>

<DC:Creator>W3C Communications Team</DC:Creator>

<DC:Date>1999-03-10</DC:Date>

<DC:Subject>Web development, World Wide Web Consortium, Interoperability of the Web</DC:Subject>

</Description>

</RDF>

Metadata revealed!

3.15.00

MARBI is an ALA committee that advises LOC on changes to USMARC record formats

Proposal 93-4 set major changes to accommodate bibliographic formats to account for networked information

They suggested a new set of data elements to add to the record and forced people to think of the definition of an online resource

The key element was remote access

They suggested changes to Field 256 “File Characteristics,” and lost

They recommended creating Field 856 “Electronic Location and Access” and won

Metadata revealed!

3.15.00

MARC Initiatives for the identification and description of networked resourcesname of resource

acronym/initialism

producer

distributor

location

contact name and address

network access

network address

hours of service

telephone

fax

network access instructions

terminal emulation

logon instructions

logoff instructions

type of resource

size of resource

freqency of update

language of resource

profile of resource

audience

restrictions on access

authorization

source machine

databases available

other providers of database

responsibility for record maintenance

date/time of last update of directory information

local access information and guidelines

cost for use

coverage

indexing terms

What's here nowWhat's here now

Metadata revealed!

3.15.00

What's new?

Field 856 is an embedded holdings field within the bibliographic record

It contains the information needed to locate digital information

The information identifies the location containing the item or from which it is available

It also contains information to retrieve the item by the access method identified in the first indicator

This information is sufficient to allow for the electronic transfer of a file, subscription to an electronic journal, or logon to a library catalog

Metadata revealed!

3.15.00

856 ELECTRONIC LOCATION AND ACCESS

* Indicators (of which there are always two)

First Access method

0 Email

1 FTP

2 Remote login (Telnet)

8 Other

Second Undefined

# Undefined

Subfield Codes

$a - Host name (R)

$b - IP address (NR)

$c - Compression information (NR)

$d - Path (R)

$f - Filename (R)

$g - Name of publication or conference (NR)

$h - Processor of request (NR)

$i - Instruction (R)

$k - Password (NR)

$l - Logon/login (NR)

$m - Contact person for information, assistance (R)

$n - Name of location of host in $a (NR)

$p - Port (NR)

$q - File mode (NR)

$s - File size (R)

$t - Terminal emulation (R)

$x - Non-public note (R)

$z - Public note (R)

$2 - Source of access (NR)

Metadata revealed!

3.15.00

OCLC’s Persistent URL (PURL) projecthttp://purl.oclc.org/

Functionally, a PURL is a URL with three parts

Protocol: this is used to access the PURL resolver

This protocol may differ from that used to access the resource associated with the PURL

Resolver address: the IP address or domain name of the PURL resolver

This portion of the PURL is resolved by the Domain Name Server (DNS)

Name: user-assigned name

Note: This may differ from the name of the resource in the associated URL

Metadata revealed!

3.15.00

Instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service

The PURL resolution service associates the PURL with the actual URL and returns the URL to the client

The client can then complete the URL transaction in the normal fashion.

The advantage of PURLs is that they persist over time no matter where the page moves

http://purl.oclc.org/your.address/yourfile.html

protocol resolver address filename

Metadata revealed!

3.15.00

CLI ENT

PURLSERVER

RESOURCESERVER

PURL

URL

URL

RESOURCE

The model works something like this:

Metadata revealed!

3.15.00

I. Introduction

• The state of the net today

• What is metadata and why do we need it?

II. What different metadata schemes are available?

• Dublin Core and Warwick Framework

• Digital Object Identifier

• Resource Description Format

• Persistent URL

III. What does metadata mean for information management in libraries?

Metadata revealed!

3.15.00

III. What does metadata mean for information management in libraries?

There are social and technical issues in the use of metadata

Metadata use requires collaboration because it provides little benefit if authors simply add whatever metadata they like to their resources

People have to agree on the metadata schemes to use (purposes and values of the scheme)

To reach agreement, people must be willing to abandon old procedures and adopt new methods

It is critical to address the social aspects of metadata early and often

Metadata revealed!

3.15.00

Issues:

Implementing new ways of storing and retrieving information requires the cooperation of various stakeholders

This means that education is necessary for those who may never see a metadata record up close and personal

It requires attention to staffing and work flow

In a practical sense, it is difficult to find people with the specialized skills needed to evaluate, implement, and maintain systems that exploit metadata

The administration should understand that involvement with metadata will require commitment of time and resources for staff training and education

Metadata revealed!

3.15.00

Convincing creators of digital information to use metadata

For a library to use metadata, the creators of digital documents must embed it in the document

This is not a trivial because adding metadata is an investment of time and effort

Librarians can help creators work with metadata but should not take responsibility for putting it in documents and files

This is because metadata is embedded in the work itself and will rarely if ever be directly controlled by the librarian

Metadata revealed!

3.15.00

Convincing librarians to understand metadata and tools which exploit it

Metadata is a tool, not a solution

Librarians must understand metadata and possess certain skills in order to make it useful for library patrons

Introducing entirely new systems for information access may undermine the goal of providing integrated access to all the library's holdings

It may requires that librarians and patrons to learn new skills

For these reasons and others, some librarians may feel that it is not worthwhile to work with metadata

Metadata revealed!

3.15.00

Who will be responsible for creating and maintaining metadata?

Publisher side

Author

Webmaster

Institution

Service side

Search service

Third party creators

How will it be done?

Automatically generated

Hand crafted

Metadata revealed!

3.15.00

Technical Issues

Compatibility with present access mechanisms and data

One issue when introducing metadata is to retain compatibility with existing access mechanisms

The catalog is the primary access point to the vast majority of library resources

MARC has become the accepted standard for exchange of library information and has heavily influenced storage and display of information

It is a not wise to become dependent on a technology that is incompatible with MARC

There should be a compelling case to believe that a technology will become dominant or that migration will be possible

Metadata revealed!

3.15.00

There are no widely accepted metadata standards yet

Some efforts have attracted interest, but use is infrequent and inconsistent

Librarians are interested in the Dublin Core because its elements transfer relatively easily to MARC

But the DC does not doc well with resources that don't behave like paper documents

Another problem with the DC is that it defines a minimal set of elements and further development seems to have stopped

There is no guarantee that metadata generated today will be useful for providing access to documents in the futureBanerje, K. (1999). Practical Applications of Metadata at Oregon State Universityhttp://ucs.orst.edu/~banerjek/papers/ola1999.html

Metadata revealed!

3.15.00

Libraries Working Group

Working Group Chair: Rebecca Guenther [email protected]

Working Group Charter:

Foster increased operability between DC and library metadata by identifying issues and solutions;

Keep the library community informed on DCdevelopments;

Consider reasons to experiment more widely with Dublin Core in libraries;

Build a library Implementors community;

Explore the need for a cross domain namespace(s) to register non-DC elements and qualifiers needed by the library community

http://purl.oclc.org/dc/groups/libraries.htm

Metadata revealed!

3.15.00

What will information professionals have to learn?

The range of applicable metadata schemes, their strengths and weaknesses

How to apply appropriate schemes to digital information

The ways in which various metadata schemes facilitate resource discovery

How they affect the administration of digital information, information security, documentation, data mining…

The relationship between standards and metadata

How to test and evaluate various metadata schemes

Metadata revealed!

3.15.00

A challenge for information professionals is to work with different metadata schemes

This involves developing metadata “crosswalks”

Fluid capability to work with same data in different metadata structures

Requires agreement on semantics

Requires standardized mappings for interoperability

It will involve working with metadata in two forms

Embedded with data

Independent of items

Metadata revealed!

3.15.00

Here are an examples of a metadata crosswalk from Dublin Core to MARC

The conversion of DC style record involves

Skeletal record for enhancement

Incorporating DC record into MARC database Subject and Keywords

Simple: 653$a (Index term--Uncontrolled)

Complex: If scheme=LCSH: 650$a

If scheme=LCC: 050$a

If scheme=DDC: 082$a

If scheme=(other): 650$a with $2 (code)

This enables communication of the DC record in MARC

Metadata revealed!

3.15.00

Here are some other examples of crosswalks:

DC/MARC/GILS Crosswalkhttp://www.loc.gov/marc/dccross.html

MARC/FGDChttp://alexandria.sdc.ucsb.edu/public-documents/metadata

/ fgdc2marc.html

GILS/MARChttp://www.usgs.gov/gils/prof_v2.html#annex_b

Also:

Dublin Core to FGDC

MARC to SGML

Crosswalks allows resource discovery across syntaxes

Metadata revealed!

3.15.00

Examples of metadata schemes

Text Encoding Initiative (TEI) http://www.uic.edu/orgs/tei/

Global Information Locator Service (GILS) http://www.gils.net/index.html

Computer Interchange of Museum Information (CIMI) http://www.cimi.org/

Encoded Archival Description (EAD) http://lcweb.loc.gov/ead/

Content Standards for Digital Geospatial Metadata (CSDGM) http://www.fgdc.gov/metadata/contstan.html

Metadata revealed!

3.15.00

Nordic Metadata Project http://linnea.helsinki.fi/meta/

HotOil: Distributed Searching over Heterogeneous Information Sourceshttp://www.dstc.edu.au/Research/Projects/hotoil/

National Biological Information Infrastructure (NBII) http://www.nbii.gov/index.html

Categories for the Description of Works of Art (CDWA) http://www.ukoln.ac.uk/metadata/desire/overview/rev_03.htm

Interoperability of Data in ECommerce Systems (INDECS) http://www.indecs.org/

This presentation is on the web at:http://www.slis.indiana.edu/hrosenba/www/Pres/metadata00/index.html