Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to...

34
Resource Description: Metadata/Dublin Core CS 431 February 21, 2007 Carl Lagoze Cornell University

Transcript of Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to...

Page 1: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Resource Description:

Metadata/Dublin Core

CS 431 – February 21, 2007

Carl Lagoze – Cornell University

Page 2: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

But first

A bit more XML

Page 3: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Referencing a name within the

schema• “ref” attribute

• Example:

http://www.cs.cornell.edu/courses/CS431/2007s

p/examples/xml_schema/business_card.xsd

Page 4: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Type Reuse

• Include – add definitions in same

namespace

• Redefine – extend or restrict definitions in

same namespace

• Import – Create new namespace and use

definitions from another namespace

Page 5: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Import Example

• Address schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/address.xsd

• Person schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person.xsd

• Instance document

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person.xml

Page 6: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Include Example

• Address schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/address.xsd

• Person schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person2.xsd

• Instance document

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person2.xml

Page 7: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Redefine Example

• Address schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/address.xsd

• Person schema

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person3.xsd

• Instance document

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/person3.xml

Page 8: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Substitution Groups

• Provide alternates for a sub-tree

• Example

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/products.xsd

– http://www.cs.cornell.edu/courses/CS431/200

7sp/examples/xml_schema/products.xml

Page 9: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Acknowledgments

• Andy Powell, Head of Development,

Eduserv Foundation, UK

• Tom Baker, Dublin Core Metadata

Initiative

• Diane Hillmann, Cornell University

• Erik Wilde, UC Berkeley School of

Information

Page 10: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity
Page 11: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Dublin Core

• Origins at 1994 Web Conference

– Metadata was necessary for finding things on the web

– Simple cross-domain vocabulary (15 elements)

describing “document-like” objects

• 2004 ISO standard elements

– http://dublincore.org/documents/dces/

Page 12: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

The fifteen Dublin Core

Elements

Creator Title Subject

Contributor Date Description

Publisher Type Format

Coverage Rights Relation

Source Language Identifier

http://dublincore.org/documents/dces/

Page 13: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Dublin Core Qualifiers

• From loose semantics to more specific

description

• Model of “graceful degradation”

– Support both simplicity and specificity

– Intra-domain and inter-domain semantics

• Informally three class of qualification

– Element refinement – from “date” to “date published”,

from “contributor” to “illustrator”

– Value encoding schemes – from “subject” to “LCSH

subject”

– Language

Page 14: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

The Dublin Core Vocabulary

http://dublincore.org/documents/dcmi-terms/

Elements1. Identifier

2. Title

3. Creator

4. Contributor

5. Publisher

6. Subject

7. Description

8. Coverage

9. Format

10. Type

11. Date

12. Relation

13. Source

14. Rights

15. Language

Abstract

Access rights

Alternative

Audience

Available

Bibliographic citation

Conforms to

Created

Date accepted

Date copyrighted

Date submitted

Education level

Extent

Has format

Has part

Has version

Is format of

Is part of

Is referenced by

Is replaced by

Is required by

Issued

Is version of

License

Mediator

Medium

Modified

Provenance

References

Replaces

Requires

Rights holder

Spatial

Table of contents

Temporal

Valid

RefinementsBox

DCMIType

DDC

IMT

ISO3166

ISO639-2

LCC

LCSH

MESH

Period

Point

RFC1766

RFC3066

TGN

UDC

URI

W3CTDF

SchemesCollection

Dataset

Event

Image

Interactive

Resource

Moving Image

Physical Object

Service

Software

Sound

Still Image

Text

Types

Page 15: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Dumb-down

• the process of translating a qualified DC

metadata record into a simple DC

metadata record is normally referred to as

„dumbing-down‟

• can be separated into two parts:

– Property – from refinement to core element

– Value – from encoding to basic string

Page 16: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Encoding DC - XML

http://dublincore.org/documents/2002/12/02/dc-xml-guidelines/

Page 17: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Encoding DC - XHTML

http://dublincore.org/documents/dcq-html/

Page 18: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

DC Vocabulary in Context:

Model and modularity• Resources are related to each other

• There are many vocabularies

Page 19: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

One resource, on description

Play

Title

Antony and Cleopatra

createdBy

Date

1606

Subject

Roman history

William Shakespeare

Page 20: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Relationship among many resources

PlayPlaywright

CollectionTitle

Title

Collected Works of

William Shakespeare

Antony and Cleopatra

William Shakespeare

Name

isPartOf

createdBy

Date

1606

Subject

Roman history

Birthplace

Stratford

Description 1

Description 2

Description 3

One-to-one principle

Page 21: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

...in one record

Description Package

PlayPlaywright

CollectionTitle

Title

Collected Works of

William Shakespeare

Antony and Cleopatra

William Shakespeare

Name

isPartOf

createdBy

Date

1606

Subject

Roman history

Birthplace

Stratford

Beschreibung A

Beschreibung B

Beschreibung C

Page 22: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Dublin Core Abstract Model

Packaging multiple descriptions

and vocabularies together

Page 23: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Resource URI

Property URI Rich representation

Property URI Value URI Vocab Enc Scheme URI

Property URIValue string Syntax Enc Scheme URI

Value string Syntax Enc Scheme URI

Resource URI

Property URI Rich representation

Property URI Value URI Vocab Enc Scheme URI

Property URI Value string Syntax Enc Scheme URI

Statement

Description

Description Set

Page 24: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Packaging a Complex Object

Page 25: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Applying this model in the context

of scholarly communciation

• Increasing availability of scholarly

research in open access repositories –

e.g., arXiv

– Mirrored

– Multi-format (pdf, laTex)

– Co-exist in journal published form and ePrint

form

• FRBR is a model for representing these

relationships.

Page 26: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

26

FRBR for eprints

The eprint as a scholarly work

Author’s Original 1.0 Author’s Original 1.1Version of Record

(French)

html pdf

publisher’s copyinstitutional repository

copy

scholarly work(work)

version(expression)

format(manifestation)

copy(item)

… Version of Record(English)

Page 27: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

27

Eprints application model

ScholarlyWork

Expression0..∞

isExpressedAs

Manifestation

isManifestedAs

0..∞

Copy

isAvailableAs

0..∞

0..∞

0..∞

isCreatedBy

isPublishedBy

0..∞isEditedBy

0..∞

isFundedBy

isSupervisedBy

AffiliatedInstitution

Agent

Page 28: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

28

Eprints model and FRBR

ScholarlyWork

Expression0..∞

isExpressedAs

Manifestation

isManifestedAs

0..∞

Copy

isAvailableAs

0..∞

0..∞

0..∞

isCreatedBy

isPublishedBy

0..∞isEditedBy

0..∞

isFundedBy

isSupervisedBy

AffiliatedInstitution

Agent

FRBRWork

FRBRExpression

FRBRManifestation

FRBRItem

Page 29: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

29

ScholarlyWork

Expression0..∞

isExpressedAs

Manifestation

isManifestedAs

0..∞

Copy

isAvailableAs

0..∞

0..∞

0..∞

isCreatedBy

isPublishedBy

0..∞isEditedBy

0..∞

isFundedBy

isSupervisedBy

AffiliatedInstitution

Agent

Eprints model and FRBR

the eprint (an abstract concept)

the ‘version of record’

orthe ‘french

version’or

‘version 2.1’

the PDF format of the version of

record

the publisher’s copy of the

PDF …

the author or the publisher

Page 30: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

30

Attributes

• the application model defines the entities

and relationships

• each entity needs to be described using

an agreed set of attributes

Page 31: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

31

Example attributesScholarlyWork:

titlesubjectabstractaffiliated institutionidentifier

Agent:

nametype of agentdate of birthmailboxhomepageidentifier

Expression:

titledate availablestatusversion numberlanguagegenre / typecopyright holderbibliographic citationidentifier

Manifestation:

formatdate modified Copy:

date availableaccess rightslicenceidentifier

Page 32: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

32

How is this complexity captured

in DC?• the DC Abstract Model provides the notion

of „description sets‟

• i.e. groups of related „descriptions‟

• where each „description‟ is about an

instance of one of the entities in the model

• relationships and attributes are

instantiated as metadata properties

Page 33: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity
Page 34: Resource Description: Metadata/Dublin Core · Dublin Core Qualifiers •From loose semantics to more specific description •Model of “graceful degradation” –Support both simplicity

Resources

• DCMI Abstract Model– http://dublincore.org/documents/abstract-model/

• Eprints Application Profile– http://www.ukoln.ac.uk/repositories/digirep/index/EPrints_

Application_Profile

• Eprints DC XML– http://www.ukoln.ac.uk/repositories/digirep/index/Eprints_D

C_XML

• Eprints DC XML/Instances– http://www.ukoln.ac.uk/repositories/digirep/index/Eprints_D

C_XML/Instances