RDTF Metadata Guidelines: an update

RDTF Metadata Guidelinesan update

Andy Powell (and Pete Johnston)

RDTF Management FrameworkProject Board

29 March 2011

Functional requirement

• help libraries, museums and archives expose existing metadata (and new metadata created using existing practice) in ways that– support the development of aggregator services– integrate well with the web (and the emerging

web of data)• note: RDTF is not about re-engineering

cataloguing practice in the LAM sectors

Guiding principles

• support the RDTF Vision• informed by Paul Miller’s desk study review• flow from the JISC IE Technical Review meeting• in line with Linked Data principles• based on the W3C Linked Open Data Star Scheme• in line with Designing URI Sets for the UK Public Sector• take into account the Europeana Data Model and ESE• be informed by mainstream web practice and search engine

behaviour and be broadly in line with the notion of “making better websites” across the library, museum and archives sectors

Draft proposal

• used the W3C Linked Open Data star scheme as framework (at 3, 4 and 5 star levels)

• three approaches– community formats– RDF data– Linked Data

• 196 comments – on pretty much all aspects of the draft

Guiding principles

• support the RDTF Vision• informed by Paul Miller’s desk study review• flow from the JISC IE Technical Review meeting• in line with Linked Data principles• based on the W3C Linked Open Data Star Scheme• in line with Designing URI Sets for the UK Public Sector• take into account the Europeana Data Model and ESE• be informed by mainstream web practice and search engine

behaviour and be broadly in line with the notion of “making better websites” across the library, museum and archives sectors

We

prob

ably

faile

d in

this

!!

Re-conceptualising the guidelines

Collectionsof Descriptions

“RDF Data”“bulk download”

RDFNot-RDF

Individual ItemDescriptions

Linked Data“page per thing”

The draft guidelines



RDFNot-RDF



The Web!



RDFNot-RDF



Possible adoption path



RDFNot-RDF



Bulk download

• “give us what you’ve got”• serve existing community bulk-formats (e.g. files containing

collections of MARC, MODS, BibTeX, DC/XML, SPECTRUM or EAD records) or CSV over RESTful HTTP

• use sitemaps and robots.txt and/or RSS/Atom to advertise availability and GZip for compression

• for CSV, provide a column called ‘label’ or ‘title’ so we’ve got something to display

• give us separate records (for CSV, read ‘rows’) about separate resources (where you can)

• simples!

Page per thing

• “build better websites”• serve an HTML page (i.e. a description) for every “thing” of

interest over RESTful HTTP• optionally serve alternative format(s) for each description (e.g.

a MODS or DC/XML record) at separate URIs and link from the HTML descriptions using “<link rel=“alternative” … />

• use “cool” ‘http’ URIs for all descriptions• use sitemaps and robots.txt and/or RSS/Atom to advertise

availability• optionally offer OAI-PMH server to allow harvesting of all

descriptions/formats

RDF data

• “RDF bulk download”• serve big buckets of RDF (as RDF/XML, N-Tuples

or N-Quads) over RESTful HTTP• re-use existing conceptual models and

vocabularies where you can• assign URIs to every “thing” of interest• use Semantic Sitemaps and the Vocabulary of

Interlinked Datasets (VoID) to advertise availability of the buckets

Linked Data

• “W3C 5 star approach”• serve HTML and RDF/RDFa for every “thing” of interest

over RESTful HTTP• assign ‘http’ URIs to every “thing” (and every

description of a thing)• follow “cool URIs for the semantic web” recommended

practice• become part of the web of data - link to other people’s

stuff using their URIs

Recommendations

• use the four-quadrant model to frame the guidelines (we think all four quadrants are useful, and that there should probably be some guidance on each area)

• develop specific guidance for serving an HTML page description per 'thing' of interest (possibly with associated, and linked, alternative formats such as DC/XML)

• develop (or find) specific guidance about how to sensibly assign persistent 'http' URIs to everything of interest (including both things and descriptions of things)

Also…

• that the definition of 'open' needs more work (particularly in the context of whether commercial use is allowed) but that this needs to be sensitive to not stirring up IPR-worries in those domains where they are less of a concern currently

• that mechanisms for making statements of provenance, licensing and versioning be developed where RDF triples are being made available (possibly in collaboration with Europeana work)

• that a fuller list of relevant models that might be adopted, the relationships between them, and any vocabularies commonly associated with them be maintained separately from the guidelines themselves

RDTF Metadata Guidelines: an update

Technology

Transcript of RDTF Metadata Guidelines: an update