Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider...

31
herbert van de som pel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University Computer Science – Digital Library Research Group OAI metadata harvesting specifications

Transcript of Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider...

Page 1: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

Workshop on OAI and peer review journals in Europe

Geneva, Switserland – March 22nd to 24th 2001

Herbert Van de Sompel

Cornell University

Computer Science – Digital Library Research Group

OAI metadata harvesting specifications

Page 2: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

a bri ef hi story of the O A I

Page 3: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

The O pen A rchi ves Ini ti ati ve has been set upto create a forum to di scuss and sol ve m attersof i nteroperabi lity between prepri nt sol uti ons,as a way to prom ote thei r gl obalacceptance.PaulGi nsparg,Ri ck Luce & H erbert Van de Som pel

the OAI roots

=>Santa Fe Conventi on:prepri nt m etadataharvesti ng

Page 4: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

interest from other communities

• Digital Library Federation meetings~ research library community has many materials

for which they would like to ‘expose’ metadata

• OAI San Antonio meeting:~ interest from librarians, publishers, others, ...

Page 5: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

resulting actions: organizational

• establish organizational stability for the OAI:

• institutional backing from CNI & DLF

• steering committee: policy guidance

• technical committee: technical specifications

• executive group: day to day coordination

• workshops: public dissemination, feedback

Page 6: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

resulting actions: technical

• [09/2000]revise specifications to allow adoptionbeyond preprints: technical committee

• [09/2000-01/2001]compile new specifications:editing by Carl and Herbert

• [11/2000-01/2001]alpha-test specifications: oai-alpha group

• [01/2001]discontinue the Santa Fe Convention

• [01/2001]release version 1.0 of the OAI protocol

Page 7: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

the O A I M etadata H arvesti ng protocol

Page 8: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

The O A M H protocoli s a l ow-barri erinteroperabi lity speci fi cati on for therecurrent exchange of m etadata betweensystem s

Page 9: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

the OAMH protocol

service provider data provider

Requests

Replies

repository

harvester

6

Page 10: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

A&I

federated services

image

FTXT

OPAC

e-print

Page 11: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

metadata harvesting via OAMH

metadata

A&I

image

OPAC

e-print

FTXT

harvester

FTXT

Page 12: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

federated services via OAMH

metadata

A&I

image

FTXT

e-print

AuthorTitleAbstractIdentifer

OPAC

Page 13: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

core concepts in OAMH

• low-barrier interoperability

• data-provider & service-provider model

• metadata harvesting model O A M H protocol

Dublin Core

H TTP based

Reply •XM L Schem a

•Sel f contai ned

• shared metadata format and parallel, community-

specific metadata formats

Page 14: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

OAI harvesting toolsservice provider data provider

Datestam pIdenti fi erSet

Records

repository

harvester

Page 15: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

OAI harvesting tools

Supporti ng protocolrequests:

•Identi fy•Li stM etadataForm ats

•Li stSets

H arvesti ng protocolrequests:

•Li stRecords

•Li stIdenti fi ers

•GetRecord

repository

service provider data provider

harvester

Page 16: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

supporting protocol requests

ListM etadataForm ats

ListM etadataForm ats / Ti m e / RequestREPEA T

•Form at prefi x•Form at XM L schem a

/REPEA T

repository

service provider data provider

harvester

Page 17: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

harvesting requests

* from =a* unti l=b* set=kl m

ListRecords * m etadataPrefi x=dc

ListRecords / Ti m e / RequestREPEA T

•Identi fi er•Datestam p

•M etadata/REPEA T

repository

service provider data provider

harvester

Page 18: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

A pplicati ons of the O A M H protocol ?

•federated servi ces [S&R,SDI,al erti ng,linki ng,. ..]•database synchroni zati on•harvesti ng the deep W eb•. ..

Page 19: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

O A I -status

Page 20: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

• freeze specifications for 12 -18 months:

• stable for experimentation; not definitive

• minimize risk for early adopters

• maximize chances for future interoperabilityacross communities

revision of specifications

Page 21: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

software to run OAI repository

• eprints.org - U. Southampton

• open source metadata server - OCLC

• NT OAI server - U. Illinois

• Aleph 500 - Ex Libris

• Z39.50ÿ OAI gateway - Virginia Tech (ongoing)

• MARC to DC convertor - OCLC

• we expect a lot more ...

• listed on OAI site

Page 22: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

tools to support OAI implementation

• Hussein’s Repository explorer

• W3C XSV Schema Validator

• XML Spy

• the OAI comformance tester:

• part of OAI registration service for repositories

• listed on OAI site

Page 23: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

modes of running OAI 1.0 repository• mode 0:

• no registration of repository in the OAI registry

Page 24: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

modes of running OAI 1.0 repository• mode 1:

• registration of repository in public OAI registry

[includes validation of replies]

exi stence of the reposi tory i s vi si ble

Page 25: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

modes of running OAI 1.0 repository• mode 2:

• registration of repository in public OAI registry

• usage of the OAI format for identifiers

resol ver for O A I form ated i denti fi ers

exi stence of the reposi tory i s vi si ble

Page 26: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

implementation status

•earl y adopti on by prepri nt com m uni ty•but al so by others

Page 27: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

•data provi ders:•20 regi stered reposi tori es (U S and Europe)

implementation status

arXivOCLC Thesis and DissertationsPerseus Digital LibraryPhysNetOxford Text ArchiveLibrary of Congress -- American MemoryCogPrintsHumboldt UniversityMIT ThesisLinguistic Data ConsortiumResource Discovery Network

Page 28: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

implementation status

•servi ce provi ders:•A RC•O pen Language A rchi ves•soon to be l isted on O A I si te

Page 29: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

implementation status

•M el lon Foundati on fundi ng for O A I-basedprojects:data provi ders and servi ce provi ders

•N SF Digi talLi brary i nterest i n O A I-rel atedprojects

•Cl ose contacts wi th SPA RC,DLF,CN I

Page 30: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

communication re OAI

• lists: subscribe via http://www.openarchives.org

• oai-general list

• oai-implementers list

• web: http://www.openarchives.org

• FAQ: http://www.openarchives.org/faq.htm

• mail: [email protected]

Page 31: Herbert Van de Sompel Cornell University Computer Science ... · service provider data provider Requests Replies r e p o s i t o r y h a r v e s t e r 6. herbertvandesompel A&I ...

herbert van de som pel

http://www.openarchives.org

[email protected]