Augmenting interoperability across scholarly repositories
-
Upload
herbert-van-de-sompel -
Category
Technology
-
view
1.735 -
download
1
Transcript of Augmenting interoperability across scholarly repositories
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting Interoperabilityacross Scholarly Repositories
Herbert Van de SompelResearch Library
Los Alamos National Laboratory, USA
Obt
ain
Har
vest
Put
This work was supported by NSF award number IIS-0430906 (Pathways)
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Pathways Project
• NSF grant number IIS-0430906• http://www.infosci.cornell.edu/pathways/• PIs: Carl Lagoze, Sandy Payette, Herbert Van de Sompel, Simeon
Warner• Research Participants: Lyudmila Balakireva, Jeroen Bekaert,
Xiaoming Liu, Chris Wilper, Zhiwu Xie
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Meeting in NYC, April 20-21 2006
• Supported by Microsoft, Mellon Foundation, Coalition forNetworked Information, Digital Library Federation, JISC
• Representatives from institutional Repository projects, scholarlycontent Repositories, Registry projects, various projects that touchon interoperability
• See http://msc.mellon.org/Meetings/Interop/ for Agenda,Participants, Topics & Goals, Terminology, Presentations, Prototypedemonstration.
• Report available July 2006
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
And more discussions with the community
• Panel at JCDL 2006, Chapel-Hill, NC• IATUL 2006, Porto, Portugal• ElPub 2006, Bansko, Bulgaria• Meeting at the University of Southampton, UK
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Context: the Repository model
Repository
An environment consisting ofDigital Object Repositorieswith a Long Life Expectation:
o Scholarly repositories- Institutional
repositories- Discipline-oriented
repositories- Publisher’s repositories- Dataset repositories- …
o Cultural heritagerepositories
o Preservation archiveso Educational repositories
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Context: compound digital objects
Digital Object
Objects of scholarlycommunication system areincreasingly compound innature, simultaneouslyconsisting of:
• Multiple media types• Multiple content types
o Papers,o Datasets,o simulations,o software,o dynamic knowledge
representations,o machine readable chemical
structures
id
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Context: the Repository model
• We must leverage the value of the materials that becomeavailable in those distributed Repositories.
• Think about these Repositories as active nodes in a globalenvironment, not as passive local nodes
o These Repositories are about facilitating the use and re-use of materials in many contexts
o These Repositories are the starting point of value chains
• In order to enable value chains, we need to augmentinteroperability across repositories
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Selective collecting
Motivation 1 : Richer cross-Repository services
service
Distributed Repositories provide sourcematerials for cross-Repository overlayservices such as discovery services
Need: digital object representation,harvesting interface, datastreamsemantics
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
id
id
idrecombine & add value
Motivation 2 : Scholarly communication workflowDistributed Repositories at the basis of adigital scholarly communication system.Scholarly communication as a globalworkflow across those Repositories
Need: digital object representation,obtain interface, put interface
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across RepositoriesD
Spac
e
Fedo
ra
aDO
Re
ePri
nts
arX
iv
Nat
ure
Individual Data Models and Services
Shared Data Model and Services
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Considerations re interoperable framework
• Scholarly communication is a long-term endeavor:• Need abstract definitions of Repository interfaces that can be
instantiated on the basis of various technologies as time goes by• Repository interfaces need to work with whichever type of
identifier (current and future) because Repositories will usewhichever type of identifier
• Value chains do not require transfer of all digital objectcontent
• The content that needs to be transferred depends on the natureof the value chain
• Recording a chain of evidence of a value chain requires finegranularity of identification
• Not only identifier of the digital object but also of therepository
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across RepositoriesD
Spac
e
Fedo
ra
aDO
Re
ePri
nts
arX
iv
Nat
ure
Individual Data Models and Services
m Obt
ain
Har
vest
Put
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across Repositories
m Pathways Core Data Model for Cross-Repository services
Bekaert, Jeroen, Xiaoming Liu, Herbert Van de Sompel, Sandy Payette, Carl Lagoze, andSimeon Warner. Pathways Core: A Data Model for Cross-Repository Services. 2006.Poster for JCDL 2006. http://public.lanl.gov/herbertv/papers/pathways_core_poster_submit.pdf
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across Repositories
• A Surrogate is available for every Digital Object• A Surrogate is a representation of the DigitalObject according to the Pathways Core data model
• The representation is uniform across repositories;not tied to identifier type, content type, applicationdomain.• The Surrogate is what is used in the value chains;the Surrogate is used at Obtain, Harvest and Putinterfaces.o Expresses properties and access points for theDigital Object (see later)
o The Surrogate for a specific Digital Object canchange over time
m Pathways Core Surrogates (currently XML/RDF)
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across Repositories
• The Surrogates provide By-Reference access toconstituent datastreams of Digital Objects
• Full asset transfer is only required for certainapplications• Static asset transform may be undesirable fordynamic objects => Live references• Avoid IP issues at the level of the interoperabilityframework
• The idea is that the Surrogate itself is notencumbered by IP issues; attach - by definition -a liberal Creative Commons license to Surrogates• Allow Surrogates to flow freely independent ofbusiness models of the underlying content
m Pathways Core Surrogates (currently XML/RDF)
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Augmenting interoperability across Repositories
• A Surrogate expresses access points andproperties of a Digital Object, e.g.:
• Location of content streams
• providerInfo: the keys necessary to Obtain afresh Surrogate at some later point in time:
• (Repository identifier, preferredIdentifier,versionKey)
• Lineage: A Surrogate expresses itspredecessor(s)
• == providerInfo in previous life• semantic: A Surrogate expresses the type ofcontent.
m Pathways Core Surrogates (currently XML/RDF)
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Obtain interface: a Repository interface that supports the request ofservices pertaining to individual Digital Objects (including theircomponent Datastreams). The core service is the request of aSurrogate for a Digital Object.
Augmenting interoperability across Repositories
Obt
ain
Har
vest Harvest interface: a Repository interface that exposes Surrogates for
incremental collecting/harvesting.
Put Put interface: a Repository interface that supports submission of oneor more Surrogates into the Repository, thereby facilitating theaddition of Digital Objects to the collection of the Repository.
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Surrogate is at the core of the value chain
id
id
id
Obt
ain
Obt
ain
Put
Obt
ain
recombine &add value
Lineage
Lineage
providerInfo
providerInfo
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Basis for a Network of Linked Digital Objects
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Repo1
Obt
ain
Har
vest
Put1 Harvest1
Obtain1
Put
Repo2
Obt
ain
Har
vest
Put2 Harvest2
Obtain2
Put
service
RESEARCHLIBRARYAugmenting Interoperability across Scholarly Repositories
JISC CNI Conference, York, UK, July 6th 2006Herbert Van de Sompel
Repo2
Repo1
Obt
ain
Har
vest
Obt
ain
Har
vest
Put2 Harvest2
Obtain2
Put1 Harvest1
Obtain1
Put
Put
Put2Harvest2Obtain2Repo2
Put1Harvest1Obtain1Repo1
PutHarvestObtainprovider
Serv
ice
Regi
stry
providerInfo
(provider,preferredIdentifier,
versionKey)