NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

79
ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO ResourceSync: Web-Based Resource Synchronization Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp ResourceSync is funded by The Sloan Foundation & JISC #resourcesync

description

ResourceSync: Web-Based Resource Synchronization. Also for Data. Herbert Van de Sompel, Digital Library Researcher, Los Alamos National Laboratory, and Co-chair of NISO’s ResourceSync Working Group Web applications frequently leverage resources made available by remote Web servers. As resources are created, updated, or deleted these applications face challenges to remain in lockstep with the server’s change dynamics. Several approaches exist to help meet this challenge for use cases where “good enough” synchronization is acceptable. But when strict resource coverage or low synchronization latency is required, commonly accepted Web-based solutions remain elusive. Motivated by the need to synchronize resources for applications in the realm of cultural heritage and research communication, the National Information Standards Organization (NISO) and the Open Archives Initiative (OAI) have launched the ResourceSync project that aims at designing an approach for resource synchronization that is aligned with the web architecture and that has a fair chance of adoption by different communities. The presentation will discuss some motivating use cases and will provide a perspective on the resource synchronization problem that results from ResourceSync project discussions. It will provide an overview of the ongoing thinking regarding an approach to address the challenges and will pay special attention to aspects that are relevant for the synchronization of data.

Transcript of NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

Page 1: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync: Web-Based

Resource Synchronization

Herbert Van de Sompel Los Alamos National Laboratory

@hvdsomp

ResourceSync is funded by The Sloan Foundation & JISC #resourcesync

Page 2: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Los Alamos National Laboratory & OAI: Martin Klein, Robert Sanderson, Herbert Van de Sompel

Cornell University & OAI: Berhard Haslhofer, Simeon Warner

Old Dominion University & OAI: Michael L. Nelson

University of Michigan & OAI: Carl Lagoze

NISO: Todd Carpenter, Nettie Lagace, Peter Murray

ResourceSync Core Team – NISO & OAI

Page 3: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

•  Manuel Bernhardt, Delving B.V. •  Kevin Ford, Library of Congress •  Richard Jones, JISC •  Graham Klyne, JISC •  Stuart Lewis, JISC •  David Rosenthal, LOCKSS •  Christian Sadilek, Red Hat •  Shlomo Sanders, Ex Libris, Inc. •  Sjoerd Siebinga, Delving B.V. •  Ed Summers, Library of Congress •  Jeff Young, OCLC Online Computer Library Center

ResourceSync Technical Group

Page 4: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync

ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Possible Technical Choices

Q&A

Page 5: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync

ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Possible Technical Choices

Q&A

Page 6: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Synchronize What?

•  Web resources – things with a URI that can be dereferenced and are cache-able (no dependency on underlying OS, technologies etc.)

•  Small websites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources)

•  That change slowly (weeks/months) or quickly (seconds), and where latency needs may vary

•  Focus on needs of research communication and cultural heritage organizations, but aim for generality

Page 7: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Why?

… because lots of projects and services are doing synchronization but have to resort to ad-hoc, case by case, approaches!

•  Project team involved with projects that need this

•  Experience with OAI-PMH: widely used in repos but o  XML metadata only o  Attempts at synchronizing actual content via OAI-PMH

(complex object formats, dc:identifier) not successful. o  Web technology has moved on since 1999

•  Devise a shared solution for data, metadata, linked data?

Page 8: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Use Cases – The Basics

Page 9: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Use Cases - More

Page 10: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Out Of Scope (For Now)

•  Bidirectional synchronization

•  Destination-defined selective synchronization (query)

•  Bulk URI migration

Page 11: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Use Case: arXiv Mirroring

•  1M article versions, ~800/day created or updated at 8 PM US Eastern Time

•  Metadata and full-text for each article

•  Accuracy important

•  Want low barrier for others to use

•  Look for more general solution than current homebrew mirroring (running with minor modifications since 1994!) and occasional rsync (filesystem layout specific, auth issues)

Page 12: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Use Case: DBpedia Live Duplication

•  Average of 2 updates per second •  Want low latency => need a push technology

Page 13: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync

ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Possible Technical Choices

Q&A

Page 14: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync Problem

•  Consideration: •  Source (server) A has resources that change over time: they

get created, modified, deleted •  Destination (servers) X, Y, and Z leverage (some) resources

of Source A. •  Problem:

•  Destinations want to keep in step with the resource changes at source A: resource synchronization.

•  Goal: •  Design an approach for resource synchronization aligned

with the Web Architecture that has a fair chance of adoption by different communities. •  The approach must scale better than recurrent HTTP

HEAD/GET on resources.

Page 15: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Destination: 3 Basic Synchronization Needs

1.  Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source

-  avoid out-of-band setup

2.  Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source

-  subject to some latency; minimal: create/update/delete -  allow to catch-up after destination has been offline

3.  Audit – A destination should be able to determine whether it is synchronized with a source

-  subject to some latency

Page 16: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 1: Describing Content

In order to advertise the resources that a source wants destinations to know about, it may describe them:

o  Publish an inventory of resource URIs and possibly associated metadata -  Destination GETs the Content Description -  Destination GETs listed resources by their URI

Page 17: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 18: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 19: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 2: Communicating Change Events

In order to achieve lower latency, a source may communicate about changes to its resources:

o  2.1. Change Set: Publish a list of recent change events (create, update, delete resource) -  Destination acts upon change events, e.g. GETs created/

updated resources, removes deleted resources.

Page 20: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 21: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 22: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 23: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 24: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 2: Communicating Change Events

In order to achieve lower latency, a source may communicate about changes to its resources:

o  2.1. Change Set: Publish a list of recent change events (create, update, delete resource) -  Destination acts upon change events, e.g. GETs created/

updated resources, removes deleted resources.

o  2.2. Push Change Set: Push a list of recent change events (create, update, delete resource) towards (a) destination(s) -  Destination acts upon change events, e.g. GETs created/

updated resources, removes deleted resources.

Page 25: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 3: Providing Access to Versions

In order to allow a destination to catch up with missed changes, a source may support:

o  3.1. Historical Change Sets: Provide access to change events that occurred prior to the ones listed in the current Change Set

Page 26: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 27: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 28: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 29: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 3: Providing Access to Versions

In order to allow a destination to catch up with missed changes, a source may support:

o  3.1. Historical Change Sets: Provide access to change events that occurred prior to the ones listed in the current Change Set

o  3.2. Historical Content: Provide access to prior resource versions

Page 30: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 4: Transferring Content

By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms:

o  4.1. Dump: Publish a package of resource representations and necessary metadata -  Destination GETs the Dump -  Destination unpacks the Dump

Page 31: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization
Page 32: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source Capability 4: Transferring Content

By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms:

o  4.1. Dump: Publish a package of resource representations and necessary metadata -  Destination GETs the Dump -  Destination unpacks the Dump

o  4.2. Alternate Content Transfer: Support alternative mechanisms to optimize getting content (see later)

Page 33: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Source: Advertise Capabilities

A source needs to advertise the capabilities it supports to allow a destination to discover them

•  Some capabilities may be provided by a third party, not the source itself

o  e.g. Historical Change Sets, Historical Content o  But the source should still make those third party capabilities

discoverable - trust

Page 34: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync

ResourceSync: What & Why?

Problem Perspective & Conceptual Approach

Possible Technical Choices

Q&A

Page 35: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync: A Framework of Capabilities

•  Modular framework allowing selective deployment of capabilities

•  A Source selects which capabilities to support in order to meet local and community needs

•  A Source’s Capabilities can be discovered via capability descriptions

Page 36: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 37: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

BY REFERENCE!

BY VALUE!

Page 38: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 39: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 40: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Sitemap

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://example.com/res1</loc> <lastmod>2012-08-08T08:15:00Z</lastmod> </url> <url> <loc>http://example.com/res2</loc> <lastmod>2012-08-08T13:22:00Z</lastmod> </url> </urlset>

Page 41: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Baseline Matching - Sitemap

•  Periodic publication of up-to-date Sitemap, which is a “by reference” inventory of a Source’s resources

•  Use ”as is” with resource location and last modification date as core elements

•  Introduce extension elements aimed at supporting audit: e.g. MD5 hash of content

Page 42: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

robots.txt!

discovery

Page 43: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Baseline Matching – Dump

•  A Dump is a “by-value” inventory of a Source’s resources

•  Periodic publication of an up-to-date Dump

•  Possible technology: ZIP file consisting of:

•  Special-purpose Sitemap that acts as a manifest for resources contained in the ZIP file •  Introduce an element to express correspondence

between resource URI and filename in the ZIP file •  Resource bitsteams

•  Possible technology: WARC file

Page 44: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 45: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Change Communication – Pull Change Sets

•  Periodic publication of a Change Set that describes recent changes

•  A Change Set is a Sitemap-style document, enhanced to express change events rather than inventory. Per change event, convey: •  About the event:

•  datetime •  event type: create/update/delete (maybe move/copy)

•  About the changed resource: •  URI •  Information relevant for audit, e.g. fixity, size, mime type •  Further information to aide accessing the resource (see

later)

Page 46: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Change Set, Based on Sitemap

<?xml version="1.0" encoding="UTF-8"?> <urlset rs:type="changeset” xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <url> <loc>http://example.com/res1</loc> <lastmod rs:type="updated">2012-08-08T08:15:00Z</lastmod> </url> <url> <loc>http://example.com/res2</loc> <lastmod rs:type="created">2012-08-08T10:22:00Z</lastmod> </url> </urlset>

Page 47: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Change Set, from Scratch

<?xml version="1.0" encoding="UTF-8"?> <changeset xmlns="http://www.openarchives.org/rs/changeset"> <change> <link rel="created" length="1234" type="text/html” href="http://example.com/res1.html"/> <date>2012-09-25T09:00:00Z</date> <fixity>ni:///sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx</fixity> </change> </changeset>

Page 48: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 49: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Change Communication – Push Change Sets

•  Use a push technology to convey changes

•  Express changes using same Sitemap-style document •  A Change Set in this case might convey only one change

event

•  Possible technology: XMPP PubSub

Page 50: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

<XMPP PubSub Intermezzo>

XMPP Publish-Subscribe: Client to Subscription Service, Subscription Service to Client(s) communication

•  One of the XMPP (Extensible Messaging and Presence Protocol) extensions http://xmpp.org/extensions/xep-0060.html

•  Apple Notifications based on XMPP PubSub

•  Both client and server tools widely available

Page 51: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

</XMPP PubSub Intermezzo>

Source Destination PubSub Server

Page 52: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 53: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Change Communication Memory

•  Publication of one or more Change Sets that convey historical (rather than recent) changes

•  All historical Change Sets use same Sitemap-style document

•  Same approach irrespective of whether pull or push is used for Change Communication

Page 54: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 55: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 56: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Resource Transfer

•  Resources are obtained in bulk by obtaining a Dump

•  An individual resource is, by default, obtained by dereferencing a resource’s URI listed in: •  Sitemap •  Change Set

•  Alternative access mechanisms are introduced to obtain an individual resource: •  From a mirror site •  Access to diff with previous version instead of access to the

entire changed resource •  Resource version

Page 57: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 58: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Resource Memory

•  Requires a (short or long term) archive of resource versions

•  Access to specific version can be expressed as an alternative access mechanism in e.g. Change Set. •  Via a link to a version resource that is the result of the

change expressed in the Change Set •  Via a link to a Memento TimeGate that supports access to all

available prior versions

Page 59: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

<Memento Intermezzo>

http://www.mementoweb.org/

Page 60: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Original Resources and Mementos

Page 61: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Bridge from Present to Past

Page 62: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Bridge from Past to Present

Page 63: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Memento Framework

Page 64: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 65: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 66: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 67: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 68: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 69: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 70: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 71: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Original Resource: http://lanlsource.lanl.gov/pics/picoftheday.png

Memento Framework

Page 72: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Time Travel across Versions of a Picture of the Day

Movie at: http://www.mementoweb.org/demo/picoftheday.mov

Page 73: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Original Resource: http://dbpedia.org/resource/France

Memento Framework

Page 74: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Time-Series Analysis across DBpedia Versions

Data collected through HTTP Navigation

Paper at http://arxiv.org/abs/1003.3661

Page 75: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

</Memento Intermezzo>

http://www.mementoweb.org/

Page 76: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Page 77: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync Timeline •  August 2012

o  First draft spec shared for feedback with ResourceSync team

•  September 2012 o  Problem Statement paper in D-Lib Magazine o  In-person meeting of ResourceSync Team

•  October 2012 o  Revise spec, conduct experiments o  Solicit broad feedback

•  December 2012 – Finalize specification (?)

Page 78: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

Pointers •  First ResourceSync draft spec (do not implement!):

http://www.openarchives.org/rs/0.1/resourcesync!

•  ResourceSync Simulator code on github http://github.org/resync/simulator!

•  NISO ResourceSync workspace http://www.niso.org/workrooms/resourcesync/!

•  Memento http://mementoweb.org!

Page 79: NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Synchronization

ResourceSync – Herbert Van de Sompel NISO Forum, September 24 2012, Denver, CO

ResourceSync: Get the Sticker!

Herbert Van de Sompel Los Alamos National Laboratory

@hvdsomp

ResourceSync is funded by The Sloan Foundation & JISC