OAC Technical Summary

55
Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel Robert Sanderson Herbert Van de Sompel Prototyping Team Research Library Los Alamos Na<onal Laboratory h"p://www.openannota-on.org/ h"p://groups.google.com/group/ oac‐discuss OAC is funded by the Andrew W. Mellon Founda<on Open Annotation: Technical Overview

description

 

Transcript of OAC Technical Summary

Page 1: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

RobertSandersonHerbertVandeSompel

PrototypingTeamResearchLibrary

LosAlamosNa<onalLaboratory

h"p://www.openannota-on.org/

h"p://groups.google.com/group/oac‐discuss

OACisfundedbytheAndrewW.MellonFounda<on

Open Annotation: Technical Overview

Page 2: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Outline

•  Motivation

•  OAC’s Perspective on Annotations

•  Design Choices

•  Alpha 3 Data Model

2

Page 3: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Motivation (1)

•  Annotations are a core ingredient of scholarship:

•  Transcribe and annotate medieval manuscripts; •  Annotate maps with maps; •  Annotate video recording of endangered languages with non-

verbal communication events; •  Annotate 3D museum artifacts; •  Your use cases …

3

Page 4: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Motivation (2)

•  Existing solutions for scholarly annotation are repository-centric, tied to silos:

•  Annotation in terms of specific repository and/or collection, no global scope;

•  Only consumable in client/server combination that created the annotation;

•  Annotations not shareable beyond original server – can not create cross system services based on (enriched & merged) annotations.

4

Page 5: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Open Annotation Collaboration

•  Focus on interoperability for annotations in order to allow sharing of annotations across:

•  Annotation clients; •  Content collections; •  Services that leverage annotations.

•  Focus on annotation for scholarly purposes. But desire to make the OAC framework more broadly usable.

•  In order to gain adoption => tools, communities, integration of scholarly communication with other areas of discourse.

5

Page 6: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (1)

•  The following characterize an annotation:

•  There is an author (human, software agent) of an annotation; •  The annotation occurs at some point in time; •  There is a body of an annotation; •  There is a target of an annotation; •  The body annotates the target, i.e. the body is somehow

“about” the target.

6

Page 7: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (2)

•  Body and target of an annotation can be of any media type.

•  A video can annotate a Web page; a Web page can annotate a video.

•  This is contrary to the prevailing view in which the body is textual.

•  Annotation, body, and target can have different authors.

•  This is contrary to the prevailing view in which annotation and body have same authorship.

7

Page 8: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (3)

•  Most (scholarly) annotations involve parts of resources (image regions, slices of a video, dimensions of a dataset).

•  The annotation framework should provide support for resource segment identification and description.

•  A variety of more complex annotations involve multiple targets (and maybe multiple bodies?).

•  The annotation framework should support this. •  So far, no compelling use cases for multiple bodies have been

identified.

8

Page 9: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Design Choices (1)

•  Interoperability specified in terms of the Architecture of the Web (URI, link, resource, representation, …) , Semantic Technologies, Linked Data.

•  Aligned with desire to more tightly embed scholarly communication in overall human discourse;

•  Aligned with trend towards machine-actionable scholarly communication system;

•  Aligned with approach we followed in other efforts (OAI-ORE for Aggregations; Memento for versioning).

9

Page 10: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Design Choices (2)

•  Client autonomy.

•  Not an annotation protocol (cf. Annotea) but an annotation model combined with a publish/subscribe mechanism;

•  No reliance on a server that helps with generation of annotations. Only a server that supports publish/discovery of the annotation created by the client is assumed.

10

Page 11: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Protocol

publish, subscribe, consume

11

Page 12: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Publish/Subscribe

publish subscribe consume

12

Page 13: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Design Choices (3)

•  An annotation is an information resource (aka document).

•  Earlier versions of the model regarded annotations as conceptual, non-information resources. This approach was related to: •  Ideas to model annotations as events; •  Ideas to model annotations as OAI-ORE Aggregations.

•  Community feedback to this approach was negative: •  Added complexity without added value; •  The use of OAI-ORE, and especially its Proxies, was deemed

artificial.

13

Page 14: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Design Choices (4)

•  Significant attention to problems related to the ephemeral nature of the Web.

•  Representations of URI-addressable resources change over time, resulting in ambiguous or incorrect annotations, as well as annotations that lack (representations of) body and/or target.

•  The approach provides hooks to allow: •  Recognizing ambiguous/incorrect annotations, e.g. via timestamp

and fixity for body and target; •  Reconstructing the annotation, e.g. via timestamps, scholarly

archives, and Memento.

14

Page 15: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Design Choices (5)

•  A model that gracefully builds from simple to complex •  The simple core of the model supports elementary annotation use

cases. •  The model becomes gradually more complex to accommodate

increasingly complex use cases.

15

Page 16: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Basic Model

•  The basic model has three resources: •  Annotation (an RDF document)

•  Default: RDF/XML but others via Content Negotiation •  Body (the comment or text of the annotation) •  Target (the resource the body is about)

16

Page 17: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Basic Model Example

17

Page 18: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Additional Relationships and Properties

•  Any of the resources can have additional information attached, such as creator, date of creation, title, etc.

18

Page 19: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Additional Properties Example

19

Page 20: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotation Types

•  There can be further types of Annotation, such as a Reply. •  Example: Replies are Annotations on Annotations.

20

Page 21: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotation Types Example

21

Page 22: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotation Types Example

We'll come back to these …

22

Page 23: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Inline Information

•  It is important to be able to have content contained within the Annotation document for reasons of Client Autonomy:

•  Clients may be unable to mint new URIs for every resource •  Clients may wish to transmit only a single document

•  Third parties can generate new URIs if the client does not

•  The W3C has a Content in RDF specification that describes how to do this:

•  http://www.w3.org/TR/Content-in-RDF10/

23

Page 24: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Inline Information: Body

•  We introduce a resource identified by a non resolvable URI, such as a UUID URN, as the Body. •  We then embed the data within the Annotation document using the 'chars' property from the Content in RDF ontology.

24

Page 25: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Inline Body Example

25

Page 26: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Multiple Targets

•  There are many use cases for multiple targets for an Annotation: •  Comparison of two or more resources •  Making a statement that applies to all of the resources •  Showing the provenance of resources •  Making a statement about multiple parts of a resource

•  The OAC Data Model allows for multiple targets by simply having more than one hasTarget relationship.

26

Page 27: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Multiple Targets Example

27

Page 28: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Segments of Resources

•  Most annotations are about part of a resource

•  Different segments for different media types:

•  Text: paragraph, arbitrary span of words •  Image: rectangular or arbitrary shaped area •  Audio: start and end time points, track name/number •  Video: area and time points •  Other: slice of a data set, volume in a 3d object, …

28

Page 29: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Segments of Resources (2)

•  Web Architecture Segmentation:

•  A URI with a Fragment identifies part of the resource •  Media-specific fragment identifiers; eg XPointer for XML •  W3C Media Fragments URI specification for simple segments of media: http://www.w3.org/TR/media-frags/

•  We introduce a method of constraining resources:

•  Introduce an approach for arbitrarily complex segments that cannot be expressed using Fragments •  Can be applied to Body or Target resource

29

Page 30: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Segments of Resources: Fragment URIs

•  URI Fragments are a syntax for creating subsidiary URIs that identify part of the main resource

•  The syntax is defined per media type •  X/HTML: The named anchor or identified element

•  http://www.example.net/foo.html#namedSection

•  XML: An XPointer to the element(s) •  http://www.example.net/foo.xml#xpointer(/a/b/c)

•  PDF: Many options, most relevant two operations: •  http://www.example.net/foo.pdf#page=2&viewrect=20,80,50,60

•  Plain Text: Either by character position or line position: •  http://www.example.net/foo.txt#char=0,10 •  http://www.example.net/foo.txt#line=1,5

• : 30

Page 31: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Segments of Resources: Media Fragments

•  Media Fragments allow anyone to create URIs that identify part of an image, audio or video resource.

•  The most common case is for rectangular areas of images: •  http://www.example.org/image.jpg#xywh=50,100,640,480

•  Link to the full resource as well, for all Fragment URIs

31

Page 32: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Media Fragments Example

32

Page 33: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Media Fragments Example

33

Page 34: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Complex Constraints

•  The ConstrainedTarget (CT-1) identifies the segment of interest

•  The type of description is dependent on the media type and nature of the target resource.

•  If a Fragment URI is not possible, we introduce a Constraint to describe the segment of interest

•  Media Fragments embed the segment description in the URI •  Constraints are entire resources, so can be more expressive •  Constraints may also describe 'contextual' information

34

Page 35: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Constraint Example

35

Page 36: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Inline Information: Constraints

•  We can also use inline information in the same way as for the Body resource to include the Constraint data.

36

Page 37: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Inline Constraint Example

37

Page 38: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

RDF Constraints

•  Instead of having the information in an external document, it could be within the RDF of the Annotation document.

•  We attach information to the Constraint node

•  The Annotation Ontology models its "Selectors" in this way

http://code.google.com/p/annotation-ontology/

38

Page 39: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

RDF Constraint Example

39

Page 40: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

RDF Constraint plus Media Fragment

40

Page 41: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Constrained Body

•  The body may also be constrained in the same way as Targets

41

Page 42: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Constrained Body

42

Page 43: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotating a Non-Document

•  The Target of an Annotation does not have to be a document: can be a Non Information Resource

•  Non Information Resource as Target •  Annotations about:

•  Concepts •  Physical things •  Locations •  or any other non-document

•  Example: A video about a real life painting

•  Non Information Resource as Body •  We'll come back to this…

43

Page 44: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Non Information Resource Example

44

Page 45: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotations and Time

•  Web Resources change over time, but keep the same URI

•  This is important for linking, but makes Annotation hard. We need to know when the annotation applies to the resource.

•  This is true for Body and Target(s).

•  The created time is not sufficient, as the Annotation, Body and Target could be created at different times. The Body could be about a previous state of the Target.

45

Page 46: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: No Persistence

Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14

46

Page 47: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: No Annotations

Archived page from: http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html

47

Page 48: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: Desired Cross-Linking

48

Page 49: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Annotations and Time

•  There are three different types of Annotation with respect to Time:

•  Timeless Annotations •  These are always relevant, regardless of the current state of the resource.

•  Uniform Annotations •  There is a single timestamp at which all of the resources should be considered.

•  Varied Annotations •  Each of the resources (Body, Targets) should be considered at a different time.

49

Page 50: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Timeless Annotations

•  The model for a Timeless Annotation is the base model

50

Page 51: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Uniform Annotations

•  If the same time is applicable to all resources, we attach it to the Annotation using the oac:when predicate.

51

Page 52: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Uniform Annotation Example

52

Page 53: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Varied Annotations

•  If different timestamps are required for each resource, we use oac:when from an oac:TimeConstraint.

53

Page 54: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

Varied Annotation Example

54

Page 55: OAC Technical Summary

Open Annotation: Technical Overview Robert Sanderson, Herbert Van de Sompel

http://www.openannotation.org/

http://groups.google.com/group/oac-discuss

55