Post on 02-Jan-2016
Resource Identity and Semantic Extensions:
Making Sense of AmbiguityDavid Booth, Ph.D.
Cleveland Clinic (contractor)
Semantic Technology Conference
25-June-2010
Latest version of these slides:http://dbooth.org/2010/ambiguity/
Also available: Companion paper
Outline
• Part 0: Myths about resource identity
• Part 1: RDF semantics and ambiguity– Interpretations
– Interpretations of a URI
• Part 2: Constraining ambiguity through URI
declarations– Bounding ambiguity
– Ambiguity and owl:sameAs
• Part 3: Determining resource identity– Semantic extensions
PART 0:
Myths About Resource Identity
This section makes some observations that will be further
explained in later sections.
4
URIs as names for resources
URI
http://example/#apple
Resource
• In the semantic web, URIs are used as names for
resources– Separate from a URI's use as a locator
• “Resource” == “Thing” == Universal class– E.g., people, proteins, medications, concepts, etc.
• But which resource does a URI name?
?
Resource identity
• Which resource does a given URI denote?
• Because a URI can denote any resource, this
question is central to RDF semantics
• This is the question of resource identity – the
determination of which resource a URI denotes
Myth 1: A URI denotes only one resource
Myth:
• “By design a URI identifies one resource”
– W3C Architecture of the World Wide Web
Reality:
• True as an ideal
• True for one interpretation of one RDF graph, but . . .
• Different interpretations of the same graph may map
the same URI to different resources
• Different graphs may permit different interpretations
Myth 2: RDF semantics are global
Myth (a variation of #1):
• There is only one giant graph, with global semantics
• E.g., “owl:sameAs makes a very strong statement”– The implication is that it must hold universally
Reality:
• RDF semantics are defined for a given graph
• There are many graphs
• The “meaning” of a URI depends on the graph– A URI may denote different resources in different
graphs
Myth 3: Resource ambiguity is due to sloppiness
Myth:
• A URI's resource identity can be uniquely defined if
you are precise enough
Reality:
• Ambiguity is unavoidable– . . . with vanishingly few exceptions
• Always possible to make ever finer resource
distinctions
• See examples in In Defense of Ambiguity by Pat
Hayes and Harry Halpin
Myth 4: Truth is absolute
Myth:
• “If your RDF models the world as flat, then it is
wrong”
Reality:
• “Truth” is irrelevant; what matters is usefulness
• Different apps have different needs– Flat world model may be best for street navigation:
Precise enough, and simpler than round world model
• Different apps need different models
PART 1:
RDF Semantics and Ambiguity
This section examines some consequences of standard
RDF semantics.
11
“Interpretations” in RDF Semantics
• An interpretation maps URIs to resources
http://example/#plum
http://example/#apple
http://example/#pear
http://example/#banana
http://example/#orange
InterpretationURIs Resources
12
An interpretation applied to a single URI
• An interpretation maps that URI to one resource– Associates the name with a particular resource
http://example/#apple
InterpretationURI Resource
13
Multiple interpretations
• RDF semantics does not constrain a graph to a
unique interpretation
• Different interpretations may map the same URI to
different resources
http://example/#apple
InterpretationsURIs Resources
i3
i2
i1
14
Many interpretations
• There may be many interpretations– Potentially infinite
http://example/#apple
15
RDF semantics constrains the possible interpretations for a given graph
• For a given graph, RDF semantics constrains the
possible interpretations
http://example/#apple
16
Adding assertions reduces the set of possible interpretations
• By merging RDF graphs, constraints of both graphs
must be satisfied
http://example/#apple
17
“Interpretations of a URI”
• For a given graph, “Interpretations of a URI” == The set of resources from applying all possible interpretations to that URI
http://example/#apple
18
Resource ambiguity
• For a given RDF graph, a URI's resource is ambiguous if there exists
more than one possible interpretation for that URI– I.e., the possible interpretations map that URI to more than one resource
• Referent of a URI is almost always ambiguous!– But that's okay – it's just life
http://example/#apple
19
Interpretations of different URIs may overlap
• URIs X and Y may map to some of the same
resources
Interpretations of X Interpretations of Y
20
Effect of owl:sameAs
• X owl:sameAs Y
• Limits the interpretations for X and Y to the intersection
X owl:sameAs Y
Interpretations of X Interpretations of Y
PART 2:
Constraining Ambiguity through
URI Declarations
This section proposes a standard way to constrain
resource ambiguity.
22
URI Declarations
• A URI declaration provides a definition for a
resource denoted by a URI– See “URI Declaration in Semantic Web Architecture”
• Definition is provided by a set of core assertions
• Core assertions constrain the possible
interpretations for the URI
• URI declaration should be provided via the URI's
follow-your-nose location– See “Cool URIs for the Semantic Web”
Why URI Declarations?
• Easy to know what definition to use– Dereference the URI to find its URI declaration (usually)
• Permits all users of the URI share the same
definition– Stablizes meaning / Avoids semantic drift
• Resource ambiguity is precisely bounded– Interpretations can still vary within bounds
24
Bounding the interpretations of a URI X
• URI declaration bounds the interpretations of URI X
• Use of X in graph A further limits the possible
interpretations
Interp. of Xin graph A
Interpretations of X consistentwith X's URI declaration
25
Interpretations of a URI X in different graphs
• Same URI may have different possible interpretations in
different graphs– E.g., URI X is used in graphs A and B
• All are within the bounds of the X's URI declaration
• When graphs are merged, the possible interpretations for
X are limited to the intersection
In graph A In graph BIn A+B
In URI declaration
26
In X's URI declaration
Inconsistent combined graphs
• URI X is used in graphs A, B and C– Graph A+B is consistent
– Graph B+C is consistent
• Graph A+C (or A+B+C) is inconsistent: no possible
interpretations
In AIn B
In C
In A+B In B+C
27
In X's URI declaration
Splitting identities
• Use of A+C (or A+B+C) together requires splitting X's identity,
e.g.:– Mint new URI Xab to replace X in graph A to make A'
– Mint new URI Xbc to replace X in graph C to make C'
– Then merge graph A' with graph C'
• See http://dbooth.org/2007/splitting/
X in AX in B X in C
Xabin A+B
Xbcin B+C
28
Trade-off: precision versus reusability
• Broader URI declaration:– Permits the URI to be used in more applications
– Causes more down-stream contradictions, when the
URI is re-used in other graphs and those graphs are
later combined
• Narrower URI declaration:– Restricts the URI to few applications
– Reduces likelihood of downstream contradictions
• Recommendation:– Choose the degree of precision that will best attract the
community of applications that you wish to attract
– See also discussion of “clumping” in
http://dbooth.org/2007/uri-decl/20100615.htm#clumping
PART 3:
Determining Resource Identity
This section proposes a standard process for determining
resource identity.
30
Determining resource identity
1. Select assertions – what graph?
1.a. Recursively merge ontologies and URI declarations– Ontologies and URI declarations should be cached!
2. Apply RDF semantics
• Constrains the possible interpretations for each URI
3. Select an interpretation
RDF Semantics only defines step 2!
31
3. Select aninterpretation
Resource identity with RDF semantics
1. Selectassertions
2. ApplyRDF semantics
Available assertions
Possible interpretations
e.g. <http://example#apple> ...
Informalassertions
e.g. rdf:comment " ... " .
Formalassertions
1.a. Get ontologies& URI declarations
32
Semantic extensions
• Define additional entailment rules and constraints– E.g., OWL or FruitOnt
• Must be monotonic– All previous entailments still hold
• Further limit the set of possible interpretations
• Typically triggered by a predicate URI
33
Resource identity under semantic extensions
1. Select assertions – what graph?
1.a. Recursively merge ontologies and URI declarations– Ontologies and URI declarations should be cached!
2. Apply RDF semantics + semantic extensions
• Predicate URI triggers the use of semantic extensions:– Opaque plug-in, or
– Set of rules
3. Select an interpretation
34
Resource identity under semantic extensions
1. Selectassertions
3. Select aninterpretation
2. ApplyRDF+extensionsemantics
Available assertions
Possible interpretations
e.g. <http://example#apple> ...
Informalassertions
e.g. rdf:comment " ... " .
Formalassertions
Semantic extensionse.g. OWL, FruitOnt
1.a. Get ontologies& URI declarations
Summary
• Part 0: Myths about resource identity
• Part 1: RDF semantics and ambiguity– Interpretations
– Interpretations of a URI
• Part 2: Constraining ambiguity through URI
declarations– Bounding ambiguity
– Ambiguity and owl:sameAs
• Part 3: Determining resource identity– Semantic extensions
38
Splitting URI X resource identity
• What if you really want to combine graphs A+B+C?
• URI X may be split into two URIs, e.g.:
• In graph AB = A+B, change all X to X1
• In graph BC = B+C, change all X to X2
In graph A In graph B In graph C
In A+B In B+C
43
Effect of owl:sameAs
• X owl:sameAs Y
• Each URI has a set of possible interpretations
• owl:sameAs limits this set to the intersection
Interpretations for X Interpretations for Y
Brief Description
• This presentation shows how ambiguity fits within
standard RDF semantics, explains how it relates to
owl:sameAs, and proposes a standard operational
sequence for determining the referent of a URI.
Abstract
• What does a URI denote? How should its referent be determined, even in
the presence of semantic extensions that affect the interpretation of an RDF
graph? How should ambiguity be viewed?
• One view is that a given URI has no fixed referent, but may denote different
things in different contexts. Another is that each URI should have a URI
declaration that precisely delimits its interpretation. Some suggest reusing
existing URIs in new contexts, while others prefer to mint new URIs and
then allow owl:sameAs assertions to indicate that two URIs denote the same
thing.
• This presentation sheds light on these issues by explaining how ambiguity
of a URI's referent fits within standard RDF semantics, how this ambiguity
applies to the use of owl:sameAs, and proposes a standard operational
sequence for determining the intended referent of a URI, even in the the
presence of semantic extensions.