Granularity in linked open data
-
Upload
gordon-dunsire -
Category
Education
-
view
2.072 -
download
2
description
Transcript of Granularity in linked open data
Granularity in Library Linked Open Data
Gordon DunsireKeynote presentation to Code4Lib 2013,
12-14 Feb 2013, Chicago, USA
Overview
Fractals
Self-similar at all levels of granularity
Cannot determine level: all levels are equal!
Multi-faceted granularity
What is described by a bibliographic record?Or a single statement?
What is the level of description?How complete is it?
How detailed is the schema used?How dumb?
Semantic constraints?Unconstrained?
AAA! OWA! Rumsfeld and the white light!
This resource has intended audience JuvenileTriple:
has Granularity?
Subject Predicate Object
Coarse-grained systems consist of fewer, larger components than fine-grained
systems [Wikipedia]
Resource Description Framework – Linked data
Subject: what is the statement about?
Journal title
Issue
Article
JournalsLibrary collection
ParagraphWord
Consortium collection
Graphics
Subjects AccessDigital collection
RDF map
Section PageMarkup
RDF/XMLURI Node
Component
Super-Aggregate
Sub-Component
Aggregate
Focus
coarser
finer
Journal index
Festschrift
Resource Work
Predicate: what is the aspect described?
Access to resource
Access to content
Suitability rating
Membership category
AudienceAudience of audio-visual material
Audience and usage Component
Super-Aggregate
Sub-Component
Aggregate
Focus
coarser
finer
isbd: International Standard Bibliographic Description
dct: Dublin Core terms
schema: Schema.org
rda: Resource Description and Access
m21: marc21rdf.info
frbrer: Functional Requirements for Bibliographic Records, entity-relationship model
unc: unconstrained version
Possible Audience map (partial)
rdfs:subPropertyOf
unc:“has note on
use or audience”
isbd:“has note on
use or audience”
unc:“Intended audience”
rda:“Intended audience”
m21:“Target
audience” frbrer:“has intended
audience”
dct:“audience”
rdfs:subPropertyOf
m21:“Target
audience of …”
rdfs:subPropertyOf
schema:“audience”
What is the aspect described?
Manifestation record
Title and s.o.r
Title statement
Resource record
Title wordFirst word of title
Title of manifestationComponent
Super-Aggregate
Sub-Component
Aggregate
Focus
coarser
finer
dct:“Title”
dc:“Title” rdfs:
“Literal”
sP
sP
sPr
rdagrp1:“Title proper
(Manifestation)”
rdafrbr:“Manifestation”
rdaopen:“Title proper”
rdaopen:“Title”
rdagrp1:“Title
(Manifestation)”
d
dsP
sP
sP
sP
isbd:“has title proper”
isbd:“has title”
isbd:“Resource”
dd
sP
sP
eP
Possible Title semantic map(partial)
sP: rdfs:subPropertyOfd: rdfs:domain
r: rdfs:range
Semantic reasoning: the sub-property ladder
isbd:“has title proper”
dct:title
rdfs:subPropertyOf
Semantic rule:If property1 sub-property of property2;Then data triple: Resource property1 “string”Implies data triple: Resource property2 “string”
isbd:”Resource” “Physics”
isbd:“has title proper”
Resource “Physics”
dct:“has title”
machineentailment
coarser
finer
dumb-up
“For children aged 7-”ex:3
rda:”Intended audience (Work)”
“For ages 5-9”ex:2
isbd:”has note on use or audience”
“Primary school”ex:1
frbrer:”has intended audience”
“Juvenile”
ex:4
m21:”Target audience” m21terms:
commonaud#j
skos:prefLabel
Data triples from multiple schema
“For ages 5-9”ex:2unc:”has note on use or audience”
Data triples entailed from sub-property map
“Primary school”ex:1unc:”has note on use or audience”
“For children aged 7-”ex:3unc:”has note on use or audience”
“Juvenile”ex:4unc:”has note on use or audience”
ex:1 frbrer:”Work””is a”
ex:3 rda:”Work””is a”
ex:2 isbd:”Resource””is a”
Data triples entailed from property domains
What is the aspect described?
CreatorAuthor
Screenwriter
Children’s cartoon screenwriterAnimation screenwriter Component
Super-Aggregate
Sub-Component
Aggregate
Focus
coarser
finer
drda:”Work”
rdaroles:”Creator”
[rda:”Agent”]rdaroles:”Author (Work)”
rdaroles:”Screenwriter (Work)”
d
d
r
r
r
s
s
dct:”Creator”
dc:”Creator”
dct:”Agent”r
s
marcrel:”Author”
marcrel:”Authorof screenplay, etc.”
dc:”Contributor”
s
lcsh:”Screenwriters”
?
?
?
s: rdfs:subPropertyOfd: rdfs:domain
r: rdfs:range
?
Machine-generated granularity
Full-text indexing: down to word level
A very large multilingual ontology with 5.5 millions of concepts • A wide-coverage "encyclopedic dictionary" • Obtained from the automatic integration of WordNet and Wikipedia • Enriched with automatic translations of its concepts • Connected to the Linguistic Linked Open Data cloud!
User-generated granularity
“OK for my kids (7 and 9)”
“Too childish for me (age 14)”
“Ideal for the child of ambitious parents”
“This sucks – for kids only”
“Great! Has cool stuff”
KISS
Keep it simple, stupid
Keep it simple and stupid?
The data model is very simple: triples!
The (meta)data content is complex
The Mandelbrot Set:“an example of a complex structure arising from
the application of simple rules” - Wikipedia
Resource discovery is complex
AAA
Anyone can say anything about any thing
Someone will say something about every thing
In every conceivable way
Linguistically
OWA
Will all the gaps get filled?
“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.”- Donald Rumsfeld
Open World Assumption: the absence of a statement is not a statement of non-existence
!