The importance of the Web for the Semantic Web
-
Upload
alexandre-monnin -
Category
Technology
-
view
1.078 -
download
1
description
Transcript of The importance of the Web for the Semantic Web
The importance of the Web for the Semantic Web
Alexandre Monnin, PhD
Associate researcher @Inria
Senior Open data Adviser for Etalab
Chair of the « PhiloWeb » community group (W3C)
Organiser of the « Les rencontres du web de données » Meetup
Twitter: @aamonnz/@PhiloWeb, Website : web-and-philosophy.org
“semantic web” and not
“semantic web”
[C. Welty, ISWC 2007]
Above all: the Semantic Web is deeply entrenched in the Web
Why the Web?
Lesl
ie C
arr,
« T
he
Fun
dam
enta
ls o
f th
e W
eb, t
he
Imp
ort
ance
of
Web
Sci
ence
” «
Maybe it is a « temporary glitch? » (Leslie Carr)
A fragile reality, relying on specific architectural principles, that gave birth over the years to many innovations that may threaten its very existence.
If the Semantic Web (or Web of data) has any future, it must be aware of its roots and preserve what made the Web so incredibly successful on a previously unseen scale.
I. Naming/identifying
The basics
Kieron O’Hara, « The Web as an ethical space »
Three components of the architecture of the web • identification (URI) & « adressability » (URL)
http://www.inria.fr
http://ns.inria.fr/fabien.gandon#me
ldap://[2001:db8::7]/c=GB?objectClass?one
• communication / protocol (HTTP) GET /centre/sophia HTTP/1.1
Host: www.inria.fr
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de-de)
AppleWebKit/523.10.3 (KHTML, like Gecko) Version/3.0.4
Safari/523.10
Accept-Encoding: gzip
Accept: text/html,application/xhtml+xml,application/xml
Accept-Language: en,en-us;q=0.8,fr;q=0.5,fr-fr;q=0.3
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7
Referer: http://fabien.info/
• Representation(s) languages (HTML / RDF) Fabien travaille chez <a href="http://www.inria.fr">Inria</a>
<http://www.inria.fr> foaf:member data:fabien
Three functions
• identification of ressources (URI)
• access to representation (HTTP URI)
• Encoding of representations (HTML , RDF, etc)
URIs (universal syntax)
Because the Web had to link to other competing systems: WAIS, Gopher, Prospero… Interoperability and openneness gave it a decisive advantage from the inception (Google “Gopher”!).
“I originally called these things “Universal Document Identifiers” (UDIs) even before we started using them for concepts. 8 The IETF were a bit put off, thinking it was too much hubris to call them “universal.” Now I realize that I should have held firm and said “but they are,” as any alternative system of naming you can make out there, I can map it to the character set we use in URIs and I can invent a new scheme for it. So we can map any scheme to URIs. We’d already mapped Gopher, FTP, and these sorts of things. Now, we’ve got HTTP and there will be lots of other schemes. So in a sense URIs are universal, as we’re saying anything—any name that you come across—can be mapped into this space.” –TimBL)
URIs are also what…
URIs: not « just » a universal syntax
http://www.
…links…
…(yes, links!)…
are made of
<a href=“http://example.com/”>lorem ipsum</a>
This remains true of the Semantic Web as well
le web originel liens typés…
RDF every bit of information is decomposed in triples Subject / entity /node relation /attribute/arc object / value /node
ex : Slides.html has for author Alexandre
and for theme the Web
Slides.html has author Alex Slides.html has theme Web
Semantic web Mentioned by Tim BL
in 1994 at WWW
[Tim Berners-Lee 1994, http://www.w3.org/Talks/WWW94Tim/]
« From ADA to AAA »
• ADA (Web) = Anyone can designate anything
“philosophy may be necessary to explain what happens when the legal system hits the Web. When you make a web-page you can link to anything, you can write anything about it. But when a lawyer comes along and reserves the right to charge you to link to their page, then in a way it’s a philosophical question, as you have to tie linking to the way the protocol is defined over a name as just a reference, something that has never been controlled over the millennia. Systems where you control names haven’t worked so far, and so you need the philosophy to show how these protocols are ground out in history and in concepts for using names that lawyers” (TimBL)
« From ADA to AAA »
• AAA (Linked data) = Anyone can say anything about anything
Because we can designate anything (green lines), we can then link any things (red lines)
II. What’s being named?
ressources
Document Properties Correlate
UDI Papers (1992)
logical name, not a physical address so
that moving documents does not
impinges on the durability of such
names (some details should be
obfuscated)
object or document, unit of retrieval
rather than the unit of storage, might
identify a query formulated through a
service, a question rather than a
document
URI RFC 1630 (1994)
cf. above. Distinct from a file name
that is local, should remain opaque,
devoid of the details attached to the
technicalities of its implementation
accessible objects if URIs are also URLs
URL RFC 1738 (1994) non-physical address
resources (not defined), identified in a
abstract way (by contrast, accessible
contents for RFC 1739)
URN RFC 1737
(1994) name, identifier stable resource, not accessible
IRL RFC 1736
(1995)
address (URL), Identifier (URN),
Description (URC)
resource – networked or non-
networked
URC IETF drafts meta-information, list of identifiers – Document
appearences database
One URI never = one « page »
Electronic documents
Rendering service
Computers
Servicing Client
Application
Other encoding formats
RPC
Psychophysically equivalents
client server
Content negotiation (conneg)
http
A forerunner: system 33 (1991-1993)
HTTP Range 14
Code HTTP Résultat Indication
200 (OK) Representation Information ressource or non-
information resources
303 (see other)
URI Any kind of resource
4XX, 5XX (error)
Error message Nothing can be inferred
They did not talk about it They talked about it
ressource
state de of the resource
the representational state of the
resource (whence the acronym
« REST »!)
Actually, this explains why there are no links on the Web before an actor like Google appears. Links are indeed rather pointers to resources inside the representations of other ressources (and, as such, these pointer might not dereference, nor therefore link two relata).
Wait! How about REST?
« »
"Resources are angels, URIs pins" "Naming is printing money"
(Larry Masinter)
Resources are « shadows »: not a bug but one of the Web’s greatest features “7.1.2 Manipulating Shadows. Defining resource such that a URI identifies a concept rather than a document leaves us with another question: how does a user access, manipulate, or transfer a concept such that they can get something useful when a hypertext link is selected? REST answers that question by defining the things that are manipulated to be representations of the identified resource, rather than the resource itself. An origin server maintains a mapping from resource identifiers to the set of representations corresponding to each resource. A resource is therefore manipulated by transferring representations through the generic interface defined by the resource identifier.” (Roy Fielding)
Can objects be mere « shadows » ?
Not « mere » shadows, but still, that compares well to what some philosophershave to say about objects:
“the presence of an object inherently involves its absence. The reason is simply the standard one: in order for a subject to take an object as an object, there must be a separation between them – enough separation to make room for intrinsic abstraction, of detachment, of stabilization. So it is essentially an ontological theorem of this metaphysics that no object, for any given subject, will be wholly there, in the sense of being fully effectively accessible. Or, to put it more carefully: in order to be present ontologically – i.e., in order to be materially present – an object must also be (at least partially) absent metaphysically, in the sense of being partly out of effective reach.” (Brian Cantwell Smith)
Just as an objet is never entirely present, a resource is never accessed as such, only « representations » are – slices of trajectories. Many philosophers thus argue that objects are not already there, waiting to be picked up or designated. Rather, what we designate are regularities, patterns that need to be tended to and maintained and that call for it.
It comes with a price
The trajectory drawn by these regularities corresponds to Justin Erenkranz’ characterization of resource as “network continuation”. The price is higher than expected since identifying ressource necessitates to "maintain a mapping from resource identifiers to the set of representations corresponding to each resource". The cost is so high that, eventually, everything will be 404. 404 guarantee that no higher authority is responsible for making sure that every URI dereferences. It is as much a design principle of the Web as any other.
May 2007
April 2008
September 2008 March 2009
September 2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
September 2011
The Web as ontology On what there is on a global scale
September 2011
From ADA and AAA to a shared world: Wikipedia and DBpedia
Conclusion :
• “The Web may fragment if the engineering isn’t right” (Kireon O’Hara)
• Just as the Web is an application built on the Internet, not the Internet, applications built on the Web are not the Web itself. While they might depart from its principles, yet they build on its success.
• “The Web spreads the conditions of its initial creation” (L. Carr). Then, as an open platform, it also spreads potential threats to these conditions.
Why should we care for the Web?
• While many important players are all trying to impose their own rules, keeping data behind closed walls, silos, proprietary platforms, we can see one of them going against the grain, towards more and more openness, building a platform designed to nurture open innovation: Valve’s Steam in the field of video games.
• “So rather than having this curated store we’re going to say, “OK if we are thinking about this correctly, it really should be sort of a network API.” There should be this publishing model – and yes you have to worry about viruses and malware and stuff like that – but essentially anybody should be able to publish anything through Steam.” (Gabe Newell)
• No unlike the Web…
Thank you very much!