Post on 22-Jul-2020
Semantic Web
Tahani Aljehani
Motivation: Example 1
• You are interested in SOAP Web architecture
• Use your favorite search engine to find the articles about SOAP
• Keywords-based search
• You'll get lots of information, both relevant and irrelevant
• Dish washing soap, facial soaps, (...)
• You still have to do a lot of work to find your required information
Motivation: Example 1
• What was the problem?
• The simple keyword matching does not take “semantics” into account
• The word soap in different context means different things, lexical ambiguity
• Semantics of a word = meaning of the word
Motivation: Example 2
• You are interested to know about the former kings of Saudi Arabia
• Again, the keyword former would not match with the word previous or old
• Most search engines do not make intelligent search, for example, by exploiting synonymy
• The information is available on the Web, but you might not get
• The problem becomes worse when information is in another language, e.g. Arabic
Limitation of the current Web
• Finding relevant information, WHY?
• Synonymy (above, Example 2)
• Homonymy (above, Example 1)
• Spelling variants:
• e.g. “organize” in American English vs. “organise” in British English
• Spelling mistakes
• Multiple languages
• English, Arabic, French,…
Limitation of the current Web
• Tasks often require to combine data on the Web
– Searching for the same information in different digital libraries
– Information may come from different web sites and needs to be combined
• Some existing Web sites often provide some limited facility to combine data from various sources
– But, these are not scalable
How to improve the existing Web?
• Increasing automatic linking among data
• Increasing accuracy in search
• Increasing automation in data integration
• Adding semantics to data is the solution!
What is Semantic Web
• A Web of data
• It is not a Web of pages
• It describes the relationship between things
– X is-a-student-of Y
• It describes properties of things
– price, color, ...
What is Semantic Web?
• The next generation of the WWW
• Information has machine-processable and machine-understandable semantics
• Not a separate Web but an augmentation of the current one
• The backbone of Semantic Web are RDF and ontologies
Semantic Web
Semantic means the study of the meaning
• “The Semantic Web is a major research initiative of the World Wide Web Consortium (W3C) to create a metadata-rich Web of resources that can describe themselves not only by how they should be displayed (HTML) or syntactically (XML), but also by the meaning of the metadata.”
• An enhancement to the current Web, not a replacement
• “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”
The Semantic Web is about…
• Web Data Annotation – connecting (syntactic) Web objects, like text chunks,
images, … to their semantic notion (e.g., this image is about Innsbruck, Dieter Fensel is a professor)
• Data Linking on the Web (Web of Data) – global networking of knowledge through URI, RDF, and
SPARQL (e.g., connecting my calendar with my rss feeds, my pictures, ...)
• Data Integration over the Web – seamless integration of data based on different conceptual
models (e.g., integrating data coming from my two favorite book sellers)
The structure of data integration
Same book in frensh
Start making queries…
• User of data “F” can now ask queries like:
• –“give me the title of the original”
• well, … « donnes-moi le titre de l‟original »
• This information is not in the dataset “F”…
• …but can be retrieved by merging with dataset “A”!
However, more can be achieved…
• We “feel” that a:author and f:auteur should be the same
• But an automatic merge doest not know that! • Let us add some extra information to the merged
data: –a:author same as f:auteur –both identify a “Person” –a term that a community may have already defined:
• a “Person” is uniquely identified by his/her name and, say, homepage
• can be used as a “category” for certain type of resources
Start making richer queries!
• User of dataset “F” can now query:
–“donnes-moi la page d‟accueil de l‟auteur de l‟original”
– well… “give me the home page of the original‟s „auteur‟”
• The information is not in datasets “F” or “A”…
but was made available by:
–merging datasets “A” and datasets “F”
Combine with different datasets
• Using, e.g., the “Person”, the dataset can be combined with other sources
• For example, data in Wikipedia can be extracted using dedicated tools
• –e.g., the “dbpedia” project can extract the “infobox” information from Wikipedia
So where is the Semantic Web?
• The Semantic Web provides technologies to make such integration possible!
• Hopefully you get a full picture at the end of the tutorial…
Semantic Web technology stack as a framework
Semantic Web Technologies
– Hypertext Web technologies
– Standardized Semantic Web technologies
– Unrealized Semantic Web technologies
Hypertext Web technologies
• Internationalized Resource Identifier (IRI), – generalization of URI Semantic Web needs unique identification
to allow provable manipulation with resources in the top layers.
• Unicode – Semantic Web should also help to bridge documents in different
human languages, so it should be able to represent them.
• XML – is a markup language that enables creation of documents
composed of structured data. Semantic web gives meaning (semantics) to structured data.
• XML Namespaces – provides a way to use markups from more sources. Semantic
Web is about connecting data together, and so it is needed to refer more sources in one document.
Standardized Semantic Web technologies
• Resource Description Framework – (RDF) is a framework for creating statements in a form of so-called triples. It
enables to represent information about resources in the form of graph
• RDF Schema (RDFS) – provides basic vocabulary for RDF. Using RDFS it is for example possible to
create hierarchies of classes and properties.
• Web Ontology Language – It allows stating additional constraints, such as for example cardinality,
restrictions of values, or characteristics of properties such as transitivity. It is based on description logic and so brings reasoning power to the semantic web.
• SPARQL – is a RDF query language - it can be used to query any RDF-based data (i.e.,
including statements involving RDFS and OWL). Querying language is necessary to retrieve information for semantic web applications.
Unrealized Semantic Web technologies
• RIF or SWRL – will bring support of rules. This is important for example to
allow describing relations that cannot be directly described using description logic used in OWL.
• Cryptography – is important to ensure and verify that semantic web statements
are coming from trusted source. This can be achieved by appropriate digital signature of RDF statements.
– Trust to derived statements will be supported by (a) verifying that the premises come from trusted source and by (b) relying on formal logic during deriving new information.
• User interface – is the final layer that will enable humans to use semantic web
applications.
RDF
• RDF was designed to provide a common way to describe information so it can be read and understood by computer applications.
• RDF descriptions are not designed to be displayed on the web.
• RDF documents are written in XML. The XML language used by RDF is called RDF/XML.
RDF
• By using XML, RDF information can easily be exchanged between different types of computers using different types of operating systems and application languages
• RDF uses Web identifiers (URIs) to identify resources.
• RDF describes resources with properties and property values.
RDF
• Explanation of Resource, Property, and Property value: – A Resource is anything that can have a URI, such
as "http://www.w3schools.com/rdf"
– A Property is a Resource that has a name, such as "author" or "homepage"
– A Property value is the value of a Property, such as "Jan Egil Refsnes" or "http://www.w3schools.com" (note that a property value can be another resource)
RDF
• RDF Statements – The combination of a Resource, a Property, and a
Property value forms a Statement (known as the subject, predicate and object of a Statement).
• Statement: "The author of http://www.w3schools.com/rdf is Jan Egil Refsnes". – The subject of the statement above is:
http://www.w3schools.com/rdf – The predicate is: author – The object is: Jan Egil Refsnes
Limitation of RDF
• That’s what RDF describes
– type
– subClassOf
– subPropertyOf
– range
– domain
– label
– comment
Examples
•type – a resource belongs to a certain class
–<WillSmith> <type> <Actor>
–This defines which properties will be relevant to Will Smith
•subClassOf – a class belongs to a parent class
–<Actor> <subClassOf> <Person>
OWL = Web Ontology Language
Ontologies
•Ontologies? –Definition and classification of concepts and entities, and the relationships between them.
•Provide a mechanism for defining the relationship among different words and for the Semantic Web, relationships among different resources
Ontologies
Based on the basic elements of RDF; adds more vocabulary for describing properties and classes.
Relationships between classes (ex: disjointWith)
• Equality (ex: sameAs)
• Richer properties (ex: symmetrical)
• Class property restrictions (ex: allValuesFrom)
Ontologies
Relationships between Classes
• disjointWith – resources belonging to one class cannot belong to the other
<Person> <disjointWith> <Country>
• complementOf – the members of one class are all the resources that do not belong to the other
<InanimateThings> <complementOf> <LivingThings>
Ontologies
Equality
• sameAs – indicates that two resources actually refer to the same real-world thing or concept
<wills> <sameAs> <wismith>
• equivalentClass – indicates that two classes have the same set of members
<CoopBoardMembers> <equivalentClass> <CoopResidents>
Ontologies
Richer Properties • Symmetric – a relationship between A and B is
also true between B and A <WillSmith> <marriedTo> <JadaPinkettSmith> implies <JadaPinkettSmith> <marriedTo> <WillSmith>
• Transitive – a relationship between A and B and between B and C is also true between A and C
<piston> <isPartOf> <engine> <engine> <isPartOf> <automobile> implies <piston> <isPartOf> <automobile>
Ontologies
Richer Properties continued
• inverseOf – a relationship of type X between A and B implies a relationship of type Y between B and A
<starsIn> <inverseOf> <hasStar>
<MenInBlack> <hasStar> <WillSmith>
implies <WillSmith> <starsIn> <MenInBlack>
Ontologies
Inferences
• Create new triples based on existing triples
• Deduce new facts based on the stated facts
<piston> <isPartOf> <engine>
<engine> <isPartOf> <automobile> implies <piston> <isPartOf> <automobile>