9/14/2015B.Ramamurthy1 Operating Systems : Overview Bina Ramamurthy CSE421/521.
10/18/2015Page 1 Introduction to Semantic Web Design B. Ramamurthy.
Transcript of 10/18/2015Page 1 Introduction to Semantic Web Design B. Ramamurthy.
04/20/23Page 2
Introduction• Web in its current form is an application on the internet that delivers
information. Ex: browsing daily news• Current applications involving the web integrate data and information.
Ex: online shopping• Next generation web is expected integrate a variety of resources and
devices and support knowledge sharing among machines. – Exploit the economies of scale possible by machines processing of
knowledge. • How to tell the machines about the resources and how to specify
concepts? How can machines acquire knowledge? How to share knowledge among machines? How to enable them to make decisions based on these?
• Need to specify resources, concepts, knowledge and other artifacts used in human decision making in a form usable by machines.
• Machines can then integrate and analyze information and make decisions and collect knwolegde.
• In this lecture we will examine technology, tools, frameworks, and applications enabling the next generation web, the semantic web.
• We will also discuss an intelligent search engine serving municipal services in a real semantic web application (Chapter 4)
04/20/23Page 3
References for today’s discussion
1. W3C school’s tutorials (http://www.w3schools.com)
2. Taxonomies and the semantic web by Alistair Miles, CISTRANA workshop, Feb 2006, Rutherford Appleton Lab
04/20/23Page 4
HTML, XML, RDF, and OWL
• HTML:– HTML stands for Hyper Text Markup Language – An HTML file is a text file containing small markup tags – The markup tags tell the Web browser how to display the
page
• XML:– XML stands for eXtensible Markup Language – XML is a markup language much like HTML – XML was designed to carry data, not to display data – XML tags are not predefined. You must define your own tags – XML is designed to be self-descriptive
– XML is a W3C Recommendation
04/20/23Page 5
HTML, XML… RDF, ..
• RDF: – RDF stands for Resource Description Framework – RDF is a framework for describing resources on the
web – RDF provides a model for data, and a syntax so that
independent parties can exchange and use it – RDF is designed to be read and understood by
computers – RDF is not designed for being displayed to people – RDF is written in XML – RDF is a part of the W3C's Semantic Web Activity – RDF is a W3C Recommendation
• Lets discuss the details.
04/20/23Page 6
HTML…OWL• OWL:
– OWL stands for Web Ontology Language – OWL is built on top of RDF – OWL is for processing information on the web – OWL was designed to be interpreted by
computers – OWL was not designed for being read by
people – OWL is written in XML – OWL has three sublanguages – OWL is a web standard
04/20/23Page 7
Web ontology
Natural languageEx: English
Programming languageEx: Pascal
Natural language Ontology
Web ontology
Programming language is a strict syntaxed language for expressing algorithms (steps) for execution by a computing device.
Web ontology is for expressing web related concepts.Web ontology language (OWL) is a technology for accomplishing this.Protégé-OWL is a tool that implements OWL.
04/20/23Page 8
Taxonomy and web ontology
• Taxonomy is a science of classification. F: Taxonomy• Ontology is specification of conceptualization. F: Ontology• XML allows for meaningful tags. T: XML• Resource Definition Framework is an XML language for
defining resources on the web (www). T: RDF• Web Ontology Language (OWL) T:OWL• RDF is an assertional language intended to be used to
express propositions using precise formal vocabularies, particularly those specified using RDFS [RDF-VOCABULARY], for access and use over the World Wide Web, and is intended to provide a basic foundation for more advanced assertional languages with a similar purpose. The overall design goals emphasize generality and precision in expressing propositions about any topic, rather than conformity to any particular processing model.
04/20/23Page 9
RDF and OWL
• OWL is a semantic extension of RDF: it allows for specification of logical dependencies between information structures. (as defined by Miles: ref 2)
• OWL works on structured information• RDF is for structuring information.• OWL is an information model.9
04/20/23Page 11
Intelligent Search Engine for online access to municipal services (Ch 4): problem definition
• Citizens can perform 80% of the city services from home• When somebody is looking for a service one must be able to
locate it easily.• You can collect, categorize and list all the services (..
Taxonomy)• However searching through this list may not yield expected
results using traditional search engines. – Search results are based on the description of the services and co-
occurrence of the words in the query.– Ex: A citizen want to dispose a washing machine should search for
“special collection of large items”• Cannot force citizens to learn government language• When a service is looked upon a set of related services should
be made available• Search engine is a first step in the roadmap to citizen self-
service
04/20/23Page 12
Zaragoza Municipal services roadmap (Fig. 4.1)
Positioning
Interface
Functionality
ContentScope
Technology
Intelligent search Engine Citizen channels Citizen self-service
04/20/23Page 13
Application of semantic web
• Three ways that Zaragosa used semantic web are:
1. Statistical approach to interpretation of citizen requests. (fig. 4.3)
2. Enhanced-keyword based approach to interpretation of citizen requests. (fig. 4.4)
3. Applying semantic distance to interpreting citizen requests. (fig. 4.5)
04/20/23Page 14
Usage of the three methods
• First approach is cheapest and consumes less resources and the semantic web approach is the most expensive.
• Zaragosa architecture arranges the three in a pipeline architecture where each stage is triggered only when previous stage did not result is satisfactory results.
04/20/23Page 15
How does it work?
• Traditional search engines retrieve documents based on occurrences of keyboards vs. Zaragosa SOA (ZS) has understanding of its services, information and data.
• ZS knows persons can change addresses, car owners pay taxes, construction work requires permits, building bars near schools is not good etc.
• All this information is stored in an ontology: a computer understandable description of what e-services are.
• This ontology allows ZS to understand citizens’s query and thus returns meaningful results.
• ZS also uses natural language understanding software to translate free text queries of citizens into the ontology. (see fig. 4.6)
04/20/23Page 16
Citizen-city government interaction (Fig. 4.6 modified)
Natural languagequery
KnowledgeTagger (KT)NLP
ZS domainontology
SemanticQuery
SemanticDistance
Analyzer (SD)
Result
04/20/23Page 17
Search vs. Intelligent Search
• Search for keywords
• Result in ranked list of documents
• Users need to invest time and effort to filter the right piece of information out of the overall results
• Search for keywords, semantic concepts.
• Results in actual relevant document
• Perceived as search engine that understands the user.
04/20/23Page 18
ZS Domain Ontology
• Development of an ontology starts with detailed study of the services offered by the city.
• Objective is to extract all relevant terms belonging to this domain from existing documents.
• ZS ontology contains four main classes: agent, process, event, object
04/20/23Page 19
ZS Domain Ontology (contd.)
• Agent: entity participating in an action• Process: A series of actions that a citizen
can do using the online services offered by the city government.
• Event: any social gathering or activity.• Object: any entity that exists in the city
which can be used for or by a service offered by the city government.
04/20/23Page 20
Using the ontology
• Approach is to establish a semantic similarity between a question provided by a citizen and the FAQs already available.
• Ontology needs to be complete in order to contain all the necessary terms to satisfy the requests.
• Ontology is completed with a number of thesauri to identify synonyms. Ex: baby and infant
• Context information is used to tackle any ambiguity.
04/20/23Page 21
Natural Language Process for ZS
• Knowledge tagger automatically annotates text according to domain ontology
• Series of linguistic analyzers, sentence splitters, simple tokenizers, spell checkers and morphological databases.
• Outcome of this analysis is a annotated text equivalent of the query.
• Then the query is synthesized in terms of domain ontology: RDQL, SPARQL, … SQL
04/20/23Page 22
Semantic Annotation of city services
• Collect and index the information about services
• Semantic processing results in ontological entities: concepts, instances, attributes, and relations
• Output of this process is semantically described services that can checked against citizen’s queries.
04/20/23Page 23
Overall Architecture of ZS
Search clients
Search Systems web services
Ontology SystemsOntology cache
Ontology SubsytemWeb services
NLP systemsNLP cache
NLP subsystemWeb services
PersistenceRDBMS
04/20/23Page 24
Summary
• Zaragosa is an powerful SOA that uses semantic knowledge to better serve its citizens.
• Its roadmap is open with ability to extend the system through its WS interface.