Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December...

82
Semantic Semantic Information Information Systems Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004

Transcript of Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December...

Page 1: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Semantic Semantic Information Information

SystemsSystemsTim Finin, Anupam Joshi,

Charles Nicholas and Tim Oates

17 December 2004

Page 2: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Agenda

10:00 - 10:10 Arrival, agenda (Nicholas)

10:10 - 10:20 Introduction (Finin)

10:20 - 10:30 Semantic web and IR (Finin)

10:30 - 10:40 Trust, provenance & policy (Joshi)

10:40 - 10:50 Relational learning on the semantic web (Oates)

10:50 - 12:00 Discussion (Nicholas)

Page 3: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Semantic Information SystemsA new generation of information systems that include semantics in a more fundamental way, e.g.,

– Computer understandable metadata– Standards for expressing knowledge as well as data– Machine learning for knowledge discovery and adaptation– Automated service discovery, composition & invocation– Cooperating agent based systems

Why now?– Metcalfe’s Law and the web– Semantic web languages– Maturing learning technologies

Is this the endgame?– Hardly, it’s the opening up of a new approach toward the

familiar long term goal of building intelligent systems

Page 4: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Introduction:Introduction:Semantic WebSemantic Web

Page 5: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

“XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria.”

-- Philip Wadler, Et tu XML? The fall of the relational empire, VLDB, Rome, September 2001.

Page 6: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

“The web has made people smarter. We need to understand how to use it to make machines smarter, too.”

-- Michael I. Jordan, paraphrased from a talk at AAAI, July 2002 by Michael Jordan (UC Berkeley)

Page 7: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

“The Semantic Web will globalize KR, just as the WWW globalize hypertext”

-- Tim Berners-Lee

Page 8: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

“The multi-agent systems paradigm and the web both emerged around 1990. One has succeeded beyond imagination and the other has not yet made it out of the lab.”

-- Anonymous, 2001

Page 9: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

tell

register

Page 10: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Origins of the Semantic Web• TBL’s vision for the web (89)

• Guha’s MCF (~94)

• XML+MCF=>RDF (~96)

• RDF+OO=>RDFS (~99)

• RDFS+KR=>DAML+OIL (00)

• W3C’s SW activity (01)

• W3C’s OWL (03)

• ~2M SWDs on web (04)

http://www.w3.org/History/1989/proposal.html TBL

Page 11: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

TBL’s semantic web vision“The Semantic Web will globalize KR, just as the WWW globalize hypertext” -- Tim Berners-Lee

we arehere

Page 12: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

RDF is the first SW language

<rdf:RDF ……..> <….> <….></rdf:RDF>

XML Encoding Graph

stmt(docInst, rdf_type, Document)stmt(personInst, rdf_type, Person)stmt(inroomInst, rdf_type, InRoom)stmt(personInst, holding, docInst)stmt(inroomInst, person, personInst)

Triples

RDFData Model

Good for Machine

Processing

Good For HumanViewing

Good For Reasoning

RDF is a simple language for building graph based representations

Page 13: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Simple RDF Example

http://umbc.edu/~finin/talks/idm02/

“Intelligent Information Systemson the Web and in the Aether”

http://umbc.edu/

dc:Title

dc:Creator

bib:Aff

“Tim Finin” “[email protected]

bib:namebib:email

Page 14: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

XML encoding for RDF<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/dc/elements/1.1/"

xmlns:bib="http://daml.umbc.edu/ontologies/bib/">

<description about="http://umbc.edu/~finin/talks/idm02/"> <dc:title>Intelligent Information Systems on the Web and in the

Aether</dc:Title> <dc:creator> <description> <bib:Name>Tim Finin</bib:Name> <bib:Email>[email protected]</bib:Email> <bib:Aff resource="http://umbc.edu/" /> </description> </dc:Creator></description></rdf:RDF>

Page 15: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

RDF Schema (RDFS)

• RDF Schema adds taxonomies forclasses & properties– subClass and subProperty

• and some metadata.– domain and range

constraints on properties

• Several widely usedKB tools can importand export in RDFS

Stanford Protégé KB editorStanford Protégé KB editor• Java, open sourced• extensible, lots of plug-ins• provides reasoning & server

capabilities

Page 16: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

From RDF to OWL• An OWL ontology is a set of RDF statements

– OWL defines semantics for certain statements

– Does NOT restrict what can be said -- documents can include arbitrary RDF

– But no OWL semantics for non-OWL statements

• Adds capabilities common to description logics:– cardinality constraints, defined classes (=> classification), equivalence,

local restrictions, disjoint classes, etc.

• More support for ontologies– Ontology imports ontology, versioning, …

• But not (yet) variables, quantification, & rules• A complete OWL reasoning is significantly more

complex than a complete RDFS reasoner.

Page 17: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

OWL in One Slide<rdf:RDF xmlns:rdf ="http://w3.org/22-rdf-syntax-ns#"

xmlns:rdfs=http://w3.org/rdf-schema#>

xmlns:owl="http://www.w3.org/2002/07/owl#”> <owl:Ontology rdf:about=""> <owl:imports rdf:resource="http://owl.org/owl+oil"/></owl:Ontology><owl:Class rdf:ID="Person"> <rdfs:subClassOf rdf:resource="#Animal"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasParent"/> <owl:toClass rdf:resource="#Person"/> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction owl:cardinality="1"> <owl:onProperty rdf:resource="#hasFather"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class><Person rdf:about=“http://umbc.edu/~finin/"> <rdfs:comment>Finin is a person.</rdfs:comment></Person>

OWL is built on top of XML and RDF

It can be used to add metadata about anything which has a URI.

everything has a URI

OWL is ~= a frame based knowledge representation language

It allows the definition, sharing, composition and use of ontologies

URIs are a W3C standard generalizing URLs

Page 18: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

And on to Rules …• Approaches for adding knowledge in the form of

rules have been developed and prototyped and are entering a standardization phase– SWRL; RuleML, log:implies

• Motivation– Extending/augmenting OWL’s expressivity– Supporting policies, negotiation, justification,

argumentation, etc.

Page 19: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

A usecase: FOAF• FOAF (Friend of a Friend) is a simple ontology to describe

people and their social networks.– See the foaf project page: http://www.foaf-project.org/

• We recently crawled the web and discovered over 1,500,000 valid RDF FOAF files.– Most of these are from blogging sites which use foaf to publish

basic user info

– See http://apple.cs.umbc.edu/semdis/wob/foaf/

<foaf:Person><foaf:name>Tim Finin</foaf:name><foaf:mbox_sha1sum>2410…37262c252e</foaf:mbox_sha1sum><foaf:homepage rdf:resource="http://umbc.edu/~finin/" /><foaf:img rdf:resource="http://umbc.edu/~finin/images/passport.gif" />

</foaf:Person>

Page 20: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

FOAF: why RDF? Extensibility!• FOAF vocabulary provides 50+ basic terms for

making simple claims about people • FOAF files can use RDF terms from other ontologies

too: RSS, MusicBrainz, Dublin Core, Wordnet, Creative Commons, blood types, starsigns, …

• RDF guarantees freedom of independent extension– OWL provides fancier data-merging facilities 

• Result: Freedom to say what you like, using any RDF markup you want, and have RDF crawlers merge your FOAF documents with other’s and know when you’re talking about the same entities. 

Page 21: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

No free lunch!Consequence:

• We must plan for lies, mischief, mistakes, stale data, slander

• Dataset is out of control, distributed, dynamic

• Importance of knowing who-said-what – Anyone can describe anyone– We must record data provenance– Modeling and reasoning about trust is critical

• Legal, privacy and etiquette issues emerge

• Welcome to the real world

Page 22: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

From where will the markup come?

• A few authors will add it manually.• More will use annotation tools.

– SMORE: Semantic Markup, Ontology and RDF Editor

• Intelligent processors (e.g., NLP) can understand documents and add markup (hard) – Machine learning powered information extraction tools

show promise

• Lots of web content comes from databases & we can generate SW markup along with the HTML– See http://ebiquity.umbc.edu/

Page 23: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

From where will the markup come?

• In many tools, part of the metadata information is present, but thrown away at output – e.g., a business chart can be generated by a tool…

– …it “knows” the structure, the classification, etc. of the chart

– …but, usually, this information is lost

– …storing it in metadata is easy!

• So “semantic web aware” tools can produce lots of metadata– E.g., Adobe’s use of its XMP platform

Page 24: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Use Cases• Adding machine understandable metadata to

images, documents, software, services, etc– In support of discovering, indexing, and retrieving

• Knowledge sharing and semantic interoperability– Exchanging data and knowledge (rules, definitions) for

negotiation, argumentation, explanations, policies …

• Semantic web/grid/agent services – Annotating service descriptions with semantic info for

discovery, composition, invocation, monitoring, etc.

• Relational learning on the web– Boosted with KR reasoning and constraints

Page 25: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Information Information Retrieval and the Retrieval and the

Semantic WebSemantic Web

Page 26: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Why use IR techniques?• We will want to retrieve over the structured and

unstructured parts of a Semantic Wed Document (SWD)

• We should prepare for the appearance of text documents with embedded SW markup

• We may want to get our SWDs into conventional search engines, such as Google.

• IR techniques also have some unique characteristics that may be very usefule.g., ranking matches, measuring similaritybetween documents, relevance feedback,etc.

Page 27: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

What we have done

• Developed Swoogle – a crawler based retrieval system for SWDs

• Developed and implemented a technique to get Google to index and retrieve SWDs

• Prototyped (twice) an ngram based IR engine for SWDs

• Used these in several demonstration systems

Page 28: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Contributors include Tim Finin, Anupam Joshi, Yun Peng, R. Scott Cost, Jim Mayfield, Joel Sachs, Pavan Reddivari, Vishal Doshi, Rong Pan, Li Ding, and Drew Ogle. Partial research support was provided by DARPA contract F30602-00-0591 and by NSF by awards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649. November 2004.

http://swoogle.umbc.edu/

Swoogle uses four kinds of crawlers to discover semantic web documents and several analysis agents to compute metadata and relations among documents and ontologies. Metadata is stored in a relational DBMS. Services are provided to people and agents.

SWD RankA SWD’s rank is a function of its type (SWO/SWI) and the rank and types of the documents to which it’s related.

SWOs

SWIs

HTMLdocuments

Images

CGI scripts

Audiofiles

Videofiles

The web, like Gaul, is divided into three parts: the regular web (e.g. HTML), Semantic Web Ontologies (SWOs), and Semantic Web Instance files (SWIs)

SWD = SWO + SWI

Swoogle Statistics

Swoogle puts documents into a character n-gram based IR engine to compute document similarity and do retrieval from queries

SWD IR Engine

SWOOGLE 2

SWD Metadata

Web Service

Web Server

SWD Cache

The Web

The WebCandidate

URLs Web Crawler

SWD Reader

IR analyzer SWD analyzer

Human users

Intelligent Agents

discovery

digest

analysis

service

OntologyDictionary

SwoogleSearch

SwoogleStatisticsOntology Dictionary

Swoogle Search

Swoogle provides services to people via a web interface and to agents as web services.

SWDs 336,000 Classes 95,000

Triples 47,000,000 Properties 53,000

Ontologies 4,200 Individuals 7,200,000

Statistics as of November 2004

demodemodemodemo

Page 29: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Swoogle IR Search • This is work in progress, not yet fully integrated

into Swoogle• Documents are put into an ngram IR engine (after

processing by Jena) in canonical XML form– Each contiguous sequence of N characters is used as an

index term (e.g., N=5)– Queries processed the same way

• Character ngrams work almost as well as words but have some advantages– No tokenization, so works well with artificial languages

and agglutinative languages => good for RDF!

Page 30: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Why character n-grams?

• Suppose we want to find ontologies for time

• We might use the following query“time temporal interval point before after during day month year eventually calendar clock duration end begin zone”

• And have matches for documents with URIs like–http://foo.com/timeont.owl#timeInterval–http://foo.com/timeont.owl#CalendarClockInterval –http://purl.org/upper/temporal/t13.owl#timeThing

Page 31: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Another approach: URIs as words

• Remember: ontologies define vocabularies• In OWL, URIs of classes and properties are the

words• So, take a SWD, reduce to triples, extract the

URIs (with duplicates), discard URIs for blank nodes, hash each URI to a token (use MD5Hash), and index the document.

• Process queries in the same way• Variation: include literal data (e.g., strings) too.

Page 32: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Harnessing Google• Google started indexing RDF documents

some time in late 2003• Can we take advantage of this?• We’ve developed techniques to get some

structured data to be indexed by Google• And then later retrieved• Technique: give Google enhanced

documents with additional annotations containing Swangle Terms ™

Page 33: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Swangle definitionswan·gle

Pronunciation: ‘swa[ng]-g&lFunction: transitive verbInflected Forms: swan·gled; swan·gling /-g(&-)li[ng]/Etymology: Postmodern English, from C++ mangle, Date: 20th century1: to convert an RDF triple into one or more IR indexing terms 2: to process a document or query so that its content bearing markup will be indexed by an IR system Synonym: see tblify- swan·gler /-g(&-)l&r/ noun

Page 34: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Swangling• Swangling turns a SW triple into 7 word like terms

– One for each non-empty subset of the three components with the missing elements replaced by the special “don’t care” URI

– Terms generated by a hashing function (e.g., MD5)

• Swangling an RDF document means adding in triples with swangle terms.– This can be indexed and retrieved via conventional search

engines like Google

• Allows one to search for a SWD with a triple that claims “Ossama bin Laden is located at X”

Page 35: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

A Swangled Triple<rdf:RDF xmlns:s="http://swoogle.umbc.edu/ontologies/swangle.owl#"</rdf>

<s:SwangledTriple> <s:swangledText>N656WNTZ36KQ5PX6RFUGVKQ63A</s:swangledText> <rdfs:comment>Swangled text for [http://www.xfront.com/owl/ontologies/camera/#Camera, http://www.w3.org/2000/01/rdf-schema#subClassOf, http://www.xfront.com/owl/ontologies/camera/#PurchaseableItem] </rdfs:comment> <s:swangledText>M6IMWPWIH4YQI4IMGZYBGPYKEI</s:swangledText> <s:swangledText>HO2H3FOPAEM53AQIZ6YVPFQ2XI</s:swangledText> <s:swangledText>2AQEUJOYPMXWKHZTENIJS6PQ6M</s:swangledText> <s:swangledText>IIVQRXOAYRH6GGRZDFXKEEB4PY</s:swangledText> <s:swangledText>75Q5Z3BYAKRPLZDLFNS5KKMTOY</s:swangledText> <s:swangledText>2FQ2YI7SNJ7OMXOXIDEEE2WOZU</s:swangledText></s:SwangledTriple>

Page 36: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

What’s the point?• We’d like to get our documents into Google• The Swangle terms look like words to

Google and other search engines.• We use cloaking to avoid having to modify

the document– Add rules to the web server so that, when a

search spider asks for document X the document swangled(X) is returned

• Caching makes this efficient

Page 37: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

WebSearchEngine

FiltersSemanticMarkup

InferenceEngine

LocalKB

SemanticMarkup

SemanticMarkup

Extractor

Encoder

RankedPages

SemanticWeb Query

EncodedMarkup

TextQuery

TextFiltersText

Page 38: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Trust and Trust and ProvenanceProvenance

Page 39: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

?

Actionable Information: Trust and Provenance

Alice

M: “Today is sunny”

Bob trust Alice

BobE(M) Alice believes MD(M)

by default

Bob believes M

+

Trust rule

Modeling Alice’s trustworthinessObtained M from

trusts

weather.com

Page 40: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How trust and provenance work

Bob Alice

source

believes

trusts

believes

http://foo.com/alice.rdf

ex:george space:livesIn space#US

http://foo.com/alice.rdf dc:creator ex:alice

ex:bob wob:trusts ex:alice

foo:George ex:livesIn ex:US

http://foo.com/bob.rdf

1

2

3

4

distrusts

Where doesGeorge live ?

foo:George ex:livesIn ex:Europe

Eve

believes

5

ex:bob wob:distrusts ex:Eve

Page 41: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Al-Qaeda

Mr.X

Terrorist Groups

isPresidentOf

listedIn

Company A

investsOrganization B

Osama Bin LadenmemberOf

Afghanistan

locatedIn

Mr. Y

ownedBy

locatedInrelatedTo

USlocatedIn

Kabul basedIn

Afghanistan

Company A

Osama Bin Laden

NASDAQ

CIA World Fact Book

HUMINT

Department of State Organization B

FOO News

Kabul

SIGINT

An Example

Page 42: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How do provenance and trust help?• Disambiguation

– Provenance helps locating matching or relevant resources• Data access

– Knowledge can be grouped by Provenance besides Topic– (Trust + Provenance) enables inference on only credible information

sources, and thus controls space complexity• Credibility analysis

– (Trust) models imperfect information– (Provenance) enables explicit trace of justification– (Trust + Provenance) enable social justification as alternative of logical

justification (truth maintenance system) • Conclusive inference by adopting trusted beliefs• Resolve inconsistency by consensus

• (Trust + Provenance) enables social warfare and thus social control on the SW

Page 43: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Trust & Provenance: Complementing Security

Trust & Provenance Traditional security

Subject Content/Collaboration Reliability

“What A said is trustworthy”

“A is cooperative”

Authenticity & Authorization

“A has identity IDA”

“M is said by A”

Assumption True identity in every interaction

Intelligent Agent (with memory and learning capability)

Adoption of security protocol

Agents are telling truth

Nature Mathematical & Heuristic/Logical

Cryptography

Methods Learning models,

Deductive inference

Encryption, protocols

Digital signature

Page 44: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Issues

• Represent provenance & trust– Web Of Belief Ontology

• Create a computationally tractable model of trust– How to (mathematically) model trust

– How to initialize and propagate trust

• Maintain Provenance– Decide when apparently distinct sources are really

based on the same underlying data• Manage provenance & trust information

– Extracted from the Semantic Web, online social networks, and online reputation systems

– Effective metadata based data access service

Page 45: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Controlling Information Gathering Behavior using Policies

• Policies are rules of “correct” behavior– Policies are normative and describe what should be done in

an ideal world.

• Policies permit high-level control of “self organizing, self managing, self healing” entities – Entities? These can be programs, services, agents, devices

and people

• Using policies reduces the need to modify code in order to change systems’ behavior– We assume modifying policies will be easier than modifying

Java.

• Using policies allows humans to understand behavior without (always) looking at the code.

Page 46: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Behavior Specifications• You MUST NOT use any information from nations in the

Axis of Evil, unless expressly authorized by the DCI.• You MUST check at least three variants of a hypothesis,

including its negative• You are OBLIGED to trust anyone that SecDef declares as

trustworthy• You SHOULD trust Safire about privacy but not about the

middle east• You are OBLIGED to materialize a new view whenever

enough queries do joins across tables. • You CAN NOT use the camera functionality of your

handheld device in this building

Page 47: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Rei Policy Language• Developed several versions of Rei, a policy

specification language, encoded in (1) Prolog, (2) RDFS, (3) OWL

• Used to model different kinds of policies– Authorization for services

– Privacy in pervasive computing and the web

– Conversations between agents

– Team formation, collaboration & maintenance

• The OWL grounding enables policies that reason over SW descriptions of actions, agents, targets and context (Domain knowledge)

Page 48: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Rei Policy Language• Rei is a declarative policy language for describing

policies over actions– Reasons over domain dependent information

• Currently represented in OWL + logical variables• Based on deontic concepts

– Permission, Prohibition, Obligation, Dispensation• Models speech acts

– Delegation, Revocation, Request, Cancel• Meta policies

– Priority, modality preference

Page 49: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Hypothesis Generation

• Question answering will become “hypothesis proving” with the semantic web (Mayfield’s idea)– How about hypothesis generation

• Can be very useful in avoiding “group think”– Can we semi automatically generate hypotheses

– Variants of an existing hypothesis using domain knowledge

– When testing for X belongs to Al-Qaeda, test for X belongs to Jamat-ul-Dawa.

Page 50: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Relational Relational Learning on the Learning on the Semantic WebSemantic Web

Page 51: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

ConclusionsConclusions

Page 52: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Semantic Information Systems

• A new generation of information systems that include semantics in a more fundamental way

• Driven by a convergence of web technologies, machine learning, and mature KB techniques

• This is being done bottom up by pragmatists as well as by ‘academic’ researchers

• Good use cases are plentiful• There’s evidence that the technologies are

being taken up

Page 53: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Backup:Backup:Semantic Semantic

WebWeb

Page 54: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

New and New and Improved!Improved!100% Better100% Betterthan XML!!than XML!!

New and New and Improved!Improved!100% Better100% Betterthan XML!!than XML!!

RDFS supports simple inferences• An RDF ontology plus some RDF

statements may imply additional RDFstatements.

• This is not true of XML.• Example:

domain(parent,person)

range(parent,person)

subproperty(mother,parent)

range(mother,woman)

mother(eve,cain)

• This is part of the data model and not of the accessing/processing code

Implies:Implies: subclass(woman,person) parent(eve,cain) person(eve) person(cain) woman(eve)

Implies:Implies: subclass(woman,person) parent(eve,cain) person(eve) person(cain) woman(eve)

onto

logy

inst

ance

Page 55: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Student: Z. Ding. Faculty: Dr. Y. Peng, Dr. T. Finin, Dr. A. Joshi. Ack: DARPA Contract F30602-00-2-0591. 09/03.

Unified Ontology Support using Bayesian Networks

MotivationMotivationReasoning within an ontology is un-certain due to noisy/incomplete dataDL based reasoning is over-generalized and inadequateRelations between concepts in different ontologies are inherently uncertain

Translation ProcessTranslation ProcessExtend OWL for probability annotationDefine structural translation rules from RDF graphs to directed acyclic graphs of Bayesian networksConstruct conditional probability tables for individual variables in the BN DAG.

ApproachApproachTranslating OWL ontologies to Bayesian networks (BNs)Concept mapping as conditional probabilitiesJoining translated BNs (by probabilistic mappings) dynamicallyOntology reasoning (in and across ontologies) as Bayesian inference

BN1

onto1

P-onto1

Probabilistic ontological information

onto2

P-onto2

BN2

Probabilistic annotation

OWL-BN translation

concept mapping

Probabilistic ontological information

Page 56: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Student: Z. Ding. Faculty: Dr. Y. Peng, Dr. T. Finin, Dr. A. Joshi. Ack: DARPA Contract F30602-00-2-0591. 09/03.

Unified Ontology Support using Bayesian Networks

MotivationMotivationReasoning within an ontology is un-certain due to noisy/incomplete dataDL based reasoning is over-generalized and inadequateRelations between concepts in different ontologies are inherently uncertain

Translation ProcessTranslation ProcessExtend OWL for probability annotationDefine structural translation rules from RDF graphs to directed acyclic graphs of Bayesian networksConstruct conditional probability tables for individual variables in the BN DAG.

ApproachApproachTranslating OWL ontologies to Bayesian networks (BNs)Concept mapping as conditional probabilitiesJoining translated BNs (by probabilistic mappings) dynamicallyOntology reasoning (in and across ontologies) as Bayesian inference

Derived Conditional Probability Tables

Page 57: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

IT TALKSIT TALKShttp://daml.org/ UMBC, JHU/APL, and MIT/Sloan are working together on a set of issues to be integrated into agent-based

applications involving search and using rule-based reasoning: UMBC: Integrating communicating agents, DAML and the Web; JHU APL: DAML and information retrieval; and MIT Sloan School: DAML, rules based technology and distributed belief

Features•Generation from DB to DAML and HTML mediated by MySQL, Java servlets, and JSP.

•Generation of DAML descriptions and user profiles from HTML forms.

•Creation & use of DAML-encoded user modelsdescribing interests and ontology extensions.

•Ontologies for events, people, places, schedules,topics, etc.

•Automatic HTML form (pre) filling from DAML.

•Syncing of talks with Palm calendars via Coola.

•Automatic classification of talks into topicontology

•A XSB-based DAML/RDF reasoning engine.

•Agent-based services:• ITTALKS agent with KQML API using DAMLas content language

• Intelligent matching of people and talks based on interests, locations and schedules.

• Agents using both Jackal and FIPA’s Java Message Service

• Notification via email and mobile devices via SMS and WML.

• Discovery of relevant background papers from NEC CiteSeer

• Automatic generation and maintenance of user models

• Talk recommendations via collaborative filtering• Integration with STP (Smart Things and Places) ubiquitous computing project

UMBCAN HONORS UNIVERSITY IN MARYLAND

Webserver + Javaservlets

DAMLreasoning

engine

DAMLreasoning

engine

<daml></daml>

<daml></daml>

<daml></daml>

<daml></daml>

DAML files

Agents

Databases

People

RDBMSRDBMSDB

Email, HTML, SMS, WAP

FIPA ACL, KQML, DAML

SQLHTTP, KQML, DAML, Prolog

MapBlast, CiteSeer, Google, …HTTP

HTTP, WebScraping

WebServices

Ittalks.org a database driven web site of IT related talks at UMBC and other institutions. The database contains information on

– Seminar events– People (speakers, hosts, users, …)– Places (rooms, institutions, …)

This database is used to dynamically generate web pages and DAML descriptions for the talks and related information and serves as a focal point for agent-based services relating to these talks. To add your organization to ittalks.org and receive a domain (e.g., mit.ittalks.org) contact [email protected]. See http://daml.org/ for more information on DAML and the semantic web.

Acknowledgements: Funded by DARPA, contract F30602-00-2-0591. Students: H. Chen, F. Perich, L. Kagal, S. Tolia, Y. Zou, J. Sachs, S. Kumar and M. Gandhe. Faculty: T. Finin, A. Joshi, Y. Peng, R. Cost, C. Nicholas, R. Masuoka

http://ittalks.org/

Page 58: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

TAGA: Travel Agent Game in Agentcities

http://taga.umbc.edu/

TechnologiesFIPA (JADE, April Agent Platform)

Semantic Web (RDF, OWL)

Web (SOAP,WSDL,DAML-S)

Internet (Java Web Start )

FeaturesOpen Market FrameworkAuction ServicesOWL message contentOWL OntologiesGlobal Agent Community

MotivationMarket dynamicsAuction theory (TAC)Semantic webAgent collaboration (FIPA & Agentcities)

Travel Agents

Auction Service Agent

Customer Agent

Bulletin BoardAgent

Market Oversight Agent

Request

Direct Buy

Report Direct Buy Transactions

BidBid

CFP

Report Auction Transactions

Report Travel Package

Report Contract

Proposal

Web Service Agents

Ontologieshttp://taga.umbc.edu/ontologies/

travel.owl – travel concepts

fipaowl.owl – FIPA content lang.

auction.owl – auction services

tagaql.owl – query language

FIPA platform infrastructure services, including directory facilitators enhanced to use OWL-S for service discovery

Owl for representation and reasoning

Owl for service

descriptions

Owl for negotiatio

n

Owl as a content languag

e

Owl for publishing

communicative acts

Owl for contract

enforcement

Owl for modeling trust

Owl for authorization policies

Owl for protocol

description

OWL Everywhere

Page 59: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Context Broker Architecture

CoBrA FeaturesCoBrA Features[1] Uses OWL for context modeling and reasoning

[2] Uses abductive reasoning to detect and resolve inconsis-tent context knowledge

[3] Extend the REI policy language for privacy protection

[4] Adopt the FIPA standards for communication & knowledge sharing

EasyMeeting an intelligent meeting room prototype that provides services for speakers, audience & organizers based on their situational needs.

http://cobra.umbc.edu

Page 60: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

RGB: Research Group in a Boxeating our own dog food

• The UMBC ebiquity portal exposes ts con-tent in RDF using a set of OWL ontologies for papers, people, projects, photos, etc.

• Lab members can add arbitrary RDF assertions about any of the portal’s objects.

• It’s designed as a modular, configurable package using O/S software that others can use to create their own portals.

• An reasoning component is planned to apply ontology sanctioned inferences & heuristics.

http://ebiquity.umbc.edu/

People

Papers

Photos

Page 61: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Backup:Backup:SW & IRSW & IR

Page 62: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

DemoFind “Time” Ontology(Swoogle Search)1

2

3

4

Digest “Time” Ontology• Document view• Term view

Find Term “Person”(Ontology Dictionary)

Digest Term “Person”• Class properties• (Instance) properties

5 Swoogle Statistics

Page 63: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Find “Time” OntologyWe can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology.

Demo1

Page 64: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Usage of Terms in SWD

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://www.cs.umbc.edu/~finin/foaf.rdf

wordNet:Agent

rdf:typerdfs:Class

rdfs:subClassOf

foaf:Person

http://xmlns.com/foaf/1.0/

foaf:mbox

rdfs:domain

rdf:typerdf:Property

populated Class

defined Class

populated Property

defined Property

http://foo.com/foaf.rdf#finin

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://foo.com/foaf.rdf

defined Individual

Page 65: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Digest “Time” Ontology (term view)Demo2(a)

………….

TimeZone

before

intAfter

Page 66: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Document Metadata• Web document metadata

– When/how discovered/fetched– Suffix of URL– Last modified time– Document size

• SWD metadata– Language features

• OWL species• RDF encoding

– Statistical features• Defined/used terms• Declared/used namespaces• Ontology Ratio

– Ontology Rank

• Ontology annotation– Label– Version– Comment

• Related Relational Metadata– Links to other SWDs

• Imported SWDs• Referenced SWDs • Extended SWDs• Prior version

– Links to terms• Classes/Properties

defined/used

Page 67: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Digest “Time” Ontology (document view)

Demo2(b)

Page 68: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Find Term “Person”Demo

3

Not capitalized! URIref is case sensitive!

Page 69: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Term Metadata: An integrated definition

Class Definition• rdfs:subClassOf -- foaf:Agent• rdfs:label – “Person”

Properties (from SWI)• foaf:name• dc:title

Properties (from SWO)• foaf:mbox• foaf:name

foaf:name

foaf:mbox

rdfs:domain

rdfs:domain

Onto 1

owl:Classrdf:type

“Person”rdfs:label

foaf:Agent

rdfs:subClassOf

Onto 2

foaf:name

rdf:type

“Tim Finin”

SWD3

foaf:Person

Page 70: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Digest Term “Person”Demo

4

167 different properties

562 different properties

Page 71: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Demo5 Swoogle

Statistics

Page 72: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Swoogle Architecture

metadata creation

data analysis

interface

SWD discovery

SWD MetadataWeb Service

Web Server

SWD Cache

The Web

The WebCandidate

URLs Web Crawler

SWD Reader

IR analyzer SWD analyzer

Agent Service

Page 73: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Backup:Backup:Trust and Trust and

ProvenanceProvenance

Page 74: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How trust and provenance work

Bob Alice

source

believes

trusts

believes

http://foo.com/alice.rdf

ex:george space:livesIn space#US

http://foo.com/alice.rdf dc:creator ex:alice

ex:bob wob:trusts ex:alice

foo:George ex:livesIn ex:US

http://foo.com/bob.rdf

1

2

3

4

distrusts

Where doesGeorge live ?

foo:George ex:livesIn ex:Europe

Eve

believes

5

ex:bob wob:distrusts ex:Eve

Page 75: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How trust and provenance work

Bob Alice

source

believes

trusts

believes

http://foo.com/alice.rdf

ex:george space:livesIn space#US

http://foo.com/alice.rdf dc:creator ex:alice

ex:bob wob:trusts ex:alice

foo:George ex:livesIn ex:US

http://foo.com/bob.rdf

1

2

3

4

distrusts

Where doesGeorge live ?

foo:George ex:livesIn ex:Europe

Eve

believes

5

ex:bob wob:distrusts ex:Eve

Page 76: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How provenance and trust help?

• Disambiguation– Provenance helps locating matching or relevant resources

• Data access– Knowledge can be grouped by Provenance besides Topic– (Trust + Provenance) enables inference on only credible

information sources, and thus controls space complexity• Credibility analysis

– (Trust) models imperfect information– (Provenance) enables to explicit justification trace– (Trust + Provenance) enables social justification as alternative

of logical justification (truth maintenance system) • Conclusive inference by adopting trusted beliefs• Resolve inconsistency by consensus

• (Trust + Provenance) enables social warfare and thus social control on the SW

1. Introduction

Page 77: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Inference

Evaluation

knows matrix

Beliefmatrix

Domainmapping

cost matrix

Queryqueue

initialize

query

BeliefFactory

GraphFactory

Trustmatrix

LocationFactory

evaluate

output

AnalyzerAgent Society•Trust Evolution•Trust Propagation

Agent Society Creator

Statistics•Accuracy•Convergence time•trust network structure

•Uniform•Locality based

•Spatial parameters•FOAF•Epinion.com•Scale free graph

•Bizrate.com•Epinion.com•Normal, Zipf

QueryFactory

• Referral policy• Feedback policy• Consult policy

feedback

input

Configuration

Page 78: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

How could a Statement be justified

I’ve been to Foo many times, and the food was good!

I’ve been to Foo many times, and the food was good!

I believe that “Restaurants with good outlook are good” “Foo has good outlook”;

I believe that “Restaurants with good outlook are good” “Foo has good outlook”;

My friends (who have similar taste as me ) said so.

My friends (who have similar taste as me ) said so.

No better alternativeNo better alternative

inductive

deductive

conclusive (mimic)

prima facie (at first view)Foo is a good

restaurant

I believe that “Good restaurants has good outlook” “Foo has good outlook”;

I believe that “Good restaurants has good outlook” “Foo has good outlook”;

abductive

1. Introduction

Page 79: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Data Model

wanted

environment

derived

Trust Network Inference

Trust

Context

Estimated Belief

Restriction

Expected Belief

approximate

input

output

evolves

•Local knowledge•Belief and domain

•Communication channel•Communication cost

•Global knowledge

Page 80: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

The Semantic Web - Trust Scenario http://www.w3c.it/talks/ICTP/slide50-0.htm

Page 81: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

Trust Inference

FOAF(F)DBLP(D)

F + D

Page RankCiteseer

Golbeck’s Trust

Unified

T. Finin

A. Joshi

L. Ding

H. Chen

P. Kolari

F. Perich

Y. Yesha

J. Golbeck

J. Hendler

Kagal

sink

hub

source

island

Finin Joshi

PengDing

Chen

KagalKolari

Chakrabarty Perich

Avancha

Yesha

Page Rank

Citeseer Rank

Golbeck’s Trust Network

DBLP Network

FOAF NetworkReputation Systems

knows

knowsknows

Page 82: Semantic Information Systems Tim Finin, Anupam Joshi, Charles Nicholas and Tim Oates 17 December 2004.

A. Joshi

L. DingH. Chen

P. Kolari

F. Perich

J. Golbeck

J. Hendler

Kagal

sink

source

island

T. Finin A. Joshi

L. Ding

H. Chen

L. Kagal

F. Perich

Golbeck’s Trust Network

DBLP Coauthor Network

FOAF Network Reputation Systems

A. Sheth

M. P. Singh

Y. Peng

6

15

1

28

T. Finin

mapTo

knows

knows

knows

co-author

hub

Google PageRank

Citeseer Rank