Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex...

61
Relationships at the Heart of Semantic Web Amit Sheth Large Scale Distributed Information Systems (LSDIS) Lab University Of Georgia; http://lsdis.cs.uga.edu CTO, Semagix, Inc. http://www.semagix.com November 2002 © Amit Sheth Keynote SOFSEM 2002 , Milovy , Czech Republic, Nov 25 2002

description

Amit P. Sheth, “Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships,” Keynote at the 29th Conference on Current Trends in Theory and Practice of Informatics (SOFSEM 2002), Milovy, Czech Republic, November 22–29, 2002.

Transcript of Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex...

Page 1: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Relationships at the Heart of Semantic Web

Amit Sheth

Large Scale Distributed Information Systems (LSDIS) LabUniversity Of Georgia; http://lsdis.cs.uga.edu

CTO, Semagix, Inc. http://www.semagix.com

November 2002© Amit Sheth

Keynote

SOFSEM 2002 , Milovy, Czech Republic, Nov 25 2002

Page 2: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

The Semantic Web -- a vision with several views:•·“The Web of data (and connections) with meaning in the sense that a computer program can learn enough about what data means to process it.” [B99]•·“The semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [BHL01]•·“The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. [W3C01]

Semantics: The Next Step in the Web’s Evolution

Page 3: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantics for the Web

On the Semantic Web every resource (people, enterprises, information services, application services, and devices) are augmented with machine processable descriptions to support the finding, reasoning about (e.g., which service is best), and using (e.g., executing or manipulating) the resource. The idea is that self-descriptions of data and other techniques would allow context-understanding programs to selectively find what users want, or for programs to work on behalf of humans and organizations to make them more efficient and productive.

Page 4: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Move from Syntax to Semantics in Information System (a personal perspective)

Semantic Web, some DL-II projects,Semantic Web, some DL-II projects,Semagix SCORE, Applied SemanticsSemagix SCORE, Applied Semantics

VideoAnywhereVideoAnywhereInfoQuiltInfoQuilt

OBSERVEROBSERVER

Generation IIIGeneration III(information

brokering)

1997...1997...

Semantics (Ontology, Context, Relationships, KB)

InfoSleuth, KMed, DL-I projectsInfoSleuth, KMed, DL-I projectsInfoscopes, HERMES, SIMS, Infoscopes, HERMES, SIMS,

Garlic,TSIMMIS,Harvest, RUFUS,...Garlic,TSIMMIS,Harvest, RUFUS,...

Generation IIGeneration II(mediators)

1990s1990s

VisualHarnessVisualHarnessInfoHarnessInfoHarness

Metadata (Domain model)

MermaidMermaidDDTSDDTS

Multibase, MRDSM, ADDS, Multibase, MRDSM, ADDS, IISS, Omnibase, ...IISS, Omnibase, ...

Generation IGeneration I(federated DB/

multidatabases)

1980s1980s

Data (Schema, “semantic data modeling)

Page 5: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantics and Relationships

Semantics is derived from relationships. Consider the linguistics perspective.

“Semantics is the study of meaning. …We may distinguish a number of legitimate ways to approach semantics:

• …• the relationship between linguistic

expressions (e.g. synonymy, antonymy, hyperonymy, etc.): sense;

• the relationship to linguistic expressions to the "real world": reference. “

Ontologies in KR help capture the above.Quoted part from http://www.ncl.ac.uk/sml/staff/. © 2000 Jonathan West.

Page 6: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Why is this a hard problem?

Are objects/entities equivalent/equal(same)?How (well) are they related? • partial and fuzzy match: related, relevant• related in a “context”• degrees: semantic similarity, semantic proximity,

semantic distance, ….– [differentiation, disjointedness]

• Even is-a link involves different notions: identify, unity, essense (Guarino and Wetley 2002)

Semantic ambiguity, also based on incomplete, inconsistent, approximate information/knowledge

Many problems have stumbled across these issues e.g., schema integration (in database management area)

Page 7: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantics and Relationships

Increasing depth and sophistication in dealing with semantics by dealing with (identifying/searching to analyzing) documents, entities, and relationships.

Documents

Entities

RelationshipsFuture

Current

Past

Page 8: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Issues - Relationships

• Identifying Relationship (extraction)• Expressing (specifying, representing)

relationships • Discovering and Exploring

Relationships• Hypothesizing and Validating

Relationships• Utilizing/exploiting Relationships for

Semantic Applications (in document search, querying metadata, analysis…)

Page 9: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Expressing Relationships

• Expressiveness of specification language– In relational model– In semantic data model, e.g., E-R

variants– KR languages – In logic, e.g., description logics– …

Page 10: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Relationship Modeling in Various Representation Models …

Catalog/ID

GeneralLogical

constraints

Terms/glossary

Thesauri“narrower

term”relation

Formalis-a

Frames(properties)

Informalis-a

Formalinstance

Value Restriction

Disjointness, Inverse,part of…

After Deborah L. McGuinness (Stanford) and Tim Finin (UMBC)After Deborah L. McGuinness (Stanford) and Tim Finin (UMBC)

SimpleTaxonomies

ExpressiveOntologies

Wordnet

CYCRDF DAML

OO

DB Schema RDFS

IEEE SUOOWL

UMLS

Page 11: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Sampling Issues in Relationships-outline of this talk• Simple Relationships – already known

– Representation– Identification/Querying: “Which entities are related

to entity X via relationship R?” where R is typically specified as possibly a join condition or path expression

• Complex relationships– Rho: discovery from large document set with

associated metadata and ontologies: “How is X related to Y?”

– ISCAPEs: validation/ human-directed knowledge discovery

Page 12: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Metadata and Ontology: Primary Semantic Web enablers

Data (Heterogeneous Types/Media)(Heterogeneous Types/Media)

Content Independent Metadata (creation-date, location, type-of-sensor...)(creation-date, location, type-of-sensor...)

Content Dependent Metadata (size, max colors, rows, columns...)(size, max colors, rows, columns...)

Direct Content Based Metadata (inverted lists, document vectors, LSI)(inverted lists, document vectors, LSI)

Domain Independent (structural) Metadata (C++ class-subclass relationships, HTML/SGML(C++ class-subclass relationships, HTML/SGML Document Type Definitions, C program structure...)Document Type Definitions, C program structure...)

Domain Specific Metadata area, population (Census),area, population (Census), land-cover, relief (GIS),metadata land-cover, relief (GIS),metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

More

Seman

tics

for

Releva

nce

to ta

ckle

Info

rmatio

n

Overlo

ad!!

Page 13: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Metadata adapter

Metadata adapter

Enterprise Content Applications

SCORE technology

KnowledgeAgent

Monitor

KS

KS

KS

KS

KA

KA

KA

KnowledgeSources

KnowledgeAgents

KAToolkit

Ontology

Metabase

Sem

i-St

ruct

u red

ContentSourcesContentSources

CA

CA

CA

ContentAgent

Monitor

ContentAgents

CAToolkit

Databases

XML/Feeds

Websites

Email

Reports

Documents

Stru

ctu r

edU

nstr

uct u

red

Databases

XML/Feeds

Websites

Email

Reports

Semantic Enhancement ServerEntity Extraction,

Enhanced Metadata,Domain Experts

AutomaticClassification

Classification Committee

Semantic Query ServerOntology

andMetabase

Main Memory Index

Page 14: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Information Extraction for Metadata Creation

WWW, EnterpriseRepositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAPFeeds/

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images. . .

. . . . . .

Key challenge: Create/extract as much (semantics)metadata automatically as possible

Page 15: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

Automatic Classification & Metadata Extraction (Web page)

Page 16: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantic Annotation

Limited tagging(mostly syntactic)

COMTEX Tagging

Content‘Enhancement’Rich Semantic

Metatagging

Value-added Voquette Semantic Tagging

Value-addedrelevant metatagsadded by Voquetteto existing COMTEX tags:

• Private companies • Type of company• Industry affiliation• Sector• Exchange• Company Execs• Competitors

Page 17: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Automatic Semantic Annotation of Text:Entity and Relationship Extraction

Page 18: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Extraction Agent

Enhanced Metadata Asset

Ontology-directed Metadata Extraction (Semi-structured data)Web Page

Page 19: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantic Metadata

Syntax Metadata

Entity and Semantic Metadata Extraction

Page 20: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Enabling powerful linking of actionable information and facilitating important semantic applications such as knowledge discovery and link analysis

(user’s task of manually retrieving all the information he needs to know is greatly minimized; he can spend more time making effective decisions)

Semantic Metadata Content TagsCompany: Cisco Systems, Inc.Classification: Channel Partners,

E-Business SolutionsChannel Partner: Siemens NetworkChannel Partner: Voyager NetworkChannel Partner: Siemens NetworkChannel Partner: Wipro GroupE-Business Solution: CI S-1270 SecurityE-Business Solution: CI S-320 LearningE-Business Solution: CI S-6250 FinanceE-Business Solution: CI S-1005 e-MarketTicker: CSCOI ndustry: Telecommunication, . . .Sector: Computer HardwareExecutive: J ohn ChambersCompetition: Nortel Networks

Syntactic MetadataProducer: BusinessWireSource: BloombergDate: Sept. 10 2001Location: San J ose, CAURL: http:/ /bloomberg.com/1.htmMedia: Text

XML content item with enriched semantic tagging, ready to be queried

E-Business SolutionOntology

CiscoSystems

VoyagerNetwork

SiemensNetwork

WiproGroup

UlysysGroup

CIS-1270 Security

CIS-320Learning

CIS-6250 Finance

CIS-1005 e-Market

Channel Partner

belongs to

- - -

Ticker

represen

ted b

y

- - -

- - -

- - -

- - -

Industry

chan

nel p

artn

er of

- - -

- - -

- - -

- - -

Competitioncompetes with

provider of

- - -

- - -

- - -

- - -

Executives

works

for

- - -

- - -

- - -

- - -

Sectorbelo

ngs

to

Semantic Enhancement

Uniquelyexploiting

real-worldsemantic

associationsin the right

context

SemanticMetadataExtraction

(also syntactic)

Content TagsSemantic MetadataClassification: Channel Partners,

E-Business SolutionsCompany: Cisco Systems, Inc.

Syntactic MetadataProducer: BusinessWireSource: BloombergDate: Sept. 10 2001Location: San J ose, CAURL: http: //bloomberg.com/1.htmMedia: Text

ChannelPartners

E-BusinessSolutionsClassification

Content Tags

Semantic MetadataClassification: Channel Partners,

E-Business Solutions

Classification CommitteeKnowledge-base, Machine Learning &

Statistical Techniques

Semantic Metadata Enhancement

Page 21: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Focused relevantcontent

organizedby topic

(semantic categorization)

Automatic ContentAggregationfrom multiple

content providers and feeds

Related relevant content not

explicitly asked for (semantic

associations)

Competitive research inferred

automatically

Automatic 3rd party content

integration

Semantic Application Example – Research Dashboard

Page 22: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Related Stock

News

Related Stock

News

Semantic Web – Intelligent Content

IndustryNews

IndustryNews

Technology Products

Technology Products

COMPANYCOMPANY

SECEPAEPA

RegulationsRegulations

CompetitionCompetition

COMPANIES in Same or Related INDUSTRY

COMPANIES inINDUSTRY with Competing PRODUCTS

Impacting INDUSTRY or Filed By COMPANY

Important to INDUSTRY or COMPANY

Intelligent Content = What You Asked for + What you need to know!

Page 23: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Syntax Metadata

Semantic Metadata

led by

Same entity

Human-assisted inference

Knowledge-based andManual Associations

Page 24: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Blended Semantic Browsing and Querying (Intelligence Analyst Workbench)

Page 25: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Physical link to Relationship

<TITLE> A Scenic Sunset at Lake Tahoe </TITLE>

<p>

Lake Tahoe is a popular tourist spot and <A HREF = “http://www1.server.edu/lake_tahoe.txt”>some interesting facts</A> are available here. The scenic beauty of Lake Tahoe can be viewed in this photograph:<center><IMG SRC=“http://www2.server.edu/lake_tahoe.img”></center>

Correlation achieved by using physical linksDone manually by user publishing the HTML document

Page 26: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

MREFMetadata Reference Link -- complementing HREF

Creating “logical web” through Media Independent Metadata

based Correlation

Page 27: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Metadata Reference Link (<A MREF …>)

• <A HREF=“URL”>Document Description</A>

physical link between document (components)

• <A MREF KEYWORDS=<list-of-keywords>; THRESH=<real>>Document Description</A>

• <A MREF ATTRIBUTES(<list-of-attribute-value-pairs>)>Document Description</A>

Page 28: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Abstraction Layers

METADATA

DATA

METADATA

DATAMREFin RDF

ONTOLOGYNAMESPACE

ONTOLOGYNAMESPACE

Page 29: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Model for LogicalCorrelation using Ontological Terms and Metadata

Framework for Representing MREFs

Serialization(one implementation choice)

Page 30: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

height, width and size

water.gif (Data)Metadata Storage

water.gif

……mpeg

……ppm

Major component(RGB)Major component(RGB)

Blue

Content based MetadataContent based Metadata

ContentDependentMetadata

Correlation based on Content-based Metadata

Some interesting information on dams is available here.“information on dams” is defined by an MREF defining keywords and metadata (which may be used for a query).

Page 31: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

An Example RDF Model for MREF

<?namespace href="http://www.foo.com/IQ" as="IQ"?> <?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:bag id="MREF:12345>

<IQ:keyword><RDF:resource id="constraint_001">

<IQ:threshold>0.5</IQ:threshold> <RDF:PropValue>dam</RDF:PropValue>

</RDF:resource></IQ:keyword><IQ:attribute>

<RDF:resource id="constraint_002"><IQ:name>majorRGB</IQ:color><IQ:type>string</IQ:type><RDF:PropValue>blue</RDF:PropValue> </RDF:resource>

</IQ:attribute> </RDF:bag></RDF:serialization>

Page 32: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Domain Specific Correlation

Potential locations for a future shopping mall identified by all regions having a population greater than 500 and area greater than 50 sq meters having an urban land cover and moderate relief <A MREF ATTRIBUTES(population < 500; area < 50 & region-type = ‘block’ & land-cover = ‘urban’ & relief = ‘moderate’)>can be viewed here</A>

=> media-independent relationships between domain specific metadata: population, area, land cover, relief

=> correlation between image and structured data at a higher domain specific level as opposed to physical “link-chasing” in the WWW

Page 33: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

TIGER/Line DB

Population: Area:

Boundaries:

Land cover:Relief:

Census DB

Map DB

Regions(SQL)

Boundaries

Image Features(IP routines)

Repositories and the Media Types

Page 34: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships
Page 35: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Relationship Discovery

• Problem Huge volumes of data. Need to find

relationships between two entities in the Semantic Web.

Application areas such as National Security, Intelligence Services, Bioinformatics.

Relationship can be of different kinds.

Page 36: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

passengerOf

AlQaida

Terrorist Organization

leaderOf

friendOf

Mohammad Atta

Example

Osama, bin laden

Ramzi Binalshibh

namename

memberOf

name

Page 37: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Semantic Association

Complex relationships which capture connectivity and similarity of entities in a knowledge base– Complex

• Involve multiple relations

– Connectivity• Includes both directed paths and undirected

paths Similarity• Specific notion of an isomorphism, based on

the similarity of roles of entities.

Page 38: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Representing and analyzing metadata

• By using a graph data model, Semantic Associations can be viewed in terms of graph traversals

• We can distinguish between different types of Semantic Associations based on structural properties

• For example, a path, intersecting paths, isomorphic paths.

• We use the RDF Graph Data Model, to model Semantic Associations.

Page 39: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Example Graph

&r3

&r5

“Reina Sofia Museun”

&r7

“oil on canvas”

&r2

2000-02-01

“oil on canvas”

&r8“Rodin

Museum”

“image/jpeg”

2000-6-09

Ext. Resource

String

Date

Integer

String

title

file_size

last_modified

mim

e-ty

pe

Artist

Sculptor

Artifact

Sculpture

Museum

String

String

String fname

lname

creates exhibited

sculpts

StringPaintingPainterpaints technique

material

typeOf(instance)

subClassOf(isA)

subPropertyOf

mime-type

exhibited

technique

exhibited

title

last_modified

last_modified

title

technique

exhibited

“Rodin”

“August”

&r6

&r1

fname

lname

fname

lname

paints

paints

creates

&r4

“Rembrandt”

“Pablo”

“Picasso”

fname

-pathConnected(x, y): is true if there is a path

<x, p1, a, p2, b, p3, …. y> in the knowledge base

XY

ap1 p2

Page 40: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

&r3

&r5

“Reina Sofia Museun”

&r7

“oil on canvas”

&r2

2000-02-01

“oil on canvas”

&r8“Rodin

Museum”

“image/jpeg”

2000-6-09

Ext. Resource

String

Date

Integer

String

title

file_size

last_modified

mim

e-ty

pe

Artist

Sculptor

Artifact

Sculpture

Museum

String

String

String fname

lname

creates exhibited

sculpts

StringPaintingPainterpaints technique

material

typeOf(instance)

subClassOf(isA)

subPropertyOf

mime-type

exhibited

technique

exhibited

title

last_modified

last_modified

title

technique

exhibited

“Rodin”

“August”

&r6

&r1

fname

lname

fname

lname

paints

paints

creates

&r4

“Rembrandt”

“Pablo”

“Picasso”

fname

X

k

a

-joinConnected(x, y): is true if there two paths P1, P2 such that: P1 = <x, pa, a, pb, b, pc, c, pd…k, pl l, pm, m> and P2 = <y, pu, b, pv,…k, pw, l, py, n>

OrP1 = < a, pa, b, pb,…k, pk, l, pl, x > and P2 = < y, pu, b, pv, m, pw, l,…k, p5, l, p6, n >

my b

n

Page 41: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Painting

&r3

&r5

“Reina Sofia Museun”

&r7

“oil on canvas”

&r2

2000-02-01

“oil on canvas”

&r8“Rodin

Museum”

“image/jpeg”

2000-6-09

Ext. Resource

String

Date

Integer

String

title

file_size

last_modified

mim

e-ty

pe

Artist

Sculptor

Artifact

Sculpture

Museum

String

String

String fname

lname

creates exhibited

sculpts

StringPainterpaints technique

material

typeOf(instance)

subClassOf(isA)

subPropertyOf

mime-type

exhibited

technique

exhibited

title

last_modified

last_modified

title

technique

exhibited

“Rodin”

“August”

&r6

&r1

fname

lname

fname

lname

paints

paints

creates

&r4

“Rembrandt”

“Pablo”

“Picasso”

fname

X

Y

pa

pa

a

u

pc

p1

c

1

-isoConnected(x, y) is true if there two paths P1, P2 such that: P1 = <x, pa, a, pb, b, pc, c> and P2 = <y, pu, b, pv, m, pw, l>

andx y, a b, c l …….pa pu, pb pv, pc pw ….

Page 42: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Operators

• The Operator computes Semantic Associations between two entities.

• Three kinds of Operators are defined. Path : This operator returns all paths between

two entities in the data model. Connect : This operator returns intersecting

paths, on which the two entities lie. Iso : -isomorphic paths implies a similarity of

nodes and edges along the paths, and returns such similar paths between entities.

Page 43: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Formalism

-pathConnected(x, y): is true if there is a path– <x, p1, a, p2, b, p3, …. y> in the knowledge base

-joinConnected(x, y): is true if there two paths P1, P2 such that:

– P1 = <x, pa, a, pb, b, pc, c, pd…k, pl l, pm, m> and

– P2 = <y, pu, b, pv,…k, pw, l, py, n>

Or

– P1 = < a, pa, b, pb,…k, pk, l, pl, x > and

– P2 = < y, pu, b, pv, m, pw, l,…k, p5, l, p6, y >

Page 44: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Complex Relationship Validation

• Arise in several contexts, especially involving multiple ontologies (hence mappings)– information interoperability where related

resources subscribe to different but related ontologies

– information requestor and resource modelers choose to use different ontologies

– information requests to support analysis, knowledge discovery, decision making, learning that requires linking multiple domains with different ontologies

Developing all encompassing, unified ontology is not shown to be practical. Preexisting classifications/metadata standards/taxonomies are hard to ignore.

Page 45: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Complex Relationships - Cause-Effects & Knowledge discovery

VOLCANO

LOCATIONASH RAIN

PYROCLASTICFLOW

ENVIRON.

LOCATION

PEOPLE

WEATHER

PLANT

BUILDING

DESTROYS

COOLS TEMP

DESTROYS

KILLS

Page 46: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Knowledge Discovery - Example

Earthquake Sources Nuclear Test Sources

Nuclear Test May Cause Earthquakes

Is it really true?

Complex Relationshi

p:How do you model

this?

Page 47: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Inter-Ontological Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region.

NuclearTest Causes Earthquake

<= dateDifference( NuclearTest.eventDate,

Earthquake.eventDate ) < 30

AND distance( NuclearTest.latitude,

NuclearTest.longitude,

Earthquake,latitude,

Earthquake.longitude ) < 10000

Page 48: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Knowledge Discovery - Example

When was the first recorded nuclear test conducted?

Find the total number of earthquakes with a magnitude5.8 or higher on the Richter scale per year starting from 1900

1950

Increase in number of earthquakes since 1945

Page 49: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Knowledge Discovery – exploring relationship…

For each group of earthquakes with magnitudes in the ranges5.8-6, 6-7, 7-8, 8-9, and >9 on the Richter scale per yearstarting from 1900, find number of earthquakes

Number of earthquakes with magnitude > 7 almost constant. So nuclear tests probably only cause earthquakes with magnitude < 7

Page 50: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Knowledge Discovery - Example…

Find nuclear tests and earthquakes that may have occurred as a result of the test

KB

Page 51: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

InfoQuilt System Core capabilities

• Ability to handle heterogeneous, static or dynamic content – wrappers & extractors, with resource modeling (completeness, data characteristics, binding patterns)

• Information Extraction: Semi-Automatically or Automatically create domain-specific or contextually relevant metadata

• Domain modeling with complex (user defined, inter-ontology) relationships, domain rules and FD

• User defined Functions (esp. for fuzzy/approximate matching) and Simulation

• Post processing result analysis (e.g., chart creator)

Page 52: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape (Information Scape)

A computing paradigm that allows users to query and analyze the data available from a diverse autonomous sources, gain better understanding of the domains and their interactions as well as discover and study relationships.

Page 53: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape …a simple example

• user’s request– for semantically related information

(regardless of all forms of heterogeneity)– specified in terms of components of knowledge base

(domain model, relationships, functions, simulations)

“Find all earthquakes with epicenter less than 5000 mile from the location at latitude 60.790 North and longitude 97.570 East and find all tsunamis that they might have caused”

Next - KD using ISacpes

Page 54: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Ontologies

Disaster

eventDate

description

site => latitude, longitude

sitelatitude

longitude

Natural Disaster

Man-made Disaster

damage

numberOfDeaths

damagePhoto

Volcano

EarthquakeNuclearTest

magnitude

bodyWaveMagnitude

conductedBy

explosiveYield

bodyWaveMagnitude < 10

bodyWaveMagnitude > 0

magnitude < 10

magnitude > 0

Terms/Concepts(Attributes)

Functional Dependencies

(FDs)

Domain Rules

Hierarchies

Page 55: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Knowledge Builder

Page 56: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape Builder

Page 57: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape Execution

IScape

Plan Plan

Knowledge

IScape

Query

Query

QueryData retrieved

Final Results

Final Results

Page 58: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape 1

NuclearTestsDB( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, [dc] waveMagnitude > 3 );

NuclearTestSites( testSite, latitude, longitude );

SignificantEarthquakesDB( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, [dc] eventDate > “January 1, 1970” );

NuclearTest( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, latitude, longitude, waveMagnitude > 0, waveMagnitude < 10, testSite -> latitude longitude );

Earthquake( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, magnitude > 0 );

“Find all nuclear tests conducted by India or Pakistan after January 1, 1995 with seismic body wave magnitude > 4.5 and find all earthquakes that could have been

caused due to these tests.”

NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000

Ontology Ontology

ResourceResource

Resource

Relationship

IScape

USGS sitehttp://sun00781.dn.net/nuke/hew/Library/Catalog

Page 59: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

IScape Processing Monitor

Page 60: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Future

Future work in Semantic Web will increasingly focus on all dimensions of relationships, especially complex relationships.

New Semantic Applications (business/govt. intelligence) are being enabled.

Page 61: Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating and Exploiting Complex Semantic Relationships

Further Information• Related Paper: Sheth, Arpinar, Kashyap:

Relationships at the Heart of Semantic Web: Modeling, Discovering, and Exploiting Complex Semantic Relationshipshttp://lsdis.cs.uga.edu/lib/2002.html

• InfoQuilt and Semantic Association Projects at the LSDIS Lab: http://lsdis.cs.uga.edu

• Green, Bean and Myaeng: The Semantics of Relationships: An Interdisciplinary Perspective, Kluwer Academic Publishers 2002.