SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

42
SICoP Presentation SICoP Presentation A story about communication A story about communication Michael Lang Michael Lang BEA BEA Revelytix Revelytix May 2, 2007 May 2, 2007

Transcript of SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Page 1: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

SICoP PresentationSICoP PresentationA story about communicationA story about communication

Michael LangMichael LangBEABEA

RevelytixRevelytixMay 2, 2007May 2, 2007

Page 2: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

InteroperabilityInteroperability

►This track concerns making an SOA as This track concerns making an SOA as managable and interoperable as is managable and interoperable as is possiblepossible DescriptionDescription AbstractionAbstraction GovernanceGovernance CommunicationCommunication

►Community based CommunicationCommunity based Communication

Page 3: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

AgendaAgenda

► Communication in IndonesiaCommunication in Indonesia

► Schemas and LanguagesSchemas and Languages Semantic TechnologySemantic Technology

ETL and data warehousingETL and data warehousing

► RoadmapRoadmap Transitioning from here to thereTransitioning from here to there

► Vocabulary ManagementVocabulary Management

► Semantic Data ServicesSemantic Data Services

Page 4: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

IndonesiaIndonesia

Page 5: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Indonesia in 1960Indonesia in 1960

► Indonesia is comprised of several Indonesia is comprised of several thousand islands thousand islands

►340 different languages/dialects are 340 different languages/dialects are spokenspoken

► Indonesia is not an integrated countryIndonesia is not an integrated country►The President decided that Indonesia The President decided that Indonesia

needed a common vocabulary so that needed a common vocabulary so that the groups could communicate with the groups could communicate with each othereach other

Page 6: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Indonesia TodayIndonesia Today

►The language/vocabulary was The language/vocabulary was developed and is now in widespread developed and is now in widespread useuse

► Indonesia is interoperable todayIndonesia is interoperable today

Page 7: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

IT SystemsIT Systems

►Each system is an island and it has its Each system is an island and it has its own languageown language We call these languages schemasWe call these languages schemas

► If we want to be able to speak among If we want to be able to speak among systems we need a common vocabularysystems we need a common vocabulary

►Key concept #1Key concept #1 Language = Vocabulary = SchemaLanguage = Vocabulary = Schema The language is used by a communityThe language is used by a community

Page 8: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

LanguagesLanguages

►FeaturesFeatures A way to represent concepts and the A way to represent concepts and the

relationships between conceptsrelationships between concepts A way of representing facts about the A way of representing facts about the

worldworld A way of adding concepts, relationships A way of adding concepts, relationships

and facts without breaking the languageand facts without breaking the language A syntax or grammar that shows how the A syntax or grammar that shows how the

language is structured and functionslanguage is structured and functions

Page 9: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Why are there Data Why are there Data WarehousesWarehouses

►The Sales community uses the schema The Sales community uses the schema of the CRM system to capture facts of the CRM system to capture facts and conceptsand concepts

►Manufacturing uses the schema of the Manufacturing uses the schema of the ERP systemERP system

►Finance wants to forecast trendsFinance wants to forecast trends Invent a new language that has some of Invent a new language that has some of

the concepts and facts of both systems as the concepts and facts of both systems as well as additional conceptswell as additional concepts

Build a new system Build a new system

Page 10: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

XSD as a LanguageXSD as a Language

► XSD is a great way to define the terms and XSD is a great way to define the terms and syntax for a communitysyntax for a community And it is a language, and it can be extendedAnd it is a language, and it can be extended But, systems processing XML documents But, systems processing XML documents

containing the data are expecting the structure to containing the data are expecting the structure to be fixed according to an XSDbe fixed according to an XSD

If the structure is different the system must be If the structure is different the system must be changedchanged

► AddressAddress StreetStreet CityCity Zip codeZip code

Address

•Street

•Apt number

•City

•Zip

Address

•Street1

•Street2

•City

•Postal Code

Page 11: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

XSD as a LanguageXSD as a Language

► The purpose of XSD is to specify syntaxThe purpose of XSD is to specify syntax The representation of the data is determined by the The representation of the data is determined by the

schemaschema IT systems that process XML count on thisIT systems that process XML count on this

► XSD has serious limitations as the technology XSD has serious limitations as the technology for achieving interoperabilityfor achieving interoperability Every time the Language is extended the systems Every time the Language is extended the systems

must be recodedmust be recoded► New communities cannot enter into the New communities cannot enter into the

vocabularyvocabulary adding their own conceptsadding their own concepts Or extending existing conceptsOr extending existing concepts

Page 12: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

RDFRDFOWLOWL

SPARQLSPARQL

Semantic TechnologySemantic Technology

Page 13: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Semantic TechnologySemantic Technology

►The first fundamental change in The first fundamental change in Information Management since the Information Management since the RDBMS was developed in the early RDBMS was developed in the early 1980’s1980’s

►Key concept #2Key concept #2 The “data” exists independently of any The “data” exists independently of any

schemaschema data = instances = factsdata = instances = facts

Page 14: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

RDFRDF

RDF is uniquely positioned as a machine readable lingua franca for representing relational, XML, proprietary, and most other data formats

►BUT, it is not a languageBUT, it is not a language There is no schemaThere is no schema It is a way of making data accessible to It is a way of making data accessible to

ANY schema defined using OWL or RDFSANY schema defined using OWL or RDFS

Page 15: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

RDFRDF

► RDF uses a graph structure Can more accurately represent the entities being modeled Can represent objects with complex structures directly

without exposing implementation techniques, e.g. hierarchical structures

► Data schemas do not need to be determined a priori

► New data from disparate sources can be integrated seamlessly on the fly by merging two (or more) RDF graphs

► RDF models data’s content and meaning, rather than just its structure or serialization

Page 16: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

RDF is SchemalessRDF is Schemaless

►The schema can be defined using The schema can be defined using RDFS or OWLRDFS or OWL

►Since the data is independent of the Since the data is independent of the schema any language can be schema any language can be developed and deployeddeveloped and deployed Seamlessly incorporate new data sources,

regardless of schema Business logic and user-interface logic far

more resilient in the face of unexpected data and schema evolution

Page 17: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Creating Languages with Creating Languages with OWLOWL

Structuring RDF with Ontologies

► The Web Ontology Language (OWL) allows the declaration of rules to define and describe RDF vocabularies.

► OWL can be used to define the language for OWL can be used to define the language for any communityany community

► OWL models can be extended without OWL models can be extended without breaking the system that is consuming the breaking the system that is consuming the datadata

Page 18: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

SPARQLSPARQLSPARQL: Web-aware semantic query

► SPARQL is a SQL-like language for querying distributed RDF graphs

► SPARQL queries emphasize: Distributed data. Multiple data sources can be queried

at once because SPARQL addresses graphs by URI. Ragged data. The SPARQL OPTIONAL keyword lets

users explore heterogeneous data in a single query. Unpredictable data. The ability to query for predicates

and information about predicates makes SPARQL ideal for exploring new and unexpected data.

Page 19: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Myth: None of this WorksMyth: None of this Works

► Not trueNot true Many tools are capable of producing RDFMany tools are capable of producing RDF Some are good at representing relational data as Some are good at representing relational data as

RDFRDF

► There are several good OWL modeling tools There are several good OWL modeling tools availableavailable

► Jena and Sesame are open source Jena and Sesame are open source reasoner/SPARQL enginesreasoner/SPARQL engines

► Many commercial products are incorporating Many commercial products are incorporating support for semantic technology nowsupport for semantic technology now

Page 20: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

RoadmapRoadmap

Page 21: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

How to get from here to How to get from here to therethere

►HereHere Currently SOA’s exchange data as XML Currently SOA’s exchange data as XML

where the document structure is where the document structure is determined by an XSD and the query determined by an XSD and the query language is XQuerylanguage is XQuery

►ThereThere SOA’s exchange data as RDF and query SOA’s exchange data as RDF and query

using SPARQLusing SPARQL

Page 22: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Semantic Alignment of Data Semantic Alignment of Data ServicesServices

MappingVocabularies

A & B

VocabularyA

VocabularyB

“same as”or

“same class as”

Web Services

Web Services

Mapping Vocabulary

generategenerate

describethe RDF

describethe RDF

Describe the structure(elements & attributes)

Describe the structure(elements & attributes)

XSD XSDXML Messages(in RDF XML)

XML Messages(in RDF XML)

WSDL WSDL

reference reference

describe describe

You can haveone or more

of these

Page 23: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Using Service ResponsesUsing Service Responses

RDF Content

RDF Content

RDF Content

RDF Contentfrom all

Responses

XML Messages(in RDF XML)

Web Services

extract

extract

extract

combine

com

bine

combine

KNOWNFACTS

Page 24: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Context Based IntegrationContext Based Integration

Vocabularies(OWL)

Composed at design-time

“Semantic Interpreter” or“Semantic Message Translator”

Small wrapper around Jena

submit produce

QUERY(SPARQL)

KNOWNFACTS

NEXT SERVICEREQUEST MESSAGE

Designed to obtaindesired message

for next service call

Composed from previousmessages in a SOAtransaction plus assertions(facts) obtained fromother sources

Page 25: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Transition PlanTransition Plan

► Start developing Community of Interest based Start developing Community of Interest based Vocabularies as OWL models immediatelyVocabularies as OWL models immediately

► Generate XSD’s to facilitate the use of existing Generate XSD’s to facilitate the use of existing infrastructure directly from the OWL modelsinfrastructure directly from the OWL models

► Begin a program of representing service based Begin a program of representing service based information as RDFinformation as RDF

► Build applications that access the RDF Build applications that access the RDF representation using the previously developed representation using the previously developed OWL models through SPARQL/reasoner enginesOWL models through SPARQL/reasoner engines

Page 26: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

ConclusionsConclusions

►Semantic Technology is the only natural Semantic Technology is the only natural integration technology availableintegration technology available

► It is game changing in delivering It is game changing in delivering capabilitycapability

►The road ahead is apparent if not quite The road ahead is apparent if not quite obvious yetobvious yet

►Commercial tools are available and more Commercial tools are available and more are coming soonare coming soon

► It is time to get startedIt is time to get started

Page 27: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary Vocabulary ManagementManagement

Page 28: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Designing Semantic Data Designing Semantic Data ServicesServices

Structured and Unstructured Data

DB DB DB XML XML

Customer Customer PortalPortal

Partner Partner AppApp

Employee Employee PortalPortal EAI / ESBEAI / ESBBPMBPM

?

Page 29: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Structured and Unstructured Data

DB DB DB XML XML

Customer Customer PortalPortal

Partner Partner AppApp

Employee Employee PortalPortal EAI / ESBEAI / ESBBPMBPM

Semantic Data Services

Designing Semantic Data Designing Semantic Data ServicesServices

Page 30: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary Management

► Step 1: Extract semantics from existing dataStep 1: Extract semantics from existing data Import, tokenize, assign definitions, …Import, tokenize, assign definitions, …

Page 31: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary ManagementStep 1: Extract semantics from existing data

DB

DB

DB

XML

XML

Page 32: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary Management

► Step 1: Extract semantics from existing dataStep 1: Extract semantics from existing data Import, tokenize, assign definitions, …Import, tokenize, assign definitions, …

► Step 2: Create bootstrapped vocabularyStep 2: Create bootstrapped vocabulary Discover terms, concepts, duplications, Discover terms, concepts, duplications,

similarities, …similarities, …

Page 33: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary ManagementStep 2: Create bootstrapped vocabulary

OWLOWLOWL

Page 34: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary Management

► Step 1: Extract semantics from existing dataStep 1: Extract semantics from existing data Import, tokenize, assign definitions, …Import, tokenize, assign definitions, …

► Step 2: Create bootstrapped vocabularyStep 2: Create bootstrapped vocabulary Discover terms, concepts, duplications, Discover terms, concepts, duplications,

similarities, …similarities, …

► Step 3: Evolve vocabulary collaborativelyStep 3: Evolve vocabulary collaboratively Make meaningful for people using natural Make meaningful for people using natural

languagelanguage Use formalisms for machines via OWLUse formalisms for machines via OWL Web-based collaboration and sharingWeb-based collaboration and sharing

Page 35: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary ManagementStep 3: Evolve vocabulary collaboratively

Page 36: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary Management

► Step 1: Extract semantics from existing dataStep 1: Extract semantics from existing data Import, tokenize, assign definitions, …Import, tokenize, assign definitions, …

► Step 2: Create bootstrapped vocabularyStep 2: Create bootstrapped vocabulary Discover terms, concepts, duplications, similarities, …Discover terms, concepts, duplications, similarities, …

► Step 3: Evolve vocabulary collaborativelyStep 3: Evolve vocabulary collaboratively Make meaningful for people using natural languageMake meaningful for people using natural language Use formalisms for machines via OWLUse formalisms for machines via OWL Web-based collaboration and sharingWeb-based collaboration and sharing

► Step 4: Use vocabularyStep 4: Use vocabulary Used by people and toolsUsed by people and tools

Page 37: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary ManagementStep 4: Use vocabulary to understand

Page 38: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Structured and Unstructured Data

DB DB DB XML XML

Customer Customer PortalPortal

Partner Partner AppApp

Employee Employee PortalPortal EAI / ESBEAI / ESBBPMBPM

Semantic Data Services

Vocabulary ManagementVocabulary Management

?? ?? ????

Step 4: Use vocabulary to design services

OWL

Page 39: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Vocabulary ManagementVocabulary ManagementStep 4: Use vocabulary to design services

Page 40: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Designing Semantic Data Designing Semantic Data ServicesServices

Vocabulary Management:Vocabulary Management:► Bootstrap vocabularies with Bootstrap vocabularies with existing data existing data

sourcessources► People People collaboratecollaborate on on creationcreation of vocabularies of vocabularies► People can People can understandunderstand vocabularies vocabularies► Tools can Tools can useuse vocabularies (OWL artifacts) to vocabularies (OWL artifacts) to

produce semantic data servicesproduce semantic data services

Next: Next: ► Deploy and use semantic data servicesDeploy and use semantic data services

Page 41: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

Structured and Unstructured Data

DB DB DB XML XML

Customer Customer PortalPortal

Partner Partner AppApp

Employee Employee PortalPortal EAI / ESBEAI / ESBBPMBPM

Designing Semantic Data Designing Semantic Data ServicesServices

BEA AquaLogic Data Services Platform (ALDSP)

OWL Import

BEA AquaLogic Workshop

Semantic Data ServicesMapping

with MatchIT

Knoodl

Page 42: SICoP Presentation A story about communication Michael Lang BEARevelytix May 2, 2007.

SICoP Knowledge Reference ModelSICoP Knowledge Reference Model

The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).