Semantics 101

30
Semantics 101 Business Use Cases

Transcript of Semantics 101

Page 1: Semantics 101

Semantics 101Business Use Cases

Page 2: Semantics 101

Intro to Semantics Business Use Cases Marklogic: Semantics + Search

Semantics 101 Overview

Page 3: Semantics 101

Intro to SemanticsFrom conversations to query

Page 4: Semantics 101

A Conversation◦ Semantics links concepts together via triples◦ Concepts are identified by global identifiers (IRIs)◦ Concepts can also have descriptive metadata◦ Ordinary names are labels – descriptive, not unique

Triples◦ A semantic “triple” consists of a

subject (what the assertion describes) predicate (the relationship) object (that thing or descriptive metadata that is related

to the subject). context (identifies domain of interest, optional)

Intro to Semantics

Page 5: Semantics 101

Assertions are Subject | Predicate | Object◦ Michael | is | an individual.◦ Michael | has | policy X. ◦ Policy X | is sold by | InsureCo.◦ Michael | is married to | Jane.◦ Jane | is a dependent of | Michael.◦ If A is a dependent of B, then A is an Individual.◦ => Jane | is an | Individual.◦ “is married to” | is | a reflexive property.◦ If A is reflexive to B, then B is reflexive to A.◦ => Jane | is married to | Michael.

Intro to Semantics II

Page 6: Semantics 101

Declares how assertions are made. Analogous to XML. Directed assertions create labeled graphs. More generalized than hierarchies (XML or

JSON) Much of the power of RDF comes from

traversal of the graph Can be expressed in multiple ways

Resource Description Framework (RDF)

Page 7: Semantics 101

RDF as Graph

Page 8: Semantics 101

@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix individual: <http://optum.com/ns/individual#> .@prefix property: <http://optum.com/ns/property#> .@prefix class: <http://optum.com/ns/class#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix aetnaPerson: <http://aetna.com/ns/persons/42164323C> .@prefix ssn: <http://ssa.gov.us/ns/ssn#> .

individual:Jane_Doe owl:sameAs <http://aetna.com/ns/persons/42164323C> , individual:Jane_Doe , ssn:351644715 ; rdf:type class:Individual ; property:location "/apps/semantics1/data/Jane_Doe.xml" ; rdfs:label "Jane Elizabeth Doe" ; property:hasDependent individual:Sarah_Doe , individual:Wendy_Jones ;.

RDF as Turtle

Page 9: Semantics 101

<triples> <triple> <subject>http://optum.com/ns/individual#Jane_Doe</subject> <predicate>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</predicate> <object>http://optum.com/ns/class#Individual</object> </triple> <triple> <subject>http://optum.com/ns/individual#Jane_Doe</subject> <predicate>http://www.w3.org/2000/01/rdf-schema#label</predicate> <object datatype="http://www.w3.org/2001/XMLSchema#string">Jane Elizabeth Doe</object> </triple> <triple> <subject>http://optum.com/ns/individual#Jane_Doe</subject> <predicate>http://optum.com/ns/property#hasDependent</predicate> <object>http://optum.com/ns/individual#Wendy_Jones</object> </triple> <triple> <subject>http://optum.com/ns/individual#Jane_Doe</subject> <predicate>http://optum.com/ns/property#location</predicate> <object datatype="http://www.w3.org/2001/XMLSchema#string" >/apps/semantics1/data/Jane_Doe.xml</object> </triple></triples>

RDF as TriplesML

Page 10: Semantics 101

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:class="http://optum.com/ns/class#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:property="http://optum.com/ns/property" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" > <class:Individual rdf:about="http://optum.com/ns/individual#Jane_Doe"> <owl:sameAs rdf:resource="http://aetna.com/ns/persons/42164323C"/> <rdf:type rdf:resource="http://optum.com/ns/class#Individual"/> <sameAs rdf:resource="http://optum.com/ns/individual#Jane_Doe"/> <property:hasDependent rdf:resource="http://optum.com/ns/individual#Sarah_Doe"/> <property:hasDependent rdf:resource="http://optum.com/ns/individual#Wendy_Jones"/> <owl:sameAs rdf:resource="http://ssa.gov.us/ns/ssn#351644715"/> <property:location rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >/apps/semantics1/data/Jane_Doe.xml</property:location> <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Jane Elizabeth Doe</rdfs:label> </class:Individual></rdf:RDF>

RDF as RDF-XML

Page 11: Semantics 101

Establishes rules, relationships & schemas Builds the “logic” of RDF Analogous to XSD Schemas use Open World Assumption

◦ You don’t know what you don’t know Schema model is accessible to RDF OWL – Ontology Web Language

◦ Many flavors SPIN – Extension language, ML has similar

RDF Schema + OWL, SPIN

Page 12: Semantics 101

SQL like language for the web Matches parts or all of triples Provides four modes

◦ Query – get tabular results◦ Describe – get triples back◦ Ask – gets a true/false answer◦ Construct – creates new triples

Results can be serialized to various formats:◦ Rdf-xml, triplesml (xml), json, turtle, csv, others

SPARQL Query

Page 13: Semantics 101

Completes RDF CRUD capabilities Used for inserting content and inferencing Supported in MarkLogic 8 Can support transactions in ML8 Good for serializing between semantic dbs

SPARQL Update

Page 14: Semantics 101

prefix individual: <http://optum.com/ns/individual#>prefix class: <http://optum.com/ns/class#>prefix property: <http://optum.com/ns/property#>prefix xs: <http://www.w3.org/2001/XMLSchema#>prefix owl: <http://www.w3.org/2002/07/owl#>prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>prefix cts: <http://marklogic.com/cts#>prefix xdmp: <http://marklogic.com/xdmp#>prefix fn: <http://www.w3.org/2005/xpath-functions#>

Select ?name ?depName ?primary ?primaryPath where { ?dependent rdfs:label ?depName. filter (cts:contains(?depName,?query)) ?primary property:hasDependent ?dependent. ?primary rdfs:label ?name. ?primary property:location ?primaryPath.}

Page 15: Semantics 101

Business Use CasesExplorations

Page 16: Semantics 101

1. Document Integration2. Taxonomy Management3. Natural Language Processing4. Master Data Hub5. Fraud Analysis6. Recommendation Engine7. Rights & Contract Management8. Diagnostics Systems9. Decision Support10. Metadata Management System

Business Use Cases

Page 17: Semantics 101

In relational world, objects are tables, and relationships join tables

In semantic world, objects are documents, and relationships join documents

Related documents can be searched and composed dynamically

Eliminates duplication – one document per entity

Low cost, minimal effort, medium value

#1 Document Integration

Page 18: Semantics 101

Terms, synonyms, antonyms can be standardized across systems

Category inheritance (cat -> pet -> animal) Controlled vocabularies can include

meaning Can feed better entity extraction Centralization of controlled vocabularies

across data silos. Moderate cost, effort, value

#2 Taxonomy Management

Page 19: Semantics 101

Search becomes more relevant and accurate

Descriptive content (doctor’s reports, etc.) can provide additional metadata

Better facilitates multiple language and searches with inaccurate spelling

Documents can be made more granular for searching

Context sensitive searches become possible Moderate cost, effort, value

#3 Natural Language Processing

Page 20: Semantics 101

From ETL to ELT Turns schemas into logical relationships Initial system converts databases to RDF

representations of models Inferencing & new information create

canonical representations of entities Can query canon or source simultaneously Services architecture becomes much

simpler Medium cost, complexity, high value

#4 Master Data Hub

Page 21: Semantics 101

Can identify potential abuses One project detected more than $1B in

fraud in insurance industry Requires large data and considerable

processing Uses inferencing to detect patterns of usage Identifies individuals under multiple aliases High cost, effort, very high value

#5 Fraud Analysis

Page 22: Semantics 101

Based upon various criteria, recommends specific products to customers

Mix of semantics, search, data analytics Sensitive to changes in rates, offerings,

provisions Useful for insurance exchanges, works well

with Optum market Medium cost, medium complexity, high

value

#6 Recommendation Engine

Page 23: Semantics 101

Determines contract provisioning, enforcement and domains

Useful for regulatory tracking and fraud prevention

Identifies equivalent legal language across different policies

Medium cost, high complexity, medium value

#7 Rights & Contract Mgmt

Page 24: Semantics 101

More applicable to health care, can identify symptoms and provide likely diagnoses

Can be used in conjunction with EHRs Uses NLP and Recommendation Systems Revenue generating potential High cost, High Complexity, High to Very

High Value

#8 Diagnostics Systems

Page 25: Semantics 101

Tracks and weighs decision trees and timeline management

Uses semantics both to provide metrics to links and to manage timelines

This could have value in managing participant, provider and treatment timelines, as well as to establish both auditing and action recommendations.

Cost medium, complexity:medium, value: medium

#9 Decision Support

Page 26: Semantics 101

This would be a metadata management system for ingesting, classifying, indexing, searching and managing the rights of media assets.

It moves beyond simple taxonomy systems, both by allowing for multiple concurrent taxonomies on resources and

Cost:Moderate, Complexity:Medium, Value=Medium to High

#10 Metadata Management System

Page 27: Semantics 101

MarklogicMerging Semantics and Search

Page 28: Semantics 101

Triple index is where queries happen Triples “bound” to sem:triple XML structure. Sem:triples can either be in documents or

bundles SPARQL can perform cts:queries SPARQL is parameterized from Xquery or

Javascript Output is either triples or sequences of

maps SPARQL can output to XML, JSON, CSV,

Turtle, or other formats

MarkLogic: Semantics+Search

Page 29: Semantics 101

Why Sem+Search?◦ Facilitates joining complex, multipart documents◦ Sem makes creating ad-hoc indexes easy◦ Semantics critical for data hubs◦ Sem+search combines searching by type &

concept as well as words.◦ Makes natural language processing (much) easier◦ Works better for document-centric apps than

traditional RDF databases◦ Chaining of queries through Xquery or Javascript

possible

MarkLogic: Semantics+Search 2

Page 30: Semantics 101

Questions? ????????????????????????????????