XML in Healthcare and the Semantic Web Jonathan Borden, M.D. Center for Brain and Cranial Diseases...
-
Upload
savannah-curtis -
Category
Documents
-
view
214 -
download
0
Transcript of XML in Healthcare and the Semantic Web Jonathan Borden, M.D. Center for Brain and Cranial Diseases...
XML in Healthcare and the Semantic Web
Jonathan Borden, M.D.Center for Brain and Cranial DiseasesSt. Vincent Health System, Erie PAInvited Expert, W3C Web Ontology Working GroupChair, ASTM E31.28 Electronic Healthcare Records
The Goal
Answer questions like:“Of all the patient’s I operated on for
brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”
Healthcare: The current situation
A disaster: 1.1 Trillion $/year in the USA30-40 % overheadmostly paper basedhighly proprietary commercial systemstens of thousands of people die each
year due to poor information/errorsMost of the information is rendered
useless
Strategies
Define open standardsCapture information in an electronic
formReduce errors related to informationDefine distributed, web enabled,
query models
Tactics
XML, schemas, query modelSemantic Web/URI graphsData analysis based on actual
population rather than small, potentially biased, samples
Google for biomedical information
Why XML?
Widely implemented with excellent open source tools
Life of data is longer than life of application
Data driven, Platform independentFormal schema and query models
Reinventing medical informatics
Get the data format right and the rest will follow
Structured information has been the holy grail of medical informatics for the last 30+ years
XML is the culmination of 30+ years of work in structured information
Time to do something
XML Briefly
Simplification of SGML … markup language for the web
<element> content </element><element attribute=“value”>
<child-element another=“123”/></element>
XML and Infosets
<patient> <person.name>
<given>James</given><given>Steven</given>
<family>Smith</family><suffix>3rd</suffix>
</person.name>startElement(“patient”)
startElement(“person.name”)startElement(“given”);characters(“James”);...
Regular Expressions
Pattern matching“*TATA*”bp ::= ‘G’ | ‘T’ | ‘A’ | ‘C’tata ::= bp*, ‘T’, ‘A’, ‘T’, ‘A’, bp*
XML DTD
<!ELEMENT foo (bar*)><!ELEMENT bar (baz?)><!ATTLIST bar bop CDATA
#IMPLIED><!ELEMENT baz (#PCDATA)>
Tree Regular Expressions
element foo{element bar{
attribute bop[int]element baz{‘xxx’}}
}
<foo><bar bop=“23”>
<baz>xxx</baz>
</bar></foo>
ASTM E2182/E2183
XML DTDs for HealthcareEmphasize Human ReadabilityFlexibilityOpenhealth reference
implementation http://www.openhealth.org/ASTM
Compatible with HL7 CDA
ASTM Healthcare DTDs
clinical.header compatible with HL7 CDA
clinical.body specific to document type operative.report radiology.report discharge.summary etc.
ASTM E31.28 Clinical Header
ch.person.type = person.name, id*, addr* ch.organization.type = organization.name?, id*, addr* clinical.header =
element clinical.header{ ch.attrib, id*, version.number?, confidentiality.code*, patient.encounter?, authenticator*, legal.authenticator*, intended.recipient*, originator?, originating.organization?, transcriptionist?, provider+, service.actor*, patient, events?, codes?, related.document*
}
ASTM E31.28 Clinical Header
service.actor = element service.actor { ch.attrib, xlink.attrib?,
(person.name|organization.name), id*, addr*, type.code?, function?, date.time? }
provider = element provider{ ch.attrib, ch.actor.type,
function?}
ASTM E31.28 Clinical Header
patient.encounter = element patient.encounter{ ch.attrib,
(id? & practice.setting? & date.time? & location)
}
service.target.model = ch.actor.type & birth.date? & gender?
patient = element patient { ch.attrib,xlink.attrib?,
service.target.model }
Encounter
<encounter> <patient>…</patient> <provider>…</provider> <date.time>…</date.time> <location> … </location> <encounter.id>…</encounter.id>
</encounter>
XML examples
<person> <person.name>
<prefix>Ms.</prefix> <given>Susan</given> <given>Samantha</given> <family>Jones</family>
</person.name> <id type=“SSN”>000-11-2233</id>
XML examples
<patient> <person.name> … </person.name> <id authority=“New England Medical
Center”>000112233</id>
</patient> <provider>
<person.name><prefix>Dr.</prefix><given>Amanda</given><family>Smith</family></person.name>
</provider>
Using XML to generate reports
Browser formASTM E2182 XML formatXSLT transform for display in
browserXSL-FO transform for printable form
(e.g. PDF)
ASTM Opnote: Header (1/3)<operative.report xmlns="http://www.openhealth.org/ASTM/operative.report"> <clinical.header xmlns="http://www.openhealth.org/ASTM/clinical.header">
<id>5556666</id><patient.encounter>
<id>ENC-11111</id><practice.setting>Operation</practice.setting><date.time>2000-10-15</date.time><location>New England Medical Center</location>
</patient.encounter><provider>
<person.name><prefix type="title">Dr.</prefix><given>Jonathan</given><given>Alan</given><family>Borden</family><suffix type="degree">M.D.</suffix>
</person.name>...
ASTM Opnote: Header (2/3)…<id type="license" authority="MA">12345</id>
<addr type="office"><house.number>750</house.number><street>Washington Street</street><city>Boston</city><state>MA</state><zip>02111</zip><uri type="email">mailto:[email protected]</uri><telephone>617-636-7587</telephone>
</addr><type.code>Attending</type.code><function>Surgeon</function>
</provider>...
ASTM Opnote: Header (3/3)…
<patient><person.name>
<given>John</given><given type="MI">Q</given><family>Doe</family><suffix>Jr.</suffix>
</person.name><id type="patient.identifier" authority="NEMC">111223344</id><id type="SSN" authority="SSA">111-22-3344</id><birth.date>1955-10-21</birth.date>
</patient><codes>
<coded.value code.system="CPT">63051</coded.value><coded.value code.system="CPT">69990</coded.value><coded.value code.system="ICD9">XXX.21</coded.value>
</codes> </clinical.header>
ASTM Opnote: Body <clinical.body> <preoperative.diagnosis>Right Frontal Brain Tumor</preoperative.diagnosis> <postoperative.diagnosis>same, probable Astrocytoma</postoperative.diagnosis> <procedure>Right Frontal Craniotomy for Excision of Brain Tumor</procedure> <anesthesia>GETA</anesthesia> <indications><p>The patient presents with severe headaches and blurred vision. An MRI demonstrates a large cystic irregularly shaped mass within the right frontal lobe.</p> </indications> <description> <p>The patient had application of the external fiducial markers and was brought down to the MRI suite where a head MRI was obtained using the frameless stereotactic (3D) protocol. The image set was transferred using the DICOM protocol ... </p> </description> <estimated.blood.loss>100cc</estimated.blood.loss> <patient.condition>Stable, extubated</patient.condition> <disposition>SICU</disposition> </clinical.body></operative.report>
How it works
Browser
Apache
XSLT
Servlet engine
xml:dbRDF
Form generation
Form.xml
Defaults.xml
Formgen.xsl
XML + XSLT => XHTML
Workflow
Form createdTransform into ASTM XML formatXHTML editing (opnote-edit.xsl)Sign finished productRender as XHTML for viewing,
printingemail to Medical Records and Billing
Workflow
generate
edit
sign
Billing
repository
Document analysis
Like gene sequences, it turns out that …Medical documentation is highly repetitiveWith ‘hot spots’ of unique informationSchema defines template filled with valuesEasily expanded into HTML for human
consumptionEasily analyzed by software
Document analysis
Integrating binary formats
MIME <-> XMTPHL7 V2X12 EDIDICOM
Internet Telemedicine
The OceanMed project, 1998Merchant vessel, e-mail access via
satellite gatewayDigital cameraWeb based physician access
XMTP Consult
36 year old male has itchy rash for 6 days
Hydrocortisone cream 1% to affected area t.i.d.|
reply
How it works
Messages arrive in MIME formatMIME SAX parser ‘converts’ to XML
by SAX eventsXMTP employs XML object model
*not necessarily* serialization format ->
grove processing
XMTP
From: [email protected] To: [email protected] Content-type: multipart/related; charset=iso-8859-1 --------- startDocument()
startElement(“MIME”) startElement(“From”)
• characters(“[email protected]”) endElement(“From”) startElement(“Content-Type”, attribute(“charset”,”iso-8859-
1”))• characters(“multipart/related”)
endElement(“Content-Type”)
The XMTP/MIME grove
Content-type: text/plain
From: [email protected]
Hi Sue! See you in Boston, Joe
<MIME>
<Content-type>text/plain</Content-Type>
<From>[email protected]</From>
<Body>Hi Sue! See you in Seattle, Joe</Body>
</MIME>
The HL7 Grove
Non-XML syntax => XML InfosetMSH|PAT|Jones^James^Stephen^3rd|
startElement(“patient”)startElement(“person.name”)
startElement(“family”)characters(“Jones”);
endElement(“family”)…endElement(“person.name”)
endElement(“patient”)
Simple building blocks
XML parsersXSLT transform enginesHTTP clients and servers
From syntax to semantics
Layer 1: syntax XML defines syntactic constrains on text other specs define syntactic constraints
on binary dataLayer 2: datatypes
integers define mapping from lexical space to value space
“10”base10 -> 10, “10”base2 -> 2
The shape of informationsyntax -> structure = semantics
“…..TATA…..”
gene
tatasnp
snp
Pattern matching transform
Semantics
Layer 3: hierarchy of classes the set of individuals of a given
datatype or object type define a classOntology: a description of a
collection of classes, their properties and the relationships between them
Healthcare Ontology
RDF in Healthcare
<rdf:Description about=“…/patient/12345”><lab:HIV>positive</lab:HIV><lab:CD4>100</lab:CD4>
</rdf:Description>
<path:Biopsy about=“…/patient/12345”>
<path:description>The brain demonstrates areas of PML including viral inclusion bodies
</path:description>
</path>
RDF is...
A standard syntax to represent (edge labeled) directed graphs in XML
DLG: Semantic Networks
vertebrate
mammal bird
canary ostrich
heartspine
hair
fly
wings
walkdoesn’t fly
yellow
isa isa
isa
has
can
freddie hugo
Semantic Networks
A way to represent natural language circa 1970s
A format for organizing statements in a way that can be queries by computers
Semantic Networks
“Can freddy fly?”“Does hugo have wings?”“Does freddy have a spine?”“Of all the canaries, how many live in
cages?”
RDF N-triples syntax
Subject predicate object .
ex:Freddy rdf:type ex:Canary .
ex:Canary rdfs:subClassOf ex:Bird .
ex:Freddy ex:color “Yellow” .
Bird
Canary
yellow
isa
Freddie
RDF/XML syntax
<rdf:Description rdf:ID=“Freddy”>
<rdf:type rdf:resource=“#Canary”/>
<ex:color>Yellow</ex:color>
</rdf:Description>
<rdf:Description rdf:ID=“Canary”>
</rdf:Description>
RDF/XML syntax: typed
<ex:Canary rdf:ID=“Freddy”>
<ex:color>Yellow</ex:color>
</ex:Canary>
Semantic analysis
“Of all the patient’s I operated on for brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”
Web Ontology Language (OWL)
Problem (restated): "Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauterne."
OWL is a language for defining Web ontologies and their associated knowledge bases.
Ontologies
Ontology is a term borrowed from philosophy that refers to the science of describing the kinds of entities in the world and how they are related. In OWL, an ontology is a set of definitions of classes and properties, and constraints on the way those classes and properties can be employed.
OWL
includes taxonomic relations between classes datatype properties, descriptions of
attributes of elements of classes, object properties, descriptions of relations
between elements of classes,
Datatype properties and object properties are collectively the properties of a class.
Simple Named Classesclass, subClassOf
Root classes: Every individual in the OWL world is a member of owl:Thing.
sample wines domain, we create three root classes: Winery, Region, and ConsumableThing. <owl:Class rdf:ID="Winery"/> <owl:Class rdf:ID="Region"/> <owl:Class rdf:ID="ConsumableThing"/>
Simple Named Classesclass, subClassOf
<owl:Class rdf:ID="PotableLiquid">
<rdfs:subClassOf rdf:resource="#ConsumableThing" />
</owl:Class>
<owl:Class rdf:ID="Wine"> <rdfs:subClassOf rdf:resource="#PotableLiquid"/> <rdfs:label xml:lang="en">wine</rdfs:label>
<rdfs:label xml:lang="fr">vin</rdfs:label> ... </owl:Class>
Defining individuals
<Region rdf:ID="CentralCoastRegion" />
is identical to
<owl:Thing rdf:ID="CentralCoastRegion" /> <owl:Thing rdf:about="#CentralCoastRegion"> <rdf:type rdf:resource="#Region"/> </owl:Thing>
Grapes
<owl:Class rdf:ID="Grape” />
<owl:Class rdf:ID="WineGrape">
<rdfs:subClassOf rdf:resource="#Grape"/> </owl:Class>
<WineGrape rdf:ID="CabernetSauvignonGrape" />
Simple properties
Object Properties
<owl:ObjectProperty rdf:ID="madeFromGrape"> <rdfs:domain rdf:resource="#Wine"/> <rdfs:range rdf:resource="#WineGrape"/> </owl:ObjectProperty>
Property hierarchy
<owl:ObjectProperty rdf:ID="WineDescriptor" />
<owl:Class rdf:ID="WineColor"> <rdfs:subClassOf rdf:resource="#WineDescriptor" /> ...</owl:Class>
<owl:ObjectProperty rdf:ID="hasWineDescriptor"> <rdfs:domain rdf:resource="#Wine" /> <rdfs:range rdf:resource="#WineDescriptor" /></owl:ObjectProperty>
<owl:ObjectProperty rdf:ID="hasColor"> <rdfs:subPropertyOf rdf:resource="#hasWineDescriptor" /> <rdfs:range rdf:resource="#WineColor" /></owl:ObjectProperty>
Domain and range
<owl:ObjectProperty rdf:ID="locatedIn">
...
<rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing" />
<rdfs:range rdf:resource="#Region" />
</owl:ObjectProperty>
Restrictions
<owl:Class rdf:ID="Wine"> <rdfs:subClassOf rdf:resource="#PotableLiquid"/> <rdfs:subClassOf>
<owl:Restriction> <owl:onProperty rdf:resource="#madeFromGrape"/> <owl:minCardinality>1</owl:minCardinality>
</owl:Restriction></rdfs:subClassOf> <rdfs:subClassOf>
<owl:Restriction> <owl:onProperty rdf:resource="#locatedIn"/> <owl:minCardinality>1</owl:minCardinality></owl:Restriction>
</rdfs:subClassOf> ... </owl:Class>
Vintages
<owl:Class rdf:ID="Vintage"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#vintageOf"/> <owl:minCardinality>1</owl:minCardinality> </owl:Restriction> </rdfs:subClassOf></owl:Class>
<owl:ObjectProperty rdf:ID="vintageOf"> <rdfs:domain rdf:resource="#Vintage" /> <rdfs:range rdf:resource="#Wine" /></owl:ObjectProperty>
Datatype properties
<owl:Class rdf:ID="WineYear" />
<owl:DataTypeProperty rdf:ID="yearValue">
<rdfs:domain rdf:resource="#WineYear" />
<rdfs:range rdf:resource="&dt;wineYear"/> </owl:DataTypeProperty>
dt;wineYear ::= integer > 1700
Properties of individuals
<CaliforniaRegion rdf:ID="SantaCruzMountainsRegion" />
<Winery rdf:ID="SantaCruzMountainVineyard" />
<CabernetSauvignon rdf:ID="SantaCruzMountainVineyardCabernetSauvignon" >
<locatedIn rdf:resource="#SantaCruzMountainsRegion"/>
<hasMaker rdf:resource="#SantaCruzMountainVineyard" />
</CabernetSauvignon>
Ontology mapping
sameClassAssameIndividualAssamePropertyAs<owl:Class rdf:ID="TexasThings">
<owl:sameClassAs> <owl:Restriction>
<owl:onProperty rdf:resource="#locatedIn" /> <owl:allValuesFrom rdf:resource="#TexasRegion" />
</owl:Restriction> </owl:sameClassAs>
</owl:Class>
Complex constructs
Description Logic unionOf intersectionOf complementOf oneOf disjointWith
Healthcare DL ontologies
OpenGALEN http://www.opengalen.org Open terminology French Ministry of Health CCAM
SNOMED http://www.snomed.org Closed DL terminology
Simplified Healthcare Ontology
<owl:Class rdf:ID=“Provider”>
<rdfs:subClassOf rdf:resource=“#Person”/>
</owl:Class>
Simplified Healthcare Ontology
Healthcare Ontology
Putting it all together
Biomedical information has many vocabularies - each in its own namespace
genetics “Bio ML”pathology “SNOMED”surgery “CPT”medicine “ICD”radiology “DICOM”
Putting it all together
Electronic medical record
genesdiagnoses
drugs
procedures
genetics
MRIPath-specimen
personGene:p53
Left temporal tumorSNOMED:
glioblastoma
OWL across schemas
Assimilating disparate information
glioblastoma
p53.1
...Ring enhancing
enhancing astrocytoma p53
UMLS next generation
Ontologies exposed as OWL on webCross references exposed as OWL on
web
Enables searching for and reasoning about terms relating to eachother
Enables searching for and reasoning about terms from multiple terminologies
Semantic analysis
repository
instance
Class
Class
Property
domain
type
subClass
Class
type
Queries: several views
Regular expression pattern matchingQuery as universal/existential
quantification (FOPL)Query as DL classification
First Order Predicate Logic
(for-all ?pat (exists ?surgeon (last-name ?surgeon “Borden”))
(exists ?procedure (craniotomy ?procedure)(patient ?procedure ?pat)(surgeon ?procedure ?surgeon)(between (date ?procedure)
“1996” “2000”)(sequence ?procedure “p53”)
...
Future directions
The technology is here …ASTM E31.28 http://www.astm.orgDefine schemas and ontologiesStandardize data formatsCollect datajust do it!
Contact Information
Jonathan Borden, M.D.Center for Brain and Cranial DiseasesSt. Vincent Health System311 W. 24th StreetErie, PA, 16505
www.openhealth.org/ASTMwww.openhealth.org/opnote (demo)www.w3.org/2001/sw/WebOntwww.jonathanborden-md.com