LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008.
-
Upload
charles-atkinson -
Category
Documents
-
view
217 -
download
2
Transcript of LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008.
LexBIG, EVS and NCBO Browser
Publish, Query, & Browse Vocabularies in
caBIG
January 2008
Agenda
• Overview• LexGrid/LexBIG infrastructure• Distributed LexBIG API• EVS Services• BioPortal Browser
• Next Steps• LexBIG Grid Services
Team Members
MayoScott BauerJames BuntrockSridhar DwarkanathThomas JohnsonPradip KanjamalaJason LeischJyoti PathakKevin PetersonCraig StanclTraci St. Martin
Coordination & MentorshipBrian DavisBob FreimuthTahsin KurcJoshua Phillips
EVS teamJohnita BeasleyGilberto FragosoFrank HartelSteven HunterWilberto GarciaCharles GriffinJason LucasKim OngJohn ParkTracy SafranRob Wynne
Grid Services
NCI/NCBO Services
LexBIG Java API
Conceptual Overview
LexGridModel & Storage
Browsers and Applications
Conceptual Overview
LexGridModel & Storage
LexGrid Model & Storage
Coding Scheme
RelationsConcepts
Properties
Associations
describable
codingScheme
concepts::conceptsdescribable
relations::relations
describable
relations::association
relations::associationInstance
associatableElement
relations::associationTarget
versionableAndDescribable
concepts::codedEntry
concepts::property
concepts::comment
concepts::definition
concepts::presentation
0..1+concepts 0..*+relations
1..*+association
0..*+sourceConcept
0..*+targetConcept
1..*+concept
0..*+property
T005: VirusT047: Disease or Syndrome
T005 -> causes -> T047
• Coverage• Lexical Semantics (e.g. names, ids, definitions, comments)• Logical Semantics (e.g. associations, qualifiers)• Context (e.g. language, source)
• Supports Load and Representation of …• Web Ontology Language (OWL)• Open Biomedical Ontologies (OBO)• UMLS Rich Release Format (RRF)• Protégé Frames (various, requires custom loader for each unique flavor)• XML & Text (various, requires custom loader for each unique flavor)
• Available Renderings of the Model• XML Schema (master)• Formal (XMI, UML)• Data Storage/Schema (e.g. RDB, LDAP)• Technology-specific (e.g. Castor, EMF)
LexGrid Model & Storage
LexBIG Java API
Conceptual Overview
LexGridModel & Storage
Other
OBO
Content
Export
ImportOWL
XML
OBO
Text
Protégé
RRF
XML
Representation
LexGridVocabulary
Model
DataRepository
LexBIG - API
Tools And Services
Access
Prog
ramm
ing Interfaces
APIs
LexBIG
…
CTS
App
s
Java
Embed
WebServices
DistributedLexBIG
API
App
licatio
nS
erver• Each LexGrid ‘Node’ provides the software, metadata, indexes, and
backing data store to service one or more vocabularies.
• Each LexBIG Installation represents one LexGrid Node and Java API to administer and query data.
File System:Metadata &
Indexes
LexBIG - API
• API coverage• Administrative FunctionsAdministrative Functions
• Query Available Code Systems and MetadataQuery Available Code Systems and Metadata
• Query Concepts, Concept Properties, and QualificationsQuery Concepts, Concept Properties, and Qualifications
• Query Concept Relationships and QualificationsQuery Concept Relationships and Qualifications
• API characteristics• Conscious separation of service and data classesConscious separation of service and data classes
• Deferred query resolutionDeferred query resolution
• Payload optimizationPayload optimization
• Support of iteratorsSupport of iterators
• Defined extension points (loaders, exporters, sort algorithms, match Defined extension points (loaders, exporters, sort algorithms, match algorithms, convenience methods)algorithms, convenience methods)
LexBIG – API Example
• Prerequisites• ICD-9-CM loaded from UMLS distribution (RRF source files)
• Target a code system• Define a concept ‘space’ (a codedNodeSet) for ICD-9-CM, version 2007• Initially unrestricted and unresolved
• Restrict the space based by adding constraints• Property ‘Semantic Type’ -> exact match -> “Disease or Syndrome”• Primary text match -> sounds like -> ‘infeksion’• Any text stemmed match -> ‘classify’ (to match ‘classified’, ‘classifying’, etc)• Must contain a property with name -> ‘UMLS_CUI’• Concept must be active
• Indicate sort preferences and limit number returned• Sort by code, ascending• Limit to top 5
• Resolve!
LexBIG – API Example
• OK, now find some relationships…• Target a code system
• Define an unrestricted graph for a target ontology (e.g. ICD-9-CM)• Restrict by adding constraints
• Restrict to parent/child relationships (UMLS-defined ‘PAR’ = has parent)• Restrict to the codedNodeSet defined in the previous example
• Indicate extent of navigation• Maximum 2 levels, moving in forward direction• Maximum 50 nodes resolved overall
• Resolve!
• Example of Work in Progress for Mar 2008 release…
• Feature Request: Provide a more convenient way to query hierarchies with different underlying hierarchical names and structures.
• Solution: Register additional Metadata indicating supported hierarchical relationships, root nodes, and direction of navigation to traverse from parent to child. Provide relation-independent and direction-agnostic convenience methods to allow tree building without need to know specific behavior of each ontology.
LexBIG – API
R1
C1
C2
hasSubtype R1
C1
C2
‘CHD’R1
C2
C5
isa
C4
C3
developsFrom
• Solution: Register additional Metadata indicating supported hierarchical relationships, root nodes, and direction of navigation to traverse from parent to child. Provide relation-independent and direction-agnostic convenience methods to allow tree building without need to know specific behavior of each ontology.
• supportedHierarchy #1• Hierarchy ID = is_a• Root = R1• Association = hasSubtype• ForwardNavigable = true
• New Convenience Methods…• Resolving root of the is_a type hierarchy is always ‘R1’• Resolving first level of tree always provides ‘C2’ & ‘C3’• Etc…
LexBIG – API
supportedHierarchy #2Hierarchy ID = is_aRoot = R1Association = CHDForwardNavigable = true
supportedHierarchy #3Hierarchy ID = is_aRoot = R1Association = isa, develops_fromForwardNavigable = false
EVS caCORE APIs
LexBIG Java API
Conceptual Overview
LexGridModel & Storage
EVS caCORE API - Distributed LexBIG
Database ServerLexBIG in Local JVM
LexBIG Install
LexBIG Distributed API Client in Local JVM
LexBIG API Proxy
caCORE EVS Server
SpringRemoting
Database Server
LexBIG Install
JDBC
JDBC
Direct Invocation
Distributed API
• Query-by-example (QBE) system
• EVS 3.2 model
• Java
• Web Services
• REST (HTTP / XML)
caCORE EVS Server
Web ServicesWeb Services
XML / HTMLXML / HTML
Java QBEJava QBE
LexBIG Install
DAO
Cache
Service L
ayer
Database Server
JDBC
EVS caCORE API – QBE
Grid Services
EVS caCORE APIs
LexBIG Java API
Conceptual Overview
LexGridModel & Storage
Grid Services – EVS 3.1 Data Model
• The EVS Grid Service is accessible via the caGRID Portal: http://cagrid-portal.nci.nih.gov
• The following is a list of the available operations via EVS Grid Service.
Operation Name Description============== ===========getHistoryRecords Searches a valid vocabulary in NCI thesaurus
for history information.searchSourceByCode Searches the Meta Thesaurus based on Source code.searchMetaThesaurus Searches NCI meta thesaurus and returns Meta
Thesaurus information that meet the search criteria.getMetaSources Returns all Metathesaurus Sources contained in the EVS.searchDescLogicConcept Searches a valid Vocabulary such as NCI
Thesaurus and returns Description Logic concepts that meet the search criteria.
getServiceSecurityMetadata Returns the service's security metadata.getVocabularyNames Returns all the vocabularies present in the Description Logic
in caCORE 3.1 EVS service.
• Next slide is a screen shot of the EVSGridService information viewable from the caGRID Portal.• The next release of caCORE/EVS 4 will support the full EVS 3.2 model n the Grid
LexBIG Installation
caGrid Nodehosting Service
Client
Client Invokes caGridService
caGrid Service usesDistributed LexBIG toimplement call
Distributed LexBIG returnsrequested information tocaGrid Service
caGrid Servicereturns responseto client
Grid Services – LexBIG Prototype
LexBIG Model in Introduce
…
The LexBIG Model is loadedinto The Introduce Grid Service Authoring Toolkitvia XSD files.
Grid Services – LexBIG Prototype
Using the LexBIG Model in a caGrid Service
Services can then bedefined using the imported types.
LexBIG Model types loaded from XSDsDefining Services
‘CodingSchemeRenderingList’is used as the output of caGrid Service ‘getSupportedCodingSchemes()’
Grid Services – LexBIG Prototype
Sample caGrid Service call‘getSupportedCodingSchemes()’
Client caGrid Service Distributed LexBIGCalls caGrid ‘getSupportedCodingSchemes()’
Calls Distributed LexBIG‘getSupportedCodingSchemes()
Returns result of call to caGrid Service
Results are returned to clientwith all appropriate caGrid security mechanisms
Grid Services – LexBIG Prototype
Creating a caGrid Service
With the model loaded andmethods created, the servicecan then be deployed to acaGrid Node.
caGrid Node
Introduce Toolkit Output:A deployed serviceto the Grid, plus clientsoftware to access theservice.
Grid Services
EVS caCORE APIs
LexBIG Java API
Conceptual Overview
LexGridModel & Storage
Browsers and Applications
LexBIG GUI
Gene OntologyHL7Medical Dictionary for Regulatory Activities Terminology (MedDRA)National Drug File - Reference TerminologyNCI MetaThesaurusNCI ThesaurusSNOMED Clinical TermsThe MGED OntologyUMLS Semantic NetworkZebrafish
Stand-alone terminologies
BioCarta Terms Derived from online maps of molecular relationships, adapted for NCI use, 0601CClinical Bioinformatics Ontology, June 2005Canonical Clinical Problem Statement System, 1999 Clinical Classifications Software, 2003Clinical Data Interchange Standards Consortium, 0601CCOSTAR, 1989-1995CRISP Thesaurus, 2004Common Terminology Criteria for Adverse Events, 2003Cancer Therapy Evaluation Program (CTEP), 2004DSM-IV, 1994NCI Developmental Therapeutics Program, 0601CExpression Library ClassificationFood and Drug Administration, 0601CGene Ontology, 2004_03_02Healthcare Common Procedure Coding System, 2005Home Health Care Classification, 2003
Health Level Seven Vocabulary, 1998-2002ICPC2E-ICD10 relationships from Dr. Henk Lamberts, 1998HUGO Gene Nomenclature, 2004_04ICD10, 1998ICD-9-CM, 2005International Classification of Diseases for Oncology (ICD)ICPC2 - ICD10 Thesaurus, 200403International Classification of Primary Care, 1993Online Congenital Multiple Anomaly/Mental Retardation Syndromes, 1999NCI Mouse Terminology, 0601CKEGG Pathway Database, 0601CLOINC 2.13MEDLINE (1995-1999)McMaster University Epidemiology Terms, 1992Mitelman Database of Chromosome Aberrations in Cancer (MDBCAC), 2005_12MedDRA, 6.0MEDLINE (2000-2005)MedlinePlus Health Topics_2004_08_14, 20040814Online Mendelian Inheritance in Man, 1993Multum MediSource Lexicon, 2004_03Medical Subject Headings, MSH2005_2004_10_12UMLS MetathesaurusMetathesaurus FDA National Drug Code Directory, 2004_01Metathesaurus additional entry terms for ICD-9-CM, 2005, 2005ICPC2 - ICD10 Thesaurus, 7-bit Equivalents, 0403ICPC2 - ICD10 Thesaurus, American English Equivalents, 0403Metathesaurus Version of Minimal Standard Terminology Digestive Endoscopy, 2001Metathesaurus forms of SNOMED Clinical Terms, 2004_01_31NCBI Taxonomy, 2004_09_30NCI modified Common Terminology Criteria for Adverse Events v3.0, 2003NCI-GLOSS (Cancer.gov Dictionary), 0601C
National Cancer Institute Thesaurus, 2006_01CNCI MetathesaurusNCI SEER ICD Neoplasm Code Mappings, 1999National Drug File - Reference Terminology, 2004_01National Library of Medicine Medline DataOmaha System, 1994Physician Data Query, 2005_12Portfolio Management Application (PMA), 2003Quick Medical Reference (QMR), 1996QMR clinically related terms from Randolph A. Miller, 1999RXNORM Project, META2005AA Cumulative Update 2004_11_17, 2005AASNOMEDCT Clinical Terms, 2004_01_31Standard Product Nomenclature, 2003Metathesaurus Source Terminology NamesUMDNS: product category thesaurus, 2005University of Washington Digital Anatomist, 1.7.3
Individual Terminologies
Metathesaurus terminologies
NCI BioPortal
NCI BioPortal
• http://bioportal.nci.nih.gov/ncbo/faces/index.xhtml
• Encourage use and feedback
• Notes …
• General query, navigation, and visualization in place
• Some operations not performing well and under investigation
• Suspect bottlenecks in Distributed LexBIG API layer; compared with NCBO implementation which works directly against the LexBIG Java API.
Future Browser Support
• Invitation for Participation !!!• NCI- Terminology Open Portal (TOP) Project• https://gforge.nci.nih.gov/projects/openportal/• Every other Friday 2pm Eastern
• The OpenPortal is a collaborative effort to develop an open, site neutral and easily extensible Web service allowing users to browse, search, and visualize ontologies stored in LexGrid repositories.
• Participation from NCBO & others.• Will inform future changes to the LexBIG model and
service layers.
Project Links
• LexBIG Project
http://gforge.nci.nih.gov/projects/lexbig/
• NCI BioPortal Project
https://gforge.nci.nih.gov/projects/lex-browser/
• NCI BioPortal Site
http://bioportal.nci.nih.gov/ncbo/faces/index.xhtml
• Open Terminology Portal Project
https://gforge.nci.nih.gov/projects/openportal/
Q & A
Additional Materials
System.out.println("Example double restriction query with additional application of sort criteria and restricted return values");
//Declare the service...LexBIGService lbs = new LexBIGServiceImpl();
//Start with an unconstrained set of all codes for the vocabulary...CodedNodeSet cns = lbs.getCodingSchemeConcepts("NCI_Thesaurus", null, false);
//Constrain to concepts with designations (assigned text presentations)//that contain text that sounds like ‘heart ventricle’cns.restrictToMatchingDesignations("hart ventrikle",SearchDesignationOption.ALL,MatchAlgorithms.DoubleMetaphoneLuceneQuery.toString(),null);
//Further restrict the results to concepts with a semantic type of//'Anatomical Structure'.cns.restrictToMatchingProperties(Constructors.createLocalNameList("Semantic_Type"), "Anatomical Structure", "exactMatch",null);
LexBIG API - Example
//Indicate that the resulting list should be sorted,//with best results first and then sorted by code if there is a tie.SortOptionList sortCriteria =Constructors.createSortOptionList(new String[]{"matchToQuery", "code"});
//Indicate to return only the assigned UMLS_CUI and textualPresentation properties.LocalNameList restrictTo =ConvenienceMethods.createLocalNameList(new String[]{"UMLS_CUI",
"textualPresentation"});
//Still nothing computed yet!//Perform the query and resolve the sorted/filtered list,//with a maximum of 6 items returned ...ResolvedConceptReferenceList list = cns.resolveToList(sortCriteria, restrictTo, 6);
//Print the results ...ResolvedConceptReference[] rcr = list.getResolvedConceptReference();for(ResolvedConceptReference rc : rcr)
System.out.println("Resolved Concept: " + ObjectToString.toString(rc));
LexBIG API - Example
LexBIG Model Harmonization
• These slides are examples of the harmonization process.
ConceptReferenceList as it appeared in the original EA representation
class Collections
«XSDcomplexType»ConceptReferenceList
«XSDelement»+ ConceptReferenceCollection: lbCore:ConceptReference [0]+ id: long
LexBIG Model Harmonization
ConceptReferenceList following harmonization to caCORE modeling requirements
class Collections
«XSDcomplexType»ResolvedConceptReferenceList
+ id: Long
«XSDattribute»+ incomplete: boolean
«XSDcomplexType»ConceptReferenceList
- id: Long
«XSDcomplexType»Core::Resolv edConceptReference
«XSDattribute»+ codingSchemeURN: String+ codingSchemeVersion: String
«XSDcomplexType»Core::ConceptReference
- id: Long
«XSDattribute»+ codingScheme: String+ conceptCode: String
«XSDextension»
+ResolvedConceptReferenceList 0..1
+ResolvedConceptReferenceCollection 0..*
+ConceptReferenceList 0..1
+ConceptReferenceCollection 1..*
LexBIG Model Harmonization
SIW output prior to Harmonization/and annotation.
LexBIG Model Harmonization
After Harmonization and prior to first pass of the annotation process
LexBIG Model Harmonization
Result’s of automated first pass of the SIW automated annotation tool.
LexBIG Model Harmonization
Documentation of Harmonization requirements
LexBIG Model Harmonization