SMART Protocols in LISC-2014
-
Upload
olga-ximena-giraldo -
Category
Science
-
view
131 -
download
0
Transcript of SMART Protocols in LISC-2014
SMART Protocols: SeMAntic RepresenTation for
Experimental Protocols
Olga Giraldo
Ontology engineering group (OEG)
Universidad Politécnica de Madrid
Agenda
• What is a lab protocol
• Motivation
• Our general research question
• Our assumption
• Our propose
• Preliminary results
• Future work
What is a lab protocol
• Laboratory protocols are like cooking recipes• They have ingredients: reagents and sample,• They have appliances: equipment,• They have a total time,• They have a list of instructions,• They have critical steps.
• The laboratory protocols are “the how to do” an experiment.
Some problems in lab protocols
some of them present insufficient granularity,
the instructions can be imprecise or ambiguous due to the use of natural language.
• Incubate the centrifuge tubes in a water bath.
• Incubate the samples for 5 min with gentle shaking.
• Rinse DNA briefly in 1-2 ml of wash.
• Incubate at -20C overnight.
Why do we need to formalize and extract information from lab protocols?
Because we want a recommendation system…• That matches protocols according to my situation, for
instance• samples I have, • availability of equipment, reagents, lab conditions • expertise
We also want content based information retrieval • Meaningful sentences, sample used, purpose of the
protocol, applicability, critical steps, etc. Also, identification of instructions• Find all protocols for DNA extraction that have been used in
Oryza sativa that are suitable for processing a large number of samples with a low execution time.
Motivation
Currently…
Semi-structured information
Unstructured information
How to formalize the information from laboratory protocols as a knowledge base?
Ontologies + NLP tools
Our assumption
“Experimental protocols are fundamental information structures that should support the description of the processes by means of which results are generated in experimental research”
Methods to represent and extract information
• Gazetteer-based method: use existing lists of named entities Lists of proper nouns, which refer to real-life entities
• Rule-based approaches: write manual extraction rules
• Combination of the above
• Ontology model representing lab protocols
work in progress
Methodology used to develop SMART Protocols
Kick-off
• Gathering use cases.• Gathering competency questions.
Conceptualization &
Formalization
• DAKA - Domain Analysis and Knowledge AcquisitionAnalysis of 175 experimental protocols.1
• LISA - Linguistic and Semantic AnalysisIdentification of key metadata for reporting protocols,2 Determination of workflow aspects in protocols
(implicit order in the instructions, following the input output structure.)
Extraction of elements pertaining to domain knowledge. (e.g. classification of protocols in groups according to the purpose. Within each group were identified basic steps (or common patterns), according to the type of protocol.
• IO - Iterative Ontology buildingDesign of conceptual maps and draft ontologies. The
ontology modules were gathering from DAKA and LISA activities and exchanged with domain experts.
Evaluation &
Evolution
• OWL• Correction of syntactic inconsistencies by using OWLViz3
and OOPS4
• The ontology model evolves as new knowledge goes through the whole cycle.
1http://goo.gl/MC4mR92goo.gl/gAVnn
3http://protegewiki.stanford.edu/wiki/OWLViz4http://oeg-lia3.dia.fi.upm.es/oops/index-content.jsp
SMART Protocols - document It is an extension of IAO ontology. It supports rhetorical and structural components (e.g. introduction, materials, and methods); It supports Information like application of the protocol, advantages and limitations, list of
reagents, critical steps.
SMART Protocols ontology is available here:
http://vocab.linkeddata.es/SMARTProtocols/
SMART Protocols - wf
• It is an extension of the P-Plan Ontology.
• It represents of the workflow aspects in protocols implicit order in the instructions, following the input output structure.
SMART Protocols ontology is available here:
http://vocab.linkeddata.es/SMARTProtocols/
New and reused terms
Resource No. of terms Resource No. of termsOBI 15 P-Plan 3NCIthesaurus 9 NPO 3CHEBI 7 EXACT 2IAO 7 SO 2MGEDOntology 3 MeSH 1
• Reused classes = 52
• Reused properties = 4Property Origen Reused in
isManufacturedBy OBI SMART Protocols-Document
hasInputVar P-Plan SMART Protocols-Workflow
hasOutputVar P-Plan SMART Protocols-Workflow
isStepOfPlan P-Plan SMART Protocols-Workflow
Ontology No. of classes No. of propertiesSMART Protocols-Document 60 7SMART Protocols-Workflow 44 1Total 104 8
• New terms
• Analysis of the protocols. Focus on the identification of keywords and/or constructs in English –e.g. instructions, actions.
• Writing rules.
• Executing, testing and debugging the rules.
Work in progress
Summarizing…
Our purpose is the formalization of lab protocols by using ontologies and NLP tools to intelligently extract information.
Special thanks…Supervisors
Oscar Corcho Alexander Garcia
OEG’s colleagues
Daniel Garijo María Poveda Pablo Calleja Nandana Mihindukulasooriya
Olga Giraldo
Ontology engineering group (OEG)
Universidad Politécnica de Madrid