Using multiple ontologies to characterise the bioactivity of small molecules

27
Use of Multiple Ontologies to Characterise the Bioactivity of Small Molecules Ying Yan 1 Janna Hastings 2,3 Jee-Hyub Kim 1 Stefan Schulz 4 Christoph Steinbeck 2 Dietrich Rebholz-Schuhmann 1 1 Text Mining, European Bioinformatics Institute, UK 2 Chemoinformatics and Metabolism, European Bioinformatics Institute, UK 3 Swiss Centre for Affective Sciences, University of Geneva, Switzerland or Medical Informatics, Statistics and Documentation, Medical University of WoMBO @ ICBO, Buffalo, July 2011

description

Presented at the 2011 ICBO workshop on working with multiple biomedical ontologies. We describe work on text mining for relationship extraction between chemical and biological entities via a language model for bioactivity.

Transcript of Using multiple ontologies to characterise the bioactivity of small molecules

Page 1: Using multiple ontologies to characterise the bioactivity of small molecules

Use of Multiple Ontologiesto Characterise the Bioactivity

of Small MoleculesYing Yan1

Janna Hastings2,3

Jee-Hyub Kim1

Stefan Schulz4

Christoph Steinbeck2

Dietrich Rebholz-Schuhmann1

1 Text Mining, European Bioinformatics Institute, UK2 Chemoinformatics and Metabolism, European Bioinformatics Institute, UK

3 Swiss Centre for Affective Sciences, University of Geneva, Switzerland4 Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria

WoMBO @ ICBO, Buffalo, July 2011

Page 2: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 2Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Bioactivity is what small molecules do in biological systems

Small molecules bind to receptors

Biochemical pathway is altered

On a macro scale, a phenotypic effect is observed

Page 3: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 3Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

ChEBI is an ontology of small molecules and their properties

ChEBI Ontology

chemical entity role

carboxylic acid

application

antibacterial drug

cefpodoxime (CHEBI:606443)

pharmaceutical

biological role

chemical role

chemical substance

molecular entitygroup

carbonyl compoundsolvent

has role

cyclooxygenaseinhibitor

carboxy group

has part

Page 4: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 4Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Roles

ChEBI role assertions are sparse

Chemical entities(26000)

Mapped roles(600)

Chemical entities

mapped to roles

(3000)

has role

Page 5: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 5Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Bioactivity is reported in the scientific literature

“Resveratrol inhibits cyclooxygenase-2 transcription and activity in phorbol ester-treated human mammary epithelial cells”

“Curcumin inhibits cyclooxygenase-2 transcription in bile acid-and phorbol ester-treated human gastrointestinal epithelial cells”

Page 6: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 6Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

ChEBI bioactivities are pre-coordinated

Page 7: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 7Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Bioactivity refers to multiple semantic types

Enzymes / proteins in general

Biological processes

Cellular or anatomical locations

Organism type

Page 8: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 8Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

The language of bioactivity

inhibitor activator modulatoragonist antagonist regulator

suppressor adaptor stimulatortoxin factor messenger blocker

Relation extraction via trigger words as features

chemical target

Page 9: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 9Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Targets and types of interaction

beta-adrenergic receptor inhibitor

targettype ofinteraction

Page 10: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 10Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Several syntactical structuresNoun phrase or adjective/adverb composition: Kinase suppressor, HIV transcriptase inhibitor

Prepositional phrase modifier: Suppressor of fused protein Oct-1 CoActivator in S phase protein

Verb phrase as noun phrase modifier: Carbonic-anhydrase inhibitors causing adverse effects in therapeutic use

Relative clauses as modifier: Factor that binds to inducer of short transcripts protein 1

Page 11: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 11Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Text mining approach

1. Syntactic parsing2. Chemical tagging (Oscar, Jochem)3. Named entity recognition

(UniProtKB, Organ, Organisms and GO Biological Process)

4. Target disambiguation (nested types)5. Pruning ‘noisy’ results using rules

source: MEDLINE abstracts

Page 12: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 12Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Pruning out noiseLargest challenges:• Difficulty in small molecule term recognition• Small molecule – protein disambiguation

Remove triples from the candidate list when the putative small molecule term:• is a role term according to ChEBI

(e.g. antibiotic)• has the suffix -ase (normally enzyme names)• has less than three characters

Page 13: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 13Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Results: distribution (feature/target)

Page 14: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 14Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Organ and Organism: Target vs. Location

Organ and organism often provide contextual/ locational information

However there are some true positives (as bioactivity targets)

Caesium ion antagonism to chlorpromazine- and L-dopa- produced behavioural depression in mice.

bothrops jararaca inhibitor thyroid stimulator

Page 15: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 15Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Noise

• On the other hand, …

• Influence of peritoneal dialysis on factors affecting oxygen transport…

• Without influence on WDS were: hysotigmine, atropine …

• The cellulase component was not markedly inhibited by …

body part?

species?

bioactive?

Page 16: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 16Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Tagging chemicals

Jochem – dictionary-based approach: better precision, lower recall

Oscar3 – machine learning approach: better recall, much more noise

Page 17: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 17Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

The ontology of bioactivity

chemical entity bioactivity

Target

Macromolecule

has_role

has_target

is_a

Organ

Organism

Biological process

Page 18: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 18Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Macromoleculesm1 is a beta adrenergic receptor:

m1 subclassOf bearer of some(realized by only

(Inhibition and(has target some BetaAdrenergicReceptor)))

Page 19: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 19Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Biological processesm2 is a mitosis stimulator:

m2 subclassOf bearer of some(realized by only (Stimulation and

(has target some (participant of some Mitosis))))

Page 20: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 20Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Organ as targetm3 is a thyroid stimulator:

m3 subclassOf bearer of some(realized by only (Stimulation and

(has target some (has locus some ThyroidGland))))

Page 21: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 21Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Species as definitional constraintm4 is a mouse thyroid stimulator:

m4 subclassOf bearer of some(realized by only (Stimulation and

(has target some (has locus some (ThyroidGland

and part of some Mouse)))))

Page 22: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 22Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Contextual vs. DefinitionalOrganisms, organs and body parts appear frequently as

contextual, locational modifiers for bioactivities

In these cases, the above formalism is too strict

We therefore introduce an additional relationship: has context

between a bioactivity and an organism, organ, body part

Non-definitional: the bioactivity can take place in many organisms, but was discovered through investigations in one organism.

Page 23: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 23Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Relating context to chemical-bioactivity associations

Context applies not to bioactivity alone

but to small molecule – bioactivity associations

(i.e. a ternary relationship)

Page 24: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 24Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Next-generation curation tools

Text mining support for human curation knowledge discovery effort

Multiple ontology-based reasoning for automated consistency checking and error

detection

Page 25: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 25Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Conclusions

• Language model for extracting small molecule bioactivity information from text

• Ontology model for accurately representing such information, and allowing automated reasoning across ontologies from chemicals to their targets

Page 26: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 26Multiple Ontologies for Small Molecule Bioactivity – WoMBO 2011

Future work

• Gold standard for chemical bioactivity in text to be used to evaluate our approach and to train machine learning tools

• Extending the relationship extraction approach to include chemical roles, applications and structural relationships

Page 27: Using multiple ontologies to characterise the bioactivity of small molecules

Wednesday, April 12, 2023 27

AcknowledgementsThanks

Colin Batchelor (RSC), Adam Bernard (EBI)

FundingBBSRC, grant agreement number

BB/G022747/1 within the "Bioinformatics and biological resources" fund