ChEMBL resources and KNIME - OpenPHACTS€¦ · ChEMBL KNIME nodes . Example: All bioactivities for...

Post on 09-Aug-2020

11 views 1 download

Transcript of ChEMBL resources and KNIME - OpenPHACTS€¦ · ChEMBL KNIME nodes . Example: All bioactivities for...

ChEMBL resources and KNIME

George Papadatos

georgep@ebi.ac.uk

Outline

•  ChEMBL data

•  ChEMBL nodes

•  Web services v2.0

•  UniChem

•  Cheminformatics utilities

•  myChEMBL

•  SureChEMBL and Open PHACTS

Bioactivity data

Compound

Ass

ay/T

arge

t

>Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE

3. Insight, tools and resources for translational drug discovery

2. Organization, integration, curation and standardization of pharmacology data

1. Scientific facts

Ki = 4.5nM

APTT = 11 min.

ChEMBL: Data for drug discovery

Bioactivity data

Compound

Ass

ay/T

arge

t

>Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE

3. Insight, tools and resources for translational drug discovery

2. Organization, integration, curation and standardization of pharmacology data

1. Scientific facts

Ki = 4.5nM

APTT = 11 min.

ChEMBL: Data for drug discovery

KNIME at the EBI

•  Access ChEBI and ChEMBL databases via KNIME nodes

•  Trusted community nodes

•  Algorithms development

•  Document classification

•  Share example workflows and use cases

•  Provide KNIME training to scientists and researchers

•  Wellcome Trust drug discovery courses, EMBL courses

•  CDK community nodes development

h"p://tech.knime.org/book/embl-­‐ebi-­‐nodes  

ChEMBL nodes

ChEMBL KNIME nodes

Example: All bioactivities for hERG

All  bioac9vi9es  for  hERG  

Ac9vity  value,  assay  descrip9on,  compound,  reference  

Example: Compound searching in ChEMBL

Query  

List  of  NNs  

Example: Polypharmacology profile

Compounds

Query  

Find  NNs  

Retrieve  bioac9vi9es    

Filter,  summarise  &  pivot    

Web services v2.0

•  Many more entities à granularity

•  Pagination, filtering, ordering

UniChem integration

EMBL-EBI chemistry resources

RDF  and  REST  API  interfaces  

REST  API  Interface  -­‐  h"ps://www.ebi.ac.uk/unichem/  

Atlas        

Ligand  induced  transcript  response  

750  

PDBe        

Ligand  structures  

from  structurally  defined  protein  

complexes    

15K  

ChEBI        

Nomenclature  of  primary  and  secondary  metabolites.  Chemical  Ontology  

 

24K  

SureChEMBL          

Chemical  structures  from  patent  literature  

 

~17M  

ChEMBL        

Bioac9vity  data  from  literature  

and  deposi9ons  

 

1.5M  

UniChem  –  InChI-­‐based  chemical  resolver  (full  +  relaxed  ‘lenses’)   >90M  

3rd  Party  Data    

ZINC,  PubChem,  ThomsonPharma  DOTF,  IUPHAR,  DrugBank,  KEGG,  

NIH  NCC,  eMolecules,  FDA  SRS,  PharmGKB,  

Selleck,  ….    

~70M  

Novelty checking with UniChem h"ps://www.ebi.ac.uk/unichem/  

Cheminformatics utilities

Cheminformatics utilities (aka ‘Beaker’)

•  Chemical format conversions

•  Dynamic image generation

•  Image processing (via OSRA)

•  Descriptors and property calculations

•  Chemical modifications and standardization

https://www.ebi.ac.uk/chembl/api/utils/docs

Example: Image to Structure

image URL

myChEMBL integration

Accessing local data with myChEMBL

Using KNIME to connect to myChEMBL

SELECT mr.*, md.chembl_id, cp.full_mwt, cp.alogp from mols_rdkit mr, molecule_dictionary md, compound_properties cp

where mr.m @> '$${SMolecule}$$'::qmol and mr.molregno = md.molregno and md.molregno = cp.molregno;

SureChEMBL and Open PHACTS

SureChEMBL and Open PHACTS

SureChEMBL  

SciBite  Termite  

Open  PHACTS  API  

https://dev.openphacts.org/docs/develop https://github.com/openphacts/OPS-Knime/

http://rdf.ebi.ac.uk/resource/surechembl/patent/US-8877786-B2

Substituted carbamoylmethylamino acetic acid derivatives as novel NEP inhibitors

US-8877786-B2

Most relevant targets and diseases

MCS scaffold

Most relevant diseases

Most relevant targets

Patent publication date histogram http://rdf.ebi.ac.uk/resource/surechembl/molecule/SCHEMBL371804

Foretinib, a kinase inhibitor in clinical phase II Found in 89 EP, WO and US patents

Summary

•  KNIME: democratizes access to data and tools

•  Access public domain structure and bioactivity data and services with KNIME

•  ChEMBL KNIME Nodes

•  UniChem

•  Cheminformatics services

•  myChEMBL

•  SureChEMBL

Publications

Acknowledgements

•  Francis Atkinson

•  Louisa Bellis

•  Jon Chambers

•  Michał Nowotka

•  Anne Hersey

•  Stefan Beisken

•  Edmund Duesbury

•  Daniela Digles

•  Thorsten Meinl

•  KNIME

•  KNIME community

All  workflow  examples  are  available  on  request.    

ChEMBL resources and KNIME

George Papadatos

georgep@ebi.ac.uk