Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system...
Transcript of Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system...
![Page 1: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/1.jpg)
ACS National Meeting, San Diego
Herman van Vlijmen 16 Mar 2016
Open PHACTS:
Semantic interoperability for drug discovery
Judith Hinton Andrew, Rock Composite 22 Artwork from The Creative Center
![Page 2: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/2.jpg)
2
What is Linked Data?
![Page 3: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/3.jpg)
3
What is Linked Data?
"LOD Cloud 2014" by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak - http://lod-cloud.net/. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:LOD_Cloud_2014.svg#/media/File:LOD_Cloud_2014.svg
![Page 4: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/4.jpg)
4
How can data be linked?
Requires linking to standards: common “concepts” – Names, units, chemical
structures, etc
Data storage format – Triples, graphs
Query tools – SPARQL
Provenance – Original data source
Chen et al. BMC Bioinformatics 2010, 11:255
![Page 5: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/5.jpg)
5
Linked Data Storage example: RDF triples
Basic format Linking data sets Concept standards
http://www.accessola2.com/olita/insideolita/wordpress/?p=60281
![Page 6: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/6.jpg)
6
Examples of Linked Data challenges
Data types and units for pharmacological activity in ChEMBL
Lee and Gobbi. J. Chem. Inf. Model. 2012, 52, 285−292
Stereochemistry Tautomerism
![Page 7: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/7.jpg)
7
Why do we need Linked Data?
Multiple data sources can be queried at once – For example: In-house data, CHEMBL, PubChem, Thomson-Reuters,
DrugBank, GOSTAR, all have compound pharmacology data
– Time savings
– Certain to get full picture from private, public, and commercial data
Complex questions can be asked relatively easily – Databases from multiple domains, e.g. compounds, diseases,
genes, pathways, etc.
– Scientists will ask things they would not ask otherwise
Completely new type of analysis – Network based queries, semantic reasoning: not possible previously
![Page 8: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/8.jpg)
8
Answering more complex questions
![Page 9: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/9.jpg)
9
![Page 10: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/10.jpg)
10
Answering more complex questions
What are the Janssen compounds active in this Janssen assay?
Give me all internal/commercial/public data on compounds that are active on my target and other closely related targets.
What is the difference in gene expression profile between tumor and normal tissue?
Given the differences in gene expression profiles between these tissues, give me the compounds with biochemical activity profiles that resemble the difference profile most
I have a CDK2 lead compound. Is there anything known in PubMed on toxicity of CDK2 inhibitors?
Given my CDK2 lead compound, what are the most likely mechanisms by which this compound class could cause toxicity
TODAY WITH LINKED DATA
![Page 11: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/11.jpg)
11
New types of analysis with Linked Data
Search PubMed for potential target-disease association: “bcl2 and schizophrenia”
Show me all possible direct and indirect links between bcl2 and schizophrenia, ranked by level of scientific data support
Search a gene disease association database like DISGENET for possible genes/proteins that can serve as biomarkers for colorectal cancer
Based on all data that I have access to, provide a prioritized list of potential biomarkers for colorectal cancer that satisfy specific tissue constraints and are obtainable from blood, urine, or stool
TODAY WITH LINKED DATA
![Page 12: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/12.jpg)
12
Example: Gene variant disease association workflow
Step 1 Step 2
Slides from Euretos (www.euretos.com)
![Page 13: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/13.jpg)
13
Gene variant disease association workflow
Step 3 Step 4
Slides from Euretos (www.euretos.com)
![Page 14: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/14.jpg)
14
Gene variant disease association workflow bcl2 - schizophrenia
Slides from Euretos (www.euretos.com)
![Page 15: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/15.jpg)
15
Current activities in Linked Data environment? Some examples
Open PHACTS: EU and Pharma sponsored IMI (Innovative Medicines Initiative) project to develop Linked Data database and semantic applications in biomedical field (2011-2016)
ELIXIR: sustainable European infrastructure for biological information. Interoperability of data is key objective
Strong emphasis on making data sources FAIR (Findable, Accessible, Interoperable, Reusable) in ongoing ELIXIR and NIH activities
Development of advanced Linked Data analysis tools – For example: Euretos, Cambridge Semantics, Ontoforce
Pharma and Biotech companies are actively integrating internal with public and commercial databases with data companies and public-private consortia
![Page 16: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/16.jpg)
16
Open PHACTS consortium partners
Associated partners
Consortium partners
![Page 17: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/17.jpg)
17
www.openphacts.org Mission: Integrate multiple research biomedical data resources into a single open, sustainable and free access point www.openphactsfoundation.org The Open PHACTS Foundation is a registered charity dedicated to sustaining and developing the Open PHACTS Discovery Platform after completion of the IMI project
![Page 18: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/18.jpg)
18
Open PHACTS data sources
![Page 19: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/19.jpg)
19
Applications that use the Open PHACTS API API freely accessible via http://dev.openphacts.org
App Ecosystem
![Page 20: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/20.jpg)
Open PHACTS access
Free API access: http://dev.openphacts.org
Virtual Machine install of Open PHACTS behind firewall, using docker image Beta testing with a Pharma partner Allows you to customize and load
your own data
![Page 21: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/21.jpg)
Phenotypic Drug Discovery Workflows
Digles et al, MedChemComm, submitted
“Knowing the knowns”
![Page 22: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/22.jpg)
22
Recent Open PHACTS developments: Patent Info Huge amount of knowledge in patent corpus, most of which
will never be published elsewhere, but potentially great value to drug discovery
SureChEMBL system (EBI) already extracts compounds from these documents
Open PHACTS consortium funded project to also extract gene/disease information (EMBL-EBI and SciBite)
~4 million patents in total, 260 million annotations (patent-compound, patent-gene or patent-disease associations)
Example use cases: – For a given target, give me all the compounds
that are linked to this target through patents – For a given disease, give me all the targets that
are linked to this disease through patents – Tell me how reliable these links are
![Page 23: Open PHACTS: Semantic interoperability for drug discovery · 16/03/2016 · SureChEMBL system (EBI) already extracts compounds from these documents Open PHACTS consortium funded](https://reader034.fdocuments.us/reader034/viewer/2022042912/5f4674a84ea01044921b7e77/html5/thumbnails/23.jpg)
23
Acknowledgements
Janssen – Edgar Jacoby
– Jean-Marc Neefs
– Dmitrii Rassokhin
– Doug Martin
Open PHACTS and Open PHACTS Foundation
Euretos – Albert Mons
– Arie Baak