The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they...

32
The Open PHACTS Project: Progress and Future Sustainability Progress and Future Sustainability Lee Harland & Bryn Williams-Jones Tom Plasterer Open PHACTS / ConnectedDiscovery AstraZeneca/Open PHACTS Rep

Transcript of The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they...

Page 1: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

The Open PHACTS Project:Progress and Future SustainabilityProgress and Future Sustainability

Lee Harland & Bryn Williams-Jones Tom PlastererOpen PHACTS / ConnectedDiscovery AstraZeneca/Open PHACTS Rep

Page 2: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Fundamental issue:Fundamental issue:There is a *lot* of science outside your wallsIt’s a chaotic spaceIt s a chaotic spaceScientists want to find information quickly and easilyeasilyOften they just “cant get there” (or don’t even know where “there” is)know where there is)And you have to manage it all (or not)

Page 3: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Pre-competitive Informatics:Pharma are all accessing processing storing & re-processing external research dataPharma are all accessing, processing, storing & re processing external research data

LiteraturePubChem

GenbankPatents Databases

DownloadsRepeat @

each xcompany

Data Integration Data Analysis Firewalled Databases

Lowering industry firewalls: pre-competitive informatics in drug discoveryNature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944

Page 4: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

The Innovative Medicines Initiative The Open PHACTS ProjectInitiative• EC funded public-private

partnership for pharmaceutical researchF k bl

The Open PHACTS Project• Create a semantic integration hub (“Open

Pharmacological Space”)…• Delivering services to support on-going drug

di i h d bli d i• Focus on key problems– Efficacy, Safety,

Education & Training,Knowledge

discovery programs in pharma and public domain• Not just another project; Leading academics in

semantics, pharmacology and informatics, driven by solid industry business requirements

Managementy y q

• 23 academic partners, 8 pharmaceutical companies, 3 biotechs

• Work split into clusters:Tehnical Build• Tehnical Build

• Scientific Drive• Community & Sustainability

The Project

Page 5: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

“Find me compounds

“What is the selectivity profile of known p38 inhibitors?”

“Let me compare MW, logP and PSA for known oxidoreductaseinhibitors” Find me compounds

that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”

inhibitors

ChEMBL DrugBank Gene Ontology Wikipathways GeneGogy

UniProt UMLSChEBI GVKBio

ChemSpiderConceptWiki TrialTrove TR Integrity

Page 6: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Business Question Driven Approach Number sum Nr of 1 Question

15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse

18 14 8Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound?

24 13 8 Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives.

32 13 8 For a given interaction profile, give me compounds similar to it.

37 13 8 The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X.

38 13 8 Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not).A project is considering Protein Kinase C Alpha (PRKCA) as a target What are all the

41 13 8

A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature.

44 13 8 Give me all active compounds on a given target with the relevant assay data

46 13 8 Give me the compound(s) which hit most specifically the multiple targets in a given pathway46 13 8 Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease)

59 14 8 Identify all known protein-protein interaction inhibitors

http://www.sciencedirect.com/science/article/pii/S1359644613001542

Page 7: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Platform Explorer

“Provenance Everywhere”

Platform Explorer

Apps

StandardsAPI

Page 8: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms
Page 9: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Apps

Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices

Identity Resolution

Service

rm

“Adenosine receptor 2a”

Semantic Workflow Engine

Chemistry

IdentifierManagement

Service

re P

latfo

r

P12374EC2.43.4

CS4532

Data Cache (Virtuoso Triple Store)

Chemistry RegistrationNormalisation& Q/C

Indexing

Co

NanopubVoID

g

VoIDNanopub

VoIDVoIDNanopub

VoIDVoID

DbDbDbDbDbDbDbDb

Public Content Commercial

Public Ontologies

User Annotations

Page 10: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Present ContentPresent Content

Page 11: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Data Licensing Solution

Chose John Wilbanks as consultant

Data Licensing Solution

A framework built around STANDARD well-understood Creative Commons licences – and how they interoperate

Deal with the problems by:

Interoperable licences

Appropriate terms

Declare expectations to users and

data publishers

One size won‘t fit all requirements

Page 12: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Its easy to integrate difficult to integrate well:Its easy to integrate, difficult to integrate well:

Page 13: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

What Is Gleevec?Imatinib

Mesylate

PubChemDrugbankChemSpider

Page 14: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Dynamic EqualityStrict Relaxed

Analysing Browsing

chemspider:gleevec drugbank:gleevec

LinkSet#1 {h id l h P t i ti ibchemspider:gleevec hasParent imatinib ...

drugbank:gleevec exactMatch imatinib ...}

Page 15: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Play! https://dev.openphacts.org/Play! https://dev.openphacts.org/

Page 16: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

APPSAPPS

Page 17: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

http://explorer.openphacts.org

Page 18: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms
Page 19: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Example applications

Advanced analyticsAdvanced analyticsChemBioNavigator Navigating at the interface of chemical and

biological data with sorting and plotting options

TargetDossier Interconnecting Open PHACTS with multiple target centric services. Exploring target similarity using diverse criteria

PharmaTrek Interactive Polypharmacology space ofPharmaTrek Interactive Polypharmacology space of experimental annotations

UTOPIA Semantic enrichment of scientific PDFs

PredictionsGARFIELD Prediction of target pharmacology based on the

Si il E bl A hSimilar Ensemble Approach

eTOX connector Automatic extraction of data for building predictive toxicology models in eTOX project

Page 20: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms
Page 21: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Uptake at AstraZeneca:Uptake at AstraZeneca: a Use Case

Applying BioAssay Ontology toApplying BioAssay Ontology to facilitate HTS analysis

Linda Zander BalderudOla Engkvist

Chemistry Innovation Centre Discovery SciencesChemistry Innovation Centre, Discovery SciencesAstraZeneca

Page 22: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Assay Informatics projectBenefits in Adopting BioAssay Ontology (BAO)Benefits in Adopting BioAssay Ontology (BAO)

• Common language for assay annotation• Common language for assay annotation

• Improved project success analyses based on assay technologies

• Better understand the impact of technology artifacts like frequent hitters

• Assay design and screening cascade support during assay development in early projects

I d bilit t f bi d d t i i f

FLIPR Tetra High Throughput Cellular Screening System(from Molecular Devices)

• Improved capability to perform combined data mining of internal and public data

22

Page 23: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

The BioAssay Ontology (BAO)Comp tational Science Uni ersit of Miami USAComputational Science, University of Miami, USA

Domain:• Assay design• Assay format• Detection technology• Meta target• Endpoint

Protein Origin

Cell Line

Screening

Assay Information

• Endpoint• Perturbagen

BioAssay Ontology imports:

• NCBI taxonomy - organism names and IDs

Cascade

• NCBI taxonomy - organism names and IDs • Uniprot - protein target names and IDs• Unit Ontology - concentration and time unit terms• Ontology of Biomedical Investigation – descriptions of

biological assays

BioAssay OntologyAssay Information and Analysis

Signalling• Gene Ontology - biological processes• Cell Line Ontology - cell line names• CL – cell types• UBERON – anatomical entities• PATO – cell phenotype

Assay Success

Target hits, Results

Pathways

External Assays (PubChem CHEMBL)

23

PATO cell phenotype• SAR connect – target classifications

Page 24: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Migration to BAOAnnotation of HTS assaysAnnotation of HTS assays

Manual annotation of protocolsManual annotation of protocols

HTS assay: reporter gene assay• Assay method: reporter gene method: beta

lactamase induction

HTS assay: FLIPR• Assay method: molecular redistribution

determination assaylactamase induction• Detection technology: FRET• Bioassay: beta lactamase assay

• Assay kit: LiveBLAzer FRET - B/G Loading Kit

determination assay• Detection technology: fluorescence intensity• Bioassay: calcium redistribution assay

• Assay kit: Fluo-8 No Wash Calcium Assay Kit • Wavelength: ex 405 em 460, 535

• Biological process• Disease

• Wavelength: ex 480 em 530

• Biological process• Disease

Over 900 PubChem assays have been annotated by the BioAssay Ontology team

24

Page 25: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Assay Development SupportC i t d b t A t Z d P bCh HTSComparison study between AstraZeneca and PubChem HTS assays

412 in house HTS assays since 2005 have been annotated according to412 in-house HTS assays since 2005 have been annotated according to the BioAssay Ontology. The assay design and technology of the annotated assays were analyzed together with 239 primary assays from PubChem. The analyzed PubChem assays are biochemical assays, assays detected by luminescence and/or assays using GPCR targets.

From the annotated assays, 515 assays were using human targets and combined 311 different human targets were represented in the studycombined 311 different human targets were represented in the study.

15 of the in-house targets were also screened in at least one PubChem assay. Eight of these were GPCR

AstraZenecaGPCR targets. 194

102PubChem

15

25

Page 26: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Assay Development SupportD t ti T h l f AZ d P bCh Bi h i l ADetection Technology of AZ and PubChem Biochemical Assays

AAstraZeneca

PPubChem

26

Page 27: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Assay Development SupportAssay design of in-house and PubChem GPCR HTS

GPCR target class

27

One explanation for the low usage of cAMP redistribution method among the annotated PubChem assays could be that no class B GPCRs have been screened

Page 28: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Sustaining The Project

The Open PHACTS Foundation

Page 29: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Access to a wide range of interconnected data – easily jump between pharmacology, chemistry, disease, pathways and other databases without having to perform complex mapping operations

Benefits…..

Query by data type, not by data source (“Protein Information” not “Uniprot Information)API queries that seamlessly connect data (for instance the Pharmacology query draws data from Chembl, ChemSpider, ConceptWiki and Drugbank)Strong chemistry representation – all chemicals reprocessed via Open PHACTS chemical registry to

i t d t bensure consistency across databasesBuilt using open community standards, not an ad-hoc solution. Developed in conjuction with 8 major pharma (so your app will speak their language!)Simple, flexible data-joining (join compound data ignoring salt forms, join protein data ignoring species)species)Provenance everywhere – every single data point tagged with source, version, author, etcNanopublication-enabled. Access to a rich dataset of established and emerging biomedical “assertions”Professionally Hosted (Continually Monitored)Professionally Hosted (Continually Monitored)Developer-friendly JSON/XML methods. Consistent API for multiple servicesSeamless data upgrades. We manage updates so you don’t have toCommunity-curation tools to enhance and correct contentA t i h li ti t k ( diff t A b ild )Access to a rich application network (many different App builders)Toolkits to support many different languages, workflow engines and user applicationsPrivate and secure, suitable for confidential analysesActive & still growing through a unique public-private partnership

Page 30: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Kick Starting SustainabilityKick-Starting Sustainability

Open PHACTS

Apps

atio

nat

ion

ssryry ers

ers

labo

rala

bora

Gra

nts

Gra

nts

ndus

trnd

ustr

PI U

sePI

Use API

Col

ColII AA

Page 31: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms
Page 32: The Open PHACTS Project: Progress and Future ...€¦ · Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms

Pfizer Limited – CoordinatorUniversität Wien – Managing entity

Spanish National Cancer Research Centre University of Manchester

NovartisMerck Serono

Technical University of Denmark University of Hamburg, Center for

Bioinformatics BioSolveIT GmBHConsorci Mar Parc de Salut de Barcelona

Maastricht University AqnowledgeUniversity of Santiago de CompostelaRheinische Friedrich-Wilhelms-Universität

Bonn

H. Lundbeck A/SEli LillyNetherlands Bioinformatics CentreSwiss Institute of BioinformaticsConnectedDiscoveryEMBL-European Bioinformatics Institute

Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam

AstraZenecaGlaxoSmithKlineEsteve

EMBL European Bioinformatics InstituteJanssenOpenLink

[email protected] @Open_PHACTS Open PHACTS