Data-intensive drug design - Chemical Information...

20
1 Data-intensive drug design 1:45 Introductory Remarks 1:50 Strategies for the identification and generation of informative compound sets Michael S Lajiness (Lilly) 2:15 Public-domain data resources at the European Bioinformatics Institute and their use in drug discovery Christoph Steinbeck (EBI) 2:40 Decision making in the face of complicated drug discovery data using the Novartis system for virtual medicinal chemistry (FOCUS) Donovan Chin (Novartis) 3:05 Integrating chemical and biological data: Insights from 10 years of VERDI Susan Roberts , W. P. Walters, R. McLoughlin, P. Gabriel, J. Willis, T. Kramer (Vertex) . 3:30 Intermission 3:45 Collaborative database and computational models for tuberculosis drug discovery decision making Dr. Sean Ekins PhD , Dr J. Bradford PhD, K. Dole, A.Spektor, K.Gregory, D. Blondeau, Dr M. Hohman PhD, Dr B. Bunin (Collaborative Drug Discovery) 4:10 Data drive life sciences: The Pyramids meet the Tower of Babel Dr. Rajarshi Guha (NIH) 4:35 Design principles for diversity-oriented synthesis: Facilitating downstream discovery with upfront design Lisa Marcaurelle (Broad Institute) 5:00 Overview: Data-intensive drug design John H Van Drie (Van Drie Research)

Transcript of Data-intensive drug design - Chemical Information...

Page 1: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

1

Data-intensive drug design

1:45 Introductory Remarks

1:50 Strategies for the identification and generation of informative compound sets

Michael S Lajiness (Lilly)

2:15 Public-domain data resources at the European Bioinformatics Institute and their use in drug discovery

Christoph Steinbeck (EBI)

2:40 Decision making in the face of complicated drug discovery data using the Novartis system for virtual

medicinal chemistry (FOCUS)

Donovan Chin (Novartis)

3:05 Integrating chemical and biological data: Insights from 10 years of VERDI

Susan Roberts, W. P. Walters, R. McLoughlin, P. Gabriel, J. Willis, T. Kramer (Vertex).

3:30 Intermission

3:45 Collaborative database and computational models for tuberculosis drug discovery decision making

Dr. Sean Ekins PhD, Dr J. Bradford PhD, K. Dole, A.Spektor, K.Gregory, D. Blondeau, Dr M. Hohman PhD,

Dr B. Bunin (Collaborative Drug Discovery)

4:10 Data drive life sciences: The Pyramids meet the Tower of Babel

Dr. Rajarshi Guha (NIH)

4:35 Design principles for diversity-oriented synthesis: Facilitating downstream discovery with upfront

design

Lisa Marcaurelle (Broad Institute)

5:00 Overview: Data-intensive drug design

John H Van Drie (Van Drie Research)

Page 2: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Data-intensive drug design: basic principles

John H Van Drie

Van Drie Research

www.vandrieresearch.com

Page 3: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Our problem resembles this

www.vandrieresearch.com 3

In the early days of aviation, the pilot’s cockpit was pretty simple

Twenty years later, it was still a stick-and-rudder and a few gauges.

Images from http://einhornpress.com/airplane.aspx

Page 4: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Our problem resembles this

www.vandrieresearch.com 4

Twenty years further on, what a pilot faced got complicated.

Pilots today face something like this.

Page 5: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Excel: the drug discovery chemists’ stick-and-rudder

www.vandrieresearch.com 5

Page 6: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

When there’s lots of data, and many molecules, this becomes unwieldy

www.vandrieresearch.com 6

• Drug discovery teams are getting more and more data. In a perfect world, more data should lead to better decisions.

• In practice, this is rarely so. In the face of many mol’s and lots of data, there is a natural tendency to simply discard them, or to otherwise not fully exploit them.

• Excel tends to accentuate this, though drug discovery chemists will never stop using Excel.

• One must be mindful of some fundamental principles of data-intensive drug design.

Page 7: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

In modern drug discovery, chemistry decisions are made in many places

www.vandrieresearch.com 7

Chemists designing screening lib’s

HTS

Hit-list triagingHit-to-leadLead-opt.

Preclinical studies

The potential to make poor decisions (e.g. discarding a useful molecule) is enormous.

Page 8: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of over-reliance on Lipinski’s Rule-of-5

www.vandrieresearch.com 8

• It’s tempting to throw out mol’s at each of these stages that do not conform to the Ro5.

• However, many well known drugs do not conform: Tagamet, Viagra, most statins, all HIV-protease inhibitors.

• Three assumptions in Lipinski et al’s analysis: (1) they only analyzed orally-available drugs (2) they explicitly excluded drugs based on natural

products, like erythromycin and Taxol (3) over 50% of the drugs in their analysis target

GPCR’s.

Page 9: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Relaxing those assumptions gives rise to a different perspective

www.vandrieresearch.com 9

Purple – high %FDk Blue – medLt Blue – lowGold – injectablesRed – active xport

The black ellipse shows the boundaries of the original ‘Egan Egg’ (JMC, 2000).

Data from drugbank.ca

Here I’ve taken a 400 molecule subset of the 995 mol’s in DrugBank with %F listed, and plotted their PSA against calculated logP, excluding only GPCR-targetted drugs.

“In drug discovery, there are no rules” - JW

Many senior, experienced scientists were betting that Vertex’s VX-950 “could never be a drug”. Nonetheless, it has recently completed its first phase III trial for Hepatitis C viral infections.

Working in oncology, where many targets have a large binding site, e.g. Bcl-2 or other Protein-Protein Interactions (PPI’s), it is self-defeating to populate a screening deck with molecules “Lipinski-compliant”.

Page 10: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of ‘hard filters’

www.vandrieresearch.com 10

• Lajiness’ Law: a molecule once thrown out never re-appears in your analysis.

• One alternative: de-prioritize options, but keep them all• Another: use soft (or fuzzy) filters

• Furthermore, medicinal chemists’ use of hard filters in hit-list triaging is highly irreproducible (Lajiness, Maggiora, Shanmugasundarum, JMC, 2004).

Page 11: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Be aware of data generated outside your own labs

www.vandrieresearch.com 11

• When we’re overloaded with our own data, there’s a strong temptation to not go hunting for important clues from the literature.

• Nowadays, that hunting is pretty easy (e.g. ChEMBL, PubChem, Prous’ Integrity).

• Many drugs have been discovered based on key external pieces of data, e.g.

• isoproteronol (beta-blocker)• captopril, enalapril (ACE inhibitors)

Page 12: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of the special data challenges in collaborative drug discovery

www.vandrieresearch.com 12

• Twenty years ago, team members were all down the hall, and met in one room to debate the meaning of data and to make decisions.

• Today, in Big Pharma, biotechs, and academia, the participants are widespread across timezones, continents, etc. Letting everyone share the data, and encouraging global participation in the decision-making, is a challenge.

• If only one person is making all the chemistry decisions, you’re in trouble. “Drug discovery is a team sport.”

Page 13: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of delays in getting key data back to the chemistry decision-makers

www.vandrieresearch.com 13

E. Petrillo, Drug Discovery World, Spring 2007

Page 14: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of how you design and prosecute screening funnels

www.vandrieresearch.com 14

In vitro assay

Cell assay

In vivo assay

Clinical response

Screening funnels (also called decision trees) are the point at which all drug discovery data converges, and hence are of special interest in Data-Intensive Drug Design.

Each assay is an imperfect predictor of the downstream assay. A correlation of r2 = 0.8 is considered good.

When one arranges assays in series, each of which is only 80% predictive of the following one, the net predictive power of three assays is only 0.8*0.8*0.8 = 0.5, i.e. your top assay is predicting the final result as well as a coin-flip.

It is an article of faith that these correlations improve within a narrow series, which may be true, though data on that is sparse.

Page 15: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of how you design and prosecute screening funnels

www.vandrieresearch.com 15

In vitro assay

Cell assay

In vivo assay

Clinical response

Two points emerge from the simple analysis on the last slide:

• Beware of constructing screening funnels with too many layers in series• Beware of how one prosecutes a screening funnel. Blind adherence may cause good molecules to be thrown out at the top.

One simple approach to handling this second issue is to consistently advance a small proportion of negatives at each step, e.g. advance 20% of the molecules which don’t satisfy the in vitro criterion into the functional assay. This allows one to constantly verify the strength of the correlation between those assays.

+’s

+’s

+’s-’s

-’s

-’s

Page 16: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Beware of the noise/uncertainty in the data

www.vandrieresearch.com 16

• When the amount of data is small, and one is in lead-opt. with a group of

seasoned chemists, there’s a natural tendency to be aware of uncertainty in the data, e.g. not asserting that a mol whose Ki is 1.0 uM is better than one whose Ki is 1.5 uM.

• However, in a data-intensive environment, these subtleties often get lost. One simple test is to simply model the noise in the data: your decisions should be robust to the addition of noise.

• Another approach: structure-activity landscapes (session this Wednesday).

Page 17: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Be aware of these issues when creating software tools in support of drug discovery

www.vandrieresearch.com 17

• VERDI, FOCUS, MOBIUS and the NIH system are good examples of proprietary tools that support a data-intensive paradigm, cognizant of these principles.

Page 18: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Summary: principles of data-intensive drug design (1 of 2)

www.vandrieresearch.com 18

1. In modern drug discovery, chemistry decisions are made in many places:

• making screening lib’s• hit-list-triaging• hit-to-lead• lead-opt

2. Beware of over-reliance on Lipinski’s Rule-of-5

3. Beware of ‘hard filters’• Lajiness’ Law: a molecule once thrown out never re-appears in

your analysis.

4. Be aware of data generated outside your own labs

Page 19: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Summary: principles of data-intensive drug design (2 of 2)

www.vandrieresearch.com 19

5. Beware of the special data challenges in collaborative drug discovery

6. Beware of delays in getting key data back to the chemistry decision-makers

7. Beware of how you design and prosecute screening funnels

8. Beware of the noise/uncertainty in the data

9. Be aware of these issues when creating software tools in support of drug discovery

Page 20: Data-intensive drug design - Chemical Information …bulletin.acscinf.org/PDFs/240nm18.pdfData-intensive drug design ... make decisions. •Today, in Big Pharma, biotechs, ... examples

Awareness of these issues may lead to

www.vandrieresearch.com 20

…and less of this…

more of this…