Analysing the drug targets in the human genome

15
www.guidetopharmacology.org Analysing the drug targets in the human genome Chris Southan 17th World Congress of Basic and Clinical Pharmacology (WCP2014), Cape Town Track 6, Tues 15 th July, 15.30 -17.00 PHAR/BPS Guide to PHARMACOLOGY Web portal Group, Centre for Integrative Physiology, Biomedical Sciences, University of Edinburgh, Hugh Robson Building, Edinburgh, EH8 9XD, [email protected] 1

description

This talk will be presented by Chris Southan, Database Curator and Chairperson of the NC-IUPHAR Drugs and Targets Annotation Subcommittee at the 17th World Congress of Basic and Clinical Pharmacology (WCP2014) taking place in Cape Town this month. The abstract for the talk can be read below: Discerning the molecular mechanisms of action (mmoa) for drugs treating human diseases is crucially important. This talk will provide an overview of target numbers in IUPHAR/BPS Guide to PHARMACOLOGY, compare these to other sources and consider the wider implications for drug discovery. We have developed stringent mapping criteria for primary targets (i.e. identifying those direct protein interactions mechanistically causative for therapeutic efficacy). This includes inter-source corroboration by intersecting multiple drug sources inside PubChem to produce consensus structure sets. The analogous approach is used to intersect published target lists and database subsets at the UniProtKB/Swiss-Prot identity level. Our cumulative curation results reveal that structure representation differences, data provenance and variability of assay results, are major issues for experimental pharmacology and global database quality. While our activity mappings encompass some polypharmacology (e.g. dual inhibitors and kinase panel screens) our strategic choice is to annotate minimal rather than maximal target sets. The consequent increased precision gives our database high utility for data mining, linking and cross-referencing. Our small-molecule figures are currently converging to ~200 human protein primary targets for ~1000 consensus chemical structures of approved drugs. Target lists from other sources are typically larger and show a degree of discordance. Comparative analysis of these lists by their UniProt ID content and Gene Ontology distributions suggests differences in curatorial selection are the main cause of divergence. The global target landscape thus shows paradoxical trends. On the one hand, cumulative drug research output and recent expansions (e.g. epigenetic targets and orphan diseases) have pushed bioactive compounds from papers or patents to above 2 million and chemically modulatable human proteins above 1500 (PMID:24204758). On the other hand, reports of Phase II clinical efficacy failure, with implicit target de-validation, are frequent. In addition, our assessment of drug approval data from 2009 to 2013 indicates new targets (i.e. first-in-class mmoas) are so low as to threaten the sustainability of the pharmaceutical industry. Causes and consequences of these paradoxes, along with utilities for minimal and maximal druggable genomes, will be discussed.

Transcript of Analysing the drug targets in the human genome

Page 1: Analysing the drug targets in the human genome

1

www.guidetopharmacology.org

Analysing the drug targets in the human genome

Chris Southan17th World Congress of Basic and Clinical Pharmacology (WCP2014), Cape Town

Track 6, Tues 15th July, 15.30 -17.00

IUPHAR/BPS Guide to PHARMACOLOGY Web portal Group, Centre for Integrative Physiology, School of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, Edinburgh, EH8 9XD, UK.

[email protected]

Page 2: Analysing the drug targets in the human genome

2

Outline

• Target considerations and definitions• Target numbers in databases and the literature• The GToPdb approach to target mapping• Target proteins and Gene Ontology comparisons• The GToPdb approved drug target function distribution• Conclusions

Page 3: Analysing the drug targets in the human genome

3

Target considerations

• The capture (in GToPdb and other sources) of the data-supported molecular mechanisms for a drug has crucial pharmacological, bioinformatics and cheminformatics utility

• The concept of “primary target” postulates a causal, necessary and sufficient link between a direct drug binding and clinical efficacy

• Polypharmacology (multiple efficacy targets) is important but difficult to prove experimentally or clinically

• The in vitro kinetic parameters and cross-reactivity for the same mechanisms are different and experimentally variable

• Verification of target engagement and residence time in vivo is rare • Many proteins in listings are not bona fide drug targets ; (e.g.

albumin, trypsin, HERG, APP, P450s) or were not initially (e.g. ACE2, BACE2)

• Phenotypic screening may be on the ascendance, but deconvolution to mechanism of action remains important

Page 4: Analysing the drug targets in the human genome

4

The spread of target numbers

Page 5: Analysing the drug targets in the human genome

5

Literature and patent target growth

Page 6: Analysing the drug targets in the human genome

6

Slow growth in new small-molecule targets

Page 7: Analysing the drug targets in the human genome

7

The GToPdb approach to target mapping

• Focus on minimal, rather than maximal relationship capture, to produce a concise “drugged genome”

• Read the papers to resolve the mechanisms• Stringent mapping of citable data (e.g. Kd, Ki, IC50)• Mask nutraceuticals/metabolites/hormones from these mappings• Use consensus sets (human UniProt/Swiss-Prot identifiers) to map “out” to drugs• Use consensus drug structures (PubChem compounds) to map “in” to targets• Reduce complex subunit mapping to direct interactions• Avoid matrix screening results for primary mappings• Small-molecule and peptide focused but includes approved antibodies• Pragmatically flexible for including dual inhibitors, proven secondary targets,

non-human mappings, or effective drugs of unknown mechanism• Encompass special relationships e.g. prodrug > drug or drug > active

metabolites• Considering new capture types (e.g. protein-protein interaction inhibitors,

conjugates and nucleotide therapeutics)

Page 8: Analysing the drug targets in the human genome

8

Primary target annotation =

Page 9: Analysing the drug targets in the human genome

9

Comparing target sets

Page 10: Analysing the drug targets in the human genome

10

Gene Ontology for intersects and differentials

Page 11: Analysing the drug targets in the human genome

11

GToPdb approved drug interactions

281 primary targets 501 protein interaction mappings

354 UniProt intersect

Page 12: Analysing the drug targets in the human genome

12

GToPdb approved targets: class and pathway splits

Page 13: Analysing the drug targets in the human genome

13

Conclusions • Published and database target numbers diverge mainly

because of different curatorial selectivity • Current inner limit ~ 300, outer limit ~ 2000 • New mechanisms for new approved drugs are low • GToPdb uses consensus lists as starting points for curation • Utilities of our minimal primary target set include:

– Validated mechanisms– Defining the core drugged genome and pocketome– Use as “small (but perfectly formed) data” to underpin “big (noisy)

data”

Page 14: Analysing the drug targets in the human genome

Acknowledgments and

referencesQuestions ?

14

GToPdb Hosted Target Listshttp://www.guidetopharmacology.org/lists.jsp

Page 15: Analysing the drug targets in the human genome

15

Short student glossary for “Analysing the drug targets in the human genome”

• Target = protein to which a small-molecule drug, research compound or antibody binds for activity modulation and therapeutic effect

• UniProt = global protein database with identifiers for curated (Swiss-Prot) or automatically classified (TrEMBL) defined protein sequences

• Ligand = generic term for a pharmacologically important and specific interaction• PubChem = global chemistry and bioactivity database with defined identifiers for

small molecule and peptide chemical structures • Approved drugs = approved by national authorities as prescription medicines• Mechanism of action = defined molecular description (e.g. enzyme inhibitor or

receptor antagonist)• Curation (for GToPdb) = abstracting published information into concise annotations

for database records, including links to other resources• Activity mapping = linking database entries between a ligand, its protein target, a

publication reference , and the affinity results (e.g. IC50, Ki or Kd)• Genome Ontology (GO) = standardised representation of different classifications of

protein function and other attributes• Intersect = proteins in-common between certain sets (result of a Boolean AND)• Differential = proteins unique to certain sets (in a Venn diagram)• For reference, the canonical human proteome (i.e. excluding sequence variants of

any type) has 20,213 Swiss-Prot entries