Roses 2005 Drug Discovery Today

download Roses 2005 Drug Discovery Today

of 13

Transcript of Roses 2005 Drug Discovery Today

  • 8/11/2019 Roses 2005 Drug Discovery Today

    1/13

    REVIEWSDDT Volume 10, Number 3 February 2005

    177www.drugdiscoverytoday.com

    Relevance of a drug target for a disease is ofteninferred with strong belief but fragile evidence. Here, aprogram for early identification of human disease-specific drug targets using high-throughput geneticassociations is described. Large numbers of well-characterized patients (>1000) and matched controlsare screened for genetic associations using severalthousand (>7000) single nucleotide polymorphismsfrom more than 1500 genes. The genes were selectedbecause they are members of target classes for whichthere are precedents for high-throughput chemicalscreening technology. This review summarizes themethods and intensive data analyses leading to targetgene identification for type 2 diabetes mellitus,including the statistical permutation methodologyused to correct for many variables.

    Discussions of target identification for drug discovery have becometechnically oriented and complicated over the past decade [ 1,2], a pe-riod of time that coincides with decreases in productivity across thepharmaceutical industry [ 36]. The latest technologies for selecting

    targets can be fascinating and imaginative, but are these targets rele-vant to treating human diseases [7,8]? Although methods for high-throughput chemical screening and for optimizing lead moleculeshave undoubtedly made major advances, target selection remains acrucial step [ 9]. Currently, there is no strategy of equally high-through-put for the selection of targets that are directly associated with humandiseases. This need becomes even clearer when animal models are con-sidered the ultimate target validation [7,10].

    The drug discovery process should be clear-cut identify the bestmolecule, for the most effective treatment, as fast and efficiently aspossible. Although high-throughput molecular methods have pro-gressed, matching the molecule to the appropriate clinical indication

    Allen D.Roses*Daniel K.BurnsStephanie ChissoePamela St.JeanGlaxoSmithKline R&D,Research Triangle Park,NC 27709, USA*e-mail: [email protected] MiddletonGlaxoSmithKline R&D,London,UK

    1359-6446/04/$ see front matter 2005 Elsevier Ltd. All rights reserved. PII:S1359-6446(04)03321-5

    ALLEN D.ROSES

    Allen D.Roses wasappointed Senior VP forGenetics Research atGlaxoSmithKline (GSK) in2000.Before joiningGlaxoWellcome in 1997,Roses was at DukeUniversity,where he was t he Jefferson PilotProfessor of Neurobiology and Neurology andDirector of the Center for Human Genetics.Rosesled the team that identified apolipoprotein E as amajor,widely confirmed susceptibility gene incommon late-onset Alzheimers disease.While atGSK,he was charged with organizing geneticstrategies for susceptibility gene discovery,developing and implementing pharmacogeneticsapproaches and integrating genetics into

    medicine discovery and development;translationof genetic and genomic research into pathwayanalyses,drug discovery and pharmacogenetics indevelopment is ongoing at GSK.

    Making the right choice of targets at the beginning of the pipeline will be the first step down the long road of

    creating innovative medicines

    Keynote review:Disease-specific targetselection:a critical first stepdown the right roadAllen D.Roses,Daniel K.Burns,Stephanie Chissoe,Lefkos Middleton and Pamela St.Jean

    DANIEL K.BURNS

    STEPHANIECHISSOE

    LEFKOSMIDDLETON

    PAMELAST.JEAN

    mailto:[email protected]:[email protected]:[email protected]
  • 8/11/2019 Roses 2005 Drug Discovery Today

    2/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    178 www.drugdiscoverytoday.com

    for development is still an arduous task: disease-relevanttarget identification can now be an important criterionfor determining the correlation between molecule andindication. The foundation of the successful reputationof the 1980s and early 1990s, when newly discovered bi-ological pathways and receptors were selected as targets,was a rich biological literature identifying the low-hang-

    ing fruit [ 11 ]. Biology has continued to direct targets toa small extent, which is probably best illustrated by bio-pharmaceuticals [ 12 ]. For small-molecule screening overthe past decade, the selection of targets using animal mod-els, simple organism genomics and genetically manipu-lated animals has a less than successful track record[13 ,14] why not return to patients to select targets fortheir diseases?

    If the speed and efficiency of identifying targets thatare relevant to human diseases were improved, then thetechnical advances for lead validation, high-throughputchemical screening and lead optimization could be ap-plied to molecules with a greater probability of success.For example, polymorphic variants of target genes mighthave different interactions with lead molecules. Studiesof mechanism, chemical lead validation and on- or off-target effects can be conducted in directed mouse mod-els [ 15 ]. Although leads from mouse model targets arefrequently ineffective in human clinical trials (this isparticularly true in cancer and CNS diseases [ 2]), for mostdiscovery research, animal models set a gold standard fortarget selection [7,16,17]. A complimentary approachwould be to identify first those targets that are geneticallyassociated with human diseases and then create appro-priate knockout and conditional knockin models withtarget gene variants. This suggestion differs significantlyin the scope and depth of human phenotyping from pro-posals to phenotype a broad catalogue of knockout miceto suggest targets [ 10 ].

    Here, early (but not preliminary) data are presented fromthe application of a high-throughput human disease-specific

    target program (called HiTDIP within GlaxoSmithKline)that focuses on the association of tractable targets withspecific patient groups. The principles of complex geneassociation studies and disease susceptibility will be ad-dressed, particularly with respect to matching targets withclinical indications. Several reviews have recently beenpublished that cover genetic association studies [ 18 26],

    and these broad, complex, and sometimes esoteric, disci-plines are not discussed here. Rather, the focus of thisarticle is the strategy and process of genetic associationstudies to match pharmaceutical targets with clinicalindications related to several human diseases.

    The pharmaceutical industry and its business analyststrack attrition at various stages in the drug discovery anddevelopment pipeline. For example, 95% of candidatequality leads fail to produce a medicine. Of the moleculesthat enter Phase I clinical trials after surviving preclinicaltesting, only 21.5% reach the market [ 27 ]. Furthermore,the number of new molecular entities (NMEs) drugswith a novel chemical structure submitted to the FDAover the past decade has decreased ( Figure 1 ) [13 ].

    Candidate leads evaluated at Phase IIA for efficacy rep-resent products of target selection generated during thepast ten years, generally before the possible contributiveeffects of the completion of the human genome sequencecould be realized. The major sources of attrition after en-tering Phase I trials are toxicity and lack of efficacy. Onemethod of decreasing attrition at an early clinical stage isto apply prospective efficacy pharmacogenetics (PGx) atPhase IIA [ 3]. Another strategy is to select the right targetinitially.

    The pharmaceutical applications of HiTDIP method-ologies were adopted as a response to the apparent in-creased failure rate of clinical development that hasplagued the industry over the past decade. A basic hy-pothesis for this attrition might be stated simply as:A reason that many molecules are ineffective in clinicaltrials is that the selection of the originally screened tar-

    get was based on data and rationale thatare actually irrelevant to the etiology orpathogenesis of the human disease . For adrug to be successful in treating a disease,two variables must be matched the tar-get and the right therapeutic indications.

    The success rate of discovery of moleculesfrom screens is therefore directly relatedto target choice, and targets identifiedusing animal models and sophisticatedgenomic analyses have so far providedfewer molecules for NME submission thananticipated [ 28 ,29].

    A critical view of genomic applicationsfor target identificationAnalysis of the human genome sequencepromised to bring a flood of new targets,

    FIGURE 1NCE success ratios: probability of progressing through each phase. Only one in 25 NCE candidatecompounds is approved by the regulators ( Table 1 ). Note that only 25% of those molecules that survivePhase II successfully pass into Phase III. Molecules derived from genetically-associated targets couldincrease the success rate.A small increase would have a significant effect on approvals.The addition of safety pharmacogenetics can also contribute to decreased attrition during development [ 60;http://csdd.tufts.edu/NewsEvents/RecentNews.asp?newsid=4 ].

    Drug Discovery Today

    LaunchFilePhase IIIPhase IIPhase ILeadScreenProteinGene

    60%

    60%

    90%

    25%

    50%

    Candidate

    http://csdd.tufts.edu/NewsEvents/RecentNews.asp?newsid=4http://csdd.tufts.edu/NewsEvents/RecentNews.asp?newsid=4
  • 8/11/2019 Roses 2005 Drug Discovery Today

    3/13

    DDT Volume 10, Number 3 February 2005

    179www.drugdiscoverytoday.com

    REVIEWS

    and therefore increase the potential throughput of thepharmaceutical pipeline [ 28 31]. This scenario has alsoproved disappointing in not reaching the projected goals.What factors have limited target selection and drug dis-covery productivity? Although HTS technologies weresuccessfully implemented and spectacular advances inmining chemical space have been made, the universe forselecting targets expanded, and in turn almost explodedwith an inundation of information. Perhaps the best ex-planation for the initial modest success observed was thedramatic increase in the noise-to-signal ratio, which ledto a rise in the rate of attrition at considerable expense.The difficulty in making the translation from the identi-fication of all genes to selecting specific disease-relevanttargets for drug discovery was not realistically appreci-ated. There was certainly a large pharmaceutical industryinvestment in the intellectual property speculation sur-rounding the sequencing of all possible genes, with thehope that a stream of new targets for drug discoverywould result. Senior R&D scientists recently recognizedthe need for a quantal step-up in discovery [ 32 ]. To feed

    a high-throughput pipeline, a high volume flow of spe-cific, disease-relevant targets is necessary. Whether or notindividual researchers believe in a particular disease hy-pothesis, in the specific relevance of a target class to someaspect of human disease pathogenesis or in particularanimal models of human disease, the evidence that val-idates (substitute believe in, consensus view or champion,among others) the choice of a target molecule for a po-tential therapeutic strategy in humans is crucial to startingdown the right road.

    Target validation is one of those terms that scientistsuse in multiple ways. With respect to target identification

    and selection in the pharmaceutical industry, validationis interpreted as providing increased confidence to initi-ate expensive chemical screens and subsequent discoveryprograms. On occasion, support of possible relevancecomes from the sheer weight of potentially (substitutebelieved in, validated, rational or accepted, among others)relevant information. For example, a target gene could bedetermined to be expressed in the tissue that is affectedpathologically by a particular disease, differentially ex-pressed in disease-relevant tissues or have a visible effectin animal models when manipulated [ 33 ]. These datamight provide modest human disease-specific supportas the starting point for a drug development program totreat a particular disease. A putative target located on neu-ronal surfaces could be relevant to a human neurologicalor psychiatric disease but which one? Proof of conceptin humans can only occur on completion of preclinicaltesting and Phase I safety studies of an optimized lead can-didate. A Phase II clinical trial is an extremely costly hur-dle with which to justify the target choice after manyyears of confidence-building research. The situation could

    be compounded when a molecule with exceptional drugqualities, but an unclear clinical indication, is tested inseveral clinical trials involving multiple clinical endpoints.

    The success of a target is judged after many years usu-ally in hindsight by counting marketed products. The suc-cess rate of efficacy (proof of concept) studies is probablya much earlier indicator of pipeline health. If the righttarget was selected more often and this led to the selectionof effective lead candidates more frequently, then attritionwould be reduced. Shots on goal are good, but center for-wards who miss 99% of the time (and take all the shots)are not hired by professional teams. The pharmaceutical

    TABLE 1

    The pharmaceutical pipeline: definition by milestoneR&D stage (milestone passed) Description

    From target identified to screening hitand/or lead compound

    Success a in identifying a compound with the desired pharmacological activity at the desiredmolecular, cellular or mechanistic target: this compound might not have all the characteristicsrequired to be a viable drug

    From lead compound to drug candidate Success in identifying a viable drug candidate by optimizing the characteristics of the initialscreening hit and/or lead; typically, this requires appropriate potency, selectivity and efficacy but canalso involve other criteria such as bioavailability, metabolic stability and preliminary safety screening

    From drug candidate to FTIH and/or Phase I Success in progressing a drug candidate into initial studies in humans, usually (Phase I) in healthyvolunteers

    From Phase I to entry into Phase II Success in progressing a clinical development candidate into small-scale exploratory studies inpatients that have the targeted disease

    to proof of concept Usually considered to be the point at which a drug candidate has demonstrated efficacy in itsintended patient population, typically within Phase II: therefore, project attrition can be measuredfor progression to or from this milestone, as well as to or from the more traditional clinicaldevelopment milestones (Phases I III)

    From Phase II to entry into Phase III Success in progressing a medicinal candidate into large-scale (pivotal) clinical trials suitable forregistration

    From Phase III to regulatory filing Success in progressing a data package into a regulatory submission

    From regulatory review to approval Success in gaining regulatory approval

    From regulatory approval to launch Success in launching an approved product, added indication and/or label-changeaProject progression can be quoted as attrition (failure) or success. Abbreviation: FTIH, first time in human.

  • 8/11/2019 Roses 2005 Drug Discovery Today

    4/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    180 www.drugdiscoverytoday.com

    industry must move away from the numbers game andstrive for specificity and speed at the earliest stages of thepipeline. Targets for specific diseases that are chosen basedon strongly held beliefs have a significant probability of being the totally wrong target. Current knowledge (sub-stitute accepted beliefs, strongly held views or reasonablygood ideas, among others) might not define disease-

    relevant hypotheses accurately. It is the result an effec-tive and safe medicine that is of importance to the pa-tients, physicians and industry. Indeed, the mechanismsof action of many successful medications are still unknown.

    Can target molecules be directly associated with humandiseases that have highly statistically significant data? Yes.Can chemical leads be produced from screening these tar-gets that can enter the pharmaceutical pipeline? Yes. Willlead candidates produce a higher rate of future success indemonstrating efficacy compared with current metrics?A decision on this aspect has yet to be reached moretime is needed to study the flow through the pipeline intohuman testing. However, when a genetically associatedtarget gene has already been screened chemically, therecan be a rapid progression from identification of the tar-get to the entry of lead molecules into clinical development.

    Gene-specific target association study designWith the human genome sequenced, it is possible to de-fine virtually all genes belonging to the known targetclasses using analogous sequence regions that define spe-cific structures or functions. However, there are few realindications as to which gene might be specifically asso-ciated with a particular disease. It is possible to test eachgene individually for disease relevance using genetic as-sociation studies, but only if sufficiently large and well-characterized patient and control groups are available.What of high-throughput association studies of many se-quence variations within all genes of each target class? Asan example hypothesis, assume there is a G-protein-cou-pled receptor (GPCR, sometimes referred to as a 7-trans-membrane repeat) target class gene variant that is asso-ciated pathologically with Alzheimers disease. WhichGPCR is it? If there are almost 500 known GPCR genes,then is the Alzheimers disease-specific GPCR the third onthe list? Is it number 222, or maybe number 407? High-throughput, gene-based single nucleotide polymorphism

    (SNP) genotyping technologies provide the opportunityfor the rapid testing of each of the GPCR variants fordisease association. Genetic association studies providean evidence-based opportunity to inform target choicerapidly and more specifically.

    There is an implied crucial assumption when using agene-disease association strategy: disease-specific associ-ations might be identified for genes that are selected sim-ply because it is known how to screen them against largechemical libraries. Seven years ago, the scientific com-munity was reluctant to take such a risk. However, morerecently, retrospective data were published that support

    this assumption. Goldstein et al. [34] examined 42 sequencevariants of genes that had been associated with a drug re-sponse at least twice. These investigators found that 21 of the 42 variants were in the target or in a known path-way of the target. It is therefore reasonable to propose thatthe probability of identifying a disease-relevant targetwould be increased by screening all potential targets for

    genetic association with well-defined diseases. That is tosay, there are now data available to support the hypoth-esis that candidate leads derived from genetically associ-ated targets can increase the probability of success (de-crease attrition) at Phase II or Phase III of clinical trials.

    Another important scientific contribution to gene-disease association studies has come from advancementsin the fields of genetic epidemiology and statisticalgenetics. These are specialized disciplines to most ofthe pharmaceutical industry, but the ability to analyzerapidly many clinical traits simultaneously in a high-throughput fashion, and with appropriate methods to de-fine statistical significance, is probably the most usefulcontribution to such studies other than the sequencedgenome template. By 2002, this advancement made itpossible to test therapeutic class genome-wide variants forstatistical associations with particular human diseases.Additional support for a disease gene-association strategycan be found in the plethora of recent studies linking spe-cific gene variants to particular diseases [ 35 40]. Whenthe effect of the variant results in expressed clinical dis-ease, it is generally viewed as a disease mutation. Wheremultiple variants of several genes contribute to the ex-pression of a disease, they are now commonly referred toas susceptibility genes. The practical problem of solelystudying disease genetics to generate targets is that mostsusceptibility genes are not drug targets, and thereforethe high-throughput methodologies currently availablecannot be used to screen for the formation of chemicalinteractions [ 41 ].

    Would the initial identification of high-throughput tar-gets with human disease-specific associations result in amore efficient pipeline with less attrition? Because therewas great confidence that the human genome projects(both public and private) would eventually provide genesequencing, and some variant, information, it was an-ticipated that a large, low-throughput resource would be

    needed; that is, prospective, well-phenotyped patient col-lections and appropriate controls, each of which wouldhave consent for commercial applications.

    The patients define the relevanceTo study the association of gene variants with the clini-cal expression of human diseases, a large number of con-senting patients and controls must be carefully examinedand the data placed into accessible databases; in addition,DNA must be collected and stored. Because the clinicalexamination of patients is performed one patient at atime, the generation of large patient collections that are

  • 8/11/2019 Roses 2005 Drug Discovery Today

    5/13

    DDT Volume 10, Number 3 February 2005

    181www.drugdiscoverytoday.com

    REVIEWS

    suitable for disease association studies using multiplemarkers is time- and resource-intensive, and consequentlyis not high throughput. Anticipatory clinical research of this type requires sustained access to clinical expertise,extensive supporting resources and commitment over aperiod of years. Few such prospective collections existand, in those academic environments that possess suchdatabases, informed consent for commercial uses is usu-ally absent.

    In 2005, the genome is sequenced, target class genevariants are known, rapid genotyping technologies areavailable and interactive databases with analytical capa-bilities have been built. Since 1997, GSK has organized

    external clinical specialists, who are expert in more thana dozen important diseases, and has accumulated morethan 80,000 patients and controls, each examined with aprospectively standardized protocol, informed consentand stored DNA samples. Several association experimentswere completed that provided exciting new putative tar-gets for pipeline consideration. The leading edge of chem-ical leads has begun to enter the pipeline.

    As molecules resulting from insights of HiTDIP reachthe published portfolio of GSK over the next few years,there will be a relatively straightforward method forthe comparison of attrition with historical metrics. With

    respect to the gene variants identified for early drugdiscovery, specific disclosure of early targets and leads inthe pipeline could be limited by regulatory and com-mercial concerns. However, the best and most rapid bio-logical and genetic validation of gene variants associatedwith disease can occur where the data are available forconfirmation by academia and industry. It is therefore

    planned that these large association experiments will bepublished in scientifically reviewed journals to enable thefull weight of academic and industry disease-specific tar-get validation to be focused in this area. Furthermore,pharmaceutical companies principally share many of thesame targets why not compete on the screening and leadchemistry of disease-relevant targets and well-designeddrug development?

    High-throughput disease-specific target discovery

    the experimentTo perform this experiment for the identification of thetargets associated with human disease, three major com-ponents are required: (i) selection of the gene targets tobe screened; (ii) well-characterized clinical data; and (iii)genetic data generation and statistical analyses.

    The targetsBefore the sequencing of the human genome, the phar-maceutical industry knew of perhaps 500 targets [ 42 ].Widely appreciated target classes, for example, nuclearreceptors, kinases and GPCRs ( Figure 2 ) now constitute~1200 genes, and there are many additional enzymaticscreens that bring one estimate of the total druggablegenome to over 3000 genes [ 43 ]. One method of dealingwith screenable targets is to create knockout or other mod-ified mice and search for phenotypes. This undoubtedlyprovides some additional support, but the phenotypiccorrelation between mouse and man can be difficult tointerpret [ 17 ]. It would seem that direct associations withhuman disease phenotypes would be a promising andefficient point to search for knockout and conditionalknockin models. Kola and Landis [ 2] listed five placesalong the pharmaceutical pipeline where attrition mightbe managed. Their first point was that building the needto get very strong evidence for proof of mechanism intothe discovery paradigm is critical. In the past, this was

    possible because an extensive literature had developed asscientists concentrated on biochemistry, physiology, phar-macology and other disciplines before suggesting targets.If the 100 best-selling drugs are examined retrospectively,the targets were initially selected because of strong andconfirmed (in the literature) biology, including studies inman [ 17 ]. The prediction that knockout models will havethe same efficiency prospectively as that provided by ret-rospective analyses will only be possible with extensiveand specific phenotyping of each knockout. Screening thephenotypes of many knockouts might be much more su-perficial [ 16 ]. Perhaps, the most efficient approach would

    FIGURE 2HiTDIP:genetic associations between tractable targets and major diseases.High-throughput analysis of approximately 7000 polymorphisms in 1800 candidategenes (numbers of validated SNPs and candidate target genes have increased overtime) are screened for association in common diseases,such as asthma,schizophrenia,depression,osteoarthritis,Alzheimer s disease, metabolic syndrome,hypertension, acute coronary syndrome and others.The approximate numbers of genes in various target-classes used in these experiments are also indicated.Genes of marketed products and other target genes such as enzymes and proteinligands are also included in the target classes.As more target groups becometractable, they are added to the screen. Between 2002 and 2004,the gene list hasexpanded from approximately 1450 genes to >1800.For the NHR and cofactorcolumn, the yellow color represents the number of NHR cofactor gene targets.Abbreviations:COPD,chronic obstructive pulmonary disorder;NHR, nuclearhormone receptor.

    Drug Discovery Today

    050

    100150

    200

    250300

    350

    Tractable gene targets... ...screened against majorunmet medical needs:

    400

    G e n e n o .

    G P C R s

    I o n c

    h a n n e

    l s

    N H R

    a n

    d c o

    f a c

    t o r s

    K i n a s e s

    P r o

    t e a s e s

    Asthma COPD Rheumatoid and osteo-arthritis Multiple sclerosis Alzheimer s disease Epilepsy Schizophrenia Migraine Unipolar and bipolar depression Hypertension Atherosclerosis Obesity Diabetes Acute coronary syndrome Irritable bowel syndrome

  • 8/11/2019 Roses 2005 Drug Discovery Today

    6/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    182 www.drugdiscoverytoday.com

    be to use knockout and conditional knockin mice withgenetic variants as additional target validation for genesthat are statistically associated with the expression of humandiseases.

    Industry-initiated partnerships, such as The SNPConsortium, and efforts to sequence the human genomerapidly catalyzed SNP discovery [ 44 ]; this information wasused to identify validated SNP assays for the ~1800 tar-gets that constitute, for example, the GPCRs and kinases.For large-scale, disease-association analyses, the testing of these variants requires two sets of reagents: (i) validatedSNP assays of each gene; and (ii) DNA from clinically well-phenotyped patients, appropriate controls and integra-tion with extensive bioinformatic support systems. Thestandard candidate gene association experiment is to testvariations of a specific gene for association to a single de-fined human disease. For HiTDIP, it is possible to extendthis experiment to hundreds of genes (~1800), using thou-sands of SNP variations (~7000), and to several diseases(17 + ) in independent test and secondary screens.

    The clinical dataThe rate-limiting step in the HiTDIP program is the ex-amination, acquisition of consent and collection of DNAfrom large collections of well-phenotyped patients andcontrols. Over the past decade, the key practical problemsencountered are that the majority of patient collections

    of DNA reside in academic laboratories or small countries,and chemical-screening libraries are located in industry.Institutional review boards require that patient popula-tions have specific informed consent for the commercialuse of DNA, thus rendering many collections of patientsenrolled in academic institutions largely unusable forcommercial HTS. Patient and control clinical evaluationsare not high throughput, because they are performedon an individual basis. Therefore, even if the technicalscreening capacity were generally available, as it is now,the essential clinical populations with available, consentedDNA samples are not.

    Disease phenotype does not simply depend on a diagnostic labelOver the past seven years, GSK has spon-sored an extensive series of disease-spe-cific clinical collaborations. These net-works eventually involved ~200 specialistphysician collaborators and currently

    comprise at least 17 diseases in severalethnic populations. Large groups of highly phenotyped patients and controlshave been collected for these associations:ten of the case-control or case-controlfamily studies currently completed(asthma and type 2 diabetes were the firstin 2003 and 2004, respectively) or sched-uled to be finalized are summarized inTable 2 . Productivity gains have resulted

    in an increase in genotyping capacity, thus six of the listeddiseases are scheduled to be analyzed in 2005, with otherdiseases to follow on completion of clinical enrollments.The lag phase for HiTDIP analyses is the process of regis-tering subjects (patients and controls). After that, theprocess is industrialized with planned overlapping pri-mary, secondary and, in some cases, tertiary screeningblocks.

    Unlike more common retrospective patient collections,where information is gleaned from records, the networkphysicians were required to agree on diagnostic criteriain advance of subject collection. Data are often missingin retrospective collections, which can give rise to spec-ulation on the implications of a blank answer in therecord of an individual. GSK created a clinical databasethat encapsulates the complete clinical and demographicinformation about each patient and each control prospec-tively: a prospective study ensures that data on all indi-viduals are collected, and the database is as comprehen-sive as possible.

    Each network of clinical experts established a core data-base of phenotypes with defined clinical descriptions foreach disease, with the added caveat that additional sup-plementary clinical information that any participatingclinical investigator wanted to collect would also beincluded in the phenotypic database. This resulted in dis-ease-specific databases encompassing clinical parameters

    agreed by all network physicians, in addition to thesub-sets collected with respect to the particular researchinterests of an investigator.

    This database format provides the opportunity to testgenetic associations with more granularity than simplydiagnosis. The data can be analyzed using the physiciansagreed disease diagnosis, single symptoms or signs orgroups of clinical findings. There can be several con-tributing pathogenic processes in complex diseases, anyof which might not be expressed concurrently to produceactive disease, but each of which might be associated atsome level with similarly diagnosed patient populations.

    TABLE 2

    Listing of family and case-control studies ongoing at GSK as of September 2004Disease No. of

    sitesFamily or case-control No. of subjects

    collectedTarget no. of subjects

    Asthma 14 Family 5909 5909

    Alzheimer s disease 9 Case-control 1504 2000

    COPD 11 Case-control and family 5121 5460

    Schizophrenia 4 Case-control 1317 1953Metabolic syndrome 6 Case-control and family 4842 4842

    Osteoarthritis 8 Case-control and family 4137 5685

    Rheumatoid arthritis 1 Case-control 1736 2600

    Parkinson s disease 1 Case-control and family 3039 3039

    Unipolar depression 9 Case-control and family 3781 3861

    Obesity 2 Case-control 2019 2000

  • 8/11/2019 Roses 2005 Drug Discovery Today

    7/13

    DDT Volume 10, Number 3 February 2005

    183www.drugdiscoverytoday.com

    REVIEWS

    For example, in studying type 2 diabetes mellitus, inclu-sion of ophthalmologic examinations, renal function in-dicators and a defined neurological physical examinationenables specific sub-type association studies. If the asso-ciation of SNP variants in those patients with peripheralneuropathy was required, then performing and specifi-cally recording the presence or absence of ankle jerks on

    all patients and controls is more accurate than trying toselect patients with retrospective lack of data. Similarly,if diabetic retinopathy were an interest, few retrospective-controlled studies would have this information.

    Illustrative study example: asthmaAsthma was the first disease to be analyzed in HiTDIPagainst multiple phenotypic variations. The associationof thousands of SNP variants was measured against fiveseparate but correlated asthma-related traits in large fam-ily sets including: (i) physicians diagnosis of asthma; (ii)atopy (positive skin prick tests); (iii) atopic asthma (physi-cians diagnosis plus atopy); (iv) strict asthma (two ormore classical symptoms and a positive methacholinechallenge test or bronchodilator test); and (v) bronchialhyper-responsiveness (positive methacholine response ator below 10 mg/ml of methacholine) [ 37 ]. The subjectswere ascertained prospectively, but retrospectively groupedfor each set of clinical criteria. Association studies usinghigh-throughput genotyping of SNPs from tractable tar-gets to define associations with various clinical definitionshave been productive.

    When thinking of asthma as a complex disease and/orsyndrome, it is understandable that there might be mul-tiple genetic and environmental susceptibilities. Certainly,there should be no expectation that all or most of the phe-notypic variables would be present in all cases, as mightbe expected for a specific genetic mutation in a highlypenetrant, rare inherited disease. A comprehensive approachthat improves this situation would be to test each of theclinical forms that occur within families for genetic as-sociation with particular genetic variants. Confirmationof association could be performed in a subsequent set of families as well as in series of sporadic cases.

    When the initial pilot asthma-screening was performedin 2001 with ~2700 SNPs from 1244 genes, interestingpatterns of association were observed. Many of the asso-

    ciated gene variants were related to three or more of theselected clinical phenotype definitions. The associationof some genes with physicians diagnoses tended to sep-arate to bronchial sensitivity-related signs and symptomsand atopy-related phenotypes. These early data regardingphenotypic criteria seemed to support the convergenceof multiple susceptibility loci for the production of symp-tomatic complex disease.

    In genetic linkage or association literature, there arefrequently tacit assumptions, for example, that the expertphysicians diagnoses are sacrosanct, or that inclusionand/or exclusion criteria of any sort could often narrow

    the disease populations to be non-representative of allpatients with the disease. Many of the conflicting reportsof associations in the literature, where independent scien-tists have not confirmed published linkage or associa-tion results with the same disease, could be the result of variations in the selected phenotypes or even the criticalcontrols. With the same corps of physicians collecting the

    large prospective patient and control test series, as well asthe subsequent confirmation series, variation between in-dividual physician diagnostic skills can be minimized andthe phenotype definitions can be stabilized.

    Illustrative study example: metabolic syndromeMetabolic syndrome is usually described as a combina-tion of obesity, diabetes mellitus, hypertension and dys-lipidemia thus providing considerable opportunities forselection of who might be included in genetic studies.In 2002, The National Cholesterol Education Program(NCEP) Expert Panel on Detection, Evaluation andTreatment of High Blood Cholesterol in Adults, AdultTreatment Panel III (ATP-III) published criteria f or a clin-ical diagnosis of the metabolic syndrome *. However, theGSK Genetics of Metabolic Syndrome (GEMS) patientcollections for target association and susceptibility genestudies were initiated before this ATP-III report was cre-ated. By comparing the detailed phenotypic definitionsfor the collected patient information with those of thesamples, and subsequently contrasting this with the def-initions provided by a high level, independent expertpanel, it was possible to analyze the relevance of our con-temporary operational research definitions. In GEMS, twosimple lipid-based criteria were used to define the disease-affected individuals: low high-density lipoprotein (HDL)-cholesterol and a concomitant elevation of plasma triglyc-eride concentrations. These criteria were selected becausethey are primary features of atherogenic dyslipidemia, areclosely related to insulin resistance, detectable at an earlystage in the development of metabolic syndrome, highlyheritable and easy to measure. Wyszynski et al. [45 ] (clin-ical investigators supporting the GEMS Network) reportedthat 86% of individuals greater than 35 years of age metboth the ATP-III and GEMS criteria. Conclusions basedon genetic linkage, genetic association of susceptibilityloci and PGx studies can thus be more accurately inter-

    preted across studies. Accurate, interpretable, reproduciblephenotypic definitions, even of complex syndromes mix-ing several complex diseases, can increase the value of clinical collections and facilitate analysis. Metabolic syn-

    *NCEP criteria for clinical diagnosis of metabolic syndrome requires anythree of the following:fasting plasma glucose of at least 110 mg/dl(6.10 mmol/l);serum triglycerides of at least 150 mg/dl (1.70 mmol/l);serum HDL cholesterol of less than 40 mg/dl (1.04 mmol/l) and50 mg/dl (1.30 mmol/l) for males and females, respectively; and bloodpressure of at least 130 mm Hg systolic and 85 mm Hg diastolic,orwaist circumference (a measure of central adiposity) of more than102 cm and 88 cm for males and females,respectively.

  • 8/11/2019 Roses 2005 Drug Discovery Today

    8/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    184 www.drugdiscoverytoday.com

    drome is an excellent example of variable clinical defini-tions and hypotheses defining the disease.

    Experimental data, analyses and statisticalsignificanceDuring the past two years, up to ~1800 tractable genesrepresented by up to ~7000 SNPs were genotyped againstseveral large patient and unrelated control groups fordisease associations. The SNPs were identified predomi-nantly from public databases, with focused SNP discov-ery performed as necessary. The SNPs selected were com-mon, with 28% average minor allele frequency, and mapped

    typically to intergenic and promoter regions. Genotypingwas performed using a multiplexed, bead-based approachpublished previously ( Figure 2 ) [46 ,47]. A bioinformaticsystem, called SubjectLand, was designed and implementedto contain patient data, SNP (and other genetic variation)data, analyses programs and analytical results, with theability to add-on data-mining capabilities. The standardexperiment for each disease was designed to test ~500well-phenotyped patients and ~500 matched controls inthe initial, primary screen and follow-up with a second-ary screen using an independent set of patients and con-trols ascertained by the same group of physicians.

    The data were statistically analyzedusing a gene-based approach ( Figure 3 ).Allelic, genotypic and haplotypic tests of association were conducted in the pri-mary and secondary screens. A fast fish-ers exact test in SAS/BASICS software wasused for the allelic and genotypic tests

    [48 ], whereas the composite haplotypemethod developed by Zaykin (unpub-lished results) was used for haplotypictests. The use of a gene-based approach foranalysis and replication, as used in HiTDIP,has recently been championed by Nealeand Sham [ 49 ]. Cardon and Bell [ 50 ] stressthat incorrectly adjusting for multipletesting can either unnecessarily reducestatistical power if too stringent a correc-tion is applied or increase the false-positiverate if too weak a correction is used. As aresult of the large number of tests con-ducted, adjustments for multiple testingwere made using a gene-based permuta-tion approach ( Box 1 ).

    The type 2 diabetes study made useof legacy (GlaxoWellcome, Burroughs-Wellcome, Glaxo, Smith Kline Beechamor Beecham) in-house collections ~400cases and controls were examined in theprimary screen and >1100 cases and con-trols in the secondary screen. Among the1405 genes examined in the primaryscreen, 256 genes had a p-value of 0.05.

    Of these 256 genes, 53 also had a p-value of 0.05 in theindependent secondary screen but only 21 of the 53 geneswere confirmed by passing the permutation process(Figure 4 , Box 1 ). As well as conducting further investi-gations regarding the 21 confirmed genes that passed thepermutation test and the 32 genes that did not, analysesof the probability that random genes would demonstratesimilar statistical significance were also performed. Of the21 permutation-confirmed genes, ten could be identifiedin pathways directly related to precedented mechanismsor metabolic pathways associated with disease-specific hy-potheses. In the case of type 2 diabetes mellitus, four of

    the 21 genes had already been chemically screened inlegacy companies and provided several leads. Statisticalanalyses of each disease will be submitted for formal peer-review in specialty journals by the appropriate networkphysicians.

    In some diseases, such as type 2 diabetes mellitus, thereare reasonable prior hypothesis concerning pathogenesis.For example, it is generally agreed that glucose metabo-lism and insulin sensitivity play a role. Among the HiTDIPtype 2 diabetes confirmed genes, several are supportedby prior published hypotheses and appear in expandedmetabolic pathways. Such coincidences resulting from

    FIGURE 3Experimental design algorithm for HiTDIP analyses. The basic design of the HiTDIP program was toscreen approximately 500 patients and 500 controls,all prospectively examined, with full clinicalinformation and commercial informed consent obtained.The current gene panel consists of approximately1800 genes and 7000 validated SNP assays. Although highly significant p-values will be observed for somegenes in the primary screen, the probability for occurrence of false positives is high. A secondary study of approximately 500 patients and 500 controls (obtained prospectively) is also tested.A gene with a p-valueof 0.05 in primary and secondary screens is assessed by permutation ( Box 1). Any gene with apermutation of p 0.05 is considered confirmed . Calculations of random gene association assessed bypermutation were confirmed ( Figure 4 ).

    Drug Discovery Today

    Confirmed

    genes

    Yes

    Yes

    Does gene havea permutation with

    p = 0.05?

    Does gene have any testwith p = 0.05?

    Primary screen500 cases and 500 controls

    1800 genes, 7000 SNPs

    Secondary screen500 cases and 500 controls

    1800 genes, 7000 SNPs

    Genotypic, allelicand haplotypic tests

    of association

    Genotypic, allelicand haplotypic tests

    of association

    Does gene have anytest with p = 0.05?

    YesPermutation test on

    genes withp = 0.05 in both

    screens

  • 8/11/2019 Roses 2005 Drug Discovery Today

    9/13

    DDT Volume 10, Number 3 February 2005

    185www.drugdiscoverytoday.com

    REVIEWS

    screening so many hypothesis-independent genes are, atthe very least, interesting. This provides increased prior-ity for screening specific targets in defined pathways and,subsequently, for designing relevant clinical trials. In dis-eases where there is a paucity of information to form thebasis of supposition, for example, schizophrenia, specificgenes and pathways identified in HiTDIP can provideinsights to new hypotheses. Results for four other sec-ondary screens (schizophrenia, obesity, migraine andunipolar depression) will be available in early 2005.

    Additional experimental data for the association of each gene can also be generated by testing whether the

    target gene is within a region of extended linkage dise-quilibrium (LD) [ 51 53]. This is also important becausethe particular SNP used in the HiTDIP analysis might notbe the variant responsible for the disease association, buthas simply evolved concomitantly with the disease-vari-ant. SNPs that are in LD with the causal variant can pro-vide positive association data, as is the case for susceptibil-ity genes for Alzheimers disease, migraine, Crohns diseaseand psoriasis [ 52 55]. Analyzing for regions of extendedLD is also important to define whether the associationsignal is driven by the tested HiTDIP gene and not byone of its neighboring genes [ 51 ]: documenting extended

    BOX 1

    The permutation testing process

    The objective of this study was to identify tractable genes associatedwith disease. Some small genes might have only one or two SNPsanalyzed, whereas 30 40 SNPs could be assessed in the case of someof the larger genes, such as those encoding ion channels.The greaterthe number of SNPs and tests performed on a gene, the greater the

    probability that the gene will appear significant than by chancealone.To account and correct for the variable number of testsconducted across genes,a gene-based permutation test was applied.Permutation testing is a standard method used to assess significancein the statistical analysis of genetic data [ 57]. Any gene significant ata p-value of 0.05 in the primary and secondary screens was furtherassessed by performing this permutation process on the data fromthe secondary screen.For each permutation,affection status wasshuffled among the cases and controls.The genetic data for all SNPsacross each subject was not altered.This maintains the underlyingcorrelation between SNPs within a gene.All the SNPs within the genewere analyzed using allelic,genotypic and haplotypic associationtests via the same methods used for the observed data.The smallestp-value across all tests was recorded for each permutation.Thepermutations were repeated up to 5000 times per gene.The

    empirical p-value is the proportion of minimal p-values from thepermutations that are less than the observed minimal p-value fromthe actual data.The empirical p-value is estimated using Equation i .

    r+1 [Eqn i]n+1

    where r is the number of permutation p-values as small as or smallerthan in the actual data and n is the number of permutations [58,59].

    A gene with an empirical p-value of 0.05 was considered to beconfirmed with respect to statistical association with disease. Forexample, if the smallest p-value of gene A in the observed data was0.004 and among 5000 permutation (n = 5000) a p-value of 0.004was observed 50 times (r = 50),then the permutation process wouldgenerate an empirical p-value for gene A of 0.01, and it would beclassed as a confirmed gene.

    For the type 2 diabetes study assessed by permutation, theminimal p-value from the actual secondary dataset for each of the53 genes analyzed and their empirical p-values from thepermutation process are shown in Figure i (observed and gene-based permutation results).Of the 53 genes assessed,21 genes hadempirical p-values of 0.05 and thus were considered confirmed .Theobserved and empirical p-values for gene 7, a confirmed gene,werehighly similar,differing by only ~0.0001.For gene 27,which was notconfirmed, the corresponding values showed striking differences: thepermutated p-value was approximately 0.33, whereas the observedp-value was

  • 8/11/2019 Roses 2005 Drug Discovery Today

    10/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    186 www.drugdiscoverytoday.com

    LD around a HiTDIP gene can increase (or decrease) con-fidence in its validity.

    It is reasonable to expect that a large screen with manytested variables could be subject to false positives. Althoughthe same limitations also characterize susceptibility genesresulting from linkage and association studies, they arenot usually tractable targets for chemical screening.Susceptibility gene studies provide new insights that arerelevant to disease pathogenesis: we anticipate that theHiTDIP genes will also generate new theories and providesupport for existing hypotheses. One immediate strategycould be to provide validation support by concentratingthorough phenotypic studies on knockout or otherwisemanipulated mice [ 16 ]. For the type 2 diabetes example,this would mean knockouts and conditional knockinsfor ~21 genes and their specific variants associated withdisease. Each mouse line could be examined with highly

    focused and complete biochemical and physiological phe-notyping in parallel with high-throughput chemicalscreening.

    There is no documented method for predicting whichHiTDIP genes will point to lead candidates for medicines.The selection of genes with which to initiate high-through-put assay development is empirically based on severalcriteria, one of which is the relative ease of creating thechemical screening assay. A few confirmed genes for type2 diabetes mellitus were screened previously in GSK legacycompanies because of pre-existing literature rationales forpotential involvement in disease pathogenesis. Enlarging

    or repeating the chemical screens with the current GSKcompound libraries is an immediate option, as is re-eval-uating earlier lead programs. In addition, there are severalgenes that the literature indicates are involved in inter-esting, potentially novel mechanisms. There are othergenes, particularly those for which the design of high-throughput assays is relatively uncomplicated, that can

    be prioritized. In the GSK R&D structure, scientists in theCenters for Excellence in Drug Discovery (CEDDs), whoare most familiar with the disease, prioritize the targets.

    It is particularly pertinent to mention here that GSKis now dealing with a relatively large, staggered flow of new target genes for which there is excellent support forgenetic association. If the leads from screened HiTDIPgenes reduce attrition, this will be evidenced in a largerproportion of positive Phase IIA efficacy studies over thenext 35 years. Remember, hits and leads must be opti-mized and undergo preclinical regulatory testing beforehuman testing can begin.

    HiTDIP provides genetic validation for target selectionto drive the pipeline. The confirmed targets are geneti-cally associated in some manner with specific human dis-eases or therapeutic indications that can be encompassedby the disease diagnostic label. Indeed, genes that areassociated with specific clinical variables could help todefine complex disease heterogeneity. A significant dif-ference in success compared with historical target selec-tion can be tested against benchmarks. Attrition that isthe result of a lack of relevance to a particular human dis-ease might only be appreciated when there is an absenceof clinical efficacy many years down the pipeline. Reducingattrition at the proof of efficacy stage (Phase IIA) will in-crease the efficiency of the pipeline and could provide asustainable stream of effective medicines related to humandiseases. Furthermore, additional hypotheses might begenerated at Phase IIA by the application of efficacy phar-macogenetics. These hypotheses can be tested reiterativelyin Phases IIB, III and IV. Of course, the consented DNAsamples remain available for testing new target classesas new chemical synthesis capabilities are developed.

    HiTDIP to discovery shunt some matches are madein heavenTrying to decide what to wear for a particular occasion fre-

    quently involves a quick scan of your collection of clothes.Large pharmaceutical companies, particularly those withlegacy compound libraries resulting from long histories of drug discovery, have a lot of unworn items in their collec-tions. Many discovery programs are initiated and numer-ous targets are screened only for leads to be left on the shelf(some with the price tags still on!) when times and cir-cumstances changed. The initial assumption for HiTDIP wasthat by screening all the known tractable targets against sev-eral well-defined diseases, new targets would be identified.Each would then require the design of a high-throughputscreen and a new screen of the chemical compound library.

    FIGURE 4Type 2 diabetes mellitus HiTDIP results. The type 2 diabetes mellitus HiTDIPexperiment was the first to be performed with two case-control screens.The priorasthma screens used a family-based series, thus the statistical analyses weresignificantly different.The primary screen consisted of 401 cases and 400 controls and

    used 4267 validated SNP assay for 1405 genes. A set of 256 genes yielded a p-value of 0.05 (a large number illustrating the high false-positive rate that is probable with asingle screen).The secondary screen enrolled 1166 patients and 1260 controls.At thetime that the secondary screen was initiated, the genotyping capacity was limitedand thus only the 256 genes and their 845 SNPs were included. A set of 53 genesyielded a p-value of 0.05,and were assessed by permutation,with 21 genes beingconfirmed (Figure i ). Estimation of the number of genes being confirmed by chancein this study yielded a rate of approximately ten genes per 1400 tested ( Box 1).

    Drug Discovery Today

    Primary screen401 cases and 400 controls

    1405 genes, 4267 SNPs

    Genotypic,allelic andhaplotypic

    association

    Genotypic,allelic andhaplotypic

    association

    256 genes with p = 0.05

    Secondary screen1166 cases and 1260 controls

    256 genes, 845 SNPs

    Permutation

    21 confirmed genes

    53 genes with p = 0.05

  • 8/11/2019 Roses 2005 Drug Discovery Today

    11/13

    DDT Volume 10, Number 3 February 2005

    187www.drugdiscoverytoday.com

    REVIEWS

    However, once targets with highly statistically signifi-cant associations with a particular disease were defined,another surprisingly rich supply of existing leads, whichrequired no novel chemical screening, was identified im-mediately. Indeed, several targets were already screened

    against the legacy company chemical libraries, withresulting leads left on the shelf because the program wasdropped, or abandoned after a molecule failed for a par-ticular therapeutic indication. Thus, immediately aftercompleting confirmation analyses for the initial HiTDIPprograms, several targets were identified for which therewere lead or better quality molecules on the shelf. At

    present, several molecules have either entered preclinicaltesting for Phase I or have already been tested in clinicaltrials designed for different therapeutic indications.

    The match between the molecule and disease is the keydata gained from confirmed genetic association. Using mol-ecules for which considerable data and work already existswill no doubt accelerate pipeline dynamics. More shots ongoal quickly or better shots at more defined goals resultsfrom being able to mine data generated from years of pre-vious drug discovery programs. Occasionally, after con-siderable prior hard work, serendipity plays a part whichis indeed food for thought for those who wonder whatthe advantages of corporate size could be! Although legacycompany names might disappear, contributions to capa-bilities, such as prior screens of targets, are maintained.

    Whole-genome testing and pathway analysesAlthough nothing in science is unanimous, there is agrowing belief that high-density, whole-genome SNP-screening, using newer statistical methods and powerfulcomputing capacities, can identify susceptibility genes fora disease. From a practical viewpoint, and the experienceof studying almost 2000 genes selected solely because theycould be drug targets, it would seem reasonable to per-form additional genome-wide (all genes) SNP-screeningassociation studies. Family linkage studies of the pasttwo decades have defined large regions of chromosomes(110 Mb) using log of the odds scores that successfullyidentified disease and susceptibility genes. By narrowingthese large regions using candidate genes, many positiveresults were reported. Much smaller regions of extendedLD (50250 kb) containing disease susceptibility geneswere identified using large patient and control collectionsfor case-control association studies. The DNA is there, themethods are increasingly more feasible economically andthere is a growing literature of statistical methods to sup-port these large studies [ 56 ]. From an academic point of

    view, whole-genome screening studies might be too ex-pensive to be readily available, but that does not meanthey will not provide scientifically valid data and diseaseinsight. Therefore, after the timetable of HiTDIP confirma-tion screening is completed, whole-genome SNP-screen-ing will follow. Making the right choice of targets at thebeginning of the pipeline will be the first step down thelong road of creating innovative medicines.

    Within three yearsThe first practical metric that will provide an estimate of the success of HiTDIP will be comparing Phase II attrition

    BOX 2

    Learning of the power and statistical significance of association studies

    Hirschhorn et al. [18] conducted an interesting meta-analysis of 166 initialassociations to determine the probability of their being reliablyreplicated. A particularly notable finding was that of 166 initialassociations from multiple studies,only six replicated consistently acrossinvestigations (i.e. had p-values

  • 8/11/2019 Roses 2005 Drug Discovery Today

    12/13

    REVIEWS DDT Volume 10, Number 3 February 2005

    188 www.drugdiscoverytoday.com

    with historical benchmarks. If the target genes selectedresult in more frequent early-phase efficacy successes,then the quantal step-up in discovery described byWeisberg will be apparent [ 32 ]. There will be other par-allel strategies for selecting targets. However, comparisonsof the success and attrition rates of drug candidates re-sulting from screens of genetically associated targets can

    be compared with benchmarks for other historical or cur-rent strategies. A retrospective view will be of interest but,for the present, the first indications will come from de-creased attrition at Phase II. Along the way, there will bea stream of lead molecules with which to prime a high-throughput pipeline. Combining the choice of the righttarget and the application of efficacy pharmacogeneticsat proof of concept, when necessary, should result in bet-ter defined medicines for patients with complex diseases[3]. The objective of the process is to reduce attrition, cycletimes and expense, as well as provide safe and effectivemedicines.

    AcknowledgementsThis research was supported by GSK R&D. We wish to ac-knowledge the work of hundreds of GSK scientists, almosttwo hundred collaborating clinicians, many academicconsultants and others who have contributed to theHiTDIP programs. As each set of disease-specific results ispublished, it is anticipated that many of the contributors

    will be identified and will be co-authors. Alison Gatherumand Mindy McPeak were extremely helpful in preparingthe manuscript. Several important visionaries must alsobe acknowledged for a program that has taken seven yearsto begin to deliver sustained value to the pipeline: JamesNiedel, who was the Chairman of R&D in GlaxoWellcomeand supported the initiation of the clinical networks(19972000); Tachi Yamada, Chairman of R&D in GSK,who backed and expanded this strategy as a core objec-tive of the R&D organization of GSK (2000present); Jean-Paul Garnier, CEO of GSK, whose support for the early,middle and late parts of the R&D pipeline was crucial insustaining these efforts; and the thousands of people inthe GSK organization who believe in Focus on thePatient to create safe and effective medicines.

    References1 Chapman, T. (2004) Drug discovery the

    leading edge. Nature 430, 1092 Kola, I. and Landis, J. (2004) Can the

    pharmaceutical industry reduce attrition rates?Nat. Rev. Drug Discov. 3, 711715

    3 Roses, A. (2002) Genome-basedpharmacogenetics and the pharmaceuticalindustry. Nat. Rev. Drug Discov. 1, 541549

    4 Aono, S. et al. (1995) Analysis of genes forbilirubin UDP-glucuronosyltransferase in

    Gilberts syndrome. Lancet 345, 9589595 Fijal, B.A. et al. (2000) Clinical trials in thegenomic era: effects of protective genotypes onsample size and duration of trial. Control. Clin.Trials 21, 720

    6 Monaghan, G. et al. (1996) Genetic variation inbilirubin UPD-glucuronosyltransferase genepromoter and Gilberts syndrome. Lancet 347,578581

    7 Chanda, S. and Caldwell, J. (2003) Fulfilling thepromise: drug discovery in the post-genomicera. Drug Discov. Today 8, 168174

    8 Glassman, R. and Sun, A. (2004) Biotechnology:identifying advances from the hype. Nat. Rev.

    Drug Discov. 3, 1771839 Johnston, P.A. and Johnston, P.A. (2002)

    Cellular platforms for HTS: three case studies. Drug Discov. Today 7, 35336310 Austin, C.P. et al. (2004) The knockout mouse

    project. Nat. Genet. 36, 92192411 Drews, J. (2000) Drug discovery: a historical

    perspective. Science 287, 1960196412 Marecki, S. and Kirkpatrick, P. (2004)

    Efalizumab. Nat. Rev. Drug Discov. 3, 47347413 Food and Drug Administration (2004)

    Innovation stagnation, challenge andopportunity on the critical path to new medicalproducts ( http://www.fda.gov/oc/initiatives/criticalpath/whitepaper.html )

    14 Berns, A. (2001) Cancer improved mousemodels. Nature 410, 10431044

    15 Liggett, S.B. (2004) Opinion: genetically

    modified mouse models for pharmacogenomicresearch. Nat. Rev. Genet. 5, 657663

    16 Zambrowicz, B.P. et al. (2003) Predicting drugefficacy: knockouts model pipeline drugs of thepharmaceutical industry. Curr. Opin. Pharmacol.3, 563570

    17 Zambrowicz, B.P. and Sands, A. (2003)Knockouts model the 100 best-selling drugs will they model the next 100? Nat. Rev. Drug

    Discov. 2, 3851

    18 Hirschhorn, J. et al. (2002) A comprehensivereview of genetic association studies. Genet. Med. 4, 4561

    19 Neale, B. and Sham, P. (2004) The future of association studies: gene-based analysis andreplication. Am. J. Hum. Genet. 75, 353362

    20 Lohmueller, K. et al. (2003) Meta-analysis of genetic association studies supports acontribution of common variants tosusceptibility to common disease. Nat. Genet.33, 177182

    21 Ioannidis, J. et al. (2003) Genetic associations inlarge versus small studies: an empiricalassessment. Lancet 361, 567571

    22 Tabor, H.K. et al. (2002) Candidate-geneapproaches for studying complex genetic traits:

    practical considerations. Nat. Rev. Genet. 3,39139723 Thomas, D. and Clayton, D. (2004) Betting

    odds and genetic associations. J. Natl. Cancer Inst. 96, 421423

    24 Weiss, K. and Terwilliger, J. (2000) How manydiseases does it take to map a gene with SNPs.Nat. Genet. 26, 151157

    25 Risch, N. (2000) Searching for geneticdeterminants in the new millennium. Nature405, 847856

    26 Colhoun, H.M. et al. (2003) Problems of reporting genetic associations with complexoutcomes. Lancet 361, 865872

    27 Tufts Center for the Study of DrugDevelopment (2003) Post-approval R&D raises

    total drug development costs to $897 million. Impact Report: Analysis and Insight into Critical Drug Development Issues 5, MayJune

    28 Venter, J.C. (2000) Genomic impact onpharmaceutical development. Novartis

    Foundation Symposium 229, 14-1829 Collins, F. and McKusick, V. (2001) Implications

    of the Human Genome Project for medicalscience. J. Am. Med. Assoc. 285, 540544

    30 Lander, E. et al. (2001) Initial sequencing and

    analysis of the human genome. Nature 409,86092131 Venter, J. et al. (2001) The sequence of the

    human genome. Science 291, 130432 Fishman, M.C. (2004) An audience with.

    Nat. Rev. Drug Discov. 3, 29233 Lindsay, M. (2003) Innovation target

    discovery. Nat. Rev. Drug Discov. 2, 83183834 Goldstein, D. et al. (2003) Pharmacogenetics

    goes genomic. Nat. Rev. Genet. 4, 93794735 Altshuler, D. et al. (2000) The common PPAR

    gamma Pro12Ala polymorphism is associatedwith decreased risk of type 2 diabetes.Nat. Genet. 26, 7680

    36 Martin, E. et al. (2001) Association of single-nucleotide polymorphisms of the tau gene with

    late-onset Parkinson disease. J. Am. Med. Assoc.286, 2245225037 Van Eerdewegh, P. et al. (2002) Association of

    the ADAM33 gene with asthma and bronchialhyperresponsiveness. Nature 418, 426430

    38 Funke, B. et al. (2004) Association of theDTNBP1 Locus with Schizophrenia in a U.S.Population. Am. J. Hum. Genet. 75, 891898

    39 Duan, J. et al. (2004) Polymorphisms in thetrace amine receptor 4 (TRAR4) gene onchromosome 6q23.2 are associated withsusceptibility to schizophrenia. Am. J. Hum.Genet. 75, 624638

    40 Giallourakis, C. et al. (2003) IBD5 is a generalrisk factor for inflammatory bowel disease:replication of association with Crohn disease

    http://www.fda.gov/oc/initiatives/criticalpath/whitepaper.htmlhttp://www.fda.gov/oc/initiatives/criticalpath/whitepaper.htmlhttp://www.fda.gov/oc/initiatives/criticalpath/whitepaper.htmlhttp://www.fda.gov/oc/initiatives/criticalpath/whitepaper.html
  • 8/11/2019 Roses 2005 Drug Discovery Today

    13/13

    DDT Volume 10, Number 3 February 2005 REVIEWS

    and identification of a novel association withulcerative colitis. Am. J. Hum. Genet. 73,205211

    41 Schmith, V. et al. (2003) Pharmacogenetics anddisease genetics of complex diseases. Cell. Mol.

    Life Sci. 60, 1636164642 Drews, J. (1996) Genomic sciences and the

    medicine of tomorrow. Nat. Biotechnol. 14,15161518

    43 Hopkins, A.L. and Groom, C. (2002) Thedruggable genome. Nat. Rev. Drug Discov. 1,727730

    44 Sachidanandam, R. et al. (2001) A map of human genome sequence variation containing1.42 million single nucleotide polymorphisms.Nature 409, 928933

    45 Wyszynski, D.F. et al. The relationship betweenatherogenic dyslipidemia and the adulttreatment program-III definition of metabolicsyndrome: The Genetic Epidemiology of Metabolic Syndrome Project. Am. J. Cardiol.(in press)

    46 Taylor, J.D. et al. (2001) Flow cytometricplatform for high-throughput single nucleotidepolymorphism analysis. Biotechniques 30,661669

    47 Chen, J. et al. (2000) A microsphere-based assayfor multiplexed single nucleotidepolymorphism analysis using single base chainextension. Genome Res. 10, 549557

    48 Mehta, C. and Patel, N. (1983) A networkalgorithm for performing Fishers Exact Test inrc contingency tables. J. Am. Stat. Assoc. 78,427434

    49 Neale, B. and Sham, P. (2004) The futureof association studies: gene-based analysisand replication. Am. J. Hum. Genet. 75,353362

    50 Cardon, L. and Bell, J. (2001) Association studydesigns for complex diseases. Nat. Rev. Genet. 2,9199

    51 Roses, A. (2002) SNPs wheres the beef? Pharmacogenomics J. 2, 277283

    52 McCarthy, L. et al. (2001) Single-nucleotidepolymorphism alleles in the insulin receptorgene are associated with typical migraine.Genomics 78, 135149

    53 Hewett, D. et al. (2002) Identification of apsoriasis susceptibility candidate gene bylinkage disequilibrium mapping with a localizedsingle nucleotide polymorphism map. Genomics79, 305314

    54 Martin, E. et al. (2000) A test for linkage andassociation in general pedigrees: the pedigreedisequilibrium test. Am. J. Hum. Genet. 67,146154

    55 Rioux, J. et al. (2001) Genetic variation in the5q31 cytokine gene cluster conferssusceptibility to Crohn disease. Nat. Genet. 29,223228

    56 Carlson, C. et al. (2004) Mapping complexdisease loci in whole-genome associationstudies. Nature 429, 446452

    57 North, B.V. et al. (2002) A note on thecalculation of empirical P values from MonteCarlo procedures. Am. J. Hum. Genet. 71,439441

    58 Davison, A.C. and Hinkley, D.V. (1997) TheBasic Bootstraps. In Bootstrap Methods and their

    Application (Gill, R. et al. , eds), pp. 1166,Cambridge University Press

    59 Davison, A.C. and Hinkley, D.V. (1997) Tests.In Bootstrap Methods and their Application(Gill, R. et al. , eds), pp. 136187, CambridgeUniversity Press

    60 Roses, A. (2004) Pharmacogenetics and drugdevelopment: the path to safer and moreeffective drugs. Nat. Rev. Genet. 5, 645655

    Related articles in other Elsevier journals

    Target identification and validation through geneticsAllen,M.J.and Carey, A.H.(2004) Drug Discovery Today:TARGETS3,183 190

    Translating pharmacogenetics and pharmacogenomics into drug development for clinical pediatrics and beyondLeeder, J.S. (2004) Drug Discov.Today, 9,567 573

    Genome scans and candidate gene approaches in the study of common diseases and variable drug responsesGoldstein,D.B. et al. (2003) Trends Genet. 19,615 622

    SNP association studies in Alzheimer s disease highlight problems for complex disease analysisEmahazion,T. et al. Trends Genet. 17,407 413