Download - Translating next generation sequencing to practice

M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5

ava i l ab le a t www.sc ienced i rec t . com

www.elsevier .com/locate/molonc

Review

Translating next generation sequencing to practice:

Opportunities and necessary steps

Sitharthan Kamalakarana,*, Vinay Varadana, Angel Janevskia,Nilanjana Banerjeea, David Tuckb, W. Richard McCombiec,Nevenka Dimitrovaa, Lyndsay N. Harrisd

aPhilips Research North America, Briarcliff Manor, NY 10510, USAbOncology Global Clinical Research, Bristol-Myers Squibb, Princeton, NJ 08540, USAcCold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USAdCase Western Reserve University School of Medicine, Cleveland, OH 44106, USA

A R T I C L E I N F O

Article history:

Received 20 March 2013

Accepted 21 April 2013

Available online 15 May 2013

Keywords:

Next generation sequencing

Oncology

Personalized medicine

Genomics

* Corresponding author.E-mail addresses: [email protected]

Janevski), [email protected] (D. [email protected] (L.N. Dimitrova), Lyndsay1574-7891/$ e see front matter ª 2013 Federhttp://dx.doi.org/10.1016/j.molonc.2013.04.00

A B S T R A C T

Next-generation sequencing (NGS) approaches for measuring RNA and DNA benefit from

greatly increased sensitivity, dynamic range and detection of novel transcripts. These tech-

nologies are rapidly becoming the standard for molecular assays and represent huge po-

tential value to the practice of oncology. However, many challenges exist in the

transition of these technologies from research application to clinical practice. This review

discusses the value of NGS in detecting mutations, copy number changes and RNA quan-

tification and their applications in oncology, the challenges for adoption and the relevant

steps that are needed for translating this potential to routine practice.

ª 2013 Federation of European Biochemical Societies.

Published by Elsevier B.V. All rights reserved.

1. Introduction There now exists an extensive literature cataloging

The last decade of research has consolidated our understand-

ing of cancer as a genetic disease caused by genomic disrup-

tions ranging from single point mutations, deletions or

amplifications of chromosomal segments, and structural rear-

rangements that give rise to chimeral genes. The aberrations

at the genomic level drive changes in gene expression, acti-

vate or silence genes and thereby perturb gene networks

and pathways.

(S. Kamalakaran), vinay.verjee), [email protected]@UHhospitals.orgation of European Bioche8

genomic disruptions in cancer and their effect on biological

functions of cancer cells. Several of these disruptions are

important biomarkers and impact treatment options. Estro-

gen receptor (ER) testing has been routinely performed on

breast carcinoma samples since the 1980’s to determine if hor-

monal therapy is indicated. Similarly, EGFR mutation status

has been used to determine which lung cancer patients will

benefit from agents targeting the EGFR receptor. The FDA lists

more than 100 indications where pharmacogenomic testing is

[email protected] (V. Varadan), [email protected] (A.(D. Tuck), [email protected] (W.R. McCombie), nevenka.dimi-

(L.N. Harris).mical Societies. Published by Elsevier B.V. All rights reserved.

mailto:[email protected]









http://crossmark.dyndns.org/dialog/?doi=10.1016/j.molonc.2013.04.008&domain=pdf

www.sciencedirect.com/science/journal/15747891

http://www.elsevier.com/locate/molonc

http://dx.doi.org/10.1016/j.molonc.2013.04.008





indicated, including 38 in oncology. (http://www.fda.gov/

drugs/scienceresearch/researchareas/Pharmacogenetics/

ucm083378.htm). While each of these individual measures is

of value, individually they represent a single data point in

the complex environment of cancer.

Theadventofmultigeneanalysis tools, i.e. genearrays,CGH

and next generation sequencing, has advanced the field by

measuring the complexity of cancer in amore comprehensive

fashion. Several multigene assays have been introduced into

the clinic to predict patient outcome. Oncotype DX� (Genomic

Health, Redwood, CA) quantifies the expression of 21 genes by

RT-PCR and uses an algorithm to combine the expression

values into a “recurrence score” to predict chemotherapy

benefit for a subset of breast cancer patients. Many additional

tests andmarkers have been reported to be of use in theman-

agement of cancere CancerTYPE ID (bioTheranostics, Inc, San

Diego, CA) to aid in the classification of the tissue of origin and

tumor subtype for patients diagnosed withmalignant disease,

OncoTypeDxColon (GenomicHealth) for assessment of risk of

recurrence following surgery in stage II colon cancer patients

and Mammaprint� (Agendia, Irvine, CA) for identifying risk

ofdistant recurrence followingsurgery.All these testsmeasure

genes and their expression levels through capillary

sequencing, microarrays, or PCR and are standardized for

measuring a small subset of the tumor genome.

As the utility of these tools increases, the number of

different tests and diagnostic providers makes it increasingly

cumbersome for pathologists and oncologists to obtain

enough sample material for analysis. Next Generation

Sequencing (NGS) technologies offer the potential to measure

and quantify all these markers at once and provide a more

complete view of the tumor’s molecular state. In addition,

NGS has allowed the analysis of complete human genomes

at a reasonable cost e the cost for sequencing one human

genome has come down from $100 million in 2001 to just un-

der $3000 in 2012 (www.genome.gov/sequencingcosts).

NGS can now provide the following depth and breadth of

genomic information in a single test: (Gargis et al., 2012) (i)

Whole genome information at single nucleotide level with a

complete catalog of mutations, (Ellis and Perou, 2013); (ii) A

profile of the copy number states of individual genes and

many chromosomal aberrations (Ellis and Perou, 2013); (iii)

Whole transcriptome landscape including mRNA levels of

protein coding genes, non-coding RNA, expression of repeat

rich regions, and aberrant fusion genes. We can now foresee

a scenario in which a tumor sample, once obtained through

biopsy or surgery, is used to extract DNA and RNA, sequenced

and assembled to provide the full patient genome and an

RNASeq profile of his/her transcriptome (Figure 1). This data

is then mined to catalog all mutations and copy number aber-

rations and quantify expression of genes. Once the muta-

tional, copy number and transcriptomic profile is generated,

clinical decision support algorithms are used to extract and

present useful and clinically actionable information from

the results of the analyses. This has already been shown in pi-

lot studies (Gargis et al., 2012) and is a major step forward in

realizing the potential of personalized medicine.

The comprehensive nature of NGS also has the potential to

replace a multitude of single gene tests that are currently per-

formed on multiple discrete specimens with a single test on

one specimen. This would lead to improved standardization

of tests for specific genetic abnormalities, more in depth infor-

mation for the clinicians and more cost effective molecular

diagnostic testing. Additionally, once sequenced, this genome

information is “digitized andmay be immortalized” which en-

ables the sequenced sample to be “frozen” in-silico and acces-

sible for further querying as the treatment progresses or

whenever new clinically relevant aberrations are identified

or reported. This is more advantageous than testing an

archived tumor sample for prospective or retrospective anal-

ysis. This makes a patient genome a source of data for

comprehensive genome forward and backward approaches

i.e. mutations that are identified would drive the selection of

therapies based on retrospective data on prior patients with

similar genomic profiles whose outcome is known (genome

backward medicine). For patients who have failed conven-

tional therapies or for whom there are no clearly delineated

guidelines on therapy choice, new therapeutic strategies

could be attempted (as part of clearly defined clinical trials)

based on the comprehensive analysis of the patient tumor’s

genetic makeup (genome forward medicine). Recently, Ellis

and Perou offered a prospective view of how genomic profiling

could help in the treatment of breast cancer (Ellis and Perou,

2013). They catalogued specific examples of mutations

(PIK3CA, BRCA1, BRCA2, GATA3, MLL gene family, rare Recep-

tor Tyrosine Kinases) and genomic abnormalities (amplifica-

tions/gain of function mutations in Her2, FGFR1, FGF3,

Cyclin D1/CDK4/CDK6, MDM2, deletions/loss of function mu-

tations in PTEN, PIK3R1) that could be used to target therapies

in breast cancers.

However, the optimal use of these novel molecular assays

will be a challenge to the practicing oncologist. There exist

many challenges to transfer this vision from being used in a

few luminary sites andhospitals into routinemedical practice.

Many of the opportunities and challenges in applying next-

generation sequencing for clinical applications have been

reviewed elsewhere (Biesecker et al., 2012; Biesecker et al.,

2009; Nekrutenko and Taylor, 2012; Treangen and Salzberg,

2011; Berg et al., 2011; Maher, 2011; McDermott et al., 2011;

Ormond et al., 2010). In this review, we will focus on the role

of next generation sequencing technologies for cancer pa-

tients, the challengeswe face in using this technology for clin-

ical applications and provide a framework for oncologists to

the promise and pitfalls for use in routine clinical practice.

We consider threemajor groups of challenges for the introduc-

tion of a new technology such as sequencing into clinical prac-

tice. We outline these major steps in Figure 2.

1. Technology e the feasibility of the technology is an impor-

tant but not sufficient step in this process. In this step the

necessary aspects of technology reproducibility, and accu-

racy must be met. Additionally, the cost and throughput

of the technologywould need to be within acceptable limits

to enable wide introduction into practice, and if individual

instruments are to enter the clinic, they need to pass the

regulatory approval (510(k) approval).

2. Clinical applicability eone would need to demonstrate clin-

ically meaningful uses of the technology and their benefits

for patients and providers. Additionally software and algo-

rithms that are used must be tested for usability and be

http://www.fda.gov/drugs/scienceresearch/researchareas/Pharmacogenetics/ucm083378.htm



http://www.genome.gov/sequencingcosts




Figure 1 e A model for enabling sequencing based personalized oncology. A CLIA certified laboratory generates NGS data (Layer 1) that is

transferred to a High Performance Computing environment where the requisite quality control and analysis of the data is performed (Layer2).

Clinical Decision support algorithms will extract clinically relevant pieces of information from the processed data (Layer 3). Examples of clinically

relevant information could be the identification of an activating PIK3CA mutation, ERBB2 amplification, a gene signature such as breast cancer

subtype or a gene fusion. This clinically relevant information is then interactively viewed through desktop/mobile devices (Layer 4).

M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 745

quality controlled. Any tests that the technology is used to

provide must demonstrate high sensitivity and specificity.

3. Adoption e the prescribed use of the technology must be

described by professional organizations; regulatory

approval must be obtained for clinical use. Additionally,

infrastructure and standardsmust be set up to enablewide-

spread use of the technology, including CPT coding of mo-

lecular services that will be intelligible to payors and will

promote uniformity of coding among equipment and ser-

vice providers. They will also include setting up IT, data

storage and privacy, and educational programs for training

healthcare professionals for appropriate use.

We will now review in detail the status of NGS technology

in the context of the broad outlines described above.

2. Technology feasibility

Next Generation Sequencing has rapidly replaced other high-

throughput technologies such as microarrays as the platform

of choice formany genomic applications. The base-call quality

of Illumina NGS machines when integrated across all reads of

a given base exceed those of Sanger based capillary se-

quencers. Ninety percent of the bases called by Illumina

NGS sequencer have quality phred-scores of Q30 compared

to around Q20 for Sanger based sequencers. A phred score of

Q30 corresponds to a probability of base-call error of 1 in

1000 or 99.9% accuracy. Also, throughput of the sequencers

has increased dramatically so that it is now possible to

generate enough sequence data to assemble a full human

genome in 1 day. New sequencers such as Ion Torrent (Life

technologies, Carlsbad, CA) and Oxford Nanopore Technolo-

gies (Oxford, UK) promise even higher throughput at lower

costs. The accuracy, speed and cost of assembling a human

genome have met the threshold for enabling clinical use.

Additionally, these instruments are undergoing regulatory

approval under Clinical Laboratory Improvement Amend-

ments (CLIA) and FDA (510(k)). Several commercial providers

have begun to offer CLIA certified lab developed tests (LDT)

that use NGS technologies. Foundation Medicine (Cambridge,

MA), offers a CLIA certified test that scans for somatic




Figure 2 e Major challenges for introducing sequencing based oncology into routine practice.


alterations in 236 relevant cancer-related genes. Ambry Ge-

netics (Aliso Viejo, CA) offers CLIA certified Exome Sequencing

for undiagnosed genetic diseases. Both Life Technologies

(AmpliSeq� Comprehensive Cancer Panel) and Illumina (Tru-

Seq Amplicon e Cancer Panel) offer target selection assays for

resequencing cancer genes. These methods allow for fast and

efficient resequencing of key genes in formalin fixed paraffin

embedded (FFPE) samples.

3. Clinically meaningful applications of NGS

For NGS to become part of routine clinical practice, the medi-

cal benefits must be clearly demonstrated. In the area of

oncology, there are a number of clinical needs that can be

met by sequencing technologies.

Genetic abnormalities detectable by sequencing can be

classified into three major groups e single nucleotide changes

or pointmutations, copy number changes (amplification, rear-

rangement or deletion of sections of chromosomes) and

changes in expression levels of genes. In the following para-

graphs, we discuss biomarkers with known clinical utility,

measured by individual tests, which could be converted to a

whole genome sequencing approach.

3.1. 1 Point mutations

Point mutations that lead to constitutive activation of onco-

genes or inactivation of tumor suppressor (TS) genes have

been used to guide development of novel targeted therapies

e the best described is the cKIT mutation in gastrointestinal

stromal tumors that can be targeted by imatinib or nilotinib.

Recently, mutations in EGFR have been reported to be predic-

tive for EGFR inhibitors (Lynch et al., 2004; Paez et al., 2004),

mutations in BRAF oncogene at codon 600 have been reported

to have clinical utility in melanoma, colorectal, lung and thy-

roid cancers (De Roock et al., 2011; Melck et al., 2010; Pao and

Girard, 2011). Patients with BRAF V600E mutation-positive,

inoperable or metastatic melanoma, are eligible for treatment

with vemurafenib. The cobas 4800 BRAF V600 Mutation Test

(Roche), is used to identify patients eligible for treatment. An

exemplary list of point mutations for which commercial

testing is available and their utility in cancer treatment are

provided in Table 1.

For single gene tests to be used, the mutations in the gene

have to meet a threshold of prevalence to warrant testing and

have very high clinical value. Many of the genes listed in Table

1 are frequently not tested in the clinic because they fail to

meet this threshold or the resultant clinical action of the tests

is not clear. Recent reports suggest that each tumor has a

distinct number of mutations driving it e however, each of

these mutations is present only in a small percentage of tu-

mors. Mardis et al. (2009) identified 750 point mutations in

AML of which 64 were in coding/regulatory regions. Only 4

out of these 64 point mutations could be detected in more

than one sample, suggesting that individual mutations are

not recurrent. Other studies have reported that while a spe-

cific mutation might not be found recurrently, common path-

ways can be identified thatmay drive pathogenesise Stransky

et al. (2011) found thatmore than 30% of head and neck cancer

cases harbored mutations in genes that regulate squamous

differentiation (e.g., NOTCH1, IRF6, and TP63). These studies

suggest that methods that assess the entire genome would

offer a comprehensive approach in defining perturbed path-

ways as opposed to using single gene expression levels or mu-

tation of particular codons as surrogates of perturbed

pathways.

It is also increasingly recognized that cancer mutations are

not limited by tissue type, although the prevalence of these

mutations might be so. For example, BRAF mutations are




Table 1 e Clinically relevant tests for mutations in cancer and their utility.

Gene Mutation Test type Cancer type Ref.

AKT1 E17K Prognostic Breast, Colorectal, Lung and Ovarian cancers (Pao and Girard, 2011;

Bleeker et al., 2008)

BRAF V600E Predictive of response

to vemurafenib

or dabrafenib

Non-Hodgkin lymphoma, colorectal cancer,

malignant melanoma, thyroid carcinoma,

non-small cell lung carcinoma, lung

adenocarcinoma and melanoma

(De Roock et al., 2011;

Melck et al., 2010;

Pao and Girard, 2011;

Flaherty et al., 2012)

EGFR Exons 18e21 Predictive of

benefit to EGFR TKIs

Non-Small Cell Lung Cancer (Linardou et al., 2009)

FLT3 D835 Prognostic Acute Myeloid Leukemia (Motyckova and

Stone, 2010)

JAK2 Exon 12, V617F Prognostic Myelo Proliferative Disorders, Chronic

Myeloid Leukemia

(Li et al., 2008b;

Ma et al., 2009)

KIT Exons 8, 9, 11, 17 Predictive Gastro Intestinal Stromal Tumors; Acute

Myeloid Leukemia

(Motyckova and Stone,

2010; Reichardt, 2010)

KRAS Codons 12, 13, 61 Predictive of

benefit to erlotinib

Lung adenocarcinoma, mucinous adenoma,

ductal carcinoma of the pancreas, and

colorectal carcinoma

(De Roock et al., 2011;

Kompier et al., 2010;

Monzon et al., 2009;

Plesec and Hunt, 2009;

Soulieres et al., 2010)

MPL Exon 10 Prognostic Myeloproliferative disorders e Chronic myeloid

leukemia, polycythemia Vera

NPM1 Codons 288,

290, Exon 12

Prognostic Acute Myeloid Leukemia (AML) (Motyckova and Stone,

2010; Hollink et al., 2009)

PIK3CA Exons 9, 20 Prognostic Colorectal cancer, Malignant melanoma, Thyroid

carcinoma, Non-Small Cell Lung cancer, breast cancer,

cervical cancer, and lung adenocarcinoma

(Pao and Girard, 2011;

Kompier et al., 2010)

TP53 Somatic mutations Prognostic Head and Neck squamous cell carcinoma, Leukemia,

and Breast cancer

(Silver et al., 2010;

Petitjean et al., 2007)


common in melanoma (>50%), but have also been detected in

lower frequencies in other cancers (Brose et al., 2002; Davies

et al., 2002). Similarly, ERBB2 amplifications were originally

described in breast cancers and form the basis of treatment

with trastuzumab. Recently, it has been reported that 10% of

gastric cancers also harbor the ERBB2 amplicon and these pa-

tients might benefit from Herceptin therapy (Barros-Silva

et al., 2009). Currently, although cumulatively these low prev-

alence mutations may make up a large portion of cancer

drivers they are not routinely tested in the clinical setting. In

this context, it might be more important to screen for all mu-

tations in the patient by sequencing in an unbiased manner,

instead of applying single gene tests.

Roychowdhury et al. (2011) recently reported an interesting

pilot study demonstrating the value of comprehensive

sequencing in a patient with metastatic colorectal cancer. A

mutation in the NRAS gene and an amplification of the

CDK8 locus were identified, both of which could be used to

enroll the patient in a clinical trial targeting these aberrations

(Roychowdhury et al. (2011)). Interestingly, this patient under-

went testing for KRAS and was deemed to be wild type which

is a basis for prescribing anti-EGFR (Biesecker et al., 2012) ther-

apy. Although this patient was not prescribed anti-EGFR ther-

apy, he would have been eligible. The authors noted that the

NRAS mutation is functionally equivalent to a KRASmutation

and if known should preclude this patient from EGFR therapy.

This case study also reinforces the difficulty in prescribing sin-

gle gene mutation testing e While NRAS mutations are pre-

sent in 18% of cutaneous melanomas (Lee et al., 2011), they

are much rarer in colorectal cancers (2%) (Irahara et al.,

2010). Only an unbiased sequencing based mutation testing

would have uncovered these important driver mutations.

Several platforms and testing services are now available

for identification of multiple mutations in cancer-related

genes in a single assay. MacConaill et al. (2009) reported a mu-

tation profiling platform (‘‘OncoMap’’) to interrogate 400 mu-

tations in 33 known oncogenes and tumor suppressors. Such

assays, while an improvement over single gene assays, still

offer a limited fraction of mutational information relevant to

cancer. The advent of whole genome sequencing technologies

offers very competitive price points and the ability to profile

tumors for multiple mutations with a single test. Commercial

providers such as Foundation Medicine (Cambridge, MA)

already use deep sequencing on selected sets of cancer genes

fromDNA extracted from routine pathology specimens to pro-

vide actionable information to treatment providers.

3.2. Chromosomal abnormalities

Regions of the genome are commonly amplified or deleted in

cancer and these regions contain genes that drive cancer pro-

gression e the best example being the 17q12 amplicon that har-

bors the HER2 oncogene. This amplicon leads to a more

aggressive type of tumor, which is now the target of a highly

successful antibody therapy, trastuzumab (Herceptin�). Other

amplicons in 11q13/14, 8q24, and 20q13.2 have been found in

cancers that seem to drive the cancer phenotype and have prog-

nostic significance. These regions contain gene sets, which are

important in DNA metabolism and maintenance of chromo-

somal integrity, suggesting that response to DNA damaging





agents used as anticancer therapy might be modulated by the

presence of particular amplicons. Recently authors of a large

breast cancer study proposed ER positive breast cancers

harboring amplifications in 11q13/14 as a separate subgroup

with worse outcomes (Curtis et al., 2012).

There exist several published methods for inferring gene

copy number using next generation sequencing. The sensi-

tivity and specificity of these methods exceed that of microar-

ray based techniques and require very small amounts of

tumor DNA as starting material.

Translocations and their corresponding gene fusion prod-

ucts have been known to play an important role in the onset

and development of several cancers (Mitelman et al., 2007).

The classic example of a translocation resulting in the crea-

tion of a fusion transcript is the reciprocal translocation

t(9;22)(q34;q11) (McDermott et al., 2011; Lee et al., 2011)

causing the BCR-ABL1 fusion transcript. BCR-ABL1 fusion oc-

curs in most patients with chronic myelogenous leukemia

(CML) and a third of patients with acute lymphoblastic leuke-

mia. While the clinical impact of gene fusions has been most

prevalent in hematological malignancies, there is growing ev-

idence that they could have prognostic and predictive utility

in common solid tumors (Table 2). The clinical utility of inter-

rogating solid tumors for gene fusions can be seen from the

recent approval of crizotinib for the treatment of NSCLC that

harbor rearrangements in ALK (Kwak et al., 2010). Similar

therapeutic implications have also been recently reported

for MAST kinase rearrangements in breast cancer (Robinson

et al., 2011) and RET rearrangements in lung adenocarcinoma

(Lipson et al., 2012), thus highlighting the clinical relevance of

genomic translocations and their fusion transcripts. This pro-

vides an additional clinical indication for sequencing technol-

ogies that can detect such variations in addition to mutations,

amplifications and deletions.

Table 2 e Clinically relevant chromosomal abnormalities in cancer and th

Genes Chromosomeabnormality

Test type

BCR-ABL1 t (McDermott et al., 2011;

Lee et al., 2011)

Diagnostic/Predictive of r

multikinase inhibitors su

imatinib, nilotinib

EML4-ALK Fusion Targetable with ALK-spe

tyrosine kinase inhibitor

TMPRSS2-ERG Fusion Diagnostic/Prognostic

MAST kinase and

Notch gene

family fusions

Fusion Potentially targetable by

inhibitor

KIF5B-RET Fusion Targetable by RET tyrosin

inhibitor

1p & 19q Deletion Diagnostic

9p21 Deletion Diagnostic

17q12 Amplification Prognostic/Predictive of r

trastuzumab and anthrac

adjuvant therapy

11q13 Amplification Prognostic

8q24 Amplification Prognostic

20q13.2 Amplification Prognostic

3.3. Measurements of RNA and transcription profiling

There have been several studies documenting the use of gene

expression of one or more genes in determining cancer prog-

nosis or treatment. An early series of publications specifically

described molecular signatures in breast cancer, primarily

focused on associations between particular sets of genes

with altered expression and survival (Sorlie et al., 2001; van

’t Veer et al., 2002; West et al., 2001; Perou et al., 2000). These

studies led to the development of the clinically used tests

such as Oncotype DX�, Mammaprint�, Breast Index� (BCI),

PAM50 and others. Other tumors where multi-gene tests are

useful are the ColoPrint� assay for recurrence risk of Stage II

and III colon cancer patients.

The ability of RNA-seq tomore completely characterize the

entire transcriptome in comparison toanyexisting single tech-

nology such as RT-PCRormicroarray promises significant clin-

ical utility (Wangetal., 2009; Ryuet al., 2011).Abigadvantageof

using RNA sequencingwould be the opportunity to applymul-

tiple individual or multi-gene tests to the same sample at a

fraction of the cost in running each of these separately in a sin-

gle RNA Seq run. For example, in breast cancer a single RNA

Seq run could provide input to calculate cancer subtype using

the PAM50 gene set (Parker et al., 2009), obtain ER/PR and Her2

RNA abundance values (Kamalakaran et al., 2011), calculate

the Genomic Grade Index (Filho et al., 2011; Liedtke et al.,

2009) based on the 97 genes and identify expression of any

other amplified or deleted genes such as EGFR which are

known to impact therapy response.

3.4. Sample processing for sequencing of tumor samples

Recent improvements in sample processing protocols have

also allowed for robust extraction of DNA/RNA from FFPE

eir utility.

Cancer type Reference

esponse to

ch as

CML/ALL (Mitelman et al., 2007)

cific Non-small cell lung

cancer

(Kwak et al., 2010)

Prostate (Tomlins et al., 2005)

g-secretase Breast (Robinson et al., 2011)

e kinase Lung adenocarcinoma (Lipson et al., 2012;

Kohno et al., 2012)

Oligodendroglioma (Barbashina et al., 2005)

Lymphoid leukemia,

lung cancer, esophageal

cancer, glioma, melanoma

(Sasaki et al., 2003)

esponse to

ycline-based

Breast (Lamy et al., 2011)

Breast

Multiple cancers

Breast





samples and use of very small amounts of DNA for complete

sequencing (Bonin et al., 2010; Fairley et al., 2012; van Eijk

et al., 2012). Sample quality is of primary importance in gener-

ating sequence based diagnostic tests. Quality control steps

such as RNA integrity number (Schroeder et al., 2006) as

used in standardmolecular tests are an absolute requirement.

In addition, steps are needed to evaluate and quantify histol-

ogy of the specimen e especially to establish percentage inva-

sive tumor in the specimen and the heterogeneity based on

staining. Additionally following the sequencing run, general

quality checkpoints are essential to ensure data integrity

and quality. The quality of the reads produced by the

sequencer needs to be evaluated before subsequent mapping

and feature extraction are performed. While most sequencers

generate a quality control (QC) report as part of their analysis

pipeline, tools such as FASTQC are available and should be

incorporated routinely in any clinical analysis. A QC report

that identifies problems as originating either in the sequencer

or in the starting library material would be more useful in the

clinical context. The interpretation of the reads produced by

the sequencer involves data analytic steps that first require

alignment of the reads to a reference genome or transcrip-

tome and subsequent interpretation of the resulting align-

ments to detect the presence of mutations, copy number

abnormalities, transcript abundance. Additionally following

the sequencing run, general quality checkpoints are essential

to ensure data integrity and quality. Quality checkpoints for

sequencing based measurements are detailed in Table 3.

3.5. Algorithms and software

There has been an enormous body of computational biology

research on the identification of mutations, copy number

changes and quantification of gene expression from tumor

samples. However, to be truly useful in the clinic, themethods

must be used in conjunction with strict controls on data qual-

ity, coverage and an understanding of the assumptions under

which the algorithms were developed.

Phred quality scores (Ewing and Green, 1998; Ewing et al.,

1998) were originally introduced by Phil Green to assess the

probability of accurately calling a base in capillary sequencers.

Li et al. (Li et al., 2008a) introduced the concept of using these

quality scores in a consensus nucleotide calling algorithm to

Table 3 e Quality control checkpoints for sequencing based measuremen

Steps Checkpoints

Sample processing Ensure percentage Invasive Compone

Heterogeneity in positively stained ce

DNA sequencing and

alignment

Ensure Base call Quality

Sensitivity and robustness of algorith

abundance levels of transcripts

Detection/quantification Sensitivity and robustness of algorith

abundance levels of transcripts

Mutation calling for Heterogenous sam

Differentiating mutations from variat

Identifying True copy number change

call mutations and single nucleotide polymorphisms. These

Phred scores (Ewing and Green, 1998; Ewing et al., 1998)

must be higher than 30 to ensure correct base calls by the

sequencer. Additionally, each base called must have enough

reads that align to that particular base to have accurate geno-

typing. In order to achieve an accuracy of 1 genotyping error in

1 million calls, Ajay et al. (Ajay et al., 2011) required at least

50X coverage of a clinical blood sample. Other reports have a

higher threshold of 100X for calling genotypes (Carter et al.,

2012). Kohlmann et al. (2011) recently reported on a study

across 10 different laboratories for targeted mutation

screening using sequencing for clinical applications. They re-

ported a high concordance, including a robust detection of

novel variants, which were undetected by standard Sanger

sequencing. Additionally, they demonstrated sensitivity to

detect low-level variants present with 1e2% frequency. In

comparison, the threshold is 20% for traditional Sanger-

based sequencing, and demonstrates the power and strength

of the next generation sequencing technologies.

For normal diploid genomes, Ajay et al. (2011) reported that

the conventional read depth of w30X coverage has produced

high quality genotypes but stringent filters for data quality

allowed them to accurately genotype only 30% of the genome

at 30X mean coverage of the whole genome. They demon-

strated an average coverage depth of 50X was necessary to

accurately genotype 95% of the genome. However, challenges

exist in calling mutations from cancer genomes as consensus

calling algorithms used in this field typically assume that the

genome is diploid which is not true for large segments of can-

cer genomes. Detection of somatic alterations in tumor biopsy

samples is complicated by both the presence of normal cells in

the biopsied tissue (purity) as well as the presence of multiple

clonal subpopulations within tumor cells. These factors affect

the required depth of sequencing to call clonal mutations at

sufficient power (>0.8) in each sample, with greater than

100-fold coverage required to detect mutations that may be

present in around 20% of tumor cells (Carter et al., 2012).

Finally, to differentiate tumor specific mutations from the

3e4 million naturally occurring variations/single nucleotide

polymorphisms, a normal sample (blood or saliva) from the

same patient would need to be sequenced. The presence of

copy number aberrations in tumor samples adds an additional

layer of complexity in determining allele states.

ts.

Quality control

nt H&E Staining

lls H&E Staining

FASTQC phred score of Q30

ms that estimate Sequencing Depth (Fold Coverage)

ms that estimate Sequencing Depth (>50 M mapped)

ples Fold Coverage (100X coverage)

ions/polymorphisms Comparison with normal tissue from

same patient

Mappability of region; differentiating

paralogous regions





Several methods have been developed for abundance esti-

mation of genes (Baba et al., 2006; Karst et al., 2011; Orth et al.,

2010) and isoforms as well as for the detection of fusion-

transcripts (Voutilainen et al., 2006; Latoszek-Berendsen

et al., 2010). Themethods differ in terms of the units of expres-

sion of transcript abundance but also show differing sensi-

tivity to sequencing parameters such as read-length, read-

quality, and insert sizes for paired-end reads. Standardization

of transcript abundance units in addition to evaluation of

sensitivity and robustness in both technical and biological

replicates are needed before clinical adoption. Additional

quality control steps are necessary for optimal sample pro-

cessing to be assured before proceeding with sequencing. It

may be that sequencing technologies could provide a new

‘gold standard’ if they provide more accurate prediction of

clinical outcome.

In order to translate an existing multigene signature (e.g.

Mammaprint, OncotypeDx, ColoPrint, Genomic Grade Index)

whose technical validation and medical utility has already

been established, itwill be critical to show that the gene expres-

sion read-out using RNA-seq is technically equivalent to the

original microarray or RT-PCR based assay. In comparing num-

ber of sequence reads mapped to each gene and the corre-

sponding absolute intensities from array (normalized) it was

found that the correlation is greater for genes that are mapped

to by large numbers of sequence reads (Wang et al., 2009;

Marioni et al., 2008a). Marioni et al. found 81% overlap when

the differentially-expressed genes from the two technologies

were compared (Marioni et al., 2008b). This suggests that the

expression values from RNA-seq will need some mathematical

transformations for certain genes to fit the established classi-

fier. We recently reported the comparison of RNA-seq with

standardized clinical measures of ER, PR andHER2 by immuno-

histochemistry or fluorescent in-situ hybridization

(Kamalakaran et al., 2011). We found that RNA-seq measure-

ments were 100% concordant with IHC for ER while PR and

HER2 were 80e90% concordant and sensitive to sample quality

(heterogeneity and percentage invasive component).

In addition to establishing technical equivalence, steps

have to be taken toward analytical validation. To establish

robust performance of RNA-seq-based analysis, the process-

ing pipeline has to be repeated several times with the same

samples with low variability. Also, the sequencing based

assay and the existing assay (e.g.PCR, RT-PCR, microarray)

should be evaluated on the same set of samples with the

intended focused objective to avoid confounding issues. Stan-

dardization of RNA-seq protocols and such evaluations of

reproducibility will be key in transitioning into this new tech-

nology (Simon, 2005) and provide a path to market via CLIA

certification and/or FDA pre-market reviews.

3.6. Clinical grade software and tools e ensuringquality, reproducibility, accuracy and usability

As described above, most of existing software tools for next

generation sequencing have been developed primarily in the

research context. Many of these algorithms will need to be

validated and quality tested for robustness before routine clin-

ical use. Clinical decision support software which interprets

the gigabytes of sequencing data into clinically actionable

decisions are still in embryonic stages. The bioinformatic

analysis and interpretation requires both standardized

genomic “content” and a host of interpretation and knowledge

discovery tools. A major challenge will be in presenting the

large volume of data that would be available from whole

genome sequencing. Databases such as MutaDATABASE

(Bale et al., 2011), a standardized and centralized warehouse

to hold disease associated variants, would be useful in priori-

tization of themutations that will be identified. A similar stan-

dardized database for annotating copy number variations

does not exist today. For the oncology experts, the decision

support is usually performed by a team of bioinformaticians,

statisticians and genetic counselors. The importance and rele-

vance of the mutations must be ascertained and presented to

the oncologist in an intuitive manner. Berg et al. (2011) pro-

posed a three tiered system that classifies each mutation/

variant data as clinically useful, clinically valid or clinical

implication unknown. This system would allow prioritization

of the data into useful and clinically actionable bits of infor-

mation that can then be used in disease management,

Visualizing results of sequencing analysis is currently done

on an ad-hoc basic using a cluster of different software tools.

OncoPrints from cBio is a tool for visualizing genomic alter-

ations, including somatic mutations, copy number alter-

ations, and mRNA expression changes across a set of

patients. Tools such as the Integrated Genomic Viewer from

the Broad Institute or the cBio cancer genomics portal from

MSKCC are meant for exploratory use for clinical discovery

studies. There is a need for an integrated software solution

that would analyze, store sequence data and present useful

genomic features for clinical actions.

4. Challenges for clinical adoption

In addition to introducing a new data type and a newmodality

in the clinic, sequencing will also provide several new chal-

lenges to implement, adopt, and utilize in the clinical environ-

ment. The Standardization of Clinical Testing (Nex-StoCT)

workgroup in conjunction with the US Centers for Disease

Control and Prevention (CDC) has taken steps to define tech-

nical process elements to assure the analytical validity and

compliance of NGS tests with existing regulatory and profes-

sional quality standards (Gargis et al., 2012). These guidelines

were drafted to ensure reliable next-generation sequencing

(NGS) based testing and their application for clinically useful

decision making. These guidelines “address four topics that

are components of quality management in a clinical environ-

ment: (i) test validation, (ii) quality control (QC) procedures to

assure and maintain accurate test results, (iii) the indepen-

dent assessment of test performance through proficiency

testing (PT) or alternative approaches and (iv) reference mate-

rials (RMs)”. These recommendations are a good framework to

build upon for ensuring clinically meaningful use of NGS

technologies.

4.1. Molecular oncology education and expertise

The next generation technologies will require not only hard-

ware and software resources but also human expertise. Even





with bioinformatics resources, the interpretation of test re-

sults will require a sophisticated set of experts with knowl-

edge in molecular biology, biostatistics, bioethics and

medicine. The new era of the ‘molecular oncologist’ will

soon be upon us where students receive training not only in

medical disciplines and basic biology, but also in the interpre-

tation of high dimensional data.

4.2. Regulatory approval and clinical guidelines

Providers of clinical sequencing services (Ambry Genetics,

Aliso Viejo, CA) and Foundation Medicine, Cambridge, MA)

haveoperatedas LabDevelopedTests (LDT)underClinical Lab-

oratory Improvement Amendments (CLIA) certification. How-

ever, there are no clear guidelines of operation for Next

generation Sequencing based tests. Recently, the College of

American Pathologists (CAP), updated theMolecular Pathology

checklists of their CAP Accreditation Program to include a sec-

tiononnext generation sequencingbased assays. TheCAPalso

updated theirmaster activitymenu to includespecificTest/Ac-

tivity codes for the use of next generation sequencing. The

checklist provides a framework for documenting processes

and ensuring quality for both the analytical wet bench process

of sample preparation and sequence generation and the bioin-

formatics process/pipeline of sequence alignment, annotation

and variant calling.We believe these guidelines and processes

would allow sequencing based tests to be more reliable and

allow for wide deployment into clinical practice. However,

these guidelines would need to be updated to include not just

mutation/variant calling based NGS tests, but also RNASeq

and DNA copy number based tests.

4.3. Data storage

Raw sequencing runs generate hundreds of gigabytes of data

from a single measurement, and thus will surpass existing

clinical data management infrastructure by one or more or-

ders of magnitude. If follow-up screening measurements or

serial measurements of disease status will be performed,

sequencing data will easily be the most data rich modality

used clinically in the future. With large quantities of data

comes the need for computational resources. Currently, anal-

ysis of sequencing data can take days on large computer clus-

ters and is typically offloaded to a computational cloud whose

capacity is unlikely to bemet by resources currently in place in

the clinic (Stein, 2010). It is expected that technological ad-

vances in cloud computing and cloud storage will become

available to meet the needs for the clinical setting, hiding

the details from the end user via a service that provides man-

agement and access to this data. However, this will initially

augment concerns over data safety and security, liability,

and compliance with regulatory requirements, such as how

data is stored, which parties own and have access, and details

of data deletion, archiving, and retrieval.

Long-term storage and clinical utility of sequencing data in

diagnosis, therapy planning and therapy monitoring will

require standards that will ensure interoperability with the

electronic healthcare records. HL7 has a Clinical Genomics

Working Group that creates and promotes its standards by

enabling the communication between interested parties of

the clinical and personalized genomic data. The goal is the

personalization (differences in individual’s genome) of the

genomic data and the linking to relevant clinical information.

The HL7CG will develop and document scenarios and use

cases in clinical genomics to determine what data needs to

be exchanged. Also, they review existing genomics standards

formats such as BSML (Bioinformatics Sequence Markup Lan-

guage), MAGE-ML (Microarray and Gene Expression Markup

Language), LSID (Life Science Identifier) and others.

Publicly funded efforts such as iRODS and Galaxy are help-

ing the community currently to copewith themassive sharing

ofdataandprocedural knowledgeabout the sequencingdown-

stream analysis. With the pipelines for RNA-Seq, Chip-Seq,

exomic and full sequence analysis, the operational question

is how to annotate terabytes, petabytes and exabytes of data

and how to organize the data for downstream analysis. iRODS

(Integrated Rule Oriented Data System), is a technical solution

that involves metadata-driven file management coming from

different domains, on different devices, under the control of

different groups. It manages descriptive metadata about each

item, including duplicate detection, archiving, datamigration,

access controls, authorization, and integrity checks, and

enforcing management policies for any desired property (e.g.

enforcement of rules regarding privacy for different consent

types). Galaxy is an open, web-based platform for data inten-

sive biomedical research. Galaxy (http://galaxyproject.org) is

a software system that provides this support through a frame-

work that gives experimentalists simple interfaces to powerful

tools, while automatically managing the computational de-

tails. Galaxy is distributed both as a publicly availableWeb ser-

vice, which provides tools for the analysis of genomic,

comparative genomic, and functional genomic data, or a

downloadable package that can be deployed in individual lab-

oratories.Oneof theearlyprojects that fullydeploys large scale

medical sequencing as a productive and critical component in

genomic medicine is the ClinSeq project. The ClinSeq project

attempts to address issues related to the genetic architecture

of disease, implementation of genomic technology, informed

consent, disclosure of genetic information, and informatics

challenges in archiving, analyzing, and displaying sequence

data (Biesecker et al., 2009).

4.4. Privacy and confidentiality

The implantation of whole genome analysis presents several

challenges for privacy and security. First, the amount of data

being generated demands large computing requirements not

available in most institutions. There are ongoing efforts to

ensure security of personal medical information (PMI) in

computing ‘clouds’.

In addition to security of PMI, the implications of

sequencing whole genomes on patients and their families

cannot be overlooked. In addition to tumor sequence, it is

clear that host sequence will also be produced by next gener-

ation sequencing technologies. The issues that have become

well studied in the field of human genetics are amplified in

this contextwhere the presence of ALL knownheritablemuta-

tions and polymorphisms in a particular patient’s genomes

will become available by sequencing their tumor sample. For

a number of reasons, including increased risk of bias,

http://galaxyproject.org





discrimination, and stigma, genetic privacy and confidenti-

ality are sometimes thought to be more important than pri-

vacy and confidentiality in other kinds of research (Esposito

and Goodman, 2009). Clearly, safeguards must be put in place

before such sensitive information is generated and patients

must be counseled and consented appropriately for known,

and theoretic risks this knowledge carries. For example, po-

tential participants must be informed on which entities and

persons will have access to the data. Thismight include inves-

tigators at other institutions, corporate sponsors, a govern-

ment, employers, etc. If information obtained during

research will be placed in a patient’s medical record, this too

must be disclosed. Subjects should also be told of the risks

of others having access to his or her genetic information.

The growth of bioinformatics or computational genomics

makes it clear that, in the near future, the concern will not

be somuchwith stored biological samples but with digitalized

samplesdelectronic data that can be stored, transmitted, and

analyzed with new ease and power. It is important for institu-

tions to consider policies surrounding the use of genetic infor-

mation (Massoudi et al., 2011). These processes should

address data collection and management, encryption,

destruction of specimens and/or genetic information, and

loss of data. Researchers and research ethics reviewers should

address the issue of clinically suspicious or significant

Figure 3 e The current status and future vision for oncology e replaceme

molecular profile.

incidental findings, and whether and how they will be

communicated to subjects. Incidental findings can be of great

interest to subjects, and a comprehensive consent process

should make clear whether such findings will be disclosed.

In the United States, the passage of the Genetic Information

Nondiscrimination Act (GINA) in 2008 (http://www.genome.-

gov/24519851) has provided, at least in principle, sweeping

protections for patients and subjects. GINA prohibits discrim-

ination in healthcare insurance and employment based on ge-

netic information. However, the extent to which GINA

changes or reduces the risks of participation in genetic or

genomic research should be included in the consent process.

4.5. Reimbursement

CPT coding needs to evolve in order to encompass the existing

single gene tests and the addition of panels of genes and mul-

tiplexed tests as well as translated molecular signatures in a

manner that also makes economic sense. CPT codes have

been developed in the last two decades in two major direc-

tions: the first one in microbiology testing and the second

one for inherited diseases and cancer. To date, 466 descriptors

formolecular pathology services have been drafted by theMo-

lecular Pathology Working Group (MPWG) of AMA. Most are

currently accepted for the CPT code set. Availability of the

nt of multitude of single gene/panel tests by one comprehensive

http://www.genome.gov/24519851

http://www.genome.gov/24519851





technology does not immediately translate into wide clinical

use since healthcare providers and payors need to have well

established policies about medical necessity of complex

sequencing tests. The MPWG has organized two groups of

CPT codes: tier 1 which provides codes for specific procedures

performed in high volume (KRASmutations), and tier 2 which

provides codes for large number of less commonly performed

tests, requiring more technical and professional resources

(similar to the six CPT levels used for Surgical Pathology ser-

vices). For example, level 2 includes trinucleotide repeat disor-

ders; level 4, sequencing of a single target/exon; level 9 covers

full sequencing of genes>50 exons. It is envisioned that next

generation sequencing may exist within existing tiers e as

medical necessity is established. Multianalyte assays with

algorithmic procedures are also likely to be further specified

in the context of next generation sequencing tests.

5. Conclusion

The practice of oncology is being transformed by the vast

amount of knowledge that is gained by high throughput mo-

lecular profiling technologies. The big challenge before us is

the ability to discriminate between the potentially actionable

information that can be gleaned from these data from those

that provide insights into tumor biology. The former is imme-

diately actionable and would make a difference for the indi-

vidual patient at hand, while the latter would help devise

therapeutic strategies for other patients. It is imperative that

these cases are distinguished and proper ethical guidelines

are set up to enable oncologists to provide better care for their

patients. The collection of the complete molecular profile of

the tumor sample would provide an avenue for oncologists

to constantly query the patient profile and update as and

when new relevant information becomes available. For

example, a report on a new clinical trial targetting a muta-

tion/aberration present in the patient would allow the oncol-

ogist to switch or update the treatment protocols and

improve outcomes or the patient (Figure 3).

Conflicts of interest

W.R.M. has participated in Illumina sponsored meetings over

the past four years and received travel reimbursement and an

honorarium for presenting at these events. Illumina had no

role in decisions relating to the study/work to be published,

data collection and analysis of data and the decision to publish.

W.R.M. has participated in Pacific Biosciences sponsored

meetings over the past three years and received travel reim-

bursement for presenting at these events.

W.R.M. is a founder and shared holder of Orion Genomics,

which focuses on plant genomics and cancer genetics.

Acknowledgment

W.R.M received support from the Cancer Center Support

Grant (CA045508) from the NCI.

R E F E R E N C E S

Ajay, S.S., Parker, S.C., Ozel Abaan, H., Fuentes Fajardo, K.V.,Margulies, E.H., 2011. Accurate and comprehensivesequencing of personal genomes. Genome Res. 9, 1498e1505.

Baba, F., Swartz, K., van Buren, R., et al., 2006. Syndecan-1 andsyndecan-4 are overexpressed in an estrogen receptor-negative, highly proliferative breast carcinoma subtype.Breast Cancer Res. Treat. 98, 91e98.

Bale, S., Devisscher, M., Van Criekinge, W., et al., 2011.MutaDATABASE: a centralized and standardized DNAvariation database. Nat. Biotechnol. 29, 117e118.

Barbashina, V., Salazar, P., Holland, E.C., Rosenblum, M.K.,Ladanyi, M., 2005. Allelic losses at 1p36 and 19q13 in gliomas:correlation with histologic classification, definition of a 150-kbminimal deleted region on 1p36, and evaluation of CAMTA1 asa candidate tumor suppressor gene. Clin. Cancer Res. 11,1119e1128.

Barros-Silva, J.D., Leitao, D., Afonso, L., et al., 2009. Association ofERBB2 gene status with histopathological parameters anddisease-specific survival in gastric carcinoma patients. Br. J.Cancer 100, 487e493.

Berg, J.S., Khoury, M.J., Evans, J.P., 2011. Deploying wholegenome sequencing in clinical practice and public health:meeting the challenge one bin at a time. Genet. Med. 13,499e504.

Biesecker, L.G., Mullikin, J.C., Facio, F.M., et al., 2009. The ClinSeqProject: piloting large-scale genome sequencing for researchin genomic medicine. Genome Res. 19, 1665e1674.

Biesecker, L.G., Burke, W., Kohane, I., Plon, S.E., Zimmern, R.,2012. Next-generation sequencing in the clinic: are we ready?Nat. Rev. Genet. 13, 818e824.

Bleeker, F.E., Felicioni, L., Buttitta, F., et al., 2008. AKT1(E17K) inhuman solid tumours. Oncogene 27, 5648e5650.

Bonin, S., Hlubek, F., Benhattar, J., et al., 2010. Multicentrevalidation study of nucleic acids extraction from FFPE tissues.Virchows Arch. 457, 309e317.

Brose, M.S., Volpe, P., Feldman, M., et al., 2002. BRAF and RASmutations in human lung cancer and melanoma. Cancer Res.62, 6997e7000.

Carter, S.L., Cibulskis, K., Helman, E., et al., 2012. Absolutequantification of somatic DNA alterations in human cancer.Nat. Biotechnol. 30, 413e421.

Curtis, C., Shah, S.P., Chin, S.-F., et al., 2012. The genomic andtranscriptomic architecture of 2,000 breast tumours revealsnovel subgroups. Nature. advance Online Publ.

Davies, H., Bignell, G.R., Cox, C., et al., 2002. Mutations of the BRAFgene in human cancer. Nature 417, 949e954.

De Roock, W., De Vriendt, V., Normanno, N., Ciardiello, F.,Tejpar, S., 2011. KRAS, BRAF, PIK3CA, and PTEN mutations:implications for targeted therapies in metastatic colorectalcancer. Lancet Oncol. 12, 594e603.

Ellis, M.J., Perou, C.M., 2013. The genomic landscape of breastcancer as a therapeutic roadmap. Cancer Discov. 3, 27e34.

Esposito, K., Goodman, K., 2009. Genethics 2.0: phenotypes,genotypes, and the challenge of databases generated bypersonal genome testing. Am. J. Bioeth. 9, 19e21.

Ewing, B., Green, P., 1998. Base-calling of automated sequencertraces using phred. II. Error probabilities. Genome Res. 8,186e194.

Ewing, B., Hillier, L., Wendl, M.C., Green, P., 1998. Base-calling ofautomated sequencer traces using phred. I. Accuracyassessment. Genome Res. 8, 175e185.

Fairley, J.A., Gilmour, K., Walsh, K., 2012. Making the most ofpathological specimens: molecular diagnosis in formalin-fixed, paraffin embedded tissue. Curr. Drug Targets 13 (12),1475e1487.

http://refhub.elsevier.com/S1574-7891(13)00078-1/sref1
























































































Filho, O.M., Ignatiadis, M., Sotiriou, C., 2011. Genomic grade index:an important tool for assessing breast cancer tumor grade andprognosis. Crit. Rev. Oncol. Hematol. 77, 20e29.

Flaherty, K.T., Robert, C., Hersey, P., et al., 2012. Improved survivalwith MEK inhibition in BRAF-mutated melanoma. N. Engl. J.Med..

Gargis, A.S., Kalman, L., Berry, M.W., et al., 2012. Assuring thequality of next-generation sequencing in clinical laboratorypractice. Nat. Biotechnol. 30, 1033e1036.

Hollink, I.H., Zwaan, C.M., Zimmermann, M., et al., 2009.Favorable prognostic impact of NPM1 gene mutations inchildhood acute myeloid leukemia, with emphasis oncytogenetically normal AML. Leukemia 23, 262e270.

Irahara, N., Baba, Y., Nosho, K., et al., 2010. NRAS mutations arerare in colorectal cancer. Diagn. Mol. Pathol. 19, 157e163.

Kamalakaran, S., Lezon-Geyda, K., Varadan, V., et al., 2011.Evaluation of ER/PR and HER2 status by RNA sequencing intissue core biopsies from preoperative clinical trial specimens.J. Clin. Oncol..

Karst, A.M., Levanon, K., Drapkin, R., 2011. Modeling high-gradeserous ovarian carcinogenesis from the fallopian tube. Proc.Natl. Acad. Sci. United State. America 108, 7547e7552.

Kohlmann, A., Klein, H.U., Weissmann, S., et al., 2011. Theinterlaboratory RObustness of Next-generation sequencing(IRON) study: a deep sequencing investigation of TET2, CBLand KRAS mutations by an international consortium involving10 laboratories. Leukemia 25 (12), 1840e1848.

Kohno, T., Ichikawa, H., Totoki, Y., et al., 2012. KIF5B-RET fusionsin lung adenocarcinoma. Nat. Med. 18, 375e377.

Kompier, L.C., Lurkin, I., van der Aa, M.N., van Rhijn, B.W., vander Kwast, T.H., Zwarthoff, E.C., 2010. FGFR3, HRAS, KRAS,NRAS and PIK3CA mutations in bladder cancer and theirpotential as biomarkers for surveillance and therapy. PLoSOne 5, e13821.

Kwak, E.L., Bang, Y.J., Camidge, D.R., et al., 2010. Anaplasticlymphoma kinase inhibition in non-small-cell lung cancer.N. Engl. J. Med. 363, 1693e1703.

Lamy, P.J., Fina, F., Bascoul-Mollevi, C., et al., 2011. Quantificationand clinical relevance of gene amplification at chromosome17q12-q21 in human epidermal growth factor receptor 2-amplified breast cancers. Breast Cancer Res. BCR 13, R15.

Latoszek-Berendsen, A., Tange, H., van den Herik, H.J.,Hasman, A., 2010. From clinical practice guidelines tocomputer-interpretable guidelines. A literature overview.Methods Inf. Med. 49, 550e570.

Lee, J.H., Choi, J.W., Kim, Y.S., 2011. Frequencies of BRAF andNRAS mutations are different in histological types and sites oforigin of cutaneous melanoma: a meta-analysis. Br. J.Dermatol. 164, 776e784.

Liedtke, C., Hatzis, C., Symmans, W.F., et al., 2009. Genomic gradeindex is associated with response to chemotherapy in patientswith breast cancer. J. Clin. Oncol. 27, 3185e3191.

Linardou, H., Dahabreh, I.J., Bafaloukos, D., Kosmidis, P.,Murray, S., 2009. Somatic EGFR mutations and efficacy oftyrosine kinase inhibitors in NSCLC. Nat. Rev. Clin. Oncol. 6,352e366.

Lipson, D., Capelletti, M., Yelensky, R., et al., 2012. Identificationof new ALK and RET gene fusions from colorectal and lungcancer biopsies. Nat. Med..

Li, H., Ruan, J., Durbin, R., 2008a. Mapping short DNA sequencingreads and calling variants using mapping quality scores.Genome Res. 18, 1851e1858.

Li, S., Kralovics, R., De Libero, G., Theocharides, A., Gisslinger, H.,Skoda, R.C., 2008b. Clonal heterogeneity in polycythemia verapatients with JAK2 exon12 and JAK2-V617F mutations. Blood111, 3863e3866.

Lynch, T.J., Bell, D.W., Sordella, R., et al., 2004. Activatingmutations in the epidermal growth factor receptor underlying

responsiveness of non-small-cell lung cancer to gefitinib.N. Engl. J. Med. 350, 2129e2139.

Ma, W., Kantarjian, H., Zhang, X., et al., 2009. Mutation profile ofJAK2 transcripts in patients with chronic myeloproliferativeneoplasias. J. Mol. Diagn. 11, 49e53.

MacConaill, L.E., Campbell, C.D., Kehoe, S.M., et al., 2009. Profilingcritical cancer gene mutations in clinical tumor samples. PLoSOne 4, e7887.

Maher, B., 2011. Human genetics: genomes on prescription.Nature 478, 22e24.

Mardis, E.R., Ding, L., Dooling, D.J., et al., 2009. Recurringmutations found by sequencing an acute myeloid leukemiagenome. N. Engl. J. Med. 361, 1058e1066.

Marioni, John C., M, C.E., Shrikant, M. Mane, Stephens, Matthew,Gilad, Yoav, 2008a. RNA-seq: an assessment of technicalreproducibility and comparison with gene expression arrays.Genome Res. 18, 1518e1519.

Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M., Gilad, Y.,2008b. RNA-seq: an assessment of technical reproducibilityand comparison with gene expression arrays. Genome Res. 18,1509e1517.

Massoudi, B.L., Goodman, K.W., Gotham, I.J., et al., 2011. Aninformatics agenda for public health: summarizedrecommendations from the. AMIA PHI Conference. (J Am MedInform Assoc..

McDermott, U., Downing, J.R., Stratton, M.R., 2011. Genomics andthe continuum of cancer care. N. Engl. J. Med. 364, 340e350.

Melck, A.L., Yip, L., Carty, S.E., 2010. The utility of BRAF testing inthe management of papillary thyroid cancer. Oncologist 15,1285e1293.

Mitelman, F., Johansson, B., Mertens, F., 2007. The impact oftranslocations and gene fusions on cancer causation. Nat. Rev.Cancer 7, 233e245.

Monzon, F.A., Ogino, S., Hammond, M.E., Halling, K.C.,Bloom, K.J., Nikiforova, M.N., 2009. The role of KRAS mutationtesting in the management of patients with metastaticcolorectal cancer. Arch. Pathol. Lab. Med. 133, 1600e1606.

Motyckova, G., Stone, R.M., 2010. The role of molecular tests inacute myelogenous leukemia treatment decisions. Curr.Hematol. Malig Rep. 5, 109e117.

Nekrutenko, A., Taylor, J., 2012. Next-generation sequencing datainterpretation: enhancing reproducibility and accessibility.Nat. Rev. Genet. 13, 667e672.

Ormond, K.E., Wheeler, M.T., Hudgins, L., et al., 2010. Challengesin the clinical application of whole-genome sequencing.Lancet 375, 1749e1751.

Orth, J.D., Thiele, I., Palsson, B.O., 2010. What is flux balanceanalysis? Nat. Biotechnol. 28, 245e248.

Paez, J.G., Janne, P.A., Lee, J.C., et al., 2004. EGFR mutations in lungcancer: correlation with clinical response to gefitinib therapy.Science 304, 1497e1500.

Pao, W., Girard, N., 2011. New driver mutations in non-small-celllung cancer. Lancet Oncol. 12, 175e180.

Parker, J.S., Mullins, M., Cheang, M.C., et al., 2009. Supervised riskpredictor of breast cancer based on intrinsic subtypes. J. Clin.Oncol. 27, 1160e1167.

Perou, C.M., Sorlie, T., Eisen, M.B., et al., 2000. Molecular portraitsof human breast tumours. Nature 406, 747e752.

Petitjean, A., Achatz, M.I., Borresen-Dale, A.L., Hainaut, P.,Olivier, M., 2007. TP53 mutations in human cancers:functional selection and impact on cancer prognosis andoutcomes. Oncogene 26, 2157e2165.

Plesec, T.P., Hunt, J.L., 2009. KRAS mutation testing in colorectalcancer. Adv. Anat. Pathol. 16, 196e203.

Reichardt, P., 2010. Optimal use of targeted agents for advancedgastrointestinal stromal tumours. Oncology 78, 130e140.

Robinson, D.R., Kalyana-Sundaram, S., Wu, Y.M., et al., 2011.Functionally recurrent rearrangements of the MAST kinase
















































































































































































and Notch gene families in breast cancer. Nat. Med. 17,1646e1651.

Roychowdhury, S., Iyer, M.K., Robinson, D.R., et al., 2011.Personalized oncology through integrative high-throughputsequencing: a pilot study. Sci. Transl Med. 3, 111ra21.

Ryu, K., Park, C., Lee, Y., 2011. Hypoxia-inducible factor 1 alpharepresses the transcription of the estrogen receptor alphagene in human breast cancer cells. Biochem. Biophysical Res.Commun. 407, 831e836.

Sasaki, S., Kitagawa, Y., Sekido, Y., et al., 2003. Molecularprocesses of chromosome 9p21 deletions in human cancers.Oncogene 22, 3792e3798.

Schroeder, A., Mueller, O., Stocker, S., et al., 2006. The RIN: anRNA integrity number for assigning integrity values to RNAmeasurements. BMC Mol. Biol. 7, 3.

Silver, D.P., Richardson, A.L., Eklund, A.C., et al., 2010. Efficacy ofneoadjuvant Cisplatin in triple-negative breast cancer. J. Clin.Oncol. 28, 1145e1153.

Simon, R., 2005. Roadmap for developing and validatingtherapeutically relevant genomic classifiers. J. Clin. Oncol. 23,7332e7441.

Sorlie, T., Perou, C.M., Tibshirani, R., et al., 2001. Gene expressionpatterns of breast carcinomas distinguish tumor subclasseswith clinical implications. Proc. Natl. Acad. Sci. U S A 98,10869e10874.

Soulieres, D., Greer,W.,Magliocco,A.M., et al., 2010. KRASmutationtesting in the treatment of metastatic colorectal cancer withanti-EGFR therapies. Curr. Oncol. 17 (Suppl 1), S31eS40.

Stein, L.D., 2010. The case for cloud computing in genomeinformatics. Genome Biol. 11, 207.

Stransky, N., Egloff, A.M., Tward, A.D., et al., 2011. The mutationallandscape of head and neck squamous cell carcinoma. Science333 (6046), 1157e1160.

Tomlins, S.A., Rhodes, D.R., Perner, S., et al., 2005. Recurrentfusion of TMPRSS2 and ETS transcription factor genes inprostate cancer. Science 310, 644e648.

Treangen, T.J., Salzberg, S.L., 2011. Repetitive DNA and next-generation sequencing: computational challenges andsolutions. Nat. Rev. Genet. 13, 36e46.

van Eijk, R., Stevens, L., Morreau, H., van Wezel, T., 2012.Assessment of a fully automated high-throughput DNAextraction method from formalin-fixed, paraffin-embeddedtissue for KRAS, and BRAF somatic mutation analysis. Exp.Mol. Pathol. 94 (1), 121e125.

van ’t Veer, L.J., Dai, H., van de Vijver, M.J., et al., 2002. Geneexpression profiling predicts clinical outcome of breastcancer. Nature 415, 530e536.

Voutilainen, K.A., Anttila, M.A., Sillanpaa, S.M., et al., 2006.Prognostic significance of E-cadherin-catenin complex inepithelial ovarian cancer. J. Clin. Pathol. 59, 460e467.

Wang, Z., Gerstein, M., Snyder, M., 2009. RNA-Seq: a revolutionarytool for transcriptomics. Nat. Rev. Genet. 10, 57e63.

West, M., Blanchette, C., Dressman, H., et al., 2001. Predicting theclinical status of human breast cancer by using geneexpression profiles. Proc. Natl. Acad. Sci. U S A 98,11462e11467.