M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5
ava i l ab le a t www.sc ienced i rec t . com
www.elsevier .com/locate/molonc
Review
Translating next generation sequencing to practice:
Opportunities and necessary steps
Sitharthan Kamalakarana,*, Vinay Varadana, Angel Janevskia,Nilanjana Banerjeea, David Tuckb, W. Richard McCombiec,Nevenka Dimitrovaa, Lyndsay N. Harrisd
aPhilips Research North America, Briarcliff Manor, NY 10510, USAbOncology Global Clinical Research, Bristol-Myers Squibb, Princeton, NJ 08540, USAcCold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USAdCase Western Reserve University School of Medicine, Cleveland, OH 44106, USA
A R T I C L E I N F O
Article history:
Received 20 March 2013
Accepted 21 April 2013
Available online 15 May 2013
Keywords:
Next generation sequencing
Oncology
Personalized medicine
Genomics
* Corresponding author.E-mail addresses: [email protected]
Janevski), [email protected] (D. [email protected] (L.N. Dimitrova), Lyndsay1574-7891/$ e see front matter ª 2013 Federhttp://dx.doi.org/10.1016/j.molonc.2013.04.00
A B S T R A C T
Next-generation sequencing (NGS) approaches for measuring RNA and DNA benefit from
greatly increased sensitivity, dynamic range and detection of novel transcripts. These tech-
nologies are rapidly becoming the standard for molecular assays and represent huge po-
tential value to the practice of oncology. However, many challenges exist in the
transition of these technologies from research application to clinical practice. This review
discusses the value of NGS in detecting mutations, copy number changes and RNA quan-
tification and their applications in oncology, the challenges for adoption and the relevant
steps that are needed for translating this potential to routine practice.
ª 2013 Federation of European Biochemical Societies.
Published by Elsevier B.V. All rights reserved.
1. Introduction There now exists an extensive literature cataloging
The last decade of research has consolidated our understand-
ing of cancer as a genetic disease caused by genomic disrup-
tions ranging from single point mutations, deletions or
amplifications of chromosomal segments, and structural rear-
rangements that give rise to chimeral genes. The aberrations
at the genomic level drive changes in gene expression, acti-
vate or silence genes and thereby perturb gene networks
and pathways.
(S. Kamalakaran), vinay.verjee), [email protected]@UHhospitals.orgation of European Bioche8
genomic disruptions in cancer and their effect on biological
functions of cancer cells. Several of these disruptions are
important biomarkers and impact treatment options. Estro-
gen receptor (ER) testing has been routinely performed on
breast carcinoma samples since the 1980’s to determine if hor-
monal therapy is indicated. Similarly, EGFR mutation status
has been used to determine which lung cancer patients will
benefit from agents targeting the EGFR receptor. The FDA lists
more than 100 indications where pharmacogenomic testing is
[email protected] (V. Varadan), [email protected] (A.(D. Tuck), [email protected] (W.R. McCombie), nevenka.dimi-
(L.N. Harris).mical Societies. Published by Elsevier B.V. All rights reserved.
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5744
indicated, including 38 in oncology. (http://www.fda.gov/
drugs/scienceresearch/researchareas/Pharmacogenetics/
ucm083378.htm). While each of these individual measures is
of value, individually they represent a single data point in
the complex environment of cancer.
Theadventofmultigeneanalysis tools, i.e. genearrays,CGH
and next generation sequencing, has advanced the field by
measuring the complexity of cancer in amore comprehensive
fashion. Several multigene assays have been introduced into
the clinic to predict patient outcome. Oncotype DX� (Genomic
Health, Redwood, CA) quantifies the expression of 21 genes by
RT-PCR and uses an algorithm to combine the expression
values into a “recurrence score” to predict chemotherapy
benefit for a subset of breast cancer patients. Many additional
tests andmarkers have been reported to be of use in theman-
agement of cancere CancerTYPE ID (bioTheranostics, Inc, San
Diego, CA) to aid in the classification of the tissue of origin and
tumor subtype for patients diagnosed withmalignant disease,
OncoTypeDxColon (GenomicHealth) for assessment of risk of
recurrence following surgery in stage II colon cancer patients
and Mammaprint� (Agendia, Irvine, CA) for identifying risk
ofdistant recurrence followingsurgery.All these testsmeasure
genes and their expression levels through capillary
sequencing, microarrays, or PCR and are standardized for
measuring a small subset of the tumor genome.
As the utility of these tools increases, the number of
different tests and diagnostic providers makes it increasingly
cumbersome for pathologists and oncologists to obtain
enough sample material for analysis. Next Generation
Sequencing (NGS) technologies offer the potential to measure
and quantify all these markers at once and provide a more
complete view of the tumor’s molecular state. In addition,
NGS has allowed the analysis of complete human genomes
at a reasonable cost e the cost for sequencing one human
genome has come down from $100 million in 2001 to just un-
der $3000 in 2012 (www.genome.gov/sequencingcosts).
NGS can now provide the following depth and breadth of
genomic information in a single test: (Gargis et al., 2012) (i)
Whole genome information at single nucleotide level with a
complete catalog of mutations, (Ellis and Perou, 2013); (ii) A
profile of the copy number states of individual genes and
many chromosomal aberrations (Ellis and Perou, 2013); (iii)
Whole transcriptome landscape including mRNA levels of
protein coding genes, non-coding RNA, expression of repeat
rich regions, and aberrant fusion genes. We can now foresee
a scenario in which a tumor sample, once obtained through
biopsy or surgery, is used to extract DNA and RNA, sequenced
and assembled to provide the full patient genome and an
RNASeq profile of his/her transcriptome (Figure 1). This data
is then mined to catalog all mutations and copy number aber-
rations and quantify expression of genes. Once the muta-
tional, copy number and transcriptomic profile is generated,
clinical decision support algorithms are used to extract and
present useful and clinically actionable information from
the results of the analyses. This has already been shown in pi-
lot studies (Gargis et al., 2012) and is a major step forward in
realizing the potential of personalized medicine.
The comprehensive nature of NGS also has the potential to
replace a multitude of single gene tests that are currently per-
formed on multiple discrete specimens with a single test on
one specimen. This would lead to improved standardization
of tests for specific genetic abnormalities, more in depth infor-
mation for the clinicians and more cost effective molecular
diagnostic testing. Additionally, once sequenced, this genome
information is “digitized andmay be immortalized” which en-
ables the sequenced sample to be “frozen” in-silico and acces-
sible for further querying as the treatment progresses or
whenever new clinically relevant aberrations are identified
or reported. This is more advantageous than testing an
archived tumor sample for prospective or retrospective anal-
ysis. This makes a patient genome a source of data for
comprehensive genome forward and backward approaches
i.e. mutations that are identified would drive the selection of
therapies based on retrospective data on prior patients with
similar genomic profiles whose outcome is known (genome
backward medicine). For patients who have failed conven-
tional therapies or for whom there are no clearly delineated
guidelines on therapy choice, new therapeutic strategies
could be attempted (as part of clearly defined clinical trials)
based on the comprehensive analysis of the patient tumor’s
genetic makeup (genome forward medicine). Recently, Ellis
and Perou offered a prospective view of how genomic profiling
could help in the treatment of breast cancer (Ellis and Perou,
2013). They catalogued specific examples of mutations
(PIK3CA, BRCA1, BRCA2, GATA3, MLL gene family, rare Recep-
tor Tyrosine Kinases) and genomic abnormalities (amplifica-
tions/gain of function mutations in Her2, FGFR1, FGF3,
Cyclin D1/CDK4/CDK6, MDM2, deletions/loss of function mu-
tations in PTEN, PIK3R1) that could be used to target therapies
in breast cancers.
However, the optimal use of these novel molecular assays
will be a challenge to the practicing oncologist. There exist
many challenges to transfer this vision from being used in a
few luminary sites andhospitals into routinemedical practice.
Many of the opportunities and challenges in applying next-
generation sequencing for clinical applications have been
reviewed elsewhere (Biesecker et al., 2012; Biesecker et al.,
2009; Nekrutenko and Taylor, 2012; Treangen and Salzberg,
2011; Berg et al., 2011; Maher, 2011; McDermott et al., 2011;
Ormond et al., 2010). In this review, we will focus on the role
of next generation sequencing technologies for cancer pa-
tients, the challengeswe face in using this technology for clin-
ical applications and provide a framework for oncologists to
the promise and pitfalls for use in routine clinical practice.
We consider threemajor groups of challenges for the introduc-
tion of a new technology such as sequencing into clinical prac-
tice. We outline these major steps in Figure 2.
1. Technology e the feasibility of the technology is an impor-
tant but not sufficient step in this process. In this step the
necessary aspects of technology reproducibility, and accu-
racy must be met. Additionally, the cost and throughput
of the technologywould need to be within acceptable limits
to enable wide introduction into practice, and if individual
instruments are to enter the clinic, they need to pass the
regulatory approval (510(k) approval).
2. Clinical applicability eone would need to demonstrate clin-
ically meaningful uses of the technology and their benefits
for patients and providers. Additionally software and algo-
rithms that are used must be tested for usability and be
Figure 1 e A model for enabling sequencing based personalized oncology. A CLIA certified laboratory generates NGS data (Layer 1) that is
transferred to a High Performance Computing environment where the requisite quality control and analysis of the data is performed (Layer2).
Clinical Decision support algorithms will extract clinically relevant pieces of information from the processed data (Layer 3). Examples of clinically
relevant information could be the identification of an activating PIK3CA mutation, ERBB2 amplification, a gene signature such as breast cancer
subtype or a gene fusion. This clinically relevant information is then interactively viewed through desktop/mobile devices (Layer 4).
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 745
quality controlled. Any tests that the technology is used to
provide must demonstrate high sensitivity and specificity.
3. Adoption e the prescribed use of the technology must be
described by professional organizations; regulatory
approval must be obtained for clinical use. Additionally,
infrastructure and standardsmust be set up to enablewide-
spread use of the technology, including CPT coding of mo-
lecular services that will be intelligible to payors and will
promote uniformity of coding among equipment and ser-
vice providers. They will also include setting up IT, data
storage and privacy, and educational programs for training
healthcare professionals for appropriate use.
We will now review in detail the status of NGS technology
in the context of the broad outlines described above.
2. Technology feasibility
Next Generation Sequencing has rapidly replaced other high-
throughput technologies such as microarrays as the platform
of choice formany genomic applications. The base-call quality
of Illumina NGS machines when integrated across all reads of
a given base exceed those of Sanger based capillary se-
quencers. Ninety percent of the bases called by Illumina
NGS sequencer have quality phred-scores of Q30 compared
to around Q20 for Sanger based sequencers. A phred score of
Q30 corresponds to a probability of base-call error of 1 in
1000 or 99.9% accuracy. Also, throughput of the sequencers
has increased dramatically so that it is now possible to
generate enough sequence data to assemble a full human
genome in 1 day. New sequencers such as Ion Torrent (Life
technologies, Carlsbad, CA) and Oxford Nanopore Technolo-
gies (Oxford, UK) promise even higher throughput at lower
costs. The accuracy, speed and cost of assembling a human
genome have met the threshold for enabling clinical use.
Additionally, these instruments are undergoing regulatory
approval under Clinical Laboratory Improvement Amend-
ments (CLIA) and FDA (510(k)). Several commercial providers
have begun to offer CLIA certified lab developed tests (LDT)
that use NGS technologies. Foundation Medicine (Cambridge,
MA), offers a CLIA certified test that scans for somatic
Figure 2 e Major challenges for introducing sequencing based oncology into routine practice.
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5746
alterations in 236 relevant cancer-related genes. Ambry Ge-
netics (Aliso Viejo, CA) offers CLIA certified Exome Sequencing
for undiagnosed genetic diseases. Both Life Technologies
(AmpliSeq� Comprehensive Cancer Panel) and Illumina (Tru-
Seq Amplicon e Cancer Panel) offer target selection assays for
resequencing cancer genes. These methods allow for fast and
efficient resequencing of key genes in formalin fixed paraffin
embedded (FFPE) samples.
3. Clinically meaningful applications of NGS
For NGS to become part of routine clinical practice, the medi-
cal benefits must be clearly demonstrated. In the area of
oncology, there are a number of clinical needs that can be
met by sequencing technologies.
Genetic abnormalities detectable by sequencing can be
classified into three major groups e single nucleotide changes
or pointmutations, copy number changes (amplification, rear-
rangement or deletion of sections of chromosomes) and
changes in expression levels of genes. In the following para-
graphs, we discuss biomarkers with known clinical utility,
measured by individual tests, which could be converted to a
whole genome sequencing approach.
3.1. 1 Point mutations
Point mutations that lead to constitutive activation of onco-
genes or inactivation of tumor suppressor (TS) genes have
been used to guide development of novel targeted therapies
e the best described is the cKIT mutation in gastrointestinal
stromal tumors that can be targeted by imatinib or nilotinib.
Recently, mutations in EGFR have been reported to be predic-
tive for EGFR inhibitors (Lynch et al., 2004; Paez et al., 2004),
mutations in BRAF oncogene at codon 600 have been reported
to have clinical utility in melanoma, colorectal, lung and thy-
roid cancers (De Roock et al., 2011; Melck et al., 2010; Pao and
Girard, 2011). Patients with BRAF V600E mutation-positive,
inoperable or metastatic melanoma, are eligible for treatment
with vemurafenib. The cobas 4800 BRAF V600 Mutation Test
(Roche), is used to identify patients eligible for treatment. An
exemplary list of point mutations for which commercial
testing is available and their utility in cancer treatment are
provided in Table 1.
For single gene tests to be used, the mutations in the gene
have to meet a threshold of prevalence to warrant testing and
have very high clinical value. Many of the genes listed in Table
1 are frequently not tested in the clinic because they fail to
meet this threshold or the resultant clinical action of the tests
is not clear. Recent reports suggest that each tumor has a
distinct number of mutations driving it e however, each of
these mutations is present only in a small percentage of tu-
mors. Mardis et al. (2009) identified 750 point mutations in
AML of which 64 were in coding/regulatory regions. Only 4
out of these 64 point mutations could be detected in more
than one sample, suggesting that individual mutations are
not recurrent. Other studies have reported that while a spe-
cific mutation might not be found recurrently, common path-
ways can be identified thatmay drive pathogenesise Stransky
et al. (2011) found thatmore than 30% of head and neck cancer
cases harbored mutations in genes that regulate squamous
differentiation (e.g., NOTCH1, IRF6, and TP63). These studies
suggest that methods that assess the entire genome would
offer a comprehensive approach in defining perturbed path-
ways as opposed to using single gene expression levels or mu-
tation of particular codons as surrogates of perturbed
pathways.
It is also increasingly recognized that cancer mutations are
not limited by tissue type, although the prevalence of these
mutations might be so. For example, BRAF mutations are
Table 1 e Clinically relevant tests for mutations in cancer and their utility.
Gene Mutation Test type Cancer type Ref.
AKT1 E17K Prognostic Breast, Colorectal, Lung and Ovarian cancers (Pao and Girard, 2011;
Bleeker et al., 2008)
BRAF V600E Predictive of response
to vemurafenib
or dabrafenib
Non-Hodgkin lymphoma, colorectal cancer,
malignant melanoma, thyroid carcinoma,
non-small cell lung carcinoma, lung
adenocarcinoma and melanoma
(De Roock et al., 2011;
Melck et al., 2010;
Pao and Girard, 2011;
Flaherty et al., 2012)
EGFR Exons 18e21 Predictive of
benefit to EGFR TKIs
Non-Small Cell Lung Cancer (Linardou et al., 2009)
FLT3 D835 Prognostic Acute Myeloid Leukemia (Motyckova and
Stone, 2010)
JAK2 Exon 12, V617F Prognostic Myelo Proliferative Disorders, Chronic
Myeloid Leukemia
(Li et al., 2008b;
Ma et al., 2009)
KIT Exons 8, 9, 11, 17 Predictive Gastro Intestinal Stromal Tumors; Acute
Myeloid Leukemia
(Motyckova and Stone,
2010; Reichardt, 2010)
KRAS Codons 12, 13, 61 Predictive of
benefit to erlotinib
Lung adenocarcinoma, mucinous adenoma,
ductal carcinoma of the pancreas, and
colorectal carcinoma
(De Roock et al., 2011;
Kompier et al., 2010;
Monzon et al., 2009;
Plesec and Hunt, 2009;
Soulieres et al., 2010)
MPL Exon 10 Prognostic Myeloproliferative disorders e Chronic myeloid
leukemia, polycythemia Vera
NPM1 Codons 288,
290, Exon 12
Prognostic Acute Myeloid Leukemia (AML) (Motyckova and Stone,
2010; Hollink et al., 2009)
PIK3CA Exons 9, 20 Prognostic Colorectal cancer, Malignant melanoma, Thyroid
carcinoma, Non-Small Cell Lung cancer, breast cancer,
cervical cancer, and lung adenocarcinoma
(Pao and Girard, 2011;
Kompier et al., 2010)
TP53 Somatic mutations Prognostic Head and Neck squamous cell carcinoma, Leukemia,
and Breast cancer
(Silver et al., 2010;
Petitjean et al., 2007)
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 747
common in melanoma (>50%), but have also been detected in
lower frequencies in other cancers (Brose et al., 2002; Davies
et al., 2002). Similarly, ERBB2 amplifications were originally
described in breast cancers and form the basis of treatment
with trastuzumab. Recently, it has been reported that 10% of
gastric cancers also harbor the ERBB2 amplicon and these pa-
tients might benefit from Herceptin therapy (Barros-Silva
et al., 2009). Currently, although cumulatively these low prev-
alence mutations may make up a large portion of cancer
drivers they are not routinely tested in the clinical setting. In
this context, it might be more important to screen for all mu-
tations in the patient by sequencing in an unbiased manner,
instead of applying single gene tests.
Roychowdhury et al. (2011) recently reported an interesting
pilot study demonstrating the value of comprehensive
sequencing in a patient with metastatic colorectal cancer. A
mutation in the NRAS gene and an amplification of the
CDK8 locus were identified, both of which could be used to
enroll the patient in a clinical trial targeting these aberrations
(Roychowdhury et al. (2011)). Interestingly, this patient under-
went testing for KRAS and was deemed to be wild type which
is a basis for prescribing anti-EGFR (Biesecker et al., 2012) ther-
apy. Although this patient was not prescribed anti-EGFR ther-
apy, he would have been eligible. The authors noted that the
NRAS mutation is functionally equivalent to a KRASmutation
and if known should preclude this patient from EGFR therapy.
This case study also reinforces the difficulty in prescribing sin-
gle gene mutation testing e While NRAS mutations are pre-
sent in 18% of cutaneous melanomas (Lee et al., 2011), they
are much rarer in colorectal cancers (2%) (Irahara et al.,
2010). Only an unbiased sequencing based mutation testing
would have uncovered these important driver mutations.
Several platforms and testing services are now available
for identification of multiple mutations in cancer-related
genes in a single assay. MacConaill et al. (2009) reported a mu-
tation profiling platform (‘‘OncoMap’’) to interrogate 400 mu-
tations in 33 known oncogenes and tumor suppressors. Such
assays, while an improvement over single gene assays, still
offer a limited fraction of mutational information relevant to
cancer. The advent of whole genome sequencing technologies
offers very competitive price points and the ability to profile
tumors for multiple mutations with a single test. Commercial
providers such as Foundation Medicine (Cambridge, MA)
already use deep sequencing on selected sets of cancer genes
fromDNA extracted from routine pathology specimens to pro-
vide actionable information to treatment providers.
3.2. Chromosomal abnormalities
Regions of the genome are commonly amplified or deleted in
cancer and these regions contain genes that drive cancer pro-
gression e the best example being the 17q12 amplicon that har-
bors the HER2 oncogene. This amplicon leads to a more
aggressive type of tumor, which is now the target of a highly
successful antibody therapy, trastuzumab (Herceptin�). Other
amplicons in 11q13/14, 8q24, and 20q13.2 have been found in
cancers that seem to drive the cancer phenotype and have prog-
nostic significance. These regions contain gene sets, which are
important in DNA metabolism and maintenance of chromo-
somal integrity, suggesting that response to DNA damaging
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5748
agents used as anticancer therapy might be modulated by the
presence of particular amplicons. Recently authors of a large
breast cancer study proposed ER positive breast cancers
harboring amplifications in 11q13/14 as a separate subgroup
with worse outcomes (Curtis et al., 2012).
There exist several published methods for inferring gene
copy number using next generation sequencing. The sensi-
tivity and specificity of these methods exceed that of microar-
ray based techniques and require very small amounts of
tumor DNA as starting material.
Translocations and their corresponding gene fusion prod-
ucts have been known to play an important role in the onset
and development of several cancers (Mitelman et al., 2007).
The classic example of a translocation resulting in the crea-
tion of a fusion transcript is the reciprocal translocation
t(9;22)(q34;q11) (McDermott et al., 2011; Lee et al., 2011)
causing the BCR-ABL1 fusion transcript. BCR-ABL1 fusion oc-
curs in most patients with chronic myelogenous leukemia
(CML) and a third of patients with acute lymphoblastic leuke-
mia. While the clinical impact of gene fusions has been most
prevalent in hematological malignancies, there is growing ev-
idence that they could have prognostic and predictive utility
in common solid tumors (Table 2). The clinical utility of inter-
rogating solid tumors for gene fusions can be seen from the
recent approval of crizotinib for the treatment of NSCLC that
harbor rearrangements in ALK (Kwak et al., 2010). Similar
therapeutic implications have also been recently reported
for MAST kinase rearrangements in breast cancer (Robinson
et al., 2011) and RET rearrangements in lung adenocarcinoma
(Lipson et al., 2012), thus highlighting the clinical relevance of
genomic translocations and their fusion transcripts. This pro-
vides an additional clinical indication for sequencing technol-
ogies that can detect such variations in addition to mutations,
amplifications and deletions.
Table 2 e Clinically relevant chromosomal abnormalities in cancer and th
Genes Chromosomeabnormality
Test type
BCR-ABL1 t (McDermott et al., 2011;
Lee et al., 2011)
Diagnostic/Predictive of r
multikinase inhibitors su
imatinib, nilotinib
EML4-ALK Fusion Targetable with ALK-spe
tyrosine kinase inhibitor
TMPRSS2-ERG Fusion Diagnostic/Prognostic
MAST kinase and
Notch gene
family fusions
Fusion Potentially targetable by
inhibitor
KIF5B-RET Fusion Targetable by RET tyrosin
inhibitor
1p & 19q Deletion Diagnostic
9p21 Deletion Diagnostic
17q12 Amplification Prognostic/Predictive of r
trastuzumab and anthrac
adjuvant therapy
11q13 Amplification Prognostic
8q24 Amplification Prognostic
20q13.2 Amplification Prognostic
3.3. Measurements of RNA and transcription profiling
There have been several studies documenting the use of gene
expression of one or more genes in determining cancer prog-
nosis or treatment. An early series of publications specifically
described molecular signatures in breast cancer, primarily
focused on associations between particular sets of genes
with altered expression and survival (Sorlie et al., 2001; van
’t Veer et al., 2002; West et al., 2001; Perou et al., 2000). These
studies led to the development of the clinically used tests
such as Oncotype DX�, Mammaprint�, Breast Index� (BCI),
PAM50 and others. Other tumors where multi-gene tests are
useful are the ColoPrint� assay for recurrence risk of Stage II
and III colon cancer patients.
The ability of RNA-seq tomore completely characterize the
entire transcriptome in comparison toanyexisting single tech-
nology such as RT-PCRormicroarray promises significant clin-
ical utility (Wangetal., 2009; Ryuet al., 2011).Abigadvantageof
using RNA sequencingwould be the opportunity to applymul-
tiple individual or multi-gene tests to the same sample at a
fraction of the cost in running each of these separately in a sin-
gle RNA Seq run. For example, in breast cancer a single RNA
Seq run could provide input to calculate cancer subtype using
the PAM50 gene set (Parker et al., 2009), obtain ER/PR and Her2
RNA abundance values (Kamalakaran et al., 2011), calculate
the Genomic Grade Index (Filho et al., 2011; Liedtke et al.,
2009) based on the 97 genes and identify expression of any
other amplified or deleted genes such as EGFR which are
known to impact therapy response.
3.4. Sample processing for sequencing of tumor samples
Recent improvements in sample processing protocols have
also allowed for robust extraction of DNA/RNA from FFPE
eir utility.
Cancer type Reference
esponse to
ch as
CML/ALL (Mitelman et al., 2007)
cific Non-small cell lung
cancer
(Kwak et al., 2010)
Prostate (Tomlins et al., 2005)
g-secretase Breast (Robinson et al., 2011)
e kinase Lung adenocarcinoma (Lipson et al., 2012;
Kohno et al., 2012)
Oligodendroglioma (Barbashina et al., 2005)
Lymphoid leukemia,
lung cancer, esophageal
cancer, glioma, melanoma
(Sasaki et al., 2003)
esponse to
ycline-based
Breast (Lamy et al., 2011)
Breast
Multiple cancers
Breast
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 749
samples and use of very small amounts of DNA for complete
sequencing (Bonin et al., 2010; Fairley et al., 2012; van Eijk
et al., 2012). Sample quality is of primary importance in gener-
ating sequence based diagnostic tests. Quality control steps
such as RNA integrity number (Schroeder et al., 2006) as
used in standardmolecular tests are an absolute requirement.
In addition, steps are needed to evaluate and quantify histol-
ogy of the specimen e especially to establish percentage inva-
sive tumor in the specimen and the heterogeneity based on
staining. Additionally following the sequencing run, general
quality checkpoints are essential to ensure data integrity
and quality. The quality of the reads produced by the
sequencer needs to be evaluated before subsequent mapping
and feature extraction are performed. While most sequencers
generate a quality control (QC) report as part of their analysis
pipeline, tools such as FASTQC are available and should be
incorporated routinely in any clinical analysis. A QC report
that identifies problems as originating either in the sequencer
or in the starting library material would be more useful in the
clinical context. The interpretation of the reads produced by
the sequencer involves data analytic steps that first require
alignment of the reads to a reference genome or transcrip-
tome and subsequent interpretation of the resulting align-
ments to detect the presence of mutations, copy number
abnormalities, transcript abundance. Additionally following
the sequencing run, general quality checkpoints are essential
to ensure data integrity and quality. Quality checkpoints for
sequencing based measurements are detailed in Table 3.
3.5. Algorithms and software
There has been an enormous body of computational biology
research on the identification of mutations, copy number
changes and quantification of gene expression from tumor
samples. However, to be truly useful in the clinic, themethods
must be used in conjunction with strict controls on data qual-
ity, coverage and an understanding of the assumptions under
which the algorithms were developed.
Phred quality scores (Ewing and Green, 1998; Ewing et al.,
1998) were originally introduced by Phil Green to assess the
probability of accurately calling a base in capillary sequencers.
Li et al. (Li et al., 2008a) introduced the concept of using these
quality scores in a consensus nucleotide calling algorithm to
Table 3 e Quality control checkpoints for sequencing based measuremen
Steps Checkpoints
Sample processing Ensure percentage Invasive Compone
Heterogeneity in positively stained ce
DNA sequencing and
alignment
Ensure Base call Quality
Sensitivity and robustness of algorith
abundance levels of transcripts
Detection/quantification Sensitivity and robustness of algorith
abundance levels of transcripts
Mutation calling for Heterogenous sam
Differentiating mutations from variat
Identifying True copy number change
call mutations and single nucleotide polymorphisms. These
Phred scores (Ewing and Green, 1998; Ewing et al., 1998)
must be higher than 30 to ensure correct base calls by the
sequencer. Additionally, each base called must have enough
reads that align to that particular base to have accurate geno-
typing. In order to achieve an accuracy of 1 genotyping error in
1 million calls, Ajay et al. (Ajay et al., 2011) required at least
50X coverage of a clinical blood sample. Other reports have a
higher threshold of 100X for calling genotypes (Carter et al.,
2012). Kohlmann et al. (2011) recently reported on a study
across 10 different laboratories for targeted mutation
screening using sequencing for clinical applications. They re-
ported a high concordance, including a robust detection of
novel variants, which were undetected by standard Sanger
sequencing. Additionally, they demonstrated sensitivity to
detect low-level variants present with 1e2% frequency. In
comparison, the threshold is 20% for traditional Sanger-
based sequencing, and demonstrates the power and strength
of the next generation sequencing technologies.
For normal diploid genomes, Ajay et al. (2011) reported that
the conventional read depth of w30X coverage has produced
high quality genotypes but stringent filters for data quality
allowed them to accurately genotype only 30% of the genome
at 30X mean coverage of the whole genome. They demon-
strated an average coverage depth of 50X was necessary to
accurately genotype 95% of the genome. However, challenges
exist in calling mutations from cancer genomes as consensus
calling algorithms used in this field typically assume that the
genome is diploid which is not true for large segments of can-
cer genomes. Detection of somatic alterations in tumor biopsy
samples is complicated by both the presence of normal cells in
the biopsied tissue (purity) as well as the presence of multiple
clonal subpopulations within tumor cells. These factors affect
the required depth of sequencing to call clonal mutations at
sufficient power (>0.8) in each sample, with greater than
100-fold coverage required to detect mutations that may be
present in around 20% of tumor cells (Carter et al., 2012).
Finally, to differentiate tumor specific mutations from the
3e4 million naturally occurring variations/single nucleotide
polymorphisms, a normal sample (blood or saliva) from the
same patient would need to be sequenced. The presence of
copy number aberrations in tumor samples adds an additional
layer of complexity in determining allele states.
ts.
Quality control
nt H&E Staining
lls H&E Staining
FASTQC phred score of Q30
ms that estimate Sequencing Depth (Fold Coverage)
ms that estimate Sequencing Depth (>50 M mapped)
ples Fold Coverage (100X coverage)
ions/polymorphisms Comparison with normal tissue from
same patient
Mappability of region; differentiating
paralogous regions
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5750
Several methods have been developed for abundance esti-
mation of genes (Baba et al., 2006; Karst et al., 2011; Orth et al.,
2010) and isoforms as well as for the detection of fusion-
transcripts (Voutilainen et al., 2006; Latoszek-Berendsen
et al., 2010). Themethods differ in terms of the units of expres-
sion of transcript abundance but also show differing sensi-
tivity to sequencing parameters such as read-length, read-
quality, and insert sizes for paired-end reads. Standardization
of transcript abundance units in addition to evaluation of
sensitivity and robustness in both technical and biological
replicates are needed before clinical adoption. Additional
quality control steps are necessary for optimal sample pro-
cessing to be assured before proceeding with sequencing. It
may be that sequencing technologies could provide a new
‘gold standard’ if they provide more accurate prediction of
clinical outcome.
In order to translate an existing multigene signature (e.g.
Mammaprint, OncotypeDx, ColoPrint, Genomic Grade Index)
whose technical validation and medical utility has already
been established, itwill be critical to show that the gene expres-
sion read-out using RNA-seq is technically equivalent to the
original microarray or RT-PCR based assay. In comparing num-
ber of sequence reads mapped to each gene and the corre-
sponding absolute intensities from array (normalized) it was
found that the correlation is greater for genes that are mapped
to by large numbers of sequence reads (Wang et al., 2009;
Marioni et al., 2008a). Marioni et al. found 81% overlap when
the differentially-expressed genes from the two technologies
were compared (Marioni et al., 2008b). This suggests that the
expression values from RNA-seq will need some mathematical
transformations for certain genes to fit the established classi-
fier. We recently reported the comparison of RNA-seq with
standardized clinical measures of ER, PR andHER2 by immuno-
histochemistry or fluorescent in-situ hybridization
(Kamalakaran et al., 2011). We found that RNA-seq measure-
ments were 100% concordant with IHC for ER while PR and
HER2 were 80e90% concordant and sensitive to sample quality
(heterogeneity and percentage invasive component).
In addition to establishing technical equivalence, steps
have to be taken toward analytical validation. To establish
robust performance of RNA-seq-based analysis, the process-
ing pipeline has to be repeated several times with the same
samples with low variability. Also, the sequencing based
assay and the existing assay (e.g.PCR, RT-PCR, microarray)
should be evaluated on the same set of samples with the
intended focused objective to avoid confounding issues. Stan-
dardization of RNA-seq protocols and such evaluations of
reproducibility will be key in transitioning into this new tech-
nology (Simon, 2005) and provide a path to market via CLIA
certification and/or FDA pre-market reviews.
3.6. Clinical grade software and tools e ensuringquality, reproducibility, accuracy and usability
As described above, most of existing software tools for next
generation sequencing have been developed primarily in the
research context. Many of these algorithms will need to be
validated and quality tested for robustness before routine clin-
ical use. Clinical decision support software which interprets
the gigabytes of sequencing data into clinically actionable
decisions are still in embryonic stages. The bioinformatic
analysis and interpretation requires both standardized
genomic “content” and a host of interpretation and knowledge
discovery tools. A major challenge will be in presenting the
large volume of data that would be available from whole
genome sequencing. Databases such as MutaDATABASE
(Bale et al., 2011), a standardized and centralized warehouse
to hold disease associated variants, would be useful in priori-
tization of themutations that will be identified. A similar stan-
dardized database for annotating copy number variations
does not exist today. For the oncology experts, the decision
support is usually performed by a team of bioinformaticians,
statisticians and genetic counselors. The importance and rele-
vance of the mutations must be ascertained and presented to
the oncologist in an intuitive manner. Berg et al. (2011) pro-
posed a three tiered system that classifies each mutation/
variant data as clinically useful, clinically valid or clinical
implication unknown. This system would allow prioritization
of the data into useful and clinically actionable bits of infor-
mation that can then be used in disease management,
Visualizing results of sequencing analysis is currently done
on an ad-hoc basic using a cluster of different software tools.
OncoPrints from cBio is a tool for visualizing genomic alter-
ations, including somatic mutations, copy number alter-
ations, and mRNA expression changes across a set of
patients. Tools such as the Integrated Genomic Viewer from
the Broad Institute or the cBio cancer genomics portal from
MSKCC are meant for exploratory use for clinical discovery
studies. There is a need for an integrated software solution
that would analyze, store sequence data and present useful
genomic features for clinical actions.
4. Challenges for clinical adoption
In addition to introducing a new data type and a newmodality
in the clinic, sequencing will also provide several new chal-
lenges to implement, adopt, and utilize in the clinical environ-
ment. The Standardization of Clinical Testing (Nex-StoCT)
workgroup in conjunction with the US Centers for Disease
Control and Prevention (CDC) has taken steps to define tech-
nical process elements to assure the analytical validity and
compliance of NGS tests with existing regulatory and profes-
sional quality standards (Gargis et al., 2012). These guidelines
were drafted to ensure reliable next-generation sequencing
(NGS) based testing and their application for clinically useful
decision making. These guidelines “address four topics that
are components of quality management in a clinical environ-
ment: (i) test validation, (ii) quality control (QC) procedures to
assure and maintain accurate test results, (iii) the indepen-
dent assessment of test performance through proficiency
testing (PT) or alternative approaches and (iv) reference mate-
rials (RMs)”. These recommendations are a good framework to
build upon for ensuring clinically meaningful use of NGS
technologies.
4.1. Molecular oncology education and expertise
The next generation technologies will require not only hard-
ware and software resources but also human expertise. Even
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 751
with bioinformatics resources, the interpretation of test re-
sults will require a sophisticated set of experts with knowl-
edge in molecular biology, biostatistics, bioethics and
medicine. The new era of the ‘molecular oncologist’ will
soon be upon us where students receive training not only in
medical disciplines and basic biology, but also in the interpre-
tation of high dimensional data.
4.2. Regulatory approval and clinical guidelines
Providers of clinical sequencing services (Ambry Genetics,
Aliso Viejo, CA) and Foundation Medicine, Cambridge, MA)
haveoperatedas LabDevelopedTests (LDT)underClinical Lab-
oratory Improvement Amendments (CLIA) certification. How-
ever, there are no clear guidelines of operation for Next
generation Sequencing based tests. Recently, the College of
American Pathologists (CAP), updated theMolecular Pathology
checklists of their CAP Accreditation Program to include a sec-
tiononnext generation sequencingbased assays. TheCAPalso
updated theirmaster activitymenu to includespecificTest/Ac-
tivity codes for the use of next generation sequencing. The
checklist provides a framework for documenting processes
and ensuring quality for both the analytical wet bench process
of sample preparation and sequence generation and the bioin-
formatics process/pipeline of sequence alignment, annotation
and variant calling.We believe these guidelines and processes
would allow sequencing based tests to be more reliable and
allow for wide deployment into clinical practice. However,
these guidelines would need to be updated to include not just
mutation/variant calling based NGS tests, but also RNASeq
and DNA copy number based tests.
4.3. Data storage
Raw sequencing runs generate hundreds of gigabytes of data
from a single measurement, and thus will surpass existing
clinical data management infrastructure by one or more or-
ders of magnitude. If follow-up screening measurements or
serial measurements of disease status will be performed,
sequencing data will easily be the most data rich modality
used clinically in the future. With large quantities of data
comes the need for computational resources. Currently, anal-
ysis of sequencing data can take days on large computer clus-
ters and is typically offloaded to a computational cloud whose
capacity is unlikely to bemet by resources currently in place in
the clinic (Stein, 2010). It is expected that technological ad-
vances in cloud computing and cloud storage will become
available to meet the needs for the clinical setting, hiding
the details from the end user via a service that provides man-
agement and access to this data. However, this will initially
augment concerns over data safety and security, liability,
and compliance with regulatory requirements, such as how
data is stored, which parties own and have access, and details
of data deletion, archiving, and retrieval.
Long-term storage and clinical utility of sequencing data in
diagnosis, therapy planning and therapy monitoring will
require standards that will ensure interoperability with the
electronic healthcare records. HL7 has a Clinical Genomics
Working Group that creates and promotes its standards by
enabling the communication between interested parties of
the clinical and personalized genomic data. The goal is the
personalization (differences in individual’s genome) of the
genomic data and the linking to relevant clinical information.
The HL7CG will develop and document scenarios and use
cases in clinical genomics to determine what data needs to
be exchanged. Also, they review existing genomics standards
formats such as BSML (Bioinformatics Sequence Markup Lan-
guage), MAGE-ML (Microarray and Gene Expression Markup
Language), LSID (Life Science Identifier) and others.
Publicly funded efforts such as iRODS and Galaxy are help-
ing the community currently to copewith themassive sharing
ofdataandprocedural knowledgeabout the sequencingdown-
stream analysis. With the pipelines for RNA-Seq, Chip-Seq,
exomic and full sequence analysis, the operational question
is how to annotate terabytes, petabytes and exabytes of data
and how to organize the data for downstream analysis. iRODS
(Integrated Rule Oriented Data System), is a technical solution
that involves metadata-driven file management coming from
different domains, on different devices, under the control of
different groups. It manages descriptive metadata about each
item, including duplicate detection, archiving, datamigration,
access controls, authorization, and integrity checks, and
enforcing management policies for any desired property (e.g.
enforcement of rules regarding privacy for different consent
types). Galaxy is an open, web-based platform for data inten-
sive biomedical research. Galaxy (http://galaxyproject.org) is
a software system that provides this support through a frame-
work that gives experimentalists simple interfaces to powerful
tools, while automatically managing the computational de-
tails. Galaxy is distributed both as a publicly availableWeb ser-
vice, which provides tools for the analysis of genomic,
comparative genomic, and functional genomic data, or a
downloadable package that can be deployed in individual lab-
oratories.Oneof theearlyprojects that fullydeploys large scale
medical sequencing as a productive and critical component in
genomic medicine is the ClinSeq project. The ClinSeq project
attempts to address issues related to the genetic architecture
of disease, implementation of genomic technology, informed
consent, disclosure of genetic information, and informatics
challenges in archiving, analyzing, and displaying sequence
data (Biesecker et al., 2009).
4.4. Privacy and confidentiality
The implantation of whole genome analysis presents several
challenges for privacy and security. First, the amount of data
being generated demands large computing requirements not
available in most institutions. There are ongoing efforts to
ensure security of personal medical information (PMI) in
computing ‘clouds’.
In addition to security of PMI, the implications of
sequencing whole genomes on patients and their families
cannot be overlooked. In addition to tumor sequence, it is
clear that host sequence will also be produced by next gener-
ation sequencing technologies. The issues that have become
well studied in the field of human genetics are amplified in
this contextwhere the presence of ALL knownheritablemuta-
tions and polymorphisms in a particular patient’s genomes
will become available by sequencing their tumor sample. For
a number of reasons, including increased risk of bias,
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5752
discrimination, and stigma, genetic privacy and confidenti-
ality are sometimes thought to be more important than pri-
vacy and confidentiality in other kinds of research (Esposito
and Goodman, 2009). Clearly, safeguards must be put in place
before such sensitive information is generated and patients
must be counseled and consented appropriately for known,
and theoretic risks this knowledge carries. For example, po-
tential participants must be informed on which entities and
persons will have access to the data. Thismight include inves-
tigators at other institutions, corporate sponsors, a govern-
ment, employers, etc. If information obtained during
research will be placed in a patient’s medical record, this too
must be disclosed. Subjects should also be told of the risks
of others having access to his or her genetic information.
The growth of bioinformatics or computational genomics
makes it clear that, in the near future, the concern will not
be somuchwith stored biological samples but with digitalized
samplesdelectronic data that can be stored, transmitted, and
analyzed with new ease and power. It is important for institu-
tions to consider policies surrounding the use of genetic infor-
mation (Massoudi et al., 2011). These processes should
address data collection and management, encryption,
destruction of specimens and/or genetic information, and
loss of data. Researchers and research ethics reviewers should
address the issue of clinically suspicious or significant
Figure 3 e The current status and future vision for oncology e replaceme
molecular profile.
incidental findings, and whether and how they will be
communicated to subjects. Incidental findings can be of great
interest to subjects, and a comprehensive consent process
should make clear whether such findings will be disclosed.
In the United States, the passage of the Genetic Information
Nondiscrimination Act (GINA) in 2008 (http://www.genome.-
gov/24519851) has provided, at least in principle, sweeping
protections for patients and subjects. GINA prohibits discrim-
ination in healthcare insurance and employment based on ge-
netic information. However, the extent to which GINA
changes or reduces the risks of participation in genetic or
genomic research should be included in the consent process.
4.5. Reimbursement
CPT coding needs to evolve in order to encompass the existing
single gene tests and the addition of panels of genes and mul-
tiplexed tests as well as translated molecular signatures in a
manner that also makes economic sense. CPT codes have
been developed in the last two decades in two major direc-
tions: the first one in microbiology testing and the second
one for inherited diseases and cancer. To date, 466 descriptors
formolecular pathology services have been drafted by theMo-
lecular Pathology Working Group (MPWG) of AMA. Most are
currently accepted for the CPT code set. Availability of the
nt of multitude of single gene/panel tests by one comprehensive
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 753
technology does not immediately translate into wide clinical
use since healthcare providers and payors need to have well
established policies about medical necessity of complex
sequencing tests. The MPWG has organized two groups of
CPT codes: tier 1 which provides codes for specific procedures
performed in high volume (KRASmutations), and tier 2 which
provides codes for large number of less commonly performed
tests, requiring more technical and professional resources
(similar to the six CPT levels used for Surgical Pathology ser-
vices). For example, level 2 includes trinucleotide repeat disor-
ders; level 4, sequencing of a single target/exon; level 9 covers
full sequencing of genes>50 exons. It is envisioned that next
generation sequencing may exist within existing tiers e as
medical necessity is established. Multianalyte assays with
algorithmic procedures are also likely to be further specified
in the context of next generation sequencing tests.
5. Conclusion
The practice of oncology is being transformed by the vast
amount of knowledge that is gained by high throughput mo-
lecular profiling technologies. The big challenge before us is
the ability to discriminate between the potentially actionable
information that can be gleaned from these data from those
that provide insights into tumor biology. The former is imme-
diately actionable and would make a difference for the indi-
vidual patient at hand, while the latter would help devise
therapeutic strategies for other patients. It is imperative that
these cases are distinguished and proper ethical guidelines
are set up to enable oncologists to provide better care for their
patients. The collection of the complete molecular profile of
the tumor sample would provide an avenue for oncologists
to constantly query the patient profile and update as and
when new relevant information becomes available. For
example, a report on a new clinical trial targetting a muta-
tion/aberration present in the patient would allow the oncol-
ogist to switch or update the treatment protocols and
improve outcomes or the patient (Figure 3).
Conflicts of interest
W.R.M. has participated in Illumina sponsored meetings over
the past four years and received travel reimbursement and an
honorarium for presenting at these events. Illumina had no
role in decisions relating to the study/work to be published,
data collection and analysis of data and the decision to publish.
W.R.M. has participated in Pacific Biosciences sponsored
meetings over the past three years and received travel reim-
bursement for presenting at these events.
W.R.M. is a founder and shared holder of Orion Genomics,
which focuses on plant genomics and cancer genetics.
Acknowledgment
W.R.M received support from the Cancer Center Support
Grant (CA045508) from the NCI.
R E F E R E N C E S
Ajay, S.S., Parker, S.C., Ozel Abaan, H., Fuentes Fajardo, K.V.,Margulies, E.H., 2011. Accurate and comprehensivesequencing of personal genomes. Genome Res. 9, 1498e1505.
Baba, F., Swartz, K., van Buren, R., et al., 2006. Syndecan-1 andsyndecan-4 are overexpressed in an estrogen receptor-negative, highly proliferative breast carcinoma subtype.Breast Cancer Res. Treat. 98, 91e98.
Bale, S., Devisscher, M., Van Criekinge, W., et al., 2011.MutaDATABASE: a centralized and standardized DNAvariation database. Nat. Biotechnol. 29, 117e118.
Barbashina, V., Salazar, P., Holland, E.C., Rosenblum, M.K.,Ladanyi, M., 2005. Allelic losses at 1p36 and 19q13 in gliomas:correlation with histologic classification, definition of a 150-kbminimal deleted region on 1p36, and evaluation of CAMTA1 asa candidate tumor suppressor gene. Clin. Cancer Res. 11,1119e1128.
Barros-Silva, J.D., Leitao, D., Afonso, L., et al., 2009. Association ofERBB2 gene status with histopathological parameters anddisease-specific survival in gastric carcinoma patients. Br. J.Cancer 100, 487e493.
Berg, J.S., Khoury, M.J., Evans, J.P., 2011. Deploying wholegenome sequencing in clinical practice and public health:meeting the challenge one bin at a time. Genet. Med. 13,499e504.
Biesecker, L.G., Mullikin, J.C., Facio, F.M., et al., 2009. The ClinSeqProject: piloting large-scale genome sequencing for researchin genomic medicine. Genome Res. 19, 1665e1674.
Biesecker, L.G., Burke, W., Kohane, I., Plon, S.E., Zimmern, R.,2012. Next-generation sequencing in the clinic: are we ready?Nat. Rev. Genet. 13, 818e824.
Bleeker, F.E., Felicioni, L., Buttitta, F., et al., 2008. AKT1(E17K) inhuman solid tumours. Oncogene 27, 5648e5650.
Bonin, S., Hlubek, F., Benhattar, J., et al., 2010. Multicentrevalidation study of nucleic acids extraction from FFPE tissues.Virchows Arch. 457, 309e317.
Brose, M.S., Volpe, P., Feldman, M., et al., 2002. BRAF and RASmutations in human lung cancer and melanoma. Cancer Res.62, 6997e7000.
Carter, S.L., Cibulskis, K., Helman, E., et al., 2012. Absolutequantification of somatic DNA alterations in human cancer.Nat. Biotechnol. 30, 413e421.
Curtis, C., Shah, S.P., Chin, S.-F., et al., 2012. The genomic andtranscriptomic architecture of 2,000 breast tumours revealsnovel subgroups. Nature. advance Online Publ.
Davies, H., Bignell, G.R., Cox, C., et al., 2002. Mutations of the BRAFgene in human cancer. Nature 417, 949e954.
De Roock, W., De Vriendt, V., Normanno, N., Ciardiello, F.,Tejpar, S., 2011. KRAS, BRAF, PIK3CA, and PTEN mutations:implications for targeted therapies in metastatic colorectalcancer. Lancet Oncol. 12, 594e603.
Ellis, M.J., Perou, C.M., 2013. The genomic landscape of breastcancer as a therapeutic roadmap. Cancer Discov. 3, 27e34.
Esposito, K., Goodman, K., 2009. Genethics 2.0: phenotypes,genotypes, and the challenge of databases generated bypersonal genome testing. Am. J. Bioeth. 9, 19e21.
Ewing, B., Green, P., 1998. Base-calling of automated sequencertraces using phred. II. Error probabilities. Genome Res. 8,186e194.
Ewing, B., Hillier, L., Wendl, M.C., Green, P., 1998. Base-calling ofautomated sequencer traces using phred. I. Accuracyassessment. Genome Res. 8, 175e185.
Fairley, J.A., Gilmour, K., Walsh, K., 2012. Making the most ofpathological specimens: molecular diagnosis in formalin-fixed, paraffin embedded tissue. Curr. Drug Targets 13 (12),1475e1487.
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5754
Filho, O.M., Ignatiadis, M., Sotiriou, C., 2011. Genomic grade index:an important tool for assessing breast cancer tumor grade andprognosis. Crit. Rev. Oncol. Hematol. 77, 20e29.
Flaherty, K.T., Robert, C., Hersey, P., et al., 2012. Improved survivalwith MEK inhibition in BRAF-mutated melanoma. N. Engl. J.Med..
Gargis, A.S., Kalman, L., Berry, M.W., et al., 2012. Assuring thequality of next-generation sequencing in clinical laboratorypractice. Nat. Biotechnol. 30, 1033e1036.
Hollink, I.H., Zwaan, C.M., Zimmermann, M., et al., 2009.Favorable prognostic impact of NPM1 gene mutations inchildhood acute myeloid leukemia, with emphasis oncytogenetically normal AML. Leukemia 23, 262e270.
Irahara, N., Baba, Y., Nosho, K., et al., 2010. NRAS mutations arerare in colorectal cancer. Diagn. Mol. Pathol. 19, 157e163.
Kamalakaran, S., Lezon-Geyda, K., Varadan, V., et al., 2011.Evaluation of ER/PR and HER2 status by RNA sequencing intissue core biopsies from preoperative clinical trial specimens.J. Clin. Oncol..
Karst, A.M., Levanon, K., Drapkin, R., 2011. Modeling high-gradeserous ovarian carcinogenesis from the fallopian tube. Proc.Natl. Acad. Sci. United State. America 108, 7547e7552.
Kohlmann, A., Klein, H.U., Weissmann, S., et al., 2011. Theinterlaboratory RObustness of Next-generation sequencing(IRON) study: a deep sequencing investigation of TET2, CBLand KRAS mutations by an international consortium involving10 laboratories. Leukemia 25 (12), 1840e1848.
Kohno, T., Ichikawa, H., Totoki, Y., et al., 2012. KIF5B-RET fusionsin lung adenocarcinoma. Nat. Med. 18, 375e377.
Kompier, L.C., Lurkin, I., van der Aa, M.N., van Rhijn, B.W., vander Kwast, T.H., Zwarthoff, E.C., 2010. FGFR3, HRAS, KRAS,NRAS and PIK3CA mutations in bladder cancer and theirpotential as biomarkers for surveillance and therapy. PLoSOne 5, e13821.
Kwak, E.L., Bang, Y.J., Camidge, D.R., et al., 2010. Anaplasticlymphoma kinase inhibition in non-small-cell lung cancer.N. Engl. J. Med. 363, 1693e1703.
Lamy, P.J., Fina, F., Bascoul-Mollevi, C., et al., 2011. Quantificationand clinical relevance of gene amplification at chromosome17q12-q21 in human epidermal growth factor receptor 2-amplified breast cancers. Breast Cancer Res. BCR 13, R15.
Latoszek-Berendsen, A., Tange, H., van den Herik, H.J.,Hasman, A., 2010. From clinical practice guidelines tocomputer-interpretable guidelines. A literature overview.Methods Inf. Med. 49, 550e570.
Lee, J.H., Choi, J.W., Kim, Y.S., 2011. Frequencies of BRAF andNRAS mutations are different in histological types and sites oforigin of cutaneous melanoma: a meta-analysis. Br. J.Dermatol. 164, 776e784.
Liedtke, C., Hatzis, C., Symmans, W.F., et al., 2009. Genomic gradeindex is associated with response to chemotherapy in patientswith breast cancer. J. Clin. Oncol. 27, 3185e3191.
Linardou, H., Dahabreh, I.J., Bafaloukos, D., Kosmidis, P.,Murray, S., 2009. Somatic EGFR mutations and efficacy oftyrosine kinase inhibitors in NSCLC. Nat. Rev. Clin. Oncol. 6,352e366.
Lipson, D., Capelletti, M., Yelensky, R., et al., 2012. Identificationof new ALK and RET gene fusions from colorectal and lungcancer biopsies. Nat. Med..
Li, H., Ruan, J., Durbin, R., 2008a. Mapping short DNA sequencingreads and calling variants using mapping quality scores.Genome Res. 18, 1851e1858.
Li, S., Kralovics, R., De Libero, G., Theocharides, A., Gisslinger, H.,Skoda, R.C., 2008b. Clonal heterogeneity in polycythemia verapatients with JAK2 exon12 and JAK2-V617F mutations. Blood111, 3863e3866.
Lynch, T.J., Bell, D.W., Sordella, R., et al., 2004. Activatingmutations in the epidermal growth factor receptor underlying
responsiveness of non-small-cell lung cancer to gefitinib.N. Engl. J. Med. 350, 2129e2139.
Ma, W., Kantarjian, H., Zhang, X., et al., 2009. Mutation profile ofJAK2 transcripts in patients with chronic myeloproliferativeneoplasias. J. Mol. Diagn. 11, 49e53.
MacConaill, L.E., Campbell, C.D., Kehoe, S.M., et al., 2009. Profilingcritical cancer gene mutations in clinical tumor samples. PLoSOne 4, e7887.
Maher, B., 2011. Human genetics: genomes on prescription.Nature 478, 22e24.
Mardis, E.R., Ding, L., Dooling, D.J., et al., 2009. Recurringmutations found by sequencing an acute myeloid leukemiagenome. N. Engl. J. Med. 361, 1058e1066.
Marioni, John C., M, C.E., Shrikant, M. Mane, Stephens, Matthew,Gilad, Yoav, 2008a. RNA-seq: an assessment of technicalreproducibility and comparison with gene expression arrays.Genome Res. 18, 1518e1519.
Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M., Gilad, Y.,2008b. RNA-seq: an assessment of technical reproducibilityand comparison with gene expression arrays. Genome Res. 18,1509e1517.
Massoudi, B.L., Goodman, K.W., Gotham, I.J., et al., 2011. Aninformatics agenda for public health: summarizedrecommendations from the. AMIA PHI Conference. (J Am MedInform Assoc..
McDermott, U., Downing, J.R., Stratton, M.R., 2011. Genomics andthe continuum of cancer care. N. Engl. J. Med. 364, 340e350.
Melck, A.L., Yip, L., Carty, S.E., 2010. The utility of BRAF testing inthe management of papillary thyroid cancer. Oncologist 15,1285e1293.
Mitelman, F., Johansson, B., Mertens, F., 2007. The impact oftranslocations and gene fusions on cancer causation. Nat. Rev.Cancer 7, 233e245.
Monzon, F.A., Ogino, S., Hammond, M.E., Halling, K.C.,Bloom, K.J., Nikiforova, M.N., 2009. The role of KRAS mutationtesting in the management of patients with metastaticcolorectal cancer. Arch. Pathol. Lab. Med. 133, 1600e1606.
Motyckova, G., Stone, R.M., 2010. The role of molecular tests inacute myelogenous leukemia treatment decisions. Curr.Hematol. Malig Rep. 5, 109e117.
Nekrutenko, A., Taylor, J., 2012. Next-generation sequencing datainterpretation: enhancing reproducibility and accessibility.Nat. Rev. Genet. 13, 667e672.
Ormond, K.E., Wheeler, M.T., Hudgins, L., et al., 2010. Challengesin the clinical application of whole-genome sequencing.Lancet 375, 1749e1751.
Orth, J.D., Thiele, I., Palsson, B.O., 2010. What is flux balanceanalysis? Nat. Biotechnol. 28, 245e248.
Paez, J.G., Janne, P.A., Lee, J.C., et al., 2004. EGFR mutations in lungcancer: correlation with clinical response to gefitinib therapy.Science 304, 1497e1500.
Pao, W., Girard, N., 2011. New driver mutations in non-small-celllung cancer. Lancet Oncol. 12, 175e180.
Parker, J.S., Mullins, M., Cheang, M.C., et al., 2009. Supervised riskpredictor of breast cancer based on intrinsic subtypes. J. Clin.Oncol. 27, 1160e1167.
Perou, C.M., Sorlie, T., Eisen, M.B., et al., 2000. Molecular portraitsof human breast tumours. Nature 406, 747e752.
Petitjean, A., Achatz, M.I., Borresen-Dale, A.L., Hainaut, P.,Olivier, M., 2007. TP53 mutations in human cancers:functional selection and impact on cancer prognosis andoutcomes. Oncogene 26, 2157e2165.
Plesec, T.P., Hunt, J.L., 2009. KRAS mutation testing in colorectalcancer. Adv. Anat. Pathol. 16, 196e203.
Reichardt, P., 2010. Optimal use of targeted agents for advancedgastrointestinal stromal tumours. Oncology 78, 130e140.
Robinson, D.R., Kalyana-Sundaram, S., Wu, Y.M., et al., 2011.Functionally recurrent rearrangements of the MAST kinase
M O L E C U L A R O N C O L O G Y 7 ( 2 0 1 3 ) 7 4 3e7 5 5 755
and Notch gene families in breast cancer. Nat. Med. 17,1646e1651.
Roychowdhury, S., Iyer, M.K., Robinson, D.R., et al., 2011.Personalized oncology through integrative high-throughputsequencing: a pilot study. Sci. Transl Med. 3, 111ra21.
Ryu, K., Park, C., Lee, Y., 2011. Hypoxia-inducible factor 1 alpharepresses the transcription of the estrogen receptor alphagene in human breast cancer cells. Biochem. Biophysical Res.Commun. 407, 831e836.
Sasaki, S., Kitagawa, Y., Sekido, Y., et al., 2003. Molecularprocesses of chromosome 9p21 deletions in human cancers.Oncogene 22, 3792e3798.
Schroeder, A., Mueller, O., Stocker, S., et al., 2006. The RIN: anRNA integrity number for assigning integrity values to RNAmeasurements. BMC Mol. Biol. 7, 3.
Silver, D.P., Richardson, A.L., Eklund, A.C., et al., 2010. Efficacy ofneoadjuvant Cisplatin in triple-negative breast cancer. J. Clin.Oncol. 28, 1145e1153.
Simon, R., 2005. Roadmap for developing and validatingtherapeutically relevant genomic classifiers. J. Clin. Oncol. 23,7332e7441.
Sorlie, T., Perou, C.M., Tibshirani, R., et al., 2001. Gene expressionpatterns of breast carcinomas distinguish tumor subclasseswith clinical implications. Proc. Natl. Acad. Sci. U S A 98,10869e10874.
Soulieres, D., Greer,W.,Magliocco,A.M., et al., 2010. KRASmutationtesting in the treatment of metastatic colorectal cancer withanti-EGFR therapies. Curr. Oncol. 17 (Suppl 1), S31eS40.
Stein, L.D., 2010. The case for cloud computing in genomeinformatics. Genome Biol. 11, 207.
Stransky, N., Egloff, A.M., Tward, A.D., et al., 2011. The mutationallandscape of head and neck squamous cell carcinoma. Science333 (6046), 1157e1160.
Tomlins, S.A., Rhodes, D.R., Perner, S., et al., 2005. Recurrentfusion of TMPRSS2 and ETS transcription factor genes inprostate cancer. Science 310, 644e648.
Treangen, T.J., Salzberg, S.L., 2011. Repetitive DNA and next-generation sequencing: computational challenges andsolutions. Nat. Rev. Genet. 13, 36e46.
van Eijk, R., Stevens, L., Morreau, H., van Wezel, T., 2012.Assessment of a fully automated high-throughput DNAextraction method from formalin-fixed, paraffin-embeddedtissue for KRAS, and BRAF somatic mutation analysis. Exp.Mol. Pathol. 94 (1), 121e125.
van ’t Veer, L.J., Dai, H., van de Vijver, M.J., et al., 2002. Geneexpression profiling predicts clinical outcome of breastcancer. Nature 415, 530e536.
Voutilainen, K.A., Anttila, M.A., Sillanpaa, S.M., et al., 2006.Prognostic significance of E-cadherin-catenin complex inepithelial ovarian cancer. J. Clin. Pathol. 59, 460e467.
Wang, Z., Gerstein, M., Snyder, M., 2009. RNA-Seq: a revolutionarytool for transcriptomics. Nat. Rev. Genet. 10, 57e63.
West, M., Blanchette, C., Dressman, H., et al., 2001. Predicting theclinical status of human breast cancer by using geneexpression profiles. Proc. Natl. Acad. Sci. U S A 98,11462e11467.
Top Related