Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and...

29
Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants identification in 547 additional validation samples. Somatic variants (SNVs, indels, focal and chromosome-arm- level CNVs, and fusion products) were first called in 197 diagnostic samples with remission DNA (for germline) using a Complete Genomics custom Whole Genome Sequencing (WGS) variant calling pipeline. Complete Genomics calls were optimized at the start of the TARGET project using 100 independently- verified variants in WGS samples. Matched tumor and remission samples in 153 cases were used for somatic variant calling by both WGS and targeted capture sequencing (TCS) of genes recurrently impacted in the WGS samples. 72% of WGS SNVs, and 76% of WGS indels were confirmed by TCS (red & green text in figures). For focal copy number (CN) alterations spanning fewer than 7 genes, 75% of recurrent WGS deletion/loss and 85% gain/amplification calls matched recurrent alterations discovered by SNP6 arrays in 96 matching samples. For chromosomal junctions, we integrated WGS, clinical and RNA-seq data by majority vote, and confirmed 89% of WGS calls. An additional 29 samples from the WGS discovery cohort were verified by TCS of diagnostic cases only, as part of 146 tumors without matched remission (see top portion of the figure). The remainder of these 146 cases were not used for discovery or validation purposes, rather, we simply identified recurrence of variants that were observed and verified in other samples. Nature Medicine: doi:10.1038/nm.4439

Transcript of Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and...

Page 1: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S1. Overview of the variant calling and verification process. This figure

expands on Fig. 1c with details of verified variants identification in 547 additional

validation samples. Somatic variants (SNVs, indels, focal and chromosome-arm-

level CNVs, and fusion products) were first called in 197 diagnostic samples with

remission DNA (for germline) using a Complete Genomics custom Whole

Genome Sequencing (WGS) variant calling pipeline. Complete Genomics calls

were optimized at the start of the TARGET project using 100 independently-

verified variants in WGS samples. Matched tumor and remission samples in 153

cases were used for somatic variant calling by both WGS and targeted capture

sequencing (TCS) of genes recurrently impacted in the WGS samples. 72% of

WGS SNVs, and 76% of WGS indels were confirmed by TCS (red & green text in

figures). For focal copy number (CN) alterations spanning fewer than 7 genes,

75% of recurrent WGS deletion/loss and 85% gain/amplification calls matched

recurrent alterations discovered by SNP6 arrays in 96 matching samples. For

chromosomal junctions, we integrated WGS, clinical and RNA-seq data by

majority vote, and confirmed 89% of WGS calls. An additional 29 samples from

the WGS discovery cohort were verified by TCS of diagnostic cases only, as part

of 146 tumors without matched remission (see top portion of the figure). The

remainder of these 146 cases were not used for discovery or validation

purposes, rather, we simply identified recurrence of variants that were observed

and verified in other samples.

Nature Medicine: doi:10.1038/nm.4439

Page 2: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S2. Cellular processes and pathways commonly impacted in pediatric AML. The height of each bar indicates the percentage of samples with verified fusions

(green), SNVs/indels (grey), or focal CNVs (gold) in recurrently impacted genes within 684 pediatric AML samples. See Table S2b for a list of the impacted genes.

Nature Medicine: doi:10.1038/nm.4439

Page 3: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S3. Data type overlap for TARGET and TCGA diagnostic samples. UpSet plots (http://www.caleydo.org/tools/upset/) showing the set overlaps for whole genome sequencing (WGS), whole exome (WXS), mRNA sequencing, DNA methylation arrays (CpGmeth), miRNA sequencing and targeted capture sequencing (TCS) in the TARGET and TCGA cohorts. The numbers of assays analyzed for each type are indicated by the horizontal bar graphs and number in the set intersection is illustrated in vertical bar graphs. The Clinical category includes samples comprising the entire TARGET AML dataset, including those in TARGET AML subprojects (e.g. previously reported WXS analysis6). Data from these samples are included in the chromosomal arm level and karyotype based assessments of copy loss and fusions. (a) All TARGET AML project samples available. (b) All TCGA samples used for comparisons to TARGET. (c) Assay type overlaps for TCGA and TARGET data combined.

TCGA assay overlap

(n=177) Combined TARGET & TCGA

assay data overlap (n=1200)

TARGET AML assay overlap

(n=1023)

* Clinical annotations include ISCN karyotype

a

b c

Nature Medicine: doi:10.1038/nm.4439

Page 4: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S4. Clonality estimates are consistent by age across cohorts. Both TCGA and TARGET AML cohorts contain affected individuals between the ages of 15 and 39

(adolescent and young adult, or AYA). Mutational and karyotypic clonality were assessed in AYA patients with whole-genome or whole-exome sequencing from either

cohort, resulting in estimates from 40 TARGET AML subjects and 22 TCGA AML subjects in this age group. No significant association between cohort and mutational

clonality estimate (p = 0.79613, Fisher’s exact test) or karyotypic clonality (p= 0.180302, Fisher’s exact test) is observed (TCGA AYA cases are older and more likely to

have normal karyotype, though not significantly so). A multivariate Poisson model similarly shows little evidence for a significant cohort-wise effect. The strongest

predictor of (decreasing) mutational clonality in AYAs is age at diagnosis (p=0.28).

Nature Medicine: doi:10.1038/nm.4439

Page 5: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S5. The context of genome-wide mutation burden in pediatric AML. The mutational burden of SNV and indels is low in pediatric AML (blue), with a median of 10

mutations/case across the 197 sample WGS cohort. This places pediatric AML, along with other pediatric malignancies (rhabdoid tumor, Ewing sarcoma,

medulloblastoma) and adult AML (red) among the least mutated of human cancers. Figure reproduced from the raw data reported by Lawrence and colleagues51

updated to reflect TARGET AML results, plotted using the ggplot2 package in the R statistical environment.

Nature Medicine: doi:10.1038/nm.4439

Page 6: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S6. A simplified visualization of common genomic variants in TARGET and TCGA AML data. Selected small variants are grouped by those that appear distinctive

from core binding factor (CBF; t(8;21) and inv(16)) and KMT2A (aka MLL) fusions (grp1: mutations of WT1, NPM1, PTPN11, GATA2, CEBPA) and those that frequently co-

occur with CBF alterations (grp2: mutations of KIT or ASXL2, loss of chr X). C, chromosomal alteration; J, junction/translocation; M, mutation; I, ITD.

Pediatric Adult

KMT2A

CBF

grp1.var

FLT3

grp2.var

NRAS

KRAS

ZEB2

MBNL1

KMT2A

KRAS

grp1.var

DNMT3A

IDH2

IDH1

CBF

TET2

TP53

NRAS

grp2.var

Nature Medicine: doi:10.1038/nm.4439

Page 7: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S7. Adult-Pediatric mutational contrasts in AML. Lollipop plots generated with ProteinPaint (https://pecan.stjude.org/#/proteinpaint) highlight differences in

frequency, type, and location of sequence variants in pediatric and adult AML. The plotted data reflect all somatic coding variants identified at presentation in 177 TCGA

cases and 815 TARGET AML cases (WGS + TCS). Mutations are coded by functional class: blue, missense; brown, insertion; gray, deletion; red, frameshifting; orange,

stop-gain; green, tandem duplication.

c KRAS

TARGET

TCGA

e NRAS

TARGET

TCGA

d FLT3

TARGET

TCGA

b GATA2

TARGET

a MYC

TARGET

TARGET

TCGA

f KIT

Nature Medicine: doi:10.1038/nm.4439

Page 8: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S8. The impact of pediatric gene fusions on clinical outcome. (a) 199 patients evaluated for CBFA2T3-GLIS2 fusion had clinical outcome data available for

analysis. Those with the fusion (n=9) had significantly worse overall survival than patients without the fusion (n=190) (p=0.0101). (b) 824 patients were evaluated for

fusions involving ETS family transcription factors (ETV6, FUS, or ERG) through karyotype and/or transcriptome sequencing and had clinical outcome data available for

analysis. Those with fusions (n=20) had significantly worse event-free survival than patients without a fusion (n=804) (p=0.0060). (c) 824 patients were evaluated for

fusions involving KAT6A through karyotype and/or transcriptome sequencing and had clinical outcome data available for analysis. Those with fusions (n=8) had

significantly worse overall survival than patients without a fusion (n=816) (p=0.0195). Differences in outcome were assessed by log-rank test. EFS, event free survival;

OS, overall survival.

Nature Medicine: doi:10.1038/nm.4439

Page 9: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S9. Pediatric CBL Exonic Deletions Detected by cDNA Fragment Length Analysis. Representative examples of CBL wild-type and deletion transcripts detected by

capillary electrophoresis of cDNA. Horizontal axis depicts size of the PCR fragment (bp), while vertical axis indicates strength of signal. WT size (full-length transcript) is

685bp, exon 8 deletion only is 563bp, and deletions of exons 8 and 9 is 354bp.

Nature Medicine: doi:10.1038/nm.4439

Page 10: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S10. Mutational frequency differences in key myeloid genes. (a) ECOG comparison4. (b) TCGA comparison, balanced by cytogenetic subtypes (see online

Methods). Error bars indicate the empirical SD from the resampling procedure.

b a

TARGET ECOG

TARGET

TCGA

Nature Medicine: doi:10.1038/nm.4439

Page 11: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S11. Mutational co-occurrence in KMT2A rearranged childhood AML. We identified single copy segmental deletions of ZEB2 and/or MBNL1 in 14 patients, 6 of

whom had concurrent KMT2A fusions (p=0.035, Fisher’s exact test). The row entitled “KMT2A (clinical)” shows the manually-curated classification of the tumor primary

cytogenetic type by combining results from clinical, genomic and RNA-seq assays. By this measure, all samples are classified as belonging to the KMT2A fusion

cytogenetic group. The row entitled KMT2A (WGS) shows KMT2A variants found by WGS alone. Note 2 samples have copy number alterations as well as fusions

impacting KMT2A. C, copy number alteration; J, junction/translocation; M, mutation; I, ITD.

KMT2A (clinical) KMT2A (WGS) MLLT3 NRAS FLT3 KRAS MLLT10 MBNL1 TMEM14E ZEB2

Nature Medicine: doi:10.1038/nm.4439

Page 12: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S12. Clonality at presentation in pediatric AML. (a) Mutation-based inference of clonality in 197 TARGET AML cases with WGS and 177 TCGA AML cases

identifies 2 or more detectable clones in the majority of patients across age ranges. (b) A similar pattern with overall fewer detectable clones was observed by

karyotypic inference of clonal relationships at presentation.

a

b

Infants

(age <3)

Children

(age 3-15)

AYA

(age 15-40)

Adults

(age >40)

Mutational clones detected at diagnosis

Karyotypic clones detected at diagnosis

Nature Medicine: doi:10.1038/nm.4439

Page 13: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S13. Gene variants alone and in combination impact pediatric AML outcomes. (a) 963 patients from the TARGET dataset with clinical results for FLT3 internal

tandem duplication (ITD), NPM1, WT1, NUP98-NSD1 fusion had clinical outcome data for analysis. Patients with a combination of FLT3 ITD and WT1 or NUP98-NSD1

versus FLT3 ITD alone or in combination with NPM1 mutation exhibit significantly decreased overall survival (p<0.001). (b) Similar results were found for COG trial

AAML0531 (b), COG trial CCG-2961 (c), and the Dutch Childhood Oncology Group (DCOG) (d). In each trial those with FLT3 ITD plus WT1 and/or NUP98-NSD1 fusion

exhibit significantly worse overall survival. The exact numbers of patients in each subgroup are indicated in the table below the figures. The total numbers of evaluable

patients is indicated in the table below. ITD, FLT3-ITD.

Cohort ITD- ITD-

NPM1+ ITD-

WT1+ ITD-

NPM1+

WT1+

ITD-

WT1+

NUP98-NSD1+

ITD-

NUP98-NSD1+ ITD+ ITD+

NPM1+ ITD+

WT1+ ITD+

NPM1+

WT1+

ITD+

WT1+

NUP98-NSD1+

ITD+

NUP98-NSD1+ ITD-

NPM1+

NUP98-NSD1+

Total

TARGET 687 37 56 7 4 0 72 27 27 7 17 21 1 963

AAML 0531 651 41 43 5 3 0 67 28 21 3 12 13 1 888

CCG-2961 435 41 27 2 0 0 17 8 11 2 4 9 0 556

DCOG 225 14 14 0 1 1 28 9 9 0 4 9 0 314

Nature Medicine: doi:10.1038/nm.4439

Page 14: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S14. Remission rates vary for pediatric AML with FLT3-ITD according to cooperating mutations. The CCG-2961, AAML0531 and DCOG cohorts were combined

to compare complete remission (CR) rates after one cycle of induction therapy for groups with FLT3-ITD cooperating mutations, as shown. CR rates are consistent with

the survival outcomes (Figs. 3c and S13) among these studies: the poorest outcome group containing FLT3-ITD and a cooperating WT1 and/or NUP98-NSD1 fusion had

the lowest CR rate, at 54.8%. The most favorable group, FLT3-ITD positive, NPM1 positive at 93.0% (groupwise p<0.0001, Kruskal-Wallis).

Nature Medicine: doi:10.1038/nm.4439

Page 15: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S15. Novel ZEB2 and MBNL1 Deletions. (a-b) show short (<500 Kbp) deletion segments along chromosomes 2 (panel a, ZEB2) and chromosome 3 (panel b,

MBNL1) in TARGET discovery cohort samples (n=197). (c) With the exception of one ZEB2-deleted sample (red point at top right of panel c), samples with ZEB2 and

MBNL1 deletions are not impacted by large numbers of other CNVs.

a b

c

Nature Medicine: doi:10.1038/nm.4439

Page 16: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S16. Novel ELF1 focal deletions in the TARGET discovery cohort. (a) Genome browser view of segmental deletions covering the ELF1 locus. Patients (n=197) are

in rows, blue bars indicate length of deletion in that genomic region. (b) Genomic deletions were confirmed in a secondary assay using the nCounter CNV assay

(Nanostring Techologies), with verification (boxed specimens with low probes signals as identified by green signals in the heatmap below) of all ELF1 deletions initially

identified by WGS. (c) Expression values (RPKM) of ELF1 differ between those with the deletion and those with wild-type copy number (p=0.0077). (d) Unsupervised

clustering of 63 differentially expressed genes (p<0.01) between patients with and without ELF1 deletion shows many genes are upregulated in the samples with ELF1

deletions. Orange labels on the y axis indicate patients with an ELF1 deletion.

a b

c

Expression of ELF1

ELF1

delet

ion

ELF1

WT

0

200

400

600

800

1000

1200

Exp

ressio

n V

alu

e

ELF1

exp

ress

ion

(R

PK

M)(

RP

KM

)exp

ress

ion

del WT

d

Nature Medicine: doi:10.1038/nm.4439

Page 17: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S17. Summary view of the key fusion classes in pediatric AML. Each colored region represents a fusion family. Descriptive labels are written adjacent to each

family. The fusion partner genes for each family are indicated by their HGNC symbols. The lines connecting gene symbols indicate fusion partners. The thickness of each

line reflects the frequency of the observed fusion.

Nature Medicine: doi:10.1038/nm.4439

Page 18: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S18. Varying the age cutoff for infants (< 3 years in Figure 4b) vs. children, to < 2 or even < 1, does not substantially alter conclusions about fusion prevalence.

Panel c is the same as Fig. 4b (reproduced here for comparison). Panels a and b show how samples shift between age groups if the infant-child threshold is reduced to

<2 years (b), or <1 year (a). Fusions are listed in the same order as in 4b and used the same color scheme.

AYA Adults Infants

<3

Children

3-15 AYA Adults Infants

<2

Children

2-15 AYA Adults Infants

<1

Children

1-15

a b c

Nature Medicine: doi:10.1038/nm.4439

Page 19: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S19. Co-occurring mutations with CEBPA. (a) Oncoprint (http://www.cbioportal.org) showing all TARGET samples with functionally-validated CSF3R

mutations20. Green indicates samples with mutations. (b) CEBPA and GATA2 mutations combinatorially impact Event-Free Survival.

CEBPA and/or GATA2 in Normal Cyto EFS

0 1000 2000 3000 40000

20

40

60

80

100

CEBPA +, GATA2 +

CEBPA +, GATA2 -

GATA2 +, CEBPA -

Wildtype

EFS (Days)

Perc

en

t su

rviv

al (N=16)

(N=13) (N=143)

(N=7)

P=0.0177

b a

Nature Medicine: doi:10.1038/nm.4439

Page 20: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S20. Patterns of mutual co-occurrence and mutual exclusion among somatic pediatric AML variants. (a) Patterns of co-occurrence and (b) mutual exclusion

among variants in the TARGET cohort were evaluated using CoMEt (see online methods). Line thickness represents –log(p-value) for the observed co-occurrence rates.

Orange boxes indicate cytogenetic groups. Except for copy number alterations at the top-right, which were only evaluated within the 197 samples with WGS, all other

relations are among 684 samples with TCS. (c) An alternative derivation of conditional gene-gene relationships using a penalized Ising model yields similar conditional

dependencies.

a b

Nature Medicine: doi:10.1038/nm.4439

Page 21: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

c

Nature Medicine: doi:10.1038/nm.4439

Page 22: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S21. Anti-correlated DNA methylation and reduced transcription potential. By scanning 2000 bp upstream and 200 bp downstream of the transcription start

site (TSS) for all known ENSEMBL isoforms of ~8000 expressed genes in AML, we fit segmented regression models of DNA methylation (X axis) against asinh (transcripts

per million, TPM, Y axis) of each transcript or gene. Hyperbolic arcsine (asinh) is similar to log transformation but is defined at all points along the real number line.

Since large batch effects confound the biological differences between TARGET pediatric AML and TCGA adult AML mRNA data, we opted to take the within-cohort

median expression for samples with 10% or less methylation at a CpG locus, and the “silencing threshold” at the locus corresponding to the gene of interest was then

defined as the methylation fraction beyond which no sample in a cohort exceeded the median “unmethylated” expression level (from samples with <= 10%

methylation) within its cohort. Any locus where healthy progenitors or myeloid cells showed >= 10% methylation was omitted from consideration. After these filtering

steps, the most significantly associated locus (ideally correlated with r > 0.8 against its neighboring loci) was then selected as a “tag CpG” for the downstream

transcript(s). A tag CpG for HumanMethylation450 arrays and either the same locus or (if not present) the best surrogate locus for HumanMethylation27 arrays passing

the filters was retained for silencing “calls”. If no suitable HumanMethylation27 locus could be found, only samples with HumanMethylation450 data were assayed for

silencing of a given gene. This method identified 119 genes with recurrent silencing by promoter hypermethylation within the TARGET and TCGA datasets. Examples

below include THRB and WDR35 (components of NMF signature 2 and 13 signals, respectively), CDKN2B, and ULBP1, ULBP2 and ULBP3 (NK ligands). The red line marks

the empirically determined silencing threshold (% methylation).

Nature Medicine: doi:10.1038/nm.4439

Page 23: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S22. Integrative analysis of gene mutations, deletion, and transcriptional silencing by promoter methylation. Silencing (gold) or mutation/deletion events

(gray) for each gene (rows) are displayed for all assayed patients (columns), with marginal total of events per patient illustrated in the upper histogram. The plotted

data reflects 172 TCGA cases and 284 TARGET cases at 119 genes and are outlined in Tables S8-S9. These data represent a complete illustration of the subset shown in

Fig. 5a with differences in row/column ordering based on differing clustering solutions for greater numbers of samples and genes.

Cohort

silenced mutated

Status

Nature Medicine: doi:10.1038/nm.4439

Page 24: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S23. NMF Deconvolution of genome-wide methylation patterns. DNA methylation signatures derived by non-negative matrix factorization (NMF) and in silico

purification. Samples are ordered by hierarchical clustering of signatures (labeled at right) and demonstrate the relative similarity of methylation features from samples

within cytogenetic categories (top ribbon). The plotted data are outlined in Table S10 and represent a complete illustration of those shown in Fig 5b.

Cohort

Associations

Nature Medicine: doi:10.1038/nm.4439

Page 25: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S24. Two DNA methylation signatures mark poor prognosis. Kaplan-Meier plots for signatures 2 and 13. After stratifying by cohort and adjusting for both TP53

mutation status and white blood cell count, these two signatures predict significantly (p < 0.05) poorer event-free survival in both pediatric and adult patients with

above-median scores.

DNA methylation signature #2 DNA methylation signature #13

Nature Medicine: doi:10.1038/nm.4439

Page 26: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S25. Unsupervised Nonnegative Matrix Factorization (NMF) Clustering of miRNA Expression. This figure is a fully annotated version of Fig. 6A in the main text.

Unsupervised NMF clustering of miRNA expression patterns in pediatric AML samples revealed 4 discrete pediatric subgroups (marked by the numbered colored

rectangles at the top) that were correlated with specific genomic alterations (indicated by blue bars in the gray annotation rows below the “race” and “FAB category”

annotations, near the top).

Consensus

matrix

Co

nse

nsu

s m

atri

x

sco

re

Expression z-

score

1 2 3 4

Nature Medicine: doi:10.1038/nm.4439

Page 27: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S26. Kaplan-Meier plots for samples expressing low and high levels of miRs let-7a-3p, let-7b-5p and 30a-3p. The expression (RPM) cut point between high and

low expression groups for each miRNA was defined using the X-tile method77, where all separation points between patients are considered and the selected cut point is

the one that provided the optimal (lowest) EFS log rank p-value. OS, overall survival.

P=<0.0001

P=<0.0001

P=0.0001

Nature Medicine: doi:10.1038/nm.4439

Page 28: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S27. Kaplan-Meier plots for samples expressing low and high levels of miRs 155-5p, 3614-5p, 4662-5p and 26a-2-3p. The expression (RPM) cut point between

high and low expression groups for each miRNA was defined using the X-tile method77. OS, overall survival.

Nature Medicine: doi:10.1038/nm.4439

Page 29: Figure S1. Overview of the variant calling and ...Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants

Figure S28. High expression levels of miRs 133a-3p, 212-3p, and 29c-5p have deleterious effects on event free survival (EFS). The expression (RPM) cut point between

high and low expression groups for each miRNA was defined using the X-tile method77. EFS, event-free survival.

Nature Medicine: doi:10.1038/nm.4439