Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was...

23
Identification of novel ALK and RET gene fusions from colorectal and lung cancer biopsies Doron Lipson, Marzia Capelletti, Roman Yelensky, Geoff Otto, Alex Parker, Mirna Jarosz, John A Curran, Sohail Balasubramanian, Troy Bloom, Kristina W Brennan, Amy Donahue, Sean R Downing, Garrett M Frampton, Lazaro Garcia, Frank Juhn, Kathy C Mitchell, Emily White, Jared White, Zac Zwirko, Tamar Peretz, Hovav Nechushtan, Lior Soussan-Gutman, Jhingook Kim, Hidefumi Sasaki, Hyeong Ryul Kim, Seung-il Park, Dalia Ercan, Christine E Sheehan, Jeffrey S Ross, Maureen T Cronin, Pasi A Jänne and Philip J Stephens Supplementary Methods Patients and samples. CRCs included 24 male and 16 female patients of mean age 61.8 years (range 56 to 87) with 1 Stage I, 4 Stage II, 14 Stage III and 17 Stage IV cases. NSCLCs included 18 female and 6 male patients of mean age 66.9 years (range 36 to 82). Patients’ smoking status was unavailable. Stage included: 9 Stage I, 3 Stage II, 1 Stage III and 5 Stage IV. For all cases CRC/NSCLC diagnosis was confirmed and thyroid cancer ruled out by at least two pathologists reviewing clinical history, radiologic images, gross appearance and routine histology. Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured (Agilent SureSelect custom kit) libraries using DNA isolated from 40 microns of formalin fixed paraffin embedded (FFPE) tumor. For all specimens ≥25% of the nuclear area was malignant tumor cells so no micro/macro dissection tissue enrichment was performed. Sequencing on the HiSeq2000 instrument (Illumina) was with 36 bp paired reads to average depth of 229X. Nature Medicine doi:10.1038/nm.2673

Transcript of Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was...

Page 1: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Identification of novel ALK and RET gene fusions from colorectal and lung cancer

biopsies

Doron Lipson, Marzia Capelletti, Roman Yelensky, Geoff Otto, Alex Parker, Mirna Jarosz, John A Curran, Sohail Balasubramanian, Troy Bloom, Kristina W Brennan, Amy Donahue, Sean R Downing, Garrett M Frampton, Lazaro Garcia, Frank Juhn, Kathy C Mitchell, Emily White, Jared White, Zac Zwirko, Tamar Peretz, Hovav Nechushtan, Lior Soussan-Gutman, Jhingook Kim, Hidefumi Sasaki, Hyeong Ryul Kim, Seung-il Park, Dalia Ercan, Christine E Sheehan, Jeffrey S Ross, Maureen T Cronin, Pasi A Jänne and Philip J Stephens

Supplementary Methods

Patients and samples. CRCs included 24 male and 16 female patients of mean age 61.8

years (range 56 to 87) with 1 Stage I, 4 Stage II, 14 Stage III and 17 Stage IV cases. NSCLCs

included 18 female and 6 male patients of mean age 66.9 years (range 36 to 82). Patients’

smoking status was unavailable. Stage included: 9 Stage I, 3 Stage II, 1 Stage III and 5 Stage

IV. For all cases CRC/NSCLC diagnosis was confirmed and thyroid cancer ruled out by at

least two pathologists reviewing clinical history, radiologic images, gross appearance and

routine histology.

Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer

genes on indexed, adaptor ligated, hybridization-captured (Agilent SureSelect custom kit)

libraries using DNA isolated from 40 microns of formalin fixed paraffin embedded (FFPE)

tumor. For all specimens ≥25% of the nuclear area was malignant tumor cells so no

micro/macro dissection tissue enrichment was performed. Sequencing on the HiSeq2000

instrument (Illumina) was with 36 bp paired reads to average depth of 229X.

Nature Medicine doi:10.1038/nm.2673

Page 2: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Sequence data analysis. Sequence data from gDNA and cDNA was mapped to the

reference human genome (hg19) using the BWA aligner1 and processed using publicly

available SAMtools2, Picard (http://picard.sourceforge.net) and GATK3. Genomic base

substitutions and indels were detected using custom tools optimized for mutation calling in

heterogeneous tumor samples, based on statistical modeling of sequence quality scores and

local sequence assembly. Variations were filtered using dbSNP and a custom artifact

database, then annotated for known and likely somatic mutations using COSMIC4. Copy

number alterations were detected by comparing targeted genomic DNA sequence coverage

with a process-matched normal control sample. Genomic rearrangements were detected by

clustering chimeric reads mapping to targeted introns. Expression levels were determined

by analyzing cDNA sequence coverage of targeted exons.

Analytical validation. For analytical validation, we obtained results from previous clinical

genotyping for 59/64 cases: 38 CRCs and 2 NSCLCs for KRAS codons 12 and 13 by allele-

specific primer extension (Genzyme Genetics); 3 CRCs for BRAF V600 by pyro-sequencing

(ARUP); 19 NSCLCs for exon 18 through 21 of EGFR by sequencing (Genzyme Genetics).

Mutation calls were completely concordant between methods, giving estimates of 100%

sensitivity (95% CI 74%-100%) and 100% specificity (95% CI 93%-100%) for our assay.

RET protein immunohistochemistry. Tumors from 55 female and 62 male patients of

mean age 65.7 years (range 40 to 86) were tested for RET expression using an anti-RET

mouse monoclonal antibody (Vector Laboratories, clone: 3F8). Antibody specificity was

confirmed using control tissues which showed cytoplasmic immunoreactivity for RET in

normal human intestine and enteric ganglion cells. Twenty-two tumors were RET

Nature Medicine doi:10.1038/nm.2673

Page 3: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

immunopositive and 95 were negative.

Additional RET rearrangement screening was performed by qRT-PCR on cDNA from frozen

tumors or RET IHC in FFPE specimens. cDNA sequencing from FFPE tissues confirmed

expression of novel gene fusions. Sequencing matched normal genomic DNA from blood for

the index KIF5B-RET NSCLC patient and two additional cases confirmed the somatic origin

of the rearrangement.

CRC ALK IHC. CRC specimens were sectioned at 5 microns and stained for ALK expression

using the IgG3 Clone CD 246 anti-ALK antibody (CD 246; Dako Corp) with the Ventana

Benchmark automated IHC stainer (Ventana Medical Systems).

NSCLC RT-PCR. Specimens from never or limited (< 10 pack years) former smoking NSCLC

patients without prior chemotherapy or radiation were identified (347 Korean, 58

Japanese and 121 CALGB tumor bank protocol 14202 European). All patients provided

written informed consent and studies were approved by local Institutional Review Boards.

cDNA sequencing. RNA from 10 micron FFPE sections (Roche High Pure Kit) was reverse

transcribed with random hexamers using SuperScript®III First-Strand Synthesis System

(Invitrogen). cDNA made double-stranded using NEBNext® mRNA Second Strand

Synthesis Module (New England Biolabs)5 was used for sequencing library construction,

hybridization selection and sequencing6.

RT-PCR for fusion gene products. Tumors obtained at surgery were snap frozen in liquid

nitrogen, embedded in OCT and sectioned. RNA was prepared using Trizol (Invitrogen) and

purified using RNeasy mini-eluate cleanup kit (Qiagen). cDNA was transcribed with

Nature Medicine doi:10.1038/nm.2673

Page 4: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Quantiscript Reverse Transcriptase (Qiagen). qRT-PCR used primers designed from

genomic sequence data reconstructing fusion events:

F5'-AAATGAGCTCAACAGATGGCGTAA-3'; R5’-AGAACCAAGTTCTTCCGAGGGAAT-3'.

Genotyping was performed using RT-PCR for EGFR, EML4-ALK, KRAS, BRAF, ERBB2, BRAF,

CD74-ROS and KIF5B-RET. All PCR-positive specimens were verified by direct sequencing.

Specific PCR primers are available upon request.

Plasmid Construction, Cell Culture, and Transfection studies. Full length KIF5B-RET

cDNA was constructed by RT-PCR amplifying two overlapping fragments, each containing a

unique EcoRI site. PCR primers used to generate the N-terminal and C-terminal

overlapping fragments are:

F5’-ATACGAAGTTATCAGTCGACCAGCTGACTGCTGCCTCTCAC-3’;

R5’-CAGATACTGCATCCCCTGTGAGAT-3’;

F5’-TAAGGAAATGACCAACCACCAGAA-3’;

R5’-ACGAATGGTCTAGAAAGCTTTTAACTATCAAACGTGTCCATTAATTTTGCCGCTGA-3’

cDNA was inserted into pDNR-Dual vector (BD Biosciences) using SalI/HindIII sites and

recombined into lentiviral expression vector JP1698 as previously described6. Full length

cDNA was confirmed by sequencing. Ba/F3 cell culture, lentivirus production, titrations

and infections were performed as previously described7,8. Polyclonal cell lines were

established by blastocidin selection then cultured without interleukin-3 (IL-3). Cell

proliferation and growth were performed as previously described8. Gefitinib, sorafenib,

sunitinib and vandetinib were prepared in DMSO and stored at -20oC. Protein detection by

immunoblotting was performed according to antibody manufacturer's recommendations.

Anti-RET and anti-phospho-RET (Tyr 905) antibodies were from Cell Signaling Technology.

Nature Medicine doi:10.1038/nm.2673

Page 5: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Anti-tubulin antibody was from Sigma Aldrich.

References

1. Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler

transform. Bioinformatics, 25, 1754-60. [PMID: 194511682. Li, H., et al. The Sequence

Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079 (2009).

2 Li, H., et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079 (2009).

3. McKenna, A., et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing

next-generation DNA sequencing data. Genome Res 20, 1297-1303 (2010). 4. Forbes, S.A., et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic

Mutations in Cancer. Nucleic Acids Res 39, D945-950 (2011). 5. D'Alessio, J.M. & Gerard, G.F. Second-strand cDNA synthesis with E. coli DNA polymerase I

and RNase H: the fate of information at the mRNA 5' terminus and the effect of E. coli DNA ligase. Nucleic Acids Res 16, 1999-2014 (1988).

6. Levin, J.Z., et al. Targeted next-generation sequencing of a cancer transcriptome enhances

detection of sequence variants and novel fusion transcripts. Genome Biol 10, R115 (2009).

7. Ercan, D., et al. Amplification of EGFR T790M causes resistance to an irreversible EGFR inhibitor. Oncogene. 16, 2346-56. (2010).

8. Sasaki, T., et al. A novel ALK secondary mutation and EGFR signaling cause resistance to

ALK kinase inhibitors. Cancer Res. 18, 6051-60. (2011).

Nature Medicine doi:10.1038/nm.2673

Page 6: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Identification of recurrent KIF5B-RET gene fusions from lung cancer biopsies

Doron Lipson, Marzia Capelletti, Roman Yelensky, Geoff Otto, Alex Parker, Mirna Jarosz, John A Curran, Sohail Balasubramanian, Troy Bloom, Kristina W Brennan, Amy Donahue, Sean R Downing, Garrett M Frampton, Lazaro Garcia, Frank Juhn, Kathy C Mitchell, Emily White, Jared White, Zac Zwirko, Tamar Peretz, Hovav Nechushtan, Lior Soussan-Gutman, Jhingook Kim, Hidefumi Sasaki, Hyeong Ryul Kim, Seung-il Park, Dalia Ercan, Christine E Sheehan, Jeffrey S Ross, Maureen T Cronin, Pasi A Jänne and Philip J Stephens

Gene RefSeq Gene RefSeq Gene RefSeq Gene RefSeq ABL1 NM_007313 PTCH1 NM_000264 HOXA3 NM_030661 RUNX1 NM_001754 AKT1 NM_005163 PTEN NM_000314 HSP90AA1 NM_005348 SMAD2 NM_005901 AKT2 NM_001626 RB1 NM_000321 IDH1 NM_005896 SMAD3 NM_005902 AKT3 NM_181690 RET NM_020630 IDH2 NM_002168 SMAD4 NM_005359 ALK NM_004304 SMO NM_005631 IGF1R NM_000875 SMARCA4 NM_003072 APC NM_000038 STK11 NM_000455 IGF2R NM_000876 SMARCB1 NM_003073 AR NM_000044 TP53 NM_000546 IKBKE NM_014002 SOX10 NM_006941

BRAF NM_004333 ABL2 NM_005158 INHBA NM_002192 SOX2 NM_003106 CCND1 NM_053056 ATM NM_000051 IRS2 NM_003749 SRC NM_005417 CDK4 NM_000075 AURKA NM_003600 JAK3 NM_000215 TET2 NM_017628

CDKN2A NM_000077 AURKB NM_004217 KDR NM_002253 TGFBR2 NM_003242 CEBPA NM_004364 BCL2 NM_000633 MAP2K4 NM_003010 TOP1 NM_003286

CTNNB1 NM_001904 BCL2L1 NM_001191 MCL1 NM_021960 TSC1 NM_000368 EGFR NM_005228 BCL2L2 NM_004050 MDM2 NM_002392 TSC2 NM_000548

ERBB2 NM_004448 BCL6 NM_001706 MDM4 NM_002393 VHL NM_000551 ESR1 NM_000125 BRCA1 NM_007294 MEN1 NM_000244 WT1 NM_000378

FGFR1 NM_015850 BRCA2 NM_000059 MITF NM_198159 ARFRP1 NM_003224 FGFR2 NM_000141 CBL NM_005188 MLH1 NM_000249 BCL2A1 NM_004049 FGFR3 NM_000142 CCNE1 NM_001238 MPL NM_005373 CDH20 NM_031891 FLT3 NM_004119 CDH1 NM_004360 MRE11A NM_005590 CDH5 NM_001795 HRAS NM_005343 CDH2 NM_001792 MSH2 NM_000251 EPHA3 NM_005233 JAK2 NM_004972 CDK6 NM_001259 MSH6 NM_000179 EPHA5 NM_004439 KIT NM_000222 CDK8 NM_001260 MTOR NM_004958 EPHA7 NM_004440

KRAS NM_004985 CHEK1 NM_001274 MYCL1 NM_005376 EPHB1 NM_004441 MAP2K1 NM_002755 CHEK2 NM_007194 MYCN NM_005378 FOXP4 NM_138457 MAP2K2 NM_030662 CRKL NM_005207 NF2 NM_000268 GPR124 NM_032777

MET NM_000245 EPHA6 NM_173655 NKX2-1 NM_003317 GUCY1A2 NM_000855 MLL NM_005933 EPHB4 NM_004444 NTRK1 NM_002529 LRP1B NM_018557 MYC NM_002467 EPHB6 NM_004445 PAX5 NM_016734 LTK NM_002344 NF1 NM_000267 ERBB3 NM_001982 PDGFRB NM_002609 PAK3 NM_002578

NOTCH1 NM_017617 ERBB4 NM_005235 PKHD1 NM_138694 PLCG1 NM_002660 NPM1 NM_002520 FBXW7 NM_018315 PRKDC NM_006904 PTPRD NM_002839 NRAS NM_002524 FGFR4 NM_002011 PTPN11 NM_002834 TBX22 NM_016954

NTRK3 NM_002530 FLT1 NM_002019 RAF1 NM_002880 USP9X NM_001039590 PDGFRA NM_006206 FLT4 NM_182925 RARA NM_000964 PIK3CA NM_006218 GATA1 NM_002049 RICTOR NM_152756 PIK3R1 NM_181523 GNAS NM_016592 RPTOR NM_020761

Supplementary Table 1a. Genes sequenced across entire coding sequence (n=145)

Nature Medicine doi:10.1038/nm.2673

Page 7: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Gene RefSeq Introns sequenced ALK NM_004304 19

BCR NM_004327 8,13,14

BRAF NM_004333 7,8,9,10

EGFR NM_005228 7

ETV1 NM_004956 3,4

ETV4 NM_001986 8

ETV5 NM_004454 6,7

ETV6 NM_001987 5,6

EWSR1 NM_005243 8,9,10,11,12,13

MLL NM_005933 6,7,8,9

RAF1 NM_002880 5,6,7,8,9

RARA NM_000964 2

RET NM_020630 9,10,11

TMPRSS2 NM_005656 1,2

Supplementary Table 1b. Genes sequenced across selected introns (n=14)

Nature Medicine doi:10.1038/nm.2673

Page 8: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Sam

ple n

ame

Cancer

Gender

Tota

l pai

rs

% p

ass f

ilter

% u

niquely

aligned

Error r

ate

Read p

air le

ngth

% ch

imer

as

Mean

inse

rt siz

e

Fold

enric

hment

Mean

exo

n cove

rage

Media

n exo

n cove

rage

100X cove

red e

xons

150X cove

red e

xons

250X cove

red e

xons

On-targ

et % d

uplicatio

n

Est o

n-targ

et lib

rary

size

On-targ

et dist

inct

pai

rs

% u

sable

pai

rs

SM3 CRC Male 27,019,766 94% 94% 0.23% 36, 36 1.11% 176.0 2067.3 161.7 159 81% 55% 10% 87% 2,247,862 2,246,894 8%

SM4 CRC Male 16,509,937 94% 93% 0.22% 36, 36 1.08% 174.2 2396.6 277.6 277 93% 86% 59% 71% 3,650,254 3,517,635 21%

SM6 CRC Male 19,231,329 95% 94% 0.22% 36, 36 0.92% 161.1 2486.5 261.8 251 92% 83% 50% 78% 3,306,316 3,266,131 17%

SM7 CRC Female 16,194,851 95% 90% 0.23% 36, 36 1.73% 159.1 1971.2 215.6 212 89% 77% 33% 72% 2,755,942 2,667,390 16%

SM8E CRC Female 11,945,498 91% 91% 0.28% 36, 36 1.77% 154.1 2199.1 156.1 159 79% 55% 9% 74% 1,960,454 1,916,603 16%

SM9 CRC Female 18,744,036 94% 92% 0.25% 36, 36 2.39% 141.1 2536.0 199.9 201 89% 74% 25% 83% 2,410,330 2,402,503 13%

SM11 CRC Male 16,023,607 95% 94% 0.22% 36, 36 1.21% 173.9 2345.6 215.6 218 88% 76% 36% 76% 2,787,679 2,745,027 17%

SM12 CRC Male 18,079,049 95% 89% 0.24% 36, 36 2.35% 161.8 1715.3 274.8 279 93% 86% 61% 62% 3,828,343 3,481,145 19%

SM13E CRC Female 17,743,604 91% 92% 0.29% 36, 36 1.25% 156.8 2211.9 176.3 184 83% 67% 17% 80% 2,230,888 2,217,154 12%

SM14 CRC Male 15,093,799 95% 94% 0.23% 36, 36 1.61% 154.9 2505.4 215.8 214 87% 74% 36% 77% 2,699,597 2,660,457 18%

SM16 CRC Male 20,624,203 95% 93% 0.22% 36, 36 1.16% 174.4 2330.9 303.7 301 95% 89% 67% 73% 4,096,015 3,984,814 19%

SM18 CRC Female 15,332,978 95% 93% 0.22% 36, 36 1.39% 164.7 2319.5 253.9 256 91% 84% 52% 70% 3,368,662 3,237,097 21%

SM19 CRC Male 10,247,815 92% 88% 0.28% 36, 36 2.71% 135.2 1737.8 117.9 119 62% 29% 1% 71% 1,485,566 1,431,053 14%

SM20 CRC Male 21,069,786 88% 92% 0.30% 36, 36 0.82% 158.8 2151.4 185.3 190 84% 69% 22% 81% 2,409,046 2,396,207 11%

SM21 CRC Male 17,431,311 95% 92% 0.22% 36, 36 2.34% 167.2 2273.0 253.3 250 92% 83% 50% 73% 3,287,666 3,199,956 18%

SM23 CRC Female 19,255,847 90% 89% 0.31% 36, 36 2.80% 139.4 1655.0 91.1 92 41% 7% 0% 87% 1,111,921 1,111,459 6%

SM28 CRC Female 15,487,496 90% 87% 0.41% 36, 36 3.63% 170.6 1586.9 177.2 176 83% 63% 18% 65% 2,448,828 2,283,271 15%

SM29 CRC Female 11,187,953 90% 89% 0.34% 36, 36 2.34% 169.1 1568.7 156.2 155 77% 53% 11% 59% 2,312,007 2,043,114 18%

SM32 CRC Female 18,846,140 95% 92% 0.23% 36, 36 2.68% 185.1 2147.2 275.4 274 92% 86% 59% 71% 3,804,394 3,664,524 19%

SM34 CRC Male 19,892,704 95% 87% 0.29% 36, 36 4.41% 144.2 1918.7 96.8 94 44% 12% 0% 89% 1,183,986 1,183,855 6%

SM36E CRC Male 38,125,948 64% 87% 0.84% 36, 36 2.36% 170.4 1438.7 228.2 233 85% 75% 43% 73% 3,030,432 2,948,401 8%

SM39 CRC Male 21,261,869 94% 93% 0.26% 36, 36 2.77% 171.6 2293.7 141.3 142 77% 44% 3% 87% 1,858,817 1,858,188 9%

SM40 CRC Male 24,280,140 94% 91% 0.25% 36, 36 1.79% 173.6 2054.5 141.5 145 79% 46% 2% 87% 1,926,458 1,925,750 8%

SM41E CRC Female 18,542,548 94% 94% 0.22% 36, 36 2.45% 201.6 2252.6 211.4 207 90% 77% 31% 78% 2,930,814 2,900,347 16%

SM42E CRC Male 18,988,365 95% 95% 0.22% 36, 36 0.74% 184.3 2342.0 152.7 159 81% 56% 4% 85% 2,086,042 2,083,697 11%

SM45 CRC Female 26,249,436 94% 94% 0.23% 36, 36 1.06% 181.1 2329.5 182.9 187 86% 69% 18% 87% 2,490,649 2,489,484 9%

SM54 CRC Female 25,197,937 94% 92% 0.27% 36, 36 3.02% 175.0 2258.3 167.9 167 86% 61% 9% 87% 2,217,645 2,216,705 9%

SM61 CRC Male 39,256,837 63% 89% 0.83% 36, 36 1.98% 168.1 1308.2 229.6 239 82% 73% 46% 70% 3,160,993 3,036,072 8%

SM67 CRC Male 28,245,093 96% 88% 0.31% 36, 36 3.58% 161.3 1379.8 230.3 216 86% 73% 38% 74% 3,068,169 2,997,557 11%

SM73 CRC Male 23,936,537 94% 93% 0.26% 36, 36 3.45% 196.3 2229.3 145.5 144 79% 45% 3% 88% 2,022,561 2,022,050 8%

SM74NE CRC Female 10,955,649 95% 94% 0.21% 36, 36 1.63% 199.9 2213.0 203.4 204 87% 74% 28% 64% 3,016,934 2,786,571 25%

SM75 CRC Female 9,434,682 95% 87% 0.25% 36, 36 4.62% 189.3 1572.2 152.8 147 79% 48% 9% 54% 2,439,252 2,046,162 22%

SM77E CRC Male 7,025,793 94% 91% 0.24% 36, 36 3.43% 181.4 2090.1 108.6 103 52% 18% 2% 68% 1,504,042 1,426,893 20%

SM78E CRC Female 7,055,984 95% 91% 0.23% 36, 36 3.15% 207.6 1942.8 155.3 154 80% 52% 10% 51% 2,653,615 2,149,238 30%

SM81 CRC Male 24,702,715 94% 94% 0.21% 36, 36 1.07% 182.7 2325.8 353.5 344 93% 89% 74% 74% 4,742,318 4,633,928 19%

SM83 CRC Male 19,069,353 94% 95% 0.20% 36, 36 0.84% 187.3 2328.7 287.9 288 91% 84% 62% 73% 3,862,786 3,758,336 20%

SM88 CRC Male 40,923,233 95% 93% 0.22% 36, 36 1.63% 171.2 2273.0 304.7 304 93% 87% 67% 86% 3,972,120 3,969,048 10%

SM98E CRC Male 20,847,010 94% 92% 0.24% 36, 36 1.34% 213.2 1754.8 292.6 293 92% 86% 65% 65% 4,521,756 4,189,272 20%

SM100E CRC Female 18,406,316 95% 93% 0.22% 36, 36 1.47% 186.0 2160.3 274.2 277 91% 85% 60% 71% 3,813,874 3,674,132 20%

SM110E CRC Male 15,076,598 95% 93% 0.21% 36, 36 1.17% 176.1 2065.7 216.5 223 83% 73% 39% 70% 3,018,578 2,900,594 19%

Average 19,588,594 92% 92% 0.28% 36, 36 2.08% 172.5 2068.4 206.2 205.9 83% 66% 31% 75% 2,793,090 2,691,718 15%

Median 18,795,088 94% 92% 0.24% 36, 36 1.78% 172.6 2179.7 207.4 205.5 86% 73% 30% 74% 2,727,770 2,663,924 16%

Supplementary Table 2a. CRC assay sequencing statistics

Sam

ple n

ame

Cancer

Gender

Tota

l pai

rs

% p

ass f

ilter

% u

niquely

aligned

Error r

ate

Read p

air le

ngth

% ch

imer

as

Mean

inse

rt siz

e

Fold

enric

hment

Mean

exo

n cove

rage

Media

n exo

n cove

rage

100X cove

red e

xons

150X cove

red e

xons

250X cove

red e

xons

On-targ

et % d

uplicatio

n

Est o

n-targ

et lib

rary

size

On-targ

et dist

inct

pai

rs

% u

sable

pai

rs

SM44 NSCLC Female 17,364,255 95% 88% 0.26% 36, 36 2.80% 166.6 1539.2 201.4 200 85% 70% 29% 68% 2,671,898 2,535,020 15%

SM46 NSCLC Female 20,948,260 94% 93% 0.24% 36, 36 1.77% 180.5 2281.0 178.9 182 86% 68% 15% 84% 2,426,651 2,421,216 12%

SM48 NSCLC Female 14,499,848 95% 92% 0.23% 36, 36 1.11% 163.4 2013.3 211.8 217 88% 77% 34% 69% 2,905,679 2,769,458 19%

SM49A5E NSCLC Female 19,297,409 95% 85% 0.37% 36, 36 6.16% 140.9 1510.8 173.4 172 83% 61% 15% 74% 2,150,042 2,098,151 11%

SM51 NSCLC Female 15,247,561 95% 92% 0.26% 36, 36 3.21% 140.5 2497.1 140.6 139 73% 43% 5% 85% 1,654,976 1,652,751 11%

SM53 NSCLC Female 15,976,709 95% 94% 0.21% 36, 36 0.77% 175.9 2423.8 285.8 283 92% 86% 61% 69% 3,888,103 3,713,704 23%

SM55 NSCLC Male 22,322,494 95% 87% 0.32% 36, 36 2.60% 144.2 1440.1 186.6 187 85% 67% 22% 75% 2,320,955 2,276,351 10%

SM63 NSCLC Female 13,442,277 95% 94% 0.21% 36, 36 2.09% 173.6 2352.7 238.1 243 88% 80% 47% 69% 3,198,092 3,050,028 23%

SM64E NSCLC Female 17,310,249 95% 93% 0.23% 36, 36 1.21% 185.7 2028.1 293.4 293 93% 87% 64% 65% 4,219,443 3,911,941 23%

SM70E NSCLC Female 15,525,662 95% 94% 0.19% 36, 36 0.95% 201.9 2207.0 314.2 319 92% 87% 70% 61% 4,731,278 4,275,065 28%

SM71E NSCLC Male 16,233,125 95% 91% 0.25% 36, 36 2.95% 153.1 2371.5 188.4 188 86% 68% 22% 80% 2,322,055 2,303,999 14%

SM86 NSCLC Female 37,176,479 94% 92% 0.26% 36, 36 2.12% 156.8 2287.5 179.4 186 89% 71% 11% 91% 2,291,983 2,291,945 6%

SM87 NSCLC Female 20,631,365 94% 93% 0.21% 36, 36 2.70% 197.2 2123.6 330.9 317 93% 88% 70% 68% 4,666,235 4,414,860 21%

SM89 NSCLC Female 20,486,730 95% 93% 0.23% 36, 36 2.05% 172.3 2146.7 192.7 199 85% 72% 25% 82% 2,527,689 2,516,344 12%

SM90 NSCLC Female 19,073,249 94% 94% 0.21% 36, 36 1.13% 215.6 1970.2 287.2 282 92% 86% 61% 68% 4,135,076 3,915,497 21%

SM91A3 NSCLC Female 22,892,049 94% 92% 0.23% 36, 36 1.07% 183.1 1952.5 131.2 136 70% 40% 3% 87% 1,839,751 1,838,841 8%

SM92 NSCLC Male 20,707,344 95% 92% 0.23% 36, 36 1.40% 167.9 2009.9 230.1 229 88% 78% 41% 77% 3,048,336 3,003,074 15%

SM93 NSCLC Unknown 20,162,905 94% 93% 0.24% 36, 36 1.06% 221.9 1912.4 301.7 290 90% 85% 62% 65% 4,663,117 4,348,135 22%

SM96 NSCLC Female 26,403,071 95% 94% 0.21% 36, 36 0.88% 197.2 2258.9 314.6 320 93% 88% 72% 78% 4,288,387 4,237,268 16%

SM107 NSCLC Female 31,404,026 94% 92% 0.25% 36, 36 2.63% 156.6 2393.8 139.9 137 74% 41% 4% 92% 1,695,838 1,695,835 5%

SM109 NSCLC Female 18,778,284 95% 93% 0.21% 36, 36 0.79% 199.4 1944.0 294.6 293 90% 84% 63% 65% 4,408,833 4,102,176 22%

SM112 NSCLC Male 16,541,878 95% 92% 0.25% 36, 36 2.56% 155.8 2216.2 193.1 196 87% 71% 24% 78% 2,456,445 2,428,207 15%

SM113 NSCLC Female 17,328,887 95% 89% 0.26% 36, 36 2.72% 164.4 1550.8 217.3 216 90% 78% 34% 65% 2,962,814 2,764,578 16%

SM114 NSCLC Male 39,093,969 94% 95% 0.62% 49, 49 1.75% 178.8 1735.3 1096.5 1173 95% 92% 88% 47% 17,022,547 13,072,965 33%

Average 20,785,337 95% 92% 0.26% 36, 36 2.02% 174.7 2048.6 263.4 266.5 87% 74% 39% 73% 3,687,343 3,401,559 17%

Median 19,185,329 95% 93% 0.24% 36, 36 1.91% 173.0 2075.8 214.5 216.5 88% 78% 34% 72% 2,934,247 2,767,018 16%

Supplementary Table 2b. NSCLC assay sequencing statistics

Nature Medicine doi:10.1038/nm.2673

Page 9: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Sample

Total

alterations

Substitut

ions INDELs

Copy

number

changes

Rearrang

ement APC TP53 KRAS BRAF FBXW7 CDH1 MYC ATM BCL2L1 BRCA2 ERBB3 GNAS PIK3CA SMAD4 ALK CDK8 LRP1B MSH6 RICTOR SMAD2 STK11

SM77E 3 3

S1346*(37%)

R213*(31%) R213*(72%)

SM18 3 3 D1394fs*21(42%) R248W(43%) S464*(13%)

SM41E 3 2 1 G1357fs*7(25%) G245S(22%) G12A(32%)

SM98E 6 3 3 K1878fs*4(33%)

R273C(32%)

R181C(38%) V600E(33%) S353fs*24(29%) T3033fs*29(44%)

SM75 3 1 2

Q1193fs*14(18%)

E1309fs*12(25%) C242F(38%)SM16 2 1 1 E1309fs*4(38%) R306*(42%)

SM21 4 4

E1353*(31%)

E1374*(31%) R175H(58%) G284R(36%)

SM9 3 2 1

E1544*(27%)

K1250fs*5(26%) V272M(36%)

SM7 3 2 1

G1288*(18%)

S1100fs*26(29%) R248W(55%)SM42E 2 2 K1370*(3%) G12V(3%)SM83 2 2 K534*(29%) C238Y(32%)

SM20 4 3 1

L1488fs*19(36%)

R213*(27%) G12A(21%) E545K(25%)

SM32 3 2 1

P1324fs*91(32%)

Q1406*(9%) K132R(26%)

SM40 3 1 2

P1373fs*42(37%)

E1317fs*4(17%) C176Y(41%)SM4 3 3 Q1294*(53%) G12D(32%) R278*(64%)

SM6 4 3 1

Q1406*(21%)

L1129S(31%) R280T(15%) Amp

SM28 3 3 Q1429*(32%) G12C(27%) E545K(11%)

SM23 4 4 Q1625*(10%) V600E(10%) V104M(16%) R201H(15%)

SM81 3 2 1

Q789*(28%)

L235fs*58(16%) L194F(40%)SM88 3 2 1 R1399fs*9(37%) E258G(32%) R658*(25%)

SM8E 9 3 6

R1450*(15%)

T1556fs*9(16%) R196*(18%) V600E(13%) P126fs*89(5%) K2811fs*46(26%)

C3869fs*1(9%)

T406fs*8(15%)

N854fs*12(26%)

SM45 3 3 R213*(25%) R282W(24%) G12V(19%)

SM100E 3 2 1

R216*(14%)

C1270*(22%) V122fs*26(25%)

SM67 3 2 1

R232*(26%)

Q1131*(30%) AmpSM110E 3 2 1 R564*(18%) F212fs*3(10%) G12V(13%)

SM3 4 3 1

R564*(64%)

E1322*(24%) R282W(88%) Amp

SM54 5 1 4

S1465fs*3(18%)

L519fs*18(24%)

S215_V218>RR(1

5%) V600E(21%) T576fs*29(13%)SM29 3 1 1 1 C135F(46%) Amp Rearranged

SM14 2 1 1 W91fs*32(26%) G12V(38%)SM13E 1 1 T253fs*11(10%)

SM34 1 1 E204fs*43(39%)

SM19 4 2 2 E258A(58%) V600E(18%) S668fs*26(17%) D360fs*24(12%)

SM36E 3 0 2 1 K292fs*54(21%) S1982fs*22(64%) Rearranged

SM12 7 3 4 P250H(11%) V600E(10%)

R74*(11%)

P201fs*14(5%)

P126fs*89(4%) R1875*(12%) P1087fs*5(5%)

SM11 2 2 R248W(30%) G12V(30%)

SM39 3 2 1

R273C(30%)

L114fs*9(21%) A118V(37%)

SM73 2 2 S240R(20%) Q214*(22%)SM74NE 2 2 V173M(10%) G12C(7%)

SM61 1 1 201H(22%)

SM78E 0

Total 125 80 39 4 2 42 34 10 6 5 4 3 2 2 2 2 2 2 2 1 1 1 1 1 1 1

Supplementary Table 3a. Alterations in 40 CRC cases . Mutations with the percent mutant allele frequency (in brackets) are shown.

Sample

Total

alterations

Substitut

ions

Total

INDELs

Copy

number

changes

Gene

fusion KRAS TP53 STK11 LRP1B JAK2 CTNNB1 RET EGFR BRAF CDKN2A MDM2 PIK3CA ATM TSC1 CCNE1 NF1 RB1 APC MLH1 MSH6 CDK4SM109 1 1 G12F(24%)

SM86 4 4 G12C(28%) V617F(4%) I35S(2%) E545K(5%)

SM51 3 3 G12C(11%) E165*(14%) G466A(12%)

SM71E 1 1 G12C(12%)

SM89 2 1 1 G12C(13%) V997fs*1(9%)SM90 2 2 G12V(12%) M237I(5%)

SM96 3 3 G12V(14%) Q2940*(10%) H83Y(27%)

SM44 3 2 1 G12V(27%) S232fs*55(19%) S37Y(10%)

SM70E 1 1 G12V(16%)

SM107 1 1 G12A(21%)SM91A3 7 6 1 C229fs*10(26%) D194Y(18%) K4112*(7%) V617F(10%) Y1635*(8%) E280(23%) L1129S(41%)

SM93 2 2 C242F(55%) E13*(27%)

SM63 1 1 G245S(7%)

SM48 3 2 1 K132*(31%) E3508*(16%)

R124_L130delRDV

ARYL(22%)

SM114 3 1 1 1 R248L(29%) Hom del RearrangedSM53 3 1 1 1 Y163C(50%) L90fs*9(41%) Amp

SM92 6 4 2 V617F(8%) G466V(21%) Amp N345K(32%) V509A(51%) Amp

SM87 2 2 Amp Amp

SM64E 1 1

D770_N771insSV

D(20%)

SM113 1 1 N429fs*7(42%)SM46 0

SM112 0

SM49A5E 0

SM55 0

Total 50 36 7 6 1 10 7 4 3 3 2 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1

Supplementary Table 3b. Alterations in 24 NSCLC cases . Mutations with the percent mutant allele frequency (in brackets) are shown.

Nature Medicine doi:10.1038/nm.2673

Page 10: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Gene No. of Mutated

Samples Potential therapeutic treatment or clinical trial

TP53 32 Presently unknown

APC 27 Presently unknown

KRAS 10 Resistance to cetuximab and panitumumab

BRAF 6 Resistance to cetuximab and panitumumab

FBXW7 5 Potential resistance to tubulins

ATM 2 PARP inhibitors

BCL2L1 2 Presently unknown

BRCA2 2 PARP inhibitors

CDH1 2 Presently unknown

ERBB3 2 Presently unknown

GNAS 2 MEK or ERK inhibitors

PIK3CA 2 PI3 kinase/mTOR inhibitors

SMAD4 2 Prognostic factor

ALK 1 ALK inhibitors e.g. Crizotinib

CDK8 1 CDK inhibitors e.g. Flavopiridol

LRP1B 1 Presently unknown

MYC 1 Presently unknown

MSH6 1 Prognostic factor

RICTOR 1 Presently unknown

SMAD2 1 Presently unknown

STK11 1 Presently unknown

Supplementary Table 4a. CRC alterations that could be linked to a clinical treatment option or clinical

trial of novel targeted therapies.

Nature Medicine doi:10.1038/nm.2673

Page 11: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Gene

No. of Mutated Samples Potential therapeutic treatment or clinical trial

KRAS 10 Resistance to EGFR kinase inhibitors, clinical trials of PI3K and MEK inhibitors

STK11 4 Presently unknown

JAK2 3 JAK2 inhibitors

EGFR 2 Erlotinib or gefitinib

BRAF 2 Vemurafenib and GSK 2118436

CDKN2A 2 CDK inhibitors e.g.PD0332991

RET 2 RET inhibitors e.g. Sorafenib or sunitinib

CTNNB1 2 Presently unknown

MDM2 2 Nutlins

PIK3CA 2 PI3 kinase/mTOR inhibitors

ATM 2 PARP inhibitors

TSC1 1 mTOR inhibitors

CCNE1 1 CDK4 inhibitors e.g PD0332991

NF1 1 Presently unknown

RB1 1 Presently unknown

MLH1 1 Presently unknown

MSH6 1 Presently unknown

CDK4 1 CDK4 inhibitors e.g PD0332991

Supplementary Table 4b. NSCLC alterations that could be linked to a clinical treatment option or clinical

trial of novel targeted therapies.

Nature Medicine doi:10.1038/nm.2673

Page 12: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Characteristic No. of Patients (n = 117) Gender Male 62 Female 55 Histology Adenocarcinoma 83 Squamous cell Carcinoma 26 Carcinoid 8 Smoking Never 5 Limited former 53 Current 34 Unknown 25 Stage I 77 II 16 III 13 IV 8 N/A 3 RET IHC 0 78 1+/2+ 17 3+/4+ 22

No of KIF5B-RET fusions: 1

Frequency in all patients: 1/117: 0.85%

Frequency Adenocarcinoma: 1/89: 1.1%

Supplementary Table 5a. Summary of NSCLC patients analyzed by RET Immunohistochemistry.

Nature Medicine doi:10.1038/nm.2673

Page 13: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Characteristic No. of Patients (n= 121) Gender Male 16 Female 35 N/A 70 Histology Adenocarcinoma 96 Squamous cell Carcinoma 11 Carcinoid 10 Other 4 Smoking Never 25 Limited former or current 26 N/A 70 Mutation EGFR 47 (39%) KRAS 13 (11%) EML4-ALK 4 (3%) BRAF 3 (2.5%) CD74-ROS 3 (2.5%) ERBB2 2 (2%) None 49 (40%)

No of KIF5B-RET fusions: 1

Frequency in all patients: 1/121: 0.8%

Frequency in adenocarcinoma: 1/96: 1.0%

Frequency in WT patients: 1/49: 2.0%

Supplementary Table 5b. Summary of European ancestry NSCLC patients analyzed by RT-PCR.

Nature Medicine doi:10.1038/nm.2673

Page 14: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Characteristic No. of Patients (n = 405) Gender Male 84 Female 321 Histology Adenocarcinoma 382 Squamous cell Carcinoma 17 Carcinoid 2 Other 4 Smoking Never 373 Limited former 32 Mutation EGFR 228 (56%) EML4-ALK 30 (7.4%) KRAS 20 (5.5%) ERBB2 10 (2.5%) BRAF 4 (1%) CD74-ROS 3 (0.5%) None 110 (27%)

Note – 1 patient has concurrent KRAS and BRAF mutations

No of KIF5B-RET fusions: 9

Frequency in all patients: 9/405: 2%

Frequency in all adenocarcinoma: 9/382: 2.4%

Frequency in WT patients: 9/110: 8.2%

Supplementary Table 5c. Summary of Asian NSCLC patients analyzed by RT-PCR.

Nature Medicine doi:10.1038/nm.2673

Page 15: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Supplementary Figure 1. Predicted C2orf44-ALK gene fusion variant sequence.

1 ATGGAGTTGG GAAAAGGAAA ACTACTCAGG ACTGGACTGA ATGCGTTGCA TCAAGCAGTG

61 CATCCGATCC ATGGCCTTGC CTGGACCGAT GGGAATCAAG TTGTCCTAAC TGATTTGCGG

121 CTTCACAGTG GAGAGGTCAA GTTTGGGGAC TCCAAAGTCA TTGGACAGTT TGAATGTGTC

181 TGTGGGTTGT CCTGGGCCCC ACCTGTTGCA GATGATACAC CTGTTCTACT CGCTGTCCAG

241 CATGAGAAGC ATGTCACTGT GTGGCAGCTG TGTCCCAGCC CTATGGAGTC AAGCAAATGG

301 CTGACGTCTC AGACTTGTGA GATTAGAGGA TCACTACCTA TCCTTCCCCA GGGCTGTGTG

361 TGGCACCCAA AATGTGCTAT TCTGACTGTG TTGACTGCTC AGGATGTCTC CATTTTCCCT

421 AATGTTCACT CTGATGATTC CCAGGTAAAG GCAGACATCA ACACCCAGGG CCGCATTCAC

481 TGTGCATGTT GGACCCAGGA TGGCCTGAGG CTGGTGGTGG CAGTAGGCAG CAGCCTGCAT

541 TCTTATATTT GGGACAGCGC TCAGAAGACT CTTCACAGGT GCTCCTCCTG CCTGGTGTTT

601 GATGTGGACA GCCACGTCTG CTCCATCACA GCAACTGTGG ACTCACAGGT TGCTATAGCT

661 ACTGAGCTTC CATTGGATAA GATCTGTGGC TTAAATGCAT CTGAAACCTT TAATATCCCA

721 CCTAACAGTA AAGACATGAC TCCGTATGCT TTACCAGTTA TTGGTGAAGT ACGCTCTATG

781 GATAAAGAGG CAACTGATTC TGAAACAAAT TCTGAAGTAT CAGTTTCTTC TTCCTATTTA

841 GAACCTCTGG ATCTAACTCA CATACATTTC AATCAACATA AGTCTGAGGG TAATTCTCTT

901 ATTTGTCTAA GAAAAAAGGA CTACTTGACA GGAACTGGCC AAGATTCTTC ACATTTGGTC

961 CTTGTGACCT TTAAGAAGGC AGTTACCATG ACGAGAAAAG TCACTATTCC AGGCATTCTG

1021 GTTCCTGATC TGATAGCATT TAATCTTAAA GCCCACGTAG TGGCAGTGGC TTCCAACACT

1081 TGTAATATAA TTTTGATCTA CTCTGTCATT CCATCTTCAG TCCCAAACAT CCAGCAAATT

1141 CGATTAGAGA ACACTGAAAG ACCAAAAGGG ATATGTTTCT TGACAGACCA ACTATTACTA

1201 ATTTTGGTAG GAAAACAAAA ACTCACTGAT ACAACATTTC TTCCTTCTTC AAAGTCTGAT

1261 CAGTATGCCA TTAGCTTGAT TGTTAGAGAA ATAATGTTGG AAGAAGAACC TTCAATAACA

1321 TCAGGTGAAA GCCAGACTAC CTACTCTACT TTCAGTGCTC CGTTAAATAA AGCAAATAGA

1381 AAAAAGTTAA TTGAAAGTCT TTCCCCAGAT TTTTGTCACC AAAACAAAGG GCTGTTGCTG

1441 ACAGTTAATA CCAGTAGTCA GAATGGAAGG CCTGGAAGAA CACTTATTAA AGAAATCCAG

1501 AGTCCTCTGT CTAGTATCTG TGATGGCTCC ATAGCTCTAG ATGCTGAGCC TGTTACCCAG

1561 CCAGCATCGC TGCCCAGACA CAGCAGCACA CCAGACCACA CCAGCACACT GGAGCCTCCT

1621 CGTTTGCCTC AAAGAAAGAA CTTACAAAGT GAAAAGGAAA CTTATCAGCT GTCTAAGGAA

1681 GTGGAAATTT TATCTAGGAA CCTGGTTGAA ATGCAACGGT GTCTTTCTGA ACTTACAAAC

1741 CGTCTGCATA ATGGGAAGAA ATCCTCTTCA GTGTATCCAC TCTCTCAAGA TCTTCCTTAT

1801 GTTCACATCA TTTACCAGAA ACCTTATTAT CTAGGTCCTG TTGTTGAAAA AAGAGCGGTG

1861 CTTCTCTGTG ATGGTAAACT AAGGCTCAGT ACAGTTCAGC AGACTTTTGG CCTTTCTCTC

1921 ATTGAAATGC TACATGATTC CCACTGGATT CTTCTCTCTG CTGACAGTGA GGGCTTTATC

1981 CCGTTAACCT TCACAGCCAC ACAGGAAATA ATCATAAGAG ATGGCAGCCT GTCCAGGCTG

2041 GAGTGCATTG GCACAATCTT GGCTCACTGC AACCTCCAAC TCCCGGGTTC AAACCGTTCA

2101 GAGCTCAGGG GAGGATATGG AGATCCAGGG AGGCTTCCTG TAGGAAGTGG CCTGTGTAGT

Nature Medicine doi:10.1038/nm.2673

Page 16: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

2161 GCTTCAAGGG CCAGGCTGCC AGGCCATGTT GCAGCTGACC ACCCACCTGC AGTGTACCGC

2221 CGGAAGCACC AGGAGCTGCA AGCCATGCAG ATGGAGCTGC AGAGCCCTGA GTACAAGCTG

2281 AGCAAGCTCC GCACCTCGAC CATCATGACC GACTACAACC CCAACTACTG CTTTGCTGGC

2341 AAGACCTCCT CCATCAGTGA CCTGAAGGAG GTGCCGCGGA AAAACATCAC CCTCATTCGG

2401 GGTCTGGGCC ATGGCGCCTT TGGGGAGGTG TATGAAGGCC AGGTGTCCGG AATGCCCAAC

2461 GACCCAAGCC CCCTGCAAGT GGCTGTGAAG ACGCTGCCTG AAGTGTGCTC TGAACAGGAC

2521 GAACTGGATT TCCTCATGGA AGCCCTGATC ATCAGCAAAT TCAACCACCA GAACATTGTT

2581 CGCTGCATTG GGGTGAGCCT GCAATCCCTG CCCCGGTTCA TCCTGCTGGA GCTCATGGCG

2641 GGGGGAGACC TCAAGTCCTT CCTCCGAGAG ACCCGCCCTC GCCCGAGCCA GCCCTCCTCC

2701 CTGGCCATGC TGGACCTTCT GCACGTGGCT CGGGACATTG CCTGTGGCTG TCAGTATTTG

2761 GAGGAAAACC ACTTCATCCA CCGAGACATT GCTGCCAGAA ACTGCCTCTT GACCTGTCCA

2821 GGCCCTGGAA GAGTGGCCAA GATTGGAGAC TTCGGGATGG CCCGAGACAT CTACAGGGCG

2881 AGCTACTATA GAAAGGGAGG CTGTGCCATG CTGCCAGTTA AGTGGATGCC CCCAGAGGCC

2941 TTCATGGAAG GAATATTCAC TTCTAAAACA GACACATGGT CCTTTGGAGT GCTGCTATGG

3001 GAAATCTTTT CTCTTGGATA TATGCCATAC CCCAGCAAAA GCAACCAGGA AGTTCTGGAG

3061 TTTGTCACCA GTGGAGGCCG GATGGACCCA CCCAAGAACT GCCCTGGGCC TGTATACCGG

3121 ATAATGACTC AGTGCTGGCA ACATCAGCCT GAAGACAGGC CCAACTTTGC CATCATTTTG

3181 GAGAGGATTG AATACTGCAC CCAGGACCCG GATGTAATCA ACACCGCTTT GCCGATAGAA

3241 TATGGTCCAC TTGTGGAAGA GGAAGAGAAA GTGCCTGTGA GGCCCAAGGA CCCTGAGGGG

3301 GTTCCTCCTC TCCTGGTCTC TCAACAGGCA AAACGGGAGG AGGAGCGCAG CCCAGCTGCC

3361 CCACCACCTC TGCCTACCAC CTCCTCTGGC AAGGCTGCAA AGAAACCCAC AGCTGCAGAG

3421 ATCTCTGTTC GAGTCCCTAG AGGGCCGGCC GTGGAAGGGG GACACGTGAA TATGGCATTC

3481 TCTCAGTCCA ACCCTCCTTC GGAGTTGCAC AAGGTCCACG GATCCAGAAA CAAGCCCACC

3541 AGCTTGTGGA ACCCAACGTA CGGCTCCTGG TTTACAGAGA AACCCACCAA AAAGAATAAT

3601 CCTATAGCAA AGAAGGAGCC ACACGACAGG GGTAACCTGG GGCTGGAGGG AAGCTGTACT

3661 GTCCCACCTA ACGTTGCAAC TGGGAGACTT CCGGGGGCCT CACTGCTCCT AGAGCCCTCT

3721 TCGCTGACTG CCAATATGAA GGAGGTACCT CTGTTCAGGC TACGTCACTT CCCTTGTGGG

3781 AATGTCAATT ACGGCTACCA GCAACAGGGC TTGCCCTTAG AAGCCGCTAC TGCCCCTGGA

3841 GCTGGTCATT ACGAGGATAC CATTCTGAAA AGCAAGAATA GCATGAACCA GCCTGGGCCC

3901 TGA

Nucleotides derived from C2orf44 are shown in blue and nucleotides derived from ALK are

shown in red.

Nature Medicine doi:10.1038/nm.2673

Page 17: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

Supplementary Figure 2. Predicted KIF5B-RET gene fusion variant sequence.

a) K15;R12 (Variant 1): KIF5B exon 15 fused to RET exon 12.

1 ATGGCGGACC TGGCCGAGTG CAACATCAAA GTGATGTGTC GCTTCAGACC TCTCAACGAG

61 TCTGAAGTGA ACCGCGGCGA CAAGTACATC GCCAAGTTTC AGGGAGAAGA CACGGTCGTG

121 ATCGCGTCCA AGCCTTATGC ATTTGATCGG GTGTTCCAGT CAAGCACATC TCAAGAGCAA

181 GTGTATAATG ACTGTGCAAA GAAGATTGTT AAAGATGTAC TTGAAGGATA TAATGGAACA

241 ATATTTGCAT ATGGACAAAC ATCCTCTGGG AAGACACACA CAATGGAGGG TAAACTTCAT

301 GATCCAGAAG GCATGGGAAT TATTCCAAGA ATAGTGCAAG ATATTTTTAA TTATATTTAC

361 TCCATGGATG AAAATTTGGA ATTTCATATT AAGGTTTCAT ATTTTGAAAT ATATTTGGAT

421 AAGATAAGGG ACCTGTTAGA TGTTTCAAAG ACCAACCTTT CAGTTCATGA AGACAAAAAC

481 CGAGTTCCCT ATGTAAAGGG GTGCACAGAG CGTTTTGTAT GTAGTCCAGA TGAAGTTATG

541 GATACCATAG ATGAAGGAAA ATCCAACAGA CATGTAGCAG TTACAAATAT GAATGAACAT

601 AGCTCTAGGA GTCACAGTAT ATTTCTTATT AATGTCAAAC AAGAGAACAC ACAAACGGAA

661 CAAAAGCTGA GTGGAAAACT TTATCTGGTT GATTTAGCTG GTAGTGAAAA GGTTAGTAAA

1201 ACTGGAGCTG AAGGTGCTGT GCTGGATGAA GCTAAAAACA TCAACAAGTC ACTTTCTGCT

1261 CTTGGAAATG TTATTTCTGC TTTGGCTGAG GGTAGTACAT ATGTTCCATA TCGAGATAGT

1321 AAAATGACAA GAATCCTTCA AGATTCATTA GGTGGCAACT GTAGAACCAC TATTGTAATT

1381 TGCTGCTCTC CATCATCATA CAATGAGTCT GAAACAAAAT CTACACTCTT ATTTGGCCAA

1441 AGGGCCAAAA CAATTAAGAA CACAGTTTGT GTCAATGTGG AGTTAACTGC AGAACAGTGG

1501 AAAAAGAAGT ATGAAAAAGA AAAAGAAAAA AATAAGATCC TGCGGAACAC TATTCAGTGG

1561 CTTGAAAATG AGCTCAACAG ATGGCGTAAT GGGGAGACGG TGCCTATTGA TGAACAGTTT

1621 GACAAAGAGA AAGCCAACTT GGAAGCTTTC ACAGTGGATA AAGATATTAC TCTTACCAAT

1681 GATAAACCAG CAACCGCAAT TGGAGTTATA GGAAATTTTA CTGATGCTGA AAGAAGAAAG

1741 TGTGAAGAAG AAATTGCTAA ATTATACAAA CAGCTTGATG ACAAGGATGA AGAAATTAAC

1801 CAGCAAAGTC AACTGGTAGA GAAACTGAAG ACGCAAATGT TGGATCAGGA GGAGCTTTTG

1861 GCATCTACCA GAAGGGATCA AGACAATATG CAAGCTGAGC TGAATCGCCT TCAAGCAGAA

1921 AATGATGCCT CTAAAGAAGA AGTGAAAGAA GTTTTACAGG CCCTAGAAGA ACTTGCTGTC

1981 AATTATGATC AGAAGTCTCA GGAAGTTGAA GACAAAACTA AGGAATATGA ATTGCTTAGT

2041 GATGAATTGA ATCAGAAATC GGCAACTTTA GCGAGTATAG ATGCTGAGCT TCAGAAACTT

2101 AAGGAAATGA CCAACCACCA GAAAAAACGA GCAGCTGAGA TGATGGCATC TTTACTAAAA

2161 GACCTTGCAG AAATAGGAAT TGCTGTGGGA AATAATGATG TAAAGGAGGA TCCAAAGTGG

2221 GAATTCCCTC GGAAGAACTT GGTTCTTGGA AAAACTCTAG GAGAAGGCGA ATTTGGAAAA

2281 GTGGTCAAGG CAACGGCCTT CCATCTGAAA GGCAGAGCAG GGTACACCAC GGTGGCCGTG

2341 AAGATGCTGA AAGAGAACGC CTCCCCGAGT GAGCTGCGAG ACCTGCTGTC AGAGTTCAAC

2401 GTCCTGAAGC AGGTCAACCA CCCACATGTC ATCAAATTGT ATGGGGCCTG CAGCCAGGAT

2461 GGCCCGCTCC TCCTCATCGT GGAGTACGCC AAATACGGCT CCCTGCGGGG CTTCCTCCGC

2521 GAGAGCCGCA AAGTGGGGCC TGGCTACCTG GGCAGTGGAG GCAGCCGCAA CTCCAGCTCC

Nature Medicine doi:10.1038/nm.2673

Page 18: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

2581 CTGGACCACC CGGATGAGCG GGCCCTCACC ATGGGCGACC TCATCTCATT TGCCTGGCAG

2641 ATCTCACAGG GGATGCAGTA TCTGGCCGAG ATGAAGCTCG TTCATCGGGA CTTGGCAGCC

2701 AGAAACATCC TGGTAGCTGA GGGGCGGAAG ATGAAGATTT CGGATTTCGG CTTGTCCCGA

2761 GATGTTTATG AAGAGGATTC CTACGTGAAG AGGAGCCAGG GTCGGATTCC AGTTAAATGG

2821 ATGGCAATTG AATCCCTTTT TGATCATATC TACACCACGC AAAGTGATGT ATGGTCTTTT

2881 GGTGTCCTGC TGTGGGAGAT CGTGACCCTA GGGGGAAACC CCTATCCTGG GATTCCTCCT

2941 GAGCGGCTCT TCAACCTTCT GAAGACCGGC CACCGGATGG AGAGGCCAGA CAACTGCAGC

3001 GAGGAGATGT ACCGCCTGAT GCTGCAATGC TGGAAGCAGG AGCCGGACAA AAGGCCGGTG

3061 TTTGCGGACA TCAGCAAAGA CCTGGAGAAG ATGATGGTTA AGAGGAGAGA CTACTTGGAC

3121 CTTGCGGCGT CCACTCCATC TGACTCCCTG ATTTATGACG ACGGCCTCTC AGAGGAGGAG

3181 ACACCGCTGG TGGACTGTAA TAATGCCCCC CTCCCTCGAG CCCTCCCTTC CACATGGATT

3241 GAAAACAAAC TCTATGGTAG AATTTCCCAT GCATTTACTA GATTCTAG

b) K16;R12 (Variant 2): KIF5B exon 16 fused to RET exon 12.

1 ATGGCGGACC TGGCCGAGTG CAACATCAAA GTGATGTGTC GCTTCAGACC TCTCAACGAG

61 TCTGAAGTGA ACCGCGGCGA CAAGTACATC GCCAAGTTTC AGGGAGAAGA CACGGTCGTG

121 ATCGCGTCCA AGCCTTATGC ATTTGATCGG GTGTTCCAGT CAAGCACATC TCAAGAGCAA

181 GTGTATAATG ACTGTGCAAA GAAGATTGTT AAAGATGTAC TTGAAGGATA TAATGGAACA

241 ATATTTGCAT ATGGACAAAC ATCCTCTGGG AAGACACACA CAATGGAGGG TAAACTTCAT

301 GATCCAGAAG GCATGGGAAT TATTCCAAGA ATAGTGCAAG ATATTTTTAA TTATATTTAC

361 TCCATGGATG AAAATTTGGA ATTTCATATT AAGGTTTCAT ATTTTGAAAT ATATTTGGAT

421 AAGATAAGGG ACCTGTTAGA TGTTTCAAAG ACCAACCTTT CAGTTCATGA AGACAAAAAC

481 CGAGTTCCCT ATGTAAAGGG GTGCACAGAG CGTTTTGTAT GTAGTCCAGA TGAAGTTATG

541 GATACCATAG ATGAAGGAAA ATCCAACAGA CATGTAGCAG TTACAAATAT GAATGAACAT

601 AGCTCTAGGA GTCACAGTAT ATTTCTTATT AATGTCAAAC AAGAGAACAC ACAAACGGAA

661 CAAAAGCTGA GTGGAAAACT TTATCTGGTT GATTTAGCTG GTAGTGAAAA GGTTAGTAAA

1201 ACTGGAGCTG AAGGTGCTGT GCTGGATGAA GCTAAAAACA TCAACAAGTC ACTTTCTGCT

1261 CTTGGAAATG TTATTTCTGC TTTGGCTGAG GGTAGTACAT ATGTTCCATA TCGAGATAGT

1321 AAAATGACAA GAATCCTTCA AGATTCATTA GGTGGCAACT GTAGAACCAC TATTGTAATT

1381 TGCTGCTCTC CATCATCATA CAATGAGTCT GAAACAAAAT CTACACTCTT ATTTGGCCAA

1441 AGGGCCAAAA CAATTAAGAA CACAGTTTGT GTCAATGTGG AGTTAACTGC AGAACAGTGG

1501 AAAAAGAAGT ATGAAAAAGA AAAAGAAAAA AATAAGATCC TGCGGAACAC TATTCAGTGG

1561 CTTGAAAATG AGCTCAACAG ATGGCGTAAT GGGGAGACGG TGCCTATTGA TGAACAGTTT

1621 GACAAAGAGA AAGCCAACTT GGAAGCTTTC ACAGTGGATA AAGATATTAC TCTTACCAAT

1681 GATAAACCAG CAACCGCAAT TGGAGTTATA GGAAATTTTA CTGATGCTGA AAGAAGAAAG

1741 TGTGAAGAAG AAATTGCTAA ATTATACAAA CAGCTTGATG ACAAGGATGA AGAAATTAAC

1801 CAGCAAAGTC AACTGGTAGA GAAACTGAAG ACGCAAATGT TGGATCAGGA GGAGCTTTTG

1861 GCATCTACCA GAAGGGATCA AGACAATATG CAAGCTGAGC TGAATCGCCT TCAAGCAGAA

Nature Medicine doi:10.1038/nm.2673

Page 19: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

1921 AATGATGCCT CTAAAGAAGA AGTGAAAGAA GTTTTACAGG CCCTAGAAGA ACTTGCTGTC

1981 AATTATGATC AGAAGTCTCA GGAAGTTGAA GACAAAACTA AGGAATATGA ATTGCTTAGT

2041 GATGAATTGA ATCAGAAATC GGCAACTTTA GCGAGTATAG ATGCTGAGCT TCAGAAACTT

2101 AAGGAAATGA CCAACCACCA GAAAAAACGA GCAGCTGAGA TGATGGCATC TTTACTAAAA

2161 GACCTTGCAG AAATAGGAAT TGCTGTGGGA AATAATGATG TAAAGCAGCC TGAGGGAACT

2221 GGCATGATAG ATGAAGAGTT CACTGTTGCA AGACTCTACA TTAGCAAAAT GAAGTCAGAA

2281 GTAAAAACCA TGGTGAAACG TTGCAAGCAG TTAGAAAGCA CACAAACTGA GAGCAACAAA

2341 AAAATGGAAG AAAATGAAAA GGAGTTAGCA GCATGTCAGC TTCGTATCTC TCAAGAGGAT

2401 CCAAAGTGGG AATTCCCTCG GAAGAACTTG GTTCTTGGAA AAACTCTAGG AGAAGGCGAA

2461 TTTGGAAAAG TGGTCAAGGC AACGGCCTTC CATCTGAAAG GCAGAGCAGG GTACACCACG

2521 GTGGCCGTGA AGATGCTGAA AGAGAACGCC TCCCCGAGTG AGCTGCGAGA CCTGCTGTCA

2581 GAGTTCAACG TCCTGAAGCA GGTCAACCAC CCACATGTCA TCAAATTGTA TGGGGCCTGC

2641 AGCCAGGATG GCCCGCTCCT CCTCATCGTG GAGTACGCCA AATACGGCTC CCTGCGGGGC

2701 TTCCTCCGCG AGAGCCGCAA AGTGGGGCCT GGCTACCTGG GCAGTGGAGG CAGCCGCAAC

2761 TCCAGCTCCC TGGACCACCC GGATGAGCGG GCCCTCACCA TGGGCGACCT CATCTCATTT

2821 GCCTGGCAGA TCTCACAGGG GATGCAGTAT CTGGCCGAGA TGAAGCTCGT TCATCGGGAC

2881 TTGGCAGCCA GAAACATCCT GGTAGCTGAG GGGCGGAAGA TGAAGATTTC GGATTTCGGC

2941 TTGTCCCGAG ATGTTTATGA AGAGGATTCC TACGTGAAGA GGAGCCAGGG TCGGATTCCA

3001 GTTAAATGGA TGGCAATTGA ATCCCTTTTT GATCATATCT ACACCACGCA AAGTGATGTA

3061 TGGTCTTTTG GTGTCCTGCT GTGGGAGATC GTGACCCTAG GGGGAAACCC CTATCCTGGG

3121 ATTCCTCCTG AGCGGCTCTT CAACCTTCTG AAGACCGGCC ACCGGATGGA GAGGCCAGAC

3181 AACTGCAGCG AGGAGATGTA CCGCCTGATG CTGCAATGCT GGAAGCAGGA GCCGGACAAA

3241 AGGCCGGTGT TTGCGGACAT CAGCAAAGAC CTGGAGAAGA TGATGGTTAA GAGGAGAGAC

3301 TACTTGGACC TTGCGGCGTC CACTCCATCT GACTCCCTGA TTTATGACGA CGGCCTCTCA

3361 GAGGAGGAGA CACCGCTGGT GGACTGTAAT AATGCCCCCC TCCCTCGAGC CCTCCCTTCC

3421 ACATGGATTG AAAACAAACT CTATGGTAGA ATTTCCCATG CATTTACTAG ATTCTAG

c) K22;R12 (Variant 3): KIF5B exon 22 fused to RET exon 12 1 ATGGCGGACC TGGCCGAGTG CAACATCAAA GTGATGTGTC GCTTCAGACC TCTCAACGAG

61 TCTGAAGTGA ACCGCGGCGA CAAGTACATC GCCAAGTTTC AGGGAGAAGA CACGGTCGTG

121 ATCGCGTCCA AGCCTTATGC ATTTGATCGG GTGTTCCAGT CAAGCACATC TCAAGAGCAA

181 GTGTATAATG ACTGTGCAAA GAAGATTGTT AAAGATGTAC TTGAAGGATA TAATGGAACA

241 ATATTTGCAT ATGGACAAAC ATCCTCTGGG AAGACACACA CAATGGAGGG TAAACTTCAT

301 GATCCAGAAG GCATGGGAAT TATTCCAAGA ATAGTGCAAG ATATTTTTAA TTATATTTAC

361 TCCATGGATG AAAATTTGGA ATTTCATATT AAGGTTTCAT ATTTTGAAAT ATATTTGGAT

421 AAGATAAGGG ACCTGTTAGA TGTTTCAAAG ACCAACCTTT CAGTTCATGA AGACAAAAAC

481 CGAGTTCCCT ATGTAAAGGG GTGCACAGAG CGTTTTGTAT GTAGTCCAGA TGAAGTTATG

Nature Medicine doi:10.1038/nm.2673

Page 20: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

541 GATACCATAG ATGAAGGAAA ATCCAACAGA CATGTAGCAG TTACAAATAT GAATGAACAT

601 AGCTCTAGGA GTCACAGTAT ATTTCTTATT AATGTCAAAC AAGAGAACAC ACAAACGGAA

661 CAAAAGCTGA GTGGAAAACT TTATCTGGTT GATTTAGCTG GTAGTGAAAA GGTTAGTAAA

1201 ACTGGAGCTG AAGGTGCTGT GCTGGATGAA GCTAAAAACA TCAACAAGTC ACTTTCTGCT

1261 CTTGGAAATG TTATTTCTGC TTTGGCTGAG GGTAGTACAT ATGTTCCATA TCGAGATAGT

1321 AAAATGACAA GAATCCTTCA AGATTCATTA GGTGGCAACT GTAGAACCAC TATTGTAATT

1381 TGCTGCTCTC CATCATCATA CAATGAGTCT GAAACAAAAT CTACACTCTT ATTTGGCCAA

1441 AGGGCCAAAA CAATTAAGAA CACAGTTTGT GTCAATGTGG AGTTAACTGC AGAACAGTGG

1501 AAAAAGAAGT ATGAAAAAGA AAAAGAAAAA AATAAGATCC TGCGGAACAC TATTCAGTGG

1561 CTTGAAAATG AGCTCAACAG ATGGCGTAAT GGGGAGACGG TGCCTATTGA TGAACAGTTT

1621 GACAAAGAGA AAGCCAACTT GGAAGCTTTC ACAGTGGATA AAGATATTAC TCTTACCAAT

1681 GATAAACCAG CAACCGCAAT TGGAGTTATA GGAAATTTTA CTGATGCTGA AAGAAGAAAG

1741 TGTGAAGAAG AAATTGCTAA ATTATACAAA CAGCTTGATG ACAAGGATGA AGAAATTAAC

1801 CAGCAAAGTC AACTGGTAGA GAAACTGAAG ACGCAAATGT TGGATCAGGA GGAGCTTTTG

1861 GCATCTACCA GAAGGGATCA AGACAATATG CAAGCTGAGC TGAATCGCCT TCAAGCAGAA

1921 AATGATGCCT CTAAAGAAGA AGTGAAAGAA GTTTTACAGG CCCTAGAAGA ACTTGCTGTC

1981 AATTATGATC AGAAGTCTCA GGAAGTTGAA GACAAAACTA AGGAATATGA ATTGCTTAGT

2041 GATGAATTGA ATCAGAAATC GGCAACTTTA GCGAGTATAG ATGCTGAGCT TCAGAAACTT

2101 AAGGAAATGA CCAACCACCA GAAAAAACGA GCAGCTGAGA TGATGGCATC TTTACTAAAA

2161 GACCTTGCAG AAATAGGAAT TGCTGTGGGA AATAATGATG TAAAGCAGCC TGAGGGAACT

2221 GGCATGATAG ATGAAGAGTT CACTGTTGCA AGACTCTACA TTAGCAAAAT GAAGTCAGAA

2281 GTAAAAACCA TGGTGAAACG TTGCAAGCAG TTAGAAAGCA CACAAACTGA GAGCAACAAA

2341 AAAATGGAAG AAAATGAAAA GGAGTTAGCA GCATGTCAGC TTCGTATCTC TCAACATGAA

2401 GCCAAAATCA AGTCATTGAC TGAATACCTT CAAAATGTGG AACAAAAGAA AAGACAGTTG

2461 GAGGAATCTG TCGATGCCCT CAGTGAAGAA CTAGTCCAGC TTCGAGCACA AGAGAAAGTC

2521 CATGAAATGG AAAAGGAGCA CTTAAATAAG GTTCAGACTG CAAATGAAGT TAAGCAAGCT

2581 GTTGAACAGC AGATCCAGAG CCATAGAGAA ACTCATCAAA AACAGATCAG TAGTTTGAGA

2641 GATGAAGTAG AAGCAAAAGC AAAACTTATT ACTGATCTTC AAGACCAAAA CCAGAAAATG

2701 ATGTTAGAGC AGGAACGTCT AAGAGTAGAA CATGAGAAGT TGAAAGCCAC AGATCAGGAA

2761 AAGAGCAGAA AACTACATGA ACTTACGGTT ATGCAAGATA GACGAGAACA AGCAAGACAA

2821 GACTTGAAGG GTTTGGAAGA GACAGTGGCA AAAGAACTTC AGACTTTACA CAACCTGCGC

2881 AAACTCTTTG TTCAGGACCT GGCTACAAGA GTTAAAAAGG AGGATCCAAA GTGGGAATTC

2941 CCTCGGAAGA ACTTGGTTCT TGGAAAAACT CTAGGAGAAG GCGAATTTGG AAAAGTGGTC

3001 AAGGCAACGG CCTTCCATCT GAAAGGCAGA GCAGGGTACA CCACGGTGGC CGTGAAGATG

3061 CTGAAAGAGA ACGCCTCCCC GAGTGAGCTG CGAGACCTGC TGTCAGAGTT CAACGTCCTG

3121 AAGCAGGTCA ACCACCCACA TGTCATCAAA TTGTATGGGG CCTGCAGCCA GGATGGCCCG

3181 CTCCTCCTCA TCGTGGAGTA CGCCAAATAC GGCTCCCTGC GGGGCTTCCT CCGCGAGAGC

3241 CGCAAAGTGG GGCCTGGCTA CCTGGGCAGT GGAGGCAGCC GCAACTCCAG CTCCCTGGAC

3301 CACCCGGATG AGCGGGCCCT CACCATGGGC GACCTCATCT CATTTGCCTG GCAGATCTCA

Nature Medicine doi:10.1038/nm.2673

Page 21: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

3361 CAGGGGATGC AGTATCTGGC CGAGATGAAG CTCGTTCATC GGGACTTGGC AGCCAGAAAC

3421 ATCCTGGTAG CTGAGGGGCG GAAGATGAAG ATTTCGGATT TCGGCTTGTC CCGAGATGTT

3481 TATGAAGAGG ATTCCTACGT GAAGAGGAGC CAGGGTCGGA TTCCAGTTAA ATGGATGGCA

3541 ATTGAATCCC TTTTTGATCA TATCTACACC ACGCAAAGTG ATGTATGGTC TTTTGGTGTC

3601 CTGCTGTGGG AGATCGTGAC CCTAGGGGGA AACCCCTATC CTGGGATTCC TCCTGAGCGG

3661 CTCTTCAACC TTCTGAAGAC CGGCCACCGG ATGGAGAGGC CAGACAACTG CAGCGAGGAG

3721 ATGTACCGCC TGATGCTGCA ATGCTGGAAG CAGGAGCCGG ACAAAAGGCC GGTGTTTGCG

3781 GACATCAGCA AAGACCTGGA GAAGATGATG GTTAAGAGGA GAGACTACTT GGACCTTGCG

3841 GCGTCCACTC CATCTGACTC CCTGATTTAT GACGACGGCC TCTCAGAGGA GGAGACACCG

3901 CTGGTGGACT GTAATAATGC CCCCCTCCCT CGAGCCCTCC CTTCCACATG GATTGAAAAC

3961 AAACTCTATG GTAGAATTTC CCATGCATTT ACTAGATTCT AG

d) K15;R11 (Variant 4): KIF5B exon 15 fused to part RET exon 11

1 ATGGCGGACC TGGCCGAGTG CAACATCAAA GTGATGTGTC GCTTCAGACC TCTCAACGAG

61 TCTGAAGTGA ACCGCGGCGA CAAGTACATC GCCAAGTTTC AGGGAGAAGA CACGGTCGTG

121 ATCGCGTCCA AGCCTTATGC ATTTGATCGG GTGTTCCAGT CAAGCACATC TCAAGAGCAA

181 GTGTATAATG ACTGTGCAAA GAAGATTGTT AAAGATGTAC TTGAAGGATA TAATGGAACA

241 ATATTTGCAT ATGGACAAAC ATCCTCTGGG AAGACACACA CAATGGAGGG TAAACTTCAT

301 GATCCAGAAG GCATGGGAAT TATTCCAAGA ATAGTGCAAG ATATTTTTAA TTATATTTAC

361 TCCATGGATG AAAATTTGGA ATTTCATATT AAGGTTTCAT ATTTTGAAAT ATATTTGGAT

421 AAGATAAGGG ACCTGTTAGA TGTTTCAAAG ACCAACCTTT CAGTTCATGA AGACAAAAAC

481 CGAGTTCCCT ATGTAAAGGG GTGCACAGAG CGTTTTGTAT GTAGTCCAGA TGAAGTTATG

541 GATACCATAG ATGAAGGAAA ATCCAACAGA CATGTAGCAG TTACAAATAT GAATGAACAT

601 AGCTCTAGGA GTCACAGTAT ATTTCTTATT AATGTCAAAC AAGAGAACAC ACAAACGGAA

661 CAAAAGCTGA GTGGAAAACT TTATCTGGTT GATTTAGCTG GTAGTGAAAA GGTTAGTAAA

1201 ACTGGAGCTG AAGGTGCTGT GCTGGATGAA GCTAAAAACA TCAACAAGTC ACTTTCTGCT

1261 CTTGGAAATG TTATTTCTGC TTTGGCTGAG GGTAGTACAT ATGTTCCATA TCGAGATAGT

1321 AAAATGACAA GAATCCTTCA AGATTCATTA GGTGGCAACT GTAGAACCAC TATTGTAATT

1381 TGCTGCTCTC CATCATCATA CAATGAGTCT GAAACAAAAT CTACACTCTT ATTTGGCCAA

1441 AGGGCCAAAA CAATTAAGAA CACAGTTTGT GTCAATGTGG AGTTAACTGC AGAACAGTGG

1501 AAAAAGAAGT ATGAAAAAGA AAAAGAAAAA AATAAGATCC TGCGGAACAC TATTCAGTGG

1561 CTTGAAAATG AGCTCAACAG ATGGCGTAAT GGGGAGACGG TGCCTATTGA TGAACAGTTT

1621 GACAAAGAGA AAGCCAACTT GGAAGCTTTC ACAGTGGATA AAGATATTAC TCTTACCAAT

1681 GATAAACCAG CAACCGCAAT TGGAGTTATA GGAAATTTTA CTGATGCTGA AAGAAGAAAG

1741 TGTGAAGAAG AAATTGCTAA ATTATACAAA CAGCTTGATG ACAAGGATGA AGAAATTAAC

1801 CAGCAAAGTC AACTGGTAGA GAAACTGAAG ACGCAAATGT TGGATCAGGA GGAGCTTTTG

1861 GCATCTACCA GAAGGGATCA AGACAATATG CAAGCTGAGC TGAATCGCCT TCAAGCAGAA

1921 AATGATGCCT CTAAAGAAGA AGTGAAAGAA GTTTTACAGG CCCTAGAAGA ACTTGCTGTC

Nature Medicine doi:10.1038/nm.2673

Page 22: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

1981 AATTATGATC AGAAGTCTCA GGAAGTTGAA GACAAAACTA AGGAATATGA ATTGCTTAGT

2041 GATGAATTGA ATCAGAAATC GGCAACTTTA GCGAGTATAG ATGCTGAGCT TCAGAAACTT

2101 AAGGAAATGA CCAACCACCA GAAAAAACGA GCAGCTGAGA TGATGGCATC TTTACTAAAA

2161 GACCTTGCAG AAATAGGAAT TGCTGTGGGA AATAATGATG TAAAGTTTGC CCACAAGCCA

2221 CCCATCTCCT CAGCTGAGAT GACCTTCCGG AGGCCCGCCC AGGCCTTCCC GGTCAGCTAC

2281 TCCTCTTCCG GTGCCCGCCG GCCCTCGCTG GACTCCATGG AGAACCAGGT CTCCGTGGAT

2341 GCCTTCAAGA TCCTGGAGGA TCCAAAGTGG GAATTCCCTC GGAAGAACTT GGTTCTTGGA

2401 AAAACTCTAG GAGAAGGCGA ATTTGGAAAA GTGGTCAAGG CAACGGCCTT CCATCTGAAA

2461 GGCAGAGCAG GGTACACCAC GGTGGCCGTG AAGATGCTGA AAGAGAACGC CTCCCCGAGT

2521 GAGCTGCGAG ACCTGCTGTC AGAGTTCAAC GTCCTGAAGC AGGTCAACCA CCCACATGTC

2581 ATCAAATTGT ATGGGGCCTG CAGCCAGGAT GGCCCGCTCC TCCTCATCGT GGAGTACGCC

2641 AAATACGGCT CCCTGCGGGG CTTCCTCCGC GAGAGCCGCA AAGTGGGGCC TGGCTACCTG

2701 GGCAGTGGAG GCAGCCGCAA CTCCAGCTCC CTGGACCACC CGGATGAGCG GGCCCTCACC

2761 ATGGGCGACC TCATCTCATT TGCCTGGCAG ATCTCACAGG GGATGCAGTA TCTGGCCGAG

2821 ATGAAGCTCG TTCATCGGGA CTTGGCAGCC AGAAACATCC TGGTAGCTGA GGGGCGGAAG

2881 ATGAAGATTT CGGATTTCGG CTTGTCCCGA GATGTTTATG AAGAGGATTC CTACGTGAAG

2941 AGGAGCCAGG GTCGGATTCC AGTTAAATGG ATGGCAATTG AATCCCTTTT TGATCATATC

3001 TACACCACGC AAAGTGATGT ATGGTCTTTT GGTGTCCTGC TGTGGGAGAT CGTGACCCTA

3061 GGGGGAAACC CCTATCCTGG GATTCCTCCT GAGCGGCTCT TCAACCTTCT GAAGACCGGC

3121 CACCGGATGG AGAGGCCAGA CAACTGCAGC GAGGAGATGT ACCGCCTGAT GCTGCAATGC

3181 TGGAAGCAGG AGCCGGACAA AAGGCCGGTG TTTGCGGACA TCAGCAAAGA CCTGGAGAAG

3241 ATGATGGTTA AGAGGAGAGA CTACTTGGAC CTTGCGGCGT CCACTCCATC TGACTCCCTG

3301 ATTTATGACG ACGGCCTCTC AGAGGAGGAG ACACCGCTGG TGGACTGTAA TAATGCCCCC

3361 CTCCCTCGAG CCCTCCCTTC CACATGGATT GAAAACAAAC TCTATGGTAG AATTTCCCAT

3421 GCATTTACTA GATTCTAG

In all panels, nucleotides derived from KIF5B are shown in blue and nucleotides derived

from RET are shown in red.

Nature Medicine doi:10.1038/nm.2673

Page 23: Identification of novel ALK and RET gene fusions from ...Genomic DNA sequencing. DNA sequencing was performed for 2574 exons of 145 cancer genes on indexed, adaptor ligated, hybridization-captured

a) K15;R12

(Variant 1, 8 cases)

1 638 1040

1 575 977

Kinesin Tyrosine

Kinase

Coiled

coil

KIF5B Exon 15 RET Exon 12

cDNA confirmation

of fusion junction

KIF5B Exon 16 RET Exon 12

T G T A A A G G A G G A T C

C T C T C A A G A G G A T C

b) K16;R12

(Variant 2, 3 cases)

cDNA confirmation

of fusion junction

Kinesin Tyrosine

Kinase

Coiled

Coil

c) K22;R12

(Variant 3, 1 case)

cDNA confirmation

of fusion junction

Kinesin Tyrosine

Kinase

Coiled

Coil

1 852 1254

T A A A A A G G A G G A T C

KIF5B Exon 22 RET Exon 12

d) K15;R11*

(Variant 4, 1 case)

1 575 1027

Kinesin Tyrosine

Kinase

Coiled

Coil

cDNA confirmation

of fusion junction

T G T A A A G T T T G C C C

KIF5B Exon 15 RET Exon 11*

Supplementary Figure 3. KIF5B-RET fusion transcripts

The total length and the position of the fusion breakpoint are shown above each variant protein

and capillary sequence confirmation of the exon junction boundaries derived from cDNA is shown

below. *This case also harbored a KIF5B-RET gene fusion transcript variant 1.

Nature Medicine doi:10.1038/nm.2673