Genomic DNA Variation Computer-Aided Discovery Methods Baylor College of Medicine course 311-405...
-
Upload
muriel-edwards -
Category
Documents
-
view
215 -
download
1
Transcript of Genomic DNA Variation Computer-Aided Discovery Methods Baylor College of Medicine course 311-405...
Genomic DNA Variation
Computer-Aided Discovery Methods
Baylor College of Medicine course 311-405Term 3, 2008/2009
Lecture on Wednesday, January 28th, 2009
Aleksandar Milosavljevic, PhDAleksandar Milosavljevic, PhDhttp://www.brl.bcm.tmc.eduhttp://www.brl.bcm.tmc.edu
Entering Segment 2
Segment 1 (3 weeks): Cancer Lectures (1,2,3) Lab: Genboree, Ruby
Segment 2 (4 weeks): Bringing it together: Lecture+Lab
Segment 3: Review lectures
Background reading
A broad-brush survey of trends:CREATIVITY SUPPORT TOOLSAccelerating Discovery and InnovationBen Schneiderman
A bit of history and pointers to philosophy ( Karl Popper, C.S. Peirce ):THINKING WITH MACHINES: Intelligence Augmentation, Evolutionary Epistemology, and SemioticPeter Skagestad
Cancer Genome Variation: Methods
Recent landmark studies ( not covered this year ):
The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008 Sep 4. [Epub ahead of print].
Parsons DW et al. An integrated genomic analysis of human glioblastoma multiforme. Science. 2008 Sep 26;321(5897):1807-12. Epub 2008 Sep 4.
Jones S. et al Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science. 2008 Sep 26;321(5897):1801-6. Epub 2008 Sep 4.
Chromosome Aberrations: References 1 of 2
Background (optional)
[Balmain 2001] Balmain, A., Cancer genetics: from Boveri and Mendel to microarrays. Nat Rev Cancer, 2001. 1(1): p. 77-82.
[Albertson et al. 2003] Albertson, D.G., et al., Chromosome aberrations in solid tumors. Nat Genet, 2003. 34(4): p. 369-76.
[Rabbitts et al. 2003] Rabbitts, T.H. and M.R. Stocks, Chromosomal translocation products engender new intracellular therapeutic technologies. Nat Med, 2003. 9(4): p. 383-6.
Chromosome Aberrations References 2 of 2
Breast cancer – copy number variation, array CGH and gene expression
[Chin K. et al. 2006] Chin K et al. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell 10:529-541 2006
Prostate Cancer – aberrant fusions – via gene expression[Tomlins et al. 2005] Tomlins SA et al., Recurrent fusion of
TMPRSS2 and ETS transcription factor genes in prostate cancer. Science, 2005. 310(5748): p. 644-8.
Breast cancer – direct detection of aberrant fusions by end-sequence profiling
[Hampton OA et al] Hampton, OA et al, A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. Genome Research. 2008 Dec 9. [Epub ahead of print]
Boveri, one century ago …
Multiple cell poles cause unequalsegregation of chromosomes.
a | Fertilization of sea-urchin eggs bytwo sperm results in multiple cell poles.
b | Chromosomes are aberrantly segregated
[Balmain 2001]
Chin K. et al., Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell 10:529-541 2006.
• 100+ aggressively treated early stage breast tumors
1989-1997, before ERBB2 antagonist Trastuzumab (Herceptin) was approved for treating ERBB2+ breast cancer
ERBB2 heuristic (“paradigm”) formulated in last sentence of Chin K. et al.
“Taking ERBB2 as the paradigm (recurrently amplified, overexpressed, associated with outcome and with demonstrated functional importance in cancer) suggests FGFR1, TACC1, ADAM9, IKBKB, PNMT, and GRB7 as high-priority therapeutic targets in these regions of amplification.”
“Taking ERBB2 as the paradigm (recurrently amplified, overexpressed…
Array CGH (~3K BAC array)
Gene expression (Affymetrix U133A array)
Deletions, amplifications induce aberrant fusions
…but…
Some aberrant fusion-producing rearrangements ( reciprocal translocations, inversions ) may not affect copy number
Mapping rearrangements ( aberrant fusions ) using paired ends
Two significant types of aberrant fusions
[Rabbitts et al.]
aberrantlyamplified
expression
aberrantactivation
of signaling protein
BCR-ABL fusion in Chronic Myeloid Leukaemia: four decades from lesion discovery
to Imatinib ( Gleevec)
1960: Philadelphia chromosome discovered
1973: Chromosome translocation t(9;22) identified
1983: Activated oncogene ABL identified
2001: Drug inhibiting BCR-ABL fusion identified
Fourfold significance of recurrent chromosomal aberrations
Prognostic Marker
Drug target
Pointing to biological pathway
Early diagnostic marker
Two case studies of fusion discovery
Case Study: Prostate Cancer Overexpression recurrent chromosomal aberration[Tomlins et al. 2005] Tomlins, S.A., et al., Recurrent fusion
of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science, 2005. 310(5748): p. 644-8.
Case Study: Breast cancerDirect discovery of submicroscopic chromosomal
aberrations[Hampton OA et al] Hampton, OA et al. A sequence-level map
of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. Genome Research. 2008 Dec 9. [Epub ahead of print]
Case Study: Prostate Cancer
Recurrent ( > 50% cases) chromosomal aberrations discovered in leukaemias, lymphomas, and sarcomas
Carcinomas more complex: -- more rearrangements-- submicroscopic structure
Gene overexpression recurrent chromosomal aberration present in > 50% prostate carcinomas
[Tomlins et al. 2005] Tomlins, S.A., et al., Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science, 2005. 310(5748): p. 644-8.
Cancer Outlier Profile Analysis (COPA) using Oncomine database reveals overexpression of
ETV1 and ERG
[Tomlins et al.]
Recurrent TMPRSS2:ETV1 and TMPRSS2:ERG fusions
revealed by the study of rearrangements involving ETV1 and ERG
Expression of TMPRSS2 is regulated by androgen
[Tomlins et al.]
TMPRSS2 translocation associated with:
• Aggressive disease
Cancer Res 66:8347-51, 2006
• Reduced disease free survival
Cancer Biol Ther 6, 2007
• Higher rate of prostate cancer specific death
TMPRSS2:ERG gene fusion associated with lethal prostate cancer in a watchful waiting cohort. Oncogene, 2007
Detecting breakpoints / fusions by end-sequence profiling of genomic DNA
fragments
Human Chr 20
Human Chr 3
Cancer chromosome
Paired-end shotgun sequencing
Spectral Karyotyping (SKY) of MCF-7 breast cancer cell line
• Near triploid• Translocations involve all chromosomes except 4
Davidson et al (2000) Br J Cancer 83, 1309-17
Current model for origin of rearrangements in breast cancer:
Breakage-Fusion-Bridge (BFB) cycles initiated by “sticky” telomere ends
End-sequence profiling of cancer
First genome-wide End-Sequence Profile of cancer: MCF-7 breast cancer cell line (Volik et al, 2003 & 2006)
~20,000 BAC ends sequenced by Sanger method~1X genome coverage
MCF-7 BAC (~150Kb)
chromosome 17chromosome 20
Left Tag Right Tag
Whole-genome BAC-end sequencing of MCF-7 (Volik et al. 2006):
1) ~20,000 MCF-7 BACs end-sequenced2) end-sequences mapped onto reference genome
Intrachromosomally rearranged BACs
Interchromosomally rearranged BACs
Rearrangement-spanning MCF-7 BACs
Chr 1 2 3 4 5 6 ……..
~ 600 BACs contain rearrangements (Volik et al. 2006)~ 2.5 % of the human genome
Fosmid
Library F
96-BAC
Pool 6
Fosmid
Library E
96-BAC
Pool 5454 PyroSeq
Run 3
Fosmid
Library D
96-BAC
Pool 4
Fosmid
Library C
96-BAC
Pool 3454 PyroSeq
Run 2
Fosmid Library B
96-BAC
Pool 2
Fosmid
Library A96-BAC
Pool 1454 PyroSeq
Run 1
8-10KFosmid clonesselected from each library for end sequencing(sanger)
569 non-redundant rearranged BACs
Volik et al, 2003 & 2006
Down to the basepair level:Down to the basepair level:Sequencing of Rearrangement-spanning MCF-7 BACsSequencing of Rearrangement-spanning MCF-7 BACs
Hampton OA et al.
Bridging (FES) and Outlining (454 PyroSeq)Bridging (FES) and Outlining (454 PyroSeq)
BAC (134Kb)
Fosmids (40Kb)
chromosome 3
PyroSeq
chromosome 17
PyroSeq
chromosome 20
PyroSeq
157 PCR-confirmed somatic 157 PCR-confirmed somatic breakpoint junctionsbreakpoint junctions
Hampton OA et al.
Genomic Aberrations in MCF-7Genomic Aberrations in MCF-7
1
3
20
17157 rearrangements• detected in BACs • PCR-validated on gDNA
83 Intrachromosomal
74 Interchromosomal
Hampton OA et al.
A majority of dispersed breakpoints A majority of dispersed breakpoints fall within LCRsfall within LCRs
Hampton OA et al.
Detection of Fusion TranscriptsTranscript RT-PCR to validate Transcript RT-PCR to validate expression of predicted fusion transcriptsexpression of predicted fusion transcripts
ATXN7
Exon 6 Exon 13promoter Exon 7
Fusion
ATXN7
RAD51C
RAD51C
MCF7 10A NFusion
MCF7 10A NRAD51C ATXN7
MCF7 10A N
RT-PCR
Genomic fusion:
Predicted transcript:
Hampton OA et al.
Expression of predicted fusion transcriptsExpression of predicted fusion transcripts
ValidationBysiRNAknock-down
Hampton OA et al.
Biological validation:Biological validation:siRNA knock-down of SULF2 in 3 cell linessiRNA knock-down of SULF2 in 3 cell lines
growth
survival
anchorage-independent growth
Hampton OA et al.
Expression of predicted fusion transcriptsExpression of predicted fusion transcripts
Hampton OA et al.
Two Mechanisms for Double-Strand Break Repair
NAHR:
Non-Allelic Homologous Recombination
NHEJ:
Non-Homologous End-Joining
Figure 12.32 The Biology of Cancer (© Garland Science 2007)
RAD51
RAD51C
Roles of RAD51 and RAD51C in HRNAHR: Non-Alleleic Homologous Recombination
RAD51C is under-expressed in 51 out of 53 breast cancer cell lines relative to normal breast tissue
Row 25
6
7
8
9
10
11600MPE AU565
BT20 BT474
BT483 BT549
CAMA1 DU4475
HBL100 HCC38
HCC70 HCC202
HCC1007 HCC1008
HCC1143 HCC1187
HCC1428 HCC1500
HCC1569 HCC1599
HCC1937 HCC1954
HCC2157 HCC2185
HCC3153 HS578T
LY2 MCF10A
MCF12A MCF7
MDAMB134 MDAMB157
MDAMB175 MDAMB231
MDAMB361 MDAMB415
MDAMB435 MDAMB436
MDAMB453 MDAMB468
SKBR3 SUM44PE
SUM52PE SUM149PT
SUM159PT SUM185PE
SUM190PT SUM225CWN
SUM1315 T47D
UACC812 ZR751
ZR7530 ZR75B
Cell Line
Exp
ressio
n L
evel
NormalBreast
MCF-7
RAD51C under-expression is cancer specific, not tissue specific
Does the RAD51C / ATXN7 fusion
• interfere with resolution of Holliday junctions or
• otherwise affect HR
in a dominant negative fashion?
Coverage is proportional to insert size
long inserts
short inserts
2X
2X 2X
coverage = L * N / G
L = insert size N = number of insertsG = genome size
coverage = L * N / G
L = insert size N = number of insertsG = genome size
Probability of breakpoint detection
= 1 - e – coverage
Massively parallel paired-end sequencing
fragment size run ~ cost unit200bp Illumina > 50M reads per run ( 8 lanes )3 Kbp Illumina, SOLiD > 50M reads per run20 Kbp Roche-454 < 1M reads per run40 Kbp diTag Method> 50M reads (54bp diTags) per run
0 10 20 30 40 50 60 700
1
2
3
475 fragment
75-paired end
45 paired end35 fragment
Cycles
Err
or
Rat
e (%
)
Left Tag Right Tag
Illumina
BLAST hits using diTag as query
Platform-independent end-sequencing
Paired-endMethod X
Paired-endMethod Y
Vendor X Vendor Y Vendor Z
Paired-endMethod Z
$1M genome $100 genome
Modular paired-end method
Effective coverage is reduced when cell population is heterogeneous
Effective coverage = Coverage * Fraction of tumor cells
with rearrangement
80% tumor cells
20% tumor cells
20% non-tumor cells
80% non-tumor cells
Effective coverage = Coverage * 80%
Effective coverage = Coverage * 20%
From the perspective of an LCR breakpoint insert size is effectively reduced by LCR size
Probability of breakpoint detection = 1 - e – effective coverage
effective coverage = W * N / GW = insert size – LCR size (“wiggle room”)N = number of insertsG = genome size
“wiggle room”
inserts
LCR
breakpoint
Laboratory exercise this week: array CGH
Analysis of array CGH data from a set of tumor samples using Genboree
– Upload array CGH data– Perform segmentation (invoke Bioconductor tool)– Subtract polymorphisms (databases, current literature)– Identify recurrent amplifications or deletions– Study correlation with gene expression