MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia

Post on 05-Jan-2016

29 views 1 download

Tags:

description

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Armstrong et al, Nature Genetics 30, 41-47 (2002). Blank slide/colon data. Hsa.37937 3' UTR 2a 197371 MYOSIN HEAVY CHAIN, NONMUSCLE (Gallus gallus). gene1. tumor:. - PowerPoint PPT Presentation

Transcript of MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemiaArmstrong et al, Nature Genetics 30, 41-47 (2002)

Blank slide/colon data

gene1

1.62 1.33 0.79 0.41 0.39 0.38 1.22 1.57 0.72 0.97 1.12 0.61 0.79 0.36 0.52 0.58 0.44 0.35 0.53 0.52 0.46 0.59 0.68 0.27 0.67 0.49 0.49 0.53 0.35 1.44 0.55 0.33 1.70 0.59 0.73 1.54 1.03 0.54 0.66 0.33

2.81 2.18 2.68 2.17 2.84 2.58 4.97 2.12 2.76 3.41 2.72 3.26 2.51 1.24 2.83 1.25 4.22 1.06 2.30 0.44 1.21 1.57

Hsa.37937 3' UTR 2a 197371 MYOSIN HEAVY CHAIN, NONMUSCLE (Gallus gallus)

tumor:

normal:

mean = 0.73 std = 0.4

mean = 2.41 std = 1.05

histograms

2.81 2.18 2.68 2.17 2.84 2.58 4.97 2.12 2.76 3.41 2.72 3.26 2.51 1.24 2.83 1.25 4.22 1.06 2.30 0.44 1.21 1.57

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5 1 3 2 3 5 4 2 1 1

HISTOGRAM, BINS OF 0.5

NORMALIZED (FREQUENCIES)

mean = 0.73 std = 0.4 mean = 2.41 std = 1.05

t-test

T = -9.04 P = 10 e-14

gene1000

0.21 0.38 0.51 0.23 0.23 0.32 0.20 0.53 0.33 0.47 0.25 0.22 0.36 0.26 0.27 0.26 0.26 0.33 0.30 0.15 0.25 0.18 0.19 0.28 0.25 0.25 0.54 0.20 0.41 0.47 0.49 0.39 0.33 0.44 0.37 0.42 0.34 0.35 0.56 0.37

0.20 0.32 0.62 0.21 0.31 0.25 0.24 0.40 0.25 0.50 0.19 0.37 0.63 0.33 0.41 0.48 0.59 0.45 0.48 0.31 0.30 0.41

Hsa.37192 3' UTR 2a 186603 EUKARYOTIC INITIATION FACTOR 4B (Homo sapiens)

mean = 0.328 std = 0.111

mean = 0.375 std = 0.134

tumor:

normal:

NORMALIZED (FREQUENCIES)

t-test

T = -1.48 P = 0.15

85%

gene2000

Hsa.1829 gene 1 Human mRNA fragment for class II histocompatibility antigen beta-chain (pII-beta-4). 1.50 2.53 2.38 3.16 3.01 2.45 1.70 2.10 3.14 2.76

1.57 4.15 3.60 5.32 2.20 1.82 2.81 5.33 4.03 2.28 1.48 2.03 1.75 1.64 2.92 1.26 1.75 2.03 2.45 2.25 2.82 3.87 1.67 1.22 2.49 1.74 4.96 1.49 1.38 5.98

1.56 3.07 4.15 8.12 3.41 3.78 1.42 0.96 2.09 2.63 2.29 2.11 1.26 1.85 1.61 3.18 2.23 1.02 3.36 3.63 2.11 1.93

tumor:

normal:

mean = 2.6258 std = 1.2039

mean = 2.6261 std = 1.536

histograms

NORMALIZED (FREQUENCIES)

t-test

T = - 0.001 P = 0.9992

E, C&N_log2E

colon date expression matrix E

log2 E, center, normalize

genes ordered by p-value

726 genes with p < 0.05ordered by difference ofmeans (normal – tumor)

after ttest 0.05 order by diffmeans

genes with p < 0.05

RANDOM DATA

sorted p

Q=0.15

I=758

how many out of 726 are false?

0.14

FDR: 726*0.14=101 false separating genes

how many genes at FDR=0.05?

516*0.05=26 false separating genes

26 out of 516 - false

26 - false

random data

100separating (p<0.001), 1900 random

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia

Armstrong et al, Nature Genetics 30, 41-47 (2002)

separation

E1

E2

ALL

MLL

E1-2E2= 0

= E1- 2E2 < 0

= E1- 2E2 > 0

projection 1

E1

E2

ALL

MLL

w

+/- PROJECTIONS ON w – DO SEPARATE ALL FROM MLL

projection 2

E1

E2

ALL

MLL

+/- PROJECTIONS ON w – DO NOT SEPARATE ALL FROM MLL

projection 3

E1

E2

WELL SEPARATED CENTERS OF MASS -NO SEPARATION OF THE TWO CLOUDS

projection 4

E1

E2WEAK SEPARATION OFCENTERS OF MASS –GOOD SEPARATION OF THE TWO CLOUDS

Fisher to perceptron

E1

E2

ALL

MLL

OPTIMAL LINETO PROJECT ON

FISHER

PERCEPTRON

UNSUPERVISED ANALYSIS

•GOAL A: FIND GROUPS OF GENES THAT HAVE

CORRELATED EXPRESSION PROFILES. THESE GENES ARE

BELIEVED TO BELONG TO THE SAME BIOLOGICAL

PROCESS.

•GOAL B: DIVIDE TISSUES TO GROUPS WITH SIMILAR

GENE EXPRESSION PROFILES. THESE TISSUES ARE

EXPECTED TO BE IN THE SAME BIOLOGICAL (CLINICAL)

STATE.

CLUSTERING

Unsupervised analysis

Giraffe

DEFINITION OF THE CLUSTERING PROBLEM

CLUSTER ANALYSIS YIELDS DENDROGRAM

T (RESOLUTION)

Giraffe + Okapi

BUT WHAT ABOUT THE OKAPI ?

STATEMENT OF THE PROBLEM

GIVEN DATA POINTS Xi, i=1,2,...N, EMBEDDED IN D

- DIMENSIONAL SPACE, IDENTIFY THE

UNDERLYING STRUCTURE OF THE DATA.

AIMS:PARTITION THE DATA INTO M CLUSTERS,

POINTS OF SAME CLUSTER - "MORE SIMILAR“

M ALSO TO BE DETERMINED!

GENERATE DENDROGRAM,

IDENTIFY SIGNIFICANT, “STABLE” CLUSTERS

"ILL POSED": WHAT IS "MORE SIMILAR"?

RESOLUTION

Statement of the problem2

CLUSTER ANALYSIS YIELDS DENDROGRAM

Dendrogram2

TLINEAR ORDERING OF DATA

YOUNG OLD

52 41 3

Agglomerative Hierarchical Clustering

3

1

4 2

5

Distance between joined clusters

Need to define the distance between thenew cluster and the other clusters.

Single Linkage: distance between closest pair.

Complete Linkage: distance between farthest pair.

Average Linkage: average distance between all pairs

or distance between cluster centers

Need to define the distance between thenew cluster and the other clusters.

Single Linkage: distance between closest pair.

Complete Linkage: distance between farthest pair.

Average Linkage: average distance between all pairs

or distance between cluster centers

Dendrogram

The dendrogram induces a linear ordering of the data points

The dendrogram induces a linear ordering of the data points

Hierarchical Clustering -Summary

• Results depend on distance update method

• Greedy iterative process

• NOT robust against noise

• No inherent measure to identify stable clusters

2 good clouds

COMPACT WELL SEPARATED CLOUDS – EVERYTHING WORKS

2 flat clouds

2 FLAT CLOUDS - SINGLE LINKAGE WORKS

filament

SINGLE LINKAGE SENSITIVE TO NOISE

52 41 3

Average linkage

3

1

4 2

5

Distance between joined clusters

Need to define the distance between thenew cluster and the other clusters.

Average Linkage: average distance between all pairs

Need to define the distance between thenew cluster and the other clusters.

Average Linkage: average distance between all pairs

Dendrogram