Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... ·...
Transcript of Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... ·...
![Page 1: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/1.jpg)
Translational Cancer MedicineStatistical Analysis of Microarray Data
Eric Blanc
KCL
December 16, 2013
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 1 / 42
![Page 2: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/2.jpg)
Outline
1 IntroductionMicroarrays
2 Statistics of differential expressionDifferential expression detectionMultiple testing correctionsModerated statistics
3 Clustering and classificationStatistical Decision TheoryExamples
4 Data processing and quality control
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 2 / 42
![Page 3: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/3.jpg)
IntroductionGenomics data sets
Genomics data sets: a single experiment returns a quantity measuredfor a large proportion of the constituents of the cell. For example
I Expression levels for most genes.I SNP counts for millions of SNP covering most of the genome.I All binding sites for a transcription factor.I Protein-protein interactions for most protein pairs.
Genomics data sets are usually very large, much larger (andpotentially noisier) than those obtained by conventional methods.
Statistical tools are necessary to1 Quantitative assessment of the results,2 Discovery and identification of features in the data,3 Analysis of possible sources of bias, or structure in the experimental
noise.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 3 / 42
![Page 4: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/4.jpg)
IntroductionMicro-array data sets as an example
The micro-array experiment offers an experimental read-out for alarge proportion of the genome.It can be DNA (SNPs, CNV, ChIP) or RNA (mRNA, small RNAs)
The primary output of a micro-array experiment is represented by alarge matrix of numbers, which contains the readout values for r rowsof probes (genes, SNPs, genomic regions, ...) and c columns ofdifferent samples.
The lack of a quantitative theoretical framework for thephysico-chemical processes generating the data exacerbates the needfor a careful analysis of the noise structures
Only differences in expression between conditions can be reliablymeasured.
We consider here expression arrays to measure mRNA expressionlevels.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 4 / 42
![Page 5: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/5.jpg)
IntroductionMicroarray technology
Probes are oligonucleotide sequences complementary to small genomic sequences(in genes, exons, regulatory sequences, around SNPs, ...)
Probes are covalently bound on the array surface so that probes sharing the samesequence are located in the same area of the array.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 5 / 42
![Page 6: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/6.jpg)
IntroductionMicroarray technology
The sample (DNA or mRNA) is fragmented and labelled with fluorescent dyecovalently bound.
The labelled sample is hybridised on the array surface, and after washing, onlyprobes which target sequence are present in the sample remain hybridised.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 6 / 42
![Page 7: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/7.jpg)
IntroductionMicro-array data and expression matrix
Ncol
Nrow
Low expression
High expression
Arr
ayS
urf
ace
Sam
ple
1
Sam
ple
2
Sam
ple
3
Sam
ple
4
Sam
ple
5
Probe 1
Probe 2
Probe 3
Probe 4
Probe 5
Probe 6
Probe 7
Probe 8
Probe 9
Probe 10
Expression matrix
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 7 / 42
![Page 8: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/8.jpg)
IntroductionOverview
1 Quantitative assessment of the resultsI Identification of genes that follow a pre-defined pattern in several
conditions.
2 Discovery and identification of features in the dataI Classification and prediction of phenotype status based of expression
pattern.Cancer classification and prediction of patient response to treatmentmay be predicted from the patient’s expression profile.
I Clustering of genes that have similar expression patterns in severalconditions.An example would be finding groups of co-regulated genes in a timeseries containing a large number of points.
3 Analysis of possible sources of bias, or structure in the experimentalnoise.
I Looking for “surprises”
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 8 / 42
![Page 9: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/9.jpg)
Statistics of differential expressionOverview
1 Quantitative assessment of the resultsI Identification of genes that follow a pre-defined pattern in several
conditions.
2 Discovery and identification of features in the dataI Classification and prediction of phenotype status based of expression
pattern.Cancer classification and prediction of patient response to treatmentmay be predicted from the patient’s expression profile.
I Clustering of genes that have similar expression patterns in severalconditions.An example would be finding groups of co-regulated genes in a timeseries containing a large number of points.
3 Analysis of possible sources of bias, or structure in the experimentalnoise.
I Looking for “surprises”
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 9 / 42
![Page 10: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/10.jpg)
Statistical modelsIntroduction
Is gene g expressed differently between healthy controls and patientsaffected by the disease under study ?
The expression levels of g in Nh and Nd patients are recorded ({xi}and {yj}).
For each gene g , a t test is carried out between expression levels {xi}and {yj} to assess the expression difference statistical significance, butwhere is the model ?
The model is the mathematical description of the situation.It provides a quantitative framework to describe the main features ofthe system.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 10 / 42
![Page 11: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/11.jpg)
Statistical modelsMathematical description
Model assumptions:I Each sample is a “faithful representation” of the corresponding parent
population.I The parent populations follow a pre-determined distribution (generally,
the normal distribution)I These distribution represent the probability of observing a given value
for the expression of gene g , when the patient is taken at random.
The gene expression sample averages provide estimates of the meanexpression in the two parent populations, and the standard deviationsestimates of the populations dispersion.
The t test provides the probability that the expression of gene g isequal in the two parent populations.
The model is intrinsically probabilistic, and its assumptions cannot beproven.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 11 / 42
![Page 12: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/12.jpg)
Statistical modelsModel choice
When the sample data are normallydistributed, the sample averagemaximises the likelihood, but whenthe distribution is a doubleexponential, then the samplemedian is the maximum likelihoodestimator.
The definition of an outlier dependson the distribution: there is aprobability of 1.5 · 10−23 ofobserving a data point further than10σ away from the mean when thedistribution is normal, but thisprobability is 0.063 when thedistribution is Cauchy.
−4 −2 0 2 4
0.0
0.1
0.2
0.3
0.4
Normal vs Cauchy distributions
x
Pro
babi
lity
dens
ity
P(|x|>5) < 10^−6P(|x|>5) = 0.1
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 12 / 42
![Page 13: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/13.jpg)
Differential expression detectionApplication of statistical models in genomic data sets
The same statistical model applies for all genes.Usually, an normal error model is assumed to obtain closed formulae.
Statistical significance of gene expression differences across conditionsis usually assessed by hypothesis testing (t tests or ANOVA).
The high level of noise in the data usually implies a large number offalse positive and false negative among genes called differentiallyexpressed.
The large number of identical tests has two major consequences:I Need for multiple testing correction, andI Possibility of statistic moderation
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 13 / 42
![Page 14: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/14.jpg)
Multiple testing correctionIntroduction
Consider that N statistical tests are carried out in an analysis, and NP-values are produced.
If a gene has a P value under the pre-defined threshold α, it can beinterpreted as that there is a probability α that it is a False Positive.
Therefore we have (in the N tests are independent):
P(1 FP in 1 test) = α1
P(0 FP in 1 test) = 1− α1
P(0 FP in N tests) = (1− α1)N
P(at least 1 FP in N tests) = αN = 1− (1− α1)N
To ensure that P(at least 1 FP in N tests) is small, the cutoff for thesignificance an individual test α1 must be set such thatαN = 1− (1− α1)N , or α1 ≈ αN/N.
This correction (due to Bonferroni) is exceedingly stringent when thenumber of tests is large.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 14 / 42
![Page 15: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/15.jpg)
Multiple testing correctionIntroduction
Consider that N statistical tests are carried out in an analysis, and NP-values are produced.
If a gene has a P value under the pre-defined threshold α, it can beinterpreted as that there is a probability α that it is a False Positive.
Therefore we have (in the N tests are independent):
P(1 FP in 1 test) = α1
P(0 FP in 1 test) = 1− α1
P(0 FP in N tests) = (1− α1)N
P(at least 1 FP in N tests) = αN = 1− (1− α1)N
To ensure that P(at least 1 FP in N tests) is small, the cutoff for thesignificance an individual test α1 must be set such thatαN = 1− (1− α1)N , or α1 ≈ αN/N.
This correction (due to Bonferroni) is exceedingly stringent when thenumber of tests is large.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 14 / 42
![Page 16: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/16.jpg)
Multiple testing correctionIntroduction
Consider that N statistical tests are carried out in an analysis, and NP-values are produced.
If a gene has a P value under the pre-defined threshold α, it can beinterpreted as that there is a probability α that it is a False Positive.
Therefore we have (in the N tests are independent):
P(1 FP in 1 test) = α1
P(0 FP in 1 test) = 1− α1
P(0 FP in N tests) = (1− α1)N
P(at least 1 FP in N tests) = αN = 1− (1− α1)N
To ensure that P(at least 1 FP in N tests) is small, the cutoff for thesignificance an individual test α1 must be set such thatαN = 1− (1− α1)N , or α1 ≈ αN/N.
This correction (due to Bonferroni) is exceedingly stringent when thenumber of tests is large.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 14 / 42
![Page 17: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/17.jpg)
Multiple testing correctionIntroduction
Consider that N statistical tests are carried out in an analysis, and NP-values are produced.
If a gene has a P value under the pre-defined threshold α, it can beinterpreted as that there is a probability α that it is a False Positive.
Therefore we have (in the N tests are independent):
P(1 FP in 1 test) = α1
P(0 FP in 1 test) = 1− α1
P(0 FP in N tests) = (1− α1)N
P(at least 1 FP in N tests) = αN = 1− (1− α1)N
To ensure that P(at least 1 FP in N tests) is small, the cutoff for thesignificance an individual test α1 must be set such thatαN = 1− (1− α1)N , or α1 ≈ αN/N.
This correction (due to Bonferroni) is exceedingly stringent when thenumber of tests is large.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 14 / 42
![Page 18: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/18.jpg)
Multiple testing correctionFalse Discovery Rate
When there is 106 tests (typical for SNPs arrays), the 1 % statisticalsignificance level after Bonferroni correction is P values below 10−8,which is unreasonably stringent for most experiments.
An alternative is the control of the False Discovery Rate is moreappropriate (less stringent) than the control of the probability for theoccurrence of one False Positive call (Family-Wise Error Rate).
The q value is defined as the expected ratio of False Positive callsamong the tests for which the statistic (for example t) is above agiven threshold.
So from the statistic t, instead of computing the P value, the q valueis computed instead, and lists of differentially expressed SNPs (orgenes) are obtained from these latter values.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 15 / 42
![Page 19: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/19.jpg)
Parallel statistical testsModerated statistics
The t statistics used to assess differential expression significance isgiven by:
t =x − y
s ·√
1/nx + 1/nywith s2 =
∑nxi=1(xi − x)2 +
∑nyj=1(yj − y)2
nx + ny − 2
When s is small, the statistical significance increases, at a given valueof the difference between means.
With many parallel tests and few replicates for each conditions, oneexpects “accidental” small standard deviations in gene expression,leading to artificially high statistical significance.
By introducing a parametric model for the standard deviationsdistribution, the t (and F ) statistics can be “moderated”, reducingthe false positive rate.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 16 / 42
![Page 20: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/20.jpg)
Using parallel tests to regularise statisticsHierarchical models
Hierarchical models impose parametric distributions for the parametersgoverning each test pdf.
In the case of a comparison between the mean expression values in twoconditions with a common variance, we could for example impose thatρ(µx), ρ(µy ) ∝ cst and ρ(σ2) ∝ s20χ
−2ν0
.
The whole dataset is then used to fit the hyper-parameters (here s0 and ν0).
The posterior values for the residual variances (& degrees of freedom) are:
s2j =ν0s
20 + νs2jν0 + ν
When sj is “accidentally” small, the addition of ν0s20 regularises the value of
the t statistic, as it avoids division by a small number.
The loss of statistical significance is compensated by the increase of thedegrees of freedom.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 17 / 42
![Page 21: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/21.jpg)
Statistics of differential expressionOverview
1 Quantitative assessment of the resultsI Identification of genes that follow a pre-defined pattern in several
conditions.
2 Discovery and identification of features in the dataI Classification and prediction of phenotype status based of expression
pattern.Cancer classification and prediction of patient response to treatmentmay be predicted from the patient’s expression profile.
I Clustering of genes that have similar expression patterns in severalconditions.An example would be finding groups of co-regulated genes in a timeseries containing a large number of points.
3 Analysis of possible sources of bias, or structure in the experimentalnoise.
I Looking for “surprises”
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 18 / 42
![Page 22: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/22.jpg)
Clustering and classification of gene expression
van’t Veer et al. (2002). Nature 415 530-536.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 19 / 42
![Page 23: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/23.jpg)
Clustering and classification of gene expression
van’t Veer et al. (2002). Nature 415 530-536.
Genes in set 1 Genes in set 2
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 19 / 42
![Page 24: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/24.jpg)
Clustering and classification of gene expression
van’t Veer et al. (2002). Nature 415 530-536.
Genes in set 1 Genes in set 2
Patients in cluster 1
Patients in cluster 2
These patientshave differentexpression patterns
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 19 / 42
![Page 25: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/25.jpg)
Clustering and classification of gene expression
van’t Veer et al. (2002). Nature 415 530-536.
Genes in set 1 Genes in set 2
Patients in cluster 1
Patients in cluster 2
These patientshave differentexpression patterns
Expression pattern predicts ER status
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 19 / 42
![Page 26: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/26.jpg)
Statistical Decision TheoryRegression, classification and clustering
We consider a data set of N pairs of (yi , xi ), where i varies from 1 to N.The xi are the input, and the yi the output for the problem.
Regression: the numerical observations yi are predicted by a modelwhich maps the explanatory variables xi onto yi . One may beinterested either in model predictions Y for new explanatory variablesX , or by parameters θ identifying the model
Classification: the observations yi are classes from which theexplanatory variables xi are drawn.One is usually interested in modelprediction for new variables
Clustering: There are no observations yi , only explanatory variablesxi , which must be grouped according to similar properties.
NB: In most cases, the explanatory variables xi have many components,and are therefore represented by a vector xi .
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 20 / 42
![Page 27: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/27.jpg)
Statistical Decision TheoryDefinitions
We consider an y (the ER class, for example) which depend onanother variable (observed or modelled) x (the expression pattern)The variables have a joint probability distribution p(x, y)
We seek a function f predicting the value of y from the knowledge ofx: y = f (x)
We define a loss function L(y , f (x)) penalising the prediction errors.Typically L(y , f (x)) = (y − f (x))2 or 0 and 1 for correct and incorrectclassifications
The solution f minimising the loss is f (x) = E (y |x)
When the choice for f is not directed by the problem, a trainingdata set is required to select f from broad classes of functions.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 21 / 42
![Page 28: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/28.jpg)
Statistical Decision TheoryAlgorithm training procedure
We assume a training data set D = {(xi , yi )}, and a loss functionL(y , f (x)).
By some minimisation algorithm, we tune the function f so that theloss over the training data is minimal.
This training of f depends on the class of acceptable functions f .It involves optimisation of internal parameters of f .
Once the algorithm is trained, it can predict output values y it hasnever seen (not in D).
The true value of an algorithm is in its prediction efficiency for newdata, not for the training data, while it is optimised only against thetraining data set.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 22 / 42
![Page 29: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/29.jpg)
Statistical Decision TheoryExamples of type of prediction function f
K th nearest neighbour:Consider that the classifier knows the class for xi inputs. Then, a newpoint x is classified according to the class of the nearest K points xi .Each of the nearest neighbours of x “votes” for a class, and the classof the new point is assigned to the majority class.
Support Vector Machines (SVM):Hyper-planes (or hyper-surfaces) in the input space of the xi areconstructed to optimally separate the various classes. The supportvectors are the data points xi which define the separation planes.Support vector machines provide a separation of the input space intoC disjoint regions, where C is the number of classes allowed foroutput values.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 23 / 42
![Page 30: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/30.jpg)
K-Nearest Neighbours (KNN algorithm)Synthetic example
The expression of 2 genes have been measured for 100 patients, 50ER+ (red) and 50 ER− (green).
The perfect classifier allowing prediction from ER status from theexpression of these two genes is known.
There are some patients which ER status is outside of the perfectclassifier domains, because the ER status is an observable, and assuch is subject to experimental error.
For this example, we have set parameters such that:I The (0, 1) square is divided in two regions (green & red), both of equal
area 0.5.I 100 points randomly distributed in the (0, 1) square, such that there
are 50 points in each region.I 40 points in the red region are assigned the red class, and 10 the green
class. The same proportion are used for green.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 24 / 42
![Page 31: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/31.jpg)
K-Nearest Neighbours (KNN algorithm)Example
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
KNN training set: 100 points
x1
x 2
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 25 / 42
![Page 32: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/32.jpg)
K-Nearest Neighbours (KNN algorithm)Example
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Class prediction for black point
x1
x 2
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 25 / 42
![Page 33: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/33.jpg)
K-Nearest Neighbours (KNN algorithm)Example
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Number of neighbours : 1
x1
x 2
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 25 / 42
![Page 34: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/34.jpg)
K-Nearest Neighbours (KNN algorithm)Example
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Number of neighbours : 5
x1
x 2
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 25 / 42
![Page 35: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/35.jpg)
K-Nearest Neighbours (KNN algorithm)Example
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Number of neighbours : 59
x1
x 2
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 25 / 42
![Page 36: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/36.jpg)
Classification and Statistical Decision TheoryLessons from the KNN example
Within a class of algorithms (KNN), there is still choice of which oneto choose (number of neighbours).
When the number of neighbours is low, the predicted regionboundaries are very complex.When the neighbours’ number increases, the it becomes smoother.
The number of misclassified training set data points is 0 when thenumber of neighbours is 1, and it increases with the number ofneighbours.This shows that the training set cannot be used to select the bestclassifier.
Classifiers, as usual estimators, are statistical quantities, which enjoystatistical properties, such as bias and variance.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 26 / 42
![Page 37: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/37.jpg)
Statistical Decision TheoryBias-Variance decomposition
The prediction error can be decomposed into 3 main sources:
E ((y − f (x))2|D) = E[(y − f (x) + f (x)− f (x))2|D
]= E
[(y − f (x))2
]+ E
[(f (x)− f (x))2
]= σ2 + E
[(f (x)− E (f (x)) + E (f (x))− f (x))2
]= σ2 + E
[(f (x)− E (f (x)))2
]︸ ︷︷ ︸
Variance
+ (E (f (x))− f (x))2︸ ︷︷ ︸Bias
σ2 is the irreducible error made on the measurement of y
The variance is due to the choice of sample: other data samples would haveled to slightly different models f
The bias is due to the choice of model function (linear models could bechosen for their simplicity, even thought they may have a bias in theirpredictions)
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 27 / 42
![Page 38: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/38.jpg)
Model assessment and selectionDefinitions
We can define
the model selection, which is choosing the best performing model,and
the model assessment, which estimates the prediction error on newdata
In data rich situations, the data can be split into 3 parts:
the training set, against which the model parameters are optimised,
the validation set, used to estimate prediction error for modelselection, and
the test set, to compute the true test error
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 28 / 42
![Page 39: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/39.jpg)
Clustering of gene expressionDifferent algorithms lead to different clustering results
D’haeseleer (2005). Nat. Biotechnol. 23 1499-1501.
(a) Original clusters
(b) Hierarchicalclustering
(c) K-means
(d) Self-organisingmaps (SOM)
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 29 / 42
![Page 40: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/40.jpg)
Clustering of expression dataDistances and Proximity matrices
To cluster data, one needs to define the notion of proximity between datapoints. Formally:
The proximity dij between inputs xi and xj must be defined
In many cases (but not all), proximities enjoy the mathematicalproperties of distances:
I dij ≥ 0I dij = 0 ⇔ xi = xjI dij = djiI dij ≤ dik + dkj
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 30 / 42
![Page 41: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/41.jpg)
Distances and Proximity matricesExamples
Examples of true distances:
dp(xi , xj) =
(∑k
||xi ,k | − |xj ,k ||p)1/p
p = 2 : Euclidian, p = 1 : Manhattan, p =∞ : Maximum
Examples of non-distance similarities
r = 1− xi · xj/(||xi || ||xj ||) Pearson Correlation
D =∑|xi ,k − xj ,k |/
∑|xi ,k + xj ,k | Camberra
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 31 / 42
![Page 42: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/42.jpg)
Distances and Proximity matricesOptimisation of cluster assignments
Clustering is an assignment of data points into K clusters so that thedistances between points from the same cluster in minimised
The optimisation can be done on the within-cluster point scatter
W (C ) =1
2
K∑k=1
∑i ,i ′∈Ck
d(xi , xi ′)
Combinatorial explosion of the different possible assignments of datapoints to clusters
S(N,K ) =1
K !
K∑k=1
(−1)K−k(K
k
)kN
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 32 / 42
![Page 43: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/43.jpg)
K-meansDefinition
Algorithm to find a quick approximation to the optimal assignment ofpoints into K clusters
A-priori number of clusters K is known, and the algorithm returns anassignment where each observation belongs to exactly one cluster
”Representative” points are chosen to perform the data points clusterassignment
The objective function is∑K
k=1
∑xi∈Ck
||xi −mk ||2 where the clusterCk centroid is mk , and observations xi are assigned to the clusterwhich center is nearest
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 33 / 42
![Page 44: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/44.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
Truth
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 45: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/45.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
Starting point
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 46: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/46.jpg)
K-meansExamples
●
●
●●
●
●
●
● ●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●● ●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
● ●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●●
●
●●
● ●
●
● ●
● ●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●
●
●
●
●
●
● ●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●●
● ●
●
●
●●
●
●
●
●
●
●●● ●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●● ●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●●●
●
●
●●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●●
● ●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●● ●●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●●
●
●
●
●
●●● ●
●
●
●
●
●
●●
●
●
●
●
● ●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●● ●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●●●
●●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
● ●
●
●
●
●
● ●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
Initial seeds
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 47: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/47.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 1 iteration
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 48: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/48.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 2 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 49: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/49.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 3 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 50: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/50.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 4 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 51: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/51.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 5 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 52: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/52.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 6 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 53: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/53.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 7 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 54: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/54.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
After 8 iterations
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 55: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/55.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
Convergence
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 56: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/56.jpg)
K-meansExamples
●●
●
●
●
●
●●
●
●
● ●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●● ●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
●●
●●
●
● ●
●
●
●● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
● ●
●
●
−2 0 2 4 6 8
−2
02
46
8
3 clusters
Mislabelled
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 34 / 42
![Page 57: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/57.jpg)
K-meansExamples
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●● ●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
● ● ●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
● ●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●●
●●
● ●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●● ●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
● ● ●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
● ●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●●
●●
● ●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●● ●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●●●
●●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
● ●
●
●
●
●
●●●
●
●
●
●
● ●
●
●
●
●
●●
●●
●
●
●●
●
● ●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●●● ●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●● ●
●●
●●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●● ●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●●●
●●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
● ●
●
●
●
●
●●●
●
●
●
●
● ●
●
●
●
●
●●
●●
●
●
●●
●
● ●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●●● ●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●● ●
●●
●●
●
●
●
−3 −2 −1 0 1 2 3
−6
−4
−2
02
46
Unsuccessful 2−clusters assignment
Convergence
● ●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 35 / 42
![Page 58: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/58.jpg)
K-meansExamples
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●●
● ●
●
●●
●●
●●
●
●●
●
●
●●
●●
● ●
●
●●●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
−2 0 2 4 6
−2
02
46
Unsuccessful 2−clusters assignment
Convergence
●
●
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 36 / 42
![Page 59: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/59.jpg)
Classification and clusteringSummary
Classification and clustering are both difficult problems, with manydifferent competing algorithms to address them.
The error made by classifiers can be (in principle) be estimated usingtest sets (reference data sets not used for training or model selection).
An objective assessment of error estimation is not possible forclustering, as the outputs are never known.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 37 / 42
![Page 60: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/60.jpg)
Data processing and quality controlBesides specific hybridization
There are many ways in which cross-hybridization and folding can affectthe measured intensity
Taken from Binder (2006). J. Phys. Condens. Matter 18 S491-S523
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 38 / 42
![Page 61: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/61.jpg)
Data processing and quality controlTrivial experimental problems
False-colour imaging of amicro-array experiment
In green are the regions of lowreliability intensity measures
The green regions probablyhighlight locations of air bubbleswhich have limited thehybridization
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 39 / 42
![Page 62: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/62.jpg)
Data processing and quality controlBiological samples variability
Measured intensities agreement between 3 biological replicates, displayedon logarithmic scale.The two replicates on the left have a good agreement, while there isconsiderable differences with the third replicate.
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 40 / 42
![Page 63: Translational Cancer Medicine - King's College London › ton.coolen › SBGP › lecture... · Introduction Overview 1 Quantitative assessment of the results I Identi cation of genes](https://reader033.fdocuments.us/reader033/viewer/2022053011/5f0efbf47e708231d441e8f6/html5/thumbnails/63.jpg)
Data processing and quality controlImportance of processing algorithm
Agreement between the topdifferentially expressed genes aspredicted by 4 different algorithms
The main processing stepinvolvde with micro-array data iscalled “Normalisation”
Many different algorithms havebeen proposed and implemented
There is no theoreticaljustification for choosing oneover another
The choice of algorithm canhave a dramatic influence on theoutcome
Eric Blanc (KCL) Translational Cancer Medicine December 16, 2013 41 / 42