Amin Fazel 2006
description
Transcript of Amin Fazel 2006
![Page 1: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/1.jpg)
Gaussian Mixture ModelGaussian Mixture Modelclassificationclassification ofof
Multi-Color Fluorescence In Situ Multi-Color Fluorescence In Situ
Hybridization (M-FISH) ImagesHybridization (M-FISH) Images
Amin Fazel
2006
Department of Computer Science and Electrical EngineeringDepartment of Computer Science and Electrical Engineering University of Missouri – Kansas CityUniversity of Missouri – Kansas City
![Page 2: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/2.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
2/15
Motivation and Goals
• Chromosomes store genetic information
• Chromosome images can indicate genetic disease, cancer, radiation damage, etc.
• Research goals:– Locate and classify each chromosome in
an image– Locate chromosome abnormalities
![Page 3: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/3.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
3/15
Karyotyping
• 46 human chromosomes form 24 types– 22 different pairs– 2 sex chromosomes, X and Y
• Grouped and ordered by length
Banding Patterns Karyotype
![Page 4: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/4.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
4/15
Multi-spectral Chromosome Imaging
• Multiplex Fluorescence In-Situ Hybridization (M-FISH) [1996]
• Five color dyes (fluorophores)• Each human chromosome type
absorbs a unique combination of the dyes
• 32 (25) possible combinations of dyes distinguish 24 human chromosome types
Healthy Male
![Page 5: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/5.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
5/15
M-FISH Images
• 6th dye (DAPI) binds to all chromosomes
DAPI Channel6th Dye
M-FISH Image5 Dyes
![Page 6: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/6.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
6/15
M-FISH Images
• Images of each dye obtained with appropriate optical filter
• Each pixel a six dimensional vector• Each vector element gives contribution of a
dye at pixel• Chromosomal origin distinguishable at single
pixel (unless overlapping)• Unnecessary to estimate length, relative
centromere position, or banding pattern
![Page 7: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/7.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
7/15
Bayesian Classification
• Based on probability theory– A feature vector is denoted as
• x = [x1; x2; : : : ; xD]T
– D is the dimension of a vector
• The probability that a feature vector x belongs to class wk is p(wk|x) and this posteriori probability can be computed via
• and
)(
)()|()|(
xp
cPcxpxwp kk
k
k
iii cPcxpxp
1
)()|()(
Probability density function of class wk
Prior probability
![Page 8: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/8.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
8/15
Gaussian Probability Density Function
• In the D-dimensional space
• is the mean vector • is the covariance matrix
– In the Gaussian distribution lies an assumption that the class model is truly a model of one basic class
)()(2
1
2/12/
1
e||)2(
1),;(
μxΣμx
ΣΣμx
T
DN
μ
Σ
![Page 9: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/9.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
9/15
Gaussian mixture model GMM
• GMM is a set of several Gaussians which try to represent groups / clusters of data– therefore represent different subclasses
inside one class– The PDF is defined as a weighted sum of
Gaussians
•
C
ccckΝp
1
),;();( Σμxx
![Page 10: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/10.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
10/15
Gaussian Mixture Models
Equations for GMMs:
multi-dimensional case: becomes vector , becomes covariance matrix .
assume is diagonal matrix:
C
ccccΝp
1
),,();( μxx
22 2/)(e2
1),,(
xxN
)()(2
1 1
e||)2(
1),,(
μxμxμx
T
DN
n
iii
1
2||
211
1
222
1
233
1
0 000
0 0-1 =
![Page 11: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/11.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
11/15
GMM
• Gaussian Mixture Model (GMM) is characterized by• the number of components,• the means and covariance matrices of
the Gaussian components• the weight (height) of each component
![Page 12: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/12.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
12/15
GMM
• GMM is the same dimension as the feature space (6-dimensional GMM)
• for visualization purposes, here are 2-dimensional GMMs:
like
liho
od
value1
valu
e2va
lue2
![Page 13: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/13.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
13/15
GMM
• These parameters are tuned using a iterative procedure called the Expectation Maximization (EM)
• EM algorithm: recursively updates distribution of each Gaussian model and conditional probability to increase the maximum likelihood.
![Page 14: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/14.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
14/15
GMM Training Flow Chart (1)• Initialize the initial Gaussian means μi using the K-means clustering
algorithm• Initialize the covariance matrices to the distance to the nearest cluster• Initialize the weights 1 / C so that all Gaussian are equally likely
• K-means clustering1. Initialization:
random or max. distance.2. Search:
for each training vector, find the closest code word,assign this training vector to that cell
3. Centroid Update:for each cell, compute centroid of that cell. Thenew code word is the centroid.
4. Repeat (2)-(3) until average distance falls below threshold
![Page 15: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/15.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
15/15
GMM Training Flow Chart (2)
E step: Computes the conditional expectation of the complete log-likelihood, (Evaluate the posterior probabilities that relate each cluster to each data point in the conditional probability) assuming the current cluster parameters to be correct
M step: Find the cluster parameters that maximize the likelihood of the data assuming that the current data distribution is correct.
N
n cnNic wp
1 ,11
N
ncn
N
ncnn
ic
w
w
1,
1,
1
x
c
j
in
ij
in
ic
cn
jxpp
cxppw
1
,
);|(
);|(
N
ncn
N
n
Ticn
icncn
ic
w
xxw
1,
1
11,
1
))((
![Page 16: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/16.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
16/15
GMM Training Flow Chart (3)
• recompute wn,c using the new weights, means and covariances. Stop training if
– wn+1,c - wn,c < threshold
• Or the number of epochs reach the specified value. Otherwise, continue the iterative updates.
![Page 17: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/17.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
17/15
GMM Test Flow Chart
• Present each input pattern x and compute the confidence for each class k:
• Where is the prior probability of class ck estimated by counting the number of training patterns
• Classify pattern x as the class with the highest confidence.
),|()( kk cPcP x
)( kcP
![Page 18: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/18.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
18/15
Results
Training Input Data
![Page 19: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/19.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
19/15
ResultsOne GaussianCorrectness
Two GaussianCorrectness
True label
![Page 20: Amin Fazel 2006](https://reader034.fdocuments.us/reader034/viewer/2022051116/568154c1550346895dc2c697/html5/thumbnails/20.jpg)
Thursday, June, 2006
CS and EE DepartmentCS and EE DepartmentUMKCUMKC
20/15
Thanks for your patience !