Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’...
-
Upload
cameron-jerome-tucker -
Category
Documents
-
view
221 -
download
1
Transcript of COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’...
![Page 1: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/1.jpg)
COMP5331 1
COMP5331
Clustering
Prepared by Raymond WongSome parts of this notes are borrowed from LW Chan’s
notesPresented by Raymond Wong
raywong@cse
![Page 2: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/2.jpg)
COMP5331 2
Clustering
Computer
History
Raymond
100 40
Louis 90 45
Wyman 20 95
… … …Computer
History
Cluster 1(e.g. High Score in Computer and Low Score in History)
Cluster 2(e.g. High Score in Historyand Low Score in Computer)
Problem: to find all clusters
![Page 3: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/3.jpg)
COMP5331 3
Why Clustering?
Clustering for Utility Summarization Compression
![Page 4: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/4.jpg)
COMP5331 4
Why Clustering? Clustering for Understanding
Applications Biology
Group different species Psychology and Medicine
Group medicine Business
Group different customers for marketing Network
Group different types of traffic patterns Software
Group different programs for data analysis
![Page 5: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/5.jpg)
COMP5331 5
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 6: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/6.jpg)
COMP5331 6
K-mean Clustering Suppose that we have n example
feature vectors x1, x2, …, xn, and we know that they fall into k compact clusters, k < n
Let mi be the mean of the vectors in cluster i.
we can say that x is in cluster i if distance from x to mi is the minimum of all the k distances.
![Page 7: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/7.jpg)
COMP5331 7
Procedure for finding k-means Make initial guesses for the means
m1, m2, .., mk
Until there is no change in any mean Assign each data point to the cluster
whose mean is the nearest Calculate the mean of each cluster For i from 1 to k
Replace mi with the mean of all examples for cluster i
![Page 8: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/8.jpg)
COMP5331 8
Procedure for finding k-means
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
k=2
Arbitrarily choose k means
Assign each objects to most similar center
Update the cluster means
Update the cluster means
reassignreassign
![Page 9: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/9.jpg)
COMP5331 9
Initialization of k-means The way to initialize the means was not
specified. One popular way to start is to randomly choose k of the examples
The results produced depend on the initial values for the means, and it frequently happens that suboptimal partitions are found. The standard solution is to try a number of different starting points
![Page 10: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/10.jpg)
COMP5331 10
Disadvantages of k-means
Disadvantages In a “bad” initial guess, there are no
points assigned to the cluster with the initial mean mi.
The value of k is not user-friendly. This is because we do not know the number of clusters before we want to find clusters.
![Page 11: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/11.jpg)
COMP5331 11
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 12: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/12.jpg)
COMP5331 12
Sequential k-Means Clustering Another way to modify the k-means
procedure is to update the means one example at a time, rather than all at once.
This is particularly attractive when we acquire the examples over a period of time, and we want to start clustering before we have seen all of the examples
Here is a modification of the k-means procedure that operates sequentially
![Page 13: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/13.jpg)
COMP5331 13
Sequential k-Means Clustering Make initial guesses for the means
m1, m2, …, mk
Set the counts n1, n2, .., nk to zero Until interrupted
Acquire the next example, x If mi is closest to x
Increment ni
Replace mi by mi + (1/ni) x (x – mi)
![Page 14: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/14.jpg)
COMP5331 14
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 15: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/15.jpg)
COMP5331 15
Forgetful Sequential k-means This also suggests another alternative in which
we replace the counts by constants. In particular, suppose that a is a constant between 0 and 1, and consider the following variation:
Make initial guesses for the means m1, m2, …, mk
Until interrupted Acquire the next example x If mi is closest to x, replace mi by mi+a(x-mi)
![Page 16: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/16.jpg)
COMP5331 16
Forgetful Sequential k-means
The result is called the “forgetful” sequential k-means procedure.
It is not hard to show that mi is a weighted average of the examples that were closest to mi, where the weight decreases exponentially with the “age” to the example.
That is, if m0 is the initial value of the mean vector and if xj is the j-th example out of n examples that were used to form mi, then it is not hard to show that mn = (1-a)n m0 +a k=1
n (1-a)n-kxk
![Page 17: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/17.jpg)
COMP5331 17
Forgetful Sequential k-means
Thus, the initial value m0 is eventually forgotten, and recent examples receive more weight than ancient examples.
This variation of k-means is particularly simple to implement, and it is attractive when the nature of the problem changes over time and the cluster centres “drift”.
![Page 18: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/18.jpg)
COMP5331 18
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 19: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/19.jpg)
COMP5331 19
Hierarchical Clustering Methods
The partition of data is not done at a single step.
There are two varieties of hierarchical clustering algorithms Agglomerative – successively fusions
of the data into groups Divisive – separate the data
successively into finer groups
![Page 20: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/20.jpg)
COMP5331 20
Dendrogram
Hierarchic grouping can be represented by two-dimensional diagram known as a dendrogram.
12
34
5
01.02.03.04.05.0Distance
Dendrogram
![Page 21: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/21.jpg)
COMP5331 21
Distance
Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 22: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/22.jpg)
COMP5331 22
Single Linkage Also, known as the nearest neighbor
technique Distance between groups is defined as
that of the closest pair of data, where only pairs consisting of one record from each group are considered
Cluster A
Cluster B
![Page 23: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/23.jpg)
COMP5331 23
0.00.30.50.80.9
0.00.40.90.10
0.00.50.6
0.00.2
0.01
2
3
4
5
1 2 3 4 5
0.00.30.50.8
0.00.40.9
0.00.5
0.0(12)
3
4
5
(12) 3 4 5
01.02.03.04.05.0Distance
Dendrogram
12
![Page 24: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/24.jpg)
COMP5331 24
0.00.30.50.8
0.00.40.9
0.00.5
0.0(12)
3
4
5
(12) 3 4 5
12
01.02.03.04.05.0Distance
Dendrogram
![Page 25: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/25.jpg)
COMP5331 25
0.00.30.50.8
0.00.40.9
0.00.5
0.0(12)
3
4
5
(12) 3 4 5
12
54
01.02.03.04.05.0Distance
Dendrogram
0.00.40.8
0.00.5
0.0(12)
3
(4 5)
(12) 3 (4 5)
![Page 26: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/26.jpg)
COMP5331 26
12
54
01.02.03.04.05.0Distance
Dendrogram
0.00.40.8
0.00.5
0.0(12)
3
(4 5)
(12) 3 (4 5)
![Page 27: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/27.jpg)
COMP5331 27
12
54
01.02.03.04.05.0Distance
3
Dendrogram
0.00.40.8
0.00.5
0.0(12)
3
(4 5)
(12) 3 (4 5)
0.00.5
0.0(12)
(3 4 5)
(12) (3 4 5)
![Page 28: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/28.jpg)
COMP5331 28
12
54
01.02.03.04.05.0Distance
3
Dendrogram
0.00.5
0.0(12)
(3 4 5)
(12) (3 4 5)
![Page 29: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/29.jpg)
COMP5331 29
12
54
01.02.03.04.05.0Distance
3
Dendrogram
0.00.5
0.0(12)
(3 4 5)
(12) (3 4 5)
![Page 30: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/30.jpg)
COMP5331 30
Distance
Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 31: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/31.jpg)
COMP5331 31
Complete Linkage
The distance between two clusters is given by the distance between their most distant members
Cluster A
Cluster B
![Page 32: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/32.jpg)
COMP5331 32
0.00.30.50.80.9
0.00.40.90.10
0.00.50.6
0.00.2
0.01
2
3
4
5
1 2 3 4 5
0.00.30.50.9
0.00.40.10
0.00.6
0.0(12)
3
4
5
(12) 3 4 5
02.04.06.08.010.0Distance
Dendrogram
12
![Page 33: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/33.jpg)
COMP5331 33
0.00.30.50.9
0.00.40.10
0.00.6
0.0(12)
3
4
5
(12) 3 4 5
12
02.04.06.08.010.0Distance
Dendrogram
![Page 34: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/34.jpg)
COMP5331 34
0.00.30.50.9
0.00.40.10
0.00.6
0.0(12)
3
4
5
(12) 3 4 5
12
54
02.04.06.08.010.0Distance
Dendrogram
0.00.50.10
0.00.6
0.0(12)
3
(4 5)
(12) 3 (4 5)
![Page 35: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/35.jpg)
COMP5331 35
12
54
02.04.06.08.010.0Distance
Dendrogram
0.00.50.10
0.00.6
0.0(12)
3
(4 5)
(12) 3 (4 5)
![Page 36: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/36.jpg)
COMP5331 36
12
54
02.04.06.08.010.0Distance
3
Dendrogram
0.00.10
0.0(12)
(3 4 5)
(12) (3 4 5)
0.00.50.10
0.00.6
0.0(12)
3
(4 5)
(12) 3 (4 5)
![Page 37: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/37.jpg)
COMP5331 37
12
54
02.04.06.08.010.0Distance
3
Dendrogram
0.00.10
0.0(12)
(3 4 5)
(12) (3 4 5)
![Page 38: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/38.jpg)
COMP5331 38
12
54
02.04.06.08.010.0Distance
3
Dendrogram
0.00.10
0.0(12)
(3 4 5)
(12) (3 4 5)
![Page 39: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/39.jpg)
COMP5331 39
Distance
Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 40: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/40.jpg)
COMP5331 40
Group Average Clustering
The distance between two clusters is defined as the average of the distances between all pairs of records (one from each cluster).
dAB = 1/6 (d13 + d14 + d15 + d23 + d24 + d25)
Cluster A
Cluster B
12
3
45
![Page 41: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/41.jpg)
COMP5331 41
Distance
Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 42: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/42.jpg)
COMP5331 42
Centroid Clustering The distance between two clusters is
defined as the distance between the mean vectors of the two clusters.
dAB = dab
where a is the mean vector of the cluster A and b is the mean vector of the cluster B.
Cluster A
Cluster B
a
b
![Page 43: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/43.jpg)
COMP5331 43
Distance
Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 44: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/44.jpg)
COMP5331 44
Median Clustering Disadvantage of the Centroid Clustering: When a
large cluster is merged with a small one, the centroid of the combined cluster would be closed to the large one, ie. The characteristic properties of the small one are lost
After we have combined two groups, the mid-point of the original two cluster centres is used as the centre of the newly combined group
Cluster ACluster B
ab
![Page 45: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/45.jpg)
COMP5331 45
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 46: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/46.jpg)
COMP5331 46
Divisive Methods In a divisive algorithm, we start with the
assumption that all the data is part of one cluster.
We then use a distance criterion to divide the cluster in two, and then subdivide the clusters until a stopping criterion is achieved. Polythetic – divide the data based on the
values by all attributes Monothetic – divide the data on the basis of
the possession of a single specified attribute
![Page 47: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/47.jpg)
COMP5331 47
Polythetic Approach
Distance Single Linkage Complete Linkage Group Average Linkage Centroid Linkage Median Linkage
![Page 48: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/48.jpg)
COMP5331 48
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7 D(1, *) = 26.0
D(2, *) = 22.5
D(3, *) = 20.7
D(4, *) = 17.3
D(5, *) = 18.5
D(6, *) = 22.2
D(7, *) = 25.5
A = {1 }
B = {2, 3, 4, 5, 6, 7}
![Page 49: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/49.jpg)
COMP5331 49
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 10
D(3, A) = 7
D(4, A) = 30
D(5, A) = 29
D(6, A) = 38
D(7, A) = 42A = {1 }
B = {2, 3, 4, 5, 6, 7}
![Page 50: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/50.jpg)
COMP5331 50
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 10
D(3, A) = 7
D(4, A) = 30
D(5, A) = 29
D(6, A) = 38
D(7, A) = 42A = {1 }
B = {2, 3, 4, 5, 6, 7}
D(2, B) = 25.0
D(3, B) = 23.4
D(4, B) = 14.8
D(5, B) = 16.4
D(6, B) = 19.0
D(7, B) = 22.2
![Page 51: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/51.jpg)
COMP5331 51
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 10
D(3, A) = 7
D(4, A) = 30
D(5, A) = 29
D(6, A) = 38
D(7, A) = 42A = {1 }
B = {2, 3, 4, 5, 6, 7}
D(2, B) = 25.0
D(3, B) = 23.4
D(4, B) = 14.8
D(5, B) = 16.4
D(6, B) = 19.0
D(7, B) = 22.2
2 = 15.0
3 = 16.4
4 = -15.2
5 = -12.6
6 = -19.0
7 = -19.8, 3
![Page 52: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/52.jpg)
COMP5331 52
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 10
D(3, A) = 7
D(4, A) = 30
D(5, A) = 29
D(6, A) = 38
D(7, A) = 42A = {1 }
B = {2, 3, 4, 5, 6, 7}
D(2, B) = 25.0
D(3, B) = 23.4
D(4, B) = 14.8
D(5, B) = 16.4
D(6, B) = 19.0
D(7, B) = 22.2
2 = 15.0
3 = 16.4
4 = -15.2
5 = -12.6
6 = -19.0
7 = -19.8, 3
![Page 53: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/53.jpg)
COMP5331 53
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 8.5
D(4, A) = 25.5
D(5, A) = 25.5
D(6, A) = 34.5
D(7, A) = 39.0
A = {1 }
B = {2, 4, 5, 6, 7}
, 3
![Page 54: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/54.jpg)
COMP5331 54
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 8.5
D(4, A) = 25.5
D(5, A) = 25.5
D(6, A) = 34.5
D(7, A) = 39.0
A = {1 }
B = {2, 4, 5, 6, 7}
D(2, B) = 29.5
D(4, B) = 13.2
D(5, B) = 15.0
D(6, B) = 16.0
D(7, B) = 18.6
, 3
![Page 55: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/55.jpg)
COMP5331 55
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 8.5
D(4, A) = 25.5
D(5, A) = 25.5
D(6, A) = 34.5
D(7, A) = 39.0
A = {1 }
B = {2, 4, 5, 6, 7}
D(2, B) = 29.5
D(4, B) = 13.2
D(5, B) = 15.0
D(6, B) = 16.0
D(7, B) = 18.6
2 = 21.0
4 = -12.3
5 = -10.5
6 = -18.5
7 = -20.4
, 3, 2
![Page 56: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/56.jpg)
COMP5331 56
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7D(2, A) = 8.5
D(4, A) = 25.5
D(5, A) = 25.5
D(6, A) = 34.5
D(7, A) = 39.0
A = {1 }
B = {2, 4, 5, 6, 7}
D(2, B) = 29.5
D(4, B) = 13.2
D(5, B) = 15.0
D(6, B) = 16.0
D(7, B) = 18.6
2 = 21.0
4 = -12.3
5 = -10.5
6 = -18.5
7 = -20.4
, 3, 2
![Page 57: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/57.jpg)
COMP5331 57
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7
D(4, A) = 24.7
D(5, A) = 25.3
D(6, A) = 34.3
D(7, A) = 38.0
A = {1 }
B = {4, 5, 6, 7}
, 3, 2
![Page 58: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/58.jpg)
COMP5331 58
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7
D(4, A) = 24.7
D(5, A) = 25.3
D(6, A) = 34.3
D(7, A) = 38.0
A = {1 }
B = {4, 5, 6, 7}
D(4, B) = 10.0
D(5, B) = 11.7
D(6, B) = 10.0
D(7, B) = 13.0
, 3, 2
![Page 59: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/59.jpg)
COMP5331 59
Polythetic Approach
091713363642
01110313438
07222529
0212330
077
010
01
2
3
4
5
6
7
1 2 3 4 5 6 7
D(4, A) = 24.7
D(5, A) = 25.3
D(6, A) = 34.3
D(7, A) = 38.0
A = {1 }
B = {4, 5, 6, 7}
D(4, B) = 10.0
D(5, B) = 11.7
D(6, B) = 10.0
D(7, B) = 13.0
4 = -14.7
5 = -13.6
6 = -24.3
7 = -25.0
, 3, 2 All differences are negative. The process would continue on each subgroup separately.
![Page 60: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/60.jpg)
COMP5331 60
Clustering Methods K-means Clustering
Original k-means Clustering Sequential K-means Clustering Forgetful Sequential K-means
Clustering Hierarchical Clustering Methods
Agglomerative methods Divisive methods – polythetic
approach and monothetic approach
![Page 61: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/61.jpg)
COMP5331 61
Monothetic
A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
It is usually used when the data consistsof binary variables.
![Page 62: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/62.jpg)
COMP5331 62
Monothetic
A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
It is usually used when the data consistsof binary variables.
1 0
1 a=3 b=1
0 c=0 d=1
AB
1234
5)03( 2
))()()((
)( 22
dcdbcaba
NbcadAB
= 1.875
Chi-Square Measure
![Page 63: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/63.jpg)
COMP5331 63
Monothetic
A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
It is usually used when the data consistsof binary variables.
1 0
1 a=3 b=1
0 c=0 d=1
AB
Chi-Square Measure
1234
5)03( 2
))()()((
)( 22
dcdbcaba
NbcadAB
= 1.875
![Page 64: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/64.jpg)
COMP5331 64
Monothetic
A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
It is usually used when the data consistsof binary variables.
1 0
1 a=3 b=1
0 c=0 d=1
AB
Chi-Square Measure
1234
5)03( 2
))()()((
)( 22
dcdbcaba
NbcadAB
= 1.875
Attr.
AB
a 3
b 1
c 0
d 1
N 5
1.87
2
![Page 65: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/65.jpg)
COMP5331 65
Monothetic
Attr.
AB AC BC
a 3 1 2
b 1 2 1
c 0 2 2
d 1 0 0
N 5 5 5
1.87
2.22
0.83
It is usually used when the data consistsof binary variables.
2A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
For attribute A, 22
ACAB = 4.09
For attribute B, 22
BCAB = 2.70
For attribute C, 22
BCAC = 3.05
1 0
1 a=3 b=1
0 c=0 d=1
AB
Chi-Square Measure
![Page 66: COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong raywong@cse.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649e555503460f94b4c597/html5/thumbnails/66.jpg)
COMP5331 66
Monothetic
It is usually used when the data consistsof binary variables.
A B C
1 0 1 1
2 1 1 0
3 1 1 1
4 1 1 0
5 0 0 1
For attribute A, 22
ACAB = 4.09
For attribute B, 22
BCAB = 2.70
For attribute C, 22
BCAC = 3.05
Attr.
AB AC BC
a 3 1 2
b 1 2 1
c 0 2 2
d 1 0 0
N 5 5 5
1.87
2.22
0.83
2
1 0
1 a=3 b=1
0 c=0 d=1
AB
Chi-Square Measure We choose attribute A for dividing the data into two groups.{2, 3, 4}, and {1, 5}