Ensemble Based Gustafson Kessel Fuzzy Clustering

9
OPEN ACCESS J DATA SCI APPL, VOL. 1, NO. 1, PP.001-010, JULY 2018 E-ISSN 2614-7408 DOI: 10.21108/JDSA.2018.1.6 JOURNAL OF DATA SCIENCE AND ITS APPLICATIONS Ensemble Based Gustafson Kessel Fuzzy Clustering Achmad Fauzi Bagus Firmansyah 1 , Setia Pramana 2 1 Badan Pusat Statistik Jalan Dr. Sutomo 6-8 Jakarta, Indonesia [email protected] 2 Politeknik Statistika STIS Jalan Otto Iskandardinata No. 64C Jakarta, Indonesia [email protected] Received on 21-04-2018, revised on 10-07-2018, accepted on 18-07-2018 Abstract Fuzzy Cluster is a clustering method that allows data to be a member of two or more clusters by combining hard-clustering method and fuzzy membership matrix. Two popular fuzzy clustering algorithms are Fuzzy C-Means (FCM) and Gustafson Kessel (GK). Although GK has better performance, GK has weakness handling linearly correlated data. Beside that, both FCM and GK produce unstable result due to randomization on parameters initialization. That weakness can be overcome by using improved covariance estimation and cluster ensemble, respectively. This study is aimed to implement cluster ensemble on fuzzy clustering (GK and FCM). The clustering performance between GK-Ensemble and FCM-Ensemble in generated dataset is investigated by using the Xie Beni index and missclassification rate. The results show that the GK-Ensemble outperform the FCM-Ensemble. The GK-Ensemble performs best in both case of overlapping clusters and well-separated clusters. Keywords: Fuzzy C-Means (FCM) Ensemble, Gustafson Kessel (GK) Ensemble, Optimality cluster comparison I. INTRODUCTION lustering is data exploration method for obtaining the hidden characteristics on data by forming groups without any prior information in the form of labels and grouping mechanism [1]. The conventional hard clustering method, such as K-Means, restricts that each observation become a member to exactly one cluster. Hence, it cannot provide a proper result when data have the same distance to other cluster center (centroid) or are in a boundary group. To overcome this drawbacks, fuzzy clustering was introduced by Zadeh [2] and improved by Bezdek [3]. In fuzzy clustering, the uncertainty of cluster membership can be described by a membership function. There are several fuzzy clustering algorithms have been proposed [4]. Among that proposed algorithms, there are two popular algorithms, i.e., Fuzzy C-Means (FCM) and Gustafson-Kessel (GK) [5]. The main difference between the two techniques is the distance function used which influence cluster membership values. The FCM uses the euclidean distance, while the GK uses local adaptation norm distance utilizing fuzzy covariances matrix. The euclidean distance is a simple and popular technique but have disadvantage that only perform best on well-separated cluster (no overlapping cluster boundary) and each cluster have spherical shape. This weakness is overcome by the GK by using fuzzy covariance matrix that represent both ellipsoidal C

Transcript of Ensemble Based Gustafson Kessel Fuzzy Clustering

Page 1: Ensemble Based Gustafson Kessel Fuzzy Clustering

OPEN ACCESS

J DATA SCI APPL, VOL. 1, NO. 1, PP.001-010, JULY 2018 E-ISSN 2614-7408

DOI: 10.21108/JDSA.2018.1.6

JOURNAL OF DATA SCIENCE AND ITS APPLICATIONS

Ensemble Based Gustafson Kessel Fuzzy

Clustering Achmad Fauzi Bagus Firmansyah1, Setia Pramana2

1Badan Pusat Statistik

Jalan Dr. Sutomo 6-8 Jakarta, Indonesia [email protected]

2Politeknik Statistika STIS

Jalan Otto Iskandardinata No. 64C Jakarta, Indonesia

[email protected]

Received on 21-04-2018, revised on 10-07-2018, accepted on 18-07-2018

Abstract

Fuzzy Cluster is a clustering method that allows data to be a member of two or more clusters by

combining hard-clustering method and fuzzy membership matrix. Two popular fuzzy clustering

algorithms are Fuzzy C-Means (FCM) and Gustafson Kessel (GK). Although GK has better

performance, GK has weakness handling linearly correlated data. Beside that, both FCM and GK

produce unstable result due to randomization on parameters initialization. That weakness can be

overcome by using improved covariance estimation and cluster ensemble, respectively. This study

is aimed to implement cluster ensemble on fuzzy clustering (GK and FCM). The clustering

performance between GK-Ensemble and FCM-Ensemble in generated dataset is investigated by

using the Xie Beni index and missclassification rate. The results show that the GK-Ensemble

outperform the FCM-Ensemble. The GK-Ensemble performs best in both case of overlapping

clusters and well-separated clusters.

Keywords: Fuzzy C-Means (FCM) Ensemble, Gustafson Kessel (GK) Ensemble, Optimality

cluster comparison

I. INTRODUCTION

lustering is data exploration method for obtaining the hidden characteristics on data by forming groups

without any prior information in the form of labels and grouping mechanism [1]. The conventional hard

clustering method, such as K-Means, restricts that each observation become a member to exactly one cluster.

Hence, it cannot provide a proper result when data have the same distance to other cluster center (centroid) or

are in a boundary group. To overcome this drawbacks, fuzzy clustering was introduced by Zadeh [2] and

improved by Bezdek [3]. In fuzzy clustering, the uncertainty of cluster membership can be described by a

membership function.

There are several fuzzy clustering algorithms have been proposed [4]. Among that proposed algorithms,

there are two popular algorithms, i.e., Fuzzy C-Means (FCM) and Gustafson-Kessel (GK) [5]. The main

difference between the two techniques is the distance function used which influence cluster membership values.

The FCM uses the euclidean distance, while the GK uses local adaptation norm distance utilizing fuzzy

covariances matrix. The euclidean distance is a simple and popular technique but have disadvantage that only

perform best on well-separated cluster (no overlapping cluster boundary) and each cluster have spherical

shape. This weakness is overcome by the GK by using fuzzy covariance matrix that represent both ellipsoidal

C

Page 2: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 2

and spherical cluster. Beside that, GK observe correlation between variables in data that often happens on

overlapping cluster boundary [6]. Hence, the GK can be regarded as the Mahalanobis form of the FCM [7].

Several comparisons have been conducted that claim GK algorithm is more powerful [5][6]. However, GK

has disadvantage that it cannot be implemented on small datasets or datasets with linearly correlated

variables. This will cause mathematical calculation problem due to non-inverted covariance matrix [8]. In

addition, both the FCM and GK algorithm have weakness that give unstable result because of randomization

on parameters initialization. Therefore, in comparison between both method, researchers must used a same seed

value [5].

The weakness in the GK algorithm can be overcome by improving covariance estimation developed by

Babuska et al [8]. The improvement make GK can be used to estimate grouping whether the variables are

linearly correlated or not. Second weakness, the unstable result can be solved by using the ensemble approach

[9]. The ensemble approach in clustering is performed by combining some of the results of clustering to produce

a stable and robust cluster through consensus techniques[10].

This study is aimed to combine the fuzzy clustering of Fuzzy C-Means (FCM) and Gustafson-Kessel (GK)

with the ensemble approach. The performance of the hybrid approaches is investigated and compared using

simulated data of different scenarios.

This paper was organized into five sections. The first section is the introduction describing the background

of research, and followed by the literature review section. The third section is the research methods that covers

implementation and simulation studies. The fourth section presents the result of simulation studies and

comparison of both algorithms. The conclusion of reseach is given at the last section.

II. LITERATURE REVIEW

A. Fuzzy C-Means

The Fuzzy C-Means (FCM) algorithm, improved by Bezdek [3] can be understood as a soft form of K-

means that create partitions of N population into k clusters and give membership degree from 0 to 1. FCM

utilizes Euclidean distance function to determine characteristic similarity from data to centroid. Let U as

membership matrix containing membership degree (π‘’π‘–π‘˜), V is centroid matrix containing vector of cluster

centers (π’—π’Œ.), π’™π’Š. as i-th observed data, m as parameter of fuzziness, 𝑑𝑖𝑗 is distances value from π’™π’Š. to π’—π’Œ.. The

FCM minimizes the objective function (π½π‘š(𝑼, 𝑽)) as follows:

π½π‘š(𝑼, 𝑽) = βˆ‘ βˆ‘ π‘’π‘–π‘˜π‘š

𝐾

π‘˜=1

β€–π’™π’Š. βˆ’ π’—π’Œ.β€–2

𝑛

𝑖=1

(1)

where π‘’π‘–π‘˜, π’—π’Œ., and 𝑑𝑖𝑗 calculated as follow:

π‘’π‘–π‘˜π‘š =

1

βˆ‘ (𝑑𝑖𝑗/𝑑𝑖𝑙)2

π‘šβˆ’1π‘˜π‘™=1

, (2)

π’—π’Œ. =βˆ‘ π‘’π‘–π‘˜

π‘š π’™π’Š.𝑛𝑖=1

βˆ‘ π‘’π‘–π‘˜π‘š 𝑛

𝑖=1

, (3)

𝑑𝑖𝑗 = β€–π’™π’Š. βˆ’ π’—π’Œ.β€–2 = (π’™π’Š. βˆ’ π’—π’Œ.)

𝑇(π’™π’Š. βˆ’ π’—π’Œ.). (4)

The distance function contained in the objective function (1) follow the rules of euclideans distances. Hance,

FCM will only provide good quality when clusters formed in a properly separated data or have the same unit

[7].

Page 3: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 3

B. Gustafson Kessel

Gustafson and Kessel [11] developed a standard algorithm to overcome weakness of the FCM algorithm,

namely Gustafson-Kessel (GK). The GK algorithm alters the function of the distance calculation into adaptive

distance norm by utilizing fuzzy covariance matrix Eq. 5. Let π‘­π‘˜ as fuzzy covariance matrix of k-th cluster, πœŒπ‘˜

as parameter of cluster volume, and p is number of variables, Gustafson Kessel minimizes objective function

π½β€²π‘š(𝑼, 𝑽) as follow:

π½π‘š(𝑼, 𝑽) = βˆ‘ βˆ‘ π‘’π‘–π‘˜π‘šπΎ

π‘˜=1 (π’™π’Š. βˆ’ π’—π’Œ.)𝑇

πœŒπ‘˜det (π‘­π‘˜)1/π‘π‘­π‘˜βˆ’1(π’™π’Š. βˆ’ π’—π’Œ.)

𝑛𝑖=1 . (5)

Each covariance matrix will be used for calculating the distance between the centroid to data forming

topological structure of the data (ellipsoidal or spherical) [6]. Values π‘­π‘˜ is obtained by performing the

following calculation:

π‘­π‘˜ =βˆ‘ π‘’π‘–π‘˜

π‘š(π’™π’Š.βˆ’π’—π’Œ.)(π’™π’Š.βˆ’π’—π’Œ.)𝑇𝑛

𝑖=1

βˆ‘ π‘’π‘–π‘˜π‘šπ‘›

𝑖=1.

(6)

Distance function in this algorithm can be written as follow:

𝑑𝑖𝑗 = (π’™π’Š. βˆ’ π’—π’Œ.)π‘‡πœŒ

π‘˜det (π‘­π‘˜)1/π‘π‘­π‘˜

βˆ’1(π’™π’Š. βˆ’ π’—π’Œ.). (7)

C. Improved Gustafson Kessel

Babuska et al (2002) [8] stated that although the GK is superior to the FCM algorithm, there are problems

on its implementation on a small dataset or datasets with linearly correlated variables. This weakness can be

overcame by limiting the calculation of the ratio between the maximum and minimun eigen values. When the

ratio exceeds the prescribed limit (e.g., 1015), minimum eigen value will be increased. Then

the covariance matrix will be rearranged according the following equation:

𝑭 = πœ±πœ¦πœ±βˆ’πŸ, (8)

where 𝜱 is a matrix containing eigen vectors and 𝜦 is a diagonal matrix containing the eigen values. In addition,

fuzzy covariance matrix can be summed with a scaled identity matrix. The summation is done proportionally

and can be written into following equation:

π‘­π‘˜ = (1 βˆ’ 𝛾)π‘­π‘˜ + 𝛾 det (𝑭0)1/𝑝𝑰,

(9)

where 𝛾 is parameter weight, 𝑭0 is a covariance matrix, and 𝑰 is identity matrix with pΓ— p dimensions.

D. Ensemble Method

Because of random initialization value of membership matrix, the result of clustering can be unstable. To

reach stability of clustering, multiple cluster results can be combined using ensemble approach. Ensemble

methods are learning algorithms that construct a set of classifiers and then classify new datasets by taking a vote

of their classes [12]. Ensemble working better than single learning algorithm because

1) can avoid local optimal occured by each algorithm,

2) can produce representional result to the data than single algorithm, and

3) get best approximation of real hypotheses by combining several results.

There are several approaches to construct ensemble method. Five popular methods to construct ensemble are:

1) Bayessian Voting, combining several results calculated as posterior probability using Bayes Rules. Then,

the result will be voted according to the value of posterior probability. This technique need prior

knowledge and need highly computational steps.

2) Manipulating training samples, running several subsets of a dataset to create several results. Then using

a consensus method to create a final result. Several techniques using this approach are Bagging,

ADABOOST, and cross validation. This technique works well especially for unstable learning

approachs.

Page 4: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 4

3) Manipulating input feature, selecting several input to group together with feature that based on human

expert or prior learning algorithm. This methods only work well when input feature are highly redundant.

4) Manipulating output targets, manipulating outcome of several result by correcting the error. This

technique is partitioning class that have many members into several classes then combining the

partitioned classes with the other classes.

5) Injecting randomness, initialize parameter with randomness then combine the result using consensus

steps.

III. RESEARCH METHODS

A. Implementation the Algorithms

The GK-ensemble and the FCM-ensemble implemented here, adopt the β€œmanipulating training samples

tecniques” but use all datasets instead of subset of data. The fuzzy clustering algorithms are performed several

times then the results are combined using a consensus function. The implementation of ensemble approach on

both the GK and FCM algorithms are defined as follows:

1) Initialization: First we initialize parameters for each algorithm followed by generating a membership

matrix and a centroid vector. Then we set an initial value of the objective function, an initial covariance of data

and fuzzy covariance matrix for GK according to Eq. 6.

2) Distance Calculation: After initializing parameter, we calculate distance value according to the technique

on both algorithms. For the FCM we use the Euclidean distance (Eq. 4), while GK use the distance function

defined in Eq.7. Next, we calculate cluster centroid, fuzzy cluster covariance, and membership matrix again.

For the GK algorithm, when the ratio of maximum and minimum eigen values exceed 1015, we arrange fuzzy

covariance matrix with improved formula according to Eq. 8. Subsequently we add the covariance with scaled

identity matrix proportionally.

3) Verification: After calculating all components, we verify whether centroids are stable or not measured by

differencing cluster centroids with the initial or previous centroids. If the differences is less than 10-5 we can

say that centroids are stable. Otherwise, we repeat step 2 and 3 until the difference is less than 10-5.

4) Ensembling: The cluster ensembling shown in Fig 1 has two main processes: (1) standardizing name of

label simulation results and (2) combining result using consensus function. Standardizing process starts by

calculating dissimilarity matrix that contain proportional differences between the two clusters. The dissimilarity

matrix (Diss) obtained through the following calculation:

π‘«π’Šπ’”π’” = 𝒇 βˆ’ 𝑼1𝑼2𝑇 , (10)

where 𝒇 is A matrix with dimensions K Γ— K that has any value so the dissimilarity matrix (π‘«π’Šπ’”π’”) is not negative.

Next, we search optimal label based on Diss using Hungarian algorithm [9]. This step will produce standardized

label for the clusters.

STANDARDIZE LABLE

DATA

1st CLUSTERING PROCESS

2nd CLUSTERING PROCESS

CLUSTERING PROCESS ...

n-th CLUSTERING PROCESS

1st CLUSTER RESULT

2nd CLUSTER RESULT

... CLUSTER RESULT

n-th CLUSTER RESULT

CALCULATE DISS MATRIX

STANDARDIZE LABLE

COMBINE USING CONSENSUS FUNCTION

ENSEMBLE RESULT

Fig. 1. Cluster Ensemble Procedure

Page 5: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 5

The next process is combining membership matrix, that its label has been standarized, using a consensus

function. Consensus function is a function used to combine multiple label into one clustering result (𝑬)

[10]. Many of consensus function are introduced, but in this research we use majority voting with sum

approach. That approach is simple and fit with fuzzy clustering due to have membership value from 0 to 1.

Furthermore, to meet the criteria for fuzzy membership, the result from consensus function is divided by the

number of ensemble performed. Consensus function used in this research can be written as follow:

𝑼 = βˆ‘ 𝑼𝑏𝐡𝑏=1 , (11)

where 𝑼 : ensemble membership matrix, 𝐡 : number of ensemble.

The proposed algorithms, the FCM and GK-Ensemble algorithms, are implemented in R languange [13][14]

and available in advclust package that can be downloaded via CRAN (https://cran.r-

project.org/web/packages/advclust/index.html).

B. Simulation Studies

After implementing the algorithms, we simulate several datasets to compare the performance of the FCM-

Ensemble and GK-Ensemble approaches. The datasets are simulated in four scenarios and generated following

the Multivariate Normal distribution with the following specifications:

TABLE I

SPECIFICATION OF SIMULATED DATASETS

No Scenarios Data

Dimension

Vector of

Cluster

Means Value

Matrix of Variance

Parameter

I n large, overlapping

120 x 3 (-2.5, -2.5, -2.5); (2, 2.5, 2); (-2.5, 2, -2)

(5.5, -1.75,5,7, -4.75, -1.75, -4.75,5.5); (3.75,2.75,0,2.75,4, -2.75,0, -2.75,5); (4.75, 3.75, 0, 3.7,4.5, -3, 0, -3.5)

II n small, overlapping

40 x 4 (-2.5, 0, -2); (-2.5, -2.5, 0)

(0.1, -0.1, 0, -0.1, 0.2, 0, -0.01, 0, 1.7); (0.2, -0.1, 0.1, -0.1, 0.2,

-0.1, 0.1, -0.1, 2.5);

III n small, well separated

30 x 3 (3.2); (-3.5, -3.5); (0.5,0.25)

(5.5, 2, 2, 1.7); (6.7,3.8,3.8,2.5); (2, 0.75, 0.75, 1.2)

IV n large, well separated

120 x 2 (-1, 1); (-0.75, -1); (0.5, 1); (0,5, -1)

(0.1, -0075, -0075, 0.2); (0.1, 0.075, 0.75, 0.2); (0:25, 0:15, 0:15, 0.2);

n: sample size

The generated data can be visualized as biplot [15] shown in Fig. 2. The points on biplot represent the data

and the color represents the cluster membership. The arrow vectors represent the value of variable used that

generated by biplot algorithm. Then we run the GK and FCM on those datasets with the following parameters:

TABLE II

RANGE OF CLUSTER PARAMETERS

No Parameters Range

1 m 1.5, 1.75, 2, 2:25, 2.5, 2.75, 3, 3.25, 3.5, 3.75, 4

2 k 2, 3, 4, 5

3 𝛾 0, 0.3, 0.7

4 𝜌 1

Page 6: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 6

Data I Data II

Data III Data IV

Fig. 2. Biplot of Generated Data

Data I and II simulate overlapping clusters datasets with large and small number of observations,

respectively. Data III and IV are well separated with small and large number of observations, respectively.

We repeat the process ten times and combine the results using cluster ensemble approach. Next, we test the

optimal parameter with Xie Beni (XB) index, that can determine separation and density quality of fuzzy

clustering result [16][17]. After deciding best parameters each method on every datasets, we compare the Xie

Beni index and measure the misclassification rate of grouping. Misclassification rate is the percentage of false

assignment occured.

After simulating the process that explained above, the optimal parameters on each methods in every datasets

are obtained by comparing the value of the Xie Beni index. Lower Xie Beni index shows that fuzzy clustering

can represent data more optimal than the others because each cluster centroid is well separated and each cluster

have have good density.

C. Complexity

Both the GK-Ensemble and the FCM-Ensemble have some complexity, that can be shown from Big-O

notation. Graves and Pedrycz [18] state that GK and FCM have O(knp) for calculating centroid and O(knp2) for

calculating membership matrix. To compare the complexity, we run the GK-Ensemble, and the FCM-Ensemble

on the Iris dataset with the following parameters: k=3, m=2, b=3, and the other parameters are default. Then we

evaluate the runtime to 10 times.

IV. RESULTS AND DISCUSSION

A. Best Parameter Each Datasets on Both Methods

The most optimal cluster parameters are shown in Table III. It can be seen that the FCM-Ensemble performs

best with number of cluster (k) is 3 and fuzzyfier (m or degree of fuzziness) is 4. The XB index calculated by

using that parameters on FCM-Ensemble is 0.067203. On the other side, the GK Ensemble performs best with

Page 7: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 7

number of cluster (k) is 3, parameter of weights (𝜸) is 0, and the fuzzyfier (m or degree of fuzzyness) is 3.75.

The XB index calculated by using that parameters on GK-Ensemble is 0.04552. The GK-Ensemble have lower

XB index than the FCM-Ensemble and it can be concluded that the GK-Ensemble outperforms FCM-Ensemble

in all simulated scenarios.

TABLE III

BEST OPTIMAL CLUSTER PARAMETERS

Data FCM-Ensemble GK-Ensemble

k M XB index k 𝜸 m XB Index

Data I 3 4 0.067203 3 0 3.75 0.04552

Data II 2 3.75 0.041432 2 0.3 3.75 0.033833

Data III 3 3.75 0.070331 3 0.7 3.75 0.029317

Data IV 4 3.75 0.073626 4 0.7 3.5 0.038407

B. Comparison of GK-Ensemble and FCM-Ensemble

From Table III, both algorithm are able to produce the right number of cluster, i.e. in data IV both algorithms

can produce four clusters. Furthermore, the GK-Ensemble have lower XB index on all four simulated datasets.

It shows that GK-Ensemble is more optimal to create cluster than FCM-Ensemble in term of separation and

density.

Table IV describes misclassification value of cluster that produced using best parameter in Table III.

TABLE IV

MISSCLASSIFICATION VALUE

Data FCM-Ensemble GK-Ensemble

Data I 23.37 1.67

Data II 10 5

Data III 26.6667 24.444

Data IV 6.3889 7.5

Table IV shows that GK-Ensemble has lower misclassification value on dataset 1, 2, and 3. On those datasets

the GK-Ensemble outperforms the FCM-Ensemble. However in the dataset IV, the FCM-Ensemble has slightly

lower misclassification value.

Next, we also visualize the result using biplot to show cluster produced by both algorithms. Fig. 3 presents

biplot of the FCM-Ensemble and the GK-Ensemble on each dataset.

It can be seen from Fig. 3 that the GK-Ensemble represent the initial classes better than FCM-Ensemble on

Data I and II (by comparing with Fig.2). In Data I, GK-Ensemble can capture overlapping cluster, as we can

see that that blue group, red group, and green group are overlapped. Whereas, the FCM-Ensemble cannot

capture the overlapping clusters.

Next, in Data II, the FCM-Ensemble cannot perform well group while the GK-Ensemble can show the

overlapping although the overlapping is shown on small area. On Data III, both the the GK-Ensemble and the

FCM-Ensemble can provide the well-separated clusters. Last, in Data IV, both the GK-Ensemble and the FCM-

Page 8: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 8

Ensemble can show the well-separated cluster as well, although there is a little misclassification on the GK-

Ensemble on boundary area and adjacent to the other clusters.

C. Complexity

Table V presents the runtime calculation and comparison in miliseconds. Here, min represents the minimum

runtime, lq represents the lower quartile, mean represents the average runtime, median represents the median

value, uq represents the upper quartile, max represents the maximum value, and neval represents the number of

evaluation occured. From Table V, the GK-Ensemble needs longer runtime (27,67 s) than the FCM-Ensemble

(12,4 s). Although it seems that both method have same complexity, the GK-Ensemble need more time to

calculate fuzzy covariance matrix.

TABLE V

RUNTIME VALUE (IN MILISECONDS)

FCM-

Ensemble

min lq mean median uq max neval

1240,859 1254,614 12874 12787,15 13234,05 13560,34 10

GK-

Ensemble

min lq mean median uq max neval

2716,109 27293,51 27672,52 27386,53 28158,37 28369,82 10

Data FCM-Ensemble GK-Ensemble

Data

I

Data

II

Data

III

v1

V2

V3

1

2

3

4

5

67

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

5354

55

56

57

5859

60

61

62

63

64

65

66

67

68

69

70

71 72

73

74

75

76

77

78

79

80 81

82

83

84

85

86

87

88

8990

91

9293

94

95

96

97

98

99

100

101102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118119

120-2

-1

0

1

2

-4 -2 0 2

PC 1 Variance Explained: 60.24 %

PC 2

V

aria

nce

Expla

ined

: 33

.86

%

Cluster

a

a

a

1

2

3

v1

V2

V3

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18

19

20

21

22

23

24

25

26

2728

29

30

31

32

33

34

35

36

37

38

39

40

-2

-1

0

1

2

-2 -1 0 1 2

PC 1 Variance Explained: 58.56 %

PC 2

V

aria

nce

Expla

ined

: 31

.08

%

Cluster

a

a

1

2

v1

V2

V3

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18

19

20

21

22

23

24

25

26

2728

29

30

31

32

33

34

35

36

37

38

39

40

-2

-1

0

1

2

-2 -1 0 1 2

PC 1 Variance Explained: 58.56 %

PC 2

V

aria

nce

Expla

ined

: 31

.08

%

Cluster

a

a

1

2

v1

V2

1

2

3

45

6

7

8

910 11

12

13

14

15

16

17

1819

2021

22

23

242526

2728

29

30

-0.4

0.0

0.4

-2 0 2

PC 1 Variance Explained: 94.98 %

PC 2

V

aria

nce

Expla

ined

: 5.

02 %

Cluster

a

a

a

1

2

3

v1

V2

1

2

3

45

6

7

8

910 11

12

13

14

15

16

17

1819

2021

22

23

242526

2728

29

30

-0.4

0.0

0.4

-2 0 2

PC 1 Variance Explained: 94.98 %

PC 2

V

ari

ance E

xpla

ined: 5

.02 %

Cluster

a

a

a

1

2

3

v1

V2

V3

1

2

3

4

5

67

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

2425

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

5354

55

56

57

5859

60

61

62

63

64

65

66

67

68

69

70

71 72

73

74

75

76

77

78

79

80 81

82

83

84

85

86

87

88

8990

91

9293

94

95

96

97

98

99

100

101102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118119

120-2

-1

0

1

2

-4 -2 0 2

PC 1 Variance Explained: 60.24 %

PC 2

V

aria

nce

Expla

ined

: 33

.86

%

Cluster

a

a

a

1

2

3

Page 9: Ensemble Based Gustafson Kessel Fuzzy Clustering

ACHMAD FAUZI BAGUS FIRMANSYAH, ET. AL. / J. DATA SCI. APPL. 2018, 1 (1): 1-9 Ensemble Based Gustafson Kessel Fuzzy Clustering 9

Data FCM-Ensemble GK-Ensemble

Data

IV

Fig. 3. Biplot of Cluster Result on Data III and IV

V. CONCLUSION

We have developed hybrid fuzzy clustering algorithms of the Gustafson-Kessel (GK)-Ensemble and the

Fuzzy C-Means (FCM)-Ensemble. The GK-Ensemble provides more optimal results than the FCM-Ensemble

on the simulated data shown by lower Xie Beni index and missclassification rate. Moreover, the GK-Ensemble

performs best in both case of overlapping clusters and well-separated clusters. However, the GK-Ensemble

needs more runtimes than the FCM-Ensemble.

REFERENCES [1] W. Wang, and Y. Zhang. "On Fuzzy Cluster Validity Indices." Fuzzy Sets and System, (2007): 2095-2117.

[2] L.A. Zadeh. "Fuzzy sets: Inf. amd Cont,", 8 (1965): 338-353.

[3] J.C. Bezdek, R. Ehrlich, and W. Full, "FCM: The Fuzzy C-Means Clustering Algorithm." Computers and Geosciences, 10 (1984):

191-203.

[4] M.S. Yang. "A Surbey of Fuzzy Clustering. Mathl. Computing Modeling", 18.11 (1993): 1-16.

[5] B. Balasko, J. Abonyi, and B. Feil, "Fuzzy Clustering and Data Analysis Toolbox: For Use with Matlab." Veszprem Hungary, (2002).

[6] G. Grekousis, and H. Thomas, "Comparison of Two Fuzzy Algorithm in Geodemographic Segmentation Analysis: The Fuzzy C-

Means and Gustafson-Kessel Methods." Applied Geography, Elsevier, (2012): 125-136.

[7] R. Krisnanpuram, and J. Kim, "A Note on the Gustafson Kessel and Adaptive Fuzzy Clustering Algorithm." Transaction on Fuzzy

Systems, (2012).

[8] R. Babuska, P.Y. Veen, and U. Kaymak, "Improved Covarians Estimation for Gustafson Kessel Clustering." IEEE. (2002).

[9] X. Sevillano, F. Alias, and J.C. Socoro. "Posisional and Confidence voting-based Consensus Function for Fuzzy Cluster Ensemble."

Fuzzy Sets and System, 1-40. 2013.

[10] A. Strehl and J. Ghosh. "Cluster Ensemble-a knowledge reuse framework for combining multiple partitions." Journal of Machine

Learning Research, (2002): 583-617.

[11] D. E. Gustafson, and W. C. Kessel, "Fuzzy Clustering With A Fuzzy Covariance Matrix." (1978).

[12] T. G. Dietterich. "Ensemble Methods in Machine Learning. In: Multiple Classifier Systems. MCS 2000." Lecture Notes in Computer

Science Springer, Berlin, Heidelberg, (2000): 1857.

[13] R Core Team. R, "A language and environment for statistical computing." R Foundation for Statistical Computing, Vienna, Austria,

http://www.R-project.org/. (2014).

[14] S. Pramana, R. Yordani, R. Kurniawan and B. Yuniarto. "Dasar-dasar Statistika dengan Software R Konsep dan Aplikasi. 2nd Edition.

" In Media. (2017).

[15] S. Pramana, B. Yuniarto, S. Mariyah, I. Santoso, and R. Nooraeni. "Data Mining dengan R. Konsep Serta Implementasi." In Media.

(2018)

[16] K. L. Wu, "Analysis of parameter selections for fuzzy c-means." Pattern Recognition, (2012): 407-415.

[17] X. L. Xie and G. Beni, "A Validity Measure for Fuzzy Clustering." IEEE Pattern Analysis and Machine Learning, (1967): 841-847.

[18] D. Graves, and W. Pedrycz, "Kernel-based Fuzzy Clustering and Fuzzy Clustering: A Comparative Experimental Study." Fuzzy Sets

and Systems, (2010): 522-543.

V1V2

1

2

3

4

5

6

78

9

10

11

12

13

14

15 1617

18

192021

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43 44

45

46

47

48

49

50

51

52

53

54

55

56

57

5859

60

61

62

63

64

65

6667

68

69

70

71

7273

74

7576

77

78

79

80

8182

8384

85

8687

88

89

90

91

92

9394 95

96

97

98 99

100101

102

103104105

106

107

108

109

110111

112113

114

115

116

117118

119

120

-2

-1

0

1

2

-2 -1 0 1 2

PC 1 Variance Explained: 53.83 %

PC 2

V

aria

nce

Expla

ined

: 46

.17

%

Cluster

a

a

a

a

1

2

3

4

V1V2

1

2

3

4

5

6

78

9

10

11

12

13

14

15 1617

18

192021

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43 44

45

46

47

48

49

50

51

52

53

54

55

56

57

5859

60

61

62

63

64

65

6667

68

69

70

71

7273

74

7576

77

78

79

80

8182

8384

85

8687

88

89

90

91

92

9394 95

96

97

98 99

100101

102

103104105

106

107

108

109

110111

112113

114

115

116

117118

119

120

-2

-1

0

1

2

-2 -1 0 1 2

PC 1 Variance Explained: 53.83 %

PC 2

V

aria

nce

Expla

ined

: 46

.17

%

Cluster

a

a

a

a

1

2

3

4