Paper-2 a New Supervised Data Classification Method for Convex and Non Convex Classes

7/29/2019 Paper-2 a New Supervised Data Classification Method for Convex and Non Convex Classes

http://slidepdf.com/reader/full/paper-2-a-new-supervised-data-classification-method-for-convex-and-non-convex 1/14

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10

ISSN: 1837-7823

9

A New Supervised Data Classification Method for Convex and Non Convex

Classes

O. El Melhaoui, M. El Hitmy, and F. Lekhal

LABO LETAS, FS, University of Mohammed I, Oujda, Morocco.

[email protected]

AbstractThe present paper proposes a supervised data classification technique for convex and non convex classes.

The technique is based on two phases; the first phase deals with the creation of prototypes and elimination of

noisy objects, the second phase consists of merging the nearest prototypes into classes. Each prototype is created

based on the K nearest neighbours approach, and is represented by its gravity center. The objects which are very

far from their neighbours are eliminated. The merge phase is inspired from the hierarchical method, and it

consists of regrouping the closest prototypes in the same class in incremental way until that the number of

created classes is equal to the number of classes set out initially. This new technique is compared to the C-

means, the fuzzy C-means, the competitive neural networks and the fuzzy Min-Max classification methodsthrough a number of simulations. The proposed technique has obtained good results.

Keywords: Supervised classification, Convex and non convex classes, Creation of prototypes, Elimination of

noisy objects, Merge, C-means, Fuzzy C-means, Competitive neural networks, Fuzzy Min-Max classification.

1. IntroductionData classification is an active subject; it has played a crucial and highly beneficial role in hard tasks such as

data analysis, quality control, biometrics (face recognition), medicine, geology (soil texture recognition) and

automatic categorization of satellite pictures, etc.The classification is an abstraction and synthesis technique, it consists of partitioning a set of data entity into

separate classes according to a similarity criterion, the objects are as similar as possible within a class , we talk

about intra-class homogeneity, while objects of different classes are the most dissimilar possible, we talk about

inter-class heterogeneity. This process allows having a simplified representation of the initial data. There are twomain types of classification; unsupervised and supervised. We are interested in supervised methods where they

assume that the number of classes is known initially. Among these, we find, C-means (CM), fuzzy C-means

(FCM), support vector machine (SVM), K - nearest neighbours (KNN), neural networks (NN) etc.

C-means, fuzzy C-means and standard competitive neural network methods proved a real ability to solve a

nonlinear problem, and they are very popular because of their simplicity and their theoretical elegance, however

they have several drawbacks, the most important are; the initialization of centers which can lead to local

solutions [4, 2, 5] and they are not convenient for non convex or complex type of classes. The diversity and the

complexity of the problem gives rise to many methods, among them the fuzzy min max classification (FMMC),

which is well suited to convex or non convex type of classes. It consists of creating prototypes or hyperboxes

iteratively until their stabilization, each iteration involves three stages: expansion, overlap and contraction.

The present paper proposes a new technique for data classification suitable to convex and non convex type

of classes, the learning process is made in two steps; the first step consists of creating prototypes and eliminating

noisy objects, the second step consists of merging the prototypes according to some criteria. The creation of prototypes is based on the K nearest neighbours method while the elimination of isolated objects considers the

fact whether the object is within a very weakly compact class or if it is in the peripherals of the class.

Throughout this work a parameter D is used to measure the isolation of the object. Each object is supposed to be

represented by its attribute vector extracted from the diverse features associated with the object. The position of

each object to be classified is supposed to be known in the attribute space. The objects are initially stored in amatrix whose size is varying through the iterations. The objects are extracted in an orderly way from this matrix.

The second step consists of regrouping the nearest prototypes into the same class iteratively.

This paper is organised as follows: section 2 describes the different techniques of classification including C-

means, fuzzy C-means, competitive neural networks and fuzzy Min-Max classification. The proposed method is

described in section 3. The result of simulations and comparisons are introduced in section 4. Finally we give a

conclusion.




ISSN: 1837-7823

10

2 - Classification methods

2.1. Descriptive element

Let’s consider a set of M objects {Oi }, characterized by N parameters regrouped in a line vector

V att =(a1 ,a2 ,…..a N ). Let Ri = (ain)1≤n≤N be a line vector of N

ℜ where the nth

component ain is the value taken by an for the object Oi. Ri is called the observation associated with Oi , it is also called the realization of the attribute

vector for this object. N ℜ is the observation space, it is also known as the parameters space. The observations

(Ri)1≤i ≤M are associated to C different classes (CLs) 1≤s≤C with respective centers (cs)1≤s≤C , each observation Ri is

associated with its membership degree ui,s to the class CLs.

2.2. C-means (CM)The C-means method was introduced by Mac Queen in 1967. CM is very popular and widely used in

scientific and industrial applications, because of its great utility in the classification and its simplicity. It consists

of looking for good centers of C classes which minimize intra-class variance and maximize the distance between

the classes by an iterative process. The CM method consists of determining the class centers which minimize the

optimization criterion defined by equation (1) [13]:

(1)

||.|| is the Euclidean distance, ui,s equal to 1 if Ri belongs to CLs , 0 if not.

2.3. Fuzzy C-means (FCM)The fuzzy C-means (FCM) method was introduced by Dunn in 1973 and was improved by Bezdek in 1981.

This technique inspired from the C-means algorithm, introduces the concept of fuzzy sets. Every object in the

data set belongs to each class with a certain degree of membership rather than belonging entirely to one class.

The underlying principle of the FCM is to form from M objects defined by their realizations (Ri)1≤i≤M , C classes

(CLs)1≤s≤C by minimizing the criterion given by equation (2) and by considering the membership degree of each

object to the different classes. The criterion to be optimized in the FCM is given by [1, 2, 14]:

(2)

Under the constraints: ∑=

=C

s

siu1

, 1 for i=1… M and 0<∑=

M

i

siu

1

,<M for s=1 …C

df is the fuzzy degree often taken equal 2,siu ,

is the membership degree of the object i to the class CLs.

For each i,s, ui,s belongs to the interval [0,1]. In order to minimize J m, ui,s and cs, must be updated at each

iteration according to [2]:

( )

( )∑=

−−

−

−

−

−=

C

k

df k i

df si

si

c R

c Ru

1

112

1

12

,

and

( )

( )∑

∑

=

== M

i

df

si

i

M

i

df

si

s

R

c

1

,

1

,

µ

µ

(3)

2.4. Standard competitive neural networks (SCNN) The competitive neural network has been widely used in various fields including classification and

compression [7, 8, 9, 10]. It divides the input data into a number of disjoint clusters; it allows the output neurons

to compete and excite one neuron and inhibits all others at a given time. The competitive network has two layers;input layer of N neurons and output layer of C neurons, fully interconnected, C represents the number of classes.

The input layer receives the observation vector of dimension N, X = (x1 , x2 , ... x N ), each neuron j in the output

layer will compute the Euclidean distance between its weight vector Wj= (w j1 , w j2 .... w jN ) and the observation

vector X , the final result of the network will give the index of a winner neuron whose weight vector is closest to

the observation. The component w jn in the vector W j is the synaptic weight of the connection between the input

neuron n and the corresponding output neuron j. The standard competitive neural network adjusts the synapticweights of the winner neuron by minimizing the criterion given by the formula [6, 15]:

2

1 1

, si

M

i

C

s

sim c Ru J −= ∑∑= =

2

1 1

,)( si

M

i

C

s

df

sim c Ru J −= ∑∑= =




ISSN: 1837-7823

11

(4)

α j = 1 if W j is the closest to X , α j = 0 otherwise.

Using the gradient algorithm [13], also called the Winner Takes All rule, the weights of the winner neuron

are updated at each step as:

(5)

η is a learning rate which is usually a small real number between 0 and 1.

2.5. Fuzzy Min-Max classification (FMMC)2.5.1. Principle

FMMC is a classification method introduced by Simpson in 1993 [3], based on neural network architecture.

FMMC contains three layers, input, output and hidden layers. The number of neurons in the input layer is equal

to the dimension of the data representation space. The number of neurons in the hidden layer increases in time,

with respect to the creation of prototypes. The number of neurons in the output layer is equal to the number of classes known initially. The synaptic weights associated with the connections between input and hidden layers

are formed by two matrices V and W representing the characteristics of the different prototypes in the hidden

layer. The synaptic weights associated with the connections between hidden and output layers are formed by a Z

matrix characterizing the association of prototypes to the classes. The learning process is made in three steps,expansion overlapping and contraction repeated for each training input pattern. These phases are controlled by

two parameters; the sensitivity γ and the vigilance factor θ that control the maximum size of created hyperboxes.The fuzzy min-max classification neural network is built using hyperbox fuzzy sets. A hyperbox defines a

region of the N -dimensional pattern space. A hyperbox B j is defined by its min point Vj = (v jn)1≤n≤N and its max

point W j =(w jn)1≤n≤N . The fuzzy hyperbox membership function of an observation Oi to an hyperbox B j is defined

as follows: [3, 11, 12]

∑=

−−+−−= N

n

in jn jnini javwa

N Ob

1

)))],1min(,0max(1,0max())),1min(,0max(1,0[max(2

1)( γ γ (6)

Where Oi =(ai1 ,ai2 ,….,aiN ) is the ith input pattern, γ is the sensitivity parameter that regulates how fast the

membership value decreases as the distance between Oi and B j increases [6]. The combination of the min-max

points and the hyperbox membership function defines a fuzzy set.

2.5.2 Learning Let’s consider A = {(Oi , d k ) / i = 1, 2, …M, k=1,…,C } a training set, Oi is the input pattern and d k = {1,2, ...,

C} is the index of one of the C classes. The fuzzy min-max learning is an expansion/contraction process. The

learning process goes through several steps:

• Initialization:

Initial values for γ and θ . Assume that the first input pattern form a hyperbox B1 , defined by its min point V 1 = (v1n)1≤n≤N and its max

point W 1 =(w1n)1≤n≤N , where V 1= W 1= O1.

• Repeat:

1. Select a new input pattern (Oi , d k ) of the set A, identify a hyperbox for the same class, and provide the

highest degree of membership. If a hyperbox cannot be found, a new hyperbox is formed and added to the neural

network.

2. Expand the new created hyperbox B j defined by a couple of points ),( **

jn jnwv , where ),max(*

in jn jn aww =

and ),min(*

in jn jn avv = , 1≤n≤N .

3. Compute the size T where: ))),min(),(max((1

1

∑=

−= N

n

in jnin jnavaw

N T

If T>θ , a new hyperbox is created. Otherwise, we check if there is an overlap between the hyperbox B j expanded

in the last expansion step, and the hyperboxes which represent other classes than that of B j.

4. If an overlap between two hyperboxes of two different classes has been detected, a contraction of the two

hyperboxes is carried out. Four possible cases for overlapping and contraction procedures are discussed in [3,11].

• Until stabilization of hyperboxes.

2

12

1∑=

−=C

j

j j W X E α

)()()1( j j j j W X t W t W −+=+ ηα




ISSN: 1837-7823

12

3. Proposed method

3.1 Descriptive elementLet’s Consider a set of M objects {O1 , O2 , ..., Oi , ..., O M } characterized by N parameters, let Ri be the

observation associated with the object Oi, and let mat_va be a matrix of M lines (representing the objects Oi) and

N columns (representing the parameters a j), defined by:

N j

M iijavamat

≤≤

≤≤=

1

1_

3.2 PrincipleThe proposed method in this work is a supervised classification method. It is based on two steps, the first

step creates the prototypes and removes the noisy objects, and the second step creates the classes from the

created prototypes, called merge step.

The objects to be classified are initially stored in a matrix called mat_va of dimension [ M,N], the mat_va

matrix will vary according to the index i of the iteration and will be called mat_vai, the first object of mat_vai,iO1 is taken. The K nearest neighbours to

iO1 are found from all the M objects considered initially, and the

farthest object toiO1 from among its K nearest neighbours is obtained, this object will be called the K

th object.

Now a test is carried out and a decision of whetheri

O1 is noisy or not may be obtained. If the Euclidian distance

between the K th object andiO1 is greater than a threshold D set out initially then

iO1 is noisy or isolated and is

removed, if noti

O1 and its K nearest neighbours will constitute a consistent prototype. All the elements of the

obtained prototype are removed from mat_vai and mat_vai+1 is obtained. The merge step is inspired from the

hierarchical clustering method; it consists of grouping the most similar prototypes into a class iteratively.

3.2.1. Creating the prototypes and removing the noisy objects,The phase of creating prototypes is based on the cloud of points situated in the attributes space. Objects are

initially stored in the realizations matrix mat_va, and are extracted from this matrix. The algorithm for this phase

is:

• Iteration t=1, mat_va1 = mat_va

1. Find the K nearest neighbours of the object O1 from the cloud of points in the attributes space, O1 is the first

element of mat_va. Let1

k R (k = 1, ..., K ) be the observations associated with the K nearest neighbours of O1.

2. Let ),(max),(1

1,..,1

1

1 k K k

K R Rd R Rd

== .

• If D R Rd K ≤),(1

1 , the object O1 and its K nearest neighbours form the prototype P1 of gravity center

1

1

1

1

1+

+

=∑=

K

R R

g

K

k

k

. The (K +1) objects of P1 in this case are compact and are similar to each other.

The (K +1) objects of P1 are removed from mat_va1 matrix, a mat_va2 is created with M-(K +1) rows, the

new indexation for mat_va2 considers rows from 1 to M-(K+1). The empty rows in mat_va1 are filled with thosenon empty immediately coming after them. mat_va2 is formed in this way with no empty lines.

• If D R Rd K >),(1

1 , then the object O1 of the observation R1, is considered to be isolated and noisy, and it is

removed from mat_va1. Then mat_va2 is of the dimension [M-1, N], where the first line of mat_va1 is removed

and again a new indexation for mat_va2 takes place, the lines are indexed from 1 to M-1

Example

Let

=

5

4

3

2

1

1_

O

O

O

O

O

vamat be a matrix of objects, and K=2. Let’s consider that O2 and O4 are the two nearest

neighbours to O1. We assume that the distance between O1 and the second nearest neighbour is smaller than D,




ISSN: 1837-7823

13

then O1, O2 and O4 form a prototype P whose elements are eliminated of mat_va1, mat_va2 becomes

=

5

3

2_

O

Ovamat

If the distance between O1 and the second nearest neighbour is bigger than D, then O1 is eliminated from

mat_va1, O1 is a noisy object and mat_va2 becomes

=

5

4

3

2

2_

O

O

O

O

vamat

. The algorithm corresponding to the creation of prototypes is:

• Iteration t= J :

Let S be the number of prototypes created and mat_va J-1 the matrix formed in the J-1 iteration, it consists of

objects not assigned to prototypes P1 ,…, PS . The matrix of observations mat_va J will be formed in the following

way:

1. Get the K -nearest neighbours of the first object of mat_vaJ-1,1

1

− J O , the K th nearest neighbour of 1

1

− J O is

11

−

+ J K O , it is considered to be the farthest from 1

1 − J O .

2. Let d(1

1

−

+

J

K O ,1

1

− J O ) be the Euclidean distance between1

1

− J O and1

1

−+

J

K O ,

If d(1

1

−+

J

K O ,1

1

− J O ) > D, 1

1

− J O is considered to be isolated and will be eliminated.

If d(1

1

−+

J

K O ,1

1

− J O ) ≤ D , then a new prototype is created and the (K +1) objects of the prototype are removed

from mat_vaJ-1

3. Create a new matrix mat_va J

4. Repeat until the last matrix mat_vae becomes empty.

3.2.2. Merge phaseWe assume that phase 1 ended by forming H prototypes P1 ,… P H .

The task of the merge step is to aggregate the most similar prototypes in the appropriate class iteratively until the

number of classes created will be equal to the number of classes fixed initially. This step has been inspired from

the hierarchical clustering method.

We choose the Euclidean distance as similarity criterion between the prototypes. Each prototype represents

a class, we then have H classes (Figure A)

Figure A

We compute the Euclidean distance between all the gravity centers of H prototypes taken two by two

mutually, so we have H(H -1) / 2 distances. The different distances are stored in a matrix mat= [d 1 ,d 2 ,…d H (H-1) /2]

after being ordered in an increasing way. The algorithm for the merge phase is:

• Itération t=1

Let )),((min),(

,..1,..1

1 ji

H i j H i

r s ggd ggd

=−=

= , where gi and g j are the gravity centers associated to the prototypes Pi ,

P j respectively. Ps and Pr are considered similar prototypes, they will be grouped in one class CL1. The two

classes associated with Ps and Pr are merged to form one single class, the number of circle used to represent the

classes is reduced by one unit (figure B). In this case the number of classes becomes H-1.

Figure B

…..P3P2 Pi PH

H classes

…..

PHCL1(Ps , Pr )

P1 P2 P3

H -1classes

PHP3P1 P2 .....…

.....Ps Pr




ISSN: 1837-7823

14

• Iteration t = 2

Let d 2(gm ,gn) be the second minimum distance. Pm and Pn are considered to be similar prototypes, a test is

then carried out:

1. If one of the prototypes is already assigned to CL1, the other prototype will also be in CL1, the circlerepresenting the other prototype is removed (Figure C). The number of classes becomes H-2.

Figure C

2. If the two prototypes Pm and Pn are not assigned to class CL1, they will be grouped in a new class CL2 (FigureD), the circles associated with Pm and Pn are combined to form one circle associated to CL2. The number of

classes becomes H-2.

Figure D

• Iteration t = T

We assume that the number of classes is S . Let d T (gc ,gb) be the T th minimum distance. Pc and Pb are

considered to be similar prototypes, three cases are to be distinguished:

1. If Pc and Pb are not assigned to any class already created during the (T-1) iterations, a new class will be

produced CLv. The number of classes becomes S-1.

2. If one of the prototypes (for example Pc) is already assigned to a class CLi already created during the T-1

iterations, then the other prototype Pb must be assigned to the same class CLi. The number of classes is S-1.

3. If both Pc and Pb prototypes are already assigned to different classes (CLe ,CL f ) (e<f), respectively, the two

prototypes Pc and Pb must be assigned to the class CLe, all the prototypes that are already assigned to the class

CL f must be assigned to the class CLe and the class CL f will be eliminated. In fact, CL f is combined with CLe to

form one class that we call CLe , the number of classes becomes S-1.

This procedure will run iteratively until that the remaining number of circles associated to the classes isequal to the number of classes fixed initially.

3.2.3. Example:This example has three classes of spherical shape. Figure 1 shows the distribution of a Gaussian randomly

generated data in the (X, Y) space, figure 2 shows the distribution of gravity centers of the created prototypes for

K = 10 and D =20.

Figure 1: The distribution of a Gaussian randomly

generated data in the (X, Y) space.

Figure 2: Prototypes with corresponding centers

obtained by the proposed method.

PmCL1(Ps , Pr ) …

P H ……..

P1 P2 P3 P H P1 P2 P3 CL1(Ps Pr Pm)……

……

H -2 classes

Pm ..... Pn ….P H P1 CL1(Ps , Pr )P2

H -2 classes

P1 P 2 .. CL1(Ps ,Pr ) ….CL2(Pm ,Pn)…… P H




ISSN: 1837-7823

15

The first step of the proposed technique creates

prototypes and eliminates noisy objects. 9 prototypes are

created (Figure E).

Figure E We notice from figure 2 that the prototypes of a class are all neighbours to each other. The gravity centers

for all the created prototypes are given by:

g1 = [16.0000 60.6364], g2= [16.1818 70.8182], g3= [26.7273 64.1818]

g4= [35.7273 147.7273], g5= [42.1818 151.4545], g6 = [44.7273 146.4545]

g7 = [52.0909 143.4545], g8 = [99.0909 120.3636], g9= [105.636 117.4545]

Table 1 gives the mutual distances between the gravity centers of all prototypes.

Table 1: different distances calculated between the different prototype gravity centers.

1st iteration

61.5)),((min),(

9,..8,..1

651 ====

ji

i ji

ggd ggd , then P5 and P6

will be grouped in one class CL1, the number of classes is 8.

2nd iteration

d 2(g8 ,g9)= 7.16, then P8 and P9 will be grouped in the same

class CL2, the number of classes is 7.

3th

iterationd 3(g5 ,g4)=7.45, P4 and P5 are therefore grouped in the same


4 th iteration

D4(g6 ,g7 )=7.95, P6 and P7 will be grouped in the same


5 th iteration

d 5(g6 ,g4)=9.08, P4 and P6 are already grouped in one class CL1, the number of classes is 5.

6 th iteration

d 6 (g1 ,g2)=10.2, P1 and P2 will be grouped in one class CL3,

the number of classes is 4.

g1 g2 g3 g4 g5 g6 g7 g8 g9

g1 0 10.2 11.3 89.3 94.5 90.5 90.3 102 106

g2 10.2 0 12.4 79.3 84.7 80.8 81.0 96.6 100

g3 11.3 12.4 0 84.0 88.6 84.2 83.2 91.6 95.2g4 89.3 79.3 84.0 0 7.45 9.08 16.9 69.0 76.2

g5 94.5 84.7 88.6 7.45 0 5.61 12.7 64.8 71.9

g6 90.5 80.8 84.2 9.08 5.61 0 7.95 60.3 67.5

g7 90.3 81.0 83.2 16.9 12.7 7.95 0 52.3 59.5

g8 102 96.6 91.6 69.0 64.8 60.3 52.3 0 7.16

g9 106 100 95.2 76.2 71.9 67.5 59.5 7.16 0

7 classes

P1 P2 CL1(P5 P6 ) …… CL2(P8 ,P9)

P2

5 classes

CL1(P5 P6, P4 , P7 ) CL2(P8 ,P9)P1

4 classes

CL3(P1 ,P2) P3 CL1(P5 P6, P4 , P7 ) CL2(P8 ,P9)

P3P1 P2 P 5 P9

9 classes

P1 P2 P3 CL1(P5 P6 )……

……

8 classes

P9

P1

6 classes

P2… CL1(P5 P6, P4) .. CL2(P8 ,P9)




ISSN: 1837-7823

16

7 th iteration

d 7 (g1 ,g3)=11.3, P1 and P3 will be grouped in one class CL3 the

number of classes is 3.

End of algorithm

The number of classes is three, defining the real existing classes. CL1 contains P4 , P5 , P6 , and P7 . CL2 contains

P8 , and P9. CL3 contains P1 , P2 and P3.

4. Simulations and comparisonsWe consider in this section that the classes are of convex and non convex types. All the data used for the

simulations have been generated by a Gaussian distribution routine through Matlab. The difference between the

various simulations can be seen in the number of data, the form of classes, the degree of overlap between the

classes and the class compactness. All data used in the simulations are of two dimensional type which have

allowed to plot the data and to have a clear view of the results.

4.1. Simulation 1

This first simulation has three classes with no overlapping between them. The data are divided into two sets,one set of 90 objects is used for learning and one set of 70 objects is used for test. Figures 3 and 4 show the

results of the classification of data in ( A1, A2) space for CM, FCM and SCNN methods for bad and good

initializations respectively.

Figures 3 and 4 show that the methods C-means (CM), fuzzy C-means (FCM) and competitive neural

networks (SCNN) converge to good solutions when using a good initialization, but when a bad initialization is

used, these methods are trapped in local solutions. Figures 5 and 6 show the classification results using the fuzzy

Min-Max classification (FMMC) method and the proposed method respectively. For FMMC we set θ=0.5 and

γ=2. For the proposed method, D is set to 1 and K to 8.

Figure 3: Results of the classification in the ( A1,A2)

space for bad initialization.

Figure 4: Results of the classification in the ( A1,A2)

space for good initialization.

Figure 5: Results of the classification in ( A1,A2)

space for FMMC.



3 classes

CL3(P1 ,P2 , P3) CL1(P5 P6, P4 , P7 CL2(P8 ,P9)






ISSN: 1837-7823

17

We have also studied in this simulation the impact of K on the classification results, we found that if K is

less than 8, the classification results are good, but the number of prototypes is higher. If K is greater than 10, the

classification results may not be good. The prototypes within a class where the compacity is very low may not be

formed at all, the objects may all be considered within this class as noisy, and the class ignored.

Table 1 gives the classification rates for the methods: C-means, fuzzy C-means, standard competitive neural

networks for bad initialization, fuzzy Min-Max classification and the proposed method.

Table 1: The classification rates obtained by the various methods considered in this work.

Table 1 shows that the methods CM, FCM and SCNN obtain bad classification results for bad initializationand the algorithms have converged to local solutions. The classification methods FMMC and the proposed

method converge to better solutions but the running time for FMMC is much bigger than that required by theproposed method.

4.2 Simulation 2 This simulation has five classes of spherical shape which are slightly overlapping between them. The data

used are divided into two parts, a training set containing 200 objects and a test set containing 100 objects. Figure

7 shows the classification results of data in ( B1, B2) space for CM, FCM and SCNN methods using a good

initialization. Figure 8 shows the results of data classification by the FMMC method, θ is set to 0.5 and γ to 1.2.

Classification methodsNumber of prototypes

Number of misclassified objects

Classification rate(%)

Learning time (s)

C-means 3 classes 30 57.14 0.188

Fuzzy C-means 3 classes 30 57.14 0.047Standard competitive neural network 3 classes 30 57.14 0.016

Fuzzy Min-Max classification 48 0 100 0.422Proposed method 9 0 100 0.0780

Figure 7: Results of the classification in the (B1,B2)

space for good initialization.

Figure 8: Results of the classification in the (B1,B2)

space for FMMC.




ISSN: 1837-7823

18

Figure 7 shows the good convergence of the C-means (CM), fuzzy C-means (FCM) and standard

competitive neural networks (SCNN) methods, but when a bad initialization is used, all these methods are

trapped in local solutions. Figure 9 shows the classification results for the proposed method, D=1 and K= 15.

Figure 9: Prototypes with corresponding centers obtained by the proposed method.

We have varied the value of K in this case and have obtained similar results as before. If K is less than

15, good classification results are obtained but at the expense of higher number of prototypes and increasing

running time. If K is greater than 15, the result of classification may not be good, the risk of the algorithm to

ignore less compact classes is always there.

Table 2 gives the classification rate for C-means, fuzzy C-means and standard competitive neural networksusing a good initialization, fuzzy Min-Max classification and the proposed method


Classification methodsNumber of prototypes

Number of misclassified objects

Classification rate(%)

Learning time (s)

C-means 5 classes 3 97 0.609

Fuzzy C-means 5 classes 3 97 0.343

Standard competitive neural network 5 classes 2 98 0.031

Fuzzy Min-Max classification 101 5 95 1.266Proposed method 7 0 100 0.766

From this table, the proposed method has the highest rate of classification, it is 100%. The other methods

CM, FCM, SCNN and FMMC have a classification rate equal to 97%, 97%, 98% and 95% respectively.

4.3 Simulation 3This simulation has three classes of spherical form. The data are divided into two parts, a training set

containing 450 objects and a test set containing 300 objects. Figure 10 shows the classification results in (C1,

C2) space using CM, FCM and SCNN methods for good initialization. Figure 11 shows the result of data

classification for the FMMC method.

Figure 10: Results of the classification in the (C1,C2)

space for bad initialization.

Figure 11: Result of data classification in the (C1,C2)

for the FMMC method.




ISSN: 1837-7823

19

Figure 12 shows the data and the prototype gravity centers in (C1, C2) space for the proposed method, D is

set to 1 and K to 23.

Figure 12: Prototypes with corresponding centers obtained by the proposed method.

It is found from this experiment that if K is less than 14, noisy objects can be selected into prototypes and

may result in a bad classification. If K is greater than 47, there is a risk of the algorithm to ignore less compact

classes. In this test, K may be of higher value than that of the previous tests because all the classes are more

compact and the number of objects in each class is higher. Table 3 gives the classification rates obtained by CM,

FCM, SCNN, FMMC and the proposed method.


A higher classification rate is obtained by the proposed method; it is equal to 98.33% that means five

objects were misclassified among 300 objects. For the CM, FCM, SCNN and FMMC, the classification rates are

97%, 97.66%, 97.33% and 95.33% respectively.

4.4 Simulation 4This simulation has two classes of complex shape. The data used are divided into two parts, a training set

containing 360 objects and a test set containing 180 objects. The data have been synthesized by the authors and

this has made use of the rand routine in Matlab. Figure 13 shows the data classification results in ( D1, D2) space

by CM, FCM and SCNN. Figure 14 shows the data classification results by FMMC, θ=0.3 and γ=1.

Figure 15 shows the data and the prototype gravity centers in ( D1, D2) space for the proposed method, D is

set to 1 and K to 30.

Classification methodsNumber of

prototypes

Number of

misclassified objects

Classification rate

(%)Learning time (s)

C-means 3 classes 9 97 0.7500Fuzzy c-means 3 classes 7 97.66 0.3280

Standard competitive neural network 3 classes 8 97.33 0.094

Fuzzy Min-Max classification 240 14 95.33 4.9531Proposed method 25 5 98.33 2.8430

Figure 13: Results of the classification in the ( D1,D2)

space for CM, FCM and SCNN.Figure 14: Results of the classification in the ( D1,D2)

space for. FMMC




ISSN: 1837-7823

20

Figure 15: Prototype gravity centers obtained by the proposed method.

If K is less than 30, good classification results are obtained, but the computing time and the number of

prototypes increases. Table 4 gives the classification rate obtained by CM, FCM, SCNN and the proposed

method.


From this table, we notice that the performance of the proposed method is higher than that of the others, we

have obtained a good classification rate in a minimum running time.

4.5. Simulation 5This simulation has two classes of complex shape. The data used are divided into two parts, a training set

containing 556 objects and a test set containing 500 objects. The data have been synthesized by the authors and

this has made use of the rand routine in Matlab. Figure 16 shows the data classification results in ( E1, E2) spaceby CM, FCM and SCNN. Figure 17 shows the data classification results by FMMC, θ=0.6 and γ=1.

Figure 18 shows the data and the prototype gravity centers in ( E1, E2) space for the proposed method, D is set

to 1 and K to 30.

Classification methodsNumber of

prototypes

Number of

misclassified objects

Classification rate

(%)Learning time (s)

C-means 2 classes 28 84.4 2.25Fuzzy c-means 2 classes 28 84.4 1.1560

Standard competitive neural network 2 classes 26 85.56 1.67

Fuzzy Min-Max classification 118 0 100 2.1880

Proposed method 19 0 100 0.8910

Figure 16: Results of the classification in the ( E1, E2)

space for CM, FCM and SCNN.Figure 17: Results of the classification in the ( E1, E2)

space for FMMC




ISSN: 1837-7823

21

Figure 18: Prototype gravity centers obtained by the proposed method.

If K is less than 30; there will always be a good classification, but the computing time and the number of

prototypes increase. Table 5 gives the classification rate obtained by CM, FCM, SCNN, FMMC and the

proposed method. Table 5: The classification rates obtained by the various methods considered in this work.

A very high classification rate is obtained by the proposed method and FMMC but the execution time is less

for the proposed method. The other methods CM, FCM and SCNN have failed to converge to the correct

solution; they have a very low classification rate (57%).

5. ConclusionThe present paper proposes a supervised data classification technique for convex and non convex classes.

The technique is carried out in two steps; step one creates the prototypes and eliminates noisy objects, step twomerges prototypes into classes.

The technique is based on the tuning of two parameters K and D. The prototypes are formed by the K

nearest neighbours technique. The second step consists of merging the closest prototypes into classes. The line of

circles architecture has been used to assist this task. . In this study, we have shown that bigger values of K may

obtain bad classification results, while smaller values of K obtain higher number of prototypes and higher

running time. A good choice of K must be made. Different simulations were performed to validate the proposedmethod, the difference between these simulations may be viewed in the number of data, the class compactness,

the form of classes and the degree of overlap between them. The proposed method was compared to various

techniques such as C-means, fuzzy C-means, competitive neural networks and the fuzzy min max classification.

For complex classes, the traditional methods as CM, FCM and SCNN have failed to converge to the true solution

whereas the proposed method has obtained good results with smaller convergence time

In all the simulations performed in this work, the proposed method has always obtained better results in a

fewer running time.6. References[1] Ouariachi, H., (2001). “Classification non supervisée de données par réseaux de neurones et une approche

évolutionniste: application à la classification d’images”. Thèse de doctorat, Université Mohamed 1, Maroc.

[2] Nasri, M., (2004). “Contribution à la classification des données par approches évolutionnistes: simulation et

application aux images de textures”. Thèse de Doctorat, Université Mohammed 1, Oujda, Maroc.

[3] Simpson, P.K., (1992). “Fuzzy min-max neural networks. Part 1: Classification”. IEEE Transactions onNeural Networks, vol. 3, pp: 776-786.

[4] Zalik, K. R. and Zalik, B., (2010). “Validity index for clusters of different sizes and densities”. Pattern

Recognition Letters, 221-234.

[5] Bouguelid, M. S., (2007). “Contribution à l'application de la reconnaissance des formes et la théorie des

possibilités au diagnostic adaptatif et prédictif des systèmes dynamiques ». Thèse de doctorat, Université de

Reins Champagne- Ardenne.

[6] Borne, P., Benrejeb M., Haggége, J. “ Les réseaux de neurones, présentation et applications”. Editionstechnip.

Classification methods Number of prototypes Number of misclassified objects Classification rate(%) Learning time (s)

C-means 2 classes 214 57.2 1.0150Fuzzy c-means 2 classes 214 57.2 0.9220

Standard competitive neural network 2 classes 214 57.2 0.17Fuzzy Min-Max classification 141 0 100 3.04

Proposed method 28 0 100 1.84




ISSN: 1837-7823

22

[7] DIAB M. ,(2007). “ Classification des signaux EMG Utérins Afin de Détecter Les Accouchements

Prématurés”. Thèse.

[8] Stéphane, D., Postaire J.-G, (1996). “ Classification interactive non supervisée de données

multidimensionnelles par réseaux de neurones à apprentissage compétitif ”. Thèse, France.

[9] Hao, Y. Régis, L., (1992). “Etude des réseaux de neurones en mode non supervisé: Application à la

reconnaissance des formes”. Thèse.[10] Kopcso, D., Pipino, L., Rybolt, W., (1992). “Classifying the uncertainty arithmetic of individuals usingcompetitive learning neural networks”. Elsevier: Expert Systems with Applications, Vol 4, pp.157-169.

[11] Gabrys, B., and Bargiel, A., (2000). "General Fuzzy Min-Max Neural Network for Clustering and

Classification". IEEE Transactions on neural networks, Vol. 11, N°. 3.

[12] Chowhan, S.S., Shinde, G. N., (2011). “Iris Recognition Using Fuzzy Min-Max Neural Network”.

International Journal of Computer and Electrical Engineering, Vol. 3, No. 5.

[13] Khan, S., Ahmad, A., (2004). “Cluster center initialization algorithm for K-means clustering”. ElsevierPattern Recognition Letters 25, pp. 1293–1302.

[14] Mohamed N. Sameh, A., Yamany, M., Nevin M., Aly A. Farag, and Moriarty, T. (2002). “A Modified

Fuzzy C-Means Algorithm for Bias Field Estimation and Segmentation of MRI Data”. IEEE, Transactions

on medical imaging, Vol. 21, N°. 3.

[15] Dong-Chul Park, (2000) “Centroid Neural Network for Unsupervised Competitive Learning”. IEEE

Transactions on neural networks, Vol. 11, N°. 2.

Paper-2 a New Supervised Data Classification Method for Convex and Non Convex Classes

Documents

Transcript of Paper-2 a New Supervised Data Classification Method for Convex and Non Convex Classes