Fuzzy Signature Neural Network - CECS - ANUcourses.cecs.anu.edu.au/courses/CS_PROJECTS/12S1/... ·...

Fuzzy Signature Neural Network

Kun He

<[email protected]>

1st June 2012

A report submitted for the degree of Master of Computing of

Australian National University

Supervisor: Prof. Tom Gedeon

Acknowledgements

Thanks to my supervisor Tom Gedeon, and Dingyun Zhu for their recommendations and kind

support on this project. Also thanks to the course coordinator Weifa Liang for supporting the

skill of writing report. Moreover, thanks to Wei Fan for his suggestions about the knowledge of

fuzzy signature neural networks. Finally, thanks to my family and my friends for their

encouragements.

Abstract

In this report, firstly, we introduce the background of neural networks and fuzzy signatures,

and then focus on the fuzzy signature neural network. The fuzzy signatures are used as the

solution for fuzzy rule based systems to reduce the rule explosion issue. The neural networks

which we consider use Radial Basis Functions as the activation function, which are real value

functions whose value only depends on the distance from the centroid of that function.

We modify the previous approach to improve the method of choosing aggregation functions,

and the method of creating the structure of the fuzzy signature neural network. The new

method reduces the risk that the results depends highly on the manual choice of aggregation

function and manual choice of structure of fuzzy signature.

Finally, this report presents experimental evaluation for the fuzzy signature neural network by

three experiments. The first and second experiments compare the fuzzy signature based on

RBF neural networks. They show that our approach is better than the previous fuzzy signature

based RBF neural network when the datasets have significant numbers of missing values. The

third experiment compares our work with other neural networks. The results demonstrate that

our approach is viable and worth further investigation.

List of Abbreviations

FSNN Fuzzy signature neural network NN Neural Networks

ANN Artificial Neural Networks

RBF Radial Basis Function

Cascor Cascade Correlation Neural Network

sNN Symmetric Nearest Neighbor

Contents

Acknowledgements .................................................................................................................................. 1

Abstract ................................................................................................................................................... 2

List of Abbreviations................................................................................................................................. 3

List of Figures .......................................................................................................................................... 6

List of Tables ............................................................................................................................................ 7

1. Introduction...................................................................................................................................... 8

1.1 Motivation .......................................................................................................................... 8

1.2 Objectives ........................................................................................................................... 8

1.3 Contribution ....................................................................................................................... 8

1.4 Preview .............................................................................................................................. 9

2. Background and relevant knowledge .................................................................................................. 9

2.1 Neural Networks ................................................................................................................. 9

2.2 Radial Basis Function......................................................................................................... 11

2.3 Fuzzy Rules Based system .................................................................................................. 12

2.4 Fuzzy Signature ................................................................................................................. 13

3. Fuzzy Signature Neural Network ...................................................................................................... 14

3.1 Basic structure and process of fuzzy signature neural network ............................................ 14

3.2 Improvement of fuzzy signature neural network ................................................................. 15

4. Design and Implementation of Fuzzy signature neural network ......................................................... 15

4.1 Description ....................................................................................................................... 16

4.2 Construction of fuzzy signature neural network .................................................................. 17

4.2.1 Damaged data ............................................................................................................ 17

4.2.2 Clustering .................................................................................................................. 18

4.2.3 Create fuzzy signature ................................................................................................ 19

4.2.3.1 Structure of fuzzy signature .................................................................................... 19

4.2.3.2 Obtain fuzzy signature information .......................................................................... 20

4.2.3.3 Aggregation .......................................................................................................... 21

4.2.4 Create neural network ................................................................................................ 22

4.2.5 Training the Fuzzy Signature Neural Network .............................................................. 22

4.3 Testing .............................................................................................................................. 24

4.3.1 Testing network ......................................................................................................... 24

4.3.2 Extracting network information .................................................................................. 25

5. Experiments and Evaluation............................................................................................................. 25

5.1 Description of the dataset .................................................................................................. 26

5.2 Hardware and software environment information ............................................................... 27

5.3 Experiment 1: Datasets experiments with no missing data................................................... 28

5.3.1 Purpose of the experiment .......................................................................................... 28

5.3.2 Description of the experiment ..................................................................................... 28

5.3.3 Experiment Process and Discussion of Results ............................................................. 29

5.4 Experiment 2: Datasets experiments with missing data ....................................................... 30




5.5 Experiment 3: Benchmarks comparison between Fuzzy signature neural network and other

approaches ..................................................................................................................................... 32




6. Conclusion and Future Works........................................................................................................... 33

6.1 Conclusion ........................................................................................................................ 33

6.2 Future Work...................................................................................................................... 33

Reference ............................................................................................................................................... 35

Appendix A ............................................................................................................................................ 37

List of Figures

Figure 1: Example of a basic NN .......................................................................................................... 10

Figure 2: Example of a neural network ................................................................................................. 11

Figure 3: Example of structure of fuzzy signature of SARS patient ............................................................ 13

Figure 4: Example of aggregation of SARS patient.................................................................................. 14

Figure 5: Example of Fuzzy signature based redial basis function neural network ...................................... 15

Figure 6: Construction of fuzzy signature neural network & testing suit ................................................... 16

Figure 7: Example of Agglomerative hierarchical clustering..................................................................... 18

Figure 8: Examples of structure of fuzzy signature for SARS .................................................................... 20

Figure 9: The example of structures of fuzzy signature according to Figure 8 ............................................ 20

Figure 10: Example of aggregation function selection ............................................................................ 21

Figure 11: Manhattan distance ........................................................................................................... 23

List of Tables

Table 1: Information of files................................................................................................................ 25

Table 2: Datasets details .................................................................................................................... 26

Table 3: the information of Hard and Software environment .................................................................. 28

Table 4: the detail of number of cluster selected ................................................................................... 29

Table 5: the benchmark for our approach............................................................................................. 29

Table 6: the benchmark for Fan's approach .......................................................................................... 30

Table 7: the benchmark for our approach with missing value .................................................................. 31

Table 8: the benchmark for Fan's approach with missing value ............................................................... 31

Table 9: the information of our approach that will be used..................................................................... 32

Table 10: the results of 3 different neural networks ............................................................................... 33

1. Introduction

1.1 Motivation

Human decision making is a comprehensible hierarchical process, in which cognition processes

lead to the selection of a set of actions among many variations, so bio-inspired techniques are

used as human decision making foundation. The design of intelligent systems, in Artificial

intelligence or its descendent Computational Intelligence is a problem of identifying

approximate models to describe a real world scenario [1]. Nowadays, AI is applied is a wide of

variety of fields, such as business, math, industry, medical science, and so on. However, if those

systems consist of very complex structured high dimensional data, and sometimes with

interdependent features and missing components, conventional AI systems are not adequate

[1]. Therefore, how to handle the datasets correctly and effectively or how to design structure

which is trend to real word, all of these have become a key issue in decision-making under

uncertainty.

An efficiency issue called rule explosion affects fuzzy rule based system which is conventional

AI system [1]. It will grow exponentially with the number of input dimensions. In this case,

fuzzy signature method is used by Gedeon et [2], which can be used in fuzzy rule based system,

meanwhile solving the rule explosion issue.

Neural Networks have very strong nonlinear fitting ability, can represent any complex

nonlinear mapping relationship, and also have strong robustness, memory, nonlinear mapping

ability and strong learning ability.[3] However, the neural network approach does not produce

easily comprehensible results, especially for networks that contain very complex structured

high dimensional data. [4]The fuzzy signatures based neural network approach has been used

to combine the benefits of neural networks and fuzzy signatures. In our project, we choose

Radial Basis Function as the activation function. Moreover, each fuzzy signature is seem as the

hidden neural in Radial Basis function. So fuzzy signature neural network can gain the

advantages of neural networks, at the same it can solve the efficiency issue of rule explosion in

fuzzy rule based system.

1.2 Objectives

The aim of this project is to implement and improved fuzzy signature neural network based on

Fan’s code [3], and to provide a detailed investigation, then evaluate this approach by testing

data, and compare with the previous approach.

1.3 Contribution

The contribution of this project involves the three following areas. Firstly, the FSNN code

written by Fan is simplified and modified in order to implement and improve FSNN. Secondly,

the sequence training code is designed for FSNN, and is used to compare with other neural

networks, also including the previous FSNN. Finally, the feasibility and future works of such a

neural network based on the results of the experiments is discussed.

1.4 Preview

Chapter 2 gives an overview of the relevant techniques and the basic concepts, including neural

networks, radial basis function neural networks, and fuzzy rule based system and fuzzy

signature, which are helpful for understanding FSNN and its evaluation. Fallowing that; in

Chapter 3 introduction and especially focuses on relevance to the fuzzy signature neural

network. Chapter 4 demonstrates the techniques and methodologies of basic fuzzy signature

neural networks and then describes the test suite for the implementation, then presents the

design, implement and testing. Chapter 5 is about evaluating this approach based on the

implementation in Chapter 4，We used 3 different experiment to evaluate our approach in

various aspect. Finally, Chapter 6 concludes the report and indicates the weaknesses as well as

suggestions for future work.

2. Background and relevant knowledge

2.1 Neural Networks

An artificial neural network, also called neural network, which is a mathematical

model or computational model that is inspired by the structure and/or functional aspects

of biological neural networks. It consists of an interconnected group of artificial neurons, which

processes information using a connectionist approach to computation [5]. Modern neural

networks are non-linear statistical data modeling tools which means it has an ability to solve

some problems that do not have a known statistical model [3]. So they are usually used to

model complex relationships between inputs and outputs or to find patterns in data. The

following figure shows the basic principle of NNs.

X1

X2

XN

……

Y

W1

W2

WN

Inputs Weights Outputs

∑WX

Activation fuction

Figure 1: Example of a basic NN

According to the Figure 1, X represents a number of input data either from original data or the

output of other neurons; the strength of the connection between input data and neurons is

called weights. Finally the activation function that converts a neurons weighted input to its

output activation. The activation function could be non-liner function such as Gaussian function,

sigmoid function. The activation function formula is shown below [5].

y = f (∑wixi

n

i=1

)

Where y is the output of the neuron

n is the number of the inputs

i is the i th of the neuron

𝑥𝑖 is the value of input i of the neuron

𝜔𝑖 is the weight value of input i i is the value of input i of the neuron

f is the activation function, e.g. for sigmoid, f(x)=1

1+e−x

Equation 1: basic equation of neural network

However a single layer neural network cannot handle complex problems because its structure

is too simple, and then we need to create multiple layer neural networks. This neural network

consists of numbers of neurons so that they are able to solve more complex problems. It consists of 3 different kind neurons which are input neurons, hidden neurons, and output

neurons, and then each neuron is located at different layers which are called input layer, hidden

layer and output layer respectively. [5] Figure 2 shows an example neuron network.

x2

xn

Hidden neuron

Hidden neuron

Hidden neuron

y2

y2

x1

Inputs layer Hidden layer Output layer

...

Weights matrix

Weights matrix

Figure 2: Example of a neural network

In order to create a neural networks module, we need two procedures which are training and

testing. The first necessary procedure is training the neural network. Through training neural

networks is able to find suitable values of the weight matrix which can make the actual outputs

more closely with the desired outputs. During this process, neurons learn the weights

iteratively by being given a number of training data. This process is finished when the network

has stabilized. After this procedure, the second necessary procedure is testing the neural

network by using the testing data and the weights matrix after training to compare between

actual outputs and desired outputs. Then, according to the testing accuracy rate, we can

conclude about our neural networks module whether it was successful or not.

2.2 Radial Basis Function

A radial basis function, also called RBF, is a real-valued function whose value depends only on the distance between the origin point 𝑥 and some other point 𝑐 [6]. The radial basis

function represents as:

∅(𝑥, 𝑐) = ∅(‖𝑥 − 𝑐‖)

Any function ∅ satisfies the property below, also called radial function.

∅(𝑥) = ∅(‖𝑥‖)

There are different distance measures which can be used in the radial basis function such as

Euclidean distance, Lukaszyk-Karmowski metric and taxicab distance. In our project, we

choose the Euclidean distance measure method for distance of RBF. The Definition of

Euclidean distance is below:

𝑑(𝑞,𝑝) = √∑(𝑞𝑖 − 𝑝𝑖)2

𝑛

𝑖=1

Where p = (p1, p2... pn) and q = (q1, q2... qn) are two points in Euclidean n-space.

Equation 1.1: Euclidean distance

There are two obviously advantages of RBF neural networks, firstly RBF neural networks train faster than other multiple layer neural networks. Another advantage that is claimed is that the

hidden layer is easier to interpret than the hidden layer in an MLP.

2.3 Fuzzy Rule Based systems

Fuzzy Rule Based systems are linguistic IF-THEN- constructions that have the general form "IF

A THEN B" where A and B are (collections of) propositions containing linguistic variables，in

which case, A is called the premise and B is the consequence of the rule. In effect, the use of

linguistic variables and fuzzy IF-THEN- rules exploits the tolerance for imprecision and

uncertainty. In this respect, fuzzy logic mimics the crucial ability of the human mind to

summarize data and focus on decision-relevant information.

Fuzzy Rule based systems are very successful and popular in control system applications. They

outperform the conventional method of modeling non-linear control systems, which is based

on solving high order partial deferential equations, by simplicity of inference. However, the

fuzzy rule based system is also called a dense fuzzy rule based system, because it suffers from a

serious issue which called rule explosion [1]. Rule explosion is caused by the exponential growth of the number of rules needed with regard to the number of fuzzy sets per input

dimension and number of inputs. The equation 1.2 below shows the calculation of the number

of rules required for a system which has k input variables and T number of fuzzy sets per input

dimension [7].

|𝑅| = 𝑂(𝑇𝐾)

Equation 1.2: Rule explosion

As the k or T increasing, the number of rules will have an increasing sharply. In order to solve

this problem; there are 4 possible solutions to model systems that have high number of inputs

and/or fuzzy subsets within those inputs. They are sparse fuzzy rule based systems,

hierarchical fuzzy rule based systems, sparse hierarchical fuzzy rule based systems, and fuzzy

signatures. In our approach, we choose the fuzzy signatures to solve this issue.

2.4 Fuzzy Signature

Computational Intelligence research focuses mainly on identifying approximate models for

decision support or classification where analytically unknown systems exist. Especially, those systems consist of very complex structured and/or high dimensional data, even with

interdependent features [3]. Traditional fuzzy logic approaches such as fuzzy rule based

systems have become popular for Computational Intelligence research because of the ability to

assign linguistic labels and to model uncertainty in many decision making and classification

problems. However, conventional fuzzy rule based systems suffer from high computational time

complexity, so in most cases, applications of fuzzy rule based systems still remain in some

conditions which have few dimensions of input variables and have relatively simple structured

data even if there is complex behavior in the system being modeled [1]. The role of aggregation

of information in rule based fuzzy systems, including sparse hierarchical fuzzy r ule based

systems, is generally by min, max and average. This is a restriction on conventional fuzzy

systems as it neglects other membership values of the same input.

A Fuzzy Signature is a Vector Valued Fuzzy Set (VVFS), where each vector component is

another VVFS (branch) or an atomic value (leaf). It can be described as below [7]:

𝐴:𝑋 → [𝑎𝑖]𝑖=1𝑘

Where: 𝑎𝑖 = { [𝑎𝑖𝑗]𝑗=1

𝑘𝑖 ; 𝑖𝑓 𝑏𝑟𝑎𝑛𝑐𝑕

[0,1] ; 𝑖𝑓 𝑙𝑒𝑎𝑓

Equation 1.3: VVFS

If each fuzzy signature has the same structure and aggregation method, then fuzzy signature

can be described a vector, normally we used min, max, and average method as aggregation functions [2]. Figure 3 below represents a fuzzy signature example for a SARS patient, and

Figure 4 shows an aggregation method based on max, min then average. According to the figure,

these aggregation functions transfer those membership values of this fuzzy signature into a

single fuzzy signature and eventually a single value.

Figure 3: Example of structure of fuzzy signature of SARS patient

Figure 4: Example of aggregation of SARS patient

There are 3 important advantages of fuzzy signature in fuzzy rule based systems. Firstly it is able to reduce the high computation cost. Second, fuzzy signatures have the ability to handle

noisy and missing value by using specific aggregation functions. Thirdly, new information or

features can be added without redesigning the structure of the data representation

However, in actual use, the definition of structure and choice of aggregation function which is

based on professional knowledge, in the presence of uncertain data, we have some difficulty to

define and choose all of these. In order to solve this issue, in our project, we are trying to

automate and hence improve the parts of definition of structure and selecting of aggregation

function to make fuzzy signature more general. In the next chapter, we will discuss more details

of the improvement of the definition of structure and selection of aggregation functions.

3. Fuzzy Signature Neural Network

3.1 Basic structure and process of fuzzy signature neural network

Fuzzy signature neural network is a type of neural networks, in our project; the radial basis

function is treated as the activation function in the neurons [8].Firstly, a number of input data

either from original data or the output of other neurons as input, then the Euclidean distance is

calculated from the evaluation point (each input is a point in a multi-dimensional input space)

to the sample point in each neuron. Additionally, each hidden neuron (fuzzy signature neuron)

in the neural network has a specific fuzzy signature associated with it and the output of the

hidden neuron is the similarity between the input vector and the fuzzy signature based on the

specific aggregation function. Thirdly, through the equation below, we can calculate the output,

and then we use the equation below to modify the weights. ∆𝑤𝑖𝑗 = (𝑡𝑖 − 𝑦𝑖)𝑥𝑖

𝑤𝑕𝑒𝑟𝑒 𝛼 is a constant with small value called learning rate

𝑡𝑖 is the value of desired output at dimension i

𝑦𝑖 is the value of actual output at dimension i

𝑥𝑖 is the value of input at dimension i

Equation 2.1: Weight modify

So the strengths between the input neurons and hidden neurons are constants, which mean it

is not able to change with training. The whole training task is only performed by the weight

matrix between hidden neurons (fuzzy signature neurons) and the output neurons. Therefore,

the time to train these neural networks should reduce since the smaller size of weight matrix,

and reduced number of layers. Figure 5 shows the general architecture of the fuzzy signature based radial basis function neural network.

Input neuron

Input neuron

Input neuron

Hidden neuron

Hidden neuron

Output neuron

Output neuron

Weight

WeightWeight

Weight

Fuzzy Signature neuron

Figure 5: Example of Fuzzy signature based redial basis function neural network

3.2 Improvement of fuzzy signature neural network

In our project, we work on a similar network has been shown by Fan; However, Fan’s

application has some limitations. Firstly, the users must be choose the aggregation function by

themselves, so the results depends highly on their choice, and secondly, each structure of fuzzy

signature must be set by the users, in actual use, the users choice must be based on

professional or practical knowledge then the users can select the structure of fuzzy signature.

Hence selecting the structure of a fuzzy signature is quite difficult to common users. The third

weakness is that once chosen, the same structure of fuzzy signature and aggregation function is

used in constructing all fuzzy signatures in that network. So it is likely to miss some features in

the data if there are significant differences in the important data in different clusters. This is

likely if we consider highly complex data with significantly different sub-structure. Therefore,

this implementation significantly extends the previous work, re-designs and constructs the

fuzzy signature based RBF network in a different and more general way.

4. Design and Implementation of Fuzzy signature neural

network

4.1 Description

This chapter describes the techniques and methodology used to implement fuzzy signature

neural networks. The programing language used to implement the FSNN is Matlab. In order to

create a fuzzy signature neural network, the implementation can be divided into five modules,

and the testing can be divided into four modules. The fundamental architecture of the

implementation and testing is demonstrated in Figure 6, the construction of fuzzy signature

neural network consists of data with missing values, clustering, and obtaining fuzzy signature

information, create and train the network. The testing suite contains three modules which are

extracting network information, and testing network, and then collate results. The construction

of the fuzzy signature neural network should be embedded into the test suite, which means it

should be part of the test suite.

Construct of fuzzy signature neural network

Testing suite

Figure 6: Construction of fuzzy signature neural network & testing suit

Damaged input

Clustering input

Obtain fuzzy signature

Create nerual

network

Train neural network

Extracting network

information

Clustering input from testing

data

Testing neural network

Collect the results

4.2 Construction of fuzzy signature neural network

This section describes the procedure to construct fuzzy signature neural networks in detail. The whole network procedure is divided into five different modules and will be introduced as

follows:

4.2.1 Dealing with data

In a real world, the recorded data is not always perfect, normally, missing data happens under

uncertainty condition. We add this module in order to simulate the situation above. The

damaged data module is an optional module, we only need to simulate the data which has

missing values, and then we will choose to use it. Below a pseudo code demonstrates the

algorithm and process.

The damaged function to rearrange the data and it contains some missing values.

Input: data (this means the data we need to handle), rates (proportion of missing values in whole data)

Output: damageddata

damaged(data, rate)

1. inputCol=number of column (data)

2. inputRow=number of row(data)

3. total=round(rate*inputCol*inputRow); // get the number of the total missing values in whole data

4. col=random(1,total,[1,inputCol]); //get the position of the column

5. row=random(total,1,[1,inputRow]); // get the position of the random

6. for loop 1 to total

7. Set data(col(1,i), row(1,i)← Not a number //set those position to a non-number

8. End loops

9. Set damageddata=data

10. End

From above pseudo code, firstly, we need to set the rate with which we need to deal the data

with missing values. Then we calculate the total number of missing values, after that we random by the positions in the matrix, then can set those position numbers into non-numbers.

Then the damage-data set is ready.

4.2.2 Clustering

Clustering is the pre-processing method to obtain the fuzzy signature neurons in this

implementation. Clustering is a creditable unsupervised method; using a clustering technique allows that the users do not need to consider how to construct the fuzzy signature themselves.

The second reason is that extracting fuzzy signature manually is time consuming and can be

difficult, especially when the input data set contains large numbers of records.

Clustering has many different methods. In our approach, we use agglomerative hierarchical

clustering. Agglomerative hierarchical clustering has some obvious advantages with regards

other methods, the first one is that we do not need users to specify the number of clusters, the

second one is that the algorithm is deterministic which is useful for research purposes, and the

third one is that the outputs are more informative than in flat clustering. [9]

Agglomerative hierarchical clustering is a bottom up hierarchical clustering method where

each cluster can be another cluster’s sub-cluster. It begins with each single object in a separate

cluster. Then it agglomerates similar clusters based on similarity criteria, until all data merge

into one cluster. Below the figures shows an example of hierarchical clustering.

.

a

b c d e f

a b c d e f

abcdef

bcdef

debc

Figure 7: Example of Agglomerative hierarchical clustering

In the hierarchical clustering, the similarity criterion is based on the Ward linkage method

since it is efficient. It uses the incremental sum of squares to calculate the distance; normally

distance means the Euclidean distance. The sum of squares measure is defined as in the

formula below:

𝑑(𝑟, 𝑠) = √2𝑛𝑟𝑛𝑠

(𝑛𝑟 +𝑛𝑠)‖𝑥𝑟− 𝑥𝑠‖

Where 𝑥𝑟and 𝑥𝑠 are the centroids of clusters r and s

‖𝑥𝑟−𝑥𝑠‖ is Euclidean distance between 𝑥𝑟 and 𝑥𝑠

𝑛𝑟 and 𝑛𝑠 are the number of elements in clusters r and s

Equation 2.2: distance calculate of Euclidean

In our application, Matlab has a package in the statistical toolbox for Agglomerative

Hierarchical Clustering with the Ward linkage method. Function 𝑙𝑖𝑛𝑘𝑎𝑔𝑒(𝑖𝑛𝑝𝑢𝑡𝑑𝑎𝑡𝑎, ′𝑤𝑎𝑟𝑑′, ′𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛′) takes input data set inputdata and the specific

method name as arguments then creates a hierarchical cluster tree. In this case, the formatted

data combined with a string as specific parameters pass though this function. This string

indicates Ward linkage method. Furthermore, the function 𝑐𝑙𝑢𝑠𝑡𝑒𝑟(𝑍, ′𝑚𝑎𝑥𝑐𝑙𝑢𝑠𝑡′, 𝑛)

constructs clusters from a hierarchical cluster tree and the number of clusters n as arguments. The clustering module in this implementation uses both functions to cluster the formatted data

set.

4.2.3 Create fuzzy signature

Before creating a fuzzy signature, we need to get the structure of fuzzy signature. In our

application, each fuzzy signature has a different structure which was created automatically.

Additionally, the fuzzy signature neurons are based on selected clusters (which has been

mentioned in Section 4.2.2), after that, we will get the membership values as described above,

and finally, our code will choose the best aggregation function to find the final membership

value as the hidden neurons (fuzzy signature neuron). In order to present the detail, we will

use the next 3 sections to explain our application.

4.2.3.1 Structure of fuzzy signature

Each fuzzy signature has different structure, this has the advantage that it reduces the risk of

affecting the results by manual construction. In our project, we use a function called 𝐶𝑟𝑒𝑎𝑡𝑒𝑆(𝑛, 𝑖𝑛𝑝𝑢𝑡) to handle this part. Following below is the pseudo code description of the

algorithm and the process for creating the structure of the fuzzy signature.

The function to create structure of fuzzy signature

Input: input (the number of input dimensions), n (this means the number of fuzzy signatures we need

create)

Output: structure (it is a matrix which contains the structure of fuzzy signature)

Structure=CreateS (n, input)

1. For loop 1 : n //once we just handle a fuzzy signature

2. For loops 1:10 // in our application we limit to 10 as maximum levels of structure.

3. Then create a structure in each level until the number of input dimensions is reached

4. End loop

5. End loop

6. Return the structure

Actually, in our project, we divide this to three parts, each function involves a loop, and then the

last function creates a structure in a fuzzy signature. After creation is done, we will gain the structure. The matrix below is an example of the structure of fuzzy signature for SARS which

contains 8 input dimensions, and we want to create the 2 fuzzy signatures.

Figure 8: Examples of structure of fuzzy signature for SARS

The first fuzzy signature has two levels; first level is from the second dimension to the seventh

dimension. The second level is from 1st to the end; However, the second fuzzy signature

structure has 3 levels, the first one is 1st to 3rd , after aggregation function to the first level, then

we can handle the second level which from to 1st to 6th , and the third level is from 1st to 8th.

The two vectors below present the detailed structure of two above two fuzzy signatures.

Figure 9: The example of structures of fuzzy signature according to Figure 8

4.2.3.2 Obtain fuzzy signature information

Each fuzzy signature demonstrates the similarity between input and the centroid of a specific

cluster in a data set. Therefore, a matrix would be extracted after the hierarchical clustering

method is applied on the formatted data set. It is essential to create a neural network that

contains the cluster’s information such as the centroid. The detailed structure information of

this matrix is shown below:

[

𝑐11 … 𝑐1𝑛𝑐21 … 𝑐2𝑛𝑐𝑗1 … 𝑐𝑗𝑛

]

Where j is the number of clusters

n is the number of dimensions of input data

Cjn is the coordinate of centroid point at dimension n in cluster j Equation 3.1: Information of the cluster

4.2.3.3 Aggregation

Before we do the aggregation, we need select the aggregation functions for each fuzzy signature.

In our project, we have 3 aggregations function method, max, min, and average. In Fan’s

application, we must select it manually, which has the disadvantage of the risk of worse

selection of aggregation functions. Therefore, in our application, we improve this, and following

Mendis’ work, [10] we used the standard deviation comparison to select the aggregation

function.

The standard deviation comparison has 4 processes; firstly we calculate all possible

membership values of the fuzzy signature. Then we find the average value for all membership

values, thirdly we find the standard deviation according to the average membership values,

finally we can find the smallest standard deviation value’s position, then we can find the

aggregation function we select. The figure below shows the process of finding the aggregation

function for a 2 level structure fuzzy signature.

Figure 10: Example of aggregation function selection

0.7 0.2 0.4 𝑎𝑣𝑒0.5 0.3 0.5 𝑚𝑎𝑥0.4𝑎𝑣𝑒

0.6𝑚𝑎𝑥

0.2 𝑚𝑖𝑛𝑚𝑖𝑛 𝑓𝑢𝑛

Average =

0.42

Smallest sd = 0.04

Aggregation function =

min/average

4.2.4 Create neural network

Before the neural network is created, we need to set some parameters to satisfy initial

conditions. The following parameters need to be specified by the creation process:

1. Number of Fuzzy Signature neurons

2. Number of training epochs

Firstly, the number of fuzzy signature neurons and the number of training epochs are

determined by the user at the beginning of the program. The second parameter is the number

of training times for a dataset. The approach also provides some parameters automatically. The

size of the weight matrix is based on the number of hidden neurons and the number of output

dimensions. Furthermore, the neural network weight matrix values are initially set to small

random values. This provides the training procedure with a good starting stage to work

towards a solution. In this case, a function in the Matlab statistics toolbox called is random

embedded into the implementation. It takes the number of columns and rows comb ined with a

mean and standard deviation as arguments then generates a matrix with random values. After

all possible parameters have been determined; one data container called handles packs all of

them and sends the package to the training module.

4.2.5 Training the Fuzzy Signature Neural Network

Training procedure helps us to find suitable values of the weight matrix which can make the

actual outputs more closely the desired outputs. During this process, neurons learn the weights

iteratively by being given a number of training data. Therefore, providing a suitable training

method is an essential pre-condition for solving a specific task such as a classification problem.

In our approach, we use the training module to handle it.

The training procedure helps us to find suitable values of the weight matrix which can make the

actual outputs more closely model the desired outputs. During this process, neurons learn the

weights iteratively by being given a number of training data. Therefore, providing a suitable

training method is an essential pre-condition for solving a specific task such as a classification

problem. In our approach, we use the training module to handle it.

The training module starts the neural network when training data is input. After receiving th e

training data, we calculate the Manhattan distance between centroid of cluster and the given

input at each dimension as in Figure 11, and then the Gaussian function is then applied on

those distances. The Gaussian function control the range of results between 0 and 1 which

represent the similarity between the given input and centroid of cluster. The value 1 means the

given input is the same or very close to the centroid of the cluster; on the other hand, the value

0 means the given input is far from the centroid. After the hidden neurons finalize the similarity

computations in all input dimensions, the results are encapsulated by the structure, that is, a

fuzzy signature.

Input data

Centroid of cluster

Vertical distance

Horizontal distance

Figure 11: Manhattan distance

Distance = Horizontal distance + Vertical distance

In order to get a membership value of the fuzzy signature, a specific function called

𝑏𝑒𝑠𝑡𝑎𝑔𝑔𝑟𝑒𝑎𝑡𝑒𝑑𝑎𝑡𝑎(𝑓, 𝑠) takes the fuzzy signature 𝑓 and structure 𝑠 as arguments, and then

we can produce the membership value.

After above process, each membership value will be used as compute the actual value. The

formula below demonstrates the calculation of the actual value:

𝑦 = ∑𝑤𝑖𝑕𝑖

𝑛

𝑖=1

𝑤𝑕𝑒𝑟𝑒 𝑦 𝑖𝑠 𝑜𝑢𝑡𝑝𝑢𝑡 𝑓𝑜𝑟 𝑛𝑒𝑢𝑎𝑙 𝑛𝑒𝑡𝑤𝑜𝑘

𝑤 𝑖𝑠 𝑡𝑕𝑒 𝑤𝑒𝑖𝑔𝑕𝑡𝑠

𝑕 𝑖𝑠 𝑡𝑕𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑕𝑖𝑑𝑑𝑒𝑛 𝑛𝑒𝑢𝑟𝑜𝑛

𝑖 𝑖𝑠 𝑡𝑕𝑒 𝑖𝑡ℎ 𝑜𝑓 𝑕𝑖𝑑𝑑𝑒𝑛 𝑛𝑒𝑢𝑟𝑜𝑛 𝑜𝑟 𝑤𝑒𝑖𝑔𝑕𝑡

After we calculate actual values, then the neural network will compare the difference between

the actual values and desired outputs, and then update the weights matrix values based on the delta rule. The delta rule is a gradient descent update rule for tuning the weights matrix of the

neurons. The training module in this implementation uses this delta rule to minimize the error

between actual output and desired output. The learning rate can be determined by the users,

with a default value of 0.01. The formula below represents a simplify form with the update

function:

∆𝑤𝑖𝑗 = (𝑡𝑖 − 𝑦𝑖)𝑥𝑖

𝑤𝑕𝑒𝑟𝑒 𝛼 is a constant with small value called learning rate

𝑡𝑖 is the value of desired output at dimension i

𝑦𝑖 is the value of actual output at dimension i

𝑥𝑖 is the value of input at dimension i

When all the training data has been used in this module, one training epoch complete, but it is

not the end of the training phase. It just finished training one epoch. The training is not

complete until all the epochs finish.

4.3 Testing

In the testing part, we have three modules, which are testing network, collect results and

extracting network information. The 2 sections below demonstrate the details of them

separately.

4.3.1 Testing network

Before testing the network, the dataset need to be reorganized, to create more accurate benchmarks since the training set and testing set are isolated. The organizing method is

performed in this test suite for the k-fold cross validation scheme. For example, if the variable k

is determined as 4 by the user, both input data set and desired data set are divided into 4 sets,

where three of them are treated as training set randomly and the remaining one is treated as

test set, the total number of iterations is 4. The advantage of achieving this is that the user is

able to choose how large each test set is and how many iterations are averaged over

independently.

After re-organizing the raw data, the network starts the testing part. In the dataset, each

desired output is a vector of binary values that represents the class to which this observation

belongs. On the other hand, the actual result from the neural network is usually a vector

consisting of decimal numbers. Therefore, a specific mapping function is perfo rmed to produce

optimized output. Firstly, we gain the actual values from the neural network. Secondly, find the

position of maximum value. Thirdly, create a new vector which has the same size as the vector

of outputs, and then set all elements as 0 in this vector. Fourthly, assign the value of element at

position which has the same position from the second step to 1. Finally, return this vector as

output. Once the optimized output has been generated, the module then compares it with the

desired output and produces the value of accuracy and mean squared error, one by one.

Furthermore, the final benchmarks including average accuracy rate and average mean squared

error are generated at the end of the procedure. This module also shows the trend of benchmarks by diagrams, so users can easily understand the quality of the network with those

benchmarks.

4.3.2 Extracting network information

Once the neural network has been trained and tested, all information that directly relates to the

network has been determined. This module collects network information as a kind of benchmark and stores it into a file. There are 5 files which are Cluster_detail.txt,

Traning_detail.txt, Testing_detail.txt, Final_result.txt, and structure.txt. The table 1 below

describes the files and the benchmarks that are associated with them. The Appendix A at the

end of this report also shows an example of network information based on the “Wine” data set

that is associated with the experiments and evaluation chapter.

File name Benchmarks Description

Cluster_detail.txt This file describes cluster information in detail. It

includes the class distribution, centroid value, minimum

value and maximum value for each cluster.

Traning_detail.txt This file contains training information

Testing_detail.txt This file shows testing information based on the trained

network. It includes the desired output, optimized

output and actual output information for unmatched

results.

Structure.txt This file shows detailed information about each cluster.

E.g.: 8 inputs dimensions. 1,8; 1,2 3,7 1 8;

Final_result.txt This file shows the general result benchmarks. It

includes accuracy rate and mean square error by

iterations. Moreover, the average accuracy rate and

mean squared error is included as well.

Table 1: Information of files

5. Experiments and Evaluation

This chapter introduces three experiments, and their evaluation and compassion with other

result. The experiments 1 and 2 used the same data sets; however, we handle the dataset with

some missing value in experiment 2. We wish to evaluate the advantages of our approach when

the dataset has missing values. The experiment 3 our approach will be compare with other

neural networks, other neural networks results from the relative academic essay. Before we

experiment and evaluate, we need to describe the basic information for the experiments.

5.1 Description of the dataset

There are 10 data sets described in section 5.1 from the University of California Irvine (UCI)

machine learning data set repository. Table 2 below shows the general information for these data sets [12].

Data set Number of

input

dimensions

Number of

output

dimensions

Number of

observation

Wine 13 3 178

Thyroid 21 3 7200

SARS 8 4 4000

Ionosphere 34 2 351

Horse 58 3 364

Heart 35 2 920

Heartc 35 2 303

Diabetes 8 2 768

Card 51 2 690

Cancer 8 2 699

Table 2: Datasets details

Descriptions for all data sets except SARS are listed at the UCI website. SARS is from Wong [2].

A brief synopsis of each data set is as follows:

Wine

This data set is about a chemical classification of three different type of wine that grown in the

same region in Italy. It consists of 13 input dimensions which describe the information of the

wine and three output dimensions which indicate three types of wines.

Thyroid

This data set refers to a classification task that diagnoses patients’ thyroid condition. There are

21 inputs, 7200 examples. The desired output has 3 categories which indicate whether the

patient’s thyroid has over function, normal function, or under function.

SARS

This data set describes the severe acute respiratory syndrome (SARS) patients’ information

such as fever temperature at different time, blood pressure, nausea and abdominal pain by 8

input dimensions. The 4 outputs indicate 4 types of patients. They are SARS, normal,

pneumonia and hypertension. The total number of observations is 4,000 (Wong et al [2]).

Ionosphere

This data set refers to radar information that was obtained by a system in G oose Bay, Labrador.

It is used to determine whether some type of structure exists in the ionosphere. There are 34

input dimensions that indicate values of the electromagnetic signal, and 2 desired output

dimensions show whether the structure exists or not. Therefore, this is a binary classification

task. Total number of observations is 351.

Horse

This data set is about a classification task which predicts the fate of a colic horse. It demonstrates whether the horse will survive, will die or will be euthanized based on the result

of veterinary examination.

Heart

This data set represents the prediction of heart disease. There are 35 input dimensions associated with personal data such as age, sex, smoking habits, subjective patient pain

descriptions and results of various medical examinations such as blood pressure and electro

cardiogram result. The 2 output dimensions determine whether at least one of four vessels is

reduced in diameter by more than 50%.

Heartc

This is an alternate version of the “Heart” data set. The structure of input and output is the

same as the heart data set. The difference between “Heart” and “Heartc” is the “Heartc” dataset

comes from a different source, the Cleveland Clinic Foundation.

Diabetes

This data is about a classification task that diagnoses Pima Indians’ diabetes. The aim of the

data set is trying to determine whether a Pima Indian individual is diabetes positive or not.

Card

This data set refers to a binary classification task that predicts whether a customer’s credit card

should be approved or not. There are 51 inputs that represent a real credit card application.

The 2 outputs show the decision on the credit card.

Cancer

This data set is about a classification task that diagnoses patients’ breast cancer. The main task

indicates a tumor as either benign or malignant.

.

5.2 Hardware and software environment information

We use the same Matlab version and computer to test our and Fan’s approach, because we need

to avoid the situation that different hardware and software leads to the different result. Table 3

blow shows the detail of the hardware and software information:

MATLAB version 7.8.0(R2009) X64

Operating system Win7 Ultimate X64

CPU Intel® Core™ I7-2630QM CPU @ 2,00GHz

Installed memory( RAM) 6.00GB

Hard Disk 750G

Table 3: the information of Hard and Software environment

5.3 Experiment 1: Datasets experiments with no missing data

5.3.1 Purpose of the experiment

The aim of this experiment is to discover the feasibility and performance of fuzzy signature

based neural networks for a range of data sets. This experiment determines the results of our

approach and Fan’s in datasets with no missing data, through the experiment, and then

comparison with ours and Fan’s, we will try to find the advantages and disadvantages of our

fuzzy signature neural network. Finally we analyze the reasons which cause the differences of

results.

5.3.2 Description of the experiment

This experiment produces related benchmarks with the data sets that are described in section

5.1 and compares them with Fan’s classification approach. In order to avoid the risk which has

each result different due to the initial conditions, we run the experiment 5 times for each

dataset, and we used the same number of hidden neurons (fuzzy signature neurons) each time.

Additionally, in Wei’s approach, we used the average method as the aggregation function. Both

approaches set for the limit at 100 training times, and 20% data as the testing dataset, and 80%

as the training dataset. The table 4 below shows the number of fuzzy signature neurons.

Data set Number of

input

dimensions

Number of

hidden neurons

Wine 13 5-10

Thyroid 21 13-18

SARS 8 2-6

Ionosphere 34 18-25

Horse 58 20-25

Heart 35 17-23

Heartc 35 17-23

Diabetes 8 2-6

Card 51 20-25

Cancer 8 2-7

Table 4: the detail of number of cluster selected

5.3.3 Experiment Process and Discussion of Results

As mentioned in the last section, various numbers of fuzzy neurons have been used to find the

maximum accuracy rate and minimum mean squared error in each dataset. The locally

optimized results for all data sets are listed in the table below. More specifically, each row in

table 5 represents one specific data set benchmark, and includes mean and standard deviation

of accuracy rate (results of 5 fold cross validation) for both training and testing data sets, and

mean squared error for both the training and testing data sets. For each data set benchmark,

we used the average number to represent the 5 folds.

Data set Training data set Testing data set

Mean (%) MSE Mean (%) MSE

Wine 93 0.0679 95 0.0714

Thyroid 92.7 0.042 92.8 0.043

SARS 94.13 0.086 94.08 0.085

Ionosphere 87.28 0.115 87.42 0.126

Horse 64.9 0.166 62.7 0.167

Heart 80.76 0.144 79.96 0.149

Heartc 79.97 0.148 78.70 0.156

Diabetes 72.80 0.178 73.37 0.175

Card 82.10 0.131 83.04 0.133

Cancer 96.3506 0.041 96.285 0.042

Average 84.4 0.11 84.34 0.11

Table 5: the benchmark for our approach

In the table above, there is no generally optimum number of fuzzy neurons in this experiment;

moreover, each data set’s accuracy rate is different. Table 6 below shows Fan’s results which are

not very different.



Wine 91.64 0.0635 92.22 0.0662

Thyroid 92.6 0.0421 92.48 0.0424

SARS 93.317 0.0822 93.57 0.0829

Ionosphere 90.39 0.0949 90.85 0.0913

Horse 63.43 0.1575 63.58 0.1542

Heart 81.9 0.131 80 0.14

Heartc 80.49 0.133 80.98 .0.138

Diabetes 70.78 0.204 68.31 0.191

Card 80.65 0.152 78.84 0.153

Cancer 96.14 0.038 96.42 0.0376

Average 84.13 0.11 83.73 0.096

Table 6: the benchmark for Fan's approach

From the two tables above, we can see that the overall results have only a little difference

between our fuzzy signature neural network and Fan’s fuzzy signature RBF neural network.

Although accuracy rates of our training dataset and testing dataset is higher than Fan’s, on the

other hand, the mean square error of our approach in the testing dataset is lower than Fan’s. So we do not have any obvious evidence to present that our algorithm has any advantages

compared with Fan’s approach. Additionally, in specific datasets, the accuracy rates of wine,

thyroid, SARS, diabetes, card, and cancer datasets are a little higher than Fan’s. However, the

accuracy rates of other dataset are a little lower than Fan’s. So there are 6 specific datasets

where our result is higher than Fan’s and 4 specific datasets where our result is slightly lower

than Fan’s. Based on those the benchmarks above, we still cannot conclude which algorithm is

better. Therefore, we performed the second experiment in which datasets with missing va lues

are used to evaluate the differences in the effects of the algorithm between ours and Fan’s.

5.4 Experiment 2: Datasets experiments with missing data


In experiment 1, we do not find any significant difference between our approa ch and Fan’s from

the benchmark. Therefore, we will try to add the datasets with missing data to test our and

Fan’s approach. The aim of this experiment is still to discover the feasibility and performance of

fuzzy signature based neural networks for a range of data sets. Through the experiment, and

then comparison with ours and Fan’s, we will try to find the advantages and disadvantages of

our fuzzy signature neural network.


In this experiment, we handle the dataset with 20% missing values, for the method of dealing

the missing values, we have introduced by section 4.2.1. For Fan’s approach, we need to handle

the missing values. In Fan’s approach, it uses the average value of this dimension instead of the

missing values. After dealing with the missing values, we used otherwise exactly the same data

and number of hidden neurons in experiment 1. All the information can be seen in section

5.3.2.


With the same as experiment 1, various numbers of fuzzy neurons have been used to find the

maximum accuracy rate and minimum mean squared error in each dataset. The table 7 below shows the average values of the accuracy rate and mean squared in our approach.



Wine 93.024 0.07 91.776 0.08

Thyroid 92.556 0.042 92.94 0.041

SARS 86.436 0.1 85.6 0.1

Ionosphere 87.5982 0.12 86.91 0.11

Horse 64.662 0.15 63.966 0.18

Heart 79.466 0.14 79.626 0.14

Heartc 78.342 0.17 77.532 0.18

Diabetes 69.64 0.19 69.9 0.19

Card 77.068 0.155 75.6 0.151

Cancer 95.56 0.04 96 0.05

Average 82.43522 0.117 81.985 0.122

Table 7: the benchmark for our approach with missing value

Because the datasets have missing values, all the benchmark results have decreased slightly

when compared with experiment 1. However, the decrease is slightly, it keeps stable generally;

therefore, our approach works quite well on datasets with missing values. After that, we tested

Fan’s approach, with the same conditions, with 20% missing values. Table 8 below shows the benchmark for Fan’s approach.



Wine 57.316 0.2 55.046 0.21

Thyroid 92.538 0.05 92.84 0.05

SARS 74.56 0.1 74.28 0.1

Ionosphere 88.94 0.08 88.76 0.09

Horse 65.02 0.16 64.24 0.16

Heart 82.52 0.15 82.42 0.14

Heartc 79.94 0.15 79.22 0.15

Diabetes 71.08 0.19 70.74 0.19

Card 78.86 0.15 78.88 0.16

Cancer 96.22 0.04 96.36 0.04

Average 78.6994 0.127 78.2786 0.129

Table 8: the benchmark for Fan's approach with missing value

From Fan’s benchmark, the accuracy rates both of training datasets and testing datasets have a

larger decrease with the accuracy rate of training datasets going from 84.13 to 78.70, and the

accuracy rate of testing datasets from 83.70 to 78.28. Compared with our approach, it

represents worse results, which has two aspects. The first aspect is that all of the benchmarks

of our approach are higher than Fan’s approach. The secondly, the effect of the dataset with

missing values in our approach are less than Fan’s approach. In two datasets, wine and SARS,

the results of Fan’s approach has decreased sharply, the accuracy rate of the wine dataset

decreased from 93% to 57%, on the other hand, in our approach the effect is slight, the result is

almost the same which is 93%. The same effect happens in SARS dataset.

According to the results above, we conclude Fan’s fuzzy signature RBF neural network has an

obvious limitation, which is that it depends highly on data integrity. On the other hand, our

approach has less effect when the datasets have missing values. Because the aggregation

method of our approach and structure of fuzzy signature is suited for more extreme cases, it

will be less affected when the data is incomplete. In future work we should examine eliminating the 20% of data values and using those datasets to construct the fuzzy signatures for both our

and Fan’s approaches.

5.5 Experiment 3: Benchmarks comparison between Fuzzy

signature neural network and other approaches


The purpose of this experiment is to evaluate the performance of our approach when compared with other neural networks. Then we can find some ways to improve our approach.


In this experiment, our approach will be compared with other neural networks which are

Cascor and k-sNN. In order to do this comparison, firstly, we find the benchmarks on which

both Cascor and k-sNN have been tested by other researchers [13, 14]. The datasets heart,

horse, diabetes and cancer have been tested by Cascor and K-sNN. After that, we find the most

optimized results from each approach. The table 9 below shows our values of parameters,

which involve number of fuzzy neurons and epochs.

Dataset Number of fuzzy neurons Epochs

Heart 30 200

Horse 35 200

Diabetes 16 400

Cancer 18 400

Table 9: the information of our approach that will be used


We used the last section’s parameters to test our approach 5 times, and then find the accuracy

rates of training datasets and testing datasets, finally calculating the average accuracy rates and find the other neural network’s accuracy rates. The table 10 below shows the information

mentioned above.

Dataset Fuzzy signature

neural network

Cascor K-sNN

Heart 80.6 80.1 75.1

Horse 68 73.6 70.9

Diabetes 76.7 76.5 69.8

Cancer 96.3 98.1 62.5

Table 10: the results of 3 different neural networks

From the table above, the best performance is Cascor, our approach is slightly lower than

Cascor; however the K-sNN is the worst one. In the horse dataset, our approach has the worst

performance, though on the hand, in the heart and diabetes dataset our approach has the best

accuracy rates. From the viewpoint of the speed of processing, the fuzzy signature neural

network is the best one, and the others are slower.

6. Conclusion and Future Work

6.1 Conclusion

We have implemented our approach (fuzzy signature neural network) and Fan’s approach

(fuzzy signature RBF neural network), and then compared their advantages and disadvantages.

The techniques and methodologies to implement such a network and suitable test cases have

been demonstrated. We used real world data sets from UCI, the benchmarks have been

compared with different parameters in the network and different classification methods. Our

approach achieved stable and good results, especially when using datasets with missing values,

it still provides stable and good benchmarks. Therefore the approach is viable and worth

further investigation.

6.2 Future Work

In this report, our approach (fuzzy signature neural network) has good performance for

classification datasets; however, this approach still has space for improvement. Therefore we

give 3 major suggestions for future work, beyond the few minor suggestions already mentioned

in various sections earlier in this report.

Firstly, in the aggregation function, we only choose 3 methods, which are max, min, and average

functions, to handle the aggregation process. This is a little simple for the aggregation process;

we can add the weights matrix during the aggregation process. For the adding weights method,

we may use the weights learning which is introduced by Mendis, 2008 [1]. It is a more complex aggregation method, and we would compare benchmarks to see the result, which we expect

would improve.

Secondly, in our approach, we have 3 layer neural networks, which means only one hidden

layer is used. This has a disadvantage in that it may lose some features in the fuzzy signature neural network. Based on this, we suggest that we can add a hidden layer, after that using the

back propagating algorithm to modify the weights, and then it may find some new features in

the dataset.

Finally, this approach has a limitation for general real world tasks, because it is suited for

classification datasets, and not tested for regression datasets. In order to extend the types of

dataset, for the fuzzy part, we can use the polymorphic fuzzy signature instead of our fuzzy

signature. At the same time, the neural network could use the cascor neural network instead of

the RBF neural network [15]. Because the polymorphic fuzzy signature and cascor neural

network have better performance individually, we consider that combining those two methods

into our approach will have better performance [1, 15].

Reference

[1] B.S.U. Mendis, 2008, Fuzzy Signatures: Hierarchical Fuzzy Systems and Applications, PhD thesis, Department of Computer Science, The Australian National University, Australia, March

2008.

[2] K.W. Wong, T.D. Gedeon, L.T. Kóczy, “Construction of Fuzzy Signature from Data: An Example

of SARs Pre-clinical Diagnosis System”, Proceedings of IEEE International Conference on Fuzzy

Systems – FUZZ-IEEE 2004, 2004,pp. 1353.

[3] W. Fan, 2008, Fuzzy Signature based Radial Basis Neural Network, Master thesis, Department

of Computer Science, The Australian National University, Australia, Nov 2011.

[4]T. Gedeon, K. Wong, and D. Tikk, “Constructing hierarchical fuzzy rule bases for classification,”

in Fuzzy Systems, 2001. The 10th IEEE International Conference, 2001,pp. 1388-1391

[5] M. Minsky, and S. Papert, Perceptron, Cambridge: MA: MIT Press, 1969, pp 115-138.

[6]M. D. Buhmann, Radial Basis Functions: Theory and Implementations, 1st edn , Canmbridge

University Press ,2003.

[7] B. S. U. Mendis, T. D. Gedeon, and L. T. K_oczy. “Flexibility and robustness of hierarchical

fuzzy signature structures with perturbed input data.” in International Conference of Information Processing and Management of Uncertainty in Knowledge Based Systems (IPMU),

2006, pp 2552-2559.

[8]Pawel Strumillo, Wladyslaw Kaminski, “Radial Basis Function Neural Networks: Theory and

Applications”, in proceedings of the Sixth International Conference on Neural Network and Soft Computing, Zakopane, Poland,2006, pp.107-119.

[9]H. Tevor, T. Robert, F. Jerome, The Elements of Statistical Learning (2nd ed.), New York:

Springer, 2009, pp520-528.

[10] B. S. U. Mendis, and T. D. Gedeon, “Complex Structured Decision Making Model: A

hierarchical frame work for complex structured data .” Information Sciences, vol 194, pp 85-106,

July,2011.

[11] G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: Hierarchical Clustering Algorithm Using

Dynamic Modeling,” Computer, 1999,vol. 32, no. 8, pp. 68-75,

[12] The UCI Machine Learning Repository, 1987, Center for Machine Learning and Intelligent

Systems, viewed 15 Feb 2011, <http://archive.ics.uci.edu/ml/datasets.html>.

[13]N.K. Treadgold, T.D. Gedeon, “Exploring constructive cascade networks”, Industrial

Electronics Society, 2001. IECON '01. The 27th Annual Conference of the IEEE, vol. 1, 2001, pp.

25-30.

[14] Nock, R., Sebban, M., and Bernard, D, “A Simple locally adaptive nearest neighbor rule with

application to pollution forecasting”, Internal Journal of Pattern Recognition and Artificial

Intelligence, 2003 pp.1369-1382.

[15] S. Fahlman, and C. Lebiere, “The Cascade-Correlation Learning Architecture,” Advances in

Neural Information Processing Systems 2, D.S. Touretsky, ed., 1990,pp. 524-532.

Appendix A

The files below show an example of wine’s the text benchmarks with 6 fuzzy neurons and 100 epochs, with 20% missing values.

File: cluster_detail.txt

Details for 6 clusters

Details for cluster number 1 with number of elements 14 are:

Percentage of member is class 1 are 7 50.000000



Cluster centroid -

13.030 2.438 2.476 20.036 108.214 2.258 1.885 0.356 1.554 5.494 0.891

2.637 865.429

Minimum values of each argument

11.960 1.090 2.100 15.200 96.000 1.100 0.550 0.130 1.140 3.050 0.590

1.560 830.000

Maximum values of each argument

13.870 4.280 3.220 27.000 124.000 3.380 3.030 0.630 2.030 10.200 1.250

3.820 920.000





Cluster centroid -

12.940 2.513 2.374 19.800 100.932 2.086 1.496 0.397 1.490 5.850 0.873

2.296 688.568


11.450 0.940 1.700 13.200 70.000 1.350 0.470 0.140 0.410 1.740 0.540

1.270 615.000


14.340 5.650 2.870 25.000 151.000 3.520 3.750 0.630 2.760 13.000 1.310

3.710 795.000





Cluster centroid -

12.350 2.254 2.241 20.411 90.321 2.371 2.177 0.350 1.607 3.078 1.024

2.772 376.321


11.030 0.740 1.710 16.000 80.000 0.980 0.340 0.170 0.680 1.900 0.580

1.330 278.000


13.880 5.800 2.750 26.500 116.000 3.500 3.150 0.660 2.910 7.100 1.710

3.640 438.000





Cluster centroid -

12.619 2.683 2.344 21.086 94.273 1.852 1.457 0.418 1.314 4.755 0.887

2.280 520.182


11.460 0.890 1.360 10.600 78.000 1.150 0.470 0.170 0.420 1.280 0.480

1.290 450.000


14.130 5.510 3.230 28.500 123.000 3.180 5.080 0.630 3.580 10.800 1.360

3.690 607.000





Cluster centroid -

13.674 1.952 2.367 16.968 106.714 2.825 2.940 0.279 1.961 5.149 1.054

3.172 1067.571


12.470 1.350 2.040 11.200 90.000 2.350 2.270 0.170 1.250 2.600 0.870

2.510 937.000


14.830 4.040 2.670 30.000 162.000 3.880 3.740 0.430 3.280 7.220 1.310

4.000 1195.000





Cluster centroid -

13.921 1.769 2.498 17.200 106.650 2.908 3.082 0.295 1.908 6.323 1.117

3.008 1360.850


13.290 1.430 2.140 12.000 89.000 2.200 2.190 0.190 1.250 3.950 0.860

2.650 1235.000


14.390 2.160 2.720 22.500 132.000 3.850 3.930 0.500 2.960 8.900 1.280

3.580 1680.000

File: training_detail.txt weight matrix learnt after rum number 1

-0.0146 0.0066 0.0026 -0.0197 -0.0083 0.0163

-0.0146 0.0262 0.0096 -0.0197 0.0243 -0.0298

0.0136 0.0072 0.0068 0.0290 -0.0124 0.0058

weight matrix learnt after rum number 2

-0.0368 -0.0072 -0.0028 -0.0195 -0.0077 -0.0107

0.0119 0.0037 0.0171 -0.0132 0.0007 -0.0203

-0.0044 -0.0013 -0.0086 0.0118 0.0119 0.0060


-0.0068 -0.0083 0.0200 0.0310 -0.0165 -0.0082

0.0346 -0.0050 -0.0272 0.0054 -0.0136 -0.0315

0.0020 0.0079 -0.0154 -0.0263 -0.0315 0.0198


0.0197 -0.0085 -0.0177 -0.0008 0.0456 0.0014

-0.0131 -0.0109 0.0040 0.0124 -0.0182 0.0088

-0.0052 0.0116 0.0035 0.0044 0.0219 0.0009


0.0149 0.0092 -0.0074 0.0064 -0.0058 -0.0034

0.0086 -0.0037 -0.0222 -0.0125 -0.0143 -0.0187

0.0122 -0.0001 -0.0286 -0.0120 -0.0124 0.0176

File: testing_detail.txt

Failed results in run number 1

Observation 69 is not match the desired result

Actual classification - 0 1 0

Neural net classification (rounded) - 0 0 1

Neural net classification (actual) - 0.03 0.28 0.52












Neural net classification (actual) - -0.07 0.39 0.33





























Neural net classification (actual) - 0.54 0.64 -0.02








Neural net classification (actual) - 0.50 0.48 -0.03


















File structure.txt

There are 6 clusters in it.

Cluster 1 :

2,5

6,11

1,13,2

Cluster 2 :

2,4

5,12

1,13,2

Cluster 3 :

2,3

4,7

8,12

1,13

Cluster 4 :

2,4

5,11

1,13

Cluster 5 :

2,7

8,10

11,12

1,13

Cluster 6 :

2,5

6,11

1,13

File final_result.txt

Final result for iteration 1

training data set success rate = 0.880282

testing data set success rate = 0.833333

training data set mean square error = 0.083936

testing data set mean square error = 0.099394





















Final benchmarks

Average training success rate = 0.891549

Average testing success rate = 0.861111

Average training mean square error = 0.082917

Average testing mean square error = 0.093247

The screen shot for the example

Fuzzy Signature Neural Network - CECS - ANUcourses.cecs.anu.edu.au/courses/CS_PROJECTS/12S1/... ·...

Documents

Transcript of Fuzzy Signature Neural Network - CECS - ANUcourses.cecs.anu.edu.au/courses/CS_PROJECTS/12S1/... ·...