Bacterial foraging trained wavelet neural networks ......1 Bacterial foraging trained wavelet neural...
Transcript of Bacterial foraging trained wavelet neural networks ......1 Bacterial foraging trained wavelet neural...
1
Bacterial foraging trained wavelet neural
networks: Application to bankruptcy
prediction in banks
Project Report
Institute for Research and Development in Banking Technology
(IDRBT)
Road No.1, Castle Hills ,Masab Tank,
Hyderabad-500057
Project Supervisor:
Dr. V. Ravi
(Assistant Professor,IDRBT
Submitted by:
Paramjeet(07HS2004)
2nd
year Student
Integrated M. Sc. in Economics
Department of Humanities and Social Sciences
IIT Kharagpur
Kharagpur West Bengal 721302
2
CERTIFICATE
This is to certify that this project has been successfully completed to
my satisfaction and that the goals set upon at the outset of this
endeavor have been worked upon to the best of student’s abilities and
resources. I hereby allow this project to be presented for evaluation
with my full consent.
Supervisor:
Dr. V. Ravi
(Assistant Professor, IDRBT )
3
ACKNOWLEDGEMENT
I would like to thank my supervisor, Dr. V. Ravi, who guided me through the
project ,and helped me sort out all the problems ,be it technical or otherwise; and
without whose support ,the project would not have reached its present state.
I would also like to thank Mr. Nikunj Chauhan(ex. M Tech student at IDRBT) for
helping me to develop my final algorithm.
Paramjeet
Project Supervisor
Dr. V. Ravi
4
Table of contents
Serial Description Page
Number Number
1 Certificate 2
2 Acknowledgement 3
3 Abstract and Keywords 4
4 Nomenclature and Abbreviations 6
5 Introduction 7
6 Bacterial Foraging Technology 8
7 BFOA Algorithm 10
8 Wavelet Neural Network 12
9 Training of WNN with BFT 16
10 Bankruptcy Prediction 18
11 Results & Discussion 20
13 Conclusion 23
5
ABSTRACT
The present report proposes training of wavelet neural network (WNN) with the newly
proposed bacterial foraging technique optimization algorithm in order to predict bankruptcy in
banks. The parameters translation, dilation and the weights connecting different layers in WNN
are tuned using BFT algorithm. The resulting neural network is called BFTWNN. The
performance of BFTWNN is compared with that of threshold accepting wavelet trained
wavelet neural network (TAWNN) [Vinay Kumar et al.[38]] and the original WNN. The
efficacy of BFTWNN is tested on bankruptcy prediction datasets viz. US banks, Turkish banks
and Spanish banks with full features. Further, It is also tested on benchmark datasets such as
Iris, Wine and Wisconsin Breast Cancer with full features. The whole experimentation is
conducted using 10-fold cross validation method. BFTWNN outperformed TAWNN and WNN
in benchmark dataset problems with good margin and it yielded comparable results as
Differential Evolution Wavelet Neural Network (DEWNN) developed by Chauhan et al. [26].
Keywords: Bacterial Foraging Technique, Wavelet Neural Network, Bankruptcy Prediction,
Classification, Bacterial foraging trained wavelet neural network (BFTWNN), Threshold
Accepting trained wavelet neural network (TAWNN).
6
Nomenclature and abbreviations
Nomenclature
n_c No of chemo tactic step
n_s No of swim step
n_r No of reproduction step
n_ed No of elimination dispersion step
p_ed probability of elimination and dispersion
p dimension of search space
BFT Bacterial Foraging technique
WNN Wavelet Neural Network
NWT Non Decimated Wavelet Transform
ANN Artificial Neural Network
BFTWNN Bacterial Foraging Trained Wavelet Neural Network
TAWNN Threshold Acceptance Wavelet Neural Network
DEWNN Differential Evolution Wavelet Neural Network
BPNN Back Propagation Neural Network
AUC Area Under receiver characteristic Curve
CAMELS Capital Adequacy, Asset Quality, Management
Expertise, Earning Strength, Liquidity,
Sensitivity to market risk.
NRMSE Normal Root Mean Square Error
7
1. INTRODUCTION
To tackle complex search problem of real world, scientists have been drawing inspiration
from nature and natural creatures over the years. Darwinian evolution, group behavior of social
insects and the foraging strategy of microbial organisms are some of the example in this
category. The core of all these animal search strategies is optimization. They usually try to
maximize certain things in their searching strategy.
Bacterial Foraging Technique is completely based on the foraging strategy of Escherichia Coli
(E. coli.) bacteria that is found in our intestine. It was proposed by Passino [20] in 2002 . It is
basically an evolutionary algorithm, Over the years, animals have developed foraging strategy
that maximize a function like E/T, where E is the energy obtained from a prey and T is the time
taken during whole process (i.e. from searching the prey, locating and time taken to digest it).
The maximization of this function ensures that the animals get more time for other activities like
fighting, fleeing, mating and shelter building. It is likely that animals with better foraging
strategy will survive and animals with poor foraging strategy will get eliminated. The best part of
BFT is that it is a derivative free method just as other metaheuristics.
BFT has now gained very much popularity and is gaining wide acceptance in solving a whole
range of problems. Acharya et al. [1] applied BFT to convert non Gaussian data to independent
linear form to recover all the component of a given source. Dasgupta et al. [34] applied adaptive
computational chemotaxis in BFT to solve the problem of slower convergence of BFT near the
global minima value and to take out the bacteria if it gets trapped in the local minima.
The BFT trained WNN is further applied for short term load forecasting by Ulagammai [37] used
BFT to a feed forward neural networks preceded by wavelet transformers. The inputs are fed
as the time series signal. Nondecimated wavelet transform (NWT) is used as the presignal
processor in the model and it is decomposed on number of wavelet coefficient and these are then
fed to multilayer network. The output obtained is further combined using wavelet recombination
and the output thus obtained is the final output.
In this paper, we propose a BFT based algorithm to train a Wavelet neural network (WNN) and
test it’s effectiveness on bank’s bankruptcy datasets. In this algorithm ,the weights connecting
8
input and hidden layers, hidden and output layers and dilation, translation parameters are
updated using BFT algorithm.
2. Overview of BFT algorithm
BFT is completely based on the foraging technique of the E. Coli (Escherichia coli) bacteria
found in the lower intestine of warm blooded organisms. It follows the saltatory search method
for searching nutrients. The bacteria has flagella on it’s body in order to facilitate in moving.
Motion of the flagella decides in which direction it has to move, like if the flagella moves in the
clockwise direction then all the flagella move independently of the one another resulting in what
we call as the tumble step this step is used for searching nutrient rich places or to move to the
place that is away from the harmful substance (like various ions, acidic or basic environment)
Oon the other hand if the flagella rotate in anticlockwise direction then the flagella respond by
forming a bundle and help it to propagate further in their direction, this is called a swim step it is
taken when the bacteria is moving in the direction of increasing nutrient concentration. The
tumble step determines the direction in which it has to move and swim step is taken in the
direction of tumble step.
The bacterial foraging system consists of four principal steps namely chemotaxis step, swarming,
reproduction step and elimination dispersion step. These steps are described briefly as follows.
2.1 Chemotaxis :
Chemotaxis term refers to the step taken due to presence of the chemical substance in the nearby
area. If the substance is of the nutrient type then the bacteria will get attracted to it and will take
a step in that direction else the bacteria will take a step away from it in order to avoid it. This
step consist of two types of movements: tumble step and swim step. Tumble step basically
determines the direction in which swim step has to be taken. This step basically searches the
direction in which the nutrient concentration is increasing. After it had taken a tumble step it has
to take swim step. This is done by moving in the direction specified by tumble. But it can move
up to a predefined number of maximum steps. After that it has to take a tumble step. It has to be
kept in mind that bacteria will not take a step if the nutrient concentration is less than the
previous one. Suppose the bacteria is at the thj chemo tactic step, thk reproduction step and thl
elimination dispersion step with step size as C(i), then the movement can be represented by
9
( )( 1, , ) ( , , ) ( )
( ) ( )
i i
T
ij k l j k l C i
i iθ θ
∆+ = +
∆ ∆
Where ∆ indicates a vector in the random direction having the elements in all the directions,T∆
represents the transpose of the vector. If the number of chemotactic steps is less than the
specified level then it is repeated again, else this process is stopped.
2.2 Swarming:
An interesting behavior is shown by some of the bacteria including E. Coli and S. Typhimurium,
where intricate and stable spatio-temporal patterns (swarms) are formed in semisolid nutrient
medium. A group of E. Coli cells arrange themselves in a traveling ring by moving up the
nutrient gradient when placed amidst a semisolid matrix with a single nutrient chemo- effecter
.The cells when stimulated by a high amount of succinate, release an attractant aspertate, which
helps them make a group and move as concentric patterns of swarms with high bacterial density.
The cell to cell factor can be calculated with the help of following function.
1
( , ( , , )) ( , ( , , ))S
i
cc cc
i
J P j k l J j k lθ θ θ=
= ∑
2 2
tan
1 1
[ exp( ( ) )] [ exp( ( ) )]S S
i i
attrac t attract m m repellant repellant m m
i i
d w h wθ θ θ θ= =
= − − − + − − −∑ ∑ ∑ ∑
Where d tanattrac t is the depth of the attractant released by the cell ; w attract is the width of the
attractant signal ; h repellant is the height of the repellant effect and repellant
h is the measure of the
width of repellant. Jcc
( )),,(,( lkjPθ is the objective function value to be added to the actual
objective function (to be minimized) to present a time varying objective function, S is the total
number of bacteria ,p is the number of variables to be optimized. For small step size the value of
cell to cell factor is close to zero.
2.3 Reproduction:
(A) For the given k and l ,and for each i=1,2,….S
Let J I
health = ∑
+
=
1
1
CN
j
),,,( lkjiJ be the health value of the bacteria .Sort bacteria in
order of ascending values ( J )health ).
10
(B) Each bacterium is sorted out in decreasing order of their health value or increasing order
of health value (it is the summation of the entire nutrient it has taken during all
chemotactic step it has taken). The first half of the bacteria with the high health value is
killed and the other half of the bacteria undergoes reproduction step and the offspring
that are produced are placed at the exactly same location as their parent, this keeps the
population size constant. If the number of reproduction steps is less than a specified
value then this step is repeated again.
2.4 Elimination and Dispersal Step:
There may be some changes like increase in the temperature in a local region or there may be
sudden increase in the acidity of the region which results in displacing of some of the bacteria or
killing some of the bacteria. These events are random in nature and these are simulated in the
algorithm by displacing a part of population into a new location. Some bacteria are randomly
chosen and are displaced to new location. This event may help if the new locations are near the
global minimum region or there may be reverse case when bacterium near the global minimum
region gets displaced to other region. The probability of happening of these events is decided by
the parameter ped
(probability of elimination and dispersion). If the current number of
elimination and dispersal event is less than a specified number, then this process is repeated
again, Else this loop is finished.
The BFOA Algorithm
Parameters:
[STEP 1] Initialization:
The first step is the initialization of all the parameters p, S, NC
, Nr, Ned
, C(i)(i=1,2,3,…..S),
iθ (i=1,2,3…S).
p: Dimension of the search space.
S: The number of bacteria in the population.
Nc: The number of chemotactic steps.
Nr: The number of reproduction steps.
Ned
: The number of elimination dispersal steps.
11
ped
: Elimination Dispersal Probability.
C(i): The size of the step taken in random direction specified by tumble.
Ns : Number of swim steps taken.
Algorithm:
[Step 2] Elimination Dispersal loop: l=l+1
[Step 3] Reproduction loop: k=k+1
[Step 4] Chemotactic loop: j=j+1
For bacteria i=1,2,3,……S take a chemotactic step as follows::
Calculate the current objective function value as follows:
J(i, j, k, l)= J(i, j, k, l)+Jcc
( ),,(),,,,( lkjPlkjiθ )(i.e. add on the cell-cell factor if you have
chosen swarming whose formula is discussed previously).
(a) To find out more favorable value compared to this save the value as
last
J =J(i, j, k, l).
(b)Tumble: To simulate tumble step generate a random vector ∆ (i) pℜ∈ with each element
has to be randomly chosen within the optimization domain(m
∆ (i),m=1,2,….p).
(c)Move: movement of the bacteria can be represented as:
( )( 1, , ) ( , , ) ( )
( ) ( )
i i
T
ij k l j k l C i
i iθ θ
∆+ = +
∆ ∆
This results in the movement of step size C(i) unit in the direction of tumble for bacteria i.
(d)Compute the function value at new point
J(i, j+1, k, l)=J(i, j, k, l)+Jcc
( iθ (j+1, k, l),P(i, j, k, l)).
(e)Swim
i) Let m=0(count for swim length)
ii) while m < NS have not gone too far
let m=m+1.
If J(i, j+1, k, l) < Jlast
(got more favorable value) then save this new value
Jlast
= J(i, j+1, k, l) and let
12
iθ (j+1, k ,l)= iθ (j, k, l) + C(i) ( )
( ) ( )T
i
i i
∆
∆ ∆
And use this ( 1, , )i j k lθ + to compute the new J(i, j+1, k, l) as we have done it in (d).
Else let m= SN this is the end of while statement..
(iii) Go to the loop of bacteria (i+1) if i≠ S
[Step 5] if j < CN go to step 4 , in this case continue chemotaxis since the life of bacteria is not
over.
[Step 6] Reproduction:
[a] For the given k and l ,and for each i=1,2,………….S, let 1
1
( , , , )CN
i
health
j
J J i j k l+
=
= ∑
is the measure of all the nutrients it has taken during it’s life time and it is also measure of how
much successful it was at avoiding noxious substance. Sort bacteria and chemotactic parameters
C(i) in order of ascending cost healthJ (higher cost means lower health).
[b] The / 2rS S= bacteria with high health value die and rest half go under the
process of reproduction and the offspring thus produced are placed exactly at the same location
as their parent.
[Step 7] If k< reN go to step 3.In this case, we have not reached the number of specified
reproduction step
[Step 8] To perform elimination and dispersal step choose the no of bacteria to be eliminated (to
be decided with the parameter edp ) and initialize them within the optimization domain. If l < edN
go to the reproduction loop again else finish the loop. Select the minimum value obtained in all
the bacteria, this value gives us the minimum value of the function. The flow chart of the above
algorithm is given in appendix 1.
Wavelet Neural Network:
The word wavelet is due to Grossmann and Morlet [16].Wavelets are a class of function used
to localize a given function in both space and scaling
(http://mathworld.wolfram.com/wavelet.html).They have advantages over traditional Fourier
methods in analyzing physical situations where the signal contains discontinuities and sharp
spikes. Wavelets were developed independently in the fields of mathematics, quantum physics,
13
electrical engineering and seismic geology .Interchanges between these fields during the last
few years have led to many new wavelet application such as image compression, radar and
earthquake prediction.
A family of wavelet can be constructed from a function ( )xψ sometimes it is known as
“mother wavelet” .which is confined in a finite interval ”Daughter Wavelets” , ( )a b xΨ are then
formed by translation (b) and dilation (a). Wavelets are specially useful for compressing image
data. An individual wavelet is defined by
, 1/2( ) | | ( )a b x bx
aψ α − −
= Ψ
In case of non uniformly distributed training data, an efficient way of solving this problem is
by learning at multiple resolutions. Wavelets in addition to forming an orthogonal basis are
capable of explicitly representing the behavior of a function at various resolutions of input
variables. Consequently a wavelet network is first trained to learn the mapping at the coarsest
resolution level. In subsequent stages ,the network is trained to incorporate elements of
mapping at higher and higher resolutions .Such hierarchical ,multi resolution has many
attractive features for solving engineering problems, resulting in a more meaningful
interpretation of the resulting mapping and more efficient training and adaptation of the
network compared to conventional methods. The wavelet theory provides useful guidelines for
the construction and initialization of networks and consequently, the training times are
significantly reduced.(http://www.ncl.ac.uk/pat/neural-networks.html).
Wavelet networks employ activation functions that are dilated and translated versions of a
single function : dR RΨ → ,where d is the input dimension(Zhang et al.[40]).This function
called the ‘mother wavelet’ is localized both in the space and frequency domains (Becerra et al.
[5]).Based on wavelet theory ,the wavelet neural network (WNN) was proposed as a universal
tool for functional approximation ,which shows surprising effectiveness in solving the
conventional problem of poor convergence or even divergence encountered in other kind of
neural networks It can dramatically increase convergence speed (Zhang et al.[42]).
The WNN network is consist of three layers namely input layer, hidden layer and output layer
.Each layer is fully connected to the nodes in the next layer. No of input and output node
depend on the no of input and output present in the problem. The no of hidden node can be any
no from 3 to 15.WNN is implemented here with the Gaussian wavelet function .
14
X (1 )
X (2 )
IN P U T L A Y E R
W ti
H ID D E N L AY E R
W t
O U TP U T 1
O U TP U T 2
O u tp u t1 )
F ig 1 :W A V E L E T N E U R A L N E T W O R K
O U T P U T L A Y E R
O U T P U T 3
The training algorithm for a WNN is as follows (Zhang et al.[42]):
1) Select the no of hidden nodes required .Initialize the dilation and translation parameters
for the connection between the input and hidden layers and also for the connection
between the hidden and the output layers. It should be kept in mind that the random
value should be limited in the interval (this gives the small error value and algorithm m
converges early).
2) The output value of the sample K
V ,K=1,2,……..,np,where np is the number of sample
is calculated with the following formula :
15
1
1
( )
n in
i j k i jn h ni
K j
j j
w x b
V W fa
=
=
−
=∑
∑ (1)
where nin is the number of input nodes and nhn is the number of hidden nodes and
k=1,2,……,np.
In (2) when f(t) is taken as Morlet mother wavelet is has the following form
2( ) cos(1.75 )exp( / 2)f t t t= − (2)
And when taken as Gaussian wavelet it becomes
2( ) exp( )f t t= − (3)
(3) reduce the error of prediction by adjustingj
W ,ij
w , ,j j
a b using , , ,j ij j j
W w a b∆ ∆ ∆ ∆ (see
formulas(4)-(7)).In the WNN, the gradient descend algorithm is employed:
( 1) ( ),j j
j
EW t W t
Wη α
∂∆ + = − + ∆
∂ (4)
( 1) ( ),( )
ij ij
ij
Ew t w t
w tη α
∂∆ + = − + ∆
∂ (5)
( 1) ( ),( )
j j
j
Ea t a t
a tη α
∂∆ + = − + ∆
∂ (6)
( 1) ( ),( )
j j
j
Eb t b t
b tη α
∂∆ + = − + ∆
∂ (7)
where the error function can be taken as
2
1
1 ( )
2
np
kk
k
E V V∧
=
= −∑ , (8)
Where η and α are the learning and the momentum rates respectively.
4) Return to step (2) the process is continued until E satisfies the given error criteria, and
the whole training of the WNN is completed.
Some problem exists in WNN such as slow convergence, searching space tapping in local
minima and oscillation (Pan et al 2008). We propose BFTWNN to resolve these problems.
3. Meta heuristic used to train WNN
3.1 Threshold accepting trained WNN(TAWNN)
16
Threshold accepting algorithm, originally proposed by Dueck and Scheur[14] is a faster variant
of the original simulated annealing algorithm wherein acceptance of the new move or solution
is determined by a deterministic criterion rather than a probabilistic one.
3.2 Bacterial Foraging Technology:
BFT is a novel approach in evolutionary algorithm .It was proposed by Passino[20].It is
population based optimization algorithm and is completely based on the foraging method of E.
coli bacterium. In a population of solution within an n dimensional search space ,a fixed
number of solution is initialized randomly ,then evolved over time to explore the search space
and to locate the minima of the objective function .Inside a generation, new solution are
generated by adding a fixed step size in each solution(chemotactic step).The half of the
solution that are better than other half are selected in each reproduction step and finally
elimination dispersion step is taken to disperse the bacteria to a random location.
4. Training of WNN with BFT algorithm
Application of BFT in training WNN basically modifies steps (3) and (4) of the WNN training
algorithm for WNN described in Section 2.Output of WNN is a function of weights W(weights
from input layer to hidden layer), w(weights from hidden layer to output layer) , dilation
parameters D, translation parameters T and input values X i.e. Y=f(X,θ ) ,where Y is the
output values vector and θ =(D,T,W,w). During training phase both the input vector X and
output vector Y are known and synaptic weights W and w, dilation parameters D and
translation parameters T are predicted and adapted by minimizing network error E to obtain
proper relationship from X to Y. In BFTWNN, the elements involved in the vectors D,T,W
and w are the decision variables.
Vector θ consists of
(i) Weights values from input nodes to hidden nodes W={ij
W ,i=1,2,………,nin,
where nin=number of input nodes j=1,2,…….,nhn ,where the nhn is the number of hidden
nodes}
ii) Weight values from hidden nodes to output nodes w={jk
w ,j=1,2,…..,nhn and
k=1,2,……,non, where non=number of output nodes }
iii) Dilation parameters D=( 1 2, ,....,nhn
d d d )
17
iv) Translation parameters T=( 1 2, ,.....,nhn
t t t )
A population P in each generation consist of M such θ vectors where M is the size of
population as below:
P= { 1 2 3, , .............M
θ θ θ θ } (9)
The initial population is randomly initialized using the user specified lower and upper bounds
for weights, dilation and translation parameters as follows:
m in m a x m in( 0 ,1) * ( )i i i i
r a n dθ θ θ θ= + −
(for faster convergence initial values should lie between 0 and 1).The initial NRMSE
value(represented by function J) is stored for these initial values
( , , , ) ( , , , ) ( ( , , ), ( , , ))i
CCJ i j k l J i j k l J j k l P j k lθ= + (10)
Chemotaxis is basically a search step, which with tumble and swim step directs the search
towards potential areas of optimal solution .In the tumble step we basically choose a unit
random vector (choosing the random values for all weights, dilation and translation parameters
and dividing them by the squared sum of all these).This vector basically determines the
direction in which bacteria have to proceed, this should be kept in mind that NRMSE value
should decrease after taking this step. The chemotactic step can be represented in the equation
as follows:
( )( , 1, , ) ( , , , ) ( )
( ) ( )T
ii j k l i j k l C i
i iθ θ
∆+ = +
∆ ∆ (11)
compute ( , 1, , )J i j k l+ :
( , 1, , ) ( , , , ) ( ( 1, , ), ( 1, , ))i
CCJ i j k l J i j k l J j k l P j k lθ+ = + + + (12)
where θ is the set of all vectors(i.e. consisting of all the weights ,dilation and translation
parameters), C(i) is the size of the step taken and i∆ is the vector in the random direction. It
will be containing random values for weights from input to hidden, from hidden to output,
dilation and translation parameter) i is the bacterium index, j, is the chemotactic index, k is
reproduction index, l is the elimination dispersal index. New NRMSE value is calculated and
stored. After the tumble step a swim step is taken in the direction of tumble step but until a
maximum length of swim , after that a tumble step must be taken.
18
Next step is the reproduction step ,in this step the health value of the bacteria is calculated as
follows, for the given reproduction step counter and elimination step counter it is calculated as
follows:
(a) For each i=1,2,………………,P let
1
1
( , , , )CN
i
health
j
J J i j k l+
=
= ∑ (11)
be the health of bacteria .sort bacteria in the increasing value of(health
J ).
(b) The / 2r
P P= bacteria with the highest health
J value die and the other r
P bacteria with the
best value split and the copies that are formed are placed at exactly the same location as their
parent.
Next step is the elimination and dispersal step to simulate this step we take a bacteria at the
random (chosen with probability ed
p and are randomly dispersed to the random location.)
Chemotactic step have to be repeated up to C
N times for each bacteria ,the reproduction step
has to be repeated r
N times and elimination dispersal step has to be completed for ed
N times
.After performing all these step we get M solution that is retained by each bacteria, we select
those value which have the minimum NRMSE value. The set of values are our optimum
weights, dilation and translation parameter value these values are tested on the test data. We
can also set the other condition that if the objective function value in two consecutive steps is
less than a predefined value then the algorithm gets terminated.
5. Bankruptcy Predictions:
Bankruptcy prediction has been a subject of formal analysis since at least 1932, when Fitz[15]
Patrick published a study of 20 pairs of firms, one failed and one surviving, matched by date,
size and industry, in The Certified Public Accountant. He did not perform statistical analysis as
19
is now common, but he thoughtfully interpreted the ratios and trends in the ratios. His
interpretation was effectively a complex, multiple variable analysis. The prediction of
bankruptcy has been subject of extensive research since late 1960.
In 1967, William Beaver[4] applied t-tests to evaluate the importance of individual accounting
ratios within a similar pair-matched sample. In 1968, in the first formal multiple variable
analysis. Edward I. Altman [2] applied multiple discriminated analysis a pair-matched
sample. One of the most prominent early models of bankruptcy prediction is the Z-Score
Financial Analysis Tool which is still applied today.
The banks are mostly monitored by regulators who conduct on-site examination on bank’s
premises every 12-18 months, as stipulated by the Federal Deposit Insurance Corporation
improvement act of 1991.Regulators indicate the safety and soundness of the institution using
a six part rating system .This rating ,referred to as the CAMELS rating ,evaluates banks
according to their basic functional areas: Capital adequacy, Asset quality, Management
Expertise, Earning Strength, Liquidity, and Sensitivity to market risk. While CAMELS ratings
clearly provide regulators with important information, Cole and Gunther[12]reported that these
CAMELS rating decay rapidly.
Many statistical techniques such as regression analysis, logistic regression etc. have been used
to solve the problem of bankruptcy prediction .These technique make use of the company’s
financial data to predict its financial state to predict it’s financial state. Bankruptcy prediction
problem can also be solved using various other type of classifiers such as case based reasoning
(Jo,Han,&Lee [18]),rough sets (Mckee [24])and data envelopment analysis(Cielen,Peters,&
Vanhoof [11]) to mention a few. Recently Ravi Kumar and Ravi[30] proposed a fuzzy rule
based classifier for bankruptcy prediction .They reported that fuzzy rule based classifier
outperformed well known technique BPNN in the case of US bank’s data sets. Cheng Chen &
Fu[10] combined RBF network with logit analysis learning to predict financial distress. They
compared the proposed technique with logit analysis and Back Propagation Neural Network
and found that their method is superior to both techniques. Ravi Kumar and Ravi [29] proposed
an ensemble classifier using simple majority voting scheme for the bankruptcy prediction
problem based on a host of intelligent technique such as ANFIS,RBF
,SORBF1,SORBF2,Orthogonal RBF and BPNN. They reported that ANFIS, SORBF2, BPNN
are most prominent as they appeared in the best ensemble classifier combinations. Ravi, Ravi
Kumar, Ravi Srinivas and Kasabov[34]proposed a semi online training algorithm for the radial
20
basis function neural networks (SORBF) and applied it to bankruptcy prediction in banks.
Semi online RBFN without linear terms performed better than techniques such as ANFIS,
BPNN, RBF and Orthogonal RBF. In another work Ravi Kumar and Ravi conducted a
comprehensive review of all the works reported using statistical and intelligent techniques to
solve the problem of bankruptcy prediction in banks and firms during 1968-2005. it compares
the techniques in the terms of prediction accuracy ,data sources ,time line of each study
wherever applicable. Recently Pramodh and Ravi[28] employed modified great deluge
algorithm to train an auto associative neural network and applied it to bankruptcy prediction
.Further Ravi,Kurnaiwan,Peter Nwee Kok Thai & Ravi Kuamr[32]developed a novel soft
computing system for bank performance prediction based on BPNN, RBF,CART,PNN,FRBC,
and PCA based hybrid techniques.
Most recently to solve bankruptcy prediction problems Ravi and Pramodh[28]proposed a
threshold accepting based training algorithm for a novel principal component neural
network(PCNN),without a formal hidden layer .They employed PCNN for bankruptcy
prediction problems and reported that PCNN outperformed BPNN,TANN, PCA-BPNN and
PCA-TANN in terms of area under receiver characteristic curve (AUC) criterion. in BPNN and
PCA-TANN,PCA is used as a preprocessor to BPNN and TANN respectively.
6. Result and discussion
The data set analyzed by us in this work are three different datasets viz. Turksih banks ,Spanish
banks and US banks datasets and three other benchmark datasets viz, Iris data, Wine data and
Wisconsin breast cancer data. Turkish bank’s dataset is obtained from Canbas, Caubak
&Kilic[9] and is available at (http://www.tbb.org.tr/english/bulten/yillik/2000/ratios.xls).
Banks association of turkey published 49 financial ratios of of previous year for predicting the
health of the bank in present year. However Canbas et al.[9] chose only 12 ratios as the early
warning indicators that have the discriminating ability (i.e. significant level is < 5%) for
healthy and failed banks one year in advance .Among these variable,12th variable has some
missing values meaning that data for some of the banks are not given so we filled those
missing values with the mean value of the variable following the general approach in data
mining. The financial ratios ,which are considered as predictor variable are presented at the end
of the paper in table 1.This datasets contains 40 banks where 22 banks went bankrupt and 18
banks were healthy .The Spanish bank’s data is obtained from Olmeda and Fernandez
21
[27]..The ratios used for the failed banks were taken from the last financial statement before
the bankruptcy was declared and the data of non failed banks was taken from 1982 statements.
This datasets contains 66 banks where 37 banks went bankrupt and 29 healthy banks. The US
bank dataset is obtained from Olmeda and Fernandez[27] the financial ratios used by them are
presented in table 1.they obtained the data of 129 banks form the Moody’s industrial manual
,where banks went bankrupt during 1975-1982.this 129 us banks dataset contains 65 went
bankrupt and 64 healthy banks. Again, the financial ratios used by them are presented in
table1.the benchmark datasets are taken from UCI repository (http://archives.ics.uci.edu/ml).
The parameters used for WNN were number of hidden node. Parameters used for BFTWNN
are number of hidden nodes, no of chemotactic step, no of reproduction step, no of elimination
dispersion step, no of bacteria, no of swim step, step size and λ (if you are using dynamic step
size(quotation needed).No of bacteria is taken as between 50 to 100,the no of chemotactic step
is taken between 30 to 50,no of reproduction step is taken between 20-40 and no of elimination
step is taken between 4-10,no of swim step is to be taken as 20-60.The λ value has to be taken
as 400.All are flexible parameters and can be decreased in order to achieve faster convergence.
The no of hidden nodes is taken in the range of 3-15 depending on the no of input nodes for all
the three algorithms.
All the datasets are analyzed with WNN,TAWNN and BFTWNN using 10 fold cross
validation .The average accuracy over all the folds are computed for the six datasets.
Table 1. Financial ratios of the datasets.
S. No. Predictor variable name
Turkish banks’ data
1 Interest expenses/average profitable assets
2 Interest expenses/average non-profitable assets
3 (Share holders’ equity + total income)/(deposits + non-deposit funds)
22
4 Interest income/interest expenses
5 (Share holders’ equity + total income)/total assets
6 (Share holders’ equity + total income)/(total assets + contingencies & commitments)
7 Networking capital/total assets
8 (Salary and employees’ benefits + reserve for retirement)/no. of personnel
9 Liquid assets/(deposits + non-deposit funds)
10 Interest expenses/total expenses
11 Liquid assets/total assets
12 Standard capital ratio
Spanish banks’ data
1 Current assets/total assets
2 Current assets-cash/total assets
3 Current assets/loans
4 Reserves/loans
5 Net income/total assets
6 Net income/total equity capital
7 Net income/loans
8 Cost of sales/sales
9 Cash flow/loans
US banks’ data
1 Working capital/total assets
2 Retained earnings/total assets
3 Earnings before interest and taxes/total assets
4 Market value of equity/total assets
5 Sales/total assets
The average sensitivities and specificities are computed for datasets with two class problems
the result for bankruptcy datasets are presented in table 2.It is observed that BFTWNN
surpassed other algorithm with much better accuracy
TABLE 2
Average result for 10-fold cross validation for other benchmark datasets with all features:-
BFTWNN (%) WNN (%) TAWNN (%) DEWNN
Iris 95.33 94.67 95.99 97.99
Wine 95.6 91.76 92.8 97.6
WBC 97.4 95.29 95.43 97.05
23
TABLE 3
Average results for ten fold cross validation for bankruptcy datasets for specified features:-
BFTWNN(%) DEWNN(%) WNN(%) TAWNN(%)
Turkish Average 95 95 95 100
Sensitivity 97.5 100 100 100
Specificity 97.5 95 95 100
AUC 9750 9750 9750 10000
Spanish Average 88.33 89.99 86.67 88.33
Sensitivity 91.66 91.66 89.67 79.66
Specificity 86.5 93 81 90.5
AUC 8908 9233 8533 8508
US Average 91.47 93.33 85.83 90.83
Sensitivity 88.9 97.323 85.82 90.46
Specificity 85.5 89.78 87.5 91.54
AUC 8720 9355.15 8666 9100
7. Conclusion:-
In this study BFTWNN is developed and compared with TAWNN and the original WNN on
benchmark datasets viz. Iris datasets, Wine datasets and Wisconsin Breast Cancer datasets as
well as bankruptcy datasets viz. Turkish bank datasets, Spanish bank datasets and the result
indicate that BFTWNN can be a very effective soft computing tool for classification problems.
The result indicates that BFTWNN outperformed all other technology in terms of benchmark
datasets and yielded a comparable result in bank datasets. Hence the present research
concludes that training WNN with Bacterial Foraging Technology solves classification
problem with a very good accuracy.
24
APPENDIX 1
FLOW CHART OF bacterial foraging technology
Start
Initialize all variables .set all
loop-counters and bacterium index is
equal to 0
Increase elimination
dispersion loop counter l=l+1
l < Nedl<NedNO
Increase reproduction loop
counter k=k+1
k<Nre ?
Perform elimijnation
dispersal(for i=1,2....S with
probability ped wlim9inate and
disperse one to a random location
Increase chemotactic loop counter j=j+1
j <Nc ?
Perform reproduction(by killing the worse
half of the population with
higher cumulative health and
splitting the better half into two
YES
STOP
NO
YES
NO
YES
Y
X
25