s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
-
Upload
sudheer-babu -
Category
Documents
-
view
214 -
download
0
Transcript of s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
1/13
ForP
eerRev
iewOnly
Statistical Model of Evolutionary Algorithm forFeed-Forward ANN Architecture Optimization
Journal: Journal of Experimental & Theoretical Artificial Intelligence
Manuscript ID: TETA-2013-0034
Manuscript Type: Original Article
Keywords: Artificial neural network, , Crossover., schema theory, topology mutation
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
2/13
ForP
eerRev
iewOnly
ABSTRACTThe optimization of feed-forward
architecture designing is the evolution of
Artificial Neural Network (ANN). There is
no systematic procedure to design a near-optimal architecture for a given application
or task. The pattern classification methods
and constructive and destructive algorithms
can be used for designing of architectures.
The proposed work develops the statistical
model of Evolutionary algorithm (EA) to
optimize the architecture. A single-point
crossover is applied with selective schemas
on the network space and evolution is
introduced in the mutation stage so that an
optimized ANNs are achieved.
Keywords: Artificial neural network,
topology mutation, schema theory
Crossover.
1 INTRODUCTION: Genetic algorithms
were developed by John Holland [1], [3], [2]
& [4]. Due to day to day life, a growing
number of applications combined with a
hardware enhancement, a variety of EAs are
becoming more and more popular. A family
of subsets of the search space and
appropriate process of re-encoding are two
notions and analogous to familiar facts
relating continuous maps and families of
open sets or measurable functions. In order
to apply on EA to a typical optimization
problem, we need to model the problem in a
suitable manner, i.e. to construct a search
space together with positive valued fitness
function and a family of mating and
mutation transforms. Therefore EAs can be
represented by a 4 tuple (, , , ) order.
is the family mating transforms; M is the
unary transformation on . The total search
space is divided into invariant subsets [3]
and a crossover operation is performs on .
While M is the family of mutations on and
is Ergodic, i.e. it ensures that MarKov
process [5] modeling the algorithm is
irreducible. The schemata correspond to
invariant subsets of the search space and the
schema theorem can be reformulated in
general framework. The invariant subsets of
the search space are encoding process
relating continuous maps and families of
open sets or measurable functions and sigma
algebras. A classical Geiringer theorem is
extended to represents a class of
evolutionary computation techniques with
crossover and mutation.
2.0 Representation of Evolutionary
Algorithm:
The mathematical foundation on
evolutionary algorithms representation given
section 1.0, we exploit the language of
category theory [6] is used. To apply on
evolutionary algorithm on a specific
Statistical Model of Evolutionary Algorithm for
Feed-Forward ANN Architecture Optimization
G.V.R. Sagar [email protected]
ssoc. Professor,
G.P.R. Engg. College.
Kurnool AP, 518007, India.
Dr. S. Venkata Chalam [email protected]
Professor,
CVR Engg. College,
Hyderabad AP, India
ge 1 of 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
3/13
ForP
eerRev
iewOnly
optimization problem, we need to model the
problem in a respective manner. This needs
to build a search space which containsthe elements of all the possible solutions to
the problem, a computable positive valued
fitness function ( ),0: f and a suitablefamily of mating or crossover andmutation transforms.
The category of Heuristic 3-tuples: All the
families (F) are invariant subsets [3] of ,characterize all families in set-theoretic and
sigmaalgebra.
Let denote a nonempty family oftransforms from m to for a fixed 1m ,for m-fixable families. We then denote the
family of invariant subsets of underfamily of is .
( ){ }= TSSTSS m, 3.18
It follows that for every element x thereis a unique element of containing x.
For a heuristic 3-tuple ( )MF,,= is a 3-tuple such that { }= ,M . Let x , Fora single heuristic 3-tuple = (, F, M),denoted by
xS then the smallest element of
F is family of invariant subsets.
In a similar manner for given two heuristic
3-tuples ( )1111 ,, MF= and( )2222 ,, MF= , we define a function
21: a represents the reproductiontransformation called as morphism. Let
x1 and y2 1FT and
( ) ( ) 2,2, FFyxx yx = such that
( )( ) ( ) ( ) ( )( )yxFyxT yx ,, ,= 3.2
Similarly 1MM and 2MHxx
such that ( )( ) ( )( )xxx HM = , gives acollection of all morphisms from 1 into 2denoted by M.
A Generalization of Geiringers theorem
for EAs
A family of recombination operators, (also
see in [7]) of a given evolutionary algorithm
changes the frequency, with which various
elements of search space are sampled [1],[8]. To illustrate this point, let
i
n
i A1== denote the search space of a given
evolutionary algorithm first discussed in [9].
Fix a population P consisting of a m
individuals with m being an even number. P
can be thought of as an m by n matrix whose
rows are individuals of the population P,
=
mnmm
n
n
aaa
aaa
aaa
P
....
.......
.......
.......
.......
....
....
21
22221
11211
1.0
The elements of the ith column of P are
members of Ai. The general Geringers
theorem [10] tells us the limiting frequency
with which certain elements of the search
space are sampled in the long run, provided
one uses crossover operator [19] alone is
represented by ( )iph ,, , where hAi theproportion of rows, say j of p for which
aji=h. i.e. if one starts with a population of
individuals and runs a evolutionary
algorithm in the absence of selection and
mutation (crossover being the only operator
involved) then, in the long run, the
frequency of occurrence of the individual
( )nhhh .....,, 21 before time t, represented by( )thhh n ,.....,, 21 is
( )thhh nt
,.....,,lim 21
= ( )iphn
i
,,1
=1.1
The limiting distributions of the frequency of
occurrence of individuals belonging to a
certain schema under these algorithms have
been computed also appeared in [11], [12],
[13]. The classical Geiringer theorem and
proposed or modified Geiringer algorithms
Page 2
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
4/13
ForP
eerRev
iewOnly
established from the basic facts about
MarKov chains [5] and random walks on
groups. This is mainly a matter of
formulating the statement of the theorem in a
slightly different manner. This new point of
view not only the existing various of
Geiringers theorem applied EAs, but alsoextends the process on evolutions
algorithms. Below we shall give a more
formal description of an EA then the one
given in Section 1.
Framework:
A population P of size m is simply an
element of m . (It is a column vector). Anelementary step is a probabilistic rule which
takes one population as an input and produce
another population of the same size as an
output. We shall consider the following
types of elementary steps.
Selection: Consider a given population P as
an input.
=
mx
x
x
P
.
.
.
2
1
with ix 1.2
The individuals of P are evaluated
( )( )
( )mm xf
xf
xf
x
x
x
..
..
..
.
.
.
2
1
2
1
1.3
A new population P1
is obtained, where yis
are chosen independently m times from the
individuals of P and yi
=xj
with probability
P =
This means that, the individuals of P1
are
among there of P, and the expectation of the
number of occurrences of any individual of
P in P1
is proportional to the number of
=
my
y
y
P
.
.
.
2
1
11.4
occurrence of that individual in P times the
individuals fitness value. In particular, the
fitter the individual is, the more copies of
that individual are likely to be present in P1.
On the other hand, the individuals having
relatively small fitness value are not likely to
enter into P1
at all. This is similar to imitate
the natural survival of fittest principle.
Crossover: The population P1 is the output
of the selection process. Now consider the
search space be a set, Fix on ordered K-tuple of integers ( )kqqqq ........,, 21= with
kqqq ........21 . Let K denote a partition
of set {1, 2, ..m} mN. Now partition theset that partition K is q-fit it
{ }kpppK .......,, 21= with ii qP = and is
denoted bym
q the family of all q-fit
partitions of {1, 2, .m}.
Let there areqkqq FFF ......,, 21 fixed families of
qi are operations on and P1, P2,..Pk bethe probability distributions on
( ) ( ) ( )qkqk
q
q
q
q FFF ..........,,2
2
1
1 respectively. Let
Pm be the probability distribution on the
collectionm
q of partitions {1, 2, m}
there the their exists a 2(k+1) tuple
mkqkqq PPPPFFF ,....,,,......,,, 2121 .
According to the above process . The given
reproduction K-tuple
mkqkqq PPPPFFF ,....,,,......,,, 2121 . The
individuals of P are portioned into pairwise
disjoint tuples for mating according to Pm is( )l
xf
xfm
l
j
=1
ge 3 of 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
5/13
ForP
eerRev
iewOnly
( ) ( )( )
=........,....,
,.....,....,,,......,,
21
2
2
2
2
2
1
1
1
1
2
1
1
j
qj
jj
qq
iii
iiiiiiK then
the corresponding tuples are given by
=
11
12
11
.
.
.
1
qi
i
i
x
x
x
Q
=
22
22
21
.
.
.
2
qi
i
i
x
x
x
Q
=
jqj
j
j
i
i
i
j
x
x
x
Q
.
.
.2
1
1.5
Having selected the partition, replace every
one of the selected qj tuples
=
jqj
j
j
i
i
i
j
x
x
x
.
.
.2
1
with the qj tuple 1.6
( )
( )
= .
...,,
.,......
.,......
.,......
.,......
,...,,
,...,,
21
21
21
2
1
1
jqj
jj
jqj
jj
jqj
jj
iiiqj
iii
iii
xxxT
xxxT
xxxT
1.7
For a qj - tuple of transformations
( ) ( )qjqjqj FTTT ,......, 21 selected randomlyaccording to the probability Pj on ( )qjqjF .
This gives a new population.
=
my
y
y
P
.
.
.
2
1
11.8
Notice that a single child does not have to be
produced by exactly two parents. It is
possible that a child has more than two
parents. Asexual reproduction (mutation) is
also allowed.
A general evolutionary search algorithm
works as follows. Fix a cycle, say
{ }jnn
SC1== when Sn is a finite sequence of
elementary steps. Now start the algorithmwith an initial population P given above may
be selected randomly. To run the algorithm
with cycle { }nSC= , simply input P into S1,run S1, input the output of S1 to S2 .. into
the O|P of Sj-1 into Sj and produce the new
O|P, say P1. Now as an initial population and
run the cycle C again. Continue the is loop
finitely many times depending on the
circumstances. A recombination sub-
algorithm defined by a sequence of
elementary steps of reproduction only.
Modified Evolutionary algorithm model:The general structure of EA proposed in [14]
The evolution algorithm has used the
following operators
a. Initializationb. Recombination or Crossoverc. Mutationd. Selection
The frame work of the EAs approach
requires a floating architecture and a fixedpopulation size. The population size,
maximum size and structure of the network
and genetic parameters are user specified.
The weight population is initialized with
user-defined number of hidden nodes for
each individual in order to create a new
population and the weights are generated
randomly same as the size of the population.
Page 4
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
6/13
ForP
eerRev
iewOnly
ANN Recombination or Crossover:In the proposed method, from the above
discussion consider a search space set andfamily of transformations Fq form
qinto ,
fix an ordered q-tuple of transforms
qq FTTT ......,, 21 . Now consider the
transformation qqqTTT :......,, 21
sending any given element
q
qx
x
x
.
.
.
2
1
into
(( )
( )
q
qqj
q
q
xxxT
xxxT
xxxT
.
...,,
.,......
.,......
.,......
.,......
,...,,
,...,,
21
212
211
1.9
Let the subsequence
{ }
j
nnSC
1== (The
element step Sn is or recombination) the
recombination sub-algorithm of proposed
EA reproduces K-tuple
mkqkqq PPPPFFF ,....,,,......,,, 2121 , this heuristic
search algorithm results a MarKov process
with a state space of population, P of fixed
size m and is devoted by ( )mm P . Letthe transition probability Pxy is simply the
probability that the population my isobtained from the population x by going
through the recombination cycle once. Thesetransition probabilities have been computed
but MarKov chain obtained is difficult to
analyze.
Let fix an EA A and the probability that a
population y is obtained from the population
X upon the completion of n complete cycles
of the recombination with a probability
0. >n
yxP . We also write YXA for X
leads to Y or with a population mP , wealso write [P]A denotes the equivalence class
of the population P under the equivalence
relation A .
Therefore The MarKov chain initiated at
some populationmP is irreducible and
its unique stationary distribution is the
uniform distribution on [P]A.
Now fix a partition
( ) mqnkPPPK = ,........., 21 when
( )nknn qqqqn ,........., 21= and now fix a parti-cular choice of tuples of transformation
( ) ( ) ( )
ni
ni
ni
ni
q
q
q
q
i
qin
ii
i FFTTTT = :......,, 21 2.0
such that ( ) ,0......,, 21 >iqiniini TTTP
First notice that we can identify m with the
setnk
nn qqq ....21 via portion( )kPPPK ...,, 21= as follows
given ( ) mmxxxx = ....,, 21 , identifymx
with the one point cross over element
( )xkxxx uuuu ...,, 21=r
when ( )naaaxi xxxu 121 ,....,,= ,i
n
i Paqaa ...,,, 21 andn
aiaaa
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
7/13
ForP
eerRev
iewOnly
Then set of all such that transformation
denote by
=ionrecombinatforchosenare,....,
andinpartitionais
21
m
qn,...., 21
k
k
TTT
n
TTT
KTH k
2.2
Now consider the set of transformation H
from m into itself as follows
{ }nnjjmm HFoFoFoFTTH == ,.....: 112.3
therefore any transformation HT is acomposition of bijections, hence is itself a
bijection so that we can say that mSH
when mS is a group of permutations of
m
.
Let G denotes the subgroup of Sm
generated by H. Now, when a E A A runs a
cycle on the input X amounts to selecting the
transformation form H independently and
applying them consecutively so that the
output of the cycle C on the input X would
be T(x) for some HT chosen with somepositive probability.
We now proceed to define a random walk
associates to a group action.
Let be a finite set and G. be a finite group
generated by H. ( )GH and let d denotesthe identity of group G ( )He .
Let is a probability distribution on Gwhich is concentrated on
( )( ) nxyPHggH .0 > the probabilitythat a state XY is reached from the state in exactly n steps.
The random walk on action of a group G onthe set to be the Markov process with
transition probabilities is
{ }( )yxggPyx xy == , ( )( )xgxg =Q 2.4
Since H generates G, n large enough so thatGg we assure 0).( >
n
xgxP , such that
g
ngmmmg ........,2
2
2
1= now
let { }Ggngn = /max .
Therefore eq. (2.4) can be written as Markov
chain as by the definition of group action
( )( )n
Xdddmmmx
n
xgx gng
ggPP = .............)( 21
( )( ) = n
xdddmmmx
n
xgx gng
ggPP ......))(....(((((.........()( 21
PP xddxdxdx ....))........()(()(
P xxdddxddd...... )(.........()......)(.......(
( ) ( ) ( )( )..........1 xmmxmPxmPX gnggnggnggngx
( ) ( )( ) XmmmXmmmP gngngngnggngn ............. 32132
According to equation (2.4)
( )( )( ) ( )=
>ng
i
g
ni
ngnmd
0
0. 2.5
Equation (2.5) is an irreducible Markovchain with a finite state space and it has
unique stationary unique distribution
denoted by is the initial distribution on x.
There fore we then have
( )X
X1
= 2.6
Then the distribution in the next generation
say . is given as .
( ) ( )
=Hm
mxmX ).(1
2.7
=Hm
mX
)(1
2.8
( ) ===
XX
mX Hm
1)(
1 2.9
Page 6
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
8/13
ForP
eerRev
iewOnly
Since
=Hx
m 1)(and is
concentrated on H.
Markov chain modeling an EA A is a
random walk associated to a finite group set
of G on Xgenerates a new population or set
H, in the long run with a uniform
distribution ().
In this proposed evolutionary algorithms
given above to improve the behavior
between parents and off-springs. Single-
point crossover given in section xxx in
which, (single-point crossover) different
cutting points for each of the two parents in
the population. The cutting points are
independently extracted for each parent
because the genotype lengths of individuals
are variable. The cutting point is taken only
between the one layer and the next layer(for
two hidden layer between second layers of
two network parents); this means that a newevolutionary weight matrix has created to
make connection between two layers at the
cutting points in the parents producing the
two off-springs, so that the population is
maintained constant. In each off-spring node
or layer creation and deletion is possible
based the predefined genetic parameters.
3.3 Topology Mutation:The mutation transformations consist of
the transformations
:a
M 3.0
where { } iSins AUa
.,.........2,1 .
Therefore ( )ikii aaaa .....,, 21=
for
{ }nSiiia
k ,.......,2,1.......21 defined
as ( ) = ni xxxX .....,, 2 we have
( ) ( )na
yyyyxM ...,........., 21== 3.1
where =
=wiseotherx
jsomeforiqifay
q
jq
q
The global behavior of evolutionary
algorithms is to consider a group or family
of subsets of the search space and to predict
which ones of these subsets (say Q) must
satisfies the property is The expected
number of occurrences of elements of Q
increases from one generation to the next.
The each subset is called as schemata.
If the chromosomes length is fixed to n, the
search space is in
i AS 1== where Ai is set ofall possible alleles which may occur at the i
th
position in the chromosome. The next
section gives the selection of offspring based
on the fitness function.
3.4 Selection: A tournament is performed by
choosing the group of off-springs which are
selected randomly and reproducing the best
individual form this group. Now picking up
the P number of challengers as a group,
which is 10% of population size and arrange
the tournament with respected to fitness
between the P challengers and rth solution
and define the scores of rth solutions. The
scores are determined by the minimum
distance method using fitness function [18].This is called the P tournament selection.
Arrange the scores of all the solutions in the
ascending order and pick up the best half
score positions. The best half scores are
considered for the next generation. Repeat
the process for r number of times, where r
is the twice the population size and obtained
the scores of r number of P-tournaments.
The selection probabilities for P-tournament
selection are given by
1 1 1 3.2
More number of selection pressures and their
comparision are given in [15], [16].
6. EXPERIMENTAL SETUPThe idea proposed in this work emphasis on
evolving ANNS; a new evolutionary system
for evolving feed-forward ANNs from
ge 7 of 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
9/13
ForP
eerRev
iewOnly
architecture space. In this contest, the
evolutionary process attempts to crossover
and mutate weights before performing any
structural or topology crossover and
mutation. The evolutionary process is
involved in mutation of weights and
topology. Weight mutation is carried outbefore structural or topology mutation.
Population size in EA taken as 20 and 10
independent trails have given to get the
generalize behavior. Condition of
terminating criteria is taken as fixed iteration
and it is equal to 100 for EA. Table 5.1 gives
all the parameters of the algorithm and
default setting values are taken for
considered problem. All the experiments are
run by specifying the parameters and by
tuning the genetic parameters to obtain the
best solution.
Table: 5.1 Default parameters.
In this work we considered five benchmark
problems are used to check the ANNoptimization.
a)N.bit (2 and 4) Parity (even) classifierb)Pima-India diabetes classifierc)SPECT heart decies classifierd)Brest cancer classifier
Performance of N-Bit Parity (XOR)
classification Problem:
In simultaneous evolution of architecture
and connection weights, we considered only
2-bit and 4-bit parity encoders with different
network sizes are given in this section.
FIGURE 5.8 Performance of Evolutionary ANN for
2-bitparity with initial sizes of [2 3 2 1 2].
FIGURE 5.8 Performance of Evolutionary ANN for
2-bitparity with initial sizes of [2 2 2 1 2].
For parity 2/4 all networks in the space has a
maximum of 10 nodes including 2/4 inputs ,
number of hidden nodes in layer one,
number of hidden nodes in layer two, 1
output node and two hidden layers i.e. the
size is [2/4 2/3 2 1 2]. This allowed for
hidden layer configurations up to 5 nodes to
be evolved. The average and best generation
over all runs that found a solution for parity-
2 using accuracy fitness function and the
smallest architecture size found. The meansquare error (MSE) for 10 trail runs are
given in the Table 5.1 and the performance
of 5 runs are shown in Fig (5.8) and the run
3,4, & 5 completed in 50 generations and
run1&2 completed in 20 generations. The
average number of hidden nodes over 10
successful trail runs is 2.1 and the average
number of connections is 7.9. For ten runs,
Symbol ParameterDefault
value
N Population size 20
SeedPreviously saved
populationnone
Probability of inserting a
hidden layer0.1
Probability of deleting a
hidden layer0.05
Probability of inserting a
neuron in hidden layer0.05
Probability of deleting a
neuron hidden layer0.05
Probability of crossover 0.1
Number of network
inputs
Problem
specific
Number of network
outputs
Problem
specific
K MSE in the range 10
Page 8
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
10/13
ForP
eerRev
iewOnly
for the N-bit parity problems the best
individuals found with = 0.05,
= 0.05, = 0.01 and
= 0.01.
FIGURE 5.10 Performance of Evolutionary ANN for
4-bit parity with initial size of [4 5 4 1 2].
FIGURE 5.10 Performance of Evolutionary ANN for
4-bit parity with initial size of [4 4 5 1 2].
Table 5.2 Performance of ANN shown by
EA for different trails
Performance of Real-Time dataset
classification Problems:
For real time datasets all the data applied to
the training and test sets are acquired from
the UCI Machine Learning Repository [17].
Each input dataset variable should bepreprocessed so that its mean value,
averaged over the entire training set, is close
to zero, or else it is small compared to its
standard deviation
i) Pima-India-Diabetes datasetproblems.
ii) SPECT Heart DeceasePima-India-Diabetes dataset composed of 8
attributes plus a binary class value to show
the signs of diabetes which corresponds to
the target classification value and includes768 instances shown in Table 4.8. All the
datasets are divided in to two sets, using 500
instances for the training, 268 for the test.
For Single Proton Emission Computed
Tomography (SPECT) Heart datasets only13 attributes are used as input parameters to
classify the problem and a total of 267
instances. The target value has stored at 14th
parameter in the data set. These data sets are
normalized before applying to the network.
All the datasets are divided in to two sets,
using 200 instances for the training and 67
for the test.
The evolutionary process initialized with all
the networks in the architecture space with
an defend architecture size, example of size
[x y z 1 n] i.e. x inputs, y hidden nodes in 1st
hidden layer, z hidden nodes in 2nd
hidden
layer, one output layer with one node and n
represents the number of layers After the
evolutionary ANN process the optimized
network consists of only 2 hidden nodes in a
single hidden layer with uni-model sigmoidactivation function and the result of the real
data classification problems are shown in
Figures and Tables
Trail
No.
MSE
([2 3 2 1 2])
MSE([4 5 4 1 2])
1 9.0084e-003 3.2548e-006
2 2.1219e-026 1.3548e-002
3 2.0416e-014 6.3254e-011
4 1.3406e-003 5.4856e-019
5 2.1219e-026 9.2154e-026
6 9.0084e-003 9.3554e-004
7 2.1219e-026 2.8754e-0148 2.0416e-014 9.2365e-013
9 1.3406e-003 3.4587e-001
10 3.2323e-022 8.2657e-016
ge 9 of 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
11/13
ForP
eerRev
iewOnly
FIGURE 5.12 Performance of Evolutionary ANN for
Pima India diabetes with initial size of [9 4 5 1 2]
Table 5.4 Results of Pima India Diabetes.
FIGURE 5.13 Performance of Evolutionary ANN for
SPECT Heart dataset with initial size of [14 4 5 1 2].
FIGURE 5.14 Performance of Evolutionary ANN for
Breast Cancer dataset with initial size of [11 4 5 1 2].
Table 5.6 Results of SPECT Heart dataset.
For Pima-India classification the average
mean square error is 8.6214e-3. During the
training the network is adjusted according to
its error, whereas the test process provides
an independent measure of networkperformance during and after training. The
best solution found in less than 50generations with = 0.1,
= 0.05,
= 0.1 and
= 0.1. With another network
size of [9 5 4 1 2] is also shown in the Table
5.4 with minimum hidden nodes of 3 in a
single hidden layer. The results of the heart
dataset are as shown in Table 5.6 and a
comparisons with literature is shown in
Table 5.7. Ten runs are executed and the
Parameter
Experimental
Results
Number Of Runs 10 10Number Of Generations 40 61
Number of Training patterns
used500 500
Average Training Set
Accuracy76.0 76.5
Number of Test patterns used 268 268
Average Test Set Accuracy 81.5 83.5
Initial Number of Hidden
layers / Nodes2 / [4 5] 2 / [5 4]
Final Number of Hidden
layers / Nodes (Resulted NN)1 / [2] 1 / [3]
Population size 50 50
Number of inputs 09 09
Number of outputs 01 01
Parameter
Experimental
Results
Number Of Runs 10 10
Number Of Generations 90 103
Number of Training patterns
used200 200
Average Training Set
Accuracy86.0 87.2
Number of Test patterns used 67 67
Average Test Set Accuracy 85.2 86.5
Initial Number of Hidden
layers / Nodes2 / [4 5] 2 / [5 4]
Final Number of Hidden
layers / Nodes (Resulted NN)
1 / [3] 1 / [3]
Population size 50 50
Number of inputs 14 14
Number of outputs 01 01
Page 10
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
12/13
ForP
eerRev
iewOnly
average percentage error values of the
training and test process are summarized in
Table 5.6 and 5 trail runs are shown in Fig
(5.13) with an average mean square error of
7.7264e-3. The best solutions reached in less
than 90 generations with = 0.1,
= 0.1, = 0.1 and = 0.1. With anothernetwork size of [14 5 4 1 2] is also shown in
the table with minimum hidden nodes of 3 in
a single hidden layer. The results of the
Brest Cancer dataset are as shown in Table
5.8 and a comparisons with literature is
Table 5.8 Results of Breast Cancer dataset.
shown in Table 5.9. Ten runs are executed
and the average percentage error values of
the training and test process are summarized
in Table 5.8 and 5 trail runs are shown in
Fig (5.14) with an average mean square error
of 5.3614e-3. The best solution reached inless than 45 generations with
= 0.05,
= 0.05,
= 0.05 and = 0.05 in all
runs. With another network size of [11 5 4 1
2] is also shown in the table with minimum
hidden nodes of 3 in a single hidden layer.
CONCLUSION:The optimal weights in ANN in the phase of
learning have obtained by using the concept
of evolutionary genetic algorithm.
Determination of optimal architecture and
weights in ANN in the phase of learning has
obtained by using the concept of
evolutionary genetic algorithm. Proposed
method of both architecture and weights
adjustment has shown outperform at every
level for 2-Bit and 4-Bit parity compared to
the fixed network Back-Propagation and real
dataset classification problems reached the
excellent percentage of accuracy and
optimized network with less number of
hidden nodes and layers of less probability. .
REFERENCES:
[1] Michalewicz, Z. Genetic algorithms + data
structures = evolution programs. Springer-Verlag.
1996.
[2] Muhleinbeim, H. and Mahnig, T. Evolutionary
computation and beyond. In Y. Uesaka, P.
Kanerva, and H. Asoh, editors, Foundations of
Real-World Intelligence, CSLI Publications, pp.
123-188, 2001.
[3] Mitavskiy B. Crossover Invariant Subsets of the
Search Space for Evolutionary Algorithms.
Evolutionary Computation.
http://www.math.lsa.umich.edu/vbmitavsk/
[4] J. H. Holland, "Adaptation in Natural and
Artificial Systems. Ann Arbor", MI: Univ. of
Michigan Press, 1975.
[5] Coffey, S. An Applied Probabilists Guide to
Genetic Algorithms. A Thesis Submitted to The
University of Dublin for the degree of Master in
Science, 1999.
[6] Mac Lane, S. Categories for the working
mathematician. Graduate Texts in Mathematics
5, Springer-Verlag. 1971.
[7] Poli, R., Stephens, C., Wright, A., Rowe, J. A
Schema-Theory-Based Extension of Geiringers
Theorem for Linear GP and variable-length GAs
under Homologous Crossover, (2002).
[8] Vose, M. Generalizing the notion of a schema in
genetic algorithms Artificial Intelligence, 50(3):
385-396, 1991.
Parameter
Experimental
Results
Number Of Runs 10 10
Number Of Generations 45 52
Number of Training patternsused
400 400
Average Training Set
Accuracy97.0 97.0
Number of Test patterns used 240 240
Average Test Set Accuracy 98.5 98.5
Initial Number of Hiddenlayers / Nodes
2 / [4 5] 2 / [5 4]
Final Number of Hidden
layers / Nodes (Resulted NN)1 / [2 ] 1/ [2]
Population size 50 50
Number of inputs 11 11
Number of outputs 01 01
ge 11 of 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence
-
7/29/2019 s1-ln1431401995844769-1939656818Hwf-1896444750IdV51689985614314019PDF_HI0001_2
13/13
ForP
eerRev
iewOnly
[9] Radcliffe, N. The algebra of genetic algorithms.
Annals of Mathematics and Artificial
Intelligence, 10:339-384, 1994.
http://users.breathemail.net/njr/papers/amai94.pdf
[10] Geiringer, H. On the probability of linkage in
Mendelian heredity. Annals of Mathematical
Statistics, 15:25-57, 1944.
[11]Vose, M. and Wright, A. The simple genetic
algorithm and the Walsh transform: Part II, the
inverse. Evolutionary Computation, 6(3):275-
289, 1998.
[12]Stephens, C. and Waelbroeck, H. Schemata
evolution and building blocks. Evolutionary
Computation, 7(2):109-124, 1999.
[13] C. Stephens. The Renormalization Group and the
Dynamics of Genetic systems, to be published in
Acta Physica Slovaka, /0210271/ (2002).
http://arXiv.org/abs/condmat/
[14] Wright, A., Rowe, J., Poli, R., and Stephens C. A
fixed point analysis of a gene pool GA with
mutation. Proceedings of the Genetic and
Evolutionary Computation Conference (GECCO)
Morgan Kaufmann. 2002.
http://www.cs.umt.edu/u/wright/.
[15] Jun He, Xin Yao Drift analysis and average
time complexity of evolutionary algorithms;
Artificial Intelligence 127, 5785, 2001.
[16]T. Chen, J. He, G. Sun, G. Chen, X. Yao, A new
approach to analyzing average time complexity
of population-based evolutionary algorithms on
unimodal problems, IEEE Trans. Syst., Man, and
Cybern., Part B 39 (5), 1092_1106, 2009.
[17] D.J. Newman, S. Hettich, C.L. Blake, and
C.J. Merz. UCI repository ofmachine learning
databases, 1998.
[18] M. Hutter, S. Legg, Fitness uniform
optimization, IEEE Trans. Evol. Comput. 10 (5)
568_589.2006.
[19] Liepins, G. and Vose, M. haracterizing cross-over in Genetic Algorithms. Annals of
Mathematics and Artificial Intelligence, 5: 27 -
34.(1992).
Page 12
URL: http://mc.manuscriptcentral.com/teta
Journal of Experimental & Theoretical Artificial Intelligence