Methods of Combining Neural Networks and Genetic Algorithms

Methods of Combining Neural Networks and Genetic Algorithms

Talib S. HussainQueen’s University

[email protected]

1. Introduction

In the past decade, two areas of research whichhave become very popular are the fields of neuralnetworks (NNs) and genetic algorithms (GAs). Both arecomputational abstractions of biological informationprocessing systems, and both have captured theimaginations of researchers all over the world. Ingeneral, NNs are used as learning systems and GAs asoptimisation systems, but as many researchers havediscovered, they may be combined in a number ofdifferent ways resulting in highly successful adaptivesystems. In this tutorial, a summary will be given ofthese combination methods. This summary is not meantto be exhaustive, but rather to be indicative of the typeof research being conducted. For a more detaileddiscussion, see Yao (1993) and Schaffer et al. (1992).

The tutorial is broken into three sections. Inthe first section, a brief introduction to the foundationsof neural networks and genetic algorithms is given. It isassumed that the participants have a basic understandingof both fields, and this introduction is designed as ashort refresher. In the second section, a variety ofapproaches to integrating NNs and GAs are presented.In the final section, some of the key research issues arediscussed.

1.1 Neural NetworksTo set up the terminology for the rest of the

paper, let us review the basics of a neural network. Aneural network is a computational model consisting of anumber of connected elements, known as neurons. Aneuron is a processing unit that receives input fromoutside the network and/or from other neurons, applies alocal transformation to that input, and provides a singleoutput signal which is passed on to other neurons and/oroutside the network. Each of the inputs is modified by avalue associated with the connection. This value isreferred to as the connection strength, or weight, androughly speaking, represents how much importance theneuron attaches to that input source. The localtransformation is referred to as the activation functionand is usually sigmoidal in nature.

A typical neural network is capable ofrepresenting many functions, as proved byKomolgorov’s Theorem, but finding the best networkneeded to solve a specific problem is a very open-endedproblem. If the developer knows the exact solution

method, then he can program the network structureexplicitly. However, if the problem is very complex orhas no known solution, the developer may not knowwhat structure to give the network. To this end, mostneural network models include a learning rule which canchange the network’s structure over the course oftraining to arrive at a good final solution. Back-propagation is the most popular learning rule.

1.2 Genetic AlgorithmsA variety of computational models based on

evolutionary processes have been proposed, and themost popular models are those known as geneticalgorithms. A genetic algorithm has four mainelements: the genetic code, a concise representation foran individual solution; the population, a number ofindividual solutions; the fitness function, an evaluationof the usefulness of an individual; and the propagationtechniques, a set of methods for generating newindividuals. The genetic algorithm works as follows.First, a population of individuals is generated byrandomly selecting different genes. The fitness of eachindividual is then evaluated, and the propagationtechniques are applied to highly fit individuals togenerate a new population - the next generation. Thecycle of evaluate and propagate continues until asatisfactory solution, hopefully optimal, is found.

In a typical genetic algorithm, the genetic codeis a fixed-length bit string and the population is always afixed size. The three most common propagationtechniques are elitism, mutation and crossover. Inelitism, the exact individual survives into the nextgeneration. In mutation, a new individual is createdfrom an old one by changing a small number ofrandomly selected bits in its gene. In crossover, a newindividual is created from two old ones by randomlyselecting a split point in their genes are creating a newgene with the left part from one parent and the right partfrom another. In any genetic algorithm, the two keyaspects are the genetic representation and the fitnessfunction. Together, these determine the type of problemwhich is being solved and the possible solutions whichmay be generated.2. Combining NNs and GAs

2.1 Supportive and CollaborativeResearchers have combined NNs and GAs in a

number of different ways. Schaffer et al. have noted

that these combinations can be classified into one of twogeneral types - supportive combinations in which theNN and GA are applied sequentially, and collaborativecombinations in which they are applied simultaneously.

In a supportive approach, the GA and the NNare applied to two different stages of the problem. Themost common combination is to use a GA to pre-process the data set that is used to train a NN. Forinstance, the GA may be used to reduce thedimensionality of the data space by eliminatingredundant or unnecessary features. Supportivecombinations are not highly interesting since the GAand NN are used very independently and either caneasily be replaced by an alternative technique. Someother possible combinations include: using a NN toselect the starting population for the GA; using a GA toanalyse the representations of a NN; and using a GAand NN to solve the same problem and integrating theirresponses using a voting scheme. (Schaffer et al.)

Alternatively, in a collaborative approach, theGA and NN are integrated into a single system in whicha population of neural networks is evolved. In otherwords, the goal of the system is to find the optimalneural network solution. Such collaborative approachesare possible since neural network learning and geneticalgorithms are both form of search. A neural networklearning rule performs a highly constrained search tooptimise the network’s structure, while a geneticalgorithm performs a very general population-basedsearch to find an optimally fit gene. Both are examplesof biased search techniques, and “any algorithm thatemploys a bias to guide its future samples can bemislead in a search space with the right structure. Thereis always an Achilles heal.” (Schaffer et al, p. 4) Theprimary reason researchers have looked at integratingNNs and GAs is the belief that they may compensate foreach other’s search weaknesses.

2.2 Evolution of Connection Weights

A genetic algorithm can be applied tooptimising a neural network in a variety of ways. Yaohas indicated three main approaches.- the evolution ofweights, the evolution of topology, and the evolution oflearning rules. In each case, the GA’s genetic codevaries highly.

In the first, the GA is used as the learning ruleof the NN. The genetic code is a direct encoding of theneural network, with each weight being representedexplicitly. The population of the GA are all NNs withthe same basic topology, but with different weightvalues. Mutation and crossover thus affect only theweights of the individuals. A key question in suchsystem is whether to use binary weights or real-valuedones - the latter increases the search space greatly.

Using GAs instead of gradient descentalgorithms to train the weights can result in faster and

better convergence. Better still, since GAs are good atglobal search but inefficient at local finely tuned search,a hybrid approach combining GAs and gradient descentare attractive. (Yao)

2.3 Evolution of architectures

In the second approach, the GA is used toselect general structural parameters and the neurallearning is used separately to trained the network anddetermine its fitness. This includes evolution of boththe topology (i.e., connectivity pattern) and activationfunctions of each node, although most work hasconcentrated on the former and little has been done onthe latter.

In architecture evolution, the genetic code canbe either a direct or indirect encoding of the network’stopology. In a direct encoding, each connection isexplicitly represented (e.g., a matrix where 1 indicatethe presence of a connection and 0 indicates noconnection). In an indirect encoding, importantparameters of the network are represented and thedetails of the exact connectivity are left todevelopmental rules (e.g., specify the number of hiddennodes and assume full connectivity between layers).

In both cases, the exact neural network is notspecified since the weights are determined by theinitialisation routine and the network’s learningalgorithm. Thus, the evaluation of a gene is noisy sinceit is dependent upon the evaluation of the trainednetwork, and the GA finds the best set of architecturalparameters rather than the best neural network.

2.4 Evolution of learning rules

In the final approach, the GA is used similarlyto the evolution of architecture, but a parametricrepresentation of the network’s learning rule is alsoencoded in the gene. The genetic coding of topology inthis case is generally indirect.

Evolving learning rules does not refer simplyto adapting learning algorithm parameters (e.g., learningrate, momentum, etc.) but to adapting the learningfunctions themselves. This is an area of research whichhas received little attention. “The biggest problem hereis how to encode the dynamic behaviour of a learningrule into static genotypes. Trying to develop a universalrepresentation scheme which can specify any kind ofdynamic behaviours is clearly impractical let alone theprohibitive long computation time required to searchsuch a learning rule space.” (Yao, p. 214)

3. Issues

Collaborative combinations of NNs and GAshave sparked the interest of a great number ofresearchers because of their obvious analogy to natural

systems. A wide variety of systems have beendeveloped and a number of research issues have beenconsidered.

3.1 The Baldwin EffectIn general, one may wonder whether it really is

of any use to have both neural learning and geneticsearch operating in the same system. Perhaps using justgenetic search would work given enough time, orperhaps a very general neural learning technique wouldbe sufficiently powerful. This is quite possibly true, butan observation from natural systems known as theBaldwin Effect provide a clearer answer.

The Baldwin Effect states that in anevolutionary system, successful genes can propagatefaster, and in some cases only, if the individuals arecapable of learning. This principle has been clearlydemonstrated in an artificial evolutionary system byFrench & Messinger (1994). Thus, an evolutionarysystem with simple individuals which can learn isgenerally more successful than one with non-learningindividuals and probably also better than a single highlycomplex learning individual.

3.2 GeneralisationIn evolving a neural network, attention must be

paid to the trade-off between evolutionary fitness andgeneralisation ability. In many tasks, the final networkis trained on a small set of data and applied to a muchlarger set of data. The goal of the learning is actually todevelop a neural network with the best performance onthe entire problem and not just the training data.However, this can easily be overlooked during thedevelopment process.

Thus, one must be careful when evolvingneural networks not to select for highly specialised,poorly generalising networks. This is especially true inproblem areas which are highly dynamic.

3.3 Encoding MethodsThe two main properties of an encoding of a

neural network in a GA are its compactness andrepresentation capability. A compact encoding is usefulsince the GA can then be efficiently applied to problemsrequiring large NN solutions. An encoding should bepowerful enough to represent a large class of NNs orelse the GA may not generate very good solutions. Forinstance, direct encoding is generally quite powerful inrepresentation, but not compact, while parameterisedencoding is compact, yet often represents a highlyrestrictive set of structures.

The discussion so far has focused on directencoding and parametric encoding of neural networkstructure. Other possibilities also exist. In particular,

grammatical encoding has recently received someattention. (Gruau, 1994) Grammar encoding is quitepowerful since it is compact but can represent a greatrange of networks.

4. ConclusionsNeural networks and genetic algorithms are

two highly popular areas of research, and integratingboth techniques can often lead to highly successfullearning systems. The participants of this tutorial areencouraged to try applying evolutionary neural networksolutions, or even developing new combinations of theirown.

References

French, R. & Messinger, A. (1994). “Genes, phenes andthe Baldwin Effect: Learning and evolution ina simulated population,” Artificial Life IV,277-282.

Gruau, F. (1994) “Automatic definition of modularneural networks,” Adaptive Behaviour, 3, 151-184.

Schaffer, D., Whitley, D. & Eshelman, L. (1992)“Combinations of Genetic Algorithms andNeural Networks: A survey of the state of theart,” Proceedings of the InternationalWorkshop on Combinations of GeneticAlgorithms and Neural Networks. D. Whitleyand D. Schaffer (Eds.,) Los Alamitos, CA:IEEE Computer Society Press, 1-37.

Yao, X. (1993) “Evolutionary artificial neural networks”International Journal of Neural Systems, 4,203-222.

Methods of Combining Neural Networks and Genetic Algorithms

Education

Transcript of Methods of Combining Neural Networks and Genetic Algorithms