Evolution of Artiﬁcial Brains in Simulated Animal … of Artiﬁcial Brains in Simulated Animal...

Evolution of Artificial Brains in Simulated AnimalBehaviour

Using radial basis, linear and random functions for decision-making

BJÖRN TEGELUNDJOHAN WIKSTRÖM

Bachelor’s Thesis at CSCSupervisor: Petter Ögren

Examiner: Mårten Björkman

TRITA xxx yyyy-nn

AbstractIn this report we simulate artificial intelligence in animalsusing genetic algorithms. In similar models, advanced arti-ficial neural networks have been used for decision making.We present two simpler decision-making models. Using twomodels based on linear and radial basis functions we findsimilar behaviours as those found in other studies, includingfood seeking, obstacle avoidance and predator-versus-preydynamics.

The results show that both decision-making models areequally efficient at gathering food and avoiding obstacles.The models differed in survival strategies when faced withdangerous obstacles and in a predator-versus-prey situa-tion the predators based on radial basis functions performedbetter.

Some evolutionary phenomena were observed duringthe evolution of the animals, including an evolutionary arms-race between predator and prey. We hoped to find signs ofmimicry as well, but classic mimicry was not found in theresults.

ReferatEvolution av artificiella hjärnor vid simulering

av djurs beteende

I denna rapport simulerar vi artificiell intelligens hos djurmed hjälp av genetiska algoritmer. I liknande modeller harman tidigare använd avancerade artificiella neuronnät sombeslutsmodeller. I denna rapport presenterar vi två enklarebeslutsmodeller. Med två modeller baserade på linjära ochradiella basfunktioner hittas liknande beteenden som i ti-digare rapporter, inklusive födoletande, hinderundvikandeoch predator-bytesdjursdynamik.

Resultaten visar att båda beslutsmodeller är lika ef-fektiva vid födoletande och hinderundvikande. Modellernasöverlevnadsstrategier skiljer sig när farliga hinder införs ochi predator-bytesdjurssitationer är predatorn baserad på ra-diella basfunktioner effektivare.

Vissa evolutionära fenomen observerades under evolu-tionen av djuren, inklusive en evolutionär kapplöpning mel-lan bytesdjur och predator. Vi hoppades även finna teckenpå mimikry, men klassisk mimikry observerades inte i re-sultaten.

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Scope and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Technical Overview 32.1 Evolutionary Concepts in Nature . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2.2 Fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2.3 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.4 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.5 Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Radial Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Implementation 93.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1.1 Implementation in Python . . . . . . . . . . . . . . . . . . . . 93.1.2 Simulated World . . . . . . . . . . . . . . . . . . . . . . . . . 93.1.3 Methods of Enforcing Behaviour . . . . . . . . . . . . . . . . 113.1.4 Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . 113.1.5 Linear Decision Making . . . . . . . . . . . . . . . . . . . . . 123.1.6 RBF-Based Decision Making . . . . . . . . . . . . . . . . . . 133.1.7 Random Decision Making . . . . . . . . . . . . . . . . . . . . 143.1.8 Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2.1 Finding and eating food . . . . . . . . . . . . . . . . . . . . . 153.2.2 Avoiding bad food . . . . . . . . . . . . . . . . . . . . . . . . 163.2.3 Predators and prey . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Results 19

4.1 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.1.1 Finding and eating food . . . . . . . . . . . . . . . . . . . . . 194.1.2 Avoiding bad food . . . . . . . . . . . . . . . . . . . . . . . . 214.1.3 Predators and prey . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2.1 Constraints and Problems . . . . . . . . . . . . . . . . . . . . 254.2.2 Simulation Accuracy and Applications . . . . . . . . . . . . . 26

4.3 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . 27

Bibliography

List of Figures

Appendices

A Third-party libraries and tools used

B Source Code

C Statement of Collaboration

Chapter 1

Introduction

1.1 Background

Today’s environment is becoming more and more extreme. This makes it increas-ingly important to model and simulate different ecological scenarios. A typicalscenario could be moving a species of animals from one ecosystem to another, orpredicting which species may face extinction in a continuously changing environ-ment. What these scenarios have in common is that the animals need to adapt tothe changes occurring around them. They need to evolve new traits and behavioursto survive. An important tool used to model evolution is genetic algorithms, whichsimulate evolution to approximate an optimal solution to an algorithmic problem.

Genetic algorithms have been used on several occasions to simulate artificial lifeas well as artificial intelligence and this report is a continuation of that research,focusing on the behaviour of the animals. They have for example been used fortraining machine learning algorithms [8], for robot motion planning [5] and fortraining artificial neural networks in the ecosystem simulator Gaia [3]. The animalssimulated in Gaia showed several interesting behaviours, two prominent ones beingsearching for food and obstacle avoidance. The authors of that report acknowledgedthat many of the behaviours observed could be implemented using simple linearassociations instead of advanced and computationally expensive neural networks[3]. The goal of this report is to find out if that is possible.

1.2 Scope and Objectives

This report focuses on evolving simulated animal behaviour using two kinds ofsimple mathematical functions capable of linear associations, namely linear andradial basis functions. As a baseline, a decision-making model which makes randomdecisions is also used. Each decision-making model takes genes as input, which areproduced by the genetic algorithm. Each model can be thought of as the ”brain”of the simulated animal. Statistics such as learning capabilities, choice of strategyand how well the animals are able to adapt to their surroundings are then analysed.

1

CHAPTER 1. INTRODUCTION

In nature evolution is not a straight-forward process. Instead, many differentevolutionary phenomena affect the course a species takes when evolving. A sec-ondary goal of this project is therefore to see which evolutionary phenomena canbe observed when running the simulation and in particular how they may relate tothe animals’ behaviour.

In each experiment the animals are presented with different surroundings, whichcan contain food, predators and other objects. The two primary decision-makingmodels are compared and contrasted after each experiment has been run, with therandom model included when necessary.

1.3 Problem Statement1. Is it possible to simulate evolution of animal behaviour using linear and radial

basis functions in combination with a genetic algorithm?

2. Are there any significant differences between the two decision-making models?

3. Which, if any, evolutionary phenomena can be observed in the simulated an-imals?

2

Chapter 2

Technical Overview

2.1 Evolutionary Concepts in Nature

2.1.1 Overview

This section provides a short overview of the evolutionary concepts in nature whichare mentioned in this report. As every solution that the genetic algorithm proposescan be thought of as an individual in a population, many natural evolutionaryphenomena can be observed in these individuals as well.

Adaptation is when an individual is forced to adapt to its surroundings in orderto survive. In nature adaptation occurs over generations and the need to adaptis the primary motivator behind evolution [6, p. 8]. The fitness of an individualdepends to a great extent on how well-adapted it is to its surroundings. During ex-treme environmental change the adaptation is accelerated and once the populationcan survive the adaptation rate decreases. A large population size increases theability to adapt, as a large number of potentially helpful adaptations can be madeper generation. A smaller population is more unstable and vulnerable as randomenvironmental changes can eradicate important individuals or genes from the pop-ulation. An example of adaptation in nature is how the polar bear evolved to havethicker, whiter fur when moving to a colder, snow-covered environment.

Co-evolution is when the evolution of a species is affected by the evolution ofanother [6, p. 90]. This could be both in a symbiotic, parasitic or a predator-versus-prey manner. When in a predator-versus-prey situation and the predator evolvesfaster, the prey is usually faced with extinction. If it is the other way around thepredators are faced with extinction as they may not be able to catch enough preyfor their population to survive. An example of this is the evolutionary arms-racebetween the cheetah and its prey, which are in a constant battle of being able torun the fastest.

Mimicry is a kind of adaptation where one species mimics an object’s appearanceor behaviour [6, p. 278]. This could be camouflaging oneself as a stone or evolvingbrightly coloured patterns, characteristic of poisonous animals, despite not beingpoisonous. Mimicry can be found in certain species of butterflies which wings take

3

CHAPTER 2. TECHNICAL OVERVIEW

S ← a random distribution of geneswhile the genes have not converged towards a solution do

F ← fitness(S) . Calculate fitness values for all s ∈ SX ← select(S, F ) . Get a multiset of S using fitness valuesC ← crossover(X) . Apply crossover operator on multisetM ← mutate(C) . Apply mutation operator on crossed genesS ←M . Restart with the new generation

end while

Figure 2.1. An overview of the genetic algorithm used in the experiments.

on the shape, colour, and texture of leaves.

2.2 Genetic Algorithms

2.2.1 OverviewGenetic algorithms are a way of approximating solutions to NP-hard problems, agroup of problems that are computationally expensive. Genetic algorithms are aform of machine learning algorithm typically used to solve problems with a large orcomplex search space and where other machine learning algorithms fail [7, p. 269-288]. Genetic algorithms resembles evolution in nature by defining a set of genomes,or individuals, where each genome represents a possible solution to the given prob-lem. This genome is commonly stored as a string of zeros and ones or as a list of float-ing point values. This set of genomes undergoes several iterations, or generations, ofsmall improvements by fitness evaluation, selection, mutation and crossover. Theseoperators all have equivalents in natural evolution and eventually the genomes willconverge to a solution which represents a local minima in the search space. The ad-vantage of genetic algorithms is that the crossover operator enables a wider searchover the problem domain than many other approaches, using fewer calculations [4].In the following subsections, the most crucial parts of the genetic algorithm will beexplained and in Figure 2.1 an overview of the genetic algorithm in pseudo code isdepicted.

2.2.2 FitnessFitness is the only mean of measuring the success of a solution. In nature, fitness isdetermined by the number of offspring produced by an individual and the percentageof the offspring which survive long enough to generate new offspring[2, p. 117]. In1932, Sewall Wright suggested that species adapt until they reach a local maximain a fitness search space, which is a multidimensional space consisting of all possiblecombinations of genes and their resulting fitness. To reach another local maxima inthe search space, which might be higher than the previous, the species must journeythrough a ”valley” in the search space that would result in a temporarily lower

4

2.2. GENETIC ALGORITHMS

fitness [10]. In genetic algorithms, the fitness value can be any numerical value thatdescribes each individual’s success at solving the problem at hand, determined by afitness function. Genetic algorithms can get significant performance advantages byselecting the correct fitness function. The fitness function should be strictly positivefor all inputs and preserve some form of relative internal ordering of the individuals.In most problems there are multiple choices of fitness functions and the best choiceof fitness function is unique for every problem [7, p. 272]. It is possible to forcethe individuals to use certain strategies by rewarding or penalising an individual’sfitness when showing certain behaviour, but that is not desired in this project. Innature survival strategies vary and that is a property which should be reflected inthe simulation. Therefore the life length of the individuals was chosen as the fitnessfunction, so that the animals can choose survival strategy freely.

2.2.3 Selection

Selection is the process of selecting which organisms that should be allowed toreproduce and in what proportions. In nature selection is closely tied to fitnessas the individual which displays highest fitness is selected to mate most often. Ingenetic algorithms there are several different ways to model the selection process, butit should always be dependant on the fitness values of the population. A good fitnessfunction should therefore strike a balance between favouring the fittest individualsand allowing less fit individuals to survive in a reduced number. One trivial selectionfunction could be to select all individuals that pass a certain fitness threshold [7,p. 273]. This is a bad idea, however, since there are few ways to determine thecorrect threshold value. In the beginning of the evolution, a high threshold willexclude most of the genomes due to low general fitness and this will lead to a fastreduction of genetic variation. In the later stages of the evolution all individualswill pass the threshold, rendering the selection function useless.

A better approach is the so-called roulette wheel selection where each individualis mapped to an area of what looks like a roulette wheel (see Figure 2.2). A largerfitness value means a larger area on the roulette wheel and the selection functionsimply generates random values corresponding to values on this roulette wheel. Inthis way, the individuals are chosen based on a biased stochastic variable and thereis room for individuals with high fitness to dominate and for individuals with lowfitness to be included by chance.

2.2.4 Mutation

Mutation is a random, usually small, change in an individual’s genome. In naturethis typically occurs spontaneously when new cells are formed and there are hun-dreds of factors which can induce mutation [9, p. 46] [6, p. 289-290]. In geneticalgorithms it is usually implemented as a small probability for each gene to mutate.When the genome consists of a series of zero and ones, the mutation operator isa binary flip of that bit and when the genes consist of floating point values, the

5


1

52

34

Wheel is rotated

Selection point

Individual 4 has low fitness and a small area of the wheel

Individual 5 has high fitness and a large area of the wheel

Figure 2.2. A visual representation of how the roulette wheel selection algorithmworks.

Parent genes Random inter-section point

Split at inter-section point

Form two descendants

Figure 2.3. A schematic drawing showing single point crossover of two genomes.

mutation can be an addition or multiplication of a random value. If the mutationrate is too high, it becomes hard to reach convergence since the good solutions willoften be mutated into worse solutions.

2.2.5 Crossover

Crossover is the operation of combining two genomes into two new genomes. Onecommon crossover operator is known as single point crossover and it consists ofsplitting two genomes at the same position and merging the split parts into twonew genomes. The split position is chosen randomly and the two new genomes

6

2.3. RADIAL BASIS FUNCTIONS

Figure 2.4. Three one-dimensional RBFs with varying µ and σ values. µ determinesthe centre of the bell curve and σ controls the slope of it.

share no genes with each other. This process is displayed in Figure 2.3. There isalso multiple point crossover where there are multiple splitting positions as well asuniform crossover where each gene can be exchanged with a certain probability.

2.3 Radial Basis Functions

2.3.1 OverviewA radial basis function (RBF) is a bell-shaped function whose value depends onthe distance from some centre [1, p. 1-8], as shown in Figure 2.5. Radial basisfunctions are commonly used in artificial neural networks as a way to encode inputinformation. They are favourable to use as they have locality, something whichlinear functions do not. Locality means that the function value is zero in almostthe entire domain of the function. This is displayed in Figure 2.4. Locality makesthem useful for function approximation, as any function can be approximated asthe sum of a number of weighted radial basis functions. A property of radial basisfunctions which can both be interpreted as an advantage and a disadvantage is thattheir value never exceeds a given constant, compared to a linear function which cangrow to infinitely high or low numbers.

Radial basis functions are commonly implemented using a formula such as inFigure 2.5, which is a three-dimensional function centred around (µx, µy, µz). Thewidth of the bell-curve in each dimension is determined by σx, σy and σz respectively.

7


f(x, y, z) = Ax ∗ exp(−(x− µx)2

2σ2x

) +Ay ∗ exp(−(y − µy)2

2σ2y

) +Az ∗ exp(−(z − µz)2

2σ2z

)

Figure 2.5. A sum of three radial basis functions, corresponding to three inputvalues. Ax, Ay and Az lies within the interval [−1, 1] and ensures that f(x, y, z) canbe negative.

8

Chapter 3

Implementation

3.1 Model

3.1.1 Implementation in Python

Python was chosen as implementation language since it is a high-level languagesuited for quickly building prototypes. It also has an abundance of third-party li-braries which helps reduce implementation time significantly. A third-party librarycalled DEAP (Distributed Evolutionary Algorithms in Python) was used for thegenetic algorithm. DEAP proved to be flexible enough for the task as it allowsthe user to define their own selection, mutation and crossover algorithms as well asmixing them with the accompanying built-in algorithms. Pygame and Matplotlibwere used for graphical rendering, Pygame for rendering the actual simulation andMatplotlib for producing graphs from the extracted data. To increase performancePypy, Numpy and Python’s built-in support for multiprocessing were used to de-crease runtime significantly. For a comprehensive list of the tools used, please seeAppendix A.

3.1.2 Simulated World

A world model for simulating the animals was created using Python together withthe libraries mentioned in the previous section. The world contains animals, whichare spheres with two antennae protruding from their bodies at angles π/6 and−π/6. There are two kinds of animals in the world: herbivores and predators. Theherbivores’ colours are decided by their genes. There may also be predators in theworld that are similar to herbivores in appearance but always have a strong redcolour. While herbivores eat green plants, which are green circles placed randomlywithin the world, predators eat herbivores. Herbivores also have to avoid red plants,which are red circles. These represent ”bad” or poisonous food. After a plant iseaten there is a small probability each moment that a new one will be placed intothe world at a random location. The maximum number of red and green plantspossible in the world is fixed, and it becomes increasingly probable to spawn a

9

CHAPTER 3. IMPLEMENTATION

Figure 3.1. A screenshot of the simulation, showing green plants (green circles), redplants (slightly larger red circles), herbivores (multicoloured circles with antennae)and a predator (red circle with inner white circle and antennae).

new plant if there only are a few left. There are also walls around the border ofthe world which the animals cannot pass through. The walls are coloured blue inorder for the herbivores to be able to make a clear distinction between walls, plants,predators and other herbivores. How the world is represented graphically can besee in Figure 3.1. Collision and detection, which are the only means of interactionbetween two objects, are governed by the following rules:

1. Interaction between two objects occur when they collide. A collision occurswhen the distance between the centres of the two objects is smaller than thesum of their radii. Exactly what happens depends on the type of the objects.

2. Detection occurs when an object crosses the antennae of either a herbivore orpredator.

At each moment, every possible collision is evaluated and the occurred collisions’effects are then applied immediately. If a herbivore collides with a plant, that plantis eaten. When eating a green plant the herbivore’s lifespan is increased slightlyand when eating a red plant the herbivore is killed. Collisions between predators

10

3.1. MODEL

and plants do not have any consequences. If a predator and a herbivore collide,the herbivore is killed and the predator’s lifespan increased. After that a check fordetection occurs and the animals which have detected objects are allowed to processtheir inputs, using the decision-making models described in sections 3.1.4 to 3.1.7,and apply a ∆s and a ∆r (change in speed and rotation) to their current speed androtation. Every animal is then moved in accordance with their speeds and rotations.

Due to performance issues, all animals of a population are not present in thesame simulation. Instead, parts of the population are simulated separately. Theresults are then gathered and used to create the next generation. The detection andcollision algorithms have a time-complexity ofO(n2) and therefore scale poorly. Thisthe reason why the division of the population is needed. In practice this means thateach simulation takes four times as long if the number of animals per simulation isdoubled. By doing this trade-off with a lower number of animals in each simulationit was possible to run a higher number of iterations of the genetic algorithm in thesame amount of time.

There are many specific constants which need to be fine-tuned for optimal resultsand performance, such as animal size, speed and life length, among others. Listingall of these and their purposes would not contribute to this report, but the interestedreader can find the entire source code for the project at the location specified inAppendix B.

3.1.3 Methods of Enforcing Behaviour

In order to investigate the natural evolutionary phenomena that can be observedwhen using genetic algorithms, is it necessary to find methods of enforcing behaviourin the animals. It is also interesting to see which evolutionary strategies are favouredby the different brains when placed in particular situations. Methods of doing thiscould be either adding additional inputs to the brains or adding extra terms in thecalculations to allow the approximation of more complex functions and thus morecomplex behaviour. This approach does however come at the cost of computingpower. For each gene added the expected time for convergence is increased [7].

To focus more on which natural evolutionary phenomena occur, an approachwas chosen which focused more on changing the animals’ environment instead of theanimals themselves. An example of an approach used was to add red plants into theworld. The task of eating green plants then became more difficult as the animalsalso had to avoid mistakingly colliding with red plants. Another addition whichallowed for more dynamics in the world was the choice to allow the herbivores tohave different colours which also depended on their genes. By doing this it enabledthe use of the evolutionary strategy known as mimicry, described in section 2.1.1.

3.1.4 Decision Making

All brains in our simulation have a similar structure, they are functions with eightinputs and two outputs. When input is received by an animal, it is in the form of

11


∆r =

0 if l4 = 0 and r4 = 0,S(g1−3 ◦ xl) if l4 6= 0 and r4 = 0S(g4−6 ◦ xr) if r4 6= 0 and l4 = 0S(g1−3 ◦ xl + g4−6 ◦ xr) if r4 6= 0 and l4 6= 0

Figure 3.2. The linear brain’s decision formula for change of rotation ∆r. S corre-sponds to a sigmoid function described in section 3.1.5.

eight numbers, four for each antenna. Three of the inputs for each antenna are thered, green, and blue components of the currently detected object’s colour. Theseinputs are normalised to the interval [0, 1]. The fourth input is zero when no objectis detected and one when an object is. It was deemed necessary to include thefourth input to avoid certain edge-cases. For example, if an animal were to detect ablack object this would be equal to not detecting anything at all if using only threeinputs, which in turn would give no reaction.

The outputs produced from this consists of the values ∆r and ∆s which denotechanges in rotation and speed. Both output values are normalised to the interval[−1, 1] to account for the possibilities of negative rotation and negative acceleration.These values are then translated into reasonable values in the simulation. Themaximum acceleration is determined by the size of the animals, the size of the worldand the maximum speed of the animals. This enables scaling of the world withoutaffecting the simulation itself, by tweaking those constants. The maximum allowedchange in rotation is 180◦ or π since the interval [−π, π] covers the entire circle. Alarger allowed change in rotation would have made learning harder as there wouldbe multiple correct responses to a situation, e.g ∆r = v,∆r = v + 2π,∆r = v + 4πand so on.

3.1.5 Linear Decision Making

The linear brain is a simple model for artificial intelligence. There are in total twelvegenes associated with the brain in the interval [−1, 1]. These genes correspond totwo outputs from the brain times two antennae times three colours. Both ∆r and∆s are calculated using the same method as mentioned in the previous section.

In Figure 3.2, the first three components of the left and right antennae vectors,the ones containing the colour data, are called xl and xr respectively. The fourthcomponents are called l4 and r4, and show if an object has been detected or not, asmentioned earlier. Six out of the twelve genes involved in total in the linear brainapply to this equation and they are divided into two vectors g1−3 and g4−6 usinggenes 1-3 and 4-6. The change in speed ∆s is calculated in the exact same wayusing the same inputs but genes 7-12 instead.

In Figure 3.2 a sigmoid function S is used to limit the outputs to be within[−1, 1]. The function 1

1+e−x ∗ 2− 1 was chosen as sigmoid function but any function

12

3.1. MODEL

∆r =

0 if l4 = 0 and r4 = 0,S(f(xl)) if l4 6= 0 and r4 = 0S(f(xr)) if l4 = 0 and r4 6= 0S(f(xl) + f(xr)) if l4 6= 0 and r4 6= 0

(3.1)

Figure 3.3. Calculating the ∆r using RBF functions. 18 genes are implicitly used,nine genes for A:s, σ:s and µ:s in f (see Figure 2.5) using input xl and nine for xr.

f : x → y, x ∈ [−6, 6], y ∈ [−1, 1] would have worked as the only concern was tolimit the output range to [−1, 1]. The sigmoid function was however used as linearbehaviour near x = 0 was desired. That gives the best learning rate and a flattercurve at the extremes. An x value near the extremes of [−6, 6] corresponds toradical behaviour such as turning 180◦ or accelerating rapidly and an x value near 0corresponds to making minor adjustments of speed and angle when encountering anobject. A high x value also corresponds to a rare event occurring, namely that bothantennae detect objects with high colour values. As the objective was to train theanimals to behave as rationally as possible to common events, the sigmoid functionwas chosen to slow down the learning rate of rare, extreme events and behavioursand accelerate the learning of common events and behaviours. This correspondsto x-values in the sigmoid far from zero and near zero, respectively. In this way,the animals still have the ability to make strong reactions, e.g. turning 180◦ whenseeing a predator, but learning focuses more on the interesting behaviours, namelyin which direction to turn or in which direction to accelerate. The reason for notchoosing a simpler function, such as f(x) = x/6, as a normalising function was thatit would have slowed down learning considerably giving more extreme cases a largerimpact than desired.

3.1.6 RBF-Based Decision Making

In RBF-based decision making, the three inputs to each antenna are used in thefunction displayed in Figure 2.4. For each antenna, ∆r and ∆s are computedby summing radial basis functions’ values and normalising them using the samefunction S as in section 3.1.5. As also mentioned in the same section, ∆r and ∆sare calculated separately using the same function and inputs but using differentgenes.

Each radial basis function has a σ and a µ which are decided by the animals’genes. σ and µ are in the ranges of [0, 1] and [−1, 1] respectively. An additionalgene is also used to weight the output, which corresponds to A in Figure 2.4. Thisis required to allow the otherwise positive radial basis functions to produce negativevalues as well. This means that the RBF-based brain has a total of 36 genes whichneed to be trained compared to the linear brain which only has twelve genes.

The difference between using radial basis functions and linear functions is that

13


∆r ={

0 if l4 = 0 and r4 = 0,r ∈ U([−1, 1]) otherwise

(3.2)

Figure 3.4. Calculating ∆r using random brain and the same notation as in Figure3.2 and Figure 3.3. r is a random number with uniform distribution.

radial basis functions have a better ability to approximate any decision-makingstrategy, as mentioned in section 2.3.1. For example, an RBF-based brain couldmake the distinction between different shades of green and thus react differently tothem while a linear function could only decide if more green produces a stronger orweaker output.

3.1.7 Random Decision Making

In random decision making, only the fourth input which denotes whether an objecthas been detected or not, is used. If an object has been detected a ∆r and a∆s within [−1, 1] are selected randomly with uniform probability. The sigmoidfunction S, which is used with both other brain architectures, is not used in therandom brain, as the random values which are produced can easily be manipulatedto be within the correct interval.

3.1.8 Genetic Algorithm

The implemented genetic algorithm is similar to the one described in section 2.2.1.The main difference is that a combination of selection methods are used. Insteadof only applying roulette selection elitism is included as well. This means that 10%of the next population are exact copies from the current generation’s population,selecting the individuals with highest fitness. The modified algorithm is depicted inFigure 3.5.

Elitism is used as roulette selection is a highly probabilistic algorithm, and it ispossible that some individuals with high fitnesses are not selected for the next gen-eration. Elitism prevents these individuals from disappearing from the populationby guaranteeing that their genes will survive until the next generation. Typically,these individuals also provide a stable maximum fitness for the population, as theyare expected to perform equally well in the next simulation. It should however benoted that this is not the case with the simulations discussed in this report, as bothherbivore, plant, and predator placement are random.

Once individuals for the new generation have been selected, crossover and mu-tation are applied. The probability of applying crossover to a pair of individualsis 30%. The probability of applying mutation is 40%. Exactly what operators areused is described in the subsequent paragraphs in this section.

The crossover algorithm used for the experiments is a modified version of theuniform crossover algorithm mentioned in section 2.2.4. If a uniform crossover

14

3.2. EXPERIMENTS

S ← a random distribution of genomesG← number of generationsfor g ← 1; g < G; g + + do

fitnesses← run simulations(S) . Simulates parts of S, collects resultcouple fitnesses(S, fitnesses) . Associates a fitness value with an animalB ← select best(S, length(S)/10) . Select 10% best genomesR← select roulette(S, length(S) ∗ 9/10) . Selects rest using roulettefor child1 ∈ R, child2 ∈ R do

crossover(child1, child2) . Probabilistically applies crossoverend forfor child ∈ R do

mutate(child) . Probabilistically mutates an individualend forS ← B ∪R . Restart with the new generation

end for

Figure 3.5. An overview of the genetic algorithm used in the simulation.

algorithm is used, it could be the case that the µ of a radial basis function camefrom one parent and the σ or A from the other. This would most likely producean individual with lower fitness than any of the its parents’, as each parameterof a certain radial basis function has been tuned to be used together in the samecalculation. Instead, ”regions” of genes are replaced when applying crossover. Aregion is defined as a group of subsequent genes. For the RBF-based brains the sizeof these regions are three genes long while one gene long for the linear brains. Thismeans that a radial basis function’s µ, σ and A are all copied. When crossover isapplied, one calculation, i.e. a radial basis function or linear unit, for either speedor rotation and for either the left or right antenna is switched for the other parent’s.The probability of exchanging the first individual’s region for the second is 30% foreach region.

A gaussian mutator is chosen as mutation algorithm, as the animals’ genes arefloating point numbers. When mutation occurs a random number, distributed ac-cording to a gaussian distribution with mean 0 and standard deviation 0.1, is appliedto the gene. On average, one gene is mutated per individual, as recommended in[8].

3.2 Experiments

3.2.1 Finding and eating food

This experiment consists of placing herbivores with randomly initialised genes ina world with only green plants in it. The number of green plants is variable, butnever over a given maximum, and the results are checked after 300 generations. The

15


population size is 200 and split into groups of 20 individuals, which are simulatedseparately. In each simulation the herbivores and plants are placed randomly intothe world, with the herbivores having a random initial rotation. This is to avoidlearning fixed patterns, that is doing the same sequence of actions each simulation.The downside of this is that the fitness values are not guaranteed to increase foreach generation. Instead, a longer interval needs to be examined. As described insection 3.1.2, a herbivore’s life length is increased when eating a green plant andthe fitness of a herbivore is its life length.

The purpose of this experiment is to see whether they are able to learn at alland how fast learning occurs. This will be an example of adaptation, as describedin section 2.1.1, where the herbivores need to adapt to their new surroundings. Itis also interesting to see if any specific behaviours occur in order to eat as manyplants as possible. The results from the linear, RBF-based, and random brainsare compared in order to see if there are any clear differences between them whenrunning this initially simple experiment. It is expected that both types of brains areable to perform well in this task, as the problem is easy and solvable using colourassociations, as proposed in [3].

In theory, a valid strategy could be to not react to plants at all. Simply acceler-ating to maximum speed and ”combing” the world for plants by rotating randomlyis a plausible strategy. According to [3] a good strategy is to simply continue in thesame direction and accelerate towards a plant when it is detected. Another strategyobserved in the same paper was to follow the walls of the world. Following the wallsof the world can be a good strategy for poorly adapted individuals, since it providesan easy way to cover a large area and by chance encounter green plants.

3.2.2 Avoiding bad food

This is a variation of the experiment described in the previous section, with theaddition of red plants. Similar to the green plants, the number of red plants is zeroat the beginning of the simulation and always below a fixed amount and they arealso spawned in random locations. A problem with this approach is that a herbivoremay be placed on top of a red plant, or vice versa. Nothing is done to prevent this,as with a large enough population size this is negligible. Another issue with this isthat red and green plants could be placed above, or near, each other in the world,thus making it difficult for the herbivores to eat the green plant while avoiding thered plant. The expected result of this is a lower average fitness value, but it isallowed to happen in order to see how the different brain types react to this case.

By adding red plants to the simulation the task of living as long as possiblebecomes more complex. It is no longer a valid strategy to randomly pick a newdirection each time a wall is encountered and not reacting to the detected plants.There are essentially two strategies for coping with this; the first strategy is tryingto avoid red plants as much as possible, and the second is to focus more on eatinggreen plants than avoiding red ones.

16

3.2. EXPERIMENTS

3.2.3 Predators and preyIn this experiment predators are added to the first and second experiment. Theratio of predators to herbivores is one to ten, which means that for each group of 20herbivores there will be 2 predators. As mentioned in section 3.1.2, the predators’objective is to eat herbivores and the predators have a strong red colour. This colouris used as prey are already learning to avoid red plants and if the predators are red aswell it will be easier for them to avoid getting eaten. Because of the limitations of thelinear brains, the RBF-based brains could have an unfair advantage if the predatorscould have multiple colours. Due to the design of the world it is beneficial to beable to treat blue, green, and red objects differently, and if the colours are mixed itquickly becomes too difficult for the linear brains to handle, as their reactions arethe sum of the reactions to the individual components of the colour.

In previous experiments there is no benefit in changing colour for the herbivores.In this experiment, however, it may be beneficial to try to mimic either predators orwalls. When this phenomena occurs in nature it is known as mimicry and describedin section 2.1.1. Another interesting evolutionary phenomena which could occuris co-evolution. In this simulation that could mean the predator’s average fitnesscontinuously increasing while the herbivores’ fitness slowly decreases, or vice versa,if the herbivores become proficient at avoiding predators.

17

Chapter 4

Results

4.1 Simulation results

4.1.1 Finding and eating food

After running the first experiment the conclusion can be drawn that the geneticalgorithm is able to train the herbivores to eat plants. Compared to the randombrain, they perform about 175% better after training for 300 generations. Most ofthe increase in fitness did occur between generations 1 and 100, as seen in Figure 4.1.The differences between the fitnesses of the linear and RBF-based brains are small,which could be attributed to the randomness in the experiments and the simplicityof the given task. As expected in section 3.2.1, there were mainly two prevalent foodeating strategies during the experiments, namely following the walls of the worldwhile looking for nearby food and the strategy of abruptly rotating in towards thecentre of the world when encountering a wall. In all cases the herbivores acceleratedand turned towards the green plants, which was also expected in section 3.2.1 andmentioned in [3].

The population size of 200 was found to be adequate as it provided a largeenough genetic diversity within the population and was also small enough to becomputationally cheap. High genetic diversity means a higher probability of initial-ising an individual with relatively successful genes, thus making the initial evolutionfaster. When using smaller population sizes the population and associated fitnessvalues became more unstable and vulnerable to small random occurrences, such asa successful individual being randomly placed in a bad starting position. This canbe related back to nature as an example of how smaller populations are unstableand more vulnerable, as stated in section 2.1.1.

The minimum and maximum fitness values for each generation were also ofinterest. As each individual is given a base life length of 100 the minimum fitnessnever went below 100. Due to the large population size and scarcity of plants inthe world it was also seldom higher than 100. There is almost always at least oneunfit individual who will not eat a single plant for every generation. The maximumfitness reaches the maximum of a simulation after only a few generations, as shown

19

CHAPTER 4. RESULTS

Figure 4.1. A comparison between the average fitnesses of the RBF-based, linear andrandom brains when tasked only with eating green plants, according to section 3.2.1.It can be seen that the linear brains reach the fitness plateau faster.

in Figure 4.2. This means that one or several herbivores have survived the full lengthof the simulation. It could be a problem if a large share of the herbivores reachedthe maximum, since these individuals would have the same fitness, but the averageis low enough for this to not be a problem. The average fitness increases steadilybefore reaching a plateau. The linear brains reached this plateau faster than theRBF-based brains, which could be an example of the longer convergence time for ahigher number of genes. This plateau represents a local maxima in the fitness searchspace, as mentioned in section 2.2.2. The likelihood of this local maxima being theglobal maxima is increased as both the linear and RBF-based brains reach the samemaxima.

The colours of the herbivores converged to a random point which reflects the de-creasing genetic diversity within the population. All brains showed these tendenciesin this experiment, which was expected, as there is no advantage or disadvantageassociated with a certain colour. An example of how the colours varied during theexperiment can be found in Figure 4.3.

20

4.1. SIMULATION RESULTS

Figure 4.2. A graph showing the minimum, average and maximum fitness pergeneration for a simulation using linear brains and only green plants, as described insection 3.2.1.

4.1.2 Avoiding bad food

Adding red plants to the simulation caused the fitness of both types of herbivores todecrease. When using a low maximum number of red plants no particular differencesbetween RBF-based and linear brains were found. In both cases they managed toincrease their fitness by about 100% and approximately 40% of all herbivores diedfrom eating red plants. When doubling the amount of red plants in the simulationdifferences between the two decision-making models began to appear. Fitness wasin both cases increased by about 30%, but the ratio of herbivores killed from eatingred plants differed, as shown in Figure 4.4. The linear brains had a death ratiobetween 60-70% while the RBF-based brains had a ratio between 30-40%. Thiscould indicate the use of different strategies. The linear brains seem to be morereckless by weighing the risk of eating a red plant against the possibility of eatingmany green plants. In comparison, the RBF-based brains seem to be more focusedon avoiding red plants and being more cautious when approaching green plants.A visual observation of the herbivores in action confirmed that few of the RBF-based herbivores ever collided with a red plant. The differences in strategies is alsosupported by the fact that the maximum fitness for the linear brains is higher butalso more unstable compared to the RBF-based brains.

21

CHAPTER 4. RESULTS

Figure 4.3. A graph over the change in the red, green, and blue components of theherbivores’ colours during the experiment for the RBF-based brains with only greenplants, according to section 3.2.1.

In the previous experiment it was always beneficial to have the highest speedpossible. In this experiment there were examples of herbivores moving slower thanthe maximum speed, particularly after detecting a red plant. This behaviour wasfound in both brains. One possible explanation of this could be that if they aretraveling at maximum speed there is not enough time to successfully steer awayfrom a red plant before colliding with it. In this experiment the herbivores’ coloursconverged to a random point, similar to the previous experiment.

4.1.3 Predators and prey

In the simulation when only predators, herbivores and green plants were presentthe predators dominated. They achieved high fitness values and suppressed theherbivores’ average fitness. Where in section 4.1.1, with only herbivores and greenplants, they had a fitness increase of about 175% they were able to increase theirfitness by 70-90% at most for linear brains and 40-50% at most for RBF-basedbrains.

It was found that the herbivores with linear brains had both a higher death-rateby predators and higher fitness compared to RBF-based brains. Linear brains hada death rate of around 50%, which was 10% higher than the RBF-based brains,

22

4.1. SIMULATION RESULTS

Figure 4.4. A graph showing the ratio of herbivores killed from eating red plantsfor both decision-making models. No predators were included in the experiment, asspecified in section 3.2.2.

and a 90% increase in fitness whereas RBF-based brains had a 30% increase. Thisindicates that RBF-based herbivores typically get eaten early on in the simulation,while linear herbivores survive longer before getting eaten. RBF-based predatorsseem to be more effective than the linear predators, as they are able to eat a largerportion of their prey early on.

In the linear case, signs of an evolutionary arms-race have been found wherespontaneous changes in colours triggered a decrease in number of herbivores killed bypredators, as shown in Figure 4.5. This suggests co-evolution between the herbivoresand predators, as described in section 2.1.1. This in turn leads to the predatorsadapting to the herbivores’ new colour, and again increasing the ratio of herbivoreskilled by predators. The arms-race was not as apparent with the RBF-based brains,which is believed to be due to the locality property of the radial basis functions.The locality property in this case means that a small change in colour can still givethe same, or similar, reaction, which is believed to make them better predators asthe herbivores’ colours are dynamic during the initial phase of the simulation. Thisis an example of where genetic algorithms do not perform well, as the vast majorityof the RBF-based herbivores receive a low fitness value during the beginning of thesimulation as they are eaten by predators and thus makes selection difficult. Itseems that difficulty when selecting which individuals are allowed to pass on their

23

CHAPTER 4. RESULTS

genes can give a decrease of the effectiveness of genetic algorithms. If a populationof predators had had this kind of dominance over a population of prey in nature,the prey would likely be extinguished. However, due to the genetic algorithm thepopulation survives and experiences the same scenario each generation.

Figure 4.5. A graph showing the herbivores’ death-by-predator percentage whenusing linear brains, according to the specification in section 3.2.3. Decreases in thepercentage correlate to sudden changes in the herbivores’ colour.

The colours in this experiment seemed to converge towards higher values, butwithout any clear strategy. An example is shown in Figure 4.6. It is believed thatthe strong colours which result from this are a way to fool the predators’ brainsinto reacting strongly when detecting them. This reaction could be strong enoughto send the predator in another direction, away from the herbivore. Initially it wasbelieved that the herbivores would converge towards either a strong green, blueor red colour to mimic the objects already present in the world. An explanationas to why this did not occur could be that the motivation to increase one colourcomponent while decreasing the two others is not apparent until perfect mimicryalready has been reached. When using Sewall Wright’s analogy from section 2.2.2,the valley in the fitness landscape between the current peak and a peak wheremimicry is present is too deep.

In the runs with both green and red plants it was observed that all animalshad lower fitness. This was due to the herbivores dying of red plants which lead topredators having fewer herbivores to eat. In general, it seemed easier to avoid red

24

4.2. DISCUSSION

plants than predators which is expected as the herbivores only can see in front ofthem. A clear inverse correlation was found between the ratio of herbivores killedby red plants and the ratio killed by predators. When herbivores learnt to avoidred plants they were instead eaten by predators and vice versa. Apart from this nonew results were found.

Figure 4.6. A graph over the change in the red, green, and blue components of theherbivores’ colours during the experiment for the RBF-based brains with predatorsand green plants, according to section 3.2.3.

4.2 Discussion

4.2.1 Constraints and ProblemsOne serious and unexpected constraint was the performance of our algorithm. Afterprofiling the code and making several improvements the performance was still amajor issue. The main problem was the matching algorithms which compared allanimals and plants to each other to determine collisions and detections. Checkingall members of a list towards themselves is an operation which takes the time O(n2)on the size of the list and this is unacceptable as n grows. The solution was to splitup the population into several subpopulations, as mentioned in section 3.1.1, andrun their simulation separately, sequentially or in parallel. This trick reduces thetime complexity to O(K ∗ n) where K is a constant less than n.

25

CHAPTER 4. RESULTS

The algorithms for collisions and detections were also scrutinised and all floatingpoint calculations were replaced with pre-calculated values wherever possible. Afterthe discovery of the scalar dot product as a big consumer of computation time, analternative implementation was developed using pre-calculated values which sacri-ficed some memory and accuracy to achieve better running times. The standardPython implementation was also deemed too slow and PyPy, a fast Just-In-Timecompiler was used to run the code most of the time. Finally, multiple cores wereutilised by using Python’s built-in support for multiprocessing. Python was chosenover C++ since Python allows for a much faster development process which madethis project feasible given the allotted amount of time.

4.2.2 Simulation Accuracy and Applications

The model created in this project is mainly intended for studying evolutionaryphenomena, not to provide a biologically accurate model of an ecosystem. It isuseful for modelling specific evolutionary scenarios, as the simple model reduces theamount of external parameters which may affect the simulation. This could alsohave drawbacks as the model may be too simplified to realistically model real-lifescenarios. The most significant limitations of the model are:

• The antennae model is in most cases not a realistic input-gathering system.Each antenna can for example only detect one object at a time and the animalscannot control the antennae.

• The predators have an unfair advantage as the herbivores cannot know if theyare being chased.

• The way reproduction works is simplified in the experiments. In real-life,reproduction is a continuous process and the population size is not fixed.

• Due to the performance issues outlined in the previous section, the populationsizes in each simulation are smaller than they would be in nature.

Despite these limitations the model was fully adequate to fulfil the goals of thisproject. A more biologically correct model would not necessarily have helped answerthe questions posed in this report better, since a less complex model using fewerparameters gives a clearer connection between cause and effect. Using antennaeto detect colours might not be biologically correct, but it still enabled the soughtafter behaviours to develop. Using colours also made analysis easier, as opposed to,for example, texture or shape, which may have been more biologically accurate. Ifanother system for reproduction had been used it would have been more difficult tocompare generations and the desired effects still appeared using the current system.

The simple deterministic brains used in the model could in some cases be a real-istic model of real organisms. When considering primitive eucaryote and procaryoteorganisms, which employ limited movement and input capabilities, this model might

26

4.3. CONCLUSIONS AND FUTURE WORK

still contain the complexity required to model their behaviour accurately. It is how-ever important to keep in mind that simulating growing populations is problematicdue to the genetic algorithm used and performance issues.

4.3 Conclusions and Future Work

In this report it has been found that it is possible to simulate evolution of animalbehaviour using both linear and radial basis functions. As theorised in [3], it ispossible to model food-seeking strategies as well as more advanced survival strate-gies using linear associations. The food-seeking strategies observed in [3], namelyfollowing the walls of the world and accelerating towards food upon detection, werealso developed by the animals simulated in this project.

The differences found between the when using radial basis functions and us-ing linear functions were relatively small. When using linear functions the animalstended to use a more aggressive strategy when seeking food in a dangerous environ-ment. When encountered with both edible and poisonous food they chose a strategywhere they risked eating poisonous food in return for eating a possibly larger portionof edible food. The animals which used radial basis functions seemed to be morerestrained, minimising the risk of eating poisonous food while losing some of theedible food. Unexpectedly, both these strategies lead to similar performance of theanimals’ populations. Another difference found was that the predators using radialbasis functions were more successful in killing prey. It is however not certain thatthis applies in all situations, as more experiments are needed to fully investigatethe reasons behind this.

Certain similarities between the model and natural evolution were observed.When the animals were first placed into the simulated world they behaved ran-domly and were not able to find and eat food. The genetic algorithm incrementallyimproved their genes, thus making them adapt to their new environment well. Theanimals had colours determined by their genes and when predators were introducedto the world the prey used this mechanism for self-defence. It was expected that theprey would use mimicry to camouflage themselves as other objects within the world,but that was not the case. Instead, a tendency to favour strong colours was found.This could in many cases induce a stronger reaction in the predators, which couldbe to the predator’s disadvantage. This is an example of an evolutionary arms-racebetween the prey’s ability to change their colour and the predator’s ability to adaptto that change, which the prey in all cases lost during our simulations.

This kind of simulation could be used in the future to examine other functionsor strategies used for decision making as well. Even though mimicry was not clearlyobserved it is believed that it could be, given experiments specifically targetedtowards it. It is also believed that this model could be modified to realisticallymodel simple biological organisms, such as primitive prokaryotes or bacteria. Theauthors of this report hope that more similar projects will be done in future usinggenetic algorithms, as they tend to go beyond human imagination in search for

27

CHAPTER 4. RESULTS

potential solutions.

28

Bibliography

[1] Buhmann, M. D. (2003) Radial Basis Functions: Theory and Implementations.Cambridge University Press

[2] Darwin, C, (1861) On the origin of species by means of natural selection; or,The preservation of favoured races in the struggle for life. D. Appleton andCompany

[3] Gracias N., Pereira H., Lima J.A., Rosa A. (1997). Gaia: An Artificial LifeEnvironment for Ecological Systems Simulation Artificial Life V: Proceedingsof the Fifth International Workshop on the Synthesis and Simulation of LivingSystems

[4] Holland, J. H. (1992). Genetic algorithms. Scientific american, 267(1) (pp.66-72).

[5] Huijsmann, R., Haasdijk E., Eiben A. E. (2012) An On-line On-board Dis-tributed Algorithm for Evolutionary Robotics. In Artificial Evolution (pp. 73-84). Springer Berlin Heidelberg

[6] King, R. C., Stansfield W. D., Mulligan, P. K., (2006) A Dictionary of Genetics.Oxford University Press

[7] Marsland, S. (2009). Machine Learning, an Algorithmic Perspective. CSC-Press

[8] Montana, D. J., and Davis, L. (1989, August). Training feedforward neuralnetworks using genetic algorithms. In Proceedings of the eleventh internationaljoint conference on artificial Intelligence (Vol. 1, pp. 762-767). (Vol. 5). MitPress.

[9] Vij, K. and Biswas, R. (2004) Basics of DNA & Evidentiary Issues. JaypeeBrothers Publishers

[10] Wright, S., (1932), The Roles of Mutation, Inbreeding, Crossbreeding and Se-lection in Evolution. In Proceedings of the Sixth International Congress onGenetics (pp. 355-366). Brooklyn Botanic Garden

List of Figures

2.1 An overview of the genetic algorithm used in the experiments. . . . . . 42.2 A visual representation of how the roulette wheel selection algorithm

works. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 A schematic drawing showing single point crossover of two genomes. . . 62.4 Three one-dimensional RBFs with varying µ and σ values. µ determines

the centre of the bell curve and σ controls the slope of it. . . . . . . . . 72.5 A sum of three radial basis functions, corresponding to three input val-

ues. Ax, Ay and Az lies within the interval [−1, 1] and ensures thatf(x, y, z) can be negative. . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1 A screenshot of the simulation, showing green plants (green circles), redplants (slightly larger red circles), herbivores (multicoloured circles withantennae) and a predator (red circle with inner white circle and antennae). 10

3.2 The linear brain’s decision formula for change of rotation ∆r. S corre-sponds to a sigmoid function described in section 3.1.5. . . . . . . . . . 12

3.3 Calculating the ∆r using RBF functions. 18 genes are implicitly used,nine genes for A:s, σ:s and µ:s in f (see Figure 2.5) using input xl andnine for xr. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.4 Calculating ∆r using random brain and the same notation as in Figure3.2 and Figure 3.3. r is a random number with uniform distribution. . . 14

3.5 An overview of the genetic algorithm used in the simulation. . . . . . . 15

4.1 A comparison between the average fitnesses of the RBF-based, linear andrandom brains when tasked only with eating green plants, according tosection 3.2.1. It can be seen that the linear brains reach the fitnessplateau faster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2 A graph showing the minimum, average and maximum fitness per gen-eration for a simulation using linear brains and only green plants, asdescribed in section 3.2.1. . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.3 A graph over the change in the red, green, and blue components of theherbivores’ colours during the experiment for the RBF-based brains withonly green plants, according to section 3.2.1. . . . . . . . . . . . . . . . 22

4.4 A graph showing the ratio of herbivores killed from eating red plantsfor both decision-making models. No predators were included in theexperiment, as specified in section 3.2.2. . . . . . . . . . . . . . . . . . . 23

4.5 A graph showing the herbivores’ death-by-predator percentage when us-ing linear brains, according to the specification in section 3.2.3. De-creases in the percentage correlate to sudden changes in the herbivores’colour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.6 A graph over the change in the red, green, and blue components of theherbivores’ colours during the experiment for the RBF-based brains withpredators and green plants, according to section 3.2.3. . . . . . . . . . . 25

Note: all of the figures have been created by the authors.

Appendix A

Third-party libraries and tools used

• DEAP - Distributed Evolutionary Algorithms in Python http://deap.gel.ulaval.ca/doc/default/index.html

• PyPy - A fast Just-in-Time compiler for Python. http://pypy.org/

• Pygame - A computer game and visualisation package for python. http://www.pygame.org/docs/

• NumPy - A Python package for numerical calculations. http://www.numpy.org/

• matplotlib - A Python package for rendering graphs. http://matplotlib.org/

http://deap.gel.ulaval.ca/doc/default/index.html

http://deap.gel.ulaval.ca/doc/default/index.html

http://pypy.org/

http://www.pygame.org/docs/

http://www.pygame.org/docs/

http://www.numpy.org/

http://www.numpy.org/

http://matplotlib.org/

http://matplotlib.org/

Appendix B

Source Code

The source code used to generate all of the results can be found at:https://github.com/johanwikstrm/artificialbrains

https://github.com/johanwikstrm/artificialbrains

Appendix C

Statement of Collaboration

Both authors have spent an equal amount of time working on and contributing tothe code, experiments and this report.

Evolution of Artiﬁcial Brains in Simulated Animal … of Artiﬁcial Brains in Simulated Animal...

Documents

Transcript of Evolution of Artiﬁcial Brains in Simulated Animal … of Artiﬁcial Brains in Simulated Animal...