[Lecture Notes in Computer Science] Genetic Programming Volume 3905 || Iterative Filter Generation...

Iterative Filter GenerationUsing Genetic Programming

Marc Segond, Denis Robilliard, and Cyril Fonlupt

Laboratoire d’Informatique du Littoral,Maison de la Recherche Blaise Pascal,50 rue Ferdinand Buisson - BP 719,

62228 CALAIS Cedex, [email protected]

http://lil.univ-littoral.fr/~segond/

Abstract. Oceanographers from the IFREMER institute have an hy-pothesis that the presence of so-called “retentive” meso-scale vortices inocean and coastal waters could have an influence on watery fauna’s de-mography. Up to now, identification of retentive hydro-dynamical struc-tures on stream maps has been performed by experts using backgroundknowledge about the area. We tackle this task with filters induced byGenetic Programming, a technique that has already been successfullyused in pattern matching problems. To overcome specific difficulties as-sociated with this problem, we introduce a refined scheme that iteratesthe filters classification phase while giving them access to a memory oftheir previous decisions. These iterative filters achieve superior resultsand are compared to a set of other methods.

1 Introduction

Watery fauna concentration in coastal waters seems to be correlated with thepresence of physical structures that may retain eggs and larvae in favorableenvironmental conditions. In the case of the anchovy in the Gulf of Biscay,biologists from the IFREMER institute are studying the correlation betweenretentive meso scale vortices, whose size ranges from 10km to 200 km, and thedemography of these fishes. The detection of these structures is made by expertson stream vector maps using background expertise about plausible structures.

Maps are actually generated by hydro-dynamical simulations such as theMars3D or the Mercator models1. A typical stream map is a 3 dimensionalmatrix containing the x and y components of the stream vector at 10 metersdepth on a discrete grid with 10km by 10km cells, collected at regular time steps,usually every 24 hours. The maps are stored in the NetCDF2 format.

To verify the hypothesis at hand, the frequency and location of interestingvortices has to be recorded and summed up over many years, yielding a very large1 http://www.mercator-ocean.fr2 http://my.unidata.ucar.edu/content/software/netcdf/index.html

P. Collet et al. (Eds.): EuroGP 2006, LNCS 3905, pp. 145–153, 2006.c© Springer-Verlag Berlin Heidelberg 2006

146 M. Segond, D. Robilliard, and C. Fonlupt

amount of maps to be processed. Thus an automatic and efficient detection toolis needed to conduct the study.

When the specialist highlights retentive structures by hand on the streammaps, he uses expertise based on his knowledge of the area, of the character-istics of the simulation model and his understanding of the phenomenon hestudies. During this process some structures that could be retained by a naiveobserver are rejected, e.g. because the stream aspect is chaotic in the neighbor-hood, suggesting these are only transient patterns or artifacts due to the modeldigitization. Thus the physics-based vortices detection problem is topped by ahidden criteria learning task. An efficient detection scheme for this problem musttherefore build over these two aspects: using hydrodynamics and being able tolearn part of the expert’s knowledge.

An ant algorithm was proposed in [1] to solve this problem. In this scheme,ants used physical information from the stream vector field and a further pa-rameter tuning phase brought the algorithm closer towards matching the hid-den criteria. This approach was satisfactory and superseded standard vorticitythreshold methods (see [3, 4] for a presentation of such techniques).

Nonetheless the question was still open whether a supervised machine learningscheme could achieve superior results. In this paper we introduce genetic pro-gramming filters that are able to take into account the physical characteristicsof the problem and to learn from example maps.

2 Genetic Programming Filters

The basic scheme is inspired by the work of Daida [2] on detecting pressure ridgesin the arctic ice cover. We evolve filters (i.e. classifier programs) in a supervisedlearning framework. These are selected on their ability to correctly classify cellsof a stream map whether they belong or not to a structure of interest. Each filterclassify one map cell at a time, and it is successively applied to every cell of themap. Evaluation is done on a set of reference maps tagged by the expert (seeFigure 1).

Filters are implemented with the ECJ3 Java evolutionary library, using thestandard Lisp-like tree representation. Inputs available to a filter are floatingpoint physical data such as stream strength and vorticity. We keep the closureproperty and use only GP nodes that return a floating point value.

The conversion between this floating point matrix and the boolean valuesexpected for classification is done with a threshold value. Continuously increasingthe threshold from 0 to 1, we obtain a monotonous increase of the true positiveand false positive rates, from 0% to 100%: we can draw a Reicever OperatingCharacteristics (ROC) curve. This is a standard technique (see e.g. [6]) thatwill be used later when evaluating and comparing heuristics. The end-user willhave the choice of the threshold level that corresponds to his preferred trade-offbetween sensibility and specificity.

3 http://cs.gmu.edu/eclab/projects/ecj/

Iterative Filter Generation Using Genetic Programming 147

Fig. 1. An example of detection performed by an expert: interesting vortices are circledin black

2.1 Basic GP Filters Presentation

The set of function and terminal nodes is shown in table 1, and it has beenchosen to allow computations on the physical characteristics of the stream.

For example, it seems relevant to use information from the 8 neighbors of thecell we are working on: the “strength3x3” terminal returns the mean value of thestream strength in the neighboring cells, and the “angle3x3” terminal gives themean value of the angle of the vector stream in those same cells. The “min” and“max” function nodes have been introduced to allow comparisons. The “curl”and “divergence” are standard operators used in vortices detection. Notice that

Table 1. Summary of non-terminal and terminal nodes used in the basic GP filters

Name Meaning Input Outputadd addition 2 reals 1 realsub subtraction 2 reals 1 realmul multiplication 2 reals 1 realdiv protected division 2 reals 1 realmin minimum of 2 arguments 2 reals 1 realmax maximum of 2 arguments 2 reals 1 realcos cosine 1 real 1 realsin sine 1 real 1 real

strength stream strength null 1 real ∈ [0, 1]strength3x3 stream strength averaged over a 3x3 cells matrix null 1 real ∈ [0, 1]angles3x3 stream vector angle averaged over a 3x3 cells matrix null 1 real

curl cell vorticity null 1 realdivergence cell divergence null 1 real

erc ephemeral random constant null 1 real


Table 2. General parameters used in the GP algorithm

Name ValueNumber of generations 80Size of the population 600Max depth for a tree 15

Mutation rate 5%Crossover rate 85%

Reproduction rate (with elitism) 5%

in order to speed up the evaluation phase, most terminal nodes (curl, divergence,strength, strength3x3, angle3x3) are pre-computed for the maps in the learningset. The evolution parameters are shown in table 2, and are quite standard.

2.2 Fitness Function Choice

One of the difficulty in Genetic Programming is to find the adequate fitness func-tion to optimize. Basically, the fitness of individuals is evaluated by measuringtheir performance on a learning set of 10 maps tagged by an expert. Howeverthe actual performance of a filter depends on the choice of the threshold level. Apossible choice is maximizing the area under the ROC curve, denoted as AUC— Area Under Curve — (see Sebag et al. [5] for a discussion about efficientcomputation of this area). Optimizing the AUC delivers pretty good results, butthe ant algorithm still dominate when the threshold trade-off is aimed at verylow false positive rates.

We therefore propose to focus on having a steeper slope in the left part of theROC curve (low false positive rates). This is achieved by choosing a set of 10values on the ROC x-axis, 5 in the range [0.25, 0.35], the others equally spacedon the range [0, 1] \ [0.25, 0.35], and minimizing the following fitness function:

f =

∑ni=1

yi

xi

n

were xi is a value chosen on the x-axis and yi the corresponding value on they-axis according to the ROC curve.

2.3 First Results and Discussion

Unfortunately the GP approach we just described fails to give conclusive results,although it relies on state-of-the-art evolutionary techniques previously success-ful on classification and pattern detection cases. On our problem, the filtersROC curves are dominated by the results obtained from the ant algorithm. GPproduces rough and noisy classification specially near the coast, that reminds ofresults obtained by vorticity analysis.

We conjecture that these filters have a too reduced “sight range” to recognizeglobal vortices shapes that can be spread over 20 grid cells or more. We saw inthe introduction that whether or not a structure is considered retentive certainly


depends on each cell of that structure, but also on distant surrounding cells thatare not member of the vortices. In this regard, the “strength3x3” and “angle3x3”nodes probably give a too local information, and we need to add more problemspecific knowledge to allow GP to cross the gap.

Experiments have been conducted to let the evolution process determine thesize of these matrix-shaped terminals, but these were not successful, leading usto propose a solution based on the propagation of classification results acrossthe grid, as explained in the next section.

3 Iterative Genetic Programming Filters

To remedy the failure of the previous scheme, we need to provide some meansof communicating information over the grid, while keeping a manageable searchspace: a large increase in the number of terminals to access a variety of distantcells would prevent successful learning by GP.

Our proposition is iterative filters, i.e. filters that are applied in several suc-cessive classifications steps on a map, retaining the final last decision, andthat have a memory of their previous decisions at each classification step (seeFigure 2). If the filter operating at a given cell accesses such memory from neigh-bors, information will slowly spread along the grid at every iteration.

3.1 Iterative GP Filters Presentation

From a technical point of view, two nodes are added to the terminal set:lastValue and meanLastValue.

Fig. 2. Description of the way an iterative filter works


– lastValue: returns a value that aggregates the filter results at previousiterations. This value is 0.5 during the first classification step (no previousresult), and it is updated using the following equation:

lastValuei+1 =2 ∗ lastValuei + F

3

Fig. 3. Evolution of the classification after 1, 15 and 30 iterations, without distCoastterminal

Fig. 4. Comparison of fitness evolution for different iteration parameters


were lastValuei is the value returned by this terminal at iteration i, and Fis the classification value computed by the filter.

– meanLastValue: returns the mean of lastValue for the 8 neighboring cells.

Thanks to meanLastValue, a filter is now able to take into account classifi-cation results from its immediate neighbors, and, within successive iterations, itcan grasp classification information about cells distant from two, three or moregrid cells, depending on how many iterations we allow. The F value producedby the individual during the last iteration will be the its final classification andwill serve to compute its fitness.

Experiments also show that it is very difficult for a filter to avoid false positivesnear the coast line, almost setting a higher bound to performances. To tacklethis problem, a distCoast node is introduced that returns 1 if the cell is fartherthan 2 grid steps from the coast, else 0.

3.2 Iterative Filters Results

On Figure 3 we plot the evolution of the classification result for three iterationlimits. We observe that the classification is refined in the first iterations beforebecoming stable.

Figure 4 is a standard fitness versus generations plot. We can see that iter-ative filters have an increased efficiency, with a maximum at 6 iterations. ThedistCoast node also boosts the performance.

Fig. 5. ROC curve based comparison between GP filters and other methods


Fig. 6. A map filtered using the “distCoast” node

A comparison with the ant algorithm and streamlines schemes introducedin [1] is given in Figure 5 using ROC curves. Depending on the trade-off desired,either the “steeper slope” fitness function or the AUC maximization may bepreferred. This plot also shows the benefits of adding the distCoast terminal.The number of false positive is reduced within the neighborhood of the coastline, as is illustrated on Figure 6 to be compared with Figure 3 for this matter.

Maximizing the ROC AUC within a 5-fold cross validation experiment (1600training cells, 7200 test cells), we obtained a mean AUC value of 0.8955 (nor-malized, maximum is 1) with a standard deviation of 0.0093. We performed asimilar experiment with a non-recurrent back-propagation artificial neural net-work (see e.g. [7]), taking 54 inputs i.e. the same 6 terminal inputs as GP for9 cells evenly spaced in a 70km x 70km area around the classification focus.Limiting the learning time to 15 min as for GP, we obtained a mean AUC valueof 0.7515 with a standard deviation of 0.0178. We cannot claim to have spent asmuch time in tuning the artificial neural network (ANN) as we have spent forthe GP algorithm, nonetheless it gives some hints about GP being competitivewith ANN for this problem.

4 Conclusion

We presented iterative GP filters for detection of retentive meso-scale vortices onsimulated stream vector fields. This scheme has needed considerable insights intothe problem in order to develop not only suitable GP functions and terminals,but also an original iterated scheme for GP classification and an alternative tothe AUC maximization fitness function.


With our GP based filtering method, we are able to learn some part of theexpert knowledge, while also performing meaningful computations in term ofvector field analysis, as can be judged by the results. We think that this iteratingscheme for GP classification may well be of interest in the image analysis domainand possibly for general classification tasks.

Although preliminary work with ADFs have shown no increase in perfor-mances, we also plan to investigate further this way.

References

1. M. Segond, C. Fonlupt, D. Robilliard, Ant Algorithms for Detection in CoastalWaters, EA’03, Vol. 1, pp. 1-100, 2003.

2. J. M. Daıda, C. T. F. Bersano-Begey, S. J. Ross, J. F. Vesecky, Computer AssistedDesign of Image Classification Algorithms: Dynamic and Static Fitness Evaluationsin a Scaffolded Genetic Programming Environment, GP’96, 1996.

3. T. Corpetti, E. Memin, P. Perez, Dense estimation of fluid flows, IEEE transactionon pattern analysis and machine intelligence, 24(3), pp. 365-380, 2002.

4. T. Corpetti, E. Memin, P. Perez, Extraction of singular points from dense motionfields: an analytic approach, Journal of mathematical imaging and vision, 2003.

5. M. Sebag, J. Aze, N. Lucas, ROC-Based Evolutionary Learning: Application toMedical Data Mining, Artificial Evolution 2003, pp.384-396, 2003.

6. W.B. Langdon, B.F. Buxton, Evolving Receiver Operating Characteristics for DataFusion, Proceedings of EuroGP’2001, pp.87–96, LNCS 2038, Springer-Verlag, 2001.

7. T. M. Mitchell, Machine Learning, Mc Graw-Hill, 1997.

[Lecture Notes in Computer Science] Genetic Programming Volume 3905 || Iterative Filter Generation...

Documents

Transcript of [Lecture Notes in Computer Science] Genetic Programming Volume 3905 || Iterative Filter Generation...