Ant Colonies for Travelling Salesman Problem

8/10/2019 Ant Colonies for Travelling Salesman Problem

1/15


2/15

BioSystems 43 (1997) 7381

Ant colonies for the travelling salesman problem

Marco Dorigo a, *, Luca M aria Gambardella ba I R I D I A, U ni ersit eL ibre de B rux ell es , A enue Franklin Roose elt 50 , CP 194 /6 , 1050 Bruxelles , Belgium

b IDSIA, Corso El ezia 36 , 6900 L ugano , Swit zerl and

Received 11 October 1996; accepted 24 October 1996

ract

We describe an articial ant colony capableof solving thetravelling salesman problem (TSP). Ants of thearticialony are able to generate successively shorter feasible tours by using information accumulated in the form of aromone trail deposited on the edges of the TSP graph. Computer simulations demonstrate that the articial antony is capableof generating good solutions to both symmetric and asymmetric instances of the TSP. Themethodan example, like simulated annealing, neural networks and evolutionary computation, of the successful use of aural metaphor to design an optimization algorithm. 1997 Elsevier Science Ireland L td.

ywords : Ant colony optimization; Computational intelligence; Articial life; Adaptive behavior; Combinatorialmization; Reinforcement learning

ntroduction

Real ants are capable of nding the shortesth from a food source to the nest (Beckers et1992; Goss et al., 1989) without using visual

s (Holldobler and Wilson, 1990). Also, theycapableof adapting to changes in theenviron-

nt, e.g. nding a new shortest path oncetheolde is no longer feasible due to a new obstacleeckers et al., 1992; Goss et al., 1989). Consider. 1A: ants are moving on a straight line that

connects a food source to their nest. I t is wellknown that the primary means for ants to form

and maintain the line is a pheromone trail. Antsdeposit a certain amount of pheromone whilewalking, and each ant probabilistically prefers tofollow a direction rich in pheromone. This ele-mentary behaviour of real ants can be used toexplain how they can nd the shortest path thatreconnects a broken line after the sudden appear-ance of an unexpected obstacle has interruptedthe initial path (F ig. 1B). In fact, once the obsta-cle has appeared, those ants which are just infront of the obstacle cannot continue to followthe pheromone trail and therefore they have toCorrespondingauthor. E-mail: [email protected]; http: //a.ulb.ac.be /dorigo /dorigo.html


3/15

M . Dorigo , L.M . Gambardella / BioSystems 43 (1997) 7381

ose between turning right or left. In this situa-n we can expect half the ants to choose to turnht and theother half to turn left. A very similaration can be found on the other side of thetacle (Fig. 1C). It is interesting to note thatse ants which choose, by chance, the shorterh around the obstacle will more rapidly recon-ute the interrupted pheromone trail comparedthose who choose the longer path. Thus, therter path will receive a greater amount of romone per time unit and in turn a larger

mber of ants will choose theshorter path. Duethis positive feedback (autocatalytic) process,the ants will rapidly choose the shorter path

g. 1D). The most interesting aspect of thisocatalytic process is that nding the shortesth around theobstacle seems to bean emergentperty of the interaction between the obstaclepeand antsdistributed behaviour: although alls move at approximately the same speed andosit a pheromone trail at approximately the

me rate, it is a fact that it takes longer totour obstacles on their longer side than onr shorter side which makes the pheromonel accumulate quicker on the shorter side. It isants preference for higher pheromone trail

els which makes this accumulation still quickerthe shorter path. We will now show how a

ilar process can be put to work in a simulatedrld inhabited by articial ants that try to solvetravelling salesman problem.

The travelling salesman problem (TSP) is theblem of nding a shortest closed tour whichts all the cities in a given set. In this article wel restrict attention to TSPs in which cities are

a plane and a path (edge) exists between eachr of cities (i.e., the TSP graph is completelynected).

Articial ants

In this work an articial ant is an agent whichves from city to city on a TSP graph. Itoses the city to move to using a probabilisticction both of trail accumulated on edges anda heuristic value, which was chosen here to be

probabilistically prefer cities that are connectedby edges with a lot of pheromone trail and whichare close-by. Initially, m articial ants are placedon randomly selected cities. At each time stepthey move to new cities and modify thepheromone trail on the edges used this istermed local trail updating. When all the antshave completed a tour the ant that made theshortest tour modies the edges belonging to itstourtermed global trail updatingby addingan amount of pheromone trail that is inverselyproportional to the tour length.

These are three ideas from natural ant be-haviour that we have transferred to our articial

ant colony: (i) thepreferencefor paths with a highpheromone level, (ii) the higher rateof growth of the amount of pheromone on shorter paths, and(iii) the trail mediated communication amongants. Articial ants were also given a few capabil-

Fig. 1. (A) Real ants follow a path between nest and foodsource. (B) An obstacle appears on the path: ants choose

whether to turn left or right with equal probability.Pheromoneis deposited more quickly on theshorter path. (D)


4/15


ose between turning right or left. In this situa-n we can expect half the ants to choose to turnht and theother half to turn left. A very similaration can be found on the other side of thetacle (Fig. 1C). It is interesting to note thatse ants which choose, by chance, the shorterh around the obstacle will more rapidly recon-ute the interrupted pheromone trail comparedthose who choose the longer path. Thus, therter path will receive a greater amount of romone per time unit and in turn a larger

mber of ants will choose theshorter path. Duethis positive feedback (autocatalytic) process,the ants will rapidly choose the shorter pathg. 1D). The most interesting aspect of this

ocatalytic process is that nding the shortesth around theobstacle seems to bean emergentperty of the interaction between the obstaclepeand antsdistributed behaviour: although alls move at approximately the same speed andosit a pheromone trail at approximately the

me rate, it is a fact that it takes longer totour obstacles on their longer side than onr shorter side which makes the pheromonel accumulate quicker on the shorter side. It isants preference for higher pheromone trail

els which makes this accumulation still quickerthe shorter path. We will now show how a

ilar process can be put to work in a simulatedrld inhabited by articial ants that try to solvetravelling salesman problem.

The travelling salesman problem (TSP) is theblem of nding a shortest closed tour whichts all the cities in a given set. In this article wel restrict attention to TSPs in which cities area plane and a path (edge) exists between each

r of cities (i.e., the TSP graph is completelynected).

Articial ants

In this work an articial ant is an agent whichves from city to city on a TSP graph. Itoses the city to move to using a probabilisticction both of trail accumulated on edges anda heuristic value, which was chosen here to befunction of the edges length. Articial ants

probabilistically prefer cities that are connectedby edges with a lot of pheromone trail and whichare close-by. Initially, m articial ants are placedon randomly selected cities. At each time stepthey move to new cities and modify thepheromone trail on the edges used this istermed local trail updating. When all the antshave completed a tour the ant that made theshortest tour modies the edges belonging to itstourtermed global trail updatingby addingan amount of pheromone trail that is inverselyproportional to the tour length.

These are three ideas from natural ant be-haviour that we have transferred to our articialant colony: (i) thepreferencefor paths with a high

pheromone level, (ii) the higher rateof growth of the amount of pheromone on shorter paths, and(iii) the trail mediated communication amongants. Articial ants were also given a few capabil-

Fig. 1. (A) Real ants follow a path between nest and foodsource. (B) An obstacle appears on the path: ants choosewhether to turn left or right with equal probability.

Pheromoneis deposited more quickly on theshorter path. (D)All ants have chosen the shorter path.


5/15

M . Dorigo , L.M . Gambardella / BioSystems 43 (1997) 7381 75

le 1mparison of ACS with other nature-inspired algorithms ondom instances of the symmetric TSP

FISA SOMblem name ENACS

5.88 5.98 6.06y set 1 6.035.866.25 6.28y set 2 6.05 6.01 6.03

5.70 5.83y set 3 5.57 5.65 5.855.86 5.965.875.81y set 4 5.70

6.716.70y set 5 6.17 6.33 6.49

mparisons of average tour length obtained on ve 50-cityblems. Results on SA, EN, and SOM arefromDurbin andlshaw (1987) and Potvin (1993). FI results are averaged15 trials startingfromdifferent initial cities. ACS wasrun1250 iterations using m = 20 ants and the results are

aged over 15 trials. The best average tour length for eachblem is in bold.

is a parameter and S is a randomvariableselectedaccording to the following probability distribu-tion, which favours edges which are shorter andhave a higher level of pheromone trail:

p k (r , s )=[ (r , s )][ (r , s )]

u M k

[ (r , u )][ (r , u )]0

if s M k

otherwise

(2)

where p k (r , s ) is the probability with which ant k chooses to move from city r to city s .

The pheromone trail is changed both locallyand globally. Global updating is intended to re-ward edges belonging to shorter tours. Oncearti-cial ants have completed their tours, the best antdeposits pheromone on visited edges; that is, onthose edges that belong to its tour. (The otheredges remain unchanged.) The amount of pheromone (r , s ) deposited on each visited edge(r , s ) by the best ant is inversely proportional tothe length of the tour: the shorter the tour thegreater the amount of pheromone deposited onedges. This manner of depositing pheromone isintended to emulate the property of differentialpheromone trail accumulation, which in the caseof real ants was due to the interplay between thelength of the path and continuity of time. Theglobal trail updating formula is (r , s ) (1 ) (r , s )+ (r , s ), where (r , s )= (shortesttour) 1 . Global trail updating is similar to areinforcement learning scheme in which bettersolutions get a higher reinforcement.

Local updating is intended to avoid a verystrong edge being chosen by all the ants: everytime an edge is chosen by an ant its amount of pheromone is changed by applying the local trailupdating formula: (r , s ) (1 ) (r , s )+ 0 ,where 0 is a parameter. Local trail updating isalso motivated by trail evaporation in real ants.

Interestingly, we can interpret theant colony asa reinforcement learning system, in which rein-forcements modify the strength (i.e. pheromonetrail) of connections between cities. In fact, theabove Eqs. (1) and (2) dictate that an ant can

s which do not havea natural counterpart, butich havebeen observed to bewell suited to theP application: articial antscan determine howaway cities are, and they are endowed with a

rking memory M k used to memorize cities al-dy visited (the working memory is emptied atbeginning of each new tour, and is updated

r each time step by adding the new visitedy).There are many different ways to translate theve principles into a computational system apt

solve the TSP. In our ant colony system (ACS)articial ant k in city r chooses the city s tove to among those which do not belong to itsrking memory M k by applying the followingbabilistic formula:

arg maxu M k {[ (r , u )][ (r , u )] }

S

if q q 0

otherwise

(1)

ere (r , u ) is the amount of pheromone trail one ( r , u ), (r , u ) is a heuristic function, whichs chosen to be the inverse of the distanceween cities r and u , is a parameter whichghs therelativeimportance of pheromonetrail

of closeness, q is a value chosen randomly


6/15


le 1mparison of ACS with other nature-inspired algorithms ondom instances of the symmetric TSP

FISA SOMblem name ENACS

5.88 5.98 6.06y set 1 6.035.866.25 6.28y set 2 6.05 6.01 6.03

5.70 5.83y set 3 5.57 5.65 5.855.86 5.965.875.81y set 4 5.70

6.716.70y set 5 6.17 6.33 6.49

mparisons of average tour length obtained on ve 50-cityblems. Results on SA, EN, and SOM arefromDurbin andlshaw (1987) and Potvin (1993). FI results are averaged15 trials startingfromdifferent initial cities. ACS wasrun1250 iterations using m = 20 ants and the results are

aged over 15 trials. The best average tour length for eachblem is in bold.

is a parameter and S is a randomvariableselectedaccording to the following probability distribu-tion, which favours edges which are shorter andhave a higher level of pheromone trail:

p k (r , s )=[ (r , s )][ (r , s )]

u M k

[ (r , u )][ (r , u )]0

if s M k

otherwise

(2)

where p k (r , s ) is the probability with which ant k chooses to move from city r to city s .

The pheromone trail is changed both locally

and globally. Global updating is intended to re-ward edges belonging to shorter tours. Oncearti-cial ants have completed their tours, the best antdeposits pheromone on visited edges; that is, onthose edges that belong to its tour. (The otheredges remain unchanged.) The amount of pheromone (r , s ) deposited on each visited edge(r , s ) by the best ant is inversely proportional tothe length of the tour: the shorter the tour thegreater the amount of pheromone deposited onedges. This manner of depositing pheromone isintended to emulate the property of differentialpheromone trail accumulation, which in the caseof real ants was due to the interplay between thelength of the path and continuity of time. Theglobal trail updating formula is (r , s ) (1 ) (r , s )+ (r , s ), where (r , s )= (shortesttour) 1 . Global trail updating is similar to areinforcement learning scheme in which bettersolutions get a higher reinforcement.

Local updating is intended to avoid a very

strong edge being chosen by all the ants: everytime an edge is chosen by an ant its amount of pheromone is changed by applying the local trailupdating formula: (r , s ) (1 ) (r , s )+ 0 ,where 0 is a parameter. Local trail updating isalso motivated by trail evaporation in real ants.

Interestingly, we can interpret theant colony asa reinforcement learning system, in which rein-forcements modify the strength (i.e. pheromonetrail) of connections between cities. In fact, theabove Eqs. (1) and (2) dictate that an ant caneither, with probability q 0 , exploit the experience

s which do not havea natural counterpart, butich havebeen observed to bewell suited to theP application: articial antscan determine howaway cities are, and they are endowed with a

rking memory M k used to memorize cities al-dy visited (the working memory is emptied atbeginning of each new tour, and is updated

r each time step by adding the new visitedy).There are many different ways to translate theve principles into a computational system apt

solve the TSP. In our ant colony system (ACS)articial ant k in city r chooses the city s tove to among those which do not belong to itsrking memory M k by applying the followingbabilistic formula:

arg maxu M

k

{[ (r , u )][ (r , u )] }

S

if q q 0

otherwise

(1)

ere (r , u ) is the amount of pheromone trail one ( r , u ), (r , u ) is a heuristic function, whichs chosen to be the inverse of the distanceween cities r and u , is a parameter whichghs therelativeimportance of pheromonetrail

of closeness, q is a value chosen randomlyh uniform probability in [0, 1], q 0 (0 q 0 1)


7/15


le 2mparison of ACS with other nature-inspired algorithms on random instances of the symmetric TSP

SA+ 3-opt Fl Fl + 3-optblem name SOM +ACS ACS+ 3-opt

5.84 5.84 5.89 5.85y set 1 5.84 5.846.00 6.02y set 2 6.00 6.00 5.99 5.99

5.575.58 5.57y set 3 5.575.57 5.575.76 5.70y set 4 5.70 5.70 5.70 5.606.50 6.40y set 5 6.17 6.17 6.17 6.19

mparison on the shortest tour length obtained by SA + 3-opt = best tour length found by SA and many distinct runs of 3-opt,M += best tour length found by SOM over 4000 different runs (by processing the cities in variousorders), F I , FI + 3-opt = best

length found by FI locally optimized by 3-opt, and ACS with and without local optimization by 3-opt. The 3-opt heuristicsthe result of ACS and FI as startingconguration for local optimization. Resultson SA + 3-opt and SOM + are fromDurbinWillshaw (1987) and Potvin (1993). ACS was run for 1250 iterations using m = 20 ants and thebest tour length was obtainedof 15 trials. The best tour length for each problem is in bold.

le 3mparison of ACS with GA, EP, SA and the AG (Lin et al., 1993)

EP SA AGblem name ACS OptimumGA

420 420er30 420 421 420 424(N /A) (N /A)city problem) (423.74) (N/A) (423.74) (423.74)

[12620][24617][40000][830] [3200]426 443 4360 425425 428

(N /A) (N /A)city problem) (427.96) (N /A) (427.86) (N/A)[28111][68512][100000][1830] [25000]

580 5615 535 545 542 535(N /A)(N /A)(N /A)city problem) (549.18)(542.31) (N/A)

[325000] [173250] [95506][3480] [80000]21282N /AN /AoA100 N /A21282 21761

(N /A) (N /A) (N /A) (N /A)-city problem) (21285.44) (N/A)[N /A][4820] [103000] [N /A] [N /A]

report the best integer tour length, the best real tour length (parentheses) and the number of tours required to nd the bestger tour length (squarebrackets). R esults using EP arefrom Fogel (1993) and those using GA arefrom Bersini et al. (1995) foroAl00, and from Whitley et al. (1989) for Oliver30, EilS0, and Eil75. Results using SA and A G are from L in et al. (1993).ver30 is from Oliver et al. (1987), Eil50, Eil75 are from Eilon et al. (1969) and are included in TSPLIB with an additional cityil51.tsp and Eil76.tsp. K roA100 is also in TSPLIB. The best result for each problem is in bold. I t is interesting to note that theplexity of all the algorithms is order of n 2 t , except for EP for which it is order of n t (where n is the number of cities and t

he number of tours generated). It is therefore clear that ACS and EP greatly outperform GA , SA, and AG.PLIB:http: // www.iwr.uni-heidelberg.de /iwr /comopt /soft / TSPLI B95 / TSPLIB.html (maintained by G. Reinelt).

umulated by the ant colony in the form of romone trail (pheromone trail will tend tow on those edges which belong to short tours,king them more desirable) or, with probability q 0), apply a biased exploration (exploration issed towards short and high trail edges) of newhs by choosing the city to move to randomly,h a probability distribution that is a functionboth the accumulated pheromone trail, the

It is interesting to note that ACS employs anovel type of exploration strategy. F irst, there isthe stochastic component S of Eq. (1): here theexploration of new paths is biased towards shortand high trail edges. (Eq. (1), which we call thepseudo-random-proportional action choicerule, isstrongly reminiscent of thepseudo-random actionchoice rule often used reinforcement learning;see for example Q-learning (Watkins and Dayan,

-


8/15


le 2mparison of ACS with other nature-inspired algorithms on random instances of the symmetric TSP

SA+ 3-opt Fl Fl + 3-optblem name SOM +ACS ACS+ 3-opt

5.84 5.84 5.89 5.85y set 1 5.84 5.84

6.00 6.02y set 2 6.00 6.00 5.99 5.995.575.58 5.57y set 3 5.575.57 5.57

5.76 5.70y set 4 5.70 5.70 5.70 5.606.50 6.40y set 5 6.17 6.17 6.17 6.19

mparison on the shortest tour length obtained by SA + 3-opt = best tour length found by SA and many distinct runs of 3-opt,M += best tour length found by SOM over 4000 different runs (by processing the cities in variousorders), F I , FI + 3-opt = best

length found by FI locally optimized by 3-opt, and ACS with and without local optimization by 3-opt. The 3-opt heuristicsthe result of ACS and FI as startingconguration for local optimization. Resultson SA + 3-opt and SOM + are fromDurbinWillshaw (1987) and Potvin (1993). ACS was run for 1250 iterations using m = 20 ants and thebest tour length was obtainedof 15 trials. The best tour length for each problem is in bold.

le 3mparison of ACS with GA, EP, SA and the AG (Lin et al., 1993)

EP SA AGblem name ACS OptimumGA

420 420er30 420 421 420 424(N /A) (N /A)city problem) (423.74) (N/A) (423.74) (423.74)

[12620][24617][40000][830] [3200]426 443 4360 425425 428

(N /A) (N /A)city problem) (427.96) (N /A) (427.86) (N/A)[28111][68512][100000][1830] [25000]

580 5615 535 545 542 535(N /A)(N /A)(N /A)city problem) (549.18)(542.31) (N/A)

[325000] [173250] [95506][3480] [80000]21282N /AN /AoA100 N /A21282 21761

(N /A) (N /A) (N /A) (N /A)-city problem) (21285.44) (N/A)[N /A][4820] [103000] [N /A] [N /A]

report the best integer tour length, the best real tour length (parentheses) and the number of tours required to nd the bestger tour length (squarebrackets). R esults using EP arefrom Fogel (1993) and those using GA arefrom Bersini et al. (1995) foroAl00, and from Whitley et al. (1989) for Oliver30, EilS0, and Eil75. Results using SA and A G are from L in et al. (1993).ver30 is from Oliver et al. (1987), Eil50, Eil75 are from Eilon et al. (1969) and are included in TSPLIB with an additional cityil51.tsp and Eil76.tsp. K roA100 is also in TSPLIB. The best result for each problem is in bold. I t is interesting to note that theplexity of all the algorithms is order of n 2 t , except for EP for which it is order of n t (where n is the number of cities and t he number of tours generated). It is therefore clear that ACS and EP greatly outperform GA , SA, and AG.PLIB:http: // www.iwr.uni-heidelberg.de /iwr /comopt /soft / TSPLI B95 / TSPLIB.html (maintained by G. Reinelt).

umulated by the ant colony in the form of romone trail (pheromone trail will tend tow on those edges which belong to short tours,king them more desirable) or, with probability q 0), apply a biased exploration (exploration issed towards short and high trail edges) of newhs by choosing the city to move to randomly,h a probability distribution that is a functionboth the accumulated pheromone trail, the

ristic function and the working memory M k .

I t is interesting to note that ACS employs anovel type of exploration strategy. F irst, there isthe stochastic component S of Eq. (1): here theexploration of new paths is biased towards shortand high trail edges. (Eq. (1), which we call thepseudo-random-proportional action choicerule, isstrongly reminiscent of thepseudo-random actionchoice rule often used reinforcement learning;see for example Q-learning (Watkins and Dayan,1992)). Second, local trail updating tends to en-


9/15


le 4S performance for some bigger TSPs

%ErrorACS averageblem name Optimal resultACS (cl = 20) best result CPU secs to generate a tour(2)(1)(1) (2)

(2)

0.681578016054 0.028 15888-city prob- [585000] [71.1]m)

0.96 0.055077951690442 51 268-city prob- [595000] [188.7]m)

1.67 0.0732 28147 28522 27686[275.4]-city prob- [830658]

m)8806 0.132.37906678 9015

[991276] [28.2]-city prob-m)

3.27 + 3.79[2213722249] 0.482316377 2297777-city [942000] [116.6]roblem)

t column: best result obtained by ACS out of 15 trials; wegive theinteger length of theshortest tour and thenumber of toursch were generated before nding it (square brackets). Second column: ACS average on 15 trials and its standard deviation inrebrackets. Third column: optimal result (for 1577 wegive, in squarebrackets, theknown lower and upper bounds, given that

optimal solution is not known). Fourth column: error percentage, a measureof thequality of thebest result found byACS. Fifthmn: time required to generatea tour on a Sun Sparc-server (50 M Hz). The reason for themore than linear increase in time isthenumber of failures, that is, the number of times an ant has to choose thenext city outside of thecandidate list, increases

h the problem dimension. All problems are included in TSPLIB.

rage exploration since each path taken has itseromone value reduced by the local updatingmula.

Results

We applied ACS to the symmetric and asym-ric TSPs listed in Tables 14 and Table 7.ese test problems were chosen either becausee was data available in the literature to com-e our results with those obtained by otherurally inspired methods or with the optimalutions (thesymmetric instances) or to show thelity of ACS in solving difcult instances of theP (the asymmetric instances).Using the test problems listed in Tables 13 theformance of ACS was compared with the per-mance of other naturally inspired global opti-

neural nets (NNs), here represented by the elasticnet (EN) and by the self organizing map (SOM),evolutionary computation (EC), here representedby the genetic algorithm (GA ) and by evolution-ary programming (EP) and a combination of sim-ulated annealing and genetic algorithms (AG);moreover we compared it with the farthest inser-

tion (FI) heuristic. Numerical experiments wereexecuted with ACS and FI, whereas the perfor-mance gures for theother algorithms were takenfrom the literature. TheACS parameters were setto the following values: m = 10, = 2, = 0.1,q 0 = 0.9, 0 = (n L nn )

1 , where Lnn is the tourlength produced by the nearest neighbour heuris-tic and n is thenumber of cities (thesevalues werefound to be very robust across a wide variety of problems). In some experiments (Table 2), thebest solution found by theheuristic carried toits local optimum by applying 3-opt (Croes,


10/15


le 4S performance for some bigger TSPs

%ErrorACS averageblem name Optimal resultACS (cl = 20) best result CPU secs to generate a tour(2)(1) (1) (2)

(2)0.681578016054 0.028 15888

-city prob- [585000] [71.1]m)

0.96 0.055077951690442 51 268-city prob- [595000] [188.7]m)

1.67 0.0732 28147 28522 27686[275.4]-city prob- [830658]

m)8806 0.132.37906678 9015

[991276] [28.2]-city prob-m)

3.27 + 3.79[2213722249] 0.482316377 2297777-city [942000] [116.6]roblem)

t column: best result obtained by ACS out of 15 trials; wegive theinteger length of theshortest tour and thenumber of toursch were generated before nding it (square brackets). Second column: ACS average on 15 trials and its standard deviation inrebrackets. Third column: optimal result (for 1577 wegive, in squarebrackets, theknown lower and upper bounds, given that

optimal solution is not known). Fourth column: error percentage, a measureof thequality of thebest result found byACS. Fifthmn: time required to generatea tour on a Sun Sparc-server (50 M Hz). The reason for themore than linear increase in time isthenumber of failures, that is, the number of times an ant has to choose thenext city outside of thecandidate list, increases

h the problem dimension. All problems are included in TSPLIB.

rage exploration since each path taken has itseromone value reduced by the local updatingmula.

Results

We applied ACS to the symmetric and asym-

ric TSPs listed in Tables 14 and Table 7.ese test problems were chosen either becausee was data available in the literature to com-e our results with those obtained by otherurally inspired methods or with the optimalutions (thesymmetric instances) or to show thelity of ACS in solving difcult instances of theP (the asymmetric instances).Using the test problems listed in Tables 13 theformance of ACS was compared with the per-mance of other naturally inspired global opti-ation methods: simulated annealing (SA),

neural nets (NNs), here represented by the elasticnet (EN) and by the self organizing map (SOM),evolutionary computation (EC), here representedby the genetic algorithm (GA ) and by evolution-ary programming (EP) and a combination of sim-ulated annealing and genetic algorithms (AG);moreover we compared it with the farthest inser-tion (FI) heuristic. Numerical experiments were

executed with ACS and FI, whereas the perfor-mance gures for theother algorithms were takenfrom the literature. TheACS parameters were setto the following values: m = 10, = 2, = 0.1,q 0 = 0.9, 0 = (n L nn )

1 , where Lnn is the tourlength produced by the nearest neighbour heuris-tic and n is thenumber of cities (thesevalues werefound to be very robust across a wide variety of problems). In some experiments (Table 2), thebest solution found by theheuristic carried toits local optimum by applying 3-opt (Croes,1958). The tables show that ACS nds results


11/15


le 5mparison between candidate list size

Average number of failures for each tour builtACS best resultdidate list Average time per trialACS averageh (secs)

431.00 426 13.93 0.730.4823.93427431.27

429 33.93 435.27 0.36426 44.26 433.47 0.11

55.06429 0.01433.87

blem: Eil51. For each candidate list length, averages are computed over 15 trials. In each trial the number of tours generated00.


Average number of failures for each tour builtACS average ACS best result Average time per trialdidate list(secs)th

458.55220154024.9 3.4253580 786.4 54970.9 2.1053 907 1134.5 55582.7 1.77

1459.254559 1.5356495.956728.3 54527 1764.3 1.30

blem: Pcb442. For each candidate list length averages is computed over 10 trials. In each trial thenumber of tours generated is00.

ich are at least as good as, and often bettern, those found by theother methods. A lso, thesolutions found by ACS in Table2 werelocal

ima with respect to 3-opt.We also ran ACS on some bigger problems tody its behaviour for increasing problem dimen-ns (Table 4). For these runs we implemented aghtly modied version of ACS which incorpo-es a more advanced data structure known as adidate list, a data structure normally useden trying to solve big TSP problems (Reinelt,4; J ohnson and McGeoch, 1997). A candidateis a list of preferred cities to be visited; it is a

ic data structure which contains, for a giveny i , the cl closest cities. In practice, an ant inCS with a candidate list rst chooses the city tove to among those belonging to the candidate Only if none of the cities in the candidate listbe visited does it consider the rest of the

es. In Tables 5 and 6 westudy theperformance

(ACS without a candidate list corresponds toACS with a candidate list with the list length setto cl = n ). We report the results obtained for theEil51 and Pcb442 TSPs (both these problem areincluded in TSPLIB) which show that a shortcandidate list improves both the average and thebest performance of ACS; also, using a shortcandidate list takes less CPU time to build a tourthan using a longer one. The results reported in

Table 4 were obtained setting cl = 20.Still morepromising are the resultsweobtained

applying ACS to some asymmetric TSP problems(Table 7). F or example, ACS was able to nd, in220 sec using a Pentium PC, the optimal solutionfor a 43-city asymmetric problem called 43X2. The same problem could not be solved to opti-mality with less than 32 h of computation on aworkstation by the best published code availablefor theasymmetric TSP based on theAssignmentProblem relaxation (Fischetti and Toth, of the asymmetric TSP, and was only very recently


12/15



Average number of failures for each tour builtACS best resultdidate list Average time per trialACS averageh (secs)

431.00 426 13.93 0.730.4823.93427431.27

429 33.93 435.27 0.36426 44.26 433.47 0.11

55.06429 0.01433.87

blem: Eil51. For each candidate list length, averages are computed over 15 trials. In each trial the number of tours generated00.


Average number of failures for each tour builtACS average ACS best result Average time per trialdidate list(secs)th

458.55220154024.9 3.4253580 786.4 54970.9 2.1053 907 1134.5 55582.7 1.77

1459.254559 1.5356495.956728.3 54527 1764.3 1.30

blem: Pcb442. For each candidate list length averages is computed over 10 trials. In each trial thenumber of tours generated is00.

ich are at least as good as, and often bettern, those found by theother methods. A lso, thesolutions found by ACS in Table2 werelocal

ima with respect to 3-opt.We also ran ACS on some bigger problems tody its behaviour for increasing problem dimen-ns (Table 4). For these runs we implemented aghtly modied version of ACS which incorpo-es a more advanced data structure known as adidate list, a data structure normally useden trying to solve big TSP problems (Reinelt,4; J ohnson and McGeoch, 1997). A candidateis a list of preferred cities to be visited; it is a

ic data structure which contains, for a giveny i , the cl closest cities. In practice, an ant inCS with a candidate list rst chooses the city tove to among those belonging to the candidate Only if none of the cities in the candidate listbe visited does it consider the rest of the

es. In Tables 5 and 6 westudy theperformanceACS for different lengths of the candidate list

(ACS without a candidate list corresponds toACS with a candidate list with the list length setto cl = n ). We report the results obtained for theEil51 and Pcb442 TSPs (both these problem areincluded in TSPLIB) which show that a shortcandidate list improves both the average and thebest performance of ACS; also, using a shortcandidate list takes less CPU time to build a tourthan using a longer one. The results reported in Table 4 were obtained setting cl = 20.

Still morepromising are the resultsweobtainedapplying ACS to some asymmetric TSP problems(Table 7). F or example, ACS was able to nd, in220 sec using a Pentium PC, the optimal solutionfor a 43-city asymmetric problem called 43X2. The same problem could not be solved to opti-mality with less than 32 h of computation on aworkstation by the best published code availablefor theasymmetric TSP based on theAssignmentProblem relaxation (Fischetti and Toth, of the asymmetric TSP, and was only very recentlysolved to optimality by (Fischetti and Toth, 1994)


13/15


le 7mparison between exact methods and ACS for ATSP problems

blem FT-94ACS best result ACS average FT-92

2 56205620 5627 N /A

(492.2)city problem) (220) (295)14422p 14685 14422 14422city problem) (610) (798) (729.6) (52.8)

exact method is the best published deterministic code for the asymmetric TSP; results are from Fischetti and Toth (1992) andhetti and Toth (1994). The ry48p is from TSPLIB, and the 43X2 problem is from Balaset al. (1993). Wereport the tour lengthin parentheses, CPU secs used to nd the reported solution (experiments were run on a Pentium PC). ACS was run using 10for 1500 iterations, and results were obtained out of 15 trials. The best result for each problem is in bold.

h an algorithm based on polyhedral cutsanch-and-cut scheme) 1 .In addition to providing interesting computa-

nal results, ACS also presents some attractiveracteristics due to the use of trail mediatedmmunication. First, communication determinesynergistic effect. This is shown for example ing. 2 which shows the typical result of an exper-ent in which we studied the average speed tod theoptimal solution (dened as the inverseof average time to nd the optimal solution) as action of the number of ants in ACS. To make

the comparison fair performance was measuredby CPU time so as to discount for the highercomplexity of the algorithm when ants communi-

cate (higher complexity is due to trail updatingoperations: when ants do not communicatetrail isinitially set to 1 on all edges and is not updatedduring computation). Second, communication in-creases the probability of quickly nding an opti-mal solution. Consider the distribution of therstnishing times, where the rst nishing time is thetime elapsed until the rst optimal solution isfound. Fig. 3 shows how this distribution changesin thecommunicating and thenoncommunicatingcases. These results show that communicationamong ants (mediated by trail) is useful.

Although when applied to the symmetric TSPACS is not competitive with specialized heuristicmethods like Lin-K ernighan (L in and K ernighan,1973), its performance can become very interest-ing when applied to a slightly different problem;in this article we reported some results on the

2. Communication determines a synergistic effect. Com-ication among agents: solid line. Absence of communica- dotted line. Test problem: CCAO, a 10-city problemlden and Stewart, 1985). Average on 100 runs. (Theuse of ncreasing number of ants does not improve performancethe case of absence of cooperation since weuse CPU time

measure speed.)

Fig. 3. Communication changes theprobability distribution of rst nishing times. Communication among agents: solid line.Absenceof communication: dotted line. Test problem: CCAO,

a 10-city problem (Golden and Stewart, 1985). Average of 10000 runs. Number of ants: m = 4.

I t should be noted that the task faced by ACS, as is thefor the heuristic method, is easier than that for any

mality proving algorithm since ACS does not have tove the optimality of the obtained result. The results of le 7 should therefore be taken for what they aretheyest that ACS is a good method for nding good

ATSPs in a reasonably short time and not that ACS ispetitive with exact methods.


14/15


15/15


bin, R. and Willshaw, D., 1987, An analogueapproach tothe travelling salesman problem using an elastic netmethod. Nature, 326, 689 691.n, S., Watson-Gandy, C.D.T. and Christodes, N., 1969,Distribution management: mathematical modeling andpractical analysis. Oper. Res. Q., 20, 3753.hetti, M. and Toth, P., 1992, An additive bounding proce-dure for the asymmetric travelling salesman problem.M ath. Program., 53, 173197.

chetti, M. and Toth, P., 1994, A polyhedral approach fortheexact solution of hard ATSP instances. Tech. R ep. No.OR-94, DEI S, U niversita di Bologna, I taly.

urent, C. and Ferland, J ., 1994, Genetic hybrids for thequadratic assignment problem, in: DIM ACS Series in Dis-crete M athematics and Theorical Computer Science, Vol.16, pp. 173187.el, D., 1993, Applying evolutionary programming to se-lected traveling salesman problems. Cybern. Syst. Int. J .,24, 2736.

mbardella, L .M. and Dorigo, M ., 1995, Ant-Q: a reinforce-ment learning approach to thetraveling salesman problem,in: Proc. M L-95, 12th Int. Conf. on M achine Learning, A.Prieditis and S. Russell (eds.) (Morgan K aufmann, SanFrancisco, CA ) pp. 252260.

mbardella, L .M., Taillard, E. and Dorigo, M ., 1997, Antcolonies for QAP. Tech. Rep. No. IDSIA .97-4, IDSIA ,Lugano, Switzerland.den, B. and Stewart, W., 1985, Empiric analysis of heuris-tics, in: The Traveling Salesman Problem, E.L . Lawler,J .K . Lenstra, A.H.G. Rinnooy-K an and D.B. Shmoys(eds.) (Wiley, New Y ork) pp. 207250.s, S., Aron, S., Deneubourg, J .L . and Pasteels, J .M., 1989,Self-organized shortcuts in the Argentine ant. Naturwis-senschaften, 76, 579 581.

Holldobler, B. and Wilson, E.O., 1990, The Ants (Springer-Verlag, Berlin).

J ohnson, D .S. and McGeoch, L .A., 1997. The travelling sales-man problem: A case study in local optimization, in: LocalSearch in Combinatorial Optimization, E.H.L. Aarts and J .K . Lenstra (eds.) (Wiley, New Y ork).

Li, Y ., Pardalos, P.M . and Resende, M.G.C., 1994, A greedyrandomized adaptive search procedure for the quadraticassignment problem, in: Quadratic Assignment and Re-lated Problems, P.M . Pardalos and H . Wolkowicz (eds.)(DI M ACS Series, American M athematical Society, RhodeIsland) pp. 237261.

Lin, F.-T., K ao, C.-Y. and Hsu, C.-C., 1993, Applying thegenetic approach to simulated annealing in solving someNP-hard problems. IEEE Trans. Syst., Man Cybern., 23,17521767.

Lin, S. and K ernighan, B.W., 1973, An effective heuristicalgorithm for the TSP. Oper. Res., 21, 498516.

Oliver, I ., Smith, D. and Holland, J .R., 1987, A study of permutation crossover operators on thetravelling salesmanproblem, in: Proc. 2nd I nt. Conf. on Genetic Algorithms, J .J . Grefenstette (ed.) (Lawrence Erlbaum, Hillsdale, New J ersey) pp. 224230.

Potvin, J -Y ., 1993, The traveling salesman problem: a neuralnetwork perpective. ORSA J . Comput., 5 (4), 328347.

Reinelt, G., 1994, The Traveling Salesman: ComputationalSolutions for TSP Applications (Springer-Verlag, Berlin).

Taillard, E., 1991, Robust taboo search for the quadraticassignment problem. Parallel Comput., 17, 443 455.

Watkins, C.J .C.H. and Dayan, P., 1992, Technical note: Q-learning. M ach. Learning, 8, 279292.

Whitley, D., Starkweather, T. and Fuquay, D., 1989, Schedul-ing problems and travelling salesman: the genetic edgerecombination operator, in: Proc. 3rd Int. Conf. on Ge-netic Algorithms, Schaffer (ed.) (Morgan K aufmann,San Mateo, CA) pp. 133140.

Ant Colonies for Travelling Salesman Problem

Documents

Transcript of Ant Colonies for Travelling Salesman Problem