[IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER...

5
Investigation of Parallel Particle Swarm Optimization Algorithm With Reduction of the Search Area Algirdas Lančinskas, Julius Žilinskas Dept. of System Analysis Institute of Mathematics and Informatics Vilnius, Lithuania [email protected], [email protected] Pilar Martínez Ortigosa Dept. of Computer Architecture and Electronics University of Almería, Almería, Spain [email protected] Abstract— We consider a population based Particle Swarm Optimization (PSO) algorithm and a few modifications to increase quality of optimization. Several strategies are investigated to exchange data between processors in parallel algorithm. Experimental investigation is performed on Multiple Gravity Assist problem. The results are compared with original PSO. Keywords- global optimization; parallel computing; parallel Particle Swarm Optimization; Particle Swarm Optimization I. INTRODUCTION Particle Swarm Optimization 1 (PSO) is developed by Eberhart and Kennedy [1][2] in 1995. PSO is a form of swarm intelligence in which the behavior of a biological social system like a flock of birds or a school of fish [3] is simulated. PSO is one of the evolutionary computation techniques and like the most evolutionary computation techniques PSO is derivative-free search algorithm [4]. PSO is initialized with a population of random solutions, called particles, and velocities. With PSO, a population of particles (individuals) in a n - dimensional search space is simulated. Each particle is associated with a velocity in PSO unlike in the other evolutionary computation techniques. Particles fly through the search space with velocities which are dynamically adjusted according to their historical behaviors – the best solution that is found by a particular particle and the best solution that is found by whole population. PSO has applications in variety of fields such as machine learning [2][5][6], operations research [7], chemistry and chemical engineering [8][9], etc. Quality of PSO hardly depends on population size – more particles in population leads to better result. On the other hand more particles mean larger number of function evaluations per iteration and longer time of computation. Parallel computing is useful to reduce computation time. II. PARTICLE SWARM OPTIMIZATION A. Original PSO algorithm Suppose we have a population of N particles. Each i -th particle ( ) 1, i N = of population has two state parameters – 1 978-1-4244-8396-9/10/$26.00 ©2010 IEEE its current position { } ( ) ( ) ( ) ( ) 1 2 , ,..., k k k k i i i in x x x = x and velocity ( ) ( ) ( ) ( ) ( ) 1 2 , ,..., k k k k i i i in v v v = v where k indicates iteration number. It also knows its previous best position (personal best experience) ( ) 1 1 , ,..., i i i in p p p = p and the best state of all particles found so far ( ) 1 2 , ,..., g g g g n p p p = p . The maximum number of iterations is denoted by max k . Position of each particle is updated using equation [10] ( 1) ( ) ( 1) ( 1) 11 2 2 , ( ) ( ), 1, , k k k i i i k k k g k i i i i i w cr cr i N + + + = + = + + = x x v v v p x p x (1) where 1 r and 2 r represent random numbers uniformly distributed between 0 and 1. Variable w is the inertia of particle, which is reduced dynamically to decrease the search area in a gradual fashion. Two constant multipliers 1 c and 2 c are known as “self-confidence” and “swarm confidence” respectively [11]. Larger self-confidence multiplier tends particle to its personal best position and larger swarm-confidence multiplier tends to the best position of all particles found so far. Often 1 2 2 c c = = are chosen to allow the product 11 cr or 22 cr to have a mean of 1 [11]. The best fitness value of i -th particle is denoted by besti f and the best ever fitness value of all population – by bestg f . The vector ( ) max max max max 1 2 , ,..., n v v v = v restricts values of velocities , 1, i i N = v within the multidimensional interval max max , v v . Usually max v is given by the equation max ( ), ub lb α = v x x where ( ) 1 2 , ,..., ub ub ub ub n x x x = x and ( ) 1 2 , ,..., lb lb lb lb n x x x = x are upper and lower bounds for positions of particles. A factor α is often chosen to be 0.5 [10]. Original PSO algorithm can be described as follows: 1. Initialization Initialize values max k , 1 c , 2 c , w , α , etc. Set 0 k = ;

Transcript of [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER...

Page 1: [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) - HERAKLION, Greece (2010.09.20-2010.09.24)] 2010 IEEE International Conference

Investigation of Parallel Particle Swarm Optimization Algorithm With Reduction of the Search Area

Algirdas Lančinskas, Julius Žilinskas Dept. of System Analysis

Institute of Mathematics and Informatics Vilnius, Lithuania

[email protected], [email protected]

Pilar Martínez Ortigosa Dept. of Computer Architecture and Electronics

University of Almería, Almería, Spain

[email protected]

Abstract— We consider a population based Particle Swarm Optimization (PSO) algorithm and a few modifications to increase quality of optimization. Several strategies are investigated to exchange data between processors in parallel algorithm. Experimental investigation is performed on Multiple Gravity Assist problem. The results are compared with original PSO.

Keywords- global optimization; parallel computing; parallel Particle Swarm Optimization; Particle Swarm Optimization

I. INTRODUCTION Particle Swarm Optimization 1 (PSO) is developed by

Eberhart and Kennedy [1][2] in 1995. PSO is a form of swarm intelligence in which the behavior of a biological social system like a flock of birds or a school of fish [3] is simulated. PSO is one of the evolutionary computation techniques and like the most evolutionary computation techniques PSO is derivative-free search algorithm [4]. PSO is initialized with a population of random solutions, called particles, and velocities.

With PSO, a population of particles (individuals) in a n - dimensional search space is simulated. Each particle is associated with a velocity in PSO unlike in the other evolutionary computation techniques. Particles fly through the search space with velocities which are dynamically adjusted according to their historical behaviors – the best solution that is found by a particular particle and the best solution that is found by whole population. PSO has applications in variety of fields such as machine learning [2][5][6], operations research [7], chemistry and chemical engineering [8][9], etc.

Quality of PSO hardly depends on population size – more particles in population leads to better result. On the other hand more particles mean larger number of function evaluations per iteration and longer time of computation. Parallel computing is useful to reduce computation time.

II. PARTICLE SWARM OPTIMIZATION

A. Original PSO algorithm Suppose we have a population of N particles. Each i -th

particle ( )1,i N= of population has two state parameters –

1978-1-4244-8396-9/10/$26.00 ©2010 IEEE

its current position { }( ) ( ) ( ) ( )1 2, , ...,k k k k

i i i inx x x=x and velocity

( )( ) ( ) ( ) ( )1 2, , ...,k k k k

i i i inv v v=v where k indicates iteration number. It also knows its previous best position (personal best experience) ( )1 1, ,...,i i i inp p p=p and the best state of

all particles found so far ( )1 2, ,...,g g g gnp p p=p . The

maximum number of iterations is denoted by maxk . Position of each particle is updated using equation [10]

( 1) ( ) ( 1)

( 1)1 1 2 2

,

( ) ( ),

1, ,

k k ki i i

k k k g ki i i i iw c r c r

i N

+ +

+

= +

= ⋅ + − + −

=

x x v

v v p x p x

(1)

where 1r and 2r represent random numbers uniformly distributed between 0 and 1. Variable w is the inertia of particle, which is reduced dynamically to decrease the search area in a gradual fashion. Two constant multipliers

1c and 2c are known as “self-confidence” and “swarm confidence” respectively [11]. Larger self-confidence multiplier tends particle to its personal best position and larger swarm-confidence multiplier tends to the best position of all particles found so far. Often 1 2 2c c= = are chosen to allow the product 1 1c r or 2 2c r to have a mean of 1 [11]. The best fitness value of i -th particle is denoted by

bestif and the best ever fitness value of all population – by

bestgf . The vector ( )max max max max1 2, ,..., nv v v=v restricts values

of velocities , 1,i i N=v within the multidimensional

interval max max,⎡ ⎤−⎣ ⎦v v . Usually maxv is given by the equation

max ( ),ub lbα= ⋅ −v x x where ( )1 2, ,...,ub ub ub ub

nx x x=x and ( )1 2, ,...,lb lb lb lbnx x x=x are

upper and lower bounds for positions of particles. A factor α is often chosen to be 0.5 [10].

Original PSO algorithm can be described as follows: 1. Initialization

Initialize values maxk , 1c , 2c , w , α , etc. Set 0k = ;

Page 2: [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) - HERAKLION, Greece (2010.09.20-2010.09.24)] 2010 IEEE International Conference

Randomly initialize positions ( )kix and velocities

( )kiv for each i -th ( )1,i N= particle.

2. Optimization While maxk k< do

For all 1,2,...,i N= Evaluate objective function value ( )( )k

i if f= x ;

If i bestif f< then besti if f= , ( )ki i=p x ;

If i bestgf f< then bestg if f= , ( )g ki=p x ;

If stopping criteria is satisfied then break else update velocity and particle position by the equation (1);

End For; 1k k= +

End While; 3. Report results.

B. Synchronous and asynchronous designs One of the key issues is whether a synchronous or

asynchronous approach is used to update particle positions and velocities. The sequential synchronous PSO algorithm updates all particle velocities and positions at the end of every optimization iteration. In contrast, the sequential asynchronous PSO algorithm updates particle positions and velocities continuously based on currently available information. The difference between the two methods is similar to the difference by Jacobi (synchronous) and Gauss–Seidel (asynchronous) methods for solving linear systems of equations [13].

C. Modifications We improve classical PSO algorithm by introducing

some modifications. First of them is to “shake” particles after a number of unsuccessful iterations. It means that if the best function value has not been improved during a number of iterations, positions of all particles are updated by the equation

( ) ( )( 1) ( ) 1 2k k ub lbi i r c+ = + − ⋅ − ⋅ ⋅x x x x ,

where r is a uniform random number between 0 and 1. It means that every particle is moved to a random neighbor place. Constant c defines how far neighbor place can be or in other words how strong the “shake” is (larger c value means stronger “shake”). This modification helps to avoid convergence of particles to local minima.

Another modification is reduction of the search area. After a fixed number of iterations we reduce the search area around the best solution found so far and restart the search in the new area.

The following equations are used to calculate new upper and lower bounds ubx and lbx :

,ub g lb g= + = −x p Δ x p Δ , ( ) ,ub lbrc= −Δ x x

where ( )1 2, ,..., n= Δ Δ ΔΔ , 0 1rc< < – reduction

coefficient. Lower reduction coefficient leads to lower search area after reduction and larger improvement of optimization quality. On the other hand reduction to lower search area increases probability to lose real minimum. We experimentally investigate reduction of the search area using three different reduction coefficients: 0.15, 0.05 and 0.01. Results of experimental research are thereinafter. Requirement for reduction is that new area must be within initial area:

, , , 1,lb lb lb lbi i i i i N⎡ ⎤ ⎡ ⎤⊂ =⎣ ⎦ ⎣ ⎦x x x x .

This requirement can be violated if any of , 1,j j nΔ = is too

large. If so then respective jΔ is reduced so that new bounds satisfy requirements.

This modification increases quality of optimization but it is sensitive for quality of solution used for reduction of the area – real global minimum can be lost if the area is reduced around faulty solution. As more computations performed till reduction, as lower probability to lost real minimum we have. In this work the reduction is performed when the number of run iterations is 50%.

III. PARALLEL PSO PSO requires a big amount of computational resources

(especially if large population is used) due to the evaluation of the objective function for the whole population at each iteration. On the other hand population-based algorithms are intrinsically parallel.

The population of particles can be divided into subpopulations and distributed among the processors. The size of each subpopulation can vary depending on the availability and speed of processors. Therefore, when performing computations on a homogeneous cluster, all the subpopulation sizes can be equal. On the other hand, when the cluster of processors is heterogeneous, the sizes should be different according to the performance of the assigned processor.

The main performance bottleneck in parallel computational environment is often communication latency between processors [14]. Therefore, designing an efficient strategy to exchange data among processors is very important in order to obtain an almost ideal speedup for parallel algorithm [15][16]. The following strategies to exchange data between processing elements have been investigated in this paper.

Synchronous master-slave. One of processing elements (the master) controls process of data exchange between all processing elements. The master collects solutions from all processing elements, decides which is the best and sends the best solution back to the remaining processing elements (the slaves). Algorithm stops to wait while all processing elements finish its own iteration and then starts data exchange procedure. The master processing element has computational job too.

All-to-all. Each processing element sends its best known solution to all others and receives solutions from all others.

Page 3: [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) - HERAKLION, Greece (2010.09.20-2010.09.24)] 2010 IEEE International Conference

Each processing element continues to the next iteration with the best solution that it knows after data exchange procedure.

Hierarchic scatter. All processing elements are divided into two equal groups: Group1 and Group2 (if number of processing elements is odd, then the last element sends its solution to the first element and division is performing without the last element). Each processing element from Group2 sends its best known solution to respective element from Group1. The same scheme is applying to Group1 – Group1 is dividing into to equal groups and elements form the second group sends data to the elements from the first group. Such iterative process is continued until group size is reduced to one processing element.

Asynchronous master-slave. This strategy is similar to synchronous master-slave strategy but master element has no computational work. It just waits while another one finishes its iteration and requests data exchange. The master receives a solution form the slave, determines the best known solution and sends the best known solution back to the slave. Algorithm does not stop to wait for all processing elements.

IV. EXPERIMENTS RESULTS

A. Test Problem We test our algorithm using Multiple Gravity Assist

(MGA) problem which is one of Space Mission Design related problems. In mathematical terms this is global optimization problem with non-linear constraints. It can be used to locate the best possible trajectory that an interplanetary probe equipped with a chemical propulsion engine may take to go from the Earth to another planet or asteroid. The spacecraft is constrained to thrust only at planetary encounters

The Cassini1 problem is an MGA problem related to the Cassini spacecraft trajectory design problem. The objective of this mission is to reach Saturn and to be captured by its gravity into an orbit having pericenter radius rp = 108950 km, and eccentricity e = 0.98. The planetary fly-by sequence considered is Earth, Venus, Venus, Earth, Jupiter, Saturn (as the one used by the Cassini spacecraft). The number of function variables is six [12].

The best known solution for this problem is x = [-789.806, 158.339, 449.386, 54.72, 1024.656, 4552.753] corresponding to a final objective function of 4.93.

During the tests of optimizers best results were achieved using Different Evolution algorithm although it stacked at 5.303 which is very strong local optimum point of this particular problem [12].

B. Description and results of experimental investigation PSO with reduction of the search area has been

experimentally tested and compared with original PSO. Cassini1 problem, different population sizes (20, 40, 60, 80 and 100 particles in population) and different reduction coefficients (0.15, 0.05 and 0.01) have been used in

experiments. Due to the stochastic nature of the algorithm, 100 independent runs with 20000 iterations have been performed for each test problem and average values are given.

To estimate the quality of algorithm we define as acceptable solution for our test problem those solutions whose objective function values is lower (smaller) than 5.308. After 100 independent runs probability to achieve acceptable solution using a particular algorithm has been calculated. It goes without saying that better algorithm must produce larger probability to achieve acceptable solution.

Results achieved with modified PSO have been compared with results achieved with original PSO using the same number of iterations. Probabilities to achieve acceptable solution using original PSO and PSO with different reduction coefficients (1.5, 0.5 and 0.1) are shown in Figure 1.

Quality of solution hardly depends on population size – more particles in population gives larger probability to achieve acceptable solution. Reduction of the search area gives significant advantage – probability to achieve acceptable solution reducing search area is larger independent of the population size. Best result has been achieved using reduction coefficient 0.1 and 100 particles in population – acceptable solution has been achieved in all 100 independent runs. But using lower population it is better to use reduction coefficient 0.05 due to error of solution that is achieved after 10000 iterations using small population.

Once the best sequential strategy and the best population size have been selected, the sequential algorithm has been compared to the four parallel strategies. The average sequential execution time using 100 particles in population has been 141.08 seconds. Using parallel computation methods it was reduced to 18.02 seconds using synchronous hierarchic scatter strategy and 13.6 seconds using asynchronous master-slave strategy on 16 processing units. Comparison of algorithm speedup an efficiency using synchronous data exchange strategies and different number of processing elements are shown in Figures 2 and 3.

The best speedup and efficiency of synchronous strategies have been achieved using hierarchic scatter strategy. Using synchronous strategies all processing elements has computational work. Asynchronous master-slave strategy leaves one processing element without computational work but only for distribution of solutions. Speedup and efficiency achieved with asynchronous master-slave strategy have been compared with synchronous hierarchic scatter strategy (Figures 4 and 5) which seems to be the best of all synchronous strategies that has been investigated.

Performing computations on a small number of processing elements (2, 4 or 6 elements) synchronous strategy produces better speedup and efficiency of parallel algorithm. Therefore it is useful to get computational work for all processing elements. On the other hand if the number of processing elements is larger (8 or more) it is useful to leave one processing element without computational work, but only for communication (asynchronous master slave strategy).

Page 4: [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) - HERAKLION, Greece (2010.09.20-2010.09.24)] 2010 IEEE International Conference

Fig. 1. Probability to achieve acceptable solution using different algorithms.

Fig. 2. Speedup of parallel algorithm using different data exchange strategies.

Fig. 3. Efficiency of parallel algorithm using different data exchange strategies.

Fig. 4. Speedup of parallel algorithm using Asynchronous master-slave and Hierarchic scatter strategies to exchange data.

Fig. 5. Efficiency of parallel algorithm using Asynchronous master-slave and Hierarchic scatter strategies to exchange data.

CONCLUSIONS Sequential and parallel Particle Swarm Optimization

algorithms and their modification reducing the search area have been investigated during experimental investigation. Several strategies for communication between processing elements have been investigated too.

Experimental investigation shows that reduction of the search area after a number of iterations has significant advantage for quality of Particle Swarm Optimization. However very important is the quality of solution over which the reduced area is specified – real global minimum can be lost if the area is reduced around faulty solution.

Parallel computing can improve performance of Particle Swarm Optimization algorithm but strategies for communication between processing elements are very important. The best speedup and efficiency of synchronous data exchange strategies have been achieved using hierarchic scatter strategy. Asynchronous data exchange strategy has advantage if number of processing elements is larger than eight.

FEATURE WORKS In the feature we plan to investigate speedup and

efficiency of parallel Particle Swarm Optimization algorithm on homogenous and heterogeneous clusters with a higher number of processing elements. We also plan to investigate methods of distributing particles among processing elements on heterogeneous cluster.

ACKNOWLEDGMENT The work has been funded by the Research Council of

Lithuania (project number MIP-108/2010) and by grants from the Spanish Ministry of Science and Innovation (TIN2008-01117) and Junta de Andalucía (P06-TIC-01426,P08-TIC-3518), in part financed by the European Regional Development Fund (ERDF). The work was partially supported by the COST Action “Open European Network for High Performance Computing on Complex Environments” IC0805.

0.40.50.60.70.80.9

1

20 40 60 80 100

Prob

abili

ty

Number of particles

Original PSO

PSO (0.15)

PSO (0.05)

PSO (0.01)

12345678

2 4 6 8 10 12 14 16

Spee

dup

Number of processing elements

All-to-all Sync. master-slave Hierarchic scatter

0.30.40.50.60.70.80.9

1

2 4 6 8 10 12 14 16

Effic

ienc

y

Number of processing elements

All-to-all Sync. master-slave Hierarchic scatter

02468

1012

2 4 6 8 10 12 14 16

Spee

dup

Number of processing elements

Async. master-slave Hierarchic scatter

0.30.40.50.60.70.80.9

1

2 4 6 8 10 12 14 16

Effic

ient

y

Number of processing elements

Async. master-slave Hierarchic scatter

Page 5: [IEEE 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) - HERAKLION, Greece (2010.09.20-2010.09.24)] 2010 IEEE International Conference

REFERENCES [1] R. C. Eberhart and J. Kennedy, “A new optimizer using particle

swarm theory”, In Proceedings of the Sixth International Symposium on Micro Machine and Human Science MHS’95, pages 39–43. IEEE Press, October 1995. ISBN: 0-7803-2676-8.

[2] J. Kennedy, R. C. Eberhart, “Particle swarm optimization”, In Proceedings of IEEE International Conference on Neural Networks, 1995, volume 4, pages 1942–1948, November 27–December 1, 1995, Perth, WA, Australia. ISBN: 0-7803-2768-3.

[3] J. K. Parrish and W. M. Hamner, “Animal Groups in Three Dimensions: How Species Aggregate”, Cambridge University Press, December 1997, ISBN: 0-52146-024-8.

[4] B. Koh, A. D. George, R. T. Haftka and B. J. Fregly, “Parallel asynchronous particle swarm optimization”, International Journal of Numerical Methods and Engineering, 2006, vol. 67 pages 578-595.

[5] M. Meissner, M. Schmuker, and G. Schneider, “Optimized particle swarm optimization (opso) and its application to artificial neural network training”, BMC Bioinformatics, 7(125), 2006.

[6] T. K. Rasmussen and T. Krink, “Improved hidden markov model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid”, BioSystems, 72(1–2):5–17, November 2003.

[7] A. M. Baltar and D. G. Fontane, “A generalized multiobjective particle swarm optimization solver for spreadsheet models: application to water quality”, In Proceedings of AGU Hydrology Days 2006, pages 1–12, March 20–22, 2006, Fort Collins, Colorado.

[8] W. Cedeno and D. K. Agrafiotis, “Using particle swarms for the development of qsar models based on k-nearest neighbor and kernel regression”, Journal of Computer-Aided Molecular Design, 17(2–4) pages 255–263, February 2003, ISSN: 0920-654X.

[9] Q. Shen, J. H. Jiang, C. X. Jiao, S. Y. Huan, G. Shen and R. Q. Yu, “Optimized partition of minimum spanning tree for piecewise modeling by particle swarm algorithm”, Journal of Chemical Information and Modeling, November 2004.

[10] J. Kennedy and R. C. Eberhart, “A discrete binary version of the particle swarm algorithm”, In Proceedings of the 1997 Conference on Systems, Man and Cybernetics 1997, pages 4104–4109.

[11] S. Das, A. Abraham and A. Konar, “Particle Swarm Optimization and Differential Evolution Algorithms: Technical Analysis, Applications and Hybridization Perspectives”, Studies in Computational Intelligence (SCI), 2008 vol. 116, pages 1–38.

[12] T. Vinko, D. Izzo and C. Bombardelli, “Benchmarking different global optimization techniques for preliminary space trajectory design”, 58th International Astronautical Congress, 2007.

[13] B. Koh, A. D. George, R. T. Haftka and B. J. Fregly, “Parallel asynchronous particle swarm optimization”, International Journal for Numerical Methods in Engineering, 2006, 67:578–595.

[14] J. F. Schutte , J. A. Reinbolt, B. J. Fregly, R. T. Haftka and A. D. George, “Parallel global optimization with the particle swarm algorithm”, International Journal For Numerical Methods in Engineering, 2004, vol. 61, pages 2296–2315.

[15] J.L. Redondo I. Garcia and P.M. Ortigosa, “Parallel evolutionary algorithms based on shared memory programming approaches”, Journal of Supercomputing, Vol. in press, (doi 10.1007/s11227-009-0374-6). 2010.

[16] J.L. Redondo, J. Fernandez, I. Garcia, and P.M. Ortigosa, “Parallel algorithms for continuous competitive location problems”, Optimization Methods & Software, Vol. 23, n. 5, pp. 779-791. 2008.