Fractional Particle Swarm Optimization in Multidimensional...

298 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 2, APRIL 2010

Fractional Particle Swarm Optimization inMultidimensional Search Space

Serkan Kiranyaz, Turker Ince, Alper Yildirim, and Moncef Gabbouj, Senior Member, IEEE

Abstract—In this paper, we propose two novel techniques, whichsuccessfully address several major problems in the field of par-ticle swarm optimization (PSO) and promise a significant break-through over complex multimodal optimization problems at highdimensions. The first one, which is the so-called multidimensional(MD) PSO, re-forms the native structure of swarm particles insuch a way that they can make interdimensional passes with adedicated dimensional PSO process. Therefore, in an MD searchspace, where the optimum dimension is unknown, swarm particlescan seek both positional and dimensional optima. This eventuallyremoves the necessity of setting a fixed dimension a priori, whichis a common drawback for the family of swarm optimizers. Never-theless, MD PSO is still susceptible to premature convergences dueto lack of divergence. Among many PSO variants in the literature,none yields a robust solution, particularly over multimodal com-plex problems at high dimensions. To address this problem, wepropose the fractional global best formation (FGBF) technique,which basically collects all the best dimensional components andfractionally creates an artificial global best (aGB) particle thathas the potential to be a better “guide” than the PSO’s nativegbest particle. This way, the potential diversity that is presentamong the dimensions of swarm particles can be efficiently usedwithin the aGB particle. We investigated both individual andmutual applications of the proposed techniques over the followingtwo well-known domains: 1) nonlinear function minimization and2) data clustering. An extensive set of experiments shows thatin both application domains, MD PSO with FGBF exhibits animpressive speed gain and converges to the global optima at thetrue dimension regardless of the search space dimension, swarmsize, and the complexity of the problem.

Index Terms—Fractional global best formation (FGBF), multi-dimensional (MD) search, particle swarm optimization (PSO).

I. INTRODUCTION

THE BEHAVIOR of a single organism in a swarm is ofteninsignificant, but their collective and social behavior is

of paramount importance. Particle swarm optimization (PSO)was introduced by Kennedy and Eberhart [27] in 1995 as apopulation-based stochastic search and optimization process.

Manuscript received May 24, 2008; revised August 28, 2008 andNovember 24, 2008. First published August 4, 2009; current version pub-lished March 17, 2010. This paper was recommended by Associate EditorQ. Zhao. This work was supported by the Academy of Finland underProject 213462 [Finnish Centre of Excellence Program (2006-2011)].

S. Kiranyaz and M. Gabbouj are with the Department of Signal Process-ing, Tampere University of Technology, 33101 Tampere, Finland (e-mail:[email protected]).

T. Ince is with the Department of Computer Engineering, Izmir Universityof Economics, Izmir 35330, Turkey (e-mail: [email protected]).

A. Yildirim is with the Tübitak UEKAE/Iltaren, Ankara 06800, Turkey(e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMCB.2009.2015054

It originated from the computer simulation of the individuals(particles or living organisms) in a bird flock or fish school [53],which basically show a natural behavior when they search forsome target (e.g., food). The goal is, therefore, to converge tothe global optima of some multidimensional (MD) and possiblynonlinear function or system. Henceforth, PSO follows thesame path of other evolutionary algorithms (EAs) [4] suchas genetic algorithm (GA) [18], genetic programming [28],evolution strategies [5], and evolutionary programming [16].The common point of all is that the nature of EAs is populationbased, and they can avoid being trapped in a local optimum.Thus, they can find the optimum solutions; however, this isnever guaranteed.

In a PSO process, a swarm of particles (or agents), eachof which representing a potential solution to an optimizationproblem, navigates through the search space. The particlesare initially distributed randomly over the search space witha random velocity, and the goal is to converge to the globaloptimum of a function or a system. Each particle keeps track ofits position in the search space and the best solution that it has sofar achieved. This is the personal best value (the so-called pbestin [27]), and the PSO process also keeps track of the global best(GB) solution so far achieved by the swarm with its particleindex (the so-called gbest in [27]). Therefore, during theirjourney with discrete time iterations, the velocity of each agentin the next iteration is computed by the best position of theswarm (personal best position of the particle gbest as the socialcomponent), the best personal position of the particle (pbest asthe cognitive component), and its current velocity (the memoryterm). Both social and cognitive components contribute ran-domly to the position of the agent in the next iteration.

As a stochastic search algorithm in MD search space, PSOexhibits some major problems similar to the aforementionedEAs. The first one is due to the fact that any stochastic optimiza-tion technique depends on the parameters of the optimizationproblem where it is applied, and variation of these parameterssignificantly affects the performance of the algorithm. Thisproblem is a crucial one for PSO where parameter variationsmay result in large performance shifts [33]. The second one isdue to the direct link of the information flow between particlesand gbest, which then “guides” the rest of the swarm, thusresulting in the creation of similar particles with some loss ofdiversity. Hence, this phenomenon increases the probability ofbeing trapped in local optima [44], and it is the main causeof the premature convergence problem, particularly when thesearch space is of high dimensions [50] and the problem to beoptimized is multimodal [44]. Another reason for the prematureconvergence is that particles are flown through a single point

1083-4419/$26.00 © 2009 IEEE

Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on March 16,2010 at 08:09:04 EDT from IEEE Xplore. Restrictions apply.

KIRANYAZ et al.: FRACTIONAL PARTICLE SWARM OPTIMIZATION IN MULTIDIMENSIONAL SEARCH SPACE 299

that is (randomly) determined by gbest and pbest positions, andthis point is not even guaranteed to be a local optimum [51].Various modifications and PSO variants have been proposed toaddress this problem, such as those in [1], [7]–[10], [13], [23],[25], [26], [29], [31]–[34], [39]–[42], [44], [46], [47], [51],[52], [54]–[56], and [58].

Such methods usually try to improve the diversity amongthe particles and the search mechanism either by changingthe update equations toward a more diversified version or byadding more randomization to the system (to particle velocities,positions, etc.) or by simply resetting some or all of themrandomly when some conditions are met. On one hand, mostof them require additional parameters to accomplish this, thusmaking the PSO variants even more parameter dependent. Onthe other hand, the main problem is, in fact, the incapabilityof using the available diversity on one or more (dimensional)components of each particle (i.e., certain dimensions of aparticle position in the search space), because all componentscontinuously and abruptly change as the PSO process updatesthe particle’s position. Note that one or more components ofany particle might already be diverted well enough to be in aclose vicinity of the global optimum (for that dimension). Thispotential is then wasted with the (velocity) update in the nextiteration, which changes all the dimensions at once. Therefore,in the proposed method, we collect all such promising (orsimply the best) components from each particle and fractionallycreate an artificial GB candidate, namely, aGB, which will bethe swarm’s GB particle if it is better than the previous GB andcurrent gbest. Note that whenever a better (real) gbest particleor aGB particle emerges, it will replace the current GB particle.Without using any of the aforementioned modifications, weshall, therefore, show that the proposed fractional PSO canavoid the local optima and thus yield the optimum (or nearoptimum) solution even in high-dimensional search spaces andusually in earlier stages.

Another major drawback of the aforementioned PSOvariants, including the basic method, is that they can only beapplied to a search space with a fixed dimension. However,in many optimization problems, the optimum dimension isalso unknown (e.g., data clustering, spatial segmentation, andoptimization of the dimensional functions) and should thus bedetermined within the PSO process. So far, only a few studieshave been presented in this area, i.e., [1] and [36]. In [36],Omran et al. presented dynamic clustering PSO (DCPSO),which is, in fact, a hybrid clustering algorithm where binaryPSO is used (only) to determine the number of clusters (and,hence, the dimension of the solution space), along with theselection of initial cluster centers, while the traditional K-meansmethod [48] performs the clustering operation in that dimension(over the initial cluster centers). In [1], Abraham et al. presentedthe multielitist PSO (MEPSO), which is another variant ofthe basic PSO (bPSO) algorithm, to address the prematureconvergence problem. In addition to being nongeneric PSOvariants that are applicable only to clustering problems, both[1] and [36] do not clarify whether they can cope with higherdimensions of the solution space since the maximum numbersof clusters used in their experiments are only 6 and 10, respec-tively. This is also true for most of the static (fixed dimensional)

PSO variants due to the aforementioned fact that the probabilityof getting trapped into a local optimum significantly increasesin higher dimensions [50].

To address these problems, we propose an MD PSO tech-nique, which can work along with the fractional GB formation(FGBF) scheme to avoid the premature convergence problem.The proposed methods are generic since both the FGBF schemeand the MD search process can dynamically be integrated intothe PSO’s native algorithm. Yet, they are also not linked toeach other, i.e., one can be performed without the other, butwe shall show that the best performance is achieved by theirmutual operation if the MD search is required by the problem.Furthermore, no additional parameter is needed to perform theproposed techniques. Furthermore, MD PSO voids the need offixing the dimension of the solution space in advance.

The rest of this paper is organized as follows. Section II sur-veys related work on PSO. The proposed techniques, namely,MD PSO and FGBF, are presented in detail in Section III.Section IV is dedicated to applications over the two prob-lem domains, namely, 1) nonlinear function minimization and2) data clustering; whereas Section V provides the experimentsconducted and discusses the results. Finally, Section VI con-cludes the paper.

II. RELATED WORK

A. bPSO Algorithm

In the bPSO method, a swarm of particles flies through anN -dimensional search space where the position of each particlerepresents a potential solution to the optimization problem.Each particle a in the swarm ξ = {x1, . . . , xa, . . . , xS} is rep-resented by the following characteristics:

xa,j(t): jth dimensional component of the position of parti-cle a at time t;

va,j(t): jth dimensional component of the velocity of particlea at time t;

ya,j(t): jth dimensional component of the personal best(pbest) position of particle a at time t;

yj(t): jth dimensional component of the GB position ofswarm at time t.

Let f denote the fitness function to be optimized. Withoutloss of generality, assume that the objective is to find theminimum of f in N -dimensional space. Then, the personal bestof particle a can be updated in iteration t + 1 as

ya,j(t + 1) ={

ya,j(t), if f (xa(t + 1)) > f (ya(t))xa,j(t + 1), else

∀j ∈ [1, N ]. (1)

Since gbest is the index of the GB particle, y(t) =ygbest(t) = arg min

∀i∈[1,S](f(yi(t))). Then, for each iteration in a

PSO process, positional updates are performed for each particlea ∈ [1, S] and along each dimensional component j ∈ [1, N ] asfollows:

va,j(t + 1) = w(t)va,j(t) + c1r1,j(t) (ya,j(t) − xa,j(t))+ c2r2,j(t) (yj(t) − xa,j(t))

xa,j(t + 1) = xa,j(t) + va,j(t + 1) (2)



TABLE IPSEUDOCODE FOR THE bPSO ALGORITHM

where w is the inertia weight [46], and c1 and c2 are theacceleration constants that are usually set to 1.49 or 2. r1,j ∼U(0, 1) and r2,j ∼ U(0, 1) are random variables with a uniformdistribution. Recall from the earlier discussion that the firstterm in the summation is the memory term, which representsthe contribution of previous velocity, the second term is thecognitive component, which represents the particle’s own ex-perience, and the third term is the social component throughwhich the particle is “guided” by the gbest particle toward theGB solution so far obtained. Although the use of inertia weightw was later added by Shi and Eberhart [46] into the velocityupdate equation, it is widely accepted as the basic form of PSOalgorithm. A larger value of w favors exploration, while a smallinertia weight favors exploitation. As originally introduced, wis often linearly decreased from a high value (e.g., 0.9) to alow value (e.g., 0.4) during the iterations of a PSO run, whichupdates the positions of the particles using (2). Depending onthe problem to be optimized, PSO iterations can be repeateduntil a specified number of iterations, e.g., IterNo, is exceeded,velocity updates become zero, or the desired fitness score isachieved (i.e., f < εC , where f is the fitness function, and εC isthe cutoff error). Accordingly, the general pseudocode of bPSOis presented in Table I.

Velocity clamping, also called “dampening,” with the user-defined maximum range Vmax (and −Vmax for the minimum),as in step 3.4.1.2, is one of the earliest attempts to controlor prevent oscillations [13]. Some important PSO variants andimprovements will be covered in Section II-B.

B. PSO Variants and Improvements

The first set of improvements has been proposed for en-hancing the problem-dependent performance of PSO due tothe strong parameter dependency. There are mainly two typesof approaches. The first one is through self-adaptation, whichhas been applied to PSO by Clerc [10], Yasuda et al. [57],Zhang et al. [60], and Shi and Eberhart [47]. The other approachis via performing hybrid techniques, which are employed alongwith PSO by Angeline [2], Reynolds et al. [43], Higashi andIba [23], Esquivel and Coello Coello [15], and many others.The rest of the PSO variants presented in this section containsome improvements, trying to avoid the premature convergenceproblem via introducing diversity to swarm particles.

Note that according to the velocity update equation in (2),the velocity of the gbest particle will only depend on the mem-ory term since xgbest = ygbest = y. To address this problem,Van den Bergh introduced a new PSO variant—the PSO withguaranteed convergence (GCPSO) [50]. In GCPSO, a differentvelocity update equation is used for the gbest particle basedon two threshold values that can be adaptively set during theprocess. It is claimed that GCPSO usually performs better thanbPSO when applied to unimodal functions and comparablefor multimodal problems; however, due to its fast rate ofconvergence, GCPSO may be more likely to trap in local min-ima with a guaranteed convergence, whereas bPSO may not.Based on GCPSO, Van den Bergh proposed the multistart PSO(MPSO) [50], which repeatedly runs GCPSO over randomizedparticles and stores the (local) optimum at each iteration. Yet,similar to bPSO and many of its variants, the performance stilldegrades significantly as the dimension of the search spaceincreases [50].

In [52], a cooperative approach to PSO (CPSO) has beenproposed. This is another variation of bPSO, which employscooperative behavior to improve the performance. In this ap-proach, multiple swarms are used to optimize different compo-nents of the solution vector in a cooperative way. In 80% ofall test cases that are run on 30-dimensional space, CPSO per-formed better than bPSO. Comparable results are obtained froma recent PSO variant proposed in [31]—the comprehensivelearning PSO (CLPSO). CLPSO basically follows a compre-hensive learning strategy, where all swarm particles’ historicalbest information is used to update a particle’s velocity. Theauthors concluded that CLPSO is not the best choice for solvingunimodal functions; however, it can generate better qualitysolutions more frequently when compared with the eight otherPSO variants. A similar approach has also been presented byMendes et al. [34], who proposed the fully informed particleswarm. In their work, using the particles’ previous best valuesto update the velocity of the particle is the main approach,and several neighborhood topologies, such as pyramid, square,ring, circle, etc., were examined. There are many other PSOvariants, which can be found in [14] and [37]. Yet, most of thempresent either little or moderate performance improvementsat the expense of additional parameters and/or computationalcomplexity. More importantly, they still suffer from the highdimensions and multimodality of the problem where it becomeseasier to trap into local optima, particularly during the earlierstages of a PSO process. To provide an unbiased performance



measure, we shall not use any of such improvements with theproposed techniques, namely, MD PSO and FGBF, as detailedin Section III.

III. MD PSO AND FGBF TECHNIQUES

In this section, we introduce two novel techniques for PSO.The first is an MD extension—the so-called MD PSO—whichpresents a substantial improvement over PSO via interdimen-sional navigation. However, it usually suffers in high dimen-sions from premature convergence to a local optimum, similarto other PSO variants. To remedy this shortcoming, we will thenpropose a second technique, which is called FGBF, and presenttheir mutual application over the following two typical prob-lems: 1) nonlinear function minimization and 2) data clustering.

A. MD PSO Algorithm

Instead of operating at a fixed dimension N , the MD PSOalgorithm is designed to seek both positional and dimensionaloptima within a dimension range (Dmin ≤ N ≤ Dmax). Toaccomplish this, each particle has two sets of components,each of which has been subjected to two independent andconsecutive processes. The first one is a regular positional PSO,i.e., the traditional velocity updates and following positionalmoves in N -dimensional search (solution) space. The secondone is a dimensional PSO, which allows the particle to navigatethrough dimensions. Accordingly, each particle keeps track ofits last position, velocity, and personal best position (pbest)in a particular dimension so that when it revisits the samedimension at a later time, it can perform its regular “positional”fly using this information. The dimensional PSO process ofeach particle may then move the particle to another dimensionwhere it will remember its positional status and keep “flying”within the positional PSO process in this dimension, and so on.The swarm, on the other hand, keeps track of the gbest particlesin all dimensions, each of which respectively indicates the best(global) position so far achieved and can thus be used in theregular velocity update equation for that dimension. Similarly,the dimensional PSO process of each particle uses its personalbest dimension in which the personal best fitness score has sofar been achieved. Finally, the swarm keeps track of the GBdimension dbest among all the personal best dimensions. Thegbest particle in the dbest dimension represents the optimumsolution (and the optimum dimension).

In an MD PSO process and at time (iteration) t, each particlea in the swarm ξ = {x1, . . . , xa, . . . , xS} is represented by thefollowing characteristics:

xxxda(t)a,j (t): jth component (dimension) of the position of

particle a in dimension xda(t);vx

xda(t)a,j (t): jth component (dimension) of the velocity of

particle a in dimension xda(t);xy

xda(t)a,j (t): jth component (dimension) of the personal best

(pbest) position of particle a in dimension xda(t);gbest(d): GB particle index in dimension d;xyd

j (t): jth component (dimension) of the GB position ofswarm in dimension d;

xda(t): dimension component of particle a;vda(t): velocity component of dimension of particle a;xda(t): personal best dimension component of particle a.Fig. 1 shows sample MD PSO and bPSO particles with index

a. The bPSO particle that is at a (fixed) dimension, N = 5,contains only positional components, whereas MD PSOparticle contains both positional and dimensional components,respectively. In the figure, the dimension range for the MDPSO is given between 2 and 9; therefore, the particle containseight sets of positional components (one for each dimension).In this example, the current dimension where the particle aresides is 2 (xda(t) = 2), whereas its personal best dimensionis 3 (xda(t) = 3). Therefore, at time t, a positional PSOupdate is first performed over the positional elements xx2

a(t),and then, the particle may move to another dimension by thedimensional PSO.

Let f denote the dimensional fitness function that is to beoptimized within a certain dimension range (Dmin ≤ N ≤Dmax). Without loss of generality, assume that the objective isto find the minimum (position) of f at the optimum dimensionwithin an MD search space. Assume that the particle a visits(back) the same dimension after T iterations (i.e., xda(t) =xda(t + T )). Then, the personal best position can be updatedin iteration t + T as in (3), shown at the bottom of the page.Furthermore, the personal best dimension of particle a can beupdated in iteration t + 1 as in (4), shown at the bottom ofthe page.

Recall that gbest(d) is the index of the GB parti-cle at dimension d. Then, xydbest(t) = xydbest

gbest(dbest)(t) =arg min

∀i∈[1,S](f(xydbest

i (t))). For a particular iteration t, and for a

particle a ∈ [1, S], first, the positional components are updated

xyxda(t+T )a,j (t + T ) =

⎧⎨⎩ xy

xda(t)a,j (t), if f

(xx

xda(t+T )a (t + T )

)> f

(xy

xda(t)a (t)

)xx

xda(t+T )a,j (t + T ), else

∀j ∈ [1, xda(t)] (3)

xda(t + 1) =

⎧⎨⎩ xda(t), if f

(xx

xda(t+1)a (t + 1)

)> f

(xy

xda(t)a (t)

)xda(t + 1), else

(4)



Fig. 1. Sample (a) MD PSO versus (b) bPSO particle structures. For MD PSO [Dmin = 2, Dmax = 9] and at the current time t, xda(t) = 2 and xda(t) = 3.For bPSO, N = 5.

in the current dimension xda(t), and then, the dimensionalupdate is performed to determine the next (t + 1)th dimension,i.e., xda(t + 1). The positional update is performed for eachdimension component j ∈ [1, xda(t)] as follows:

vxxda(t)a,j (t + 1) = w(t)vx

xda(t)a,j (t)

+ c1r1,j(t)(xy

xda(t)a,j (t) − xx

xda(t)a,j (t)

)+ c2r2,j(t)

(xy

xda(t)j (t) − xx

xda(t)a,j (t)

)xx

xda(t)a,j (t + 1) = xx

xda(t)a,j (t) + vx

xda(t)a,j (t + 1). (5)

Note that the particle’s new position xxxda(t)a (t + 1) will still

be in the same dimension xda(t); however, the particle may flyto another dimension afterward with the following dimensionalupdate equations:

vda(t + 1) =⌊vda(t) + c1r1(t)

(xda(t) − xda(t)

)+ c2r2(t) (dbest − xda(t))

⌋xda(t + 1) =xda(t) + vda(t + 1) (6)

where �·� is the floor operator. Unlike in (2), an inertia weightis not used for positional velocity update since no benefitwas obtained experimentally for dimensional PSO. To avoidexploding, along with the positional velocity limit Vmax, twomore clamping operations are applied for dimensional PSOcomponents, such as |vda,j(t + 1)| < V Dmax and the initialdimension range set by the user, i.e., Dmin ≤ xda(t) ≤ Dmax.Accordingly, the general pseudocode of the MD PSO techniqueis given in Table II.

Once the MD PSO process terminates, the optimum solutionwill be xydbest at the optimum dimension dbest, which isachieved by the particle gbest(dbest), and finally, the best(fitness) score achieved will naturally be f(xydbest).

B. FGBF Algorithm

FGBF is designed to avoid premature convergence by pro-viding a significant diversity obtained from proper fusion ofthe swarm’s best components (the individual dimension(s) ofthe current position of each particle in the swarm). At each

iteration in a bPSO process, an artificial GB particle (aGB)is (fractionally) formed by selecting the most promising (orsimply the best) particle (dimensional) components from theentire swarm. Therefore, particularly during the initial steps,the FGBF can most of the time be a better alternative than thenative gbest particle since it has the advantage of assessingeach dimension of every particle in the swarm individuallyand forming the (aGB) particle fractionally by using the mostpromising (or simply the best) components among them. Thisprocess naturally uses the available diversity among individualdimensional components, and thus, it can prevent the swarmfrom trapping in local optima. Suppose for a swarm ξ FGBFis performed in a PSO process at a (fixed) dimension N .Recall from the earlier discussion that in a particular itera-tion t, each PSO particle a has the following components:1) position (xa,j(t)); 2) velocity (va,j(t)); and 3) the personalbest position (ya,j(t)), j ∈ [1, N ]). aGB particle, first, doesnot use a velocity term since instead of velocity updates, theaGB particle is fractionally (re-) created from the dimensionsof some swarm particles. Consequently, yaGB(t) is set to thebest of xaGB(t) and yaGB(t − 1). As a result, the FGBFprocess creates one aGB particle, providing a (potential) GBsolution (yaGB(t)). Let f(a, j) be the dimensional fitness scoreof the jth component of particle a. Suppose that all dimen-sional fitness scores (f(a, j),∀a ∈ [1, S]) can be computed instep 3.1, and the FGBF pseudocode, as given in Table III,can then be plugged in-between steps 3.3 and 3.4 of bPSO’spseudocode.

Step 2, along with the computation of f(a, j), dependsentirely on the optimization problem. It keeps track of partialfitness contributions from each individual dimension from eachparticle’s position (the potential solution). For those prob-lems without any constraints (e.g., nonlinear function min-imization), the best dimensional components can simply beselected, whereas in others (e.g., clustering), some promisingcomponents that satisfy the constraints are first selected andgrouped, and the most suitable one in each group is then usedfor FGBF. Here, the internal nature of the problem will deter-mine the “suitability” of the selection. Take, for instance, thefunction minimization problem as illustrated in Fig. 2, where2-D space is used for illustration purposes. In the figure, three



TABLE IIPSEUDOCODE FOR THE MD PSO ALGORITHM

particles in a swarm are ranked as the first (or the gbest), thethird, and the eighth with respect to their proximity to the targetposition (or the global solution) of some function. Although thegbest particle (i.e., first ranked particle) is the closest in the over-all sense, the particles ranked third and eighth provide the bestx and y dimensions (closest to the target’s respective dimen-sions) in the entire swarm, and hence, the aGB particle viaFGBF yields a better (closer) particle than the swarm’s gbest.

C. FGBF Algorithm Over MD PSO

Section III-B introduced the principles of FGBF when ap-plied in a bPSO process on a single dimension. In this section,we present its generalized form with the proposed MD PSO,

where there is one gbest particle per (potential) dimension ofthe solution space.

For this purpose, recall from the earlier discussion thatin a particular iteration t, each MD PSO particle a has thefollowing components: 1) position (xx

xda(t)a,j (t)); 2) velocity

(vxxda(t)a,j (t)); and 3) the personal best position (xy

xda(t)a,j (t))

for each potential dimensions in the solution space (i.e.,xda(t) ∈ [Dmin,Dmax] and j ∈ [1, xda(t)]) and their respec-tive counterparts in the dimensional PSO process (i.e., xda(t),vda(t), and xda(t)). The aGB particle does not need dimen-sional components where a single positional component withthe maximum dimension Dmax is created to cover all dimen-sions in the range ∀d ∈ [Dmin,Dmax], and as explained earlier,there is no need for the velocity term either since the aGB



TABLE IIIPSEUDOCODE FOR FGBF IN bPSO

Fig. 2. Sample FGBF in 2-D space.

particle is fractionally (re-) created from the dimensions ofsome swarm particles. Furthermore, the aforementioned com-petitive selection ensures that xyd

aGB(t) ∀d ∈ [Dmin,Dmax] isset to the best of the xxd

aGB(t) and xydaGB(t − 1). As a result,

the FGBF process creates one aGB particle providing (poten-tial) GB solutions (xyd

aGB(t)) for all dimensions in the givenrange (i.e., ∀d ∈ [Dmin,Dmax]). Let f(a, j) be the dimensionalfitness score of the jth component of particle a, which has thecurrent dimension xda(t) and j ∈ [1, xda(t)]. At a particulartime t, all dimensional fitness scores (f(a, j),∀a ∈ [1, S]) canbe computed in step 3.1, and the FGBF pseudocode for MDPSO, as given in Table IV, can then be plugged in-betweensteps 3.2 and 3.3 of the MD PSO’s pseudocode. We will presentthe applications of both techniques on two well-known problemdomains next.

IV. APPLICATIONS

Two problem domains are considered in this paper, where theproposed PSO algorithms are applied. The first one is nonlin-ear function minimization, where several benchmark functionsare used. This allows us to test the performance over MD

TABLE IVPSEUDOCODE FOR FGBF IN MD PSO

search spaces and against both unimodality and multimodality.The second domain is data clustering, which provides certainconstraints in MD solution space and allows the performanceevaluation in the presence of significant variation in data dis-tribution with an impure validity index. Both problem domainscan efficiently validate MD PSO’s performance regarding con-vergence to the global solution in the right dimension with orwithout FGBF. This way, we can truly evaluate the contributionand the significance of FGBF particularly over multimodaloptimization problems in high dimensions.

A. Nonlinear Function Minimization

We selected seven benchmark functions and biased themwith a dimensional term to test the performance of MD PSO.The functions given in Table V provide a good mixture ofcomplexity and modality and have been widely studied byseveral researchers (see, e.g., [3], [15], [23], [33], [45], and[46]). The dimensional bias term Ψ(d) has the form of Ψ(d) =K|d − d0|α, where the constants K and α are properly setwith respect to the dynamic range of the function to be mini-mized. Note that the variable Dmin ≤ d0 ≤ Dmax is the targetdimension in which the global minimum can be truly reachedand all functions thus have the global minimum Fn(x, d0)=0,when d = d0. Sphere, De Jong, and Rosenbrock are the uni-modal functions, and the rest are multimodal, meaning thatthey have many deceiving local minima. On the macroscopiclevel Griewank demonstrates certain similarities with unimodalfunctions, particularly when the dimensionality is above 20;however, in low dimensions, it bears a significant noise, whichcreates many local minima due to the second multiplicationterm with cosine components. Yet, with the addition of di-mensional bias term Ψ(d), even unimodal functions eventuallybecome multimodal since they now have a local minimum



TABLE VBENCHMARK FUNCTIONS WITH DIMENSIONAL BIAS

at every dimension (which is their global minimum at thatdimension without Ψ(d)) but only one global minimum atdimension d0.

Recall from the earlier remarks that an MD PSO parti-cle a represents a potential solution at a certain dimension,and therefore, the jth component of a d-dimensional point(xj , j ∈ [1, d]) is stored in its positional component xxd

a,j(t) attime t. Step 3.1 in MD PSO’s pseudocode computes the(dimensional) fitness score (f(a, j)) of the jth component (xj),and at step 2 in the FGBF process, the index of the particlewith those xj’s yielding minimum f(a, j) is then stored in thearray a[j]. Except for the nonseparable functions Rosenbrockand Griewank, the assignment of f(a, j) for particle a isstraightforward (e.g., f(a, j) = x2

j for Sphere and f(a, j) =xj sin(

√|xj |) for Schwefel, simply using the term with thejth component of the summation). For Rosenbrock, we can setf(a, j) = (xj+1 − x2

j )2 + (xj − 1)2 since the aGB particle,

which is fractionally formed by those xj’s minimizing the jthsummation term, eventually minimizes the function. Finally,for Griewank, one can approximate f(a, j) ≈ x2

j for particle aand the FGBF operation then finds and uses such xj that cancome to a close vicinity of the global minimum at dimension jon a macroscopic scale so that the native PSO process can thenhave a higher chance of avoiding those noise-like local optimaand, thus, eventually converge to the global optimum.

B. Data Clustering

1) Problem Definition: As the process of identifying naturalgroupings in an MD data based on some distance metric (e.g.,Euclidean), data clustering can be divided into the followingtwo main categories: 1) hierarchical and 2) partitional [17].Each category then has a wealth of subcategories and differ-ent algorithmic approaches for finding the clusters. Clustering

can also be performed in the following two different modes:1) hard (or crisp) and 2) fuzzy. In the former mode, theclusters are disjoint and nonoverlapping, and any data pointbelongs to a single cluster, whereas in the latter case, it canbelong to all the clusters with some degree of membership[24]. K-means [48] is a well-known and widely used clusteringmethod, which first assigns each data point to one of the Kcluster centroids and then updates them to the mean of theirassociated points. Starting from a random set of K centroids,this cycle is then iteratively performed until the convergencecriteria ΔKmeans < ε is reached, where the objective functionΔKmeans can be expressed as

ΔKmeans =K∑

k=1

∑xp∈ck

‖ck − xp‖2 (7)

where ck is the kth cluster center, xp is the pth data point incluster ck, and ‖ · ‖ is the distance metric in Euclidean space. Asa hard clustering method, K-means suffers from the followingdrawbacks.

1) The number of clusters K needs to be set in advance.2) The performance of the method depends on the initial

(random) centroid positions as the method converges tothe closest local optima.

3) The method is also dependent on the data distribution.

The fuzzy version of K-means—the so-called fuzzyC-means (FCM) (sometimes also called fuzzy K-means)—wasproposed by Bezdek [6] and has become the most popularfuzzy clustering method so far. It is a fuzzy extension ofK-means while FCM usually achieves a better performancethan K-means [19] and is less data dependent; however, itstill suffers from the same drawbacks, i.e., the number ofclusters should be fixed a priori, and unfortunately, it may also



converge to local optima [24]. Zhang and Hsu [59] proposeda novel fuzzy clustering technique—the so-called K harmonicmeans (KHM)—which is less sensitive to initial conditionsand promises further improvements. Experimental resultsdemonstrate that KHM outperforms both K-means and FCM[20], [59]. There are many other variants that are skipped heresince clustering is only an application field for the proposedPSO techniques and is, hence, out of the main scope of thispaper. An extensive survey over various types of clusteringtechniques can be found in [24] and [38].

A hard clustering technique based on bPSO was first intro-duced by Omran et al. [35], and this paper showed that bPSOcan outperform K-means, FCM, KHM, and some other state-of-the-art clustering methods in any (evaluation) criteria. This is,indeed, an expected outcome due to the PSO’s aforementionedability to cope up with the local optima by maintaining aguided random search operation through the swarm particles.In clustering, similar to other PSO applications, each particlerepresents a potential solution at a particular time t, i.e., theparticle a in the swarm ξ = {x1, . . . , xa, . . . , xS} is formedas xa(t) = {ca,1, . . . , ca,j , . . . , ca,K} ⇒ xa,j(t) = ca,j , whereca,j is the jth (potential) cluster centroid in N -dimensional dataspace, and K is the number of clusters fixed in advance. Notethat contrary to nonlinear function minimization in the earliersection, the data space dimension N is now different from thesolution space dimension K. Furthermore, the fitness functionf that is to be optimized is formed with respect to the followingtwo widely used criteria in clustering.

1) Compactness: Data items in one cluster should be sim-ilar or close to each other in N -dimensional space anddifferent or far away from the others when belonging todifferent clusters.

2) Separation: Clusters and their respective centroids shouldbe distinct and well separated from each other.

The fitness functions for clustering are then formed asa regularization function fusing both the Compactness andSeparation criteria, and in this problem domain, they are knownas clustering validity indexes. Omran et al. used the followingvalidity index in their work [35]:

f(xa, Z)

= w1dmax(xa, Z) + w2 (Zmax − dmin(xa)) + w3Qe(xa)

where

Qe(xa) =1K

K∑j=1

∑∀zp∈xa,j

‖xa,j − zp‖‖xa,j‖ (8)

where Qe is the quantization error (or the average intraclus-ter distance), and dmax is the maximum average Euclideandistance of data points Z = {zp, zp ∈ xa,j} to their cen-troids xa. Zmax is a constant value for theoretical maxi-mum intercluster distance, and dmin is the minimum centroid(intercluster) distance in the cluster centroid set xa. The weightsw1, w2, and w3 are user-defined regularization coefficients.Therefore, the minimization of the validity index f(xa, Z) willsimultaneously try to minimize the intracluster distances (for

better Compactness) and maximize the intercluster distance (forbetter Separation). In such a regularization approach, differentpriorities (weights) can be assigned to both subobjectives viaproper setting of weight coefficients. Another traditional andwell-known validity index is Dunn’s index [12], which suffersfrom the following two drawbacks: It is 1) computationallyexpensive and 2) sensitive to noise [22]. Several variants ofDunn’s index were proposed in [38], where robustness againstnoise is improved. There are many other validity indexes, i.e.,proposed by Turi [49], Davies and Bouldin [11], Halkidi andVazirganis [21], etc. A throughout survey can be found in[22]. Most of them presented promising results; however, noneof them can guarantee the “optimum” number of clusters inevery clustering scheme. In particular, for the aforementionedPSO-based clustering in [35], the clustering scheme furtherdepends on weight coefficients and may, therefore, result inoverclustering or underclustering, particularly in complex datadistributions.

Although PSO-based clustering outperforms many well-known clustering methods, it still suffers from he followingtwo major drawbacks: 1) The number of clusters K (beingthe solution space dimension as well) should still be spec-ified in advance and 2) similar to other bPSO applications,the method tends to trap in local optima, particularly whenthe complexity of the clustering scheme increases. This alsoinvolves the dimension of the solution space, i.e., convergenceto “optimum” number of “true” clusters can only be guaranteedfor low dimensions. Recall from the earlier discussion that thisis also true for the dynamic clustering schemes DCPSO [36]and MEPSO [1], both of which eventually present results onlyin low dimensions (K ≤ 10 in [36] and K ≤ 6 in [1]), andfor simple data distributions. The degradation is likely to bemore severe, particularly for DCPSO, since it entirely relies onK-means for actual clustering.

2) Clustering Based on MD PSO with FGBF: Based onthe earlier discussion, it is obvious that the clustering prob-lem requires the determination of the solution space di-mension (i.e., the number of clusters K) and an effectivemechanism to avoid local optima traps (both dimensionallyand spatially), particularly in complex clustering schemes inhigh dimensions (e.g., K > 10). The former requirement jus-tifies the use of the proposed MD PSO technique, whilethe latter calls for FGBF. At time t, the particle a in theswarm ξ = {x1, . . . , xa, . . . , xS} has the positional compo-nent formed as xx

xda(t)a (t) = {ca,1, . . . , ca,j , . . . , ca,xda(t)} ⇒

xxxda(t)a,j (t) = ca,j , meaning that it represents a potential so-

lution (i.e., the cluster centroids) for the xda(t) number ofclusters, with the jth component being the jth cluster centroid.Apart from the regular limits such as (spatial) velocity Vmax,dimensional velocity V Dmax, and dimension range Dmin ≤xda(t) ≤ Dmax, the N -dimensional data space is also limitedwith some practical spatial range, i.e., Xmin < xx

xda(t)a (t) <

Xmax. In case this range is exceeded even for a singledimension j, i.e., xx

xda(t)a,j (t), then all positional components of

the particle for the respective dimension xda(t) are initializedrandomly within the range (i.e., refer to step 1.3.1 in the MDPSO pseudocode), and this further contributes to the overall



diversity. The following validity index is used to obtain compu-tational simplicity with minimal or no parameter dependency:

f(xxxda(t)

a , Z)

= Qe

(xxxda(t)

a

)(xda(t))α where

Qe

(xxxda(t)

a

)=

xda(t)∑j=1

∑∀zp∈xx

xda(t)a,j

∥∥xxxda(t)a,j

−zp

∥∥∥∥xx

xda(t)a

∥∥xda(t)

(9)

where Qe is the quantization error (or the average intraclusterdistance), representing the Compactness term, and (xda(t))α

is the Separation term, derived by simply penalizing highercluster numbers with an exponential α > 0. Using α = 1, thevalidity index yields the simplest form (i.e., only the nominatorof Qe) and becomes entirely parameter-free.

On the other hand, (hard) clustering has some constraints. LetCj = {xx

xda(t)a,j (t)} = {ca,j} be the set of data points assigned

to a (potential) cluster centroid xxxda(t)a,j (t) for a particle a

at time t. The partitions Cj , ∀j ∈ [1, xda(t)] should maintainthe following constraints.

1) Each data point should be assigned to one cluster set, i.e.,⋃xda(t)j=1 Cj = Z.

2) Each cluster should contain at least one data point, i.e.,Cj �= {φ},∀j ∈ [1, xda(t)].

3) Two clusters should have no common data points, i.e.,Ci ∩ Cj = {φ}, i �= j and ∀i, j ∈ [1, xda(t)].

To satisfy the first and third (hard) clustering constraints,before computing the clustering fitness score via the validityindex function in (9), all data points are first assigned to theclosest centroid. Yet, there is no guarantee for the fulfillmentof the second constraint since xx

xda(t)a (t) is set (updated) by

the internal dynamics of the MD PSO process, and hence,any dimensional component (i.e., a potential cluster candidate)xx

xda(t)a,j (t) can be in an abundant position (i.e., no closest data

point exists). To avoid this, a high penalty is set for the fitnessscore of the particle, i.e., f(xx

xda(t)a , Z) ≈ ∞, if {xx

xda(t)a,j } =

{φ} for any j.The major outlines so far given are sufficient for the stand-

alone application of the MD PSO technique for a dynamicclustering application; however, the FGBF operation presentsfurther difficulties since for the aGB creation, the selectionof the best or the most promising dimensions (i.e., the clustercentroids) among all dimensions of swarm particles isnot straightforward. Recall that in step 2 of the FGBFpseudocode, the index array of such particles yielding theminimum f(a, j) for the jth dimension is given by a[j] =arg min

a∈[1,S]j∈[1,Dmax](f(a, j)). This was straightforward for the

nonlinear function minimization where each dimension ofthe solution space is distinct and corresponds to an individ-ual dimension of the data space. However, in the clusteringapplication, any (potential) cluster centroid of each particlexx

xda(t)a,j (t) is updated independently and can be any arbi-

trary point in N -dimensional data space. Furthermore, datapoints assigned to the jth dimension of a particle a, (∀zp ∈xx

xda(t)a,j (t)) also depend on the distribution of the other di-

Fig. 3. Formation of the centroid subset in a sample clustering example. Blackdots represent data points over 2-D space, and each colored “+” represents onecentroid (dimension) of a swarm particle.

mensions (centroids), i.e., the “closest” data points are assignedto the jth centroid only because the other centroids happento be at a farther location. Inserting this particular dimension(centroid) into another particle (e.g., aGB, in case selected)might create an entirely different assignment (or cluster), in-cluding the possibility of having no data points assigned to itand, thus, violating the second clustering constraint. To avoidthis problem, a new approach is adopted for step 2 to obtaina[j]. At each iteration, a subset among all dimensions of swarmparticles is first formed by verifying the following: a dimensionof any particle is selected into this subset if and only if thereis at least one data point that is closest to it. Henceforth, thecreation of the aGB particle within this verified subset ensuresthat the second clustering constraint will (always) be satisfied.Fig. 3 illustrates the formation of the subset on a sample datadistribution with four clusters. Note that in the figure, all dimen-sions of the entire swarm particles are shown as “+,” but the redones belonging to the subset have at least one (or more) datapoints closest, whereas the blue ones have none and are, hence,discarded.

Once the subset centroids are selected, the objective is tocompose a[j] with the most promising Dmax centroids selectedfrom the subset in such a way that each dimensional componentof the aGB particle with K dimensions (xxK

aGB,j(t),∀j ∈[1,K]), which is formed from a[j] (see step 3.1 of FGBFin the MD PSO pseudocode) can represent one of the trueclusters, i.e., being in a close vicinity of its centroid. Toaccomplish this, only such Dmax dimensions that fulfill thetwo clustering criteria Compactness and Separation are selectedand then stored in a[j]. To achieve well-separated clustersand to avoid the selection of more than one centroid rep-resenting the same cluster, spatially close centroids are firstgrouped using a minimum spanning tree (MST) [30], and then,a certain number of centroid groups, e.g., d ∈ [Dmin,Dmax],can be obtained simply by breaking (d − 1) longest MSTbranches. From each group, one centroid, which provides thehighest Compactness score (i.e., the minimum dimensionalfitness score f(a, j)) is then selected and inserted into a[j]as the jth dimensional component. During the computation ofthe validity index f(xx

xda(t)a , Z) in (9), f(a, j) can simply



Fig. 4. Fitness score (top) and dimension (bottom) plots versus iteration number for MD PSO (top) and bPSO (bottom) operations both of which were run overDe Jong function.

be set as the jth term of the summation in the Qe expres-sion, i.e.,

f(a, j) =

∑∀zp∈xx

xda(t)a,j

∥∥∥xxxda(t)a,j − zp

∥∥∥∥∥∥xxxda(t)a

∥∥∥ . (10)

In Fig. 3, a sample MST is formed using 14 subset centroidsas the nodes, and 13 branches are shown as the red linesconnecting the closest nodes (in a minimum span). Breakingthe three longest branches (shown as the dashed lines) thus re-veals the four groups (G1, . . . , G4) among which one centroidyielding the minimum f(a, j) can then be selected as an in-dividual dimension of the aGB particle with four-dimensionalcomponents (i.e., d = K = 4, xxK

aGB,j(t),∀j ∈ [1,K]).

V. EXPERIMENTAL RESULTS

An extensive set of experiments was conducted over the twoapplication domains discussed in Section IV, and the resultswill be presented in the following sections.

A. Nonlinear Function Minimization

Both proposed techniques, the standalone MD PSO andMD PSO with FGBF, are tested over seven benchmark func-tions given in Table V. We use the termination criteria asthe combination of the maximum number of iterations al-lowed (iterNo = 5000) and the cutoff error (εC = 10−4).Table V also presents both positional (±xmax) and dimensional([Dmin,Dmax]) range values, whereas others are empiricallyset as Vmax = xmax/2 and V Dmax = 18, respectively. Unlessstated otherwise, these range values are used in all experimentspresented in this section.

The first set of experiments was performed for comparativeevaluation of the standalone MD PSO versus bPSO over bothunimodal and multimodal functions. Fig. 4 presents typicalplots where both techniques are applied over the unimodal

function De Jong using the swarm size S = 160. The red curvesof both plots in Fig. 4 and all the rest of the figures in thissection belong to the GB particle (whether it is a new gbest orthe aGB particle when FGBF is used) and the correspondingplots of the blue curve is obtained from the gbest particle whenthe termination criteria is met [e.g., gbest = 74 for bPSO andf(y74(158)) = 9.21 × 10−5 < εC]. Naturally, the true dimen-sion (d0 = 20) is set in advance for the bPSO process, and itconverges to the global optima within 158 iterations, as shownin the right plot, whereas MD PSO spent 700 iterations tohave the GB particle in the target dimension (d0 = 20) andthen only 80 iterations more to satisfy the termination criteria.Recall that its objective is to find the true dimension wherethe global optimum exists, and at this dimension, its internalprocess becomes identical with bPSO. Yet, in the overall sense,the standalone MD PSO is slower, but over an extensive setof experiments, we observed that it has the same convergencebehavior to the global optima with bPSO. For instance, theirperformance is degraded in higher dimensions, e.g., for thesame function but at d0 = 50, both require five times moreiterations on the average to find the global minimum.

A significant speed improvement can be achieved when MDPSO is performed with FGBF. A typical MD PSO run using theswarm size S = 320 over another unimodal function Sphere,but at a higher (target) dimension, is shown in Fig. 5. Note thatthe one with FGBF (left) took only 160 iterations, whereas thestandalone MD PSO (right) is completed within 3740 iterations.Note also that within a few iterations, the process with FGBFalready found the true dimension d0 = 40, and after only teniterations, the aGB particle already came in a close vicinityof the global minimum (i.e., f(xy40

aGB(10)) ∼= 4 × 10−2). Asshown in Fig. 6, the particle index plot for this operation clearlyshows the time instances where aGB (with index number 320)becomes the GB particle, e.g., the first 14 iterations and thenoccasionally in the rest of the process.

In addition to the significant speed improvement for uni-modal functions, the primary contribution of FGBF techniquebecomes most visible when applied over multimodal functions



Fig. 5. Fitness score in log-scale (top) and dimension (bottom) plots versus iteration number for an MD PSO run over Sphere function with (top) and without(bottom) FGBF.

Fig. 6. Particle index plot for the MD PSO with FGBF operation shownin Fig. 5.

where bPSO (and the standalone MD PSO) is generally not ableto converge to the global optimum even at the low dimensions.Figs. 7 and 8 present two (standalone MD PSO versus MDPSO with FGBF) applications (using a swarm size 320) overSchwefel and Giunta functions at d0 = 20. Note that whenFGBF is used, MD PSO can directly have the aGB (as beingthe GB) particle in the target dimension (d0 = 20) at the be-ginning of the operation; furthermore, the PSO process benefitsfrom having an aGB particle that is, indeed, in a close vicin-ity of the global minimum. This eventually helps the swarmto move toward the right direction thereafter. Without thismechanism, both standalone PSO applications are eventuallytrapped into local minima due to the highly multimodal natureof these functions. This is quite evident in the right-hand plotsof both figures, and except for a few minority cases, this is alsotrue for the other multimodal functions. In higher dimensions,standalone MD PSO applications over multimodal functionsyield even worse results, such as earlier traps to local minima,and possibly in a wrong dimension. For example, in standaloneMD PSO operations over Schwefel and Giunta with d0 = 80,the GB scores at t = 4999 (f(xy80)) are 8955.39 and 1.83,respectively.

An observation worth mentioning here is that MD PSO withFGBF is usually affected by the higher dimensions, but its

performance degradation usually occurs as a certain amount ofdelay, not as the entrapment to a local minimum. For instance,when applied over Schwefel and Giunta at d0 = 80, the con-vergence to global optima is still achieved only in a slightlydelayed manner, i.e., in 119 and 484 iterations, respectively.Moreover, Fig. 9 presents fitness plots for applications of MDPSO with FGBF using two different swarm sizes over twomore multimodal functions Griewank and Rastrigin. Similarto earlier results, the global minimum in the true dimension isreached for both functions; however, operations at d0 = 20 (redcurves) take usually a few hundreds of iterations less than theones at d0 = 80 (blue curves).

Unlike bPSO, the swarm size has a direct effect on theperformance of MD PSO with FGBF; that is, a larger swarmsize increases the speed performance, which is quite evidentin Fig. 9 between the corresponding plots in the left and theright sides. This is due to the fact that with the larger swarmsize, the probability of having better dimensional components(closer to the global optimum at that dimension) of the aGBparticle increases, thus yielding a better aGB particle formationin general. Note that this is also clear in the plots at both sides,i.e., at the beginning (e.g., within the first 10–15 iterations whenaGB is usually the GB particle) the drop in the fitness score ismuch steeper on the right-hand plots with respect to the ones onthe left.

For an overall performance evaluation both proposed meth-ods are tested over seven benchmark functions using threedifferent swarm sizes (i.e., 160, 320, and 640) and target dimen-sions (i.e., 20, 50, and 80). For each setting, 100 runs are per-formed, and the first- and second-order statistics (mean μ andstandard deviation σ) of the operation time (total number of it-erations) and the two components of the solution, i.e., the fitnessscore achieved in the resulting dimension (dbest), are presentedin Table VI. During each run, the operation terminates whenthe fitness score drops below the cutoff error (εC = 10−4), andit is assumed that the global minimum of the function in the



Fig. 7. Fitness score in log-scale (top) and dimension (bottom) plots versus iteration number for an MD PSO run over Schwefel function with (top) and without(bottom) FGBF.

Fig. 8. Fitness score in log-scale (top) and dimension (bottom) plots versus iteration number for an MD PSO run over Giunta function with (top) and without(bottom) FGBF.

target dimension is reached; henceforth, the score is set to 0,and obviously, dbest = d0. Therefore, for a particular function,target dimension d0, and swarm size S, obtaining μ = 0 as theaverage score means that the method converges to the globalminimum in the target dimension at every run. On the otherhand, having the average iteration number as 5000 indicatesthat the method cannot converge to the global minimum atall, instead it gets trapped in a local minimum. The statisti-cal results enlisted in Table VI approve earlier observationsand remarks about the effects of modality, swarm size, anddimension over the performance (both speed and accuracy). Inparticular, for the standalone MD PSO application, increasingthe swarm size improves the speed of convergence whereverthe global minimum is reached for unimodal functions Sphereand De Jong while reducing the score significantly on theothers. The score reduction is particularly visible on higher

dimensions, e.g., for d0 = 80 (compare the highlighted averagescores of the top five functions). Note that particularly onDe Jong at d0 = 50, none of the standalone MD PSO runs withS = 160 can converge to the global minimum, while they allcan with a higher swarm population (i.e., 320 or 640).

Both dimension and modality have a direct effect on theperformance of the standalone MD PSO. On unimodal func-tions, its convergence speed decreases with increasing dimen-sion, e.g., see the highlighted average values of the iterationnumbers for Sphere at d0 = 20 versus d0 = 50. On multimodalfunctions, regardless of the dimension and swarm size, allstandalone MD PSO runs get trapped in local minima (exceptperhaps a few runs on Rosenbrock at d0 = 20); however, thefitness performance still depends on the dimension; that is, thefinal score tends to increase in higher dimensions, indicating anearlier entrapment at a local minimum. Regardless of the swarm



Fig. 9. MD PSO with FGBF operation over Griewank (top) and Rastrigin (bottom) functions with d0 = 20 (red) and d0 = 80 (blue) using the swarm sizeS = 80 (top) and S = 320 (bottom).

size, this can easily be seen in all multimodal functions, exceptGriewank and Giunta, both of which show higher modali-ties in lower dimensions. In particular, Griewank becomes aplain Sphere function when the dimensionality exceeds 20–25.This is the reason behind the performance improvement (orscore reduction) from d0 = 20 to d0 = 50, but note that theworst performance (highest score average) is still encounteredat d0 = 80.

As the entire statistics in the right side of Table VI indicate,MD PSO with FGBF finds the global minimum at the targetdimension for all runs over all functions regardless of the di-mension, swarm size, and modality, and without any exception.Moreover, the mutual application of the proposed techniquessignificantly improves the convergence speed, e.g., compare thehighlighted average iteration numbers with the standalone MDPSOs. Dimensionality, modality, and swarm size might still beimportant factors over the speed and have the same effects asmentioned earlier, i.e., the speed degrades with modality and di-mensionality, whereas it improves with increasing swarm size.Their effects, however, vary significantly among the functions,e.g., as highlighted in Table VI, the swarm size can enhance thespeed radically for Giunta but only merely for Griewank. Thesame statement can be made concerning the dimensionality ofDe Jong and Sphere.

Based on the results in Table VI, we can perform comparativeevaluations with some of the promising PSO variants, suchas [2], [15], [44], and [45], where similar experiments areperformed over some or all of these benchmark functions. Theyhave, however, the advantage of fixed dimension, whereas MDPSO with FGBF finds the true dimension on the fly. Further-more, it is rather difficult to make speed comparisons since noneof them really find the global minimum for most functions;instead they have demonstrated some incremental performanceimprovements in terms of score reduction with respect to someother competing technique(s). For example, in [2], a tourna-ment selection mechanism is formed among particles and themethod is applied over four functions (i.e., Sphere, Rosenbrock,

Rastrigin, and Griewank). Although the method is performedover a reduced positional range, i.e., ±15, and at low dimen-sions (i.e., 10, 20, and 30), they got varying average scoresbetween the range {0.3, 1194}. As a result, they reportedboth better and worse performances than bPSO, depending onthe function. In [15], bPSO and two PSO variants, namely,GCPSO and mutation-extended PSO over three neighborhoodtopologies, are applied to some common multimodal functions,namely, Rastrigin, Schwefel, and Griewank. Although the di-mension is rather low (i.e., 30), none of the topologies overany PSO variant converged to the global minimum, and theyreported average scores varying in the range {0.0014, 4762}. In[44], a diversity-guided PSO variant, ARPSO, along with twocompeting methods, bPSO and GA, are applied over the multi-modal functions (i.e., Rastrigin, Rosenbrock, and Griewank) atthree different dimensions (i.e., 20, 50, and 100). The range iskept quite reduced for Rosenbrock and Rastrigin, i.e., ±100 and±5.12, respectively, and for each run; the number of evaluations(product of iterations and the swarm size) is kept from 400 000to 2 000 000, depending on the dimensionality.

The experimental results have shown that none of the threemethods converged to the global minimum, except ARPSOover (only) Rastrigin at dimension 20. Only when ARPSO runsuntil stagnation, where no fitness improvements occur within200 000 evaluations, can it also find the global minimum overRastrigin at higher dimensions (i.e., 50 and 100). However, inpractical sense, this indicates that the total number of iterationsmight be in the magnitude of 105 or even more. Recall thatthe number of iterations required for MD PSO with FGBFconvergence to the global minimum is less than 400 for anydimension. ARPSO performed better than bPSO and GA overRastrigin and Rosenbrock but worse over Griewank. The CPSOproposed in [52] was applied over five functions among whichfour of them are common (i.e., Sphere, Rastrigin, Rosenbrock,and Griewank). The dimension of all functions is fixed to 30,and in this dimension, CPSO performed better than bPSO in80% of the experiments. Finally, in [45], dynamic sociometries



TABLE VISTATISTICAL RESULTS FROM 100 RUNS OVER SEVEN BENCHMARK FUNCTIONS



Fig. 10. Two-dimensional synthetic data spaces carrying different clustering schemes.

via ring and star have been introduced among the swarm par-ticles, and the performance of various combinations of swarmsize and sociometry over six functions (the ones used in thispaper except Schwefel) has been reported. Although the testsare performed over comparatively reduced positional rangesand at a low dimension (i.e., 30), the experimental resultsindicate that none of the sociometry and swarm size com-bination converged to the global minimum of multimodalfunctions, except only for some dimensions of the Griewankfunction.

B. Data Clustering

To test the application of the proposed techniques overclustering, we create 11 synthetic data spaces, as shown inFig. 10. For illustration purposes, each data space is formedin 2-D; however, clusters are formed with different shapes,densities, sizes, and intercluster distances to test the robustnessof clustering application of the proposed techniques againstsuch variations. Furthermore, recall that the number of clustersdetermines the (true) dimension of the solution space in a PSOapplication, and hence, it is also kept varying among data spaces

to test the converging accuracy to the true (solution space)dimension. As a result, significantly varying complexity levelsare established among the 11 data spaces to perform a general-purpose evaluation of each technique.

Unless stated otherwise, the maximum number of iterationsis set to 2000; however, the use of cutoff error as a terminationcriterion is avoided since it is not feasible to set a uniqueεC value for all clustering schemes. The same range valuesgiven in Section V-A are also used in all experiments, exceptthe positional range ±xmax, since it can now be set simplyas the natural boundaries of the 2-D data space. The firstset of clustering operations is performed for a comparativeevaluation of the standalone MD PSO versus bPSO over thesimple data spaces where they can yield accurate results, e.g.,the results of clustering over the three data spaces at the toprow in Fig. 10 are shown in Fig. 11, where each cluster isrepresented in one of the three color codes (i.e., red, green, andblue) for illustration purposes and each cluster centroid (eachdimensional component of the gbest particle) is shown with awhite “+”.

The accuracy of both bPSO and MD PSO tends to de-grade with the increasing dimensionality and complexity.



Fig. 11. Standalone MD PSO (and bPSO) clustering for data spaces C1–C3 shown in Fig. 10.

Fig. 12. Erroneous bPSO clustering over data spaces C4, C5, C6, and C9 shown in Fig. 10.

Fig. 12 presents typical clustering results for K ≥ 10 and whilerunning each bPSO operation until iteration number reaches20 000 (i.e., stagnation). K = 10 is, indeed, not a very highdimension for bPSO, but it particularly suffers from the highlycomplex clustering schemes in C4 and C5 (i.e., varying sizes,shapes, and densities among clusters). Over a simpler dataspace, e.g., C6 with 13 clusters, we noticed that bPSO occasion-ally yields accurate clustering, but for those data spaces with20–25 clusters or more, clustering errors become inevitableregardless of the level of complexity, and errors tend to increasesignificantly in higher dimensions as a natural consequenceof earlier local traps. A typical example is C9, which has42 clusters in the simplest form (uniform size, shape, anddensity), and the clustering result presents many overclusteringand underclustering schemes with many occasional mislocatedcentroids. Much worse performance can be expected from theirapplications for C10 and C11.

As stated earlier, MD PSO with FGBF, besides its speedimprovement, has its primary contribution over the accuracy ofthe clustering, i.e., converging to the true number of clustersK and correct localization of the centroids. As typical resultsshown in Fig. 13, MD PSO with FGBF meets the expectationson clustering accuracy but occasionally results in a slightlyhigher number of clusters. This is due to the use of a simplebut quite impure validity index in (9) as the fitness function,and for some complex clustering schemes, it may, therefore,yields its minimum score at a slightly higher number of clusters.A sample clustering operation validating this fact is shownin Fig. 14. Note that the (true) number of clusters is 10,which is eventually reached at the beginning of the operation;yet, the minimum score achieved with K = 10(∼750) remainshigher than the one with K = 11(∼610) and than the finaloutcome, with K = 12(∼570) as well. The main reason forthis is that the validity index in (9) over long (and loose)clusters, such as “C” and “S” in the figure, yields a muchhigher fitness score with one centroid than two or perhapsmore, and therefore, over all data spaces with such long andloose clusters (e.g., C4, C8, C10, and C11), the proposedmethod yields a slight overclustering but never underclustering.

Improving the validity index or adapting a more sophisticatedone, such as Dunn’s index [12] or many others, might improvethe clustering accuracy; however, this is beyond the scope ofthis paper.

An important observation worth mentioning is that clus-tering complexity (modality) affects the proposed methods’mutual performance much more than the total cluster number(dimension). For instance, MD PSO with FGBF clustering(with S = 640) over data space C9 can immediately find outthe true cluster number and accurate location of the centroidswith a slight offset (see Fig. 15), whereas this takes around1900 iterations for C8. Fig. 16 shows time instances whereaGB (with index number 640) becomes the GB particle. Itimmediately (at the first iteration) provides a “near-optimum”GB solution with 43 clusters, and then, the MD PSO process(at the 38th iteration) eventually finds the global optimum with42 clusters (i.e., see the first snapshot in Fig. 15). Afterward,the ongoing PSO process corrects the slight positional offset ofthe cluster centroids (e.g., compare first and second snapshotsin Fig. 15). Therefore, when the clusters are compact, areuniformly distributed, and have similar shape, density, and size,thus yielding the simplest form, it becomes quite straightfor-ward for FGBF to select the “most promising” dimensions witha greater accuracy. As the complexity (modality) increases, dif-ferent centroid assignments and clustering combinations haveto be assessed to converge toward the global optimum, whicheventually becomes a slow and tedious process.

Recall from the earlier discussion about the application ofthe proposed methods over nonlinear function minimization(both standalone MD PSO and MD PSO with FGBF), a certainspeed improvement occurs in terms of reduction in the iterationnumber and a better fitness score is achieved when a largerswarm is used. However, the computational complexity (periteration) also increases since the number of evaluations (fitnesscomputations) is proportional to the number of particles. Thesame tradeoff also exists for clustering application and, further-more, a significantly higher computational complexity of themutual application of the proposed methods can occur due tothe spatial MST grouping for the selection of the well-separated



Fig. 13. Typical clustering results via MD PSO with FGBF. Overclustered samples are indicated with ∗.

Fig. 14. Fitness score (top) and dimension (bottom) plots versus iteration number for an MD PSO with FGBF clustering operation over C4. Three clusteringsnapshots at iterations 105, 1050, and 1850 are presented below the plots.

centroids. As explained in Section IV-A, MST is the essence ofchoosing the “most promising” dimensions (centroids) to formthe best possible aGB particle. However, it is a costly O(N2

SS)

operation, where NSS is the subset size, which is formed bythose dimensions (potential centroids) having at least one dataitem closest to it. Therefore, NSS tends to increase if a larger



Fig. 15. Fitness score (top) and dimension (bottom) plots versus iteration number for an MD PSO with FGBF clustering operation over C9. Three clusteringsnapshots at iterations 40, 950, and 1999 are presented below the plots.

Fig. 16. Particle index plot for the MD PSO with FGBF clustering operationshown in Fig. 15.

swarm size is used and/or MD PSO with FGBF clusteringis performed over large (with many data items) and highlycomplex data spaces.

Table VII presents average processing times per iterationover all sample data spaces and using four different swarmsizes. All experiments are performed on a computer with aPentium IV 3-GHz central processing unit and 1 GB of randomaccess memory. Note that the processing times tend to increasein general when data spaces get larger (with more data items),but the real factor is the complexity. The processing for a highlycomplex data structure, such as C10, may require several timesmore computations (or time) than a simpler but comparable-size data space, such as C5. Therefore, on such highly complexdata spaces, the swarm size should be kept low, e.g., 80 ≤ S ≤160, for the sake of a reasonable processing time.

VI. CONCLUSION

In this paper, we have proposed two novel PSO techniques,namely, MD PSO and FGBF, as a cure to common drawbacksof the family of PSO methods such as a priori knowledge ofthe search space dimension and premature convergence to localoptima. The first proposed technique, i.e., the (standalone) MDPSO, efficiently addresses the former drawback by defining anew particle formation and embedding the ability of dimen-sional navigation into the core of the process. It basically allows

particles to make interdimensional “passes” with a dedicatedPSO process while performing regular positional updates inevery dimension that they visit. Such flexibility negates therequirement of setting the dimension in advance since swarmparticles can now converge to the global solution at the opti-mum dimension simultaneously.

Although the ability of determining the optimum dimensionwhere the global solution exists is gained with MD PSO, itsconvergence performance is still limited to the same level asbPSO, which suffers from the lack of diversity among parti-cles. This leads to a premature convergence to local optimaparticularly when multimodal problems are optimized at highdimensions. Realizing that the main problem lies, in fact, at theinability of using the available diversity among the dimensionalcomponents of swarm particles, the FGBF technique proposedin this paper addresses this problem by collecting the bestcomponents and fractionally creating an aGB particle that hasthe potential to be a better “guide” than the contemporarygbest particle. Therefore, for those problems, where either theexact dimensional fitness evaluation or its approximation ispossible, FGBF can be conveniently used to improve the globalconvergence ability without changing the native structure of theswarm.

To test and evaluate the MD PSO’s performance over thefirst problem domain, i.e., the nonlinear function minimization,seven benchmark functions were biased with a dimensionalterm so that they have the global minimum only at a par-ticular dimension. We have shown that the standalone MDPSO (without FGBF) converges to the global minimum in thetarget dimension over unimodal functions. We then investigatedthe effects of swarm size, dimensionality, and modality overperformance - both accuracy and speed. A comparative eval-uation with bPSO has shown that the standalone MD PSOis slower than bPSO due to its additional dimensional searchprocess but has the same convergence behavior, as expected.The performance of both methods degrades with the increas-ing modality and dimensionality due to the aforementioned



TABLE VIIPROCESSING TIME (IN MILLISECONDS) PER ITERATION FOR MD PSO WITH FGBF CLUSTERING USING FOUR DIFFERENT SWARM SIZES.

NUMBER OF DATA ITEMS IS PRESENTED IN PARENTHESIS WITH THE SAMPLE DATA SPACE

reasons. When used with FGBF, MD PSO exhibits such animpressive speed gain that their mutual performance surpassesbPSO by several magnitudes. Experimental results show thatexcept in a few minority cases, the convergence to the globalminimum at the target dimension is achieved within fewerthan 1000 iterations on the average, mostly only within fewhundreds or even less. Yet, the major improvement occurs inthe convergence accuracy, both positional and dimensional.MD PSO with FGBF finds the global minimum at the targetdimension for all runs over all functions without any exception.This is a substantial achievement in the area of PSO-basednonlinear function minimization.

Similar remarks can be made for the applications of bPSOand the standalone MD PSO over data clustering within whichthe (clustering) complexity can be thought of as synonymous to(function) modality, i.e., speed and accuracy performances ofboth methods drastically degrade with increasing complexity.Needless to say, the true number of clusters has to be set inadvance for bPSO, whereas MD PSO finds it on the fly and,hence, exhibits a slower convergence pace than bPSO. Whenit is performed with FGBF, a significant speed improvementis achieved, and only such cooperation can provide accurateclustering results over complex data spaces. Since the clusteringperformance also depends on the validity index used, occa-sional overclustering can be encountered where we have shownthat such results, indeed, correspond to the global minimum ofthe validity index function used. As a result, the true numberof clusters and accurate centroid localization are achieved atthe expense of increased computational complexity due tothe usage of MST. To keep the overall computational costwithin feasible limits, we also investigated the effects of swarmsize over complexity and recommended a proper range forpractical use.

Overall, the proposed techniques fundamentally upgrade theparticle structure and the swarm guidance, both of whichaccomplish substantial improvements in terms of speed andaccuracy. Both techniques are modular and independent fromeach other, i.e., one can be performed without the other, whileother PSO methods/variants can also be used conveniently with(either of) them.

REFERENCES

[1] A. Abraham, S. Das, and S. Roy, “Swarm intelligence algorithms for dataclustering,” in Soft Computing for Knowledge Discovery and Data MiningBook. New York: Springer-Verlag, Oct. 25, 2007, pp. 279–313. Part IV.

[2] P. J. Angeline, “Using selection to improve particle swarm optimization,”in Proc. IEEE Congr. Evol. Comput., 1998, pp. 84–89.

[3] P. I. Angeline, “Evolutionary optimization versus particle swarm opti-mization: Philosophy and performance differences,” in Proc. Evol. Pro-gram. VII, Conf. EP. New York: Springer-Verlag, Mar. 1998, vol. 1447,pp. 601–610.

[4] T. Back and H. P. Schwefel, “An overview of evolutionary algorithm forparameter optimization,” Evol. Comput., vol. 1, no. 1, pp. 1–23, 1993.

[5] T. Back and F. Kursawe, “Evolutionary algorithms for fuzzy logic:A brief overview,” in Fuzzy Logic and Soft Computing. Singapore:World Scientific, 1995, pp. 3–10.

[6] J. C. Bezdek, Pattern Recognition With Fuzzy Objective FunctionAlgorithms. New York: Plenum, 1981.

[7] X. Chen and Y. Li, “A modified PSO structure resulting in highexploration ability with convergence guaranteed,” IEEE Trans. Syst., Man,Cybern. B, Cybern., vol. 37, no. 5, pp. 1271–1289, Oct. 2007.

[8] Y.-P. Chen, W.-C. Peng, and M.-C. Jian, “Particle swarm optimizationwith recombination and dynamic linkage discovery,” IEEE Trans. Syst.,Man, Cybern. B, Cybern., vol. 37, no. 6, pp. 1460–1470, Dec. 2007.

[9] K. M. Christopher and K. D. Seppi, “The Kalman swarm. A newapproach to particle motion in swarm optimization,” in Proc. GECCO,2004, pp. 140–150.

[10] M. Clerc, “The swarm and the queen: Towards a deterministic and adap-tive particle swarm optimization,” in Proc. IEEE Congr. Evol. Comput.,Jul. 1999, vol. 3, pp. 1951–1957.

[11] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEETrans. Pattern Anal. Mach. Intell., vol. PAMI-1, no. 2, pp. 224–227,Apr. 1979.

[12] J. C. Dunn, “Well separated clusters and optimal fuzzy partitions,”J. Cybern., vol. 4, pp. 95–104, 1974.

[13] R. Eberhart, P. Simpson, and R. Dobbins, Computational Intelligence. PCTools. Boston, MA: Academic, 1996.

[14] A. P. Engelbrecht, Fundamentals of Computational Swarm Intelligence.Hoboken, NJ: Wiley, 2005.

[15] S. C. Esquivel and C. A. Coello Coello, “On the use of particle swarmoptimization with multimodal functions,” in Proc. IEEE Congr. Evol.Comput., 2003, vol. 2, pp. 1130–1136.

[16] U. M. Fayyad, G. P. Shapire, P. Smyth, and R. Uthurusamy, Advances inKnowledge Discovery and Data Mining. Cambridge, MA: MIT Press,1996.

[17] H. Frigui and R. Krishnapuram, “Clustering by competitive agglomera-tion,” Pattern Recognit., vol. 30, no. 7, pp. 1109–1119, Jul. 1997.

[18] D. Goldberg, Genetic Algorithms in Search, Optimization and MachineLearning. Reading, MA: Addison-Wesley, 1989, pp. 1–25.

[19] G. Hammerly, “Learning structure and concepts in data throughdata clustering,” Ph.D. dissertation, Univ. California, San Diego, CA,Jun. 26, 2003.

[20] G. Hammerly and C. Elkan, “Alternatives to the k-means algorithm thatfind better clusterings,” in Proc. 11th ACM CIKM, 2002, pp. 600–607.

[21] M. Halkidi and M. Vazirgiannis, “Clustering validity assessment: Findingthe optimal partitioning of a data set,” in Proc. 1st IEEE ICDM, 2001,pp. 187–194.

[22] M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On cluster validationtechniques,” J. Intell. Inf. Syst., vol. 17, no. 2/3, pp. 107–145, 2001.

[23] H. Higashi and H. Iba, “Particle swarm optimization with Gaussian muta-tion,” in Proc. IEEE Swarm Intell. Symp., 2003, pp. 72–79.

[24] A. K. Jain, M. N. Murthy, and P. J. Flynn, “Data clustering: A review,”ACM Comput. Rev., vol. 31, no. 3, pp. 264–323, Nov. 1999.

[25] S. Janson and M. Middendorf, “A hierarchical particle swarm optimizerand its adaptive variant,” IEEE Trans. Syst., Man, Cybern. B, Cybern.,vol. 35, no. 6, pp. 1272–1282, Dec. 2005.

[26] B. Kaewkamnerdpong and P. J. Bentley, “Perceptive particle swarmoptimization: An investigation,” in Proc. IEEE Swarm Intell. Symp.,Pasadena, CA, Jun. 8–10, 2005, pp. 169–176.



[27] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEEInt. Conf. Neural Netw., Perth, Australia, 1995, vol. 4, pp. 1942–1948.

[28] J. Koza, Genetic Programming: On the Programming of Computers byMeans of Natural Selection. Cambridge, MA: MIT Press, 1992.

[29] R. A. Krohling and L. S. Coelho, “Coevolutionary particle swarm opti-mization using Gaussian distribution for solving constrained optimizationproblems,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 36, no. 6,pp. 1407–1416, Dec. 2006.

[30] J. B. Kruskal, “On the shortest spanning subtree of a graph and thetraveling salesman problem,” in Proc. AMS, 1956, vol. 7, pp. 48–50.

[31] J. J. Liang and A. K. Qin, “Comprehensive learning particle swarm opti-mizer for global optimization of multimodal functions,” IEEE Trans. Evol.Comput., vol. 10, no. 3, pp. 281–295, Jun. 2006.

[32] M. Lovberg, “Improving particle swarm optimization by hybridizationof stochastic search heuristics and self-organized criticality,” M.S. thesis,Dept. Comput. Sci., Univ. Aarhus, Aarhus, Denmark, 2002.

[33] M. Lovberg and T. Krink, “Extending particle swarm optimisers with self-organized criticality,” in Proc. IEEE Congr. Evol. Comput., 2002, vol. 2,pp. 1588–1593.

[34] R. Mendes, J. Kennedy, and J. Neves, “The fully informed particle swarm:Simpler, maybe better,” IEEE Trans. Evol. Comput., vol. 8, no. 3, pp. 204–210, Jun. 2004.

[35] M. Omran, A. Salman, and A. P. Engelbrecht, “Image classification usingparticle swarm optimization,” in Proc. Conf. Simulated Evolution Learn.,2002, vol. 1, pp. 370–374.

[36] M. G. Omran, A. Salman, and A. P. Engelbrecht, “Dynamic clustering us-ing particle swarm optimization with application in image segmentation,”Pattern Anal. Appl., vol. 8, no. 4, pp. 332–344, Feb. 2006.

[37] M. G. Omran, A. Salman, and A. P. Engelbrecht, Particle SwarmOptimization for Pattern Recognition and Image Processing. Berlin,Germany: Springer-Verlag, 2006.

[38] N. R. Pal and J. Biswas, “Cluster validation using graph theoretic con-cepts,” Pattern Recognit., vol. 30, no. 6, pp. 847–857, Jun. 1997.

[39] B. Peng, R. G. Reynolds, and J. Brewster, “Cultural swarms,” in Proc.IEEE Congr. Evol. Comput., 2003, vol. 3, pp. 1965–1971.

[40] T. Peram, K. Veeramachaneni, and C. K. Mohan, “Fitness-distance-ratiobased particle swarm optimization,” in Proc. IEEE Swarm Intell. Symp.,2003, pp. 174–181.

[41] A. C. Ratnaweera, S. K. Halgamuge, and H. C. Watson, “Particle swarmoptimization with self-adaptive acceleration coefficients,” in Proc. 1st Int.Conf. Fuzzy Syst. Knowl. Discovery, 2003, pp. 264–268.

[42] A. C. Ratnaweera, S. K. Halgamuge, and H. C. Watson, “Particle swarmoptimiser with time varying acceleration coefficients,” in Proc. Int. Conf.Soft Comput. Intell. Syst., 2002, pp. 240–255.

[43] R. G. Reynolds, B. Peng, and J. Brewster, “Cultural swarms—Part 2:Virtual algorithm emergence,” in Proc. IEEE CEC, Canberra, Australia,2003, pp. 1972–1979.

[44] J. Riget and J. S. Vesterstrom, “A diversity-guided particle swarmoptimizer—The ARPSO,” Dept. Comput. Sci., Univ. Aarhus, Aarhus,Denmark, 2002. Tech. Rep.

[45] M. Richards and D. Ventura, “Dynamic sociometry in particle swarmoptimization,” in Proc. 6th Int. Conf. Comput. Intell. Natural Comput.,Cary, NC, Sep. 2003, pp. 1557–1560.

[46] Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer,” in Proc.IEEE Congr. Evol. Comput., 1998, pp. 69–73.

[47] Y. Shi and R. C. Eberhart, “Fuzzy adaptive particle swarm optimization,”in Proc. IEEE Congr. Evol. Comput., 2001, vol. 1, pp. 101–106.

[48] J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles. London,U.K.: Addison-Wesley, 1974.

[49] R. H. Turi, “Clustering-based colour image segmentation,” Ph.D. disser-tation, Monash Univ., Melbourne, Australia, 2001.

[50] F. Van den Bergh, “An analysis of particle swarm optimizers,” Ph.D.dissertation, Dept. Comput. Sci., Univ. Pretoria, Pretoria, South Africa,2002.

[51] F. Van den Bergh and A. P. Engelbrecht, “A new locally convergentparticle swarm optimizer,” in Proc. IEEE Int. Conf. Syst., Man, Cybern.,2002, pp. 96–101.

[52] F. Van den Bergh and A. P. Engelbrecht, “A cooperative approach toparticle swarm optimization,” IEEE Trans. Evol. Comput., vol. 8, no. 3,pp. 225–239, Jun. 2004.

[53] E. O. Wilson, Sociobiology: The New Synthesis. Cambridge, MA:Belknap Press, 1975.

[54] X. Xie, W. Zhang, and Z. Yang, “A dissipative particle swarm optimiza-tion,” in Proc. IEEE Congr. Evol. Comput., 2002, vol. 2, pp. 1456–1461.

[55] X. Xie, W. Zhang, and Z. Yang, “Adaptive particle swarm optimizationon individual level,” in Proc. 6th Int. Conf. Signal Process., 2002, vol. 2,pp. 1215–1218.

[56] X. Xie, W. Zhang, and Z. Yang, “Hybrid particle swarm optimizer withmass extinction,” in Proc. Int. Conf. Commun., Circuits Syst., 2002, vol. 2,pp. 1170–1173.

[57] K. Yasuda, A. Ide, and N. Iwasaki, “Adaptive particle swarmoptimization,” in Proc. IEEE Int. Conf. Syst., Man, Cybern., 2003, vol. 2,pp. 1554–1559.

[58] W.-J. Zhang and X.-F. Xie, “DEPSO: Hybrid particle swarm with differ-ential evolution operator,” in Proc. IEEE Int. Conf. Syst., Man, Cybern.,2003, vol. 4, pp. 3816–3821.

[59] B. Zhang and M. Hsu, “K-harmonic means—A data clusteringalgorithm,” Hewlett-Packard Labs, Palo Alto, CA, Tech. Rep. HPL-1999-124, 1999.

[60] W.-J. Zhang, Y. Liu, and M. Clerc, “An adaptive PSO algorithm for re-active power optimization,” in Proc. APSCOM, Hong Kong, 2003, vol. 1,pp. 302–307.

Serkan Kiranyaz was born in Turkey in 1972. Hereceived the B.S. degree in electrical and electronicsengineering and the M.S. degree in signal and videoprocessing from Bilkent University, Ankara, Turkey,in 1994 and 1996, respectively, and the Ph.D. de-gree and the Docency from Tampere University ofTechnology, Tampere, Finland, in 2005 and 2007,respectively.

He was a Researcher with the Nokia Re-search Center and later with Nokia Mobile Phones,Tampere. He is currently an Associate Professor with

the Department of Signal Processing, Tampere University of Technology.He is the architect and principal developer of the ongoing content-basedmultimedia indexing and retrieval framework MUVIS. His research interestsinclude multidimensional optimization, evolutionary neural networks, content-based multimedia indexing, browsing and retrieval algorithms, audio analysisand audio-based multimedia retrieval, object extraction, motion estimation andVLBR video coding, MPEG4 over IP, and multimedia processing.

Turker Ince received the B.S. degree in electri-cal engineering from Bilkent University, Ankara,Turkey, the M.S. degree in electrical engineeringfrom the Middle East Technical University, Ankara,and the Ph.D. degree in electrical engineering fromthe University of Massachusetts, Amherst (UMass-Amherst), in 1994, 1996, and 2001, respectively.

From 1996 to 2001, he was a Research Assistantwith the Microwave Remote Sensing Laboratory,UMass-Amherst. He was a Design Engineer withAware, Inc., Boston, MA, from 2001 to 2004, and

with Texas Instruments, Inc., Dallas, from 2004 to 2006. In 2006, he joinedthe faculty of the Department of Computer Engineering, Izmir University ofEconomics, Turkey, where he is currently an Assistant Professor. His researchinterests include electromagnetic remote sensing and target recognition, radarsignal processing, biomedical signal processing, neural networks, and globaloptimization techniques.



Alper Yildirim was born in Turkey in 1974. Hereceived the B.Sc. degree in electrical and elec-tronics engineering from Bilkent University, Ankara,Turkey, in 1996, the M.S. degree in digital and com-puter systems from Tampere University of Technol-ogy, Tampere, Finland, in 2001, and the Ph.D. degreein electronics engineering from Ankara University,Ankara, in 2007.

He was a Design Engineer with Nokia MobilePhones, Tampere. He is currently a Chief ResearchScientist with the Scientific and Technological Re-

search Council of Turkey, Ankara. His research interests include digital signalprocessing, optimization, and radar systems.

Moncef Gabbouj (M’85–SM’95) received the B.S.degree in electrical engineering from OklahomaState University, Stillwater, in 1985 and the M.S. andPh.D. degrees in electrical engineering from PurdueUniversity, West Lafayette, IN, in 1986 and 1989,respectively.

He is a Professor with the Department of SignalProcessing, Tampere University of Technology,Tampere, Finland, which he headed during2002–2007. He was a Visiting Professor at theAmerican University of Sharjah, Sharjah, United

Arab Emirates, during 2007–2008 and was a Senior Research Fellow of theAcademy of Finland, Helsinki, Finland, during 1997–1998 and 2007–2008. Hewas a Guest Editor for Multimedia Tools and Applications and Applied SignalProcessing. His research interests include multimedia content-based analysis,indexing, and retrieval, nonlinear signal and image processing and analysis,and video processing and coding.

Dr. Gabbouj is a member of IEEE Signal Processing (SP) Society and theIEEE Circuits and Systems (CAS) Society. He is the past Chairman of theIEEE Finland Section, the IEEE CAS Society, the Technical Committee onDigital Signal Processing, and the IEEE SP/CAS Finland Chapter. He servedas a Distinguished Lecturer for the IEEE Circuits and Systems Society during2004–2005. He served as an Associate Editor for the IEEE TRANSACTIONS

ON IMAGE PROCESSING. He was the recipient of the 2005 Nokia FoundationRecognition Award, a corecipient of the Myril B. Reed Best Paper Award atthe 32nd Midwest Symposium on Circuits and Systems, and a corecipientof the NORSIG 94 Best Paper Award at the 1994 Nordic Signal ProcessingSymposium.


Fractional Particle Swarm Optimization in Multidimensional...

Documents

Transcript of Fractional Particle Swarm Optimization in Multidimensional...