Research Article An Improved Exact Algorithm for Least-Squares Unidimensional...

16
Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2013, Article ID 890589, 15 pages http://dx.doi.org/10.1155/2013/890589 Research Article An Improved Exact Algorithm for Least-Squares Unidimensional Scaling Gintaras Palubeckis Faculty of Informatics, Kaunas University of Technology, Studentu 50-408, 51368 Kaunas, Lithuania Correspondence should be addressed to Gintaras Palubeckis; gintaras@soſten.ktu.lt Received 16 October 2012; Accepted 31 March 2013 Academic Editor: Frank Werner Copyright © 2013 Gintaras Palubeckis. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Given n objects and an × symmetric dissimilarity matrix D with zero main diagonal and nonnegative off-diagonal entries, the least-squares unidimensional scaling problem asks to find an arrangement of objects along a straight line such that the pairwise distances between them reflect dissimilarities represented by the matrix D. In this paper, we propose an improved branch- and-bound algorithm for solving this problem. e main ingredients of the algorithm include an innovative upper bounding technique relying on the linear assignment model and a dominance test which allows considerably reducing the redundancy in the enumeration process. An initial lower bound for the algorithm is provided by an iterated tabu search heuristic. To enhance the performance of this heuristic we develop an efficient method for exploring the pairwise interchange neighborhood of a solution in the search space. e basic principle and formulas of the method are also used in the implementation of the dominance test. We report computational results for both randomly generated and real-life based problem instances. In particular, we were able to solve to guaranteed optimality the problem defined by a 36 × 36 Morse code dissimilarity matrix. 1. Introduction Least-squares (or 2 ) unidimensional scaling is an important optimization problem in the field of combinatorial data analysis. It can be stated as follows. Suppose that there are objects and, in addition, there are pairwise dissimilarity data available for them. ese data form an × symmetric dissimilarity matrix, = ( ), with zero main diagonal and nonnegative off-diagonal entries. e problem is to arrange the objects along a straight line such that the pairwise distances between them reflect dissimilarities given by the matrix . Mathematically, the problem can be expressed as min Φ () = −1 =1 =+1 ( ) 2 , (1) where denotes the vector ( 1 ,..., ) and , ∈ {1, . . . , }, is the coordinate for the object on the line. We note that two (or even more) objects may have the same coordinate. In the literature, the double sum given by (1) is called a stress function [1]. Unidimensional scaling and its generalization, multi- dimensional scaling [1, 2], are dimensionality reduction techniques for exploratory data analysis and data mining [3]. e applicability of these techniques has been reported in a diversity of fields. In recent years, unidimensional scaling has found new applications in several domains including genetics, psychology, and medicine. Caraux and Pinloche [4] developed an environment to graphically explore gene expression data. is environment offers unidimensional scaling methods, along with some other popular techniques. Ng et al. [5] considered the problem of constructing linkage disequilibrium maps for human genome. ey proposed a method for this problem which is centered on the use of a specific type of the unidimensional scaling model. Armstrong et al. [6] presented a study in which they evaluated interest profile differences across gender, racial-ethnic group, and employment status. e authors used two measures of profile similarity, the so-called correlation and Euclidean dis- tances. In the former case, unidimensional scaling appeared to be a good model for interpreting the obtained results. Hubert and Steinley [7] applied unidimensional scaling to analyze proximity data on the United States Supreme Court

Transcript of Research Article An Improved Exact Algorithm for Least-Squares Unidimensional...

Page 1: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2013 Article ID 890589 15 pageshttpdxdoiorg1011552013890589

Research ArticleAn Improved Exact Algorithm for Least-SquaresUnidimensional Scaling

Gintaras Palubeckis

Faculty of Informatics Kaunas University of Technology Studentu 50-408 51368 Kaunas Lithuania

Correspondence should be addressed to Gintaras Palubeckis gintarassoftenktult

Received 16 October 2012 Accepted 31 March 2013

Academic Editor Frank Werner

Copyright copy 2013 Gintaras PalubeckisThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Given n objects and an 119899 times 119899 symmetric dissimilarity matrix D with zero main diagonal and nonnegative off-diagonal entriesthe least-squares unidimensional scaling problem asks to find an arrangement of objects along a straight line such that thepairwise distances between them reflect dissimilarities represented by the matrixD In this paper we propose an improved branch-and-bound algorithm for solving this problem The main ingredients of the algorithm include an innovative upper boundingtechnique relying on the linear assignment model and a dominance test which allows considerably reducing the redundancy inthe enumeration process An initial lower bound for the algorithm is provided by an iterated tabu search heuristic To enhance theperformance of this heuristic we develop an efficient method for exploring the pairwise interchange neighborhood of a solutionin the search space The basic principle and formulas of the method are also used in the implementation of the dominance testWe report computational results for both randomly generated and real-life based problem instances In particular we were able tosolve to guaranteed optimality the problem defined by a 36 times 36Morse code dissimilarity matrix

1 Introduction

Least-squares (or 1198712) unidimensional scaling is an important

optimization problem in the field of combinatorial dataanalysis It can be stated as follows Suppose that there are119899 objects and in addition there are pairwise dissimilaritydata available for them These data form an 119899 times 119899 symmetricdissimilarity matrix 119863 = (119889

119894119895) with zero main diagonal

and nonnegative off-diagonal entries The problem is toarrange the objects along a straight line such that the pairwisedistances between them reflect dissimilarities given by thematrix119863 Mathematically the problem can be expressed as

minΦ (119909) =

119899minus1

sum

119894=1

119899

sum

119895=119894+1

(119889119894119895minus

10038161003816100381610038161003816119909119894minus 119909119895

10038161003816100381610038161003816)

2

(1)

where 119909 denotes the vector (1199091 119909

119899) and 119909

119894 119894 isin 1 119899

is the coordinate for the object 119894 on the line We note that two(or even more) objects may have the same coordinate In theliterature the double sumgiven by (1) is called a stress function[1]

Unidimensional scaling and its generalization multi-dimensional scaling [1 2] are dimensionality reductiontechniques for exploratory data analysis and data mining [3]The applicability of these techniques has been reported ina diversity of fields In recent years unidimensional scalinghas found new applications in several domains includinggenetics psychology and medicine Caraux and Pinloche[4] developed an environment to graphically explore geneexpression data This environment offers unidimensionalscaling methods along with some other popular techniquesNg et al [5] considered the problem of constructing linkagedisequilibrium maps for human genome They proposed amethod for this problem which is centered on the use of aspecific type of the unidimensional scalingmodel Armstronget al [6] presented a study in which they evaluated interestprofile differences across gender racial-ethnic group andemployment statusThe authors used twomeasures of profilesimilarity the so-called 119876 correlation and Euclidean dis-tances In the former case unidimensional scaling appearedto be a good model for interpreting the obtained resultsHubert and Steinley [7] applied unidimensional scaling toanalyze proximity data on the United States Supreme Court

2 Journal of Applied Mathematics

Justices They have found that agreement among the justiceswas better represented by this approach than using a hierar-chical classification method Dahabiah et al [8 9] proposeda data mining technique based on a unidimensional scalingalgorithm They applied this technique on a digestive endo-scope database consisting of a large number of pathologiesGe et al [10] developed a method for unsupervised classifica-tion of remote sensing images Unidimensional scaling playsa pivotal role in this classifier Bihn et al [11] examined thediversity of ant assemblages in the Atlantic Forest of SouthernBrazil To visualize nestedness of ant genera distribution aunidimensional scaling approach was adopted

From (1) we see that in order to solve the unidimensionalscaling problem (UDSP for short) we need to find not only apermutation of objects but also their coordinates preservingthe order of the objects as given by the permutation Thusat first glance it might seem that the UDSP is a much moredifficult problem than other linear arrangement problemsFortunately this is not the case As shown by Defays [12]the UDSP can be reduced to the following combinatorialoptimization problem

max119901isinΠ

119865 (119901) =

119899

sum

119896=1

(

119896minus1

sum

119897=1

119889119901(119896)119901(119897)

minus

119899

sum

119897=119896+1

119889119901(119896)119901(119897)

)

2

(2)

where Π is the set of all permutations of119882 = 1 119899 thatis the set of all vectors (119901(1) 119901(2) 119901(119899)) such that 119901(119896) isin119882 119896 = 1 119899 and 119901(119896) = 119901(119897) 119896 = 1 119899 minus 1 119897 = 119896 +

1 119899 Letting 119878119894 119894 isin 119882 denote the sumsum

119899

119895=1119895 = 119894119889119894119895 we can

write

119865 (119901) =

119899

sum

119896=1

(2

119896minus1

sum

119897=1

119889119901(119896)119901(119897)

minus 119878119901(119896)

)

2

=

119899

sum

119896=1

(2

119899

sum

119897=119896+1

119889119901(119896)119901(119897)

minus 119878119901(119896)

)

2

(3)

Thus theUDSP is essentially a problem over all permutationsof 119899 objects With an optimal solution 119901

lowast to (2) at handwe can compute the optimal coordinates for the objects bysetting 119909

119901lowast(1)

= 0 and using the following relationship [13]for 119896 + 1 = 2 119899

119909119901lowast(119896+1)

= 119909119901lowast(119896)

+ (

119899

sum

119897=119896+1

119889119901lowast(119896)119901lowast(119897)

minus

119896minus1

sum

119897=1

119889119901lowast(119896)119901lowast(119897)

minus

119899

sum

119897=119896+2

119889119901lowast(119896+1)119901

lowast(119897)

+

119896

sum

119897=1

119889119901lowast(119896+1)119901

lowast(119897))119899minus1

(4)

There is an important body of the literature on algorithmsfor solving the UDSP Since the feasible solutions to the prob-lem are linear arrangements of objects a natural approachto the UDSP is to apply a general dynamic programmingparadigm An algorithm based on this paradigm was pro-posed by Hubert and Arabie [14] However as pointed out

in the papers by Hubert et al [15] and Brusco and Stahl[13] the dynamic programming algorithm requires very largememory for storing the partial solutions and in fact runsout of memory long before running out of time As reportedin these papers the largest UDSP instances solved usingdynamic programmingwere of sizes 25 and 26 An alternativeapproach for solving the UDSP is to use the branch-and-bound method The first algorithm following this approachis that of Defays [12] In his algorithm during branching theobjects are assigned in an alternating fashion to either theleftmost free position or the rightmost free position in the119899-permutation under construction For example at level 4of the search tree the four selected objects are 119901(1) 119901(119899)119901(2) and 119901(119899 minus 1) and there are no object assigned topositions 3 119899 minus 2 in the permutation 119901 To fathom partialsolutions the algorithm uses a symmetry test an adjacencytest and a bound test Some more comments on the Defaysapproach were provided in the paper by Brusco and Stahl[13] The authors of that paper also proposed two improvedbranch-and-bound algorithms for unidimensional scalingThe first of them adopts the above-mentioned branchingstrategy borrowed from the Defays method Meanwhile inthe second algorithm of Brusco and Stahl an object chosenfor branching is always assigned to the leftmost free positionBrusco and Stahl [13 16] presented improved boundingprocedures and an interchange test expanding the adjacencytest used by Defays [12] These components of their branch-and-bound algorithms are briefly sketched in Section 2 andSection 4 respectively Brusco and Stahl [13] have reportedcomputational experience for both randomly generated andempirical dissimilarity matrices The results suggest thattheir branch-and-bound implementations are superior to theDefays algorithm Moreover the results demonstrate thatthese implementations are able to solve UDSP instances thatare beyond the capabilities of the dynamic programmingmethod

There are also a number of heuristic algorithms forunidimensional scaling These include the smoothing tech-nique [17] the iterative quadratic assignment heuristic [15]and simulated annealing adaptations [18 19] A heuristicimplementation of dynamic programming for escaping localoptima has been presented by Brusco et al [20] Computa-tional results for problem instances with up to 200 objectswere reported in [18ndash20]

The focus of this paper is on developing a branch-and-bound algorithm for the UDSP The primary intention isto solve exactly some larger problems than those solved byprevious methods The significant features of our algorithmare the use of an innovative upper bounding procedurerelying on the linear assignmentmodel and the enhancementof the interchange test from [13] In the current paper thisenhancement is called a dominance test A good initialsolution and thus lower bound are obtained by running aniterated tabu search algorithm at the root node of the searchtree To make this algorithm more efficient we provide anovel mechanism and underlying formulas for searching thepairwise interchange neighborhood of a solution in 119874(119899

2)

Journal of Applied Mathematics 3

time which is an improvement over a standard 119874(1198993)-time

approach from the literatureThe same formulas are also usedin the dominance test We present the results of empiricalevaluation of our branch-and-bound algorithm and compareit against the better of the two branch-and-bound algorithmsof Brusco and Stahl [13] The largest UDSP instance that wewere able to solvewith our algorithmwas the problemdefinedby a 36 times 36Morse code dissimilarity matrix

The remainder of this paper is organized as follows InSection 2 we present an upper bound on solutions of theUDSP In Section 3 we outline an effective tabu search pro-cedure for unidimensional scaling and propose a method forexploring the pairwise interchange neighborhood Section 4gives a description of the dominance test for pruning nodesin the searchThis test is used in our branch-and-bound algo-rithm presented in Section 5 The results of computationalexperiments are reported in Section 6 Finally Section 7concludes the paper

2 Upper Bounds

In this section we first briefly recall the bounds used byBrusco and Stahl [13] in their second branch-and-boundalgorithm Then we present a new upper bounding mecha-nism for the UDSP We discuss the technical details of theprocedure for computing our bound We choose a variationof this procedure which yields slightly weakened bounds butis quite fast

Like in the second algorithm of Brusco and Stahl in ourimplementation of the branch-and-bound method partialsolutions are constructed by placing each new object in theleftmost free position For a partial solution 119901 = (119901(1)

119901(119903)) 119903 isin 1 119899 minus 1 we denote by 119882(119901) the set of theobjects in 119901 Thus 119882(119901) = 119901(1) 119901(119903) and 119882(119901) =

119882 119882(119901) is the set of unselected objects At the root nodeof the search tree 119901 is empty and 119882(119901) is equal to 119882 Thefirst upper bound used by Brusco and Stahl when adopted to119901 takes the following form

119880BS = 119865119903(119901) + sum

119894isin119882(119901)

1198782

119894 (5)

where 119865119903(119901) denotes the sum of the terms in (3) corre-

sponding to assigned objects Thus 119865119903(119901) = sum

119903

119896=1(119878119901(119896)

minus

2sum119896minus1

119897=1119889119901(119896)119901(119897)

)2

To describe the second bound we define 119863 = (119889119894119895) to

be an 119899 times (119899 minus 1) matrix obtained from 119863 by sorting thenondiagonal entries of each row of119863 in nondecreasing orderLet 120578119894119896

= (119878119894minus 2sum

119896minus1

119895=1

119889119894119895)2 and 120583

119896= max

1⩽119894⩽119899120578119894119896for 119896 =

1 lceil1198992rceil For other values of 119896 120583119896is obtained from the

equality 120583119896= 120583119899minus119896+1

119896 isin 1 119899 The second upper boundproposed in [13] is given by the following expression

119865119903(119901) +

119899

sum

119896=119903+1

120583119896 (6)

The previous outlined bounds are computationally attrac-tive but not strong enough when used to prune the search

space In order to get a tighter upper bound we resort to thelinear assignment model

max119899minus119903

sum

119894=1

119899minus119903

sum

119895=1

119890119894119895119910119894119895

st119899minus119903

sum

119895=1

119910119894119895= 1 119894 = 1 119899 minus 119903

119899minus119903

sum

119894=1

119910119894119895= 1 119895 = 1 119899 minus 119903

119910119894119895isin 0 1 119894 119895 = 1 119899 minus 119903

(7)

where 119890119894119895 119894 119895 = 1 119899 minus 119903 are entries of an (119899 minus 119903) times (119899 minus

119903) profit matrix 119864 with rows and columns indexed by theobjects in the set 119882(119901) Next we will construct the matrix119864 For a permutation 119901 isin Π we denote by 119891

119901(119896)(resp

1198911015840

119901(119896)) the sum of dissimilarities between object 119901(119896) and all

objects preceding (resp following) 119901(119896) in 119901 Thus 119891119901(119896)

=

sum119896minus1

119897=1119889119901(119896)119901(119897)

1198911015840119901(119896)

= sum119899

119897=119896+1119889119901(119896)119901(119897)

From (3) we have

119865 (119901) =

119899

sum

119896=1

(2max (119891119901(119896)

1198911015840

119901(119896)) minus 119878119901(119896)

)

2

(8)

and 2max(119891119901(119896)

1198911015840

119901(119896)) ⩾ 119878

119901(119896)= 119891119901(119896)

+ 1198911015840

119901(119896)for each 119896 =

1 119899 Suppose now that 119901 = (119901(1) 119901(119903)) is a partialsolution andΠ(119901) is the set of all 119899-permutations that can beobtained by extending 119901 Assume for the sake of simplicitythat 119882(119901) = 1 119899 minus 119903 In order to construct a suitablematrix 119864 we have to bound 119891

119901(119896)and 119891

1015840

119901(119896)in (8) from above

To this end for any 119894 isin 119882(119901) let (11988910158401198941 119889

1015840

119894119899minus119903minus1) denote the

list obtained by sorting 119889119894119895 119895 isin 119882(119901) 119894 in nonincreasing

order To get the required bounds we make use of the sums120574119894119895

= sum119895

119897=11198891015840

1198941 119894 isin 119882(119901) 119895 = 0 1 119899 minus 119903 minus 1 and 120588

119894119896=

sum119903

119897=1119889119894119901(119897)

+ 120574119894119896minus119903minus1

119894 isin 119882(119901) 119896 = 119903 + 1 119899 Suppose that119901 is any permutation in Π(119901) and 119901(119896) = 119894 for a position 119896 isin

119903 + 1 119899 Then it is easy to see that 119891119901(119896)

= 119891119894⩽ 120588119894119896and

1198911015840

119901(119896)= 1198911015840

119894⩽ 120574119894119899minus119896

Indeed in the case of 119891119901(119896)

for examplewe have 119891

119901(119896)= sum119903

119897=1119889119894119901(119897)

+ sum119896minus1

119897=119903+1119889119894119901(119897)

and the last term inthis expression is less than or equal to 120574

119894119896minus119903minus1 We define 119864 to

be the matrix with entries 119890119894119895= (2max(120588

119894119903+119895 120574119894119899minus119903minus119895

) minus 119878119894)2

119894 119895 = 1 119899minus119903 Let119880lowast denote the optimal value of the linearassignment problem (LAP) (7) with the profitmatrix119864 Fromthe definition of119864 we immediately obtain an upper bound onthe optimal value of the objective function 119865

Proposition 1 For a partial solution 119901 = (119901(1) 119901(119903))119880 = 119865

119903(119901) + 119880

lowast⩾ max

119901isinΠ(119901)119865(119901)

From the definition of 120588 and 120574 values it can be observedthat with the increase of 119896 120588

119894119896increases while 120574

119894119899minus119896

decreases Therefore when constructing the 119894th row of thematrix 119864 it is expedient to first find the largest 119896 isin 119903 +

1 119899 for which 120588119894119896

⩽ 120574119894119899minus119896

We denote such 119896 by 119896 and

4 Journal of Applied Mathematics

assume that 119896 = 119903 if120588119894119903+1

gt 120574119894119899minus119903minus1

Then 119890119894119896minus119903

= (2120574119894119899minus119896

minus119878119894)2

for 119896 = 119903+1 119896 and 119890

119894119896minus119903= (2120588119894119896minus119878119894)2 for 119896 =

119896+1 119899

The most time-consuming step in the computation of theupper bound 119880 is solving the linear assignment problem Inorder to provide a computationally less expensive boundingmethod we replace 119880

lowast with the value of a feasible solutionto the dual of the LAP (7) To get the bound we take120572119894

= max1⩽119896⩽119899minus119903

119890119894119896

= 119890119894119899minus119903

= 1198782

119894 119894 = 1 119899 minus 119903

120573119895

= max1⩽119894⩽119899minus119903

(119890119894119895minus 120572119894) 119895 = 1 119899 minus 119903 Clearly 120572

119894+

120573119895

⩾ 119890119894119895for each pair 119894 119895 = 1 119899 minus 119903 Thus the vector

(1205721 120572

119899minus119903 1205731 120573

119899minus119903) satisfies the constraints of the dual

of the LAP instance of interest Let = sum119899minus119903

119894=1120572119894+ sum119899minus119903

119895=1120573119895

Then from the duality theory of linear programming we canestablish the following upper bound for 119865(119901)

Proposition 2 For a partial solution 119901 = (119901(1) 119901(119903))1198801015840= 119865119903(119901) + ⩾ max

119901isinΠ(119901)119865(119901)

Proof The inequality follows from Proposition 1 and the factthat ⩾ 119880

lowast

One point that needs to be emphasized is that 1198801015840

strengthens the first bound proposed by Brusco and Stahl

Proposition 3 1198801015840 ⩽ 119880119861119878

Proof Indeed 119865119903(119901) +sum

119894isin119882(119901)1198782

119894= 119865119903(119901) +sum

119899minus119903

119894=1120572119894⩾ 119865119903(119901) +

The equality in the previous chain of expressions comesfrom the choice of 120572

119894and our assumption on119882(119901) whereas

the inequality is due to the fact that 120573119895

⩽ 0 for each 119895 =

1 119899 minus 119903

As an example we consider the following 4 times 4 dissimi-larity matrix borrowed from Defays [12]

[

[

[

[

minus 4 2 3

4 minus 7 2

2 7 minus 4

3 2 4 minus

]

]

]

]

(9)

The same matrix was also used by Brusco and Stahl [13] toillustrate their bounding procedures From Table 2 in theirpaper we find that the solution 119901 = (2 4 1 3) is optimal forthis instance of the UDSP The optimal value is equal to 388We compute the proposed upper bound for the whole Defaysproblem In this case 119882(119901) = 1 4 and 119865

119903(119901) = 0

The matrix 119864 is constructed by noting that 119896 = 2 for each119894 = 1 4 It takes the following form

[

[

[

[

81 25 25 81

169 81 81 169

169 81 81 169

81 25 25 81

]

]

]

]

(10)

Using the expressions for120572119894and120573119895 we find that120572

1= 1205724= 81

1205722= 1205723= 169 120573

1= 1205734= 0 and 120573

2= 1205733= minus56 Thus =

sum4

119894=1120572119894+ sum4

119895=1120573119895= 500 minus 112 = 388 = max

119901isinΠ119865(119901) Hence

our bound for the Defays instance is tight For comparisonthe bound (5) is equal to sum

4

119894=1120572119894= 500 To obtain the second

bound of Brusco and Stahl we first compute the 120583 valuesSince 120583

1= 1205834= 169 and 120583

2= 1205833= 81 the bound (6) (taking

the form sum4

119894=1120583119894) is also equal to 500

3 Tabu Search

As it is well known that the effectiveness of the bound testin branch-and-bound algorithms depends on the value of thebest solution found throughout the search process A goodstrategy is to create a high-quality solution before the searchis started For this task various heuristics can be applied Ourchoice was to use the tabu search (TS) technique It should benoted that it is not the purpose of this paper to investigateheuristic algorithms for solving the UDSP Therefore werestrict ourselves merely to presenting an outline of ourimplementation of tabu search

For 119901 isin Π let 119901119896119898

denote the permutation obtainedfrom 119901 by interchanging the objects in positions 119896 and 119898The set 119873

2(119901) = 119901

119896119898| 119896 119898 = 1 119899 119896 lt 119898 is

the 2-interchange neighborhood of 119901 At each iterationthe tabu search algorithm performs an exploration of theneighborhood 119873

2(119901) A subset of solutions in 119873

2(119901) are

evaluated by computing Δ(119901 119896119898) = 119865(119901119896119898

)minus119865(119901) In orderto get better quality solutions we apply TS iteratively Thistechnique called iterated tabu search (ITS) consists of thefollowing steps

(1) initialize the search with some permutation of theobjects

(2) apply a tabu search procedure Check if the termina-tion condition is satisfied If so then stop Otherwiseproceed to (3)

(3) apply a solution perturbation procedure Return to(2)

In our implementation the search is started from a ran-domly generated permutation Also some data are preparedto be used in the computation of the differences Δ(119901 119896119898)We will defer this issue until later in this section In step (2) ofITS at each iteration of the TS procedure the neighborhood1198732(119901) of the current solution 119901 is explored The procedure

selects a permutation 119901119896119898

isin 1198732(119901) which either improves

the best available objective function value so far or if thisdoes not happen has the largest value of Δ(119901 119894 119895) amongall permutations 119901

119894119895isin 1198732(119901) for which the objects in

positions 119894 and 119895 are not in the tabu list The permutation119901119896119898

is then accepted as a new current solution The numberof iterations in the tabu search phase is bounded aboveby a parameter of the algorithm The goal of the solutionperturbation phase is to generate a permutation which isemployed as the starting point for the next invocation ofthe tabu search procedure This is achieved by taking thecurrent permutation and performing a sequence of pairwiseinterchanges of objects Throughout this process each objectis allowed to change its position in the permutation at mostonce At each step first a set of feasible pairs of permutationpositions with the largest Δ values is formed Then a pairsay (119896119898) is chosen randomly from this set and objects inpositions 119896 and119898 are switchedWe omit some of the details of

Journal of Applied Mathematics 5

the algorithm Similar implementations of the ITS techniquefor other optimization problems are thoroughly describedfor example in [21 22] We should note that also otherenhancements of the generic tabu search strategy could beused for our purposes These include for example multipathadaptive tabu search [23] and bacterial foraging tabu search[24] algorithms

In the rest of this section we concentrate on howto efficiently compute the differences Δ(119901 119896119898) Certainlythis issue is central to the performance of the tabu searchprocedure But more importantly evaluation of pairwiseinterchanges of objects stands at the heart of the dominancetest which is one of the crucial components of our algorithmAs we will see in the next section the computation of the Δvalues is an operation the dominance test is based on

Brusco [19] derived the following formula to calculate thedifference between 119865(119901

119896119898) and 119865(119901)

Δ (119901 119896119898)

= (4

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

)((

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

) + 2119891119901(119896)

minus 119878119901(119896)

)

+ (4

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

)((

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

) minus 2119891119901(119898)

+ 119878119901(119898)

)

+

119898minus1

sum

119897=119896+1

4 (119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

) ((119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(11)

Using (11) Δ(119901 119896119898) for all permutations in 1198732(119901) can be

computed in 119874(1198993) time However this approach is actually

too slow to be efficient In order to describe a more elaborateapproach we introduce three auxiliary 119899 times 119899 matrices 119860 =

(119886119894119902)119861 = (119887

119894119902) and119862 = (119888

119894119895) defined for a permutation119901 isin Π

Let us consider an object 119894 = 119901(119904) The entries of the 119894th rowof the matrices 119860 and 119861 are defined as follows if 119902 lt 119904 then119886119894119902= sum119904minus1

119897=119902119889119894119901(119897)

119887119894119902= sum119904minus1

119897=119902119889119894119901(119897)

(119889119894119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

) if 119902 gt 119904then 119886

119894119902= sum119902

119897=119904+1119889119894119901(119897)

119887119894119902= sum119902

119897=119904+1119889119894119901(119897)

(minus119889119894119901(119897)

+2119891119901(119897)

minus119878119901(119897)

)if 119902 = 119904 then 119886

119894119902= 119887119894119902= 0 Turning now to the matrix 119862 we

suppose that 119895 = 119901(119905) Then the entries of 119862 are given by thefollowing formula

119888119894119895=

max(119904119905)minus1sum

119897=min(119904119905)+1119889119894119901(119897)

119889119895119901(119897)

if |119904 minus 119905| gt 1

0 if |119904 minus 119905| = 1

(12)

Unlike 119860 and 119861 the matrix 119862 is symmetric We use thematrices 119860 119861 and 119862 to cast (11) in a form more suitable forboth the TS heuristic and the dominance test

Proposition 4 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119898 isin 119896 +

1 119899

Δ (119901 119896119898) = 4119886119901(119896)119898

(119886119901(119896)119898

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 4119886119901(119898)119896

(119886119901(119898)119896

minus 2119886119901(119898)1

+ 119878119901(119898)

)

+ 4 (119887119901(119898)119896+1

minus 119887119901(119896)119898minus1

minus 2119888119901(119896)119901(119898)

)

(13)

Proof First notice that the sums sum119898

119897=119896+1119889119901(119896)119901(119897)

and sum119898minus1

119897=119896

119889119901(119897)119901(119898)

in (11) are equal respectively to 119886119901(119896)119898

and 119886119901(119898)119896

Next it can be seen that 119891

119901(119896)= 119886119901(119896)1

and 119891119901(119898)

= 119886119901(119898)1

Finally the third term in (11) ignoring the factor of 4 can berewritten as119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

(119889119901(119897)119901(119898)

+ 2119891119901(119897)

minus 119878119901(119897)

)

minus

119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

119889119901(119896)119901(119897)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

119889119901(119897)119901(119898)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

(minus119889119901(119896)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(14)

It is easy to see that the first sum in (14) is equal to 119887119901(119898)119896+1

each of the next two sums is equal to 119888

119901(119896)119901(119898) and the last sum

is equal to 119887119901(119896)119898minus1

After simple manipulations we arrive at(13)

Having the matrices 119860 119861 and 119862 and using (13) the 2-interchange neighborhood of 119901 can be explored in 119874(119899

2)

time The question that remains to be addressed is howefficiently the entries of these matrices can be updated afterthe replacement of 119901 by a permutation 119901

1015840 chosen from theneighborhood 119873

2(119901) Suppose that 1199011015840 is obtained from 119901 by

interchanging the objects in positions 119896 and119898 Then we canwrite the following formulas that are involved in updating thematrices 119860 and 119861

119886119894119902= 119886119894119902minus1

+ 1198891198941199011015840(119902) (15)

119886119894119902= 119886119894119902+1

+ 1198891198941199011015840(119902) (16)

119887119894119902= 119887119894119902minus1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

minus 1198891198941199011015840(119902)) (17)

119887119894119902= 119887119894119902+1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

+ 1198891198941199011015840(119902)) (18)

From now on we will assume that the positions 119904 and 119905 of theobjects 119894 and 119895 are defined with respect to the permutation 119901

1015840More formally we have 119894 = 119901

1015840(119904) 119895 = 119901

1015840(119905)The procedure for

updating 119860 119861 and 119862 is summarized in the following

(1) for each object 119894 isin 1 119899 do the following

(11) if 119904 lt 119896 then compute 119886119894119902 119902 = 119896 119898 minus 1 by

(15) and 119887119894119902 119902 = 119896 119899 by (17)

(12) if 119904 gt 119898 then compute 119886119894119902 119902 = 119898 119898minus1 119896+

1 by (16) and 119887119894119902 119902 = 119898119898 minus 1 1 by (18)

(13) if 119896 lt 119904 lt 119898 then compute 119886119894119902by (15) for 119902 =

119898 119899 and by (16) for 119902 = 119896 119896 minus 1 1

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

2 Journal of Applied Mathematics

Justices They have found that agreement among the justiceswas better represented by this approach than using a hierar-chical classification method Dahabiah et al [8 9] proposeda data mining technique based on a unidimensional scalingalgorithm They applied this technique on a digestive endo-scope database consisting of a large number of pathologiesGe et al [10] developed a method for unsupervised classifica-tion of remote sensing images Unidimensional scaling playsa pivotal role in this classifier Bihn et al [11] examined thediversity of ant assemblages in the Atlantic Forest of SouthernBrazil To visualize nestedness of ant genera distribution aunidimensional scaling approach was adopted

From (1) we see that in order to solve the unidimensionalscaling problem (UDSP for short) we need to find not only apermutation of objects but also their coordinates preservingthe order of the objects as given by the permutation Thusat first glance it might seem that the UDSP is a much moredifficult problem than other linear arrangement problemsFortunately this is not the case As shown by Defays [12]the UDSP can be reduced to the following combinatorialoptimization problem

max119901isinΠ

119865 (119901) =

119899

sum

119896=1

(

119896minus1

sum

119897=1

119889119901(119896)119901(119897)

minus

119899

sum

119897=119896+1

119889119901(119896)119901(119897)

)

2

(2)

where Π is the set of all permutations of119882 = 1 119899 thatis the set of all vectors (119901(1) 119901(2) 119901(119899)) such that 119901(119896) isin119882 119896 = 1 119899 and 119901(119896) = 119901(119897) 119896 = 1 119899 minus 1 119897 = 119896 +

1 119899 Letting 119878119894 119894 isin 119882 denote the sumsum

119899

119895=1119895 = 119894119889119894119895 we can

write

119865 (119901) =

119899

sum

119896=1

(2

119896minus1

sum

119897=1

119889119901(119896)119901(119897)

minus 119878119901(119896)

)

2

=

119899

sum

119896=1

(2

119899

sum

119897=119896+1

119889119901(119896)119901(119897)

minus 119878119901(119896)

)

2

(3)

Thus theUDSP is essentially a problem over all permutationsof 119899 objects With an optimal solution 119901

lowast to (2) at handwe can compute the optimal coordinates for the objects bysetting 119909

119901lowast(1)

= 0 and using the following relationship [13]for 119896 + 1 = 2 119899

119909119901lowast(119896+1)

= 119909119901lowast(119896)

+ (

119899

sum

119897=119896+1

119889119901lowast(119896)119901lowast(119897)

minus

119896minus1

sum

119897=1

119889119901lowast(119896)119901lowast(119897)

minus

119899

sum

119897=119896+2

119889119901lowast(119896+1)119901

lowast(119897)

+

119896

sum

119897=1

119889119901lowast(119896+1)119901

lowast(119897))119899minus1

(4)

There is an important body of the literature on algorithmsfor solving the UDSP Since the feasible solutions to the prob-lem are linear arrangements of objects a natural approachto the UDSP is to apply a general dynamic programmingparadigm An algorithm based on this paradigm was pro-posed by Hubert and Arabie [14] However as pointed out

in the papers by Hubert et al [15] and Brusco and Stahl[13] the dynamic programming algorithm requires very largememory for storing the partial solutions and in fact runsout of memory long before running out of time As reportedin these papers the largest UDSP instances solved usingdynamic programmingwere of sizes 25 and 26 An alternativeapproach for solving the UDSP is to use the branch-and-bound method The first algorithm following this approachis that of Defays [12] In his algorithm during branching theobjects are assigned in an alternating fashion to either theleftmost free position or the rightmost free position in the119899-permutation under construction For example at level 4of the search tree the four selected objects are 119901(1) 119901(119899)119901(2) and 119901(119899 minus 1) and there are no object assigned topositions 3 119899 minus 2 in the permutation 119901 To fathom partialsolutions the algorithm uses a symmetry test an adjacencytest and a bound test Some more comments on the Defaysapproach were provided in the paper by Brusco and Stahl[13] The authors of that paper also proposed two improvedbranch-and-bound algorithms for unidimensional scalingThe first of them adopts the above-mentioned branchingstrategy borrowed from the Defays method Meanwhile inthe second algorithm of Brusco and Stahl an object chosenfor branching is always assigned to the leftmost free positionBrusco and Stahl [13 16] presented improved boundingprocedures and an interchange test expanding the adjacencytest used by Defays [12] These components of their branch-and-bound algorithms are briefly sketched in Section 2 andSection 4 respectively Brusco and Stahl [13] have reportedcomputational experience for both randomly generated andempirical dissimilarity matrices The results suggest thattheir branch-and-bound implementations are superior to theDefays algorithm Moreover the results demonstrate thatthese implementations are able to solve UDSP instances thatare beyond the capabilities of the dynamic programmingmethod

There are also a number of heuristic algorithms forunidimensional scaling These include the smoothing tech-nique [17] the iterative quadratic assignment heuristic [15]and simulated annealing adaptations [18 19] A heuristicimplementation of dynamic programming for escaping localoptima has been presented by Brusco et al [20] Computa-tional results for problem instances with up to 200 objectswere reported in [18ndash20]

The focus of this paper is on developing a branch-and-bound algorithm for the UDSP The primary intention isto solve exactly some larger problems than those solved byprevious methods The significant features of our algorithmare the use of an innovative upper bounding procedurerelying on the linear assignmentmodel and the enhancementof the interchange test from [13] In the current paper thisenhancement is called a dominance test A good initialsolution and thus lower bound are obtained by running aniterated tabu search algorithm at the root node of the searchtree To make this algorithm more efficient we provide anovel mechanism and underlying formulas for searching thepairwise interchange neighborhood of a solution in 119874(119899

2)

Journal of Applied Mathematics 3

time which is an improvement over a standard 119874(1198993)-time

approach from the literatureThe same formulas are also usedin the dominance test We present the results of empiricalevaluation of our branch-and-bound algorithm and compareit against the better of the two branch-and-bound algorithmsof Brusco and Stahl [13] The largest UDSP instance that wewere able to solvewith our algorithmwas the problemdefinedby a 36 times 36Morse code dissimilarity matrix

The remainder of this paper is organized as follows InSection 2 we present an upper bound on solutions of theUDSP In Section 3 we outline an effective tabu search pro-cedure for unidimensional scaling and propose a method forexploring the pairwise interchange neighborhood Section 4gives a description of the dominance test for pruning nodesin the searchThis test is used in our branch-and-bound algo-rithm presented in Section 5 The results of computationalexperiments are reported in Section 6 Finally Section 7concludes the paper

2 Upper Bounds

In this section we first briefly recall the bounds used byBrusco and Stahl [13] in their second branch-and-boundalgorithm Then we present a new upper bounding mecha-nism for the UDSP We discuss the technical details of theprocedure for computing our bound We choose a variationof this procedure which yields slightly weakened bounds butis quite fast

Like in the second algorithm of Brusco and Stahl in ourimplementation of the branch-and-bound method partialsolutions are constructed by placing each new object in theleftmost free position For a partial solution 119901 = (119901(1)

119901(119903)) 119903 isin 1 119899 minus 1 we denote by 119882(119901) the set of theobjects in 119901 Thus 119882(119901) = 119901(1) 119901(119903) and 119882(119901) =

119882 119882(119901) is the set of unselected objects At the root nodeof the search tree 119901 is empty and 119882(119901) is equal to 119882 Thefirst upper bound used by Brusco and Stahl when adopted to119901 takes the following form

119880BS = 119865119903(119901) + sum

119894isin119882(119901)

1198782

119894 (5)

where 119865119903(119901) denotes the sum of the terms in (3) corre-

sponding to assigned objects Thus 119865119903(119901) = sum

119903

119896=1(119878119901(119896)

minus

2sum119896minus1

119897=1119889119901(119896)119901(119897)

)2

To describe the second bound we define 119863 = (119889119894119895) to

be an 119899 times (119899 minus 1) matrix obtained from 119863 by sorting thenondiagonal entries of each row of119863 in nondecreasing orderLet 120578119894119896

= (119878119894minus 2sum

119896minus1

119895=1

119889119894119895)2 and 120583

119896= max

1⩽119894⩽119899120578119894119896for 119896 =

1 lceil1198992rceil For other values of 119896 120583119896is obtained from the

equality 120583119896= 120583119899minus119896+1

119896 isin 1 119899 The second upper boundproposed in [13] is given by the following expression

119865119903(119901) +

119899

sum

119896=119903+1

120583119896 (6)

The previous outlined bounds are computationally attrac-tive but not strong enough when used to prune the search

space In order to get a tighter upper bound we resort to thelinear assignment model

max119899minus119903

sum

119894=1

119899minus119903

sum

119895=1

119890119894119895119910119894119895

st119899minus119903

sum

119895=1

119910119894119895= 1 119894 = 1 119899 minus 119903

119899minus119903

sum

119894=1

119910119894119895= 1 119895 = 1 119899 minus 119903

119910119894119895isin 0 1 119894 119895 = 1 119899 minus 119903

(7)

where 119890119894119895 119894 119895 = 1 119899 minus 119903 are entries of an (119899 minus 119903) times (119899 minus

119903) profit matrix 119864 with rows and columns indexed by theobjects in the set 119882(119901) Next we will construct the matrix119864 For a permutation 119901 isin Π we denote by 119891

119901(119896)(resp

1198911015840

119901(119896)) the sum of dissimilarities between object 119901(119896) and all

objects preceding (resp following) 119901(119896) in 119901 Thus 119891119901(119896)

=

sum119896minus1

119897=1119889119901(119896)119901(119897)

1198911015840119901(119896)

= sum119899

119897=119896+1119889119901(119896)119901(119897)

From (3) we have

119865 (119901) =

119899

sum

119896=1

(2max (119891119901(119896)

1198911015840

119901(119896)) minus 119878119901(119896)

)

2

(8)

and 2max(119891119901(119896)

1198911015840

119901(119896)) ⩾ 119878

119901(119896)= 119891119901(119896)

+ 1198911015840

119901(119896)for each 119896 =

1 119899 Suppose now that 119901 = (119901(1) 119901(119903)) is a partialsolution andΠ(119901) is the set of all 119899-permutations that can beobtained by extending 119901 Assume for the sake of simplicitythat 119882(119901) = 1 119899 minus 119903 In order to construct a suitablematrix 119864 we have to bound 119891

119901(119896)and 119891

1015840

119901(119896)in (8) from above

To this end for any 119894 isin 119882(119901) let (11988910158401198941 119889

1015840

119894119899minus119903minus1) denote the

list obtained by sorting 119889119894119895 119895 isin 119882(119901) 119894 in nonincreasing

order To get the required bounds we make use of the sums120574119894119895

= sum119895

119897=11198891015840

1198941 119894 isin 119882(119901) 119895 = 0 1 119899 minus 119903 minus 1 and 120588

119894119896=

sum119903

119897=1119889119894119901(119897)

+ 120574119894119896minus119903minus1

119894 isin 119882(119901) 119896 = 119903 + 1 119899 Suppose that119901 is any permutation in Π(119901) and 119901(119896) = 119894 for a position 119896 isin

119903 + 1 119899 Then it is easy to see that 119891119901(119896)

= 119891119894⩽ 120588119894119896and

1198911015840

119901(119896)= 1198911015840

119894⩽ 120574119894119899minus119896

Indeed in the case of 119891119901(119896)

for examplewe have 119891

119901(119896)= sum119903

119897=1119889119894119901(119897)

+ sum119896minus1

119897=119903+1119889119894119901(119897)

and the last term inthis expression is less than or equal to 120574

119894119896minus119903minus1 We define 119864 to

be the matrix with entries 119890119894119895= (2max(120588

119894119903+119895 120574119894119899minus119903minus119895

) minus 119878119894)2

119894 119895 = 1 119899minus119903 Let119880lowast denote the optimal value of the linearassignment problem (LAP) (7) with the profitmatrix119864 Fromthe definition of119864 we immediately obtain an upper bound onthe optimal value of the objective function 119865

Proposition 1 For a partial solution 119901 = (119901(1) 119901(119903))119880 = 119865

119903(119901) + 119880

lowast⩾ max

119901isinΠ(119901)119865(119901)

From the definition of 120588 and 120574 values it can be observedthat with the increase of 119896 120588

119894119896increases while 120574

119894119899minus119896

decreases Therefore when constructing the 119894th row of thematrix 119864 it is expedient to first find the largest 119896 isin 119903 +

1 119899 for which 120588119894119896

⩽ 120574119894119899minus119896

We denote such 119896 by 119896 and

4 Journal of Applied Mathematics

assume that 119896 = 119903 if120588119894119903+1

gt 120574119894119899minus119903minus1

Then 119890119894119896minus119903

= (2120574119894119899minus119896

minus119878119894)2

for 119896 = 119903+1 119896 and 119890

119894119896minus119903= (2120588119894119896minus119878119894)2 for 119896 =

119896+1 119899

The most time-consuming step in the computation of theupper bound 119880 is solving the linear assignment problem Inorder to provide a computationally less expensive boundingmethod we replace 119880

lowast with the value of a feasible solutionto the dual of the LAP (7) To get the bound we take120572119894

= max1⩽119896⩽119899minus119903

119890119894119896

= 119890119894119899minus119903

= 1198782

119894 119894 = 1 119899 minus 119903

120573119895

= max1⩽119894⩽119899minus119903

(119890119894119895minus 120572119894) 119895 = 1 119899 minus 119903 Clearly 120572

119894+

120573119895

⩾ 119890119894119895for each pair 119894 119895 = 1 119899 minus 119903 Thus the vector

(1205721 120572

119899minus119903 1205731 120573

119899minus119903) satisfies the constraints of the dual

of the LAP instance of interest Let = sum119899minus119903

119894=1120572119894+ sum119899minus119903

119895=1120573119895

Then from the duality theory of linear programming we canestablish the following upper bound for 119865(119901)

Proposition 2 For a partial solution 119901 = (119901(1) 119901(119903))1198801015840= 119865119903(119901) + ⩾ max

119901isinΠ(119901)119865(119901)

Proof The inequality follows from Proposition 1 and the factthat ⩾ 119880

lowast

One point that needs to be emphasized is that 1198801015840

strengthens the first bound proposed by Brusco and Stahl

Proposition 3 1198801015840 ⩽ 119880119861119878

Proof Indeed 119865119903(119901) +sum

119894isin119882(119901)1198782

119894= 119865119903(119901) +sum

119899minus119903

119894=1120572119894⩾ 119865119903(119901) +

The equality in the previous chain of expressions comesfrom the choice of 120572

119894and our assumption on119882(119901) whereas

the inequality is due to the fact that 120573119895

⩽ 0 for each 119895 =

1 119899 minus 119903

As an example we consider the following 4 times 4 dissimi-larity matrix borrowed from Defays [12]

[

[

[

[

minus 4 2 3

4 minus 7 2

2 7 minus 4

3 2 4 minus

]

]

]

]

(9)

The same matrix was also used by Brusco and Stahl [13] toillustrate their bounding procedures From Table 2 in theirpaper we find that the solution 119901 = (2 4 1 3) is optimal forthis instance of the UDSP The optimal value is equal to 388We compute the proposed upper bound for the whole Defaysproblem In this case 119882(119901) = 1 4 and 119865

119903(119901) = 0

The matrix 119864 is constructed by noting that 119896 = 2 for each119894 = 1 4 It takes the following form

[

[

[

[

81 25 25 81

169 81 81 169

169 81 81 169

81 25 25 81

]

]

]

]

(10)

Using the expressions for120572119894and120573119895 we find that120572

1= 1205724= 81

1205722= 1205723= 169 120573

1= 1205734= 0 and 120573

2= 1205733= minus56 Thus =

sum4

119894=1120572119894+ sum4

119895=1120573119895= 500 minus 112 = 388 = max

119901isinΠ119865(119901) Hence

our bound for the Defays instance is tight For comparisonthe bound (5) is equal to sum

4

119894=1120572119894= 500 To obtain the second

bound of Brusco and Stahl we first compute the 120583 valuesSince 120583

1= 1205834= 169 and 120583

2= 1205833= 81 the bound (6) (taking

the form sum4

119894=1120583119894) is also equal to 500

3 Tabu Search

As it is well known that the effectiveness of the bound testin branch-and-bound algorithms depends on the value of thebest solution found throughout the search process A goodstrategy is to create a high-quality solution before the searchis started For this task various heuristics can be applied Ourchoice was to use the tabu search (TS) technique It should benoted that it is not the purpose of this paper to investigateheuristic algorithms for solving the UDSP Therefore werestrict ourselves merely to presenting an outline of ourimplementation of tabu search

For 119901 isin Π let 119901119896119898

denote the permutation obtainedfrom 119901 by interchanging the objects in positions 119896 and 119898The set 119873

2(119901) = 119901

119896119898| 119896 119898 = 1 119899 119896 lt 119898 is

the 2-interchange neighborhood of 119901 At each iterationthe tabu search algorithm performs an exploration of theneighborhood 119873

2(119901) A subset of solutions in 119873

2(119901) are

evaluated by computing Δ(119901 119896119898) = 119865(119901119896119898

)minus119865(119901) In orderto get better quality solutions we apply TS iteratively Thistechnique called iterated tabu search (ITS) consists of thefollowing steps

(1) initialize the search with some permutation of theobjects

(2) apply a tabu search procedure Check if the termina-tion condition is satisfied If so then stop Otherwiseproceed to (3)

(3) apply a solution perturbation procedure Return to(2)

In our implementation the search is started from a ran-domly generated permutation Also some data are preparedto be used in the computation of the differences Δ(119901 119896119898)We will defer this issue until later in this section In step (2) ofITS at each iteration of the TS procedure the neighborhood1198732(119901) of the current solution 119901 is explored The procedure

selects a permutation 119901119896119898

isin 1198732(119901) which either improves

the best available objective function value so far or if thisdoes not happen has the largest value of Δ(119901 119894 119895) amongall permutations 119901

119894119895isin 1198732(119901) for which the objects in

positions 119894 and 119895 are not in the tabu list The permutation119901119896119898

is then accepted as a new current solution The numberof iterations in the tabu search phase is bounded aboveby a parameter of the algorithm The goal of the solutionperturbation phase is to generate a permutation which isemployed as the starting point for the next invocation ofthe tabu search procedure This is achieved by taking thecurrent permutation and performing a sequence of pairwiseinterchanges of objects Throughout this process each objectis allowed to change its position in the permutation at mostonce At each step first a set of feasible pairs of permutationpositions with the largest Δ values is formed Then a pairsay (119896119898) is chosen randomly from this set and objects inpositions 119896 and119898 are switchedWe omit some of the details of

Journal of Applied Mathematics 5

the algorithm Similar implementations of the ITS techniquefor other optimization problems are thoroughly describedfor example in [21 22] We should note that also otherenhancements of the generic tabu search strategy could beused for our purposes These include for example multipathadaptive tabu search [23] and bacterial foraging tabu search[24] algorithms

In the rest of this section we concentrate on howto efficiently compute the differences Δ(119901 119896119898) Certainlythis issue is central to the performance of the tabu searchprocedure But more importantly evaluation of pairwiseinterchanges of objects stands at the heart of the dominancetest which is one of the crucial components of our algorithmAs we will see in the next section the computation of the Δvalues is an operation the dominance test is based on

Brusco [19] derived the following formula to calculate thedifference between 119865(119901

119896119898) and 119865(119901)

Δ (119901 119896119898)

= (4

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

)((

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

) + 2119891119901(119896)

minus 119878119901(119896)

)

+ (4

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

)((

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

) minus 2119891119901(119898)

+ 119878119901(119898)

)

+

119898minus1

sum

119897=119896+1

4 (119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

) ((119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(11)

Using (11) Δ(119901 119896119898) for all permutations in 1198732(119901) can be

computed in 119874(1198993) time However this approach is actually

too slow to be efficient In order to describe a more elaborateapproach we introduce three auxiliary 119899 times 119899 matrices 119860 =

(119886119894119902)119861 = (119887

119894119902) and119862 = (119888

119894119895) defined for a permutation119901 isin Π

Let us consider an object 119894 = 119901(119904) The entries of the 119894th rowof the matrices 119860 and 119861 are defined as follows if 119902 lt 119904 then119886119894119902= sum119904minus1

119897=119902119889119894119901(119897)

119887119894119902= sum119904minus1

119897=119902119889119894119901(119897)

(119889119894119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

) if 119902 gt 119904then 119886

119894119902= sum119902

119897=119904+1119889119894119901(119897)

119887119894119902= sum119902

119897=119904+1119889119894119901(119897)

(minus119889119894119901(119897)

+2119891119901(119897)

minus119878119901(119897)

)if 119902 = 119904 then 119886

119894119902= 119887119894119902= 0 Turning now to the matrix 119862 we

suppose that 119895 = 119901(119905) Then the entries of 119862 are given by thefollowing formula

119888119894119895=

max(119904119905)minus1sum

119897=min(119904119905)+1119889119894119901(119897)

119889119895119901(119897)

if |119904 minus 119905| gt 1

0 if |119904 minus 119905| = 1

(12)

Unlike 119860 and 119861 the matrix 119862 is symmetric We use thematrices 119860 119861 and 119862 to cast (11) in a form more suitable forboth the TS heuristic and the dominance test

Proposition 4 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119898 isin 119896 +

1 119899

Δ (119901 119896119898) = 4119886119901(119896)119898

(119886119901(119896)119898

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 4119886119901(119898)119896

(119886119901(119898)119896

minus 2119886119901(119898)1

+ 119878119901(119898)

)

+ 4 (119887119901(119898)119896+1

minus 119887119901(119896)119898minus1

minus 2119888119901(119896)119901(119898)

)

(13)

Proof First notice that the sums sum119898

119897=119896+1119889119901(119896)119901(119897)

and sum119898minus1

119897=119896

119889119901(119897)119901(119898)

in (11) are equal respectively to 119886119901(119896)119898

and 119886119901(119898)119896

Next it can be seen that 119891

119901(119896)= 119886119901(119896)1

and 119891119901(119898)

= 119886119901(119898)1

Finally the third term in (11) ignoring the factor of 4 can berewritten as119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

(119889119901(119897)119901(119898)

+ 2119891119901(119897)

minus 119878119901(119897)

)

minus

119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

119889119901(119896)119901(119897)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

119889119901(119897)119901(119898)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

(minus119889119901(119896)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(14)

It is easy to see that the first sum in (14) is equal to 119887119901(119898)119896+1

each of the next two sums is equal to 119888

119901(119896)119901(119898) and the last sum

is equal to 119887119901(119896)119898minus1

After simple manipulations we arrive at(13)

Having the matrices 119860 119861 and 119862 and using (13) the 2-interchange neighborhood of 119901 can be explored in 119874(119899

2)

time The question that remains to be addressed is howefficiently the entries of these matrices can be updated afterthe replacement of 119901 by a permutation 119901

1015840 chosen from theneighborhood 119873

2(119901) Suppose that 1199011015840 is obtained from 119901 by

interchanging the objects in positions 119896 and119898 Then we canwrite the following formulas that are involved in updating thematrices 119860 and 119861

119886119894119902= 119886119894119902minus1

+ 1198891198941199011015840(119902) (15)

119886119894119902= 119886119894119902+1

+ 1198891198941199011015840(119902) (16)

119887119894119902= 119887119894119902minus1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

minus 1198891198941199011015840(119902)) (17)

119887119894119902= 119887119894119902+1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

+ 1198891198941199011015840(119902)) (18)

From now on we will assume that the positions 119904 and 119905 of theobjects 119894 and 119895 are defined with respect to the permutation 119901

1015840More formally we have 119894 = 119901

1015840(119904) 119895 = 119901

1015840(119905)The procedure for

updating 119860 119861 and 119862 is summarized in the following

(1) for each object 119894 isin 1 119899 do the following

(11) if 119904 lt 119896 then compute 119886119894119902 119902 = 119896 119898 minus 1 by

(15) and 119887119894119902 119902 = 119896 119899 by (17)

(12) if 119904 gt 119898 then compute 119886119894119902 119902 = 119898 119898minus1 119896+

1 by (16) and 119887119894119902 119902 = 119898119898 minus 1 1 by (18)

(13) if 119896 lt 119904 lt 119898 then compute 119886119894119902by (15) for 119902 =

119898 119899 and by (16) for 119902 = 119896 119896 minus 1 1

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 3

time which is an improvement over a standard 119874(1198993)-time

approach from the literatureThe same formulas are also usedin the dominance test We present the results of empiricalevaluation of our branch-and-bound algorithm and compareit against the better of the two branch-and-bound algorithmsof Brusco and Stahl [13] The largest UDSP instance that wewere able to solvewith our algorithmwas the problemdefinedby a 36 times 36Morse code dissimilarity matrix

The remainder of this paper is organized as follows InSection 2 we present an upper bound on solutions of theUDSP In Section 3 we outline an effective tabu search pro-cedure for unidimensional scaling and propose a method forexploring the pairwise interchange neighborhood Section 4gives a description of the dominance test for pruning nodesin the searchThis test is used in our branch-and-bound algo-rithm presented in Section 5 The results of computationalexperiments are reported in Section 6 Finally Section 7concludes the paper

2 Upper Bounds

In this section we first briefly recall the bounds used byBrusco and Stahl [13] in their second branch-and-boundalgorithm Then we present a new upper bounding mecha-nism for the UDSP We discuss the technical details of theprocedure for computing our bound We choose a variationof this procedure which yields slightly weakened bounds butis quite fast

Like in the second algorithm of Brusco and Stahl in ourimplementation of the branch-and-bound method partialsolutions are constructed by placing each new object in theleftmost free position For a partial solution 119901 = (119901(1)

119901(119903)) 119903 isin 1 119899 minus 1 we denote by 119882(119901) the set of theobjects in 119901 Thus 119882(119901) = 119901(1) 119901(119903) and 119882(119901) =

119882 119882(119901) is the set of unselected objects At the root nodeof the search tree 119901 is empty and 119882(119901) is equal to 119882 Thefirst upper bound used by Brusco and Stahl when adopted to119901 takes the following form

119880BS = 119865119903(119901) + sum

119894isin119882(119901)

1198782

119894 (5)

where 119865119903(119901) denotes the sum of the terms in (3) corre-

sponding to assigned objects Thus 119865119903(119901) = sum

119903

119896=1(119878119901(119896)

minus

2sum119896minus1

119897=1119889119901(119896)119901(119897)

)2

To describe the second bound we define 119863 = (119889119894119895) to

be an 119899 times (119899 minus 1) matrix obtained from 119863 by sorting thenondiagonal entries of each row of119863 in nondecreasing orderLet 120578119894119896

= (119878119894minus 2sum

119896minus1

119895=1

119889119894119895)2 and 120583

119896= max

1⩽119894⩽119899120578119894119896for 119896 =

1 lceil1198992rceil For other values of 119896 120583119896is obtained from the

equality 120583119896= 120583119899minus119896+1

119896 isin 1 119899 The second upper boundproposed in [13] is given by the following expression

119865119903(119901) +

119899

sum

119896=119903+1

120583119896 (6)

The previous outlined bounds are computationally attrac-tive but not strong enough when used to prune the search

space In order to get a tighter upper bound we resort to thelinear assignment model

max119899minus119903

sum

119894=1

119899minus119903

sum

119895=1

119890119894119895119910119894119895

st119899minus119903

sum

119895=1

119910119894119895= 1 119894 = 1 119899 minus 119903

119899minus119903

sum

119894=1

119910119894119895= 1 119895 = 1 119899 minus 119903

119910119894119895isin 0 1 119894 119895 = 1 119899 minus 119903

(7)

where 119890119894119895 119894 119895 = 1 119899 minus 119903 are entries of an (119899 minus 119903) times (119899 minus

119903) profit matrix 119864 with rows and columns indexed by theobjects in the set 119882(119901) Next we will construct the matrix119864 For a permutation 119901 isin Π we denote by 119891

119901(119896)(resp

1198911015840

119901(119896)) the sum of dissimilarities between object 119901(119896) and all

objects preceding (resp following) 119901(119896) in 119901 Thus 119891119901(119896)

=

sum119896minus1

119897=1119889119901(119896)119901(119897)

1198911015840119901(119896)

= sum119899

119897=119896+1119889119901(119896)119901(119897)

From (3) we have

119865 (119901) =

119899

sum

119896=1

(2max (119891119901(119896)

1198911015840

119901(119896)) minus 119878119901(119896)

)

2

(8)

and 2max(119891119901(119896)

1198911015840

119901(119896)) ⩾ 119878

119901(119896)= 119891119901(119896)

+ 1198911015840

119901(119896)for each 119896 =

1 119899 Suppose now that 119901 = (119901(1) 119901(119903)) is a partialsolution andΠ(119901) is the set of all 119899-permutations that can beobtained by extending 119901 Assume for the sake of simplicitythat 119882(119901) = 1 119899 minus 119903 In order to construct a suitablematrix 119864 we have to bound 119891

119901(119896)and 119891

1015840

119901(119896)in (8) from above

To this end for any 119894 isin 119882(119901) let (11988910158401198941 119889

1015840

119894119899minus119903minus1) denote the

list obtained by sorting 119889119894119895 119895 isin 119882(119901) 119894 in nonincreasing

order To get the required bounds we make use of the sums120574119894119895

= sum119895

119897=11198891015840

1198941 119894 isin 119882(119901) 119895 = 0 1 119899 minus 119903 minus 1 and 120588

119894119896=

sum119903

119897=1119889119894119901(119897)

+ 120574119894119896minus119903minus1

119894 isin 119882(119901) 119896 = 119903 + 1 119899 Suppose that119901 is any permutation in Π(119901) and 119901(119896) = 119894 for a position 119896 isin

119903 + 1 119899 Then it is easy to see that 119891119901(119896)

= 119891119894⩽ 120588119894119896and

1198911015840

119901(119896)= 1198911015840

119894⩽ 120574119894119899minus119896

Indeed in the case of 119891119901(119896)

for examplewe have 119891

119901(119896)= sum119903

119897=1119889119894119901(119897)

+ sum119896minus1

119897=119903+1119889119894119901(119897)

and the last term inthis expression is less than or equal to 120574

119894119896minus119903minus1 We define 119864 to

be the matrix with entries 119890119894119895= (2max(120588

119894119903+119895 120574119894119899minus119903minus119895

) minus 119878119894)2

119894 119895 = 1 119899minus119903 Let119880lowast denote the optimal value of the linearassignment problem (LAP) (7) with the profitmatrix119864 Fromthe definition of119864 we immediately obtain an upper bound onthe optimal value of the objective function 119865

Proposition 1 For a partial solution 119901 = (119901(1) 119901(119903))119880 = 119865

119903(119901) + 119880

lowast⩾ max

119901isinΠ(119901)119865(119901)

From the definition of 120588 and 120574 values it can be observedthat with the increase of 119896 120588

119894119896increases while 120574

119894119899minus119896

decreases Therefore when constructing the 119894th row of thematrix 119864 it is expedient to first find the largest 119896 isin 119903 +

1 119899 for which 120588119894119896

⩽ 120574119894119899minus119896

We denote such 119896 by 119896 and

4 Journal of Applied Mathematics

assume that 119896 = 119903 if120588119894119903+1

gt 120574119894119899minus119903minus1

Then 119890119894119896minus119903

= (2120574119894119899minus119896

minus119878119894)2

for 119896 = 119903+1 119896 and 119890

119894119896minus119903= (2120588119894119896minus119878119894)2 for 119896 =

119896+1 119899

The most time-consuming step in the computation of theupper bound 119880 is solving the linear assignment problem Inorder to provide a computationally less expensive boundingmethod we replace 119880

lowast with the value of a feasible solutionto the dual of the LAP (7) To get the bound we take120572119894

= max1⩽119896⩽119899minus119903

119890119894119896

= 119890119894119899minus119903

= 1198782

119894 119894 = 1 119899 minus 119903

120573119895

= max1⩽119894⩽119899minus119903

(119890119894119895minus 120572119894) 119895 = 1 119899 minus 119903 Clearly 120572

119894+

120573119895

⩾ 119890119894119895for each pair 119894 119895 = 1 119899 minus 119903 Thus the vector

(1205721 120572

119899minus119903 1205731 120573

119899minus119903) satisfies the constraints of the dual

of the LAP instance of interest Let = sum119899minus119903

119894=1120572119894+ sum119899minus119903

119895=1120573119895

Then from the duality theory of linear programming we canestablish the following upper bound for 119865(119901)

Proposition 2 For a partial solution 119901 = (119901(1) 119901(119903))1198801015840= 119865119903(119901) + ⩾ max

119901isinΠ(119901)119865(119901)

Proof The inequality follows from Proposition 1 and the factthat ⩾ 119880

lowast

One point that needs to be emphasized is that 1198801015840

strengthens the first bound proposed by Brusco and Stahl

Proposition 3 1198801015840 ⩽ 119880119861119878

Proof Indeed 119865119903(119901) +sum

119894isin119882(119901)1198782

119894= 119865119903(119901) +sum

119899minus119903

119894=1120572119894⩾ 119865119903(119901) +

The equality in the previous chain of expressions comesfrom the choice of 120572

119894and our assumption on119882(119901) whereas

the inequality is due to the fact that 120573119895

⩽ 0 for each 119895 =

1 119899 minus 119903

As an example we consider the following 4 times 4 dissimi-larity matrix borrowed from Defays [12]

[

[

[

[

minus 4 2 3

4 minus 7 2

2 7 minus 4

3 2 4 minus

]

]

]

]

(9)

The same matrix was also used by Brusco and Stahl [13] toillustrate their bounding procedures From Table 2 in theirpaper we find that the solution 119901 = (2 4 1 3) is optimal forthis instance of the UDSP The optimal value is equal to 388We compute the proposed upper bound for the whole Defaysproblem In this case 119882(119901) = 1 4 and 119865

119903(119901) = 0

The matrix 119864 is constructed by noting that 119896 = 2 for each119894 = 1 4 It takes the following form

[

[

[

[

81 25 25 81

169 81 81 169

169 81 81 169

81 25 25 81

]

]

]

]

(10)

Using the expressions for120572119894and120573119895 we find that120572

1= 1205724= 81

1205722= 1205723= 169 120573

1= 1205734= 0 and 120573

2= 1205733= minus56 Thus =

sum4

119894=1120572119894+ sum4

119895=1120573119895= 500 minus 112 = 388 = max

119901isinΠ119865(119901) Hence

our bound for the Defays instance is tight For comparisonthe bound (5) is equal to sum

4

119894=1120572119894= 500 To obtain the second

bound of Brusco and Stahl we first compute the 120583 valuesSince 120583

1= 1205834= 169 and 120583

2= 1205833= 81 the bound (6) (taking

the form sum4

119894=1120583119894) is also equal to 500

3 Tabu Search

As it is well known that the effectiveness of the bound testin branch-and-bound algorithms depends on the value of thebest solution found throughout the search process A goodstrategy is to create a high-quality solution before the searchis started For this task various heuristics can be applied Ourchoice was to use the tabu search (TS) technique It should benoted that it is not the purpose of this paper to investigateheuristic algorithms for solving the UDSP Therefore werestrict ourselves merely to presenting an outline of ourimplementation of tabu search

For 119901 isin Π let 119901119896119898

denote the permutation obtainedfrom 119901 by interchanging the objects in positions 119896 and 119898The set 119873

2(119901) = 119901

119896119898| 119896 119898 = 1 119899 119896 lt 119898 is

the 2-interchange neighborhood of 119901 At each iterationthe tabu search algorithm performs an exploration of theneighborhood 119873

2(119901) A subset of solutions in 119873

2(119901) are

evaluated by computing Δ(119901 119896119898) = 119865(119901119896119898

)minus119865(119901) In orderto get better quality solutions we apply TS iteratively Thistechnique called iterated tabu search (ITS) consists of thefollowing steps

(1) initialize the search with some permutation of theobjects

(2) apply a tabu search procedure Check if the termina-tion condition is satisfied If so then stop Otherwiseproceed to (3)

(3) apply a solution perturbation procedure Return to(2)

In our implementation the search is started from a ran-domly generated permutation Also some data are preparedto be used in the computation of the differences Δ(119901 119896119898)We will defer this issue until later in this section In step (2) ofITS at each iteration of the TS procedure the neighborhood1198732(119901) of the current solution 119901 is explored The procedure

selects a permutation 119901119896119898

isin 1198732(119901) which either improves

the best available objective function value so far or if thisdoes not happen has the largest value of Δ(119901 119894 119895) amongall permutations 119901

119894119895isin 1198732(119901) for which the objects in

positions 119894 and 119895 are not in the tabu list The permutation119901119896119898

is then accepted as a new current solution The numberof iterations in the tabu search phase is bounded aboveby a parameter of the algorithm The goal of the solutionperturbation phase is to generate a permutation which isemployed as the starting point for the next invocation ofthe tabu search procedure This is achieved by taking thecurrent permutation and performing a sequence of pairwiseinterchanges of objects Throughout this process each objectis allowed to change its position in the permutation at mostonce At each step first a set of feasible pairs of permutationpositions with the largest Δ values is formed Then a pairsay (119896119898) is chosen randomly from this set and objects inpositions 119896 and119898 are switchedWe omit some of the details of

Journal of Applied Mathematics 5

the algorithm Similar implementations of the ITS techniquefor other optimization problems are thoroughly describedfor example in [21 22] We should note that also otherenhancements of the generic tabu search strategy could beused for our purposes These include for example multipathadaptive tabu search [23] and bacterial foraging tabu search[24] algorithms

In the rest of this section we concentrate on howto efficiently compute the differences Δ(119901 119896119898) Certainlythis issue is central to the performance of the tabu searchprocedure But more importantly evaluation of pairwiseinterchanges of objects stands at the heart of the dominancetest which is one of the crucial components of our algorithmAs we will see in the next section the computation of the Δvalues is an operation the dominance test is based on

Brusco [19] derived the following formula to calculate thedifference between 119865(119901

119896119898) and 119865(119901)

Δ (119901 119896119898)

= (4

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

)((

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

) + 2119891119901(119896)

minus 119878119901(119896)

)

+ (4

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

)((

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

) minus 2119891119901(119898)

+ 119878119901(119898)

)

+

119898minus1

sum

119897=119896+1

4 (119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

) ((119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(11)

Using (11) Δ(119901 119896119898) for all permutations in 1198732(119901) can be

computed in 119874(1198993) time However this approach is actually

too slow to be efficient In order to describe a more elaborateapproach we introduce three auxiliary 119899 times 119899 matrices 119860 =

(119886119894119902)119861 = (119887

119894119902) and119862 = (119888

119894119895) defined for a permutation119901 isin Π

Let us consider an object 119894 = 119901(119904) The entries of the 119894th rowof the matrices 119860 and 119861 are defined as follows if 119902 lt 119904 then119886119894119902= sum119904minus1

119897=119902119889119894119901(119897)

119887119894119902= sum119904minus1

119897=119902119889119894119901(119897)

(119889119894119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

) if 119902 gt 119904then 119886

119894119902= sum119902

119897=119904+1119889119894119901(119897)

119887119894119902= sum119902

119897=119904+1119889119894119901(119897)

(minus119889119894119901(119897)

+2119891119901(119897)

minus119878119901(119897)

)if 119902 = 119904 then 119886

119894119902= 119887119894119902= 0 Turning now to the matrix 119862 we

suppose that 119895 = 119901(119905) Then the entries of 119862 are given by thefollowing formula

119888119894119895=

max(119904119905)minus1sum

119897=min(119904119905)+1119889119894119901(119897)

119889119895119901(119897)

if |119904 minus 119905| gt 1

0 if |119904 minus 119905| = 1

(12)

Unlike 119860 and 119861 the matrix 119862 is symmetric We use thematrices 119860 119861 and 119862 to cast (11) in a form more suitable forboth the TS heuristic and the dominance test

Proposition 4 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119898 isin 119896 +

1 119899

Δ (119901 119896119898) = 4119886119901(119896)119898

(119886119901(119896)119898

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 4119886119901(119898)119896

(119886119901(119898)119896

minus 2119886119901(119898)1

+ 119878119901(119898)

)

+ 4 (119887119901(119898)119896+1

minus 119887119901(119896)119898minus1

minus 2119888119901(119896)119901(119898)

)

(13)

Proof First notice that the sums sum119898

119897=119896+1119889119901(119896)119901(119897)

and sum119898minus1

119897=119896

119889119901(119897)119901(119898)

in (11) are equal respectively to 119886119901(119896)119898

and 119886119901(119898)119896

Next it can be seen that 119891

119901(119896)= 119886119901(119896)1

and 119891119901(119898)

= 119886119901(119898)1

Finally the third term in (11) ignoring the factor of 4 can berewritten as119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

(119889119901(119897)119901(119898)

+ 2119891119901(119897)

minus 119878119901(119897)

)

minus

119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

119889119901(119896)119901(119897)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

119889119901(119897)119901(119898)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

(minus119889119901(119896)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(14)

It is easy to see that the first sum in (14) is equal to 119887119901(119898)119896+1

each of the next two sums is equal to 119888

119901(119896)119901(119898) and the last sum

is equal to 119887119901(119896)119898minus1

After simple manipulations we arrive at(13)

Having the matrices 119860 119861 and 119862 and using (13) the 2-interchange neighborhood of 119901 can be explored in 119874(119899

2)

time The question that remains to be addressed is howefficiently the entries of these matrices can be updated afterthe replacement of 119901 by a permutation 119901

1015840 chosen from theneighborhood 119873

2(119901) Suppose that 1199011015840 is obtained from 119901 by

interchanging the objects in positions 119896 and119898 Then we canwrite the following formulas that are involved in updating thematrices 119860 and 119861

119886119894119902= 119886119894119902minus1

+ 1198891198941199011015840(119902) (15)

119886119894119902= 119886119894119902+1

+ 1198891198941199011015840(119902) (16)

119887119894119902= 119887119894119902minus1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

minus 1198891198941199011015840(119902)) (17)

119887119894119902= 119887119894119902+1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

+ 1198891198941199011015840(119902)) (18)

From now on we will assume that the positions 119904 and 119905 of theobjects 119894 and 119895 are defined with respect to the permutation 119901

1015840More formally we have 119894 = 119901

1015840(119904) 119895 = 119901

1015840(119905)The procedure for

updating 119860 119861 and 119862 is summarized in the following

(1) for each object 119894 isin 1 119899 do the following

(11) if 119904 lt 119896 then compute 119886119894119902 119902 = 119896 119898 minus 1 by

(15) and 119887119894119902 119902 = 119896 119899 by (17)

(12) if 119904 gt 119898 then compute 119886119894119902 119902 = 119898 119898minus1 119896+

1 by (16) and 119887119894119902 119902 = 119898119898 minus 1 1 by (18)

(13) if 119896 lt 119904 lt 119898 then compute 119886119894119902by (15) for 119902 =

119898 119899 and by (16) for 119902 = 119896 119896 minus 1 1

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

4 Journal of Applied Mathematics

assume that 119896 = 119903 if120588119894119903+1

gt 120574119894119899minus119903minus1

Then 119890119894119896minus119903

= (2120574119894119899minus119896

minus119878119894)2

for 119896 = 119903+1 119896 and 119890

119894119896minus119903= (2120588119894119896minus119878119894)2 for 119896 =

119896+1 119899

The most time-consuming step in the computation of theupper bound 119880 is solving the linear assignment problem Inorder to provide a computationally less expensive boundingmethod we replace 119880

lowast with the value of a feasible solutionto the dual of the LAP (7) To get the bound we take120572119894

= max1⩽119896⩽119899minus119903

119890119894119896

= 119890119894119899minus119903

= 1198782

119894 119894 = 1 119899 minus 119903

120573119895

= max1⩽119894⩽119899minus119903

(119890119894119895minus 120572119894) 119895 = 1 119899 minus 119903 Clearly 120572

119894+

120573119895

⩾ 119890119894119895for each pair 119894 119895 = 1 119899 minus 119903 Thus the vector

(1205721 120572

119899minus119903 1205731 120573

119899minus119903) satisfies the constraints of the dual

of the LAP instance of interest Let = sum119899minus119903

119894=1120572119894+ sum119899minus119903

119895=1120573119895

Then from the duality theory of linear programming we canestablish the following upper bound for 119865(119901)

Proposition 2 For a partial solution 119901 = (119901(1) 119901(119903))1198801015840= 119865119903(119901) + ⩾ max

119901isinΠ(119901)119865(119901)

Proof The inequality follows from Proposition 1 and the factthat ⩾ 119880

lowast

One point that needs to be emphasized is that 1198801015840

strengthens the first bound proposed by Brusco and Stahl

Proposition 3 1198801015840 ⩽ 119880119861119878

Proof Indeed 119865119903(119901) +sum

119894isin119882(119901)1198782

119894= 119865119903(119901) +sum

119899minus119903

119894=1120572119894⩾ 119865119903(119901) +

The equality in the previous chain of expressions comesfrom the choice of 120572

119894and our assumption on119882(119901) whereas

the inequality is due to the fact that 120573119895

⩽ 0 for each 119895 =

1 119899 minus 119903

As an example we consider the following 4 times 4 dissimi-larity matrix borrowed from Defays [12]

[

[

[

[

minus 4 2 3

4 minus 7 2

2 7 minus 4

3 2 4 minus

]

]

]

]

(9)

The same matrix was also used by Brusco and Stahl [13] toillustrate their bounding procedures From Table 2 in theirpaper we find that the solution 119901 = (2 4 1 3) is optimal forthis instance of the UDSP The optimal value is equal to 388We compute the proposed upper bound for the whole Defaysproblem In this case 119882(119901) = 1 4 and 119865

119903(119901) = 0

The matrix 119864 is constructed by noting that 119896 = 2 for each119894 = 1 4 It takes the following form

[

[

[

[

81 25 25 81

169 81 81 169

169 81 81 169

81 25 25 81

]

]

]

]

(10)

Using the expressions for120572119894and120573119895 we find that120572

1= 1205724= 81

1205722= 1205723= 169 120573

1= 1205734= 0 and 120573

2= 1205733= minus56 Thus =

sum4

119894=1120572119894+ sum4

119895=1120573119895= 500 minus 112 = 388 = max

119901isinΠ119865(119901) Hence

our bound for the Defays instance is tight For comparisonthe bound (5) is equal to sum

4

119894=1120572119894= 500 To obtain the second

bound of Brusco and Stahl we first compute the 120583 valuesSince 120583

1= 1205834= 169 and 120583

2= 1205833= 81 the bound (6) (taking

the form sum4

119894=1120583119894) is also equal to 500

3 Tabu Search

As it is well known that the effectiveness of the bound testin branch-and-bound algorithms depends on the value of thebest solution found throughout the search process A goodstrategy is to create a high-quality solution before the searchis started For this task various heuristics can be applied Ourchoice was to use the tabu search (TS) technique It should benoted that it is not the purpose of this paper to investigateheuristic algorithms for solving the UDSP Therefore werestrict ourselves merely to presenting an outline of ourimplementation of tabu search

For 119901 isin Π let 119901119896119898

denote the permutation obtainedfrom 119901 by interchanging the objects in positions 119896 and 119898The set 119873

2(119901) = 119901

119896119898| 119896 119898 = 1 119899 119896 lt 119898 is

the 2-interchange neighborhood of 119901 At each iterationthe tabu search algorithm performs an exploration of theneighborhood 119873

2(119901) A subset of solutions in 119873

2(119901) are

evaluated by computing Δ(119901 119896119898) = 119865(119901119896119898

)minus119865(119901) In orderto get better quality solutions we apply TS iteratively Thistechnique called iterated tabu search (ITS) consists of thefollowing steps

(1) initialize the search with some permutation of theobjects

(2) apply a tabu search procedure Check if the termina-tion condition is satisfied If so then stop Otherwiseproceed to (3)

(3) apply a solution perturbation procedure Return to(2)

In our implementation the search is started from a ran-domly generated permutation Also some data are preparedto be used in the computation of the differences Δ(119901 119896119898)We will defer this issue until later in this section In step (2) ofITS at each iteration of the TS procedure the neighborhood1198732(119901) of the current solution 119901 is explored The procedure

selects a permutation 119901119896119898

isin 1198732(119901) which either improves

the best available objective function value so far or if thisdoes not happen has the largest value of Δ(119901 119894 119895) amongall permutations 119901

119894119895isin 1198732(119901) for which the objects in

positions 119894 and 119895 are not in the tabu list The permutation119901119896119898

is then accepted as a new current solution The numberof iterations in the tabu search phase is bounded aboveby a parameter of the algorithm The goal of the solutionperturbation phase is to generate a permutation which isemployed as the starting point for the next invocation ofthe tabu search procedure This is achieved by taking thecurrent permutation and performing a sequence of pairwiseinterchanges of objects Throughout this process each objectis allowed to change its position in the permutation at mostonce At each step first a set of feasible pairs of permutationpositions with the largest Δ values is formed Then a pairsay (119896119898) is chosen randomly from this set and objects inpositions 119896 and119898 are switchedWe omit some of the details of

Journal of Applied Mathematics 5

the algorithm Similar implementations of the ITS techniquefor other optimization problems are thoroughly describedfor example in [21 22] We should note that also otherenhancements of the generic tabu search strategy could beused for our purposes These include for example multipathadaptive tabu search [23] and bacterial foraging tabu search[24] algorithms

In the rest of this section we concentrate on howto efficiently compute the differences Δ(119901 119896119898) Certainlythis issue is central to the performance of the tabu searchprocedure But more importantly evaluation of pairwiseinterchanges of objects stands at the heart of the dominancetest which is one of the crucial components of our algorithmAs we will see in the next section the computation of the Δvalues is an operation the dominance test is based on

Brusco [19] derived the following formula to calculate thedifference between 119865(119901

119896119898) and 119865(119901)

Δ (119901 119896119898)

= (4

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

)((

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

) + 2119891119901(119896)

minus 119878119901(119896)

)

+ (4

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

)((

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

) minus 2119891119901(119898)

+ 119878119901(119898)

)

+

119898minus1

sum

119897=119896+1

4 (119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

) ((119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(11)

Using (11) Δ(119901 119896119898) for all permutations in 1198732(119901) can be

computed in 119874(1198993) time However this approach is actually

too slow to be efficient In order to describe a more elaborateapproach we introduce three auxiliary 119899 times 119899 matrices 119860 =

(119886119894119902)119861 = (119887

119894119902) and119862 = (119888

119894119895) defined for a permutation119901 isin Π

Let us consider an object 119894 = 119901(119904) The entries of the 119894th rowof the matrices 119860 and 119861 are defined as follows if 119902 lt 119904 then119886119894119902= sum119904minus1

119897=119902119889119894119901(119897)

119887119894119902= sum119904minus1

119897=119902119889119894119901(119897)

(119889119894119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

) if 119902 gt 119904then 119886

119894119902= sum119902

119897=119904+1119889119894119901(119897)

119887119894119902= sum119902

119897=119904+1119889119894119901(119897)

(minus119889119894119901(119897)

+2119891119901(119897)

minus119878119901(119897)

)if 119902 = 119904 then 119886

119894119902= 119887119894119902= 0 Turning now to the matrix 119862 we

suppose that 119895 = 119901(119905) Then the entries of 119862 are given by thefollowing formula

119888119894119895=

max(119904119905)minus1sum

119897=min(119904119905)+1119889119894119901(119897)

119889119895119901(119897)

if |119904 minus 119905| gt 1

0 if |119904 minus 119905| = 1

(12)

Unlike 119860 and 119861 the matrix 119862 is symmetric We use thematrices 119860 119861 and 119862 to cast (11) in a form more suitable forboth the TS heuristic and the dominance test

Proposition 4 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119898 isin 119896 +

1 119899

Δ (119901 119896119898) = 4119886119901(119896)119898

(119886119901(119896)119898

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 4119886119901(119898)119896

(119886119901(119898)119896

minus 2119886119901(119898)1

+ 119878119901(119898)

)

+ 4 (119887119901(119898)119896+1

minus 119887119901(119896)119898minus1

minus 2119888119901(119896)119901(119898)

)

(13)

Proof First notice that the sums sum119898

119897=119896+1119889119901(119896)119901(119897)

and sum119898minus1

119897=119896

119889119901(119897)119901(119898)

in (11) are equal respectively to 119886119901(119896)119898

and 119886119901(119898)119896

Next it can be seen that 119891

119901(119896)= 119886119901(119896)1

and 119891119901(119898)

= 119886119901(119898)1

Finally the third term in (11) ignoring the factor of 4 can berewritten as119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

(119889119901(119897)119901(119898)

+ 2119891119901(119897)

minus 119878119901(119897)

)

minus

119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

119889119901(119896)119901(119897)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

119889119901(119897)119901(119898)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

(minus119889119901(119896)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(14)

It is easy to see that the first sum in (14) is equal to 119887119901(119898)119896+1

each of the next two sums is equal to 119888

119901(119896)119901(119898) and the last sum

is equal to 119887119901(119896)119898minus1

After simple manipulations we arrive at(13)

Having the matrices 119860 119861 and 119862 and using (13) the 2-interchange neighborhood of 119901 can be explored in 119874(119899

2)

time The question that remains to be addressed is howefficiently the entries of these matrices can be updated afterthe replacement of 119901 by a permutation 119901

1015840 chosen from theneighborhood 119873

2(119901) Suppose that 1199011015840 is obtained from 119901 by

interchanging the objects in positions 119896 and119898 Then we canwrite the following formulas that are involved in updating thematrices 119860 and 119861

119886119894119902= 119886119894119902minus1

+ 1198891198941199011015840(119902) (15)

119886119894119902= 119886119894119902+1

+ 1198891198941199011015840(119902) (16)

119887119894119902= 119887119894119902minus1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

minus 1198891198941199011015840(119902)) (17)

119887119894119902= 119887119894119902+1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

+ 1198891198941199011015840(119902)) (18)

From now on we will assume that the positions 119904 and 119905 of theobjects 119894 and 119895 are defined with respect to the permutation 119901

1015840More formally we have 119894 = 119901

1015840(119904) 119895 = 119901

1015840(119905)The procedure for

updating 119860 119861 and 119862 is summarized in the following

(1) for each object 119894 isin 1 119899 do the following

(11) if 119904 lt 119896 then compute 119886119894119902 119902 = 119896 119898 minus 1 by

(15) and 119887119894119902 119902 = 119896 119899 by (17)

(12) if 119904 gt 119898 then compute 119886119894119902 119902 = 119898 119898minus1 119896+

1 by (16) and 119887119894119902 119902 = 119898119898 minus 1 1 by (18)

(13) if 119896 lt 119904 lt 119898 then compute 119886119894119902by (15) for 119902 =

119898 119899 and by (16) for 119902 = 119896 119896 minus 1 1

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 5

the algorithm Similar implementations of the ITS techniquefor other optimization problems are thoroughly describedfor example in [21 22] We should note that also otherenhancements of the generic tabu search strategy could beused for our purposes These include for example multipathadaptive tabu search [23] and bacterial foraging tabu search[24] algorithms

In the rest of this section we concentrate on howto efficiently compute the differences Δ(119901 119896119898) Certainlythis issue is central to the performance of the tabu searchprocedure But more importantly evaluation of pairwiseinterchanges of objects stands at the heart of the dominancetest which is one of the crucial components of our algorithmAs we will see in the next section the computation of the Δvalues is an operation the dominance test is based on

Brusco [19] derived the following formula to calculate thedifference between 119865(119901

119896119898) and 119865(119901)

Δ (119901 119896119898)

= (4

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

)((

119898

sum

119897=119896+1

119889119901(119896)119901(119897)

) + 2119891119901(119896)

minus 119878119901(119896)

)

+ (4

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

)((

119898minus1

sum

119897=119896

119889119901(119897)119901(119898)

) minus 2119891119901(119898)

+ 119878119901(119898)

)

+

119898minus1

sum

119897=119896+1

4 (119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

) ((119889119901(119897)119901(119898)

minus 119889119901(119896)119901(119897)

)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(11)

Using (11) Δ(119901 119896119898) for all permutations in 1198732(119901) can be

computed in 119874(1198993) time However this approach is actually

too slow to be efficient In order to describe a more elaborateapproach we introduce three auxiliary 119899 times 119899 matrices 119860 =

(119886119894119902)119861 = (119887

119894119902) and119862 = (119888

119894119895) defined for a permutation119901 isin Π

Let us consider an object 119894 = 119901(119904) The entries of the 119894th rowof the matrices 119860 and 119861 are defined as follows if 119902 lt 119904 then119886119894119902= sum119904minus1

119897=119902119889119894119901(119897)

119887119894119902= sum119904minus1

119897=119902119889119894119901(119897)

(119889119894119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

) if 119902 gt 119904then 119886

119894119902= sum119902

119897=119904+1119889119894119901(119897)

119887119894119902= sum119902

119897=119904+1119889119894119901(119897)

(minus119889119894119901(119897)

+2119891119901(119897)

minus119878119901(119897)

)if 119902 = 119904 then 119886

119894119902= 119887119894119902= 0 Turning now to the matrix 119862 we

suppose that 119895 = 119901(119905) Then the entries of 119862 are given by thefollowing formula

119888119894119895=

max(119904119905)minus1sum

119897=min(119904119905)+1119889119894119901(119897)

119889119895119901(119897)

if |119904 minus 119905| gt 1

0 if |119904 minus 119905| = 1

(12)

Unlike 119860 and 119861 the matrix 119862 is symmetric We use thematrices 119860 119861 and 119862 to cast (11) in a form more suitable forboth the TS heuristic and the dominance test

Proposition 4 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119898 isin 119896 +

1 119899

Δ (119901 119896119898) = 4119886119901(119896)119898

(119886119901(119896)119898

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 4119886119901(119898)119896

(119886119901(119898)119896

minus 2119886119901(119898)1

+ 119878119901(119898)

)

+ 4 (119887119901(119898)119896+1

minus 119887119901(119896)119898minus1

minus 2119888119901(119896)119901(119898)

)

(13)

Proof First notice that the sums sum119898

119897=119896+1119889119901(119896)119901(119897)

and sum119898minus1

119897=119896

119889119901(119897)119901(119898)

in (11) are equal respectively to 119886119901(119896)119898

and 119886119901(119898)119896

Next it can be seen that 119891

119901(119896)= 119886119901(119896)1

and 119891119901(119898)

= 119886119901(119898)1

Finally the third term in (11) ignoring the factor of 4 can berewritten as119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

(119889119901(119897)119901(119898)

+ 2119891119901(119897)

minus 119878119901(119897)

)

minus

119898minus1

sum

119897=119896+1

119889119901(119897)119901(119898)

119889119901(119896)119901(119897)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

119889119901(119897)119901(119898)

minus

119898minus1

sum

119897=119896+1

119889119901(119896)119901(119897)

(minus119889119901(119896)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(14)

It is easy to see that the first sum in (14) is equal to 119887119901(119898)119896+1

each of the next two sums is equal to 119888

119901(119896)119901(119898) and the last sum

is equal to 119887119901(119896)119898minus1

After simple manipulations we arrive at(13)

Having the matrices 119860 119861 and 119862 and using (13) the 2-interchange neighborhood of 119901 can be explored in 119874(119899

2)

time The question that remains to be addressed is howefficiently the entries of these matrices can be updated afterthe replacement of 119901 by a permutation 119901

1015840 chosen from theneighborhood 119873

2(119901) Suppose that 1199011015840 is obtained from 119901 by

interchanging the objects in positions 119896 and119898 Then we canwrite the following formulas that are involved in updating thematrices 119860 and 119861

119886119894119902= 119886119894119902minus1

+ 1198891198941199011015840(119902) (15)

119886119894119902= 119886119894119902+1

+ 1198891198941199011015840(119902) (16)

119887119894119902= 119887119894119902minus1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

minus 1198891198941199011015840(119902)) (17)

119887119894119902= 119887119894119902+1

+ 1198891198941199011015840(119902)

(21198861199011015840(119902)1

minus 1198781199011015840(119902)

+ 1198891198941199011015840(119902)) (18)

From now on we will assume that the positions 119904 and 119905 of theobjects 119894 and 119895 are defined with respect to the permutation 119901

1015840More formally we have 119894 = 119901

1015840(119904) 119895 = 119901

1015840(119905)The procedure for

updating 119860 119861 and 119862 is summarized in the following

(1) for each object 119894 isin 1 119899 do the following

(11) if 119904 lt 119896 then compute 119886119894119902 119902 = 119896 119898 minus 1 by

(15) and 119887119894119902 119902 = 119896 119899 by (17)

(12) if 119904 gt 119898 then compute 119886119894119902 119902 = 119898 119898minus1 119896+

1 by (16) and 119887119894119902 119902 = 119898119898 minus 1 1 by (18)

(13) if 119896 lt 119904 lt 119898 then compute 119886119894119902by (15) for 119902 =

119898 119899 and by (16) for 119902 = 119896 119896 minus 1 1

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

6 Journal of Applied Mathematics

(14) if 119904 = 119896 or 119904 = 119898 then first set 119886119894119904

= 0

and afterwards compute 119886119894119902by (15) for 119902 = 119904 +

1 119899 and by (16) for 119902 = 119904 minus 1 119904 minus 2 1(15) if 119896 ⩽ 119904 ⩽ 119898 then first set 119887

119894119904= 0 and afterwards

compute 119887119894119902by (17) for 119902 = 119904+1 119899 and by (18)

for 119902 = 119904 minus 1 119904 minus 2 1

(2) for each pair of objects 119894 119895 isin 1 119899 such that 119904 lt 119905

do the following

(21) if 119904 lt 119896 lt 119905 lt 119898 then set 119888119894119895= 119888119894119895+1198891198941199011015840(119896)1198891198951199011015840(119896)

minus

1198891198941199011015840(119898)

1198891198951199011015840(119898)

(22) if 119896 lt 119904 lt 119898 lt 119905 then set 119888

119894119895= 119888119894119895+

1198891198941199011015840(119898)

1198891198951199011015840(119898)

minus 1198891198941199011015840(119896)1198891198951199011015840(119896)

(23) if |119904 119905 cap 119896119898| = 1 then compute 119888119894119895anew

using (12)(24) if 119888

119894119895has been updated in (21)ndash(23) then set

119888119895119894= 119888119894119895

It can be noted that for some pairs of objects 119894 and 119895

the entry 119888119894119895remains unchanged There are five such cases

(1) 119904 119905 lt 119896 (2) 119896 lt 119904 119905 lt 119898 (3) 119904 119905 gt 119898 (4) 119904 lt 119896 lt 119898 lt 119905(5) 119904 119905 = 119896119898

We now evaluate the time complexity of the describedprocedure Each of Steps (11) to (15) requires 119874(119899) timeThus the complexity of Step (1) is 119874(119899

2) The number of

operations in Steps (21) (22) and (24) is 119874(1) and that inStep (23) is 119874(119899) However the latter is executed only 119874(119899)

timesTherefore the overall complexity of Step (2) and henceof the entire procedure is 119874(119899

2)

Formula (13) and the previous described procedure forupdating matrices 119860 119861 and 119862 lie at the core of ourimplementation of the TS method The matrices 119860 119861 and 119862

can be initialized by applying the formulas (15)ndash(18) and (12)with respect to the starting permutationThe time complexityof initialization is 119874(119899

2) for 119860 and 119861 and 119874(119899

3) for 119862 The

auxiliary matrices 119860 119861 and 119862 and the expression (13) forΔ(119901 119896119898) are also used in the dominance test presented inthe next section This test plays a major role in pruning thesearch tree during the branching and bounding process

4 Dominance Test

As is well known the use of dominance tests in enumerativemethods allows reducing the search space of the optimizationproblem of interest A comprehensive review of dominancetests (or rules) in combinatorial optimization can be found inthe paper by Jouglet and Carlier [25] For the problem we aredealing with in this study Brusco and Stahl [13] proposed adominance test for discarding partial solutions which relieson performing pairwise interchanges of objects To formulateit let us assume that 119901 = (119901(1) 119901(119903)) 119903 lt 119899 is a partialsolution to the UDSP The test checks whether Δ(119901 119896 119903) gt 0

for at least one of the positions 119896 isin 1 119903 minus 1 If thiscondition is satisfied then we say that the test fails In thiscase 119901 is thrown away Otherwise 119901 needs to be extendedby placing an unassigned object in position 119903 + 1 The firstcondition in our dominance test is the same as in the test

suggested by Brusco and Stahl To compute Δ(119901 119896 119903) weuse subsets of entries of the auxiliary matrices 119860 119861 and119862 defined in the previous section Since only the sign ofΔ(119901 119896 119903) matters the factor of 4 can be ignored in (13) andthus Δ

1015840(119901 119896 119903) = Δ(119901 119896 119903)4 instead of Δ(119901 119896 119903) can be

considered From (13) we obtain

Δ1015840(119901 119896 119903) = 119886

119901(119896)119903(119886119901(119896)119903

+ 2119886119901(119896)1

minus 119878119901(119896)

)

+ 119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

)

+ 119887119901(119903)119896+1

minus 119887119901(119896)119903minus1

minus 2119888119901(119896)119901(119903)

(19)

We also take advantage of the case where Δ1015840(119901 119896 119903) =

0 We use the lexicographic rule for breaking ties The testoutputs a ldquoFailrdquo verdict if the equality Δ1015840(119901 119896 119903) = 0 is foundto hold true for some 119896 isin 1 119903 minus 1 such that 119901(119896) gt 119901(119903)Indeed 119901 = (119901(1) 119901(119896 minus 1) 119901(119896) 119901(119896 + 1) 119901(119903))

can be discarded because both 119901 and partial solution 1199011015840=

(119901(1) 119901(119896 minus 1) 119901(119903) 119901(119896 + 1) 119901(119896)) are equally goodbut 119901 is lexicographically larger than 119901

1015840We note that the entries of the matrices 119860 119861 and 119862 used

in (19) can be easily obtained from a subset of entries of119860 119861and 119862 computed earlier in the search process Consider forexample 119888

119901(119896)119901(119903)in (19) It can be assumed that 119896 isin 1 119903minus

2 because if 119896 = 119903 minus 1 then 119888119901(119896)119901(119903)

= 0 For simplicity let119901(119903) = 119894 and 119901(119903 minus 1) = 119895 To avoid ambiguity let us denoteby 1198621015840 the matrix 119862 that is obtained during processing of the

partial solution (119901(1) 119901(119903 minus 2)) The entry 1198881015840

119901(119896)119894of 1198621015840 is

calculated by (temporarily) placing the object 119894 in position119903 minus 1 Now referring to the definition of 119862 it is easy to seethat 119888

119901(119896)119894= 1198881015840

119901(119896)119894+ 119889119901(119896)119895

119889119894119895 Similarly using formulas of

types (15) and (17) we can obtain the required entries of thematrices 119860 and 119861 Thus (19) can be computed in constanttime The complexity of computing Δ

1015840(119901 119896 119903) for all 119896 is

119874(119899) The dominance test used in [13] relies on (11) and hascomplexity 119874(119899

2) So our implementation of the dominance

test is significantly more time-efficient than that proposed byBrusco and Stahl

In order to make the test even more efficient we inaddition decided to evaluate those partial solutions thatcan be obtained from 119901 by relocating the object 119901(119903) fromposition 119903 to position 119896 isin 1 119903 minus 2 (here position119896 = 119903 minus 1 is excluded since in this case relocating wouldbecome a pairwise interchange of objects) Thus the testcompares 119901 against partial solutions 119901

119896= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119903 minus 1)) 119896 = 1 119903 minus 2 Now supposethat 119901 and 119901

119896are extended to 119899-permutations 119901 isin Π and

respectively 119901119896isin Π such that for each 119897 isin 119903 + 1 119899 the

object in position 119897 of 119901 coincides with that in position 119897 of 119901119896

Let us define 120575119903119896to be the difference between 119865(119901

119896) and 119865(119901)

This difference can be computed as follows

120575119903119896

= (4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(20)

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 7

The previous expression is similar to the formula (6) in [19]For completeness we give its derivation in the Appendix Wecan rewrite (20) using the matrices 119860 and 119861

Proposition 5 For 119901 isin Π 119896 isin 1 119899 minus 1 and 119903 isin 119896 +

1 119899

120575119903119896

= 4119886119901(119903)119896

(119886119901(119903)119896

minus 2119886119901(119903)1

+ 119878119901(119903)

) + 4119887119901(119903)119896

(21)

The partial solution 119901 is dominated by 119901119896if 120575119903119896

gt 0Upon establishing this fact for some 119896 isin 1 119903 minus 2 thetest immediately stops with the rejection of 119901 Of course likein the case of (13) a factor of 4 in (21) can be dropped Thecomputational overhead of including the condition 120575

119903119896gt 0

in the test is minimal because all the quantities appearing onthe right-hand side of (21) except 119887

119901(119903)1 are also used in the

calculation ofΔ1015840(119901 119896 119903) (see (19)) Since the loop over 119896needsto be executed the time complexity of this part of the test is119874(119899) Certainly evaluation of relocating the last added objectin 119901 gives more power to the test Thus our dominance testis an improvement over the test described in the Brusco andStahl [13] paper

5 A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solu-tion provided by the iterated tabu search heuristic invokedin the initialization phase of the algorithm and then proceedwith the main phase where using the branch-and-boundtechnique a search tree is built For the purpose of pruningbranches of the search tree the algorithm involves three tests(1) the upper bound test (based on 119880

1015840) (2) the dominancetest (3) the symmetry testThe first two of them have alreadybeen described in the prior sections The last one is the mostsimple and cheap to apply It hinges on the fact that the valuesof the objective function 119865 at a permutation 119901 isin Π and itsreverse 119901 defined by 119901(119894) = 119901(119899 minus 119894 + 1) 119894 = 1 119899 arethe same Therefore it is very natural to demand that somefixed object in 119882 say 119908 being among those objects that areassigned to the first half of the positions in 119901 More formallythe condition is 119894 ⩽ lceil1198992rceil where 119894 is such that 119901(119894) = 119908If this condition is violated for 119901 = (119901(1) 119901(lceil1198992rceil)) then the symmetry test fails and 119901 is discarded Because ofthe requirement to fulfill the previous condition some careshould be taken in applying the dominance test In the case ofΔ1015840(119901 119896 119903) = 0 and 119901(119896) gt 119901(119903) (see the paragraph following

(19)) this test in our implementation of the algorithm gives aldquoFailrdquo verdict only if either 119903 ⩽ lceil1198992rceil or 119901(119896) = 119908

Through the preliminary computational experimentswe were convinced that the upper bound test was quiteweak when applied to smaller partial solutions In otherwords the time overhead for computing linear assignment-based bounds did not pay off for partial permutations 119901 =

(119901(1) 119901(119903)) of length 119903 that is small or modest comparedto 119899 Therefore it is reasonable to compute the upper bound1198801015840 (described in Section 2) only when 119903 ⩾ 120582(119863) where

120582(119863) is a threshold value depending on the UDSP instancematrix 119863 and thus also on 119899 The problem we face is howto determine an appropriate value for 120582(119863) We combat this

problem with a simple procedure consisting of two steps Inthe first step it randomly generates a relatively small numberof partial permutations 119901

1 119901

120591 each of length lfloor1198992rfloor For

each permutation in this sample the procedure computes theupper bound 119880

1015840 Let 11988010158401 119880

1015840

120591be the values obtained We

are interested in the normalized differences between thesevalues and 119865(119901heur) where 119901heur is the solution delivered bythe ITS algorithm So the procedure calculates the sum 119885 =

sum120591

119894=1100(119880

1015840

119894minus 119865(119901heur))119865(119901heur) and the average 119877 = 119885120591 of

such differences In the second step using119877 it determines thethreshold value This is done in the following way if 119877 lt 119877

1

then 120582(119863) = lfloor1198993rfloor if 1198771⩽ 119877 lt 119877

2 then 120582(119863) = lfloor1198992rfloor if 119877 gt

1198772 then 120582(119863) = lfloor31198994rfloor The described procedure is executed

at the initialization step of the algorithm Its parameters are1198771 1198772 and the sample size 120591

Armed with tools for pruning the search space we canconstruct a branch-and-bound algorithm for solving theUDSP In order to make the description of the algorithmmore readable we divide it into two parts namely themain procedure BampB (branch-and-bound) and an auxiliaryprocedure SelectObject The former builds a search tree withnodes representing partial solutionsThe task of SelectObjectis either to choose an object which can be used to extendthe current permutation or if the current node becomesfathomed to perform a backtracking step In the descriptionof the algorithm given in the following the set of objects thatcan be used to create new branches is denoted by119881Themainprocedure goes as follows

BampB

(1) Apply the ITS algorithm to get an initial solution 119901lowast

to a given instance of the UDSP

Set 119865lowast = 119865(119901lowast)

(2) Calculate 120582(119863)(3) Initialize the search tree by creating the root node

with the empty set 119881 attached to it

Set 119871 = 0 (119871 denotes the tree level the currentlyconsidered node belongs to)

(4) Set 119903 = 119871

(5) Call SelectObject(119903) It returns 119871 and if 119871 = 119903additionally some unassigned object (let it be denotedby V) If 119871 lt 0 then go to (7) Otherwise checkwhether 119871 lt 119903 If so then return to (4) (perform abacktracking step) If not (in which case 119871 = 119903) thenproceed to (6)

(6) Create a new node with empty 119881 Set 119901(119903 + 1) = VIncrement 119871 by 1 and go to (4)

(7) Calculate the coordinates for the objects by applyingformula (4) with respect to the permutation 119901

lowast andstop

Steps (1) and (2) of BampB comprise the initializationphase of the algorithm The initial solution to the problemis produced using ITS This solution is saved as the bestsolution found so far denoted by 119901lowast If later a better solutionis found 119901lowast is updated This is done within the SelectObject

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

8 Journal of Applied Mathematics

procedure In the final step BampB uses 119901lowast to calculate the

optimal coordinates for the objects The branch-and-boundphase starts at Step (3) of BampB Each node of the search treeis assigned a set 119881 The partial solution corresponding to thecurrent node is denoted by 119901 The objects of 119881 are used toextend 119901 Upon creation of a node (Step (3) for root andStep (6) for other nodes) the set 119881 is empty In SelectObject119881 is filled with objects that are promising candidates toperform the branching operation on the corresponding nodeSelectObject also decides which action has to be taken nextIf 119871 = 119903 then the action is ldquobranchrdquo whereas if 119871 lt 119903 theaction is ldquobacktrackrdquo Step (4) helps to discriminate betweenthese two cases If 119871 = 119903 then a new node in the nextlevel of the search tree is generated At that point the partialpermutation 119901 is extended by assigning the selected object Vto the position 119903 + 1 in119901 (see Step (6)) If however119871 lt 119903 and 119903is positive then backtracking is performed by repeating Steps(4) and (5) We thus see that the search process is controlledby the procedure SelectObject This procedure can be statedas follows

SelectObject(119903)

(1) Check whether the set119881 attached to the current nodeis empty If so then go to (2) If not then consider itssubset 1198811015840 = 119894 isin 119881 | 119894 is unmarked and 119906

119894(119901) gt

119865lowast If 1198811015840 is nonempty then mark an arbitrary object

V isin 1198811015840 and return with it as well as with 119871 = 119903

Otherwise go to (6)(2) If 119903 gt 0 then go to (3) Otherwise set 119881 = 1 119899

119906119894(119901) = infin 119894 = 1 119899 and go to (4)

(3) If 119903 = 119899 minus 1 then go to (5) Otherwise for each 119894 isin

119882(119901) perform the following operations Temporarilyset 119901(119903 + 1) = 119894 Apply the symmetry test thedominance test and if 119903 ⩾ 120582(119863) also the bound testto the resulting partial solution 119901

1015840= (119901(1) 119901(119903 +

1)) If it passes all the tests then do the followingInclude 119894 into 119881 If 119903 ⩾ 120582(119863) assign to 119906

119894(119901) the value

of the upper bound 1198801015840 computed for the subproblem

defined by the partial solution 1199011015840 If however 119903 lt

120582(119863) then set 119906119894(119901) = infin If after processing all

objects in 119882(119901) the set 119881 remains empty then go to(6) Otherwise proceed to (4)

(4) Mark an arbitrary object V isin 119881 and return with it aswell as with 119871 = 119903

(5) Let 119882(119901) = 119895 Calculate the value of the solution119901 = (119901(1) 119901(119899 minus 1) 119901(119899) = 119895) If 119865(119901) gt 119865

lowast thenset 119901lowast = 119901 and 119865

lowast= 119865(119901)

(6) If 119903 gt 0 then drop the 119903th element of 119901 Return with119871 = 119903 minus 1

In the previous description 119882(119901) denotes the set ofunassigned objects as before The objects included in 119881 sube

119882(119901) are considered as candidates to be assigned to theleftmost available position in 119901 At the root node 119881 = 119882(119901)In all other cases except when |119882(119901)| = 1 an object becomesa member of the set 119881 only if it passes two or for longer 119901three tests The third of them the upper bound test amounts

to check if 1198801015840 gt 119865lowast This condition is checked again on

the second and subsequent calls to SelectObject for the samenode of the search tree This is done in Step (1) for eachunmarked object 119894 isin 119881 Such a strategy is reasonable becauseduring the branch-and-bound process the value of 119865lowast mayincrease and the previously satisfied constraint1198801015840 gt 119865

lowast maybecome violated for some 119894 isin 119881 To be able to identify suchobjects the algorithm stores the computed upper bound 119880

1015840

as 119906119894(119901) at Step (3) of SelectObjectA closer look at the procedure SelectObject shows that

two scenarios are possible (1)when this procedure is invokedfollowing the creation of a newnode of the tree (which is doneeither in Step (3) or in Step (6) of BampB) (2)when a call to theprocedure ismade after a backtracking stepThe set119881 initiallyis empty and therefore Steps (2) to (5) of SelectObject becomeactive only in the first of these two scenarios Dependingon the current level of the tree 119903 one of the three cases isconsidered If the current node is the root that is 119903 = 0 thenStep (2) is executed Since in this case the dominance test isnot applicable the algorithm uses the set119881 comprising all theobjects under consideration An arbitrary object selected inStep (4) is then used to create the first branch emanating fromthe root of the tree If 119903 isin [1 119899 minus 2] then the algorithmproceeds to Step (3) The task of this step is to evaluate allpartial solutions that can be obtained from the current partialpermutation119901by assigning an object from119882(119901) to the (119903 + 1)

th position in 119901 In order not to lose at least one optimalsolution it is sufficient to keep only those extensions of 119901 forwhich all the tests were successful Such partial solutions arememorized by the set 119881 One of them is chosen in Step (4)

to be used (in Step (6) of BampB) to generate the first son ofthe current node If however no promising partial solutionhas been found and thus the set 119881 is empty then the currentnode can be discarded and the search continued from itsparent node Finally if 119903 = 119899 minus 1 then Step (5) is executedin which the permutation 119901 is completed by placing theremaining unassigned object in the last position of119901 Its valueis compared with that of the best solution found so far Thenode corresponding to 119901 is a leaf and therefore is fathomedIn the second of the above-mentioned two scenarios (Step(1) of the procedure) the already existing nonempty set 119881is scanned The previously selected objects in 119881 have beenmarked They are ignored For each remaining object 119894 it ischecked whether 119906

119894(119901) gt 119865

lowast If so then 119894 is included inthe set 1198811015840 If the resulting set 1198811015840 is empty then the currentnode of the tree becomes fathomed and the procedure goesto Step (6) Otherwise the procedure chooses a branchingobject and passes it to its caller BampB Our implementationof the algorithm adopts a very simple enumeration strategybranch upon an arbitrary object in 119881

1015840 (or 119881 when Step (4) isperformed)

It is useful to note that the tests in Step (3) are applied inincreasing order of time complexityThe symmetry test is thecheapest of them whereas the upper bound test is the mostexpensive to perform but still effective

6 Computational Results

The main goal of our numerical experiments was to pro-vide some measure of the performance of the developed

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 9

BampB algorithm as well as to demonstrate the benefits ofusing the new upper bound and efficient dominance testthroughout the solution process For comparison purposeswe also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13] In their studythis algorithm is denoted as BB3 When referring to it inthis paper we will use the same name In order to make thecomparison more fair we decided to start BB3 by invokingthe ITS heuristic Thus the initial solution from which thebranch-and-bound phase of each of the algorithms starts isexpected to be of the same quality

61 Experimental Setup Both our algorithm and BB3 havebeen coded in the C programming language and all the testshave been carried out on a PC with an Intel Core 2 DuoCPU running at 30GHz As a testbed for the algorithmsthree empirical dissimilarity matrices from the literature inaddition to a number of randomly generated ones wereconsidered The random matrices vary in size from 20 to30 objects All off-diagonal entries are drawn uniformly atrandom from the interval [1 100] The matrices of courseare symmetric

We also used three dissimilarity matrices constructedfrom empirical data found in the literature The first matrixis based onMorse code confusion data collected by Rothkopf[26] In his experiment the subjects were presented pairsof Morse code signals and asked to respond whether theythought the signals were the same or differentThese answerswere translated to the entries of the similarity matrix withrows and columns corresponding to the 36 alphanumericcharacters (26 letters in the Latin alphabet and 10 digits in thedecimal number system) We should mention that we foundsmall discrepancies between Morse code similarity matricesgiven in various publications and electronic media We havedecided to take the matrix from [27] where it is presentedwith a reference to Shepard [28] Let thismatrix be denoted by119872 = (119898

119894119895) Its entries are integers from the interval [0 100]

Like in the study of Brusco and Stahl [13] the correspondingsymmetric dissimilarity matrix 119863 = (119889

119894119895) was obtained by

using the formula 119889119894119895= 200 minus 119898

119894119895minus 119898119895119894for all off-diagonal

entries and setting the diagonal of 119863 to zero We consideredtwoUDSP instances constructed usingMorse code confusiondatamdashone defined by the 26 times 26 submatrix of119863 with rowsand columns labeled by the letters and another defined by thefull matrix119863

The second empirical matrix was taken from Groenenand Heiser [29] We denote it by 119866 = (119892

119894119895) Its entry 119892

119894119895

gives the number of citations in journal 119894 to journal 119895 Thusthe rows of 119866 correspond to citing journals and the columnscorrespond to cited journals The dissimilarity matrix 119863 wasderived from 119866 in two steps like in [13] First a symmetricsimilarity matrix 119867 = (ℎ

119894119895) was calculated by using the

formulas ℎ119894119895

= ℎ119895119894

= (119892119894119895sum119899

119897=1l = 119894 119892119894119897) + (119892119895119894sum119899

119897=1119897 = 119895119892119895119897)

119894 119895 = 1 119899 119894 lt 119895 and ℎ119894119894

= 0 119894 = 1 119899 Then 119863

was constructed with entries 119889119894119895

= lfloor100(max119897119896ℎ119897119896

minus ℎ119894119895)rfloor

119894 119895 = 1 119899 119894 = 119895 and 119889119894119894= 0 119894 = 1 119899

The third empirical matrix used in our experimentswas taken from Hubert et al [30] Its entries represent the

dissimilarities between pairs of food items As mentioned byHubert et al [30] this 45 times 45 food-item matrix 119863foodwas constructed based on data provided by Ross andMurphy[31] For our computational tests we considered submatricesof 119863food defined by the first 119899 isin 20 21 35 food itemsAll the dissimilarity matrices we just described as well asthe source code of our algorithm are publicly available athttpwwwproinktultsimgintarasudshtml

Based on the results of preliminary experiments with theBampB algorithm we have fixed the values of its parameters120591 = 10 119877

1= 0 and 119877

2= 5 In the symmetry test119908 was fixed

at 1 The cutoff time for ITS was 1 second

62 Results The first experiment was conducted on a seriesof randomly generated full density dissimilarity matricesThe results of BampB and BB3 for them are summarized inTable 1 The dimension of the matrix 119863 is encoded in theinstance name displayed in the first column The secondcolumn of the table shows for each instance the optimalvalue of the objective function 119865

lowast and in parentheses tworelated metrics The first of them is the raw value of thestress function Φ(119909) The second number in parentheses isthe normalized value of the stress function that is calculatedasΦ(119909)sum

119894lt119895119889119894119895 The next two columns list the results of our

algorithm the number of nodes in the search tree and theCPU time reported in the form hours minutes seconds orminutes seconds or seconds The last two columns give thesemeasures for BB3

As it can be seen from the table BampB is superior to BB3We observe for example that the decrease in CPU time overBB3 is around 40 for the largest three instances Also ouralgorithm builds smaller search trees than BB3

In our second experiment we opt for three empiricaldissimilarity matrices mentioned in Section 61 Table 2 sum-marizes the results of BampB and BB3 on the UDSP instancesdefined by these matrices Its structure is the same as that ofTable 1 In the instance names the letter ldquoMrdquo stands for theMorse codematrix ldquoJrdquo stands for the journal citationsmatrixand ldquoFrdquo stands for the food item data As before the numberfollowing the letter indicates the dimension of the problemmatrix Comparing results in Tables 1 and 2 we find thatfor empirical data the superiority of our algorithm over BB3is more pronounced Basically BampB performs significantlybetter than BB3 for each of the three data sets For examplefor the food-item dissimilarity matrix of size 30 times 30 BampBbuilds almost 10 times smaller search tree and is about 5 timesfaster than BB3

Table 3 presents the results of the performance of BampBon the full Morse code dissimilarity matrix as well as onlarger submatrices of the food-item dissimilarity matrixRunning BB3 on these UDSP instances is too expensive sothis algorithm is not included in the table By analyzing theresults in Tables 2 and 3 we find that for 119894 isin 22 35 theCPU time taken by BampB to solve the problem with the 119894 times 119894

food-item submatrix is between 196 (for 119894 = 30) and 282 (for119894 = 27) percent of that needed to solve the problem with the(119894 minus1)times (119894minus1) food-item submatrix For the submatrix of size35 times 35 the CPU time taken by the algorithm was more than

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

10 Journal of Applied Mathematics

Table 1 Performance on random problem instances

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

p20 9053452 (1797554 0284) 1264871 44 1526725 56p21 9887394 (1879277 0285) 3304678 106 4169099 145p22 12548324 (2245406 0282) 6210311 205 8017112 290p23 14632006 (2396198 0274) 12979353 450 16624663 1067p24 16211210 (2815859 0294) 31820463 1533 42435561 2555p25 15877026 (3126700 0330) 103208158 6446 137525505 9524p26 20837730 (3471478 0302) 200528002 13354 267603807 20436p27 21464870 (3585958 0311) 287794935 20377 392405143 32365p28 23938264 (3961360 0317) 581602877 44093 812649051 113054p29 28613820 (4537786 0315) 2047228373 250572 3044704554 438570p30 30359134 (4645569 0315) 3555327537 509099 5216804247 855046

Table 2 Performance on empirical dissimilarity matrices

Inst Objective function value BampB BB3Number of nodes Time Number of nodes Time

M26 180577772 (19448891 0219) 108400003 7432 201264785 16143J28 60653172 (7167033 0249) 2019598593 348006 6780517853 1039066F20 19344116 (1051432 0098) 236990 27 1971840 70F21 22694178 (1229979 0102) 568998 52 4716777 162F22 26496098 (1495949 0110) 1178415 106 9993241 357F23 30853716 (1757681 0116) 2524703 252 24357098 1378F24 35075372 (2174712 0130) 6727692 1001 51593000 3304F25 40091544 (2491002 0134) 14560522 2283 139580891 10014F26 45507780 (2886048 0142) 40465339 6553 298597411 23188F27 51859210 (3336825 0148) 116161004 19298 670492972 55578F28 58712130 (3788642 0153) 253278124 44358 1366152797 203265F29 66198704 (4203981 0156) 530219761 142265 3616257778 532084F30 74310052 (4629353 0157) 972413816 320566 9360343618 1556412

14 days In fact a solution of value 120526370was found by theITS heuristic in less than 1 sThe rest of the time was spent onproving its optimalityWe also see fromTable 3 that theUDSPwith the full Morse code dissimilarity matrix required nearly21 days to solve to proven optimality using our approach

In Table 4 we display optimal solutions for the twolargest UDSP instances we have tried The second and fourthcolumns of the table contain the optimal permutations forthese instances The coordinates for objects calculated using(4) are given in the third and fifth columns The number inparenthesis in the fourth column shows the order in whichthe food items appear within Table 51 in the book by Hubertet al [30] From Table 4 we observe that only a half of thesignals representing digits in theMorse code are placed at theend of the scale Other such signals are positioned betweensignals encoding letters We also see that shorter signals tendto occupy positions at the beginning of the permutation 119901

In Table 5 we evaluate the effectiveness of tests used tofathom partial solutions Computational results are reportedfor a selection of UDSP instances The second third andfourth columns of the table present 100119873sym119873 100119873dom119873and 100119873UB119873 where119873sym (respectively119873dom and119873UB) isthe number of partial permutations that are fathomed due

to symmetry test (respectively due to dominance test anddue to upper bound test) in Step (3) of SelectObject and119873 = 119873sym+119873dom+119873UBThe last three columns give the sameset of statistics for the BB3 algorithm Inspection of Table 5reveals that119873dom is much larger than both119873sym and119873UB Itcan also be seen that119873dom values are consistently higher withBB3 thanwhen using our approach Comparing the other twotests we observe that 119873sym is larger than 119873UB in all casesexcept when applying BampB to the food-item dissimilaritysubmatrices of size 20 times 20 and 25 times 25 Generally we noticethat the upper bound test is most successful for the UDSPinstances defined by the food-item matrix

We end this section with a discussion of two issuesregarding the use of heuristic algorithms for the UDSP Froma practical point of view heuristics of course are preferableto exact methods as they are fast and can be applied to largeproblem instances If carefully designed they are able toproduce solutions of very high quality The development ofexact methods on the other hand is a theoretical avenue forcoping with different problems In the field of optimizationexact algorithms for a problem of interest provide not only asolution but also a certificate of its optimality Therefore oneof possible applications of exact methods is to use them in

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 11

Table 3 Performance of BampB on larger empirical dissimilarity matrices

Instance Objective function value Number of nodes TimeF31 82221440 (5194455 0164) 2469803824 857038F32 91389180 (5703241 0166) 5040451368 1944127F33 100367800 (6412395 0174) 14318454604 5524200F34 110065196 (7104054 0180) 33035826208 12936500F35 120526370 (7841564 0185) 82986984742 34256456M36 489111494 (40682956 0230) 243227368848 49451175

Table 4 Optimal solutions for the full Morse code dissimilarity matrix and the 35 times 35 food-item dissimilarity matrix

119894

Morse code full Food item data119901(119894) 119909

119901(119894)119901(119894) 119909

119901(119894)

1 (E) ∙ 000000 orange (3) 0000002 (T) minus 583333 watermelon (2) 0285713 (I) ∙∙ 2477778 apple (1) 0285714 (A) ∙minus 3408333 banana (4) 1285715 (N) minus∙ 4411111 pineapple (5) 1628576 (M) minusminus 5233333 lettuce (6) 21257147 (S) ∙ ∙ ∙ 7030556 broccoli (7) 22314298 (U) ∙ ∙ minus 8313889 carrots (8) 22542869 (R) ∙ minus ∙ 9347222 corn (9) 228857110 (W) ∙ minus minus 10297222 onions (10) 251142911 (H) ∙ ∙ ∙∙ 11444444 potato (11) 299714312 (D) minus ∙ ∙ 12430556 rice (12) 516285713 (K) minus ∙ minus 13152778 spaghetti (19) 589714314 (V) ∙ ∙ ∙minus 14500000 bread (13) 644857115 (5) ∙ ∙ ∙ ∙ ∙ 15244444 bagel (14) 698571416 (4) ∙ ∙ ∙ ∙ minus 15991667 cereal (16) 721142917 (F) ∙ ∙ minus∙ 17075000 oatmeal (15) 725142918 (L) ∙ minus ∙∙ 18225000 pancake (18) 763714319 (B) minus ∙ ∙∙ 18952778 muffin (17) 778000020 (X) minus ∙ ∙minus 19955556 crackers (20) 889714321 (6) minus ∙ ∙ ∙ ∙ 20566667 granola bar (21) 930000022 (3) ∙ ∙ ∙ minus minus 22013889 pretzels (22) 1007428623 (C) minus ∙ minus∙ 22947222 nuts (24) 1043714324 (Y) minus ∙ minusminus 23841667 popcorn (23) 1076285725 (7) minus minus ∙ ∙ ∙ 24930556 potato chips (25) 1124857126 (Z) minus minus ∙∙ 25488889 doughnuts (26) 1200857127 (Q) minus minus ∙minus 26438889 pizza (31) 1265142928 (P) ∙ minus minus∙ 27083333 cookies (27) 1368857129 (J) ∙ minus minusminus 28294444 chocolate bar (29) 1394000030 (G) minus minus ∙ 29247222 cake (28) 1410571431 (O) minus minus minus 30022222 pie (30) 1435428632 (2) ∙ ∙ minus minus minus 31050000 ice cream (32) 1520000033 (8) minus minus minus ∙ ∙ 32036111 yogurt (33) 1571142934 (1) ∙ minus minus minus minus 33350000 butter (34) 1618000035 (9) minus minus minus minus ∙ 34186111 cheese (35) 1650857136 (0) minus minus minus minus minus 35027778

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

12 Journal of Applied Mathematics

Table 5 Relative performance of the tests

Instance B amp B BB3Sym Dom UB Sym Dom UB

p20 1158a 8824b 018c 1111a 8873b 016c

p25 886a 9087b 027c 848a 9143b 009c

p30 701a 9284 b 015c 629a 9364b 007c

M26 1140a 8847 b 013c 1044a 8952b 004c

J28 855a 8923 b 222c 568 a 9426b 006c

F20 801a 7563 b 1636c 1325a 8542b 133c

F25 704a 8127 b 1169c 1031a 8939b 030c

F30 895a 8391b 714c 774a 9216b 010c

F35 853a 8723b 424c mdasha mdashb mdashc

M36 882a 9114b 004c mdasha mdashb mdashc

aPercentage of partial solutions fathomed due to symmetry test

bPercentage of partial solutions fathomed due to dominance testcPercentage of partial solutions fathomed due to bound test

the process of assessment of heuristic techniques Specificallyoptimal values can serve as a reference point with which theresults of heuristics can be compared However obtaining anoptimality certificate in many cases including the UDSP isvery time consuming In order to see the time differencesbetween finding an optimal solution for UDSP with the helpof heuristics and proving its optimality (as it is done bythe branch-and-bound method described in this paper) werun the two heuristic algorithms iterated tabu search andsimulated annealing (SA) of Brusco [19] on all the probleminstances used in the main experiment For two randominstances namely p21 and p25 SA was trapped in a localmaximumTherefore we have enhanced our implementationof SA by allowing it to run in a multi start mode Thisvariation of SA located an optimal solution for the above-mentioned instances in the third and respectively secondrestarts Both tested heuristics have proven to be very fastIn particular ITS took in total only 295 seconds to findthe global optima for all 30 instances we experimented withPerforming the same task by SA required 544 seconds Suchtimes are in huge contrast to what is seen in Tables 1ndash3 Basically the long computation times recorded in thesetables are the price we have to pay for obtaining not only asolution but also a guarantee of its optimality We did notperform additional computational experiments with ITS andSA because investigation of heuristics for theUDSP is outsidethe scope of this paper

Another issue concerns the use of heuristics for providinginitial solutions for the branch-and-bound technique Wehave employed the iterated tabu search procedure for thispurpose Our choice was influenced by the fact that wecould adopt in ITS the same formulas as those used inthe dominance test which is a key component of ourapproach Obviously there are other heuristics one canapply in place of ITS As it follows from the discussionin the preceding paragraph a good candidate for this isthe simulated annealing algorithm of Brusco [19] Supposethat BampB is run twice on the same instance from differentinitial solutions both of which are optimal Then it shouldbe clear that in both cases the same number of tree nodes

is explored This essentially means that any fast heuristiccapable of providing optimal solutions is perfect to be usedin the initialization step of BampB An alternative strategyis to use in this step randomly generated permutationsThe effects of this simple strategy are mixed For examplefor problem instances F25 and F27 the computation timeincreased from 1483 and respectively 11698 seconds (seeTable 2) to 4133 and respectively 20625 seconds Howeverfor example for p25 and M26 a somewhat different picturewas observed In the case of random initial solution BampBtook 4075 and 4687 seconds for these instances respectivelyThe computation times in Table 1 for p25 and Table 2 forM26are only marginally shorter One possible explanation of thislies in the fact that as Table 5 indicates the percentage ofpartial solutions fathomed due to bound test for both p25 andM26 is quite low However in general it is advantageous touse a heuristic for obtaining an initial solution for the branch-and-bound algorithm especially because this solution can beproduced at a negligible cost with respect to the rest of thecomputation

7 Conclusions

In this paper we have presented a branch-and-bound algo-rithm for the least-squares unidimensional scaling problemThe algorithm incorporates a LAP-based upper bound testas well as a dominance test which allows reducing theredundancy in the search process drastically The results ofcomputational experiments indicate that the algorithm canbe used to obtain provably optimal solutions for randomlygenerated dissimilarity matrices of size up to 30 times 30 andfor empirical dissimilarity matrices of size up to 35 times 35In particular for the first time the UDSP instance with the36 times 36 Morse code dissimilarity matrix has been solved toguaranteed optimality Another important instance is definedby the 45 times 45 food-item dissimilarity matrix 119863food It is agreat challenge to the optimization community to develop anexact algorithm that will solve this instance Our branch-and-bound algorithm is limited to submatrices of 119863food of size

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 13

around 35 times 35 We also remark that optimal solutions forall UDSP instances used in our experiments can be found byapplying heuristic algorithms such as iterated tabu search andsimulated annealing proceduresThe total time taken by eachof these procedures to reach the global optima is only a fewseconds An advantage of the branch-and-bound algorithmis its ability to certify the optimality of the solution obtainedin a rigorous manner

Before closing there are two more points worth of men-tion First we proposed a simple yet effective mechanismfor making a decision about when it is useful to apply abound test to the current partial solution and when sucha test most probably will fail to discard this solution Forthat purpose a sample of partial solutions is evaluated in thepreparation step of BampBWe believe that similar mechanismsmay be operative in branch-and-bound algorithms for otherdifficult combinatorial optimization problems Second wedeveloped an efficient procedure for exploring the pairwiseinterchange neighborhood of a solution in the search spaceThis procedure can be useful on its own in the design ofheuristic algorithms for unidimensional scaling especiallythose based on the local search principle

Appendix

Derivation of (20)We will deduce an expression for 120575

119903119896 1 ⩽ 119896 lt 119903 ⩽ 119899

assuming that 119901 is a solution to the whole problem thatis 119901 isin Π Relocating the object 119901(119903) from position 119903 toposition 119896 gives the permutation 119901

1015840= (119901(1) 119901(119896 minus

1) 119901(119903) 119901(119896) 119901(119899)) By using the definition of 119865 we canwrite

120575119903119896

= 119865 (1199011015840) minus 119865 (119901)

= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119896

119889119901(119903)119901(119897)

)

2

minus(

119903minus1

sum

119897=1

119889119901(119903)119901(119897)

minus

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

+

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

+2119889119901(119903)119901(119897)

)

2

minus

119903minus1

sum

119897=119896

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

(A1)

Let us denote the first two terms in (A1) by 1205751015840 and the

remaining two terms by 12057510158401015840 = sum119903minus1

119897=11989612057510158401015840

119897 It is easy to see that 1205751015840

stems frommoving the object 119901(119903) to position 119896 and 12057510158401015840 stemsfrom shifting objects 119901(119897) 119897 = 119896 119903 minus 1 by one position to

the right To get a simpler expression for 1205751015840 we perform thefollowing manipulations

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

(

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2(

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

+

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A2)

Continuing we get

1205751015840= (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

minus (

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

)

2

minus 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

minus (

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

2

+ 2

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

+ 2

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

minus (

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

)

2

(A3)

Finally

1205751015840= minus 4

119896minus1

sum

119897=1

119889119901(119903)119901(119897)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

119899

sum

119897=119903+1

119889119901(119903)119901(119897)

= minus 4(119891119901(119903)

minus

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

)

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

+ 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

(119878119901(119903)

minus 119891119901(119903)

)

= 4

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

((

119903minus1

sum

119897=119896

119889119901(119903)119901(119897)

) minus 2119891119901(119903)

+ 119878119901(119903)

)

(A4)

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 14: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

14 Journal of Applied Mathematics

Similarly for 12057510158401015840119897we have

12057510158401015840

119897= (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

+ 4119889119901(119903)119901(119897)

(

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

+ 41198892

119901(119903)119901(119897)minus (

119897minus1

sum

119895=1

119889119901(119897)119901(119895)

minus

119899

sum

119895=119897+1

119889119901(119897)119901(119895)

)

2

= 4119889119901(119903)119901(119897)

(119891119901(119897)

minus 119878119901(119897)

+ 119891119901(119897)

) + 41198892

119901(119903)119901(119897)

= 4119889119901(119903)119901(119897)

(119889119901(119903)119901(119897)

+ 2119891119901(119897)

minus 119878119901(119897)

)

(A5)

By summing up (A4) and (A5) for 119897 = 119896 119903 minus 1 we obtain(20)

References

[1] I Borg and P J F Groenen Modern Multidimensional ScalingTheory and Appplications Springer New York NY USA 2ndedition 2005

[2] L Hubert P Arabie and J Meulman The Structural Repre-sentation of Proximity Matrices with MATLAB vol 19 SIAMPhiladelphia Pa USA 2006

[3] S Meisel and D Mattfeld ldquoSynergies of Operations Researchand Data Miningrdquo European Journal of Operational Researchvol 206 no 1 pp 1ndash10 2010

[4] G Caraux and S Pinloche ldquoPermutMatrix a graphical envi-ronment to arrange gene expression profiles in optimal linearorderrdquo Bioinformatics vol 21 no 7 pp 1280ndash1281 2005

[5] M K Ng E S Fung and H-Y Liao ldquoLinkage disequilibriummap by unidimensional nonnegative scalingrdquo in Proceedings ofthe 1st International Symposium on Optimization and SystemsBiology (OSB rsquo07) pp 302ndash308 Beijing China 2007

[6] P I Armstrong N A Fouad J Rounds and L HubertldquoQuantifying and interpreting group differences in interestprofilesrdquo Journal of Career Assessment vol 18 no 2 pp 115ndash1322010

[7] L Hubert and D Steinley ldquoAgreement among supreme courtjustices categorical versus continuous representationrdquo SIAMNews vol 38 no 7 2005

[8] A Dahabiah J Puentes and B Solaiman ldquoDigestive casebasemining based on possibility theory and linear unidimensionalscalingrdquo in Proceedings of the 8th WSEAS International Con-ference on Artificial Intelligence Knowledge Engineering andData Bases (AIKED rsquo09) pp 218ndash223 World Scientific andEngineering Academy and Society (WSEAS) Stevens PointWis USA 2009

[9] A Dahabiah J Puentes and B Solaiman ldquoPossibilistic patternrecognition in a digestive database for mining imperfect datardquoWSEASTransactions on Systems vol 8 no 2 pp 229ndash240 2009

[10] Y Ge Y Leung and J Ma ldquoUnidimensional scaling classifierand its application to remotely sensed datardquo inProceedings of theIEEE International Geoscience and Remote Sensing Symposium(IGARSS rsquo05) pp 3841ndash3844 July 2005

[11] J H Bihn M Verhaagh M Brandle and R Brandl ldquoDosecondary forests act as refuges for old growth forest animalsRecovery of ant diversity in the Atlantic forest of BrazilrdquoBiological Conservation vol 141 no 3 pp 733ndash743 2008

[12] D Defays ldquoA short note on a method of seriationrdquo BritishJournal of Mathematical and Statistical Psychology vol 31 no1 pp 49ndash53 1978

[13] M J Brusco and S Stahl ldquoOptimal least-squares unidimen-sional scaling improved branch-and-bound procedures andcomparison to dynamic programmingrdquo Psychometrika vol 70no 2 pp 253ndash270 2005

[14] L J Hubert and P Arabie ldquoUnidimensional scaling and com-binatorial optimizationrdquo in Multidimensional Data Analysis Jde Leeuw W J Heiser J Meulman and F Critchley Eds pp181ndash196 DSWO Press Leiden The Netherlands 1986

[15] L J Hubert P Arabie and J J Meulman ldquoLinear unidimen-sional scaling in the 119871

2-norm basic optimization methods

usingMATLABrdquo Journal of Classification vol 19 no 2 pp 303ndash328 2002

[16] M J Brusco and S Stahl Branch-and-Bound Applicationsin Combinatorial Data Analysis Springer Science + BusinessMedia New York NY USA 2005

[17] V Pliner ldquoMetric unidimensional scaling and global optimiza-tionrdquo Journal of Classification vol 13 no 1 pp 3ndash18 1996

[18] A Murillo J F Vera and W J Heiser ldquoA permutation-translation simulated annealing algorithm for 119871

1and 119871

2uni-

dimensional scalingrdquo Journal of Classification vol 22 no 1 pp119ndash138 2005

[19] M J Brusco ldquoOn the performance of simulated annealing forlarge-scale 119871

2unidimensional scalingrdquo Journal of Classification

vol 23 no 2 pp 255ndash268 2006[20] M J Brusco H F Kohn and S Stahl ldquoHeuristic imple-

mentation of dynamic programming for matrix permutationproblems in combinatorial data analysisrdquo Psychometrika vol73 no 3 pp 503ndash522 2008

[21] G Palubeckis ldquoIterated tabu search for the maximum diversityproblemrdquo Applied Mathematics and Computation vol 189 no1 pp 371ndash383 2007

[22] G Palubeckis ldquoA new bounding procedure and an improvedexact algorithm for the Max-2-SAT problemrdquo Applied Mathe-matics and Computation vol 215 no 3 pp 1106ndash1117 2009

[23] J Kluabwang D Puangdownreong and S Sujitjorn ldquoMultipathadaptive tabu search for a vehicle control problemrdquo Journal ofApplied Mathematics vol 2012 Article ID 731623 20 pages2012

[24] N Sarasiri K Suthamno and S Sujitjorn ldquoBacterial foraging-tabu search metaheuristics for identification of nonlinear fric-tion modelrdquo Journal of Applied Mathematics vol 2012 ArticleID 238563 23 pages 2012

[25] A Jouglet and J Carlier ldquoDominance rules in combinato-rial optimization problemsrdquo European Journal of OperationalResearch vol 212 no 3 pp 433ndash444 2011

[26] E Z Rothkopf ldquoA measure of stimulus similarity and errors insome paired-associate learning tasksrdquo Journal of ExperimentalPsychology vol 53 no 2 pp 94ndash101 1957

[27] I Duntsch and G Gediga ldquoModal-Style Operators in Qual-itative Data Analysisrdquo Tech Rep CS-02-15 Department ofComputer Science Brock University St Catharines OntarioCanada 2002

[28] R N Shepard ldquoAnalysis of proximities as a technique for thestudy of information processing in manrdquoHuman factors vol 5pp 33ndash48 1963

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 15: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Journal of Applied Mathematics 15

[29] P J F Groenen and W J Heiser ldquoThe tunneling method forglobal optimization in multidimensional scalingrdquo Psychome-trika vol 61 no 3 pp 529ndash550 1996

[30] L Hubert P Arabie and JMeulmanCombinatorial Data Anal-ysis Optimization by Dynamic Programming SIAM Philadel-phia Pa USA 2001

[31] B H Ross and G L Murphy ldquoFood for thought cross-classification and category organization in a complex real-worlddomainrdquo Cognitive Psychology vol 38 no 4 pp 495ndash553 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 16: Research Article An Improved Exact Algorithm for Least-Squares Unidimensional …downloads.hindawi.com/journals/jam/2013/890589.pdf · 2019-07-31 · by a 36×36 Morse code dissimilarity

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of