Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal...

11
Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourenço a,n , Bruno J. Guerreiro a,b , Pedro Batista a,b , Paulo Oliveira d,c,a , Carlos Silvestre e,a a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugal b Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal c Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal d Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugal e Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China article info Article history: Received 18 February 2016 Received in revised form 12 May 2016 Accepted 25 July 2016 Available online 28 July 2016 Keywords: Weighted Procrustes statistics Perturbation theory Uncertainty characterization Map transformation abstract This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbed point clouds, formulated as an optimization problem with a closed-form solution. A novel uncertainty characterization of the solution of this problem is proposed resorting to perturbation theory concepts, which admits arbitrary transformations between point clouds and individual covariance and cross- covariance matrices for the points of each cloud. The method is thoroughly validated through extensive Monte Carlo simulations, and particularly interesting cases where nonlinearities may arise are further analyzed. & 2016 Elsevier Ltd. All rights reserved. 1. Introduction The problem of nding the similarity transformation between two sets of points in n-dimensional space appears commonly in many applications of computer vision, robotics, statistics, and other elds of research. The study of this family of problems is usually known as the Procrustes analysis [1], which includes the statistical characterization of the transformation between the shape of objects [2]. One particularly important problem in this family is the so-called orthogonal Procrustes problem, which can be traced back to the work presented in [3], and consists in ex- tracting the orthogonal transformation that maps one set of points into a second set of points, with known associations between them. It is closely related to Wahba's problem [4] and to the Kabsch algorithm [5]. The generalization for rotation, translation, and scaling has been the subject of extensive research in areas such as computer vision, and can be traced back to [68]. While initially the problem was posed without any restrictions on the transformation between the sets, i.e., rotations and reections were allowed, a more evolved strategy appeared restricting the transformation to the special orthogonal group, as detailed in [6,9]. Furthermore, Goryn and Hein [10] demonstrated that the previous solutions are optimal even when both data sets are perturbed with isotropic and identical Gaussian noise. The statistical characterization of the Procrustes analysis has also been the subject of study in works such as [2,9,11,12]. Using perturbation theory, the nonlinear problem of characterizing the uncertainty was addressed with some limiting options, such as the absence of weighting of the point sets, the use of small rotations, or the same covariance for all points. Within the eld of medical imaging, the work presented in [13] also resorts to perturbation theory to present a statistical characterization of a target position, considering small rotations, isotropic uncertainty, and equal weights for each point. More recently, the work presented in [14] extends these results for anisotropic uncertainty in the compo- nents of the point space. This is achieved by considering the same covariance matrix for all points, which may weigh each compo- nent of the point space independently. The authors of [15] further expand this by considering different noise levels for each point, while keeping the linearized model for the rotation matrix. An interesting advance in the study of the uncertainty is the rst or- der error propagation proposed in [16]. The optimization problem Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/pr Pattern Recognition http://dx.doi.org/10.1016/j.patcog.2016.07.037 0031-3203/& 2016 Elsevier Ltd. All rights reserved. n Corresponding author. E-mail addresses: [email protected] (P. Lourenço), [email protected] (B.J. Guerreiro), [email protected] (P. Batista), [email protected] (P. Oliveira), [email protected] (C. Silvestre). Pattern Recognition 61 (2017) 210220

Transcript of Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal...

Page 1: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

Pattern Recognition 61 (2017) 210–220

Contents lists available at ScienceDirect

Pattern Recognition

http://d0031-32

n CorrE-m

bguerrepbatistacsilvestr

journal homepage: www.elsevier.com/locate/pr

Uncertainty characterization of the orthogonal Procrustes problemwith arbitrary covariance matrices

Pedro Lourenço a,n, Bruno J. Guerreiro a,b, Pedro Batista a,b, Paulo Oliveira d,c,a,Carlos Silvestre e,a

a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugalb Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugalc Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugald Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugale Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China

a r t i c l e i n f o

Article history:Received 18 February 2016Received in revised form12 May 2016Accepted 25 July 2016Available online 28 July 2016

Keywords:Weighted Procrustes statisticsPerturbation theoryUncertainty characterizationMap transformation

x.doi.org/10.1016/j.patcog.2016.07.03703/& 2016 Elsevier Ltd. All rights reserved.

esponding author.ail addresses: [email protected]@isr.tecnico.ulisboa.pt (B.J. Guerreiro),@isr.tecnico.ulisboa.pt (P. Batista), [email protected]@umac.mo (C. Silvestre).

a b s t r a c t

This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbedpoint clouds, formulated as an optimization problem with a closed-form solution. A novel uncertaintycharacterization of the solution of this problem is proposed resorting to perturbation theory concepts,which admits arbitrary transformations between point clouds and individual covariance and cross-covariance matrices for the points of each cloud. The method is thoroughly validated through extensiveMonte Carlo simulations, and particularly interesting cases where nonlinearities may arise are furtheranalyzed.

& 2016 Elsevier Ltd. All rights reserved.

1. Introduction

The problem of finding the similarity transformation betweentwo sets of points in n-dimensional space appears commonly inmany applications of computer vision, robotics, statistics, andother fields of research. The study of this family of problems isusually known as the Procrustes analysis [1], which includes thestatistical characterization of the transformation between theshape of objects [2]. One particularly important problem in thisfamily is the so-called orthogonal Procrustes problem, which canbe traced back to the work presented in [3], and consists in ex-tracting the orthogonal transformation that maps one set of pointsinto a second set of points, with known associations betweenthem. It is closely related to Wahba's problem [4] and to theKabsch algorithm [5]. The generalization for rotation, translation,and scaling has been the subject of extensive research in areassuch as computer vision, and can be traced back to [6–8]. Whileinitially the problem was posed without any restrictions on the

(P. Lourenço),

cnico.ulisboa.pt (P. Oliveira),

transformation between the sets, i.e., rotations and reflectionswere allowed, a more evolved strategy appeared restricting thetransformation to the special orthogonal group, as detailed in [6,9].Furthermore, Goryn and Hein [10] demonstrated that the previoussolutions are optimal even when both data sets are perturbed withisotropic and identical Gaussian noise.

The statistical characterization of the Procrustes analysis hasalso been the subject of study in works such as [2,9,11,12]. Usingperturbation theory, the nonlinear problem of characterizing theuncertainty was addressed with some limiting options, such as theabsence of weighting of the point sets, the use of small rotations,or the same covariance for all points. Within the field of medicalimaging, the work presented in [13] also resorts to perturbationtheory to present a statistical characterization of a target position,considering small rotations, isotropic uncertainty, and equalweights for each point. More recently, the work presented in [14]extends these results for anisotropic uncertainty in the compo-nents of the point space. This is achieved by considering the samecovariance matrix for all points, which may weigh each compo-nent of the point space independently. The authors of [15] furtherexpand this by considering different noise levels for each point,while keeping the linearized model for the rotation matrix. Aninteresting advance in the study of the uncertainty is the first or-der error propagation proposed in [16]. The optimization problem

Page 2: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220 211

that is considered is not weighted and therefore identical isotropicnoise is assumed for all the points. The author defines a first ordererror model that is propagated through the solution, while as-suming independent and identically distributed points (no longernecessarily isotropic). It is noted that the findings of the afore-mentioned works are all restricted to three-dimensional points. In[17] a different optimization problem is proposed that accounts forindependent anisotropic noise affecting rotated-only point setsalso in three dimensions. The authors determine the theoreticallower bound for the covariance of the rotation error in that case,and through an iterative solution recurring to quaternion re-presentation reach the theoretical bound. Besides the iterativesolution, some shortcomings of this work are its limitation to thetridimensional problem with rotation-only, and the fact that, al-though anisotropic, the input covariances are normalized andshare a common normalizing factor. Regarding the stability of thesolution, Söderkvist [18] addresses the study of this issue whenthe algorithm is exposed to perturbed data sets, concluding thatthe singular values of the matrices composed with the points ineach set are closely related to the conditioning of the problem,while finding a bound for the perturbation on the rotation matrixwhen the input perturbations are bounded. In related directions ofresearch that demonstrate the relevance of pattern point match-ing, and, consequently, of point registration problems such as theProcrustes problem, the authors of [19,20] propose algorithms thatexploit different approaches to registration and matching. Fur-thermore, the latter is an iterative algorithm that, assuming a ro-tation–scaling–translation transformation between two sets ofpoints, finds the point correspondences and a variational Bayesianapproximation for the distribution of the transformation.

This paper addresses the n-dimensional (n-D) extended or-thogonal Procrustes problem considering a transformation com-posed of a rotation and a translation (no scaling). The problem isposed with individual scalar weights for each pair of points, and aclosed-form solution is presented. Data association is assumed tobe performed a priori. Founded on perturbation theory, a noveland general uncertainty description for the solution of the opti-mization problem is proposed. Building on the results presented in[13,14,16], and assuming a stochastic perturbation model for thepoint sets with individual covariance matrices for each point, aswell as cross-covariances for each pair of points, the first andsecond moments of the resulting translation and rotation arecomputed. This is achieved considering arbitrary rotations andtranslations, individual weights, and full covariance matrices forboth point sets. As a by-product of this work, an application torobotics was proposed in [21,22] within the scope of simultaneouslocalization and mapping [23]. In this application, if a landmarkmap (or set of points) is available in a coordinate frame attached tothe robot, it is possible to compute the transformation betweenthat frame and another frame fixed to the initial position of therobot. Following this idea, an online Earth-fixed trajectory andmap estimation algorithm based on the Procrustes problem wasproposed and its uncertainty characterization derived, making fulluse of the methodology proposed in this paper. This builds on theprevious works by the authors, where globally asymptoticallystable filters for simultaneous localization and mapping in a sen-sor-based or robocentric framework were proposed for bidimen-sional [24] and tridimensional mission scenarios [25]. The per-formance and consistency of the overall algorithm are validated ina real world environment for both dimensionalities, showing thatthe algorithm provides accurate and consistent estimates, and,therefore, also providing an experimental validation of the un-certainty characterization proposed in this paper.

The contributions of this paper are: (i) the full uncertaintycharacterization of the optimization problem of obtaining thetransformation between corresponding n-dimensional point sets

and its closed-form solution, while considering point sets per-turbed by anisotropic noise, and points that are not required to beindependent nor identically distributed; and (ii) a thorough vali-dation of the uncertainty characterization, using extensive MonteCarlo simulations to study the main properties of the proposedmethodology. This paper builds on the preliminary versions of thiswork presented in [21,22], by reformulating the problem of ob-taining the pose of the vehicle, while extending the derivationtherein to points of arbitrary dimensions. In contrast with thelatter work, this paper provides new theoretical results, gen-eralizes the proposed uncertainty characterization to

n, andprovides statistical validation through extensive Monte Carlo si-mulations for several dimensions and a multitude of parametercombinations.

The applications of Procrustes analysis are found in a widevariety of fields, which can benefit from the proposed approach,including rigid body motion, vibration tests of large complexstructures [26], structural and system identification, factor analysisin n-D (e.g. checking whether two matrices are equivalent), simi-larity evaluation in statistical data sets [27, Chapter 20], medicalimaging [14], photogrammetry [28], shape comparison (general-ized Procrustes analysis) [29], and quantitative psychology [30](where the problem was initially solved). In recent years severalalgorithms were developed in the field of computer vision thatavailed themselves of the Procrustes problem, from shapematching and retrieval [31] to similarity search in image collec-tions [32], among others. Shape matching is in fact a more com-plex problem, as the problem of finding the transformations iscoupled with the problem of finding the reference shape to whichall the measured shapes relate. In [33] the authors propose aunifying framework that has a closed-form computation for affine,similarity or Euclidean transformations between a set of shapes,while allowing us to find the underlying shape and accounting formissing pairs of points. All this is performed considering noise inthe measured shapes and not in the reference-space as is cus-tomary. Other applications include non-negative matrix factor-ization [34], and phase FIR filter bank design [35], whereas thework in [36] underlines the importance of addressing the problemin less common dimensionalities, such as four-dimensionalshapes. Another possible application of the Procrustes problem liesin iterative closest point algorithms such as [37], even thoughmost use quaternions to parametrize the rotation of the sets. If theregistration is performed in each step with a constrained leastsquares approach, it can benefit from the characterization hereproposed. Another interesting application of Procrustes analysis ismanifold alignment [38] in the area of machine learning. In this n-dimensional technique, it is argued that it is possible to model theunderlying structure of most datasets by manifolds, whose align-ment then allows for knowledge transfer across datasets. The au-thors of [38] demonstrate the validity of this approach by applyingthe idea to learning transfer in reinforcement learning with Mar-kov Decision Processes, alignment of the tertiary structure ofproteins, cross lingual information alignment, within others. Thesedemonstrate the real world relevance of the n-dimensional Pro-crustes problem in several fields even for dimensionalities outsideof the 2-D/3-D common problems. Furthermore, given the noisynature of these problems, the proposed uncertainty characteriza-tion can be useful to compute the reliability of the alignment re-sulting from the Procrustes procedure.

Paper structure: The paper is organized as follows. Section 2presents a brief overview of some mathematical concepts neededin the course of this paper. Section 3 presents the formulation andclosed-form solution of the weighted orthogonal Procrustes pro-blem. A novel uncertainty characterization of this problem is de-rived in Section 4 and validated in Section 5 through extensive

Page 3: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220212

Monte Carlo simulations. Finally, Section 6 provides concludingremarks. Further figures depicting the simulations detailed in thissection are provided in the supplementary material, along withdetailed proofs to the results.

2. Preliminary definitions

This section serves the purpose of introducing the notationused in this paper, as well as a few definitions and propertiesneeded for the mathematical derivations in the sequel.

2.1. Notation

Throughout the paper, vectors and matrices are represented insmall and capital boldface letters, respectively. Scalar symbols areexpressed in italic: constants by capital letters, and scalar variablesin small letters. Particularly, the symbol ×0n m denotes an n�mmatrix of zeros (if only one subscript is present, the matrix issquare), In is an identity matrix with dimension n�n, and

( )…A Adiag , , n1 is a block diagonal matrix. The determinant of ageneric square matrix is denoted by | |A , and, for a generic matrix

∈ ×A n m, the Frobenius norm is adopted, i.e., ( )∥ ∥ =A AAtr T .

The operator s→ ( )× o nskew: n n yields the skew-symmetric

component of a square matrix, ( )( ) = −A A Askew T12

. Finally, the

expectation operator is denoted as ⟨·⟩, and the covariance matrixbetween two generic stochastic vectors ∈ a b, n is denoted by

( )( )Σ = − ⟨ ⟩ − ⟨ ⟩a a b babT

or Σa, if =a b. The Orthogonal Group is

denoted by { }( ) = ∈ = =×n X XX X X IO : :n n T T , and the Special Or-

thogonal Group is denoted by { }( ) = ∈ ( ) =n X XSO : O n : 1 .

2.2. Definitions and properties

In this paper, except when explicitly stated, the dimension ofthe space

n is arbitrary, i.e., all the derivations are valid for pointclouds in

n for all ≥n 2, and the rotations are expressed in thespecial orthogonal group ( )nSO . Note that the term rotation, androtation matrix, applies to all the orthogonal matrices of unitarydeterminant. This is related to the Lie algebra s ( )o n comprised ofskew-symmetric matrices that can be mapped to ( )nSO throughthe exponential map.

Definition 1. The matrix sω[ ] ∈ ( )o nS is a skew-symmetric matrixparametrized by the vector ω ω ω=[ ⋯ ] ∈ : n

T n1 p

p, with = ( − )npn n 1

2.

One possible parametrization is given by

( )

ω

ω ω ω ω

ω ω ω

ω ωω

[ ] =

( − ) ( − ) ⋯ ( − ) ( − )

⁎ ( − ) ⋯ ( − ) ( − )

⁎ ⁎ ⋮ ⋮

⁎ ⁎ ⁎ ⋱ ( − ) ( − )⁎ ⁎ ⁎ ⁎ ( − )⁎ ⁎ ⁎ ⁎ ⁎

−−

−−

−−

−−

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥1

S

0 1 1 1 1

0 1 1 1

0

1 10 1

0

.

np npn

nn

n

npn

nn

n

n

21

22 3

11

23

2 42

2

22

1

The elements under the diagonal are automatically defined byω ω[ ] = − [ ]S ST , and are therefore omitted to avoid cluttering the

reading. Also note that, in general, ≥n np , except in the bidimen-sional case when np¼1 and the whole vector ω collapses to ω1.The unskew operator s ( ) →−

o nS : n1 p is related to this matrix as itextracts the vector that parametrizes a skew-symmetric matrix,

i.e., ( )ω ω[ ] =−S S1 .

Notice that, given ω ∈ np, other parametrizations could be

used to obtain ω[ ]S . This form was chosen without loss of gen-erality, as the results could be derived for any parametrization. Forthe tridimensional case (n¼3), this matrix is also known to encodethe cross-product, as in [ ] = ×S a b a b. Due to this fact, there is apossible anti-commutation between the two vectors given by

[ ] = − [ ] = [ ]S a b S b a S b aT . Even though these properties are solelyapplicable to the tridimensional case, a possible extension to allthe dimensions is the matrix ¯ ·⎡⎣ ⎤⎦S defined hereafter.

Definition 2. The generalized anti-commutation matrix¯ ∈ ×⎡⎣ ⎤⎦ S a n np parametrized by the vector ∈ a n, when associatedto the parametrization of ω[ ]S in (1), can be defined recursively by

for all = …i n3, , , with = ⋯⎡⎣ ⎤⎦a aa i iT

1: 1 and ¯ = [ − ]⎡⎣ ⎤⎦ a aS a1:2 2 1 .

Note that for vectors in three dimensions, this matrix degen-erates in the skew-symmetric matrix, i.e., ¯ = [ ]⎡⎣ ⎤⎦S a S a1:3 1:3 . Besidesthis fact, the matrix (2) exhibits a series of properties that are usedin this paper. These are summarized in the following lemma.

Lemma 1. The generalized anti-commutation matrix has the fol-lowing properties for every ∈ a b, n, ω ∈

np, constants α β ∈ , andinteger ≥n 2.

1. Linearity: α β α β¯ [ + ] = ¯ [ ] + ¯ [ ]S a b S a S b2. Anti-commutativity: ¯ [ ] = − ¯ [ ]S a b S b a3. Anti-commutation with a skew-symmetric matrix: ω ω[ ] = ¯ [ ]S a S a

T

4. Relation to the unskew operator: ( )− = ¯ [ ]−S ba ab S a bT T1 .

Proof. Due to the recursive nature of ¯ [ ]S a and ω[ ]S , it is possible toexpress all the sides of these identities recursively and thus theseproperties can be proven using mathematical induction. The proofis presented in Appendix A in the supplementary material. □

3. Procrustes optimization problem and closed-form solution

This section presents the formulation and closed-form solutionof the weighted extended orthogonal Procrustes problem, featur-ing individual weights for each pair of points, related by a trans-lation and rotation.

Consider the existence of two point sets in n, A and B, which

contain, respectively, the points expressed in an arbitrary frame{ }A and the same points expressed in some other frame { }B . Eachpoint ∈ai A nominally corresponds to a point ∈bi B, with

∈ ={ … }i m: 1, , and ≥m n, and that correspondence is expressedby = +a Rb ti i , where the pair ( ) ∈ ( ) × nR t, SO n fully defines the

transformation from frame { }B to frame { }A , as it represents therotation and translation from { }B to { }A . Given the relation be-tween the two sets, it is possible to define the error function

= − −e a Rb ti i i , that represents the error between the i-th pointestimate in A and its homologous in B, rotated and translatedinto frame { }A . Obtaining the pair ( )R t, is the purpose of theoptimization problem

( * ) = ( )( )

∈ ( )∈

GR t R t, arg min , ,

3nR

t

SOn

Page 4: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220 213

where the function ( )G R t, is defined as

∑ σ Σ( ) = − − = ( − − )=

− −Gm m

R t a Rb t Y RX t1, :1 1

,i

m

i i iT

e1

2 2 1/22

where σ > 0i2 , ∈i , accounts for the intrinsic uncertainty of each

point pair, = [ ⋯ ] ∈ ×Y a am

n m1 and = [ ⋯ ] ∈ ×

X b bmn m

1 are, re-spectively, the concatenation of the point vectors expressed in frames{ }A and { }B , = [ ⋯ ] ∈ 1 1 1 T m is a vector of ones, and the weightmatrix Σe is a diagonal matrix whose entries are the weights

σ σ…, , m12 2 that model the point uncertainty. If, for example, one of the

points in a pair is missing, the corresponding σ −i

2 can be set to zero,thus allowing the cost function to account for missing data. Given thatthe true Σe is not known, these can be conservatively defined as

σ λ λ λΣ Σ Σ Σ= ( ) + ( ) ≥ ( + )R Ri max a max b max a bT2

i i i i , denoting λ (·)max asthe maximum eigenvalue. This weight matrix allows the use of theinformation regarding the different degrees of uncertainty of eachpoint pair.

The optimization problem (3) has a closed-form, numericallyrobust, and computationally efficient solution based on the workpresented in [6,2]. The weighted statistical properties of the pointsets A and B, in the form of their weighted centroids and cov-ariances, can be expressed in matrix form using the symmetricweight matrix Σ Σ Σ= − ∈− − − ×

W 11: e N eT

em m1 1 1 1

W, where

σ Σ= ∑ ==− −N 1 1:W i

mi

Te1

2 1 . The resulting expressions for the weigh-

ted centroids of the sets A and B, respectively μ μ ∈ ,A Bn, are

∑μ μσ Σ Σ= = ==

− − −

N N Na Y 1 X 1:

1 1, and :

1,A

W i

m

i iW

e BW

e1

2 1 1

whereas the weighted covariance Σ ∈ ×AB

n n is given by

∑ μ μσΣ = ( − )( − ) = ==

N N Na b YWX B:

1 1:

1.AB

W i

m

i i A i BT

W

T

W

T

1

2

Consider now the singular value decomposition of BT

= ( ) ( )UDV Bsvd . 4T T

The optimal rotation matrix, from the optimization problem (3), isgiven by

* = ( … ) ( )R U U V Vdiag 1, , 1, , 5T

which in fact is the optimal rotation between the two sets whentheir weighted centroids coincide. This solution is valid as long asthe covariance matrix ΣAB has rank −n 1, as reported in [6]. Theoptimal translation vector is

μ μΣ= ( − * ) = − *( )

⁎ −

Nt Y R X 1 R

1.

6We A B

1

Notice that the optimal translation is the vector that translates theweighted centroid of the points in B rotated to frame { }A to theweighted centroid of the points in A.

Some remarks on applications: The derivation here presentedand followed in the next section is detailed for the matching oftwo point clouds. Nevertheless, the problem is completelyequivalent to the general problem of matching two different ma-trices, considering that each point is a column of the corre-sponding matrix. When dealing with generalized Procrustes ana-lysis, for shape matching/registration, the problem at hand is ex-tended to include the estimation of a reference shape. This is notthe objective of the optimization problem (3). However, once areference shape is chosen or found (for example using the classicalalternation approach [1, Chapter 9] or the more evolved ap-proaches proposed in [33,29], the problem of finding the Euclideantransformation is exactly expressed by (3) and therefore all the

derivations in this and the following sections hold.

4. Uncertainty characterization

The goal of this section is to compute accurate approximationsof the uncertainty associated with the extended Procrustes pro-blem solution obtained above. The study of the statistical prop-erties of the orthogonal Procrustes problem can be traced back to[11,2], which used perturbation theory to find the approximatedistribution of the cost functional ( )G R t, , assuming Σ =− Ie

1 , smallrotations, and the same covariance matrix for each landmark.These results have been extensively applied to the field of point-based medical image registration problems, as detailed in [13–15].The work in [16] eliminates the assumptions on the rotation, butstill addresses the optimization problem without weights andconsiders that the points are independent and identically dis-tributed, albeit with anisotropic uncertainty. The analysis pre-sented hereafter is based on the aforementioned works and aimsat providing approximate uncertainty descriptions for the trans-formation parameters, *R and ⁎t . This uncertainty characterizationis achieved while considering arbitrary rotations and translationsand using individual weights as well as individual covariancematrices for each landmark and cross covariance terms.

Within the scope of perturbation theory, the error models ofthe known variables are defined as

= + ϵ + (ϵ ) ( )( ) ( )a a a 7ai i i0 1 2

= + ϵ + (ϵ )( ) ( )b b b0 1 2

( )7bi i i

for all ∈i , where ϵ is the smallness parameter, the notation(ϵ )m stands for the remaining terms of orderm or higher, the zero

order terms are the true values, hence ⟨ ⟩ =( ) ( )a ai i0 0 and ⟨ ⟩ =( ) ( )b bi i

0 0 ,

whereas the first order terms, (·)( )1 , are assumed to follow a knowndistribution with zero mean and covariance matrices defined by

Σ =⟨ ⟩( ) ( )a a:a i j

T1 1

ij and Σ =⟨ ⟩( ) ( )b b:b i j

T1 1

ij , respectively, for all ∈i j, . Notethat each of these clouds has cross covariance terms between theirpoints, as suggested in the above covariance matrices expressions.The optimal translation vector is assumed to have an error modelwith a similar structure to that of (7),

= + ϵ + (ϵ ) ( )⁎ ( ) ( )t t t . 80 1 2

4.1. Rotation uncertainty

The rotation matrix obtained through the optimization processdescribed before is restricted to the special orthogonal group ( )nSO .However, nothing is known about the components of its error model.The following lemma is introduced to address these quantities.

Lemma 2. Consider the generic error model

= + ϵ + (ϵ ) ( )( ) ( )M M M . 90 1 2

If the matrix M belongs to the orthogonal group ( )nO then ( )M 0 be-longs to the same matrix space, its determinant is equal to the de-terminant of M, and ( )M 1 has the special structure

ω ϖ= [ ] = [ ]( ) ( ) ( )M S M M S1 0 0 , with sω ϖ[ ] [ ] ∈ ( )o nS S, and

ω ϖ ∈( − )

,n n 1

2 . Furthermore, if ∈ ( )nM SO , then ∈ ( )( ) nM SO0 .

Proof. The proof is made by exploring the algebraic constraintsimposed by the matrix spaces ( )nO and ( )nSO and using errormodel (9). The identity =M M IT is expanded, resulting in an un-perturbed part that yields ∈ ( )( ) nM O0 and a perturbed part that

Page 5: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220214

implies that ω ϖ= [ ] = [ ]( ) ( ) ( )M S M M S1 0 0 . Finally, the determinant ofM noting the structure of ω[ ]S is expanded. The proof is presentedin Appendix B in the supplementary material.□

From the solution to the optimization problem in [6, Lemma], itis known that * ∈ ( )nR SO . If (9) is applied to *R , the following errormodel results

ω* = [ + ϵ [ ] + (ϵ )] ( )( )R I S R . 102 0

To ascertain whether ∈ ( )( ) nR SO0 corresponds to the true rotation,consider that matrix B, used to compute the estimated rotation,can be described in terms of its error model, using that of matricesX and Y , which are a generalization of (7b) and (7a), respectively.This results in

= + ϵ + (ϵ ) ( )( ) ( )B B B , 110 1 2

with =( ) ( ) ( )B X WY

T0 0 0

and = +( ) ( ) ( ) ( ) ( )B X WY X WY

T T1 1 0 0 1

. Further-more, the singular value decomposition (4) of B can itself be ex-panded defining similar models for each of its components[39], i.e., = + ϵ + (ϵ )( ) ( )U U U0 1 2 , = + ϵ + (ϵ )( ) ( )D D D0 1 2 , and

= + ϵ + (ϵ )( ) ( )V V V0 1 2 , which yields

= + (ϵ)( ) ( ) ( )B U D V .

T0 0 0

Comparison with the error model (11) yields

=( ) ( ) ( ) ( ) ( )X WY U D V

T T0 0 0 0 0

, thus confirming that these terms arecomposed of true quantities only. Combining the error model ofthe optimal rotation matrix with its solution (5) yields

+ (ϵ) = ( … ) + (ϵ)( ) ( ) ( ) ( ) ( )R U U V Vdiag 1, , 1, .

T0 0 0 0 0

Note that, as U and V are orthogonal matrices, Lemma 2 applies,and, therefore, not only ( )U 0 and ( )V 0 are orthogonal matrices, buttheir determinants correspond to the determinants of the per-turbed versions, a fact that was used in the last expression. Fromthis, it follows that ( )R 0 is the true rotation as it is obtained fromthe singular value decomposition of ( )B 0 , itself computed using thetrue terms ( )ai

0 and ( )bi0 , for all ∈ { … }i m1, , . As ( )R 0 is the true

rotation and given the structure of (10), the vector ω can be seenas a rotation error around the true rotation matrix.

From the proof of Umeyama [6, Lemma], it is known that thematrix *BR is symmetric, and thus, using the error models (10) and(11) one can write,

ω( *) = ( ) + ϵ ( + [ ] ) =( ) ( ) ( ) ( ) ( ) ( )BR B R B R B S R 0skew skew skew .0 0 1 0 0 0

This implies that each skew operator is null, meaning that ( ) ( )B R0 0

is also symmetric and that, after left multiplication by ( )R 0 , right

multiplication by( )

RT0, and using the identity ω ω[ ] = − [ ]S ST ,

ω( [ ]) = ( ) ( )( ) ( ) ( ) ( )R B S R Bskew skew . 12T0 0 0 1

Further manipulation of this formula is needed in order to obtain atractable expression for ω. Consider the matrix ω[ ]( ) ( )R B ST0 0 ex-pressed as a summation of point terms, noting that = +( ) ( ) ( )a r ti i

0 0 0

with =(·) ( ) (·)r R b:i i0 for all ∈i ,

∑ ∑( )

ω ω ωσ σ[ ] = [ ] − [ ]( ) ( )

=

− ( ) ( )

=

− ( ) ( )⎡⎣⎢⎢

⎤⎦⎥⎥ 13N

R B S r r S r r S1,T

i

m

i i i

TT

W j

m

j i j

TT0 0

1

2 0 0

1

2 0 0

The same can be done with ( ) ( )R B0 1 , yielding

∑ σ= [ ¯ + ¯ ]( )

( ) ( )

=

− ( ) ( ) ( ) ( )R B r a r a ,14i

m

i i i i i0 1

1

2 0 1 1 0T T

where μ¯ = −(·) (·) (·)a a:i i A .Due to the skew-symmetric nature of (12), it is possible to

apply the unskew operator to both sides. This enables us to applyProperties 4 and 2 of Lemma 1 combined with (13) and (14),yielding

∑ ∑

ω

ω ωσ σ σ

( ( [ ]))

= ¯ [ ] [ ] − ¯ [ ] [ ]( )

− ( ) ( )

=

− − ( ) ( )

=

− ( ) ( )

N

S R B S

S r S r S r S r

2skew

1,

15

T

W i j

m

i j i ji

m

i i i

1 0 0

, 1

2 2 0 0

1

2 0 0

whereas the right-hand side becomes

∑ σ= ( ( )) = (¯ [ ¯ ] − ¯ [ ] ¯ )( )

− ( ) ( )

=

− ( ) ( ) ( ) ( )c S R B S a r S r a: 2skew .16i

m

i i i i i1 0 1

1

2 0 1 0 1

Using Property 3 of Lemma 1 in (15), one obtains a linear matrixequation after applying the unskew operator to both sides of (12),given by ω =A c, where the matrix ∈ ×

A n np p is defined as

∑ ∑σ σ σ= − ¯ [ ] ¯ [ ] + ¯ [ ] ¯ [ ]=

− ( ) ( )

=

− − ( ) ( )

NA S r S r S r S r:

1.

i

m

i iT

iW i j

m

i j iT

j1

2 0 0

, 1

2 2 0 0

From the linear equation now derived it is possible to obtain ω,as long as A is invertible. It can be shown that this result degen-erates into the main result of [16] when assuming independentand identically distributed points in

3. However, the existenceand uniqueness of the solution is not addressed there. That is thefocus of the forthcoming theorem, addressing the conditions un-der which the invertibility of A in n-dimensional space holds.

Theorem 1. The matrix A is invertible if and only if there are at leastn points in the set B whose connecting vectors span −

n 1.

Proof. The proof is made by manipulating =u Au 0T so that a clearrelation between the landmarks and u appears, allowing us toexplore the nature of [·]S to analyze the solutions of =u Au 0T for∥ ∥ =u 1 when A is singular. It can be found in Appendix C in thesupplementary material. □

By observation of (16), it can be seen that ⟨ ⟩ =c 0, as its un-

certain elements are only ( )ri0 and ¯ ( )ai

0 , which have zero mean.

Therefore ω⟨ ⟩ = ⟨ ⟩−A c1 is also zero, and the corresponding covar-iance matrix is given by

ωω σ σ

σ σ

σ σ

σ σ

Σ Σ= ⟨ ⟩ = [ ¯ [ ¯ ] ¯ [ ¯ ]

+ ¯ [ ]⟨ ¯ ¯ ⟩ ¯ [ ]

− ¯ [ ¯ ]⟨ ¯ ⟩ ¯ [ ]

− ¯ [ ]⟨ ¯ ⟩ ¯ [ ¯ ]]

ω−

=

− − ( ) ( ) ( ) ( )

=

− − ( ) ( ) ( )

=

− − ( ) ( ) ( ) ( )

=

− − ( ) ( ) ( ) ( ) −

A S a R R S a

S r a a S r

S a r a S r

S r a r S a A .

T

i j

m

i j i b

T Tj

i j

m

i j i i j j

i j

m

i j i i j j

i j

m

i j i i j

T

j

T

1

, 1

2 2 0 0 0 0

, 1

2 2 0 1 1 0

, 1

2 2 0 1 1 0

, 1

2 2 0 1 1 0 1

ij

T

T

The individual covariances between points of the two sets aredetailed in Appendix D of the supplementary material.

4.2. Translation uncertainty

The optimal translation between frames is given by (6) andthe associated error model is assumed to be (8). Usingthis information along with the error models for the points in bothsets, defined in (7), it is possible to expand (6) to obtainexpressions for the error model components, the true translation

σ= ∑ [ − ]( )=

− ( ) ( ) ( )t a R bN i

mi i i

0 11

2 0 0 0

W, and the first order perturbed part

Page 6: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220 215

∑ ωσ= ( − [ ] − )( )

( )

=

− ( ) ( ) ( )

Nt a S r r

1.

17W i

m

i i i i1

1

2 1 0 1

It is confirmed that ( )t 1 has zero mean, noting that all the pertur-bation quantities involved have zero mean. Therefore, the covar-iance matrix of the position estimate Σt is simply given by

Σ = ⟨ ⟩( ) ( )t tt

T1 1

, and it can be computed by expanding ( )t 1 accordingto (17), using Property 3 of Lemma 1 to extract ω from the skew-symmetric matrix. This yields

ω ω

ω ω

σ σΣ Σ Σ Σ

Σ Σ

= ( + + ¯ [ ] ¯ [ ]

− ¯ [ ]⟨ ⟩ − ⟨ ⟩¯ [ ]

− + ¯ [ ]⟨ ⟩ + ⟨ ⟩¯ [ ] − )

ω=

− −( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

NR R S r S r

S r a a S r

R S r r r S r R ,

ti j

mi j

W

a b

T Ti j

Ti j

T

iT

j

b aT

i j

T

iT

j a b

T

, 1

2 2

20 0 0 0

0 1 1 0

0 0 1 1 0 0

ij ij

i j i j

4.3. Cross translation-rotation uncertainty

The optimal translation estimate is obtained using the optimalrotation, and for that reason there is a correlation between thetwo. The corresponding cross covariance is given by

ω

ω

σ σ

σ σ

Σ Σ Σ

Σ

= ¯ [¯ ]( − ⟨ ⟩¯ [ ] − )

+ ¯ [ ]( + ⟨¯ ⟩ ¯ [ ] − ⟨¯ ⟩)

ω−

=

− −( ) ( ) ( ) ( ) ( ) ( )

=

− −( ) ( ) ( ) ( ) ( ) ( )

N

N

A S a R r S r R R

A S r R a S r a a ,

ti j

mi j

Wi b a i

Tj b j

T

i j

mi j

Wi a b

T

iT

j i j

T

1

, 1

2 20 0 1 0 0 0

1

, 1

2 20 0 1 0 1 1

i j i

i j

where the cross covariances between points in either sets and therotation error are computed in the same manner, and are detailedin Appendix D in the supplementary material. Note that Σa bi j mayin most cases be zero due to the possible independence of the sets.

Remark 1. As the true quantities ( )R 0 , ( )ai0 , and ( )bi

0 are unknownfor all ∈i , a possible approximation is to use the optimal valuesfor the rotation and translation and the original perturbed pointsai and bj instead. This applies to all the covariance expressionsderived in this section.

5. Algorithm validation

In this section, the proposed uncertainty characterization isvalidated through extensive Monte Carlo simulations. Despite thegeneral derivation presented above, the numerical validation re-quires the particularization for one or several dimensions. Al-though bidimensional and tridimensional problems are the mostcommon in the literature, as in [24,25], the study of rotations andother quantities in seven-dimensional space has also been ad-dressed in [40]. For these reasons and to represent meaningfulexamples, this section will focus on those dimensionalities.

As the optimization and uncertainty description presented inthis paper are intended to work for any rotation and translation,and taking into account that some approximations were introduced,a Monte Carlo simulation with N¼1000 samples was performed foreach of M¼500 random configurations, to validate the proposedtechnique. In each of these configurations, the transformation isuniformly distributed, with each component of the translationvector ranging from �10 to 10 m, the locations of the points from0 to 10 m in each component, and the rotation matrix is built usingthe exponential map = θ[ ]R eS u , where θ is uniformly distributedbetween π− and π radians and u is an uniformly distributed ran-dom unit vector. In order to comprehensively test the properties ofthe proposed method, the covariance matrices used to generate the

normally distributed location errors of the points in each set arecomplete covariances with cross-correlations between all the pointsand different individual covariances. In the two-dimensional case,the eigenvalues of the covariance matrices are ( )0. 01 , 0. 052 2 .For the three-dimensional case, the eigenvalues are( )0. 01 , 0. 03 , 0. 052 2 2 , and the seven-dimensional eigenvalues are

( )0. 01 , 0. 017 , 0. 023 , 0. 03 , 0. 037 , 0. 043 , 0. 052 2 2 2 2 2 2 . In each ofthese cases, all the permutations of eigenvalues are used an equalnumber of times. The full covariance for each point cloud is then

λ λΣ = ( … )U Udiag , ,nomnm

T1 , where the eigenvector matrix U is an

nm�nm rotation matrix built following the same reasoning used tobuild R .

The meaningful comparison between two different models,usually denoted as the null model and the alternate model, is adifficult problem which is frequently performed using likelihoodratio tests. In short, the likelihood ratio expresses how many timesmore likely the data is explained under one model than the other,by obtaining the probability distribution of the test, assuming thenull model to be true (null hypothesis). Thus, the null hypothesis isrejected if its probability is below a desired significance value,typically 0.01 or 0.05. However, a central limitation for the usageof these tests is the assumption that the stochastic variable followsa specific distribution, usually Gaussian. Indeed, and as a con-sequence of the considered arbitrary rotations, the translationtends to be Gaussian distributed as the number of points increases,but may fail to be so for a reduced number of landmarks, which isin accordance with the central limit theorem.

Fig. 1 and Tables 1 and 2 summarize the likelihood ratio testsfor the comparison between the sample covariance matrices re-

sulting from the Monte Carlo simulations, Σωsim, Σt

sim, and the joint

error covariance Σξsim where ωξ = [ ]tT T T , and the covariance ma-

trices resulting from the proposed methodology and respectiveuncertainty approximation, Σω

opt , Σtopt , and Σξ

opt . The definition ofeach ratio depends on whether the quantity is a scalar or a matrix.For n¼2, the rotation error is a scalar, and therefore its covariance

will be too. Then, the ratio for the null hypothesis Σ Σ=ω ω ω=H :n sim opt2

is defined as

λ ΣΣ

χ= ∼ωω

ω

⁎ ⁎⁎N:

sim

opt N2

which is asymptotically χ2 distributed with = −⁎N N: 1 degrees offreedom and, thus, can be compared with a predefined sig-nificance level, which in the presented results is considered to be0.01. For the multivariable cases of the null hypothesis of the

general case for rotation Σ Σ=ω ω ω≥H :n sim opt3 , the translation

Σ Σ=H :t tsim

topt and the joint error covariance Σ Σ=ξ ξ ξH : sim opt , the

specific likelihood ratio test can be found in [41, Section 10.8]. Thelikelihood ratio for these cases is defined as

λ Σ= Σ⁎⁎

−− ( )

⁎ ⁎−

⎜ ⎟⎛⎝

⎞⎠N

Be

e ,pN

sim optN

B1

12 1

12 1

2tr sim opt 1

where Σ= ⁎NB :sim sim, and the comparison with the significance le-vel is achieved by noting that λ− ⁎2 log 1 is asymptotically χ2 dis-

tributed with ( + )p p 112

degrees of freedom where p is the di-mension of the vector (n for the translation, np for the rotationerror, and +n np for the joint error vector ξ). Hence, for the like-lihood ratio tests, λω

⁎ is compared with a threshold

α= ( − )ω χ= −

⁎t F: 1n 2 1

N

2 in the bidimensional case, and the remaining

likelihood tests are made by comparing λ− ⁎2 log 1 with similar

thresholds α= ( − )ωχ

≥ −t F: 1n 3 1

np

2 , α= ( − )χ

−t F: 1t1

n

2 and α= ( − )ξχ

+

t F: 11

np n

2 ,

Page 7: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

Fig. 1. The cumulative distribution of the likelihood ratios for the 2-D case including all the covariances built with each particular simulation values and different input noisecovariance profiles (profile 1 in the left, 3 in the center, and 4 in the right columns). Rotation error ω (top row), translation error (center row) and joint error vector (bottomrow).

Table 1Worst-case covariance likelihood ratio tests for n¼2 (left) and n¼3 (right).

n=2 n=3

m ωH (%) Ht (%) ξH (%) ωH100 (%) Ht100 (%) ξH100 (%) ωH (%) Ht (%) ξH (%) ωH100 (%) Ht

100 (%) ξH100 (%)

2 77.2 27.8 29.2 100 71.4 73.23 94.4 51.8 55.4 100 90.2 90.4 67.6 13.4 9.8 95.4 60.0 60.24 96.6 65.4 70.8 100 96.2 95.4 88.4 32.8 27.8 98.0 84.8 85.85 96.8 79.4 82.8 100 97.6 97.8 93.8 50.0 47.4 98.8 93.8 93.67 96.4 87.6 89.2 100 97.0 97.0 96.0 76.6 75.2 99.0 97.8 98.810 98.8 93.2 94.8 100 98.2 98.4 98.2 92.0 91.2 97.8 96.8 97.220 97.4 95.8 96.6 100 99.0 98.8 97.8 96.6 96.0 99.0 97.8 97.650 98.2 98.6 98.4 100 99.6 99.2 98.8 98.0 97.8 99.2 99.0 98.2100 99.0 98.4 98.6 100 99.0 99.6 98.4 98.0 97.6 98.6 99.4 99.4

Table 2Worst-case covariance likelihood ratio tests for n¼7.

m ωH (%) Ht (%) ξH (%) ωH100 (%) Ht100 (%) ξH100 (%)

7 61.4 1.6 0.0 84.4 32.2 7.410 95.2 28.6 9.0 89.0 89.8 56.415 98.0 74.6 65.2 93.0 97.2 70.220 98.2 90.2 86.0 90.8 98.2 73.430 98.4 96.6 94.0 93.0 98.6 75.850 98.6 97.6 97.4 93.4 98.4 76.0100 98.4 97.8 98.2 93.0 99.0 79.6

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220216

all with α = 0.01.As the true values of the coordinates of the points and of the

rotation matrix are not available in real-world scenarios, thecovariances resulting from the uncertainty description of theprevious section must be computed using the perturbed quantities

instead, therefore depending on the actual values of the simula-tion. The covariances and respective likelihood ratios are com-puted for each sample of the Monte Carlo simulations, that is, for

each configuration, one Σsim and N instances of Σopt are computed,resulting in N�M covariances, and consequently, the samequantity of ratios and tests. The highest ratio in each configurationis used for the hypothesis tests presented in Tables 1 and 2, thusrepresenting the worst-case scenarios (in terms of likelihood ra-tios) for each configuration and number of landmarks. The valuesshown in the tables for these tests denote the percentage of suc-cessful tests among the M different configurations for n¼2, n¼3,and n¼7, respectively. As mentioned before, the Gaussian as-sumption may not be valid when dealing with an arbitrary rota-tion R , which, together with the fact that the ratio expressions andthe degrees of freedom of their approximation depend on thenumber of Monte Carlo samples, the tests are expected to fail more

Page 8: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220 217

pronouncedly when the number of Monte Carlo samples increases.For this reason, an additional set of likelihood ratio tests are pre-sented using a smaller number of samples, N¼100, and denoted asH100. From the values shown in these tables, it can be concludedthat, for more than 5 points in 2-D, 7 points in 3-D, and 15 pointsin 7-D, the obtained covariance matrices pass more than 70% ofthe tests in the worst-case, and can therefore be considered as agood approximation.

The cumulative distributions of the ratios depicted in Fig. 1,however, show that the worst-case scenario in terms of likelihoodratio is not representative of the majority of the computed cov-ariance matrices. The left side of this figure presents the cumula-tive distribution of the total N�M likelihood ratios, normalized bythe respective thresholds, and therefore depict the quantity ofsamples that is explained by the model according to the null hy-pothesis (area to the left of the vertical dashed line) as well as itsvariation with the number of available points for 2-D. Comparingwith the worst cases provided in the previous tables, it can be seenthat when using the covariances built with every sample in eachconfiguration the results are significantly better even when a lowratio of points per dimension is present. In particular, even withthe minimum amount of points ( = )m n/ 1 , it is possible to achievemore than 70% of positive rotation and translation tests for all thedimensions except for n¼7, where 10 points are needed to surpassthat threshold. In fact, in most cases the growth of the cumulativefunction is quite fast, reaching high values of cumulative percen-tage for relatively low values of the ratio. It is also confirmed thatadding a few points to the =m n/ 1 case leads to a relevant im-provement of the test results for all the three n, noting that eventhe joint error ratio displays success values above 90% for eachdimensionality, respectively with m¼3, m¼5, and m¼10. None-theless, the =m n/ 1 test results are degraded with the increase ofn (from 79% in 2-D to 64% in 3-D and 22% in 7-D for the jointerror), a tendency less apparent for the tests with ≥m n/ 2. Tobetter explore the various aspects of the uncertainty character-ization, and achieve a broader validation, additional simulationsfor n¼2 were performed with several noise scales, namely:(1) nominal noise; (2) B noise scaled by 102; (3) A noise scaled by102; and (4) both multiplied by 102 which are depicted in Fig. 1.Profiles 2 and 3 are very similar and only one is shown. One im-portant fact to note prior to this analysis is that, as the covariancesare built with the noise-perturbed point sets and the estimatedrotation, increasing the noise will inevitably lead to worse testresults in general. Notwithstanding this fact, for a reasonablenumber of points (m¼10) more than 70% of the samples pass thetest for all the quantities tested in all 4 profiles. Regardless of thenoise profile, the rotation error covariances perform much betterthan the translation in these tests, as 3 points are enough to havemore than 70% of positive tests for the rotation, while for thetranslation 7 points are needed for profiles 2 and 3, and 10 pointsfor the last profile. Furthermore, with 2 points the performancemay be below 20% in the translation for profile 4, while the ro-tation is never below 60% in any profile. Even though the left andcenter columns of Fig. 1 are very similar, the translation tests areworse in the former (around 1.4% less for eachm), when the highernoise is on B, while the rotation tests are better (around 0.7%higher for each m). This may be explained by the fact that, in themathematical expressions derived in Section 4.2, the actual valuesof the points in that set are always multiplied by the rotation,hence amplifying the influence of the noise in that set. When bothinput covariances are scaled, the results worsen as every dis-tribution crosses the threshold with around 10% less samples thanin profiles 2 and 3. Note however that in realistic situations, thesenoise levels would imply that 99% of the noisy points could be in aellipse with a major axis of 1.5 m around the true value σ( )3 whichis not a usual situation. Nevertheless, to further assess the

influence of the noise in the tests, new simulations were madewhere both nominal input covariances are multiplied by 1002. Inthese tests, the rotation still performs adequately (60% for 20points, 80% for 50 points), but the translation is far from that level(with 100 points only 3.37% of the samples pass the test). Never-theless, when the true values are used to compute the covariancesin these harsh noise conditions the performance is much better(55% for rotation, 44% for translation, and 12% for the joint errorvector, all for m¼2), confirming that it is also the fact that thecovariances must be built with the perturbed values that leads tothe worsening of the results. This is intrinsic to the nonlinearproblem at hand, and not specific to the proposedcharacterization.

To provide a better understanding of the underlying problem,Fig. 2 is provided, depicting the spatial distribution of the trans-lation error (red dots) with the corresponding 99% bound in da-shed black, along with σ3 bounds given by the median ratiocovariance matrix in solid blue, and the simulation covariance indashed green. Here a difficult case is presented, where all thepoints in B are evenly distributed along the x-axis, there is atranslation of 2 units in each axis and a 180° rotation between sets,with narrow input covariances on both sets,Σ Σ= = ( × × × )− − − −diag 10 , 9 10 , 4 10 , 16 10b a

5 3 5 3 for each twopoints. The baseline (Δ), i.e., the maximum distance betweenpoints in B, increases from the leftmost figure to the rightmost.When moving downwards, the number of equidistant points in-creases, maintaining the baseline. A careful analysis of the in-formation contained in Fig. 2 indicates that the baseline is thedominant factor that can lead to highly nonlinear uncertaintydistributions. As these nonlinearities are not captured by theproposed uncertainty computation, its performance will be de-graded in difficult cases. Although a short baseline leads to highlynonlinear error distributions, as the number of points increasesthese distributions become more conventional, although not asnotoriously as with the increase of the baseline. In fact, even withonly 2 points, a baseline of 4 m leads to a median covariance quitesimilar to the experimental covariance, and conservative to theactual shape of the 99% simulation bound.

In a further simulation, the rotation, the translation and theshape of the input noise covariance (narrow versus round) werevaried for only 2 points with a fixed baseline. It was observed thatthe influence of the rotation and translation is not evidently visiblein the results, further supporting that the baseline, the number ofpoints, and the shape and size of the input noise distributions arethe most significant parameters in both the results of the opti-mization problem and the validity of the uncertainty character-ization. In fact, these two aspects of the whole problem are closelyrelated: the uncertainty characterization here proposed is notadequate when the distribution of the error cannot be sufficientlycharacterized by its first two moments, mean and covariance, eventhough it demonstrates great accuracy when the error distributionis more conventional. In summary, narrow shapes of input noise,low baselines, and low number of points lead to non-conventionalshapes of error distributions, and bettering any of these threeparameters leads to more natural error distributions, especially forthe case of the baseline. Even though this places restrictions on thebaseline and number of points available for the optimization, it isby no means a very harsh limitation, as in the vast majority ofapplications it is common practice to use a relatively large numberof points and reasonable baselines when compared to noise levels.

6. Conclusion

In this paper, the uncertainty involved in the weighted orthogonal

Page 9: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

Fig. 2. Translation error and uncertainty for a series of combinations of baseline-number of points. Translation error Monte Carlo samples in red, 99% bound in dashed black,σ3 simulation covariance ellipse in dashed green, and the σ3 covariance ellipse resulting from the uncertainty characterization with the median likelihood ratio. The baseline

(Δ = ∥ − ∥b bmax i j for all ≠i j, ∈i j, ) increases in each figure from left to right, whereas the number of points used in the optimization increases from top to down. Notethat the scaling changes from line to line as the number of points increases (ellipses in the bottom are smaller than those at the top). (For interpretation of the references tocolor in this figure caption, the reader is referred to the web version of this paper.)

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220218

Page 10: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220 219

Procrustes problem for stochastically perturbed n-dimensional pointclouds was studied thoroughly and analytical expressions were de-rived for the first and second moments of the stochastic outputs, thetranslation and rotation, as well as cross terms that characterize theanisotropic uncertainty of the problem, not imposing assumptions onthe actual rotation and translation. These were obtained after as-suming an error model based on perturbation theory for each point inthe point clouds, and advancing similar error models both for thetranslation and the rotation matrix. This novel uncertainty character-ization for the Procrustes problem was validated through extensiveMonte Carlo simulations exploring the general framework of the al-gorithm, using arbitrary rotations and translations, as well as fullcovariance matrices for each point. For a relatively low number ofpoints, and reasonable noise conditions, the proposed uncertaintycharacterization performed very well in likelihood ratio tests thatencompass a wide variety of configurations varying in translation,rotation, and spatial distribution of points. A thorough analysis of theinfluence of several parameters on the error distribution and the va-lidity of the results was performed, resulting in the conclusion thatwith a reasonable baseline the uncertainty characterization performsaccurately. These results are consistent with the assumptions, andfurther validate this approach.

Possible directions of future work include applying this meth-odology to similar optimization problems (e.g. LiDAR calibrationfrom 3-D point clouds [42]), and research on the use of the un-certainty characterization in iterative Procrustes or ICP algorithms.

Acknowledgments

This work was supported by the Fundação para a Ciência e aTecnologia (FCT) through ISR under LARSyS UID/EEA/50009/2013,and through IDMEC, under LAETA UID/EMS/50022/2013 contracts,by the University of Macau Project MYRG2015-00126-FST, and bythe Macao Science and Technology Development Fund underGrant FDCT/048/2014/A1. The work of P. Lourenço and B. Guerreirowas supported, respectively, by the PhD. Grant SFRH/BD/89337/2012 and by the Post-Doc grant SFRH/BPD/110416/2015 from FCT.

Appendix A. Supplementary data

Supplementary data associated with this article can be found in theonline version at http://dx.doi.org/10.1016/j.patcog.2016.07.037.

References

[1] J.C. Gower, G.B. Dijksterhuis, Procrustes Problems, Oxford Statistical ScienceSeries, Oxford University Press, Oxford, UK, 2004.

[2] C. Goodall, Procrustes methods in the statistical analysis of shape, J. R. Stat.Soc. Ser. B (Methodol.) 53 (2) (1991) 285–339.

[3] P.H. Schönemann, A generalized solution of the orthogonal procrustes pro-blem, Psychometrika 31 (1) (1966) 1–10.

[4] G. Wahba, Problem 65-1: a least squares estimate of satellite attitude, SIAMRev. 7 (3) (1965) 409.

[5] W. Kabsch, A solution for the best rotation to relate two sets of vectors, ActaCrystallogr. Sect. A 32 (5) (1976) 922–923.

[6] S. Umeyama, Least-squares estimation of transformation parameters betweentwo point patterns, IEEE Trans. Pattern Anal. Mach. Intell. 13 (4) (1991)376–380.

[7] B.K.P. Horn, H. Hilden, S. Negahdaripour, Closed-form solution of absoluteorientation using orthonormal matrices, J. Opt. Soc. Am. 5 (7) (1988)1127–1135.

[8] K. Arun, T.S. Huang, S.D. Blostein, Least-squares fitting of two 3-D point sets,IEEE Trans. Pattern Anal. Mach. Intell. 9 (5) (1987) 698–700.

[9] K. Kanatani, Analysis of 3-D rotation fitting, IEEE Trans. Pattern Anal. Mach.Intell. 16 (5) (1994) 543–549.

[10] D. Goryn, S. Hein, On the estimation of rigid body rotation from noisy data,

IEEE Trans. Pattern Anal. Mach. Intell. 17 (12) (1995) 1219–1220.[11] R. Sibson, Studies in the robustness of multidimensional scaling: perturba-

tional analysis of classical scaling, J. R. Stat. Soc. Ser. B (Methodol.) 41 (2)(1979) 217–229.

[12] R. Sibson, Studies in the robustness of multidimensional scaling: procrustesstatistics, J. R. Stat. Soc. Ser. B (Methodol.) 40 (2) (1978) 234–238.

[13] J. Fitzpatrick, J. West, The distribution of target registration error in rigid-bodypoint-based registration, IEEE Trans. Med. Imaging 20 (9) (2001) 917–927.

[14] A.D. Wiles, A. Likholyot, D.D. Frantz, T.M. Peters, A statistical model for point-based target registration error with anisotropic fiducial localizer error, IEEETrans. Med. Imaging 27 (3) (2008) 378–390.

[15] M.H. Moghari, P. Abolmaesumi, Distribution of target registration error foranisotropic and inhomogeneous fiducial localization error, IEEE Trans. Med.Imaging 28 (6) (2009) 799–813.

[16] L. Dorst, First order error propagation of the procrustes method for 3D attitudeestimation, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2) (2005) 221–229.

[17] N. Ohta, K. Kanatani, Optimal estimation of three-dimensional rotation andreliability evaluation, IEICE Trans. Inf. Syst. E81-D 11 (1998) 1247–1252.

[18] I. Söderkvist, Perturbation analysis of the orthogonal procrustes problem, BITNumer. Math. 33 (4) (1993) 687–694.

[19] J. Tang, L. Shao, X. Zhen, Robust point pattern matching based on spectralcontext, Pattern Recognit. 47 (3) (2014) 1469–1484.

[20] J. Christmas, R. Everson, J. Bell, C. Winlove, Inexact Bayesian point patternmatching for linear transformations, Pattern Recognit. 47 (10) (2014)3265–3275.

[21] B.J. Guerreiro, Sensor-based control and localization of autonomous systems inunknown environments (Ph.D. thesis). Instituto Superior Técnico, Universityof Lisbon, December 2013.

[22] P. Lourenço, B.J. Guerreiro, P. Batista, P. Oliveira, C. Silvestre, 3-D inertial tra-jectory and map online estimation: building on a GAS sensor-based SLAMfilter, in: Proceedings of the 2013 European Control Conference, Zurich,Switzerland, 2013, pp. 4214–4219.

[23] H. Durrant-Whyte, T. Bailey, Simultaneous localisation and mapping (SLAM):Part I the essential algorithms, IEEE Robot. Autom. Mag. 13 (2) (2006) 99–110.

[24] B.J. Guerreiro, P. Batista, C. Silvestre, P. Oliveira, Globally asymptotically stablesensor-based simultaneous localization and mapping, IEEE Trans. Robot. 29 (6)(2013) 1380–1395.

[25] P. Lourenço, B.J. Guerreiro, P. Batista, P. Oliveira, C. Silvestre, Simultaneouslocalization and mapping for aerial vehicles: a 3-D sensor-based GAS filter,Auton. Robots 40 (2016) 881–902.

[26] C. Beattie, S. Smith, Optimal matrix approximants in structural identification, J.Optim. Theory Appl. 74 (1) (1992) 23–56.

[27] I. Borg, P. Groenen, Modern Multidimensional Scaling: Theory and Applica-tions, Springer Series in Statistics, Springer, New York, USA, 2007.

[28] F. Crosilla, A. Beinat, Use of generalised procrustes analysis for the photo-grammetric block adjustment by independent models, {ISPRS} J. Photogr.Remote Sens. 56 (3) (2002) 195–209.

[29] L. Igual, X. Perez-Sala, S. Escalera, C. Angulo, F.D. la Torre, Continuous gen-eralized procrustes analysis, Pattern Recognit. 47 (2) (2014) 659–671.

[30] P. Schönemann, R. Carroll, Fitting one matrix to another under choice of acentral dilation and a rigid motion, Psychometrika 35 (1970) 245–255.

[31] G. McNeill, S. Vijayakumar, Hierarchical procrustes matching for shape retrival,in: Proceedings of the 2006 Conference on Computer Vision and Pattern Re-cognition (CVPR), CVPR, San Diego, CA, USA, 2006, pp. 885–894.

[32] Y. Gong, S. Lazebnik, A. Gordo, F. Perronnin, Iterative quantization: a pro-crustean approach to learning binary codes for large-scale image retrieval,IEEE Trans. Pattern Anal. Mach. Intell. 35 (12) (2013) 2916–2929.

[33] A. Bartoli, D. Pizarro, M. Loog, Stratified generalized procrustes analysis, Int. J.Comput. Vis. 101 (2) (2012) 227–253.

[34] K. Huang, N. Sidiropoulos, A. Swami, Non-negative matrix factorization re-visited: uniqueness and algorithm for symmetric decomposition, IEEE Trans.Signal Process. 62 (1) (2014) 211–224.

[35] B.-K. Ling, N. Tian, C.-F. Ho, W.-C. Siu, K.-L. Teo, Q. Dai, Maximally decimatedparaunitary linear phase fir filter bank design via iterative svd approach, IEEETrans. Signal Process. 63 (2) (2015) 466–481.

[36] K. Lekadir, N. Keenan, D. Pennell, G.-Z. Yang, An inter-landmark approach to4-D shape extraction and interpretation: application to myocardial motionassessment in MRI, IEEE Trans. Med. Imaging 30 (1) (2011) 52–68.

[37] S. Kaneko, T. Kondo, A. Miyamoto, Robust matching of 3d contours usingiterative closest point algorithm improved by m-estimation, Pattern Recognit.36 (9) (2003) 2041–2047.

[38] C. Wang, S. Mahadevan, Manifold alignment using procrustes analysis, in:Proceedings of the 25th International Conference on Machine Learning, ACM,New York, NY, USA, 2008, pp. 1120–1127.

[39] J. Liu, X. Liu, X. Ma, First-order perturbation analysis of singular vectors insingular value decomposition, IEEE Trans. Signal Process. 56 (7) (2008)3044–3049.

[40] W.S. Massey, Cross products of vectors in higher dimensional Euclideanspaces, Am. Math. Mon. 90 (10) (1983) 697–701.

[41] T. Anderson, An introduction to multivariate statistical analysis, in: WileySeries in Probability and Statistics—Applied Probability and Statistics SectionSeries, Wiley, Hoboken, NJ, USA, 1984.

[42] B.J. Guerreiro, C. Silvestre, P. Oliveira, Automatic 2-D LiDAR geometric cali-bration of installation bias, Robot. Auton. Syst. 62 (8) (2014) 1116–1129.

Page 11: Uncertainty characterization of the orthogonal …...Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourençoa,n, Bruno

P. Lourenço et al. / Pattern Recognition 61 (2017) 210–220220

Pedro Lourenço received the B.Sc. and M.Sc. degrees in Aerospace Engineering in 201

0 and 2012, respectively, both from Instituto Superior Técnico (IST), Lisbon, Portugal, onthe course of which he received the Merit Award. He is currently pursuing his Ph.D. degree in Electrical and Computer Engineering with the Institute for Systems andRobotics, Laboratory for Robotics and Engineering Systems, at IST. His research interests include sensor-based simultaneous localization and mapping with RGB-D, vision,ranging, and bearing sensors.

Bruno J. Guerreiro received the Licenciatura and Ph.D. degrees in Electrical and Computer Engineering in 2004 and 2013, respectively, both from Instituto Superior Técnico(IST), Lisbon, Portugal, and his Ph.D. thesis received the Best Ph.D. Thesis Award by the Portuguese Robotics Society. He is currently a postdoctoral researcher at the DynamicSystems and Ocean Robotics Lab of the Institute for Systems and Robotics, at IST, and has been involved in several research and development projects regarding the use ofautonomous vehicles for automatic inspection of infrastructures and archaeological sites. His research interests include the sensor-based navigation and control of au-tonomous vehicles, laser calibration methods and 3-D reconstruction, as well as nonlinear model predictive control.

Pedro Batista received the Licenciatura degree in electrical and computer engineering in 2005 and the Ph.D. degree in 2010, both from Instituto Superior Técnico (IST),Lisbon, Portugal. From 2004 to 2006, he was a Monitor with the Department of Mathematics, IST. Since 2012 he is with the Department of Electrical and ComputerEngineering of IST, he where he is currently Assistant Professor. His research interests include sensor-based navigation and control of autonomous vehicles. Dr. Batista hasreceived the Diploma de Mérito twice during his graduation and his Ph.D. thesis was awarded the Best Robotics Ph.D. Thesis Award by the Portuguese Society of Robotics.

Paulo Oliveira received the Ph.D. degree in 2002 from the Instituto Superior Técnico (IST), Lisbon, Portugal. He is currently an Associate Professor with the Department ofMechanical Engineering, IST, Lisbon, Portugal and a Researcher with the Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Lisbon,Portugal. His research interests include Robotics and Autonomous Vehicles with special focus on the fields of sensor fusion, navigation, positioning, and estimation. He hasparticipated in more than 15 Portuguese and European Research projects over the last 20 years.

Carlos Silvestre received the Licenciatura degree in Electrical Engineering from the Instituto Superior Tecnico (IST) of Lisbon, Portugal, in 1987 and the M.Sc. degree inElectrical Engineering and the Ph.D. degree in Control Science from the same school in 1991 and 2000, respectively. In 2011 he received the Habilitation in ElectricalEngineering and Computers also from IST. Since 2000, he is with the Department of Electrical Engineering of the Instituto Superior Tecnico, where he is currently anAssociate Professor of Control and Robotics on leave. Since 2012 he is an Associate Professor of the Department of Electrical and Computers Engineering of the Faculty ofScience and Technology of the University of Macau. Over the past years, he has conducted research on the subjects of navigation guidance and control of air and underwaterrobots. His research interests include linear and nonlinear control theory, coordinated control of multiple vehicles, gain scheduled control, integrated design of guidance andcontrol systems, inertial navigation systems, and mission control and real time architectures for complex autonomous systems with applications to unmanned air andunderwater vehicles.