1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1...

12
1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality Anders Eriksson, Carl Olsson, Fredrik Kahl, and Tat-Jun Chin Abstract—In this paper we explore the role of duality principles within the problem of rotation averaging, a fundamental task in a wide range of applications. In its conventional form, rotation averaging is stated as a minimization over multiple rotation constraints. As these constraints are non-convex, this problem is generally considered challenging to solve globally. We show how to circumvent this difficulty through the use of Lagrangian duality. While such an approach is well-known it is normally not guaranteed to provide a tight relaxation. Based on spectral graph theory, we analytically prove that in many cases there is no duality gap unless the noise levels are severe. This allows us to obtain certifiably global solutions to a class of important non-convex problems in polynomial time. We also propose an efficient, scalable algorithm that outperforms general purpose numerical solvers by a large margin and compares favourably to current state-of-the-art. Further, our approach is able to handle the large problem instances commonly occurring in structure from motion settings and it is trivially parallelizable. Experiments are presented for a number of different instances of both synthetic and real-world data. Index Terms—Rotation averaging, Structure from Motion, Lagrangian duality, graph Laplacian, chordal distance. 1 I NTRODUCTION R OTATION averaging appears as a subproblem in many important applications in computer vision, robotics, sensor networks and related areas. Given a number of relative rotation estimates between pairs of poses, the goal is to compute absolute camera orientations with respect to some common coordinate system. In computer vision, for instance, non-sequential structure from motion systems such as [1], [2], [3] rely on rotation averaging to initialize bundle adjustment. The overall idea is to consider as much data as possible in each step to avoid suboptimal recon- structions. In the context of rotation averaging this amounts to using as many camera pairs as possible. The problem can be thought of as inference on a graph. An edge (i, j ) in this undirected graph represents a relative rotation measurement ˜ R ij and the objective is to find the absolute orientation R i for each vertex i such that R i ˜ R ij = R j holds (approximately in the presence of noise) for all edges. The problem is generally considered difficult due to the need to enforce non-convex rotation constraints. Indeed, both L 1 and L 2 formulations of rotation averaging can have local minima, see Figure 1. Wilson et al. [4] studied local convexity of the problem and showed that instances with large loosely connected graphs are hard to solve with local, iterative optimization methods. In contrast, our focus is on global optimality. In this paper we show that convex relaxation methods can in fact overcome the difficulties with local minima in rotation aver- aging. We utilize Lagrangian duality to handle the quadratic A. Eriksson is with the School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Australia. C. Olsson is with Chalmers University of Technology and Lund Univer- sity, Sweden. F. Kahl is with Chalmers University of Technology, Sweden. T-J. Chin is with the School of Computer Science, The University of Adelaide, Australia. Manuscript received; revised . Fig. 1. In many structure from motion pipelines, camera orientations are estimated with rotation averaging followed by recovery of camera centres (red) and 3D structure (blue). Here are three solutions cor- responding to different local minima of the same rotation averaging problem. non-convex rotation constraints. While such an approach is normally not guaranteed to provide a tight relaxation we give analytical error bounds that guarantee there will be no duality gap. For instance, it is sufficient that each angular residual is less than 42.9 to ensure optimality for complete graphs. Additionally, we develop a scalable and efficient algorithm, based on block coordinate descent, that outperforms standard semidefinite program (SDP) solvers for this problem. Further, we experimentally compare our approach to specialized state-of-the-art solvers. 1.1 Related Work Rotation averaging has been under intense study in recent years, see [1], [5], [6], [7], [8], [9]. Despite progress in practical algorithms, they largely come without guarantees. One of the earliest averaging methods was due to Govindu [10], who showed that when representing the rotations with quaternions the problem can be viewed as a linear homogeneous least squares problem. There is however a sign ambiguity in the quaternion representation that has to be resolved before the formulation can be applied. Addi- tionally, since a quaternion only represents a rotation if its

Transcript of 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1...

Page 1: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

1

Rotation Averaging with the Chordal Distance:Global Minimizers and Strong Duality

Anders Eriksson, Carl Olsson, Fredrik Kahl, and Tat-Jun Chin

Abstract—In this paper we explore the role of duality principles within the problem of rotation averaging, a fundamental task in a widerange of applications. In its conventional form, rotation averaging is stated as a minimization over multiple rotation constraints. As theseconstraints are non-convex, this problem is generally considered challenging to solve globally. We show how to circumvent this difficultythrough the use of Lagrangian duality. While such an approach is well-known it is normally not guaranteed to provide a tight relaxation.Based on spectral graph theory, we analytically prove that in many cases there is no duality gap unless the noise levels are severe.This allows us to obtain certifiably global solutions to a class of important non-convex problems in polynomial time.We also propose an efficient, scalable algorithm that outperforms general purpose numerical solvers by a large margin and comparesfavourably to current state-of-the-art. Further, our approach is able to handle the large problem instances commonly occurring instructure from motion settings and it is trivially parallelizable. Experiments are presented for a number of different instances of bothsynthetic and real-world data.

Index Terms—Rotation averaging, Structure from Motion, Lagrangian duality, graph Laplacian, chordal distance.

F

1 INTRODUCTION

ROTATION averaging appears as a subproblem in manyimportant applications in computer vision, robotics,

sensor networks and related areas. Given a number ofrelative rotation estimates between pairs of poses, the goalis to compute absolute camera orientations with respectto some common coordinate system. In computer vision,for instance, non-sequential structure from motion systemssuch as [1], [2], [3] rely on rotation averaging to initializebundle adjustment. The overall idea is to consider as muchdata as possible in each step to avoid suboptimal recon-structions. In the context of rotation averaging this amountsto using as many camera pairs as possible.

The problem can be thought of as inference on a graph.An edge (i, j) in this undirected graph represents a relativerotation measurement Rij and the objective is to find theabsolute orientation Ri for each vertex i such that RiRij =Rj holds (approximately in the presence of noise) for alledges. The problem is generally considered difficult due tothe need to enforce non-convex rotation constraints. Indeed,both L1 and L2 formulations of rotation averaging can havelocal minima, see Figure 1. Wilson et al. [4] studied localconvexity of the problem and showed that instances withlarge loosely connected graphs are hard to solve with local,iterative optimization methods.

In contrast, our focus is on global optimality. In thispaper we show that convex relaxation methods can in factovercome the difficulties with local minima in rotation aver-aging. We utilize Lagrangian duality to handle the quadratic

• A. Eriksson is with the School of Information Technology and ElectricalEngineering, University of Queensland, Brisbane, Australia.

• C. Olsson is with Chalmers University of Technology and Lund Univer-sity, Sweden. F. Kahl is with Chalmers University of Technology, Sweden.T-J. Chin is with the School of Computer Science, The University ofAdelaide, Australia.

Manuscript received; revised .

Fig. 1. In many structure from motion pipelines, camera orientationsare estimated with rotation averaging followed by recovery of cameracentres (red) and 3D structure (blue). Here are three solutions cor-responding to different local minima of the same rotation averagingproblem.

non-convex rotation constraints. While such an approachis normally not guaranteed to provide a tight relaxationwe give analytical error bounds that guarantee there willbe no duality gap. For instance, it is sufficient that eachangular residual is less than 42.9◦ to ensure optimality forcomplete graphs. Additionally, we develop a scalable andefficient algorithm, based on block coordinate descent, thatoutperforms standard semidefinite program (SDP) solversfor this problem. Further, we experimentally compare ourapproach to specialized state-of-the-art solvers.

1.1 Related WorkRotation averaging has been under intense study in recentyears, see [1], [5], [6], [7], [8], [9]. Despite progress inpractical algorithms, they largely come without guarantees.One of the earliest averaging methods was due to Govindu[10], who showed that when representing the rotationswith quaternions the problem can be viewed as a linearhomogeneous least squares problem. There is however asign ambiguity in the quaternion representation that has tobe resolved before the formulation can be applied. Addi-tionally, since a quaternion only represents a rotation if its

Page 2: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

2

length is one, a norm constraint has to be incorporated foreach rotation. It was observed by Fredriksson and Olssonin [11] that since both the objective and the constraints arequadratic, the Lagrange dual can be computed in closedform. The resulting SDP was experimentally shown to haveno duality gap for moderate noise levels. In order to avoidsolving a costly SDP for large scale problems, [11] proposeda verification scheme that given a candidate primal solution,computes a dual solution and evaluates the duality gap.In this way it is possible to test and verify optimality ofa candidate solution obtained using any feasible algorithm.

A more straightforward rotation representation is using3× 3 matrices. Martinec and Pajdla [1] approximately solvethe problem by ignoring the orthogonality and determinantconstraints. This allows simple optimization by computingeigenvectors of the resulting homogeneous system of equa-tions. A similar relaxation was derived by Arie-Nachimsonet al. in [12]. In addition, an SDP formulation was presentedwhich is equivalent to the one we address here, but with noperformance guarantees. The tightness of SDP relaxationsfor 2D rotation averaging is studied in [13].

A number of robust approaches have been developedto handle outlier measurements. A sampling scheme overspanning trees of the graph is developed by Govindu in[14]. Enqvist et al. [2] also start from a spanning tree andadd relative rotations that are consistent with the solution.In [15] the Weiszfeld algorithm is applied to single rotationaveraging with the L1 norm. In [16] convexity propertiesof the single rotation averaging problem are given. To ourknowledge these results do not generalize to the case of mul-tiple rotations. In [17] a robust formulation is solved usingIRLS and in [18] Cramer-Rao lower bounds are computedfor maximum likelihood estimators, but neither with anyoptimality guarantees.

A closely related problem is that of pose graph estima-tion, where camera orientations and positions are jointly op-timized. In this context Lagrangian duality has been applied[19], [20]. both for optimization and for verification similarto [11]. In [21] a consensus algorithm that allows for efficientdistributed computations is presented. A fast verificationtechnique for pose graph estimation was given in [22].In a recent paper [23] an SDP relaxation for pose graphestimation with performance guarantees is analyzed. It isshown that there is a noise level β for which the relaxationis guaranteed to provide the optimal solution. However,the result only shows the existence of β. Its value whichis dependent on the problem instance is not computed. Incontrast our result for rotation averaging gives explicit noisebounds.

The main contributions of this paper are:

• We apply Lagrangian duality to the rotation aver-aging problem with the chordal error distance andstudy the properties of the obtained relaxations.

• We develop strong theoretical bounds on the noiselevel that guarantee exact global recovery based onspectral graph theory.

• We develop a conceptually simple and scalable algo-rithm which is able to handle large problem instancesoccurring in structure from motion problems.

• We present experimental results that confirm our

theoretical findings and which compare favourablyto previous state-of-the-art.

This paper is an extension of the conference paper [24].

1.2 Preliminaries and NotationThe group of all rotations about the origin in three di-mensional Euclidean space is the Special Orthogonal Group,denoted SO(3). This group is commonly represented byrotation matrices, orthogonal 3×3 real-valued matrices withpositive determinant, i.e.,

SO(3) ∈ {R ∈ R3×3 | RTR = I, det(R) = 1}. (1)

If we omit det(R)=1, we get the Orthogonal Group, O(3).Rotations can also be parametrised using the axis-angle

representation predicated by Euler’s rotation theorem. Thistheorem states that any rotation in three-dimensional spaceis equivalent to a single rotation about a fixed axis v by amagnitude of α degrees. This allows us to represent R usingthe explicit expression

R = V[

cosα − sinα 0sinα cosα 0

0 0 1

]V T , (2)

where V is an orthogonal matrix that maps the rotation axisv onto the z-axis. Note that the orientation angle can easilybe computed from tr (R) = 1 + 2 cosα. Both of the aboverepresentations of rotations in three dimensions will be usedinterchangeably.

To measure distances between rotations we will in thiswork restrict ourselves to the chordal distance. This is themost commonly used metric when analyzing Lagrangianduality in rotation averaging. It is a convenient choice asit is quadratic in its entries leading to a particularly simplederivation and form of the associated dual problem. Thechordal distance between two rotations R and S is definedas their Euclidean distance in the embedding space, that is,

d(R,S) = ‖R− S‖F . (3)

Since ‖R‖2F = ‖S‖2F = 3 we have

d(R,S)2 = ‖R‖2F−2tr(RTS

)+‖S‖2F = 4(1−cos(α)), (4)

where α is the rotation angle of RTS. It is therefore easilyseen that the chordal distance is related to the residualrotation angle by

d(R,S) =√

4(1− cos(α)) =

√8 sin2 α

2= 2√

2 sin|α|2.

(5)For further details on rotation parametrizations and distancemeasures, see [5].

The optimality guarantees developed in this paper relieson some basic concepts from spectral graph theory. Let G =(V,E) denote an undirected graph with vertex set V andedge set E and let n = |V |. The adjacency matrix A is bydefinition the n× n matrix with elements

aij =

{0 (i, j) /∈ Ewij (i, j) ∈ E for i, j = 1, . . . , n (6)

where wij is a positive weight. In this paper we mostlyconsider unweighted graphs, where wij = 1, but our resultsextend to the weighted case where wij can be interpreted as

Page 3: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

3

a measurement uncertainty. The degree di is the number ofedges incident to vertex i, and the degree matrix D is thediagonal matrix D = diag (d1, . . . , dn). The Laplacian LG ofG is defined by

LG = D −A. (7)

It is easily seen that for any vector x ∈ Rn we have

xTLGx =∑

(i,j)∈E

wij(xi − xj)2, (8)

and therefore LG � 0. Furthermore, since by constructionthe row sums over LG are zero, the vector (1 1 . . . 1)T

is an eigenvector with eigenvalue zero. The second smallesteigenvalue λ2 of LG, also known as the algebraic connec-tivity or the Fiedler value, reflects the connectivity of G. Fora connected graph G, which is the only case of interest tous, we always have λ2 > 0. In general the magnitude of λ2

reflects how well connected the overall graph is, see [25] forfurther analysis.

2 PROBLEM STATEMENT

The problem of rotation averaging is defined as the task ofdetermining a set of n absolute rotations R1, ..., Rn givendistinct estimated relative rotations Rij . Available relativerotations are represented by the edge set E of the graphG = (V,E). Under ideal conditions this amounts to findingthe n rotations compatible with the linear relations,

RiRij = Rj , (9)

for all (i, j) ∈ E. However, in the presence of noise, asolution to (9) is not guaranteed to exist. Instead, it istypically solved in a least-metric sense,

minR1,...,Rn

∑(i,j)∈E

d(RiRij , Rj)p, (10)

where p ≥ 1 and d(·, ·) is a distance function on the space ofrotations. With the chordal distance, we define the rotationaveraging problem as

arg minR1,...,Rn∈SO(3)

∑(i,j)∈E

‖RiRij −Rj‖2F , (11)

which, with trace notation, can be simplified to

arg minR1,...,Rn∈SO(3)

−∑

(i,j)∈E

tr(RiRijR

Tj

), (12)

which constitutes our primal problem.It will prove convenient to next introduce a compact

matrix formulation. Let

R =

0 a12R12 ... a1nR1n

a21R21 0 ... a2nR2n

.... . .

...an1Rn1 an2Rn2 ... 0

, (13)

where Rij = RTji and aij are the elements of the adjacencymatrix A of the graph G and let

R =[R1 R2 . . . Rn

]. (14)

We may now write the primal problem as

(P ) min −tr(RRRT

)s.t. R ∈ SO(3)n.

(15)

3 OPTIMALITY CONDITIONS

Since (P) is a constrained program it is natural to introduceLagrange multipiers to enforce the constraints R ∈ SO(3)n.Using the Lagrangian function there are two ways of tryingto solve the original primal problem. Either by computingall stationary (KKT) points and evaluating the objectivevalues of these or by maximizing the dual function. Whilethe former approach is guaranteed to find an optimal point(assuming the existence of one) it is usually not feasiblefor large scale problems due to the non-linearity of theinvolved equations. If the dual function can be computed(which we will see is the case for our application) the latterapproach yields a convex problem (specifically, a concavemaximization) that is solvable for large scale instances. Theresulting maximum yields a lower bound on the primalobjective. In the following sections we will use propertiesof the KKT points to derive conditions that ensure that theprimal and dual values are the same, enabling us to find asolution via a convex program. In this section we presentthe KKT conditions and derive the dual program that wewill use.

3.1 Necessary Local Optimality Conditions

The constraint set R ∈ SO(3)n consists of two types ofconstraints; the orthogonality constraints RTi Ri = I and thedeterminant constraints det(Ri) = 1, i = 1, . . . , n.

Consider relaxing the rotation averaging problem byremoving the determinant constraint,

(P ′) min −tr(RRRT

)s.t. R ∈ O(3)n.

(16)

The constraintR ∈ O(3)n still requires theRi’s to be orthog-onal. The orthogonal matrices consist of two disjoint, non-connected sets, with determinants 1 and −1 respectively.Hence, any local minimizer to the problem (P ) also has tobe a local minimizer, and therefore a KKT point, to (P ′). Wenote that orthogonality can be enforced by restricting the3 × 3 diagonal blocks of the symmetric matrix RTR to beidentity matrices. That is, for any block diagonal matrix ofthe form

Λ =

Λ1 0 0 ...0 Λ2 0 ...0 0 Λ3 ...

......

.... . .

(17)

we have

Λ(I −RTR) = 0. (18)

Hence the Lagrangian can be written

L(R,Λ) = −tr(RRRT

)− tr

(Λ(I −RTR)

)= tr

(R(Λ− R)RT

)− tr (Λ) .

(19)

Differentiating (19) we obtain the stationarity condition foroptimality as (

1

2(Λ∗ + Λ∗T )− R

)R∗

T

= 0 (20)

Page 4: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

4

Noting that this permits us to, without loss of generality,assume that Λ is symmetric, we can now state the KKTconditions as

(Stationarity) (Λ∗ − R)R∗T

= 0 (21a)(Primal feasibility) R∗ ∈ SO(3)n. (21b)

Equation (21a) states that the rows of a local minimizer R∗

will be eigenvectors of the matrix Λ∗ − R with eigenvaluezero. This allows us to compute the optimal Lagrange mul-tiplier Λ∗ from a given minimizer R∗ in SO(3)n. By (21a)we see that

Λ∗iR∗Ti =

∑j 6=i

aijRijR∗Tj ⇐⇒ Λ∗i =

∑j 6=i

aijRijR∗Tj R∗i (22)

for i = 1, . . . , n. We therefore have:

Lemma 3.1. For a stationary point R∗ to the primal problem(P ), we can compute the corresponding Lagrangian multiplier Λ∗

in closed form via (22).

3.2 Sufficient Global Optimality Conditions

Next we present some simple constraints that ensure thatthe maximal value of the dual problem is the same asthe minimum of (P ). For simplicity we first consider theLagrange dual of (P ′) which is a semidefinite program thatwe will use for optimization in later sections. The dualproblem is defined by

maxΛ

minR

L(R,Λ). (23)

Since the (unrestricted) optimum of minR L(R,Λ) is either−tr (Λ), when Λ− R � 0, or −∞ otherwise, we get

(D) maxΛ−R�0

−tr (Λ) . (24)

It is clear (through standard duality arguments) that (D)gives a lower bound on (P ′). Furthermore, if R∗ is a sta-tionary point fulfilling (21a) and (21b) with a correspondingLagrangian multiplier Λ∗ that satisfies Λ∗ − R � 0 then Λ∗

is feasible in (D). It also follows that for the stationary pointR∗ we have tr

(R∗Λ∗R∗T

)= tr

(R∗RR∗T

)due to (21a).

This together with (18) gives us

−tr (Λ∗) = −tr(

Λ∗R∗TR∗)

= −tr(R∗RR∗T

). (25)

This shows that the dual problem attains the same objectivevalue as the primal one, that is, there is no duality gap be-tween (P ′) and (D). Thus, the convex program (D) providesa way of solving the non-convex (P ′) when Λ∗ − R � 0.

We further note that if Λ∗ − R � 0 then by definition itis true that

xT(

Λ∗ − R)x ≥ 0, (26)

for any 3n-vector x. In particular, for any R ∈ O(3)n,

0 ≤ tr(R(Λ∗ − R)RT

)= tr (Λ∗)− tr

(RRRT

)= tr

(R∗Λ∗R∗T

)− tr

(RRRT

),

(27)

which shows that −tr(R∗RR∗T

)≤ −tr

(RRRT

)for all

R ∈ O(3)n, that is, R∗ is the global optimum of (P ′).

Finally we note that since R∗ is also feasible in (P ) andthe global minimum of (P ) is clearly at least as large as thatof (P ′), R∗ must also be optimal in (P ) if Λ∗ − R � 0. Wesummarize the above discussion in the following lemma:

Lemma 3.2. If a stationary point R∗ (fulfilling (21a) and (21b))with corresponding Lagrangian multiplier Λ∗ fulfills Λ∗−R � 0then:

1) There is no duality gap between (P ) and (D).2) R∗ is a global minimum for (P ).

In the remainder of this paper we will study underwhich conditions Λ∗ − R � 0 holds and derive an efficientimplementation for solving (D).

We note that it is possible to encode the determinantconstraint with quadratic expressions by considering vectorproducts of the rows of the rotation matrix (see e.g. [22]).It would therefore be easy to derive a Lagrangian for (P )instead of (P ′). This approach would introduce a set of ad-ditional dual variables. This is likely to strengthen the dualformulation for problem instances where there is a dualitygap. However under the above assumptions (which we willsee in many cases hold under quite generous circumstances)it makes no difference as these would simply be set to zero.

4 MAIN RESULT

In this section, we will state our main result which giveserror bounds that guarantee that strong duality holds forour primal and dual problems. From a practical point ofview, the result means that it is possible to solve a convexsemidefinite program and obtain the globally optimal solu-tion to our non-convex problem, which is quite remarkable.

4.1 Strong Duality TheoremReturning to our initial, primal rotation averaging problem(11). The goal is to find rotations Ri and Rj such that thesum of the residuals ‖RiRij−Rj‖2F is minimized. For strongduality to hold, we need to show that there is a stationarypoint R∗ for which the dual variable fulfills Λ∗ − R � 0.Our main result shows that this can be done by boundingthe angular error residuals.

Theorem 4.1 (Strong Duality). Let R∗i , i = 1, . . . , n denote astationary point to the primal problem (P ) for a connected graphG with Laplacian LG. Let αij denote the angular residuals, i.e.,αij = ∠(R∗i Rij , R

∗j ). Then R∗i , i = 1, . . . , n will be globally

optimal and strong duality will hold for (P ) if

|αij | ≤ αmax ∀(i, j) ∈ E, (28)

where

αmax = 2 arcsin

√1

4+λ2(LG)

2dmax− 1

2

, (29)

and dmax is the maximal vertex degree.

Note that the assumptions of the theorem are only sta-tionarity of R∗ and a bound on the angular error residuals.Thus an immediate consequence of the above result is thefollowing converse:

Page 5: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

5

Corollary 4.1 (Global/Local Minimizers). Any local mini-mizer of (P ) that is not global must have at least one angularerror that is larger than (29).

It is clear that (29) will give a positive bound αmax forany graph. Thus for any given problem instance, αmax givesan explicit bound on the error residuals for which strongduality is guaranteed to hold. The strength of the bound willdepend on the particular graph connectivity encapsulatedby the Fiedler value λ2(LG) and the maximal vertex degreedmax. We will see that for tightly connected graphs thebound ensures strong duality under surprisingly generousnoise levels. In [4] it was observed that local convexity at apoint holds under similar circumstances.

4.1.1 Complete GraphsConsider a graph with n = 3 vertices that are connected,and all degrees are equal, dmax = 2, denoted K3. TheLaplacian matrix LK3 has the Fiedler value λ2(LK3) = 3.This gives αmax = π

3 rad = 60◦. So, any local minimizerwhich has angular residuals less than 60◦ is also a globalsolution. For a complete graph with n vertices (Figure 2),

Fig. 2. A complete graph (left) and a cycle graph (right), both with 6vertices.

denoted Kn, it follows that dmax = n − 1 since every pairof nodes is connected. Further, it is well-known (and easy toshow) that λ2(LKn) = n, see [25]. As n becomes larger than3, we get a decreasing series of upper bounds which in thelimit tends to 2 arcsin(

√3−12 ) ≈ 0.749rad = 42.9◦. Hence,

as long as the residual angular errors are less than 42.9◦

- which is quite generous from a practical point of view -we can compute the optimal solution via a convex program.Also note that this bound holds independently of n.

Corollary 4.2. For a complete graph Kn with n vertices, theresidual upper bound αmax = 2 arcsin(

√3−12 ) ≈ 0.749rad =

42.9◦ ensures global optimality and strong duality for any n.

4.1.2 Circular GraphsNext we consider a graph that arises during circular cameramotions. Such problems are frequently occurring when us-ing turn tables for 3D reconstruction. The simplest exampleis a cycle graph, see Figure 2. Here each node is connectedto its two closest neighbors.

The Laplacian for the graph in Figure 3 is

LG =

4 −1 −1 0 0 0 −1 −1−1 4 −1 −1 0 0 0 −1−1 −1 4 −1 −1 0 0 00 −1 −1 4 −1 −1 0 00 0 −1 −1 4 −1 −1 00 0 0 −1 −1 4 −1 −1

−1 0 0 0 −1 −1 4 −1−1 −1 0 0 0 −1 −1 4

.

Fig. 3. A circular graph with connections to its four closest neighbors.

In general a graph with n vertices of this type has aLaplacian that is a circulant matrix, which corresponds to adiscrete periodic convolution. It is diagonalized by a discreteFourier matrix Fn with elements fkl = ei

2π(k−1)(l−1)n , where

k, l = 1, .., n and its eigenvalues can be computed fromFnc

T , where c is the first row of LG [26]. Note that theeigenvalues are real when the elements of c are real (andby combining conjugate columns of Fn the eigenvectors canalso be selected real). For a general circular graph with nvertices where each vertex is connected to its 2b closestneighbors we get the eigenvalues

ηk =

{2b+ 1− sin(πn (2b+1)(k−1))

sin(πn (k−1)) k 6= 0

0 k = 0.(30)

Note that LG is essentially (2b + 1)I − M , where M is asliding average filter. The eigenvalues of M are the Fouriercoefficients of a step function, which is the sampled sinc-function. The Fiedler value is thus

λ2(LG) = η2 = ηn = 2b+ 1−sin(πn (2b+ 1))

sin(πn ). (31)

Since this graph has dmax = 2b we have

λ2(LG)

2dmax≈ 1

2− n

4πbsin(

π

n2b), (32)

for large n and b. This expression only depends on thequotient 2b

n which corresponds to the fraction of availablemeasurements. (Note also that when 2b + 1 < n

2 theright hand side is a proper lower bound on λ2(LG)

2dmaxsince

sin(πn ) < πn and sin(πn (2b + 1)) > sin(πn2b).) Inserting (32)

into (29) gives the function plotted in Figure 4. For 2bn = 1

Fig. 4. The maximal angle residual αmax (y-axis) versus the level ofavailable data 2b

n(x-axis) for circular graphs.

the estimate (32) yields αmax = 42.9◦ which coincides withthe estimate for large complete graphs. Additionally, it isstrictly positive as long as b

n > 0. This analysis suggests that

Page 6: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

6

the case b = 1 which corresponds to a single cycle graph,see Figure 2, is most likely to have multiple local minima.On the other hand, with less data the residual errors areusually smaller and therefore strong duality may still hold.In fact, we show below that for a cycle graph strong dualityalways holds. In Figure 1, we have a real example of anorbital camera motion which is close to a cycle. It may seemhard to determine if the camera motion consists of one ormore loops around the object - we give three different localminima for this example.

For a cycle graph (31) with b = 1 reduces to λ2(LG) =2(1 − cos 2π

n ). Inserting into (29) and simplifying, we get

αmax = 2 arcsin(√

14 + sin2(πn )− 1

2

). For large values of n,

the upper bound decreases rapidly. In fact, the upper boundis quite conservative and it is possible to show a strongerone using a different analysis.

Theorem 4.2. Let R∗i , i = 1, . . . , n denote a stationary point tothe primal problem (P ) for a cycle graph with n vertices. Let αijdenote the angular residuals, i.e., αij = ∠(R∗i Rij , R

∗j ). Then,

R∗i , i = 1, . . . , n will be globally optimal and strong duality willhold for (P ) if |αij | ≤ π

n for all (i, j) ∈ E.

Proof. A sufficient condition for strong duality to hold is thatΛ∗ − R � 0 (Lemma 3.2), which is equivalent to DR∗(Λ∗ −R)DT

R∗ � 0 with the same notation and argument as in (35)and (36). For a cycle graph, we get DR∗(Λ∗ − R)DT

R∗ =E12+E1n −E12 −E1n−ET12 ET12+E23 −E23

−ET23. . .

. . .. . .

. . .−ET1n

. (33)

As this matrix is symmetric, it implies for the first diagonalblock that E12 − ET12 = ET1n − E1n. As all Eij ∈ SO(3), itfollows that E12 = ET1n = E for some rotation E ∈ SO(3).Similarly, for the second diagonal block E12 = ET23 = E andby induction, the matrix DR∗(Λ∗−R)DT

R∗ has the followingtridiagonal (Laplacian-like) structure

E+ET −E −ET−ET E+ET −E

−ET. . .

. . .. . .

. . . −E−E −ET E+ET

. (34)

Note that this means that the total error is equally dis-tributed in an optimal solution among all the residuals, inparticular, αij = α for all (i, j) ∈ E, where α is the residualrotation angle of E .

Let v denote the rotation axis of E and let u and wbe an orthogonal base which is orthogonal to v. Then, de-fine the two vectors v± = ( v±,1 v±,2 . . . v±,n )T , wherev±,i = cos( 2πi

n )u ± sin( 2πin )w for i = 1, . . . , n. Now it

is straight-forward to check that v± are eigenvectors to(34) with eigenvalues 4 sin(πn ± α) sin(πn ). The sign of thesmallest of these two eigenvalues determines the positivedefiniteness of the matrix in (34). In other words, we haveshown that if |α| ≤ π

n then DR∗(Λ∗ − R)DTR∗ � 0.

Requiring that the angular residuals |αij | must be lessthan π/n for the global solution may seem like a restriction,

but it is actually not. To see this, note that a non-optimalsolution to the rotation averaging problem can be obtainedby choosing R1 such that the first residual α12 is zero, andthen continuing in the same fashion such that all but the lastresidual α1n in the cycle is zero. In the worst case, α1n = π.However, this is (obviously) non-optimal. A better solutionis obtained if we distribute the angular residual error evenlyso that αij = α = α1n

n (which is always possible, seeTheorem 23 in [27]). In conclusion, the angular residuals|αij | of the globally optimal solution for a cycle is alwaysless than or equal to π

n , which means that strong dualityholds. Conversely, if the angular residual is larger than π

nfor a local minimizer, then it does not correspond to theglobal solution.

4.2 Proof of Theorem 4.1We now return to the proof of our main result. Recall thata sufficient condition for strong duality to hold is that Λ∗ −R � 0 (Lemma 3.2). To prove Theorem 4.1 we will showthat this is true under the conditions of the theorem.

To simplify the presentation we denote the residualrotations Eij = R∗i RijR

∗Tj and define

DR∗ =

R∗

1 0 0 ...

0 R∗2 0 ...

0 0 R∗3 ...

......

.... . .

. (35)

Then DR∗(Λ∗ − R)DTR∗ =

∑j 6=1 a1jE1j −a12E12 −a13E13 . . .

−a12ET12

∑j 6=2 a2jE2j −a23E23 . . .

−a13ET13 −a23ET23

∑j 6=3 a3jE3j . . .

......

.... . .

. (36)

Note that by symmetry of Λ∗, the above matrix is symmetrictoo, and in particular, the diagonal blocks are symmetric,that is,

∑j 6=i aijEij = 1

2

∑j 6=i aij(Eij + ETij) . Since DR∗ is

orthogonal, the matrix Λ∗− R is positive semidefinite if andonly if DR∗(Λ∗ − R)DT

R∗ is.In the noise free case we note that the residual rotations

will fulfill Eij = I and therefore

DR∗(Λ∗ − R)DTR∗ = LG ⊗ I3. (37)

In the general noise case our strategy will therefore be tobound the eigenvalues of DR∗(Λ∗ − R)DT

R∗ by those of LGfor which well-known estimates exist. Thus, we will analyzethe difference and define the matrix

∆ = DR∗(Λ∗ − R)DTR∗ − LG ⊗ I3. (38)

The following results characterize the eigenvalues of ∆.

Lemma 4.1. Let ∆ij , i = 1, ..., n, j = 1, ..., n be the 3 × 3sub-blocks of ∆. If λ is an eigenvalue of ∆ then

|λ| ≤n∑j=1

‖∆ij‖ for some i = 1, . . . , n. (39)

Proof. The proof is similar to that of Gerschgorin’s the-orem [28]. Let ∆x = λx, with ‖x‖ = 1. Then λxi =∑j ∆ijxj . Now pick i such that ‖xi‖ ≥ ‖xj‖ for all j. Then

|λ| =∥∥∥∥λ xi‖xi‖

∥∥∥∥ =

∥∥∥∥∥∥n∑j=1

∆ijxj‖xi‖

∥∥∥∥∥∥ ≤n∑j=1

‖∆ij‖. (40)

Page 7: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

7

Lemma 4.2. Let αij be the residual angle of Eij . Then

‖∆ii‖ ≤ 2∑j 6=i

aij sin2(αij

2

). (41)

If in addition 0 ≤ αmax ≤ π2 , where αmax = maxi 6=j |αij |, then

‖∆ii‖ ≤ 2di sin2(αmax

2

)(42)

for all i = 1, . . . n, where di is the degree of vertex i.

Proof. It is easy to see that by applying a change of coordi-nates Eij can be written

Eij = Vij

cos(αij) − sin(αij) 0sin(αij) cos(αij) 0

0 0 1

V Tij , (43)

and hence

1

2(Eij + ETij) = Vij

cos(αij) 0 00 cos(αij) 00 0 1

V Tij . (44)

Therefore

(cos(αij)− 1)I � 1

2(Eij + ETij)− I � 0, (45)

and since ∆ii =∑j 6=i aij

(12 (Eij + ETij)− I

)we get∑

j 6=iaij(cos(αij)− 1)I � ∆ii � 0. (46)

Thus ‖∆ii‖ ≤∑j 6=i aij(1 − cos(αij)). Using the identity

1− cos(αij) = 2 sin2(αij

2

)and the fact that the sin-function

is non-decreasing on [−π2 ,π2 ] gives the desired result.

Lemma 4.3. If i 6= j then

‖∆ij‖ ≤ 2aij sin

( |αij |2

). (47)

If in addition 0 ≤ αmax ≤ π2 then

‖∆ij‖ ≤ 2aij sin(αmax

2

). (48)

Proof. To estimate the off-diagonal blocks ‖∆ij‖ = aij‖I −Eij‖ we note that for a unit vector v we have√‖v − Eijv‖2 =

√‖v‖2 − 2 cos∠(v, Eijv) + ‖Eijv‖2

≤√

2(1− cos(αij)), (49)

where ∠(v, Eijv) is the angle between v and Eijv. Further-more, we will have equality if v is perpendicular to therotation axis of Eij . Therefore

‖∆ij‖ = aij

√2(1− cos(αij)) ≤ 2aij sin

(αmax

2

). (50)

Summarizing the results in Lemmas 4.1- 4.3 we get thatthe eigenvalues λ of ∆ fulfill

|λ(∆)| ≤ 2di sin2(αmax

2

)+∑j 6=i

2aij sin(αmax

2

)≤ 2dmax sin

(αmax

2

)(1 + sin

(αmax

2

)),

(51)

where dmax is the maximal vertex degree. Note that thesame bound holds for all eigenvalues of ∆, in particular,the one with the largest magnitude λmax(∆).

Now returning to our goal of showing that DR∗(Λ∗ −R)DT

R∗ � 0. Let N =[I I . . .

]T . The columns of N willbe in the nullspace ofDR∗(Λ∗−R)DT

R∗ . ThereforeDR∗(Λ∗−R)DT

R∗ is positive semidefinite ifDR∗(Λ∗−R)DTR∗ +µNNT

is, and hence it is enough to show that

λ1

(DR∗(Λ∗ − R)DT

R∗ + µNNT)≥ 0 (52)

for sufficiently large µ. The Laplacian LG is positivesemidefinite with smallest eigenvalue λ1 = 0 and corre-sponding eigenvector v =

(1 1 . . . 1

)T . Furthermore,as N = v⊗ I3, it is clear that for sufficiently large µ we haveλ1(LG ⊗ I3 + µNNT ) = λ1(LG + µvvT ) = λ2(LG). Since

DR∗(Λ∗− R)DTR∗ +µNNT = LG⊗ I3 +µNNT + ∆, (53)

we therefore get

λ1(DR∗(Λ∗ − R)DTR∗ + µNNT ) ≥ λ2(LG)− |λmax(∆)|.

(54)

If the right-hand side is positive, then so is the left-handside. Using (51) for λmax(∆) yields the following result.

Lemma 4.4. The matrix Λ∗ − R is positive semidefinite if 0 ≤αmax ≤ π

2 and

λ2(LG)− 2dmax sin(αmax

2)(

1 + sin(αmax

2))≥ 0. (55)

By completing squares, one obtains the equivalent con-dition (

sin(αmax

2) +

1

2

)2

≤ λ2(LG)

2dmax+

1

4, (56)

which proves Theorem 4.1.What these results show, is that if there is a KKT point in

(P ), then it is also a KKT point to (P ′). If this KKT pointfulfills the prescribed error conditions it will be globallyoptimal in (P ′) and strong duality holds. But a solution thatis globally optimal in (P ′) and feasible in (P ) will also beglobally optimal in (P ) since the objective functions are thesame. Thus, as long as there is a solution to (P ) with smallenough errors, the programs (P ), (P ′) and (D) will all yieldthe same objective value.

5 PROBLEMS WITH UNKNOWN GRAPH TOPOLOGY

The bound (29) on αmax depends on the algebraic connectiv-ity of the underlying graph λ2(LG). It may also be of interestto have an estimate that is independent of the particulargraph and only depends on the amount of missing data. Inthis section we show how to derive such a bound.

Again we consider a matrix ∆ defined as in (38), butwith respect to the complete graph Kn, that is,

∆ = DR∗(Λ∗ − R)DTR∗ − LKn ⊗ I3. (57)

The 3× 3 sub-blocks of ∆ can be written

∆ii =∑j 6=i

aij(Eij + ETij)− (n− 1)I

=∑j 6=i

aij(Eij + ETij − I)− (n− 1− di)I(58)

Page 8: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

8

and∆ij = −aijEij − I. (59)

From Lemmas 4.2 and 4.3 we now get that

‖∆ii‖ ≤ 2di sin2(αmax

2

)+ (n− 1− di) (60)

and

‖∆ij‖ ≤{

1 aij = 0

2 sin(αmax

2

)aij = 1

. (61)

Lemma 4.1 shows that any eigenvalue of ∆ has to fulfill

|λ| ≤ 2di(

sin2(αmax

2

)+ sin

(αmax

2

))+2(n−1−di), (62)

for some i. We get that the sufficient condition λ2(LKn) −|λmax(∆)| ≥ 0 for strong duality is true if λ2(LKn) is greateror equal to the right-hand side of (62). As the completegraph Kn has λ2(LKn) = n, a sufficient condition for strongduality can be stated as follows,

1− n

2di+

1

di≥ sin2

(αmax

2

)+ sin

(αmax

2

). (63)

Solving for αmax gives

αmax ≤ 2 arcsin

(√5

4− n

2di+

1

di− 1

2

). (64)

Figure 5 shows the right hand side as a function of the

Fig. 5. Fraction of available data din−2

(x-axis) versus αmax (y-axis).

vertex-connectivity (normalized with n − 2). As long asdi/(n − 2) > 0.5 we get a positive bound on αmax, thatis, as long as no more than roughly half of the measure-ments are missing for each rotation this bound ensures thatΛ∗ − R � 0. Since this has to hold for all i and the bound isincreasing with di we get the following result.

Theorem 5.1 (Strong Duality with Unknown Topology). Thestationary point R∗i , i = 1, . . . , n will be globally optimal andstrong duality will hold for (P ) if

|αij | ≤ αmax ∀(i, j) ∈ E, (65)

where

αmax ≤ 2 arcsin

(√5

4− n− 2

2dmin− 1

2

), (66)

and dmin is the minimal vertex degree.

6 CHORDAL ERROR BOUNDS

The result in Lemma 4.4 gives a condition for strong dualitythat uses the worst case residual angle αmax and the largestdegree dmax. While such an estimate provides an intuitiveand simple sufficient condition it is clear that it could beviolated, for example, in the case of outlier measurements.In this section we show that by considering all residuals,and not just the worst one, we can derive conditions thatwill allow measurement errors that are larger than thoserequired by (29).

We show the following stronger version of Lemma 4.4.

Lemma 6.1. The matrix Λ∗ − R is positive semidefinite if∑j 6=i

2aij sin

( |αij |2

)(1 + sin

( |αij |2

))≤ λ2(LG), (67)

for all i = 1, ..., n.

Proof. With ∆ as in Section 4.1 we have from Lem-mas 4.1, 4.2 and 4.3 that the eigenvalues of ∆ fulfill

|λ(∆)| ≤∑i 6=j

2aij sin2(αij

2

)+∑i 6=j

2aij sin

( |αij |2

), (68)

for some i. We therefore have that λ2(LG) ≥ |λmax(∆)|if (67) holds. And since Λ∗ − R � 0 when λ2(LG) −|λmax(∆)| ≥ 0 this proves the result.

Since ‖RiRij−Rj‖F = 2√

2 sin(|αij |

2

)we see that Λ∗−

R is positive semidefinite if for all i = 1, ..., n we have

ei ≤ λ2(LG), (69)

where

ei =∑i 6=j

aij

(‖RiRij −Rj‖2F

4+‖RiRij −Rj‖F√

2

). (70)

This bound allows individual residual errors to be largerthan αmax as long as the sum (70) is sufficiently small.

7 SOLVING THE ROTATION AVERAGING PROBLEM

The dual problem (D) is a convex semidefinite program,and although it is provably solvable in polynomial timeby interior point methods [29], in practice such problemsquickly become intractable as the dimension of the enteringvariables grow.

In this section we present a first-order method for solv-ing semidefinite programs with constant block diagonals.Our approach solves the dual of (D) and consists of two sim-ple matrix operations only, matrix multiplication and squareroots of 3 × 3 symmetric matrices, the latter which can besolved in closed form. Consequently, these two operationspermit a simple and efficient implementation without theneed for dedicated numerical libraries.

The dual of (D) is given by

minY�0

maxΛ−tr (Λ) + tr

(Y (Λ− R)

). (71)

Page 9: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

9

Let the matrix Y be partitioned as follows,

Y =

Y11 Y12 ... Y1n

Y T12 Y22 ... Y2n

......

. . ....

Y T1n ... ... Ynn

(72)

where each block Yij ∈ R3×3 for i, j = 1, . . . , n. Since Λ isblock-diagonal (17) it is clear that the inner maximization isunbounded when Yii − I3×3 6= 0 and zero otherwise. Wetherefore get

(DD) minY

−tr(RY

)s.t. Yii = I3, i = 1, ..., n,

Y � 0.

(73)

Since Y � 0 it is clear that

−tr (Λ) + tr (Y (Λ−R∗)) ≥ −tr (Λ) ,

for all Λ of the form (17). Therefore (DD) ≥ (D) andassuming strong duality holds (D) = (P ). Furthermore ifR∗ is the global optimum of (P ) then Y = R∗TR∗ is feasiblein (73) which shows that (DD) = (P ).

Thus, when strong duality holds, recovering a primalsolution to (P ) is then achieved by simply reading off thefirst three rows of Y ∗ and choosing their signs to ensurepositive determinants of the resulting rotation matrices.

7.1 Block Coordinate DescentIn this section we present a block coordinate descent methodfor solving semidefinite programs with block diagonal con-straints on the form (73). This method is a generalization ofthe row-by-row algorithms derived in [30].

Consider the following semidefinite program,

minS∈R3n×3

tr(WTS

)s.t.

[I ST

S B

]� 0.

(74)

This is a subproblem that arises when attempting to solve(DD) in (73) using a block coordinate descent approach, i.e.,by fixing all but one row and column of blocks in (72) andreordering as necessary. It turns out that this subproblemhas a particularly simple, closed form solution, establishedby the following lemma.

Lemma 7.1. Let B be a positive semidefinite matrix. Then, thesolution to (74) is given by,

S∗ = −BW[(WTBW

) 12

]†. (75)

Here † denotes the Moore–Penrose pseudoinverse.

Proof. From the Schur complement, we have that the 2 × 2block matrix in (74) is positive semidefinite if and only if

I − STB†S � 0, (76)

(I −BB†)S = 0. (77)

Hence the problem (74) is equivalent to

minS∈R3n×3

< W,S > (78a)

s.t. I − STB†S � 0, (78b)

(I −BB†)S = 0. (78c)

The KKT conditions for (78), with Lagrangian multipliers Γand Υ , become

W + 2B†SΓ + (I −BB†)Υ = 0, (79)

I − STB†S � 0, (80)

(I −BB†)S = 0, (81)Γ � 0, (82)

(I − STB†S)Γ = 0. (83)

Rewrite (79) and (83) as

B†SΓ = −1

2W − 1

2(I −BB†)Υ, (84)

ΓTΓ = ΓTSTB†SΓ. (85)

Since the pseudoinverse fulfills B†BB† = B†, combining(84) and (85) we obtain

Γ2 = ΓTSTB†BB†SΓ = (86)

=1

4

(W + (I −BB†)Υ

)TB(W + (I −BB†)Υ

)= (87)

=1

4WTBW. (88)

Here the last equality follows since B(I − BB†) = 0. Thisgives

Γ =1

2

(WTBW

) 12

. (89)

Inserting (89) in (84)

B†S(WTBW

) 12

= −W − (I −BB†)Υ, (90)

multiplying with B from the left on both sides and using(81), BB†S = S, we arrive at

S(WTBW

) 12

= −BW, (91)

and consequently

S = −BW[(WTBW

) 12

]†. (92)

Finally, since

Γ =1

2

(WTBW

) 12 � 0, (93)

I − STB†S =

= I −[(WTBW

) 12

]†WTBW

[(WTBW

) 12

]†� 0,

(94)

the conditions (80) and (82) are satisfied then (75) must be afeasible and optimal solution to (78) and consequently alsoto (74).

The resulting algorithm is summarised in Algorithm 1below.

8 EXPERIMENTAL RESULTS

In this section we present an experimental study aimedat characterizing the performance and computational ef-ficiency of the proposed algorithm compared to existingnumerical solvers.

Page 10: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

10

(a) Gustavus (b) Sphinx (c) Sri Thendayuthapani (d) Alcatraz

(e) Tooth Relic Temple (f) Pumpkin (g) Plaza de Armas (h) Buddah

Fig. 6. Images and reconstructions of the datasets in Table 2.

—————————————————————————-input: R, Y (0) � 0, t = 0.repeat

· Select an integer k ∈ [1, . . . , n],· Bk: the result of eliminating the kth row andcolumn from Y t.·Wk: the result of eliminating the kth row andall but the kth column from R.· S∗k = −BkWk

[(WTk BkWk

) 12]†

as in (75).

· Y t =[I S∗T

k

S∗k Bk

], (succeeded by the appropriate

reordering).· t = t+ 1

until convergence—————————————————————————-

Algorithm 1: A block coordinate descent algorithmfor the semidefinite relaxation (DD) in (73).

8.1 Synthetic DataIn our first set of experiments we compared the computa-tional efficiency of the following algorithms:

• The Levenberg-Marquardt (LM) algorithm [31], astandard nonlinear optimization method.

• Algorithm 1, our approach developed in Section 7.• The Staircase method of [32], a Riemannian optimiza-

tion based approach with a very efficient and well-written implementation which is publicly available.

• SeDuMi [33], a well-known software package forconic optimization.

We constructed a large number of synthetic problem

instances of increasing size, perturbed by varying levels ofnoise. Each absolute rotation was obtained by rotation aboutthe z-axis by 2π/n rad and by construction, forming a cyclegraph. The relative rotations were perturbed by noise in theform of a random rotation about an axis sampled from auniform distribution on the unit sphere with angles nor-mally distributed with mean 0 and variance σ. The absoluterotations were initialized (if required) in a similar fashionbut with the angles uniformly distributed over [0, 2π] rad.

The results, averaged over 100 runs, can be seen inTable 1. As expected, the LM algorithm significantly out-performs our algorithm as well as SeDuMi and the Staircasemethod, but it only manages to obtain the global optima inabout 30−70% of the time. As predicted by Theorem 4.2 andthe discussion in Section 4.1 on cycle graphs, Algorithm 1,the Staircase method and SeDuMi produce globally optimalsolutions at every single problem instance, independent ofthe noise level and independent on the number of cam-eras. As expected, Algorithm 1 does appear to outperformSeDuMi quite significantly with respect to computationalefficiency. From this table we also observe that our pro-posed approach also appears competitive with the Staircasemethod of [32]. It should be noted that we here solved theproblem only to moderate precision (3-4 decimal places). Fora greater precision would we expect the Staircase methodto perform better. However, for the applications consideredhere, moderate accuracies are generally sufficient.

8.2 Real-World Data

In our second set of experiments we compared the compu-tational efficiency on a number of publicly available real-

Page 11: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

11

LM [31] Alg. 1 Staircase [32] SeDuMi [33]

n σ [rad] success rate time[s] time[s] time[s] time[s]

20 0.2 30.0% 0.003 0.008 0.147 0.5830.5 31.0% 0.006 0.008 0.115 0.462

50 0.2 44.0% 0.008 0.030 0.465 1.8790.5 40.0% 0.031 0.034 0.311 1.903

100 0.2 38.0% 0.026 0.114 1.080 16.6120.5 44.0% 0.048 0.238 1.058 18.318

200 0.2 58.0% 0.076 0.800 2.854 223.9000.5 55.0% 0.083 1.234 2.754 229.608

TABLE 1Comparison of running times on synthetic data, averaged over 100 runs. The success rate, i.e. the fraction of the times the global optima was

reached by the LM algorithm is also listed.

time[s] Angle Bound (29) Chordal Bound (69)

Dataset n Alg. 1 Staircase [32] SeDuMi maxij |αij | αmax maxi ei λ2(LG)

Gustavus 57 0.053 0.98 4.46 6.33◦ 8.89◦ 0.371 4.677Sphinx 70 0.057 0.28 7.54 5.73◦ 10.20◦ 1.063 9.296Sri Thendayuthapani 98 0.01 0.07 24.76 5.56◦ 43.31◦ 2.190 98.000Alcatraz 133 0.11 0.27 70.61 7.68◦ 23.32◦ 1.793 64.410Tooth Relic Temple 162 0.08 0.16 155.59 5.72◦ 11.57◦ 1.927 32.384Pumpkin 209 0.25 0.34 327.91 8.63◦ 2.54◦ 3.801 6.894Plaza de Armas 240 0.12 0.18 691.33 5.73◦ 25.46◦ 9.301 128.555Buddha 322 0.36 0.26 1694.41 7.29◦ 0.94◦ 3.104 2.156

TABLE 2The average run time and largest resulting angular residual (|αij |) and bound (αmax) on five different real-world datasets.

world datasets1 [2]. The results are presented in Table 2.Here, as in the previous experiment, both methods correctlyproduce the global optima at each instance. Algorithm 1again performs comparable to the Staircase method andsignificantly outperforms SeDuMi with respect to compu-tational cost, providing further evidence of the efficiencyof the proposed algorithm. It can further be seen thatTheorem 4.1 provides bounds sufficiently large to guaranteestrong duality, and hence global optimality, in all the real-world instances except for the pumpkin and buddah datasets.Although strong duality does indeed hold in both of thesecases, the resulting certificate is insufficient with respectto the angle bound (29) for the pumpkin dataset and boththe angle and chordal bounds for the buddah dataset. Thevisibility graphs of both of these datasets are comprisedboth of densely as well as sparsely connected cameras,resulting in a large value of dmax in combination with asmall value of dmin (minimum degree). Since λ2 ≤ dmin

a limited bound on αmax follows directly from (29). Theseinstances serve as a representative example of when thebounds of Theorem 4.1, although still valid and strictlypositive, become too conservative in practice.

9 CONCLUSIONS

In this paper we have presented a theoretical analysis ofLagrangian duality in rotation averaging based on spectralgraph theory. Our main result states that for this class ofproblems strong duality will provably hold between the pri-mal and dual formulations if the noise levels are sufficiently

1. Available from http://www.maths.lth.se/˜calle

restricted. In many cases the noise levels required for strongduality not to hold can be shown to be quite severe. To thebest of our knowledge, this is the first time such practicallyuseful sufficient conditions for strong duality have beenestablished for optimization over multiple rotations.

A scalable first-order algorithm, a generalization of coor-dinate descent methods for semidefinite cone programming,was also presented. Our empirical validation demonstratesthe potential of this proposed algorithm, significantly out-performing existing general purpose semidefinite numericalsolvers and performing comparable, in many instances evenbetter than, state-of-the-art methods. We found this resultparticularly encouraging as, owing to its simplicity, ourproposed approach would be exceedingly straightforwardto parallelise, with potentially considerable speed-ups as aconsequence.

ACKNOWLEDGMENTS

This work has been funded by the Australian ResearchCouncil through grant FT170100072, the Swedish ResearchCouncil (no. 2016-04445 and 2018-05375), the Swedish Foun-dation for Strategic Research (Semantic Mapping and VisualNavigation for Smart Robots) and Vinnova / FFI (Percep-tron, no. 2017-01942).

REFERENCES

[1] D. Martinec and T. Pajdla, “Robust rotation and translation estima-tion in multiview reconstruction,” in IEEE Conference on ComputerVision and Pattern Recognition, 2007.

Page 12: 1 Rotation Averaging with the Chordal Distance: Global ...calle/papers/eriksson-etal-tpami-2019.pdf1 Rotation Averaging with the Chordal Distance: Global Minimizers and Strong Duality

12

[2] O. Enqvist, F. Kahl, and C. Olsson, “Non-sequential structurefrom motion,” in International Workshop on Omnidirectional Vision,Camera Networks and Non-Classical Cameras, 2011.

[3] P. Moulon, P. Monasse, and R. Marlet, “Global fusion of relativemotions for robust, accurate and scalable structure from motion,”in International Conference on Computer Vision, 2013.

[4] K. Wilson, D. Bindel, and N. Snavely, “When is rotations averaginghard?” in European Conference on Computer Vision, 2016.

[5] R. Hartley, J. Trumpf, Y. Dai, and H. Li, “Rotation averaging,”International Journal of Computer Vision, vol. 103, no. 3, pp. 267–305,2013.

[6] F. Kahl and R. Hartley, “Multiple-view geometry under the L∞-norm,” IEEE Transactions on Pattern Analysis and Machine Intelli-gence, vol. 30, no. 9, pp. 1603–1617, 2008.

[7] F. Arrigoni, L. Magri, B. Rossi, P. Fragneto, and A. Fusiello, “Ro-bust absolute rotation estimation via low-rank and sparse matrixdecomposition,” in International Conference on 3D Vision, 2014.

[8] R. Tron, B. Afsari, and R. Vidal, “Intrinsic consensus on SO(3)with almost-global convergence.” in IEEE Conference on Decisionand Control, 2012.

[9] L. Carlone, R. Tron, K. Daniilidis, and F. Dellaert, “Initializationtechniques for 3D SLAM: A survey on rotation estimation andits use in pose graph optimization,” in International Conference onRobotics and Automation, 2015.

[10] V. Govindu, “Combining two-view constraints for motion estima-tion,” in IEEE Conference on Computer Vision and Pattern Recognition,2001.

[11] J. Fredriksson and C. Olsson, “Simultaneous multiple rotation av-eraging using Lagrangian duality,” in Asian Conference on ComputerVision, 2012.

[12] M. Arie-Nachimson, S. Z. Kovalsky, I. Kemelmacher-Shlizerman,A. Singer, and R. Basri, “Global motion estimation from pointmatches,” in International Conference on 3D Imaging, Modeling,Processing, Visualization and Transmission, 2012.

[13] Y. Zhong and N. Boumal, “Near-optimal bounds for phase syn-chronization,” ArXiv e-prints, Mar. 2017.

[14] V. Govindu, “Robustness in motion averaging,” in European Con-ference on Computer Vision, 2006.

[15] R. Hartley, K. Aftab, and J. Trumpf, “L1 rotation averaging usingthe Weiszfeld algorithm,” in IEEE Conference on Computer Visionand Pattern Recognition, 2011.

[16] R. Hartley, J. Trumpf, and Y. Dai, “Rotation averaging and weakconvexity,” in International Symposium on Mathematical Theory ofNetworks and Systems, 2010.

[17] A. Chatterjee and V. Madhav Govindu, “Efficient and robust large-scale rotation averaging,” in International Conference on ComputerVision, 2013.

[18] N. Boumal, A. Singer, P.-A. Absil, and V. Blondel, “Cramer-Rao bounds for synchronization of rotations,” Informationand Inference, vol. 3, pp. 1–39, 2014. [Online]. Available:http://imaiai.oxfordjournals.org/content/3/1/1

[19] L. Carlone, G. C. Calafiore, C. Tommolillo, and F. Dellaert, “Planarpose graph optimization: Duality, optimal solutions, and verifi-cation,” IEEE Transactions on Robotics, vol. 32, no. 3, pp. 545–565,2016.

[20] L. Carlone and F. Dellaert, “Duality-based verification techniquesfor 2D SLAM,” in International Conference on Robotics and Automa-tion, 2015.

[21] R. Tron and R. Vidal, “Distributed 3-D localization of camera sen-sor networks from 2-D image measurements,” IEEE Transactionson Automatic Control, vol. 59, no. 12, pp. 3325–3340, 2014.

[22] J. Briales and J. Gonzalez-Jimenez, “Fast global optimality verifi-cation in 3D SLAM,” in International Conference on Intelligent Robotsand Systems, 2016.

[23] D. M. Rosen, L. Carlone, A. S. Bandeira, and J. J. Leonard,“SE-Sync: A certifiably correct algorithm for synchronization overthe special Euclidean group,” CoRR, vol. abs/1612.07386, 2016.[Online]. Available: http://arxiv.org/abs/1612.07386

[24] A. Eriksson, C. Olsson, F. Kahl, and T.-J. Chin, “Rotation averagingand strong duality,” in IEEE Conference on Computer Vision andPattern Recognition (CVPR), 2018, pp. 127–135.

[25] M. Fiedler, “Algebraic connectivity of graphs,” CzechoslovakMathematical Journal, vol. 23, no. 2, pp. 298–305, 1973. [Online].Available: http://eudml.org/doc/12723

[26] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. TheJohns Hopkins University Press, 1996.

[27] O. Enqvist, “Robust algorithms for multiple view geometry - out-liers and optimality,” Ph.D. dissertation, Centre for MathematicalSciences, Lund University, Sweden, 2011.

[28] D. G. Feingold and R. S. Varga, “Block diagonally dominantmatrices and generalizations of the Gerschgorin circle theorem.”Pacific J. Math., vol. 12, no. 4, pp. 1241–1250, 1962.

[29] S. Boyd and L. Vandenberghe, Convex Optimization. CambridgeUniversity Press, 2004.

[30] Z. Wen, D. Goldfarb, S. Ma, and K. Scheinberg, “Row by rowmethods for semidefinite programming,” Columbia University,Tech. Rep., 2009.

[31] S. Wright and J. Nocedal, “Numerical optimization,” SpringerScience, vol. 35, pp. 67–68, 1999.

[32] N. Boumal, “A riemannian low-rank method for optimizationover semidefinite matrices with block-diagonal constraints,” arXivpreprint arXiv:1506.00575, 2015.

[33] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for opti-mization over symmetric cones,” Optimization methods and software,vol. 11, no. 1-4, pp. 625–653, 1999.

Anders Eriksson Dr Eriksson is an Australian Research Council FutureFellow and Associate Professor at the School of Information Technologyand Electrical Engineering, University of Queensland. He received hisMasters of Science degree in Electrical Engineering in 2000 and hisPhD in Mathematics in 2008 from Lund University, Sweden. He wasa research fellow and later an ARC DECRA fellow at the Universityof Adelaide from 2008-2015 and a Vice Chancellor’s research fellowat Queensland University of Technology in Brisbane from 2015-2018.His research areas include optimisation theory and numerical methodsapplied to the fields of computer vision and machine learning.

Carl Olsson Carl Olsson received his M.Sc. degree in electrical engi-neering in 2005 and his Ph.D. degree in mathematics in 2009. Since2013 he is an associate professor at Lund University and since 2016he is also a senior researcher at Chalmers University of Technology.His research focuses on mathematical optimization with applications incomputer vision.

Fredrik Kahl Fredrik Kahl received his M.Sc. degree in computer sci-ence and technology in 1995 and his Ph.D. degree in mathematics in2001 from Lund University. His thesis was awarded the Best NordicThesis Award in pattern recognition and image analysis 20012002 atthe Scandinavian Conference on Image Analysis 2003. In 2005, hereceived the International Conference on Computer Vision Marr Prizeand in 2007, he was awarded an ERC Starting Grant from the EuropeanResearch Council. Since 2013, he is a professor at Chalmers Universityof Technology where he leads the Computer Vision Group. Researchinterests include machine learning, optimization, medical image analysisand computer vision.

Tat-Jun Chin Tat-Jun Chin received the BEng degree in mechatronicsengineering from Universiti Teknologi Malaysia (UTM) in 2003 and thePhD degree in computer systems engineering from Monash University,Victoria, Australia, in 2007. He was a research fellow at the Institute forInfocomm Research (I2R) in Singapore from 2007 to 2008. Since 2008,he has been at The University of Adelaide, South Australia, and is nowan Associate Professor. He is an Associate Editor of IPSJ Transactionson Computer Vision and Applications (CVA), and Journal of Imaging. Hisresearch interests include robust estimation and geometric optimisation.