simple matrices - CCRMAdattorro/simplemat.pdf · Simple matrices Mathematicians also attempted to...

Appendix B

Simple matrices

Mathematicians also attempted to develop algebra of vectors but there wasno natural definition of the product of two vectors that held in arbitrarydimensions. The first vector algebra that involved a noncommutative vectorproduct (that is, v×w need not equal w×v) was proposed by HermannGrassmann in his book Ausdehnungslehre (1844 ). Grassmann’s text alsointroduced the product of a column matrix and a row matrix, which resulted inwhat is now called a simple or a rank-one matrix. In the late 19th century theAmerican mathematical physicist Willard Gibbs published his famous treatiseon vector analysis. In that treatise Gibbs represented general matrices, whichhe called dyadics, as sums of simple matrices, which Gibbs called dyads. Laterthe physicist P. A. M. Dirac introduced the term “bra-ket” for what we nowcall the scalar product of a “bra” (row) vector times a “ket” (column) vectorand the term “ket-bra” for the product of a ket times a bra, resulting in whatwe now call a simple matrix, as above. Our convention of identifying columnmatrices and vectors was introduced by physicists in the 20th century.

−Suddhendu Biswas [52, p.2]

B.1 Rank-1 matrix (dyad)

Any matrix formed from the unsigned outer product of two vectors,

Ψ = uvT ∈ RM×N (1783)

where u∈RM and v∈RN , is rank-1 and called dyad. Conversely, any rank-1 matrix musthave the form Ψ . [233, prob.1.4.1] Product −uvT is a negative dyad. For matrix productsABT, in general, we have

R(ABT) ⊆ R(A) , N (ABT) ⊇ N (BT) (1784)

with equality when B =A [374, §3.3, §3.6]B.1 or respectively when B is invertible andN (A)=0. Yet for all nonzero dyads we have

R(uvT) = R(u) , N (uvT) = N (vT) ≡ v⊥ (1785)

B.1Proof. R(AAT) ⊆ R(A) is obvious.

R(AAT) = {AATy | y ∈ Rm}

⊇ {AATy | ATy ∈ R(AT)} = R(A) by (146) ¨

Dattorro, Convex Optimization � Euclidean Distance Geometry 2ε, Mεβoo, v2018.09.21. 517

https://ccrma.stanford.edu/~dattorro

http://www.convexoptimization.com

518 APPENDIX B. SIMPLE MATRICES

R(v)

N (Ψ)=N (vT)

r r

N (uT)

R(Ψ) = R(u)

RN = R(v) ⊞ N (uvT) N (uT) ⊞ R(uvT) = RM

0 0

Figure 183: The four fundamental subspaces [376, §3.6] for any dyad Ψ = uvT∈RM×N :R(v)⊥ N (Ψ) & N (uT)⊥ R(Ψ). Ψ(x), uvTx is a linear mapping from RN to RM . Mapfrom R(v) to R(u) is bijective. [374, §3.1]

where dim v⊥=N−1.

It is obvious that a dyad can be 0 only when u or v is 0 ;

Ψ = uvT = 0 ⇔ u = 0 or v = 0 (1786)

The matrix 2-norm for Ψ is equivalent to Frobenius’ norm;

‖Ψ‖2 = σ1 = ‖uvT‖F = ‖uvT‖2 = ‖u‖ ‖v‖ (1787)

When u and v are normalized, the pseudoinverse is the transposed dyad. Otherwise,

Ψ† = (uvT)† =vuT

‖u‖2 ‖v‖2(1788)

When dyad uvT∈RN×N is square, uvT has at least N−1 0-eigenvalues andcorresponding eigenvectors spanning v⊥. The remaining eigenvector u spans the range ofuvT with corresponding eigenvalue

λ = vTu = tr(uvT) ∈ R (1789)

Determinant is a product of the eigenvalues; so, it is always true that

det Ψ = det(uvT) = 0 (1790)

When λ = 1 , the square dyad is a nonorthogonal projector projecting on its range(Ψ2 =Ψ , §E.6); a projector dyad. It is quite possible that u∈ v⊥ making the remainingeigenvalue instead 0 ;B.2 λ = 0 together with the first N−1 0-eigenvalues; id est, it ispossible uvT were nonzero while all its eigenvalues are 0. The matrix

[

1−1

]

[ 1 1 ]=

[

1 1−1 −1

]

(1791)

for example, has two 0-eigenvalues. In other words, eigenvector u may simultaneously bea member of the nullspace and range of the dyad. The explanation is, simply, because uand v share the same dimension, dim u = M = dim v = N :

B.2A dyad is not always diagonalizable (§A.5) because its eigenvectors are not necessarily independent.

B.1. RANK-1 MATRIX (DYAD) 519

Proof. Figure 183 sees the four fundamental subspaces for the dyad. Linear operatorΨ : RN→RM provides a map between vector spaces that remain distinct when M =N ;

u ∈ R(uvT)

u ∈ N (uvT) ⇔ vTu = 0

R(uvT) ∩ N (uvT) = ∅(1792)

¨

B.1.0.1 rank-1 modification

For A∈RM×N , x∈RN , y∈RM , and yTAx 6=0 [238, §2.1]B.3

rank

(

A − AxyTA

yTAx

)

= rank(A) − 1 (1794)

Given nonsingular matrix A∈RN×N Ä 1 + vTA−1u 6= 0 , [178, §4.11.2] [253, App.6][463, §2.3 prob.16] (Sherman-Morrison-Woodbury)

(A ± uvT)−1 = A−1 ∓ A−1 uvTA−1

1 ± vTA−1u(1795)

B.1.0.2 dyad symmetry

In the specific circumstance that v = u , then uuT∈ RN×N is symmetric, rank-1 , andpositive semidefinite having exactly N−1 0-eigenvalues. In fact, (Theorem A.3.1.0.7)

uvTº 0 ⇔ v = u (1796)

and the remaining eigenvalue is almost always positive;

λ = uTu = tr(uuT) > 0 unless u=0 (1797)

When λ = 1 , the dyad becomes an orthogonal projector.Matrix

[

Ψ uuT 1

]

(1798)

for example, is rank-1 positive semidefinite if and only if Ψ = uuT.

B.1.1 Dyad independence

Now we consider a sum of dyads like (1783) as encountered in diagonalization and singularvalue decomposition:

R(

k∑

i=1

siwTi

)

=k

∑

i=1

R(

siwTi

)

=k

∑

i=1

R(si) ⇐ wi ∀ i are l.i. (1799)

B.3This rank-1 modification formula has a Schur progenitor, in the symmetric case:

minimizec

c

subject to

[

A AxyTA c

]

º 0(1793)

has analytical solution by (1679b): c≥ yTAA†Ax = yTAx . Difference A− AxyTA

yTAxcomes from (1679c).

Rank modification is provable via Theorem A.4.0.1.3.


range of summation is the vector sum of ranges.B.4 (Theorem B.1.1.1.1) Under theassumption the dyads are linearly independent (l.i.), then vector sums are unique (p.631):for {wi} l.i. and {si} l.i.

R(

k∑

i=1

siwTi

)

= R(

s1wT1

)

⊕ . . . ⊕R(

skwTk

)

= R(s1) ⊕ . . . ⊕R(sk) (1800)

B.1.1.0.1 Definition. Linearly independent dyads. [243, p.29 thm.11] [382, p.2]The set of k dyads

{

siwTi | i=1 . . . k

}

(1801)

where si∈CM and wi∈CN , is said to be linearly independent iff

rank

(

SWT ,k

∑

i=1

siwTi

)

= k (1802)

where S , [s1 · · · sk] ∈ CM×k and W , [w1 · · · wk] ∈ CN×k. △

Dyad independence does not preclude existence of a nullspace N (SWT) , as defined,nor does it imply SWT were full-rank. In absence of assumption of independence,generally, rankSWT≤ k . Conversely, any rank-k matrix can be written in the formSWT by singular value decomposition. (§A.6)

B.1.1.0.2 Theorem. Linearly independent (l.i.) dyads.Vectors {si∈CM , i=1 . . . k} are l.i. and vectors {wi∈CN , i=1 . . . k} are l.i. if and onlyif dyads {siw

Ti ∈ CM×N , i=1 . . . k} are l.i. ⋄

Proof. Linear independence of k dyads is identical to definition (1802).(⇒) Suppose {si} and {wi} are each linearly independent sets. Invoking Sylvester’s rankinequality, [233, §0.4] [463, §2.4]

rankS + rankW − k ≤ rank(SWT) ≤ min{rankS , rankW} (≤ k) (1803)

Then k≤ rank(SWT)≤ k which implies the dyads are independent.(⇐) Conversely, suppose rank(SWT)= k . Then

k≤min{rankS , rankW} ≤ k (1804)

implying the vector sets are each independent. ¨

B.1.1.1 Biorthogonality condition, Range and Nullspace of Sum

Dyads characterized by biorthogonality condition WTS = I are independent; id est, forS∈CM×k and W ∈ CN×k, if WTS = I then rank(SWT)= k by the linearly independentdyads theorem because (confer §E.1.1)

WTS = I ⇒ rankS =rankW = k≤M =N (1805)

To see that, we need only show: N (S)=0 ⇔ ∃ B Ä BS =I .B.5

(⇐) Assume BS =I . Then N (BS)=0={x |BSx = 0} ⊇ N (S). (1784)

B.4Move of range R to inside summation admitted by linear independence of {wi}.B.5Left inverse is not unique, in general.

B.2. DOUBLET 521

R([u v ])

N (Π)= v⊥ ∩ u⊥

r r

v⊥ ∩ u⊥

R(Π)=R([u v])

RN = R([u v ]) ⊞ N([

vT

uT

])

N([

vT

uT

])

⊞ R([u v ]) = RN

0 0

Figure 184: The four fundamental subspaces [376, §3.6] for doublet Π = uvT+ vuT∈ SN .Π(x)=(uvT+ vuT)x is a linear bijective mapping from R([u v ]) to R([u v ]).

(⇒) If N (S)=0 then S must be full-rank thin-or-square.

∴ ∃ A,B ,C Ä

[

BC

]

[S A ] = I (id est, [S A ] is invertible) ⇒ BS =I .

Left inverse B is given as WT here. Because of reciprocity with S ,it immediately follows: N (W )=0 ⇔ ∃ S Ä STW = I . ¨

Dyads produced by diagonalization, for example, are independent because of their inherentbiorthogonality. (§A.5.0.3) The converse is generally false; id est, linearly independentdyads are not necessarily biorthogonal.

B.1.1.1.1 Theorem. Nullspace and range of dyad sum.Given a sum of dyads represented by SWT where S∈CM×k and W ∈ CN×k

N (SWT) = N (WT) ⇐ ∃ B Ä BS = I

R(SWT) = R(S) ⇐ ∃ Z Ä WTZ = I(1806)

⋄

Proof. (⇒) N (SWT)⊇N (WT) and R(SWT)⊆R(S) are obvious.(⇐) Assume existence of a left inverse B∈Rk×N and a right inverse Z∈RN×k .B.6

N (SWT) = {x | SWTx = 0} ⊆ {x | BSWTx = 0} = N (WT) (1807)

R(SWT) = {SWTx | x∈RN} ⊇ {SWTZy | Zy∈RN} = R(S) (1808)

¨

B.2 Doublet

Consider a sum of two linearly independent square dyads, one a transposition of the other:

Π = uvT+ vuT =[u v ]

[

vT

uT

]

= SWT ∈ SN (1809)

where u , v∈RN . Like the dyad, a doublet can be 0 only when u or v is 0 ;

Π = uvT+ vuT = 0 ⇔ u = 0 or v = 0 (1810)

By assumption of independence, a nonzero doublet has two nonzero eigenvalues

λ1 , uTv + ‖uvT‖ , λ2 , uTv − ‖uvT‖ (1811)

B.6By counterexample, the theorem’s converse cannot be true; e.g, S = W = [1 0 ] .


N (uT)

N (E)=R(u)

r r

R(v)

R(E) = N (vT)

RN = N (uT) ⊞ N (E) R(v) ⊞ R(E) = RN

0 0

Figure 185: vTu=1/ζ . The four fundamental subspaces [376, §3.6] for elementary matrix

E as a linear mapping E(x)=

(

I − uvT

vTu

)

x .

where λ1 > 0 >λ2 , with corresponding eigenvectors

x1 ,u

‖u‖ +v

‖v‖ , x2 ,u

‖u‖ − v

‖v‖ (1812)

spanning the doublet range. Eigenvalue λ1 cannot be 0 unless u and v have opposingdirections, but that is antithetical since then the dyads would no longer be independent.Eigenvalue λ2 is 0 if and only if u and v share the same direction, again antithetical.Generally we have λ1 > 0 and λ2 < 0 , so Π is indefinite.

By the nullspace and range of dyad sum theorem, doublet Π has N−2 zero-eigenvalues

remaining and corresponding eigenvectors spanning N([

vT

uT

])

. We therefore have

R(Π) = R([u v ]) , N (Π) = v⊥ ∩ u⊥ (1813)

of respective dimension 2 and N−2. (Figure 184)

B.3 Elementary matrix

A matrix of the formE = I − ζ uvT ∈ RN×N (1814)

where ζ∈R is finite and u,v∈RN , is called elementary matrix or rank-1 modificationof the Identity . [235] Any elementary matrix in RN×N has N−1 eigenvalues equal to 1corresponding to real eigenvectors that span v⊥. The remaining eigenvalue

λ = 1− ζ vTu (1815)

corresponds to eigenvector u .B.7 From [253, App.7.A.26] the determinant:

det E = 1 − tr(

ζ uvT)

= λ (1816)

If λ 6= 0 then E is invertible; [178] (confer §B.1.0.1)

E−1 = I +ζ

λuvT (1817)

Eigenvectors corresponding to 0 eigenvalues belong to N (E) , and the numberof 0 eigenvalues must be at least dimN (E) which, here, can be at most one.

B.7Elementary matrix E is not always diagonalizable because eigenvector u need not be independent ofthe others; id est, u∈ v⊥ is possible.

B.3. ELEMENTARY MATRIX 523

(§A.7.3.0.1) The nullspace exists, therefore, when λ=0 ; id est, when vTu=1/ζ ; rather,whenever u belongs to hyperplane {z∈RN | vTz=1/ζ}. Then (when λ=0) elementarymatrix E is a nonorthogonal projector projecting on its range (E2 =E , §E.1) andN (E)=R(u) ; eigenvector u spans the nullspace when it exists. By conservation ofdimension, dimR(E)=N−dimN (E). It is apparent from (1814) that v⊥⊆R(E) ,but dim v⊥=N−1. Hence R(E)≡ v⊥ when the nullspace exists, and the remainingeigenvectors span it.

In summary, when a nontrivial nullspace of E exists,

R(E) = N (vT) , N (E) = R(u) , vTu = 1/ζ (1818)

illustrated in Figure 185, which is opposite to the assignment of subspaces for a dyad(Figure 183). Otherwise, R(E)= RN .

When E = ET, the spectral norm is

‖E‖2 = max{1 , |λ|} (1819)

B.3.1 Householder matrix

An elementary matrix is called a Householder matrix when it has the defining form, fornonzero vector u [185, §5.1.2] [178, §4.10.1] [374, §7.3] [233, §2.2]

H = I − 2uuT

uTu∈ SN (1820)

which is a symmetric orthogonal (reflection) matrix (H−1=HT=H (§B.5.3)). Vector uis normal to an N− 1-dimensional subspace u⊥ through which this particular H effectspointwise reflection; e.g, Hu⊥=u⊥ while Hu =−u .

Matrix H has N−1 orthonormal eigenvectors spanning that reflecting subspaceu⊥ with corresponding eigenvalues equal to 1. The remaining eigenvector u hascorresponding eigenvalue −1 ; so

det H = −1 (1821)

Due to symmetry of H , the matrix 2-norm (the spectral norm) is equal to the largesteigenvalue-magnitude. A Householder matrix is thus characterized,

HT = H , H−1 = HT , ‖H‖2 = 1 , H ² 0 (1822)

For example, the permutation matrix

Ξ =

1 0 00 0 10 1 0

(1823)

is a Householder matrix having u=[ 0 1 −1 ]T/√

2 . Not all permutation matrices areHouseholder matrices, although all permutation matrices are orthogonal matrices (§B.5.2,ΞTΞ = I) [374, §3.4] because they are made by permuting rows and columns of theIdentity matrix. Neither are all symmetric permutation matrices Householder matrices;

e.g, Ξ =

0 0 0 10 0 1 00 1 0 01 0 0 0

(1920) is not a Householder matrix.


B.4 Auxiliary V -matrices

B.4.1 Auxiliary projector matrix V

It is convenient to define a matrix V that arises naturally as a consequence of translatinggeometric center αc (§5.5.1.0.1) of some list X to the origin. In place of X − αc1

T wemay write XV as in (1135) where

V = I − 1

N11T ∈ SN (1071)

is an elementary matrix called the geometric centering matrix.Any elementary matrix in RN×N has N−1 eigenvalues equal to 1. For the particular

elementary matrix V , the N th eigenvalue equals 0. The number of 0 eigenvalues mustequal dimN (V ) = 1 , by the 0 eigenvalues theorem (§A.7.3.0.1), because V =V T isdiagonalizable. Because

V 1 = 0 (1824)

the nullspace N (V )=R(1) is spanned by the eigenvector 1. The remaining eigenvectorsspan R(V ) ≡ 1⊥ = N (1T) that has dimension N−1.

BecauseV 2 = V (1825)

and V T = V , elementary matrix V is also a projection matrix (§E.3) projectingorthogonally on its range N (1T) which is a hyperplane containing the origin in RN

V = I − 1(1T1)−11T (1826)

The {0, 1} eigenvalues also indicate that diagonalizable V is a projection matrix.[463, §4.1 thm.4.1] Symmetry of V denotes orthogonal projection; from (2120),

V 2 = V , V T = V , V † = V , ‖V ‖2 = 1 , V º 0 (1827)

Matrix V is also circulant [197].

B.4.1.0.1 Example. Relationship of Auxiliary to Householder matrix.Let H∈ SN be a Householder matrix (1820) defined by

u =

1...1

1 +√

N

∈ RN (1828)

Then we have [180, §2]

V = H

[

I 00T 0

]

H (1829)

Let D∈ SNh and define

−HDH , −[

A bbT c

]

(1830)

where b is a vector. Then because H is nonsingular (§A.3.1.0.5) [215, §3]

−V D V = −H

[

A 00T 0

]

H º 0 ⇔ −A º 0 (1831)

and affine dimension is r = rankA when D is a Euclidean distance matrix. 2

B.4. AUXILIARY V -MATRICES 525

B.4.2 Schoenberg auxiliary matrix VN

1. VN =1√2

[

−1T

I

]

∈ RN×N−1 (1055)

2. V TN 1 = 0

3. I − e11T =

[

0√

2VN]

4.[

0√

2VN]

VN = VN

5.[

0√

2VN]

V = V

6. V[

0√

2VN]

=[

0√

2VN]

7.[

0√

2VN] [

0√

2VN]

=[

0√

2VN]

8.[

0√

2VN]†

=

[

0 0T

0 I

]

V

9.[

0√

2VN]†

V =[

0√

2VN]†

10.[

0√

2VN] [

0√

2VN]†

= V

11.[

0√

2VN]† [

0√

2VN]

=

[

0 0T

0 I

]

12.[

0√

2VN]

[

0 0T

0 I

]

=[

0√

2VN]

13.

[

0 0T

0 I

]

[

0√

2VN]

=

[

0 0T

0 I

]

14. [VN 1√21 ]−1 =

[

V †N

√2

N 1T

]

15. V †N =

√2

[

− 1N 1 I− 1

N 11T]

∈ RN−1×N ,(

I− 1N 11T ∈ SN−1

)

16. V †N1 = 0

17. V †NVN = I , V T

N VN = 12 (I + 11T) ∈ SN−1

18. V T = V = VNV †N = I − 1

N 11T ∈ SN

19. −V †N (11T− I )VN = I ,

(

11T− I ∈ EDMN)

20. D = [dij ] ∈ SNh (1073)

tr(−V D V ) = tr(−V D) = tr(−V †NDVN ) = 1

N 1TD 1 = 1N tr(11TD) = 1

N

∑

i,j

dij

Any elementary matrix E∈ SN of the particular form

E = k1 I − k2 11T (1832)

where k1 , k2∈R ,B.8 will make tr(−ED) proportional to∑

dij .

B.8If k1 is 1−ρ while k2 equals −ρ∈R , then all eigenvalues of E for −1/(N−1) < ρ < 1 are guaranteedpositive and therefore E is guaranteed positive definite. [343]


21. D = [dij ] ∈ SN

tr(−V D V ) = 1N

∑

i,ji6=j

dij − N−1N

∑

i

dii = 1N 1TD 1 − tr D

22. D = [dij ] ∈ SNh

tr(−V TNDVN ) =

∑

j

d1j

23. For Y ∈ SN

V (Y − δ(Y 1))V = Y − δ(Y 1)

B.4.3 Orthonormal auxiliary matrix VW

Thin matrix

VW ,

−1√N

−1√N

· · · −1√N

1 + −1N+

√N

−1N+

√N

· · · −1N+

√N

−1N+

√N

. . .. . . −1

N+√

N

.... . .

. . ....

−1N+

√N

−1N+

√N

· · · 1 + −1N+

√N

∈ RN×N−1 (1833)

has R(VW)=N (1T) and orthonormal columns. [7] We defined three auxiliary V -matrices:V , VN (1055), and VW sharing some attributes listed in Table B.4.4. For example, Vcan be expressed

V = VWV TW = VNV †

N (1834)

but V TWVW = I means V is an orthogonal projector (2117) and

V †W = V T

W , ‖VW‖2 = 1 , V TW1 = 0 (1835)

B.4.4 Auxiliary V -matrix Table

dim V rankV R(V ) N (V T) V TV V V T V V †

V N×N N−1 N (1T) R(1) V V V

VN N×(N−1) N−1 N (1T) R(1) 12 (I + 11T) 1

2

[

N−1 −1T

−1 I

]

V

VW N×(N−1) N−1 N (1T) R(1) I V V

B.4.5 More auxiliary matrices

Mathar shows [295, §2] that any elementary matrix (§B.3) of the form

VS = I − b1T ∈ RN×N (1836)

such that bT1 = 1 (confer [190, §2]), is an auxiliary V -matrix having

R(V TS ) = N (bT) , R(VS) = N (1T)

N (VS) = R(b) , N (V TS ) = R(1)

(1837)

B.5. ORTHOMATRICES 527

Given X∈ Rn×N , the choice b= 1N 1 (VS =V ) minimizes ‖X(I − b1T)‖F . [192, §3.2.1]

B.5 Orthomatrices

B.5.1 Orthonormal matrix

Property QTQ = I completely defines orthonormal matrix Q∈Rn×k (k≤n); a full-rankthin-or-square matrix characterized by nonexpansivity (2121)

‖QTx‖2 ≤ ‖x‖2 ∀x∈Rn, ‖Qy‖2 = ‖y‖2 ∀ y∈Rk (1838)

and preservation of vector inner-product

〈Qy , Qz〉 = 〈y , z〉 (1839)

B.5.2 Orthogonal matrix & vector rotation

An orthogonal matrix is a square orthonormal matrix. Property Q−1 = QT completelydefines orthogonal matrix Q∈Rn×n employed to effect vector rotation; [374, §2.6, §3.4][376, §6.5] [233, §2.1] for any x∈Rn

‖Qx‖2 = ‖x‖2 (1840)

In other words, the 2-norm is orthogonally invariant. Any antisymmetric matrix constructsan orthogonal matrix; id est, for A = −AT

Q = (I + A)−1(I − A) (1841)

A unitary matrix is a complex generalization of orthogonal matrix; conjugate transposedefines it: U−1 = UH. An orthogonal matrix is simply a real unitary matrix.B.9

Orthogonal matrix Q is a normal matrix further characterized by norm:

Q−1 = QT , ‖Q‖22 = 1 , ‖Q‖2

F = n (1842)

Applying this characterization to QT, we see that it too is an orthogonal matrix becausetranspose inverse equals inverse transpose. Hence the rows and columns of Q eachrespectively form an orthonormal set. Normalcy guarantees diagonalization (§A.5.1.0.1).So, for Q , SΛSH,

SΛ−1SH = S∗ΛST(

= SΛ∗SH)

, ‖δ(Λ)‖∞ = 1 , 1T|δ(Λ)| = n (1843)

characterizes an orthogonal matrix in terms of eigenvalues and eigenvectors.

All permutation matrices Ξ , for example, are nonnegative orthogonal matrices;and vice versa. Product or Kronecker product of any permutation matrices remainsa permutator. Any product of permutation matrix with orthogonal matrix remainsorthogonal. In fact, any product AQ of orthogonal matrices A and Q remains orthogonalby definition. Given any other dimensionally compatible orthogonal matrix U , themapping g(A)= UTAQ is a bijection on the domain of orthogonal matrices (a nonconvexmanifold of dimension 1

2n(n−1) [56]). [273, §2.1] [274]

B.9Orthogonal and unitary matrices are called unitary linear operators.


The largest magnitude entry of an orthogonal matrix is 1 ; for each and every j∈ 1 . . . n

‖Q(j , :)‖∞ ≤ 1‖Q(: , j)‖∞ ≤ 1

(1844)

Each and every eigenvalue of a (real) orthogonal matrix has magnitude 1 (Λ−1 = Λ∗)

λ(Q) ∈ Cn , |λ(Q)| = 1 (1845)

but only the Identity matrix can be simultaneously orthogonal and positive definite.Orthogonal matrices have complex eigenvalues in conjugate pairs: so detQ=±1.

B.5.3 Reflection

A matrix for pointwise reflection is defined by imposing symmetry upon the orthogonalmatrix; id est, a reflection matrix is completely defined by Q−1 = QT = Q . The reflectionmatrix is a symmetric orthogonal matrix, and vice versa, characterized:

QT = Q , Q−1 = QT , ‖Q‖2 = 1 (1846)

The Householder matrix (§B.3.1) is an example of symmetric orthogonal (reflection)matrix.

Reflection matrices have eigenvalues equal to ±1 and so detQ=±1. It is naturalto expect a relationship between reflection and projection matrices because all projectionmatrices have eigenvalues belonging to {0, 1}. In fact, any reflection matrix Q is relatedto some orthogonal projector P by [235, §1 prob.44]

Q = I − 2P (1847)

Yet P is, generally, neither orthogonal or invertible. (§E.3.2)

λ(Q) ∈ Rn , |λ(Q)| = 1 (1848)

Whereas P connotes projection on R(P ) , here Q connotes reflection with respect toR(P )⊥.

Matrix I−2(I−P ) = 2P−I represents antireflection; id est, reflection about R(P ).Every orthogonal matrix can be expressed as the product of a rotation and a reflection.

The collection of all orthogonal matrices of particular dimension forms a nonconvex set;topologically, it is instead referred to as a manifold.

B.5.3.0.1 Example. Pythagorean sum by antireflection sequence in R2.Figure 186 illustrates a process for determining magnitude of vector p0 = [x y ]T∈R2

+ .The given point p0 is assumed to be a member of quadrant I with y≤x .B.10 The ideais to rotate p0 into alignment with the x axis; its length then becomes equivalent to therotated x coordinate. Rotation is accomplished by iteration (i index):

First, p0 is projected on the x axis; [x 0 ]T. Vector u0 = [x y/2 ]T is a bisector ofdifference p0− [x 0 ]T, translated to projection [x 0 ]T, and is the perpendicular bisectorof chord p0p1 . Point p1 is the antireflection of p0 ;

p1 =

(

2u0u

T0

uT0 u0

− I

)

p0 (1849)

B.10 p0 belongs to the monotone nonnegative cone KM+⊂ R2+ .


0

R2

p0 =

[

xy

]

u0 =

[

xy/2

] p1

u1

p2[

x0

]

Figure 186: [145] First two of three iterations required, in absence of square root function,

to calculate√

x2+y2 to 20 significant digits (presuming infinite precision). Point p0 isgiven. u0 is a bisector of difference [ 0 y ]T, translated to [x 0 ]T, and is the perpendicularbisector of chord p0p1 ; similarly, for u1 . [255, §16/4] Assuming that Moler & Morrisoncondition [302] |y|≤ |x| holds, iterate p3 (not shown) is always on quarter circle close to xaxis. (p0 violates Moler & Morrison condition, for illustration. Cartesian axes not drawn.)

In the second iteration, Figure 186 illustrates operation on p1 and construction of p2 .Because p2 is the antireflection of p1 , each must be equidistant from the perpendicularbisector through u1 of chord p1p2 .

A third iteration completes the process. Say pi = [xi yi ]T with [x0 y0 ], [x y ] thegiven point. It is known that the particular selection of bisector ui = [xi yi/2 ]T diminishestotal required number of iterations to three for 20 significant digits ([302] presuming exactarithmetic).

The entire sequence of operations is illustrated programmatically by this Matlabsubroutine to calculate magnitude of any vector in R2 by antireflection:

function z = pythag2(x,y) %z(1)=sqrt(x^2+y^2), z(2)=y coordinate of p3

if ~[x;y]

z = [x;y];

else

z = sort(abs([x;y]),’descend’);

for i=1:3

u = [z(1); z(2)/2];

z = 2*u*(u’*z)/(u’*u) - z;

end

end

end

Pythagorean sum sans square root was first published in 1983 by Moler & Morrison [302]who did not describe antireflection. 2

https://blogs.mathworks.com/cleve/2012/07/30/Pythagorean-addition




B.5.3.0.2 Exercise. Pythagorean sum: ‖x∈Rn‖ sans square root sans sort.Matlab subroutine pythag2() calculates magnitude of any vector in R2. By constrainingperpendicular bisector angle Áui≤ π

4 , show that sorting condition |y|≤ |x| is obviated:

function z = pythag2d(x,y) %z(1)=sqrt(x^2+y^2), z(2)=y coordinate of p4

if ~[x;y]

z = [x;y];

elseif ~x

z = [abs(y); 0];

else

z = abs([x;y]);

for i=1:4

u = [z(1); min(z(1), z(2)/2)];

z = 2*u*(u’*z)/(u’*u) - z;

end

end

end

Elimination of sort() incurred an extra iteration in R2.

Antireflection admits simple modifications that enable calculation of Euclidean normfor any vector in Rn. Show that Euclidean norm may be calculated with only a smallnumber of iterations in absence of square root and sorting functions:

function z = pythag3d(x) %z(1)=||x||, z(2:end)=coordinates of p13

if ~x

z = x;

elseif ~x(1)

z = [pythag3d(x(2:end)); 0];

else

z = abs(x);

for i=1:13

u = [z(1); min(z(1), z(2:end)/2)];

z = 2*u*(u’*z)/(u’*u) - z;

end

end

end

Projection is on the first nonzero coordinate axis, as in §B.5.3.0.1, then antireflection isabout a vector ui normal to a hyperplane. Demonstrate that number of required iterationsdoes not grow linearly with n ; contrary to [302, §3], growth is much slower. H

B.5.4 Rotation of range and rowspace

Given orthogonal matrix Q , column vectors of a matrix X are simultaneously rotatedabout the origin via product QX . In three dimensions (X∈ R3×N ), precise meaningof rotation is best illustrated in Figure 187 where a gimbal aids visualization of what isachievable; mathematically, (§5.5.2.0.1)

Q =

cos θ 0 −sin θ0 1 0

sin θ 0 cos θ

1 0 00 cos ψ −sinψ0 sinψ cos ψ

cos φ −sin φ 0sin φ cos φ 0

0 0 1

(1850)


Figure 187: Gimbal : a mechanism imparting three degrees of dimensional freedom toa Euclidean body suspended at its center. Each ring is free to rotate about one axis.(Drawing by courtesy of The MathWorks Inc.)

B.5.4.0.1 Example. One axis of revolution.

Partition n+1-dimensional Euclidean space Rn+1 ,

[

Rn

R

]

and define an n -dimensionalsubspace

R , {λ∈Rn+1 | 1Tλ = 0} (1851)

(a hyperplane through the origin). We want an orthogonal matrix that rotates a list in thecolumns of matrix X∈ Rn+1×N through the dihedral angle between Rn and R (§2.4.3)

Á(Rn, R) = arccos

( 〈en+1 , 1〉‖en+1‖ ‖1‖

)

= arccos

(

1√n+1

)

radians (1852)

The vertex-description of the nonnegative orthant in Rn+1 is

{[ e1 e2 · · · en+1 ] a | a º 0} = {a º 0} = Rn+1+ ⊂ Rn+1 (1853)

Consider rotation of these vertices via orthogonal matrix

Q , [1 1√n+1

ΞVW ]Ξ ∈ Rn+1×n+1 (1854)

where permutation matrix Ξ∈Sn+1 is defined in (1920) and where VW ∈Rn+1×n is theorthonormal auxiliary matrix defined in §B.4.3. This particular orthogonal matrix isselected because it rotates any point in subspace Rn about one axis of revolution onto R ;e.g, rotation Qen+1 aligns the last standard basis vector with subspace normal R⊥= 1.The rotated standard basis vectors remaining are orthonormal spanning R . 2

Another interpretation of product QX is rotation/reflection of R(X). Rotation ofX as in QXQT is a simultaneous rotation/reflection of range and rowspace.B.11

Proof. Any matrix can be expressed as a singular value decomposition X = UΣWT

(1734) where δ2(Σ) = Σ , R(U)⊇R(X) , and R(W )⊇R(XT). ¨

B.11Product QTA Q can be regarded as coordinate transformation; e.g, given linear map y =Ax : Rn→R

n

and orthogonal Q , the transformation Qy =AQx is a rotation/reflection of range and rowspace (145) ofmatrix A where Qy∈R(A) and Qx∈R(AT) (146).

http://www.mathworks.com/help/physmod/sm/mech/ref/gimbal.html


0 2 4 6 8 10

nz = 28

0

1

2

3

4

5

6

7

8

9

10

11

Figure 188: 10×10 arrow matrix. Twenty eight nonzero (nz) entries indicated.

B.5.5 Matrix rotation

Orthogonal matrices are also employed to rotate/reflect other matrices like vectors:[185, §12.4.1] Given orthogonal matrix Q , the product QTA will rotate A∈Rn×n in

the Euclidean sense in Rn2

because Frobenius’ norm is orthogonally invariant (§2.2.1);

‖QTA‖F =√

tr(ATQQTA) = ‖A‖F (1855)

(likewise for AQ). Were A symmetric, such a rotation would depart from Sn. One remedyis to instead form product QTAQ because

‖QTAQ‖F =√

tr(QTATQQTAQ) = ‖A‖F (1856)

By §A.1.1 no.33,vec QTAQ = (Q ⊗ Q)T vec A (1857)

which is a rotation of the vectorized A matrix because Kronecker product of any orthogonalmatrices remains orthogonal; e.g, by §A.1.1 no.43,

(Q ⊗ Q)T(Q ⊗ Q) = I (1858)

Matrix A is orthogonally equivalent to B if B=STAS for some orthogonal matrix S .Every square matrix, for example, is orthogonally equivalent to a matrix having equalentries along the main diagonal. [233, §2.2, prob.3]

B.6 Arrow matrix

Consider a partitioned symmetric n×n-dimensional matrix A that has arrow [330] (orarrowhead [373]) form constituted by vectors a, b∈Rn−1 and real scalar c :

A ,

[

δ(a) bbT c

]

∈ Sn (1859)

Figure 188 illustrates sparsity pattern of an arrow matrix. Embedding of diagonal matrixδ(a) makes relative sparsity increasing with dimension. Because an arrow matrix is a kindof bordered matrix, eigenvalues of δ(a) and A are interlaced;

λn ≤ (ΞTa)n−1 ≤ λn−1 ≤ (ΞTa)n−2 ≤ · · · ≤ (ΞTa)1 ≤ λ1 (1860)

[374, §6.4] [233, §4.3] [370, §IV.4.1] denoting nonincreasingly ordered eigenvalues of A byvector λ∈Rn, and those of δ(a) by ΞTa∈Rn−1 where Ξ is a permutation matrix arranginga into nonincreasing order: δ(a)= Ξδ(ΞTa)ΞT (§A.5.1.2).

B.6. ARROW MATRIX 533

B.6.1 positive semidefinite arrow matrix

i) Nonnegative main diagonal aº0 insures n−1 nonnegative eigenvalues in (1860).

Positive semidefiniteness is left determined by smallest eigenvalue λn :

A º 0⇔ a º 0 , bT

(

I− δ(a)δ(a)†)

= 0 , c− bTδ(a)†b ≥ 0⇔ c ≥ 0 , b(1− cc†) = 0 , δ(a)− c†bbTº 0

(1861)

Schur complement condition (§A.4) bT(

I− δ(a)δ(a)†)

= 0 is most simply a requirement for

ii) a zero entry in vector b wherever there is a corresponding zero entry in vector a .

In other words, vector b can reside anywhere in a Cartesian subspace of Rn−1 that isdetermined solely by indices of the nonzero entries in vector a .

iii) c≥ bTδ(a)†b provides a tight lower bound for scalar c .

As shown in §3.5.1, bTδ(a)†b is simultaneously convex in vectors a and b .

simple matrices - CCRMAdattorro/simplemat.pdf · Simple matrices Mathematicians also attempted to...

Documents

Transcript of simple matrices - CCRMAdattorro/simplemat.pdf · Simple matrices Mathematicians also attempted to...