Download - Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Transcript
Page 1: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Lecture 2: Generalized Inverses

1

Page 2: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Moore’s plan

The striking analogies between the theories for linear equations in

n–dimensional Euclidean space, for Fredholm integral equations in

the space of continuous functions defined on a finite real interval,

and for linear equations in Hilbert space of infinitely many

dimensions, led Moore to lay down his well–known principle.

“The existence of analogies between central features of various

theories implies the existence of a more fundamental general

theory embracing the special theories as particular instances and

unifying them as to those central features.” (Moore, 1912)

“The effectiveness of the reciprocal of a non–singular finite

matrix in the study of properties of such matrices makes it

desirable to define if possible an analogous matrix to be

associated with each finite matrix even if it is not square or, if

square, is not necessarily non–singular.” (Moore 1935)

2-b

Page 3: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Desiderata

Cm×nr = the m × n matrices over C with rank r.

A matrix A ∈ Cn×n is nonsingular if rank A = n, or detA 6= 0.

The inverse of A satisfies, by definition, the following equations,

AXA = A (1)

XAX = X (2)

(AX)∗ = AX (3)

(XA)∗ = XA (4)

AX = XA (5)as well as the conditions

Ax = λx =⇒ A−1x =1

λx (6)

A, B nonsingular =⇒ (AB)−1 = B−1A−1 (7)

These properties are desirable, can one have them for general A ?

3

Page 4: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

The Penrose equations

The Penrose equations for A ∈ Cm×n are:

AXA = A , (1)

XAX = X , (2)

(AX)∗ = AX , (3)

(XA)∗ = XA . (4)

Let A{i, j, . . . , k} denote the set of matrices X ∈ Cn×m which

satisfy equations (i), (j), · · · , (k).

A matrix X ∈ A{i, j, . . . , k} is called an {i, j, . . . , k}–inverse of A,

and also denoted by A(i,j,...,k).

In particular, a {1}–inverse, a {2}–inverse, a {1, 3}–inverse, etc.

The Moore–Penrose inverse of A is its {1, 2, 3, 4}–inverse,

denoted A†.

4-a

Page 5: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Why Moore’s work was unknown in 1955?

Answer: Telegraphic style and idiosyncratic notation. Example:

(29.3) Theorem.

UC B1 II B2 II κ12·) ·

∃ |λ21 type M2

κ∗ M1

κ � ·S2 κ12 λ21 = δ11M1

κ

· S1 λ21 κ12 = δ22M2

κ∗

English translation:

(29.3) Theorem.

For every matrix A there exists a unique matrix X : R(A) → R(A∗)

such that

AX = PR(A) , XA = PR(A∗) .

5-a

Page 6: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Construction of {1}–inverses

Given A ∈ Cm×nr , let E ∈ C

m×mm and P ∈ C

n×nn be such that

EAP =

[Ir KO O

]

. (1)

Then for any L ∈ C(n−r)×(m−r), the n × m matrix

X = P

[Ir OO L

]

E (2)

is a {1}–inverse of A. The partitioned matrices in (1), (2) must be

suitably interpreted in case r = m or r = n.

Proof. Write (1) as

A = E−1

[Ir KO O

]P−1 ,

then verify that any X given by (2) satisfies AXA = A. �

6

Page 7: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Linear equations

Given A ∈ Cm×n , b ∈ C

m, the equations

Ax = b (1)

have a solution if and only if for any X ∈ A{1},

AXb = b , (2)

in which case the general solution is

x = X b + (I − XA)y , y ∈ Cn arbitrary (3)

Proof. AXA = A =⇒ AX idempotent, rank AX = rank A.

∴ AX = PR(A),M , for some M such that Cm = R(A) ⊕ M .

Ax = b consistent ⇐⇒ b ∈ R(A) ⇐⇒ PR(A),Mb = b , ∀ M

Finally, A (X b + (I − XA)y) = AX b = b. �

7-a

Page 8: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Linear matrix equations

Theorem. Let A ∈ Cm×n , B ∈ C

p×q , D ∈ Cm×q. Then the

matrix equation

AXB = D (1)

is consistent if and only if for some A(1), B(1),

AA(1)DB(1)B = D , (2)

in which case the general solution is

X = A(1)DB(1) + Y − A(1)AY BB(1) (3)

for arbitrary Y ∈ Cn×p.

Proof. If (1) is consistent then

D = AXB = AA(1)AXBB(1)B = AA(1)DB(1)B .

8

Page 9: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Kronecker products and matrix equations

The Kronecker product A ⊗ B of the two matrices

A = (aij) ∈ Cm×n , B ∈ Cp×q is the mp × nq matrix

A ⊗ B =

a11B a12B · · · a1nB

a21B a22B · · · a2nB

· · · · · · · · · · · ·

am1B am2B · · · amnB

For X = (xij) ∈ Cm×n, let vec(X) = (vk) ∈ C

mn be the vector

obtained by listing the elements of X by rows,

vn(i−1)+j = xij (i ∈ 1, m ; j ∈ 1, n)

Lemma. For compatible matrices A, X, B

(A ⊗ BT ) vec(X) = vec (AXB)

9

Page 10: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Construction of {1, 2}–inverses

Proposition. Let Y, Z ∈ A{1}, and let

X = Y AZ .

Then X ∈ A{1, 2}.

Proof. AXA = A(Y AZ)A = (AY A)ZA = AZA = A ,

XAX = (Y AZ)A(Y AZ) = Y (AZA)Y AZ = Y (AY A)Z = X . �

Proposition. Any two of the following statements imply the third:

(a) X ∈ A{1} ,

(b) X ∈ A{2} ,

(c) rank X = rank A .

Proof. X ∈ A{1}, Y ∈ A{2} =⇒ rank Y ≤ rank A ≤ rank X , etc.

10-a

Page 11: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Projections

Theorem. For any A ∈ Cm×n , A(1) ∈ A{1}.

R(AA(1)) = R(A) , N(A(1)A) = N(A) , R((A(1)A)∗) = R(A∗) .

Proof. Always R(AX) ⊂ R(A) , N(A) ⊂ N(XA) .

ButAXA = A =⇒ rank AX = rank XA = rank A .

Theorem. Let X be a {1, 2}–inverses of A. Then:

(a) AX is the projector on R(A) along N(X), and

(b) XA is the projector on R(X) along N(A).

Proof. AX = (AX)2 =⇒ AX = PR(AX),N(AX)

AXA = A =⇒ R(AX) = R(A)

XAX = X , rankAX = rank X =⇒ N(AX) = N(X)

11-a

Page 12: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

The set of {1, 3}–inverses

Theorem. The set A{1, 3} consists of all solutions for X of

AX = AA(1,3) , (1)

where A(1,3) is an arbitrary element of A{1, 3}.

Proof. If X satisfies (1), then

AXA = AA(1,3)A = A , AX = (AX)∗ . ∴ X ∈ A{1, 3} .

Conversely, if X ∈ A{1, 3}, then

AA(1,3) = AXAA(1,3) = (AX)∗AA(1,3) = X∗A∗(A(1,3))∗A∗

= X∗A∗ = AX .

Theorem. The set A{1, 4} consists of all solutions for X of

XA = A(1,4)A .

12-b

Page 13: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Characterizations of {1, 3}, and {1, 4}–inverses

Recall that for Cn = L ⊕ M .

M = L⊥ ⇐⇒ PL,M is Hermitian

Theorem. For any A ∈ Cm×n:

(a) AX = PR(A) ⇐⇒ X ∈ A{1, 3}

(b) XA = PR(A∗) ⇐⇒ X ∈ A{1, 4}

Proof. (a) ⇐=

AXA = A =⇒ AX = PR(AX),N(AX)

AXA = A =⇒ R(AX) = R(A) ∴ AX = PR(A),N(AX)

AX = (AX)∗ =⇒ N(AX) = R(A)⊥ ∴ AX = PR(A)

(a) =⇒AX = PR(A) = AA(1,3) =⇒ X ∈ A{1, 3}

13-a

Page 14: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

{1, 2, 3}, and {1, 2, 4}–inverses

Theorem (Urquhart). For every A ∈ Cm×n ,

(A∗A)(1)A∗ ∈ A{1, 2, 3} , (a)

A∗(AA∗)(1) ∈ A{1, 2, 4} , (b)

A(1,4)AA(1,3) ∈ A{1, 2, 3, 4} . (c)

Proof of (a). Let X := (A∗A)(1)A∗.

R(A∗A) = R(A∗) (why?) =⇒ A∗ = A∗AU , ∃ U ∴ A = U∗A∗A

∴ AXA = U∗A∗A(A∗A)(1)A∗ = U∗A∗A = A ∴ X ∈ A{1}

rank X ≤ rankA∗ and X ∈ A{1} =⇒ rank X ≥ rank A

∴ rankX = rank A ∴ X ∈ A{2}Finally

AX = U∗A∗A(A∗A)(1)A∗AU = U∗A∗AU ∴ X ∈ A{3} �

14

Page 15: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

The Moore–Penrose inverse

Theorem (Penrose). Given A ∈ Cm×n, a solution of

AXA = A , (1)

XAX = X , (2)

(AX)∗ = AX , (3)

(XA)∗ = XA , (4)

exists and is unique. The {1, 2, 3, 4}–inverse of A is denoted A†.

Proof. Uniqueness. Let X, Y ∈ A{1, 2, 3, 4}. Then

X = X(AX)∗ = XX∗A∗ = X(AX)∗(AY )∗

= XAY = (XA)∗(Y A)∗Y = A∗Y ∗Y

= (Y A)∗Y = Y .

Existence. A† = A(1,4)AA(1,3) . �

15

Page 16: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Full–rank factorization

Given A ∈ Cm×nr , r > 0, a full–rank factorization is

A = CR , C ∈ Cm×rr , R ∈ C

r×nr (1)

Theorem (MacDuffee). Given A ∈ Cm×nr , r > 0, C, R as in (1),

A† = R∗(C∗AR∗)−1C∗ . (2)

Proof. C∗AR∗ is nonsingular, because

C∗AR∗ = (C∗C)(RR∗) , a product of nonsingular matrices .

Let X = RHS(2) = R∗(RR∗)−1(C∗C)−1C∗ , and check that X

satisfies the 4 Penrose equations. �

A† = R∗(RR∗)−1(C∗C)−1C∗ = R†C† (3)

Q: What is a “good” method for full–rank factorization ?

16-b

Page 17: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Singular value decomposition

Let A ∈ Cm×nr , r > 0, and let

AA∗ui = σ2i ui , i ∈ 1, m

A∗Avi = σ2i vi , i ∈ 1, n

σ1 ≥ σ1 ≥ · · · ≥ σr > 0 = σr+1 = σr+2 = · · ·

The singular value decomposition (SVD) of A is

A = UΣV ∗ (SVD)

U = [u1 u2 · · · um] ∈ Cm×m , U∗U = Im ,

V = [v1 v2 · · · vn] ∈ Cn×n , V ∗V = In ,

Σ = diag(σ1, σ2, · · · , σr) ∈ Rm×n .

Theorem (Penrose).A† = V Σ†U∗

where Σ† = diag

(1

σ1,

1

σ2, · · · ,

1

σr

)∈ R

n×m

17

Page 18: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Properties of the Moore–Penrose inverse

(a) For any scalar λ,λ† =

, if λ 6= 0 ;

0 , otherwise .

If a,b are column vectors then

(b) a† = (a∗a)†a∗ (c) (ab∗)† = (a∗a)†(b∗b)†ba∗

(d) If D = diag(λ1, · · · , λk) ∈ Cm×n then

D† = diag(λ†1, · · · , λ

†k) ∈ C

n×m

For any matrix A

(e) (A†)† = A (f) (A∗)† = (A†)∗

(g) (AT )† = (A†)T (h) A† = (A∗A)†A∗ = A∗(AA∗)†

(i) R(A†) = R(A∗) (j) N(A†) = N(A∗)

(k) AA† = PR(A) (l) A†A = PR(A∗)

(m) If U and V are unitary matrices, (UAV )† = V ∗A†U∗

(n) For any matrices A, B: (A ⊗ B)† = A† ⊗ B†

18

Page 19: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Non–properties of the Moore–Penrose inverse

(a) In general, for compatible A, B,

(AB)† 6= B†A†

(b) If A, B are similar, i.e. B = S−1AS for some nonsingular S,

then, in general, B† 6= S−1A†S .

(c) If Jk(0) is a Jordan block corresponding to the eigenvalue zero,

then (Jk(0))† = (Jk(0))T . For example,

0 1 0 0

0 0 1 0

0 0 0 1

0 0 0 0

=

0 0 0 0

1 0 0 0

0 1 0 0

0 0 1 0

∴ A† is not a polynomial in A.

19-b

Page 20: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Continuity of the inverse

Let ‖ · ‖ be a multiplicative matrix norm, i.e.

‖XY ‖ ≤ ‖X‖‖Y ‖ , if XY is defined

Let X ∈ Cn×nn . Then the perturbation (X + E) = (I + EX−1)X

is nonsingular for all E such that ‖E‖ <1

‖X−1‖and its inverse is

(X + E)−1 = X−1(I − EX−1 + (EX−1)2 − (EX−1)3 + · · ·

)

which converges if

‖EX−1‖ < 1 , guaranteed by ‖E‖ <1

‖X−1‖

The inverse is a continuous function Cn×nn 7→ Cn×n

n , and the

nonsingular matrices are an open set in Cn×n.

20

Page 21: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

The Moore–Penrose inverse is discontinuous

Ex. Let

X(ǫ) =

1 0

0 ǫ

→ X(0) =

1 0

0 0

, as ǫ → 0 .

But

X(ǫ)† =

1 0

01

ǫ

6→ X(0)† =

1 0

0 0

.

For perturbations Ek → O,

(X + Ek)† → X† ⇐⇒ rank (X + Ek) → rank X

21-a

Page 22: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

The Smith normal form

A nonsingular matrix A ∈ Zn×n whose inverse A−1 is also in Zn×n

is called a unit matrix.

Two matrices A, S ∈ Zm×n are said to be equivalent over Z if

there exist two unit matrices P ∈ Zm×m and Q ∈ Z

n×n such that

PAQ = S . (1)

Theorem. Let A ∈ Zm×nr . Then A is equivalent over Z to a

matrix S = [sij ] ∈ Zm×nr such that:

(a) sii 6= 0 , i ∈ 1, r,

(b) sij = 0 otherwise, and

(c) sii divides si+1,i+1 for i ∈ 1, r − 1.

S is called the Smith normal form of A, and its nonzero

elements sii (i ∈ 1, r) are invariant factors of A.

22-a

Page 23: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Integer solutions

Let A ∈ Zm×n,b ∈ Z

m and let the linear equation

Ax = b (P)

be consistent. It is required to determine if (P) has an integer

solution, in which case determine all of them.

Theorem (Hurt and Waid). Let A ∈ Zm×n. Then there is an

n × m matrix X satisfying

AXA = A , (1)

XAX = X , (2)

AX ∈ Zm×m, XA ∈ Z

n×n . (6)

Proof. Let PAQ = S be the Smith normal form of A. Then

X = QS†P .

23

Page 24: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Integer solutions (con’d)

Let A the {1, 2}–inverse of A as given above.

Theorem (Hurt and Waid). Let A and b be integral, and let

the vector equation

Ax = b (P)

be consistent. Then (P) has an integral solution if and only if the

vector

Ab

is integral, in which case the general integral solution of (P) is

x = Ab + (I − AA)y , y ∈ Zn .

24

Page 25: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Application of {2}–inverses to Newton’s method

The Newton method for solving a single equation in 1 variable,

f(x) = 0 ,

is xk+1 = xk −f(xk)

f ′(xk), (k = 0, 1, . . .) .

A Newton method for solving m equations in n variables

fi(x1, . . . , xn) = 0 , i ∈ 1, m or f(x) = 0 ,

is similarly given, for the case m = n, by

xk+1 = xk − f ′(xk)−1f(xk) , (k = 0, 1, . . .) ,

where f ′(xk) is the derivative of f at xk, represented by the

matrix of partial derivatives (the Jacobian matrix)

f ′(xk) =

(∂fi

∂xj

(xk)

).

25

Page 26: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Notation

We denote the derivative of f at c

f ′(c) =

(∂fi

∂xj

(c)

)by Jf (c) or by Jc .

We denote by ‖ · ‖ both a vector norm in Rn and a matrix norm

consistent with it,

‖Ax‖ ≤ ‖A‖‖x‖ , ∀ x .

For a given point x0 ∈ Rn and a positive scalar r we denote by

B(x0, r) = {x ∈ Rn : ‖ x − x0 ‖< r}

the open ball with center x0 and radius r. The closed ball

with the same center and radius is

B(x0, r) = {x ∈ Rn : ‖ x − x0 ‖≤ r} .

26

Page 27: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Newton method using {2}–inverses of f ′

Theorem. Let x0 ∈ Rn, r > 0 and let f : R

n → Rm be

differentiable in B(x0, r). Let M > 0 be such that

‖Ju − Jv‖ ≤ M ‖u− v‖ (1)

for all u,v ∈ B(x0, r). Further, assume that for all x ∈ B(x0, r),

the Jacobian Jx has a {2}–inverse Tx ∈ Rn×m, TxJxTx = Tx,

such that ‖Tx0‖∥∥f(x0)

∥∥ < α , (2)

and, ‖(Tu − Tv)f(v)‖ ≤ N ‖u− v‖2

, ∀ u,v ∈ B(x0, r) (3)

M

2‖Tu‖ + N ≤ K < 1 , ∀ u ∈ B(x0, r) (4)

for some positive scalars N, K and α, and

h := αK < 1 ,α

1 − h< r . (5)

27

Page 28: Lecture 2: Generalized Inverses - Ben-Israel.netbenisrael.net › GI-LECTURE-2.pdfLecture 2: Generalized Inverses 1. Moore’s plan ... matrix in the study of properties of such matrices

Theorem (cont’d)

Then:

(a) Starting at x0, all iterates

xk+1 = xk − Txk f(xk), k = 0, 1, . . . (6)

lie in B(x0, r).

(b) The sequence {xk} converges, as k → ∞, to a point

x∞ ∈ B(x0, r), that is a solution of

Tx∞f(x) = 0 . (7)

(c) For all k ≥ 0∥∥xk − x∞

∥∥ ≤ αh2k−1

1 − h2k. (8)

Since 0 < h < 1, the method is (at least) quadratically convergent.

The iterates converge not to a solution of f(x) = 0, but of (7). The

degree of approximation depends on the {2}–inverse used.

28-b