2. MATRICES IN 2 DIMENSIONS Matrices... · 17 2. MATRICES IN 2 DIMENSIONS §2.1. Matrices . So far...

17

2. MATRICES IN 2 DIMENSIONS

§2.1. Matrices So far we’ve concentrated on 2-dimensional row and column vectors. It’s not difficult to see how the definitions and theorems could be extended to n-dimensional row and column vectors. But

we can extend some of these ideas even further. A row vector has 1 row and several columns. A column vector has 1 column and several rows. Why not consider ‘vectors’ with several rows and several columns? Good idea, but we call them ‘matrices’ (the plural of ‘matrix’). Although the word, ‘matrix’, is used in geology, the mathematical use comes from the printing industry. In the days of moveable type, typesetters chose their letters from trays of little compartments, with many rows and columns.

An m × n matrix is a rectangular array of numbers, with m rows and n columns. The mn entries in the table are called the components and are normally real numbers. The i-j component of the matrix is the number that’s in the i’th row and j’th column. Here we’ll concentrate on 2 × 2 matrices.

Example 1:

2 3

−4 1 is a 2 × 2 matrix.

Matrices are just like vectors except that they are written in rectangular blocks instead of a single row or column. We could write the 4 components of a 2 × 2 matrix as a 4-dimensional row or column vector. We add matrices in the same way as we add vectors and we multiply a matrix by a scalar in the same way as for vectors.

Example 2: If A =

3 2

−1 5 and B =

4 1

7 3 then 3A + 2B =

9 6

−3 15 +

8 2

14 6 =

17 8

11 21 .

§2.2. Matrix Multiplication When adding two 2 × 2 matrices or multiplying a 2 × 2 matrix by a scalar, it would make no difference if the components were written as a 4-dimensional row or column vector. But what we can do with a 2 × 2 matrices, that we can’t do with 4-dimensional vectors, is to multiply them. Well, we can multiply two vectors u, v using the inner product u.v but the answer is not another vector – it’s a scalar.

With matrix multiplication the product of A and B is written AB and, being a 2 × 2 matrix itself, we can multiply it by a third matrix C to get (AB)C. Now you might expect, after noting that adding matrices means just adding components, that matrix multiplication might be similar. However it’s rather more complicated. It takes a little bit of getting used to.

Suppose we have a change of variables:

x1 = a1x + a2y

y1 = a3x + a4y

18

where a1, a2, a3, a4 are constants. We’ll see later that, for example, such a change of variables

occurs whenever a vector x

y is rotated to get a new vector

x1

y1.

But suppose that x

y was obtained from

x0

y0 by a similar transformation:

x = b1x0 + b2y0

y = b3x0 + b4y0

It would be nice to be able to write each of these transformations using matrices as follows.

x1

y1 =

a1 a2

a3 a4 x

y and

x

y =

b1 b2

b3 b4

x0

y0.

This would mean defining the product of a row vector by a matrix as follows:

a1 a2

a3 a4 x

y =

a1x + a2y1

a3x + a4y . After changing the variables twice we’d get x1 and y1 expressed in terms of the original variables as follows: x1 = a1x + a2y = a1(b1x0 + b2y0) + a2(b3x0 + b4y0) = (a1b1 + a2b3)x0 + (a1b2 + a2b4)y0 and y1 = a3x1 + a4y1 = a3(b1x0 + b2y0) + a4(b3x0 + b4y0) = (a3b1 + a4b3)x0 + (a3b2 + a4b4)y0. Expressing this using matrices and vectors we’d get:

x1

y1 =

a1b1 + a2b3 a1b2 + a2b4

a3b1 + a4b3 a3b2 + a4b4

x0

y0.

But

x1

y1 =

a1 a2

a3 a4 x

y and, since x

y =

b1 b2

b3 b4

x0

y0 we’d get

x1

y1 =

a1 a2

a3 a4

b1 b2

b3 b4

x0

y0.

We therefore define the product of a matrix by a vector by

a1 a2

a3 a4 x

y =

a1x + a2y

a3x + a4y

and we define the product of two 2 × 2 matrices by

a1 a2

a3 a4

b1 b2

b3 b4 =

a1b1 + a2b3 a1b2 + a2b4

a3b1 + a4b3 a3b2 + a4b4 .

These formulae may be pretty hard to memorise, which is why we don’t. The best approach to learning most formulae in mathematics is to learn the pattern. Look at the definition of the product of two matrices. To get the 1-1 component of the product (top-left corner) you run along the 1st row of the first factor and down the 1st column of the second, multiplying corresponding components and adding these products. To get the 1-2 component (top-right) you run along the 1st row of the first factor and down the 2nd column of the second factor. To get the 2-1 component (bottom-left) you run along the 2nd row of the first factor and simultaneously down the 1st column of the second factor. The 2-2 component of the product follows exactly the same pattern. This is essentially the same as calculating an inner product, but with a row vector from the first matrix and a column vector from the second. Now look at the formula for multiplying a matrix by a column vector. It follows exactly the same pattern! Once you’ve mastered this you will be able to multiply larger matrices. Example 3:

3 2

−1 4

a b

c d =

3a + 2c 3b + 2d

−a + 4c −b + 4d

3 2

−1 4

5 9

0 −7 =

3.5 + 2.0 3.9 + 2.(−7)

(−1).5 + 4.0 (−1).9 + 4.(−7) =

15 13

−5 −35 .

19

3 2

−1 4 x

y =

3x + 2y

−x + 4y

3 2

−1 4 5

7 =

15 + 14

−5 + 28 =

29

23 .

Although our emphasis in this chapter is on 2 × 2 matrices, and associated vectors, let’s see how we could multiply larger matrices by the same method.

Example 4:

1 2

3 45 6 7 8

a b c

d e f =

a + 2d b + 2e c + 2f

3a + 4d 3b + 4e 3c + 3f5a + 6d 5b + 6e 5c + 6f7a + 8d 7b + 8e 7c + 8f

.

A column vector can be considered as a matrix with one column so once we state the definition of the product of any two matrices we observe that multiplying a matrix by a column vector is just a special case of matrix multiplication.

Notice that you can’t multiply a column vector by a matrix. The matrix must come first.

While

3 2

−1 4 5

7 =

15 + 14

−5 + 28 =

29

23 , the product 5

7

3 2

−1 4 is not defined. If we run along the

first row of 5

7 and down the first column of

3 2

−1 4 we’ll run out of components in the row before

we have finished with the column. We will explore this question further when we consider matrices of all sizes. Even the inner product of two vectors is a particular case of matrix multiplication. If u1 and u2 are column vectors with the same dimension then u1.u2 is the matrix product u1

Tu2. For example,

if u1 =

x1

y1 and u2 =

x2

y2 then u1

Tu2 = (x1, y1)

x2

y2 = (x1x2 + y1y2). Technically, this last matrix is a

1 × 1 matrix while u1.u2 is a scalar. However, we always identify a 1 × 1 matrix (k) with the scalar k itself. §2.3. The Algebra of Matrices The system of 2 × 2 real matrices (matrices with real components) is like the system of real numbers themselves in many ways. You can add them and multiply them.

There’s a matrix

0 0

0 0 that behave like the number zero :

0 0

0 0 +

a b

c d =

a b

c d .

There’s a matrix

1 0

0 1 that behaves like the number 1:

1 0

0 1 .

a b

c d =

a b

c d .

We call the matrix

0 0

0 0 the zero matrix and denote it by the symbol 0. (Of course we have to know from the context whether 0 denotes the zero matrix or the zero scalar.) We call the matrix

1 0

0 1 the identity matrix and denote it by the symbol I. However, there are some important differences between the algebra of 2 × 2 real matrices and the algebra of real numbers, but we first concentrate on the similarities.

20

Theorem 1: For all 2 × 2 matrices A, B, C the following identities hold: (1) (Commutative law for Addition) A + B = B + A; (2) (Associative law for Addition) (A + B) + C = A + (B + C); (3) (Associative law for Multiplication) (AB)C = A(BC); (4) (Distributive laws) A(B + C) = AB + AC (A + B)C = AC + BC

Proof: These can all be verified by putting A =

a1 b1

c1 d1 , B =

a2 b2

c2 d2 and C =

a3 b3

c3 d3 and

calculating both sides of each equation. For example:

A(B + C) =

a1 b1

c1 d1

a2 + a3 b2 + b3

c2 + c3 d2 + d3 =

a1(a2 + a3) + b1(c2 + c3) a1(b2 + b3) + b1(d2 + d3)

c1(a2 + a3) + d1(c2 + c3) c1(b2 + b3) + d1(d2 + d3)

while AB + AC =

a1a2 + b1c2 + a1a3 + b1c3 a1b2 + b1d2 + a1b3 + b1d3

c1a2 + d1c2 + c1a3 + d1c3 c1b2 + d1d2 + c1b3 + d1d3 . Careful comparison of

the components reveals that these matrices are equal. So in many ways, the algebra of matrices runs parallel to the algebra of real numbers. But there are two important differences. MATRIX MULTIPLICATION IS NOT COMMUTATIVE: While AB = BA is possible for 2 × 2 matrices, in most cases AB ≠ BA.

Example 5: Let A =

0 1

0 0 and B =

1 0

0 0 . Then AB =

0 0

0 0 while BA =

0 1

0 0 .

A NON-ZERO MATRIX NEED NOT HAVE AN INVERSE UNDER MULTIPLICATION: While most 2 × 2 matrices have multiplicative inverses many do not.

Example 6: Let A =

0 1

0 0 and B =

a b

c d .Then AB =

c d

0 0 which can never equal

1 0

0 1 . So the

matrix

0 1

0 0 , though non-zero, fails to have an inverse. The consequences of these differences between matrix algebra and ordinary algebra (the algebra of real numbers we learnt at school) are far reaching. For example we learnt at school that

a2 − b2 = (a − b)(a + b). The corresponding equation A2 − B2 = (A − B)(A + B) doesn’t usually hold for matrices.

If you expand the right-hand-side using the distributive laws you get A2 − BA + AB − B2. This only simplifies to A2 − B2 if AB = BA, which is not generally true.

Another well-known identity from school algebra is (a + b)2 = a2 + 2ab + b2. For matrices we expand (A + B)2 as A2 + AB + BA + B2, but that’s as far as it goes. We can only combine the AB and BA terms into a single 2AB term in the rare cases where AB = BA.

Another “fact of algebra” is the cancellation law: if xy = 0 then x = 0 or y = 0. For real numbers we prove this as follows:

Suppose xy = 0 and x ≠ 0. Hence x−1 exists. ∴ x−1(xy) = 0 ∴ (x−1x)y = 0 ∴ 1y = 0 ∴ y = 0.

21

This argument fails for the algebra of matrices because there are non-zero matrices that fail to have an inverse. In example 5 we saw a product of two non-zero matrices that equalled zero.

If we’re given the equation x2 = 1 in ordinary algebra we easily conclude that x = ±1. The

corresponding equation for matrices would be X2 = I. But unlike the real number case we get not just two solutions ± I. In fact we get infinitely many! Can this be? Can a quadratic equation have more than two solutions, even infinitely many? Indeed it can, though not in the algebra of real numbers of course. What you learnt at school is still valid, provided you’re just dealing with real numbers. But in the algebra of matrices strange things happen. This is how we prove, for real number algebra, that x2 = 1 implies that x = ±1. Suppose that x2 = 1.

∴ x2 − 1 = 0 ∴ (x − 1)(x + 1) = 0 ∴ x − 1 = 0 or x + 1 = 0 ∴ x = ±1.

Writing this in the notation of matrices this becomes: Suppose that X2 = I.

∴ X2 − I = 0 [this step is OK for matrices] ∴ (X − I)(X + I) = 0 [(X2- A2) doesn’t usually factorise but here it’s OK since XI = IX] ∴ X − I = 0 or X + I = 0 [Here’s the real problem. AB = 0 implies that A = 0 or B = 0 does not necessarily hold for matrices] ∴ X = ± I. [If we could ever get to the previous stage, this would have been OK]

This argument works for matrices until the second last step. Not only is the proof invalid,

the statement that ± I are the only solutions to X2 = I is false. Example 7: Solve the equation X2 = I for 2 × 2 matrices X.

Solution: Let X =

a b

c d . Then X2 = c(a + d) d2 + bc

a2 + bc b(a + d) .

Suppose that X2 =

1 0

0 1 .

Then a2 + bc = 1

b(a + d) = 0c(a + d) = 0d2 + bc = 1

Case I: Suppose that a + d ≠ 0. Then b = c = 0. (Here we’ve used the cancellation law, but at the level of numbers, where it is true, not at the level of matrices.) The remaining equations reduce to a2 = d2 = 1, so a = ±1 and d = ±1. Since a + d ≠ 0 we must have a = d = 1 or a = d = −1. Here we get the two solutions we were expecting:

X =

1 0

0 1 = I or X =

−1 0

0 −1 = −I.

Case II: Suppose a + d = 0. Then d = −a. The four equations reduce to just two:

a2 + bc = 1

d = −a .

22

Case IIA: Suppose b ≠ 0.

Then c = 1 − a2

b . This gives, for every combination of values of a and b, provided that b ≠ 0, the

solution X = −a

a b

1 − a2

b . Here we have infinitely many solutions.

Case IIB: Suppose that b = 0. Then c can take any value, a = ±1 and d = −a.

This gives two other infinite sets of solutions: X =

1 0

c −1 and X =

−1 0

c 1 for any real number c.

§2.4. Fields There are many algebraic systems which have the two operations of addition and multiplication. The algebra you learnt at school is just one of them. The algebra of matrices is somewhat different. But there are other algebraic systems in which high-school algebra applies. We call these systems ‘fields’.

Don’t you just love the homely vocabulary of mathematicians? They mostly use common words and give them technical meaning, words such as ‘set’, ‘group’, ‘ring’ and ‘field’. Chemists use unpronounceable words such as ‘phenolphthalein’ (the acid-base indicator on litmus paper). Zoologists call the Indian parrot ‘psittacula eupatria’. Botanists refer to bamboo as ‘bamboosa aridinarifolia’. But to cap it all, there’s the longest word in the English language, describing a certain lung disease: ‘pneumonoultramicroscopicsilicovulcanoconiosis’! Good on the mathematicians with their homely vocabulary. I remember once, when attending a course on Infinite Abelian Groups, being told that a virgin subgroup is defined to be one that remains pure when embedded in a larger group’! I suspect that the ‘virgin’ term was an invention of the lecturer, to get a few laughs, but ‘group’, ‘pure’ and ‘embedded’ are definitely in use in mathematics. A field is a set F on which there are two binary operations of addition and multiplication (the sum of x and y is written x + y and the product is written xy) such that the following eleven axioms hold.

(1) (Closure under +) x + y ∈ F for all x, y ∈ F; (2) (Associative law for +) x + (y + z) = (x + y) + z for all x, y, z ∈ F; (3) (Commutative law for +) x + y = y + x for all x, y ∈ F; (4) (Identity under +) there is a number 0 such that x + 0 = x for all x ∈ F; (5) (Inverses under +) for all x ∈ F there exists −x ∈ F such that x + (−x) = 0; (6) (Closure under ×) xy ∈ F for all x, y ∈ F; (7) (Associative law for ×) x(yz) = (xy)z for all x, y, z ∈ F; (8) (Commutative law for ×) xy = yx for all x, y ∈ F; (9) (Identity under ×) there is 1 ∈ F, with 1 ≠ 0, where 1x = x for all x ∈ F; (10) (Inverses under ×) for all non-zero x ∈ F there exists x−1 ∈ F such that xx−1 = 1; (11) (Distributive law) x (y + z) = xy + xz for all x, y, z ∈ F.

Example 8: The set of real numbers, ℝ, with the usual operations of addition and multiplication, is a field and so is the set of complex numbers, ℂ, and the set of rational numbers, ℚ. Example 9: The set of integers, ℤ, is not a field because axiom 10 fails. The other ten axioms hold, but not every non-zero integer has an inverse under multiplication which is itself a integer. It’s true

23

that the inverse of the integer 2 does exist in ℚ. But we need the inverse to be in the set of integers and this is not the case. Example 10: The smallest field has just two elements, the words ‘Odd’ and ‘Even’. The operations of addition and multiplication are given by the following tables:

+ Even Odd × Even Odd Even Even Odd Even Even Even Odd Odd Even Odd Even Odd

The zero element (identity under addition) is Even and the identity element (under multiplication) is Odd. We often use the symbols 0 and 1 to represent ‘Even’ and ‘Odd’ respectively. This gives the system of integers modulo 2, denoted by ℤ2. The addition and multiplication follow normal arithmetic, except that in this system we have 1 + 1 = 0. An even smaller system would be one with just one element Ω where Ω + Ω = Ω and Ω.Ω = Ω but, in this case we’d have 0 = Ω = 1, contradicting axiom 9. The insistence on 0 ≠ 1 in that axiom may seem artificial, but there are good reasons for it. §2.5. Linear Transformations If A is a square matrix the function f, defined by f (v) = Av, is called the corresponding linear transformation. (Here v and Av are column vectors.)

Example 11: Suppose A =

3 2

7 −1 and f is the corresponding linear transformation.

If v = 4

3 then f (v) =

3 2

7 −1 4

3 =

18

25 .

Linear transformations get their name from the fact that, interpreted geometrically, they take straight lines to straight lines.

A typical point on the line joining the points u and v is w = (1 − t)u + tv. After applying the linear transformation w moves to Aw = A[(1 − t)u + tv] = (1 − t)Au + tAv. But this is a typical point on the line joining Au to Av. Hence straight lines transform to straight lines. A good example of a linear transformation in 2-dimensional space is a rotation. Clearly this takes straight lines to straight lines. But before we can be sure that it’s a linear transformation we must find its corresponding matrix. Remember that we measure rotation angles as positive if they are anticlockwise and negative if they are anticlockwise. Theorem 2: Rotation of vectors in ℝ2 through an angle θ, about the origin, is a linear

transformation corresponding to the matrix

cos θ −sin θ

sin θ cos θ .

Proof: Let X

Y be the vector obtained by rotating x

y through the angle θ about the origin. Let r be

the length of x

y and let α be the angle from the positive x-axis to the vector v.

24

Then X = r cos(α + θ) = r (cos α.cosθ − sin α.sin θ) = (r cos α)cos θ − (r sin α)sin θ = x cos θ − y sin θ Y = r sin(α + θ) = r (sin α.cosθ + cos α.sin θ) = (r cos α)sin θ + (r sin α)cos θ = x sin θ + y cos θ

so X

Y =

cos θ −sin θ

sin θ cos θ x

y .

Example 12: Find the point that results when (3, 4) is rotated about the origin through 30° (in a positive direction).

Solution: Here θ = 30° and so the matrix of the rotation is R =

3

2 −12

12

32

.

Let v = 3

4 . Then v rotates to Rv = 2 −12

12

32

3

3

4 =

3

32 − 2

32 + 2 3

≈

0.598

4.964 .

§2.6. 2 × 2 Determinants Some 2 × 2 matrices have an inverse under multiplication. Others don’t. We’d like a simple criterion for determining which is which. A 2 × 2 matrix is invertible if it has an inverse under multiplication. Otherwise it’s called non-invertible. An old fashioned term for “non-invertible” is “singular”. This word has largely dropped out of everyday language but those who read 19th century literature, especially the Sherlock Holmes stories, will know that it means ‘strange’ or ‘out of the ordinary’. This reflects the fact that singular matrices are much rarer than non-singular (or invertible) ones. Associated with every 2 × 2 matrix is a number, called its determinant. We’ll see that matrices with non-zero determinant are invertible (non-singular) and those with zero determinant are non-invertible (singular). The determinant of a matrix is a single number that can be computed from the components of the matrix.

If A =

a b

c d we define the determinant of A to be |A| = ad − bc and we define the adjoint

of A to be adj(A) =

d −b

−c a . Remember the pattern of the adjoint is “swap the diagonal

components and change the sign of the others”. [These definitions are just for 2 × 2 matrices. When we move to larger matrices the definitions of determinant and adjoint become much more complicated.] Theorem 3: A.adj(A) = adj(A).A = |A|I.

Proof: A.adj(A) =

a b

c d

d −b

−c a =

ad − bc 0

0 ad −bc .

So the determinant determines whether or not the matrix has an inverse. The following

theorem is true for square matrices of all sizes, but here we just prove it in the 2 × 2 case.

x

y

X

Y

α θ

0

25

Theorem 4: If |A| ≠ 0 then A−1 exists and is 1

|A| adj(A).

If |A| = 0 then A−1 does not exist. Proof: The first case follows directly from Theorem 3. If |A| = 0 then A.adj(A) = 0. Suppose |A| = 0 and A−1 exists. Then adj(A) = (A−1A)adj(A) = A−1(A.adj(A)) = A−1. |A|. I = A−10 = 0.

But then

d −b

−c a =

0 0

0 0 and so a = b = c = d = 0, in which case A is the zero matrix.

But the zero matrix clearly doesn’t have an inverse.

Example 13: Suppose A =

3 −5

2 7 . Find |A|, adj(A) and A−1.

Solution: |A| = 21 + 10 = 31.

adj(A) =

7 5

−2 3

A−1 = 131

7 5

−2 3 .

We can get a geometric interpretation for the determinant as follows. Write the 2 × 2 matrix

A in terms of its columns as A = (u, v). So if A =

a b

c d then u = a

c and v = b

d .

Theorem 5: The area of the parallelogram with vertices 0, u, v and u + v is the determinant of the matrix A = (u, v).

Proof: Let u = a

c and v = c

d . Let P be the area of the parallelogram. The area of the whole rectangle is equal to the sum of the areas of the parts into which it is divided. ∴ (a + b)(c + d) = 2bc + ac + bd + P. ∴ P = ad − bc = |A|.

0

0

a

c

d

b

a+b

c+d

b

d

a

c

d

b a

c

a

b

½ ac

½ bd P

bc

bc ½ ac

½ bd

26

If the vectors are interchanged it requires a negative rotation through some angle less than

180° for the line through u to become the line through v the matrix (u, v) becomes

b a

d c and the

determinant becomes − (ad − bc), that is, minus the area of the parallelogram. So the determinant of a matrix (u, v) is the signed area of the parallelogram formed from u, v. The sign is the sign of the rotation required to swing u round to the line through v.

A more useful geometric interpretation of the determinant of a 2 × 2 matrix is that it measures the magnification of areas that results from applying a linear transformation.

The concept of the area of an arbitrary subset of the plane is more complicated than at first it seems. We might expect to divide the subset up into small squares and to take a limit as the size of the square approaches zero. This works well in many cases but if, for example, we took the set of points with rational coordinates then it would be impossible to define its area in this way. In the following theorem the definition of a ‘closed region’ is a subset whose area can be defined. Theorem 6: Suppose A is a 2 × 2 real matrix and S is a closed region of the plane. Suppose T is the image of S under the linear transformation v → Av. Then the area of T is the absolute value of |A| times the area of S.

Proof: Let A =

a b

c d . Since the area of S can be approximated as close as we like by squares with

sides parallel to the axes it is sufficient to take S to be one such square, with vertices x

y ,

x + k

y ,

x

y + k ,

x + k

y + k .

The square has area k2. Translating the image to the origin by subtracting Ax

y from each vector we see that this image is a parallelogram with vertices:

0

0 , A

x

y+k − Ax

y , A

x+k

y+k − Ax

y and A

x+k

y − Ax

y ,

that is with vertices 0

0 , A 0

k , Ak

k and A k

0 .

[The fact that Ak

k = A 0

k + Ak

0 shows that it is a parallelogram.]

These four vertices are 0

0 , k

b

d , k

a+b

c+d and k

a

c .

x

y

x+k

y

x

y+k

x+k

y+k

Ax

y

A

x+k

y

A

x

y+k

A

x+k

y+k

27

The area of this parallelogram is clearly k2 times that of the parallelogram with vertices 0

0 ,

b

d ,

a+b

c+d and

a

c . Hence it is k2 times the absolute value of |A|.

Theorem 7: |AB| = |A|.|B| Proof: For 2 × 2 matrices we could prove this algebraically. However the following geometric proof is more intuitive. Also it can be extended to 3 × 3 matrices where volumes replace areas.

The linear transformation v → Bv multiplies areas by the absolute value of |B|. The linear transformation v → Av multiplies areas by the absolute value of |A|. Hence the linear transformation v → (AB)v = A(Bv) multiplies areas first by the absolute value of |B| and then by the absolute value of |A|, that is it multiplies areas by the absolute value of |A|.|B|. Hence |AB| = ± |A|.|B|. A more careful analysis will show that the sign is +.

We now give a geometric argument for the fact that A is invertible if and only if |A| ≠ 0. If |A| ≠ 0 the linear transformation v → Av multiples areas by the absolute value of |A|. It seems plausible that it has an inverse that divides areas by the absolute value of |A|.

Certainly, if |A| = 0 the unit square gets squashed down to a parallelogram with zero area, that is, a line segment. Clearly no linear transformation can “unsquash” the line segment. Once the area is zero no amount of multiplication can restore it to something non-zero.

Example 14: Prove that A =

1 −3

2 4 is invertible and find its inverse.

Solution: |A| = 4 + 6 = 10 ≠ 0 so A is invertible.

A−1 = 110

4 3

−2 1 =

2

5 310

−15

110

.

Theorem 8: If |A| = 0 there exists a non-zero vector v such that Av = 0. Proof: Suppose |A| = 0. If A = 0 then any vector v will do. Suppose that A ≠ 0. Then at least one of a, b, c, d is non-zero.

Now

a b

c d

−b

a =

0

ad − bc = 0

0 and

a b

c d

d

−c =

ad − bc

0 = 0

0 .

At least one of

d

−c and

−b

a is a suitable non-zero solution.

Note that these are the columns of adj(A).

Example 15: Find a non-zero vector x

y such that

24 16

15 10 x

y = 0

0 .

Solution: adj

24 16

15 10 =

10 −16

−15 24 so

10

−15 is a solution.

Note that any non-zero multiple of

10

−15 , such as

2

−3 , will also be a solution.

A ?

28

§2.7. Orthogonal Matrices The transpose of an 2 × 2 matrix A is the 2 × 2 matrix obtained from A by writing the rows as columns. It is denoted by AT.

For the 2 × 2 matrix A =

a b

c d the transpose of A is AT =

a c

b d .

Theorem 9:

(1) (AT)T = A; (2) (A + B)T = AT + BT; (3) (AB)T = BTAT; (4) |AT| = |A|.

Proof: Though these four properties hold for matrices of all sizes (provided these sizes are appropriate), at this stage we only verify them for 2 × 2 matrices. The verification in this case is routine and we omit the details. The matrix A is symmetric if AT = A. It is orthogonal if AT = A−1. In the following theorem we write matrices in terms of their rows or columns.

So if A =

a b

c d we can write A as (c1, c2) where c1 = a

c and c2 = b

d are the columns.

Or we can write A =

r1

T

r2T where r1

T = (a, b) and r2T = (c, d) are the rows.

This means that r1 = a

b and r2 = c

d . Remember that we usually consider the natural form for a vector to be the column version,

so that we can write Av instead of AvT when we interact with matrices. But this means that row vectors will have to be written in the form vT. Note that u.v can be written as uTv. Theorem 10: The columns of an orthogonal matrix are orthogonal vectors of unit length. Proof: Let A = (a, b) where the columns of A are the two vectors a, b.

Then AT =

aT

bT and ATA =

aT

bT (a, b) =

aTa aTb

bTa bTb =

a.a a.b

b.a b.b =

|a|2 a.b

b.a |b|2 .

If A is orthogonal then ATA = I in which case |a| = |b| = 1 and a.b = 0. Theorem 11: If A is an orthogonal matrix then |A| = ±1. Proof: Suppose A is orthogonal. Then AAT = I. Hence |AAT| = 1 and so |A|.|AT| = 1. But |AT| = |A| so |A|2 = 1. It follows that |A| = ±1. Theorem 12: Orthogonal linear transformations (linear transformations whose matrices are orthogonal) preserve lengths and areas. Proof: Let v → Av be an orthogonal linear transformation. So A−1 = AT. Let u, v be any vectors. Then Au.Av = (Au)TAv = uTATAv = uTv = u.v. Hence orthogonal linear transformations preserve inner products. In particular, if u = v this shows that |Av|2 = |v|2 so |Av| = |v|. Hence orthogonal linear transformations preserve lengths. Since angles between vectors can be expressed in terms of inner products and lengths, orthogonal linear transformations preserve angles, and hence areas. The converse is true. Any linear transformation that preserves lengths and angles must be

orthogonal. For if A =

a b

c d then A 1

0 = a

c and A 0

1 = b

d .

Now 1

0 and 0

1 are orthogonal unit vectors.

29

If v → Av preserves inner products and lengths than the columns of A must be orthogonal unit vectors and so A must be an orthogonal matrix.

Theorem 13: A 2 × 2 orthogonal matrix with determinant 1 has the form

cos θ −sin θ

sin θ cos θ and with

determinant −1 it has the form

cos θ sin θ

sin θ −cos θ .

Proof: Let A =

a b

c d be orthogonal. Then |A| = ±1 and AT = A−1.

Case I: |A| = 1: Then A−1 =

d −b

−c a = AT =

a c

b d . Hence a = d and b = −c. Thus A =

a −c

c a .

Now a

c is a unit vector so a2 + c2 = 1. Hence for some θ, a = cos θ and c = sin θ, and so

A =

cos θ −sin θ

sin θ cos θ .

Case II: |A| = −1: Then A−1 =

−d b

c −a = AT =

a c

b d .

Hence a = −d and b = c. Thus A =

a c

c −a .

Now a

c is a unit vector so a2 + c2 = 1. Hence for some θ, a = cos θ and c = sin θ, and so

A =

cos θ sin θ

sin θ −cos θ .

We know from theorem 2 that an orthogonal matrix with determinant 1 represents a rotation

about the origin. In a similar way it can be shown that the orthogonal matrix

cos θ sin θ

sin θ −cos θ

represents a reflection in the line y = x tan(θ/2). (This is left as an exercise.) §2.8. Eigenvalues and Eigenvectors Example 16: If A =

2 3

4 1 , find the values of λ for which there exist non-zero solutions to the

equation Av = λv.

Solution: We rewrite the equation as λv − Av = 0, that is

λ−2 −3

−4 λ−1 v = 0.

Such a non-zero solution will exist if and only if

λ−2 −3

−4 λ−1 = 0.

This gives the quadratic equation (λ − 2)(λ − 1) − 12 = 0, that is, λ2 − 3λ − 10 = 0. Factorising we get (λ + 2)(λ − 5) = 0 and so there are two suitable values of λ namely λ = −2 or 5. The equation Av = λv is an important one in matrix theory. Of course v = 0 is a solution no matter what λ is. We’re only interested in non-zero vectors v. Suppose A is an 2 × 2 matrix. If for some scalar λ and some v ≠ 0 we have Av = λv we say that λ is an eigenvalue (or ‘latent root’) of A and that v is a corresponding eigenvector (or ‘characteristic vector’).

The prefix ‘eigen’ comes from a German word meaning ‘characteristic’. These scalars and vectors are characteristics of the matrix. The old-fashioned term ‘latent root’ comes from the fact that these scalars are hidden – a certain amount of work is required to reveal them. Note that any non-zero scalar multiple of an eigenvector is also an eigenvector, for the same eigenvalue.

30

If λ is an eigenvalue for A and v is a corresponding eigenvector then (λI − A)v = 0. Since v is non-zero we must have |λI − A| = 0. Conversely, if |λI − A| = 0 the matrix λI − A is non-invertible. Hence there will be a non-zero vector v such that (λI − A)v = 0.

The expression |λI − A| is a polynomial in λ, and if A is n × n then |λI − A| has degree n. We call this polynomial the characteristic polynomial of A. Here we use the English word rather than the German ‘eigen’.

Example 17: Find the eigenvalues of A =

2 3

5 4 .

Solution: |λI − A| =

λ − 2 −3

−5 λ − 4 = (λ − 2)(λ − 4) − 15 = λ2 − 6λ − 7 = (λ + 1)(λ − 7) so the

eigenvalues are −1, 7, We have seen that matrices are a very useful tool in geometry. But there are many non-geometric applications as well. A very famous application is the following. §2.9. An Important Field of Matrices The system of all 2 × 2 matrices with real components does not, as we have seen, form a field. However a certain sub-system does.

Consider the set C of all 2 × 2 real matrices of the form

x −y

y x . Going through the eleven

field axioms we can check that all are satisfied. For example, although matrices in general don’t commute (that is, AB is usually not equal to BA) matrices of this special type do commute.

a −b

b a

c −d

d c =

ac − bd − (ad + bc)

ad + bc ac − bd =

c −d

d c

a −b

b a .

This verifies closure under multiplication as well as the commutative law for multiplication. Moreover, non-zero 2 × 2 matrices don’t always have inverses under multiplication. These special

ones do, except for the zero matrix, since

a −b

b a = a2 + b2 > 0. Moreover the inverse of one of

these special matrices also has this form.

a −b

b a −1

= 1

a2+b2

a b

−b a =

α −β

β α where α = a

a2+b2 and β = −b

a2+b2 .

Consider the matrices of the form

x 0

0 x . Such a matrix can be written as a scalar multiple of the identity matrix: xI. There’s one of these for every real number x and these add and multiply exactly like the real numbers themselves.

xI + yI = (x + y)I and xI.yI = xyI. So, inside the system C we have a working model of the real numbers. But there are additional matrices in C.

A very special matrix in C is

0 −1

1 0 . When you square this matrix you get

−1 0

0 −1 = −I.

If you think of −I as representing the real number −1 we’ve found a square root of −1. You know that −1 does not have any real square roots. The square of a real number can never be negative. But if we enlarge our system of numbers we are able to have square roots of −1. We can identify the scalar matrix xI with the real number x itself and if we denote this

special matrix

0 −1

1 0 by i we then get i2 = −1.

Now a typical element of C has the form

x −y

y x . This can be decomposed as

31

x −y

y x =

x 0

0 x +

0 −y

y 0

=

x 0

0 x +

0 −1

1 0

y 0

0 y

= xI + i(yI). If we write xI and yI as just ‘x’ and ‘y’ (after all these 2 × 2 scalar matrices behave like the real numbers x and y) then the typical element of C would be x + iy, where x and y are real numbers and i2 = −1. This is one way of constructing the field of complex numbers. The number x + iy is called a complex number because it’s made up of two pieces. Although we can construct the complex numbers as 2 × 2 matrices we certainly don’t think of them as matrices in normal use. Rather we regard them as numbers, just as we regard the familiar real numbers. We’ve extended the system of real numbers to include new numbers, to get a much richer arithmetic system. It’s a system in which there are square roots of −1. §2.10. Application To Fibonnacci Numbers There’s a famous sequence called the Fibonacci sequence: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …….. It follows a very simple pattern. After the initial two 1’s, each term is the sum of the previous two. If you look up Fibonacci on the web you’ll find that this sequence seems to pervade nature, particularly biology and botany where counting various features of plants gives rise to Fibonacci numbers. One important form of this is the Fibonacci Spiral. This spiral can be found in Nature in many places. It has also been reflected in art and architecture. Here are a few manifestations of it.

Leonard of Pisa, whose father was called Bonaccio, became known as Fibonacci (son of

Bonaccio) and in 1202 he published a book in which he presented the sequence as a solution to a population problem involving rabbits. Suppose you have a newly-born pair of rabbits, one male, one female, and put them put them in a field. Rabbits are able to mate at the age of one month so that at the end of its second month a female can produce another pair of rabbits. Suppose that our rabbits never die and that the female always produces one new pair (one male, one female) every month from the second month on. The puzzle that Fibonacci

32

posed was. How many pairs will there be in one year? Let F1 = F2 = 1 and define Fn+2 = Fn+1 + Fn for all n ≥ 1. If we want to compute the value of F1000 it would appear that we’d have to climb up through all the Fibonacci numbers in between. Is there an explicit formula whereby we can go directly to F1000 by substituting n = 1000? There is, but it’s a rather surprising formula – one that you would be very unlikely to guess. The first step is to introduce a sequence of 2-dimensional vectors whose components are a pair of successive Fibonacci numbers.

For each n ≥ 1 define vn =

Fn+1

Fn . If we can find an explicit formula for vn we certainly have

one for Fn. Define F0 = 1. Then v0 = 1

0 .

Now notice that

Fn+2

Fn+1 =

Fn+1 + Fn

Fn =

1 1

1 0

Fn+1

Fn.

Denoting the matrix

1 1

1 0 by A we can write this as vn+1 = Avn.

Since v0 = 1

0 , v1 = Av0, v2 = A2v0, … and vn = Anv0. All we need is an explicit expression for An. What about raising each of the components of A to the n’th power? That’s rather a silly suggestion because matrix multiplication doesn’t work like that. But it may not be as silly as it

looks. If A is a diagonal matrix, of the form

a 0

0 d , then An is indeed

an 0

0 dn . Of course A isn’t diagonal, so is this any use? Well, yes it is. Because if we can find an invertible matrix S and a diagonal matrix D for which A = SDS−1 then An = (SDS−1) (SDS−1) … (SDS−1) = SD(S−1S)D(S−1S) ... (SS−1)DS−1 = SDnS−1. and this would give us what we want – a formula for An. So, we want to find an invertible matrix S and a diagonal matrix D such that A = SDS−1 or equivalently AS = SD.

Now if we write D =

λ 0

0 µ and if we write S in terms of its columns as S = (v, w) then we

need A(v, w) = (v, w)

λ 0

0 µ = (λv, µw).

Equating corresponding columns this gives two similar equations: Av = λv and Aw = µw. We need both v and w to be non-zero vectors, otherwise we’d have a non-invertible S. So v and w need to be eigenvectors. Moreover for S to be invertible we must not only have v and w to be different – they must not be scalar multiples of one another. They need to be independent in some sense. (Later on we’ll describe this independence precisely in a way that makes sense of square matrices of all sizes.)

To begin with we need an eigenvalue λ and an eigenvector v for A =

1 1

1 0 .

The characteristic polynomial of A is |λI − A| =

λ−1 −1

−1 λ = λ(λ − 1) − 1 = λ2 − λ − 1.

The zeros of this polynomial are 1 ± 5

2 . So let λ = 1 + 5

2 and µ = 1 − 5

2 .

We must now find corresponding eigenvectors v, w.

33

Now λI − A =

λ−1 −1

−1 λ . Suppose v = a

c .

Then (λI − A)v = 0 becomes

λ−1 −1

−1 λ a

c = 0

0 .

Hence (λ − 1)a − c = 0 and −a + λc = 0. Let c = 1. (Since any non-zero scalar multiple of an eigenvector is an eigenvector we are entitled to choose any convenient non-zero for one of its components.) Then a = λc = λ. From the first equation we have (λ − 1)λ − 1 = 0, that is, λ2 − λ − 1 = 0.

But this is automatically true for the λ we’ve chosen. So λ

1 is an eigenvector for λ.

Similarly µ

1 is an eigenvector for µ.

This gives S =

λ µ

1 1 .

|S| = λ − µ = 5 and so S−1 = 15

1 −µ

−1 λ .

Hence An = 15

λ µ

1 1

λn 0

0 µn

1 −µ

−1 λ

= 15

λ µ

1 1

λn −µλn

−µn λµn

= 15

λn+1 − µn+1 −µλn+1 + λµn+1

λn − µn −µλn + λµn .

Thus vn = An1

0 = 15

λn+1 − µn+1 −µλn+1 + λµn+1

λn − µn −µλn + λµn 1

0

= 15

λn+1 − µn+1

λn − µn .

So we have Fn = 15 (λn − µn)

= 15

1 + 5

2n −

1 − 5

2n

.

At last we have an explicit formula for the terms of the Fibonacci sequence. But surely this can’t be correct! The Fibonacci numbers are all positive integers. None of them involve 5 . Yet the amazing thing is that this formula produces positive integers for every n. The 5 ′s cancel every time. You’ll agree that, if you examined the Fibonacci sequence you’d never have guessed such a complicated formula. Yet, using matrices and their eigenvalues, we were able to obtain it. The Fibonacci sequence is just one example of what is called a recurrence sequence. The appropriate tool for solving such recurrence sequences is the technique of diagonalization that we’ve illustrated here.

34

EXERCISES FOR CHAPTER 2

Exercise 1: If A =

2 3

3 5 , B =

−1 2

1 0 and C =

5 −3

−3 2 evaluate

(i) A + B; (ii) 2A − B; (iii) AB; (iv) BA; (v) AC ; (vi) CA; (vii) A(B + C); (viii) AB + AC; (ix) (A + B)(A − B); (x) A2 − B2.

Exercise 2: Find the values of k for which the equation

k − 1 −2

−2 k + 2 v = 0 has a non-zero solution,

v. For each of these values find a corresponding non-zero solution.

Exercise 3: If A =

6 −9

3 −5 find (i) |A|;

(ii) A−1; (iii) values of λ for which |λI − A| = 0. Exercise 4: Write the system of simultaneous equations

2x + 5y = 1

x − 4y = −1

in the form Av = b where v = x

y . Find A−1 and hence solve the system of equations.

Exercise 5: If A =

3 1

5 2 find A−1 and use it to solve the following systems of equations

(i) 3x + y = 7

5x + 2y = 12 ; (ii) 3a + b = 0

5a + 2b = 0 ; (iii) 3p + q = h

5p + 2q = k .

Exercise 6: If A =

3 4

−1 −2 find

(i) the eigenvalues of A; (ii) an eigenvector for each eigenvalue of A.

Exercise 7: Find the eigenvalues and eigenvectors of the matrix

a b

0 1 in terms of a and b. Exercise 8: Find the areas of:

(i) the parallelogram whose vertices are (0, 0), (3, 1), (5, 5) and (2, 4); (ii) the parallelogram whose vertices are (1, 1), (4, 2), (3, 5) and (6, 6); (iii) the triangle whose vertices are (1, 1), (4, 2) and (3, 5); (iv) the quadrilateral whose vertices are (1, 1), (4, 2), (3, 5) and (5, 5);

Exercise 9: Sketch the polygon whose vertices are A(0, 0), B(3, −3), C(4, 2), D(1, 5) and E(−2, 3) and find its area.

35

Exercise 10: Find v if it is obtained by rotating (3, −1) about the origin through an angle of 150°.

Exercise 11: Show that the matrix for a reflection in the line y = tan(θ/2)x is

cos θ sin θ

sin θ −cos θ .

Exercise 12: If the quadratic equation ax2 + 2bx + c = 0 has two solutions x, y and if

X =

x y

1 1 and A =

a b

b c prove that XTAX =

0 z

z 0 for some z. Find z in terms of a, b, c.

Exercise 13: Prove that if A =

a b

c d and AX = XA for all 2 × 2 matrices X, then b = c = 0 and a = d. Exercise 14: u0, u1, … is a sequence defined by un = aun−1 + b for n ≥ 1 where a ≠ 1.

(i) If Sn =

un+1 1

un 1 , S =

u1 1

u0 1 and A =

a 0

b 1 prove by induction that Sn = SAn for all n.

(ii) Hence show that un+1 − un = an(u1 − u0). (iii) Use this to find un in terms of u0, u1 and n. Exercise 15: (a) Prove that a 2 × 2 matrix is non-invertible if and only if 0 is an eigenvalue. (b) Prove that if H, K are 2 × 2 matrices such that H is invertible then there are at most two values of λ for which H + λK is non-invertible. (c) Show that if H is non invertible then there can be infinitely many such λ. Exercise 16: (i) Prove that if A, B are orthogonal matrices such that A + B is also orthogonal then ABT + BAT = − I. (ii) Show that if A, B are 2 × 2 orthogonal matrices such that A + B is orthogonal then (ABT)3 = I. Exercise 17 (Harder): ABCD is a parallelogram and a point is rotated anticlockwise through 90° four times, firstly about A, then about B, then about C and finally about D. Prove that the point has returned to its starting position if and only if ABCD is a square. Exercise 18 (Harder): A is a 2 × 2 invertible matrix such that |A + A−1| = |A| + |A−1|. Prove that A4 = kI for some k > 0.

SOLUTIONS FOR CHAPTER 2

Exercise 1: (i)

1 5

4 5 ; (ii)

5 4

5 10 ; (iii)

1 4

2 6 ; (iv)

4 7

2 3 ; (v)

1 0

0 1 ; (vi)

1 0

0 1 ; (vii)

2 4

2 7

(viii)

2 4

2 7 ; (ix)

13 26

22 29 ; (x)

10 23

22 32 .

Exercise 2: This is when

k − 1 −2

−2 k + 2 = (k − 1)(k + 2) − 4 = k2 + k − 6 = (k − 2)(k + 3) = 0.

So the values are k = 2, −3.

36

Exercise 3: (i) −3; (ii)

5/3 −3

1 −2 ; (iii) |λI − A| =

λ − 6 9

−3 λ + 5 = (λ − 6)(λ + 5) + 27 = λ2 − λ − 3

so |λI − A| = 0 when λ = 1 ± 13

2 .

Exercise 4:

2 5

1 −4 x

y =

1

−1 ; A−1 = 113

4 5

1 −2 ; x = − 113 , y =

313 .

Exercise 5: (i)

3 1

5 2 x

y =

7

12 ; A−1 =

2 −1

−5 3 ; x = 2, y = 1.

(ii)

3 1

5 2 x

y = 0

0 ; A−1 =

2 −1

−5 3 ; x = 0, y = 0.

(iii)

3 1

5 2 x

y = h

k ; A−1 =

2 −1

−5 3 ; x = 2h − k, y = −5h + 3k.

.

Exercise 6:

λ − 3 −4

1 λ + 2 = (λ − 3)(λ + 2) + 4 = λ2 − λ − 2 = (λ − 2)(λ + 1) = 0 so the eigenvalues

are λ = −1, 2.

A −(−1)I =

4 4

−1 −1 and

4 4

−1 −1 x

y = 0

0 implies x + y = 0.

Hence

1

−1 is an eigenvector for the eigenvalue −1.

A −2I =

1 4

−1 −4 and

1 4

−1 −4 x

y = 0

0 implies x + 4y = 0.

Hence

4

−1 is an eigenvector for the eigenvalue 2.

Exercise 7:

λ − a −b

0 λ − 1 = (λ − a)(λ − 1) so the eigenvalues are 1, a.

λ = 1:

a − 1 b

0 0 x

y = 0

0 → (a − 1)x + by = 0 so

b

1−a is an eigenvector for λ = 1.

λ = a:

0 b

0 1 − a x

y = 0

0 → y = 0 so 1

0 is an eigenvector for λ = a.

Exercise 8:

(i)

3 2

1 4 = 10; (ii) translating (1, 1) to the origin we subtract (1, 1) from the other vectors and so we get the same parallelogram as before – its area is again 10; (iii) 5; (iv) the quadrilateral can be divided into two triangles, the first being the one in (iii) and the second with vertices (3, 5), (5, 5) and (4, 2), which has area ½.2.3 = 3. Hence the total area is 13/2 + 3 = 19/2. Exercise 9: We can split this region into the parallelogram ABCD and the triangle ADE. The area is thus

3 1

−3 5 + ½

1 −2

5 3 = 18 + 13/2 = 49/2.

A

B

C

D

E

37

Exercise 10: The point moves to (X, Y) where

X

Y = −sin(5π/6)

sin(5π/6) cos(5π/6)

cos(5π/6)

3

−1

=

−√3/2 −1/2

1/2 √3/2

3

−1 =

1 − 3 32

3 − 32

.

Exercise 11:

If x = r cos α and y = r sin α then Mθ/2x

y = Mθ/2

r cos α

r sin α

=

r cos(θ − α)

r sin(θ − α)

=

r cos θ cos α + r sin θ sin α

r sin θ cos α − r cos θ sin α

=

x cos θ + y sin θ

x sin θ − y cos θ

=

cos θ sin θ

sin θ −cos θ x

y .

Exercise 12: XTAX =

x 1

y 1

a b

b c

x y

1 1

=

ax + b bx + c

ay + b by + c

x y

1 1

=

ax2 + bx + bx + c axy + by + bx + c

axy + bx + by + c ay2 + by + by + c

=

ax2 + 2bx + c u

u ay2 + 2by + c where u = axy + b(x + y) + c

=

0 u

u 0 since x, y are solutions to the quadratic. Now u = a(xy) + b(x + y) + c

= ac

a − b b

a + c

= 2ac − b2

a .

Exercise 13: Suppose that AX = XA for all 2 × 2 matrices, X.

In particular

a b

c d

1 0

0 0 =

1 0

0 0

a b

c d , so

a 0

c 0 =

a b

0 0 and hence b = c = 0.

Also

a 0

0 d

0 1

0 0 =

0 1

0 0

a 0

0 d , so

0 a

0 0 =

0 d

0 0 and hence a = d.

α θ/2 v = r(cos α, sin α)

Mθ/2v = r(cos(θ − α), sin(θ − α))

y = tan(θ/2)x

38

Exercise 14: (i) S0 =

u1 1

u0 1 = S = SA0 so it is true for n = 0. Suppose it holds for n, that is, Sn = SAn.

Then Sn+1 =

un+2 1

un+1 1

=

aun+1 + b 1

aun + b 1

=

un+1 1

un 1

a 0

b 1 = SnA = (SAn)A by the induction hypothesis = SAn+1. Hence it holds for n + 1 and so, by induction, it holds for all n ≥ 0. (ii) Since Sn = SAn, |Sn| = |S|.|A|n. Hence un+1 − un = (u1 − u0)an. (iii) un = (un − un−1) + (un−1 − un−2) + ... + (u1 − u0) + u0 = (u1 − u0)(an−1 + ... + a + 1) + u0

= (u1 − u0)

an − 1

a − 1 + u0.

Exercise 15: (a) Suppose A is non-invertible. Then |A| = 0. Hence the product of the eigenvalues of A is 0 and so one eigenvalue must be zero.

Suppose now that 0 is an eigenvalue of A. Then Av = 0v = 0 for some non-zero vector v. If A−1 exists then v = Iv = (A−1A)v = A−1(Av) = A−10 = 0, a contradiction. (b) Suppose that H, K are 2 × 2 matrices and that H is invertible. Suppose H + λK is non-invertible. ∴ |H|.|I + λH−1K| = 0. ∴ |I + λH−1K| = 0.

Let H−1K =

a b

c d .

∴

1 + λa λb

λc 1 + λd = 0.

∴ (1 + λa)(1 + λd) − λ2bc = 0. ∴ λ2(ad − bc) + λ(a + d) + 1 = 0. This has at most two solutions. (c) Simply take H = K = 0. Exercise 16: (i) Suppose that A, B and A + B are orthogonal. Then I = (A + B)(A + B)T = (A + B)(AT + BT). ∴ AAT + ABT + BAT + BBT = I. ∴ I + ABT + BAT + I = I. ∴ ABT + BAT = − I.

Let C = ABT =

a b

c d . Then C + CT = ABT + BAT = − I.

∴ a = d = −½ and c = −b.

∴ C =

−½ b

−b −½ . Since |C| = ± 1, b2 + ¼ = 1 and so b = 2±3 .

Hence C is the rotation matrix Rθ where θ = 2π/3 or 4π/3. In either case C3 = I.

39

Exercise 17: Let A, B, C, D be represented by the vectors 0, b, c and d respectively.

Measuring angles in degrees let R =

cos 90 −sin 90

sin 90 cos 90 =

0 −1

1 0 .

Let v be any vector representing a point in ℝ2. The vector will move as follows by the sequence of four rotations: v → Rv → R(Rv − b) + b = R2v − Rb + b → R(R2v − Rb + b − c) + c = R3v − R2b + Rb − Rc + c → R(R3v − R2b + Rb − Rc + c − d) + d = R4v − R3b + R2b − R2c + Rc − Rd + d = v − R3b + R2b − R2c + Rc − Rd + d. This vector is always v if and only if R3b − R2b + R2c − Rc + Rd − d = 0, that is, if and only if (R − I)(R2b + Rc + d) = 0.

Now R − I =

−1 −1

1 −1 and so is invertible.

Hence the above condition is equivalent to R2b + Rc + d = 0. Now R2 = −I (this corresponds to a rotation through 180°) and so the condition is equivalent to Rc = b − d. The length of Rc is the same as that of c, so this implies that the diagonals of ABCD are equal. It also implies that the diagonals are perpendicular. The only quadrilateral where the diagonals are of equal length and are perpendicular, is a square. Exercise 18: Suppose that |A + A−1| = |A| + |A−1|. ∴ |A2 + I| = |A2| + 1. Suppose the eigenvalues of A2 are α, β. Then the eigenvalues of A2 + I are 1 + α and 1 + β. Since the determinant of a matrix is the sum of its eigenvalues, (1 + α)(1 + β) = αβ + 1. ∴ α + β = 0, that is, β = − α. Since A is invertible, α ≠ 0.

Hence, for some invertible matrix S, A2 = SDS−1 where D =

α 0

0 −α .

Thus A4 = SDS−1SDS−1 = SD2S−1

= S

α2 0

0 α2 S−1

= S(α2I)S−1 = α2I.

2. MATRICES IN 2 DIMENSIONS Matrices... · 17 2. MATRICES IN 2 DIMENSIONS §2.1. Matrices . So far...

Documents

Transcript of 2. MATRICES IN 2 DIMENSIONS Matrices... · 17 2. MATRICES IN 2 DIMENSIONS §2.1. Matrices . So far...