MAT3-ALG algebra 2008-2009 | Toby Bailey lecture 0 preamblechris/ALG/tnb_notes.pdf · MAT3-ALG...

MAT3-ALG algebra 2006/7 (tnb) - lecture 0 1

MAT3-ALG algebra 2008-2009 — Toby Baileyhttp://student.maths.ed.ac.uk

lecture 0 preamble

• This course follows on from Year 2 Linear Algebra. You are strongly

advised to do the revision problems below to get up to speed on last

year’s material.

• The course consists of the skeleton notes, the lectures and the exercises.

The lectures will not duplicate what is in the notes and more examples

will be done in lectures. Most important: do the exercises — in that way

you keep up with the material and will get more from lectures.

• There will be questions set each week to hand in to and discuss with your

tutor — usually one short example from each lecture. These questions are

only meant to be good examples to discuss in the limited time available

— you should not assume that these questions are particularly likely to

come up in exams. You should be attempting most of the questions for

each lecture.

problems

These are “easy” revision problems from year 2 Linear Algebra. It is essential

that you are on top of this earlier material and so you are strongly recommended

to do these exercises. Aim to do all of them by the end of week 2 at the latest.

Throughout, Pn denotes the vector space of polynomials of degree ≤ n in a

variable x and M denotes the vector space of 2× 2 real matrices.

done?problem 0.1 Which of the following are subspaces of the given vector space?

1. {x ∈ R3 | 2x1 − x2 + x3 = 1} ⊆ R3

2. {x ∈ R3 | x1 = 2x2} ⊆ R3

3. {P ∈ P3 |P(1) = 0} ⊆ P3

4. {A ∈M |AT = −A} ⊆M (the “T” denotes matrix transpose).

done?problem 0.2 For each of the examples in the previous question that is a

subspace give its dimension and write down a basis for the subspace.

done?problem 0.3 Calculate the coordinate matrix of x3 with respect to the basis

x3 − x2, x2 − x, x− 1, 1 of P3.

http://student.maths.ed.ac.uk


done?problem 0.4 Use the change of basis matrix to find the coordinate matrix of

x in the basis v1, v2 of R2 where

x =

(1

1

), v1 =

(1

−1

), v2 =

(2

1

).

done?problem 0.5 What is the span of a set S = {v1, . . . , vk} of vectors? What

does it mean for the set to be linearly independent?

done?problem 0.6 Let U = {x | x1 = 0} and V = {x | x2 = 0} be subspaces of R3.What is the sum U+ V of these subspaces? State the Dimension Theorem for

sums of subspaces and verify it in this example. Is this an example of a direct

sum?

done?problem 0.7 Which of the following are linear maps?

1. T :M→ P2 where T(A) = the characteristic polynomial of the matrix A.

2. T :M→ R where T(A) = TraceA (Here “Trace” denotes the trace of a

matrix — the sum of the elements on the leading diagonal.)

3. T : P3 → P3 where T : p(x) 7→ p ′(x)

done?problem 0.8 Define the kernel and image of a linear map. State the Rank

Theorem (a.k.a. “Rank-Nullity theorem”) for linear maps. For each of the

examples in the previous question that is linear, describe the kernel and image

and verify the theorem.

done?problem 0.9 Let A be an n × n matrix and let T : Rn → Rn be the linear

map T : x 7→ Ax. Which of the following conditions are equivalent to A having

an inverse?

1. detA 6= 0

2. ker T = {0}

3. im T = Rn

4. T is a bijection.

5. A is diagonalisable.


done?problem 0.10 Find the eigenvalues and eigenvectors of

A =

(1 2

1 1

).

Hence diagonalise A.

lecture 1 sets

1.1 sets and subsets

1.1.1 definition A set is a collection of objects. The objects in a set S are

called elements or members of S. If x is a member of S we write x ∈ S and

if not then we write x 6∈ S. Two sets S, T are equal if they have the same

elements.

1.1.2 notation We denote a set by including its elements in braces (curly

brackets). Thus we might write S = {1, 2, 3, 4, 5} to define the set whose

members are the first five natural numbers. We also use the “vertical bar” that

means “such that”. So we could define the interval [0,∞) by

[0,∞) := {x ∈ R | x ≥ 0}.

(Note that we often use “:=” when an equality is defining its left-hand side.)

1.1.3 notation if the set S has a finite number of elements, we refer to that

number as the size of S. We write ]S for the size of S.

1.1.4 definition The empty set is the set with no elements. We write it as {}

or as ∅.

1.1.5 definition The set A is a subset of the set B if x ∈ A =⇒ x ∈ B. We

write A ⊆ B if A is a subset of B. The subset A ⊆ B is proper if A 6= B. We

write A ⊂ B to mean that A is a proper subset of B.

1.1.6 warning Mathematicians are not consistent in notation — some use

A ⊂ B for all subsets, proper or not. Our notation is modeled on the distinction

between “<” and “≤”.

1.1.7 theorem For every set B it is true that {} ⊆ B.

Proof. Every element of {} is also in B because there are no such elements. �


1.1.8 note We sometimes say that conditions such as that in the above proof

“hold vacuously”. If you doubt it, look at it this way. For the statement to be

false there would need to be an element of {} that is not in B. That is certainly

not the case. So the statement is certainly not false.

1.1.9 note We allow sets to be elements of sets. For example, {{}} is not the

same as {}. The first set has an element (which happens to be the empty set)

but the second does not.

1.1.10 definition The power set P(S) of a set S is the set of all subsets of S.

1.1.11 definition One often deals with families of sets. For example, we

might define In to be the closed interval [−n,n] as follows.

In := [−n,n], n ∈ N.

We refer to this as a family of sets with index set N.

1.2 complements and differences

1.2.1 definition The complement A ′ of a set A is defined by {x | x 6∈ A}.

1.2.2 definition Let A and B be sets. Then the set difference

A \ B := A ∩ B ′

1.3 intersections and unions

1.3.1 definition The union A ∪ B and intersection A ∩ B of two sets are

defined by

A ∪ B := {x | x ∈ A or x ∈ B}, A ∩ B := {x | x ∈ A and x ∈ B}.

In the definition of union note that as always in mathematics “P or Q” is true

if at least one and possibly both of P and Q are true.

1.3.2 definition Let Aλ, λ ∈ Λ be a family of sets. Then we define the union

and intersection of the family by⋃λ∈Λ

Aλ := {x | x ∈ Aλ for some λ ∈ Λ},⋂λ∈Λ

Aλ := {x | x ∈ Aλ for all λ ∈ Λ}.


1.3.3 example For the family in §1.1.11, the union is R and the intersection

is [−1, 1].

1.3.4 theorem For all sets A,B we have

A ⊆ A ∪ B, B ⊆ A ∪ B, A ∩ B ⊆ A, A ∩ B ⊆ B.

The analogous statements hold for unions and intersections of families.

1.4 set algebra

1.4.1 trivial identities

1. A ∪ B = B ∪A and A ∩ B = B ∩A

2. A ∪ (B ∪C) = (A ∪ B) ∪C and so A ∪ B ∪C is unambiguous. Same for

intersections.

3. {} ∩A = {} and {} ∪A = A

4. (A ′) ′ = A, A ∩A ′ = {}

5. A = A ∩A = A ∪A

1.4.2 identities involving families

B ∩⋃λ∈Λ

Aλ =⋃λ∈Λ

(B ∩Aλ),

B ∪⋂λ∈Λ

Aλ =⋂λ∈Λ

(B ∪Aλ),(⋃λ∈Λ

Aλ

) ′=⋂λ∈Λ

A ′λ,(⋂λ∈Λ

Aλ

) ′=⋃λ∈Λ

A ′λ.

1.4.3 example proof Here is a proof of the final identity involving families.

We show first that LHS ⊆ RHS and then that RHS ⊆ LHS.

1. Let x be in the left-hand side. Then x is not in(⋂

λ∈ΛAλ). Thus x is

not in every one of the Aλ. Thus there exists a µ ∈ Λ such that x ∈ A ′µ.

Hence x is in the right-hand side.

2. Now suppose x is in the right-hand side. Then there exists a µ ∈ Λ such

that x ∈ A ′µ. Thus x 6∈ Aµ and so x 6∈(⋂

λ∈ΛAλ)

and so x is in the

LHS.


(Note in the above we have used λ for a “general” element of Λ and µ for a

particular element that arises in the proof.)

1.4.4 example Use set algebra to show that A \ (B∪C) = (A \B)∩ (A \C).

Solution:

A \ (B ∪ C) = A ∩ (B ∪ C) ′ by defn of set difference

= A ∩ (B ′ ∩ C ′) by §1.4.2 (3rd identity)

= A ∩A ∩ B ′ ∩ C ′ by §1.4.1, parts 2 and 5

= (A ∩ B ′) ∩ (A ∩ C ′) by §1.4.1

= (A \ B) ∩ (A \ C) by defn of set difference

1.5 russel’s paradox (optional)

Although it is never an issue in everyday mathematics, you have to be very

careful about talking about sets of sets. Let S be the set of all sets. This does

not sound too worrying except that S has the odd property that S ∈ S. So there

seems to be nothing to stop us defining

R := {A ∈ S |A 6∈ A}.

(In words, R is the set of all sets which are not elements of themselves.)

Now ask whether R is an element of itself. Show that R ∈ R =⇒ R 6∈ Rand also that R 6∈ R =⇒ R ∈ R. We have a complete contradiction, known as

“Russel’s Paradox”. The usual way out of this is not to allow things as wildly

general as S to be called a set.

problems

done?problem 1.1 Is it true that {} ⊂ B for all sets B?

done?problem 1.2 For the four properties of set algebra in §1.4.2 above, write down

(next to the original, perhaps) what they reduce to for a family of just two sets.

In each case, give a proof. For at least two of those not proved in lectures or

the text, write down a proof too of the version for families.

done?problem 1.3 Let S be a finite set. Write down a formula for the size of P(S).Write down P(S) in the case of S being the empty set. Does the formula work

in this case?


done?problem 1.4 Let L denote the set of (straight) lines through the origin in

R2. Let M denote the set of (straight) lines through (1, 1) in R2. How many

elements does L ∩M have?

done?problem 1.5 + Hand-in for tutorial Suppose S = {x, y, z}. True or

False:

1. S ∈ P(S);

2. x ∈ P(S);

3. {x, y} ∈ S;

4. {x, y} ∈ P(S);

5. {{x, y}, {}} ∈ P(S);

6. {{x, y}, {}} ⊆ P(S);

7. {{x, y}, {}} ∈ P(P(S));

8. {{}} ∈ P(P(S)).

What is the size of P(P(S))?

done?problem 1.6 + Hand-in part 1 only for tutorial Use set algebra to

show that

1. A \ (B ∩ C) = (A \ B) ∪ (A \ C)

2. A \ (B \ C) = (A \ B) ∪ (A \ C ′)

done?problem 1.7 Show that

A ∪ (A ∩ B) = A.

You will need to argue by showing that the LHS is a subset of the RHS and

that the RHS is a subset of the LHS. Use set algebra to deduce that also

A ∩ (A ∪ B) = A.

These two results (which can not be deduced from other identities of set algebra

that we have previously stated) are called the ”axioms of absorption”. Use them

together with set algebra as before to show that A \ (B \A) = A.


lecture 2 cartesian products and functions

2.1 cartesian products

2.1.1 definition An ordered n-tuple is a list (x1, . . . , xn) of n objects, where

the order is important and repetitions are allowed. If n = 2 and n = 3 we use

the terms ordered pair and ordered triple.

2.1.2 definition Let A,B be sets. Then the cartesian product of A and B is

the set

A× B := {(a, b) |a ∈ A and b ∈ B}.

(Here (a, b) is an ordered pair.)

2.1.3 definition More generally, let S1, . . . , Sn be sets. Then their cartesian

product is the set

S1 × · · · × Sn := {(x1, . . . , xn) | xj ∈ Sj for j = 1, . . . , n}.

2.1.4 example The usual definition of Rn is

Rn := R× · · · × R︸︷︷︸n factors

.

2.2 functions (= maps = mappings = transformations)

2.2.1 definition A function f from A to B is a subset Gf ⊆ A × B with

the property that for each a ∈ A there is one and only one b ∈ B such that

(a, b) ∈ Gf. We write b = f(a) if (a, b) ∈ Gf.

2.2.2 notation We sometimes use the equivalent words map, mapping or

transformation instead of function.

2.2.3 relation with more elementary definition We have previously defined a

function from A to B to be a rule that associates an element f(a) ∈ B to

each a ∈ A. The connection of course is that f(a) is the unique b such that

(a, b) ∈ Gf. The previous definition has two drawbacks:

• What exactly is a “rule”? For example, is “ask Henry” a suitable rule to

determine a function?

• Two functions can have different rules but be equal, for example x 7→(x + 1)2 and x 7→ x2 + 2x + 1. (One really needs to supplement the old

definition by adding that “two functions are equal if they take the same

values for all x”.)


Our definition is cleaner and more precise (at the expense of being more ab-

stract). The relationship is that if we have defined a function f : A → B by a

rule then the subset of A× B is its graph

Gf = {(a, f(a)) |a ∈ A}.

We will usually specify functions as we always have, by giving a rule for com-

puting f(a) from a.

2.2.4 definition and notation We write f : A → B to denote that f is a

function from A to B. If f(a) = b we write f : a 7→ b. The set A is called the

domain of f and the set B is called the codomain.

2.2.5 note Note well the difference between “→” and “7→”. (Some people

use “→” for both.)

2.2.6 definition Let S be a set. Then the identity function I : S → S is

defined by I(x) = x for all x.

2.2.7 example Let S be a set. Define a map f : S×S→ S×S by f : (x, y) 7→(y, x). The set of all (x, y) such that f : (x, y) 7→ (x, y) (that is, the fixed point

set of f) is a special subset ∆ of S× S called the diagonal. Alternatively, it can

be defined by

∆ := {(x, x) | x ∈ S}.

2.2.8 example The diagonal ∆ is the graph of the identity map I : S→ S.

2.2.9 definition The maps π1 : X× Y → X and π2 : X× Y → Y defined by

π1 : (x, y) 7→ x, π2 : (x, y) 7→ y

are called the projections from X × Y to X and Y respectively. We sometimes

refer to them as the canonical projections. (A canonical object is something

that arises naturally from the given information without making extra choices.)

2.3 action on subsets

2.3.1 definition Let f : A→ B be a function. Let U ⊆ A and V ⊆ B. Then

we define

f(U) := {f(x) | x ∈ U}, f−1(V) := {x ∈ A | f(x) ∈ V}.


2.3.2 note Both f and f−1 act on subsets and produce subsets. In fact, they

define functions

f(U) : P(A)→ P(B), f−1 : P(B)→ P(A).2.3.3 example Consider f : R→ R with f : x 7→ x2. Then

f({−1, 1, 3}) = {1, 9}, f({−3}) = {9}, f([−2, 2]) = [0, 4], f(R) = [0,∞)

and

f−1({1}) = {−1, 1}, f−1({−3, 4}) = {−2, 2}, f−1([−2,−1]) = {}.

2.3.4 notation and a warning In mathematics there is always a tension be-

tween precision and readability. Confusing related but different things can

be disastrous, but can also be a huge aid to transparency. We will usually

drop the tilde from the above notation and write just f([−1, 1]) = [0, 1] and

f−1({1}) = {−1, 1}. We may also allow ourselves to confuse elements of a set

with one-element subsets and write f−1(1) = {−1, 1}. Do not however fall in

to the trap of thinking that when we use f−1 or f−1 like this that f or f has

an inverse. The function f in our example does not have an inverse function

f−1 : R→ R.

2.3.5 the axiom of choice (optional) More generally, it is reasonable to define

the cartesian product of an arbitrary (perhaps infinite) family Sλ, λ ∈ Λ of sets

— an element of the product is a choice for each λ of an element of Sλ. The

axiom of choice is the statement that if there exists a family of sets as above

such that each Sλ is non-empty then the cartesian product is non-empty.

You might say that the axiom of choice is obvious since all you need to do

to find an element of the cartesian product is make a choice for each λ. It can

not however be proved or disproved from more basic axioms for set theory, and

so you are free to believe it or not as you wish.

The vast majority of working mathematicians take the axiom of choice to

be “obviously” true and use it without thinking. A small minority doubt it since

it is not clear how you can make what may be an infinite number of choices

simultaneously and so disbelieve things that require it for their proof. These

are often abstract, non-constructive results that prove something exists without

actually demonstrating how to find it.

problems

done?problem 2.1 Let A,B be finite sets. What is the size of A× B?


done?problem 2.2 Show that A × (B ∩ C) = (A × B) ∩ (A × C). (Hint: to do

this carefully, show that the LHS is a subset of the RHS and that the RHS is a

subset of the LHS.)

done?problem 2.3 Decide what relationship holds between the following pairs of

sets (one side is a subset of the other, or they are equal, or there is no relation).

Give a proof.

1. A× (B ∪ C) and (A× B) ∪ (A× C);

2. (A× B) ∩ (C×D) and (A ∩ C)× (B ∩D);

3. (A× B) ∪ (C×D) and (A ∪ C)× (B ∪D).

done?problem 2.4 Consider the function f : R→ R2 given by f : t 7→ (cos t, sin t).

The graph of f is a subset of R1 × R2 = R3. Sketch it.

done?problem 2.5 Consider the map f : R2 → R given by f : (x, y) 7→ √x2 + y2.

Describe each of the following using some combination of words, equations or

pictures (proofs not required):

1. f−1(2)

2. f−1([1, 2]) (here [1, 2] is the closed interval)

3. f−1(−1)

4. f(Z) where Z = {(x, y) | (x− 2)2 + (y− 2)2 = 2}

5. f(V) where V = {(x, y) |y > 0}

6. f(U) where U = {(x, y) | x2 − y2 = 1}

7. f(R2)

done?problem 2.6 + Hand-in for tutorial Let f : X→ Y be a function and

let A,B be subsets of X. Show that

f(A ∩ B) ⊆ f(A) ∩ f(B).

(Hint: You proof should begin: “Let y ∈ f(A∩B)” and should finish with “and

hence y ∈ f(A)∩f(B)”.) Give an example to show that “⊆” can not be replace

with equality.


done?problem 2.7 let X and Y be finite non-empty sets. Write down a formula for

the number of different functions from X to Y. Now consider the case where

one of X and Y is empty. How many functions are there in that case? Are the

results consistent with your previous formula. (Hint: For the second part use

the definition of function in terms of a subset of the cartesian product.)

lecture 3 more on functions

3.1 injection, surjection, bijection

3.1.1 definition The map f : X → Y is an injection (or “one to one”) if

f(x) = f(y) =⇒ x = y.

3.1.2 definition The map f : X→ Y is a surjection (or “onto”) if for all y ∈ Ythere exists x ∈ X such that f(x) = y.

3.1.3 definition The map f : X → Y is a bijection (or “a one to one corre-

spondence” or an “isomorphism of sets”) if it is injective and bijective.

3.1.4 definition The image im(f) of the map f : X→ Y is defined by

im(f) := {f(x) | x ∈ X}.

3.1.5 note In terms of f acting on subsets, im(f) = f(X) and so f is surjective

if and only if f(X) = Y.

3.2 composition and inverses

3.2.1 definition Let f : X → Y and g : Y → Z be maps. We define the

composition g ◦ f : X→ Z by g ◦ f : x 7→ g(f(x)).

3.2.2 theorem Composition of maps is associative: let f : W → X, g : X →Y, h : Y → Z be maps. Then h ◦ (g ◦ f) = (h ◦ g) ◦ f.

Proof. Let w ∈W. Then

(h ◦ (g ◦ f))(w) = h((g ◦ f)(w)) = h(g(f(w)))

and

((h ◦ g) ◦ f)(w) = (h ◦ g)(f(w)) = h(g(f(w))).

The two functions thus give the same value for all w and so are equal. �


3.2.3 definition Let f : X→ Y be a map. A map g : Y → X is an inverse for

f if g ◦ f = IX and f ◦ g = IY .

3.2.4 notation We usually write f−1 for the inverse of a map if one exists. Do

not confuse this with f−1 acting on subsets (as in the previous lecture), which

is well-defined even if f has no inverse.

3.2.5 theorem The map f : X→ Y has an inverse if and only if f is a bijection.

Proof. Same proof as for functions R→ R. �

3.2.6 theorem If f, g as above are both injective then so is g ◦ f. If f, g as

above are both injective then so is g ◦ f. If f, g as above are both bijective then

so is g ◦ f and (g ◦ f)−1 = f−1 ◦ g−1.

Proof.

• For injectivity of g◦f: Let (g◦f)(x) = (g◦f)(y). Then g(f(x)) = g(f(y))

and so f(x) = f(y) since g is injective. Hence x = y since f is injective.

• For surjectivity of g◦f: Exercise - you must begin by saying ”let z ∈ Z” and

you should end by deducing the existence of x ∈ X such that (g◦f)(x) = z.

• Firstly, g◦f is injective since both f and g are injective, and it is surjective

since both f and g are surjective. Hence g ◦ f is bijective. Now check:

(f−1 ◦ g−1) ◦ (g ◦ f) = f−1(◦g−1) ◦ g) ◦ f = f−1 ◦ I ◦ f = f−1 ◦ f = I,

and similarly for (◦g ◦ f) ◦ (f−1 ◦ g−1). So (g ◦ f)−1 = f−1 ◦ g−1.

�

3.3 change of domain and codomain

3.3.1 definition Let f : X → Y and suppose U ⊆ X. Then the restriction of

f to U is the function f|U : U→ Y defined by f|U(x) = f(x) for all x ∈ U.

3.3.2 note So we are simply forgetting that we could apply f to elements

outside U. We often just write f for the restricted function unless there is a

danger of confusion.

3.3.3 example Restriction of the domain can have important effects. The

function sin : R → [−1, 1] is not injective and hence has no inverse. If we

restrict the domain to [−π/2, π/2] it is a bijection and has an inverse, called

arcsin.


3.3.4 definition Let f : X→ Y and suppose V is such that im(f) ⊆ V. Then

we can change the codomain and obtain a map X→ V.

3.3.5 example For the squaring function R→ R, we can restrict the domain

to [0,∞) and change the codomain to [0,∞). We then have a bijection whose

inverse is usually written x 7→ √x.

3.4 back to cartesian products

3.4.1 the problem We would obviously like to think of R1 × R2 as being

“the same as” R3. Unfortunately, being pedantic, it is not. An element of

the first set is something like (−3, (2, 4)) which is an ordered pair whose first

component is a number and whose second component is an ordered pair. On

the other hands, an element of R3 is an ordered triple such as (−3, 2, 4). What

we can say is that there is a canonical (meaning, remember, naturally arising

from the situation) bijection f : Rk × Rl → Rk+l given by

f : ((x1, . . . , xk), (y1, . . . , yl)) 7→ ((x1, . . . , xk, y1, . . . , yl)).

Having seen all this once, we simply use this canonical bijection to identify

Rk × Rl with Rk+l.

3.4.2 associativity of cartesian product A similar problem arises with carte-

sian products in general. A×(B×C) and (A×B)×C) are in principle different.

But if we identify both with A×B×C in the obvious way, then we can regard

them as equal.

3.4.3 commutativity of cartesian products There is a canonical bijection f :

A × B → B × A given by f : (a, b) 7→ (b, a). But here is is important to

maintain the distinction between the two sets.

problems

done?problem 3.1 Give conditions on the size of the subsets f−1(y), y ∈ Y that are

characterize f being (a) injective; (b) surjective; (c) bijective.


done?problem 3.2 + Hand-in for tutorial

1. Let g ◦ f be injective. Show that f is injective. (Hint: You may find it

best to show that if f is not injective then g ◦ f is not injective.

2. If g ◦ f is injective, does g have to be injective? Give a proof or a

counterexample.

3. What exactly can be deduced if we know that g ◦ f is surjective?

done?problem 3.3 Let f : A→ A be a map and suppose that f◦ f = f. What extra

condition on f allows us to deduce that f is the identity map?

done?problem 3.4

1. Let f : A → B be a map. Suppose there exists a map g : B → A such

that g ◦ f = IA : A→ A. Show that f is injective.

2. Suppose f : A→ B is injective. Deduce that there exists a map g : B→ A

such that g ◦ f = IA : A→ A.

3. Under what circumstances is the map g in the previous part unique?

done?problem 3.5 State and prove results analogous to the previous exercise that

involve a map h : B→ A such that f ◦ h = IB?

done?problem 3.6 Show that there exists an injection A → B if and only if there

exists a surjection B→ A.

done?problem 3.7 (Harder!) Let S be a set. Show that there can not exist a

surjection f : S → P(S). You might proceed as follows. Suppose there is such

a surjection f. Now consider the subset A ⊆ S defined by

A = {x ∈ S | x 6∈ f(x)}.

Deduce that A itself is not in the image of f for a contradiction.

done?problem 3.8 (Optional!) The Schroder-Bernstein Theorem states that if there

are injections A → B and B → A then there exists a bijection A → B. It is

not entirely trivial. Find a proof (Halmos’s “Naive Set Theory” or perhaps the

web) and understand it!


lecture 4 relations and quotients

4.1 relations in general

4.1.1 definition A relation between the sets X and Y is a subset of R ⊆ X×Y.

If (x, y) ∈ R then we say y is related to x. A relation on X is a relation between

X and itself.

4.1.2 examples

• {} ⊆ X× Y is the relation where nothing in X is related to anything in Y.

• X×Y ⊆ X×Y is the relation where everything in X is related to everything

in Y.

• The definition of function A→ B in §2.2.1 defines a function as a special

sort of relation between A and B

• The subset {(x, y) | x2 + y2 = 1} ⊆ R × R defines a relation on R. It is

not a function R→ R because some x-values have no y value and some

have more than one.

• The subset {(z,w) | |w| ≥ |z|} ⊆ R × R defines a relation on C. In this

case, w is related to z only if its modulus is at least as great as that of z.

• The subset {(m,n) |m − n = 3k for some k ∈ Z} ⊆ Z × Z defines a

relation on Z. Here n is related to m if and only if they have the same

remainder on division by 3.

4.2 equivalence relations

4.2.1 definition An equivalence relation on a set S is a relation such that,

writing x ∼ y if y is related to x we have

1. a ∼ b =⇒ b ∼ a

2. For all a ∈ X it is the case that a ∼ a

3. a ∼ b and b ∼ c =⇒ a ∼ c

4.2.2 notation Hereafter in this lecture we assume that S is a set on which

an equivalence relation is defined.

4.2.3 definition Let a ∈ S. The equivalence class of a is the set

[a] := {b ∈ S |a ∼ b}.


4.2.4 theorem If a ∼ b then [a] = [b]. Otherwise [a] ∩ [b] = {}.

4.2.5 corollary The set S is a disjoint union of equivalence classes. (Note: a

union is disjoint if every pair of sets in the union has empty intersection.)

4.2.6 definition A set of representatives is a subset of S with the property

that it contains precisely one element from each equivalence class.

4.2.7 example Let S = R2 and let x ∼ y if |x| = |y| (the usual modulus of a

vector). This is an equivalence relation. The equivalence classes are (a) all the

circles centred on the origin and also (b) the equivalence class {0} consisting of

just the zero vector. A set of representatives is {(a, 0) |a ≥ 0}.

4.2.8 theorem Let f : S → T be a function. Then a ∼ b if and only if

f(a) = f(b) defines an equivalence relation on S.

4.2.9 examples

1. The example in §4.2.7 arises from the modulus function R2 → R.

2. Consider the squaring function f : R → R. This gives rise to the equiva-

lence relation on R where x ∼ y if and only if x2 = y2. The equivalence

classes are all sets of the form {x,−x} together with the single-element

class {0}. A set of representatives is [0,∞).

4.2.10 example (This does not naturally arise from a function as just dis-

cussed.) Let M be the set of real n × n matrices. For A,B ∈ M, let us say

that A ∼ B iff there exists an invertible n× n matrix P with B = P−1AP. This

defines an equivalence relation on M.

Proof.

1. A = I−1AI and so A ∼ A.

2. If A ∼ B then there exists invertible P with B = P−1AP. Set Q = P−1

which is also invertible. Then A = PBP−1 = Q−1BQ. Thus B ∼ A.

3. If A ∼ B and B ∼ C then there exist invertible P,Q with B = P−1AP and

C = Q−1BQ. Thus C = Q−1P−1APQ = (PQ)−1A(PQ) and so A ∼ C.

�


4.3 quotients

4.3.1 definition Let ∼ be an equivalence relation on X. Define the quotient

X/ ∼ to be the set whose elements are the equivalence classes of ∼.

4.3.2 example Consider T , the set of all times, past present and future. Con-

sider the equivalence relation on T given by s ∼ t iff s and t differ by an exact

integer multiple of 24 hours. Then the quotient D = T/ ∼ can be thought of

as the set of “times of day”. When we make a statement like “I like a drink at

6 o’clock”, one could argue that the “6 o’clock” refers to an element of D - a

single abstract entity that we construct which is the equivalence class of all 6

o’clocks in all possible days.

4.3.3 example Let Z denote the integers and let a ∼ b iff a−b is a multiple

of 3. This is an equivalence relation. There are three equivalence classes

{. . . ,−3, 0, 3, 6, . . . }, {· · ·− 5,−2, 1, 4, 7, . . . } and {· · ·− 4,−1, 2, 5, 8, . . . }.

A set of representatives is {0, 1, 2} although {−19, 27, 7} would be just as good.

We often use the notation [a] for the equivalence class containing a. The set

of equivalence classes Z3 := Z/ ∼ is thus Z3 = {[0], [1], [2]}.

problems

done?problem 4.1 + Hand-in for tutorial Show that u ∼ v if and only if

u−v ∈ Z defines an equivalence relation on R. Describe [x] for this equivalence

relation and give a set of representatives.

done?problem 4.2 Does a ∼ b ⇐⇒ a + 2b = 3k where k ∈ Z define an

equivalence relation on Z? (Check carefully and investigate if you are not sure

— don’t just guess!)

done?problem 4.3 + Hand-in for tutorial Let a be the vector (1, 1) ∈ R2.Show that

x ∼ y ⇐⇒ x− y = λa for some λ ∈ R

defines an equivalence relation on R2. Sketch the equivalence classes and show

that R = {(x, y) | x+ y = 0} is a set of representatives.


done?problem 4.4

(a) Show that if x ∈ R2 is non-zero then there exists an invertible 2× 2 matrix

A such that Ae1 = x where e1 is the first standard basis vector in R2.

(b) Use the above to show that given two non-zero vectors x, y ∈ R2 there

exists an invertible 2 × 2 matrix P such that y = Px. (Hint: take x to e1and then e1 to y.)

(c) Let x ∼ y iff there exists an invertible 2 × 2 matrix A such that y = Ax.

Show that this defines an equivalence relation on R2.

(d) What are the equivalence classes for this equivalence relation? Give a set

of representatives. How many elements does R2/ ∼ have?

done?problem 4.5 Consider the set X = {(a, b) |a, b ∈ Z and b 6= 0} (so an

element of X is a pair of integers with the second one non-zero). Show from

the definition that

(a, b) ∼ (k, l) ⇐⇒ al = bk

defines an equivalence relation on X.

lecture 5 the first isomorphism theorem (FIT) for sets

5.1 defining operations on quotients

Fix a natural number n > 1. It will be helpful for us in future to discuss the

essentially trivial fact that addition mod n is a well-defined concept.

Let a ∼ b in Z iff a−b is an integer multiple of n. There are n equivalence

classes - a set of representatives is {0, 1, 2, 3, . . . , n − 1}. We write (as always)

[a] for the equivalence class containing a. Thus [a] ∈ Zn := Z/ ∼. Now, we

can regard addition mod n as an operation defined on the equivalence classes

— in other words as an operation defined on elements of the quotient Zn. Let

us define

[a] + [b] := [a+ b].

There is something that needs thinking about here: if n = 5 then [2] and [7] are

(different ways of describing) the same equivalence class, as are [4] and [9]. Our

definition says that [2] + [4] = [6] and [7] + [9] = [16]. But of course [6] = [16]

and so there is no contradiction here.

So much for waffle: here is what one might write to prove that addition is

well-defined in Zn.

Proof. Let [a] = [a ′] and [b] = [b ′] so that a ∼ a ′ and b ∼ b ′. Then there

exist k, l ∈ Z such that a ′−a = kn and b ′−b = ln. Then (a ′+b ′)−(a+b) =


(a ′−a)+(b ′−b) = (k+l)n and so a+b ∼ a ′+b ′ and hence [a+b] = [a ′+b ′].

�

5.1.1 moral The point of well-definedness in general is this. When we define

an operation on, or a map from, a quotient, we often have to define its action

on the equivalence class [x] by giving some formula involving x directly. In that

case we must check that if [x] = [y] then the formula applied to x gives the

same result as the formula applied to y.

5.2 the “first isomorphism theorem (FIT) for sets”

This is important as a paradigm for the first isomorphism theorems for vector

spaces and groups which we will study later. While FIT for vector spaces and

groups are fundamental results that appear in many books, FIT for sets is mainly

important as a prototype for them.

5.2.1 definition Let ∼ be an equivalence relation on S. The map p : S→ S/ ∼

defined by p : x 7→ [x] is called the canonical surjection.

5.2.2 note The fact that p is a surjection might be regarded as a very easy

theorem, but we will take it to be so obvious that it is part of the definition.

5.2.3 theorem Let R ⊆ X be a set of representatives. Then the restriction

p|R : R→ X/ ∼ is a bijection.

Proof. Trivial consequence of the definition of a set of representatives. �

5.2.4 note Finding a set of representatives involves making a choice, which

in most situations is to some degree arbitrary. The quotient is a sort of “gener-

alised, abstract set of representatives” that avoids this arbitrariness.

5.2.5 theorem (“FIT for sets”) Let f : X → Y be surjective. Define the

equivalence relation ∼ on X by a ∼ b if and only if f(a) = f(b). Then there is

a bijection f : X/ ∼→ Y such that f ◦ p = f where p is the canonical surjection

X→ X/ ∼.

5.2.6 note We often express the last condition by saying that the diagram

Xf

−→ Y

p ↓ ↗ f

X/ ∼


commutes. When we say that a diagram of objects connected by maps com-

mutes we mean that if there are two different routes (following the arrows) from

one object to another, then both routes give the same answer.

Proof.

• First we define our map. Let f([x]) = f(x). Suppose [a] = [b]. Then by

definition of ∼ we have f(a) = f(b) and so f([a]) = f([b]) and so f is

well-defined.

• Let f([a]) = f([b]). Then f(a) = f(b) and so a ∼ b and so [a] = [b] and

hence f is injective.

• Let y ∈ Y. Then there exists x ∈ X such that f(x) = y since f is

surjective. Then f([x]) = f(x) = y and so f is surjective.

• Since f is surjective and injective it is a bijection.

• Let x ∈ X. Then

(f ◦ p)(x) = f(p(x)) = f([x]) = f(x).

Thus f ◦ p = f.

�

5.2.7 corollary In the statement of the theorem, the assumption that f is

surjective can be dropped provided the conclusion is changed to claim that f is

a bijection from X/ ∼ to the image of f.

5.2.8 example Consider the equivalence relation on R3 given by x = y if and

only if |x| = |y|. (Two vectors are equivalent if they have the same modulus.)

This equivalence relation arises from the surjection f : R3 → [0,∞) where

f : x 7→ |x|. Thus we have a bijection R3/ ∼→ [0,∞). In other words, the

points in [0,∞) label the equivalence classes of ∼.

problems

done?problem 5.1 Prove that the operation of multiplication is well-defined in Zn.

done?problem 5.2 + Hand-in for tutorial Consider the equivalence relation

x ∼ y ⇐⇒ |x| = |y| on Z. Use the absolute value function (i.e. the modulus

function) and FIT for sets to deduce that the quotient Z/ ∼ can be identified

with N ∪ {0}.


done?problem 5.3 Consider the equivalence relation ∼ on the set X = {(a, b) |a, b ∈Z and b 6= 0} as in the problem for lecture 4. Show that setting

[(a, b)] + [(c, d)] = [(ad+ bc, bd)], [(a, b)] ∗ [(c, d)] = [(ac, bd)]

is a well-defined “addition” and “multiplication” on X/ ∼.

done?problem 5.4 This continues from the previous problem. Define a map f :

X → Q (where Q is the rational numbers) by f : (a, b) 7→ a/b. Use FIT

for sets to deduce that X/ ∼ can be identified with Q. Note by the way that

the previous exercise can be taken to be a definition of Q and its arithmetic

operations which does not use the idea of fractions or real numbers.

done?problem 5.5 Let S1 denote the unit circle, thought of as the unit-modulus

complex numbers. Consider the map f : R→ S1 defined by f : x 7→ e2πxi. Show

that the equivalence relation on R defined by u ∼ v ⇐⇒ f(u) = f(v) is that

u and v are equivalent if and only if they differ by an integer. Use FIT for sets

to show that R/ ∼ can be identified with S1.

lecture 6 fields and n-dimensional space

6.1 introduction

The theorems of linear algebra use only some basic algebraic properties of the

scalars. In year 2 we considered the case of real and complex vector spaces and

noticed that generally definitions, theorems the proofs worked in the same way

for both. The idea of a field is that it is a set of “numbers” that obey the

same algebraic rules as R and C and that are therefore usable as “scalars” for

a vector space.

A “field” then is a set of things that you can add, subtract multiply and

divide and the rules of algebra are just like those for real or complex numbers.

Familiar examples are

Q (the rational numbers), R (the real numbers), C (the complex numbers).

6.2 definitions and properties

6.2.1 definition (for completeness only - does not need to be memorised) A

field is a set F of objects on which two commutative operations are defined.

These are addition (+) and multiplication (usually just denoted by juxtaposi-

tion). They must obey the following axioms.


• Under addition, F is a commutative (sometimes called “abelian”) group.

In particular there is an additive identity (“zero”) such that a+ 0 = a for

all a ∈ F and every element a must have an additive inverse -a with the

property that a+ (−a) = a− a = 0.

• Let F∗ denote the set of all non-zero elements of F. Then F∗ is a com-

mutative group under multiplication. There is a “multiplicative identity”

1 such that 1a = a for all a ∈ F and every a ∈ F∗ has to have a

multiplicative inverse a−1 with the property that aa−1 = a/a = 1.

• The addition and multiplication satisfy distributive laws: a(b + c) =

ab+ ac.

6.2.2 note The notions of modulus (of real or complex numbers) and in-

equalities (such as 2 < 3 in the real numbers) have no analogue in fields in

general.

6.2.3 examples

• Q,R,C are fields.

• Z and R[x] (the set of all polynomials in a variable x) are not fields.

(Most of the elements do not have multiplicative inverses — this is the

most common reason why a set of objects that can be commutatively

added and multiplied fail to be a field.)

• Let Zp denote the integers mod p where p is a prime. Then Zp is a field.

(In particular Z2 = {0, 1} is the smallest possible field.)

The reason Zn is not a field if n is not prime is that one does not have multi-

plicative inverses: e.g. there is no k ∈ Z4 such that 2k = 1 mod 4 so 2 has no

multiplicative inverse. On the other hand, suppose a ∈ Zp where p is prime.

Then gcd(a, p) = 1 and so there exist integers k, l such that ka + lp = 1

(Euclidean algorithm!). Then k is a multiplicative inverse for a.

6.2.4 example In Z7 we have 3 × 5 = 15 = 1 mod 7. So 1/3 = 5 and

1/5 = 3.

6.3 subfields

6.3.1 definitions


• A subset K ⊆ F is a subfield if K is a field in its own right with the same

operations as F. In other words, it is a subset of elements of F that includes

both zero and one and that is closed under addition, multiplication and

the taking of additive and multiplicative inverses.

• To check whether K ⊆ F is a subfield you must check that if a, b are in K

then so are a+b, ab,−a and (if a is non-zero) a−1. (Strictly, one needs

also to check that K contains a non-zero element, otherwise we might

have the empty set or {0}.)

6.3.2 examples

• Clearly Q is a subfield of R which is in turn a subfield of C.

• Z3 = {0, 1, 2} is not a subfield of R. Although 1,2 and 3 are real numbers,

the arithmetic operations in Z3 are not those in R. (Strictly, we should

be writing e.g. “[2]” rather than “2” for the element of Z3. The element

[2] ∈ Z7 is an equivalence class of integers and not at all the same thing

as the real number 2.

• Q[√2] := {a + b

√2 |a, b ∈ Q}. is a subfield of R. (The proof is one of

the problems.)

6.4 n-dimensional space over a field

6.4.1 definition Let F be a field. Then

Fn :=

x1...xn

∣∣∣∣∣∣∣ x1, . . . , xn ∈ F

.6.4.2 notes

• This agrees with the usual definition of Rn and Cn. (In this course we

will always think of n-dimensional space as having column vectors as

elements.)

• Znp has just pn vectors.

6.4.3 observation All the basic ideas of vector spaces apply to Fn. For ex-

ample:

• In Z23 the subset

U := {x | x1 + 2x2 = 0}

forms a 1-dimensional subspace. (It contains precisely 3 vectors).


• Let

a :=

110

, b :=

111

be vectors in Z32. Their span is a 2-dimensional subspace of Z32. This

contains precisely 4 vectors.

problems

done?problem 6.1 + Hand-in for tutorial Find the multiplicative inverses

of the non-zero elements in Z7. (Just experimenting is probably easier than

using the Euclidean algorithm.)

done?problem 6.2 Show that if L ⊆ K is a subfield then 1, 1+ 1, 1+ 1+ 1, . . . are

all elements of L. (It is tempting to call these 1, 2, 3, . . . but note that (e.g. in

Zp) that they are not necessarily all distinct.) Deduce that Zp does not have

any subfields (other than itself).

What do you think the smallest subfield of R is?

done?problem 6.3 + Hand-in for tutorial Do the following equations have

solutions in the fields C,R,Q,Z3,Z2?

x2 + 1 = 0, x2 − x− 1 = 0

Note: what this means in each case is this: is there an element of the given

field such that if you substitute it in to this equation and do all the arithmetic

in that field then you get zero? The answers for the first three fields should be

easy from elementary background knowledge. The last two fields have very few

elements and so you can just experiment.

done?problem 6.4 Show that Q[√2] (definition in notes) is a subfield of R.

done?problem 6.5 Find all the vectors in Z23 that are scalar multiples of

a =

(1

2

).

done?problem 6.6 Find the vectors in the subspaces in §6.4.3.

lecture 7 vector spaces — revision


7.1 setting

Vector spaces over a general field F.

7.2 generalizing to arbitrary fields

Almost everything in this lecture should be familiar from Year 2 Linear Algebra.

The difference here is that we are allowing the field to be arbitrary. We will not

prove the results again because the same proofs that work for R and C work

for general fields.

7.3 vector spaces and subspaces

7.3.1 the idea of a vector space A vector space V over a field F (also called

an “F-vector space”) is a set of objects (“vectors”) such that if u, v ∈ V then we

can form their sum u+v ∈ V and if λ ∈ F (we call elements of F “scalars”) then

we can form λv ∈ V. Furthermore these operations obey the familiar algebraic

properties of the corresponding operations in Fn.

7.3.2 proper definition (details do not need to be memorised) What we re-

quire in detail is that V is an Abelian (i.e. commutative) group under the op-

eration of addition of vectors, with an identity element 0 ∈ V. The scalar

multiplication should be compatible with the group operation in the following

ways: for all vectors u, v ∈ V and scalars λ, µ ∈ F we have

• λ(µv) = (λµ)v

• 1v = v

• λ(u+ v) = λu+ λv

• (λ+ µ)v = λv+ µv

7.3.3 examples

• For n ∈ N the “standard n-dimensional space over the field F” is Fn, the

space of all column vectors of height n with entries in F.

• More generally, let V be the set of all m × n matrices with entries in

F. Then this is a vector space over F. (When we take this view, we are

forgetting all the other things we might do with matrices and using only

the fact that matrices can be added (if they are of the same size) and

multiplied by scalars.)


• Let F[x] denote the space of all polynomials in a variable x with coefficients

in F. This is a vector space over F.

• Let K ⊆ F be a subfield. Then we can regard F as a vector space over K.

For example:

– R ⊆ C and so we can regard C as a vector space over R. It is

2-dimensional — a fact that is apparent every time we draw the

complex plane.

– Q ⊆ R and so we can regard R as a vector space over Q. This is in

fact infinite-dimensional.

– Q[√2] is a vector space over Q. It is 2-dimensional.

7.3.4 definition The non-empty subset U ⊆ V of the F-vector space V is a

(vector) subspace if for all x, y ∈ U and λ, µ ∈ F we have

λx+ µy ∈ U

7.3.5 examples

• The set {x ∈ Fn | λ1x1 + · · ·+ λnxn = 0} defines a subspace of Fn.

• Consider the set V = {iy |y ∈ R} ⊆ C. If you regard C as a 2-dimensional

vector space over R then this is a subspace. If on the other hand you

regard C as a 1-dimensional complex vector space, it is not. (Note that V

is closed under vector addition. The difference arises because V is closed

under multiplication be real scalars but not complex ones.)

• The set V = {P ∈ F[x] |P(x) = p(−x)} is a subspace of F[x].

7.4 span

7.4.1 definition Let S be a subset of a vector space V. A linear combination

of elements of S is a finite sum

λ1v1 + · · ·+ λnvn where λj ∈ F, vk ∈ S, n ∈ N

7.4.2 idea The span of a set S of vectors in a vector space V is the smallest

subspace of V that contains all the vectors in S. It is easier to work with the

following.


7.4.3 definition Let S be a subset of a vector space V. Then the span Span(S)

of S is the set of all linear combinations of elements of S. That is,

Span(S) = {λ1v1 + · · ·+ λnvn | λj ∈ F, vk ∈ S, n ∈ N}.

We set the span of the empty set {} to be {0} by convention.

If S = {v1, . . . , vk} is a finite set of vectors then the span is just

Span(S) = {λ1v1 + · · ·+ λkvk | λj ∈ F}.

7.4.4 examples

• In R3 the span of the vectors110

,100

is the subspace defined by x3 = 0.

• In P3(R) (real polynomials in x of degree ≤ 3) the span of {x, x3} is the

subspace consisting of all the odd polynomials.

• For any vector space, Span(V) = V.

• If U ⊆ V is a subspace then Span(U) = U. (In fact this is an if and only

if and so this condition characterizes subspaces.)

7.4.5 theorem

• If S ⊆ V is a subset then Span(S) is a subspace of V.

• Let U be a subspace of V and let S ⊆ U be a subset. Then Span(S) ⊆ U.

Combined, these results make sense of Span(S) being the smallest subspace

containing S.

7.5 linear independence, bases

7.5.1 linear independence

• A set S of vectors in a vector space V is linearly dependent if there exists

n ∈ N and vectors x1, . . . , xn ∈ S such that

λ1x1 + · · ·+ λnxn = 0

where the scalars λ1, . . . , λn ∈ F are not all zero.

• If S is not linearly dependent then we say it is linearly independent.

• A set S of vectors is linearly dependent if and only if there is a vector in

the set which is in the span of the other elements of S.


7.5.2 bases and dimension

• A basis for a vector space V (which may be a subspace of some larger

vector space) is a set S ⊆ V of vectors which is linearly independent and

which spans V.

• If V has a basis consisting of a finite number n of elements of V then we

say V has dimension n. Otherwise, we say V has infinite dimension.

7.5.3 coordinates Let V be n-dimensional and let u1, . . . , un be a basis for

V and let x ∈ V be given. Then there exist unique scalars λ1, . . . , λn ∈ F (called

the coordinates of x in the basis) such that

x = λ1u1 + · · ·+ λnun.

The coordinate matrix of x with respect to the basis is the column matrixλ1...λn

.7.6 sums and intersections of subspaces

7.6.1 definition Let U,V be subspaces of an F-vector space X. Then

W = U+ V = {u+ v |u ∈ U, v ∈ V}

is a subspace of X. If U ∩ V = {0} then we say that the sum is direct and we

write W = U⊕V.

7.6.2 theorems

• If W = U⊕V and the sum is direct then every vector w ∈ W can be

written in one and only on way as w = u+ v with u ∈ U and v ∈ V.

• The intersection of two subspaces is itself a subspace.

• If U+ V is finite-dimensional then

dim(U+ V) = dimU+ dimV − dim(U ∩ V).


problems

done?problem 7.1 + Hand-in for tutorial In Z32 find all the vectors in

Span(x, y) where

x =

110

, y =

011

done?problem 7.2 Find all the vectors in the subspace V ⊆ Z32 given by V = {x ∈Z32 | x1 + x2 + x3 = 0}.

done?problem 7.3 Show that in Z35 the vectors113

,202

,430

are linearly dependent.

done?problem 7.4 Give a basis of the subspace of Z35 defined by the equation

x1 + x2 + x3 = 0. What is the dimension of this subspace? How many vectors

are there in this subspace? Find the coordinate matrix of the vector

v =

433

in you chosen basis.

done?problem 7.5 + Hand-in for tutorial Complete the following sentence

without using any terms from linear algebra. “If R were finite-dimensional as a

vector space over Q it would mean that there exist a finite number r1, . . . , rnof . . . such that every . . . could be written as . . . ”


done?problem 7.6 + Hand-in for tutorial There are seven different non-

zero vectors in Z32 and hence (since the only scalars are {0, 1}) there are seven

different 1-dimensional subspaces. There are also seven different linear equations

of the form

λ1x1 + λ2x2 + λ3x3 = 0, λj ∈ Z2

apart from the trivial one with all the λj being zero. Each

of these defines a different 2-dimensional subspace of Z32.In the picture there are seven blobs, and seven lines

(six straight together with a circle). Label each blob

with a different 1-dimensional subspace and each line

with a 2-dimensional subspace in such a way that a

blob is on a line iff the 1-dimensional subspace lies in-

side the 2-dimensional one (i.e. iff the vector satisfies

the equation). (Note by the way that this config-

uration has the property that through every pair of

points there is a unique line and every pair of lines

meet in precisely one point. It is an example of a

”finite projective plane”.)

done?problem 7.7 u Challenge How many 2-dimensional subspaces does Z42have? You might want to think along the lines of defining such a subspace

by choosing a non-zero vector and then choosing another vector that is not a

multiple of it — their span determines a subspace. Count how many ways there

are of doing this and then work out how many times each subspace has been

counted.

lecture 8 linear maps — revision

8.1 setting

Linear maps T : U → V where U,V are finite-dimensional vector spaces over

the same field F.

8.2 generalizing to arbitrary fields

Almost everything in this lecture should be familiar from Year 2 Linear Algebra.

The difference here is that we are allowing the field to be arbitrary. We will not

prove the results again because the same proofs that work for R and C work

for general fields.


8.3 ideas

Whenever you have defined a structure (such as ”vector space”), you go on to

consider maps that preserve that structure. Vector spaces are sets which have

addition and scalar multiplication defined. Thus the relevant maps are those

that preserve these operations. That is the real content of the definition of

”linear map” below.

An isomorphism (see below) is a linear map T : U → V that is also a

bijection (i.e. 1-1 and onto) thus it just matches up the elements of U and V

in a way that respects the vector space operations.

8.4 definitions

8.4.1 definition Let U,V be vector spaces over the same field F. Then the

map T : U → V is a linear map or a homomorphism of vector spaces if for all

x, y ∈ U and all λ, µ ∈ F

• T(λx+ µy) = λTx+ µTy.

8.4.2 kernel and image

• The kernel of T as above is

ker T = {u ∈ U | Tu = 0}

which is a subspace of U.

• The image of T is

im T = {v | v = Tu for some u ∈ U}

which is a subspace of V.

• The rank of T is the dimension of im T .

• The Rank Theorem (a.k.a. Rank-Nullity Theorem) states that

dim ker T + dim im T = dimU.

8.4.3 inverses

• A linear map T : U→ V is injective (i.e. 1-1) iff ker T = {0}.

• A bijective (equivalently, ”invertible”) linear map T : U→ V is called an

isomorphism of vector spaces.


8.4.4 composition

• If T : U→ V and S : V →W are linear maps then the composition

S ◦ T : U→W, (S ◦ T)x := S(T(x))

is also a linear map.

• If also S and T are both invertible (i.e. both isomorphisms) then so is S◦Tand (S ◦ T)−1 = T−1 ◦ S−1.

8.5 linear maps Fn → Fm

8.5.1 matrices

• Linear maps Fn → Fm are given by matrices — in other words given such

a linear map T there exists an m×n matrix A such that the map is given

by T : x 7→ Ax.

• For a linear map T : Fn → Fm given by T : x 7→ Ax where A is a matrix,

the j-th column of A is the image under T of the j-th basis vector in Fn.

Hence, the image of T is the span of the columns of A. We define the

rank of a matrix to be the dimension of the span of its columns, and so

the rank of a matrix is equal to the rank of the corresponding linear map.

• Composition of maps corresponds to multiplication of matrices: if T :

Fn → Fm is given by T : x 7→ Ax and S : Fm → Fp is given by S : y 7→ By

then the composition S ◦ T : Fn → Fp is given by S ◦ T : x 7→ BAx.

8.6 bases as isomorphisms — not revision

8.6.1 theorem Let u1, . . . , un be a basis for V. Then

S : V →→ Fn, S : x 7→x1...xn

( where x1, . . . , xn are the coordinates of x in the basis) is an isomorphism.

Proof. That the map is a bijection is the existence and uniqueness of coordi-

nates with respect to a basis. The fact that S is linear expresses the fact that

adding and scalar multiplying vectors in V corresponds to the same operations

on the coordinate matrices. �


8.6.2 note In fact a basis is exactly the same thing as an isomorphism S :

V → Fn. Given such an isomorphism, the basis consists of the vectors in V that

map to the standard basis vectors in Fn.

8.6.3 idea The above expresses the underlying idea of bases: a basis is just

a choice of an identification of the vector space with the standard vector space

Fn.

8.7 coordinates — not all revision

8.7.1 definition (revision) Let T : U → V be a linear map between finite-

dimensional vector spaces. Let f1, . . . , fn be a basis for U and g1, . . . , gm be a

basis for V. The matrix of T with respect to these bases is the matrix A whose

k-th column is the coordinate matrix in V of the vector Tfk ∈ V.

8.7.2 idea The bases identify U,V with Fn, Fm respectively (as discussed

above) and identify T with a linear map given by matrices. The situation is

encapsulated by the following commutative diagram (i.e. a diagram where if

there are two routes from one place to another following the arrows, the maps

are equal). The vertical maps are the isomorphisms given by the bases.

UT

−→ Vy yFn

x 7→Ax−→ Fm

problems

done?problem 8.1 + Hand-in for tutorial Consider the linear map T :

Z32 → Z32 with matrix 1 1 0

1 0 1

0 1 1

.Find all the vectors in ker T and im T .


done?problem 8.2 How many linear maps T : Z22 → Z22 are there? How many of

them are invertible? (Hint: equivalently, how many 2 × 2 matrices are there

with entries in Z2 and how many have inverses? Remember that a 2× 2 matrix

has an inverse if its first column is non-zero and the second column is not a

multiple of the first.)

Let A be such an invertible matrix and let

e1 =

(1

0

), e2 =

(0

1

), e3 =

(1

1

).

Show that Aej 6= Aek unless j = k and that Aej 6= 0. Deduce that multi-

plication by A permutes the three non-zero vectors in Z22. Find explicitly the

permutation given by each of the six invertible matrices.

done?problem 8.3 Let Pj[Z3] denote the vector space of polynomials of degree ≤ jwith coefficients in Z3.

1. State the dimension of Pj[Z3].

2. Show that 1+ x, x+ x2, x2 is a basis for P2[Z3].

3. Show that T : P2[Z3] → P3[Z3] where T : p(x) 7→ (x + 2)p(x) is a linear

map.

4. Calculate the matrix of T with respect to the basis above for P2[Z3] and

the basis 1, x, x2, x3 for P2[Z3].

lecture 9 invariant subspaces and block matrices

9.1 setting

Linear maps V → V where V is a (normally finite-dimensional) vector space

over a field F.

9.2 basic ideas

9.2.1 definition Let T : V → V be a linear map. The subspace U ⊆ V is

invariant under T if T(U) ⊆ U.

9.2.2 examples

• V ⊆ V is always an invariant subspace.

• {0} ⊆ V is always an invariant subspace. (Recall that for linear maps we

have T(0) = 0 always.)


• ker T ⊆ V is an invariant subspace of T : V → V.

9.2.3 definition Let T : V → V be a linear map and let λ ∈ F. Define

Vλ := {v ∈ V | Tv = λv}.

If Vλ 6= {0} then we say λ is an eigenvalue of T and Vλ is the corresponding

eigenspace. Nonzero elements of Vλ are the eigenvectors of T (with the given

eigenvalue).

9.2.4 theorem The eigenspaces Vλ are invariant subspaces of V.

Proof. That they are subspaces is an easy exercise. Now suppose v ∈ Vλ.

Then Tv = λv ∈ Vλ and so Vλ is invariant. �

9.2.5 note If U ⊆ V is an invariant subspace, then we can restrict T to obtain

a linear map T |U : U→ U. (We will usually abuse our notation and write simply

T for this map.) In the case where U = Vλ is an eigenspace, T |U = λI (where,

as always, I denotes the identity linear map I : x 7→ x).

9.3 block matrices

9.3.1 definition Let M be an n×n matrix. Given k with 1 < k < n, we can

divide our matrix M into blocks A,B,C,D so that

M =

(A B

C D

).

Here, A is a k× k matrix, D is a (n− k)× (n− k) matrix and the other two

are the sizes they have to be. We say that M is written in block form. In some

books, block form is indicated by separating the blocks with dotted or dashed

lines.

9.3.2 theorem Suppose M,M ′ are two n×n matrices both written in block

form

M =

(A B

C D

), M ′ =

(A ′ B ′

C ′ D ′

)with A and A ′ of the same size so that also D and D ′ are of the same size.

Then the product in block form is given by

MM ′ =

(A B

C D

)(A ′ B ′

C ′ D ′

)=

(AA ′ + BC ′ AB ′ + BD ′

CA ′ +DC ′ CB ′ +DD ′

).

(In other words, the blocks multiply as though they were scalars BUT the terms

in products must be kept in the same order.)


Proof. It is easy to convince yourself this is true, a formal proof is not very

enlightening. �

9.3.3 extensions

• One can write non-square matrices in block form and one can consider

cases where the diagonal blocks are not square.

• There is a general rule here that is confusing to give precise form to: in a

product of matrices in block form, if the blocks match up precisely so that

all the matrix multiplications of the blocks make sense, then the product

can be computed as above treating the blocks as though they were scalars

(but maintaining the order of products). For example, with M as above

and a column vector x divided into two blocks u, v of height k, n − k

respectively we have

Mx =

(A B

C D

)(u

v

)=

(Au+ Bv

Cu+Dv

).

• All the above generalizes to cases where matrices are divided into more

than four blocks.

9.4 relation with invariant subspaces

9.4.1 definition Let M be in block form as in §9.3.1.

• If C = 0 we say that M is block upper-triangular.

• If B = 0 we say that M is block lower-triangular.

• If B = C = 0 we say that M is block diagonal.

9.4.2 theorem Let T : V → V be a linear map. Then T has a k-dimensional

invariant subspace if and only if there exists a basis for V such that the matrix

of T is block upper-triangular with the top-left block of size k× k.

Proof. Suppose first there exists a basis v1, . . . , vn such that the matrix

of T is block upper-triangular with the top-left block of size k × k. Then

Span(v1, . . . , vk) is an invariant subspace.

Conversely, suppose U ⊆ V is an invariant k-dimensional subspace. Choose

a basis v1, . . . , vk for U and extend to obtain a basis for V (always possible by

Year 2 Linear Algebra). Then with respect to this basis the matrix of T is block

upper-triangular. �


9.4.3 note The block A in our block upper-triangular matrix is the matrix of

T |U : U→ U in the basis v1, . . . , vk of U.

9.4.4 corollary If Vλ is a k-dimensional eigenspace of T then there exists a

basis for V such that the matrix of T is block upper-triangular with the leading

diagonal block being λI.

9.4.5 theorem Let T : V → V be a linear map. Then V is the direct sum of

two invariant subspaces U,U ′ of dimensions k, n− k if and only if there exists

a basis of V for which the matrix of T is block diagonal with A being k×k and

D being (n− k)× (n− k).

Proof. Similar to the previous theorem - the basis is such that v1, . . . , vk is a

basis for U and vk+1, . . . , vn is a basis for U ′. �

9.5 flags

9.5.1 definition A flag in an n-dimensional vector space V is a collection of

subspaces

0 = V0 ⊂ V1 ⊂ · · · ⊂ Vn−1 ⊂ Vn = V, such that dimVk = k, k = 1, . . . , n.

9.5.2 definition Let T : V → V be a linear map. A flag in V is invariant if

each Vk, k = 1, . . . , n is an invariant subspace for T .

9.5.3 theorem Let T : V → V be a linear map. Then there exists an invariant

flag in V if and only if there exists a basis for V such that the matrix of T is

upper-triangular.

Proof. Same idea as before - the basis in this case is such that v1, . . . , vk is a

basis for the subspace Vk in the flag. �

problems

done?problem 9.1 Let T : V → V be linear. Prove ker T is an invariant subspace.

done?problem 9.2 Consider rotations (about the origin) and reflections (in lines

through the origin) in the plane. Describe all 1-dimensional invariant subspaces.

done?problem 9.3 Show that every 1-dimensional invariant subspace is the span of

an eigenvector.


done?problem 9.4 Consider matrix multiplication of block upper-triangular matri-

ces. Using the notation of §9.3.1, show that M is invertible if and only if the

blocks A and D are invertible.

done?problem 9.5 + Hand-in for tutorial Show that T : V → V has an

invariant subspace of dimension l if and only if V has a basis with respect to

which the matrix of T is block lower-triangular.

Deduce that if the matrix M is n×n block upper-triangular and P is n×nsuch that Pij = 1 when i+ j = n+ 1 and zero otherwise, then P−1MP is block

lower-triangular.

done?problem 9.6 Let T : V → V be a linear map and suppose there is a flag {Vk}

in V such that T(Vk) ⊆ Vk−1 for k = 1, . . . , n. Show that Tn = 0.

Show that there exists such a flag for T if and only if there exists a basis

for V such that the matrix of T is strictly upper-triangular (meaning i ≥ j =⇒Tij = 0).

lecture 10 quotients and the 1st isomorphism theo-rem

10.1 setting

Vector spaces over a field F, which may be assumed to be finite-dimensional

(and needs to be when we make statements about dimension).

10.2 introduction

The Maple command “series( sin(x) /(x-1) , x=0 );” produces the output

−x− x2 −5

6x3 −

5

6x4 −

101

120x5 +O(x6).

Maple is “working to order x5”, neglecting terms involving higher powers of x.

One way of expressing this is as follows. Let X be the (infinite-dimensional)

vector space of “formal power series” - the set of all expressions a0+a1x+a2x2+

. . . without worrying whether they converge or not. Let V be the subspace of

those formal power series whose first six coefficients are zero. Now define an

equivalence relation on X by p ∼ q iff p − q ∈ V. (In other words, two series

are equivalent if they agree up to and including the term in x5). Then when we

“work to order x5” we are really working in the quotient X/ ∼. This quotient is

clearly a vector space: there is a well-defined addition and scalar multiplication

of such power series.


10.3 basic definitions

10.3.1 theorem Let V be a subspace of X. Then x ∼ y if and only if x−y ∈ Vdefines an equivalence relation on X.

10.3.2 example Let V be the subspace x1+x2+x3 = 0 of X = R3. Then for

each d ∈ R the set of all vectors x with x1 + x2 + x3 = d form an equivalence

class. Each equivalence class is thus a plane parallel to the subspace V.

10.3.3 theorem Let V be a subspace of a vector space X over F and let ∼

denote the equivalence relation x ∼ y ⇐⇒ x− y ∈ V. Then

[x] + [y] := [x+ y] and λ[x] := [λx]

are well-defined and make X/ ∼ into a vector space over F. The zero vector is

[0] = V. We call this vector space the quotient of X by V and denote it by

X/V.

Proof. One must check is that these operations are well-defined i.e. if [u] = [u ′]

and [v] = [v ′] then [u+ v] = [u ′ + v ′] and λ[u] = λ[u ′].

Secondly, one has to check that the axioms are obeyed. For instance

[u] + [v] = [u+ v] = [v+ u] = [v] + [u]

and so vector addition is commutative. The rest are equally trivial and we leave

them to the interested reader. �

10.3.4 theorem Let V ⊆ X be a subspace. Then P : X → X/V defined by

P : x 7→ [x] is a surjective linear map with kernel V.

Consequently, by the Rank Theorem for linear maps, if V is a subspace of

X then

dim(V) + dim(X/V) = dim(X).

Proof. One must check that P is linear, that it is surjective and that it has

kernel V. These are all trivial. �

10.4 FIT for vector spaces

10.4.1 the first isomorphism theorem Let T : U → V be a surjective linear

map. Then there is a canonical linear map

T : U/ ker T → V, where T : [u] 7→ Tu

which is an isomorphism of vector spaces and T = T ◦ P.


10.4.2 note The final condition is equivalent to the fact that the diagram

UT

−→ V

P ↓ ↗ T

U/ ker T

commutes.

Proof. First, note that T(x) = T(x ′) ⇐⇒ x− x ′ ∈ ker T ⇐⇒ x ∼ x ′ where

∼ is the equivalence relation that defines the quotient vector space. Thus FIT

for sets proves everything except the fact that T is a linear map. For that:

T(λ[x] + µ[x ′]) = T([λx+ µx ′])

= T(λx+ µx ′)

= λTx+ µTx ′

= λT([x]) + µT([x ′])

�

10.4.3 corollary The condition that T is surjective can be dropped from the

statement of the theorem. In that case, the conclusion is that the canonical

linear map T is an isomorphism of U/ ker T with im T .

10.5 bases

10.5.1 theorem let V ⊆ X be a k-dimensional subspace of an n-dimensional

vector space X. Let f1, . . . , fn be a basis for X such that f1, . . . , fk are a basis

for V. Then

[fk+1], . . . , [fn]

form a basis for X/V.

Proof. They span X/V since if x ∈ X and x =∑ni=1 λifi then

[x] =

[n∑i=1

λifi

]=

[n∑

i=k+1

λifi

]=

n∑i=k+1

λi[fi].

But dimX/V = n− k and so they form a basis. �

10.5.2 FIT and bases In the situation of FIT, choose a basis for U such that

f1, . . . , fk form a basis for ker T . Choose any basis for V. Then the matrix of T

has block form (0 A

).


(We have a generalization of block form here - the matrix being blocked is not

square and we are blocking in to just two blocks.) The matrix A is the matrix

of T : U/ ker T → V with respect to the basis [fk+1], . . . , [fn] of U/ ker T and

the given basis of V.

10.6 complementary subspaces

10.6.1 complementary subspaces A complementary subspace to V ⊆ X is a

subspace W ⊆ X such that X = V ⊕W.

It is not hard to see that a complementary subspace is a set of representatives

for ∼ and so if X = V ⊕W then we can identify X/V with W.

The subspace V will have many complementary subspaces and there may

be no good reason to choose one rather than another. (Unless X has an inner

product when the perpendicular subspace would be a good complement.) The

quotient X/V is one abstractly constructed object that one can work with and

avoid making an arbitrary choice.

problems

done?problem 10.1 Let V ⊆ X be a subspace. Check that x ∼ y ⇐⇒ x− y ∈ Vdoes define an equivalence relation on X

done?problem 10.2 Check that the addition and scalar multiplication defined in

§10.3.3 is well-defined.

done?problem 10.3 Check that the distributive law (λ(x + y) = λx + λy for all

vectors x, y and a scalar λ) holds in X/V.

done?problem 10.4 Write out a careful proof of the fact that P : X→ X/V where

P : x 7→ [x] is linear, surjective and has kernel V.

done?problem 10.5 + Hand-in for tutorial Suppose T : X→ Y is a linear

map and that V ⊆ X is a subspace such that V ⊆ ker T . Define a linear map

T : X/V → Y (checking that the map you have defined is linear) such that the

following diagram commutes. (The vertical map is the usual one.) Find the

dimension of the kernel of T in terms of the dimensions of V and ker T . (Hint:

apply the rank theorem to T and T .)

XT

−→ Y↓ ↗ T

X/V


done?problem 10.6 (Harder!) Let U ⊆ V ⊆ X be subspaces of X.

(a) Show that there is a canonical linear map S : V/U→ X/U.

(b) Show that S is injective.

(c) Show that there is a canonical linear map T : X/U→ X/V. Show that T is

surjective.

(d) Show that ker T = imS.

So, if we identify V/U with its image in X/U (reasonable, since S is injective)

then we can deduce that there is an isomorphism

X/U

V/U→ X/V.

(Notation suggestion: write x ∼U y if x− y ∈ U and write [x]U for the equiva-

lence class under this relation. Similarly for V.)


lecture 11 quotients and linear maps

11.1 setting

Finite-dimensional vector spaces over a field F. For the result that for every

linear map T : V → V there exists a basis with respect to which the matrix of

T is upper-triangular, the field is assumed to be C.

11.2 linear maps of quotients

11.2.1 theorem Let T : X→ X be a linear map and let V ⊆ X be a subspace

such that T(V) ⊆ V. Then there is a canonical linear map T : X/V → X/V

such that P ◦T = T ◦P where P is the canonical surjection X→ X/V. The final

condition is just that the following diagram commutes.

XT

−→ X

P ↓ P ↓X/V

T−→ X/V

Proof. Define T by

T : [x] 7→ [Tx].

One just has to show that this map is well-defined, linear and that the diagram

commutes. �

11.2.2 theorem In the situation as above, let f1, . . . , fn be a basis for X such

that f1, . . . , fk is a basis for V. Then the matrix A of T with respect to these

bases is block upper-triangular of the form

A =

(S U

0 Q

)where S is the k×k matrix of T restricted to be a map V → V. The (n−k)×(n− k) matrix Q is the matrix of T with respect to the basis [fk+1], . . . , [fn] of

X/V.

Proof. Immediate from the definition of T . �

11.3 a result for maps Cn → Cn

11.3.1 theorem Let V be an n-dimensional complex vector space and let

T : V → V be a linear map. Then there exists a basis for V with respect to

which the matrix of T is upper-triangular.


Proof. Note first that since by the Fundamental Theorem of Algebra every

polynomial with complex coefficients has a root in C we know that every such T

has an eigenvector. Now we proceed by induction on the dimension n. Clearly

the theorem holds for n = 1. Assume it holds for dimension n − 1 and now

consider dimension n. Let f1 be an eigenvector of T and let U = Span(f1).

Then T(U) ⊆ U and so there exists an induced linear map T : V/U → V/U.

By the inductive hypothesis, there exists a basis [f2], . . . , [fn] for V/U such that

the matrix of T is upper-triangular. Then the matrix of T is upper-triangular

with respect to the basis f1, . . . , fn. �

11.3.2 remark The entries that appear on the diagonal of an upper-triangular

matrix are the eigenvalues.

11.3.3 corollary If V is a finite-dimensional complex vector space and T :

V → V is a linear map, then there exists a flag in V invariant under T .

11.3.4 corollary Let B be an n × n complex matrix. Then there exists an

invertible n× n complex matrix P such that P−1BP is upper-triangular.

11.3.5 corollary A linear map T : V → V is called nilpotent if there exists

k ∈ N such that Tk = 0. If V is complex and T is nilpotent, then there exists a

basis for V such that the matrix of T is strictly upper-triangular. Consequently

Tn = 0.

Proof. If T is nilpotent then all its eigenvalues are zero. (Why?) Then apply

the remark following the theorem. Now we have an invariant flag and further

we have T(Vk) ⊆ Vk−1 for all k. Thus

Tn(V = Vn) ⊆ Tn−1(Vn − 1) ⊆ · · · ⊆ V0 = {0}.

�

problems

done?problem 11.1 + Hand-in for tutorial The 3× 3 real matrix

A =

1 0 −1

0 −2 0

1 0 0

has a single real eigenvalue. Find a real invertible matrix P such that P−1AP is

block upper-triangular.


done?problem 11.2 Consider the differentiation map D : P3 → P3 where as usual

Pn is the vector space of polynomials of degree ≤ n in a variable x. Show

that D gives rise to a linear map D : P3/V → P3/V where V is the subspace

of constant polynomials. What is the matrix of D with respect to the basis

[x], [x2], [x3] of P3/V?

done?problem 11.3 + Hand-in for tutorial Let J : R2 → R2 be rotation

anticlockwise by a rightangle. Does there exist a basis for R2 with respect to

which the matrix of J is upper-triangular? If not, explain where the proof we

gave for complex vector spaces breaks down.

done?problem 11.4 The aim of this question is to prove that if T : V → V is

nilpotent (so that Tk = 0 for some k), then there exits a basis for V such that

the matrix of T is strictly upper-triangular. (The proof in the notes applies only

if V is complex.) Let k be the least k such that Tk = 0.

1. Show that if T is nilpotent then T has zero as an eigenvalue.

2. Let U ⊆ V be an invariant subspace for nilpotent T . Show that T :

V/U→ V/U is also nilpotent.

3. Now argue analogously to the proof of the main theorem in the notes for

this lecture.

lecture 12 linear maps V → V — eigenspaces

12.1 setting

Linear maps T : V → V where V is finite-dimensional over a field F.

12.2 characteristic equation

12.2.1 definition Let A be an n× n matrix. Then the degree n polynomial

in the variable x given by

cA(x) = det(A− xI)

is the characteristic polynomial of A.

12.2.2 theorem (revision) The scalar λ is an eigenvalue of A if and only if it

is a root of cA(x).

Proof. λ is an eigenvalue if and only if A − λI has nontrivial kernel, which is

the case if and only if the matrix A− λI is singular (i.e. not invertible). �


12.2.3 theorem Let A,B, P be n×n matrices with P invertible and suppose

that B = P−1AP. Then cA(x) = cB(x).

Proof.

det(P−1AP − xI) = det(P−1(A− xI)P)= det(P−1)det(A− xI)det(P) = det(A− xI).

�

12.2.4 definition If T : V → V is a linear map we can define the characteristic

polynomial cT of T to be cA where A is the matrix of T in some basis. (Since the

previous theorem shows that change of basis does not change the characteristic

polynomial, this is well-defined.)

12.2.5 notation Let λ ∈ F be an eigenvalue of T : V → V. We write

Vλ := {v ∈ V | Tv = λv}

for the corresponding eigenspace.

12.2.6 definition Let λ be an eigenvalue of T : V → V.

• The geometric multiplicity g of λ is g := dimVλ.

• The algebraic multiplicity a of λ is the multiplicity of λ as a root of cT .

12.3 polynomials in matrices and linear maps

12.3.1 notation Let

p(x) = anxn + · · ·+ a1x+ a0, aj ∈ F

be a polynomial. Then if A is a square matrix, we define

p(A) = anAn + · · ·+ a1A+ a0I

(so p(A) is itself a square matrix). More abstractly, if T : V → V is a linear

map, we can define p(T) (which is also a linear map V → V). If the matrix of

T with respect to a basis is A, then the matrix of p(T) is p(A).

12.3.2 theorem Suppose that A is diagonal with entries µj. Then p(A) is

diagonal with entries p(µj).

Proof. Obvious. �


12.3.3 theorem Suppose Tv = λv and p is a polynomial. Then p(T)(v) =

p(λ)v.

Proof. Obvious. �

12.4 direct sums of subspaces

12.4.1 definition Let U1, . . . , Uk be subspaces of V. The sum of the sub-

spaces is

U1 + · · ·+Uk := {u1 + · · ·+ uk |uj ∈ Uj}.

12.4.2 definition A sum of subspaces as above is direct if

u1 + · · ·+ uk = 0, uj ∈ Uj =⇒ uj = 0 for all j.

If W is the sum of the Uj and the sum is direct we write

W = U1 ⊕ · · · ⊕Uk =⊕j=1,...,k

Uj.

12.4.3 theorem Let V =⊕

j=1,...,kUj and let v ∈ V. Then v can be written

in one and only one way as

v = u1 + · · ·+ uk, uj ∈ Uj.

Proof. That v can be so written is immediate from the definition. Then write

v as such a sum in two ways and show that they are equal. �

12.4.4 theorem Let λ1, . . . , λm be distinct eigenvalues of T : V → V. Then

the sum of the eigenspaces Vλj , j = 1, . . . ,m is direct.

Proof. Suppose

v1 + · · ·+ vk = 0, vk ∈ Vλk (∗)

Let 1 ≤ j ≤ k. The we can write

cT (x) = (x− λj)ajQj(x)

where aj is the algebraic multiplicity of λj. Then λj is not a root of Qj but all

other eigenvalues are. Applying Qj(T) to both sides of (*) we see that vj = 0.

�


12.5 relation between multiplicities

12.5.1 theorem Let the n× n matrix M be block upper-triangular

M =

(A B

0 D

).

Then detM = detAdetD.

Proof. Omitted. It is not hard if one takes as the definition of determinant

detM =∑

σ a perm of 1, . . . , n

(−1)|σ|M1σ(1) . . .Mnσ(n)

where |σ| denotes the sign of the permutation σ. �

12.5.2 corollary Let the n× n matrix M be block upper-triangular

M =

(A B

0 D

).

Then cM(x) = cA(x)cD(x).

Proof. Trivial. �

12.5.3 corollary Let T : V → V and suppose U ⊆ V is an invariant subspace.

let cU denote the characteristic polynomial of T : U → U. Then ct(x) =

cU(x)Q(x) where Q(x) is the characteristic polynomial of the canonical linear

map T : V/U→ V/U.

Proof. Trivial. �

12.5.4 theorem Let λ be an eigenvalue of T : V → V. Then

g ≤ a

where g and a are the geometric and algebraic multiplicities of λ respectively.

Proof. The eigenspace Vλ is a g-dimensional invariant subspace. On that

subspace, T = λI and so the characteristic polynomial of T on this subspace is

(λ− x)g. By the theorem, (λ− x)g divides cT (x) exactly and so a ≥ g. �


problems

done?problem 12.1 Write out the details of the proof of §12.4.3.

done?problem 12.2 + Hand-in for tutorial Let x1, . . . , xk be non-zero

vectors and let Uj = Span(xj). Show that the sum of the subspaces Uj is direct

if and only if the vectors xj are linearly independent.

done?problem 12.3 Let V =⊕

j=1,...,kUj be a direct sum of subspaces. Suppose

that we are given a basis for each subspace Uj. Show that the totality of all

these basis vectors forms a basis for V.

done?problem 12.4 Write out the details of the proof of §12.4.4. In particular,

provide more detail on the final sentence.

In the following problems, the idea is to calculate and spot the pattern. Don’t

get too hung-up on proofs!

In the following problems we write Jn(α) for the Jordan matrix which is the

n × n matrix whose i, j-th entry is: α if i = j; 1 if j = i + 1 and 0 otherwise.

Thus for example

J3(−5) =

−5 1 0

0 −5 1

0 0 −5

.done?problem 12.5 + Hand-in for tutorial What is the characteristic

polynomial of J3(α)? Find its eigenvalues, and their algebraic and geometric

multiplicity. How does this generalize to Jn(α)?

done?problem 12.6 Consider D : P3 → P3 (where D denotes differentiation and

Pn is the vector space of real polynomials in “x” of degree n or less). Find a

basis with respect to which D has matrix J4(0).


done?problem 12.7 For k ≥ 1, the k-th generalized eigenspace of T : V → V with

eigenvalue λ is

Eλ,k := {v ∈ V | (T − λI)kv = 0}.

So, for k = 1 the generalized eigenspace is just the eigenspace in the usual

sense.

1. Show that if k ≤ l then Eλ,k ⊆ Eλ,l.

2. Let A = J3(α). Show that Eα,k = V for k ≥ 3. Describe Eα,2 and give

its dimension.

3. In general, what is dimEα,k for the Jordan matrix Jn(α)?

lecture 13 the Cayley-Hamilton theorem and the min-imal polynomial

13.1 setting

T : V → V is a linear map and V is a finite-dimensional vector space over a field

F. Its characteristic polynomial is cT (x) and its minimal polynomial is mT (x).

(We will usually write the characteristic polynomial using x as the variable in

place of the more familiar λ.)

We will write λ1, . . . , λk for the distinct eigenvalues of T or A but µ1, . . . , µlfor the eigenvalues listed with multiplicity.

13.2 the Cayley-Hamilton theorem

13.2.1 theorem (Cayley-Hamilton) Let V be a finite-dimensional vector space

and let T : V → V be a linear map with characteristic polynomial cT (x). Then

cT (T) = 0.

Proof. We give a proof for F = C only. Let V be complex and n-dimensional

and let T : V → V be a linear map. Choose a basis f1, . . . , fn as in lecture

11 with respect to which the matrix A of T is upper-triangular with diagonal

entries the eigenvalues (with multiplicity) µ1, . . . , µn. Then

cT (x) = ±(x− µ1)(x− µ2) . . . (x− µn).

Let V0 = {0} and for 1 ≤ k ≤ n let Vk be the subspace of V spanned by

f1, . . . , fk. Then (see lecture 9) we have T(Vk) ⊆ Vk for all k.


Now consider the matrix of Tj := T − µjI. This is upper-triangular with a

zero in the j-th diagonal entry. We deduce that Tj(Vj) ⊆ Vj−1. Thus

(cT (T))(V) = T1T2 . . . Tn(Vn) ⊆ T1T2 . . . Tn−1(Vn−1) ⊆ · · · ⊆ V0 = {0}.

Thus cT (T) sends all vectors to zero and hence is the zero linear map. �

13.2.2 corollary Let A be an n × n complex matrix. Then cA(A) = 0. (In

words, “a matrix satisfies its own characteristic equation”.)

Proof. This is simply the translation of the theorem into matrix terms. �

13.2.3 corollary The Cayley-Hamilton theorem holds for any field F which is

a subfield of C.

Proof. The matrix version is immediately seen to hold because an n×n matrix

with entries in such a field F is also a complex matrix. �

13.2.4 application The characteristic equation of a 2× 2 matrix A is

x2 − Trace(A)x+ detA = 0.

Substitute for A for x (OK by C-H) and rearrange to get

(A− Trace(A)I)A = −det(A)I.

So if detA 6= 0 we can deduce that

A−1 =1

detA(Trace(A)I−A)

which reduces to the usual formula.

13.2.5 application - getting eigenvectors by cheating Suppose A is 2×2 with

distinct eigenvalues λ1, λ2. Then

(A− λ1I)(A− λ2I) = 0.

Now, ker(A− λ1I) is the λ1 eigenspace and the equation tells us that

im(A− λ2I) ⊆ ker(A− λ1I).

So the columns of A− λ2I are eigenvectors with eigenvalue λ1.

13.3 the minimal polynomial

13.3.1 definition A polynomial p(x) is monic if the coefficient of its highest

order term is 1.


13.3.2 definition The minimal polynomial mT of the linear map T : V → V

is the monic polynomial of least degree such that mT (T) = 0.

13.3.3 theorem Let T : V → V be a linear map with minimal polynomial

mT (x). Then every polynomial p(x) such that p(T) = 0 is of the form p(x) =

mT (x)Q(x) for some polynomial Q(x).

Proof. Suppose p(T) = 0. Then we can divide p(x) by mT (x) (polynomial

“long division”) to get

p(x) = mT (x)Q(x) + r(x)

where the remainder r(x) has degree < deg p(x) or is zero. Now, p(T) =

mT (T) = 0 and so r(T) = 0 and if r 6= 0 it contradicts the minimality of the

degree of mT . �

13.3.4 corollary The minimal polynomial divides the characteristic polyno-

mial.

13.3.5 theorem If λ is an eigenvalue of T then mT (λ) = 0.

Proof. 0 = mT (T)v = mT (λ)v by §12.3.3. �

13.3.6 theorem Let V be a complex vector space and T : V → V a linear

map with distinct eigenvalues λ1, . . . , λk. Then

cT (x) = (x− λ1)a1 . . . (x− λk)

ak

where the aj are the algebraic multiplicities. Then

mT (x) = (x− λ1)m1 . . . (x− λk)

mk

where for each j we have 1 ≤ mj ≤ aj.

Proof. Follows immediately from what we have just done. �

13.3.7 important note The definition of minimal polynomial and the theo-

rems have completely analogous statements in terms of square matrices rather

than linear maps. We will use both forms interchangeably.


problems

In the problems for this lecture the idea is to calculate and spot the pattern.

Don’t get hung-up on proofs!

In the following problems we write Jn(α) for the Jordan matrix which is the

n × n matrix whose i, j-th entry is: α if i = j; 1 if j = i + 1 and 0 otherwise.

Thus for example

J3(−5) =

−5 1 0

0 −5 1

0 0 −5

.done?problem 13.1 What is the minimal polynomial of J3(α)? How does this

generalize to Jn(α)?


done?problem 13.2 + Hand-in for tutorial Consider ODEs of the form

x ′ = Ax where x(t) =

(u(t)

v(t)

)and A is a constant 2 × 2 real matrix. Re-

call from CVD that one can solve such systems when A is real-diagonalizable

(“nodes”,“saddles” and “stars”) and complex-diagonalizable (“foci” and “cen-

tres”).

1. Let A be a real 2× 2 matrix that is not diagonalizable by real or complex

P. Show that the minimal polynomial of A is equal to the characteristic

polynomial which is of the form (x− λ)2, where λ is the eigenvalue.

2. Deduce (Cayley-Hamilton) that if f2 is not an eigenvector then f1 :=

(A− λI)f2 is.

3. Show that the matrix of x 7→ Ax in a basis f1, f2 (as in the previous part)

is J2(λ) and hence that A = PJ2(λ)P−1 where P has f1, f2 as columns.

4. Show that

exp(tJ2(λ)) =

(etλ tetλ

0 etλ

).

5. Show that exp(tA) = P exp(tJ2(λ))P−1 where P is the matrix with

columns f1, f2.

6. Solve

u ′ = −3u− v, v ′ = 4u+ v.

(Recall from CVD that the general solution of x ′ = Ax is x(t) =

exp(tA)

(C1C2

).)

It is purely optional for this course, but you might like to figure out what the

phase portraits for these systems look like.

Remark: this approach to the solution obscures the relationship of the so-

lutions to the basis. Alternatively we can observe that

eλtf1, eλt(tf1 + f2)

are two independent solutions.


done?problem 13.3 A matrix is said to be in Jordan form if it is block-diagonal

with each diagonal block being a Jordan matrix. (There may be more than one

block with a given parameter value α.) So for example the 7× 7 matrixJ3(5) 0 0

0 J1(5) 0

0 0 J3(−2)

is in Jordan form. (Note that J1(α) is the 1 × 1 matrix (a.k.a. “number”) α

and so a diagonal matrix is an example of Jordan form where all the blocks are

of size 1.)

1. Consider a matrix A in Jordan form with just two Jordan blocks

Jp(α), Jq(β) where α 6= β. Find the characteristic polynomial, eigenval-

ues, minimal polynomial, and dimensions of the generalized eigenspaces

of A (definition is in a problem for lecture 12).

2. The same, only now assume that α = β.

3. Conjecture how this generalizes to a general matrix in Jordan form.

done?problem 13.4 Find two matrices A,B in Jordan form which have the same

minimal polynomial, characteristic polynomial and dimensions of the (ordinary,

not generalized) eigenspaces and such that A 6= B (and neither do A,B differ

only by a change in the order of the blocks down the diagonal).

13.3.8 something worth knowing In fact, every complex square matrix is sim-

ilar to one in Jordan form. Some thought shows that the Jordan form is de-

termined (up to choosing the order of the blocks) by the dimensions of all

the generalized eigenspaces and so this provides a solution to the classification

problem for complex square matrices: two complex matrices are similar iff they

have the same eigenvalues and the same dimension for all the corresponding

generalized eigenspaces.

Knowing the characteristic and minimal polynomials is not enough, as ex-

ercise 4 demonstrates.

lecture 14 a diagonalizability theorem

14.1 setting

T : V → V is a linear map and V is a finite-dimensional vector space over a field

F. Its characteristic polynomial is cT (x) and its minimal polynomial is mT (x).


(We usually write the characteristic polynomial using x as the variable in place

of the more familiar λ.)

We will write λ1, . . . , λk for the distinct eigenvalues of T or A but µ1, . . . , µlfor the eigenvalues listed with multiplicity.

14.2 diagonalizability

14.2.1 remark When we consider (say) T : Rn → Rn, an eigenvector is a

real vector and it has a real eigenvalue. It may be the case that the matrix of

T has complex “eigenvalues” and “eigenvectors”. These are NOT eigenvalues

and eigenvectors for T but for the map Cn → Cn with the same matrix.

14.2.2 definition We say that T : V → V is diagonalizable if there exists a

basis for V such that the matrix of T is diagonal. (Equivalently, if and only if

there exists a basis for V consisting of eigenvectors of T .)

14.2.3 theorem A linear map T : V → V is diagonalizable if and only if V is

the sum (necessarily direct) of the eigenspaces of T .

Proof. Obvious. �

14.3 interpolating with polynomials

14.3.1 theorem Let λ1, . . . , λk ∈ F be distinct. Define polynomials (of degree

k− 1)

pj(x) =∏i6=j

x− λiλj − λi

, j = 1, . . . , k.

Let q(x) be a polynomial of degree less than k. Then

q(x) = q(λ1)p1(x) + . . . q(λk)pk(x).

Proof. The polynomials pj(x) satisfy

pj(λi) =

{1 if i = j

0 otherwise.

The two sides of the claimed equation are thus equal at all the points λj and

are hence equal everywhere since both sides are polynomials of degree less than

k (otherwise, the left-hand side minus the right-hand side would be a non-zero

polynomial of degree of degree < k but with k roots). �


14.4 the main theorem

14.4.1 theorem Let V be finite-dimensional and let T : V → V be a linear

map. Then T is diagonalizable if and only if the minimal polynomial mT (x) of

T factorises as a product of distinct linear factors:

mT (x) = (x− λ1)(x− λ2) . . . (x− λk), λ1, . . . , λk ∈ F.

14.4.2 notes

• If V is n-dimensional and the characteristic polynomial has n distinct

roots in F, then T is trivially diagonalizable by taking a basis consisting of

the corresponding eigenvectors. But in this case, we know (by §13.3.6)

that mT = cT so this is consistent.

• The main point in all this is the case where the characteristic polynomial

does factorise into linear factors but has repeated roots:

cT (x) = ±(x− λ1)a1(x− λ2)a2 . . . (x− λk)ak .

Then the theorem says that T is diagonalizable if and only if the minimal

polynomial has each factor to the power 1 only.

14.4.3 examples

• Example application. Suppose a square matrix A has characteristic poly-

nomial (x−1)2(x+2). Then A is diagonalizable if and only if its minimal

polynomial is (x− 1)(x+ 2); i.e. if and only if (A− I)(A+ 2I) = 0.

• Let

A =

(1 1

0 1

), B =

(1 0

0 1

).

Both matrices have characteristic polynomial (x − 1)2. The matrix A

has minimal polynomial (x − 1)2 (since the only other possibilities are

0, (x−1) and by inspection A does not satisfy either of the corresponding

equations). Thus A is not diagonalizable (as one can easily check by

explicitly showing that it has only one linearly independent eigenvector).

The matrix B on the other hand has minimal polynomial (x − 1) and so

is diagonalizable (as is glaringly obvious since it is diagonal already).

• The linear map T : R2 → R2 given by the “rotate by a right-angle” matrix

J =

(0 −1

1 0

)


has characteristic polynomial x2 + 1. This does not factorise with real

coefficients and so as a linear map R2 → R2, T is not diagonalizable.

On the other hand x2+1 = (x+ i)(x− i) and so the linear map C2 → C2

given by x 7→ Jx is diagonalizable.

In concrete terms, there exist complex invertible matrices P such that

P−1JP is diagonal, but not real ones.

14.5 proof of main theorem

14.5.1 proof — easy direction Proof that if T is diagonalizable then the

minimal polynomial is as stated.

Proof. Let A be the diagonal matrix representing T in a basis of eigenvectors.

Let

cT (x) = ±(x− λ1)a1(x− λ2)a2 . . . (x− λk)ak .

Set

p(x) = (x− λ1)(x− λ2) . . . (x− λk).

Then p(A) is diagonal with entries p(λk) = 0. �

14.5.2 proof — hard direction Proof that if the miminimal polynomial is as

stated then T is diagonalizable.

Proof. By §14.2.3 it is enough to show that V is the sum of the eigenspaces

of T . Let λ1, . . . , λk be the distinct eigenvalues of T and let the polynomials

pj(x) be defined as in §14.3.1. Applying the theorem following to q(x) = 1 we

deduce that

p1(x) + · · ·+ pk(x) = 1

and so

p1(T) + · · ·+ pk(T) = I. (∗∗)

Define Tj := T − λjI. Then we note that

pj(T) = αj∏

i=1,...,k, i 6=jTi where αj ∈ F.

Since T1T2 . . . Tn = mT (T) = 0 we see that

impj(T) ⊆ ker Tj = the λj-eigenspace of T .

Now let v ∈ V and apply (**) to deduce that

v = p1(T)(v) + · · ·+ pk(T)(v)

and so v is a sum of eigenvectors of T as required. �


14.6 corollaries

14.6.1 corollary In the above proof, in fact im(Tj) = Vj. (Consider the final

equation where v ∈ Vj.) Consequently, if A is a diagonalizable matrix then the

eigenspace Vj is the column span of the matrix

(A− λ1I) . . . (A− λkI) where the λj-term is omitted.

14.6.2 corollary Let T : V → V be a diagonalizable linear map and let U ⊆ Vbe a subspace such that T(U) ⊆ U. Then T : U→ U is diagonalizable.

Proof. Let mT (x) be the minimum polynomial of T : V → V, which we know

to be a product of distinct linear factors. Then (mT (T))(u) = 0 for all u ∈ Uand so the minimal polynomial of T : U→ U divides mT (x) and hence is itself

a product of distinct linear factors and hence T : U→ U is diagonalizable. �

14.6.3 theorem Let T : V → V and S : V → V be diagonalizable linear maps

and suppose ST = TS. Then there exists a basis for V with respect to which

the matrices of S and T are both diagonal.

Proof. Let Vj be the λj eigenspace of T . Let v ∈ Vj. Then T(Sv) = S(Tv) =

λjSv and so S(Vj) ⊆ Vj. By §14.6.2 S : Vj → Vj is diagonalizable. Choose a

basis for each Vj consisting of eigenvectors of S. Then the union of these is a

basis for V consisting of vectors that are eigenvectors of both S and T . �

problems

done?problem 14.1 Use the process described in §14.3.1 to find quadratic polyno-

mials p1, p2, p3 such that for every quadratic polynomial q we have

q(x) = q(0)p1(x) + q(1)p2(x) + q(2)p3(x).

done?problem 14.2 For what values of k is the matrix

M :=

1 1 2

0 −2 k

0 0 1

diagonalizable?


done?problem 14.3 A is a 2× 2 matrix and λ is an eigenvalue of A. Also

A− λI =(2 −3

−4 6

).

Find a basis for R2 consisting of eigenvectors of A.

done?problem 14.4 Let

Rθ =

(cos θ − sin θ

sin θ cos θ

), θ ∈ [0, 2π].

If you calculate you will find that the different Rθ have different (complex)

eigenvalues but the same (complex) eigenvectors. How does this relate to the

theory in this lecture?

done?problem 14.5 + Hand-in for tutorial Use the main theorem to check

that

A =

−2 −4 2

3 6 −1

−6 −4 6

is diagonalizable. Find a basis v1, v2, v3 with respect to which the matrix of

u 7→ Au is diagonal and hence write down a matrix P such that P−1AP is

diagonal. You are given that cA(x) = (x − 2)(x − 4)2. (You can (and should)

do all this without explicitly solving (A− λI)v = 0 for eigenvectors.)

done?problem 14.6 Continuing the previous exercise, let

B =

−3 −4 3

5 6 −3

−5 −4 5

Check that AB = BA. Find a basis that diagonalizes both A and B. A

suggested strategy is as follows. Use your change of basis matrix P from the

previous exercise (that diagonalizes A) to find the matrix of x 7→ Bx in the

basis v1, v2, v3 (Maple, perhaps). Now you should discover that B is block-

diagonal and you just need to change basis again within the 2-dimensional

eigenspace. (There are other ways: you could think about the intersection of

the 2-dimensional eigenspaces of the two matrices, for example.)


lecture 15 bilinear and quadratic forms on R-vectorspaces

15.1 setting

Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)

we write β for the associated quadratic form and B for the matrix of b with

respect to some basis.

15.2 motivation

Consider in R2 the function

α(x) = x21 − x22, x =

(x1x2

)∈ R2

which as we will see is an example of a “quadratic form” on R2 — such things

are essentially just polynomials in the coordinates with each term having total

degree 2.

Consider also the quadratic form

β(y) = y1y2 y =

(y1y2

)∈ R2.

This is in fact “essentially the same” quadratic form because under the change

of coordinates

y1 = x1 + x2, y2 = x1 − x2

the forms are equal. On the other hand, no such change of coordinates can ever

make either of these forms equal to the form

γ(z) = z21 + z22.

This is easy to see in this simple case because γ(z) = 0 only for z = 0 whereas

there are non-zero vectors x ∈ R2 such that α(x) = 0, but it is useful to have

some general theory here on when quadratic forms are equivalent under change

of coordinates. We will address this issue over the next three lectures.

15.3 definition

15.3.1 definition A symmetric bilinear form (hereafter SBF) on a real vector

space V is a function b : V × V → R which satisfies

• b(u, v) = b(v, u) for all u, v ∈ V (i.e. b is “symmetric”)

• b(λu+ µv,w) = λb(u,w) + µb(v,w) for all u, v,w ∈ V and λ, µ ∈ R.


15.3.2 note The second axiom says that an SBF is linear in the first entry

with the second entry held fixed (i.e. fixing y the map x → b(x, y) is linear

from V to R).

15.3.3 theorem Let b be an SBF on V. Then b is linear in the second entry

(hence the word “bilinear” in the definition):

b(w, λu+ µv) = λb(w,u) + µb(w, v) for all u, v,w ∈ V and λ, µ ∈ R.

Proof. Trivial (see problems). �

15.4 coordinates

15.4.1 SBF’s on Rn On Rn, SBFs are given by symmetric matrices: let B

be a symmetric n× n real matrix. Then

b(x, y) = xTBy

is a symmetric bilinear form on Rn (where x, y are both column vectors).

15.4.2 definition Let f1, . . . , fn be a basis for V. The matrix of b with respect

to the basis is the symmetric n× n matrix B with Bij = b(fi, fj).

15.4.3 theorem let B be the matrix of the SBF b with respect to a basis.

Suppose that x, y are the coordinate matrices of the vectors u, v ∈ V. Then

b(u, v) = xT By.

Proof. We must establish the claimed property. Write xj, yj for the coordinates

of u, v with respect to the given basis. Then by linearity in both entries

b

n∑j=1

xjfj,

n∑k=1

ykfk

=

n∑j=1

n∑k=1

xjykb(fj, fk) =

n∑j=1

n∑k=1

xjBjkyk = xTBy.

�

15.5 the associated quadratic form

15.5.1 definition Let b be a SBF on V. The associated quadratic form β is

the function

β(v) = b(v, v).

15.5.2 note In coordinates, β(v) is a polynomial in the coordinates of v with

each term having degree exactly two.


15.5.3 theorem An SBF determines a quadratic form. The SBF is recoverable

from the quadratic form by the polarization identity

b(x, y) =1

4(β(x+ y) − β(x− y)).

Proof. Expand the right-hand side. �

(Thus, SBFs and quadratic forms are really just the same thing.)

15.6 examples

15.6.1 examples

• On R2 using the standard basis the general SBF is given by a general

symmetric matrix B as

b(x, y) = xTBy = lx1y1 +m(x1y2 + x2y1) + nx2y2, B =

(l m

m n

).

The associated quadratic form is

β(x) = lx21 + 2mx1x2 + nx22.

Comparing the formulas should make the relationship clear.

• The standard inner product on Rn is an SBF given by the identity matrix

and the associated quadratic form is ||x||2.

• b(X, Y) = Trace(XY) defines an SBF on the vector space of n × n real

matrices.

15.7 properties

15.7.1 theorem Let b be an SBF on V and let X ⊆ V be a subspace. Then

b restricted to X defines an SBF on X.

Proof. Obvious. �

15.7.2 definition

• An SBF on V is positive-definite if for all v 6= 0 ∈ V we have b(v, v) > 0.

• An SBF on V is negative-definite if for all v 6= 0 ∈ V we have b(v, v) < 0.

• An inner product on V is a positive-definite SBF on V.


problems

done?problem 15.1 Prove that an SBF is linear in the second entry (§15.3.3).

done?problem 15.2 Check the claim in §15.4.1 that if B is a symmetric n×n matrix

then b(x, y) = xTBy defines an SBF on Rn, were x, y are column vectors as

usual. (Hint: for the symmetric part, take the transpose of the 1 × 1 matrix

xTBY.)

done?problem 15.3 Consider the SBF b on R2 with matrix

B =

(2 3

3 1

).

Find a vector v 6= 0 in R2 such that b(v, v) = 0. Give a sketch showing the

regions in the plane where b(x, x) is positive, negative and zero.

done?problem 15.4 Consider the SBF on R3 with matrix

B =

1 0 0

0 1 0

0 0 −1

.Sketch the regions in R3 where b(x, x) is positive, negative and zero.

done?problem 15.5 + Hand-in for tutorial Let V be the vector space of

2 × 2 matrices with real entries and trace zero. Consider the SBF (known, by

the way, as the “trace form”) b(X, Y) = Trace(XY) on V. Find the matrix of b

with respect to the basis(0 1

0 0

),

(0 0

1 0

),

(1 0

0 −1

)of V.

lecture 16 diagonalizing SBF’s

16.1 setting





16.2 revision

16.2.1 orthogonal matrices Recall that an n×n matrix is said to be orthog-

onal if Pt = P−1 or equivalently if the columns form an orthonormal basis for

Rn (with respect to the standard inner product).

16.2.2 diagonalisation of symmetric matrices Recall that a concrete version

of the finite-dimensional spectral theorem (FDST) is the following. Given a real

symmetric matrix S there exists an orthogonal matrix P such that

PtSP = P−1SP = D where D = Diag(µ1, . . . , µn)

and the µj are the eigenvalues (necessarily all real, remember) of S taken with

multiplicity (i.e. repeated roots of the characteristic equation appear the cor-

responding number of times). The columns of P are the corresponding unit

eigenvectors of S.

This is easy to achieve in practice if S has n distinct eigenvalues, one just

chooses a unit eigenvector for each. (Recall that eigenvectors with different

eigenvalues are automatically orthogonal.) If µ is a repeated root of the charac-

teristic equation, then the corresponding eigenspace will be of dimension equal

to the multiplicity of the root and one normally has to use Gram-Schmidt to

find an orthonormal basis for it.

16.3 type of an SBF

16.3.1 definition Let b be an SBF on V. Define

Nb := {x ∈ V |b(x, v) = 0 for all v ∈ V}.

We say that b is nondegenerate if Nb = {0} and otherwise that b is degenerate.

16.3.2 theorem Nb as defined above is a subspace of V.

Proof. Exercise (see problems). �

16.3.3 theorem Define the rank of b by

rankb = dimV − dimNb.

Then the rank of b is equal to the rank of its matrix B.

Proof. The vector v ∈ Nb iff Bx = 0 (where x is the coordinate column vector

of v in our chosen basis). Thus the dimension of Nb is the dimension of the

kernel of the linear map x 7→ Bx and the result follows from the Rank theorem

for linear maps. �


16.3.4 notes

• We will not generally use a name, but sometimes Nb is referred to as the

“kernel” of b.

• The point of the above is this. We know that the rank of a matrix (the

dimension of the row and column span) is significant when we are thinking

of a matrix as the coordinate version of a linear map. The theorem gives

a meaning for the rank when instead the (square) matrix is the coordinate

version of an SBF.

16.3.5 corollary An SBF is nondegenerate if and only if its matrix has non-

zero determinant.

16.3.6 definition Let b be an SBF on V. Let p be the largest integer which

is the dimension of a subspace on which b is positive-definite. Similarly let

q be the largest integer which is the dimension of a subspace on which b is

negative-definite. Then we say that b has type (p, q) and has signature p− q.

16.3.7 example If V is n-dimensional then an inner product on V is the same

thing as an SBF of type (n, 0).

16.4 change of basis

16.4.1 theorem Let b be an SBF on V with matrix B with respect to a basis

f1, . . . , fn. Let f ′1, . . . , f′n be a new basis for V such that the change of basis

matrix from the original basis to this new basis is P. Then the matrix of b with

respect to the new basis is

B ′ = PTBP.

16.4.2 theorem Let b be an SBF on V. Then there exists a basis for V such

that the matrix of b is of the form

B =

Ip 0 0

0 −Iq 0

0 0 0n−(p+q)

where 0n−(p+q) is the (n− (p+ q))× (n− (p+ q)) zero matrix.

We shall refer to such matrices as being in the standard form for an SBF.

Proof. Start with any basis for V and let the matrix in that basis be S. The

matrix version of FDST recalled above shows that there is an orthogonal matrix

P representing a change of basis to a new basis f1, . . . , fn for V, such that

PtSP = P−1SP = D where D = Diag(µ1, . . . , µn).


We can assume without loss of generality that in the list of eigenvalues, the

positive eigenvalues come before the negative ones which come before the zero

ones.

Now rescale the basis vectors corresponding to nonzero eigenvalues accord-

ing to

fj 7→ 1√|λj|fj

to obtain a basis with the matrix in the desired form, with p and q being the

number of positive and negative eigenvalues respectively. �

16.4.3 theorem Let b have type (p, q) then the rank of b is p + q and

matrix of b in any basis has p positive and q negative eigenvalues (counting

with multiplicity).

Proof. It is easy to check (see exercises) that if the matrix of b is in the

standard form as in §16.4.2 then the largest dimensions of subspaces on which

b is positive or negative definite are p and q. We saw in the proof that p, q

were the number of positive and negative eigenvalues of the matrix of b in the

originally chosen arbitrary basis. �

problems

done?problem 16.1 Show that Nb := {x ∈ V |b(x, v) = 0 for all v ∈ V is a

subspace of V. (i.e. give the proof of §16.3.2.)

done?problem 16.2 Suppose that the matrix of b in a basis is the standard form

as in §16.4.2. Identify a p-dimensional subspace on which b is positive definite.

Identify also an (n − p)-dimensional subspace on which b is “negative semi-

definite” (meaning that b(v, v) ≤ 0 for all v in the subspace). By considering

intersections, deduce that there is no subspace of dimension larger than p on

which b is positive definite.


done?problem 16.3 + Hand-in for tutorial

1. Find a 2× 2 orthogonal matrix P such that PtSP is diagonal where

S =

(7 −6

−6 −2

).

2. Find also a matrix P such that PtSP is diagonal with diagonal entries ±1or 0.

3. What is the type of the SBF on R2 given by the matrix S? What is its

rank and what is its signature?

done?problem 16.4 + Hand-in for tutorial Let P2 denote the real vector

space of polynomials of degree ≤ 2 in a variable x.

1. Show that

b(p(x), q(x)) :=

∫ 10

p ′(x)q ′(x)dx

defines an SBF on P2 (the “dashes” indicate derivatives).

2. Find a nonzero element of Nb and hence deduce that b is degenerate.

3. Find the matrix of b with respect to the basis {x2, x, 1} of P2 and hence

find the rank of b.

done?problem 16.5 Quick questions

1. True or false: An SBF is non-degenerate iff its matrix does not have zero

as an eigenvalue.

2. What are the possible types of an SBF on R3 if there exists a 2-dimensional

subspace on which it is negative-definite?

3. An SBF on Rn has type (p, q). What is the largest possible dimension

for a subspace V such that b(v, v) < 0 for all non-zero v ∈ V?

4. Same as above but now b(v, v) ≤ 0 for all non-zero v ∈ V.


done?problem 16.6 + Hand-in for tutorial Let V denote the vector space

of n× n real matrices.

1. What is the dimension of V and of the subspace of symmetric matrices

and of the subspace of antisymmetric matrices?

2. Show that b(X, Y) = Trace(XY) defines an SBF on V.

3. Show that

b(X, Y) =

n∑j=1

n∑k=1

XjkYkj.

4. Show that b is positive-definite on the subspace of symmetric matrices

and negative-definite on the subspace of antisymmetric matrices.

5. Find the type, rank and signature of b.

done?problem 16.7 Let b be the SBF on R4 given by the matrix B given in block

form as

B =

(I2 0

0 −I2

).

Let A be a fixed 2× 2 matrix. Define

U :=

{x ∈ R4

∣∣∣∣ x = (Avv), v ∈ R2

}.

(We are using “block form” notation above.) Show that U is a subspace of R4

and state its dimension.

Show that b is identically zero on U if and only if A is an orthogonal matrix

(i.e. iff AtA = I).

done?problem 16.8 Suppose b is a non-degenerate SBF on V. Can there exist

a subspace U of V such that b restricted to U is degenerate? What if b is

assumed to be positive-definite?

done?problem 16.9 True or False: If an SBF is positive definite on subspaces

U,U ′ ⊆ V then it is positive definite on their sum U+U ′. Explain your answer.

lecture 17 determining type — applications

17.1 setting





17.2 determining type

The type of an SBF can often be determined without computing eigenvalues.

We develop this method here.

17.2.1 theorem Let b be an SBF of type (p, q) on n-dimensional V and let

B be its matrix with respect to a basis. Then

• detB = 0 if and only if b is degenerate. (That is, if and only if p+q < n.)

• detB > 0 if and only if b is nondegenerate and q is even.

• detB < 0 if and only if b is nondegenerate and q is odd.

Proof. The determinant of a matrix is the product of its eigenvalues. �

17.2.2 theorem Suppose that b is an SBF of type (p, q) on V and that on a

subspace U ⊆ V it has type (p ′, q ′). Then p ′ ≤ p and q ′ ≤ q.

Proof. Immediate from the definition of p, q in terms of subspaces on which

b is positive and negative definite. �

17.2.3 theorem Let {0} = V0 ⊆ V1 ⊆ · · · ⊆ Vn = V be a flag in an n-

dimensional vector space V. Let b be an SBF on V which is nondegenerate

when restricted to each subspace Vk, k = 1, . . . , n. Let dk be the determinant

of the matrix of b restricted to Vk. Then b has type (n− q, q) where q is the

number of sign changes in the sequence

1, d1, d2, . . . , dn.

Proof. If the type on Vk is (p, q) then the type on Vk+1 is either (p, q+ 1) or

(p + 1, q). In the first case there is an extra negative eigenvalue and since the

determinant is the product of the eigenvalues, it changes sign. �

17.2.4 corollary Let B be an n × n symmetric matrix. For 1 ≤ k ≤ n, let

Bk denote the “top-left” k × k sub-matrix of B (i.e. formed from the entries

bij with 1 ≤ i, j ≤ k). Let dk = detBk. (Here B1 is a 1 × 1 matrix and its

determinant is equal to its entry.) Suppose that all the dk are non-zero. Let q

be the number of sign changes in the sequence 1, d1, . . . , dn. Then b has type

(p, q).


17.2.5 variations We illustrate with examples how one may be able to use

the above ideas in other ways.

• In the above, we do not need to take the chain of subspaces (or “flag”)

to start from the top. For example the theorem does not directly apply

to

B =

0 1 1

1 1 1

1 1 2

.Taking the chain of determinants starting from the bottom-right however

we get the values 2, 1,−1 for the determinants and so the signature is

(2, 1).

You can even start with the middle 1 × 1 matrix. But you must only

consider square matrices with the same leading diagonal as the original

matrix — you should never be considering e.g. the bottom left 2× 2.

•

B =

−1 6 3

6 1 1

3 1 2

The usual chain works fine here giving values −1,−37,−46 and so the

signature is (2, 1).

But you can take a shortcut. B is positive definite on the span of the last

two basis vectors (just working out the 1×1 and 2×2 determinants in your

head). So the type is one of (2, 1), (2, 0), (3, 0). But B is negative definite

on the span of the first basis vector and so only the first is possible. (We

have avoided having to compute the 3× 3 determinant.)

•

B =

−3 2 7 −4

2 −2 6 3

7 6 1 1

−4 3 1 2

.4× 4 determinants are a pain. We might however notice (just computing

2× 2 determinants) that B is positive definite on the span of the last two

basis vectors and negative-definite on the span of the first two. Thus B

can only be of type (2, 2).

17.3 classification of critical points

Let f(x) = f(x1, . . . , xn) be a smooth function of n variables and let x = a

be a critical point, meaning that ∂f/∂xk = 0 at x = a for all k. The Hessian


matrix at a is the symmetric matrix H with

Hjk =∂2f

∂xj∂xk

∣∣∣∣x=a

.

The Taylor expansion of f near x = a is

f(a+ z) = f(a) +1

2ztHz+ higher order terms

and so near x = a the first non-trivial term is the quadratic form with H as its

matrix. We immediately see the following.

• If H is positive-definite (resp. negative-definite) then f has a strict local

minimum (resp. maximum) at x = a.

• If H has type (p, q) with p > 0, q > 0 then there are directions in which

f increases and direction in which it decreases.

If H is degenerate then one may need to know about the higher order terms in

order to understand the nature of the critical point. For example, in R4 if the

Hessian is type (2, 0) then the point may or may not be a local minimum.

problems

done?problem 17.1 Consider the quadratic form

β = 2x2 + 3y2 + z2 − 4xy+ 2xz+ 2yz.

1. Write down the matrix B of this quadratic form.

2. By evaluating determinants only determine the type of this form.

3. Find the eigenvalues and eigenvectors of B and check that the signs of

the eigenvalues are consistent with derivation of the type in the previ-

ous part. (You might want to use Maple to find the eigenvalues — use

“evalf(LinearAlgebra[Eigenvalues](B))”.)

done?problem 17.2 + Hand-in for tutorial Show that the origin is a critical

point of

f(x, y, z) = 2x2 + y siny+ z2 + 2(y+ z) sin x− 2ky sin z

(where k is a constant). What can you say about the nature of the critical point

for different values of k?


done?problem 17.3 What is the type of the SBF with matrix−3 12 −7

12 4 2

−7 2 2

You do not need to evaluate a 3×3 determinant (or compute eigenvalues (don’t

even think of it) ).

done?problem 17.4 What is the type of the SBF in problem 5 of lecture 15?

lecture 18 SBF’s on inner-product spaces

18.1 setting


we write B for the matrix of b with respect to some basis.

18.2 inner product spaces (IPS)

18.2.1 definitions An inner product on a real vector space V is a positive-

definite symmetric bilinear form on V. (Note that this is equivalent to the

definition from Year 2.) We will usually write inner products as 〈·, ·〉.

18.2.2 orthonormal bases For an SBF which is an inner product, a basis

which for which the matrix of the SBF takes the standard form of 16.4.2 (which

is just the identity matrix) is called orthonormal.

This idea should be familiar from Year 2: a basis in an inner-product space

is orthonormal if 〈ei, ej〉 = 0 when i 6= j and ||ei|| :=√〈ei, ei〉 = 1.

In an inner-product space one usually works with orthonormal bases where

possible.

18.2.3 orthogonal matrices Recall that an n×n real matrix A is orthogonal

if ATA = I. Orthogonal matrices give distance and angle-preserving linear

maps from Rn (with it’s usual, standard inner product) to itself. That is A is

orthogonal if and only if

〈x, y〉 = 〈Ax,Ay〉 for all x, y ∈ Rn.

Orthogonal matrices arise also as change of basis matrices between orthonor-

mal bases in any real inner-product space. To see this, note that if an SBF has

matrix I with respect to one basis (i.e. it is an inner product and the basis is


orthonormal) then for its matrix also to be I with respect to a new basis we

require

PT IP = PTP = I

where P is the change of basis matrix. (We are using the formula for change of

basis for the matrix of an SBF.)

18.3 classification of SBF’s on an IPS

18.3.1 theorem Let b be an SBF on an inner-product space V. Then there

exists an orthonormal basis for V such that the matrix of b is

Diag(µ1, . . . , µn)

where the numbers µ1, . . . , µn are the eigenvalues of the matrix of b with respect

to any orthonormal basis (counted with multiplicity).

Proof. Choose an orthonormal basis for V and let B be the matrix of b in

that basis. Let µ1, . . . , µn be the eigenvalues of B (counted with multiplicity).

Then by the finite-dimensional spectral theorem (see §16.2.2) there exists an

orthogonal matrix P such that PTBP is as stated. The matrix P is thus the

change of basis matrix to an orthonormal basis with respect to which the matrix

of b is as claimed. �

18.3.2 remark As we know from before, if we forget the inner product on V

then there exists a basis where the matrix of b is diagonal with entries ±1 or

zero. We obtained this basis by taking the one we are using here and rescaling

the basis vectors associated with nonzero eigenvalues.

For an SBF b on a vector space V generally, only the sign (±1 or zero)

of the eigenvalues is significant. If V has an inner product, then the eigenval-

ues themselves (of the matrix of b with respect to an orthonormal basis) are

significant.

A matrix version of the above fact is this. Under the transformation B 7→PTBP of a symmetric matrix B by an arbitrary invertible matrix P, the sign

of the eigenvalues is preserved. If also P is orthogonal then the values of the

eigenvalues are preserved.

18.4 classification of quadrics

18.4.1 quadrics in R3 A non-degenerate central quadric in R3 is a surface Σ

defined by the equation

xtSx = 1


where S is a non-zero symmetric matrix. We will assume S is not negative-

definite since in that case there are no x satisfying the equation. We will

assume also that S is non-degenerate (i.e. it has rank 3).

18.4.2 theorem Given such a quadric there exists an orthonormal basis such

that with respect to that basis the quadric takes one of the following forms.

• If S is positive definite then Σ is an ellipsoid given by

x2

a2+y2

b2+z2

c2= 1, a, b, c > 0.

• If S has type (2, 1) then Σ is a hyperboloid of one sheet given by

x2

a2+y2

b2−z2

c2= 1, a, b, c > 0.

• If S has type (1, 2) then Σ is a hyperboloid of two sheets given by

x2

a2−y2

b2−z2

c2= 1, a, b, c > 0.

In each case, the coefficients of x2, y2, z2 are the eigenvalues of S.

18.4.3 remark Note: the type of S determines which of the three categories

the quadric is in. One can usually determine the type by computing determi-

nants. If one wants to know the values of a, b, c then one needs to know the

eigenvalues of S and if one wants also to know the orthonormal basis in which

the equation takes the form then the eigenvectors also must be computed.

problems

done?problem 18.1 Classify the following quadrics.

1. x2 + 2y2 + 3z2 + 2xy+ 2xz = 1

2. 2xy+ 2xz+ 2yz = 1

3. x2 + 3y2 + 2xz+ 2yz− 6z2 = 1

You should be able to do all of these using “determinants” analysis if you think

carefully. You can always check your answer by asking Maple for the eigenvalues.

done?problem 18.2 Let β(x) be a quadratic form on Rn given by a symmetric

matrix S. How are the maximum and minimum values of β(x) on the unit

sphere xtx = 1 related to the eigenvalues of S. (Hint: orthogonal change of

coordinates to standard form.)


done?problem 18.3 u Challenge Continuing the previous exercise, use La-

grange multipliers to find the critical points of xtSx subject to the constraint

xtx = 1.

lecture 19 simultaneous diagonalization

19.1 setting


we write B for the matrix of b with respect to some basis.

19.2 SBFs and self-adjoint linear maps

19.2.1 definition (revision from year 2) Let V be an IPS and let T : V → V

be a linear map. Then T is self-adjoint if

〈u, Tv〉 = 〈Tu, v〉, for all u, v ∈ V.

19.2.2 theorem The map T : V → V is self-adjoint if and only if its matrix

with respect to an orthonormal basis is symmetric.

Proof. Let u, v have coordinate column matrices x, y with respect to an

orthonormal basis and let T have matrix A. Then

〈u, Tv〉− 〈Tu, v〉 = xTAy− (Ax)Ty = xT (A−AT )y

which is zero for all u, v if and only if A = AT . �

19.2.3 theorem Let T : V → V be a self-adjoint linear map. Then

b(u, v) := 〈u, Tv〉

defines an SBF on V. The matrices of the linear map T and of the SBF b are

equal when they are taken with respect to an orthonormal basis.

Proof. That b is linear is immediate. It is symmetric because

b(v, u) = 〈v, Tu〉 = 〈Tv, u〉 = 〈u, Tv〉 = b(u, v).

Let T have matrix A and let u, v have coordinate column matrices x, y with

respect to an orthonormal basis. Then

b(u, v) = 〈u, Tv〉 = xT (Ay)

and so b has matrix B. �


19.2.4 remark So, in an IPS, self-adjoint linear maps and SBF’s are essen-

tially the same thing. This explains why it is reasonable for the eigenvalues

of the matrix of an SBF to be significant — they are the eigenvalues of the

associated self-adjoint linear map.

In matrix terms, under a change of coordinates the matrices of SBF’s and

linear maps have different transformation rules:

B 7→ PTBP, A 7→ P−1AP.

On an IPS however we have a preferred set of bases, the orthonormal ones, and

for change of basis betwen these we have PT = P−1 and those formulae become

the same.

19.3 simulaneous diagonalization

19.3.1 theorem Let b, a be two SBF’s on a vector space V and suppose that

a is positive definite. Then there exists a basis for V such that the matrix of a

is the identity matrix and the matrix of b is diagonal.

Proof. Regard a as an inner product on V so that b is an SBF on an inner-

product space. Then there exists an orthonormal (with respect to 〈x, y〉 =

a(x, y)) basis for V such that the matrix of b is diagonal by the results of the

previous lecture. �

19.3.2 details We will work on Rn (which involves no loss of generality since

we can reduce to that case by choosing an arbitrary basis for V). Let a, b be

SBF’s as above with matrices A,B respectively. If we regard a as an inner

product then we can use that to identify b with a self-adjoint (with respect to

a) linear map x 7→Mx using the formula

b(x, y) = a(x,My)

or in matrix form xtBy = xtAMy. Thus we deduce that B = AM or M =

A−1B.

Now the diagonal entries in the matrix of B once we have diagonalised are

precisely the eigenvalues of M which we can calculate by solving the character-

istic equation

det(B− λA) = 0

since det(A−1B − λI) = det(A−1)det(B − λA). The solutions are called the

relative eigenvalues of the pair B and A.

Further, the required basis vectors are the eigenvectors of M scaled so that

a(v, v) = 1. These can be obtained by solving for each relative eigenvalue

(B− λA)v = 0


and rescaling so that vTAv = 1. If λ is a repeated root, then one would have

to use Gram-Schmidt (using the inner product a) to find an orthonormal basis

for the corresponding eigenspace.

problems

done?problem 19.1 + Hand-in for tutorial Consider the SBF’s on R2 given

(with respect to the standard basis) by the matrices

B =

(1 3

3 3

), A =

(2 1

1 1

).

Show that one of these is positive definite and hence find a basis for R2 with

respect to which the matrices of both the SBF’s are diagonal (with the positive-

definite one having the identity as its matrix). Write down the change of basis

matrix that diagonalises both forms.

done?problem 19.2

1. Let

A =

(1 3

3 2

).

Check explicitly that 〈Ax, y〉 = 〈x,Ay〉 for all x, y ∈ R2 where the inner

product is the standard one on R2.

2. Let

A =

(1 3

−1 2

).

Find vectors x, y ∈ R2 such that 〈Ax, y〉 6= 〈x,Ay〉 where the inner

product is the standard one on R2.


done?problem 19.3 + Hand-in for tutorial Let n ∈ N and consider

Tn =

{a0 +

n∑k=1

ak coskx+ bk sinkx |ak, bk ∈ R

}

with the inner product

〈p(x), q(x)〉 :=∫ 2π0

p(x)q(x)dx.

Consider the linear map −D2 : Tn → Tn where −D2 : p(x) 7→ −p ′′(x).

1. Show that −D2 is self-adjoint. (Hint: integration by parts.)

2. What are the eigenvalues of −D2 : Tn → Tn and what is the multiplicity

of each eigenvalue? (Think ODE’s — no clever theory required.)

3. Show that the SBF associated to −D2 is b(p, q) =∫2π0 p

′(x)q ′(x)dx.

4. What is the type of the SBF just found? Relate that to the eigenvalues

of −D2.

MAT3-ALG algebra 2008-2009 | Toby Bailey lecture 0 preamblechris/ALG/tnb_notes.pdf · MAT3-ALG...

Documents

Transcript of MAT3-ALG algebra 2008-2009 | Toby Bailey lecture 0 preamblechris/ALG/tnb_notes.pdf · MAT3-ALG...