MAT3-ALG algebra 2008-2009 | Toby Bailey lecture 0 preamblechris/ALG/tnb_notes.pdf · MAT3-ALG...
Transcript of MAT3-ALG algebra 2008-2009 | Toby Bailey lecture 0 preamblechris/ALG/tnb_notes.pdf · MAT3-ALG...
MAT3-ALG algebra 2006/7 (tnb) - lecture 0 1
MAT3-ALG algebra 2008-2009 — Toby Baileyhttp://student.maths.ed.ac.uk
lecture 0 preamble
• This course follows on from Year 2 Linear Algebra. You are strongly
advised to do the revision problems below to get up to speed on last
year’s material.
• The course consists of the skeleton notes, the lectures and the exercises.
The lectures will not duplicate what is in the notes and more examples
will be done in lectures. Most important: do the exercises — in that way
you keep up with the material and will get more from lectures.
• There will be questions set each week to hand in to and discuss with your
tutor — usually one short example from each lecture. These questions are
only meant to be good examples to discuss in the limited time available
— you should not assume that these questions are particularly likely to
come up in exams. You should be attempting most of the questions for
each lecture.
problems
These are “easy” revision problems from year 2 Linear Algebra. It is essential
that you are on top of this earlier material and so you are strongly recommended
to do these exercises. Aim to do all of them by the end of week 2 at the latest.
Throughout, Pn denotes the vector space of polynomials of degree ≤ n in a
variable x and M denotes the vector space of 2× 2 real matrices.
done?problem 0.1 Which of the following are subspaces of the given vector space?
1. {x ∈ R3 | 2x1 − x2 + x3 = 1} ⊆ R3
2. {x ∈ R3 | x1 = 2x2} ⊆ R3
3. {P ∈ P3 |P(1) = 0} ⊆ P3
4. {A ∈M |AT = −A} ⊆M (the “T” denotes matrix transpose).
done?problem 0.2 For each of the examples in the previous question that is a
subspace give its dimension and write down a basis for the subspace.
done?problem 0.3 Calculate the coordinate matrix of x3 with respect to the basis
x3 − x2, x2 − x, x− 1, 1 of P3.
MAT3-ALG algebra 2006/7 (tnb) - lecture 0 2
done?problem 0.4 Use the change of basis matrix to find the coordinate matrix of
x in the basis v1, v2 of R2 where
x =
(1
1
), v1 =
(1
−1
), v2 =
(2
1
).
done?problem 0.5 What is the span of a set S = {v1, . . . , vk} of vectors? What
does it mean for the set to be linearly independent?
done?problem 0.6 Let U = {x | x1 = 0} and V = {x | x2 = 0} be subspaces of R3.What is the sum U+ V of these subspaces? State the Dimension Theorem for
sums of subspaces and verify it in this example. Is this an example of a direct
sum?
done?problem 0.7 Which of the following are linear maps?
1. T :M→ P2 where T(A) = the characteristic polynomial of the matrix A.
2. T :M→ R where T(A) = TraceA (Here “Trace” denotes the trace of a
matrix — the sum of the elements on the leading diagonal.)
3. T : P3 → P3 where T : p(x) 7→ p ′(x)
done?problem 0.8 Define the kernel and image of a linear map. State the Rank
Theorem (a.k.a. “Rank-Nullity theorem”) for linear maps. For each of the
examples in the previous question that is linear, describe the kernel and image
and verify the theorem.
done?problem 0.9 Let A be an n × n matrix and let T : Rn → Rn be the linear
map T : x 7→ Ax. Which of the following conditions are equivalent to A having
an inverse?
1. detA 6= 0
2. ker T = {0}
3. im T = Rn
4. T is a bijection.
5. A is diagonalisable.
MAT3-ALG algebra 2006/7 (tnb) - lecture 1 3
done?problem 0.10 Find the eigenvalues and eigenvectors of
A =
(1 2
1 1
).
Hence diagonalise A.
lecture 1 sets
1.1 sets and subsets
1.1.1 definition A set is a collection of objects. The objects in a set S are
called elements or members of S. If x is a member of S we write x ∈ S and
if not then we write x 6∈ S. Two sets S, T are equal if they have the same
elements.
1.1.2 notation We denote a set by including its elements in braces (curly
brackets). Thus we might write S = {1, 2, 3, 4, 5} to define the set whose
members are the first five natural numbers. We also use the “vertical bar” that
means “such that”. So we could define the interval [0,∞) by
[0,∞) := {x ∈ R | x ≥ 0}.
(Note that we often use “:=” when an equality is defining its left-hand side.)
1.1.3 notation if the set S has a finite number of elements, we refer to that
number as the size of S. We write ]S for the size of S.
1.1.4 definition The empty set is the set with no elements. We write it as {}
or as ∅.
1.1.5 definition The set A is a subset of the set B if x ∈ A =⇒ x ∈ B. We
write A ⊆ B if A is a subset of B. The subset A ⊆ B is proper if A 6= B. We
write A ⊂ B to mean that A is a proper subset of B.
1.1.6 warning Mathematicians are not consistent in notation — some use
A ⊂ B for all subsets, proper or not. Our notation is modeled on the distinction
between “<” and “≤”.
1.1.7 theorem For every set B it is true that {} ⊆ B.
Proof. Every element of {} is also in B because there are no such elements. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 1 4
1.1.8 note We sometimes say that conditions such as that in the above proof
“hold vacuously”. If you doubt it, look at it this way. For the statement to be
false there would need to be an element of {} that is not in B. That is certainly
not the case. So the statement is certainly not false.
1.1.9 note We allow sets to be elements of sets. For example, {{}} is not the
same as {}. The first set has an element (which happens to be the empty set)
but the second does not.
1.1.10 definition The power set P(S) of a set S is the set of all subsets of S.
1.1.11 definition One often deals with families of sets. For example, we
might define In to be the closed interval [−n,n] as follows.
In := [−n,n], n ∈ N.
We refer to this as a family of sets with index set N.
1.2 complements and differences
1.2.1 definition The complement A ′ of a set A is defined by {x | x 6∈ A}.
1.2.2 definition Let A and B be sets. Then the set difference
A \ B := A ∩ B ′
1.3 intersections and unions
1.3.1 definition The union A ∪ B and intersection A ∩ B of two sets are
defined by
A ∪ B := {x | x ∈ A or x ∈ B}, A ∩ B := {x | x ∈ A and x ∈ B}.
In the definition of union note that as always in mathematics “P or Q” is true
if at least one and possibly both of P and Q are true.
1.3.2 definition Let Aλ, λ ∈ Λ be a family of sets. Then we define the union
and intersection of the family by⋃λ∈Λ
Aλ := {x | x ∈ Aλ for some λ ∈ Λ},⋂λ∈Λ
Aλ := {x | x ∈ Aλ for all λ ∈ Λ}.
MAT3-ALG algebra 2006/7 (tnb) - lecture 1 5
1.3.3 example For the family in §1.1.11, the union is R and the intersection
is [−1, 1].
1.3.4 theorem For all sets A,B we have
A ⊆ A ∪ B, B ⊆ A ∪ B, A ∩ B ⊆ A, A ∩ B ⊆ B.
The analogous statements hold for unions and intersections of families.
1.4 set algebra
1.4.1 trivial identities
1. A ∪ B = B ∪A and A ∩ B = B ∩A
2. A ∪ (B ∪C) = (A ∪ B) ∪C and so A ∪ B ∪C is unambiguous. Same for
intersections.
3. {} ∩A = {} and {} ∪A = A
4. (A ′) ′ = A, A ∩A ′ = {}
5. A = A ∩A = A ∪A
1.4.2 identities involving families
B ∩⋃λ∈Λ
Aλ =⋃λ∈Λ
(B ∩Aλ),
B ∪⋂λ∈Λ
Aλ =⋂λ∈Λ
(B ∪Aλ),(⋃λ∈Λ
Aλ
) ′=⋂λ∈Λ
A ′λ,(⋂λ∈Λ
Aλ
) ′=⋃λ∈Λ
A ′λ.
1.4.3 example proof Here is a proof of the final identity involving families.
We show first that LHS ⊆ RHS and then that RHS ⊆ LHS.
1. Let x be in the left-hand side. Then x is not in(⋂
λ∈ΛAλ). Thus x is
not in every one of the Aλ. Thus there exists a µ ∈ Λ such that x ∈ A ′µ.
Hence x is in the right-hand side.
2. Now suppose x is in the right-hand side. Then there exists a µ ∈ Λ such
that x ∈ A ′µ. Thus x 6∈ Aµ and so x 6∈(⋂
λ∈ΛAλ)
and so x is in the
LHS.
MAT3-ALG algebra 2006/7 (tnb) - lecture 1 6
(Note in the above we have used λ for a “general” element of Λ and µ for a
particular element that arises in the proof.)
1.4.4 example Use set algebra to show that A \ (B∪C) = (A \B)∩ (A \C).
Solution:
A \ (B ∪ C) = A ∩ (B ∪ C) ′ by defn of set difference
= A ∩ (B ′ ∩ C ′) by §1.4.2 (3rd identity)
= A ∩A ∩ B ′ ∩ C ′ by §1.4.1, parts 2 and 5
= (A ∩ B ′) ∩ (A ∩ C ′) by §1.4.1
= (A \ B) ∩ (A \ C) by defn of set difference
1.5 russel’s paradox (optional)
Although it is never an issue in everyday mathematics, you have to be very
careful about talking about sets of sets. Let S be the set of all sets. This does
not sound too worrying except that S has the odd property that S ∈ S. So there
seems to be nothing to stop us defining
R := {A ∈ S |A 6∈ A}.
(In words, R is the set of all sets which are not elements of themselves.)
Now ask whether R is an element of itself. Show that R ∈ R =⇒ R 6∈ Rand also that R 6∈ R =⇒ R ∈ R. We have a complete contradiction, known as
“Russel’s Paradox”. The usual way out of this is not to allow things as wildly
general as S to be called a set.
problems
done?problem 1.1 Is it true that {} ⊂ B for all sets B?
done?problem 1.2 For the four properties of set algebra in §1.4.2 above, write down
(next to the original, perhaps) what they reduce to for a family of just two sets.
In each case, give a proof. For at least two of those not proved in lectures or
the text, write down a proof too of the version for families.
done?problem 1.3 Let S be a finite set. Write down a formula for the size of P(S).Write down P(S) in the case of S being the empty set. Does the formula work
in this case?
MAT3-ALG algebra 2006/7 (tnb) - lecture 2 7
done?problem 1.4 Let L denote the set of (straight) lines through the origin in
R2. Let M denote the set of (straight) lines through (1, 1) in R2. How many
elements does L ∩M have?
done?problem 1.5 + Hand-in for tutorial Suppose S = {x, y, z}. True or
False:
1. S ∈ P(S);
2. x ∈ P(S);
3. {x, y} ∈ S;
4. {x, y} ∈ P(S);
5. {{x, y}, {}} ∈ P(S);
6. {{x, y}, {}} ⊆ P(S);
7. {{x, y}, {}} ∈ P(P(S));
8. {{}} ∈ P(P(S)).
What is the size of P(P(S))?
done?problem 1.6 + Hand-in part 1 only for tutorial Use set algebra to
show that
1. A \ (B ∩ C) = (A \ B) ∪ (A \ C)
2. A \ (B \ C) = (A \ B) ∪ (A \ C ′)
done?problem 1.7 Show that
A ∪ (A ∩ B) = A.
You will need to argue by showing that the LHS is a subset of the RHS and
that the RHS is a subset of the LHS. Use set algebra to deduce that also
A ∩ (A ∪ B) = A.
These two results (which can not be deduced from other identities of set algebra
that we have previously stated) are called the ”axioms of absorption”. Use them
together with set algebra as before to show that A \ (B \A) = A.
MAT3-ALG algebra 2006/7 (tnb) - lecture 2 8
lecture 2 cartesian products and functions
2.1 cartesian products
2.1.1 definition An ordered n-tuple is a list (x1, . . . , xn) of n objects, where
the order is important and repetitions are allowed. If n = 2 and n = 3 we use
the terms ordered pair and ordered triple.
2.1.2 definition Let A,B be sets. Then the cartesian product of A and B is
the set
A× B := {(a, b) |a ∈ A and b ∈ B}.
(Here (a, b) is an ordered pair.)
2.1.3 definition More generally, let S1, . . . , Sn be sets. Then their cartesian
product is the set
S1 × · · · × Sn := {(x1, . . . , xn) | xj ∈ Sj for j = 1, . . . , n}.
2.1.4 example The usual definition of Rn is
Rn := R× · · · × R︸ ︷︷ ︸n factors
.
2.2 functions (= maps = mappings = transformations)
2.2.1 definition A function f from A to B is a subset Gf ⊆ A × B with
the property that for each a ∈ A there is one and only one b ∈ B such that
(a, b) ∈ Gf. We write b = f(a) if (a, b) ∈ Gf.
2.2.2 notation We sometimes use the equivalent words map, mapping or
transformation instead of function.
2.2.3 relation with more elementary definition We have previously defined a
function from A to B to be a rule that associates an element f(a) ∈ B to
each a ∈ A. The connection of course is that f(a) is the unique b such that
(a, b) ∈ Gf. The previous definition has two drawbacks:
• What exactly is a “rule”? For example, is “ask Henry” a suitable rule to
determine a function?
• Two functions can have different rules but be equal, for example x 7→(x + 1)2 and x 7→ x2 + 2x + 1. (One really needs to supplement the old
definition by adding that “two functions are equal if they take the same
values for all x”.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 2 9
Our definition is cleaner and more precise (at the expense of being more ab-
stract). The relationship is that if we have defined a function f : A → B by a
rule then the subset of A× B is its graph
Gf = {(a, f(a)) |a ∈ A}.
We will usually specify functions as we always have, by giving a rule for com-
puting f(a) from a.
2.2.4 definition and notation We write f : A → B to denote that f is a
function from A to B. If f(a) = b we write f : a 7→ b. The set A is called the
domain of f and the set B is called the codomain.
2.2.5 note Note well the difference between “→” and “7→”. (Some people
use “→” for both.)
2.2.6 definition Let S be a set. Then the identity function I : S → S is
defined by I(x) = x for all x.
2.2.7 example Let S be a set. Define a map f : S×S→ S×S by f : (x, y) 7→(y, x). The set of all (x, y) such that f : (x, y) 7→ (x, y) (that is, the fixed point
set of f) is a special subset ∆ of S× S called the diagonal. Alternatively, it can
be defined by
∆ := {(x, x) | x ∈ S}.
2.2.8 example The diagonal ∆ is the graph of the identity map I : S→ S.
2.2.9 definition The maps π1 : X× Y → X and π2 : X× Y → Y defined by
π1 : (x, y) 7→ x, π2 : (x, y) 7→ y
are called the projections from X × Y to X and Y respectively. We sometimes
refer to them as the canonical projections. (A canonical object is something
that arises naturally from the given information without making extra choices.)
2.3 action on subsets
2.3.1 definition Let f : A→ B be a function. Let U ⊆ A and V ⊆ B. Then
we define
f(U) := {f(x) | x ∈ U}, f−1(V) := {x ∈ A | f(x) ∈ V}.
MAT3-ALG algebra 2006/7 (tnb) - lecture 2 10
2.3.2 note Both f and f−1 act on subsets and produce subsets. In fact, they
define functions
f(U) : P(A)→ P(B), f−1 : P(B)→ P(A).2.3.3 example Consider f : R→ R with f : x 7→ x2. Then
f({−1, 1, 3}) = {1, 9}, f({−3}) = {9}, f([−2, 2]) = [0, 4], f(R) = [0,∞)
and
f−1({1}) = {−1, 1}, f−1({−3, 4}) = {−2, 2}, f−1([−2,−1]) = {}.
2.3.4 notation and a warning In mathematics there is always a tension be-
tween precision and readability. Confusing related but different things can
be disastrous, but can also be a huge aid to transparency. We will usually
drop the tilde from the above notation and write just f([−1, 1]) = [0, 1] and
f−1({1}) = {−1, 1}. We may also allow ourselves to confuse elements of a set
with one-element subsets and write f−1(1) = {−1, 1}. Do not however fall in
to the trap of thinking that when we use f−1 or f−1 like this that f or f has
an inverse. The function f in our example does not have an inverse function
f−1 : R→ R.
2.3.5 the axiom of choice (optional) More generally, it is reasonable to define
the cartesian product of an arbitrary (perhaps infinite) family Sλ, λ ∈ Λ of sets
— an element of the product is a choice for each λ of an element of Sλ. The
axiom of choice is the statement that if there exists a family of sets as above
such that each Sλ is non-empty then the cartesian product is non-empty.
You might say that the axiom of choice is obvious since all you need to do
to find an element of the cartesian product is make a choice for each λ. It can
not however be proved or disproved from more basic axioms for set theory, and
so you are free to believe it or not as you wish.
The vast majority of working mathematicians take the axiom of choice to
be “obviously” true and use it without thinking. A small minority doubt it since
it is not clear how you can make what may be an infinite number of choices
simultaneously and so disbelieve things that require it for their proof. These
are often abstract, non-constructive results that prove something exists without
actually demonstrating how to find it.
problems
done?problem 2.1 Let A,B be finite sets. What is the size of A× B?
MAT3-ALG algebra 2006/7 (tnb) - lecture 2 11
done?problem 2.2 Show that A × (B ∩ C) = (A × B) ∩ (A × C). (Hint: to do
this carefully, show that the LHS is a subset of the RHS and that the RHS is a
subset of the LHS.)
done?problem 2.3 Decide what relationship holds between the following pairs of
sets (one side is a subset of the other, or they are equal, or there is no relation).
Give a proof.
1. A× (B ∪ C) and (A× B) ∪ (A× C);
2. (A× B) ∩ (C×D) and (A ∩ C)× (B ∩D);
3. (A× B) ∪ (C×D) and (A ∪ C)× (B ∪D).
done?problem 2.4 Consider the function f : R→ R2 given by f : t 7→ (cos t, sin t).
The graph of f is a subset of R1 × R2 = R3. Sketch it.
done?problem 2.5 Consider the map f : R2 → R given by f : (x, y) 7→ √x2 + y2.
Describe each of the following using some combination of words, equations or
pictures (proofs not required):
1. f−1(2)
2. f−1([1, 2]) (here [1, 2] is the closed interval)
3. f−1(−1)
4. f(Z) where Z = {(x, y) | (x− 2)2 + (y− 2)2 = 2}
5. f(V) where V = {(x, y) |y > 0}
6. f(U) where U = {(x, y) | x2 − y2 = 1}
7. f(R2)
done?problem 2.6 + Hand-in for tutorial Let f : X→ Y be a function and
let A,B be subsets of X. Show that
f(A ∩ B) ⊆ f(A) ∩ f(B).
(Hint: You proof should begin: “Let y ∈ f(A∩B)” and should finish with “and
hence y ∈ f(A)∩f(B)”.) Give an example to show that “⊆” can not be replace
with equality.
MAT3-ALG algebra 2006/7 (tnb) - lecture 3 12
done?problem 2.7 let X and Y be finite non-empty sets. Write down a formula for
the number of different functions from X to Y. Now consider the case where
one of X and Y is empty. How many functions are there in that case? Are the
results consistent with your previous formula. (Hint: For the second part use
the definition of function in terms of a subset of the cartesian product.)
lecture 3 more on functions
3.1 injection, surjection, bijection
3.1.1 definition The map f : X → Y is an injection (or “one to one”) if
f(x) = f(y) =⇒ x = y.
3.1.2 definition The map f : X→ Y is a surjection (or “onto”) if for all y ∈ Ythere exists x ∈ X such that f(x) = y.
3.1.3 definition The map f : X → Y is a bijection (or “a one to one corre-
spondence” or an “isomorphism of sets”) if it is injective and bijective.
3.1.4 definition The image im(f) of the map f : X→ Y is defined by
im(f) := {f(x) | x ∈ X}.
3.1.5 note In terms of f acting on subsets, im(f) = f(X) and so f is surjective
if and only if f(X) = Y.
3.2 composition and inverses
3.2.1 definition Let f : X → Y and g : Y → Z be maps. We define the
composition g ◦ f : X→ Z by g ◦ f : x 7→ g(f(x)).
3.2.2 theorem Composition of maps is associative: let f : W → X, g : X →Y, h : Y → Z be maps. Then h ◦ (g ◦ f) = (h ◦ g) ◦ f.
Proof. Let w ∈W. Then
(h ◦ (g ◦ f))(w) = h((g ◦ f)(w)) = h(g(f(w)))
and
((h ◦ g) ◦ f)(w) = (h ◦ g)(f(w)) = h(g(f(w))).
The two functions thus give the same value for all w and so are equal. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 3 13
3.2.3 definition Let f : X→ Y be a map. A map g : Y → X is an inverse for
f if g ◦ f = IX and f ◦ g = IY .
3.2.4 notation We usually write f−1 for the inverse of a map if one exists. Do
not confuse this with f−1 acting on subsets (as in the previous lecture), which
is well-defined even if f has no inverse.
3.2.5 theorem The map f : X→ Y has an inverse if and only if f is a bijection.
Proof. Same proof as for functions R→ R. �
3.2.6 theorem If f, g as above are both injective then so is g ◦ f. If f, g as
above are both injective then so is g ◦ f. If f, g as above are both bijective then
so is g ◦ f and (g ◦ f)−1 = f−1 ◦ g−1.
Proof.
• For injectivity of g◦f: Let (g◦f)(x) = (g◦f)(y). Then g(f(x)) = g(f(y))
and so f(x) = f(y) since g is injective. Hence x = y since f is injective.
• For surjectivity of g◦f: Exercise - you must begin by saying ”let z ∈ Z” and
you should end by deducing the existence of x ∈ X such that (g◦f)(x) = z.
• Firstly, g◦f is injective since both f and g are injective, and it is surjective
since both f and g are surjective. Hence g ◦ f is bijective. Now check:
(f−1 ◦ g−1) ◦ (g ◦ f) = f−1(◦g−1) ◦ g) ◦ f = f−1 ◦ I ◦ f = f−1 ◦ f = I,
and similarly for (◦g ◦ f) ◦ (f−1 ◦ g−1). So (g ◦ f)−1 = f−1 ◦ g−1.
�
3.3 change of domain and codomain
3.3.1 definition Let f : X → Y and suppose U ⊆ X. Then the restriction of
f to U is the function f|U : U→ Y defined by f|U(x) = f(x) for all x ∈ U.
3.3.2 note So we are simply forgetting that we could apply f to elements
outside U. We often just write f for the restricted function unless there is a
danger of confusion.
3.3.3 example Restriction of the domain can have important effects. The
function sin : R → [−1, 1] is not injective and hence has no inverse. If we
restrict the domain to [−π/2, π/2] it is a bijection and has an inverse, called
arcsin.
MAT3-ALG algebra 2006/7 (tnb) - lecture 3 14
3.3.4 definition Let f : X→ Y and suppose V is such that im(f) ⊆ V. Then
we can change the codomain and obtain a map X→ V.
3.3.5 example For the squaring function R→ R, we can restrict the domain
to [0,∞) and change the codomain to [0,∞). We then have a bijection whose
inverse is usually written x 7→ √x.
3.4 back to cartesian products
3.4.1 the problem We would obviously like to think of R1 × R2 as being
“the same as” R3. Unfortunately, being pedantic, it is not. An element of
the first set is something like (−3, (2, 4)) which is an ordered pair whose first
component is a number and whose second component is an ordered pair. On
the other hands, an element of R3 is an ordered triple such as (−3, 2, 4). What
we can say is that there is a canonical (meaning, remember, naturally arising
from the situation) bijection f : Rk × Rl → Rk+l given by
f : ((x1, . . . , xk), (y1, . . . , yl)) 7→ ((x1, . . . , xk, y1, . . . , yl)).
Having seen all this once, we simply use this canonical bijection to identify
Rk × Rl with Rk+l.
3.4.2 associativity of cartesian product A similar problem arises with carte-
sian products in general. A×(B×C) and (A×B)×C) are in principle different.
But if we identify both with A×B×C in the obvious way, then we can regard
them as equal.
3.4.3 commutativity of cartesian products There is a canonical bijection f :
A × B → B × A given by f : (a, b) 7→ (b, a). But here is is important to
maintain the distinction between the two sets.
problems
done?problem 3.1 Give conditions on the size of the subsets f−1(y), y ∈ Y that are
characterize f being (a) injective; (b) surjective; (c) bijective.
MAT3-ALG algebra 2006/7 (tnb) - lecture 4 15
done?problem 3.2 + Hand-in for tutorial
1. Let g ◦ f be injective. Show that f is injective. (Hint: You may find it
best to show that if f is not injective then g ◦ f is not injective.
2. If g ◦ f is injective, does g have to be injective? Give a proof or a
counterexample.
3. What exactly can be deduced if we know that g ◦ f is surjective?
done?problem 3.3 Let f : A→ A be a map and suppose that f◦ f = f. What extra
condition on f allows us to deduce that f is the identity map?
done?problem 3.4
1. Let f : A → B be a map. Suppose there exists a map g : B → A such
that g ◦ f = IA : A→ A. Show that f is injective.
2. Suppose f : A→ B is injective. Deduce that there exists a map g : B→ A
such that g ◦ f = IA : A→ A.
3. Under what circumstances is the map g in the previous part unique?
done?problem 3.5 State and prove results analogous to the previous exercise that
involve a map h : B→ A such that f ◦ h = IB?
done?problem 3.6 Show that there exists an injection A → B if and only if there
exists a surjection B→ A.
done?problem 3.7 (Harder!) Let S be a set. Show that there can not exist a
surjection f : S → P(S). You might proceed as follows. Suppose there is such
a surjection f. Now consider the subset A ⊆ S defined by
A = {x ∈ S | x 6∈ f(x)}.
Deduce that A itself is not in the image of f for a contradiction.
done?problem 3.8 (Optional!) The Schroder-Bernstein Theorem states that if there
are injections A → B and B → A then there exists a bijection A → B. It is
not entirely trivial. Find a proof (Halmos’s “Naive Set Theory” or perhaps the
web) and understand it!
MAT3-ALG algebra 2006/7 (tnb) - lecture 4 16
lecture 4 relations and quotients
4.1 relations in general
4.1.1 definition A relation between the sets X and Y is a subset of R ⊆ X×Y.
If (x, y) ∈ R then we say y is related to x. A relation on X is a relation between
X and itself.
4.1.2 examples
• {} ⊆ X× Y is the relation where nothing in X is related to anything in Y.
• X×Y ⊆ X×Y is the relation where everything in X is related to everything
in Y.
• The definition of function A→ B in §2.2.1 defines a function as a special
sort of relation between A and B
• The subset {(x, y) | x2 + y2 = 1} ⊆ R × R defines a relation on R. It is
not a function R→ R because some x-values have no y value and some
have more than one.
• The subset {(z,w) | |w| ≥ |z|} ⊆ R × R defines a relation on C. In this
case, w is related to z only if its modulus is at least as great as that of z.
• The subset {(m,n) |m − n = 3k for some k ∈ Z} ⊆ Z × Z defines a
relation on Z. Here n is related to m if and only if they have the same
remainder on division by 3.
4.2 equivalence relations
4.2.1 definition An equivalence relation on a set S is a relation such that,
writing x ∼ y if y is related to x we have
1. a ∼ b =⇒ b ∼ a
2. For all a ∈ X it is the case that a ∼ a
3. a ∼ b and b ∼ c =⇒ a ∼ c
4.2.2 notation Hereafter in this lecture we assume that S is a set on which
an equivalence relation is defined.
4.2.3 definition Let a ∈ S. The equivalence class of a is the set
[a] := {b ∈ S |a ∼ b}.
MAT3-ALG algebra 2006/7 (tnb) - lecture 4 17
4.2.4 theorem If a ∼ b then [a] = [b]. Otherwise [a] ∩ [b] = {}.
4.2.5 corollary The set S is a disjoint union of equivalence classes. (Note: a
union is disjoint if every pair of sets in the union has empty intersection.)
4.2.6 definition A set of representatives is a subset of S with the property
that it contains precisely one element from each equivalence class.
4.2.7 example Let S = R2 and let x ∼ y if |x| = |y| (the usual modulus of a
vector). This is an equivalence relation. The equivalence classes are (a) all the
circles centred on the origin and also (b) the equivalence class {0} consisting of
just the zero vector. A set of representatives is {(a, 0) |a ≥ 0}.
4.2.8 theorem Let f : S → T be a function. Then a ∼ b if and only if
f(a) = f(b) defines an equivalence relation on S.
4.2.9 examples
1. The example in §4.2.7 arises from the modulus function R2 → R.
2. Consider the squaring function f : R → R. This gives rise to the equiva-
lence relation on R where x ∼ y if and only if x2 = y2. The equivalence
classes are all sets of the form {x,−x} together with the single-element
class {0}. A set of representatives is [0,∞).
4.2.10 example (This does not naturally arise from a function as just dis-
cussed.) Let M be the set of real n × n matrices. For A,B ∈ M, let us say
that A ∼ B iff there exists an invertible n× n matrix P with B = P−1AP. This
defines an equivalence relation on M.
Proof.
1. A = I−1AI and so A ∼ A.
2. If A ∼ B then there exists invertible P with B = P−1AP. Set Q = P−1
which is also invertible. Then A = PBP−1 = Q−1BQ. Thus B ∼ A.
3. If A ∼ B and B ∼ C then there exist invertible P,Q with B = P−1AP and
C = Q−1BQ. Thus C = Q−1P−1APQ = (PQ)−1A(PQ) and so A ∼ C.
�
MAT3-ALG algebra 2006/7 (tnb) - lecture 4 18
4.3 quotients
4.3.1 definition Let ∼ be an equivalence relation on X. Define the quotient
X/ ∼ to be the set whose elements are the equivalence classes of ∼.
4.3.2 example Consider T , the set of all times, past present and future. Con-
sider the equivalence relation on T given by s ∼ t iff s and t differ by an exact
integer multiple of 24 hours. Then the quotient D = T/ ∼ can be thought of
as the set of “times of day”. When we make a statement like “I like a drink at
6 o’clock”, one could argue that the “6 o’clock” refers to an element of D - a
single abstract entity that we construct which is the equivalence class of all 6
o’clocks in all possible days.
4.3.3 example Let Z denote the integers and let a ∼ b iff a−b is a multiple
of 3. This is an equivalence relation. There are three equivalence classes
{. . . ,−3, 0, 3, 6, . . . }, {· · ·− 5,−2, 1, 4, 7, . . . } and {· · ·− 4,−1, 2, 5, 8, . . . }.
A set of representatives is {0, 1, 2} although {−19, 27, 7} would be just as good.
We often use the notation [a] for the equivalence class containing a. The set
of equivalence classes Z3 := Z/ ∼ is thus Z3 = {[0], [1], [2]}.
problems
done?problem 4.1 + Hand-in for tutorial Show that u ∼ v if and only if
u−v ∈ Z defines an equivalence relation on R. Describe [x] for this equivalence
relation and give a set of representatives.
done?problem 4.2 Does a ∼ b ⇐⇒ a + 2b = 3k where k ∈ Z define an
equivalence relation on Z? (Check carefully and investigate if you are not sure
— don’t just guess!)
done?problem 4.3 + Hand-in for tutorial Let a be the vector (1, 1) ∈ R2.Show that
x ∼ y ⇐⇒ x− y = λa for some λ ∈ R
defines an equivalence relation on R2. Sketch the equivalence classes and show
that R = {(x, y) | x+ y = 0} is a set of representatives.
MAT3-ALG algebra 2006/7 (tnb) - lecture 5 19
done?problem 4.4
(a) Show that if x ∈ R2 is non-zero then there exists an invertible 2× 2 matrix
A such that Ae1 = x where e1 is the first standard basis vector in R2.
(b) Use the above to show that given two non-zero vectors x, y ∈ R2 there
exists an invertible 2 × 2 matrix P such that y = Px. (Hint: take x to e1and then e1 to y.)
(c) Let x ∼ y iff there exists an invertible 2 × 2 matrix A such that y = Ax.
Show that this defines an equivalence relation on R2.
(d) What are the equivalence classes for this equivalence relation? Give a set
of representatives. How many elements does R2/ ∼ have?
done?problem 4.5 Consider the set X = {(a, b) |a, b ∈ Z and b 6= 0} (so an
element of X is a pair of integers with the second one non-zero). Show from
the definition that
(a, b) ∼ (k, l) ⇐⇒ al = bk
defines an equivalence relation on X.
lecture 5 the first isomorphism theorem (FIT) for sets
5.1 defining operations on quotients
Fix a natural number n > 1. It will be helpful for us in future to discuss the
essentially trivial fact that addition mod n is a well-defined concept.
Let a ∼ b in Z iff a−b is an integer multiple of n. There are n equivalence
classes - a set of representatives is {0, 1, 2, 3, . . . , n − 1}. We write (as always)
[a] for the equivalence class containing a. Thus [a] ∈ Zn := Z/ ∼. Now, we
can regard addition mod n as an operation defined on the equivalence classes
— in other words as an operation defined on elements of the quotient Zn. Let
us define
[a] + [b] := [a+ b].
There is something that needs thinking about here: if n = 5 then [2] and [7] are
(different ways of describing) the same equivalence class, as are [4] and [9]. Our
definition says that [2] + [4] = [6] and [7] + [9] = [16]. But of course [6] = [16]
and so there is no contradiction here.
So much for waffle: here is what one might write to prove that addition is
well-defined in Zn.
Proof. Let [a] = [a ′] and [b] = [b ′] so that a ∼ a ′ and b ∼ b ′. Then there
exist k, l ∈ Z such that a ′−a = kn and b ′−b = ln. Then (a ′+b ′)−(a+b) =
MAT3-ALG algebra 2006/7 (tnb) - lecture 5 20
(a ′−a)+(b ′−b) = (k+l)n and so a+b ∼ a ′+b ′ and hence [a+b] = [a ′+b ′].
�
5.1.1 moral The point of well-definedness in general is this. When we define
an operation on, or a map from, a quotient, we often have to define its action
on the equivalence class [x] by giving some formula involving x directly. In that
case we must check that if [x] = [y] then the formula applied to x gives the
same result as the formula applied to y.
5.2 the “first isomorphism theorem (FIT) for sets”
This is important as a paradigm for the first isomorphism theorems for vector
spaces and groups which we will study later. While FIT for vector spaces and
groups are fundamental results that appear in many books, FIT for sets is mainly
important as a prototype for them.
5.2.1 definition Let ∼ be an equivalence relation on S. The map p : S→ S/ ∼
defined by p : x 7→ [x] is called the canonical surjection.
5.2.2 note The fact that p is a surjection might be regarded as a very easy
theorem, but we will take it to be so obvious that it is part of the definition.
5.2.3 theorem Let R ⊆ X be a set of representatives. Then the restriction
p|R : R→ X/ ∼ is a bijection.
Proof. Trivial consequence of the definition of a set of representatives. �
5.2.4 note Finding a set of representatives involves making a choice, which
in most situations is to some degree arbitrary. The quotient is a sort of “gener-
alised, abstract set of representatives” that avoids this arbitrariness.
5.2.5 theorem (“FIT for sets”) Let f : X → Y be surjective. Define the
equivalence relation ∼ on X by a ∼ b if and only if f(a) = f(b). Then there is
a bijection f : X/ ∼→ Y such that f ◦ p = f where p is the canonical surjection
X→ X/ ∼.
5.2.6 note We often express the last condition by saying that the diagram
Xf
−→ Y
p ↓ ↗ f
X/ ∼
MAT3-ALG algebra 2006/7 (tnb) - lecture 5 21
commutes. When we say that a diagram of objects connected by maps com-
mutes we mean that if there are two different routes (following the arrows) from
one object to another, then both routes give the same answer.
Proof.
• First we define our map. Let f([x]) = f(x). Suppose [a] = [b]. Then by
definition of ∼ we have f(a) = f(b) and so f([a]) = f([b]) and so f is
well-defined.
• Let f([a]) = f([b]). Then f(a) = f(b) and so a ∼ b and so [a] = [b] and
hence f is injective.
• Let y ∈ Y. Then there exists x ∈ X such that f(x) = y since f is
surjective. Then f([x]) = f(x) = y and so f is surjective.
• Since f is surjective and injective it is a bijection.
• Let x ∈ X. Then
(f ◦ p)(x) = f(p(x)) = f([x]) = f(x).
Thus f ◦ p = f.
�
5.2.7 corollary In the statement of the theorem, the assumption that f is
surjective can be dropped provided the conclusion is changed to claim that f is
a bijection from X/ ∼ to the image of f.
5.2.8 example Consider the equivalence relation on R3 given by x = y if and
only if |x| = |y|. (Two vectors are equivalent if they have the same modulus.)
This equivalence relation arises from the surjection f : R3 → [0,∞) where
f : x 7→ |x|. Thus we have a bijection R3/ ∼→ [0,∞). In other words, the
points in [0,∞) label the equivalence classes of ∼.
problems
done?problem 5.1 Prove that the operation of multiplication is well-defined in Zn.
done?problem 5.2 + Hand-in for tutorial Consider the equivalence relation
x ∼ y ⇐⇒ |x| = |y| on Z. Use the absolute value function (i.e. the modulus
function) and FIT for sets to deduce that the quotient Z/ ∼ can be identified
with N ∪ {0}.
MAT3-ALG algebra 2006/7 (tnb) - lecture 6 22
done?problem 5.3 Consider the equivalence relation ∼ on the set X = {(a, b) |a, b ∈Z and b 6= 0} as in the problem for lecture 4. Show that setting
[(a, b)] + [(c, d)] = [(ad+ bc, bd)], [(a, b)] ∗ [(c, d)] = [(ac, bd)]
is a well-defined “addition” and “multiplication” on X/ ∼.
done?problem 5.4 This continues from the previous problem. Define a map f :
X → Q (where Q is the rational numbers) by f : (a, b) 7→ a/b. Use FIT
for sets to deduce that X/ ∼ can be identified with Q. Note by the way that
the previous exercise can be taken to be a definition of Q and its arithmetic
operations which does not use the idea of fractions or real numbers.
done?problem 5.5 Let S1 denote the unit circle, thought of as the unit-modulus
complex numbers. Consider the map f : R→ S1 defined by f : x 7→ e2πxi. Show
that the equivalence relation on R defined by u ∼ v ⇐⇒ f(u) = f(v) is that
u and v are equivalent if and only if they differ by an integer. Use FIT for sets
to show that R/ ∼ can be identified with S1.
lecture 6 fields and n-dimensional space
6.1 introduction
The theorems of linear algebra use only some basic algebraic properties of the
scalars. In year 2 we considered the case of real and complex vector spaces and
noticed that generally definitions, theorems the proofs worked in the same way
for both. The idea of a field is that it is a set of “numbers” that obey the
same algebraic rules as R and C and that are therefore usable as “scalars” for
a vector space.
A “field” then is a set of things that you can add, subtract multiply and
divide and the rules of algebra are just like those for real or complex numbers.
Familiar examples are
Q (the rational numbers), R (the real numbers), C (the complex numbers).
6.2 definitions and properties
6.2.1 definition (for completeness only - does not need to be memorised) A
field is a set F of objects on which two commutative operations are defined.
These are addition (+) and multiplication (usually just denoted by juxtaposi-
tion). They must obey the following axioms.
MAT3-ALG algebra 2006/7 (tnb) - lecture 6 23
• Under addition, F is a commutative (sometimes called “abelian”) group.
In particular there is an additive identity (“zero”) such that a+ 0 = a for
all a ∈ F and every element a must have an additive inverse -a with the
property that a+ (−a) = a− a = 0.
• Let F∗ denote the set of all non-zero elements of F. Then F∗ is a com-
mutative group under multiplication. There is a “multiplicative identity”
1 such that 1a = a for all a ∈ F and every a ∈ F∗ has to have a
multiplicative inverse a−1 with the property that aa−1 = a/a = 1.
• The addition and multiplication satisfy distributive laws: a(b + c) =
ab+ ac.
6.2.2 note The notions of modulus (of real or complex numbers) and in-
equalities (such as 2 < 3 in the real numbers) have no analogue in fields in
general.
6.2.3 examples
• Q,R,C are fields.
• Z and R[x] (the set of all polynomials in a variable x) are not fields.
(Most of the elements do not have multiplicative inverses — this is the
most common reason why a set of objects that can be commutatively
added and multiplied fail to be a field.)
• Let Zp denote the integers mod p where p is a prime. Then Zp is a field.
(In particular Z2 = {0, 1} is the smallest possible field.)
The reason Zn is not a field if n is not prime is that one does not have multi-
plicative inverses: e.g. there is no k ∈ Z4 such that 2k = 1 mod 4 so 2 has no
multiplicative inverse. On the other hand, suppose a ∈ Zp where p is prime.
Then gcd(a, p) = 1 and so there exist integers k, l such that ka + lp = 1
(Euclidean algorithm!). Then k is a multiplicative inverse for a.
6.2.4 example In Z7 we have 3 × 5 = 15 = 1 mod 7. So 1/3 = 5 and
1/5 = 3.
6.3 subfields
6.3.1 definitions
MAT3-ALG algebra 2006/7 (tnb) - lecture 6 24
• A subset K ⊆ F is a subfield if K is a field in its own right with the same
operations as F. In other words, it is a subset of elements of F that includes
both zero and one and that is closed under addition, multiplication and
the taking of additive and multiplicative inverses.
• To check whether K ⊆ F is a subfield you must check that if a, b are in K
then so are a+b, ab,−a and (if a is non-zero) a−1. (Strictly, one needs
also to check that K contains a non-zero element, otherwise we might
have the empty set or {0}.)
6.3.2 examples
• Clearly Q is a subfield of R which is in turn a subfield of C.
• Z3 = {0, 1, 2} is not a subfield of R. Although 1,2 and 3 are real numbers,
the arithmetic operations in Z3 are not those in R. (Strictly, we should
be writing e.g. “[2]” rather than “2” for the element of Z3. The element
[2] ∈ Z7 is an equivalence class of integers and not at all the same thing
as the real number 2.
• Q[√2] := {a + b
√2 |a, b ∈ Q}. is a subfield of R. (The proof is one of
the problems.)
6.4 n-dimensional space over a field
6.4.1 definition Let F be a field. Then
Fn :=
x1...xn
∣∣∣∣∣∣∣ x1, . . . , xn ∈ F
.6.4.2 notes
• This agrees with the usual definition of Rn and Cn. (In this course we
will always think of n-dimensional space as having column vectors as
elements.)
• Znp has just pn vectors.
6.4.3 observation All the basic ideas of vector spaces apply to Fn. For ex-
ample:
• In Z23 the subset
U := {x | x1 + 2x2 = 0}
forms a 1-dimensional subspace. (It contains precisely 3 vectors).
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 25
• Let
a :=
110
, b :=
111
be vectors in Z32. Their span is a 2-dimensional subspace of Z32. This
contains precisely 4 vectors.
problems
done?problem 6.1 + Hand-in for tutorial Find the multiplicative inverses
of the non-zero elements in Z7. (Just experimenting is probably easier than
using the Euclidean algorithm.)
done?problem 6.2 Show that if L ⊆ K is a subfield then 1, 1+ 1, 1+ 1+ 1, . . . are
all elements of L. (It is tempting to call these 1, 2, 3, . . . but note that (e.g. in
Zp) that they are not necessarily all distinct.) Deduce that Zp does not have
any subfields (other than itself).
What do you think the smallest subfield of R is?
done?problem 6.3 + Hand-in for tutorial Do the following equations have
solutions in the fields C,R,Q,Z3,Z2?
x2 + 1 = 0, x2 − x− 1 = 0
Note: what this means in each case is this: is there an element of the given
field such that if you substitute it in to this equation and do all the arithmetic
in that field then you get zero? The answers for the first three fields should be
easy from elementary background knowledge. The last two fields have very few
elements and so you can just experiment.
done?problem 6.4 Show that Q[√2] (definition in notes) is a subfield of R.
done?problem 6.5 Find all the vectors in Z23 that are scalar multiples of
a =
(1
2
).
done?problem 6.6 Find the vectors in the subspaces in §6.4.3.
lecture 7 vector spaces — revision
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 26
7.1 setting
Vector spaces over a general field F.
7.2 generalizing to arbitrary fields
Almost everything in this lecture should be familiar from Year 2 Linear Algebra.
The difference here is that we are allowing the field to be arbitrary. We will not
prove the results again because the same proofs that work for R and C work
for general fields.
7.3 vector spaces and subspaces
7.3.1 the idea of a vector space A vector space V over a field F (also called
an “F-vector space”) is a set of objects (“vectors”) such that if u, v ∈ V then we
can form their sum u+v ∈ V and if λ ∈ F (we call elements of F “scalars”) then
we can form λv ∈ V. Furthermore these operations obey the familiar algebraic
properties of the corresponding operations in Fn.
7.3.2 proper definition (details do not need to be memorised) What we re-
quire in detail is that V is an Abelian (i.e. commutative) group under the op-
eration of addition of vectors, with an identity element 0 ∈ V. The scalar
multiplication should be compatible with the group operation in the following
ways: for all vectors u, v ∈ V and scalars λ, µ ∈ F we have
• λ(µv) = (λµ)v
• 1v = v
• λ(u+ v) = λu+ λv
• (λ+ µ)v = λv+ µv
7.3.3 examples
• For n ∈ N the “standard n-dimensional space over the field F” is Fn, the
space of all column vectors of height n with entries in F.
• More generally, let V be the set of all m × n matrices with entries in
F. Then this is a vector space over F. (When we take this view, we are
forgetting all the other things we might do with matrices and using only
the fact that matrices can be added (if they are of the same size) and
multiplied by scalars.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 27
• Let F[x] denote the space of all polynomials in a variable x with coefficients
in F. This is a vector space over F.
• Let K ⊆ F be a subfield. Then we can regard F as a vector space over K.
For example:
– R ⊆ C and so we can regard C as a vector space over R. It is
2-dimensional — a fact that is apparent every time we draw the
complex plane.
– Q ⊆ R and so we can regard R as a vector space over Q. This is in
fact infinite-dimensional.
– Q[√2] is a vector space over Q. It is 2-dimensional.
7.3.4 definition The non-empty subset U ⊆ V of the F-vector space V is a
(vector) subspace if for all x, y ∈ U and λ, µ ∈ F we have
λx+ µy ∈ U
7.3.5 examples
• The set {x ∈ Fn | λ1x1 + · · ·+ λnxn = 0} defines a subspace of Fn.
• Consider the set V = {iy |y ∈ R} ⊆ C. If you regard C as a 2-dimensional
vector space over R then this is a subspace. If on the other hand you
regard C as a 1-dimensional complex vector space, it is not. (Note that V
is closed under vector addition. The difference arises because V is closed
under multiplication be real scalars but not complex ones.)
• The set V = {P ∈ F[x] |P(x) = p(−x)} is a subspace of F[x].
7.4 span
7.4.1 definition Let S be a subset of a vector space V. A linear combination
of elements of S is a finite sum
λ1v1 + · · ·+ λnvn where λj ∈ F, vk ∈ S, n ∈ N
7.4.2 idea The span of a set S of vectors in a vector space V is the smallest
subspace of V that contains all the vectors in S. It is easier to work with the
following.
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 28
7.4.3 definition Let S be a subset of a vector space V. Then the span Span(S)
of S is the set of all linear combinations of elements of S. That is,
Span(S) = {λ1v1 + · · ·+ λnvn | λj ∈ F, vk ∈ S, n ∈ N}.
We set the span of the empty set {} to be {0} by convention.
If S = {v1, . . . , vk} is a finite set of vectors then the span is just
Span(S) = {λ1v1 + · · ·+ λkvk | λj ∈ F}.
7.4.4 examples
• In R3 the span of the vectors110
,100
is the subspace defined by x3 = 0.
• In P3(R) (real polynomials in x of degree ≤ 3) the span of {x, x3} is the
subspace consisting of all the odd polynomials.
• For any vector space, Span(V) = V.
• If U ⊆ V is a subspace then Span(U) = U. (In fact this is an if and only
if and so this condition characterizes subspaces.)
7.4.5 theorem
• If S ⊆ V is a subset then Span(S) is a subspace of V.
• Let U be a subspace of V and let S ⊆ U be a subset. Then Span(S) ⊆ U.
Combined, these results make sense of Span(S) being the smallest subspace
containing S.
7.5 linear independence, bases
7.5.1 linear independence
• A set S of vectors in a vector space V is linearly dependent if there exists
n ∈ N and vectors x1, . . . , xn ∈ S such that
λ1x1 + · · ·+ λnxn = 0
where the scalars λ1, . . . , λn ∈ F are not all zero.
• If S is not linearly dependent then we say it is linearly independent.
• A set S of vectors is linearly dependent if and only if there is a vector in
the set which is in the span of the other elements of S.
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 29
7.5.2 bases and dimension
• A basis for a vector space V (which may be a subspace of some larger
vector space) is a set S ⊆ V of vectors which is linearly independent and
which spans V.
• If V has a basis consisting of a finite number n of elements of V then we
say V has dimension n. Otherwise, we say V has infinite dimension.
7.5.3 coordinates Let V be n-dimensional and let u1, . . . , un be a basis for
V and let x ∈ V be given. Then there exist unique scalars λ1, . . . , λn ∈ F (called
the coordinates of x in the basis) such that
x = λ1u1 + · · ·+ λnun.
The coordinate matrix of x with respect to the basis is the column matrixλ1...λn
.7.6 sums and intersections of subspaces
7.6.1 definition Let U,V be subspaces of an F-vector space X. Then
W = U+ V = {u+ v |u ∈ U, v ∈ V}
is a subspace of X. If U ∩ V = {0} then we say that the sum is direct and we
write W = U⊕V.
7.6.2 theorems
• If W = U⊕V and the sum is direct then every vector w ∈ W can be
written in one and only on way as w = u+ v with u ∈ U and v ∈ V.
• The intersection of two subspaces is itself a subspace.
• If U+ V is finite-dimensional then
dim(U+ V) = dimU+ dimV − dim(U ∩ V).
MAT3-ALG algebra 2006/7 (tnb) - lecture 7 30
problems
done?problem 7.1 + Hand-in for tutorial In Z32 find all the vectors in
Span(x, y) where
x =
110
, y =
011
done?problem 7.2 Find all the vectors in the subspace V ⊆ Z32 given by V = {x ∈Z32 | x1 + x2 + x3 = 0}.
done?problem 7.3 Show that in Z35 the vectors113
,202
,430
are linearly dependent.
done?problem 7.4 Give a basis of the subspace of Z35 defined by the equation
x1 + x2 + x3 = 0. What is the dimension of this subspace? How many vectors
are there in this subspace? Find the coordinate matrix of the vector
v =
433
in you chosen basis.
done?problem 7.5 + Hand-in for tutorial Complete the following sentence
without using any terms from linear algebra. “If R were finite-dimensional as a
vector space over Q it would mean that there exist a finite number r1, . . . , rnof . . . such that every . . . could be written as . . . ”
MAT3-ALG algebra 2006/7 (tnb) - lecture 8 31
done?problem 7.6 + Hand-in for tutorial There are seven different non-
zero vectors in Z32 and hence (since the only scalars are {0, 1}) there are seven
different 1-dimensional subspaces. There are also seven different linear equations
of the form
λ1x1 + λ2x2 + λ3x3 = 0, λj ∈ Z2
apart from the trivial one with all the λj being zero. Each
of these defines a different 2-dimensional subspace of Z32.In the picture there are seven blobs, and seven lines
(six straight together with a circle). Label each blob
with a different 1-dimensional subspace and each line
with a 2-dimensional subspace in such a way that a
blob is on a line iff the 1-dimensional subspace lies in-
side the 2-dimensional one (i.e. iff the vector satisfies
the equation). (Note by the way that this config-
uration has the property that through every pair of
points there is a unique line and every pair of lines
meet in precisely one point. It is an example of a
”finite projective plane”.)
done?problem 7.7 u Challenge How many 2-dimensional subspaces does Z42have? You might want to think along the lines of defining such a subspace
by choosing a non-zero vector and then choosing another vector that is not a
multiple of it — their span determines a subspace. Count how many ways there
are of doing this and then work out how many times each subspace has been
counted.
lecture 8 linear maps — revision
8.1 setting
Linear maps T : U → V where U,V are finite-dimensional vector spaces over
the same field F.
8.2 generalizing to arbitrary fields
Almost everything in this lecture should be familiar from Year 2 Linear Algebra.
The difference here is that we are allowing the field to be arbitrary. We will not
prove the results again because the same proofs that work for R and C work
for general fields.
MAT3-ALG algebra 2006/7 (tnb) - lecture 8 32
8.3 ideas
Whenever you have defined a structure (such as ”vector space”), you go on to
consider maps that preserve that structure. Vector spaces are sets which have
addition and scalar multiplication defined. Thus the relevant maps are those
that preserve these operations. That is the real content of the definition of
”linear map” below.
An isomorphism (see below) is a linear map T : U → V that is also a
bijection (i.e. 1-1 and onto) thus it just matches up the elements of U and V
in a way that respects the vector space operations.
8.4 definitions
8.4.1 definition Let U,V be vector spaces over the same field F. Then the
map T : U → V is a linear map or a homomorphism of vector spaces if for all
x, y ∈ U and all λ, µ ∈ F
• T(λx+ µy) = λTx+ µTy.
8.4.2 kernel and image
• The kernel of T as above is
ker T = {u ∈ U | Tu = 0}
which is a subspace of U.
• The image of T is
im T = {v | v = Tu for some u ∈ U}
which is a subspace of V.
• The rank of T is the dimension of im T .
• The Rank Theorem (a.k.a. Rank-Nullity Theorem) states that
dim ker T + dim im T = dimU.
8.4.3 inverses
• A linear map T : U→ V is injective (i.e. 1-1) iff ker T = {0}.
• A bijective (equivalently, ”invertible”) linear map T : U→ V is called an
isomorphism of vector spaces.
MAT3-ALG algebra 2006/7 (tnb) - lecture 8 33
8.4.4 composition
• If T : U→ V and S : V →W are linear maps then the composition
S ◦ T : U→W, (S ◦ T)x := S(T(x))
is also a linear map.
• If also S and T are both invertible (i.e. both isomorphisms) then so is S◦Tand (S ◦ T)−1 = T−1 ◦ S−1.
8.5 linear maps Fn → Fm
8.5.1 matrices
• Linear maps Fn → Fm are given by matrices — in other words given such
a linear map T there exists an m×n matrix A such that the map is given
by T : x 7→ Ax.
• For a linear map T : Fn → Fm given by T : x 7→ Ax where A is a matrix,
the j-th column of A is the image under T of the j-th basis vector in Fn.
Hence, the image of T is the span of the columns of A. We define the
rank of a matrix to be the dimension of the span of its columns, and so
the rank of a matrix is equal to the rank of the corresponding linear map.
• Composition of maps corresponds to multiplication of matrices: if T :
Fn → Fm is given by T : x 7→ Ax and S : Fm → Fp is given by S : y 7→ By
then the composition S ◦ T : Fn → Fp is given by S ◦ T : x 7→ BAx.
8.6 bases as isomorphisms — not revision
8.6.1 theorem Let u1, . . . , un be a basis for V. Then
S : V →→ Fn, S : x 7→x1...xn
( where x1, . . . , xn are the coordinates of x in the basis) is an isomorphism.
Proof. That the map is a bijection is the existence and uniqueness of coordi-
nates with respect to a basis. The fact that S is linear expresses the fact that
adding and scalar multiplying vectors in V corresponds to the same operations
on the coordinate matrices. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 8 34
8.6.2 note In fact a basis is exactly the same thing as an isomorphism S :
V → Fn. Given such an isomorphism, the basis consists of the vectors in V that
map to the standard basis vectors in Fn.
8.6.3 idea The above expresses the underlying idea of bases: a basis is just
a choice of an identification of the vector space with the standard vector space
Fn.
8.7 coordinates — not all revision
8.7.1 definition (revision) Let T : U → V be a linear map between finite-
dimensional vector spaces. Let f1, . . . , fn be a basis for U and g1, . . . , gm be a
basis for V. The matrix of T with respect to these bases is the matrix A whose
k-th column is the coordinate matrix in V of the vector Tfk ∈ V.
8.7.2 idea The bases identify U,V with Fn, Fm respectively (as discussed
above) and identify T with a linear map given by matrices. The situation is
encapsulated by the following commutative diagram (i.e. a diagram where if
there are two routes from one place to another following the arrows, the maps
are equal). The vertical maps are the isomorphisms given by the bases.
UT
−→ Vy yFn
x 7→Ax−→ Fm
problems
done?problem 8.1 + Hand-in for tutorial Consider the linear map T :
Z32 → Z32 with matrix 1 1 0
1 0 1
0 1 1
.Find all the vectors in ker T and im T .
MAT3-ALG algebra 2006/7 (tnb) - lecture 9 35
done?problem 8.2 How many linear maps T : Z22 → Z22 are there? How many of
them are invertible? (Hint: equivalently, how many 2 × 2 matrices are there
with entries in Z2 and how many have inverses? Remember that a 2× 2 matrix
has an inverse if its first column is non-zero and the second column is not a
multiple of the first.)
Let A be such an invertible matrix and let
e1 =
(1
0
), e2 =
(0
1
), e3 =
(1
1
).
Show that Aej 6= Aek unless j = k and that Aej 6= 0. Deduce that multi-
plication by A permutes the three non-zero vectors in Z22. Find explicitly the
permutation given by each of the six invertible matrices.
done?problem 8.3 Let Pj[Z3] denote the vector space of polynomials of degree ≤ jwith coefficients in Z3.
1. State the dimension of Pj[Z3].
2. Show that 1+ x, x+ x2, x2 is a basis for P2[Z3].
3. Show that T : P2[Z3] → P3[Z3] where T : p(x) 7→ (x + 2)p(x) is a linear
map.
4. Calculate the matrix of T with respect to the basis above for P2[Z3] and
the basis 1, x, x2, x3 for P2[Z3].
lecture 9 invariant subspaces and block matrices
9.1 setting
Linear maps V → V where V is a (normally finite-dimensional) vector space
over a field F.
9.2 basic ideas
9.2.1 definition Let T : V → V be a linear map. The subspace U ⊆ V is
invariant under T if T(U) ⊆ U.
9.2.2 examples
• V ⊆ V is always an invariant subspace.
• {0} ⊆ V is always an invariant subspace. (Recall that for linear maps we
have T(0) = 0 always.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 9 36
• ker T ⊆ V is an invariant subspace of T : V → V.
9.2.3 definition Let T : V → V be a linear map and let λ ∈ F. Define
Vλ := {v ∈ V | Tv = λv}.
If Vλ 6= {0} then we say λ is an eigenvalue of T and Vλ is the corresponding
eigenspace. Nonzero elements of Vλ are the eigenvectors of T (with the given
eigenvalue).
9.2.4 theorem The eigenspaces Vλ are invariant subspaces of V.
Proof. That they are subspaces is an easy exercise. Now suppose v ∈ Vλ.
Then Tv = λv ∈ Vλ and so Vλ is invariant. �
9.2.5 note If U ⊆ V is an invariant subspace, then we can restrict T to obtain
a linear map T |U : U→ U. (We will usually abuse our notation and write simply
T for this map.) In the case where U = Vλ is an eigenspace, T |U = λI (where,
as always, I denotes the identity linear map I : x 7→ x).
9.3 block matrices
9.3.1 definition Let M be an n×n matrix. Given k with 1 < k < n, we can
divide our matrix M into blocks A,B,C,D so that
M =
(A B
C D
).
Here, A is a k× k matrix, D is a (n− k)× (n− k) matrix and the other two
are the sizes they have to be. We say that M is written in block form. In some
books, block form is indicated by separating the blocks with dotted or dashed
lines.
9.3.2 theorem Suppose M,M ′ are two n×n matrices both written in block
form
M =
(A B
C D
), M ′ =
(A ′ B ′
C ′ D ′
)with A and A ′ of the same size so that also D and D ′ are of the same size.
Then the product in block form is given by
MM ′ =
(A B
C D
)(A ′ B ′
C ′ D ′
)=
(AA ′ + BC ′ AB ′ + BD ′
CA ′ +DC ′ CB ′ +DD ′
).
(In other words, the blocks multiply as though they were scalars BUT the terms
in products must be kept in the same order.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 9 37
Proof. It is easy to convince yourself this is true, a formal proof is not very
enlightening. �
9.3.3 extensions
• One can write non-square matrices in block form and one can consider
cases where the diagonal blocks are not square.
• There is a general rule here that is confusing to give precise form to: in a
product of matrices in block form, if the blocks match up precisely so that
all the matrix multiplications of the blocks make sense, then the product
can be computed as above treating the blocks as though they were scalars
(but maintaining the order of products). For example, with M as above
and a column vector x divided into two blocks u, v of height k, n − k
respectively we have
Mx =
(A B
C D
)(u
v
)=
(Au+ Bv
Cu+Dv
).
• All the above generalizes to cases where matrices are divided into more
than four blocks.
9.4 relation with invariant subspaces
9.4.1 definition Let M be in block form as in §9.3.1.
• If C = 0 we say that M is block upper-triangular.
• If B = 0 we say that M is block lower-triangular.
• If B = C = 0 we say that M is block diagonal.
9.4.2 theorem Let T : V → V be a linear map. Then T has a k-dimensional
invariant subspace if and only if there exists a basis for V such that the matrix
of T is block upper-triangular with the top-left block of size k× k.
Proof. Suppose first there exists a basis v1, . . . , vn such that the matrix
of T is block upper-triangular with the top-left block of size k × k. Then
Span(v1, . . . , vk) is an invariant subspace.
Conversely, suppose U ⊆ V is an invariant k-dimensional subspace. Choose
a basis v1, . . . , vk for U and extend to obtain a basis for V (always possible by
Year 2 Linear Algebra). Then with respect to this basis the matrix of T is block
upper-triangular. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 9 38
9.4.3 note The block A in our block upper-triangular matrix is the matrix of
T |U : U→ U in the basis v1, . . . , vk of U.
9.4.4 corollary If Vλ is a k-dimensional eigenspace of T then there exists a
basis for V such that the matrix of T is block upper-triangular with the leading
diagonal block being λI.
9.4.5 theorem Let T : V → V be a linear map. Then V is the direct sum of
two invariant subspaces U,U ′ of dimensions k, n− k if and only if there exists
a basis of V for which the matrix of T is block diagonal with A being k×k and
D being (n− k)× (n− k).
Proof. Similar to the previous theorem - the basis is such that v1, . . . , vk is a
basis for U and vk+1, . . . , vn is a basis for U ′. �
9.5 flags
9.5.1 definition A flag in an n-dimensional vector space V is a collection of
subspaces
0 = V0 ⊂ V1 ⊂ · · · ⊂ Vn−1 ⊂ Vn = V, such that dimVk = k, k = 1, . . . , n.
9.5.2 definition Let T : V → V be a linear map. A flag in V is invariant if
each Vk, k = 1, . . . , n is an invariant subspace for T .
9.5.3 theorem Let T : V → V be a linear map. Then there exists an invariant
flag in V if and only if there exists a basis for V such that the matrix of T is
upper-triangular.
Proof. Same idea as before - the basis in this case is such that v1, . . . , vk is a
basis for the subspace Vk in the flag. �
problems
done?problem 9.1 Let T : V → V be linear. Prove ker T is an invariant subspace.
done?problem 9.2 Consider rotations (about the origin) and reflections (in lines
through the origin) in the plane. Describe all 1-dimensional invariant subspaces.
done?problem 9.3 Show that every 1-dimensional invariant subspace is the span of
an eigenvector.
MAT3-ALG algebra 2006/7 (tnb) - lecture 10 39
done?problem 9.4 Consider matrix multiplication of block upper-triangular matri-
ces. Using the notation of §9.3.1, show that M is invertible if and only if the
blocks A and D are invertible.
done?problem 9.5 + Hand-in for tutorial Show that T : V → V has an
invariant subspace of dimension l if and only if V has a basis with respect to
which the matrix of T is block lower-triangular.
Deduce that if the matrix M is n×n block upper-triangular and P is n×nsuch that Pij = 1 when i+ j = n+ 1 and zero otherwise, then P−1MP is block
lower-triangular.
done?problem 9.6 Let T : V → V be a linear map and suppose there is a flag {Vk}
in V such that T(Vk) ⊆ Vk−1 for k = 1, . . . , n. Show that Tn = 0.
Show that there exists such a flag for T if and only if there exists a basis
for V such that the matrix of T is strictly upper-triangular (meaning i ≥ j =⇒Tij = 0).
lecture 10 quotients and the 1st isomorphism theo-rem
10.1 setting
Vector spaces over a field F, which may be assumed to be finite-dimensional
(and needs to be when we make statements about dimension).
10.2 introduction
The Maple command “series( sin(x) /(x-1) , x=0 );” produces the output
−x− x2 −5
6x3 −
5
6x4 −
101
120x5 +O(x6).
Maple is “working to order x5”, neglecting terms involving higher powers of x.
One way of expressing this is as follows. Let X be the (infinite-dimensional)
vector space of “formal power series” - the set of all expressions a0+a1x+a2x2+
. . . without worrying whether they converge or not. Let V be the subspace of
those formal power series whose first six coefficients are zero. Now define an
equivalence relation on X by p ∼ q iff p − q ∈ V. (In other words, two series
are equivalent if they agree up to and including the term in x5). Then when we
“work to order x5” we are really working in the quotient X/ ∼. This quotient is
clearly a vector space: there is a well-defined addition and scalar multiplication
of such power series.
MAT3-ALG algebra 2006/7 (tnb) - lecture 10 40
10.3 basic definitions
10.3.1 theorem Let V be a subspace of X. Then x ∼ y if and only if x−y ∈ Vdefines an equivalence relation on X.
10.3.2 example Let V be the subspace x1+x2+x3 = 0 of X = R3. Then for
each d ∈ R the set of all vectors x with x1 + x2 + x3 = d form an equivalence
class. Each equivalence class is thus a plane parallel to the subspace V.
10.3.3 theorem Let V be a subspace of a vector space X over F and let ∼
denote the equivalence relation x ∼ y ⇐⇒ x− y ∈ V. Then
[x] + [y] := [x+ y] and λ[x] := [λx]
are well-defined and make X/ ∼ into a vector space over F. The zero vector is
[0] = V. We call this vector space the quotient of X by V and denote it by
X/V.
Proof. One must check is that these operations are well-defined i.e. if [u] = [u ′]
and [v] = [v ′] then [u+ v] = [u ′ + v ′] and λ[u] = λ[u ′].
Secondly, one has to check that the axioms are obeyed. For instance
[u] + [v] = [u+ v] = [v+ u] = [v] + [u]
and so vector addition is commutative. The rest are equally trivial and we leave
them to the interested reader. �
10.3.4 theorem Let V ⊆ X be a subspace. Then P : X → X/V defined by
P : x 7→ [x] is a surjective linear map with kernel V.
Consequently, by the Rank Theorem for linear maps, if V is a subspace of
X then
dim(V) + dim(X/V) = dim(X).
Proof. One must check that P is linear, that it is surjective and that it has
kernel V. These are all trivial. �
10.4 FIT for vector spaces
10.4.1 the first isomorphism theorem Let T : U → V be a surjective linear
map. Then there is a canonical linear map
T : U/ ker T → V, where T : [u] 7→ Tu
which is an isomorphism of vector spaces and T = T ◦ P.
MAT3-ALG algebra 2006/7 (tnb) - lecture 10 41
10.4.2 note The final condition is equivalent to the fact that the diagram
UT
−→ V
P ↓ ↗ T
U/ ker T
commutes.
Proof. First, note that T(x) = T(x ′) ⇐⇒ x− x ′ ∈ ker T ⇐⇒ x ∼ x ′ where
∼ is the equivalence relation that defines the quotient vector space. Thus FIT
for sets proves everything except the fact that T is a linear map. For that:
T(λ[x] + µ[x ′]) = T([λx+ µx ′])
= T(λx+ µx ′)
= λTx+ µTx ′
= λT([x]) + µT([x ′])
�
10.4.3 corollary The condition that T is surjective can be dropped from the
statement of the theorem. In that case, the conclusion is that the canonical
linear map T is an isomorphism of U/ ker T with im T .
10.5 bases
10.5.1 theorem let V ⊆ X be a k-dimensional subspace of an n-dimensional
vector space X. Let f1, . . . , fn be a basis for X such that f1, . . . , fk are a basis
for V. Then
[fk+1], . . . , [fn]
form a basis for X/V.
Proof. They span X/V since if x ∈ X and x =∑ni=1 λifi then
[x] =
[n∑i=1
λifi
]=
[n∑
i=k+1
λifi
]=
n∑i=k+1
λi[fi].
But dimX/V = n− k and so they form a basis. �
10.5.2 FIT and bases In the situation of FIT, choose a basis for U such that
f1, . . . , fk form a basis for ker T . Choose any basis for V. Then the matrix of T
has block form (0 A
).
MAT3-ALG algebra 2006/7 (tnb) - lecture 10 42
(We have a generalization of block form here - the matrix being blocked is not
square and we are blocking in to just two blocks.) The matrix A is the matrix
of T : U/ ker T → V with respect to the basis [fk+1], . . . , [fn] of U/ ker T and
the given basis of V.
10.6 complementary subspaces
10.6.1 complementary subspaces A complementary subspace to V ⊆ X is a
subspace W ⊆ X such that X = V ⊕W.
It is not hard to see that a complementary subspace is a set of representatives
for ∼ and so if X = V ⊕W then we can identify X/V with W.
The subspace V will have many complementary subspaces and there may
be no good reason to choose one rather than another. (Unless X has an inner
product when the perpendicular subspace would be a good complement.) The
quotient X/V is one abstractly constructed object that one can work with and
avoid making an arbitrary choice.
problems
done?problem 10.1 Let V ⊆ X be a subspace. Check that x ∼ y ⇐⇒ x− y ∈ Vdoes define an equivalence relation on X
done?problem 10.2 Check that the addition and scalar multiplication defined in
§10.3.3 is well-defined.
done?problem 10.3 Check that the distributive law (λ(x + y) = λx + λy for all
vectors x, y and a scalar λ) holds in X/V.
done?problem 10.4 Write out a careful proof of the fact that P : X→ X/V where
P : x 7→ [x] is linear, surjective and has kernel V.
done?problem 10.5 + Hand-in for tutorial Suppose T : X→ Y is a linear
map and that V ⊆ X is a subspace such that V ⊆ ker T . Define a linear map
T : X/V → Y (checking that the map you have defined is linear) such that the
following diagram commutes. (The vertical map is the usual one.) Find the
dimension of the kernel of T in terms of the dimensions of V and ker T . (Hint:
apply the rank theorem to T and T .)
XT
−→ Y↓ ↗ T
X/V
MAT3-ALG algebra 2006/7 (tnb) - lecture 10 43
done?problem 10.6 (Harder!) Let U ⊆ V ⊆ X be subspaces of X.
(a) Show that there is a canonical linear map S : V/U→ X/U.
(b) Show that S is injective.
(c) Show that there is a canonical linear map T : X/U→ X/V. Show that T is
surjective.
(d) Show that ker T = imS.
So, if we identify V/U with its image in X/U (reasonable, since S is injective)
then we can deduce that there is an isomorphism
X/U
V/U→ X/V.
(Notation suggestion: write x ∼U y if x− y ∈ U and write [x]U for the equiva-
lence class under this relation. Similarly for V.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 11 44
lecture 11 quotients and linear maps
11.1 setting
Finite-dimensional vector spaces over a field F. For the result that for every
linear map T : V → V there exists a basis with respect to which the matrix of
T is upper-triangular, the field is assumed to be C.
11.2 linear maps of quotients
11.2.1 theorem Let T : X→ X be a linear map and let V ⊆ X be a subspace
such that T(V) ⊆ V. Then there is a canonical linear map T : X/V → X/V
such that P ◦T = T ◦P where P is the canonical surjection X→ X/V. The final
condition is just that the following diagram commutes.
XT
−→ X
P ↓ P ↓X/V
T−→ X/V
Proof. Define T by
T : [x] 7→ [Tx].
One just has to show that this map is well-defined, linear and that the diagram
commutes. �
11.2.2 theorem In the situation as above, let f1, . . . , fn be a basis for X such
that f1, . . . , fk is a basis for V. Then the matrix A of T with respect to these
bases is block upper-triangular of the form
A =
(S U
0 Q
)where S is the k×k matrix of T restricted to be a map V → V. The (n−k)×(n− k) matrix Q is the matrix of T with respect to the basis [fk+1], . . . , [fn] of
X/V.
Proof. Immediate from the definition of T . �
11.3 a result for maps Cn → Cn
11.3.1 theorem Let V be an n-dimensional complex vector space and let
T : V → V be a linear map. Then there exists a basis for V with respect to
which the matrix of T is upper-triangular.
MAT3-ALG algebra 2006/7 (tnb) - lecture 11 45
Proof. Note first that since by the Fundamental Theorem of Algebra every
polynomial with complex coefficients has a root in C we know that every such T
has an eigenvector. Now we proceed by induction on the dimension n. Clearly
the theorem holds for n = 1. Assume it holds for dimension n − 1 and now
consider dimension n. Let f1 be an eigenvector of T and let U = Span(f1).
Then T(U) ⊆ U and so there exists an induced linear map T : V/U → V/U.
By the inductive hypothesis, there exists a basis [f2], . . . , [fn] for V/U such that
the matrix of T is upper-triangular. Then the matrix of T is upper-triangular
with respect to the basis f1, . . . , fn. �
11.3.2 remark The entries that appear on the diagonal of an upper-triangular
matrix are the eigenvalues.
11.3.3 corollary If V is a finite-dimensional complex vector space and T :
V → V is a linear map, then there exists a flag in V invariant under T .
11.3.4 corollary Let B be an n × n complex matrix. Then there exists an
invertible n× n complex matrix P such that P−1BP is upper-triangular.
11.3.5 corollary A linear map T : V → V is called nilpotent if there exists
k ∈ N such that Tk = 0. If V is complex and T is nilpotent, then there exists a
basis for V such that the matrix of T is strictly upper-triangular. Consequently
Tn = 0.
Proof. If T is nilpotent then all its eigenvalues are zero. (Why?) Then apply
the remark following the theorem. Now we have an invariant flag and further
we have T(Vk) ⊆ Vk−1 for all k. Thus
Tn(V = Vn) ⊆ Tn−1(Vn − 1) ⊆ · · · ⊆ V0 = {0}.
�
problems
done?problem 11.1 + Hand-in for tutorial The 3× 3 real matrix
A =
1 0 −1
0 −2 0
1 0 0
has a single real eigenvalue. Find a real invertible matrix P such that P−1AP is
block upper-triangular.
MAT3-ALG algebra 2006/7 (tnb) - lecture 12 46
done?problem 11.2 Consider the differentiation map D : P3 → P3 where as usual
Pn is the vector space of polynomials of degree ≤ n in a variable x. Show
that D gives rise to a linear map D : P3/V → P3/V where V is the subspace
of constant polynomials. What is the matrix of D with respect to the basis
[x], [x2], [x3] of P3/V?
done?problem 11.3 + Hand-in for tutorial Let J : R2 → R2 be rotation
anticlockwise by a rightangle. Does there exist a basis for R2 with respect to
which the matrix of J is upper-triangular? If not, explain where the proof we
gave for complex vector spaces breaks down.
done?problem 11.4 The aim of this question is to prove that if T : V → V is
nilpotent (so that Tk = 0 for some k), then there exits a basis for V such that
the matrix of T is strictly upper-triangular. (The proof in the notes applies only
if V is complex.) Let k be the least k such that Tk = 0.
1. Show that if T is nilpotent then T has zero as an eigenvalue.
2. Let U ⊆ V be an invariant subspace for nilpotent T . Show that T :
V/U→ V/U is also nilpotent.
3. Now argue analogously to the proof of the main theorem in the notes for
this lecture.
lecture 12 linear maps V → V — eigenspaces
12.1 setting
Linear maps T : V → V where V is finite-dimensional over a field F.
12.2 characteristic equation
12.2.1 definition Let A be an n× n matrix. Then the degree n polynomial
in the variable x given by
cA(x) = det(A− xI)
is the characteristic polynomial of A.
12.2.2 theorem (revision) The scalar λ is an eigenvalue of A if and only if it
is a root of cA(x).
Proof. λ is an eigenvalue if and only if A − λI has nontrivial kernel, which is
the case if and only if the matrix A− λI is singular (i.e. not invertible). �
MAT3-ALG algebra 2006/7 (tnb) - lecture 12 47
12.2.3 theorem Let A,B, P be n×n matrices with P invertible and suppose
that B = P−1AP. Then cA(x) = cB(x).
Proof.
det(P−1AP − xI) = det(P−1(A− xI)P)= det(P−1)det(A− xI)det(P) = det(A− xI).
�
12.2.4 definition If T : V → V is a linear map we can define the characteristic
polynomial cT of T to be cA where A is the matrix of T in some basis. (Since the
previous theorem shows that change of basis does not change the characteristic
polynomial, this is well-defined.)
12.2.5 notation Let λ ∈ F be an eigenvalue of T : V → V. We write
Vλ := {v ∈ V | Tv = λv}
for the corresponding eigenspace.
12.2.6 definition Let λ be an eigenvalue of T : V → V.
• The geometric multiplicity g of λ is g := dimVλ.
• The algebraic multiplicity a of λ is the multiplicity of λ as a root of cT .
12.3 polynomials in matrices and linear maps
12.3.1 notation Let
p(x) = anxn + · · ·+ a1x+ a0, aj ∈ F
be a polynomial. Then if A is a square matrix, we define
p(A) = anAn + · · ·+ a1A+ a0I
(so p(A) is itself a square matrix). More abstractly, if T : V → V is a linear
map, we can define p(T) (which is also a linear map V → V). If the matrix of
T with respect to a basis is A, then the matrix of p(T) is p(A).
12.3.2 theorem Suppose that A is diagonal with entries µj. Then p(A) is
diagonal with entries p(µj).
Proof. Obvious. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 12 48
12.3.3 theorem Suppose Tv = λv and p is a polynomial. Then p(T)(v) =
p(λ)v.
Proof. Obvious. �
12.4 direct sums of subspaces
12.4.1 definition Let U1, . . . , Uk be subspaces of V. The sum of the sub-
spaces is
U1 + · · ·+Uk := {u1 + · · ·+ uk |uj ∈ Uj}.
12.4.2 definition A sum of subspaces as above is direct if
u1 + · · ·+ uk = 0, uj ∈ Uj =⇒ uj = 0 for all j.
If W is the sum of the Uj and the sum is direct we write
W = U1 ⊕ · · · ⊕Uk =⊕j=1,...,k
Uj.
12.4.3 theorem Let V =⊕
j=1,...,kUj and let v ∈ V. Then v can be written
in one and only one way as
v = u1 + · · ·+ uk, uj ∈ Uj.
Proof. That v can be so written is immediate from the definition. Then write
v as such a sum in two ways and show that they are equal. �
12.4.4 theorem Let λ1, . . . , λm be distinct eigenvalues of T : V → V. Then
the sum of the eigenspaces Vλj , j = 1, . . . ,m is direct.
Proof. Suppose
v1 + · · ·+ vk = 0, vk ∈ Vλk (∗)
Let 1 ≤ j ≤ k. The we can write
cT (x) = (x− λj)ajQj(x)
where aj is the algebraic multiplicity of λj. Then λj is not a root of Qj but all
other eigenvalues are. Applying Qj(T) to both sides of (*) we see that vj = 0.
�
MAT3-ALG algebra 2006/7 (tnb) - lecture 12 49
12.5 relation between multiplicities
12.5.1 theorem Let the n× n matrix M be block upper-triangular
M =
(A B
0 D
).
Then detM = detAdetD.
Proof. Omitted. It is not hard if one takes as the definition of determinant
detM =∑
σ a perm of 1, . . . , n
(−1)|σ|M1σ(1) . . .Mnσ(n)
where |σ| denotes the sign of the permutation σ. �
12.5.2 corollary Let the n× n matrix M be block upper-triangular
M =
(A B
0 D
).
Then cM(x) = cA(x)cD(x).
Proof. Trivial. �
12.5.3 corollary Let T : V → V and suppose U ⊆ V is an invariant subspace.
let cU denote the characteristic polynomial of T : U → U. Then ct(x) =
cU(x)Q(x) where Q(x) is the characteristic polynomial of the canonical linear
map T : V/U→ V/U.
Proof. Trivial. �
12.5.4 theorem Let λ be an eigenvalue of T : V → V. Then
g ≤ a
where g and a are the geometric and algebraic multiplicities of λ respectively.
Proof. The eigenspace Vλ is a g-dimensional invariant subspace. On that
subspace, T = λI and so the characteristic polynomial of T on this subspace is
(λ− x)g. By the theorem, (λ− x)g divides cT (x) exactly and so a ≥ g. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 12 50
problems
done?problem 12.1 Write out the details of the proof of §12.4.3.
done?problem 12.2 + Hand-in for tutorial Let x1, . . . , xk be non-zero
vectors and let Uj = Span(xj). Show that the sum of the subspaces Uj is direct
if and only if the vectors xj are linearly independent.
done?problem 12.3 Let V =⊕
j=1,...,kUj be a direct sum of subspaces. Suppose
that we are given a basis for each subspace Uj. Show that the totality of all
these basis vectors forms a basis for V.
done?problem 12.4 Write out the details of the proof of §12.4.4. In particular,
provide more detail on the final sentence.
In the following problems, the idea is to calculate and spot the pattern. Don’t
get too hung-up on proofs!
In the following problems we write Jn(α) for the Jordan matrix which is the
n × n matrix whose i, j-th entry is: α if i = j; 1 if j = i + 1 and 0 otherwise.
Thus for example
J3(−5) =
−5 1 0
0 −5 1
0 0 −5
.done?problem 12.5 + Hand-in for tutorial What is the characteristic
polynomial of J3(α)? Find its eigenvalues, and their algebraic and geometric
multiplicity. How does this generalize to Jn(α)?
done?problem 12.6 Consider D : P3 → P3 (where D denotes differentiation and
Pn is the vector space of real polynomials in “x” of degree n or less). Find a
basis with respect to which D has matrix J4(0).
MAT3-ALG algebra 2006/7 (tnb) - lecture 13 51
done?problem 12.7 For k ≥ 1, the k-th generalized eigenspace of T : V → V with
eigenvalue λ is
Eλ,k := {v ∈ V | (T − λI)kv = 0}.
So, for k = 1 the generalized eigenspace is just the eigenspace in the usual
sense.
1. Show that if k ≤ l then Eλ,k ⊆ Eλ,l.
2. Let A = J3(α). Show that Eα,k = V for k ≥ 3. Describe Eα,2 and give
its dimension.
3. In general, what is dimEα,k for the Jordan matrix Jn(α)?
lecture 13 the Cayley-Hamilton theorem and the min-imal polynomial
13.1 setting
T : V → V is a linear map and V is a finite-dimensional vector space over a field
F. Its characteristic polynomial is cT (x) and its minimal polynomial is mT (x).
(We will usually write the characteristic polynomial using x as the variable in
place of the more familiar λ.)
We will write λ1, . . . , λk for the distinct eigenvalues of T or A but µ1, . . . , µlfor the eigenvalues listed with multiplicity.
13.2 the Cayley-Hamilton theorem
13.2.1 theorem (Cayley-Hamilton) Let V be a finite-dimensional vector space
and let T : V → V be a linear map with characteristic polynomial cT (x). Then
cT (T) = 0.
Proof. We give a proof for F = C only. Let V be complex and n-dimensional
and let T : V → V be a linear map. Choose a basis f1, . . . , fn as in lecture
11 with respect to which the matrix A of T is upper-triangular with diagonal
entries the eigenvalues (with multiplicity) µ1, . . . , µn. Then
cT (x) = ±(x− µ1)(x− µ2) . . . (x− µn).
Let V0 = {0} and for 1 ≤ k ≤ n let Vk be the subspace of V spanned by
f1, . . . , fk. Then (see lecture 9) we have T(Vk) ⊆ Vk for all k.
MAT3-ALG algebra 2006/7 (tnb) - lecture 13 52
Now consider the matrix of Tj := T − µjI. This is upper-triangular with a
zero in the j-th diagonal entry. We deduce that Tj(Vj) ⊆ Vj−1. Thus
(cT (T))(V) = T1T2 . . . Tn(Vn) ⊆ T1T2 . . . Tn−1(Vn−1) ⊆ · · · ⊆ V0 = {0}.
Thus cT (T) sends all vectors to zero and hence is the zero linear map. �
13.2.2 corollary Let A be an n × n complex matrix. Then cA(A) = 0. (In
words, “a matrix satisfies its own characteristic equation”.)
Proof. This is simply the translation of the theorem into matrix terms. �
13.2.3 corollary The Cayley-Hamilton theorem holds for any field F which is
a subfield of C.
Proof. The matrix version is immediately seen to hold because an n×n matrix
with entries in such a field F is also a complex matrix. �
13.2.4 application The characteristic equation of a 2× 2 matrix A is
x2 − Trace(A)x+ detA = 0.
Substitute for A for x (OK by C-H) and rearrange to get
(A− Trace(A)I)A = −det(A)I.
So if detA 6= 0 we can deduce that
A−1 =1
detA(Trace(A)I−A)
which reduces to the usual formula.
13.2.5 application - getting eigenvectors by cheating Suppose A is 2×2 with
distinct eigenvalues λ1, λ2. Then
(A− λ1I)(A− λ2I) = 0.
Now, ker(A− λ1I) is the λ1 eigenspace and the equation tells us that
im(A− λ2I) ⊆ ker(A− λ1I).
So the columns of A− λ2I are eigenvectors with eigenvalue λ1.
13.3 the minimal polynomial
13.3.1 definition A polynomial p(x) is monic if the coefficient of its highest
order term is 1.
MAT3-ALG algebra 2006/7 (tnb) - lecture 13 53
13.3.2 definition The minimal polynomial mT of the linear map T : V → V
is the monic polynomial of least degree such that mT (T) = 0.
13.3.3 theorem Let T : V → V be a linear map with minimal polynomial
mT (x). Then every polynomial p(x) such that p(T) = 0 is of the form p(x) =
mT (x)Q(x) for some polynomial Q(x).
Proof. Suppose p(T) = 0. Then we can divide p(x) by mT (x) (polynomial
“long division”) to get
p(x) = mT (x)Q(x) + r(x)
where the remainder r(x) has degree < deg p(x) or is zero. Now, p(T) =
mT (T) = 0 and so r(T) = 0 and if r 6= 0 it contradicts the minimality of the
degree of mT . �
13.3.4 corollary The minimal polynomial divides the characteristic polyno-
mial.
13.3.5 theorem If λ is an eigenvalue of T then mT (λ) = 0.
Proof. 0 = mT (T)v = mT (λ)v by §12.3.3. �
13.3.6 theorem Let V be a complex vector space and T : V → V a linear
map with distinct eigenvalues λ1, . . . , λk. Then
cT (x) = (x− λ1)a1 . . . (x− λk)
ak
where the aj are the algebraic multiplicities. Then
mT (x) = (x− λ1)m1 . . . (x− λk)
mk
where for each j we have 1 ≤ mj ≤ aj.
Proof. Follows immediately from what we have just done. �
13.3.7 important note The definition of minimal polynomial and the theo-
rems have completely analogous statements in terms of square matrices rather
than linear maps. We will use both forms interchangeably.
MAT3-ALG algebra 2006/7 (tnb) - lecture 13 54
problems
In the problems for this lecture the idea is to calculate and spot the pattern.
Don’t get hung-up on proofs!
In the following problems we write Jn(α) for the Jordan matrix which is the
n × n matrix whose i, j-th entry is: α if i = j; 1 if j = i + 1 and 0 otherwise.
Thus for example
J3(−5) =
−5 1 0
0 −5 1
0 0 −5
.done?problem 13.1 What is the minimal polynomial of J3(α)? How does this
generalize to Jn(α)?
MAT3-ALG algebra 2006/7 (tnb) - lecture 13 55
done?problem 13.2 + Hand-in for tutorial Consider ODEs of the form
x ′ = Ax where x(t) =
(u(t)
v(t)
)and A is a constant 2 × 2 real matrix. Re-
call from CVD that one can solve such systems when A is real-diagonalizable
(“nodes”,“saddles” and “stars”) and complex-diagonalizable (“foci” and “cen-
tres”).
1. Let A be a real 2× 2 matrix that is not diagonalizable by real or complex
P. Show that the minimal polynomial of A is equal to the characteristic
polynomial which is of the form (x− λ)2, where λ is the eigenvalue.
2. Deduce (Cayley-Hamilton) that if f2 is not an eigenvector then f1 :=
(A− λI)f2 is.
3. Show that the matrix of x 7→ Ax in a basis f1, f2 (as in the previous part)
is J2(λ) and hence that A = PJ2(λ)P−1 where P has f1, f2 as columns.
4. Show that
exp(tJ2(λ)) =
(etλ tetλ
0 etλ
).
5. Show that exp(tA) = P exp(tJ2(λ))P−1 where P is the matrix with
columns f1, f2.
6. Solve
u ′ = −3u− v, v ′ = 4u+ v.
(Recall from CVD that the general solution of x ′ = Ax is x(t) =
exp(tA)
(C1C2
).)
It is purely optional for this course, but you might like to figure out what the
phase portraits for these systems look like.
Remark: this approach to the solution obscures the relationship of the so-
lutions to the basis. Alternatively we can observe that
eλtf1, eλt(tf1 + f2)
are two independent solutions.
MAT3-ALG algebra 2006/7 (tnb) - lecture 14 56
done?problem 13.3 A matrix is said to be in Jordan form if it is block-diagonal
with each diagonal block being a Jordan matrix. (There may be more than one
block with a given parameter value α.) So for example the 7× 7 matrixJ3(5) 0 0
0 J1(5) 0
0 0 J3(−2)
is in Jordan form. (Note that J1(α) is the 1 × 1 matrix (a.k.a. “number”) α
and so a diagonal matrix is an example of Jordan form where all the blocks are
of size 1.)
1. Consider a matrix A in Jordan form with just two Jordan blocks
Jp(α), Jq(β) where α 6= β. Find the characteristic polynomial, eigenval-
ues, minimal polynomial, and dimensions of the generalized eigenspaces
of A (definition is in a problem for lecture 12).
2. The same, only now assume that α = β.
3. Conjecture how this generalizes to a general matrix in Jordan form.
done?problem 13.4 Find two matrices A,B in Jordan form which have the same
minimal polynomial, characteristic polynomial and dimensions of the (ordinary,
not generalized) eigenspaces and such that A 6= B (and neither do A,B differ
only by a change in the order of the blocks down the diagonal).
13.3.8 something worth knowing In fact, every complex square matrix is sim-
ilar to one in Jordan form. Some thought shows that the Jordan form is de-
termined (up to choosing the order of the blocks) by the dimensions of all
the generalized eigenspaces and so this provides a solution to the classification
problem for complex square matrices: two complex matrices are similar iff they
have the same eigenvalues and the same dimension for all the corresponding
generalized eigenspaces.
Knowing the characteristic and minimal polynomials is not enough, as ex-
ercise 4 demonstrates.
lecture 14 a diagonalizability theorem
14.1 setting
T : V → V is a linear map and V is a finite-dimensional vector space over a field
F. Its characteristic polynomial is cT (x) and its minimal polynomial is mT (x).
MAT3-ALG algebra 2006/7 (tnb) - lecture 14 57
(We usually write the characteristic polynomial using x as the variable in place
of the more familiar λ.)
We will write λ1, . . . , λk for the distinct eigenvalues of T or A but µ1, . . . , µlfor the eigenvalues listed with multiplicity.
14.2 diagonalizability
14.2.1 remark When we consider (say) T : Rn → Rn, an eigenvector is a
real vector and it has a real eigenvalue. It may be the case that the matrix of
T has complex “eigenvalues” and “eigenvectors”. These are NOT eigenvalues
and eigenvectors for T but for the map Cn → Cn with the same matrix.
14.2.2 definition We say that T : V → V is diagonalizable if there exists a
basis for V such that the matrix of T is diagonal. (Equivalently, if and only if
there exists a basis for V consisting of eigenvectors of T .)
14.2.3 theorem A linear map T : V → V is diagonalizable if and only if V is
the sum (necessarily direct) of the eigenspaces of T .
Proof. Obvious. �
14.3 interpolating with polynomials
14.3.1 theorem Let λ1, . . . , λk ∈ F be distinct. Define polynomials (of degree
k− 1)
pj(x) =∏i6=j
x− λiλj − λi
, j = 1, . . . , k.
Let q(x) be a polynomial of degree less than k. Then
q(x) = q(λ1)p1(x) + . . . q(λk)pk(x).
Proof. The polynomials pj(x) satisfy
pj(λi) =
{1 if i = j
0 otherwise.
The two sides of the claimed equation are thus equal at all the points λj and
are hence equal everywhere since both sides are polynomials of degree less than
k (otherwise, the left-hand side minus the right-hand side would be a non-zero
polynomial of degree of degree < k but with k roots). �
MAT3-ALG algebra 2006/7 (tnb) - lecture 14 58
14.4 the main theorem
14.4.1 theorem Let V be finite-dimensional and let T : V → V be a linear
map. Then T is diagonalizable if and only if the minimal polynomial mT (x) of
T factorises as a product of distinct linear factors:
mT (x) = (x− λ1)(x− λ2) . . . (x− λk), λ1, . . . , λk ∈ F.
14.4.2 notes
• If V is n-dimensional and the characteristic polynomial has n distinct
roots in F, then T is trivially diagonalizable by taking a basis consisting of
the corresponding eigenvectors. But in this case, we know (by §13.3.6)
that mT = cT so this is consistent.
• The main point in all this is the case where the characteristic polynomial
does factorise into linear factors but has repeated roots:
cT (x) = ±(x− λ1)a1(x− λ2)a2 . . . (x− λk)ak .
Then the theorem says that T is diagonalizable if and only if the minimal
polynomial has each factor to the power 1 only.
14.4.3 examples
• Example application. Suppose a square matrix A has characteristic poly-
nomial (x−1)2(x+2). Then A is diagonalizable if and only if its minimal
polynomial is (x− 1)(x+ 2); i.e. if and only if (A− I)(A+ 2I) = 0.
• Let
A =
(1 1
0 1
), B =
(1 0
0 1
).
Both matrices have characteristic polynomial (x − 1)2. The matrix A
has minimal polynomial (x − 1)2 (since the only other possibilities are
0, (x−1) and by inspection A does not satisfy either of the corresponding
equations). Thus A is not diagonalizable (as one can easily check by
explicitly showing that it has only one linearly independent eigenvector).
The matrix B on the other hand has minimal polynomial (x − 1) and so
is diagonalizable (as is glaringly obvious since it is diagonal already).
• The linear map T : R2 → R2 given by the “rotate by a right-angle” matrix
J =
(0 −1
1 0
)
MAT3-ALG algebra 2006/7 (tnb) - lecture 14 59
has characteristic polynomial x2 + 1. This does not factorise with real
coefficients and so as a linear map R2 → R2, T is not diagonalizable.
On the other hand x2+1 = (x+ i)(x− i) and so the linear map C2 → C2
given by x 7→ Jx is diagonalizable.
In concrete terms, there exist complex invertible matrices P such that
P−1JP is diagonal, but not real ones.
14.5 proof of main theorem
14.5.1 proof — easy direction Proof that if T is diagonalizable then the
minimal polynomial is as stated.
Proof. Let A be the diagonal matrix representing T in a basis of eigenvectors.
Let
cT (x) = ±(x− λ1)a1(x− λ2)a2 . . . (x− λk)ak .
Set
p(x) = (x− λ1)(x− λ2) . . . (x− λk).
Then p(A) is diagonal with entries p(λk) = 0. �
14.5.2 proof — hard direction Proof that if the miminimal polynomial is as
stated then T is diagonalizable.
Proof. By §14.2.3 it is enough to show that V is the sum of the eigenspaces
of T . Let λ1, . . . , λk be the distinct eigenvalues of T and let the polynomials
pj(x) be defined as in §14.3.1. Applying the theorem following to q(x) = 1 we
deduce that
p1(x) + · · ·+ pk(x) = 1
and so
p1(T) + · · ·+ pk(T) = I. (∗∗)
Define Tj := T − λjI. Then we note that
pj(T) = αj∏
i=1,...,k, i 6=jTi where αj ∈ F.
Since T1T2 . . . Tn = mT (T) = 0 we see that
impj(T) ⊆ ker Tj = the λj-eigenspace of T .
Now let v ∈ V and apply (**) to deduce that
v = p1(T)(v) + · · ·+ pk(T)(v)
and so v is a sum of eigenvectors of T as required. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 14 60
14.6 corollaries
14.6.1 corollary In the above proof, in fact im(Tj) = Vj. (Consider the final
equation where v ∈ Vj.) Consequently, if A is a diagonalizable matrix then the
eigenspace Vj is the column span of the matrix
(A− λ1I) . . . (A− λkI) where the λj-term is omitted.
14.6.2 corollary Let T : V → V be a diagonalizable linear map and let U ⊆ Vbe a subspace such that T(U) ⊆ U. Then T : U→ U is diagonalizable.
Proof. Let mT (x) be the minimum polynomial of T : V → V, which we know
to be a product of distinct linear factors. Then (mT (T))(u) = 0 for all u ∈ Uand so the minimal polynomial of T : U→ U divides mT (x) and hence is itself
a product of distinct linear factors and hence T : U→ U is diagonalizable. �
14.6.3 theorem Let T : V → V and S : V → V be diagonalizable linear maps
and suppose ST = TS. Then there exists a basis for V with respect to which
the matrices of S and T are both diagonal.
Proof. Let Vj be the λj eigenspace of T . Let v ∈ Vj. Then T(Sv) = S(Tv) =
λjSv and so S(Vj) ⊆ Vj. By §14.6.2 S : Vj → Vj is diagonalizable. Choose a
basis for each Vj consisting of eigenvectors of S. Then the union of these is a
basis for V consisting of vectors that are eigenvectors of both S and T . �
problems
done?problem 14.1 Use the process described in §14.3.1 to find quadratic polyno-
mials p1, p2, p3 such that for every quadratic polynomial q we have
q(x) = q(0)p1(x) + q(1)p2(x) + q(2)p3(x).
done?problem 14.2 For what values of k is the matrix
M :=
1 1 2
0 −2 k
0 0 1
diagonalizable?
MAT3-ALG algebra 2006/7 (tnb) - lecture 15 61
done?problem 14.3 A is a 2× 2 matrix and λ is an eigenvalue of A. Also
A− λI =(2 −3
−4 6
).
Find a basis for R2 consisting of eigenvectors of A.
done?problem 14.4 Let
Rθ =
(cos θ − sin θ
sin θ cos θ
), θ ∈ [0, 2π].
If you calculate you will find that the different Rθ have different (complex)
eigenvalues but the same (complex) eigenvectors. How does this relate to the
theory in this lecture?
done?problem 14.5 + Hand-in for tutorial Use the main theorem to check
that
A =
−2 −4 2
3 6 −1
−6 −4 6
is diagonalizable. Find a basis v1, v2, v3 with respect to which the matrix of
u 7→ Au is diagonal and hence write down a matrix P such that P−1AP is
diagonal. You are given that cA(x) = (x − 2)(x − 4)2. (You can (and should)
do all this without explicitly solving (A− λI)v = 0 for eigenvectors.)
done?problem 14.6 Continuing the previous exercise, let
B =
−3 −4 3
5 6 −3
−5 −4 5
Check that AB = BA. Find a basis that diagonalizes both A and B. A
suggested strategy is as follows. Use your change of basis matrix P from the
previous exercise (that diagonalizes A) to find the matrix of x 7→ Bx in the
basis v1, v2, v3 (Maple, perhaps). Now you should discover that B is block-
diagonal and you just need to change basis again within the 2-dimensional
eigenspace. (There are other ways: you could think about the intersection of
the 2-dimensional eigenspaces of the two matrices, for example.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 15 62
lecture 15 bilinear and quadratic forms on R-vectorspaces
15.1 setting
Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)
we write β for the associated quadratic form and B for the matrix of b with
respect to some basis.
15.2 motivation
Consider in R2 the function
α(x) = x21 − x22, x =
(x1x2
)∈ R2
which as we will see is an example of a “quadratic form” on R2 — such things
are essentially just polynomials in the coordinates with each term having total
degree 2.
Consider also the quadratic form
β(y) = y1y2 y =
(y1y2
)∈ R2.
This is in fact “essentially the same” quadratic form because under the change
of coordinates
y1 = x1 + x2, y2 = x1 − x2
the forms are equal. On the other hand, no such change of coordinates can ever
make either of these forms equal to the form
γ(z) = z21 + z22.
This is easy to see in this simple case because γ(z) = 0 only for z = 0 whereas
there are non-zero vectors x ∈ R2 such that α(x) = 0, but it is useful to have
some general theory here on when quadratic forms are equivalent under change
of coordinates. We will address this issue over the next three lectures.
15.3 definition
15.3.1 definition A symmetric bilinear form (hereafter SBF) on a real vector
space V is a function b : V × V → R which satisfies
• b(u, v) = b(v, u) for all u, v ∈ V (i.e. b is “symmetric”)
• b(λu+ µv,w) = λb(u,w) + µb(v,w) for all u, v,w ∈ V and λ, µ ∈ R.
MAT3-ALG algebra 2006/7 (tnb) - lecture 15 63
15.3.2 note The second axiom says that an SBF is linear in the first entry
with the second entry held fixed (i.e. fixing y the map x → b(x, y) is linear
from V to R).
15.3.3 theorem Let b be an SBF on V. Then b is linear in the second entry
(hence the word “bilinear” in the definition):
b(w, λu+ µv) = λb(w,u) + µb(w, v) for all u, v,w ∈ V and λ, µ ∈ R.
Proof. Trivial (see problems). �
15.4 coordinates
15.4.1 SBF’s on Rn On Rn, SBFs are given by symmetric matrices: let B
be a symmetric n× n real matrix. Then
b(x, y) = xTBy
is a symmetric bilinear form on Rn (where x, y are both column vectors).
15.4.2 definition Let f1, . . . , fn be a basis for V. The matrix of b with respect
to the basis is the symmetric n× n matrix B with Bij = b(fi, fj).
15.4.3 theorem let B be the matrix of the SBF b with respect to a basis.
Suppose that x, y are the coordinate matrices of the vectors u, v ∈ V. Then
b(u, v) = xT By.
Proof. We must establish the claimed property. Write xj, yj for the coordinates
of u, v with respect to the given basis. Then by linearity in both entries
b
n∑j=1
xjfj,
n∑k=1
ykfk
=
n∑j=1
n∑k=1
xjykb(fj, fk) =
n∑j=1
n∑k=1
xjBjkyk = xTBy.
�
15.5 the associated quadratic form
15.5.1 definition Let b be a SBF on V. The associated quadratic form β is
the function
β(v) = b(v, v).
15.5.2 note In coordinates, β(v) is a polynomial in the coordinates of v with
each term having degree exactly two.
MAT3-ALG algebra 2006/7 (tnb) - lecture 15 64
15.5.3 theorem An SBF determines a quadratic form. The SBF is recoverable
from the quadratic form by the polarization identity
b(x, y) =1
4(β(x+ y) − β(x− y)).
Proof. Expand the right-hand side. �
(Thus, SBFs and quadratic forms are really just the same thing.)
15.6 examples
15.6.1 examples
• On R2 using the standard basis the general SBF is given by a general
symmetric matrix B as
b(x, y) = xTBy = lx1y1 +m(x1y2 + x2y1) + nx2y2, B =
(l m
m n
).
The associated quadratic form is
β(x) = lx21 + 2mx1x2 + nx22.
Comparing the formulas should make the relationship clear.
• The standard inner product on Rn is an SBF given by the identity matrix
and the associated quadratic form is ||x||2.
• b(X, Y) = Trace(XY) defines an SBF on the vector space of n × n real
matrices.
15.7 properties
15.7.1 theorem Let b be an SBF on V and let X ⊆ V be a subspace. Then
b restricted to X defines an SBF on X.
Proof. Obvious. �
15.7.2 definition
• An SBF on V is positive-definite if for all v 6= 0 ∈ V we have b(v, v) > 0.
• An SBF on V is negative-definite if for all v 6= 0 ∈ V we have b(v, v) < 0.
• An inner product on V is a positive-definite SBF on V.
MAT3-ALG algebra 2006/7 (tnb) - lecture 16 65
problems
done?problem 15.1 Prove that an SBF is linear in the second entry (§15.3.3).
done?problem 15.2 Check the claim in §15.4.1 that if B is a symmetric n×n matrix
then b(x, y) = xTBy defines an SBF on Rn, were x, y are column vectors as
usual. (Hint: for the symmetric part, take the transpose of the 1 × 1 matrix
xTBY.)
done?problem 15.3 Consider the SBF b on R2 with matrix
B =
(2 3
3 1
).
Find a vector v 6= 0 in R2 such that b(v, v) = 0. Give a sketch showing the
regions in the plane where b(x, x) is positive, negative and zero.
done?problem 15.4 Consider the SBF on R3 with matrix
B =
1 0 0
0 1 0
0 0 −1
.Sketch the regions in R3 where b(x, x) is positive, negative and zero.
done?problem 15.5 + Hand-in for tutorial Let V be the vector space of
2 × 2 matrices with real entries and trace zero. Consider the SBF (known, by
the way, as the “trace form”) b(X, Y) = Trace(XY) on V. Find the matrix of b
with respect to the basis(0 1
0 0
),
(0 0
1 0
),
(1 0
0 −1
)of V.
lecture 16 diagonalizing SBF’s
16.1 setting
Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)
we write β for the associated quadratic form and B for the matrix of b with
respect to some basis.
MAT3-ALG algebra 2006/7 (tnb) - lecture 16 66
16.2 revision
16.2.1 orthogonal matrices Recall that an n×n matrix is said to be orthog-
onal if Pt = P−1 or equivalently if the columns form an orthonormal basis for
Rn (with respect to the standard inner product).
16.2.2 diagonalisation of symmetric matrices Recall that a concrete version
of the finite-dimensional spectral theorem (FDST) is the following. Given a real
symmetric matrix S there exists an orthogonal matrix P such that
PtSP = P−1SP = D where D = Diag(µ1, . . . , µn)
and the µj are the eigenvalues (necessarily all real, remember) of S taken with
multiplicity (i.e. repeated roots of the characteristic equation appear the cor-
responding number of times). The columns of P are the corresponding unit
eigenvectors of S.
This is easy to achieve in practice if S has n distinct eigenvalues, one just
chooses a unit eigenvector for each. (Recall that eigenvectors with different
eigenvalues are automatically orthogonal.) If µ is a repeated root of the charac-
teristic equation, then the corresponding eigenspace will be of dimension equal
to the multiplicity of the root and one normally has to use Gram-Schmidt to
find an orthonormal basis for it.
16.3 type of an SBF
16.3.1 definition Let b be an SBF on V. Define
Nb := {x ∈ V |b(x, v) = 0 for all v ∈ V}.
We say that b is nondegenerate if Nb = {0} and otherwise that b is degenerate.
16.3.2 theorem Nb as defined above is a subspace of V.
Proof. Exercise (see problems). �
16.3.3 theorem Define the rank of b by
rankb = dimV − dimNb.
Then the rank of b is equal to the rank of its matrix B.
Proof. The vector v ∈ Nb iff Bx = 0 (where x is the coordinate column vector
of v in our chosen basis). Thus the dimension of Nb is the dimension of the
kernel of the linear map x 7→ Bx and the result follows from the Rank theorem
for linear maps. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 16 67
16.3.4 notes
• We will not generally use a name, but sometimes Nb is referred to as the
“kernel” of b.
• The point of the above is this. We know that the rank of a matrix (the
dimension of the row and column span) is significant when we are thinking
of a matrix as the coordinate version of a linear map. The theorem gives
a meaning for the rank when instead the (square) matrix is the coordinate
version of an SBF.
16.3.5 corollary An SBF is nondegenerate if and only if its matrix has non-
zero determinant.
16.3.6 definition Let b be an SBF on V. Let p be the largest integer which
is the dimension of a subspace on which b is positive-definite. Similarly let
q be the largest integer which is the dimension of a subspace on which b is
negative-definite. Then we say that b has type (p, q) and has signature p− q.
16.3.7 example If V is n-dimensional then an inner product on V is the same
thing as an SBF of type (n, 0).
16.4 change of basis
16.4.1 theorem Let b be an SBF on V with matrix B with respect to a basis
f1, . . . , fn. Let f ′1, . . . , f′n be a new basis for V such that the change of basis
matrix from the original basis to this new basis is P. Then the matrix of b with
respect to the new basis is
B ′ = PTBP.
16.4.2 theorem Let b be an SBF on V. Then there exists a basis for V such
that the matrix of b is of the form
B =
Ip 0 0
0 −Iq 0
0 0 0n−(p+q)
where 0n−(p+q) is the (n− (p+ q))× (n− (p+ q)) zero matrix.
We shall refer to such matrices as being in the standard form for an SBF.
Proof. Start with any basis for V and let the matrix in that basis be S. The
matrix version of FDST recalled above shows that there is an orthogonal matrix
P representing a change of basis to a new basis f1, . . . , fn for V, such that
PtSP = P−1SP = D where D = Diag(µ1, . . . , µn).
MAT3-ALG algebra 2006/7 (tnb) - lecture 16 68
We can assume without loss of generality that in the list of eigenvalues, the
positive eigenvalues come before the negative ones which come before the zero
ones.
Now rescale the basis vectors corresponding to nonzero eigenvalues accord-
ing to
fj 7→ 1√|λj|fj
to obtain a basis with the matrix in the desired form, with p and q being the
number of positive and negative eigenvalues respectively. �
16.4.3 theorem Let b have type (p, q) then the rank of b is p + q and
matrix of b in any basis has p positive and q negative eigenvalues (counting
with multiplicity).
Proof. It is easy to check (see exercises) that if the matrix of b is in the
standard form as in §16.4.2 then the largest dimensions of subspaces on which
b is positive or negative definite are p and q. We saw in the proof that p, q
were the number of positive and negative eigenvalues of the matrix of b in the
originally chosen arbitrary basis. �
problems
done?problem 16.1 Show that Nb := {x ∈ V |b(x, v) = 0 for all v ∈ V is a
subspace of V. (i.e. give the proof of §16.3.2.)
done?problem 16.2 Suppose that the matrix of b in a basis is the standard form
as in §16.4.2. Identify a p-dimensional subspace on which b is positive definite.
Identify also an (n − p)-dimensional subspace on which b is “negative semi-
definite” (meaning that b(v, v) ≤ 0 for all v in the subspace). By considering
intersections, deduce that there is no subspace of dimension larger than p on
which b is positive definite.
MAT3-ALG algebra 2006/7 (tnb) - lecture 16 69
done?problem 16.3 + Hand-in for tutorial
1. Find a 2× 2 orthogonal matrix P such that PtSP is diagonal where
S =
(7 −6
−6 −2
).
2. Find also a matrix P such that PtSP is diagonal with diagonal entries ±1or 0.
3. What is the type of the SBF on R2 given by the matrix S? What is its
rank and what is its signature?
done?problem 16.4 + Hand-in for tutorial Let P2 denote the real vector
space of polynomials of degree ≤ 2 in a variable x.
1. Show that
b(p(x), q(x)) :=
∫ 10
p ′(x)q ′(x)dx
defines an SBF on P2 (the “dashes” indicate derivatives).
2. Find a nonzero element of Nb and hence deduce that b is degenerate.
3. Find the matrix of b with respect to the basis {x2, x, 1} of P2 and hence
find the rank of b.
done?problem 16.5 Quick questions
1. True or false: An SBF is non-degenerate iff its matrix does not have zero
as an eigenvalue.
2. What are the possible types of an SBF on R3 if there exists a 2-dimensional
subspace on which it is negative-definite?
3. An SBF on Rn has type (p, q). What is the largest possible dimension
for a subspace V such that b(v, v) < 0 for all non-zero v ∈ V?
4. Same as above but now b(v, v) ≤ 0 for all non-zero v ∈ V.
MAT3-ALG algebra 2006/7 (tnb) - lecture 17 70
done?problem 16.6 + Hand-in for tutorial Let V denote the vector space
of n× n real matrices.
1. What is the dimension of V and of the subspace of symmetric matrices
and of the subspace of antisymmetric matrices?
2. Show that b(X, Y) = Trace(XY) defines an SBF on V.
3. Show that
b(X, Y) =
n∑j=1
n∑k=1
XjkYkj.
4. Show that b is positive-definite on the subspace of symmetric matrices
and negative-definite on the subspace of antisymmetric matrices.
5. Find the type, rank and signature of b.
done?problem 16.7 Let b be the SBF on R4 given by the matrix B given in block
form as
B =
(I2 0
0 −I2
).
Let A be a fixed 2× 2 matrix. Define
U :=
{x ∈ R4
∣∣∣∣ x = (Avv), v ∈ R2
}.
(We are using “block form” notation above.) Show that U is a subspace of R4
and state its dimension.
Show that b is identically zero on U if and only if A is an orthogonal matrix
(i.e. iff AtA = I).
done?problem 16.8 Suppose b is a non-degenerate SBF on V. Can there exist
a subspace U of V such that b restricted to U is degenerate? What if b is
assumed to be positive-definite?
done?problem 16.9 True or False: If an SBF is positive definite on subspaces
U,U ′ ⊆ V then it is positive definite on their sum U+U ′. Explain your answer.
lecture 17 determining type — applications
17.1 setting
Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)
we write β for the associated quadratic form and B for the matrix of b with
MAT3-ALG algebra 2006/7 (tnb) - lecture 17 71
respect to some basis.
17.2 determining type
The type of an SBF can often be determined without computing eigenvalues.
We develop this method here.
17.2.1 theorem Let b be an SBF of type (p, q) on n-dimensional V and let
B be its matrix with respect to a basis. Then
• detB = 0 if and only if b is degenerate. (That is, if and only if p+q < n.)
• detB > 0 if and only if b is nondegenerate and q is even.
• detB < 0 if and only if b is nondegenerate and q is odd.
Proof. The determinant of a matrix is the product of its eigenvalues. �
17.2.2 theorem Suppose that b is an SBF of type (p, q) on V and that on a
subspace U ⊆ V it has type (p ′, q ′). Then p ′ ≤ p and q ′ ≤ q.
Proof. Immediate from the definition of p, q in terms of subspaces on which
b is positive and negative definite. �
17.2.3 theorem Let {0} = V0 ⊆ V1 ⊆ · · · ⊆ Vn = V be a flag in an n-
dimensional vector space V. Let b be an SBF on V which is nondegenerate
when restricted to each subspace Vk, k = 1, . . . , n. Let dk be the determinant
of the matrix of b restricted to Vk. Then b has type (n− q, q) where q is the
number of sign changes in the sequence
1, d1, d2, . . . , dn.
Proof. If the type on Vk is (p, q) then the type on Vk+1 is either (p, q+ 1) or
(p + 1, q). In the first case there is an extra negative eigenvalue and since the
determinant is the product of the eigenvalues, it changes sign. �
17.2.4 corollary Let B be an n × n symmetric matrix. For 1 ≤ k ≤ n, let
Bk denote the “top-left” k × k sub-matrix of B (i.e. formed from the entries
bij with 1 ≤ i, j ≤ k). Let dk = detBk. (Here B1 is a 1 × 1 matrix and its
determinant is equal to its entry.) Suppose that all the dk are non-zero. Let q
be the number of sign changes in the sequence 1, d1, . . . , dn. Then b has type
(p, q).
MAT3-ALG algebra 2006/7 (tnb) - lecture 17 72
17.2.5 variations We illustrate with examples how one may be able to use
the above ideas in other ways.
• In the above, we do not need to take the chain of subspaces (or “flag”)
to start from the top. For example the theorem does not directly apply
to
B =
0 1 1
1 1 1
1 1 2
.Taking the chain of determinants starting from the bottom-right however
we get the values 2, 1,−1 for the determinants and so the signature is
(2, 1).
You can even start with the middle 1 × 1 matrix. But you must only
consider square matrices with the same leading diagonal as the original
matrix — you should never be considering e.g. the bottom left 2× 2.
•
B =
−1 6 3
6 1 1
3 1 2
The usual chain works fine here giving values −1,−37,−46 and so the
signature is (2, 1).
But you can take a shortcut. B is positive definite on the span of the last
two basis vectors (just working out the 1×1 and 2×2 determinants in your
head). So the type is one of (2, 1), (2, 0), (3, 0). But B is negative definite
on the span of the first basis vector and so only the first is possible. (We
have avoided having to compute the 3× 3 determinant.)
•
B =
−3 2 7 −4
2 −2 6 3
7 6 1 1
−4 3 1 2
.4× 4 determinants are a pain. We might however notice (just computing
2× 2 determinants) that B is positive definite on the span of the last two
basis vectors and negative-definite on the span of the first two. Thus B
can only be of type (2, 2).
17.3 classification of critical points
Let f(x) = f(x1, . . . , xn) be a smooth function of n variables and let x = a
be a critical point, meaning that ∂f/∂xk = 0 at x = a for all k. The Hessian
MAT3-ALG algebra 2006/7 (tnb) - lecture 17 73
matrix at a is the symmetric matrix H with
Hjk =∂2f
∂xj∂xk
∣∣∣∣x=a
.
The Taylor expansion of f near x = a is
f(a+ z) = f(a) +1
2ztHz+ higher order terms
and so near x = a the first non-trivial term is the quadratic form with H as its
matrix. We immediately see the following.
• If H is positive-definite (resp. negative-definite) then f has a strict local
minimum (resp. maximum) at x = a.
• If H has type (p, q) with p > 0, q > 0 then there are directions in which
f increases and direction in which it decreases.
If H is degenerate then one may need to know about the higher order terms in
order to understand the nature of the critical point. For example, in R4 if the
Hessian is type (2, 0) then the point may or may not be a local minimum.
problems
done?problem 17.1 Consider the quadratic form
β = 2x2 + 3y2 + z2 − 4xy+ 2xz+ 2yz.
1. Write down the matrix B of this quadratic form.
2. By evaluating determinants only determine the type of this form.
3. Find the eigenvalues and eigenvectors of B and check that the signs of
the eigenvalues are consistent with derivation of the type in the previ-
ous part. (You might want to use Maple to find the eigenvalues — use
“evalf(LinearAlgebra[Eigenvalues](B))”.)
done?problem 17.2 + Hand-in for tutorial Show that the origin is a critical
point of
f(x, y, z) = 2x2 + y siny+ z2 + 2(y+ z) sin x− 2ky sin z
(where k is a constant). What can you say about the nature of the critical point
for different values of k?
MAT3-ALG algebra 2006/7 (tnb) - lecture 18 74
done?problem 17.3 What is the type of the SBF with matrix−3 12 −7
12 4 2
−7 2 2
You do not need to evaluate a 3×3 determinant (or compute eigenvalues (don’t
even think of it) ).
done?problem 17.4 What is the type of the SBF in problem 5 of lecture 15?
lecture 18 SBF’s on inner-product spaces
18.1 setting
Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)
we write B for the matrix of b with respect to some basis.
18.2 inner product spaces (IPS)
18.2.1 definitions An inner product on a real vector space V is a positive-
definite symmetric bilinear form on V. (Note that this is equivalent to the
definition from Year 2.) We will usually write inner products as 〈·, ·〉.
18.2.2 orthonormal bases For an SBF which is an inner product, a basis
which for which the matrix of the SBF takes the standard form of 16.4.2 (which
is just the identity matrix) is called orthonormal.
This idea should be familiar from Year 2: a basis in an inner-product space
is orthonormal if 〈ei, ej〉 = 0 when i 6= j and ||ei|| :=√〈ei, ei〉 = 1.
In an inner-product space one usually works with orthonormal bases where
possible.
18.2.3 orthogonal matrices Recall that an n×n real matrix A is orthogonal
if ATA = I. Orthogonal matrices give distance and angle-preserving linear
maps from Rn (with it’s usual, standard inner product) to itself. That is A is
orthogonal if and only if
〈x, y〉 = 〈Ax,Ay〉 for all x, y ∈ Rn.
Orthogonal matrices arise also as change of basis matrices between orthonor-
mal bases in any real inner-product space. To see this, note that if an SBF has
matrix I with respect to one basis (i.e. it is an inner product and the basis is
MAT3-ALG algebra 2006/7 (tnb) - lecture 18 75
orthonormal) then for its matrix also to be I with respect to a new basis we
require
PT IP = PTP = I
where P is the change of basis matrix. (We are using the formula for change of
basis for the matrix of an SBF.)
18.3 classification of SBF’s on an IPS
18.3.1 theorem Let b be an SBF on an inner-product space V. Then there
exists an orthonormal basis for V such that the matrix of b is
Diag(µ1, . . . , µn)
where the numbers µ1, . . . , µn are the eigenvalues of the matrix of b with respect
to any orthonormal basis (counted with multiplicity).
Proof. Choose an orthonormal basis for V and let B be the matrix of b in
that basis. Let µ1, . . . , µn be the eigenvalues of B (counted with multiplicity).
Then by the finite-dimensional spectral theorem (see §16.2.2) there exists an
orthogonal matrix P such that PTBP is as stated. The matrix P is thus the
change of basis matrix to an orthonormal basis with respect to which the matrix
of b is as claimed. �
18.3.2 remark As we know from before, if we forget the inner product on V
then there exists a basis where the matrix of b is diagonal with entries ±1 or
zero. We obtained this basis by taking the one we are using here and rescaling
the basis vectors associated with nonzero eigenvalues.
For an SBF b on a vector space V generally, only the sign (±1 or zero)
of the eigenvalues is significant. If V has an inner product, then the eigenval-
ues themselves (of the matrix of b with respect to an orthonormal basis) are
significant.
A matrix version of the above fact is this. Under the transformation B 7→PTBP of a symmetric matrix B by an arbitrary invertible matrix P, the sign
of the eigenvalues is preserved. If also P is orthogonal then the values of the
eigenvalues are preserved.
18.4 classification of quadrics
18.4.1 quadrics in R3 A non-degenerate central quadric in R3 is a surface Σ
defined by the equation
xtSx = 1
MAT3-ALG algebra 2006/7 (tnb) - lecture 18 76
where S is a non-zero symmetric matrix. We will assume S is not negative-
definite since in that case there are no x satisfying the equation. We will
assume also that S is non-degenerate (i.e. it has rank 3).
18.4.2 theorem Given such a quadric there exists an orthonormal basis such
that with respect to that basis the quadric takes one of the following forms.
• If S is positive definite then Σ is an ellipsoid given by
x2
a2+y2
b2+z2
c2= 1, a, b, c > 0.
• If S has type (2, 1) then Σ is a hyperboloid of one sheet given by
x2
a2+y2
b2−z2
c2= 1, a, b, c > 0.
• If S has type (1, 2) then Σ is a hyperboloid of two sheets given by
x2
a2−y2
b2−z2
c2= 1, a, b, c > 0.
In each case, the coefficients of x2, y2, z2 are the eigenvalues of S.
18.4.3 remark Note: the type of S determines which of the three categories
the quadric is in. One can usually determine the type by computing determi-
nants. If one wants to know the values of a, b, c then one needs to know the
eigenvalues of S and if one wants also to know the orthonormal basis in which
the equation takes the form then the eigenvectors also must be computed.
problems
done?problem 18.1 Classify the following quadrics.
1. x2 + 2y2 + 3z2 + 2xy+ 2xz = 1
2. 2xy+ 2xz+ 2yz = 1
3. x2 + 3y2 + 2xz+ 2yz− 6z2 = 1
You should be able to do all of these using “determinants” analysis if you think
carefully. You can always check your answer by asking Maple for the eigenvalues.
done?problem 18.2 Let β(x) be a quadratic form on Rn given by a symmetric
matrix S. How are the maximum and minimum values of β(x) on the unit
sphere xtx = 1 related to the eigenvalues of S. (Hint: orthogonal change of
coordinates to standard form.)
MAT3-ALG algebra 2006/7 (tnb) - lecture 19 77
done?problem 18.3 u Challenge Continuing the previous exercise, use La-
grange multipliers to find the critical points of xtSx subject to the constraint
xtx = 1.
lecture 19 simultaneous diagonalization
19.1 setting
Finite-dimensional vector spaces over R. If b is a symmetric bilinear form (SBF)
we write B for the matrix of b with respect to some basis.
19.2 SBFs and self-adjoint linear maps
19.2.1 definition (revision from year 2) Let V be an IPS and let T : V → V
be a linear map. Then T is self-adjoint if
〈u, Tv〉 = 〈Tu, v〉, for all u, v ∈ V.
19.2.2 theorem The map T : V → V is self-adjoint if and only if its matrix
with respect to an orthonormal basis is symmetric.
Proof. Let u, v have coordinate column matrices x, y with respect to an
orthonormal basis and let T have matrix A. Then
〈u, Tv〉− 〈Tu, v〉 = xTAy− (Ax)Ty = xT (A−AT )y
which is zero for all u, v if and only if A = AT . �
19.2.3 theorem Let T : V → V be a self-adjoint linear map. Then
b(u, v) := 〈u, Tv〉
defines an SBF on V. The matrices of the linear map T and of the SBF b are
equal when they are taken with respect to an orthonormal basis.
Proof. That b is linear is immediate. It is symmetric because
b(v, u) = 〈v, Tu〉 = 〈Tv, u〉 = 〈u, Tv〉 = b(u, v).
Let T have matrix A and let u, v have coordinate column matrices x, y with
respect to an orthonormal basis. Then
b(u, v) = 〈u, Tv〉 = xT (Ay)
and so b has matrix B. �
MAT3-ALG algebra 2006/7 (tnb) - lecture 19 78
19.2.4 remark So, in an IPS, self-adjoint linear maps and SBF’s are essen-
tially the same thing. This explains why it is reasonable for the eigenvalues
of the matrix of an SBF to be significant — they are the eigenvalues of the
associated self-adjoint linear map.
In matrix terms, under a change of coordinates the matrices of SBF’s and
linear maps have different transformation rules:
B 7→ PTBP, A 7→ P−1AP.
On an IPS however we have a preferred set of bases, the orthonormal ones, and
for change of basis betwen these we have PT = P−1 and those formulae become
the same.
19.3 simulaneous diagonalization
19.3.1 theorem Let b, a be two SBF’s on a vector space V and suppose that
a is positive definite. Then there exists a basis for V such that the matrix of a
is the identity matrix and the matrix of b is diagonal.
Proof. Regard a as an inner product on V so that b is an SBF on an inner-
product space. Then there exists an orthonormal (with respect to 〈x, y〉 =
a(x, y)) basis for V such that the matrix of b is diagonal by the results of the
previous lecture. �
19.3.2 details We will work on Rn (which involves no loss of generality since
we can reduce to that case by choosing an arbitrary basis for V). Let a, b be
SBF’s as above with matrices A,B respectively. If we regard a as an inner
product then we can use that to identify b with a self-adjoint (with respect to
a) linear map x 7→Mx using the formula
b(x, y) = a(x,My)
or in matrix form xtBy = xtAMy. Thus we deduce that B = AM or M =
A−1B.
Now the diagonal entries in the matrix of B once we have diagonalised are
precisely the eigenvalues of M which we can calculate by solving the character-
istic equation
det(B− λA) = 0
since det(A−1B − λI) = det(A−1)det(B − λA). The solutions are called the
relative eigenvalues of the pair B and A.
Further, the required basis vectors are the eigenvectors of M scaled so that
a(v, v) = 1. These can be obtained by solving for each relative eigenvalue
(B− λA)v = 0
MAT3-ALG algebra 2006/7 (tnb) - lecture 19 79
and rescaling so that vTAv = 1. If λ is a repeated root, then one would have
to use Gram-Schmidt (using the inner product a) to find an orthonormal basis
for the corresponding eigenspace.
problems
done?problem 19.1 + Hand-in for tutorial Consider the SBF’s on R2 given
(with respect to the standard basis) by the matrices
B =
(1 3
3 3
), A =
(2 1
1 1
).
Show that one of these is positive definite and hence find a basis for R2 with
respect to which the matrices of both the SBF’s are diagonal (with the positive-
definite one having the identity as its matrix). Write down the change of basis
matrix that diagonalises both forms.
done?problem 19.2
1. Let
A =
(1 3
3 2
).
Check explicitly that 〈Ax, y〉 = 〈x,Ay〉 for all x, y ∈ R2 where the inner
product is the standard one on R2.
2. Let
A =
(1 3
−1 2
).
Find vectors x, y ∈ R2 such that 〈Ax, y〉 6= 〈x,Ay〉 where the inner
product is the standard one on R2.
MAT3-ALG algebra 2006/7 (tnb) - lecture 19 80
done?problem 19.3 + Hand-in for tutorial Let n ∈ N and consider
Tn =
{a0 +
n∑k=1
ak coskx+ bk sinkx |ak, bk ∈ R
}
with the inner product
〈p(x), q(x)〉 :=∫ 2π0
p(x)q(x)dx.
Consider the linear map −D2 : Tn → Tn where −D2 : p(x) 7→ −p ′′(x).
1. Show that −D2 is self-adjoint. (Hint: integration by parts.)
2. What are the eigenvalues of −D2 : Tn → Tn and what is the multiplicity
of each eigenvalue? (Think ODE’s — no clever theory required.)
3. Show that the SBF associated to −D2 is b(p, q) =∫2π0 p
′(x)q ′(x)dx.
4. What is the type of the SBF just found? Relate that to the eigenvalues
of −D2.