Winter 2016 - University of Washingtonnaehrn/teaching/m308.pdfMath 308, Linear Algebra with...

Math 308, Linear Algebra with Applications

Natalie Naehrig

Winter 2016

Contents 3

Contents

1 Linear Equations 11.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . 11.3 Echelon Systems . . . . . . . . . . . . . . . . . . . . . . . . . 71.4 Matrices and Gaussian Elimination . . . . . . . . . . . . . . . 12

2 Euclidean Space 252.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2 Geometry of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . 342.3 Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.4 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . 43

3 Matrices 513.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . 513.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 603.3 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4 Subspaces 774.1 Introduction to Subspaces . . . . . . . . . . . . . . . . . . . . 774.2 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . 824.3 Row and Column Spaces . . . . . . . . . . . . . . . . . . . . . 89

5 Determinants 955.1 The Determinant Function . . . . . . . . . . . . . . . . . . . . 95

1

Chapter 1

Linear Equations

1.1 Notation

Before we start digging into the theory of linear algebra, we need to introduceand fix some notation.

1.1.1 NotationWe let-N be the set of natural numbers, i.e. the set {1, 2, 3, . . .}Note that we do not assume that 0 is an element of N.-Z be the set of integers, i.e. the set {0, 1,−1, 2,−2, 3,−3 . . .}-R be the set of real numbers-φ := {}, the empty set.-S1 ⊆ S2 means that whenever s ∈ S1, then it is also an element of S2.-S1 ( S2 means that S1 ⊆ S2, and there is at least one element s ∈ S2, whichis not an element of S1.

1.2 Systems of Linear Equations

Let’s start with some applications in linear algebra, so that we understandwhat all the structures, that we will learn later, are good for.Consider the system

2x1 + 3x2 = 12x1 − x2 = 1

2 Linear Equations

Solving this system means finding the set of all pairs (x1, x2), which showa true statement when plugged in. We can find solutions in many differentways.Let us first approach the system algebraically, using elimination. The advan-tage of elimination is that it can easily be generalized to systems of any size.Moreover, by the end of this chapter, we will be able to precisely describethe way of finding solutions algorithmically.We start with eliminating x1 in the second row. To this end, compare thecoefficients of x1 of the first and second row. To eliminate x1 in the secondline, we need to multiply the second line by −2 and add this to the first line:

2x1 + 3x2 = 12x1 − x2 = 1 ×(−2)

2x1 + 3x2 = 12−2x1 + 2x2 = −2

2x1 + 3x2 = 125x2 = 10

We can now solve for x2 in the second row (x2 = 2) and plug this result intothe first row:

2x1 + 3× 2 = 12 −6x2 = 2

We then get

2x1 = 6 : 2x2 = 2

x1 = 3x2 = 2

What does this result tell us? It gives us the solution: (3, 2) is the only

Systems of Equations 3

solution to the system! Do the test! We plug in (3, 2) in the linear system:

2(3) + 3(2) = 123 − 2 = 1

12 = 121 = 1

There is also a geometrical interpretation available: Focussing on the firstrow of the original system, we have

2x1 + 3x2 = 12.

Do you remember functions of the form ax1 + bx2 = c? These were lines in aspecific form. How do we get the slope and x2-interception from this form?Slope: −(a

b)

x2-intercept: cb.

In our example, we have a line with slope −23

and x2-intercept 4.The second row of the system corresponds to a line with slope 1 and y-intercept −1.Finding a solution to the system can be interpreted as finding a point thatlies on both lines, i.e. the intersection of those lines.

-1

1 2 3 4 5

1

2

3

4

5

Let’s do the same for the following system:

4 Linear Equations

2x1 + x2 = 54x1 + 2x2 = 10 ×(−0.5)

2x1 + x2 = 50x1 + 0x2 = 0

How do we interpret this result? Obviously, the second row is always true.We just follow the approach we did in the first example, just a bit moregeneral. In the first example, we found a unique solution for x2. That isdifferent here. The second row does not provide a unique solution for x2(remember we must never, ever divide by 0). So let us go general: Let t ∈ Rbe any real number, a so called parameter. Set x2 = t and consider the firstrow:

2x1 + t = 5 −t ×(0.5)x2 = t

x1 = 2.5− t0x1 + 0x2 = 0

The set of solutions is therefore {(2.5 − t, t) | t ∈ R}. This has infinitelymany elements. Let us have a look at the geometrical interpretation.Both rows of the original system are forms of the same line, and the solution,i.e. the ”intersection of those two lines”, is this line.

-1

1 2 3 4 5

1

2

3

4

5

Systems of Equations 5

We will now discuss a third example.

2x1 + x2 = 52x1 + x2 = 1 ×(−1)

2x1 + x2 = 50x1 + 0x2 = 4

The second row reveals a contradiction, so that we conclude that the systemhas no solution at all. Geometrically, we see two parallel lines, i.e. two linesthat do not intersect at all.

1 2 3 4 5

1

2

3

4

5

-1

-2

-3

-4

Question: Which cases do we expect when dealing with a system of threerows with three variables? What kind of geometrical objects are we dealingwith?Let us continue with formally defining linear equations. This way, we willbe able, just from having the definition in its most general setting, to drawconclusions about any system of linear equations.

6 Linear Equations

1.2.1 DefinitionLet n ∈ N. Then

a1x1 + a2x2 + . . .+ amxm = b (1.1)

with coefficients ai, b ∈ R for 1 ≤ i ≤ m and xi variables or unknownsfor 1 ≤ i ≤ m is called a linear equation. A solution of this linear equationis an ordered set of m numbers

s1s2

...sm

(or m-tupel (s1, s2, . . . , sm)) such that replacing xi by si, the equation (1.1)is satisfied. The solution set for the equation (1.1) is the set of all solutionsto the equation.

As we have seen above, we sometimes deal with more than one equation andare interested in solutions that satisfy two or more equations simultaneously.So we need to formulize this as well.

1.2.2 DefinitionLet m,n ∈ N. Then a system of linear equations/linear system is acollection of linear equations of the form

a1,1x1 + a1,2x2 + . . . + a1,mxm = b1a2,1x1 + a2,2x2 + . . . + a2,mxm = b2

......

...an,1x1 + an,2x2 + . . . + an,mxm = bn

with ai,j, bi ∈ R for 1 ≤ i ≤ n, 1 ≤ j ≤ m. The solution set of the systemis the set of m-tupels which each satisfy every equation of the system.

1.2.3 RemarkNote that any solution of a system with n equations and m variables is a listwith m entries (independent of the number of equations n)!!

1.3. Echelon Systems 7

1.3 Echelon Systems

In this section, we will learn how to algorithmically solve a system of linearequations.

Eliminating Consider the following system:

x1 − 2x2 + 3x3 = −1 (2) (−7)−2x1 + 5x2 − 10x3 = 4

7x1 − 17x2 + 34x3 = −16

x1 − 2x2 + 3x3 = −1x2 − 4x3 = 2 (3)

−3x2 + 13x3 = −9

x1 − 2x2 + 3x3 = −1x2 − 4x3 = 2

x3 = −3

Back Substitution Once we have reached this triangular form of a system,we can use back substitution for finding the set of solutions. To this end,start with the last row of the previous system. We can read off from thatline that x3 = −3. So we plug this result in the second last row:

x2 − 4(−3) = 2,

which gives us x2 = −10. The solutions for x3, x2 can now be plugged in thefirst line.

x1 = −1 + 2(−10)− 3(−3) = −12.

In summary, we get the solution set

S = {

−12−10−3

}

8 Linear Equations

Check the solution Let’s go back to the original system, plug in thesolution, and verify that we did not make any mistake:

−12 − 2(−10) + 3(−3) = −1−2(−12) + 5(−10) − 10(−3) = 47(−12) − 17(−10) + 34(−3) = −16,

which leads to a correct set of equations.

1.3.1 Definition (Consistent/Inconsistent)If a system (of linear equations) has at least one solution, the system is saidto be consistent. Else it is called inconsistent.

The system above is therefore a consistent system.How did we actually get to the solution? We managed to bring the systeminto a specific form, so that we could read of the solution. Because it is astrong mean, we put some terminology into the forms.

1.3.2 DefinitionIn a system of linear equations, in which a variable x occurs with a non-zerocoefficient as the first term in at least one equation, x is called a leadingvariable.

1.3.3 ExampleConsider the system:

12x1 − 2x2 + 5x3 + −2x5 = −13x4 + 4x5 = 72x4 − x5 = 2

7x5 = −14

the variables x1, x4, x5 are leading variables, while x2, x3 are not.

1.3.4 DefinitionA system of linear equations is in triangular form, if it has the form:

a1,1x1 + a1,2x2 + a1,3x3 + . . . + a1,nxn = b1a2,2x2 + a2,3x3 + . . . + a2,nxn = b2

a3,3x3 + . . . + a3,nxn = b3. . .

......

an,nxn = bn

where the coefficients ai,i for 1 ≤ i ≤ n are nonzero.

Echelon Systems 9

The following proposition reveals the meaning of triangular systems in termsof finding the solution of a system.

1.3.5 Proposition(a) Every variable of a triangular system is the leading variable of exactlyone equation.(b) A triangular system has the same number of equations as variables.(c) A triangular system has exactly one solution.

Proof: (a) xi is the leading variable of row number i.(b) Follows from (a).(c) Do the back substitution. �

Free variables The notion of a triangular system is very strong. Each vari-able is the leading variable of exactly one equation. When solving systems,we can relax the triangular notion a bit, but still find solutions. We thereforeintroduce new terminology, after understanding the following example.Consider the system:

2x1 − 4x2 + 2x3 + x4 = 11x2 − x3 + 2x4 = 5

3x4 = 9

Performing back substitution, gives us x4 = 3. But, as x3 is not a leadingvariable, we do not get a solution for it. We therefore set x3 := s, a pa-rameter s ∈ R. We treat s to be a solution and go forth. After all weget

x2 = −1 + s

andx1 = 2 + s.

To emphasize it again, s can be any real number, and for each such choice,we get a different solution for the system. What do we conclude? Yes, thissystem has infinitely many solutions, given by

S = {

2 + s−1 + s

s3

| s ∈ R} = {

2−1

03

+ s

1110

| s ∈ R}.

10 Linear Equations

1.3.6 DefinitionA linear system is in echelon form, if(a) Every variable is the leading variable of at most one equation.(b) The system is organized in a descending stair step pattern, so that theindex of the leading variables increases from top to bottom.(c) Every equation has a leading variable.For each system in echelon form, a variable, that is not a leading variable, iscalled a free variable.

1.3.7 PropositionA system in echelon form is consistent. If it has no free variable, it is intriangular form and has exactly one solution. Otherwise, it has infinitelymany solutions.

Proof: If a system is in echelon form, determine its free variables, bydetermining its variables that are not leading variables. Set the free variablesequal to free parameters and perform back substitution. This gives you asolution, and depending on whether there are free variables or not, you getthe assertion about the number of solutions. �

1.3.8 RemarkTwo comments that are important to know and understand:- A system in echelon form is consistent and therefore does not have rows ofthe form

0x1 + 0x2 + . . .+ 0xn = c

for some nonzero c ∈ R. That means, as free variables only have beendefined for echelon systems, that we only speak of free variables inconsistent systems.- The second part of the previous proposition reads like that: If you find freevariables, then the system has infinitely many solutions!

Recapitulating what we did to find a solution, we saw that the coefficients ofthe linear equations changed, but the solution did not (remember, we did thecheck with the original system!). The reason for this is because we allowedonly certain manipulations, which did not change the solution set.

1.3.9 DefinitionIf two linear systems lead to the same solution set, we say that these systemsare equivalent. The following operations lead to equivalent systems:

(a) Interchange the position of two equations the system.

Echelon Systems 11

(b) Multiply an equation by a nonzero constant.

(c) Add a multiple of one equation to another.

Each of these operations are called elementary row operations.

1.3.10 Remark (Algorithm for solving a linear system)- Put system into echelon form:

A Switch rows until variable with least index with non-zero coefficient isfirst row. This is a leading variable.

B Write down the first row AND leave it from now on.

C Eliminate terms with that variable in all but the first row.

D Repeat step [A] with system 2nd row to last row.

E Step [B] with second row.

F Step [C] with 3rd row until last.

(a) Repeat [A]-[C] likewise with remaining rows.

- If there is a row of the form 0x1 + 0x2 + . . .+ 0xn = 0, delete that row.- If there is a row of the form 0x1 + 0x2 + . . . + 0xn = c for c 6= 0, stop andstate that the system is inconsistent and hence has no solution.- Once you have system in echelon form identify leading variables.- Identify then possible free variables. Set those variables equal to parametersti.-Do backward substitution.

12 Linear Equations

1.3.11 ExampleLet us apply the above algorithm.

x2 + 6x3 = −52x1 − x2 + 5x3 = −62x1 − 2x2 − x3 = −1

→2x1 − x2 + 5x3 = −62x1 − 2x2 − x3 = −1

x2 + 6x3 = −5→

2x1 − x2 + 5x3 = −6−x2 − 6x3 = 5x2 + 6x3 = −5

→2x1 − x2 + 5x3 = −6

−x2 − 6x3 = 50x2 + 0x3 = 0

→

2x1 − x2 + 5x3 = −6−x2 − 6x3 = 5

The leading variables are x1, x2, hence the free variable is x3. We set x3 = tfor a parameter t ∈ R. Now we perform backward substitution to get

2x1 − x2 = −6− 5t−x2 = 5 + 6t

2x1 = −6− 5t− 5− 6tx2 = −5− 6t

x1 = −11/2− 11/2tx2 = −5− 6t

In summary, the solution set is

S = {

−11/2−11/2t−5−6t

| t ∈ R} =

11/2

0−5

0

+ t

0

11/20−6

| t ∈ R}

1.4 Matrices and Gaussian Elimination

From now on we will start to make precise definitions in mathematical style.Doing so, we can always be sure, that conclsuions we draw from assumptionsare true.One might consider the following sections abstract and wonder what it has todo with linear equations. I promise, that in the end, all that follows will animportant wole in finding the set of solutions. Let us start with a structurecalled matrix, appearing all the time in linear algebra.

Gauss 13

1.4.1 DefinitionLet m,n be natural numbers. We call a rectangular array A with n rows andm columns of the form

A =

a1,1 a1,2 . . . a1,ma2,1 a2,2 . . . a2,m...

...an,1 an,2 . . . an,m

with ai,j ∈ R for 1 ≤ i ≤ n and 1 ≤ j ≤ m an (n,m)-matrix or (n ×m)-matrix. We also write (ai,j) for A. Furthermore, the prefix (n,m) is calledthe size or the dimension of A.

1.4.2 ExampleConsider the matrix

A :=

1 7 02 −1 2−3 5 2−1 0 −8

Let us get acquainted with the new terminology:(i) What is the dimension of A? It is a (4, 3)-matrix.(ii) What is A4,2, the (4, 2)-entry of A? This is 0.

Having this new terminology, we focus again on linear equations. Take aclose look at the algorithm we used for finding the solutions. Note thatwe always only changed the coefficients of the equations, rather than thevariables (e.g. we never ended with x2x

23, from x2, say). So why bother and

carry the variables all the time?? Just focus on the coefficients and proceedas usual. That is the idea behind the next section.

1.4.3 DefinitionLet m,n ∈ N. Consider the system of equations:

a1,1x1 + a1,2x2 + . . . + a1,mxm = b1a2,1x1 + a2,2x2 + . . . + a2,mxm = b2

......

...an,1x1 + an,2x2 + . . . + an,mxm = bn

with ai,j, bj ∈ R for 1 ≤ i ≤ n, 1 ≤ j ≤ m.

14 Linear Equations

Then

A =

a1,1 a1,2 . . . a1,ma2,1 a2,2 . . . a2,m...

...an,1 an,2 . . . an,m

is called the coefficient matrix for the system. Note that it is an (n,m)-matrix.Furthermore,

B =

a1,1 a1,2 . . . a1,m b1a2,1 a2,2 . . . a2,m b2...

...an,1 an,2 . . . an,m bn

is the augmented matrix for the system. This is an (n,m+1)-matrix. Theaugmented matrix B is also denoted by [A | b], where A is the coefficientmatrix and

b =

b1b2...bn

.Let us look at a specific example.

1.4.4 ExampleConsider the system

2x1 + 3x2 = −14x1 + 7x3 = 4

3x2 − x3 = 0

The corresponding augmented matrix is 2 3 0 −14 0 7 40 3 −3 0

Gauss 15

Just as we did for linear systems, we can define elementary operations onmatrices.

1.4.5 DefinitionWe define the following elementary row operations on matrices.

(a) Interchange two rows.

(b) Multiply a row by a nonzero constant.

(c) Add a multiple of a row to another.

Two matrices A,B are said to be equivalent if one can be obtained from theother through a series of elementary row operations. In this case, we writeA ∼ B or A→ B (but we do not write A = B!).

Let us apply some elementary row operations to the matrix in Example 1.4.4.

1.4.6 ExampleLet the matrix A be as in example 1.4.4.Interchanging row 2 and row 3, we get 2 3 0 −1

0 3 −1 04 0 7 4

Adding −2 times the first row to the third gives: 2 3 0 −1

0 3 −1 00 −6 7 6

Multiply the second row by 2: 2 3 0 −1

0 6 −2 00 −6 7 6

And finally adding the second to the third gives: 2 3 0 −1

0 6 −2 00 0 5 6

16 Linear Equations

Note that all these matrices, including the original one from Example 1.4.4are equivalent.Let us retrieve the associated linear system to this augmented matrix. Itlooks like:

2x1 + 3x2 = −16x2 − 2x3 = 0

5x3 = 6

This system is now in triangular form and we can perform backward sub-stitution to determine the solution set. In particular, we have x3 = 6/5,x2 = 2(6/5)/6 = 2/5, and x1 = 0.5(−1− 3(2/5)) = −11/10, hence

S = {

−11/102/56/5

}We see that the terminology we introduced for linear systems, can be carriedover to that of augmented matrices.

1.4.7 DefinitionA leading term in a row is the leftmost non zero entry of a row. A matrixis in echelon form, if

(a) Every leading term is in a column to the left of the of the leading termof the row below it.

(b) All zero rows are at the bottom of the matrix.

A pivot position in a matrix in echelon form is one that consists of a leadingterm. The columns containing a pivot position are called pivot columns.This algorithm, which transforms a given matrix into echelon form, is calledGaussian elimination.

1.4.8 Remark(a) Note that the entries in a column that are below a pivot position willonly be zeros.(b) Also note the difference between systems in echelon form and matrices inechelon form. For matrices we do allow zero rows. The mere stair pattern isa criteria for matrices in echelon form. So it might happen that going backfrom a matrix in echelon form (which has zero rows) leads to system that isnot in echelon form, unless you delete zero rows or state it to be inconsistent.

Gauss 17

Remember, that we can find a solution to an associated linear system from amatrix in echelon form by doing backward substitution. But actually, thereis a way to transform a matrix in echelon form even further, so that thesolution set can be read off from a matrix.

1.4.9 ExampleLet us continue the example above. We apply a few more elementary rowoperations, to obtain a very specific echelon form.In the example, we ended with 2 3 0 −1

0 6 −2 00 0 5 6

Divide each nonzero row by the reciprocal of the pivot, so that we end upwith 1 as leading term in each nonzero row. In our example, we have tomultiply the first row by 1/2, the second by 1/6 and the third by 1/5. Thisgives 1 3/2 0 −1/2

0 1 −2/6 00 0 1 6/5

Use now elementary row operations to get zeros in the entries above eachpivot position. In our example, we add 2/6 times the last row to the secondrow. We get 1 3/2 0 −1/2

0 1 0 2/60 0 1 6/5

Now we add −3/2 times the second row to the first row and get 1 0 0 −11/10

0 1 0 2/60 0 1 6/5

Let us find the associated linear system.

x1 = −11/10x2 = 2/6

x3 = 6/5

18 Linear Equations

By finding this particular matrix form, we are able to just read off the solu-tion. So, the solution set is

S = {

−11/102/56/5

}.This is, of course, worth a definition.

1.4.10 DefinitionA matrix is in reduced echelon form if

(a) It is in echelon form.

(b) All pivot positions have a 1.

(c) The only nonzero term in a pivot column is in the pivot position.

The transformation of a matrix to a matrix in reduced echelon form is calledGauss-Jordan elimination. Getting to a matrix in echelon form is calledthe forward phase, getting from echelon form to reduced echelon form iscalled backward phase.

1.4.11 ExampleThis example is to show how to handle systems with free parameters whentransforming to reduced echelon form. We start with a matrix in reducedechelon form. (Check that!) 1 0 −2 2

0 1 1 −20 0 0 0

The corresponding system is:

x1 − 2x2 = 2x2 + x3 = −2

We plug in the free parameter s for x3 and get x2 = −2− s and x1 = 2 + 2s,so that the solution set is

S = {

2 + 2s−2− s

s

, s ∈ R} = {

2−2

0

+ s

2−1

1

, s ∈ R}.

Gauss 19

We end this chapter with focussing on which kinds of linear systems weencounter.

1.4.12 Definition(a) A linear equation is homogeneous, if it is of the form

a1x1 + a2x2 + . . .+ amxm = 0.

Likewise, a system is called homogeneous, if it has only homogeneous equa-tions. Otherwise, the system is called inhomogeneous.(b) The vector

000...0

is the so called trivial solution to a homogeneous system.

1.4.13 Theorem (!!!)A system of linear equations has no solution, exactly one solution, or infinitelymany solutions.

Proof: (Proof strategy: Case study) Let A be the transformed matrix,which we obtained from the augmented matrix, such that A is in echelonform, we have one of the following three outcomes:(a) A has a row of the form [0, 0, . . . , 0 | c] for a nonzero constant c. Thenthe system must be inconsistent, and hence has no solution.(b) A has a triangular form, and thus has no free variable. By backwardsubstitution, we find a unique solution.(c) A corresponds to a system which is in echelon form but not triangular,and thus has one or more free variables. This means, that the system hasinfinitley many solutions. �

1.4.14 Remark(a) Note that a homogeneous system is always consistent, because there isalways the trivial solution, which is (0, 0, . . . , 0).(b) Consider an inhomogeneous system. Assume that the correspondingaugmented matrix has been reduced to echelon form. If this matrix has arow of the form

[0, 0, 0, . . . , 0 | c]

20 Linear Equations

for some nonzero constant c, then the system is inconsistent.

Note, that by Remark 1.4.14, we also find at least one solution, which is thetrivial one. By Theorem 1.4.13, we therefore have either one or infinitleymany solutions for homogenous systems.

Applications We will now discuss different applications, where linear sys-tems are used to tackle the problems.

1.4.15 ExampleConsider the following intersections of streets:

A

C B

x 3

0.4

x1

0.3

0.2 0.6

0.3

x2

0.4

Let us sum up node by node, which traffic volumes occur. The incomingtraffic needs to be equal to the outgoing, so that we get.At node A we get x2 = 0.4 + x3.At node B we get x3 + 0.4 + 0.6 = 0.3 + x1.At node C we get 0.3 + x1 = x2 + 0.2 + 0.4.We now need to rearrange those linear equations and build up a linear system.

x2 − x3 = 0.4x1 − x3 = 0.7x1 − x2 = 0.3

The associated augmented matrix looks like:

Gauss 21

0 1 −1 0.41 0 −1 0.71 −1 0 0.3

Let us now perform Gauss-Jordan elimination to obtain the reduced echelonform. 0 1 −1 0.4

1 0 −1 0.71 −1 0 0.3

∼

1 −1 0 0.30 1 −1 0.41 0 −1 0.7

∼

1 −1 0 0.30 1 −1 0.40 1 −1 0.4

∼

1 −1 0 0.30 1 −1 0.40 0 0 0

∼

1 0 −1 0.70 1 −1 0.40 0 0 0

The corresponding linear system is:

x1 − x3 = 0.7x2 − x3 = 0.4

We set the free variable x3 equal to the parameter s and thus get afterbackward substitution:x3 = s, x2 = 0.4 + s, x1 = 0.7 + s, so the solution set is

S = {

0.70.4

0

+ s ∈ R≥0

111

}.Note that s must be greater or equal tpo zero in this situation because itdoes not make sense in a real world application to have a negative trafficvolume. The minimum travel rate from C to A is therefore 0.4.

1.4.16 ExampleWhen propane burns in oxygen, it produces carbon dioxide and water:

C3H8 + O2 → CO2 + H2O.

22 Linear Equations

We see, that on the left hand side, there is are eight H-atom, whereas on theright, there are two. We need to balance the equation, i.e. we are looking forcoefficients x1, x2, x3, x4 in

x1C3H8 + x2O2 → x3CO2 + x4H2O.

Let us go through each kind of atom:There are 3x1 carbon atoms on the left hand side, and x3 on the left handside, so that we get

3x1 = x3.

We have 8x1 hydrogen atoms on the left, and 2x4, hence

8x1 = 2x4.

Further, there are 2x2 oxygen atomes on the left hand side, and 2x3 + x4 onthe right hand side, thus

2x2 = 2x3 + x4.

We thus get the following homogeneous system

3x1 − x3 = 08x1 − 2x4 = 0

2x2 − 2x3 − x4 = 0

The associated augmented matrix is 3 0 −1 0 08 0 0 −2 00 2 −2 −1 0

∼

3 0 −1 0 00 0 8 −6 00 2 −2 −1 0

3 0 −1 0 0

0 2 −2 −1 00 0 8 −6 0

∼

1 0 −1/3 0 00 1 −1 −1/2 00 0 1 −3/4 0

1 0 0 −1/4 0

0 1 0 −5/4 00 0 1 −3/4 0

The corresponding system is

x1 − 1/4x4 = 0x2 − 5/4x4 = 0

x3 − 3/4x4 = 0

Gauss 23

We set x4 = s and get x3 = 3/4s, x2 = 5/4s and x1 = 1/4s. As we arenot looking for the whole solution set, but rather for coefficients, so that thebalance is right, we choose s = 4 (we want natural numbers as coefficients),so that

C3H8 + 5O2 → 3CO2 + 4H2O.

1.4.17 ExampleWe are looking for the coefficients of a polynomial of degree 2, on which thepoints (0,−1), (1,−1), and (−1, 3) lie, i.e. we need to find a, b, c in

f(x) = ax2 + bx+ c.

Plugging in the x-values of the points, we get

c = −1a + b + c = −1a − b + c = 3

This gives 0 0 1 −11 1 1 −11 −1 1 3

∼

1 1 1 −11 −1 1 30 0 1 −1

∼

1 1 1 −10 −2 0 40 0 1 −1

1 1 1 −1

0 1 0 −20 0 1 −1

∼

1 1 0 00 1 0 −20 0 1 −1

∼

1 0 0 20 1 0 −20 0 1 −1

The solution to this system is a = 2, b = −2, c = −1. This gives us

f(x) = 2x2 − 2x− 1.

24 Linear Equations

25

Chapter 2

Euclidean Space

2.1 Vectors

Have a look at the ingredients of milk with 2% milkfat. It says, that thereare 5g fat, 13g carbs, and 8g protein in a cup (240ml). How could we displaythis information? We arrange the values in a (3, 1)-matrix in the followingway. 5

138

,where the first row represents fat, the secod carbs, and the third protein.This representation is called a vector. If you are interested in the amountof fat, carbs, and protein in 1200ml, say, then all you do is multiply eachentry by 5. How can we display that? Remember, that the less unnecessaryinformation you use, the more you keep the clear view. We therefore define

5

5138

:=

5 · 55 · 135 · 8

=

256540

Drinking that amount of milk means taking 25g fat, 65g carbs and 40g proteinin. This is a new kind of product, namely the product of a scalar with avector.The ingredient vector for whole milk is 8

128

.

26 Euclidean Space

If you drink one cup of 2% milk and one cup of whole milk, you will haveingested 5g + 8g fat, 13g + 12g carbs and 8g + 8g protein. How can werepresent that? Just add the vectors: 5

128

+

8128

=

5 + 812 + 128 + 8

.The idea behind this concept will lead to a whole new terminology, even anew and mathematically very rich structure.

2.1.1 DefinitionA vector over R with n components is an ordered list of real numbersu1, u2, . . . , un, displayed as

u =

u1u2...un

or as

u = [u1, u2, . . . , un].

The set of all vectors with n entries is denoted by Rn. Each entry ui is calleda component of the vector. A vector displayed in vertical form is called acolumn vector, a vector displayed in row form is called a row vector.

Remember what we did with vectors in our examples? We wrap a definitionaround that, because the concepts are very important and easy to generalizeto so called vector spaces, which we will encounter later.

2.1.2 DefinitionLet u,v be vectors in Rn given by

u =

u1u2...un

and v =

v1v2...vn

Suppose that c is a real number. In the context of vectors, c is called ascalar. We then have the following operations:

Vectors 27

Equality: u = v if and only if u1 = v1, u2 = v2, . . . , un = vn.

Addition: u + v =

u1u2...un

+

v1v2...vn

=

u1 + v1u2 + v2

...un + vn

Scalar Multiplication: cu = c

u1u2...un

=

cu1cu2

...cun

The zero vector 0 ∈ Rn is the vector, whose entries are all 0. Moreover, wedefine −u := (−1)u. The set of all vectors of Rn, together with addition andscalar multiplication, is called a Euclidean Space.

2.1.3 ExampleLet

u =

2−1

05

and

v =

1921

.Compute 2u− 3v.

2u− 3v = 2

2−1

05

− 3

1921

=

4−2

010

−

32763

=

1

−29−6

7

.Let us now analyze the structure we have in a very abstract and mathemat-ical way. What we do is generalize properties of one structure to another.Think about what we do, when we add vectors. In words, it is just addingcomponentwise. But adding componentswise is adding two numbers, just the

28 Euclidean Space

way we are used to. Each property we are used to with ordinary addition,just carry over to vectors! Let’s get more concrete:

2.1.4 Theorem (Algebraic Properties of Vectors)Let a, b be scalars (i.e. real numbers) and u,v, and w be vectors in Rn. Thenthe following holds:

(a) u + v = v + u.

(b) a(u + v) = au + av.

(c) (a+ b)u = au + bu.

(d) (u + v) + w = u + (v + w).

(e) a(bu) = (ab)u.

(f) u + (−u) = 0.

(g) u + 0 = 0 + u.

(h) 1u = u.

Proof: Let

u =

u1u2...un

and v =

v1v2...vn

.Then

u+v =

u1u2...un

+

v1v2...vn

=

u1 + v1u2 + v2

...un + vn

=

v1 + u1v2 + u2

...vn + un

=

v1v2...vn

+

u1u2...un

= v+u.

This proves (a).For (b) consider

a(u+v) = a

u1 + v1u2 + v2

...un + v2

=

a(u1 + v1)a(u2 + v2)

...a(un + vn)

=

au1 + av1au2 + av2

...aun + av2

= a

u1u2...un

+a

v1v2...vn

= au+av.

Vectors 29

Try the remaining ones as extra practice! �Adding multiples of different vectors will be an important concept, so thatit gets is own name.

2.1.5 DefinitionLet u1,u2, . . . ,um be vectors, and c1, c2, . . . , cm be scalars. We then call

c1u1 + c2u2 + . . .+ cmum

a linear combination of (the given) vectors. Note, that we do not excludethe case, that some (or all) scalars are zero.

2.1.6 Example(a) What can we obtain when we linear combine ‘one’ vector? Let’s plug inthe definition. Let u be a vector in Rn. Then

au

with a in R is by Definition 2.1.5 a linear combination of u. What if a = 1?This tells us that u is a linear combination of u! For a = 0, we see, that 0 isa linear combination of u!(b) Let

e1 =

100

, e2 =

010

, e3 =

001

be three vectors in R3. Then

3e1 − 1e2 + 2e3 =

3−1

1

is a linear combination of e1, e2, e3.

2.1.7 ExampleRemember our milk example from the beginning of this chapter. We had2%-fat milk whose ingredients could be displayed as 5

138

30 Euclidean Space

and the whole milk ingredients, 8128

.How much of each kind of milk do I have to drink, to ingest 3g of fat, 19/3gof carbs and 4g of protein?What are we looking for? We need to find the amount x1 of 2%-fat milkand the amount x2 of whole milk so that the given amounts of fat, carbs andprotein are ingested. When finding the combined ingredients, we introducedadding to vectors, so we do the same here: We want the following to be true:

x1

5138

+ x2

8128

=

319/3

4

.This must be true in each row, so let us write it down row wise or componentwise.

5x1 + 8x2 = 313x1 + 12x2 = 19/38x1 + 8x2 = 4

This is a linear system! Let us find the corresponding augmented matrix:

5 8 313 12 19/38 8 4

∼

5 8 30 −44 −22/30 −24 −4

∼

5 8 30 1 1/60 1 1/6

∼

5 0 10/60 1 1/60 −24 −4

∼

1 0 1/30 1 1/60 0 0

So the solution is x1 = 1/3 and x2 = 1/6. In context, this means, that weneed to drink 1/3 cup of 2% fat-milk and 1/6 cup of whole milk to ingest thedesired amounts.As solution we therefore have[

x1x2

]= x =

[1/31/6

].

Vectors 31

To finish this section, let us define a new operation between a matrix and avector. This is a new product.

2.1.8 Definition (Matrix-Vector-Product)Let a1, a2, . . . , am be vectors in Rn. If

A =[

a1 a2 . . . am

]and x =

x1x2...xm

.Then the matrix-vector product between A and x is defined to be

Ax = x1a1 + x2a2 + . . .+ xmam.

2.1.9 Remark(a) The product is only defined, if the number of columns of A equals thenumber of components of x. In the definition above, this means that x mustbe an element from Rm.(b) Have a second look at the Definition 2.3.2 of the span. Compare thiswith the definition of the matrix-vector product of Definition 2.1.8. Wenotice the IMPORTANT fact, that the result of a matrix-vector product isin fact a linear combination of the columns of the matrix!!! In other words,any product A · b for an (m,n)-matrix A and a vector b ∈ Rn lies in thespan of the columns of A.

2.1.10 ExampleFind A,x and b, so that the equation Ax = b corresponds to the system ofequations

4x1 − 3x2 + 7x3 − x4 = 13−x1 + 2x2 + 6x4 = −2

x2 − 3x3 − 5x4 = 29

First we collect coefficients: 4 −3 7 −1−1 2 0 6

0 1 −3 −5

32 Euclidean Space

Then we collect variables:

x =

x1x2x3x4

Finally collect the values on the right hand side of the equality sign.

b =

13−229

-Put on table three equivalent ways of displaying a linear system.We will soon see a proof for a theorem that is very elegant just becausewe are now able to use this new representation of linear systems with thematrix-vector product. Let usexplore an example:Even though we treated homogeneous and inhomogeneous systems as beingvery different, we are about to discover, that they are very much related.

2.1.11 ExampleConsider the following example for a homogeneous (blue column) and inho-mogenous (red column) system, written in matrix form: 2 −6 −1 8 0 7

1 −3 −1 6 0 6−1 3 −1 2 0 4

∼

2 −6 −1 8 0 70 0 1 −4 0 −50 0 −3 12 0 15

∼

2 −6 −1 8 0 70 0 1 −4 0 −50 0 0 0 0 0

∼

2 −6 0 4 0 20 0 1 −4 0 −50 0 0 0 0 0

∼

1 −3 0 2 0 10 0 1 −4 0 −50 0 0 0 0 0

We set the free variables x4 := t1 and x2 := t2 and obtain as solution set forthe respective systems

Sh = {s1

3100

+ s2

−2

041

| t1, t2 ∈ R}

Vectors 33

Si = {

10−5

0

+ s1

3100

+ s2

−2

041

| t1, t2 ∈ R}.

Compare those sets, what do we note? The set ’differ’ only by the constantvector

10−5

0

.This phenomenon can be proven and is very helpful for computations.

Now we are ready to state and proof the relationship between homogeneousand inhomogeneous systems:

2.1.12 TheoremLet xp be one arbitrary solution to a linear system of the form

Ax = b.

Then all solutions xg to this system are of the form

xg = xp + xh,

where xh is a solution to the associated homogeneous system Ax = 0.

Proof: Let xg be any solution to the system. We must prove, that xg canbe written as proposed. To do so, we first consider the vector xg − xp. Wehave by Lemma the distributeive property (here without proof)

A(xg − xp) = Axg − Axp = b− b = 0.

From this, we conclude, that xg − xp is a solution xh to the associatedhomogeneous system, so that we can write

xg − xp = xh.

Solving the last equation for xg, gives us xg = xp + xh. �

34 Euclidean Space

2.2 Geometry of Rn

We are dealing with vectors over Rn. This set is equipped with a huge packageof additional properties. One of these properties - at least for n = 1, 2, 3 - isthat it allows a geometric interpretation of vectors, addition of vectors, andscalar multiplication. Let us investigate R2.

A vector

[x1x2

]of R2 can be understood as an arrow from the origin to the

point (x1, x2).

[24

]

[1

−1

]-1

1 2 3 4 5-1-2-3-4-5

1

2

3

4

5

It is now clear, that a length can be attached to a vector in Rn. Good old

Pythagoras shows that if u =

[24

], its length |u| in the plane is given by

|u| =√

22 + 42 =√

20.

2.2.1 Definition (Length of a vector)Let

u =

u1u2

...un

Geometry of Rn 35

be a vecotr in Rn. Then the Length |u| of u is defined to be

|u| =√u21 + u22 + . . .+ u2n.

How is adding vectors being reflected in the picture? We move one of thetwo vectors without changing direction or length, such that the tip of theother vector touches the tail of the moved vector. The result, i.e. the sum ofthe two vectors starts in the origin and ends at the tip of the moved vector.

[24

]

[1

−1

]

[33

]

-1

1 2 3 4 5-1-2-3-4-5

1

2

3

4

5

Let us finish this section with visualizing, what scalar mulitplication means.Let u be a vector in Rn and let c ∈ R. Then we have(a) If c = 0, then cu = 0, just a point in the origin.(b) If c > 0, then cu points into the same direction as u, and its length isthe length of u multiplied by c.(c) If c < 0, then cu points into the opposite direction as u, and its length isthe length of u multiplied by |c|.

36 Euclidean Space

[24

]

0.5

[24

]

[1

−1

]

−[

1−1

]

(−2)

[1

−1

]

-1

1 2 3 4 5-1-2-3-4-5

1

2

3

4

5

2.3 Span

2.3.1 ExampleConsider the following two vectors

u =

2−1

1

, v =

1−1

0

and w =

340

We aim at answering the following question. Is w a linear combination ofu,v, i.e. is there a solution to

x1u + x2v = w?

Let’s do the computation.We like to find a solution to 2x1

−x1x1

+

x2−x20x2

=

340

Span 37

The corresponding linear system is

2x1 + x2 = 3−x1 − x2 = 4x1 = 0

The associated augmented matrix is 2 1 3−1 −1 4

1 0 0

∼

1 0 02 1 3−1 −1 4

1 0 0

0 1 30 −1 4

∼

1 0 00 1 30 0 7

The last row of the matrix reveals a contradiction. So the answer must be:No matter which combination of u and w you consider, you will never reachthe desired one.

The previous example immediately raises the following question: How can wedescribe the set of all linear combinations of a set of given vectors? How canone determine, if the set of all linear combinations of a given set of vectorsin Rn is the whole of Rn? We will investigate this kind of questions in thissection.

2.3.2 Definition (Span)Let {u1,u2, . . . ,um} be a set of vectors in Rn. The span of this set isdenoted by span{u1,u2, . . . ,um}, and is defined to be the set of all linearcombinations

s1u1 + s2u2 + . . .+ smum,

where s1, s2, . . . , sm can be any real numbers. If span{u1,u2, . . .um} = Rn,we say that {u1,u2, . . .um}spans Rn.

2.3.3 Example(a) Let u = 0 ∈ Rn. Then

span{u} = {a0 | a ∈ R} = {0}

is a set with only one element.

38 Euclidean Space

(b) Let u 6= 0 ∈ Rn. Then

span{u} = {au | a ∈ R}

is a set with infinitely many elements. Have a look back at the geometricinterpretation on a ·u. This shows that span{u} is a line through the origin.

Just by following Example 2.3.1, we can generally pose the following theorem:

2.3.4 TheoremLet u1,u2, . . . ,um and w be vectors in Rn.Then w is an element of span{u1,u2, . . . ,um}, if and only if the linear systemrepresented by the augmented matrix[

u1 u2 . . . um w]

has a solution.

Proof: Let us plug in the definition of v being an element of the setspan{u1,u2, . . . ,um}:w ∈ span{u1,u2, . . . ,um} if and only if there are scalars s1, s2, . . . , sm, suchthat

s1u1 + s2u2 + . . .+ smum = w.

But this is true, if and only if (s1, s2, . . . , sm) is a solution to the systemrepresented by [

u1 u2 . . . um w]

This is what we wanted to show. �

2.3.5 ExampleIs 0

−28

an element of

span{

−1−1−3

, 1

3−5

, 1

2−1

}?

Span 39

Consider the following augmented matrix representing a linear system: 1 1 −1 02 3 −1 −2−1 −5 −3 8

→

1 1 −1 00 1 1 −20 −4 −4 8

1 1 −1 0

0 1 1 −20 0 0 0

→

1 0 −2 20 1 1 −20 0 0 0

Let us translate that back to a linear system.

x1 = 2 + 2sx2 = −2 − sx3 = s

In vector form, this solution set can be written as

S = {x =

x1x2x3

=

2−2

0

+ s

2−1

1

|s ∈ R}.

It seems that sometimes, one could take out an element from a set S :={u1,u2, . . .um} of vectors in Rn and still obtain the same span. Whichproperties do such elements have? Here is the answer:

2.3.6 TheoremLet u1,u2, . . . ,um and u be vectors in Rn. If u is in span{u1,u2, . . . ,um},then

span{u,u1,u2, . . . ,um} = span{u1,u2, . . . ,um}.One could also say, that u is redundant in span{u1,u2, . . . ,um}.

Proof: Whenever we want to show equality of two sets, we do that byshowing that one set is contained in the other and vice versa. To that end,let S0 := span{u,u1,u2, . . . ,um} and S1 := span{u1,u2, . . . ,um}. So let usfirst show S1 ⊆ S0. Let v be any element in S1. Then there are scalarsa1, a2, . . . , am, such that v = a1u1 + a2u2 + . . . + amum. This sum can alsobe written as v = 0u + a1u1 + a2u2 + . . .+ amum, so that v lies in S0.Let us now show that S0 ⊆ S1. Fix w ∈ S0, so there are scalars, b0, b1, b2 . . . , bmsuch that

w = b0u + b1u1 + b2u2 + . . .+ bmum.

40 Euclidean Space

Remember, that we also assumed that u is an element of S1. That means,that there are scalars c1, c2, . . . , cm such that

u = c1u1 + c2u2 + . . .+ cmum.

Let us plug in this representation of u in the representation of w. We thenget

w = b0(c1u1 + c2u2 + . . .+ cmum) + b1u1 + b2u2 + . . .+ bmum.

Apply the distributive property and rearrange that expression, so that youget

w = (b0c1 + b1)u1 + (b0c2 + b2)u2 + . . .+ (b0cm + bm)um.

Now have a look at the previous equation. This is a linear combination ofthe vectors {u1,u2, . . . ,um}, hence it lies in S1!Now we have shown S0 ⊆ S1 and S1 ⊆ S0, hence those two sets are equal.

�We are still exploring, what one can say about span{u1,u2, . . . ,um}, in par-ticular exploring what a certain choice of vectors reveals about their span.

2.3.7 TheoremLet S := {u1,u2, . . . ,um} be a set of vectors in Rn. If m < n, then this setdoes not span Rn. If m ≥ n, then the set might span Rn or it might not. Ifnot, we cannot say more without additional information about the vectors.

Proof: Let a be any vector in Rn. Let us check, what we need to generatethis vector with elements of S. By Theorem 2.3.4 his means, looking at thesystem, whose augmented matrix looks like that:

A =[

u1 u2 . . . um a].

What do we usually do, when handling an augmented matrix? We aim forechelon form! Imagine, we did that for A. We then end up with an equivalentmatrix of the form

1 ? . . . ? a10 1 ? ? a2...

. . . . . ....

0 0 0 0 an

.The important thing to understand is, that because m < n, we will get atleast one zero row. If you choose a smartly, such that after getting to the

Span 41

echelon form, you have am 6= 0, you will have created an inconsistent system.In our setting, an inconsistent system means finding no solution to generatinga with elements of S. But this means, that there are vectors in Rn whichdo not lie in the span S. If m ≥ n, then the matrix in echelon form mayrepresent either a matrix with zero rows or not. If there are zero rows, wealways find vectors a, such that the associated system becomes inconsistent.If not, we always find solutions, hence the span is Rn. �

2.3.8 ExampleIs

span{

1−1

1

, −2

20

, 0

0−3

, 3−3

2

} = R3?

If that were true, any vector [a1, a2, a3]t of R3 would have to be a linear

combination of the given vectors. We make use of Theorem 2.3.4 and solvethe following system: 1 −2 0 3 a1

−1 2 0 −3 a21 0 −3 2 a3

→

1 −2 0 −3 a10 0 0 0 a1 + a20 2 −3 −1 −a1 + a3

1 −2 0 −3 a1

0 2 −3 −1 −a1 + a30 0 0 0 a1 + a2

The last row of the matrix reveals a contradiction. Whenever you choose avector in R3 such that a1 + a3 6= 0, you will not be able to linear combinethe given vectors to that one. In particular, they cannot span the whole R3.

2.3.9 Remark (!!)If you follow the previous example, you see that the considering the equivalentmatrix in echelon form actually presented an understanding of whether thereis a solution or not. So, whenever you are actually computing the span,go over to echelon form and look for rows which have only zero entries. Ifthere are any, then you’ll instantly know, that not the whole of Rn can bespanned, because there will be choices for vectors which have nonzero entriesat precisely those rows.

To summarize the previous results ang get a wider perspective, let us collectthe following equivalences.

42 Euclidean Space

2.3.10 TheoremLet a1, a2, . . . , am and b be vectors in Rn. Then the following statements areequivalent:(a) The vector b is in span{a1, a2, . . . , am}.(b) The vector equation x1a1 + x2a2 + . . . + xmam = b has at least onesolution.(c) The linear system corresponding to

[a1 a2 . . . am b

]has at least

one solution.(d) The equation Ax = b with A and x given as in Definition 2.1.8, has atleast one solution.

Proof: (a) ⇐⇒ (c): This is Theorem 2.3.4(b) ⇐⇒ (a): This is rewriting system in augmented matrix form to linearequation form.(c) ⇐⇒ (d): This is Definition 2.1.8. �

2.3.11 ProblemHere is some more for you to practice:

(a) Do u1 =

[32

]and u2 =

[53

]span R2?

(b) Find a vector in R3 that is not in

span{

2−1

4

, 0

3−2

}.How do you know a priori, that such a vector must exist?(c) Find all values of h, such that the set of vectors {u1,u2,u3} spans R3,where

u1 =

12−3

, u2 =

0h1

, u3 =

−321

(d) Determine, if the equation Ax = b has a solution for any choice of b,where

A =

1 0 −22 1 22 1 3

.Do the same for

A =

[1 3−2 −6

].

2.4. Linear Independence 43

2.4 Linear Independence

The previous section and this section are central key parts in this course.Before those sections, we learned how to solve linear equation and how tointerpret equivalent systems/matrices in terms of the solution set. I thereforeunderstand those sections as providing strong tools within linear algebra. Butthese current sections actually investigate the essence of linear algebra anddeal with fundamental concepts.Let us assume that we are given three vectors u1,u2,u3, such that

u3 = 2u1 − u2.

We can solve these equations for the other vectors as well:

u1 = 1/2u2 + 1/2u3 or u2 = u1 − u3 .

We see, that the representation of one of the vectors depends on the othertwo. Another way of seeing this, is - because the equations are equivalent to

2u1 − u2 − u3 = 0,

- is that the systemc1u1 + c2u2 + c3u3 = 0

has the non trivial solution (2,−1,−1). This is exactly the idea behind lineardependence/independence:

2.4.1 Definition (!!Linear Independence)Let {u1,u2, . . . ,um} be a set of vectors in Rn. If the only solution to thevector equation

x1u1 + x2u2 + . . .+ xmum = 0

is the trivial solution given by x1 = x2 = . . . = xm = 0, then the set{u1,u2, . . . ,um} is linearly independent. If there are non trivial solutions,then the set is linearly dependent.

2.4.2 Remark(a) Note that the definition is very abstract: It is based on the absence ofnon trivial solutions.(b) When are two vectors u1,u2 linearly (in-)dependent? There are two casesto discuss:

44 Euclidean Space

(i) At least one of the vectors is the zero vector. Let u1 = 0. Then 1·0+0·u2 =0. Compare that with the definition of linear independence. It fits thennegation of that definition! This is because x1 = 1, x2 = 0 is not the trivialsolution. So these two vectors are linearly dependent and we can write theequation above as u1 = 0u2.(ii) Neither vector is the zero vector. Let us assume that they are linearlydependent. Then there are scalars c1, c2 not equal to 0 (assume one is zeroand you will see that then at least one of the vectors must be the zero vector),such that

c1u1 + c2u2 = 0.

So we can solve for u1 or u2. Let us solve for u1, so that we get

u1 = c2/c1u2.

How can we interpret this last equation: Two vectors are linearly dependent,whenever one vector is a multiple (here the factor is c2/c1) of the other.(c) One single vector u 6= 0 is linearly independent. We see that by pluggingin the definition for linear independence for one vector:

cu = 0

has only c = 0 as solution, as we assume that u 6= 0. In other words,this equation has only the trivial solution c = 0, which makes it linearlyindependent.(d) A different view on linear independence is to ask whether the zero vectoris a non-trivial linear combination of the given vectors. So actually, we aredealing with questions about the span. Make use of Theorem 2.3.4 to getthe following theorem!

2.4.3 LemmaLet S := {u1,u2, . . . ,um} be a set of vectors in Rn. Consider the system fordetermining if this set is linearly independent. Set

A =[

u1 u2 . . . um

],x =

x1x2...xm

.Then S is linearly independent, if and only if the homogeneous linear systemAx = 0 has only the trivial solution.

Linear Independence 45

2.4.4 ExampleWhen you need to determine if a set of vectors is linearly independent ornot, you usually just do, what the definition of linear independence says:You solve the associated system: Let us do this with a concrete example:Let

u1 =

−1

4−2−3

, u2 =

3

−1377

, u3 =

−2

19−5

.So let us solve the system

x1u1 + x2u2 + x3u3 = 0.

The augmented matrix corresponding to that system is−1 3 −2 0

4 −13 1 0−2 7 9 0−3 7 −5 0

∼

−1 3 −2 0

0 −1 −7 00 1 13 00 −2 1 0

−1 3 −2 0

0 −1 −7 00 0 6 00 0 15 0

∼

−1 3 −2 0

0 −1 −7 00 0 6 00 0 −13 0

−1 3 −2 0

0 −1 −7 00 0 6 00 0 0 0

Back substitution now shows, that the unique solution is the trivial solutionx1 = x2 = x3 = 0. Therefore, the u1,u2,u3 are linearly independent.

2.4.5 TheoremLet S := {0,u1,u2, . . . ,um} be a set of vectors in Rn. Then the set is linearlydependent.

Proof: Let us just plug in the definition for linear independence, that is,determine the solutions for

x00 + x1u1 + . . .+ um = 0.

46 Euclidean Space

We are looking for non trivial solutions. Observe, that x00 = 0 for any x0. Sowhy not suggesting (1, 0, 0, . . . , 0) as solution. It works and it is non trivial.So, the system has a non trivial solution, hence it is linearly dependent. �

2.4.6 ExampleConsider

u1 =

1628

u2 =

2244

u3 =

1804

u4 =

1826

Are these vectors linearly independent?We set up the equation for testing linear independence. 16 22 18 18 0

2 4 0 2 08 4 4 6 0

→ 8 11 9 9 0

0 −5 9 1 00 0 4 1 0

There is at least one free variable, which means for homogeneous systemsthat there is not only the trivial solution. We therefore conclude that thevectors are linearly dependent.

We find in this example one situation, when we actually know that a set islinearly independent without solving a system:

2.4.7 TheoremSuppose that {u1,u2, . . . ,um} is a set of vectors in Rn. If n < m, then theset is linearly dependent.

Proof: Consider the system x1u1 + x2u2 + . . . + xmum = 0. As this isan homogeneous system, we will not get any inconsistency. Because of theassumption, this system has less rows than columns. Hence, there must befree variables, thus infinitely many solutions. In particular, the system doesnot only have the trivial solution, hence the vectors are linearly dependent.

�Let us now make a connection between ’span’ and ’linear independence’.

2.4.8 TheoremLet {u1,u2, . . . ,un} be a set of vectors in Rn. Then the set is linearly depen-dent, if and only if one of the vectors in the set is in the span of the othervectors.


Proof: Suppose, the set is linearly dependent. Then there exist scalarsc1, c2, . . . , cm, not all zero, such that

c1u1 + c2u2 + . . .+ cmum = 0.

Note that this is the equation for linear independence. Without loss of gen-erality, let us assume that c1 6= 0. Then solve the equation for u1, to get

u1 = −c2c1

u2 − . . .−cmc1

um.

This is an expression which shows that u1 is a linear combination of theothers. Hence, u1 is in the span of {u2, . . . ,um}. Therefore, the ’forward’direction of the assertion is true.Let us now suppose that one of the vectors is in the span of the remainingvectors. Again, without loss of generality we assume that u1 is in the spanof {u2, . . . ,um}. Hence, there are scalars b2, b3, . . . , bm, such that

u1 = b2u2 + . . .+ bmum.

This equation is equivalent to

u1 − b2u2 − . . .− bmum = 0.

Note that u1 has 1 as coefficient, so we see that the zero vector is a non triviallinear combination of the vectors, which means by definition, that they arelinearly dependent. This shows the ’backward direction’. �

2.4.9 RemarkNote that the previous theorem does not mean, that every vector in a linearlydependent set must be a linear combination of the others. There is just atleast one that is a linear combination of the others.

We can collect the previous theorems, reinterpret them and summarize asfollows:

2.4.10 TheoremLet a1, a2, . . . , am and b be vectors in Rn. Then the following statements areequivalent.(a) The set {a1, a2, . . . , am} is linearly independent.(b) The vector equation x1a1 + x2x2 + . . . + xmam = b has at most onesolution.

48 Euclidean Space

(c) The linear system corresponding to[

a1 a2 . . . am b]

has at mostone solution.(d) The equation Ax = b with

[a1 a2 . . . am

]has at most one solution.

Proof: (b) ⇐⇒ (c) ⇐⇒ (d) is obvious, because it is just rewriting linearsystems in equation-/ matrix-form.So we need to proof that from (a), (b) follows and vice versa. Let us assumethat (a) is true and then show that (b) is true. To that end, we assume thecontrary and lead that to a contradiction. So assume, that x1a1 + x2x2 +. . .+xmam = b has more than one solution. We pick two different solutions,(r1, r2, . . . , rm) and (s1, s2, . . . , sm), so that

r1a1 + r2a2 + . . .+ rmam = b

s1a1 + s2a2 + . . .+ smam = b

Thus, we have

r1a1 + r2a2 + . . .+ rmam = s1a1 + s2a2 + . . .+ smam,

and hence

(r1 − s1)a1 + (r2 − s2)a2 . . .+ (rm − sm)am = 0.

Without loss of generality, we may assume that r1 6= s1, so that we see,that there is a non trivial linear combination of the zero vector. But byassumption, a1, a2, . . . , am are linearly independent, which gives us the con-tradiction. Therefore, the assumption in the number of solutions must havebeen wrong.Let us now assume (b) and show that (a) holds. Choose b = 0. As (b) holds,we know that x1a1 + x2x2 + . . . + xmam = 0 has at most one solution. Asthe trivial solution is already one, there cannot be another solution, whichgives us per definition the linear independence of the vectors. �

2.4.11 Theorem (!THE BIG THEOREM!)Let A = {a1, a2, . . . , an} be a set of n vectors in Rn. (Note that this time wehave the same number n of vectors and entries in a column.) Furthermore,let A =

[a1 a2 . . . an

]. Then the following are equivalent:

(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for all b in Rn


Proof: First we show, that (a) ⇐⇒ (b). If A is linearly dependent, thenwithout loss of generality, a1 is a linear combination of the remaining vectorsin A by Theorem 2.4.8. Hence, by Theorem 2.4.8, we have

span{a1, a2, . . . , an} = span{a2, . . . , an}.

By assumption (a), we have

span{a2, . . . , an} = Rn,

which is a contradiction to Theorem 2.3.7. Therefore, the assumption aboutA being linearly dependent must be wrong, and hence (a) ⇒ (b).For (b)⇐ (a), we may assume that A is linearly independent. If A does notspan Rn, we find a vector a, which is not a linear combinations of vectors inA.Therefore, Theorem 2.4.8 applies, and {a, a1, . . . , an} is linearly independent.But these are n + 1 linearly independent vectors in Rn, which cannot be byTheorem 2.4.7, this is a contradiction. Hence (a) → (b).We now show, that (a) ⇒ (c). Note, as we have already shown (a) ⇒ (b),that we can use (a) and (b). By Theorem 2.3.10(a), we now, that Ax = bhas at least one solution for every b in Rn. But Theorem 2.4.10(b), implies,that Ax = b has at most one solution. Hence, there is exactly one solutionto the equation.Finally, it remains to show (c) ⇒ (a). So let (c) be true, so that there is aunique solution to every vector in Rn. But this is just saying that any vectorof Rn lies in the span of A, hence (a) is true. �

50 Euclidean Space

51

Chapter 3

Matrices

3.1 Linear Transformations

So far we have always fixed a natural number n and then moved aroundwithin Rn. We never passed a bridge from Rn to Rm, say. But with whatwe are going to explore in this chapter, this will be possible. We will learnthat certain functions will allow us to go from one vector space to anotherwithout losing too much structure.

3.1.1 Definition ((Co)-Domain, Image, Range, Linear Transformation/Homomorphism)Let T : Rm → Rn be a function, that takes a vector of Rm as input andproduces a vector in Rn as output. We then say, that Rm is the domain ofT and Rn is the codomain of T . For u ∈ Rm, we call T (u) the image ofu under T . The set of the images of all vectors in the domain is called therange of T . Note that the range is a subset of the codomain.A function T : Rm → Rn is a linear transformation, or a linear homo-morphism, if for all vectors u and v in Rm, and all scalars c, the followingholds:(a) T (u + v) = T (u) + T (v).(b) T (cu) = cT (u).

3.1.2 ExampleCheck if the following functions are linear homomorphism.

(a) T1 : R1 → R2, [u] 7→[

u−3u

].

52 Matrices

Let us check the first condition. We need to take to vectors [u1], [u2] ∈ Rand see if the function respects addition.

T1([u1+u2]) =

[u1 + u2

−3(u1 + u2)

]=

[u1−3u1

]+

[u2−3u2

]= T1([u1]+T1([u2])).

Let us check the second condition. Let c ∈ R be a scalar and let [u] ∈ R1.

T1(c[u]) = T1[cu] =

[cu

−3(cu)

]= c

[u−3u

]= cT1([u]).

Both condition are satisfied, T1 is a linear homomorphism.(b) T2 : Rn → Rm,u 7→ 0.Let us check the conditions.

T2(u1 + u2) = 0 = 0 + 0 = T2(u1) + T2(u2).

Finally

cT2(u) = c0 = 0 = T2(cu),

hence this is also a linear homomorphism.(c) T3 : R2 → R2, [u1, u2]

t 7→ [u21, u22]

t.We know by the binomial equations that squaring does not preserve addition.So we are already expecting that the first condition will be violated.Check the following vectors out:

T3(

[23

]+

[44

]) = T3(

[67

]) =

[3649

],

but

T3(

[23

]) + T3(

[44

]) =

[49

]+

[1616

]=

[2532

].

We can stop here because the first condition is not satisfied and we canalready conclude that T3 is not a linear homomorphism.

(d) T4 : R2 → R2, [u1, u2]t 7→

[u1 + u2

1

].

Let us check the first condition.

T4(

[22

]+

[22

]) = T4(

[44

]) =

[41

],

Linear Transformations 53

but

T4(

[22

]) + T4(

[22

]) =

[21

]+

[21

]=

[42

]and hence it is not a linear homomorphism.But check that it is in fact a homomorphism if we map the second componentto 0 instead of 1.

Domain Codomain

(a) not injective, not surjective

54 Matrices

Domain Codomain

(b) injective, not surjective

Domain Codomain

(c) not injective, surjective


Domain Codomain

(d) injective, surjective

3.1.3 Definition (Injective/Surjective Homomorphism)Let T : Rm → Rn be a linear homomorphism.(a) T is called injective or one-to-one, if for every vector w in Rn thereexists at most one vector u in Rm such that T (u) = w.(b) T is called a surjective homomorphism or onto, if for every vectorw in Rn, there exists at least one vector u in Rm, such that T (u) = w.(c) T is called an isomorphism or bijective, if it is injective and surjective.

3.1.4 LemmaLet T : Rm → Rn be a linear homomorphism. Then T is injective, if andonly if T (u) = T (v) implies that u = v.

Proof: Put w := T (u). If T is injective, then u and v both are mapped tothe same image, hence they must be equal by the definition of injectivity.Assume now that there are two vectors u and v which have the same imagew under T . That is, T (u) = T (v). By assumption, this means, that u = v,so that every possible image has at most one preimage, which is the definitionfor T being injective. �

3.1.5 Definition (Kernel)Let T : Rm → Rn be a homomorphism. Then the kernel of T is defined tobe the set of vectors x in Rm, which satisfy

T (x) = 0.

56 Matrices

The kernel is denoted by ker(T ).

3.1.6 TheoremLet T : Rm → Rn be a linear homomorphism. Then T is injective if andonly if T (x) = 0 has only the trivial solution, in other words, if and only ifker(T ) = {0}.

Proof:If T is injective, then there is at most one solution to T (x) = 0. But T islinear, so 0 = 0T (x) = T (0x) = T (0), so that 0 is already a solution. Thatmeans, that T (x) = 0 has only the trivial solution.Let us now suppose that T (x) = 0 has only the trivial solution. Let u,v bevectors such that T (u) = T (v). Then 0 = T (u)− T (v) = T (u− v). By ourassumption, we conclude that u− v = 0, hence u = v. �

3.1.7 Remark (Tool!)Determining whether a linear homomorphism is injective or not just meansfinding out, if T (x) = 0 has only the trivial solution.

3.1.8 LemmaSuppose that A =

[a1 a2 . . . am

]is an (n,m)-matrix, and let

x =

x1x2...xm

, y =

y1y2...ym

.Then A(x + y) = Ax + Ay.

3.1.9 Theorem (!Matrix Product as Homomorphism)Let A be an (n,m)-matrix and define T (x) = Ax. Then T : Rm → Rn is alinear homomorphism.

Proof: We look at the definition and see that we need to prove two prop-erties: So let u,v be two arbitrary vectors in Rm.(a) T (u + v) = A(u + v) = Au + Av = T (u) + T (v).(b) c(T (u)) = cAu = (cA)u = A(cu) = T (cu).

�The following theorem gives a tool for deciding whether a vector lies in therange of a linear transformation or not.


3.1.10 TheoremLet A =

[a1 a2 . . . am

]be an (n,m)-matrix, and let T : Rm → Rn,x 7→

Ax be a linear transformation. Then(a) The vector w is in the range of T if and only if Ax = w is a consistentlinear system.(b) range(T ) = span{a1, . . . , am}.

Proof: (a) A vector w is in the range of T , if and only if there exists avector v ∈ Rm, such that

T (v) = Av = w.

But this equation is true, if and only if v is a solution to the system

Ax = w.

(b) By Theorem 2.3.10(a),(d), we know that Ax = w is consistent, if andonly if w is in span{a1, a2, . . . , am}, which just means what the assertionstates. �Let us find conditions, for when a homomorphism is injective or surjective.

3.1.11 TheoremLet A be a (n,m)-matrix and define the homomorphism T : Rm → Rn, x 7→Ax. Then(a) T is injective, if and only if the columns of A are linearly independent.(b) If n < m, then T is not injective.(c) T is surjective, if and only if the columns of A span the codomain Rn.(d) If n > m, then T is not surjective.

Proof: (a) By Theorem 3.1.6, T is injective, if and only if T (x) = 0 has onlythe trivial solution. By Theorem 2.4.3, T (x) has only the trivial solution, ifand only if the columns of A are linearly independent.(b) If A has more columns than rows, we can apply Theorem 2.4.7. Thecolumns of A are therefore linearly dependent. Hence, by (a) of this Theorem,T is not injective.(c) T is surjective, if and only if its range is Rn. By Theorem 3.1.10, therange of T equals the span of the columns of A. Hence, the range of T is thewhole of Rn, if the columns of A span Rn.(d) By Theorem 2.3.7 and the assumption n > m, the columns do not spanRn, hence T is not onto. �Let us try an example to see, how we tackle questions about injectivity andsurjectivity.

58 Matrices

3.1.12 Example(a) Consider the homomorphism

T : R2 → R3,x 7→

1 3−1 0

3 3

x.

We determine, if T is injective or surjective. Let us first consider injectivity.We apply Theorem 3.1.11 and find the echelon form of 1 3 0

−1 0 03 3 0

∼ 1 0 0

0 1 00 0 0

.This means, that the associated linear system has only the trivial solution,so that the columns of A are linearly independent. Hence, we conclude thatT is injective.The reasoning for T not being surjective is easier: because 3 > 2, T cannotbe surjective.

(b) Is the homomorphism T : R3 → R3,x 7→

2 1 11 2 01 3 0

x surjective? By

Theorem 3.1.11 we need to determine, if the columns of the matrix span R3.Let us find the echelon form of 2 1 1 ?

1 2 0 ?1 3 0 ?

∼ 2 1 1 ?

0 1 −1/3 ?0 0 −2 ?

.Whatever vector [?, ?, ?] we start with, the system will always be consistent.We thus conclude, that T is surjective.

The following theorem is important. We introduced homomorphisms and wesaw that x 7→ Ax for a matrix A is a homomorphism. But the next theoremproves that any homomorphism can be written as a matrix-vector-product-homomorphism.

3.1.13 TheoremLet T : Rm → Rn be a function (not yet a homomorphism!). There is an(n,m)-matrix A, such that T (x) = Ax for any x ∈ Rn, if and only if T is ahomomorphism.


Proof: We have already proven one direction in Theorem 3.1.9. So let usassume that T is a homomorphism. We need to find a matrix A that satisfiesAx = T (x) for any x.Consider the standard vectors

e1 :=

100...0

, e2 :=

010...0

, e3 :=

001...0

, . . . , en :=

000...1

.

! Note that any vector in Rm can be written as a linear combination of thestandard vectors, as can be seen as follows:

x =

x1x2x3

...xm

= x1

100...0

+x2

010...0

+. . .+xm

000...0

= x1e1+x2e2+. . .+xmem.

Now consider the matrix

A :=[T (e1) T (e2) . . . T (en)

].

We then have from the linear property of T that

T (x) = T (x1e1 + x2e2 + . . .+ xmem)

= x1T (e1) + x2T (e2) + . . .+ xmT (em)

= Ax.

3.1.14 ExampleShow that

T : R2 → R3,

[x1x2

]7→

x22x1 − 3x2x1 + x2

is a homomorphism.

60 Matrices

We could verify the defining properties of homomorphisms (see Definition3.1.1). Instead we find a matrix such that T (x) = Ax for all x and applyTheorem 3.1.13.

T (

[x1x2

]) =

x22x1 − 3x2x1 + x2

=

0x1 + x22x1 − 3x2x1 + x2

=

0 12 −31 1

[ x1x2

].

So

A =

0 12 −31 1

is the desired matrix and T is indeed a linear transformation.

3.1.15 Theorem (The Big Theorem, Version 2)Let A = {a1, a2, an} ⊆ Rn, A =

[a1 . . . an

]and T : Rn → Rn,x 7→ Ax.

Then the following is equivalent:(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for any b in Rn.(d) T is surjective.(e) T is injective.

Proof: In Theorem 2.4.11, we have already proven the equivalences (a)⇐⇒ (b) ⇐⇒ (c). From Theorem 3.1.11(c) we see, that (a) and (d) areequivalent. From Theorem 3.1.11(a) we see that (b) and (e) are equivalent.

�

3.2 Matrix Algebra

In Definition 1.4.1, we introduced matrices. We used matrices as a mean todeal with systems of linear equations. They provided a tool to find solutionsto a linear system by performing Gauss elimination. We also introduced thematrix-vector product in Definition 2.1.8. This is already a first hint, thatthe set of matrices over R provides a very rich algebraic structure. We aregoing to explore these structures systematically in this section.

Matrix Algebra 61

3.2.1 DefinitionLet

A =

a11 a12 . . . a1ma21 a22 . . . a2m...

......

an1 an2 . . . anm

andB =

b11 b12 . . . b1mb21 b22 . . . b2m...

......

bn1 bn2 . . . bnm

be two (n,m)-matrices over R, and let c ∈ R. Then addition and scalarmultiplication of matrices are defines as follows:

(a)Addition: A+B =

(a11 + b11) (a12 + b12) . . . (a1m + b1m)(a21 + b21) (a22 + b22) . . . (a2m + b2m)

......

...(an1 + bn1) (an2 + bn2) . . . (anm + bnm)

.

(b)Scalar multiplication: cA =

ca11 ca12 . . . ca1mca21 ca22 . . . ca2m

......

...can1 can2 . . . canm

Just as we did with vectors, we find some algebraic properties which matricesinherited from the corresponding properties in R. Compare with Theorem2.1.4.

3.2.2 TheoremLet s, t be scalars, A,B and C be (n,m)-matrices. Let 0nm be the (n,m)-zero-matrix, which has all 0’s at entries. Then(a) A+B = B + A(b) s(A+B) = sA+ sB(c) (s+ t)A = sA+ sB(d) (st)A = s(tA)(e) A+ 0nm = A

Proof: Mimic proof of Theorem 2.1.4. �The operations ’addition’ and ’scalar mulitplication’, did not hit us by sur-prise. What we already knew about vectors is easily generalized to matrices.

62 Matrices

What is really new is the following way to define a new and unexpected ma-trix product. Instead of a componentwise multiplication, we will introducea convolution product. This new product reveals a lot of algebraic structurethat inhibits the set of (n,m)-matrices.

3.2.3 Definition (Matrix-Matrix-Product)Let A be an (n, k)-matrix and B =

[b1 b2 . . . bm

]be a (k,m)-matrix.

Then the matrix product AB is the (n,m)-matrix given by

AB =[Ab1 Ab2 . . . Abm

].

3.2.4 RemarkNote that the matrix product for two matrices is only defined if the numberof columns of the first matrix equals the number of rows of the second matrix.

3.2.5 ExampleLet

A =

[2 0−1 3

]and B =

[2 1 00 4 2

]and compute the matrix product AB.We defined AB to be

[Ab1 Ab2 Ab3

]. So let us compute the three ma-

trix vector products:

Ab1 = 2

[2−1

]+ 0

[03

]=

[4−2

]

Ab2 = 1

[2−1

]+ 4

[03

]=

[2

11

]

Ab3 = 0

[2−1

]+ 2

[03

]=

[06

],

so that

AB =

[4 2 0−2 11 6

].

Even though the definition of the matrix product works fine, there is a for-mula, which makes it much more practical - even if you do not think so whenseeing the formula. It is all about using your both hands in order to get usedto the matrix product.

Matrix Algebra 63

3.2.6 LemmaLet A be an (n, k)-matrix and B =

[b1 b2 . . . bm

]be a (k,m)-matrix.

Then the i, j-entry of C = AB for 1 ≤ i ≤ n and 1 ≤ j ≤ m is given by

Cij =k∑

t=1

AitBtj.

Now we are ready to inevestigate the set of matrices with et another productand find out which structures are revealed.

3.2.7 Definition (Zero Matrix, Identity Matrix)(a) The (n,m)-matrix having only zero entries is called the zero (n,m)-matrix and denoted by 0nm.(b) The identity (n, n)-matrix is of the form

1 0 0 . . . 00 1 0 . . . 00 0 1 . . . 0...

. . ....

0 0 0 . . . 1

and denoted by In or I, if the dimensions are obvious.(c) We introduce the short notation [aij] for

a11 a12 . . . a1ma21 a22 . . . a2m

......

...an1 an2 . . . anm

.3.2.8 TheoremLet s be a scalar, and let A,B and C be matrices. Then each of the followingholds, whenever the dimensions of the matrix are such that the operation isdefined.(a) A(BC) = (AB)C(b) A(B + C) = AB + AC(c) (A+B)C = AC +BC(d) s(AB) = (sA)B = A(sB)(e) AI = A(f) IA = A

64 Matrices

Proof: All proofs apply Lemma 3.2.6, so we illustrate only the parts (c)and (e).(c) Let A = [aij] and B = [bij] be (n,m)-matrices and let C = [cij] be an(m, k)-matrix. Then the (ij)-entry of (A+B)C is

(A+B)Cij =m∑t=1

(Ait+Bit)Ctj =m∑t=1

(AitCtj+BitCtj) =m∑t=1

AitCtj+m∑t=1

BitCtj.

But the last term is just the (ij)-entry of AC +BC, just what we wanted toproof.(e) The (i, j)-entry of AIm is by Lemma 3.2.6

(AIm)ij =m∑t=1

AitItj.

But Itj is only not zero if t = j, so that the sum can be displayed as

(AIm)ij = Aij,

so the assertion follows. �We will now see a phenonemon, that we have not encountered before. Weneed to get used to the fact that different products lead to different behaviour.

3.2.9 TheoremLet A,B and C be matrices with dimensions so that all products are welldefined.(a) It is possible that AB 6= BA, i.e. the set of matrices is not commutativeunder the matrix product.(b) AB = 0 does not imply that A = 0 or B = 0.(c) AC = BC does not imply that A = B or C = 0.

Proof: (a) We give examples that prove the statement right. Consider

A =

[2 −1−1 0

]and B =

[0 14 −1

].

Then

AB =

[−4 3

0 −1

]

Matrix Algebra 65

and

BA =

[−1 0

9 −4

].

(b) Consider

A =

[1 23 6

]and B =

[−4 6

2 −3

].

Then AB = 022, but neither A nor B is the zero-vector.(c) Consider

A =

[−3 311 −3

]and B =

[−1 2

3 1

]and C =

[1 32 6

].

Even though A 6= B and C is not the zero matrix, we have AC = BC.�

Let us continue defining operations on the set of matrices.

3.2.10 DefinitionThe transpose of an (n,m)-matrix A, denoted by At, AT and also Atr, isthe (m,n)-matrix defined by At

ij = Aji. Practically, it is interchanging rowsand columns.

3.2.11 Example(a) Consider

A =

1 2 3 45 6 7 89 10 11 12

.Then

At =

1 5 92 6 103 7 114 8 12

.(b)

23−1

24

t

= [2 3 − 1 2 4].

We have the following properties for transposing.

66 Matrices

3.2.12 TheoremLet A,B be (n,m)-matrices, C be an (m, k)-matrix and s be a scalar. Then(a) (A+B)t = At +Bt

(b) (sA)t = sAt

(c) (AC)t = CtAt

Proof: (a) and (b) are straightforward. (c), even not difficult regardingthe idea, is very technical (apply Lemma 3.2.6) and therefore left out. �

3.2.13 Definition (Symmetric Matrices)A matrix A that satisfies At = A, is called symmetric. Note that it mustbe an (n, n)-matrix.

3.2.14 Definition (Power of a matrix)We define the k-th power of a matrix A to be the product

Ak = A · A · · · A,

where we have k factors.

3.2.15 Definition (Diagonal Matrix, Upper/Lower Triangular Matrix)(a) A diagonal (n, n)-matrix is of the form

a11 0 0 . . . 00 a22 0 . . . 00 0 a33 . . . 0...

......

. . ....

0 0 0 . . . ann

.

(b) An upper triangular (n, n)-matrix is of the forma11 a12 a13 . . . a1n0 a22 a23 . . . a2n0 0 a33 . . . a3n...

......

. . ....

0 0 0 . . . ann

.

Matrix Algebra 67

(c) A lower triangular (n, n)-matrix is of the forma11 0 0 . . . 0a21 a22 0 . . . 0a31 a32 a33 . . . 0...

......

. . ....

an1 an2 an3 . . . ann

.

We introduced these matrices because they behave in a certain way undermultiplication with matrices of the same shape.

3.2.16 Theorem(a) If

A =

a11 0 0 . . . 00 a22 0 . . . 00 0 a33 . . . 0...

......

. . ....

0 0 0 . . . ann

.is a diagonal matrix and k ≥ 1 an integer, then

Ak =

ak11 0 0 . . . 00 ak22 0 . . . 00 0 ak33 . . . 0...

......

. . ....

0 0 0 . . . aknn

.

(b) If A,B are upper (lower) triangular (n, n)-matrices, then AB is also anupper (lower) triangular (n, n)-matrix. In particular, if k ≥ 1 is an integer,then Ak is also an upper (lower) triangular (n, n)-matrix.

Proof: (a) We proof this assertion by induction. If k = 1, then A = A1 =Ak is a diagonal matrix, so the assertion is true for k = 1. Let us assumethat the assertion is true for all integers up to k − 1. We need to proof thatit is also true for k. Let us write Ak = (Ak−1) · A. For Ak−1, the induction

68 Matrices

hypothesis holds and we know that this is a diagonal matrix.

Ak = Ak−1 · A =

ak−111 0 0 . . . 0

0 ak−122 0 . . . 0

0 0 ak−133 . . . 0

......

.... . .

...0 0 0 . . . ak−1

nn

a11 0 0 . . . 0

0 a22 0 . . . 00 0 a33 . . . 0...

......

. . ....

0 0 0 . . . ann

=

ak11 0 0 . . . 00 ak22 0 . . . 00 0 ak33 . . . 0...

......

. . ....

0 0 0 . . . aknn

.The last equality results from just performing matrix multiplication.(b) We proof the assertion for upper triangular matrices. The proof for thelower triangular matrices works just the same. We need to show that belowthe diagonal, there are only zero entries. The indices of any entry (AB)ijbelow the diagonal satisfy i > j. So let i > j and determine what ABij is.By Lemma 3.2.6, we have

(AB)ij =n∑

t=1

aitbtj.

Now consider the following four cases:(i) i > t: Then ait = 0, so the product aitbtj = 0.(ii) t > j: Then btj = 0, so is the product aitbtj.(iii) t ≥ i. But then we have t ≥ i > j by assumption, so that btj = 0, hencethe product aitbtj = 0.(iv) j ≥ t. But then i > j ≥ t, so that ait = 0 in this case, hence the productaitbtj = 0.These are all possible cases for t, which means, that

(AB)ij =n∑

t=1

aitbtj = 0

for i > j. �

3.2.17 RemarkAs we have seen, the set of (n, n)-matrices satisfies the following propertiesfor any matrices A,B,C and scalars a, b ∈ R.

3.3. Inverses 69

(a) A+B = B + A(b) A+ 0 = A(c) (AB)C = A(BC)(d) AI = A(e) A(B + C) = AB + AC(f) (A+B)C = AC +BC(g) a(A+B) = aA+ aB(h) (a+ b)A = aA+ bA(i) (ab)A = a(bA)(j) 1A = A(k) a(AB) = (aA)B = A(aB)Because of these properties, we call the set of (n, n)-matrices over R an R-algebra.

3.3 Inverses

In practical applications such as encoding data, encrypting messages, logisis-tics, linear homomorphisms are used to handle data. But the nature of theseapplications request, that we can get back from an image to the preimage.This is the scope of this section. We will determine the properties a homo-morphism needs to have in order to ”go back”, i.e. in order to have an inverse.Moreover, we will learn how to determine the inverse of a homomorphism.

3.3.1 Definition (Invertible homomorphism, Inverse)A homomorphism T : Rm → Rn is invertible, if T is an isomorphism, i.e.injective and surjective. When T is invertible, the inverse T−1 : Rn → Rm

is defined byT−1(y) = x if and only if T (x) = y.

What we have is the following picture.

70 Matrices

x = T−1(y)

Domain

y = T (x)

Codomain

3.3.2 Definition (Identity function)A function f that satisfies f(x) = x for all x in the domain of f is called theidentity function. It is denoted by id.

3.3.3 RemarkIf T is an invertible homomorphism, then the inverse function is uniquelydetermined. Its domain is the range of T , which is equal to the codomain ofT , and its codomain is the domain of T . Moreover,

T (T−1)(y) = y and T−1(x) = x,

hence, T (T−1) = id is the identity on the range of T and T−1(T ) = id is theidentity on the domain of T .

3.3.4 TheoremLet T : Rm → Rn be a homomorphism. Then(a) If T has an inverse, then m = n.(b) If T is invertible, then T−1 is also an invertible homomorphism.

Proof: (a) If T has an inverse, then it is surjective and injective. ByTheorem 3.1.11(b) and (d) we have n ≥ m and n ≤ m, hence equality.(b) We need to show that T−1(u+v) = T−1(u)+T−1(v) for all v,u. But T isan isomorphism, so there are unique u, v, such that T (u) = u and T (v) = v.

Inverses 71

Therefore, we have

T−1(u + v) = T−1(T (u)) + T (v))

= T−1(T (u + u)

= u + v

= T−1(u) + T−1(v).

Moreover, we need to show that T−1 respects scalar multiplication. So sim-ilarly to the above, we have T−1(ru) = T−1(rT (u)) = T−1(T (ru)) = ru =r(T−1(u)). �Remember Theorem 3.1.13? Every homomorphism T can be written inmatrix-form, meaning there is a matrix A such that T (x) = Ax for anyvector x in the domain.What does this mean in the world of inverses?

3.3.5 LemmaLet idRn be the identity on Rn. Then the associated matrix for this homo-morphism is the identity matrix.

Proof: For any vector in Rn we need to find a matrix A such that id(x) =Ax. By plugging in the standard vectors, the assertion follows. �

3.3.6 TheoremSuppose that T is an invertible homomorphism with associated matrix AT .Then the inverse T−1 has an associated matrixAT−1 which satisfiesATAT−1 =I.

Proof: By Lemma 3.3.5 and Remark 3.3.3, the assertion follows. �

3.3.7 Definition (Invertible Matrix, Singular Matrix)Let A be an (n, n)-matrix. Then A is called invertible or nonsingular, ifthere exists an (n, n)-matrix B such that AB = In. If A has no inverse, it iscalled singular.

3.3.8 CorollaryLet T be an invertible homomorphism. Then the associated matrix A is alsoinvertible. �

3.3.9 TheoremLet A be an invertible matrix and let B be a matrix with AB = I. ThenBA = In. Moreover, B is the unique matrix such that AB = BA = I.

72 Matrices

Proof: Let x ∈ Rn. Then

AB(Ax) = IA(x) = Ax⇒ A(BAx) = Ax⇒ A(BAx− x) = 0.

But A is invertible, in particular it is injective. So Ay = 0 has only thetrivial solution. Therefore, A(BAx − x) = 0 means that BAx − x must bethe trivial vector, hence BAx = x. Since x ∈ Rn was arbitrary, we have thatBA is the identity matrix.Now suppose, that there is a matrix C with AB = AC = I. Then

B(AB) = B(AC)⇒ (BA)B = (BA)C ⇒ B = C,

as we have just proven that BA = I. Hence B = C, so it is unique. �

3.3.10 Definition (Inverse matrix)Let A be an invertible matrix. Then the uniquely determined matrix B suchthat AB = BA = I is called the inverse of A and is denoted by B = A−1.

3.3.11 TheoremLet A and B be invertible (n, n)-matrices, and C,D be (n,m)-matrices.(a) A−1 is invertible with (A−1)−1 = A.(b) AB is invertible with (AB)−1 = B−1A−1.(c) If AC = AD, then C = D. (Compare that to Theorem 3.2.9).(d) If AC = 0nm, then C = 0nm.

Proof: (a) We know that A−1A = I. So by Theorem 3.3.9, the inverse ofthe inverse, which is (A−1)−1 = A.(b) Note that

(AB)(B−1A−1) = AIA−1 = AA−1 = I.

Therefore, the inverse of (AB) is (AB)−1 = B−1A−1.(c) Consider C = IC = A−1AC = A−1AD = ID = D.(d) As in (c). �How do we find the inverse of a matrix. As we will see, there is an algorithmthat provides a way to find an inverse of a matrix, if it exists.For this, it is convenient to introduce the following notion.

Inverses 73

3.3.12 DefinitionThe vector

ei =

00...0100...0

∈ Rn,

where the nonzero entry is in the i-th position, is called the i-th standardvector of Rn.

Let A be an invertible matrix and denote for the time being the inverse ofA by B = [b1 . . . bn]. Then, by definition, we have AB = I = [e1 . . . en].Remember the definition of a matrix product, Definition 3.2.3. From that,we can read off, that

Ab1 = e1, . . . Abn = en.

But this reads as ”b1 is the solution to Ax = e1, . . ., bn is the solution toAx = en. Each of these n systems, could have been solved one at a time byperforming Gauss-Jordan elimination to each of the systems[

a1 . . . an ei

]for all 1 ≤ i ≤ n.

But we can save some time and work by doing that simultaneously with onelarge augmented matrix :[

a1 . . . an e1 . . . en

].

If we are done with Gauss-Jordan elimination, then we will have

[A| In] transformed to [In|A−1 ].

3.3.13 ExampleFind the inverse of

A =

1 0 −22 1 22 1 3

.

74 Matrices

So we need to Gauss-Jordan-eliminate the following large augmented matrix 1 0 −2 1 0 02 1 2 0 1 02 1 3 0 0 1

∼

1 0 −2 1 0 00 1 6 −2 1 00 1 7 −2 0 1

1 0 −2 1 0 0

0 1 6 −2 1 00 0 1 0 −1 1

∼

1 0 0 1 −2 20 1 0 −2 7 −60 0 1 0 −1 1

We therefore conclude that the right hand side of the augmented matrix isthe inverse

A−1 =

1 −2 2−2 7 −6

0 −1 1

3.3.14 ExampleHow do we see, if a matrix does not have an inverse? Let us have a look atan example.Does

A =

1 0 00 1 1−2 3 3

have an inverse? Start with the algorithm: 1 0 0 1 0 0

0 1 1 0 1 0−2 3 3 0 0 1

∼

1 0 0 1 0 00 1 1 0 1 00 3 3 2 0 1

1 0 0 1 0 0

0 1 1 0 1 00 0 0 2 −3 1

We see with the third row that this is an inconsistent system. It thereforehas no solution, hence no inverse.

For (2, 2)-matrices there is a quick formula for finding the inverse. We willlater learn how with the help of determinants, the formula is correct.

Inverses 75

3.3.15 RemarkLet

A =

[a bc d

]be a (2, 2)-matrix. If ad− bc 6= 0, then the inverse of A can be computed by:

A−1 =1

ad− bc

[d −b−c a

].

Proof: Just compute AA−1 and see that it is the identity matrix. �We had a lot of statements about linear systems relating (augmented) ma-trices. Let us see how this relation is specified with inverse matrices.Let us finish the section with yet a further version of the Big Theorem.

3.3.16 Theorem (The Big Theorem, Version 3)Let A = {a1, a2, an} ⊆ Rn, A =

[a1 . . . an

]and T : Rn → Rn,x 7→ Ax.

Then the following is equivalent:(a) A spans Rn.(b) A is linearly independent (remember: i.e. Ax = 0 has only one solution).(c) Ax = b has a unique solution for any b in Rn given by x = A−1b.(d) T is surjective.(e) T is injective.(f) A is invertible.

Proof: We have already the equivalences (a),(b),(c),(d),(e). Therefore weneed to proof that T is injective if and only if A is invertible. But T isinjective if and only if (we are in the setting of the Big Theorem ”n = m”)T is surjective, if and only if A is invertible. �

76 Matrices

77

Chapter 4

Subspaces

4.1 Introduction to Subspaces

We have seen homomorphisms, that are surjective, and we have seen setsof vectors that span the whole of Rn. But what if a homomorphism is notsurjective? What if a set of vectors does not span the whole Euclideanspace? Which structural properties do we then find in the range of thathomomorphism and in the span of the vectors, respectively? This is thescope of this section, we will study so called subspaces.

4.1.1 Definition (subspace)A subset S of Rn is a subspace, if S satisfies the following three conditions:(i) S contains the zero vector 0.(ii) If u and v are in S, then so is u + v.(iii) If u is in S and r any real number, then ru is also in S.

4.1.2 Example(a) φ, the empty set is not a subspace because it does not satisfy the firstproperty fromthe definition of subspace.(b) {0} ⊂ Rn is a subspace.(c) Rn is a subspace.(d) The set of all vectors that consists of integer components is not a sub-sapce, because it violates (iii) of the definition. We sometimes say that (b)and (c) are the trivial subspaces of Rn.

78 Subspaces

4.1.3 ExampleLet us consider the set vectors S in R3, that have a zero in the third compo-nent, i.e.

S = {

ab0

| a, b ∈ R}.

Is S a subspace? Let us check the defining properties:

(i)

000

is an element of S, because it has a zero in the last component.

(ii) Consider

a1b10

, a2b20

. Then

a1b10

+

a2b20

=

a1 + a2b1 + b2

0

,which is again an element of S.

(iii) Let r be a scalar and

ab0

. Then

r

ab0

=

rarb0

,which again is an element in S.Hence, we showed that S is indeed a subspace of R3.If we had chosen the last component to be 1, then we easily would have seenthat no defining property of a subspace would had been satisfied.

Which sets do we easlily recognize as subspaces?

4.1.4 TheoremLet S = span{u1,u2, . . . ,um} be a subset of Rn. Then S is a subspace ofRn.

Proof: (a) Since 0u1 + . . . + 0um lies in the span, the zero vector is anelement of the span.

Introduction 79

(b) Let v,w be elements of S. Then there are scalars r1, . . . , rn and s1, . . . , snsuch that

v = r1u1 + . . .+ rnun and w = s1u1 + . . .+ snun.

The aum of the two vectors is then

u+w = r1u1+ . . .+rnun+s1u1+ . . .+snun = (r1+s1)u1+ . . .+(rn+sn)un.

But this last expression is easily identified as an element lying in the span ofu1, . . . ,un.(c) Let c be an scalar and let v = r1u1 + . . .+rnun be an element in S. Then

cv = c(r1u1 + . . .+ rnun) = (cr1)u1 + . . .+ (crn)un

is again represented as an element of the span. �

4.1.5 ExampleLet

A =

1 3 4 00 2 4 41 1 0 −4

.Let us determine the solution to the corresponding homogeneous system:

1 3 4 0 00 2 4 4 01 1 0 −4 0

∼

1 3 4 0 00 2 4 4 00 −2 −4 −4 0

∼

1 3 4 0 00 2 4 4 00 0 0 0 0

We deduce, that x3, x4 are free variables, which means that we set them tobe x3 = s1 and x4 = s2. Backward substitution gives us then x1 = 2s1 + 6s2and x2 = −2s1 − 2s2. The general solution is therefore

x = s1

2−2

10

+ s2

6−2

01

.This can also be written in the form

S = span{

2−2

10

,

6−2

01

}.

80 Subspaces

Note also, that the zero vector corresponds to the trivial solution, whichlies in S. By the algorithm in Remark 4.1.4, we thus conclude, that S is asubspace of R3.

The result of the previous example holds in general, as we see in the followingtheorem.

4.1.6 TheoremIf A is an (n,m)-matrix, then the set of solutions to the homogeneous systemAx = 0 is a subspace of Rm.

Proof: (a) The zero vector represents the trivial solution and is hence anelement of the set of solutions.(b) Suppose u,v are both solutions to the system. Then

A(u + v) = Au + Av = 0 + 0 = 0,

which shows, that the sum of two solutions is again a solution to the homo-geneous system.(c) Let c ∈ R and u be a solution. Then

A(cu) = c(Au) = c0 = 0,

which shows, that the set of solutions is also closed under scalar multiplica-tion.Overall, we have that the set of solutions is a subspace. �

4.1.7 Definition (Null Space)If A is an (n,m)-matrix, then the set of solutions to the homogeneous systemAx = 0 is called the null space of A and denoted by null(A).

Now that we have that the solutions to a system form a subspace, we canask and investigate which other sets we have encountered so far also formsubspaces. That leads us directly into homomorphisms, their range and theirkernel.

4.1.8 TheoremLet T : Rm → Rn be a homomorphism. Then the kernel of T is a subspaceof the domain Rm and the range of T is a subspace of the codomain Rn.

Introduction 81

Proof: Let A = [a1, . . . am] be the associated (n,m)-matrix wrt T . Thenany element of the kernel of T satisfies Ax = 0. Thus, we see that elementsof the kernel are solutions to the homogeneous system associated with A, i.e.we have

ker(T ) = null(A).

By Theorem 4.1.6, we thus have that the kernel is in fact a subspace of Rm.In Theorem 3.1.10, we saw that the range of T and the span of the columnsof A were the same, i.e.

range(T ) = span{a1, . . . , am}.

�

4.1.9 ExampleDetermine the kernel and the range of the homomorphism T : R2 → R3,

[x1x2

]7→ 2x1 − 10x2

−3x1 + 15x2x1 − 5x2

. The associated matrix wrt T is

A =

2 −10−3 15

1 −5

.Because ker(T ) = null(A), we just need to find the solution to 2 10 0

−3 15 01 −5 0

.But this matrix is equivalent to 1 −5 0

0 0 00 0 0

.Any solution to this system is of the form

x = s

[51

],

82 Subspaces

which means that the kernel of T is

ker(T ) = span{[

51

]}.

What is th range of T? By Theorem 3.1.10, the range of T is just the spanof the columns of A. Hence we have

range(T ) = span{

2−3

1

, −10

15−5

}.Let us close this section with the Big Theorem extended by the results ofthis section.

4.1.10 Theorem (Big Theorem, Version 4)Let A := {a1, a2, . . . , an} be a set of n vectors in Rn, let A =

[a1 . . . an

],

and let T : Rn → Rn be given by T (x) = Ax. Then the following areequivalent:(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for any b ∈ Rn.(d) T is surjective.(e) T is injective.(f) A is invertible.(g) ker(T ) = {0}.(h) null(A) = {0}.

Proof: (a) through (f) has already been proven in Theorem 3.3.16. Theo-rem 3.1.6 gives the last equivalence. �

4.2 Basis and Dimension

When we like to find a set of vectors in Rn such that their span is the wholeof Rn, we need to choose ’enough’ vectors, just saying ’more is better’ in thatsense. On the other hand, when dealing with linearly independent sets, werather choose only few such vectors, ’less is better’. As we will see in thissection, there is a point that exactly meets both needs and both directionsof adding vectors and taking vectors out.

Basis and Dimension 83

4.2.1 Definition (!!!!Basis!!!!)A set B = {u1,u2, . . . ,um} is called a basis for a subspace S if(a) B spans S.(b) B is linearly independent.

Let us first look at some properties of bases.

4.2.2 TheoremLet B = {u1, . . . ,um} be a basis for a subspace S. Then every vector v in Scan be written as a linear combination

v = r1u1 + . . .+ rmum

in exactly one way.

Proof: B is per assumption a basis for S, hence, its vectors span S. Assumetherefore there are two ways to present v ∈ S:

u = r1u1 + . . .+ rmum and u = s1u1 + . . .+ smum.

Reorganizing this equation gives us

(r1 − s1)u1 + . . .+ (rm − sm)um = 0.

But as B is assumed to be a basis it has linearly independent vectors, thusonly the trivial solution to the previous equation. That means, that r1 =s1, . . . , rm = sm. �

4.2.3 ExampleConsider the set A := {

101

, 1

00

, 0

01

}. Let S = span(A). We easily

see, that

202

is an element of S. But 202

= 2

100

+ 2

001

= 2

101

.The setA provides two different ways of representing the vector

202

, hence

A cannot be a basis for S.

84 Subspaces

4.2.4 Definition (Standard Basis)Let

ei =

0...010...0

,

be the i-th standard vector for 1 ≤ i ≤ n. Then {e1, e2, . . . , en} is called thestandard basis of Rn.

Before we proceed we need a technical lemma.

4.2.5 LemmaLet A and B be equivalent matrices. Then the subspace spanned by the rowsof A is the same as the subspace spanned by the rows of B.

Is there always a basis for a subspace? Yes there is, and here is how you findit:

4.2.6 TheoremLet S = span{u1, . . . , um}. A basis of S can be obtained in the followingway:STEP 1: Use the vectors u1, . . . ,um to form the rows of a matrix A.STEP 2: Transform A to echelon form B.STEP 3: The non-zero rows give a basis for S.

Proof: The new rows still span S, by Lemma 4.2.5. The linearly indepen-dence is given by the fact that we choose the non-zero rows which are in ech-elon form, giving us only the trivial solution when considering the associatedhomogenous system. Be careful: We are now stating linear independence forrows instead of columns! �

4.2.7 ExampleFind a basis for the following subspace:

span{

10−2

3

,−2

242

,

3−1−6

5

}.


Let us put them into a matrix 1 0 −2 3−2 2 4 2

3 −1 −6 5

∼ 1 0 −2 3

0 2 0 80 0 0 0

.We therfore have as a basis for the subspace:

B = {

10−2

3

,

0208

}.4.2.8 LemmaSuppose U = [u1 . . . um] and V = [v1 . . . vm] are two equivalent matrices.Then any linear dependence that exists among the vectors u1, . . . ,um alsoexists among the vectors v1, . . . ,vm. This means, that if WLOG u1 = r2u2+. . .+ rmum, then v1 = r2v2 + . . .+ rmvm.

There is another way of determining the basis of a subspace, involving thecolumns instead of the rows.

4.2.9 TheoremLet S = span{u1, . . .um}. Then the following algorithm provides a basis forS.STEP 1: Use the vectors u1, . . . ,um to form the columns of a matrix A.STEP 2: Transform A to echelon form B. The set of the pivot columns willbe linearly independent (and the remaining will be linearly dependent formthe pivot columns).STEP 3: The set of columns of A corresponding to the pivot columns of Bform a basis of S.

4.2.10 ExampleFind a basis for the following subspace:

span{

10−2

3

,−2

242

,

3−1−6

5

}.Following the previous algorithm, we transform the following matrix in ech-elon form:

86 Subspaces

1 −2 30 2 −1−2 4 −6

3 2 5

∼

1 −2 30 2 −10 0 00 8 −4

∼

1 −2 30 2 −10 0 00 0 0

The pivot columns are the first and second column, so B := {

10−2

3

,−2

242

}is a basis for the subspace.

4.2.11 RemarkWe obviously have two options of determining a basis for a given subspace.The first algorithm has the tendency to give back vectors with more zeroentries, which is sometimes favorable. The second algorithm gives us a basiswhich is a subset of the original vectors. Deciding which algorithm to usedepends on what you like to do with the basis afterwards.

4.2.12 TheoremLet S be a subspace of Rn. Then any basis of S has the same number ofvectors.

Proof: We just give the idea to the proof. We assume the contrary, so thatthere are at least two bases, one of which has less elements than the other.Denote this basis by U , the one with more elements by V . By the definitionof the basis, each element of V can be expressed as a linear combination ofelements of U . Consider then the homogeneous equation

a1v1 + . . .+ amvm = 0

of elements of V and subsitute their expression in terms of elements of U .Because of the linear independence of U , this gives a system of homogeneousequation with mire variables than equations, hence infinitely many solutions,a contradiction to the linear independence of V . �

4.2.13 Definition (Dimension)Let S be a subspace of Rn. Then the dimension of S is the uniquelydetermined number of vectors in any basis of S.


4.2.14 Remark(a) The zero subspace {0} has no basis and its dimension is defined to be 0.(b) One finds the dimension of a space, by determining a basis and countingits elements.

4.2.15 TheoremLet U = {u1, . . . ,um} be a set of vectors in a subspace S of Rn.(a) If U is linearly independent, then either U is a basis for S or additionalvectors can be added to U to form a basis of S.(b) If U spans S, then either U is a basis for S or vectors can be removedfrom U to form a basis for S.

Proof: (a) If U spans S, it is a basis. If not, then select a vector s1 ∈ Sthat does not lie in the span of U and add it to U . Denote the new set byU2. As s1 is not in the span of U , we know that U2 is linearly independent.If U2 spans S, then it is a basis. If not, repeat the process until the set spansS, which gives then a basis for S.(b) Apply the algorithm of Theorem 4.2.9. �Let us go through an example of how to find a basis.

4.2.16 ExampleLet

x1 =

73−6

4

, A =

−3 3 −6 −6

2 6 0 −80 −8 4 12−3 −7 −1 9

2 10 −2 −14

.Note that x1 lies in the null space of A. Find a basis of null(A) that includesx1. Find a basis for null(A).After using Gauss algorithm we see, that the null space of A has the followingform:

null(A) = span{

−1

302

,−3

120

}.Finding a basis that includes x1, is now possible with the ’Column-Algorithm’described in Theorem 4.2.9.We therefore consider the following matrices:

88 Subspaces

7 −1 −33 3 1−6 0 2

4 2 0

∼

3 0 −10 3 20 0 00 0 0

The pivot columns of the previous matrix are the first and the second, hencewe take the first and the second matrix of the original matrix to find that

{

73−6

4

,−1

302

}is a basis for null(A).

4.2.17 TheoremLet U be a set of m vectors of a subspace S of dimension m. If U is eitherlinearly independent or spans S, then U is a basis for S.

Proof: If U is linearly independent, but does not span S, then by Theorem4.2.15, we can add an additional vector, so that U together with the new vec-tor stays linearly independent. But this gives a linearly independent subsetof S with more than m elements, which is a contradiction to the assumptionthat dim(S) = m.If U spans S, but is not linearly independent, there is a vector in U , that liesin the span of the others. Remove this vector from U to obtain the set U1.Repeat this process until ou get to a linearly independent set U∗, which stillspans S. Hence U∗ is a basis for S contradicting the assumption that thedimension is m. �

4.2.18 TheoremSuppose that S1, S2 are both subspaces of Rn and that S1 is a subset ofS2. Then dim(S1) ≤ dim(S2). Moreover, dim(S1) = dim(S2) if and only ifS1 = S2.

Proof: Assume that dim(S1) > dim(S2). Let BS1 be a basis for S1. ThenS1 = span(B1) ⊆ S2. Then by Theorem 4.2.15 (a), additional vectors canbe added to B1 to obtain a basis for S2. But this is a contradiction to theassumption dim(S1) > dim(S2).If dim(S1) = dim(S2), then any basis for S1 must be already a basis for S2.

�

4.3. Row and Column Spaces 89

4.2.19 TheoremLet U = {u1, . . . ,um} be a set of vectors in a subspace S of dimension k.(a) If m < k, then U does not span S.(b) If m > k, then U is not linearly independent.

Proof: (a) Assume that U spans S. By Theorem 4.2.15(b) we conclude,that m must be greater than the dimension of S, a contradiction to m < k.(b) Assume that U is linearly independent. By Theorem 4.2.15(a), we canadd vectors to U to obtain a basis, a contradiction to m > k. �


[a1 . . . an

],

and let T : Rn → Rn be given by T (x) = Ax. Then the following areequivalent:(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for any b ∈ Rn.(d) T is surjective.(e) T is injective.(f) A is invertible.(g) ker(T ) = {0}.(h) null(A) = {0}.(i) A is a basis for Rn.

Proof:We only need to prove the equivalence of (a)-(g) with (h). By Definition4.2.1, (a) plus (b) are equivalent with (h). �

4.3 Row and Column Spaces

In the previous section we introduced two algorithms fo obtaining a basisof a subspace. One involved the rows of a matrix, the other the columns.In this section, we will learn how the columns and the rows of a matrix arestructurally connected.

4.3.1 Definition (Row Space, Column Space)Let A be an (n,m)-matrix.(a) The row space of A is the subspace of Rm spanned by the row vectorsof A and is denoted by row(A).

90 Subspaces

(b) The column space of A is the subspace of Rn spanned by the columnvectors of A and is denoted by col(A).

4.3.2 ExampleConsider the matrix [

1 −1 2 10 1 0 1

].

Then the row space is

span{[1, −1, 2, 1], [0, 1, 0, 1]}

and the column space is

span{[

10

],

[−1

1

],

[20

],

[11

]}.

Note that in general the column space and the row space are not equal.

The algorithms for providing a basis can now be reformulated as follows:

4.3.3 TheoremLet A be a matrix and B an echelon form of A.(a) The nonzero rows of B form a basis for row(A).(b) The columns of A corresponding to the pivot columns of B form a basisfor col(A). �

Shedding a slightly different view on Theorem 4.2.6 and 4.2.9, and Theorem4.3.3.

4.3.4 TheoremFor any matrix A, the dimension of the row space equals the dimension ofthe column space.

Proof: WLOG we may assume that A is already in echelon form. ByTheorem 4.3.3(a), we know that the dimension of the row space of A is equalto the nonzero rows. But remembering the stair-like pattern form of matricesin echelon form, we conclude that each pivot nonzero row has exactly onepivot position, hence a pivot column. Thus, the number of pivot columns isequal to the number of nonzero rows. By Theorem 4.3.3(b), the statementfollows. �

Row and Column Spaces 91

4.3.5 Definition (Rank)Let A be a matrix. Then the rank of A is the dimension of the row (orcolumn) space. It is denoted by rank(A).

Let us have a closer look at the numbers we can now attach to any matrix.

4.3.6 ExampleConsider −1 4 3 0

2 −4 −4 22 8 2 8

.What is the nullity and what is the rank of A?For determining both numbers, we first need to find an equivalent matrix inechelon form. It is easy to see that A is equivalent to 1 0 −1 2

0 2 1 10 0 0 0

.The previous matrix has two nonzero rows, so that rank(A) = 2. On theother hand, if we associate A with a homogeneous system, we see that thereare two free variables, hence nullity(A) = 2.

It is not by coincidence, that the nullity and the rank turned out to be 2 and2. Here is why:

4.3.7 Theorem (!Rank-Nullity Theorem!)Let A be an (n,m)-matrix. Then

rank(A) + nullity(A) = m.

Proof: Assume WLOG, that A is in echelon form. We have alreadyseen that the number of pivot columns is equal to the rank of A. On theother hand, each nonpivot column corresponds to a free variable. Hence thenumber of nonpivot columns equals the nullity of A. But the number ofpivot columns plus the number of nonpivot columns is equal to the numberof columns which is m. This translates to ’the rank plus the nullity equalsm’. �

92 Subspaces

4.3.8 ExampleLet A be a (3, 15)-matrix.- The maximum rank of A is 3.- The minimum nullity of A is 12.In particular, the homomorphism that sends a vector u ∈ R15 to Au (∈ R3)cannot be injective, because the nullity of A is at the same time the dimensionof the kernel of that homomorphism.

Let us apply the previous theorem to homomorphisms.

4.3.9 TheoremLet T : Rm → Rn be a homomorphism. Then

dim(ker(T )) + dim(range(T )) = m.

Proof: Let A be the associated matrix to T . The nullity of A equals thedimension of the kernel of T . Moreover, the range of T equals the span of thecolumns of A, which is by definition col(A). Hence, the statement follows.

�

4.3.10 TheoremLet A be an (n,m)-matrix and b be a vector in Rn.(a) The system Ax = b is consistent, if and only if b is col(A).(b) The system Ax = b has a unique solution, if and only if b is in col(A)and the columns of A are linearly independent.

Proof: (a) This is true by Theorem 2.3.10 and the fact that any vector ofthe form Au is a linear combination of the columns of A, hence in col(A).(b) By (a) we have at least one solution, by Theorem 2.4.10, we have at mostone solution, hence a unique solution. �


[a1 . . . an

],

and let T : Rn → Rn be given by T (x) = Ax. Then the following areequivalent:(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for any b ∈ Rn.(d) T is surjective.(e) T is injective.(f) A is invertible.

Row and Column Spaces 93

(g) ker(T ) = {0}.(h) null(A) = {0}.(i)A is a basis for Rn.(j) col(A) = Rn.(k) row(A) = Rn.(l) rank(A) = n.

Proof: With version 5 we had already the equivalences (a)-(h). By Theorem4.3.4 and Definition 4.3.5, we see that (i)-(k) are equivalent. Moreover, (a)and (i) are obviously equivalent as it states the same. �

94 Subspaces

95

Chapter 5

Determinants

5.1 The Determinant Function

We have already attached to numbers to a matrix, such as the rank and thenullity. In this chapter, we will introduce yet another powerful number to asquare matrix - the determinant.Before defining this new term we need to introduce some technicalities.

5.1.1 DefinitionLet A be an (n, n)-matrix. Then Mij denotes the (n− 1, n− 1)-matrix thatwe get from A by deleting the row and the column containing the entry aij.

5.1.2 ExampleDetermine M2,3 and M2,4 for the following matrix.

1 2 3 45 6 7 89 10 11 12

13 14 15 16

The entry a2,3 is equal to 7. We therefore have

M2,3 =

1 2 49 10 12

13 14 16

.

96 Chapter 5. Determinants

The entry a2,4 is equal to 8, hence

M2,4 =

1 2 39 10 11

13 14 15

5.1.3 Definition (Determinant, Cofactor)(a) Let A = [a11] be a (1, 1)-matrix. Then the determinant of A is givenby

det(A) =| a11 |:= a11.

(b) Let

A =

[a11 a12a21 a22

]be a (2, 2)-matrix. Then the determinant of A is given by

det(A) =

∣∣∣∣ a11 a12a21 a22

∣∣∣∣ = a11a22 − a12a21.

(c) Let

A =

a11 a12 a13a21 a22 a23a31 a32 a33

be a (3, 3)-matrix. Then the determinant of A is given by

det(A) =

∣∣∣∣∣∣a11 a12 a13a21 a22 a23a31 a32 a33

∣∣∣∣∣∣= a11a22a33 + a12a23a31 + a31a21a32 − a13a22a31 − a23a32a11 − a33a12a21.

(d) Let A be an (n, n)-matrix and let aij be its (i, j)th entry. Then thecofactor Cij of aij is given by

Cij = (−1)i+j det(Mij).

The determinant det(Mij) is called the minor of aij.

5.1. The Determinant Function 97

(e) Let n be an integer and fix either some 1 ≤ i ≤ n or 1 ≤ j ≤ n. Let

A =

a11 a12 . . . a1na21 a22 . . . a2n

......

. . ....

an1 an2 . . . ann

be an (n, n) matrix. Then the determinant of A is given byeither(i) det(A) = ai1Ci1 + ai2Ci2 + . . . + ainCin, where Cij are the cofactors ofai1, ai2, . . . , ain, respectively, (’Expand across row i’),or(ii) det(A) = a1jC1j + a2jC2j + . . .+ anjCnj,where Cij are the cofactors of a1j, a2j, . . . , anj, respectively, (’Expand downcolumn j’).These formulas are referred to collectively as the cofactor expansions.

5.1.4 RemarkNote that the definition of a general determinant is used recursively, becausein its definition it is reduced to the determinant of an (n− 1, n− 1)-matrix.

5.1.5 ExampleFind det(A) for

A =

3 0 −2 −1−1 2 5 4

6 0 −5 −3−6 0 4 2

Let us first find the M1j’s.

M1,1 =

2 5 40 −5 −30 4 2

M1,2 =

−1 5 46 −5 −3−6 4 2

M1,3 =

−1 2 46 0 −3−6 0 2

M1,4 =

−1 2 56 0 −5−6 0 4

We can now use the formula of Definition 5.1.3(c), to get:


det(M1,1) = −5 · 2 · 2− 3 · 5 · 0 + 4 · 0 · 4− 4 · (−5) · 0− 4 · (−3) · 2− 2 · 0 · 5 = 4

det(M1,2) = 10 + 90 + 96− 120− 12− 60 = 4

det(M1,3) = 0 + 36 + 0− 0− 0− 24 = 12

det(M1,4) = 0 + 60 + 0− 0− 0− 48 = 12

Using Definition 5.1.3(d), we get

det(A) = 3·(−1)1+1·4+0·(−1)1+2·4+(−2)·(−1)1+3(12)+(−1)·(−1)1+4·12 = 12−24+12 = 0.

Now that we have understood the consept of calculating the determinant ofa matrix, we give a generalized version of such a calculation. This comes invery handy, if there are rows or columns with many zeros.

5.1.6 TheoremFor any integer n we have det(In) = 1.

Proof: We proof that by induction.n = 1: We have I1 = [1], so that det(I1) = 1.Let the statement be true for n− 1. Then

det(In) = 1·C1,1+0·C1,2+. . .+0·C1,n = C1,1 = (−1)2 det(M1,1) = det(M1,1).

But M1,1 = In−1, and using the assumption of the induction, we get

det(In) = det(M1,1) = det(In−1) = 1.

�Why bothering with introducing a complicated formula all around the entriesof a square matrix? Because the determinant features some really powerfultools and properties attached to matrices. Here are a few:

5.1.7 TheoremLet A,B be (n, n)-matrices.(a) A is invertible, if and only if det(A) 6= 0.(b) det(A ·B) = det(A) · det(B).(c) If A is invertible, then

det(A−1) =1

det(A).

5.1. The Determinant Function 99

(d) det(At) = det(A).�

5.1.8 ExampleFind the determinant of

A =

−2 1 0 13

1 1 0 7−7 3 0 2

2 1 0 9

We expand down the third column, which gives us:det(A) = 0 ·C1,3 +0 ·C2,3 +0 ·C3,3 +0 ·C4,3 = 0, hence we see that the matrixis not invertible.

What did we learn from the previous example?

5.1.9 TheoremLet A be a square matrix.(a) If A has a zero column or zero row, then det(A) = 0.(b) If A has two identical rows or columns then det(A) = 0.

Proof: (a) Is clear by Theorem ??.(b) We use induction.The statement makes only sense for n ≥ 2.Let A is a (2, 2)-matrix with two identical rows or columns. WLOG let Ahave two identical rows, so that [

a ba b

].

Hence, det(A) = ab− ab = 0.Let the assertion be true for n − 1 and assume that A is an (n, n)-matrixwith two identical rows. Take a row that is different from those two identicalones and expand across this row. We will get a sum of determinants of(n − 1, n − 1)-matrices with two identical rows. We thus have by inductiona sum of zeros. �There is one special case when calculating the determinant is particularlyeasy.


5.1.10 TheoremLet A be a triangular (n, n)-matrix. Then det(A) is the product of the termsalong the diagonal.

Proof: We use induction. If n = 1, then the statement is trivial. Let thestatement be true for n−1 and assume that WLOG A is an upper triangular(n, n)-matrix. We expand down the first column. This will get to

det(A) = a1,1 · C1,1.

C1,1 is the determinant of an upper triangular (n − 1, n − 1)-matrix. Byinduction, the statement follows. �

5.1.11 Theorem (The Big Theorem, Version 7)Let A := {a1, a2, . . . , an} be a set of n vectors in Rn, let A =

[a1 . . . an

],

and let T : Rn → Rn be given by T (x) = Ax. Then the following areequivalent:(a) A spans Rn.(b) A is linearly independent.(c) Ax = b has a unique solution for any b ∈ Rn.(d) T is surjective.(e) T is injective.(f) A is invertible.(g) ker(T ) = {0}.(h) null(A) = {0}.(i) A is a basis for Rn.(j) col(A) = Rn.(k) row(A) = Rn.(l) rank(A) = n.(m) det(A) 6= 0.

Proof: We have already the equivalences (a)-(k). The equivalence (l) isincluded by Theorem 5.1.7

Winter 2016 - University of Washingtonnaehrn/teaching/m308.pdfMath 308, Linear Algebra with...

Documents

Transcript of Winter 2016 - University of Washingtonnaehrn/teaching/m308.pdfMath 308, Linear Algebra with...