8/2/2019 A Em Theory
1/124
Contents
0 Solving Linear Equation Systems with the Gauss-Algorithm 6
1 Linear Algebra and Vector Spaces 1
1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . 1
1.1.2 Linear Independence . . . . . . . . . . . . . . . 2
1.1.3 Dimension and Basis . . . . . . . . . . . . . . . 3
1.1.4 Scalar Product . . . . . . . . . . . . . . . . . . 5
1.1.5 Orthonormal Systems . . . . . . . . . . . . . . . 6
1.1.6 Norms . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Matrices and Linear Maps . . . . . . . . . . . . . . . . 9
1.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Linear Maps . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Linear Equations . . . . . . . . . . . . . . . . . 14
1.2.4 Inverse map and Inverse Matrix . . . . . . . . . 15
1.2.5 Changing the Basis . . . . . . . . . . . . . . . . 17
1.2.6 Some Special Linear Maps in R2 . . . . . . . . . 18
1.2.7 Examples . . . . . . . . . . . . . . . . . . . . . 19
1.3 Operations with matrices . . . . . . . . . . . . . . . . . 19
1.3.1 Matrix-algebra . . . . . . . . . . . . . . . . . . 20
1.3.2 Scalar Product . . . . . . . . . . . . . . . . . . 21
1.3.3 Homogeneous Coordinates . . . . . . . . . . . . 21
1.3.4 Norms . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Gauss Algorithm and LU-Decomposition . . . . . . . . . 241.4.1 Numerical Stability . . . . . . . . . . . . . . . . 24
1.4.2 Special Operations . . . . . . . . . . . . . . . . 26
8/2/2019 A Em Theory
2/124
AEM 0- 2
1.4.3 Properties ofC(k, l; ), D(k; ) and F(k, l) . . 27
1.4.4 Standard Algorithm . . . . . . . . . . . . . . . . 27
1.4.5 LU-Decomposition . . . . . . . . . . . . . . . . 28
1.4.6 Example . . . . . . . . . . . . . . . . . . . . . . 311.4.7 Summary of LU-decomposition . . . . . . . . . . 33
1.4.8 Example of LU-Decomposition . . . . . . . . . . 34
1.4.9 Solving a Linear Equation System . . . . . . . . 36
1.4.10 Short Form . . . . . . . . . . . . . . . . . . . . 37
1.4.11 Example . . . . . . . . . . . . . . . . . . . . . . 37
1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . 40
1.5.1 Definition and properties . . . . . . . . . . . . . 401.5.2 More properties . . . . . . . . . . . . . . . . . . 41
1.5.3 Lemma . . . . . . . . . . . . . . . . . . . . . . 41
1.5.4 Theorem: Schur Form . . . . . . . . . . . . . . 41
1.5.5 Consequences . . . . . . . . . . . . . . . . . . . 42
1.5.6 Jordan-Form . . . . . . . . . . . . . . . . . . . 42
1.5.7 Example . . . . . . . . . . . . . . . . . . . . . . 45
1.6 Special Properties of Symmetric Matrices . . . . . . . . 511.6.1 Properties of Symmetric and Hermitian Matrices 52
1.6.2 Orthogonal Matrices . . . . . . . . . . . . . . . 52
1.7 Singular Value Decomposition (SVD) . . . . . . . . . . 53
1.7.1 Preparations . . . . . . . . . . . . . . . . . . . 53
1.7.2 Existence and Construction of the SVD . . . . . 54
1.8 Generalized Inverses . . . . . . . . . . . . . . . . . . . . 55
1.8.1 Special case: A injectiv . . . . . . . . . . . . . . 571.9 Applications to linear equation systems . . . . . . . . . 58
1.9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . 58
1.9.2 Numerical Rank Deficiency . . . . . . . . . . . . 59
1.9.3 Application: Best Fit Functions . . . . . . . . . 61
1.10 Symmetric Matrices and Quadratic Forms . . . . . . . . 64
1.11 QR-Decomposition . . . . . . . . . . . . . . . . . . . . 68
1.12 Numerics of eigenvalues . . . . . . . . . . . . . . . . . . 71
8/2/2019 A Em Theory
3/124
AEM 0- 3
2 Ordinary Differential Equations 2
2.1 General Definitions . . . . . . . . . . . . . . . . . . . . 2
2.2 Linear differential equations with constant coefficients . 4
2.2.1 Inhomogeneous Equations . . . . . . . . . . . . 72.3 Linear differential equations of higher order . . . . . . . 8
2.3.1 General Case . . . . . . . . . . . . . . . . . . . 8
2.3.2 Ode with Constant Coefficients . . . . . . . . . 10
2.3.3 Special Inhomogeneities . . . . . . . . . . . . . 11
3 Calculus in Several Variables 3
3.1 Differential Calculus in Rn . . . . . . . . . . . . . . . . 33.1.1 Definitions . . . . . . . . . . . . . . . . . . . . 3
3.1.2 Examples and Properties of Open and Closed Sets 4
3.1.3 Main Rule for Vector-Valued Functions . . . . . 4
3.1.4 Definition - Limits and Continous Fuctions . . . 5
3.1.5 Definition - Partial Derivatives . . . . . . . . . . 5
3.1.6 Theorem of H.A. Schwarz . . . . . . . . . . . . 6
3.1.7 Definition: Derivative off . . . . . . . . . . . . 63.1.8 Higher derivatives . . . . . . . . . . . . . . . . . 7
3.1.9 Examples . . . . . . . . . . . . . . . . . . . . . 7
3.1.10 Directional derivative, Gfteaux derivative . . . . 7
3.1.11 Rules . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Inverse and Implicit Functions . . . . . . . . . . . . . . 8
3.2.1 Inverse Function Theorem . . . . . . . . . . . . 8
3.2.2 Application: Newtons method . . . . . . . . . . 93.2.3 Implicit Function Theorem . . . . . . . . . . . . 9
3.3 Taylor Expansions . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Nabla-Operator . . . . . . . . . . . . . . . . . . 10
3.3.2 Construction of Taylor Expansions . . . . . . . . 10
3.3.3 Taylors Theorem . . . . . . . . . . . . . . . . . 11
3.3.4 Calculation in the two-dimensional Case . . . . . 12
3.4 Extreme Values . . . . . . . . . . . . . . . . . . . . . . 143.4.1 Definition . . . . . . . . . . . . . . . . . . . . . 14
3.4.2 Neccesary Criterion . . . . . . . . . . . . . . . . 15
8/2/2019 A Em Theory
4/124
AEM 0- 4
3.4.3 Sufficient Criterion . . . . . . . . . . . . . . . . 15
3.4.4 Saddle Points . . . . . . . . . . . . . . . . . . . 1
4 Integral Transforms 24.1 Laplace Transform . . . . . . . . . . . . . . . . . . . . 2
4.1.1 Method of Calculation . . . . . . . . . . . . . . 2
4.1.2 Convolution . . . . . . . . . . . . . . . . . . . . 4
4.1.3 Some important Examples . . . . . . . . . . . . 4
4.1.4 Solution of Inital Value Problems . . . . . . . . 5
4.2 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 6
4.2.1 Theorem . . . . . . . . . . . . . . . . . . . . . 64.2.2 Definition . . . . . . . . . . . . . . . . . . . . . 6
4.2.3 Theorem . . . . . . . . . . . . . . . . . . . . . 7
4.2.4 Properties of the Coefficients . . . . . . . . . . 7
4.2.5 Real form of the Fourier Series . . . . . . . . . . 7
4.3 Fourier Transform . . . . . . . . . . . . . . . . . . . . . 8
4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . 8
4.3.2 Inverse Transform . . . . . . . . . . . . . . . . . 84.3.3 Convolution . . . . . . . . . . . . . . . . . . . . 9
4.3.4 Rules . . . . . . . . . . . . . . . . . . . . . . . 9
4.3.5 Sine and Cosine transform . . . . . . . . . . . . 10
4.3.6 More Properties . . . . . . . . . . . . . . . . . . 10
4.3.7 Calculation of the Fourier Transform . . . . . . 11
4.3.8 Gauss functions . . . . . . . . . . . . . . . . . . 11
4.3.9 Consequences . . . . . . . . . . . . . . . . . . . 124.3.10 Definition: Dirac sequence . . . . . . . . . . . . 1
4.3.11 Main Property of Dirac sequences . . . . . . . . 1
4.3.12 Delta Distribution . . . . . . . . . . . . . . . . . 1
5 Stability of Ordinary Differential Equations 2
5.1 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 35.3 Flow-box theorem . . . . . . . . . . . . . . . . . . . . . 3
5.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 3
8/2/2019 A Em Theory
5/124
AEM -1- 5
5.5 Theorem: Linear Case . . . . . . . . . . . . . . . . . . 4
5.6 Linearisation . . . . . . . . . . . . . . . . . . . . . . . . 4
5.7 PoincarS-Ljapunov Theorem . . . . . . . . . . . . . . . 4
5.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 45.9 Ljapunov Functions . . . . . . . . . . . . . . . . . . . . 1
5.9.1 Definition . . . . . . . . . . . . . . . . . . . . . 1
5.9.2 Theorem . . . . . . . . . . . . . . . . . . . . . 1
8/2/2019 A Em Theory
6/124
0 Solving Linear EquationSystems with theGauss-Algorithm
A linear equation system with m equations and n unknowns is given by
a11x1 + a12x2 + a1nxn = b1...
am1x1 + am2x2 +
amnxn = bm
Omitting the plus-signs and the variables this will be written down in the
short form a11 a12 a1n b1... ... ... ...
am1 am2 amn bm
In case of an homogeneous equation system (all bj are equal to zero)the last column is omitted, too.
Allowed operations are
multiply a row with a number unequal to zero exchange two rows
add a multiple of a row to another row The exchange of columns is only allowed is there is a row 0 added
that contains the names of the variables.
8/2/2019 A Em Theory
7/124
AEM 0- 7
Naturally the last column containing the bj-values must not be
exchanged with other columns.
The simplest form of the Gauss-Algorithm is to perform these steps:m1 Try to get a 1 into the upper left corner. If this is not possible,the algorithm stops.
m2 by adding suitable multiples of the first row to the rows below (andabove) generate zeroes in the rest of the column.
m3 Repeat the process in the subscheme without the first row andcolumn.
In the end (possibly after exchanging rows and columns) one has
xj1 xj2 xjk xjk+1 xjn1 0 0 c10 1 0 c2...
..
.
. ..
..
.
..
.
..
.
..
.0 0 1 ck0 0 0 0 ck+1...
......
......
0 0 0 0 cm
The first row contains the names of the variables.
The number k is called the rank of the equation system. The followingcases are possible:
(i) At least one of the values ck+1, . . . , c m is unequal to zero. Then
the system is not solvable.
(ii) If k = n = m then the system is uniquely solvable with xj1 = c1,
. . . , xjn = cn.
(iii) It is k < n and ck+1 =
= cm = 0. Then we can take the last
n k variables xjk+1 to xjn as parameters in the solution. With thisthe values of xj1 to xjk are uniquely determined for each choice of
the parameters.
8/2/2019 A Em Theory
8/124
AEM 0- 8
Example
2x1 +6x2 +2x4 = 10
x1 +3x2 +x3 +2x4 = 7
3x1 +9x2 +4x3 = 16
3x1 +9x2 +x3 +4x4 = 17
or
x1 x2 x3 x4
2 6 0 2 101 3 1 2 7
3 9 4 0 16
3 9 1 4 17
m1 Exchange rows 1 and 2.
x1 x2 x3 x41 3 1 2 7
2 6 0 2 103 9 4 0 16
3 9 1 4 17
m2 Add row 1 multiplied by (2) to row 2, multiplied by (3) to row3 and multiplied by 3 to row 4. This results in
x1 x2 x3 x4
1 3 1 2 70 0 2 2 40 0 1 6 50 0 2 2 4
m3 Now swap columns 2 and 4.
x1 x4 x3 x2
1 2 1 3 70 2 2 0 40 6 1 0 50 2 2 0 4
m4 Add row 2 to row 1, row 2 multiplied by (3) to row 3 and multi-
plied by (
1) to row 4. Then divide row 2 by (
2).
8/2/2019 A Em Theory
9/124
AEM 0- 9
x1 x4 x3 x21 0 1 3 30 1 1 0 2
0 0 7 0 70 0 0 0 0
m5 Leave row 4 away, divide row 3 by 7 and add row 3 to row 1 and
subtract it from row 2. Then we reach the final form
x1 x4 x3 x21 0 0 3 4
0 1 0 0 10 0 1 0 1
m6 The system is solvable. The variables behind the columns that
form an identity matrix are parameters; here this applies to x2.
With x2 = t one sees x1 = 4 3t, x4 = 1 and x3 = 1. So we canwrite the general solution as follows:
x1x2x3x4
=
4 3tt
1
1
=
4
0
1
1
+ t
31
0
0
8/2/2019 A Em Theory
10/124
1 Linear Algebra and VectorSpaces
1.1 Vector spaces
1.1.1 Vector Spaces
1.1.1.1 Definition
A real vector-space (short: VS) is a set in which two operations additionand multiplication are defined, and where the following rules hold:
u, v and w are elements of the vector-space, and are real numbers.
(i) u+ v = v + u, u+ (v + w) = (v + u) + w
(ii) There is a zero-vector 0 with v + 0 = v.
(iii) For each v there is a vector
v with v + (
v) = 0.
(iv) ( + )v = v + v , ()v = (v), (v + w) = v +
w , 1 v = vIf one admits complex scalars, one gets a complex vector-space instead
of a real VS.
The elements of the vector-space are called vectors. The elements of
the field R or C are called scalars.Often there is no difference wether one has R or C as field. In this case
we use K as a symbol.
8/2/2019 A Em Theory
11/124
AEM 1- 2
1.1.1.2 Definition: Subspace
Let V be a VS and U
V. U is called a subspace of V, if U is itself a
VS with the operations induced by V. This is fulfilled, iff (short for ifand only if) U contains for each pair of elements x and y the sum x+ y
and all vectors of the form x with K.This property is called closedness against sums and multiplications.
Always V has the trivial subspaces V and {0}.
1.1.2 Linear Independence
1.1.2.1 Definition
Let v1 to vk vectors and 1, . . . , k K. The expression1v1 + kvk is called linear combination. The numbers j are calledcoefficients. Please notice that a linear combination is always a finitesum, even in infinite-dimensional spaces.
The vectors v1 to vk are linearly dependent (l.d.), if there are coefficients
1 to k with 1v1 + + kvk = 0, and not all of the i are zero. Ifthis is not the case, the vectors are called linearly independent (l.i.).
Therefore, ifv1 to vk are linearly independent, and 1v1+
+kvk = 0,
it follows that 1 = 2 = = k = 0.On the other hand, if v1 to vk are l.d, then it is possible to write
1v1 + + kvk = 0 with at least one of the j = 0, say 1 = 0. Thenone has
v1 =11
(2v2 + kvk),
so one of the vectors is a linear combination of the others.
8/2/2019 A Em Theory
12/124
AEM 1- 3
1.1.2.2 Criteria for Linear Dependence
A single vector is linearly dependent, iff it is the zero-vector.
Two vectors uand v are linearly dependent, iff they lie on a straightline through zero; or iff one of them is a multiple of the other.
Three vectors u, v and w are linearly dependent, iff they lie ina plane through zero; or if one of them is a linear combination
of the others. In R3 there is a criterion with the volume of the
parallelepiped spanned by these vectors
v1, v2, v3 l.d. (v1, v2, v3) = det(v1, v2, v3) = 0
k vectors v1 to vk are linearly dependent, iff the rank of the matrixwith the columns v1 to vk is less than k. (rank will be explained
later).
More than n vectors in Kn are always linearly dependent.
Criterion for n vectors in Kn:v1 to vn are linearly dependent det(v1, . . . , vn) = 0.
1.1.3 Dimension and Basis
1.1.3.1 Definition: Span, Dimension and Basis
Let V be a vector space.
(i) Let M V be a (finite or infinite) non-empty subset of V. Theset of all linear combinations is called the span on M, span M =
{m
k=1kvk | j K, vj M}. The span is always a subspace.
(ii) If there is a system M of n vectors in V, so that V is the span of
M, and there is no such system consisting of less than n vectors,
then V has the dimension n.
8/2/2019 A Em Theory
13/124
AEM 1- 4
If there is no finite set M with span M = V, V is said to be
infinite-dimensional.
(iii) A set M = {v1, v2 vn} V is called a basis ofV, iff every vectorv V has an unique representation v =
nk=1
kvk.
1.1.3.2 Remarks
(i) If V has dimension n, then every basis consists of n elements.
(ii) If V has dimension n, then every linearly independent set of n
vectors forms a basis.
(iii) The elements of a basis are always linearly independent.
1.1.3.3 Coordinates
Let M = {v1, v2 vn} V be a basis of V. For each v V thereis a unique representation v =
nk=1
kvk. The numbers (1, . . . , n) are
called the coordinates ofv with respect to M. The vector
1...
n
(always
a column!) is called the coordinate vector of v.
8/2/2019 A Em Theory
14/124
AEM 1- 5
1.1.4 Scalar Product
1.1.4.1 Complex scalar product
Let Vbe a complex vectorspace. A scalar product is a mapping VVC, (v , w) < v , w > with the properties
(i) < u + v , w >= < u, w > + < v , w > for , C,u, v , w V (linearity in the first argument)
(ii) < u, v + w >= < u, v > + < u, w > for , C,u, v , w V (anti-linearity in the second argument)
(iii) < u, v >= < v , u >
(iv) < u, u > 0 and < u, u >= 0 u = 0 (positive definiteness)Esp. the scalarproduct of a vector with itself is always real and
non-negative.
1.1.4.2 Real scalar product
If V is a real vectorspace, the same properties shall hold with real valued
scalar product, , R and (naturally) without complex conjugation.
1.1.4.3 Standard scalar product
The standard real resp. complex scalar product of two vectors in Kn is
defined by
v w =< v , w >:=n
k=1
vkwk v , w Rn
v w =< v , w >:=n
k=1
vkwk v , w Cn
In this case we define
8/2/2019 A Em Theory
15/124
AEM 1- 6
(i) u := < u, u > is the length or (euclidean) norm of the vectoru (also denoted by |u|).
(ii) The angle [0, ] of two non-zero vectors u, v Rn
is definedby cos =
< u, v >
u v .
1.1.5 Orthonormal Systems
With the Kronecker symbol i j = 1 i = j0 i = j we define1.1.5.1 Definition
(i) Two vectors having scalar product zero are called orthogonal or
perpendicular.
(ii) A set of vectors {vi} with < vi, vj >= i j is called an orthonormalsystem (ONS). A basis that is an ONS is called orthonormal basis
(ONB).
1.1.5.2 Lemma
ONS are linearly independent.
The importance of ONB lies in the following theorem, which allows an
expansion of a given vector in the basis with aid of scalarproducts:
1.1.5.3 Expansion Theorem
Let v1, , vk be an ONS, V the span of these vectors.
8/2/2019 A Em Theory
16/124
AEM 1- 7
(i) If u V, then the following holds:u =< u, v1 > v1+ < u, v2 > v2 + + < u, vk > vk
=
kj=1
< u, vj > vj
(ii) For V U and u U there exists a decomposition u = u1+u2 withu1 V and < u1, u2 >= 0. u1 is called the orthogonal projectionof u, and the map u u1 is the orthogonal projection onto V.
1.1.5.4 Gram-Schmidt Orthonormalisation Process
Let u1, , uk be a set of vectors, in which at least one non-zero vectorexists.
m1 Choose u1 = 0, let v1 = u1 and set w1 = 1v1v1.
m2 If w1 to wj1 are already constructed, letvj = uj < uj, w1 > w1 < uj, wj1 > wj1= uj
j1i=1
< uj, wi > wi.
Then span {u1 . . . uj} = span {v1 . . . vj} and< vj, v1 >= =< vj, vj1 >
=< vj, u1 >= =< vj, uj1 >= 0.In manual computations it is often easier to use the vi instead of
the wi:
vj = uj < uj, v1 >< v1, v1 >
v1 < uj, vj1 >< vj1, vj1 >
vj1
= uj j1
i=1< uj, vi >
vi
2
vi.
As the vj will be normed later, it is allowed to substitute the vj with
a multiple. With this technique one can avoid sometimes the use
of fractions.
8/2/2019 A Em Theory
17/124
AEM 1- 8
m3 If vi = 0 then let wj = 1vj vj and go on withm2 . If one is
calculating with vj instead of wj this step can be carried out in the
end.
If vj = 0 then uj was linearly dependent of u1 to uj1. In this caseuj is deleted from the starting set of vectors and the algorithm
goes on with the next vector.
If the ui are linearly independent this case cannot occur.
1.1.6 Norms
1.1.6.1 Definition
A norm on a vector space V is a function . : V R, x x Rwith the following properties:
(i) x 0 and x = 0 x = 0 (definiteness)(ii) x = ||x (homogeneity)
(iii) x + y x + y (triangle inequality)
1.1.6.2 Examples
(i) The euclidean norm on Kn x2 = |x| = x, x(ii) The 1-norm: x1 = |x1| + |x|2 + + |x|n
(iii) The -norm: x = max{|x1|, |x|2, , |x|n}
(iv) On C([a, b]) we define f2 :=b
a|f(x)|2 dx
1/2
Remark In (i)(iii) we have ek = 1.
8/2/2019 A Em Theory
18/124
AEM 1- 9
1.1.6.3 Lemma: Cauchy-Schwarz and Minkowski inequalities
Let
., .
be a real or complex scalar product, i.e.
., .
is linear in the
first argument and u, v = v , u with u, u = 0 u = 0.(i) | u, v | u, u 1/2 v , v 1/2
(ii) With u := u, u 1/2 is a norm, especially u+ v u + v.(i) is called Cauchy-Schwarz inequality, (ii) is the Minkowski inequality.
1.1.6.4 Comparison of norms
It is easy to see that x x2 x1 nx holds. Therefore,one can define: a sequence xk approaches zero if the real sequence xkhas the limit zero, and the choice of the norm doesnt make a difference.
Naturally xk x (xk x) 0 xk x 0.
1.2 Matrices and Linear Maps
1.2.1 Matrices
1.2.1.1 Definition
In the most cases it is sufficient to regard a matrix as a rectangular
scheme consisting of column-vectors:
A = (ai j) i=1..mj=1..n
=
a11 a12 a1na21 a22 a2n
......
. . ....
am1 am2 amn
=
| | |a1 a2 an...
......
...
| | |
A matrix with an equal number of rows and columns is called square
matrix.
8/2/2019 A Em Theory
19/124
AEM 1- 10
1.2.1.2 Special types of square matrices
1 0 0 00 1 0 00 0 1 0...
......
. . ....
0 0 0 1
Identity-matrix
En or Inor E or I
d1 0 0 00 d2 0 00 0 d3 0...
......
. . ....
0 0 0 dn
diagonal-matrix
0 0 0 0 0 0...
......
. . ....
lower
triangular matrix
0 0 0 ...
......
. . ....
0 0 0
uppertriangular matrix
Two matrices of the same size can be added by adding all entries. A
matrix is multiplied by a scalar by multiplying each entry by .
A =
a11 a12 a21 a22
......
. . .
, B =
b11 b12 b21 b22
......
. . .
,
A =
a11 a12 a21 a22 ...
.... . .
A + B =a11 + b11 a12 + b12 a21 + b21 a22 + b22
......
. . .
,
1.2.1.3 Multiplication of Matrices and Vectors
Let A be a matrix with k columns and b be an element ofKk.
The product of the matrix A and the vector b = (b1, . . . , b k)T is the
linear combination of the column-vectors of A with the coefficients b1
8/2/2019 A Em Theory
20/124
AEM 1- 11
to bk.
|
a1|
|ak|
b1...
bk = b1a1 + + bkak
The matrix A is multiplied with the matrix B by decomposing B into
column-vectors and forming the corresponding matrix-vector-products.
These products are written down in order.
A |b1|
|bk| = |A b1
| |A bk
|
So the matrix-product is calculated in concrete situations:
ai1 ai2 ai3
b3j
b1jb2j
ci j
C = ABA
B
......
...
ci j = ai1b1j + ai2b2j + ainb=
nk=1
aikbkj
ci j = ai1b1j + ai2b2j + ainbnj =n
k=1
aikbkj
On the other hand, if you define matrices as an (ordered) collection of
row-vectors, the product bA consists of linear combinations of the rows
of A with coefficients in b. Observe the order of multiplication!
This leads to:
The product AB of the matrices A and B is a matrix with
the k-th column is a linear combination of the columns of A withcoefficients in the k-th columns of B
8/2/2019 A Em Theory
21/124
AEM 1- 12
the k-th row is a linear combination of the rows of B with coeffi-cients in the k-th row of A
1.2.2 Linear Maps
1.2.2.1 Definition: Linear map
Let U and V be vector-spaces. A map L : U V is called linear, if forall x , y U and , K the following equation holds:
L(x + y) = L(x) + L(y)
If u1. . . un is a basis of U then L is completely determined by its action
on the basis:
Lu = L
n
i=1iui
=
n
i=1iL(ui)
Suppose that V has a basis v1. . . vm. Then each Lui has a representation
Lui =m
j=1
aj ivj. The matrix A = (aj i)j=1..m,i=1..n is called the matrix
associated to the linear map L. Note that this matrix depends not only
on L itself, but also on the choice of the bases in U and V.
Resuming this for the special case U = Kn and V = Km with the standard
bases we have:The matrix of the linear map L : U V has in the k-th column theimage of ek.
On the other hand every matrix with n rows and m columns defines a
linear map Km Kn through L(x) := Ax.
1.2.2.2 Definition: Rank
The rank of a matrix is the rank of the corresponding homogeneous
equation system defined in chapter 0.
8/2/2019 A Em Theory
22/124
AEM 1- 13
1.2.2.3 Rank theorem
Let A be a matrix. Then the maximum number of linear independent
columns is equal to the maximum numbers of linear independent rows.
A matrix with the property that the rank is the minimum number of rows
and columns is called to have full rank.
1.2.2.4 Definition: Multilinear Maps
(i) Let U1, . . . , Un und V be vectorspaces. A mapL : U1 U2 Un VL : (u1, u2, . . . , u n) L(u1, . . . , u n) Vis called multilinear ifL is linear in each component, i.e. L is linear
in each uj if one fixes all other uk.
(ii) Most important case: U1 = = Un.
For n = 2 we have bilinear maps. They are called symmetric istL(u, v) = L(v , u) and hermitian if L(u, v) = L(v , u).
A multilinear map with the property
L( , uj, , uk, ) = L( , uk, , uj, )
is called alternating.
Properties of alternating maps:
(iii) (1) L( , u, , u, ) = 0(2) If one of the vectors is a linear combination of the others,
L( ) = 0.
(3) For U1 = = Un = Kn
and V =K
there is exactly one Lwith
L(e1, , en) = 1.
8/2/2019 A Em Theory
23/124
AEM 1- 14
In this case we have L(u1, , un) u1, , un lin. indepen-dant.
This L is called determinante L( u1, , un) = det( u1, , un)(and is the well known determinante with the usual properties)(iv) Application: Cramers Rule
Let a1, , an Kn be a basis, A = [a1, , an] a n n-matrixand b Kn.Then the equation system Ax = b is uniquely solvable with
xj =det A
jdet A , where A
j is A with aj is replaced by b.
1.2.3 Linear Equations
1.2.3.1 Some Definitions
A linear map L : U K is called linear functional.Let L : U V be linear. Then L is called
epimorphism, if L is surjective isomorphism, if L is bijective (one-to-one) endomorphism, if U = V
automorphism, if U = V and L is one-to-one.The rank of L is the dimension of the range of L in V. As the range is
spanned by the column-vectors of the matrix representation, the rank of
L is the rank of the corresponding matrix.
1.2.3.2 Definition: Linear equation
Let L : U V be a linear map, b V. An equation Lx = b iscalled linear equation. For b = 0 the equation is called homogeneous,
8/2/2019 A Em Theory
24/124
AEM 1- 15
otherwise inhomogeneous. The set of all solutions of the homogeneous
equation is called the kernel of L, written ker L.
From now on we assume that L is represented by the matrix A.
1.2.3.3 Immediate Properties
(i) The kernel is a subspace ofU
(ii) For the homogeneous equation the dimension formula holds:
dim ker L = dim U rank LThat means that one can choose freely n k parameters in thesolution of the equation Lx = 0.
(iii) The general solution of the inhomogeneous equation is archived by
adding one particular solution to all solutions of the homogeneous
equation.
(iv) The inhomogeneous equation is solvable iff the rank of A is equalto the rank of the extended matrix (A|b).(v) For square n n-matrices A the following holds:
The inhomogeneous equation is solvable for each right side b
The homogeneous equation is uniquely solvable det A = 0 A has rank n ker A = {0} A1 exists (A1 is defined below).
1.2.4 Inverse map and Inverse Matrix
Let Lx = b be a linear equation that is uniquely solvable for all b.
Then the map b x is well defined, and this map is called L1, theinverse map of L.
8/2/2019 A Em Theory
25/124
AEM 1- 16
1.2.4.1 Consequences
(i) L1 is a linear map from V to U.
(ii) In the finite dimensional case the matrix associated to L must be
square.
Let A be a n n-square matrix with rank n. Then each equation systemAx = b is uniquely solvable. The matrix B = [v1, . . . , vn] containing the
solutions Avj = ej is called the inverse of A, A1 = B.
A is called regular or invertible.
1.2.4.2 Properties
A1A = AA1 = E
From now on we restrict ourselves to the case that the linear map is
defined between Rn
and Rm
or between Cn
and Cm
.
1.2.4.3 Correspondences between Linear Maps and Matrices
Linear Map L Matrix A
Application to a vector L(x) Multiplication matrix - vector Ax
Identity map I(x) = x Identity matrix E with Ex = x
Zero map: O(x) = 0 Zero matrix 0 with 0x = 0
Composition L1 L2 Matrix-multiplication A1A2Inverse map L1 Inverse matrix A1
8/2/2019 A Em Theory
26/124
AEM 1- 17
1.2.5 Changing the Basis
In the beginning of the section was mentioned that the matrix of a given
map L : Kn Km contains in the columns the coordinates of the imagesof the basis ofKn with respect to the basis ofKm. Now we can ask how
the matrix changes when we choose other bases in Kn or Km.
1.2.5.1 Coordinates with Respect to a Basis
Let u1, , un be a basis ofKn
. Then the matrix U = (u1 un) is in-vertible. To gain the coordinates a of a point x with respect to u1, , unwe write
x = Ua a = U1x .If v1 vn is another basis ofKn we have with V = (v1 vn)
x = Ua = Vb b = V1Ua a = U1Vb
1.2.5.2 Matrix and Change of Coordinates
This uses the same method as in the paragraph above: x Kn hasthe representations x = Ua = Vb and y Km has the representationsy = W c = Zd.
Let A be the matrix of L with respect to the basis U and W. Using the
last paragraph we have
L(x) = y Aa = c AU1Vb = W1Zd Z1W AU1Vb = d.A special case is the change of basis of an endomorphism: With W = U
and Z = V the last formula reduces to
Aa = c
V1UAU1Vb = d
(V1U)A(V1U)1b = d
In the even more special case U = W = E we have
Aa = c V1AVb = d .
8/2/2019 A Em Theory
27/124
AEM 1- 18
1.2.6 Some Special Linear Maps in R2
(i) Identity and zero maps E and 0.
(ii) Homogeneous scaling E =
0
0
(iii) Rotation with the angle
cos sin sin cos
.
(iv) Shears as 1 1
0 1(v) Reflections.
Let a = 1 and g be the straight line with direction a. Thereflection at g has the matrix
Sg =
2a21 1 2a1a2
2a1a2 2a22 1
= 2aaT E
(vi) The reflection at zero has the matrix E = 1 00 1.
8/2/2019 A Em Theory
28/124
AEM 1- 19
1.2.7 Examples
1 2 3 4
5 6 7 8
1 :
1 0
0 1
2 :
1.5 0
0 0.5
3 :
0.75 0
0 1
4 :
1 1
0 1
5 :
1 00 1
6 :
1 0
0 1
7 :
1 00 1
8 :
a aa a
, a =
12
1.3 Operations with matrices
The transpose AT of a matrix A is the matrix with columns and rows ex-
changed. The transpose of a mn-matrix is a nm-matrix. This meansfor square matrices, that everything is mirrored at the first diagonal.
The adjoint A of an (complex) matrix A is constructed by replacing allentries of the transpose by their complex conjugates.
A square matrix is called symmetric, if it is equal to its transpose. It
is called self-adjoint or hermitian, if it is equal to its adjoint. For real
matrices these term coincide.
A matrix with A = AT is called skew-symmetric, if A = A, A iscalled skew-hermitian.
8/2/2019 A Em Theory
29/124
AEM 1- 20
Often it is useful to regard vectors as matrices with one column and n
rows. The numbers in R or C correspondent to the 1 1-matrices.
1.3.1 Matrix-algebra
A+B = B +A (A+B) = A+B (A+B)+C = A+(B +C)
(A + B)C = AC+ BC A(B + C) = AB + AC (AB)C = A(BC)
Attention! In general is AB = BA.Let A and B be invertible n n-matrices. Then AB is invertible and thefollowing rules hold:
(AB)1 = B1A1 (A)1 =1
A1.
AE = EA = A A0 = 0A = 0
(A1)1 = A (AT )T = A (A) = A
(A + B)T = AT + BT (A)T = AT (AB)T = BTAT
(A + B) = A + B (A) = A (AB) = BA
(A1)T = (AT )1 (A1) = (A)1
1.3.1.1 Block Matrices
If a matrix is divided into blocks by horizontal or vertical lines one can
calculate with these block as if they were entries in a common matrix
(exception: determinants!). The blocks have to fit in size. Example:
A1 A2A3 A4 B1 0
Ek B2 = A1B1 + A2 A2B2
A3B1 + A4 A4B2Here O denotes a matrix consisting only of zeroes and Ek a kk identitymatrix.
8/2/2019 A Em Theory
30/124
AEM 1- 21
1.3.2 Scalar Product
The role of the transpose resp. adjoint matrix becomes clearer if we if
we regard the scalar product as a matrix product:
< u, v >=n
i=1
uivi = vu (complex case)
< u, v >=n
i=1
uivi = vT u (real case).
So we have
< Au, v >= vTAu = vTAT T u = (ATv)T u =< u, ATv >
and analogously in the complex case < Au, v >=< u, Av >.
This property characterizes the transpose matrix:
Let < Au, v >=< u, Bv > for all u, v Rn. If one chooses u = ei andv = ej one has < Aei, ej >= aj i and < ei, Bej >= bi j, so B = A
T.
1.3.3 Homogeneous Coordinates
With matrix multiplication one can describe rotations, stretchings, shear-
ings or reflections (and combinations of these), but as the origin always
remains fixed, translations are not possible. This difficulty can be over-
come by using homogeneous coordinates. Homogeneous coordinates inR3 consist of four coordinates, where the fourth coordinate must not be
zero. A point (x , y , z ) R3 is represented by any vector of the form[a x , a y , a z , a]T. Especially [x , y , z , 1]T is a representant of [x , y , z ]T.
Then we have the following correspondeces:
8/2/2019 A Em Theory
31/124
AEM 1- 22
cartesian coordinates homogeneous coordinates
x =
x1x2
x3
y =
x1x2x31
or y =
ax1ax2ax3
a
x Ax y By with B =
0A 0
0
0 0 0 1
x x + v y By with B =
1 0 0 v10 1 0 v20 0 1 v30 0 0 1
1.3.4 Norms
Definition of norms of linear maps Let U and V be normed vectorspaces and let L(U, V) denote the vectorspace of all linear maps from
U to V. A norm on L(U, V) is a real valued function with the following
properties:
If A, B L(U, V) then(i) A 0 and A = 0 A = 0, the zero-map (definiteness)
(ii) A = ||A (homogeneity)(iii) A + B A + B (triangle inequality)(iv) AB AB
8/2/2019 A Em Theory
32/124
AEM 1- 23
In the finite dimensional case linear maps are represented by matrices,
and the norm is called matrix-norm. Other notation: operator-norm
In general, a vector-norm . a and a matrix-norm . b are compatibleif for each vector x and each matrix A the inequality Ax Axholds. The norm-definition below produces compatible matrix-norms.
Definition Let . i be a (vector)-norm in Kn and A be a nn matrix.We define the matrix-norm A generated by . by
A = max{Ax | x = 1} = max{Ax | x 1}.Then one has A = min{C | for all x U one has Ax Cx}The norm generated by the vector-norms . 1 and . above aredenoted by the same symbol.
Lemma
(i) A1 = max1jn
ni=1
|ai j| (largest sum of columns)
(ii) A2 is the first (and largest) singular value of A (will be definedlater)
(iii) A = max1inn
j=1 |ai j| (largest sum of rows)
(iv) As = n
i ,j=1
|ai j|2 is compatible with . 2 (Frobenius Norm).
8/2/2019 A Em Theory
33/124
AEM 1- 24
1.4 Gauss Algorithm and LU-Decomposition
1.4.1 Numerical Stability
We will study some small equation systems and the effect of rounding
errors onto the solutions.
Example system:
104x + y = 1
x + y = 2
Solution with the Gaua-Algorithm, exact calculation:1
100001 1
1 1 2
110000
1 1
0 9999 9998
110000
1 1
0 1 99989999
110000 0 199990 1 9998
9999
1 0 1000099990 1 9998
9999
and so x 1 and y 1Now the same calculation with three significant digits, i.e. all numbers
are rounded to the next number with three digits of the form x = 0.abc10p.
0.0001 1 1
1 1 2
0.0001 1 1
0 10000 10000
0.0001 1 1
0 1 1
0.0001 0 0
0 1 1
1 0 0
0 1 1
x = 0
y = 1
This solution is unusable.
This can be avoided by pivoting: choose the entry in the first column
below the diagonal (the diagonal included) with the largest absolute value
and put it into the diagonal by exchanging rows. Then go on with Gaua
8/2/2019 A Em Theory
34/124
AEM 1- 25
algorithm. IfA is invertible then the pivot elements are unequal to zero.
This results in the following:
0.0001 1 11 1 2
1 1 20.0001 1 1
1 1 20 0.9999 0.9998
1 1 2
0 1 1
1 0 1
0 1 1
Other problems may arise. Example two is example one after multiplying
row 1 by 20000. Again the calculations use three significant digits.2 20000 20000
1 1 2
2 20000 20000
0 10000 10000
2 20000 20000
0 1 1
2 0 0
0 1 1
x = 0
y = 1
So this solution is unusable, too.This effect can be avoided by equilibration.This means that each equa-
tion is multiplied with a factor so that the sum of the absolute values of
the row,
nk=1
|aik| is equal to one.
Applying this one gets
2 20000 200001 1 2
220002
2000020002
2000020002
12
12
1
0.0001 1 1
0.5 0.5 1
Then pivoting gives
0.5 0.5 1
0.0001 1 1 0.5 0.5 1
0 1 1 x = 1
y = 1
Conclusion: pivoting and equilibration can help to avoid problems caused
by rounding errors.
8/2/2019 A Em Theory
35/124
AEM 1- 26
1.4.2 Special Operations
Always we assume that the sizes of the matrices fit so that the products
can be performed.
Let be a real or complex number, k = l. Now define the followingn n- square matrices:Definition
C(k, l; ) = (ci j)i ,j=1..n with ci j = 1 for i = j
for i = k, j = l
0 otherwise
D(k; ) = (di j)i ,j=1..n with di j =
1 for i = j and i = k for i = j = k
0 otherwise
F(k, l) = (fi j)i ,j=1..n with fi j = 1 for i = j and i = k and i = l1 for i = k, j = l
1 for i = l , j = k0 otherwise
Decompose A into column-vectors ai and row-vectors bj:
A = ak al =
bk bl
Multiplication from the left side does operations with rows:
C(k, l; )A =
bk + bl
bl
D(k; )A =
bk bl
F(k, l)A =
bl bk
Multiplication from the right side does operations with columns:
8/2/2019 A Em Theory
36/124
AEM 1- 27
AC(k, l; ) = ak al + ak ,
AD(k; ) =
ak al AF(k, l) = al ak
Observe that multiplication with C(k, l; ) from the right changes col-
umn l while multiplication from the left changes row k.
1.4.3 Properties of C(k, l; ), D(k; ) and F(k, l)
(i) C(k, l; )1 = C(k, l; )(ii) C(k, l; )C(k, m;) = C(k, m;)C(k, l; )
(iii) C(k, l; 0) = E
(iv) For = 0 we have D(k; )1 = D(k; 1)(v) F(k, l)1 = F(k, l) = F(k, l)T
1.4.4 Standard Algorithm
Standard operations in the Gauss algorithm are
(i) adding row l multiplied by to row k
(ii) multiplying row k by
= 0
(iii) exchanging rows k and l.
These operations can be described with aid of the fundamental matrices
C(k, l; ), D(k; ) and F(k, l). To see this we write the system Ax = b
as an augmented matrix S = (A|b).Then the operations (i) to (iii) from above are
(i) multiply S with C(k, l; ) from the left(ii) multiply S with D(k; ) from the left
(iii) multiply S with F(k, l) from the left.
8/2/2019 A Em Theory
37/124
AEM 1- 28
As all appearing matrices are invertible we see that the Gauss algorithm
gives equivalent transformations and so preserves the set of solutions.
If the system Ax =b is uniquely solvable it is sufficient to reach theform
d11 . . . ...0 dnn
From the last equation it one can read directly the value of xn, and
substituting the already determined variables the solution is calculated
recursively from the bottom to the top.
1.4.5 LU-Decomposition
The LU-decomposition
is an effective method in solving many equations with the sameleft side decomposes a given square matrix A as A = P LU with
(i) P is a permutation matrix, i.e. P has exactly one 1 in each
column and row, and all other entries are zero.
(ii) L is a lower triangular matrix
(iii) U is an upper triangular matrix
1.4.5.1 Description of the Algorithm - simple case with P=E
The algorithm consists of a series of transformations of the matrix A.
With L0 = E, U0 = A we calculate
A = EA = L0U0 = LkUk = = LnUn =: LU.
The matrices Lk and Uk are block-diagonal as shown:
8/2/2019 A Em Theory
38/124
AEM 1- 29
Lk =
1 0. .
. 10
1 0
. . .
0 1
k
Uk =
u1 . .
.0 uk
0
. . .
k
Lk1 =
1 0. . .
10
1 0
. . .
0 1
k1
Uk1 =
u1 . . .
0 uk1
0
z . . .
k1
m1 We start with A = Lk1Uk1. Let Uk1 = (ui j).
During this simple case we assume that z = ukk = 0.To each row from row k + 1 to the last in Uk1 we add the row
k multiplied with j :=
ujk
ukk. This results in having zeroes in
column k from row k + 1 to the bottom.
These actions expressed with matrices: Uk1 is multiplied from theleft with C( j,k; j).
Recall the facts that the inverse of C( j,k; j) is C( j,k; j) andthat matrices C( j,k; ) and C(i , k;) commute. So we have
A = Lk1Uk1 = Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)Uk1= Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)
C(n, k; n)C(n, k; n)Uk1
8/2/2019 A Em Theory
39/124
AEM 1- 30
=
Lk1C(k + 1, k; k+1) C(n, k; n)
C(k + 1, k; k+1) C(n, k; n)Uk1=: LkUk.How is Lk build from Lk1?
The action of the matrices is adding multiples of the columns k+1
to n to column k. Obviously only column k is changed by this
process, and contains in the places k + 1 to n the negative of the
factors used by the transforming of Uk1.As an example we write down L1 in the case U0 = A = (ai j):
L1 =
1 0 0a21a11
1 0...
.... . .
...an1a11
0 1
m2 Recursively now repeat step m1
When the algorithm ends we have
A = LU with
L =
1 0 0 1 . . . ...... . . . . . . 0 1
and U =
u11 0 u22
. . ....
.
.. . . . . . . 0 0 unn
The ui i are non-zero.
m3 Now we have
Ax = LUx = L Uxy = Ly =b
(i) Ly = b is solved recursively beginning with the first compo-
nent of y.
8/2/2019 A Em Theory
40/124
AEM 1- 31
(ii) Ux = y is solved recursively beginning with the last compo-
nent of x.
1.4.5.2 Remark
det A = det L det U = u11 unn.
1.4.6 Example
Let A =
1 2 42 3 81 3 1
and b =36
0
. Solve Ax = bm1 Start with the LU-decomposition of A.
[L0|U0] = 1 0 0 1 2 4
0 1 0 2 3 80 0 1 1 3 1 [L1|U1] =
1 0 0 1 2 42 1 0 0 1 0
1 0 1 0 1 3
[L2|
U2
] = 1 0 0 1 2 4
2 1 0 0
1 0
1 1 1 0 0 3 m2 Solve Ly = b.
Line by line one has y1 = 3, y2 = 0 and y3 = 3.
m3 Solve Ux = y.
Line by line (from the bottom to the top) one has
x3 = 1, x2 = 0 and x1 = 1, so x =1 0 1T.
8/2/2019 A Em Theory
41/124
AEM 1- 32
1.4.6.1 LU-decomposition, general case
This general case brings two extensions:
A may be singular pivoting is possible
Now we construct a decomposition A = P LU. We start with P0 := E,
L0 = E and U0 = A
If the element z in Uk1 is zero and in the rest of the column k there
are only zeroes, too, then the matrix A is singular. In this case letUk := Uk1 and Lk := Lk1. We will get a LU-decomposition of A withsome diagonal elements of U being zero. This can only happen is A is
singular.
If in the row l > k of the column k there is an entry with an larger
absolute value, then exchange the rows k and l of Uk1.
This is a multiplication of Uk1 from the left with F(k, l). RememberingF(k, l)F(k, l) = E we get
A = Pk1Lk1Uk1 =
Pk1Lk1F(k, l)
F(k, l)Uk1
=:
Pk1Lk1Fki
Uk1.
The matrixUk1 is Uk1 with rows i and k exchanged and therefore hasa non-zero element in position z.
The action of right multiplication with F(k, l) on Lk1 is interchangingcolumns k and i. As these columns consist of zeroes with only one 1
in each case this can be undone by interchanging the rows k and i, i.e.
multiplying Lk1 with F(k, l) from the left. But doing so interchangesthe first k
1 positions of these rows too, so that one has to undo this.
Resuming this we have this step in the algorithm: Set Pk := Pk1F(k, l)and Lk1 is Lk1 with the first k 1 columns of the rows k and linterchanged.
8/2/2019 A Em Theory
42/124
AEM 1- 33
1
1
1
kl
Pk1 Pk Lk1 Lk1 Uk1 Uk1
lk
k l
Now construct Uk and Lk as in the simple case from Uk1 and Lk1
and get A = PkLkUk.
In the end we have P1 = PT. As P is a product of matrices F(k, l)and F(k, l)1 = F(k, l)T this is true for P, too, because of:
Let AT = A1 and BT = B1. Then (AB)T = BTAT = B1A1 =(AB)1.
1.4.7 Summary of LU-decomposition
Solving a linear equation system Ax = b with LU-decompostion consists
of the following steps:
m1 Start with P0 = L0 = En, U0 = A.
m2 For each k from 1 to n perform
Exchanging rowsUk1 is Uk1 with rows k and l > k exchanged, Lk1 is Lk1where the first k 1 entries in rows k and l are exchanged(only if k > 1), and exchanging columns k and l in Pk1 gives
PkIf you skip this step just put Pk := Pk1, Lk1 := Lk1 andUk1 := Uk1
8/2/2019 A Em Theory
43/124
AEM 1- 34
Adding multiples of row k to the rows belowAdding in Uk1 the l-fold row to the rows l with l > kgives Uk, and Lk is Lk
1 with entries
l in row l of
column k.
With P := Pn, L := Ln and U := Un this gives the decomposition
A = P LU.
In case of different right sides bj in the equation system, this step
has to be carried out only once.
m3 Solve P z = b by z = PT b
m4 Solve Ly = z recursively starting with y1.
m5 Solve Ux = y recursively starting with xn.
At an arbitrary point you can make a crosscheck whether you made
mistakes during the calculation: always PkLkUk and PkLk1Uk1 must
be equal to A.
1.4.7.1 Remarks
(i) The first step in the LU-Decomposition can be used to do pivoting;
i.e. you can always put the entry with the largest absolute value
into the umm-position. This results in higher numerical stability.
(ii) P arises from the identity-matrix by interchanging rows. Thereforeit is not necessary to write down the complete matrix. One only
has to keep notice what coordinates are interchanged.
1.4.8 Example of LU-Decomposition
A =
6 5 3 103 7 3 512 4 4 40 12 0 8
8/2/2019 A Em Theory
44/124
AEM 1- 35
[P0|L0|U0] = [E|E|A]
=
1 0 0 0 1 0 0 0 6 5 3 100 1 0 0 0 1 0 0 3 7 3 50 0 1 0 0 0 1 0 12 4 4 4
0 0 0 1 0 0 0 1 0 12 0 8
[P1|L0|U0] = 0 0 1 0 1 0 0 0 12 4 4 4
0 1 0 0 0 1 0 0 3 7 3 51 0 0 0 0 0 1 0 6 5 3 100 0 0 1 0 0 0 1 0 12 0 8
[P1|L1|U1] =
0 0 1 0 1 0 0 0 12 4 4 4
0 1 0 0 1/4 1 0 0 0 6 4 41 0 0 0 1/2 0 1 0 0 3 1
12
0 0 0 1 0 0 0 1 0 12 0 8
[P2|L1|U1] =
0 0 1 0 1 0 0 0 12 4 4 4
0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 0 1 0 0 3 1 120 1 0 0 1/4 0 0 1 0 6 4 4
[P2|L2|U2] =
0 0 1 0 1 0 0 0 12 4 4 4
0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 1/4 1 0 0 0 1 100 1 0 0 1/4 1/2 0 1 0 0 4 8
[P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 40 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 8
0 1 0 0 1/2 1/4 0 1 0 0 1 10
8/2/2019 A Em Theory
45/124
AEM 1- 36
[P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 4
0 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 80 1 0 0 1/2 1/4 1/4 1 0 0 0 8
A = P LU = P3L3U3 with
P = 0 0 0 1
0 0 1 01 0 0 0
0 1 0 0
L = 1 0 0 0
0 1 0 01/4 1/2 1 01/2 1/4 1/4 1
U = 12 4 4 4
0 12 0 80 0 4 80 0 0 8
1.4.9 Solving a Linear Equation System
Ax = b with A =
6 5 3
10
3 7 3 512 4 4 4
0 12 0 8
and b = 10
148
8
m1 Solve P z = b
z=
PT b=
0 0 1 0
0 0 0 1
0 1 0 01 0 0 0
1014
88
= 8
8
1410
.m2 Solve Ly = z, i.e.
1 0 0 0
0 1 0 01/4 1/2 1 01/2 1/4 1/4 1
y1y2y3y4
=
8
814
10
.
Line by line one has y1 = 8, y2 = 8, 2 4 + y3 = 14 y3 = 16and 4 2 4 + y4 = 10 y4 = 8.
8/2/2019 A Em Theory
46/124
AEM 1- 37
m3 Solve Ux = y, i.e.
12 4 4 4
0 12 0 80 0 4 80 0 0 8
x1
x2x3x4
= 8
8168
.line by line one has
8x4 = 8 x4 = 1, 4x3 + 8 = 16 x3 = 2, 12x2 8 =8 x2 = 0 and 12x1 8 + 4 = 8 x1 = 1, sox = 1 0 2 1
T
.
1.4.10 Short Form
(i) Use the zeroes in the U-matrix to store the elements below the
diagonal of the L-matrix.
Divide these areas of the U-matrix by a line.
(ii) Instead of the P-matrix use a vector (initiallyp = [1 2 3 4]T) con-
taining the numbers of the rows of the right-side vector b.
Then a pivoting operation results in exchanging whole rows in U and p.
1.4.11 Example
[P0|L0|U0] = [E|E|A] :
6 5 3 103 7 3 5
12 4 4 4
0 12 0 8
1
2
3
4
.
[P1|L0|U0] : 12 4 4 4
3 7 3 56 5 3 100 12 0 8
3
21
4
8/2/2019 A Em Theory
47/124
AEM 1- 38
[P1|L1|U1] : 12 4 4 41
/4 6 4 41/2 3 1 120 12 0 8
3
21
4
[P2|L1|U1] :
12 4 4 4
0 12 0 81/2 3 1
12
1/4 6 4 4
3
4
1
2
[P2|L2|U2] =
12 4 4 4
0 12 0 81/2 1/4 1 101/4 1/2 4 8
3
4
1
2
[P3|L3|U3] :
12 4 4 4
0 12 0 81/4 1/2 4 81/2 1/4 1 10
3
4
2
1
[P3|L3|U3] =
12 4 4 40 12 0 81/4 1/2 4 81/2 1/4 1/4 8
34
2
1
Decompose this and put the L- and U-parts into the right form:
L =
1 0 0 00 1 0 01/4 1/2 1 01/2 1/4 1/4 1
und U =12 4 4 40 12 0 80 0 4 8
0 0 0 8
8/2/2019 A Em Theory
48/124
AEM 1- 39
In z = PT b one has
b = b1
b2b3b4
= 10
148
8
so z = b3
b4b2b1
= 8
814
10
and the rest is as above.
If one wants P explicitly one has from p: P = [ e3, e4, e2, e1].
8/2/2019 A Em Theory
49/124
AEM 1- 40
1.5 Eigenvalues and Eigenvectors
1.5.1 Definition and propertiesLet A be a square matrix.
(i) If C and v = 0 is a vector with Av = v, then v is calledeigenvector of A to the eigenvalue .
(ii) It is Av = v with v = 0 there is a vector v = 0 with (A E)v = 0 the kernel of A E is non-trivial A E is not regular det(A E) = 0As det(A E) is a polynomial of degree n in , we definep() = det(A E) is called characteristic polynomial von A.
Therefore a (complex) number is an eigenvalue of A, if is azero of the characteristic polynomials.
(iii) A has at least one eigenvalue and at least one eigenvector to each
eigenvalue.
(iv) If is a k-fold zero of p, then o() = k is called the algebraic
multiplicity of .
The geometric multiplicity () is the dimension of the kernel ofA E, that is dimension of eigenspace of A and .(v) A vector v is called generalized eigenvector of the k-th order to ,
if the following holds:
(A E)kv = 0, but (A E)k1v = 0.
(vi) Because of (A
E)0v = Ev = v the eigenvectors are just the
generalized eigenvectors of first order. If v is a generalized eigen-
vector of k-th order then (A E)v is a generalized eigenvectorof order (k 1).
8/2/2019 A Em Theory
50/124
AEM 1- 41
1.5.2 More properties
(i) Let C = P AP1. Then A and C have the same characteristicpolynomial.
(ii) Ifv is a (generalised) eigenvector of A then P v ist a (generalized)
eigenvector of C (of the same order).
(iii) Let A be a square kk-matrix with the property that the diagonaland everything below the diagonal is zero.
Then Ak = 0.
(iv) Let A be an (upper or lower) triangular matrix. Then the eigen-
values of A are the diagonal elements.
This shows that eigenvalues are properties of the linear map rather than
of the representing matrix.
1.5.3 Lemma
Let C be a m m-matrix. Then there exists an invertible m m-matrixP so that
S = P1CP =
0 ...
......
0
where is an eigenvalue of C.
1.5.4 Theorem: Schur Form
Let A be a n
n-matrix. Then there exists an invertible matrix P and
an upper triangular matrix U with A = P UP1.
U has the same characteristic polynomial as A, so the diagonal ofU are
the eigenvalues of A with the same multiplicity.
8/2/2019 A Em Theory
51/124
AEM 1- 42
1.5.5 Consequences
(i)
Always 1
()
o()
n holds.
If () < o() then for sufficient large k the dimension ofthe kernel of (A E)k is equal to the algebraic multiplicityo().
(ii) The generalized eigenspace to is the span of all generalized
eigenvectors to . Its dimension is o(), i.e. there are in total
as many linearly independent generalized eigenvectors to as the
order of as a zero of the characteristic polynomial.In particular for a simple zero of the characteristic polynomial we
have: there is a one-dimensional eigenspace and there are no gen-
eralized eigenvectors of higher order.
(iii) (generalized) eigenvectors to distinct eigenvalues are linearly inde-
pendent.
(iv) A real matrix is called (real) diagonalisable, if(1) the characteristic polynomial has only real zeroes
(2) for each zero the algebraic and the geometric multiplicity are
equal.
This means that there is a basis of theRn consisting of eigenvectors
of A resp. that there are no generalized eigenvectors of higher
order.
(v) Accordingly a complex matrix is called complex diagonalisable if
for every eigenvalue the algebraic and geometric multiplicity are
the same.
(vi) The spectrum of A is the set of eigenvalues, denoted by (A).
1.5.6 Jordan-FormIf is an eigenvalue of the matrix A and v is a corresponding eigenvector,
then Av = v.
8/2/2019 A Em Theory
52/124
AEM 1- 43
If v is a generalized eigenvector of order k+ 1 then u = (A E)v is ageneralized eigenvector of order k. In this case we have Av = v + u.
Putting these two cases together we get the important theorem on theJordan-form of a matrix:
1.5.6.1 Jordan-Form
Let L be an endomorphism ofCn. Then there exists a basis ofCn so
that in this basis L has an block-matrix representation
J =
J1 0 00 J2
. . ....
.... . .
. . . 0
0 0 Jp
where Jr =
r 1 0 00 r 1
. . ....
.... . .
. . .. . .
......
. . . r 1
0
0 r
The numbers r are (not necessarily distinct) eigenvalues. The blocks
Jr are Jordan-blocks.
If Jr has the size k and u1 uk are the basis vectors associated to theblock Jr then we have
Lu1 = ru1 and for 2 s k we have Lus = us + us1. ()
That means that u1 is an eigenvector and us are generalized eigenvectorsof order s. The (ordered) set u1 uk alle called Jordan-chain.Now let u1,1 u1,k1, u2,1 u2,k2, up,1 up,kp be the Jordan chainsassosiated with the Jordan blocks J1 Jp. The matrix
U = [u1,1 u1,k1 up,1 up,kp]fulfills
AU = UJ A = UJU1 J = U1AU.This is easily seen be looking at the columnvectors in the products,
because this is just the equation () in each column.
8/2/2019 A Em Theory
53/124
AEM 1- 44
1.5.6.2 Remark
If each Jr has the size 1 then there exists a basis of eigenvectors and
there are no generalized eigenvectors of order greater than one. In thiscase the matrix is diagonalisable.
8/2/2019 A Em Theory
54/124
AEM 1- 45
1.5.7 Example
A :=
2 0 1 0 0 0 0 0 0 0
1 2 0 0 0 0 0 0 0 00 0 2 0 0 0 0 0 0 0
0 0 0 2 0 0 0 1 0 0
0 0 0 0 2 0 0 0 1 0
0 0 0 1 0 2 0 0 0 0
0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 2 0 0
0 0 0 0 0 0 0 0 2 00 0 0 0 0 0 0 0 0 2
p() = (2 )10, so 2 is 10-fold eigenvalue of A.
B :=
0 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
B2 =
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Furthermore B3 = 0.
8/2/2019 A Em Theory
55/124
AEM 1- 46
U3
U2
U1
s3 = 2
s2 = 3
s1 = 5
r3 = 10 r2 = 8 r1 = 5 r0 = 0
ker B3 ker B2 ker B1 ker B0
One has
v ker B Bv = 0 B1(Bv) = 0 Bv ker B1.
So B is injective between U3, U2 and U1.
b31
b32 b22
b21
b23 b13
b11
b12
b14
b15
B B
U3 U2 U1
Choose a basis b31 and b32 of U3.
From this define
(i) Bb31 = b21, Bb21 = b11 and Bb11 = 0. (Jordan chain of length 3)
(ii) Bb32 = b22, Bb22 = b12 and Bb12 = 0. (Jordan chain of length 3)
In the 3-dimensional space U2 the vectors b21 and b22 are completed toa basis by b23.So one has
(iii) Bb23 = b13, Bb13 = 0. (Jordan chain of length 2)
8/2/2019 A Em Theory
56/124
AEM 1- 47
In the end the vectors in U1 that are already determined are completed
to a basis.
(iv) Bb14 =
0 (Jordan chain of length 1)
(v) Bb15 = 0 (Jordan chain of length 1)
With this the map B is uniquely described in the basis bi j.
If one observes Bv = 0 (A I)v = 0 Av = vBv = w (A I)v = w Av = v + w ,one has with
b11, b21, b31,b12, b22, b32,b13, b23,b14 andb15
the following matrix representation of A
J :=
1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
J is the (better a) Jordan form of the map A.
Gather the vectors b11, b21, . . . , b15 in a matrix C. Then it follows
AC = CJ, so A = CJC1.
8/2/2019 A Em Theory
57/124
AEM 1- 48
Calculation with numbers
U1 is the kernel of B. It consists of all vectors having a zero in position
1, 3, 4, 8 and 9. Because in general there is no canonical choice of baseswe describe U1 as
U1 = [e2 e5, e2 + e5, e6 e2, e7 e2, e10 e2.]
The kernel of B2 consists of all vectors having a zero in position 3 and
8. So U1 is completed by
U2 = [e1 + e4, e1 e4, e1 + e9]to a basis of ker B2.
ker B3 consists of all vectors. So we choose
U3 := [e3, e8].
Now construct the Jordan chains:
Be3 = e1, Be1 = e2 Be2 = 0 these are b31, b21 and b11
Be8 = e4, Be4 = e6 Be6 = 0 these are b32, b22, and b12
These are the chains of lenghth 3.
In U2 we have to complete the images of the vectors of U3 (e1 and e4)to a basis. So we choose b23 = e1 + e9 and build the next Jordan chain
B(e1 + e9) = e2 + e5, B(e2 + e5) = 0 these are b23 and b13
In U1 the span of e2, e6 and e2 + e5 has to be completed to a basis.
Therefore we choose
b14 = e10 e2 and b15 = e7 e2.
8/2/2019 A Em Theory
58/124
AEM 1- 49
With this we have: in the basis b11, b21, b31, b12, b22, b32, b13, b23,b14und b15 A the form J stated above.
Here we have C = (e2, e1, e3, e6, e4, e8, e2 + e5, e1 + e9, e10 e2, e7 e2)and so
C =
0 1 0 0 0 0 0 1 0 0
1 0 0 0 0 0 1 0 1 10 0 1 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
und
C1 =
0 1 0 0
1 0 1 0 0 1
1 0 0 0 0 0 0 0 1 00 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 1 0 0 0
The Jordan theorem now tells that A = CJC1 and J = C1AC.
8/2/2019 A Em Theory
59/124
AEM 1- 50
1.5.7.1 Algorithm
We look for the Jordan form and transformation matrices of an endo-
morphism A on Rn (or Cn), so A = CJC1.
(i) Calculate p() = det(A E) and find all zeroes. These are theeigenvalues.
(ii) For each eigenvalue perform the following process:
m1 For construct B := A E and determine the spaces Ui,until the dimension of the kernel of Bk (this is equal to thesum of the dimensions of the Ui) is equal to the algebraic
multiplicity of .
This is done iteratively: first find (with aid of the Gaua-
algorithm) a basis of the kernel of B. This is U1.
Then compute B2 and find a basis of its kernel by completing
the basis of U1 by other vectors. These completing vectorsform a basis of U2.
Now find a basis of U3 by completing the basis of ker B2 by
some vectors to a basis of ker B3 and so on.
m2 Now construct the Jordan chains:
the basis ofU3 (in general: Uk with the highest k) is mapped
by B in U2; and then is completed to a basis ofU2 by vectorsthat have been computed in m1 .
This basis is mapped by B; and the images are completed to
a basis of U1.
Each j-tuple v, Bv, . . . Bj1v of basis vectors with a startingvector v
Uj forms a Jordan chain of length j.
m3 When in total basis vectors are found, the work is donefor this eigenvalue.
8/2/2019 A Em Theory
60/124
AEM 1- 51
(iii) Each Jordan chain v, Bv, . . . Bj1v is written down in reverseorder (so starting with the eigenvector) Bj1v, Bj2v, . . . v andgathered to the matrix C.
In the Jordan matrix J each chain corresponds to a Jordan block
of size j j having the form J( j,) =
1 0 00 1 0...
.... . .
. . ....
0 0 10 0 0
with
the eigenvalue .The Jordan matrix J is the a block diagonal matrix consisting of
the single Jordan blocks.
1.6 Special Properties of Symmetric Matrices
A matrix is called orthogonal, iff the columns form an orthonormal basis.Equivalently one can say
AT = A1 or AT A = A AT = En.
In the complex case a matrix is called unitary if
A = A1 or A A = A A = En.
The importance of these notions lies in the fact that for arbitrary vectors
v and w and an orthogonal or unitary matrix A the following holds:
Av = v and < Av , A w >=< ATAv , w >=< v , w >An orthogonal transformation doesnt change neither angles nor lengths.
The proof of these facts is given below,
This subsection contains facts about symmetric or hermitian matrices.
Recall that a real matrix is called symmetric if A = A and a complexhermitian if A = A. For real matrices these definitions coincide.
The following statements are formulated for the complex case, because
the (more important) real case is contained in it.
8/2/2019 A Em Theory
61/124
AEM 1- 52
1.6.1 Properties of Symmetric and Hermitian Matrices
Let A be a hermitian n
n-matrix.
(i) The eigenvalues of A are real.
(ii) If = are eigenvalues and v1 and v2 are eigenvectors to resp., then < v1, v2 >= 0.
(iii) For each eigenvalue the geometrical and the algebraic multiplicity
are equal.
(iv) There exists a ON-Basis of eigenvectors ofA(v) There is an unitary matrix U and a real diagonal matrix D with
A = UDU. (Remember: U unitary U = U1.)
1.6.2 Orthogonal Matrices
A square matrix is called orthogonal (or unitary in the complex case)if ATA = E resp. AA = E. As the real case is more important, werestrict our further results to this case. The complex case can be proved
analogously.
1.6.2.1 Properties of Orthogonal Matrices
The following statements are equivalent:
(i) A is orthogonal.
(ii) AT = A1.
(iii) The columns of A form an orthonormal basis.
(iv) The rows of A form an orthonormal basis.
(v) For v, w Rn we have < v , w >=< Av , A w >.(vi) for each v Rn we have Av = v.
8/2/2019 A Em Theory
62/124
AEM 1- 53
1.6.2.2 Further Properties
Let A be orthogonal.
(i) For v , w one has
8/2/2019 A Em Theory
63/124
AEM 1- 54
(ix) Let vi and vj be elements of the ON-Basis of the eigenspace of
ATA to = 0. Then we have
i j = < vi, vj >=< vi, vj >=< AT
Avi, vj >=< Avi, Avj > .
This shows that Avi forms an orthogonal system and hence the
dimension of the eigenspace of AAT to must be greater or equal
than the dimension of the corresponding eigenspace of ATA.
By symmetry it follows that these two numbers are equal.
1.7.2 Existence and Construction of the SVD
1.7.2.1 Theorem
Let A be a m n-matrix.Then there exists an orthogonal nn-matrix V and an orthogonal mmmatrix U, and a m
n-matrix S = (si j) with si i
0 so that
A = USVT.
The matrix S = (si j) is a matrix of diagonal type, i.e. for i = j one hassi j = 0.
1.7.2.2 Algorithm
m1 Form B = ATA. This is an n n-matrix.m2 Compute the eigenvalues of B. These are non-negative and are
numered in the sequence 1 2 k > k+1 = =n = 0. The fact that k is the rank of the matrix A ( and the rank
of ATA too) can be used as a crosscheck.
m3 Find an ON-basis v1, . . . , vn ofRn. here is vi eigenvector to theeigenvalue i. V := [v1, , vn] becomes an orthogonal matrix.(VT = V1).
8/2/2019 A Em Theory
64/124
AEM 1- 55
m4 The singular values of A are defined as si =
i. The matrix
S = (si j) is a matrix of diagonal type, i.e. for i = j one has si j = 0.S has the same shape as A, i.e. n columns and m rows. The
elements in the diagonal are given by the singular values: si i = si.
m5 For i k define the vectors ui = 1iAvi. They form an orthonor-mal system. Complete these vectors to an ON-basis u1, . . . , um of
Rm and gather them into the matrix U = [u1, , um].m6 The singular value decomposition of A is
A = USVT.
1.7.2.3 Remark
In many cases the vectors in V and U belonging to the eigenvalues zero
are not needed. In this case the entries are denoted by stars (
) and
are not explicitely calculated. This is called the simplified version of theSVD.
1.7.2.4 Further Properties
If A = USVT is the SVD of A then AT has the SVD AT = V S TUT. If A
is invertible, then A1 = V S1UT.
1.8 Generalized Inverses
The singular value decomposition can be used to construct approximate
solutions of (possibly) non-square linear equation systems.
Given a mn-matrix A and an vector b Rm we are looking for a vectorx Rn so that the norm
Ax b2 = min!
8/2/2019 A Em Theory
65/124
AEM 1- 56
Substituting the SVD of A and remembering that for the orthogonal
matrix U we have that UT = U1 is orthogonal, too with u = UT ufor each u
Rm we get
Ax b2 = USVTx b2 = UTUSVTx UTb2
= S VTxz
UTbd
2 ()
The solutions of this equation are given by
zj = 1sj
dj j = 1 . . . k
arbitrary j > k
As V is orthogonal we get all solutions x as
x = V z =
kj=1
1
sjdjvj +
nj=k+1
zjvj.
Because V is orthogonal, the norm of x is given by n
j=1
z2j 1/2
. There-
fore the solution with the smallest norm is
x+ = V z =
kj=1
1
sjdjvj.
This solution is called pseudo-normal solution.One sees that the mapping b x+ is given by the matrix A+ := VSUTwith the diagonal-type matrix S := (ii j) where i is defined by i = 1
sjfor j k
0 for j > k.
1.8.0.5 Definition
The so defined matrix A+ is called generalized inverse or
Moore-Penrose-inverse of A.
8/2/2019 A Em Theory
66/124
AEM 1- 57
1.8.0.6 Further Properties
We have (AT)+ = (A+)T.
1.8.1 Special case: A injectiv
If A is injective then A has the rank n and the pseudo-normal solution
of every equation Ax = b is unique. Furthermore in this case ATA is
invertible (because of rank A = n and the rank of ATA is equal to the
rank of A).
In this case we can calculate x+ without explicit construction of the
SVD: using ATA = V S TUTUSV = V STSV we get from the equation
() above:
SVTx+ = UT b V S TSVTx+ = V S TUT b
V ST
SVT
ATA
x+
= V ST
UT
ATb A
T
Ax+
= AT
b x+
= (AT
A)1
AT
b
So in this case
A+ = (ATA)1AT
If one wants only x+ it is sufficient to solve ATAx+ = ATb.
8/2/2019 A Em Theory
67/124
AEM 1- 58
1.9 Applications to linear equation systems
1.9.1 Errors1.9.1.1 Introductory example
Ax = b with A =
2 3 42 3 4.001
3 4 5
and b =
11
1
One easily sees that A is invertible. The solution x is uniquely deter-
mined:
The exact solution is x =
11
0
. On the other hand y =
0.50
0.5
is
not far from the solution because of Ay =
1
1.0005
1
. From this one
sees that the given equation system is very unstable with respect to
perturbations.
If one calculates the solution of the slightly perturbated system Ax1 = b1
with b1 =
1
0.9
1
one gets x1 =
101.0000201.0000
100.0000
.
1.9.1.2 Theorem
Let x be the solution of Ax = b. If we compare the solution x + x of
the disturbed system A(x+x) = b+bwith x, we get the relative error
x
x A
A1
b
b.
The number (A) = cond A = AA1 is called the condition of A.With a little more efford it is possible to prove
8/2/2019 A Em Theory
68/124
AEM 1- 59
Theorem If x is the solution of Ax = b and x + x the solution of
(A + A)(x + x) = b+ b, then the following estimate for the relative
error holds:
xx
(A)
1 (A)AA
bb +
AA
.
For small values ofA the right side is approximately equal to
(A)b
b + A
A .1.9.2 Numerical Rank Deficiency
Numerical rank deficiency appears if a matrix is close to another matrix
with smaller rank. This leads to a very large condition number.
Small variations in the initial data of Ax = b lead to large variation inthe result x.
The SVD of A is A = USVT with the singular values s1 10, s2 0.4and s3 1/3000.To avoid this effects one can proceed as follows:
m1 Decompose A = USVT.
m2 The matrix S1 is build out ofS by replacing all entries smaller thana given number by zero and A1 = US1V
T.
This is reasonable: one can prove that entries in S that are smaller
than the machine accuracy multiplied by the Frobenius norm of the
matrix will have no influence on the result.
m3 Instead of the solutions ofAx = bfind the pseudo-normal solutionsof A1x = b with
x+ = A+1 b = V S+1 U
T b
8/2/2019 A Em Theory
69/124
AEM 1- 60
In the example one has A = USVT with
S = 10.3873 0 0
0 0.3338 0
0 0 0.0003 and orthogonal matrices U and V.We change the third singular value to zero an get
S1 =
10.3873 0 00 0.3338 0
0 0 0
and S+1 =
0.0963 0 00 2.9961 0
0 0 0
.
Then A+
1
= 1.1633 1.1674 1.8314
0.1669
0.1676 0.3342
0.8316 0.8344 1.1662 andx+ = A+1
11
1
=
0.49920.0002
0.4997
and x+1 = A+1
10.9
1
=
0.38250.0165
0.4163
In the original problem we have
Ax+ =0.99980.9999
1.0001
and Ax+1 0.94990.94991.0003
8/2/2019 A Em Theory
70/124
AEM 1- 61
1.9.3 Application: Best Fit Functions
Other name: Gaua method of least squares
1.9.3.1 Most important case: best fit straight line
Starting point are n > 2 pairs of coordinates (xi, yi), so that at least two
different x-values occur.
We search for a line y = ax+bwith the property that the quadratic error
ni=1
(axi + b) yi
2is as small as possible.
The solution of this problem is the pseudo normal solution of
b+ ax1 = y1...
b+ axn = yn
, or A b
a = ywith A =
1 x1...
...
1 xn
and y =
y1...
yn
As the matrix is injective, the solution is obtained with aid of the trans-
posed matrix:
b
a
= (ATA)1ATy.
The coefficient of correlations r measures the quality of the approxima-
tion. Always we have
|r
| 1 and for r =
1 the line goes through all
points.
8/2/2019 A Em Theory
71/124
AEM 1- 62
Algorithm
All sums are from i = 1 to n.
m1 = n
x2i (
xi)2
m2 The best fit straight line y = ax + b has the coefficients
a =1
(n
xiyi
xi
yi)
and b = 1 x2i yi xiyi xi.
m3 r =n
xiyi
xi
yin
x2i (
xi)2
n
y2i (
yi)2
Second method
Find the mean values x =1
n
nk=1
xk and y =1
n
nk=1
yk. Shift the coordi-
nate system so that x and y are the new origin by replacing xk by xk xresp. yk by yk y. Then the best fit straight line is given by
y = v x with v =
nk=1
xk yk
nk=1
x2k
and
r =
n
k=1 xkykn
k=1
x2k
1/2 nk=1
y2k
1/2 = x , yx y .
8/2/2019 A Em Theory
72/124
AEM 1- 63
Here it is easy to see that the coefficient of correlation describes the
relative error in the approximation:
nk=1
(v xk yk)2
nk=1
y2k
= 1 r2.
1.9.3.2 General problem
Let (xi, yi), i = 1, . . . , n be n pairs of data. Furthermore let f1, . . . , f k be
k < n functions. We look for a linear combination f(x) =
kj=1
jfj(x)
of the fj so that the sum of the squares of the deviations of f(xi) to yibecomes minimal:
F =
ni=1
(f(xi) yi)2 =n
i=1
kj=1
jfj(xi) yi2 !
= min.
Solution: Solve Aa = y. Here a = (1, . . . , k)T contains the coeffi-
cients we look for and A = f1(x1) f2(x1) fk(x1)f1
(x2
) f2
(x2
)
fk
(x2
)...
.... . .
...
f1(xn) f2(xn) fk(xn)
and y = y1y2...
yn
.
8/2/2019 A Em Theory
73/124
AEM 1- 64
1.10 Symmetric Matrices and Quadratic
Forms
A quadratic form on Rn is a map of the form
x = (x1, . . . , x n)T Q(x) =
ni ,j=1
ci j xi xj
The ci j are real numbers with ci j = cj i. With the symmetric matrix
C = (ci j)i ,j=1...n this is written as
Q(x) = xTCx ,
On the other hand is Q the quadratic form that belongs to C .
Let C = UDUT with a real diagonal matrix D containing the eigenvalues
of C and an orthogonal matrix U. Then
QC(x) = xTCx = xTUDUTx = (UTx)TD(UTx)
If the columns of U are the (ON-)vectors u1 un, then UTx are thecoefficients of x in this basis. If these are denoted by y1, . . . , y n, then
with y = UTx one has
QC(x) = yTDy =
nk=1
ky2k.
From this one has immediately e.g. that QC(x) is positive for non-zero-
vectors iff all eigenvalues of C are positive.
This leads to the definition:
A quadratic form is called
positive definite
if Q(x) > 0 for x = 0 > 0 for all eigenvalues of C.
8/2/2019 A Em Theory
74/124
AEM 1- 65
positive semidefinite
if Q(x) 0 for all x 0 for all eigenvalues of C.
negative definiteif Q(x) < 0 for x = 0 < 0 for all eigenvalues of C.negative semidefinite
if Q(x) 0 for all x 0 for all eigenvalues of C.definite
if Q is negative or positive definite.
indefiniteif there are x and y with Q(x) < 0 < Q(y)
the matrix C has positive and negative eigenvalues.(Dangerous) notation: C positive definite: C > 0, C positive semidef-
inite: C 0, C negative (semi)definite: C < 0 (C 0).A symmetric matrix is called positive/negative (semi)definite or indefi-
nite, if this is true for the corresponding quadratic form.
Remark
A is positive [semi]definite A is negative [semi]definite.
Hurwitz Criterion
The Hurwitz Criterion is useful to determine the definiteness of a matrix
without calculation the eigenvalues.
8/2/2019 A Em Theory
75/124
AEM 1- 66
In the symmetric n n-matrix Aone forms - starting from the left
upper corner - submatrices of the
size 1, 2,... n. The determinantsof these submatrices are called
D1 to Dn. We have D1 = a11,
D2 = a11a22 a12a21. Dn is thedeterminant of A at last. Then
the following holds:
a11
a21
a31
an1 ann
a1na12 a13
a22 a23
a33a32
.... . .
(i) D1 > 0, D2 > 0, D3 > 0, D4 > 0 etc.
A pos. definite.
(Dk > 0)
(ii) D1 < 0, D2 > 0, D3 < 0, D4 > 0 etc. A neg. definite.((1)kDk > 0)
(iii) A pos. semidefinite D1 0, D2 0, D3 0, D4 0 etc.(Dk 0)
(iv) A neg. semidefinite
D1
0, D2
0, D3
0, D4
0 etc.
((1)kDk 0)(v) if neither iii) nor iv) holds, A is indefinite.
Especially A is indefinite, if for an even number k Dk < 0 holds. Please
pay attention to the fact that A may be indefinite even if always Dk 0or (1)kDk 0 holds. In this case at least one Dk has to be zero.
Quadratic Completion
Another possibility to determine the definiteness of a quadratic form is
quadratic completion. The method is explained at the example
Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 9z2.
m1 Choose one variable xj with a non-vanishing coefficient ofx2j . Herewe choose x. If such a choice is impossible, the quadratic form is
indefinite.
8/2/2019 A Em Theory
76/124
AEM 1- 67
m2 Gather all terms that contain x:
Q(x) = (x2 + 4x y + 2x z) + (8y2 + 16y z + 9z2)
m3 Use the following to complete to a square(a + b+ c + d + )2 =a2+b2 +c2+d2 +2(ab+ac+ad+ +bc+bd+ +cd+ )
Q(x) = (x + 2y + z)2 +
m4 Subtract the term that are not contained in the bracket in stepm2 :Q(x) = (x + 2y + z)2 + (4y2 z2 4y z) + ( 8y2 + 16y z + 9z2)= (x + 2y + z)2 + (4y2 + 12y z + 8z2)
m5 Now the second bracket contains no x. Continue with m1 applied
to the second bracket. Choose y.
m6 Q(x) = (x + 2y + z)2 + (4y2 + 12y z) + 8z2
= (x + 2y + z)2 + (2y + 3z)2 9z2 + 8z2= (x + 2y + z)2 + (2y + 3z)2 z2.
This is a sum of squares with two plus and one minus-sign. This means
that there are two positive and one negative eigenvalues in the corre-
sponding matrix, and Q is indefinite.
Further examples
Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 10z2 is positive semidefinite,
and
Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 11z2 is positive definite.
8/2/2019 A Em Theory
77/124
AEM 1- 68
1.11 QR-Decomposition
Theorem
Let A be a matrix with m rows and n m columns. Then there existsa orthogonal matrix Q and a upper triangular matrix R with A = QR.
Upper triangular means that for R = (ri j) one has ri j = 0 for j < i.
Proof 1 - Jacobi method, Givens rotations
The case n = 1 or m = 1 is trivial. Now let us first look at the case
m = 2.
We are looking for an orthogonal 2 2-matrix Q with A = QR andr21 = 0.
Q = u vv u with u
2 + v2 = 1, R = r11 r12 0 r22
and A =
a b c d
leads to
QTA = R
u v
v u
a b c d
=
r11 r12 0 r22
av + uc = r11 r12 0 r22 .So this can be fulfilled with
u =a
a2 + c2and v =
ca2 + c2
In the case c = 0 one simply takes Q = E2.
8/2/2019 A Em Theory
78/124
AEM 1- 69
With Q0 = E and R0 = A for each element below the diagonal an
operation is performed:
QTi Ri = Ri+1
. . . 0 a b ... ... c d 0
Ek 0 00 u 0 v
.
..... 0 Em 0
...... v 0 u ...0 Ep
. . .
0 r ... ... 0 0
From this one sees: the same values of u and v as above eliminate the
c- element with an orthogonal matrix Qi, and the rest of the column
that contains a and c is not changed.
So we have A = Q0R0 = Q0R0 = Q0Q1R1 = Q0 QkRk := QR withQ = Q0 Qk and R = Rk.
Proof 2 - Householder Transformations
The Jacobi method needs n2
2 steps. This method uses only n 1steps:
The idea is to use a series of reflexions that map the parts of the columns
below the diagonal to zeroes.
After some steps we habe the matrix
Rk =
. . .
0 | ... bk 0 |
8/2/2019 A Em Theory
79/124
AEM 1- 70
The lower part of column k, bk, shall be mapped onto a multiple of
ek. Let ck be a vektor equal to bk, but with zeroes in the first k 1
positions. So defi
Top Related