Tom Apostol - Calculus Vol.2 - Multi-Variable Calculus and Linear Algebra with

696
Tom IN. Apostol CALCULUS VOLUME II Mlul ti Variable Calculus and Linear Algebra, with Applications to DifFeren tial Equations and Probability SECOND EDITION John Wiley & Sons New York London Sydney Toronto

Transcript of Tom Apostol - Calculus Vol.2 - Multi-Variable Calculus and Linear Algebra with

CalculusMlul ti Variable Calculus and Linear Algebra, with Applications to
DifFeren tial Equations and Probability
SECOND EDITION
New York London Sydney Toronto
C O N S U L T I N G E D I T O R
George Springer, Indiana University
COPYRIGHT 0 1969 BY XEROX CORPORATION. All rights reserved. No part of the material covered by this copyright
may be produced in any form, or by any means of reproduction. Previous edition copyright 0 1962 by Xerox Corporation.
Librar of Congress Catalog Card Number: 67-14605 ISBN 0 471 00007 8 Printed in the United States of America.
1 0 9 8 7 6 5 4 3 2
To
PREFACE
This book is a continuation of the author’s Calculus, Volume I, Second Edition. The present volume has been written with the same underlying philosophy that prevailed in the first. Sound training in technique is combined with a strong theoretical development. Every effort has been made to convey the spirit of modern mathematics without undue emphasis on formalization. As in Volume I, historical remarks are included to give the student a sense of participation in the evolution of ideas.
The second volume is divided into three parts, entitled Linear Analysis, Nonlinear Ana!ysis, and Special Topics. The last two chapters of Volume I have been repeated as the first two chapters of Volume II so that all the material on linear algebra will be complete in one volume.
Part 1 contains an introduction to linear algebra, including linear transformations, matrices, determinants, eigenvalues, and quadratic forms. Applications are given to analysis, in particular to the study of linear differential equations. Systems of differential equations are treated with the help of matrix calculus. Existence and uniqueness theorems are proved by Picard’s method of successive approximations, which is also cast in the language of contraction operators.
Part 2 discusses the calculus of functions of several variables. Differential calculus is unified and simplified with the aid of linear algebra. It includes chain rules for scalar and vector fields, and applications to partial differential equations and extremum problems. Integral calculus includes line integrals, multiple integrals, and surface integrals, with applications to vector analysis. Here the treatment is along more or less classical lines and does not include a formal development of differential forms.
The special topics treated in Part 3 are Probability and Numerical Analysis. The material on probability is divided into two chapters, one dealing with finite or countably infinite sample spaces; the other with uncountable sample spaces, random variables, and dis- tribution functions. The use of the calculus is illustrated in the study of both one- and two-dimensional random variables.
The last chapter contains an introduction to numerical analysis, the chief emphasis being on different kinds of polynomial approximation. Here again the ideas are unified by the notation and terminology of linear algebra. The book concludes with a treatment of approximate integration formulas, such as Simpson’s rule, and a discussion of Euler’s summation formula.
\‘I11 Preface
There is ample material in this volume for a full year’s course meeting three or four times per week. It presupposes a knowledge of one-variable calculus as covered in most first-year calculus courses. The author has taught this material in a course with two lectures and two recitation periods per week, allowing about ten weeks for each part and omitting the starred sections.
This second volume has been planned so that many chapters can be omitted for a variety of shorter courses. For example, the last chapter of each part can be skipped without disrupting the continuity of the presentation. Part 1 by itself provides material for a com- bined course in linear algebra and ordinary differential equations. The individual instructor can choose topics to suit his needs and preferences by consulting the diagram on the next page which shows the logical interdependence of the chapters.
Once again I acknowledge with pleasure the assistance of many friends and colleagues. In preparing the second edition I received valuable help from Professors Herbert S. Zuckerman of the University of Washington, and Basil Gordon of the University of California, Los Angeles, each of whom suggested a number of improvements. Thanks are also due to the staff of Blaisdell Publishing Company for their assistance and cooperation.
As before, it gives me special pleasure to express my gratitude to my wife for the many ways in which she has contributed. In grateful acknowledgement I happily dedicate this book to her.
T. M. A. Pasadena, California September 16, 1968
Logical Interdependence of the Chapters ix
1 LINEAR S P A C E S
I 6 LINEAR I
I 7
T R A N S F O R M A T I O N S AND MATRICES
D E T E R M I N A N T S
\ DIFFERENTIAL
EQUATIONS 4
E I G E N V A L U E S , I A N D
EIGENVECTORS SYSTEMS OF
I I 10 1 1 13 DIFFERENTIAL
H LINE SET FUNCTIONS CALCULUS OF I N T E G R A L S AND ELEMENTARY SCALAR AND PROBABILITY
VECTOR FIELDS
S P A C E S 9 ’ APPLICATIONS OF
DIFFERENTIAL C A L C U L U S
1 P R OBABILITI E S 1
I N T E G R A L S
CONTENTS
1.3 Examples of linear spaces
1.4 Elementary consequences of the axioms
1.5 Exercises
1.7 Dependent and independent sets in a linear space
1.8 Bases and dimension
1.13 Exercises
1.15 Orthogonal complements. Projections
1.16 Best approximation of elements in a Euclidean space by elements in a finite-
3
3
4
6
7
8
9
2.1 Linear transformations
2.3 Nullity and rank
2.6 Inverses
2.11 Construction of a matrix representation in diagonal form
2.12 Exercises
2.14 Tsomorphism between linear transformations and matrices
2.15 Multiplication of matrices
2.18 Computation techniques
2.20 Exercises
3. DETERMINANTS
3.1 Introduction
3.2 Motivation for the choice of axioms for a determinant function
3.3 A set of axioms for a determinant function
3.4 Computation of determinants
3.5 The uniqueness theorem
3.8 The determinant of the inverse of a nonsingular matrix
3.9 Determinants and independence of vectors
3.10 The determinant of a block-diagonal matrix
3.11 Exercises
3.13 Existence of the determinant function
3.14 The determinant of a transpose
3.15 The cofactor matrix
3.16 Cramer’s rule
4.1 Linear transformations with diagonal matrix representations 4.2 Eigenvectors and eigenvalues of a linear transformation 4.3 Linear independence of eigenvectors corresponding to distinct eigenvalues 4.4 Exercises * 4.5 The finite-dimensional case. Characteristic polynomials 4.6 Calculation of eigenvalues and eigenvectors in the finite-dimensional case 4.7 Trace of a matrix 4.8 Exercises 4.9 Matrices representing the same linear transformation. Similar matrices 4.10 Exercises
96 9 7
5.1 Eigenvalues and inner products 5.2 Hermitian and skew-Hermitian transformations 5.3 Eigenvalues and eigenvectors of Hermitian and skew-Hermitian operators 5.4 Orthogonality of eigenvectors corresponding to distinct eigenvalues 5.5 Exercises 5.6 Existence of an orthonormal set of eigenvectors for Hermitian and
114 115 117 117 118
skew-Hermitian operators acting on finite-dimensional spaces 5 . 7 Matrix representations for Hermitian and skew-Hermitian operators 5.8 Hermitian and skew-Hermitian matrices. The adjoint of a matrix 5 . 9 Diagonalization of a Hermitian or skew-Hermitian matrix 5.10 Unitary matrices. Orthogonal matrices 5.11 Exercises 5.12 Quadratic forms 5.13 Reduction of a real quadratic form to a diagonal form 5.14 Applications to analytic geometry 5.15 Exercises
A5.16 Eigenvalues of a symmetric transformation obtained as values of its quadratic form
k5.17 Extremal properties of eigenvalues of a symmetric transformation k5.18 The finite-dimensional case
5.19 Unitary transformations 5.20 Exercises
120 121 122 122 123 124 126 128 130 134
135 136 137 138 141
xiv Contents
6.1 Historical introduction
6.2 Review of results concerning linear equations of first and second orders
6.3 Exercises
6.5 The existence-uniqueness theorem
6.6 The dimension of the solution space of a homogeneous linear equation
6.7 The algebra of constant-coefficient operators
6.8 Determination of a basis of solutions for linear equations with constant
coefficients by factorization of operators
6.9 Exercises 6.10 The relation between the homogeneous and nonhomogeneous equations
6.11 Determination of a particular solution of the nonhomogeneous equation.
The method of variation of parameters
6.12 Nonsingularity of the Wronskian matrix of n independent solutions of a
homogeneous linear equation
6.13 Special methods for determining a particular solution of the nonhomogeneous
equation. Reduction to a system of first-order linear equations
6.14 The annihilator method for determining a particular solution of the
nonhomogeneous equation
6.15 Exercises
6.17 Linear equations of second order with analytic coefficients
6.18 The Legendre equation
6.19 The Legendre polynomials
6.21 Exercises
6.23 The Bessel equation
7 .2 Calculus of matrix functions 193
7.3 Infinite series of matrices. Norms of matrices 194
7.4 Exercises 195
142
143
144
145
147
147
148
150
154
156
157
7.6 The differential equation satisfied by etA
7.7 Uniqueness theorem for the matrix differential equation F’(t) = AF(t) 7.8 The law of exponents for exponential matrices
7.9 Existence and uniqueness theorems for homogeneous linear systems
197
198
199
7.11 The Cayley-Hamilton theorem 203
7.12 Exercises 205
7.14 Alternate methods for calculating et-’ in special cases 208
7.15 Exercises 211
7.17 Exercises 215
7.18 The general linear system Y’(t) = P(t) Y(t) + Q(t) 217
7.19 A power-series method for solving homogeneous linear systems 220
7.20 Exercises 221
7.21 Proof of the existence theorem by the method of successive approximations 222
7.22 The method of successive approximations applied to first-order nonlinear systems 227
7.23 Proof of an existence-uniqueness theorem for first-order nonlinear systems 229
7.24 Exercises 230
*7.26 Normed linear spaces 233
k7.27 Contraction operators 234
PART 2. NONLINEAR ANALYSIS
Functions from R” to R”. Scalar and vector fields
Open balls and open sets
Exercises
Limits and continuity
Exercises The derivative of a scalar field with respect to a vector
Directional derivatives and partial derivatives
Partial derivatives of higher order
Exercises
243
244
245
247
251
252
254
255
255
8.11 The total derivative
8.13 A sufficient condition for differentiability
8.14 Exercises
8.16 Applications to geometry. Level sets. Tangent planes
8.17 Exercises
8.20 The chain rule for derivatives of vector fields
8.21 Matrix form of the chain rule
8.22 Exercises
A8.23 Sufficient conditions for the equality of mixed partial derivatives
8.24 Miscellaneous exercises
9.1 Partial differential equations 283
9.2 A first-order partial differential equation with constant coefficients 284
9.3 Exercises 286
9.5 Exercises 292
9.7 Worked examples 298
9. IO Second-order Taylor formula for scalar fields 308
9.11 The nature of a stationary point determined by the eigenvalues of the Hessian
matrix 310
9.12 Second-derivative test for extrema of functions of two variables 312
9.13 Exercises 313
9. I5 Exercises 318
9.17 The small-span theorem for continuous scalar fields (uniform continuity) 321
10. LINE INTEGRALS 10.1 Introduction
10.2 Paths and line integrals
323
323
10.5 Exercises 328
10.6 The concept of work as a line integral 328
10.7 Line integrals with respect to arc length 329
10.8 Further applications of line integrals 330
10.9 Exercises 331
10.10 Open connected sets. Independence of the path 332
10.11 The second fundamental theorem of calculus for line integrals 333
10.12 Applications to mechanics 335
10.13 Exercises 336
10.14 The first fundamental theorem of calculus for line integrals 331
10.15 Necessary and sufficient conditions for a vector field to be a gradient 339
10.16 Necessary conditions for a vector field to be a gradient 340
10.17 Special methods for constructing potential functions 342
10.18 Exercises 345
10.19 Applications to exact differential equations of first order 346
10.20 Exercises 349
11.3 The double integral of a step function 355
11.4 The definition of the double integral of a function defined and bounded on a
rectangle
11.6 Evaluation of a double integral by repeated one-dimensional integration
11.7 Geometric interpretation of the double integral as a volume
11.8 Worked examples
11 .I 1 Integrability of bounded functions with discontinuities
11.12 Double integrals extended over more general regions
11.13 Applications to area and volume
11.14 Worked examples
11.17 Two theorems of Pappus
11.18 Exercises
11.19 Green’s theorem in the plane
11.20 Some applications of Green’s theorem
11.21 A necessary and sufficient condition for a two-dimensional vector field to be a
gradient
*11.24 The winding number
11.27 Special cases of the transformation formula
11.28 Exercises 11.29 Proof of the transformation formula in a special case
11.30 Proof of the transformation formula in the general case
11.31 Extensions to higher dimensions
11.32 Change of variables in an n-fold integral
11.33 Worked examples
12.2 The fundamental vector product
12.3 The fundamental vector product as a normal to the surface
12.4 Exercises
12.6 Exercises
12.9 Other notations for surface integrals
12.10 Exercises
12.12 The curl and divergence of a vector field
12.13 Exercises
12.15 Exercises
*12.17 Exercises
12.19 The divergence theorem (Gauss’ theorem:) 12.20 Applications of the divergence theorem
12.21 Exercises
13.1 Historical introduction
13.3 Finitely additive measures
13.6 Special terminology peculiar to probability theory
13.7 Exercises
13.11 Exercises
13.15 Compound experiments
13.16 Bernoulli trials
13.17 The most probable number of successes in n Bernoulli trials
13.18 Exercises
13.20 Exercises
13.21 The definition of probability for countably infinite sample spaces
13.22 Exercises
14. CALCULUS OF PROBABILITIES
14.2 Countability of the set of points with positive probability
14.3 Random variables
14.7 Discrete distributions. Probability mass functions
14.8 Exercises
469
470
471
472
473
475
477
477
479
481
485
486
488
490
492
495
497
499
501
504
506
507
507
510
511
512
14.11 Cauchy’s distribution
14.16 Exercises
14.18 Exercises 14.19 Distributions of two-dimensional random variables
14.20 Two-dimensional discrete distributions
14.22 Exercises
14.24 Exercises
14.27 Exercises
14.30 The central limit theorem of the calculus of probabilities
14.3 1 Exercises
15.1 Historical introduction
15.4 Fundamental problems in polynomial approximation
15.5 Exercises
15.8 Error analysis in polynomial interpolation
15.9 Exercises
15.11 Equally spaced interpolation points. The forward difference operator
15.12 Factorial polynomials
526
530
532
533
535
539
540
541
542
543
545
546
548
550
553
556
559
560
562
564
566
568
569
571
572
574
575
577
579
582
583
585
588
590
592
593
595
15.17 Application to the error formula for interpolation
15.18 Exercises 15.19 Approximate integration. The trapezoidal rule
15.20 Simpson’s rule
15.23 Exercises Suggested References
LINEAR SPACES
1.1 Introduction
Throughout mathematics we encounter many examples of mathematical objects that can be added to each other and multiplied by real numbers. First of all, the real numbers themselves are such objects. Other examples are real-valued functions, the complex numbers, infinite series, vectors in n-space, and vector-valued functions. In this chapter we discuss a general mathematical concept, called a linear space, which includes all these examples and many others as special cases.
Briefly, a linear space is a set of elements of any kind on which certain operations (called addition and multiplication by numbers) can be performed. In defining a linear space, we do not specify the nature of the elements nor do we tell how the operations are to be performed on them. Instead, we require that the operations have certain properties which we take as axioms for a linear space. We turn now to a detailed description of these axioms.
1.2 The definition of a linear space
Let V denote a nonempty set of objects, called elements. The set V is called a linear space if it satisfies the following ten axioms which we list in three groups.
Closure axioms
AXIOM 1. CLOSURE UNDER ADDITION. For every pair of elements x and y in V there corresponds a unique element in V called the sum of x and y, denoted by x + y .
AXIOM 2. CLOSURE UNDER MULTIPLICATION BY REAL NUMBERS. For every x in V and every real number a there corresponds an element in V called the product of a and x, denoted by ax.
Axioms for addition
AXIOM 3. COMMUTATIVE LAW. For all x and y in V, we have x + y = y + x.
AXIOM 4. ASSOCIATIVELAW. Forallx,y,andzinV,wehave(x+y) + z =x +(y+z).
?
4 Linear spaces
AXIOM 5. EXISTENCEOFZEROELEMENT. There is an element in V, denoted by 0, such that
x+0=x forallxin V .
AXIOM 6. EXISTENCEOFNEGATIVES. For every x in V, the element (- 1)x has the property
x+(-1)x= 0 .
Axioms for multiplication by numbers
AXIOM 7. ASSOCIATIVE LAW. For every x in V and all real numbers a and b, we have
a(bx) = (ab)x.
AXIOM 8. DISTRIBUTIVE LAW FOR ADDITION IN V. For all x andy in V and all real a, we hare
a(x + y) = ax + ay .
AXIOM 9. DISTRIBUTIVE LAW FOR ADDITION OF NUMBERS. For all x in V and all real a and b, we have
(a + b)x = ax + bx.
AXIOM 10. EXISTENCE OF IDENTITY. For every x in V, we have lx = x.
Linear spaces, as defined above, are sometimes called real linear spaces to emphasize the fact that we are multiplying the elements of V by real numbers. If real number is replaced by complex number in Axioms 2, 7, 8, and 9, the resulting structure is called a complex linear space. Sometimes a linear space is referred to as a linear vector space or simply a vector space; the numbers used as multipliers are also called scalars. A real linear space has real numbers as scalars; a complex linear space has complex numbers as scalars. Although we shall deal primarily with examples of real linear spaces, all the theorems are valid for complex linear spaces as well. When we use the term linear space without further designation, it is to be understood that the space can be real or complex.
1.3 Examples of linear spaces
If we specify the set V and tell how to add its elements and how to multiply them by numbers, we get a concrete example of a linear space. The reader can easily verify that each of the following examples satisfies all the axioms for a real linear space.
EXAMPLE 1. Let V = R , the set of all real numbers, and let x + y and ax be ordinary addition and multiplication of real numbers.
EXAMPLE 2. Let V = C, the set of all complex numbers, define x + y to be ordinary addition of complex numbers, and define ax to be multiplication of the complex number x
Examples of linear spaces
by the real number a. Even though the elements of V are complex numbers, this is a real linear space because the scalars are real.
EXAMPLE 3. Let V’ = V,, the vector space of all n-tuples of real numbers, with addition and multiplication by scalars defined in the usual way in terms of components.
EXAMPLE 4. Let V be the set of all vectors in V, orthogonal to a given nonzero vector IV. If n = 2, this linear space is a line through 0 with N as a normal vector. If n = 3, it is a plane through 0 with N as normal vector.
The following examples are called function spaces. The elements of V are real-valued functions, with addition of two functions f and g defined in the usual way:
(f + g)(x) =f(x) + g(x)
for every real x in the intersection of the domains off and g. Multiplication of a function f by a real scalar a is defined as follows: af is that function whose value at each x in the domain off is af (x). The zero element is the function whose values are everywhere zero. The reader can easily verify that each of the following sets is a function space.
EXAMPLE 5. The set of all functions defined on a given interval.
EXAMPLE 6. The set of all polynomials.
EXAMPLE 7. The set of all polynomials of degree 5 n, where n is fixed. (Whenever we consider this set it is understood that the zero polynomial is also included.) The set of all polynomials of degree equal to IZ is not a linear space because the closure axioms are not satisfied. For example, the sum of two polynomials of degree n need not have degree n.
EXAMPLE 8. The set of all functions continuous on a given interval. If the interval is [a, b], we denote this space by C(a, b).
EXAMPLE 9. The set of all functions differentiable at a given point.
EXAMPLE 10. The set of all functions integrable on a given interval.
EXAMPLE 11. The set of all functions f defined at 1 with f(1) = 0. The number 0 is essential in this example. If we replace 0 by a nonzero number c, we violate the closure axioms.
EXAMPLE 12. The set of all solutions of a homogeneous linear differential equation y” + ay’ + by = 0, where a and b are given constants. Here again 0 is essential. The set of solutions of a nonhomogeneous differential equation does not satisfy the closure axioms.
These examples and many others illustrate how the linear space concept permeates algebra, geometry, and analysis. When a theorem is deduced from the axioms of a linear space, we obtain, in one stroke, a result valid for each concrete example. By unifying
6 Linear spaces
diverse examples in this way we gain a deeper insight into each. Sometimes special knowl- edge of one particular example helps to anticipate or interpret results valid for other examples and reveals relationships which might otherwise escape notice.
1.4 Elementary consequences of the axioms
The following theorems are easily deduced from the axioms for a linear space.
THEOREM 1.1. UNIQUENESS OF THE ZERO ELEMENT. in any linear space there is one and only one zero element.
Proof. Axiom 5 tells us that there is at least one zero element. Suppose there were two, say 0, and 0,. Taking x = OI and 0 = 0, in Axiom 5, we obtain Or + O2 = 0,. Similarly, taking x = 02 and 0 = 0,) we find 02 + 0, = 02. But Or + 02 = 02 + 0, because of the commutative law, so 0, = 02.
THEOREM 1.2. UNIQUENESS OF NEGATIVE ELEMENTS. In any linear space every element has exactly one negative. That is, for every x there is one and only one y such that x + y = 0.
Proof. Axiom 6 tells us that each x has at least one negative, namely (- 1)x. Suppose x has two negatives, say y1 and yZ. Then x + yr = 0 and x + yZ = 0. Adding yZ to both members of the first equation and using Axioms 5, 4, and 3, we find that
and
Y2 + (x + yd = (y2 + x) + y1 = 0 + y, = y1 + 0 = y,,
Therefore y1 = y2, so x has exactly one negative, the element (- 1)x.
Notation. The negative of x is denoted by -x. The difference y - x is defined to be the sum y + (-x) .
The next theorem describes a number of properties which govern elementary algebraic manipulations in a linear space.
THEOREM 1.3. In a given linear space, let x and y denote arbitrary elements and let a and b
denote arbitrary scalars. Then we‘have the following properties:
(a) Ox = 0. (b) a 0 = 0 . ( c ) ( - a ) x = - ( a x ) = a ( - x ) . ( d ) I f a x = O , t h e n e i t h e r a = O o r x = O . ( e ) Ifax=ayanda#O, t h e n x = y . ( f ) Ifax=bxandx#O,thena=b.
(g> -(x + y) = (-4 + C-y) = --x - y.
(h) x + x = 2x, x + x +x = 3x, andingeneral, &x = nx.
Exercises 7
We shall prove (a), (b), and (c) and leave the proofs of the other properties as exercises.
Proof of (a). Let z = Ox. We wish to prove that z = 0. Adding z to itself and using Axiom 9, we find that
Now add -z to both members to get z = 0.
Proof of(b). Let z = a0, add z to itself, and use Axiom 8.
Proof of(c). Let z = (-a)x. Adding z to ax and using Axiom 9, we find that
z+ax=(-a)x+ax=(-a+a)x=Ox=O,
so z is the negative of ax, z = -(ax). Similarly, if we add a(-~) to ax and use Axiom 8 and property (b), we find that a(-~) = -(ax).
1.5 Exercises
In Exercises 1 through 28, determine whether each of the given sets is a real linear space, if addition and multiplication by real scalars are defined in the usual way. For those that are not, tell which axioms fail to hold. The functions in Exercises 1 through 17 are real-valued. In Exer- cises 3, 4, and 5, each function has domain containing 0 and 1. In Exercises 7 through 12, each domain contains all real numbers.
1. All rational functions. 2. All rational functionsflg, with the degree off < the degree ofg (including f = 0). 3 . Allfwithf(0) = f ( l ) . 8. All even functions. 4 . Allfwith2f(O) = f ( l ) . 9. All odd functions. 5. Allfwithf(1) = 1 +f(O). 10. All bounded functions. 6. All step functions defined on [0, 11. 11. All increasing functions. 7. Allfwithf(x)-Oasx+ +a. 12. All functions with period 2a.
13. All f integrable on [0, l] with Ji f(x) dx = 0. 14. All f integrable on [0, l] with JA f(x) dx > 0. 15. All f satisfyingf(x) = f(l - x) for all x. 16. All Taylor polynomials of degree < n for a fixed n (including the zero polynomial). 17. All solutions of a linear second-order homogeneous differential equation’ y” + P(x)y’ +
Q(x)y = 0, where P and Q are given functions, continuous everywhere. 18. All bounded real sequences. 20. All convergent real series. 19. All convergent real sequences. 21. All absolutely convergent real series. 22. All vectors (x, y, z) in V, with z = 0. 23. All vectors (x, y, z) in V, with x = 0 or y = 0. 24. All vectors (x, y, z) in V, with y = 5x. 25. All vectors (x, y, z) in V, with 3x + 4y = 1, z = 0. 26. All vectors (x, y, z) in V, which are scalar multiples of (1,2, 3). 27. All vectors (x, y, z) in V, whose components satisfy a system of three linear equations of the
form :
allx + a,,y + a13z = 0, azlx + a,,y + uz3z = 0 , CZ31X + U33Y + U33Z = 0.
8 Linear spaces
28. All vectors in V,, that are linear combinations of two given vectors A and B. 29. Let V = R+, the set of positive real numbers. Define the “sum” of two elements x and y in
V to be their product x y (in the usual sense), and define “multiplication” of an element x in V by a scalar c to be xc. Prove that V is a real linear space with 1 as the zero element.
30. (a) Prove that Axiom 10 can be deduced from the other axioms. (b) Prove that Axiom 10 cannot be deduced from the other axioms if Axiom 6 is replaced by
Axiom 6’: For every x in V there is an element y in V such that x + y = 0. 3 1. Let S be the set of all ordered pairs (x1, xZ) of real numbers. In each case determine whether
or not S is a linear space with the operations of addition and multiplication by scalars defined as indicated. If the set is not a linear space, indicate which axioms are violated. (4 (x1,x2) + (y19y2) = (x1 +y1,x2 +y,), 4x1, x2) = @Xl) 0). (b) (-99x2) + (y1,y,) = (~1 +yl,O), 4X1,X,) = (ax,, ax,>.
cc> (Xl, x2) + cy1,y2> = (Xl, x2 +y2>9 4x1, x2> = (~17 QX2>.
(4 @1,x2) + (yl,y2) = (Ix, + x,l,ly1 +y,l)t 4x1, x2) = (lql, lq!l) f
32. Prove parts (d) through (h) of Theorem 1.3.
1.6 Subspaces of a linear space
Given a linear space V, let S be a nonempty subset of V. If S is also a linear space, with the same operations of addition and multiplication by scalars, then S is called a subspace of V. The next theorem gives a simple criterion for determining whether or not a subset of a linear space is a subspace.
THEOREM 1.4. Let S be a nonempty subset of a linear space V. Then S is a subspace if and only if S satisfies the closure axioms.
Proof. If S is a subspace, it satisfies all the axioms for a linear space, and hence, in particular, it satisfies the closure axioms.
Now we show that if S satisfies the closure axioms it satisfies the others as well. The commutative and associative laws for addition (Axioms 3 and 4) and the axioms for multiplication by scalars (Axioms 7 through 10) are automatically satisfied in S because they hold for all elements of V. It remains to verify Axioms 5 and 6, the existence of a zero element in S, and the existence of a negative for each element in S.
Let x be any element of S. (S has at least one element since S is not empty.) By Axiom 2, ax is in S for every scalar a. Taking a = 0, it follows that Ox is in S. But Ox = 0, by Theorem 1.3(a), so 0 E S, and Axiom 5 is satisfied. Taking a = - 1, we see that (-1)x is in S. But x + (- 1)x = 0 since both x and (- 1)x are in V, so Axiom 6 is satisfied in S. Therefore S is a subspace of V.
DEFINITION. Let S be a nonempty subset of a linear space V. An element x in V of the form
k
x = z: cixi ) i=l
where x1,. . . , xk are all in S and cl, . . . , ck are scalars, is called a$nite linear combination of elements of S. The set of alljnite linear combinations of elements of S satisjies the closure axioms and hence is a subspace of V. We call this the subspace spanned by S, or the linear span of S, and denote it by L(S). If S is empty, we dejne L(S) to be {0}, the set consisting of the zero element alone.
Dependent and independent sets in a linear space 9
Different sets may span the same subspace. For example, the space V, is spanned by each of the following sets of vectors: {i,j}, {i,j, i +j}, (0, i, -i,j, -j, i + j}. The space of all polynomialsp(t) of degree < n is spanned by the set of n + 1 polynomials
(1, t, t2, . . . ) P}.
It is also spanned by the set {I, t/2, t2/3, . . . , t”/(n + l>>, and by (1, (1 + t), (1 + t)“, . . . , (1 + t)“}. The space of all polynomials is spanned by the infinite set of polynomials (1, t, t2, . . .}.
A number of questions arise naturally at this point. For example, which spaces can be spanned by a finite set of elements? If a space can be spanned by a finite set of elements, what is the smallest number of elements required? To discuss these and related questions, we introduce the concepts of dependence, independence, bases, and dimension. These ideas were encountered in Volume I in our study of the vector space V, . Now we extend them to general linear spaces.
1.7 Dependent and independent sets in a linear space
DEFINITION. A set S of elements in a linear space V is called dependent if there is a-finite set of distinct elements in S, say x1, . . . , xg, and a corresponding set of scalars cl, . . . , c,, not all zero, such that
An equation 2 c,x( = 0 with not all ci = 0 is said to be a nontrivial representation of 0. The set S is called independent ifit is not dependent. In this case, for all choices of distinct elements x1, . . . , xk in S and scalars cl, . . . , ck,
ii cixi = O implies c1=c2=..*=ck=o.
Although dependence and independence are properties of sets of elements, we also apply these terms to the elements themselves. For example, the elements in an independent set are called independent elements.
If S is a finite set, the foregoing definition agrees with that given in Volume I for the space V,. However, the present definition is not restricted to finite sets.
EXAMPLE 1. If a subset T of a set S is dependent, then S itself is dependent. This is logically equivalent to the statement that every subset of an independent set is independent.
EXPMPLE 2. If one element in S is a scalar multiple of another, then S is dependent.
EXAMPLE 3. If 0 E S, then S is dependent.
EXAMPLE 4. The empty set is independent,
10 Linear spaces
Many examples of dependent and independent sets of vectors in V,, were discussed in Volume I. The following examples illustrate these concepts in function spaces. In each case the underlying linear space V is the set of all real-valued functions defined on the real line.
EXAMPLE 5. Let ui(t) = co? t , uz(t) = sin2 t , us(f) = 1 for all real t. The Pythagorean identity shows that u1 + u2 - uQ = 0, so the three functions ui, u2, u, are dependent.
EXAMPLE 6. Let uk(t) = tk for k = 0, 1,2. . . . , and t real. The set S = {u,, , ui, u2, . . .} is independent. To prove this, it suffices to show that for each n the n + 1 polynomials u,, 4, *. * 3 u, are independent. A relation of the form 1 c,u, = 0 means that
(1.1) -&tk = 0 k=O
for all real 1. When t = 0, this gives co = 0 . Differentiating (1.1) and setting t = 0, we find that c1 = 0. Repeating the process, we find that each coefficient ck is zero.
EXAMPLE 7. If a,,..., a, are distinct real numbers, the n exponential functions
q(x) = ea@, . . . , u,(x) = eanr
are independent. We can prove this by induction on n. The result holds trivially when n = 1 . Therefore, assume it is true for n - 1 exponential functions and consider scalars Cl, * *. 7 c, such that
(1.2) tckeakx = 0. k=l
Let alIf be the largest of the n numbers a,, . . . , a,. Multiplying both members of (1.2) by e-a~x, we obtain
If k # M, the number ak - aAl is negative. Therefore, when x + + co in Equation (1.3), each term with k # M tends to zero and we find that cnl = 0. Deleting the Mth term from (1.2) and applying the induction hypothesis, we find that each of the remaining n - 1 coefficients c, is zero.
THEOREM 1.5. Let s = {X1,..., xk} be an independent set consisting of k elements in a linear space V and let L(S) be the subspace spanned by S. Then every set of k + 1 elements in L(S) is dependent.
Proof. The proof is by induction on k, the number of elements in S. First suppose k = 1. Then, by hypothesis, S consists of one element xi, where x1 # 0 since S is independent. Now take any two distinct elements y1 and yZ in L(S). Then each is a scalar
Dependent and independent sets in a linear space 1 1
multiple of x1, say y1 = clxl and yZ = cZxl, where c, and c2 are not both 0. Multiplying y1 by c2 and y, by c1 and subtracting, we find that
c2.h - c,y, = 0.
This is a nontrivial representation of 0 soy, and y2 are dependent. This proves the theorem when k = 1 .
Now we assume the theorem is true for k - 1 and prove that it is also true for k. Take any set of k + 1 elements in L(S), say T = {yr , yZ, . . I , yk+r} . We wish to prove that Tis dependent. Since each yi is in L(S) we may write
(1.4) k
yi = 2 a, jxj i-1
f o r e a c h i = 1,2,... , k + 1 . We examine all the scalars ai, that multiply x1 and split the proof into two cases according to whether all these scalars are 0 or not.
CASE 1. ai, = 0 for every i = 1,2, . . . , k + 1 . In this case the sum in (1.4) does not involve x1, so each yi in T is in the linear span of the set S’ = {x2, . . . , xk} . But S’ is independent and consists of k - 1 elements. By the induction hypothesis, the theorem is true for k - 1 so the set T is dependent. This proves the theorem in Case 1.
CASE 2. Not all the scaIars ai, are zero. Let us assume that a,, # 0. (If necessary, we can renumber the y’s to achieve this.) Taking i = 1 in Equation (1.4) and multiplying both members by ci, where ci = ail/all, we get
k
From this we subtract Equation (1.4) to get
k
C,yl - yi = x(Cial j - aij>xj 3 j=2
fori=2,..., k + 1 . This equation expresses each of the k elements ciy, - yi as a linear combination of the k - 1 independent elements .x2, . . . , xk . By the induction hypothesis, the k elements ciy, - yi must be dependent. Hence, for some choice of scalars t,, . . . , tk+l, not all zero, we have
kfl
iz2ti(ciYl - Yi) = O 9
from which we find
But this is a nontrivial linear combination of y,, . . . , yh.+l which represents the zero ele- ment, so the elements y1 , . . . , yri.r must be dependent. This completes the proof.
1 2 Linear spaces
1.8 Bases and dimension
DEFINITION. A jinite set S of elements in a linear space V is called aJnite basis .for V if S is independent and spans V. The space V is called$nite-dimensional if it has a jinite basis, or if V consists of 0 alone. Otherwise, V is called injinite-dimensional.
THEOREM 1.6. Let V be a jnite-dimensional linear space. Then every jnite basis for V has the same number of elements.
Proof. Let S and T be two finite bases for V. Suppose S consists of k elements and T consists of m elements. Since S is independent and spans V, Theorem 1.5 tells us that every set of k + 1 elements in Vis dependent. Therefore, every set of more thank elements in V is dependent. Since T is an independent set, we must have m 5 k. The same argu- ment with S and T interchanged shows that k < m . Therefore k = m .
DEFINITION. If a linear space V has a basis of n elements, the integer n is called the dimension of V. We write n = dim V. If V = {O}!, we say V has dimension 0.
EXAMPLE 1. The space V, has dimension n. One basis is the set of n unit coordinate vectors.
EXAMPLE 2. The space of all polynomials p(t) of degree < n has dimension n + 1 . One basis is the set of n + 1 polynomials (1, t, t2, . . . , t’“}. Every polynomial of degree 5 n is a linear combination of these n + 1 polynomials.
EXAMPLE 3. The space of solutions of the differential equation y” - 2y’ - 3y = 0 has dimension 2. One basis consists of the two functions ul(x) = e-“, u2(x) = e3x, Every solution is a linear combination of these two.
EXAMPLE 4. The space of all polynomials p(t) is infinite-dimensional. Although the infinite set (1, t, t2, . . .} spans this space, no$nite set of polynomials spans the space.
THEOREM 1.7. Let V be a jinite-dimensional linear space with dim V = n. Then we have the following:
(a) Any set of independent elements in V is a s&set of some basis for V. (b) Any set of n independent elements is a basisf;pr V.
Proof. To prove (a), let S = {x1, . . . , xk} be any independent set of elements in V. If L(S) = V, then S is a basis. If not, then there is some element y in V which is not in L(S). Adjoin this element to S and consider the new set S’ = {x1, . . . , xk, y} . If this set were dependent there would be scalars cl, . . . , c~+~, not all zero, such that
izlCiXi + cktly = 0 *
But Ck+l # 0 since xi, . . . , xk are independent. Hence, we could solve this equation for y and find that y E L(S), contradicting the fact that y is not in L(S). Therefore, the set S’
Exercises 13
is independent but contains k + 1 elements. If L(S’) = V, then S’ is a basis and, since S is a subset of S’, part (a) is proved. If S’ is not a basis, we can argue with S’ as we did with S, getting a new set S” which contains k + 2 elements and is independent. If S” is a basis, then part (a) is proved. If not, we repeat the process. We must arrive at a basis in a finite number of steps, otherwise we would eventually obtain an independent set with it + 1 elements, contradicting Theorem 1.5. Therefore part (a) is proved.
To prove (b), let S be any independent set consisting of II elements. By part (a), S is a subset of some basis, say B. But by Theorem 1.6, the basis B has exactly n elements, so S= B.
1.9 Components
Let V be a linear space of dimension II and consider a basis whose elements e, , . . . , e, are taken in a given order. We denote such an ordered basis as an n-tuple (e,, . . . , e,). If x E V, we can express x as a linear combination of these basis elements:
(1.5) x = $ ciei . i=l
The coefficients in this equation determine an n-tuple of numbers (c,, . . . , CJ that is uniquely determined by x. In fact, if we have another representation of x as a linear combination of e,, . . . , e,, say x = I7z1 d,e,, then by subtraction from (1.5), we find that & (ci - d,)e, = 0. Bu since the basis elements are independent, this implies ci = dit foreachi,sowehave(c, ,..., c,)=(d, ,..., d,).
The components of the ordered n-tuple (c,, . . . , CJ determined by Equation (1.5) are called the components of x relative to the ordered basis (e, , . . . , e,).
1.10 Exercises
In each of Exercises 1 through 10, let S denote the set of all vectors (x, y, z) in V3 whose com- ponents satisfy the condition given. Determine whether S is a subspace of V3. If S is a subspace, compute dim S.
1. X = 0. 6.x=y o r x=z. 2 . x + y = o . 7. x2-y2=o. 3.x+y+z=o. 8. x fy = 1. 4 . x = y . 9..y=2x a n d z=3x. 5 . x =y =z. 10. x + J + z = 0 and x - y - z = 0.
Let P, denote the linear space of all real polynomials of degree < it, where n is fixed. In each of Exercises 11 through 20, let S denote the set of all polynomials f in P, satisfying the condition given. Determine whether or not S is a subspace of P, . If S is a subspace, compute dim S.
11. f(0) = 0. 16. f(0) = f(2) . 12. j-‘(O) = 0. 17. f is even. 13. j-“(O) = 0. 18. f is odd. 14. f(O) +f’(o> = 0. 19. f has degree _< k, where k < n, or f = 0. 15. f(0) =f(l). 20. f has degree k, where k < n , or f = 0.
21. In the linear space of all real polynomials p(t), describe the subspace spanned by each of the following subsets of polynomials and determine the dimension of this subspace. 6-4 (1, t2, t4>; (b) {t, t3, t5>; cc> 0, t2> ; (d) { 1 + t, (1 + t,“}.
14 Linear spaces
22. In this exercise, L(S) denotes the subspace spanned by a subset S of a linear space V. Prove each of the statements (a) through (f). (a) S G L(S). (b) If S G T G V and if T is a subspace of V, then L(S) c T. This property is described by
saying that L(S) is the smallest subspace of V which contains 5’. (c) A subset S of V is a subspace of V if and only if L(S) = S. (d) If S c T c V, then L(S) c L(T). (e) If S and Tare subspaces of V, then so is S n T. (f) If S and Tare subsets of V, then L(S n T) E L(S) n L(T). (g) Give an example in which L(S n T) # L(S) ~-1 L(T).
23. Let V be the linear space consisting of all real-valued functions defined on the real line. Determine whether each of the following subsets of V is dependent or independent. Compute the dimension of the subspace spanned by each set.
I%: i’ ,ea2,ebz},a #b. (f) {cos x, sin x>.
ear, xeax}. (g) {cosz x, sin2 x}.
iz il, eaz, xeaz). (h) {‘I, cos 2x, sin2 x}. eax, xeax, x2eax}. (i) {sin x, sin 2x}.
(e) {e”, ec”, cash x}. (j) {e” cos x, eP sin x}. 24. Let V be a finite-dimensional linear space, and let S be a subspace of V. Prove each of the
following statements. (a) S is finite dimensional and dim S 2 dim V. (b) dim S = dim V if and only if S = V. (c) Every basis for S is part of a basis for V. (d) A basis for V need not contain a basis for S.
1.11 Inner products, Euclidean spaces. Norms
In ordinary Euclidean geometry, those properties that rely on the possibility of measuring lengths of line segments and angles between lines are called metric properties. In our study of V,, we defined lengths and angles in terms of the dot product. Now we wish to extend these ideas to more general linear spaces. We shall introduce first a generalization of the dot product, which we call an inner product, and then define length and angle in terms of the inner product.
The dot product x *y of two vectors x = (x1, . . . , x,) and y = (ul, . . . , yn) in V, was defined in Volume I by the formula
(1.6) x * y = i x,y,. i=I
In a general linear space, we write (x, JJ) instead of x * y for inner products, and we define the product axiomatically rather than by a specific formula. That is, we state a number of properties we wish inner products to satisfy and we regard these properties as axioms.
DEFINITION. A real linear space V is said to have an inner product if for each pair of elements x and y in V there corresponds a unique real number (x, y) satisfying the following axioms for all choices of x, y, z in V and all real scalars c.
(1) (XT y> = oi, 4 (commutativity, or symmetry).
(2) (x, y + z> = (x, y> + (x3 z> (distributivity, or linearity).
(3) 4x2 .Y> = (cx, Y> (associativity, or homogeneity).
(4) (x3 x> > 0 if x#O (positivity).
Inner products, Euclidean spaces. Norms 1 5
A real linear space with an inner product is called a real Euclidean space.
Note: Taking c = 0 in (3), we find that (0,~) = 0 for all y.
In a complex linear space, an inner product (x, y) is a complex number satisfying the same axioms as those for a real inner product, except that the symmetry axiom is replaced by the relation
(1’) (X>Y> = (YP 4, (Hermitian? symmetry)
where (y, x) denotes the complex conjugate of (y, x). In the homogeneity axiom, the scalar multiplier c can be any complex number. From the homogeneity axiom and (l’), we get the companion relation
- _ _ (3’) (x, cy) = (cy, x) = Q, x) = qx, y).
A complex linear space with an inner product is called a complex Euclidean ‘space. (Sometimes the term unitary space is also used.) One example is complex vector space V,(C) discussed briefly in Section 12.16 of Volume I.
Although we are interested primarily in examples of real Euclidean spaces, the theorems of this chapter are valid for complex Euclidean spaces as well. When we use the term Euclidean space without further designation, it is to be understood that the space can be real or complex.
The reader should verify that each of the following satisfies all the axioms for an inner product.
EXAMPLE 1. In I’, let (x, y) = x . y , the usual dot product of x and y.
EXAMPLE 2. If x = (xi, XJ and y = (yi , yJ are any two vectors in V,, define (x, y) by the formula
(x3 Y) = %Yl + XlY2 + X2Yl + X2Y2 *
This example shows that there may be more than one inner product in a given linear space.
EXAMPLE 3. Let C(a, b) denote the linear space of all real-valued functions continuous on an interval [a, b]. Define an inner product of two functions f and g by the formula
CL d = jab J-(&At) dt .
This formula is analogous to Equation (1.6) which defines the dot product of two vectors i n I!,. The function values f(t) and g(t) play the role of the components xi and yi , and integration takes the place of summation.
t In honor of Charles Hermite (1822-1901), a French mathematician who made many contributions to algebra and analysis.
1 6 Linear spaces
u-3 d = jab W(W(Od~) dt,
where w is a fixed positive function in C(a, b). The function w is called a weightfunction. In Example 3 we have w(t) = 1 for all t.
EXAMPLE 5. In the linear space of all real polynomials, define
CL d = jam e-tf(MO dt. Because of the exponential factor, this improper integral converges for every choice of polynomials /and g.
THEOREM 1.8. In a Euclidean space V, every inner product satisfies the Cauchy-Schwarz inequality:
I(x,y)12 5 (x, x)(y, y) for all x andy in V.
Moreover, the equality sign holds lyand only if x and y are dependent.
Proof. If either x = 0 or y = 0 the result holds trivially, so we can assume that both x and y are nonzero. Let z = ax + by, where a and b are scalars to be specified later. We have the inequality (z, z) >_ 0 for all a and b. When we express this inequality in terms of x and y with an appropriate choice of a and b we will obtain the Cauchy-Schwarz inequality.
To express (z, z) in terms of x and y we use properties (l’), (2) and (3’) to obtain
(z,Z> = (ax + by, ax + by) = (ax, ax) + (ax, by) + (by, ax) + (by, by)
= a@, x> + a&x, y) + bii(y, x) + b&y, y) 2 0.
Taking a = (y, y) and cancelling the positive factor (J, y) in the inequality we obtain
01, y>(x, 4 + 6(x, y> + Ny, xl + b6 2 0.
Now we take b = -(x, y) . Then 6 = - (y, x) and the last inequality simplifies to
(Y, y)(x, x) 2 (x, y>c.Y9 x> = I(& yv.
This proves the Cauchy-Schwarz inequality. The equality sign holds throughout the proof if and only if z = 0. This holds, in turn, if and only if x and y are dependent.
EXAMPLE. Applying Theorem 1.8 to the space C(a, b) with the inner product (f, g) = j,bf(t)g(t) dt , we find that the Cauchy-Schwarz inequality becomes
(jbf(MO dt)' I (jabfZW dt)( jab g"(t) dl).a
Inner products, Euclidean spaces. Norms 17
The inner product can be used to introduce the metric concept of length in any Euclidean space.
DEFINITION. In a Euclidean space V, the nonnegative number IIx I/ deJned by the equation
llxjl = (x, x)”
is called the norm of the element x.
When the Cauchy-Schwarz inequality is expressed in terms of norms, it becomes
IGGY)I 5 llxll M .
Since it may be possible to define an inner product in many different ways, the norm of an element will depend on the choice of inner product. This lack of uniqueness is to be expected. It is analogous to the fact that we can assign different numbers to measure the length of a given line segment, depending on the choice of scale or unit of measurement. The next theorem gives fundamental properties of norms that do not depend on the choice of inner product.
THEOREM 1.9. In a Euclidean space, every norm has the following properties for all elements x and y and all scalars c:
(4 II-4 = 0 if x=0.
@I II4 > 0 if x#O (positivity).
cc> Ilcxll = IcIll4 (homogeneity).
(4 Ilx + YII I l/x/I + Ilyll (triangle inequality). The equality sign holds in (d) if x = 0, ify = 0, or if y = cxfor some c > 0.
Proof. Properties (a), (b) and (c) follow at once from the axioms for an inner product. To prove (d), we note that
Il.~+yl12=(~+y,~+y>=~~,~~+~y,y>+~~,y>+cv,~>
= lIxl12 + llyl12 + (x3 y> + t-x, y>.
The sum (x, y) + (x, y) is real. The Cauchy-Schwarz inequality shows that 1(x, y)l 5
II-4 llyll and IGG y)I I I I4 llyll , so w e have
/lx + yl12 I lIxl12 + llYl12 + 2llxll llyll = Wll + llyll>“.
This proves (d). When y = cx , where c > 0, we have
/lx +yII = IIX + cxll = (1 + c> IL-II = llxll + IICXII = I I4 + llyll .
1 8 Linear spaces
DEFINITION. In a real Euclidean space V, the angle between two nonzero elements x and y is dejned to be that number t9 in the interval 0 5 8 < TT which satisfies the equation
(1.7) (x9 Y)cos e = -
IIXII llvll ’
Note: The Cauchy-Schwarz inequality shows that the quotient on the right of (1.7) lies in the interval [ - 1 , 11, so there is exactly one 0 in [0, 7~1 whose cosine is equal to this quotient.
1.12 Orthogonality in a Euclidean space
DEFINITION. In a Euclidean space V, two elements x and y are called orthogonal if their inner product is zero. A subset S of V is calIed an orthogonal set if (x, y) = 0 for every pair of distinct elements x and y in S. An orthogonal set is called orthonormal if each of its elements has norm 1.
The zero element is orthogonal to every element of V; it is the only element orthogonal to itself. The next theorem shows a relation between orthogonality and independence.
THEOREM 1.10. In a Euclidean space V, every orthogonal set of nonzero elements is independent. In particular, in a jinite-dimensional Euclidean space with dim V = n, every orthogonal set consisting of n nonzero elements is a basis for V.
Proof. Let S be an orthogonal set of nonzero elements in V, and suppose some finite linear combination of elements of S is zero, say
where each xi E S. Taking the inner product of each member with x1 and using the fact that (xi , xi) = 0 if i # 1 , we find that cl(xl, x1) = 0. But (x1, x,) # 0 since xi # 0 so c1 = 0. Repeating the argument with x1 replaced by xi, we find that each cj = 0. This proves that S is independent. If dim V = n and if S consists of n elements, Theorem 1.7(b) shows that S is a basis for V.
EXAMPLE. In the real linear space C(O,27r) with the inner product (f, g) = JiBf(x)g(x) dx, let S be the set of trigonometric functions {u,, ul, u2, . . .} given by
%&4 = 1, uznpl(x) = cos nx, uZn(x) = sin nx, f o r n = 1,2,....
If m # n, we have the orthogonality relations
s
2n
Orthogonality in a Euclidean space 1 9
so S is an orthogonal set. Since no member of S is the zero element, S is independent. The norm of each element of S is easily calculated. We have (u,, uO) = j’$’ dx = 27r and, for n 2 1, we have
(~1~~~~) u2,r-l -wo cos2 nx dx = T,1 - i”” (uzT1, uzvr) =Ib?” sin2 nx dx = T.
Therefore, iluOll = J% and lIu,/l = V% for n 2 1 . Dividing each zd, by its norm, we obtain an orthonormal set {pO, cpi , yz, . . .} where ~j~ = u,/~Iu,lI Thus, we have
q+)(x) = 1 J2n ’
cp2,Ax) = y > ql,,(x) = s= )
\‘G f o r n>l
In Section 1.14 we shall prove that every finite-dimensional Euclidean space has an orthogonal basis. The next theorem shows how to compute the components of an element relative to such a basis.
THEOREM I .l 1. Let V he a finite-dimerwionai Euclidean space with dimension n, and wume that S = {El, . . . , e,>IS an orthogonal basis,fbr V. [fan element x is expressed as g linear combination of the basis elements, say
:1.8) .x = f c,e, , c=l
then its components relative to the ordered basis (e, , . . . , e,> are given by the,formulas
‘1.9)
rn particular, if S is an orthonormal basis, each cj is given by
11.10) cj = (x, ej) .
Proof. Taking the inner product of each member of (1.8) with ej, we obtain
(X, ej) = i c,(ei, ej) = c,(ej, ej) i=-;*
Ante (ei, eJ = 0 if i #j. This implies (1.9), and when (ej, ej) = 1, we obtain (1.10).
If {e,, . . . , e,} is an orthonormal basis, Equation (1.9) can be written in the form
1.11) X = f (x, ei)ei . i=l
The next theorem shows that in a finite-dimensional Euclidean space with an orthonormal oasis the inner product of two elements can be computed in terms of their components.
20 Linear spaces
THEOREM 1.12. Let V be a$nite-dimensional Euclidean space of dimension n, and assume fhat {e,, . . . , e,} is an orthonormal basis for V. Then for every puir of elements x and y in V, we have
(1.12) (Parseval’s formula).
In particular, when x = y , we have
Proof, Taking the inner product of both members of Equation (1.11) withy and using the linearity property of the inner product, we obtain (1.12). When x = y, Equation (1.12) reduces to (1.13).
Note: Equation (1.12) is named in honor of hf. A. ParsevaI (circa 1776-1836), who obtained this type of formula in a special function space. Equation (1.13) is a generalization of the theorem of Pythagoras.
1.13 Exercises
1. Let x = (x1, . . . , x,) andy = (yl, . . . , yn) be arbitrary vectors in V, . In each case, determine whether (x, y) is an inner product for V,, if (x, y) is defined by the formula given. In case (x, y) is not an inner product, tell which axioms are not satisfied.
(4 (A y> = 5 xi lyil i=l
(4 (x, y) = ( i&:yf)“2 .
(4 0, y> = j$ (xi + yd2 - t$lx1 - $IIK.
2. Suppose we retain the first three axioms for a real inner product (symmetry, linearity, and homogeneity but replace the fourth axiom by a new axiom (4’): (x, x) = 0 if and only if x = 0. Prove that either (x, x) > 0 for all x # 0 or else (x, x) < 0 for all x # 0.
[Hint: Assume (x, x) > 0 for some x # 0 and (y, y) < 0 for some y # 0. In the space spanned by {x, y), find an element z # 0 with (z, z) = 0.1
Prove that each of the statements in Exercises 3 through 7 is valid for all elements x and y in a real Euclidean space.
3. (x, y) = 0 if and only if /Ix + yll = l/x - yl/ . 4. (x, y) = 0 if and only if 11x + yj12 = j/x112 + 11~11~. 5. (x, y) = 0 if and only if 11x + cyll > ]jxll for all real c. 6. (x + y, x - y) = 0 if and only if (Ix/I = liyjj. 7. If x and y are nonzero elements making an angle 0 with each other, then
IIX - yl12 = llxl12 + Ilyl12 - 2 IIXII llyll cos 0.
Exercises
8. In the real linear space C(l, e), define an inner product by the equation
2 1
Cf, g> = 1: (log xv-Wg(4 dx .
(a) Iff(x) = G, compute lifi1. (b) Find a linear polynomial g(x) = a + bx that is orthogonal to the constant function f(x) = 1.
9. In the real linear space C( - 1, l), let (f, g) = j’, f(t)g(t) dt . Consider the three functions Ul, u2, u, given by
u,(t) = 1 , u2(t) = t 7 Y+(t) = 1 + t.
Prove that two of them are orthogonal, two make an angle n/3 with each other, and two make an angle n/6 with each other.
10. In the linear space P, of all real polynomials of degree 5 n, define
(f3g) = -$ f(3 g(k). k=O
(a) Prove that (f,g) is an inner product for P,, . (b) Compute (f, g) whenf(t) = t and g(t) = at + b . (c) Iff(t) = t , find all linear polynomials g orthogonal tof.
Il. In the linear space of all real polynomials, define (f,g) = sr e&f(t)g(t) dt . (a) Prove that this improper integral converges absolutely for all polynomials f and g. (b)Ifx,(t)=t”forn=0,1,2,...,provethat(x,,;r,)=(m+n)!. (c) Compute (f,g) when f(t) = (t + 1)2 andg(t) = t2 + 1 . (d) Find all linear polynomialsg(t) = a + bt orthogonal tof(t) = 1 + t.
12. In the linear space of all real polynomials, determine whether or not (f, g) is an inner product if (f,g) is defined by the formula given. In case (f,g) is not an inner product, indicate which axioms are violated. In (c), f’ and g ’ denote derivatives.
(a> (f,g) =f(l)g(l).
(cl (f,g> = /;f’(t)g’(t)dt.
Cd) Cftg) = (j-;f(f, df)(j-;g(t) dtj .
13. Let Vconsist of all infinite sequences {x,} of real numbers for which the series Cx: converges. If x = {x~} and y = {y,} are two elements of V, define
(a) Prove that this series converges absolutely.
[Hint: Use the Cauchy-Schwarz inequality to estimate the sum 2:&t Ixnynl.]
(b) Prove that V is a linear space with (x, y) as an inner product. (c) Compute (x, y) if x, = l/n andy, = l/(n + 1) for n 2 1. (d) Compute (x, v) if x, =2”andy, = l/n!forn 2 1.
14. Let V be the set of all real functions f continuous on [0, + a) and such that the integral sc e-Ff2(t) dt converges. Define (f,g) = j”r ebf(t)g(t) dt .
22 Linear spaces
(a) Prove that the integral for (f, g) converges absolutely for each pair of functions f and g in V.
[Hint: Use the Cauchy-Schwarz inequality to estimate the integral jf e-t 1 f (t)g(t)l dt.]
(b) Prove that V is a linear space with (f, g) as an inner product. (c) Compute (f,g) iff(t) = e& and&t) = P, where n = 0, 1,2, . . . .
15. In a complex Euclidean space, prove that the inner product has the following properties for all elements X, y and z, and all complex a and b. (4 (ax, by) = &x, y). (b) (x, ay + bz) = rf(x, y) + 6(x, z).
16. Prove that the following identities are valid in every Euclidean space. (a) Ilx +yl12 = l/xl12 + llyl12 + (x,y> + (y, x). (b) I/x + yl12 - lb - yl12 = 2(x, y) + xy, 4. (4 l/x + yl12 + lx - yl12 = 2 llxl12 + 2 IIy112.
17. Prove that the space of all complex-valued functions continuous on an interval [a, b] becomes a unitary space if we define an inner product by the formula
(fvg) = s: Wf(QgO4
where w is a fixed positive function, continuous on [a, b].
1.14 Construction of orthogonal sets. The Gram-Scltmidt process
Every finite-dimensional linear space has a finite basis. If the space is Euclidean, we can always construct an orthogonal basis. This result will be deduced as a consequence of a general theorem whose proof shows how to construct orthogonal sets in any Euclidean space, finite or infinite dimensional. The construction is called the Gram-Schmidt orthog- onalizationprocess, in honor of J. P. Gram (1850-1916) and E. Schmidt (18451921).
THEOREM 1.13. ORTHOGONALIZATION THEOREM. Let x1,x2,. .., be ajinite or intnite sequence of elements in a Euclidean space V, and let L(x,, . . . , xk) denote the subspace spanned by thejrst k of these elements. Then there is a corresponding sequence of elements y1,y2, * * * 9 in V which has the following properties for each integer k:
(a) The element yr is orthogonal to every element in the subspace L(yl, . . . , yk-J. (b) The subspace spanned by yl, . . . , yk is the same as that spanned by x1, . . . , x, :
uyl,. . . ,yJ = L(x,, . . . , XTJ.
(c) The sequence yl, y, , . . . , is unique, except for scalar factors. That is, ifyi , y: , . . . , is another sequence of elements in V satisfying properties (a) and (b) for all k, then for each k there is a scalar ck such that y; = ckyr .
Proof. We construct the elements yr, y2, . . . , by induction. To start the process, we take yr = x1. Now assume we have constructed yl, . . . , y,. so that (a) and (b) are satisfied when k = r . Then we define y,.+r by the equation
(1.14) Yr+1 = Xr+l - & ad+ y
Construction of orthogonal sets. The Gram-Schmidt process 2 3
where the scalars a,, . . . , a, are to be determined. For j < r, the inner product of yI+r with yj is given by
since (yi, yj) = 0 if i #j . If yj # 0, we can make yr+r orthogonal to yj by taking
(1.15) a _ (XT+1 7 .Yi) 3
(Yj, Yi) ’
If yj = 0, then yr+i is orthogonal to yj for any choice of aj, and in this case we choose aj=O. Thus, the element Y?+~ is well defined and is orthogonal to each of the earlier elements yr , . . . , y, . Therefore, it is orthogonal to every element in the subspace
This proves (a) when k = r + 1. To prove (b) when k = r + 1, we must show that L(y,, . . . , y,.+J = L(x, , . . . , x,+r),
given that L(y,, . . . , yr) = L(x,, . . . , x,) . The first r elements yl, . . . , y,. are in
and hence they are in the larger subspace L(x, , . . . , x,+~). The new element yrsl given by (1.14) is a difference of two elements in ,5(x,, . . , , , x,+~) so it, too, is in L(x,, . . . , x,+r). This proves that
Equation (1.14) shows that x,+i is the sum of two elements in LQ, , . . . , yr+r) so a similar argument gives the inclusion in the other direction:
UXl, . . . 9 x,+1) s uyl, . . . ,y7+1).
This proves (b) when k = r + 1. Therefore both (a) and (b) are proved by induction on k. Finally we prove (c) by induction on k. The case k = 1 is trivial. Therefore, assume (c)
is true for k = r and consider the element y:+r . Because of (b), this element is in
so we can write
Yk+* =;ciyi = ZT + Cr+lYr+l,
where z, E L(y,, . . . , y,.) . We wish to prove that z, = 0. By property (a), both vi+,, and ~,+ry,.+~ are orthogonal to z, . Therefore, their difference, z,, is orthogonal to z, . In other words, z, is orthogonal to itself, so z, = 0. This completes the proof of the orthogonaliza- tion theorem.
24 Linear spaces
In the foregoing construction, suppose we have Y?,.~ = 0 for some r. Then (1.14) shows that x,+~ is a linear combination of yl, . . . ,y,, and hence of x1, . . . , xc, so the elements x1, . . . , x,.,, are dependent. In other words, if the first k elements x1,. . . , x, are independent, then the corresponding elements y1 , . . . , yk are nonzero. In this case the coefficients ai in (1.14) are given by (1.15), and the formulas defining y, , . . . , yk become
(1.16) y, = x1, Yr+l = %+1 - for r = 1,2, . . . , k - 1.
These formulas describe the Gram-Schmidt process for constructing an orthogonal set of nonzero elements yl, . . . , y, which spans the same subspace as a given independent set Xl,. . . ,x,. In particular, if x1, . . . , x, is a basis for a finite-dimensional Euclidean space,
theny,, . . . ,yk is an orthogonal basis for the same space. We can also convert this to an orthonormal basis by normalizing each element yi, that is, by dividing it by its norm. Therefore, as a corollary of Theorem 1.13 we have the following.
THEOREM 1.14. Every$nite-dimensional Euclidean space has an orthonormal basis.
If x and y are elements in a Euclidean space, withy # 0, the element
is called the projection of x along y. In the Gram-Schmidt process (1.16), we construct the element Y,.+~ by subtracting from x,.+~ the projection of x,+r along each of the earlier elements yl, . . . , yr. Figure 1.1 illustrates the construction geometrically in the vector space V,.
FrooaE 1.1 The Gram-Schmidt process in V3 . An orthogonal set {yl, yZ , y3} is constructed from a given independent set {x1, x2, ~3.
Construction of orthogonal sets. The Gram-Schmidt process 25
EXAMPLE 1. In Vd, find an orthonormal basis for the subspace spanned by the three vectors x1 = (1, -1, 1, -l), xZ = (5, 1, 1, l), and xQ = (-3, -3, 1, -3).
Solution. Applying the Gram-Schmidt process, we find
y, = x1 = (1, -1, 1, -l),
(x2 9 Yl) ~ Yl = x2 - Yl = (4,2,0,2),
yz = xp - (Yl, Yl)
(x3 3 Yl) (x3 3 Y2> ~ Y2 = x3 - Yl + Y2 = (0, 0, 0, 0).
y3=x3-(Yl~Yl)y1-(Y2~Y2)
Since y3 = 0, the three vectors x1, x2, x3 must be dependent. But since y1 and y2 are nonzero, the vectors x1 and x2 are independent. Therefore L(x,, x2, x3) is a subspace of dimension 2. The set {y1,y2} is an orthogonal basis for this subspace. Dividing each of y1 and y2 by its norm we get an orthonormal basis consisting of the two vectors
EXAMPLE 2. The Legendre polynomials. In the linear space of all polynomials, with the inner product (x, y) = ST, x(t)y(t) dt , consider the infinite sequence x,, , Xl, x2, * * * , where x,(t) = tn. When the orthogonalization theorem is applied to this sequence it yields another sequence of polynomials y,, y,, y2, . . . , first encountered by the French mathe- matician A. M. Legendre (1752-1833) in his work on potential theory. The first few polynomials are easily calculated by the Gram-Schmidt process. First of all, we have ye(t) = x0(t) = 1 . Since
(yo,yo)=j:,dt=2 and (~1, ~0) = jtl t dt = 0,
we find that
Next, we use the relations
(xzz,yo)= f_llt2dt=$, b,yl)= jJ1t3dt=0, (yl,yl)= jT1t2dt+,
to obtain
y2(t) = x2(t) - (E; ye(t) - E$l; yl(t) = t2 - 8. 03 0 17 1
Similarly, we find that
y3(t) = t3 - St, y4(t) = t4 - $t2 + -&, y5(t) = t5 - -gt3 + &t.
26 Linear spaces
We shall encounter these polynomials again in Chapter 6 in our further study of differential equations, and we shall prove that
n! d” yn(f) = (2n)! dt”
- - (t2 - 1)“.
p?st> = (2n)! - y,(t) = h -$ 0” - 1)” 2”(n !)”
are known as the Legendrepolynomials. The polynomials in the corresponding orthonormal
sequence fro, ply v2, . . . , given by ~7~ = y,/llynll are called the normalized Legendre poly- nomials. From the formulas for yo, . . . , y5 given above, we find that
f&l(t) = Ji ) q%(t) = 4 t ) q*(t) = &JS (3t2 - 1)) Q)3(t) = &J3 (5t3 - 3t),
p4(t) = +Ji (35th - 30t* + 3), p5(t) = $& (63t” - 70t3 + 19).
1.15. Orthogonal complements. Projections
Let V be a Euclidean space and let S be a finite-dimensional subspace. We wish to consider the following type of approximation problem: Given an element x in V, to deter- mine an element in S whose distance from x is as small as possible. The distance between two elements x and y is defined to be the norm 11x - yII .
Before discussing this problem in its general form, we consider a special case, illustrated in Figure 1.2. Here V is the vector space V, and S is a two-dimensional subspace, a plane through the origin. Given x in V, the problem is to find, in the plane S, that point s nearest to x.
If x E S, then clearly s = x is the solution. If x is not in S, then the nearest point s is obtained by dropping a perpendicular from x to the plane. This simple example suggests an approach to the general approximation problem and motivates the discussion that follows.
DEFINITION. Let S be a subset of a Euclidean space V. An element in V is said to be orthogonal to S if it is orthogonal to every element of S. The set of’ all elements orthogonal to S is denoted by S-’ and is called “S perpendicular.”
It is a simple exercise to verify that Sl is a subspace of V, whether or not S itself is one. In case S is a subspace, then S1 is called the orthogonal complement of S.
EXAMPLE. If S is a plane through the origin, as shown in Figure 1.2, then S1 is a line through the origin perpendicular to this plane. This example also gives a geometric inter- pretation for the next theorem.
Orthogonal complements. Projections 2 7
FIGURE 1.2 Geometric interpretation of the orthogonal decomposition theorem in V,.
THEOREM 1.15. ORTHOGONAL DECOMPOSITION THEOREM. Let V be a Euclidean space and let S be ajnite-dimensional subspace of V. Then every element x in V can be represented uniquely as a sum of two elements, one in S and one in Sl. That is, we have
(1.17) x=s+s’, where s E S and d- E 5-l.
Moreover, the norm of x is given by the Pythagorean formula
(1.18) llxl12 = IIsl12 + Il~1112.
Proof. First we prove that an orthogonal decomposition (1.17) actually exists. Since S is finite-dimensional, it has a finite orthonormal basis, say {e, , . . . , e,}. Given x, define the elements s and sL as follows:
(1.19) s = i (x, ei)ei, SI = x - s . i=l
Note that each term (x, e,)e, is the projection of x along e, . The element s is the sum of the projections of x along each basis element. Since s is a linear combination of the basis elements, s lies in S. The definition of & shows that Equation (1.17) holds. To prove that d lies in Sl, we consider the inner product of sL and any basis element ej . We have
(S’, ej> = (x - s, ej) = (x, e,) - (s, ei) .
But from (1.19), we find that (s, eJ = (x, e,), so sL is orthogonal to ei. Therefore sL is orthogonal to every element in S, which means that sL E SL .
Next we prove that the orthogonal decomposition (1.17) is unique. Suppose that x has two such representations, say
(1.20) x=s+sl- and x=t+tl,
28 Linear spaces
where s and t are in S, and sL and t’ are in SI. We wish to prove that s = t and sL = t1 . From (1.20), we have s - t = t1 - sL, so we need only prove that s - I = 0. But s - t E Sand t1 - s1 E SL so s - t is both orthogonal to tl - .sL and equal to t1 - & . Since the zero element is the only element orthogonal to itself, we must have s - t = 0. This shows that the decomposition is unique.
Finally, we prove that the norm of x is given by the Pythagorean formula. We have
llxll2 = (x, x) = (s + &, s + sl) = (s, s) + (sl, s’->,
the remaining terms being zero since s and sL are orthogonal. This proves (1.18).
DEFINITION. Let S be a Jnite-dimensional subspace of a Euclidean space V, and let
{e,, . . . , e,} be an orthonormal basis for S. If x E V, the element s dejned by the equation
s = 2 (x, ei)ei i=l
is called the projection of x on the subspace S.
We prove next that the projection of x on S is the solution to the approximation problem stated at the beginning of this section.
1.16 Best approximation of elements ,in a Euclidean space by elements in a finite- dimensional subspace
THEOREM 1.16. APPROXIMATION THEOREM. Let S be a ,finite-dimensional subspace of a Euclidean space V, and let x be any element of V. Then the projection of x on S is nearer to x than any other element of S. That is, [f s is the projection of x on S, we have
llx - $11 I IIX - tll
for all t in S; the equality sign holds if and only if t = s.
Proof. By Theorem 1.15 we can write x = s + sL, where s E S and s1 E SL. Then, for any t in S, we have
x - t = (x - s) + (s - t) .
Since s - t E S and x - s = s-l E SL, this is an orthogonal decomposition of x - t, so its norm is given by the Pythagorean formula
lb - tl12 = IIX - sly + l/s - tll2.
But IIs - tlj2 2 0, so we have IIx - tl12 2 [Ix - sl12, with equality holding if and only if s = t. This completes the proof.
Best approximation of elements in a Euclidean space 29
EXAMPLE 1. Approximation of continuous functions on [0,2n] by trigonometric polynomials. Let V = C(O,27r), the linear space of all real functions continuous on the interval [0,2~], and define an inner product by the equation (f, g) = Ji” f(x)g(x) dx . In Section 1.12 we exhibited an orthonormal set of trigonometric functions pO, all, yz, . . . , where
(1.21) PO(X) = $9 cos kx P)‘m--1(X) = -
JG ’ Q)‘@(x) = sE-!k )
7-r
The 2n + 1 elements vO, all,. . . , pzn span a subspace S of dimension 2n + 1. The ele- ments of S are called trigonometric polynomials.
IffE C(O,27r), letf, denote the projection off on the subspace S. Then we have
(1.22) where (f, PJ = ~oz”fb)plp(x) dx.
The numbers (f, Q)& are called Fourier coefJicients off. Using the formulas in (1.21), we can rewrite (1.22) in the form
(1.23)
where
ak = $ L’>(X) cos kx dx , 1
s
i-r 0
fork=0,1,2 ,..., n. The approximation theorem tells us that the trigonometric poly- nomial in (1.23) approximates ,f better than any other trigonometric polynomial in S, in the sense that the norm l\f - f,ll is as small as possible.
EXAMPLE 2. Approximation of continuous functions on [- 1, I] by polynomials of degree < n. Let V = C(- 1, 1)) the space of real continuous functions on [- 1, 11, and let
(f, g) = S’, f(x)&) dx. The n + 1 normalized Legendre polynomials qo, pl, . . . , P)~, introduced in Section 1.14, span a subspace S of dimension n + 1 consisting of all poly- nomials of degree < n. Iffe C(- 1, 1)) let f, denote the projection off on S. Then we have
where Cf, Q)A = i_‘,f (Ovdt> dt .
This is the polynomial of degree < n for which the norm Ilf - f,ll is smallest. For example, whenf(x) = sin TX, the coefficients (f, vk) are given by
(f, q&j = ST, sin nt &t) dt.
In particular, we have (f, po) = 0 and
30 Linear spaces
Therefore the linear polynomialf,(t) which is nearest to sin nt on [- 1, l] is
Since (f, & = 0, this is also the nearest quadratic approximation.
1.17 Exercises
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
In each case, find an orthonormal basis for the subspace of V, spanned by the given vectors. (a) x1 = (1, 1, l), x2 = (l,O,l), x3 = (3,2,3).
(b) xl = (1, 1, l), x2 = (-l,l, -l), x3 = (l,O, 1). In each case, find an orthonormal basis for the subspace of V, spanned by the given vectors. (a> x1 = (1, l,O, 01, x.2 = (%I, 1, o>, x3 = (0, 0, 1, l), xq = (l,O,O, 1). @I XI = (1, l,O, 11, x2 = (1, 0,2, 11, x3 = (1,2, -2, 1). In the real linear space C(0, n), with inner product (x, y) = j; x(t)y(t) dt, let x,(t) = cos nt forn =0,1,2 ,.... Prove that the functions yO, y,, y,, . . . , given by
2 and y%(f) = ; cos nt for II 2 1,
form an orthonormal set spanning the same subspace as x0, x1, x2, . . . . In the linear space of all real polynomials, with inner product (x,~) = & x(t)y(r) dt, let x,(t) = tn for n = 0, 1, 2, . . . . Prove that the functions
y,(t) = 11 y1(t) = 6 (2t - l), y,(t) = 6 (6? - 6t + 1)
form an orthonormal set spanning the same subspace as {x,, , x1, x2}. Let V be the linear space of all real functions f continuous on [0, + a) and such that the integral j: eetf2(f) dt converges. Define (f,g) = jr ePtf(t)g(t) dt, and let y,, y, ,y,, . . . , be the set obtained by applying the Gram-Schmidt process to x0, xi, x2, . . . , where x,(t) = tn for n 2 0. Prove that v,,(t) = 1, I+ = t - 1, y2(t) = t2 - 4t + 2, y3(t) = t3 - 9t2 + 18t - 6. In the real linear space C(1, 3) with inner product (f,g) = jf f (x)g(x) dx, let f(x) = l/x and show that the constant polynomial g nearest to f is g = 4 log 3. Compute lig -f II2 for this g. In the real linear space C(0, 2) with inner product (f, g) = j”i f (x)g(x) dx, let f(x) = en and show that the constant polynomial g nearest to f is g = $(e” - 1). Compute iig -f II2 for this g. In the real linear space C( - 1, 1) with inner product (f, g) = ST1 f (x)g(x) dx , let f(x) = e5 and find the linear polynomial g nearest to f. Compute ilg - f II2 for this g. In the real linear space C(0,27~) with inner product (f,g) = jiT f(x)g(x) dx, let f(x) = x. In the subspace spanned by u,,(x) = 1 , ul(x) = cos x, u2(x) = sin x, find the trigonometric polynomial nearest to J In the linear space V of Exercise 5, let f (x) = e-” and find the linear polynomial that is nearest toJ
2
2.1 Linear transformations
One of the ultimate goals of analysis is a comprehensive study of functions whose domains and ranges are subsets of linear spaces. Such functions are called transformations, mappings, or operators. This chapter treats the simplest examples, called linear transforma- tions, which occur in all branches of mathematics. Properties of more general transforma- tions are often obtained by approximating them by linear transformations.
First we introduce some notation and terminology concerning arbitrary functions. Let V and W be two sets. The symbol
T:V+W
will be used to indicate that T is a function whose domain is V and whose values are in W. For each x in V, the element T(x) in W is called the image of x under T, and we say that T maps x onto T(x). If A is any subset of V, the set of all images T(x) for x in A is called the image of A under T and is denoted by T(A). The image of the domain V, T(V), is the range of T.
Now we assume that V and Ware linear spaces having the same set of scalars, and we define a linear transformation as follows.
D E F I N I T I O N . If V and Ware linear spaces, a function T: V + W is called a linear trans- formation of V into W if it has the.following two properties:
(a) T(x + y) = T(x) + T(y) for all x and y in V, (b) T(cx) = CT(X) for all x in V and all scalars c.
These properties are verbalized by saying that T preserves addition and multiplication by scalars. The two properties can be combined into one formula which states that
T(ax + by) = aT(x) + bTQ)
for all x,y in V and all scalars a and 6. By induction, we also have the more general relation
for any n elements x1, . . . , x,inVandanynscalarsa,,...,a,.
31
32 Linear transformations and matrices
The reader can easily verify that the following examples are linear transformations.
EXAMPLE 1. The identity transformation. The transformation T: V + V, where T(x) = x for each x in V, is called the identity transformation and is denoted by Z or by IV.
EXAMPLE 2. The zero transformation. The transformation T: V--f V which maps each element of V onto 0 is called the zero transformation and is denoted by 0.
EXAMPLE 3. Multiplication by ajxed scalar c. Here we have T: V + V, where T(x) = cx for all x in V. When c = 1 , this is the identity transformation. When c = 0, it is the zero transformation.
EXAMPLE 4. Linear equations. Let V = V,, and W = V, . Given mn real numbers aiL, wherei= 1,2,...,mandk= 1,2,...,n,defineT: V,+V,asfollows: T m a p s e a c h vector x = (x1, . . . , x,) in V, onto the vector y = (,vl, . . . , ym) in V, according to the equations
Yi = l$ aikXk for i=l,2 ,..., m. k=l
EXAMPLE 5. Inner product with afixed element. Let V be a real Euclidean space. For a fixed element z in V, define T: V -+ R as follows: If x E V, then T(x) = (x, z), the inner product of x with z.
EXAMPLE 6. Projection on a subspace. Let V be a Euclidean space and let S be a finite- dimensional subspace of V. Define T: V + S as follows: If x E V, then T(x) is the projection of x on S.
EXAMPLE 7. The dzferentiation operator. Let V be the linear space of all real functions f differentiable on an open interval (a, b). The linear transformation which maps each functionfin V onto its derivativef’ is called the differentiation operator and is denoted by D. Thus, we have D: V + W, where D (f) = f’ for each f in V. The space W consists of all derivatives f’.
EXAMPLE 8. The integration operator. Let V be the linear space of all real functions continuous on an interval [a, b]. IffE V, define g = T(f) to be that function in V given by
g(x) = JaZfW dt i f a<x<b.
This transformation T is called the integration operator.
2.2 Null space and range
In this section, Tdenotes a linear transformation of a linear space V into a linear space W.
THEOREM 2.1. The set T(V) (the range of T) is a subspace of W. Moreover, T maps the zero element of V onto the zero element of W.
Null space and range 33
Proof. To prove that r( I’) is a subspace of W, we need only verify the closure axioms. Take any two elements of T(V), say T(x) and r(y). Then T(X) + 3-(y) = T(x + y) , so T(x) + T(y) is in r(V). Also, for any scalar c we have CT(X) = T(cx) , so CT(X) is in T(V). Therefore, T( I’) is a subspace of W. Taking c = 0 in the relation T(cx) = CT(X), we find that T(0) = 0.
DEFINITION. The set of all elements in V that T maps onto 0 is called the null space of T and is denoted by N(T). Thus, we have
N(T) = {x 1 x E V and T(x) = 0} .
The null space is sometimes called the kernel of T.
THEOREM 2.2. The null space of T is a subspace of V.
Proof. If x and y are in N(T), then so are x + y and cx for all scalars c, since
T(x + y) = T(x) + T(y) = 0 and T(cx) = CT(~) = 0.
The following examples describe the null spaces of the linear transformations given in Section 2.1.
EXAMPLE 1. Identity transformation. The null space is {0}, the subspace consisting of the zero element alone.
EXAMPLE 2. Zero transformation, Since every element of V is mapped onto zero, the null space is V itself.
EXAMPLE 3. Multiplication by a$xed scalar c. If c # 0, the null space contains only 0. If c = 0, the null space is V.
EXAMPLE 4. Linear equations. The null space consists of all vectors (xi, . . . , x,) in V, for which
for i=l,2 ,..., m.
EXAMPLE 5. Inner product with ajxed element z. The null space consists of all elements in V orthogonal to z.
EXAMPLE 6. Projection on a subspace S. If x E V, we have the unique orthogonal decomposition x = s + sL (by Theorem I .15). ,‘mce T(x) = s, we have T(x) = 0 if and only if x = sL . Therefore, the null space is zl, the orthogonal complement of S.
EXAMPLE 7. DifSerentiation operator. The null space consists of all functions that are constant on the given interval.
EXAMPLE 8. Integration operator. The null space contains only the zero function.
3 4
2.3 Nullity and rank
Linear transformations and matrices
Again in this section T denotes a linear transformation of a linear space V into a linear space W. We are interested in the relation between the dimensionality of V, of the null space N(T), and of the range T(V). If V is finite-dimensional, then the null space is also finite-dimensional since it is a subspace of I’. The dimension of N(T) is called the nullity of T. In the next theorem, we prove that the range is also finite-dimensional; its dimension is called the rank of T.
THEOREM 2.3. NULLITY PLUS RANK THEOREM. If V is finite-dimensional, then T(V) is also finite-dimensional, and we have
(2.1) dim N( T> + dim T(V) = dim V .
In other words, the nullity plus the rank of a linear transformation is equal to the dimension of its domain.
Proof. Let n = dim V and let e, , . . . , e,beabasisforN(T),wherek = dimN(T)< n. By Theorem 1.7, these elements are part of some basis for V, say the basis
(2.2) e,, . . . , ek9 ekflp . . . ? ek+7y
where k + r = n . We shall prove that the r elements
(2.3) T(e,+A . . . 7 T(%,-r)
form a basis for T(V), thus proving that dim T(V) = r . Since k + r = n , this also proves (2.1).
First we show that the r elements in (2.3) span T(V). If y E T(V), we have y = T(x) for some x in V, and we can write x = clel + * * * + ck+,.ek+,. . Hence, we have
since T(e,) = * *‘* = T(e,) = 0. This shows that the elements in (2.3) span T(V). Now we show that these elements are independent. Suppose that there are scalars
ck+19 * * - 3 Ck+T such that
k+r
This implies that
SO the element X = Ck+lek+l + ’ ’ ’ + ck+&k+r is in the null space N(T). This means there
Exercises 35
are scalars cl, . . . , c, such that x = clel + * . * + c,e,, so we have
k ktr
x - x = 2 ciei - 2 ciei = 0 . i=l i=kfl
But since the elements in (2.2) are indepecdent, this implies that all the scalars ci are zero. Therefore, the elements in (2.3) are independent.
Note: If V is infinite-dimensional, then at least one of N(T) or T(V) is infinite- dimensional. A proof of of this fact is outlined in Exercise 30 of Section 2.4.
2.4 Exercises
In each of Exercises 1 through 10, a transformation T: V, -+ VZ is defined by the formula given for T(x, y), where (x, y) is an arbitrary point in VZ . In each case determine whether Tis linear. If T is linear, describe its null space and range, and compute its nullity and rank.
1. W, y> = Cy, 4. 6. T(x, y) = (e5, ev) . 2. T(x, y> = (x, -y> . 7. T(x, y) = (x, 1). 3. T(x,y) = (x,0). 8. T(x,y) = (x + 1,~ + 1). 4. T(x, y) = (x, x) . 9. T(x,y) = (x -y,x +y). 5. T(x,y) = (x2,y2). 10. T(x,y) =(2x -y,x +y).
Do the same as above for each of Exercises 11 through 15 if the transformation T: V, --f V, is described as indicated.
11. T rotates every point through the same angle q about the origin. That is, T maps a point with polar coordinates (r, 0) onto the point with polar coordinates (r, 0 + v), where q~ is fixed. Also, T maps 0 onto itself.
12. T maps each point onto its reflection with respect to a fixed line through the origin. 13. T maps every point onto the point (1, 1). 14. T maps each point with polar coordinates (r, 0) onto the point with polar coordinates (2r, 0).
Also, T maps 0 onto itself. 15. T maps each point with polar coordinates (r, 0) onto the point with polar coordinates (r, 20).
Also, T maps 0 onto itself.
Do the same as above in each of Exercises 16 through 23 if a transformation T: V,+ V, is defined by the formula given for T(x, y, z), where (x, y, z) is an arbitrary point of V, .
16. T@,y,z) = (z,y,x). 20. T(x,y,z) =(x + l,y + l,z - 1). 17. W,y,z) = ky,O). 21. T(x,y,z) = (x + 1,y + 2,z + 3). 18. T(x, y, z) = (x, 2y, 3~). 22. T(x, y, z) = (x, y2,z3). 19. T(x,y,z) = (x,y, 1). 23. T(x, y, z) = (x + z, 0, x + y).
In each of Exercises 24 through 27, a transformation T: V‘+ V is described as indicated. In each case, determine whether T is linear. If T is linear, describe its null space and range, and compute the nullity and rank when they are finite.
24. Let V be the linear space of all real polynomialsp(x) of degree I; n . Ifp E V, y = T(p) means that q(x) =p(x + 1) for all real x.
25. Let V be the linear space of all real functions differentiable on the open interval (- 1, 1). IffE v, g = T(f) means that g(x) = xf’(x) for all x in ( - 1, 1).
36 Linear transformations and matrices
26. Let V be the linear space of all real functions continuous on [a, b]. Iffg V, g = T(f) means that
g(x) =/If(t) sin (x - t) dt for a < x 5 b.
27. Let V be the space of all real functions twice differentiable on an open interval (a, b). If y E V, define 7”(y) = y” + Py’ + Qy , where P and Q are fixed constants.
28. Let V be the linear space of all real convergent sequences {x,}. Define a transformation T: V--t V as follows: If x = {x,} is a convergent sequence with limit a, let T(x) = {y,}, where yn = a - x, fo