Download - Introduction to Applied Mathematics I (Chun Liu)

Chapter I. Vectors and Tensors

Yuxi Zheng

Abstract.

We assume the students know these materials:

Basic vector algebra: addition, subtraction, and multiplication by a scalar, linear

dependence, linear independence, basis, expansion of a vector with respect to other

vectors, inner product.

We will cover the topics:

Advanced vector algebra: Projection of a vector onto an axis; vector product, product

of three vectors;

Brief Introduction

As human beings learned to know the natural world around them, they invented

words for description, and later introduced units of measurement to quantify their

description. They tried various ways, including imagination, observation, and setting

up laboratories, to find out the mechanism of motion. Math is to introduce mathe-

matical symbols (numbers, variables (length, area, volume), coordinates, functions,

vectors, tensors, rate of change, equations, inequalities, etc) to model the natural

phenomena. With enough symbols accumulated, a branch of math, called pure math,

is devoted to the study of these symbols. The study of these symbols (rules of opera-

tions) with aims on applications to the natural world is called Applied Mathematics.

A clear distinction between pure and applied mathematics is hard to draw. However,

the application of mathematics is easily seen as the use of developed mathematical

tools in sciences, engineering, and other fields.

Our goal will be learning the basic tools of mathematics that had and will con-

tinue to have applications. These tools will be introduced most often with some

background of origin. Applications are often to follow. Our emphasis is on the

math: principles and essential calculations

1.1. Vectors

Review. Vectors are

A = (1, 1, 1), B = (0,−1, 2).

The notation for a vector here is a bold face letter; it can be a letter with an arrow

on top of it.

Scalar multiplication

2A = (2, 2, 2).

Addition

2A + B = (2, 2, 2) + (0,−1, 2) = (2, 1, 4).

The zero vector

0 = (0, 0, 0).

The subtraction

A−B = (1, 2,−1).

Other examples

C = (1, 2), D = (0, 3).

And

2C−D = (2, 4) − (0, 3) = (2, 1).

Geometric Representation.

(Figure 1.1.1. Representation of C.)

2

x

x

C

1

1

2

1

Figure 1.1.1. Representation of C.

(Figure 1.1.2. Representation of A) (omitted)

(Figure 1.1.3. Addition and subtraction of C + D and C−D.)

2

2

x

x

C

1

1

2

1

3C+D

Figure 1.1.3. Representation of C +D.

D

(Figure 1.1.4. Scalar multiplication of C.) (omitted)

Length (magnitude) of a vector B = (x1, x2, x3):

|B| =√

x21+ x2

2+ x2

3.

A unit vector in the direction of B:

B

|B| =(0,−1, 2)

√

02 + (−1)2 + 22= (0,− 1√

5,

2√5).

(Figure 1.1.5. A unit vector. ) (omitted)

A nonzero vector B yields naturally an axis: the line that passes through B with

the same direction of B.

(Figure 1.1.6. An axis associated with a given nonzero vector B.) (omitted)

1.1.1. Projection of a vector A onto an axis u: |u| = 1.

Definition: ProjuA = (|A| cos φ)u.

Here φ represents the angle between the two vectors A and u.

Figure 1.1.7. Projections with acute and obtuse angles.

3

B

A

u Proj

Φ

Β

A

AB

ProjuA

Figure 1.1.7. Projections with acute and obtuse angles.

If φ is between −π/2 and π/2, then the cos is positive and the projection is in the

same direction as u. If φ is between π/2 and 3π/2, the projection is in the opposite

direction.

We see that if u1 and u2 are orthogonal axes, the projection can be used to

decompose A:

A = Proju1A + Proju2

A.

For a general nonzero vector B, the projection onto B is

ProjBA = (|A| cos φ)B

|B| .

1.1.2. Inner product. (a.k.a. scalar product, dot product)

Example. Let A = (1,−1, 2), B = (2, 3,−5). Then

A ·B = 1 · 2 + (−1) · 3 + 2 · (−5) = −11. (1)

The inner product can be used to express the projection. Formula:

ProjBA =A ·B|B|2 B. (2)

The proof will be given minutes later.

Now let us examine an example. Let

i1 = (1, 0, 0), i2 = (0, 1, 0), i3 = (0, 0, 1).

Then,

i1 · i1 = 1, i2 · i2 = 1, i3 · i3 = 1,

i1 · i2 = 0, i1 · i3 = 0, i2 · i3 = 0.(3)

4

If a set of vectors satisfies (3), then the set is called orthonormal. Conditions (3) are

called orthonormal conditions.

Although the inner product in (1) is simple and direct, there is another popular

definition, which is equivalent. It is

A ·B = |A||B| cos(A,B) (4)

where (A,B) denotes the angle from A to B.

Figure 1.1.8. Angle and inner product.

B

A

Figure 1.1.8. Angle and inner product.

(A, B)

Properties: A ·B = B ·A, (2A) ·B = 2(A ·B).

Distributive law: A · (B + C) = A ·B + A ·C.

We give a one line proof for the formula (2):

ProjBA = (|A| cos φ)B

|B| = (|A||A| cos φ)B

|B|2 = (A ·B)B

|B|2 .

Now we see that A and B is orthogonal (defined as φ = ±π/2) if and only if

A ·B = 0, and if and only if ProjBA = 0.

We are interested in deriving (1) from (4): Let A = a1i1 + a2i2 + a3i3, B =

b1i1 + b2i2 + b3i3. Following the distributive law, we have

A ·B = A · (b1i1 + b2i2 + b3i3)

= b1A · i1 + b2A · i2 + b3A · i3= b1(a1 + 0 + 0) + b2(0 + a2 + 0) + b3(0 + 0 + a3)

= a1b1 + a2b2 + a3b3.

(5)

—End of Lecture 1. —

5

1.1.3. Vector product (a.k.a cross product)

Given two vectors A and B. We define the vector product of A and B to be

a vector C:

C = A×B

where

1. The length of C is the area of the parallelogram spanned by A and B:

|C| = |A||B|| sin(A,B)|;

2. The direction of C is perpendicular to the plane formed by A and B. And

the three vectors A, B, and C follow the right-hand rule.

The right-hand rule is: when the four fingers of the right hand turn from A to

B, the thumb points to the direction of C.

(Figure 1.1.3.1. Definition of cross product.)

A

B

C=AxB

Figure 1.1.3.1. Definition of cross product.

Area

Right hand rule

Properties:

A×B = −B×A;

A× (B + C) = A×B + A×C;

A‖B is the same as A×B = 0

where the symbol ‖ means parallel.

Example 1.1.3a: Recall the three vectors i1, i2, i3. They follow the right hand

rule. By definition we can find that

i1 × i1 = 0, i2 × i2 = 0, i3 × i3 = 0;

and

i1 × i2 = i3, i2 × i3 = i1, i3 × i1 = i2.

With Example 1.1.3a and the distributive property, we can find a formula for

the product. Let

A = a1i1 + a2i2 + a3i3, B = b1i1 + b2i2 + b3i3.

Then (please do it on your own. you can do it.)

A×B =

∣

∣

∣

∣

∣

∣

∣

∣

∣

i1 i2 i3

a1 a2 a3

b1 b2 b3

∣

∣

∣

∣

∣

∣

∣

∣

∣

.

Example 1.1.3b. An electric charge e moving with velocity v in a magnetic

field H experiences a force:

F =e

c(v ×H)

where c is the speed of light.

Example 1.1.3c. The moment M of a force F about a point O is

M = r× F.

(Figure 1.1.3.2. Moment of a force.)

Figure 1.1.3.2. The moment of a force.

r

F

M

O

2

1.1.4. Product of three vectors.

Given three vectors A, B, and C. We list three products with formula

(A×B)×C = B(A ·C)−A(B ·C);

A× (B×C) = B(A ·C)−C(A ·B);

(A×B) ·C =

∣

∣

∣

∣

∣

∣

∣

∣

∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣

∣

∣

∣

∣

∣

∣

∣

∣

where the entries are the coordinates of the three vectors. The first two products

are called vector triple products, the third is called scalar triple product. The proof

for the formulas for the vector triple products are complicated. But the proof for

the formula for the scalar triple product is straightforward. The reader should be

able to do it alone.

To remember the formulas for the two vector triple products, there is a quick way.

You see that the final product of the first vector triple product will be perpendicular

to A×B, so it will lie in the plane spanned by A and B. It is perpendicular to C, so

there will be no component in the C direction. So the first vector triple product is

a linear combination of A and B, not C. The coefficients are the inner products of

the remaining two vectors, with a minus sign for the second term; while the middle

vector B is the first term.

Recall that the magnitude (length) of A × B is the area of the parallelogram

spanned by A and B, and the inner product with C is this magnitude times |C| cos φ,

which is exactly the height of the parallelepiped with a “slanted height” |C| and a

bottom parallelogram spanned by A and B. Thus the magnitude of the scalar triple

product is the volume of the parallelepiped formed by the three vectors. See Figure

1.1.3.3.

(Figure 1.1.3.3. Volume of the parallelepiped formed by three vectors.)

3

A

B

Area

AxB

Cheight =

projection

of C.

Figure 1.1.3.3. The volume of the parallelepiped is the magnitude of (AxB) \centerdot C.

One can form other triple products, but they all can be reduced quickly to one

of the three mentioned here. One may notice that the second vector triple product

can be reduced to the first vector product easily. So essentially there is only one

vector triple product and one scalar triple product.

1.2. Variable vectors.

1.2.1. Vector functions of a scalar argument.

Example 1.2.1a. A(t) = (cos t, sin t, 0), (−∞ < t < ∞).

The graph (the collection of all the tips of the vector A(t)) is a circle.

Example 1.2.1b. A(t) = (cos t, sin t, t), (−∞ < t < ∞).

The graph is a helix.

Example 1.2.1c. A(t) = (3t, 2t,−t) = (3, 2,−1)t. It represents a straight line.

The general formula for a straight line is

A(t) = tα + β

where α and β are numerical vectors independent of t.

1.2.2. The derivatives of a vector function.

Let

A(t) = (A1(t), A2(t), A3(t)) = A1(t)i1 + A2(t)i2 + A3(t)i3.

Then

dA(t)

dt= (

dA1(t)

dt,dA2(t)

dt,dA3(t)

dt) =

dA1(t)

dti1 +

dA2(t)

dti2 +

dA3(t)

dti3.

4

Formal definition:

A′(t) = lim∆t→0

A(t + ∆t)−A(t)

∆t.

(Figure 1.2.1. Derivative: pay attention to the direction of the difference quo-

tient.)

Figure 1.2.1. Derivative: tangent direction.

A (t) A (t+ dt)

Example 1.2.2a. Let r(t) be the position vector of a moving particle. Then

dr(t)

dt= v(t) is the velocity;

d2r(t)

dt2=

dv(t)

dt= a(t) is the acceleration.

Example 1.2.2b. Consider r(t) = (cos t, sin t). Then r′(t) = (− sin t, cos t).

Note that the derivative is tangent to the graph of r(t). Consider R(t) = 2(cos t, sin t).

Then R′(t) = 2(− sin t, cos t). Notice that the magnitude of the derivative is twice

as large. See Figure 1.2.2.

(Figure 1.2.2. Derivative: direction and magnitude.)

5

1

r

r’R

R’

2

Figure 1.2.2. Derivative: direction and magnitudes.

Simple rules:d

dt(A±B) = d

dtA± d

dtB,

d

dt(cA) = dc

dtA + c d

dtA,

d

dt(A ·B) = dA

dt·B + A · dB

dt.

d

dt(A×B) = dA

dt×B + A× dB

dt.

(1)

1.2.3. The integral of a vector function.

The integral of a vector function is defined also component-wise.

Let

A(t) = (A1(t), A2(t), A3(t)) = A1(t)i1 + A2(t)i2 + A3(t)i3.

Then

B(t) =

∫

A(t) dt = (

∫

A1(t) dt,

∫

A2(t) dt,

∫

A3(t) dt).

Combining Lectures 1 and 2, we have covered this week the following sections

of our text book (Vector and Tensor Analysis with Applications, by Borisenko etc.)

1.2.3, 1.4, 1.5, and 1.7.

End of Lecture 2.

6

1.3. Vector fields

We have mentioned the magnetic field, which is defined as a domain in which a

vector of magnetism is defined at every point. Another example is the velocity field

in a stream: each water droplet has a velocity. See Figure 1.3.1.

(Figure 1.3.1. Velocity field)

Figure 1.3.1. The velocity field of a stream.

Generally, a vector field is a domain Ω and a vector function A(r) defined in it.

Furthermore, a vector field may be time-dependent: A(r, t).

1.3.1. Line integrals and circulation.

We introduce an integral that gives work done by a force field or circulation of

velocity around a loop.

Let A(r) be a vector field with domain Ω. Let M1M2 be a curve in the domain

directed from M1 to M2. Chop the curve into many small pieces, say n pieces. One

typical piece is denoted by the end points ri and ri+1. See Figure 1.3.2. The work

done in this piece is approximately A(ri) ·∆ri, where ∆ri = ri+1− ri, if we imagine

that the vector field is a force field. This can also be interpreted as the flow of the

vector field in the direction of ∆ri. We sum over all such pieces and take the limit

as all ∆ri → 0 to define the line integral:

limn→∞

∑ni=1 A(ri) ·∆ri =

∫M1M2

A(r) · dr

=∫M1M2

A1dx1 + A2dx2 + A3dx3.(1)

Here the notation is A = A1i1 + A2i2 + A3i3.

Line integrals give either total work done by the vector field, or total flow of the

vector field along the curve M1M2 in the direction specified.

Total circulation around a contour L is defined as

Γ =

∮L

A · dr.

(Figure 1.3.2. Definition of the line integral.)

Figure 1.3.2. Definition of the line integral.

O

M2

rri+1

i

r M 1∆ i

A

Example 1.3.1a. Let A = (−x2, x1, 0). Let L be the unit circle: x21 + x2

2 =

1, x3 = 0 and counter-clockwise. Then

Γ =∮LA · dr =

∮L−x2dx1 + x1dx2

(L : x1 = cos θ, x2 = sin θ, 0 ≤ θ ≤ 2π)

=∫

2π

0− sin θd cos θ + cos θd sin θ

=∫

2π

0sin2 θdθ + cos2 θdθ

=∫

2π

0dθ = 2π.

(2)

See Figure 1.3.3.

(Figure 1.3.3. Examples of circulation and line integrals.)

2

(a)

Figure 1.3.3. Examples of circulation and line integrals.

Γ = 0

(b)

Γ = 2π

Example 1.3.1b. Let A = (x1, x2, 0) and L as before. Then

Γ =∮LA · dr =

∮L

x1dx1 + x2dx2

(L : x1 = cos θ, x2 = sin θ, 0 ≤ θ ≤ 2π)

=∫

2π

0cos θd cos θ + sin θd sin θ

=∫

2π

0− cos θ sin θdθ + sin θ cos θdθ

=∫

2π

00dθ = 0.

(3)

Example 1.3.1c. Let A = (x1, x2, 0) and L be the line segment: 0 ≤ x1 ≤

1, x2 = 0 directed toward the x1-axis. Then

Γ =

∮L

A · dr =

∫1

0

x1dx1 =1

2x2

1|10 =

1

2. (4)

1.4. Theorems of Gauss, Green, and Stokes.

Recall the Fundamental Theorem of Calculus:

∫b

a

F ′(x) dx = F (b)− F (a).

Its magic is to reduce the domain of integration by one dimension. We want higher

dimensional versions of this theorem.

We want two theorems like

∫∫S(integrand) dS =

∮∂S

(another integrand) d`∫∫∫V

(integrand) dV =∫∫

∂V(another integrand) dS.

(5)

3

When S is a flat surface, the formula is called Green’s Theorem. When S is

curved, it is called Stokes’ Theorem. The volume integral is called Gauss’ Theorem.

Gauss’ Theorem. Let P (x1, x2, x3), Q(x1, x2, x3), R(x1, x2, x3) and all their

partial derivatives be continuous in a given domain V with boundary ∂V . Then∫ ∫ ∫V

(∂P

∂x1

+∂Q

∂x2

+∂R

∂x3

)dV =

∫ ∫∂V

(P cos(n, x1) + Q cos(n, x2) + R cos(n, x3)) dS.

Here n is the unit exterior normal to the surface ∂V . The term (n, x1) represents

the angle between n and the x1-axis, etc.

Note that the domain V can have holes: V can be a shell (a ball with another

concentric, smaller ball removed, in which case the boundary of V consists of two dis-

joint parts: an exterior surface with normal pointing outside and an interior surface

with exterior unit normal pointing to the origin). The boundary ∂V can be allowed

to be piecewise smooth. But the functions P (x1, x2, x3), Q(x1, x2, x3), R(x1, x2, x3)

and their derivatives are required to be continuous.

The more common form of Gauss’ Theorem is in vector form. Let

A = (P,Q,R).

Let the divergence of the vector A be

div A =∂P

∂x1

+∂Q

∂x2

+∂R

∂x3

.

Recall

n = (n1, n2, n3) = (cos(n, x1), cos(n, x2), cos(n, x3)).

Then Gauss’ Theorem can be written in vector form:∫ ∫ ∫V

div A dV =

∫ ∫∂V

A · n dS.

The proof of Gauss’ Theorem is omitted.

Green’s Theorem. Given a planar region S bounded by a closed contour L.

Suppose that P (x1, x2) and Q(x1, x2) and all their partial derivatives are continuous

in the union S ∪ L. Then∫ ∫S

(∂Q

∂x1

−∂P

∂x2

) dS =

∮∂S

Pdx1 + Qdx2,

where L is traversed in the direction such that S appears to the left of an observer

moving along L.

End of Lecture 3.

4

1.4. Theorems of Gauss, Green, and Stokes(Cont’d)

Stokes Theorem Given a surface S bounded by a closed contour L. Suppose

that all P,Q,R and their derivatives are continuous on the union of S ∪ L. Then∫∫S [( ∂R∂x2

− ∂Q∂x3

) cos(n, x1) + ( ∂P∂x3− ∂R

∂x1) cos(n, x2) + ( ∂Q∂x1

− ∂P∂x2

) cos(n, x3)]dS

=∮L Pdx1 +Qdx2 +Rdx3,

(1)

where n is the unit normal to S. Here L is traversed in the direction such that S

appears to the left of an observer moving along L with the vector n at points near

L pointing from the observer’s feet to his/her head.

(Figure 1.4.1. Orientations in Stokes Theorem)

L

S

Figure 1.4.1. Orientations in Stokes’ Theorem.

n

Stokes’ Theorem in vector form. If we let

A = (P,Q,R) = P i1 +Qi2 +Ri3

and definecurl A = ( ∂R∂x2

− ∂Q∂x3

, ∂P∂x3− ∂R

∂x1, ∂Q∂x1

− ∂P∂x2

)

=

∣∣∣∣∣∣∣∣∣i1 i2 i3∂x1 ∂x2 ∂x3

P Q R

∣∣∣∣∣∣∣∣∣ ,then Stokes’ Theorem can be written as∫ ∫

Scurl A · ndS =

∮∂S

A · dr.

We see that the term∮∂S A ·dr is the total circulation of the vector field A along

∂S. The term∫∫S curl A · ndS is called the total flux of the vector field curl A

through the surface S. In general the total flux of a vector W through a surface S

is defined as ∫ ∫S

W · ndS.

See Figure 1.4.2.

(Figure 1.4.2. Definition of flux)

n

W

Figure 1.4.2. Flux of a vector field through a surface.

S

We will come back to the meaning of curl later.

1.4.1. Simply connected domains.

We emphasize that Stokes’ Theorem holds only when the vector field A and its

curl are continuous on the union of the surface with its boundary. In general the

continuity condition is verified in a domain D that contains S. Sometimes one may

make mistakes in the relation of S with D. Let us consider the following question.

Let D be a domain. Suppose a vector A and all its derivatives are continuous

in D. Suppose further that curl A = 0 in D. Can we then use Stokes’ Theorem to

conclude that∮L A · dr = 0 for any contour L within D?

The answer is yes if D is a solid ball or even a shell ( a shell is the region between

two concentric balls). But the answer is no if D is a torus. See Figure 1.4.3. To see

why, we imagine a contour L that goes along the long circle of the torus. Now it is

clear that we can not find a surface S whose boundary is L and lies entirely in the

2

domain D. It may well be the case that the curl of A is not zero anymore outside

of D. In this case, we do not have a surface S to apply Stokes’s Theorem on.

(Figure 1.4.3. A torus.)

Figure 1.4.3. A torus and a contour that cannot shrink to a point within.

L

The difference between a torus and a shell or a ball can be characterized as

follows. Any closed curve inside a ball can be shrunk within the ball to a point. But

not every closed curve inside a torus can be shrunk within the torus to a point.

Definition. A domain is called simply connected if any closed curve inside

the domain can be shrunk continuously to a point within the domain. A domain is

called multiply connected if some closed curves cannot be shrunk within the domain

to a point.

A torus is multiply connected. A ball is simply connected, so is a shell.

To see how a shell is simply connected, imagine a curve in a domain bounded by

two parallel infinite planes. The curve can shrink within the plane to a point. Then

imagine bending the plate so that a portion of the plate forms a portion of a shell.

Stokes’ Theorem applies to any contour L within a simply connected domain.

In particular, circulation of a vector field along any closed curve within a simply

connected domain is zero if the vector field and its curl are continuous and the curl

vanishes at every point in the domain.

1.5. Scalar fields

Examples of scalar fields are the pressure function p(r) and the temperature

function T (r) in a domain D.

A real function of r in a domain is called a scalar field.

1.5.1. Gradient.

3

Let h(x1, x2) denote the height of a mountain for (x1, x2) in a planar domain D.

The set

(x1, x2) : h(x1, x2) = a height c

is called a level curve.

The gradient of h(x) is defined as

∇h(x1, x2) = (∂x1h, ∂x2h).

It is a vector. Its direction gives the direction for fastest change of h. It is normal

to the level curve. See Figure 1.5.1.

(Figure 1.5.1.)

Figure 1.5.1. Level curves and the gradient.

In three dimensions, the gradient of a function f(x1, x2, x3) is defined similarly:

∇f(x1, x2, x3) = (∂x1f, ∂x2f, ∂x3f).

It is perpendicular to the level surfaces:

(x1, x2, x3) : f(x1, x2, x3) = c

where c stands for a real number. The direction of ∇f points to the direction of

fastest change of f . See Figure 1.5.2.

4

(Figure 1.5.2.)

f = constant

Figure 1.5.2. Level surfaces and the gradient.

∆

f

End of Lecture 4.

5

1.5. Scalar fields (continued)

1.5.1. Gradient (continued)

Since we would like “to learn the REAL meaning of mathematical terms and

how/when/why to use them”, quoted from the feedback of a student, let us consider

an example.

Example 1.5.1a. Find the tangent plane to the surface

4x21 + x2

2 + 25x23 = 30

at the point (1, 1, 1).

Solution. Let

φ(x1, x2, x3) = 4x21 + x2

2 + 25x23.

Then the set (x1, x2, x3) : φ = constant is a level surface. Our surface is one of

the level surfaces. The gradient

∇φ = (8x1, 2x2, 50x3)

is a normal to the level surfaces. A normal to our surface at the point (1, 1, 1) is

∇φ(1, 1, 1) = (8, 2, 50).

All points (x1, x2, x3) on the tangent plane satisfy

(x1 − 1, x2 − 1, x3 − 1) · ∇φ(1, 1, 1) = 0.

That is,

8x1 + 2x2 + 50x3 = 60.

See Figure 1.5.3.

(Figure 1.5.3.) Figure 1.5.3. Tangent plane.

∆φ

Summary: Gradient of a scalar field is normal to the level surfaces and the scalar

increases most rapidly in the direction of the gradient among all possible directions.

Proof will be given shortly.

1.5.2. Directional derivative.

We sometimes need to find the rate of change of a scalar in a scalar field in a

given direction. For example, a NASA scientist wants to know the rate of change of

air density along a chosen path for the re-entry of a space ship.

Given a point with position vector r in a scalar field φ. Given also a unit vector

l. The rate of change of φ along l at the point r is defined as

dφ

d l= lim

α→0

φ(r + αl)− φ(r)α

.

See Figure 1.5.4.

(Figure 1.5.4.)

Figure 1.5.4. Directional derivative.

l

r

α l

It can be shown that there holds

dφ

d l= ∇φ · l. (1)

(Note: there might be open space from regular sized text to footnote sized text

that is to follow.)

2

Mathematical proof of properties of gradient. Let l = (l1, l2, l3) and r =

(x1, x2, x3). Then

φ(r + αl)− φ(r) = φ(x1 + αl1, x2 + αl2, x3 + αl3)− φ(x1, x2 + αl2, x3 + αl3)

+φ(x1, x2 + αl2, x3 + αl3)− φ(x1, x2, x3 + αl3)

+φ(x1, x2, x3 + αl3)− φ(x1, x2, x3).

(2)

We see thatφ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)

α

= φ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)αl1

l1(3)

converges to

∂φ(x1, x2 + αl2, x3 + αl3)

∂x1l1|α=0 =

∂φ(x1, x2, x3)

∂x1l1 (4)

as α → 0. Similarly, we can find the limits for the other two terms of difference in

(2). In summary, we find that

dφ

d l=

∂φ

∂x1l1 +

∂φ

∂x2l2 +

∂φ

∂x3l3

which is (1).

Now we can derive the two properties of the gradient from (1). We see that the term

∇φ · l achieves maximum when the angle between ∇φ and l is zero, i.e., l = ∇φ/|∇φ|,among all possible directions. So this is why the gradient gives the direction for most

rapid increase. Furthermore, the least change is zero change, achieved at directions

that are perpendicular to ∇φ, i.e., the angle between ∇φ and l is 90 degrees. We

know intuitively that there is no change in a level surface. Thus the gradient is a

normal to the level surfaces.

1.5.3. Coordinate-independent representation of gradient.

The representation is

∇φ(x) = limV→0

1V

∫ ∫∂V

nφ(y) dSy

where V is a domain that contains the point x and n is the unit exterior normal to

∂V . We also use V to denote the volume of the domain V , a misnomer.

3

Mathematical proof of the coordinate-independent representation

Consider

A = Cφ

where C is a constant vector. Let us apply Gauss’ Theorem:∫ ∫ ∫V

div A dV =

∫ ∫∂V

A · n dS.

Notice that div A = C · ∇φ. We find

C ·∫ ∫ ∫

V

∇φdV = C ·∫ ∫

∂V

nφ dS.

Since C is arbitrary, we conclude that∫ ∫ ∫V

∇φdV =

∫ ∫∂V

nφdS.

Dividing the equation by the volume V and taking the limit V → 0, we have

∇φ(x) = limV→0

1

V

∫ ∫∂V

nφ(y) dSy.

1.6. Vector fields

1.6.1. Flux of a vector field.

Let S be a two-sided piecewise-smooth surface in a vector field A(r). Let n be

a unit normal to S. The flux of A through an element dS is

A · n dS.

The flux through S is ∫ ∫S

A · n dS.

See Figure 1.4.2 for an earlier definition of flux.

To figure out the real meaning of a flux, let us consider an example. Imagine a

highway, an obserational gate, and you are watching the passing cars. If no car is

moving (a complete stall), you see no car passing through your gate, we say that the

flux is zero. Now suppose that all cars are moving at the same speed of 60 miles per

hour. Then in one minute the first car you saw at the begining of the minute has

traveled one mile. If the density of cars on the highway is 1 car per mile, then you

have seen 1 car in one minute passing through the gate. If the density is 2 cars per

mile, then you have seen 2 cars passing. So both velocity and density plays roles.

4

Now suppose that there are multiple lanes on the highway and the density is number

of cars per mile per lane, then the number of cars passed depends on the number of

gates you watch. That is, the cross length of the observational line plays a role. See

Figure 1.6.1.

(Figure 1.6.1.)

Figure 1.6.1. Flux associated with cars.sl

ante

dga

te

gate

cros

s

However, if the observational line is not perpendicular to the line of moving,

then the real length of the observational line does not play a role; the projection

of the observational line onto the line perpendicular to the lines of motion plays a

role. This projection is the same as the projection of the velocity vector field onto

the normal n of the surface S (observational line). So in three dimensions, let v be

the velocity of fluid particles, let S be a surface, let ρ be the density of the fluid

particles, then the quantity ∫ ∫Sρv · n dS

is the total mass of particles that have passed through S in unit time. Without the

density factor, it is called the flux.

1.6.2. The divergence of a vector field.

Recall

div A =∂A1

∂x1

+∂A2

∂x1

+∂A3

∂x3

= ( new notation )∇ ·A.

Recall Gauss’ Theorem: ∫ ∫ ∫V

div A dV =∫ ∫

∂VA · n dS.

Dividing the equation by the volume V and taking the limit V → 0 so that V shrinks

to a point x, we find

div A(x) = limV→0

∫ ∫∂V

A · n dS.

5

So we have found a coordinate-independent representation of the divergence. Using

the definition of flux, we see that the integral over ∂V is the total flux through the

surface ∂V . This total flux over volume V is the flux per unit volume. In the limit

V → 0, the limiting value measures the flux production per unit volume at the point

x. This is the real meaning of the divergence. If it is positive, then it is called a

source. If it is negative, then it is called a sink. If it is zero in a domain, then there

is no source or sink, and it is called divergence free.

6

Example 1.6.2a. Let

A(r) = qr

r3

where r denotes the norm of r. It is an exercise that

div A = 0

at every point where except r = 0. We are interested to find the flux through the

unit sphere S centered at the origin. We know that the unit exterior normal to the

unit sphere is given by

n =r

r.

So we have ∫∫S

A · n dS =∫∫

r=1q rr3· rrdS

= q∫∫r=1

dS

= 4πq.

(5)

If q > 0, it is a source (fountain). If q < 0, it is a sink. By Gauss’ Theorem, we can

see that the flux through any surface is 4πq if the surface encloses the origin. The flux

is zero if the surface does not enclose the origin. We also see that the flux is the same

no matter how small the surface is as long as it contains the origin. This vector field

is smooth every where away from the origin, and the origin is a point source/sink.

See Figure 1.6.2.

(Figure 1.6.2.)

Figure 1.6.2. Flux and source/sink.

SinkSource

1.6.3. The curl of a vector field.

Recall we have introduced the curl of a vector in association with Stokes’ Theo-

7

rem:curl A = (∂A3

∂x2− ∂A2

∂x3, ∂A1∂x3− ∂A3

∂x1, ∂A2∂x1− ∂A1

∂x2)

=

∣∣∣∣∣∣∣∣∣i1 i2 i3∂x1 ∂x2 ∂x3

A1 A2 A3

∣∣∣∣∣∣∣∣∣ .We state without proof that the curl has a coordinate-independent representation:

curl A = limV→0

1V

∫ ∫∂V

n×A dS (= ∇×A).

We see the real meaning of the curl in the next example.

εxample 1.6.3a. Consider a rigid body rotating about a fixed point O with

angular velocity ω. See Figure 1.6.3. The velocity of a point with position vector r

is given by

v = ω × r.

ω

vr

Figure 1.6.3. Curl is twice angular velocity.

Let us calculate the curl of v.

curl1v = ∂x2v3 − ∂x3v2

= ∂x2(ω1x2 − ω2x1)− ∂x3(ω3x1 − ω1x3)

= 2ω1

(6)

(ω is independent of r), and similarly

curl2v = 2ω2

curl3v = 2ω3.(7)

It follows that

curl v = 2ω.

That is, the curl of the velocity field of a rotating body equals twice the angular

velocity of the body.

End of Lecture 5.

8

1.7. Coordinate transformations.

We deal with coordinate transformations between rectangular coordinate sys-

tems, which play an important role in the definition of tensors.

Preliminary remark on tensors. Tensors are physical quantities that exist in-

dependent of coordinate systems. Scalar quantities are called zeroth-order tensors

(e.g., temperature); vectors are called first-order tensors (e.g., velocity); second-order

tensors can all be represented by 3× 3 matrices (e.g., the stress tensor), but not all

3 × 3 matrices are tensors. Tensors of orders greater than 2 cannot be represented

by 3 × 3 matrices. A n-th-order tensor requires 3n real numbers and is invariant

under change of coordinate systems. The requirement of the invariance is natural

since physical observables are invariant under change of coordinate systems.

Suppose we have two orthonormal bases:

i1, i2, i3, and i′1, i′2, i′3,

and two origin O and O′ to form two rectangular coordinate systems K and K ′. Let

a point M in space have the representation

r = x1i1 + x2i2 + x3i3r′ = x′1i

′1 + x′2i

′2 + x′3i

′3.

(1)

Note the equations:r = r′ + r′0, r′0 = ~OO′

r′ = r + r0, r0 = ~O′O,(2)

where the vector r′0 is a vector from O to O′ and r0 = −r′0. See Figure 1.7.1.

(Figure 1.7.1.)

xx

x

x

xx

r’

r

r

11’

2

3 3’

2’

0’

M

Figure 1.7.1. Coordinate transformation.

i

i

ii’

i’

KO

O’

K’

1

2

32

3

i’1

Now we use the summation convention: a repeated index is automatically summed:

xkik =3∑k=1

xkik.

Thus, the equations in (2) can be written

xkik = x′ki′k + x′0kik

x′ki′k = xkik + x0ki′k

(3)

where x′0k are the coordinates of r′0 with respect to the old system K, etc. Take

inner product with il or i′l in equations (3) and note the Kronecker delta function

ik · il = δkl =

0, k 6= l,

1, k = l.(4)

We findxl = x′k(i

′k · il) + x′0l

x′l = xk(ik · i′l) + x0l.(5)

2

We introduce new notations

i′k · il = 1 · 1 · cos(i′k, il) = αk′l. (6)

Thus

ik · i′l = i′l · ik = αl′k. (7)

Thereforexl = αk′lx

′k + x′0l

x′l = αl′kxk + x0l.(8)

The first equation of (8) is the transformation from K ′ to K, while the second

equation of (8) is the inverse transformation from K ′ to K. Note the index summed

in the second equation is the second index of α, while the index summed in the first

equation of (8) is the first index.

Properties of αl′k.

We note that the Kronecker delta in system K ′ is

δ′kl = i′k · i′l =

0, k 6= l,

1, k = l.(9)

Note the expansions:

i′k = αk′lil, and ik = αl′ki′l,

see Figure 1.7.2.

(Figure 1.7.2.)

i’

i’

O’ 2

3

i’1

i

i

iK

O

1

2

3 i’

i

k

k

Figure 1.7.2. Calculations of the coeffieients.

K’

(a) (b)

3

The (αl′k) are often given as a 3× 3 matrix

(αl′k) =

α1′1 α1′2 α1′3

α2′1 α2′2 α2′3

α3′1 α3′2 α3′3

.Note also

δkm = ik · im = αl′ki′l · αj′mi′j = αl′kαl′m

δ′km = i′k · i′m = αk′lil · αm′jij = αk′lαm′l.

Thus we have the propertyαl′kαl′m = δkm,

αk′lαm′l = δ′km.

These properties simply say that the columns of the matrix (αl′k) are orthonormal,

so are the rows of the matrix. Thus the matrix (αl′k) is an orthogonal matrix.

4

1.8. Zeroth-Order Tensors (Scalars).

A scalar is a single function (i.e., one component) which is invariant under

changes of the coordinate systems. We deal with rectangular coordinate systems

only. Thus our tensors are also called cartesian tensors.

Let ϕ be a function of points in a domain in space. Think of ϕ as a physical

or geometrical quantity. This function exists irrelevant of a coordinate system (e.g.

temperature, density, or pressure). Suppose we have two rectangular coordinate sys-

tems K and K ′. In K we have the representation ϕ(x1, x2, x3) of the function; while

in K ′ we have the representation ϕ′(x′1, x′2, x′3) where xi and x′i are the coordinates

of one and the same point in K and K ′. If the function is a scalar, then

ϕ(x1, x2, x3) = ϕ′(x′1, x′2, x′3)

for all points in the domain.

Example 1.8.1a. We show that the distance between two points is a scalar.

Let A and B be two points. Let K and K ′ be two rectangular coordinate systems.

In these systems both A and B have coordinates:

A has coordinates xAi in K, and x′iA in K ′;

B has coordinates xBi in K, and x′iB in K ′.

Let

∆xi = xBi − xAi , ∆x′i = x′iB − x′i

A.

Let the transformation from K to K ′ be

x′i = αi′kxk + x0i.

Then∆x′i = x′i

B − x′iA = αi′kx

Bk + x0i − αi′kxAk − x0i

= αi′k(xBk − xAk ) = αi′k∆xk.

Thus

∆x′i = αi′k∆xk. (1)

Recall Pythagorean Theorem for distance

(∆s′)2 =3∑i=1

(∆x′i)2.

Then(∆s′)2 = αi′k∆xkαi′l∆xl = αi′kαi′l∆xk∆xl

= δkl∆xk∆xl=∑3k=1(∆xk)2.

Thus

∆s′ = ∆s.

1.9. First-Order (Cartesian) Tensors (Vectors)

A first-order tensor is given by three components, and satisfies a certain trans-

formation law.

Think of point B as displaced from point A. Then ∆xi are the displacement.

We have seen that the displacement satisfies the law (1).

Definition. A vector (a.k.a first-order tensor) A is a quantity uniquely specified

in any coordinate system by three real numbers (called the components of the vector)

which transform under changes of the coordinate system according to the law

A′i = αi′kAk (2)

where Ak, A′i are the components of the vector in the old and new coordinate systems

K and K ′ respectively, and αi′k is the cosine of the angle between the i-th axis of

K ′ and the k-th axis of K.

We remark that it is obvious that the zero vector (0, 0, 0) is represented the same

way in any coordinate system. Furthermore, this definition of vector is equivalent

to the definition of a vector as a directed line segment. Lastly, we can use formula

(2) to calculate the components of the representation of a vector in K ′ from the

components of the representation in K.

Example 1.9a. A moving particle P has position coordinate xi(t) in a coordi-

nate system K. The displacement

xi(t+ ∆t)− xi(t)

satisfies the law

x′i(t+ ∆t)− x′i(t) = αi′k(xi(t+ ∆t)− xi(t)) (3)

2

by (1). Thus it determines a vector. We divide (3) by ∆t (a scalar) to find

x′i(t + ∆t)− x′i(t)∆t

= αi′k(xi(t + ∆t)− xi(t))

∆t.

Taking the limit ∆t→ 0 and using the definition of velocity

vi(t) = lim∆t→0

xi(t + ∆t)− xi(t)∆t

,

we find that

v′i(t) = αi′kvk(t).

So the velocity is a vector. Similarly the acceleration

ak(t) =dvk(t)dt

is a vector. Multiplying the acceleration with the scalar mass m, and by Newton’s

second law, we find that the force

F = mA

is a vector.

1.10. Second-Order Tensors

Definition. A second-order tensor is a quantity uniquely specified by 9 real

numbers (called the components of the tensor) which transform under changes of

the coordinate system according to the law

A′ik = αi′lαk′mAlm (4)

where Alm, A′ik are the components of the tensor in the old and new coordinate

systems K and K ′ respectively, and αi′k is the cosine of the angle between the i-th

axis of K ′ and the k-th axis of K.

Remarks. 1. We can use the transformation law to determine the coordinates

of A from one system to another.

2. The zero tensor has zero coordinates in any coordinate system.

3. The components of a second-order tensor are often written as a matrix:

A = (Aik) =

A11 A12 A13

A21 A22 A23

A31 A32 A33

.

3

It can be regarded as a representation of a tensor with respect to a coordinate

system.

4. Tensors of order higher than 2 cannot be represented by matrices.

Example 1.10a. Given two vectors A and B. There are nine products of a

component of A with a component of B:

AiBk (i, k = 1, 2, 3).

Suppose we transform to a new coordinate system K ′, in which A and B have

components A′i and B′k. Then

A′i = αi′lAl, B′k = αk′mBm

and hence

A′iB′k = αi′lαk′mAlBm.

This shows that AiBk is a second-order tensor. It is often denoted as A⊗B.

More examples are in the text book.

4

1.10.1. The Stress Tensor.

Consider an elastic medium, such as rubber. Use a surface to separate the

medium you will encouter a force acting between them. The total force divided by

the total surface area is called the stress (vector). The stress depends on the location

in the medium and the normal direction of the surface. It is possible that we can

factor out the direction part of the stress vector to form a quantity called the stress

tensor so that the stress vector depends bilinearly on the tensor and the direction

of the surface.

Take a rectangular coordinate system K. Take an arbitrary point M in the

elastic medium. Take a tetrahedron with M being one vertex, so that the three

faces passing through M are parallel to the coordinate surfaces, see Figure 1.10.1.

p

p

p

p

n

1

3

2

x

x

x

2

3

1

n

M

Figure 1.10.1. Tetrahedron at M with stress vectors.

Let n be the exterior normal to the slant surface, with area dσn. Let pn be the

stress (force/unit area) onto the tetrahedron through the slant surface. Let pi be the

stress onto the exterior of the tetrahedron through the surface that is perpendicular

to the i-th axis. Let a be the acceleration of the tetrahedron and f be the body

force per unit mass. By Newton’s second law, we have

adm = fdm + pndσn − pidσi.

Let dm go to zero, and note that volume goes to zero faster than corresponding

surface area, we find

pndσn = pidσi.

Note the area formula

dσi = dσn cos(n, ii) = nidσn.

We find that

pn = pini.

Projecting to ik we find

pnk = pikni.

Definition. The stress tensor is (pik). Normal stresses are pii. Tangential

(shearing) stresses are pij(i 6= j).

Note that n is arbitrary since the tetrahedron is not necessarily regular. We

have

pn = pnkik = pikniik,

which determines the stress (vector) on all surfaces.

Note now that (pik) depends only on M , not n.

Real Meaning of Stress Tensor. Once n is specified, the stress tensor (pik)

and n give the stress

pn = pikniik ( force/unit area ).

Once the area is specified as dσn, the force is

pndσn.

A second-order (stress) tensor takes a vector (unit normal) to a (stress) vector.

It only remains to verify that pik is indeed a second-order tensor. For mathemat-

ical rigor as well as the whole point of the concept of tensor, we should verify that

the stress satisfies the law of coordinate transformation. However, it is a technical

point, which I choose to skip in class.

(Note: there might be open space from regular sized text to footnote sized text

that is to follow.)

2

Additional readings. When we use the fact that the tetrahedron is rotation free

(See reference book by Young), we can deduce that

pik = pki.

Thus the matrix (pik) is symmetric.

In hydrodynamics, it is customary to write

pik = −pδik + pik

where the scalar p is called the hydrodynamic pressure and pik is the viscous stress

tensor. A Newtonian fluid is such that the linear relation

pik = ηiklmvlm

holds where

vlm =1

2(∂vl∂xm

+∂vm∂xl

)

is the rate of deformation tensor. For isotropic fluid, there holds

pik = 2µvik + µ′δikvll

where µ and µ′ are called viscosity coefficients.

Verification of the tensor character of the stress. Since the definition of (pik)

involves no restriction on the normal n, we can take n to be the i-th base vector of

the new coordinate system K′, so that

n = i′i

(K and K′ have orthonomal bases i1, i2, i3 and i′1, i′2, i′3, respectively). Then projecting

n onto the l-th axis of K gives

nl = n · il = i′i · il = αi′l,

where αi′l is the cosine of the angle between the i-th axis of K′ and the l-th axis of

K, and hence

pn ≡ p′i = plnl = αi′lpl = αi′limplm.

Finally, projecting p′i onto the k-th axis of K′, we obtain

p′i · i′k = αi′l(im · i′k)plm

or

p′ik = αi′lαk′mplm.

By definition, we find that (pik) transforms like a second-order tensor.

3

1.10.2. The moment of inertia tensor. Consider a rigid body system of n

particles with coordinates (x(j)1 , x

(j)2 , x

(j)3 ), j = 1, 2, · · · , n and mass mj in a coordi-

nate system K with origin O. The quantities

Iik =n∑j=1

mj(δikx(j)l x

(j)l − x

(j)i x

(j)k )

are called the moment of inertia tensor (about the origin O). It is a second-order

tensor. It is used in physics in

ωkIik = Li

where

L =n∑j=1

mj(r(j) × v(j))

is the angular momentum and ~ω is the angular velocity:

v(j) = ~ω × r(j).

4

1.10.3. The Deformation Tensor.

Let u(r) be the displacement of a point with position vector r. Then the quan-

tities

uik =12

(∂ui∂xk

+∂uk∂xi

)form a second-order tensor, called the deformation tensor.

1.10.3. The rate of Deformation Tensor.

Let v(M) be the velocity at a point M of a moving fluid. Then the quantities

vik =12

(∂vi∂xk

+∂vk∂xi

)form a second-order tensor, called the rate of deformation tensor.

1.11. High-Order Tensors.

By a tensor of order n is meant a quantity uniquely specified by 3n real numbers

(the components of the tensor) which transform under changes of coordinate systems

according to the law

A′i1i2···in = αi′1k1αi′2k2

· · ·αi′nknAk1k2···kn

where Ak1k2···kn , A′i1i2···in are the components of the vector in the old and new

coordinate systems K and K ′ respectively, and αi′k is the cosine of the angle between

the i-th axis of K ′ and the k-th axis of K.

Example 1.11a. If A,B, and C are three vectors, then the 33 = 27 quantities

Dijk = AiBjCk

form a tensor of order 3. The proof is omitted, but see an exercise.

Example 1.11b. Suppose one second-order tensor Aik is a linear function of

another second-order tensor Bik, such that

Aik = λiklmBlm,

then λiklm form a fourth-order tensor. Proof is omitted.

1.12. Tensor Algebra.

1.12.1. Addition. We can add any two tensors of the same order, the sum is

a tensor of the same order, whose components are the sums of the corresponding

components of the two tensors. For example, tensor Aik and tensor Bik can be added

to give a tensor Cik:

Cik = Aik +Bik.

1.12.2. Multiplication. We can multiply any number of tensors of arbitrary

orders. The product of two tensors, for example, is a tensor whose order is the sum of

the orders of the two tensors, and whose components are products of a component of

one tensor with any component of the other tensor. The product of two second-order

tensors Aik with Blm, for example, is a fourth-order tensor Ciklm with components

Ciklm = AikBlm.

Our product of tensors is also called outer product.

1.12.3. Contraction of Tensors.

Summing a tensor of order n (n ≥ 2) over two of its indices is called contraction.

For example, summing over the first and second indices of a third-order tensor

Aiik = A11k +A22k +A33k

gives a vector. This is called contraction in the first and second indices. Contraction

in both indices of a second-order tensor Bij gives a scalar

Bii = B11 +B22 +B33.

Another example is Aiki gives another vector.

Contraction can be done many times.

Inner product. Multiplying two or more tensors and then contracting the

product with respect to indices belonging to different factors is often called an inner

product of the given tensors. For example, AikBk, AiBi, and λiklmBlm are all inner

products. But AiiBk is not an inner product.

2

1.13. Symmetry Properties of Tensors.

A tensor Sikl··· ( of order 2 or higher) is said to be symmetric in the first and

second indices (say) if

Sikl··· = Skil···.

It is antisymmetric in the first and second indices (say) if

Sikl··· = −Skil···.

Antisymmetric tensors are also called skewsymmetric or alternating tensors. The

Kronecker δik is a symmetric second-order tensor since

δik = ii · ik = ik · ii = δki.

The stress tensor pik is symmetric. But the tensor

Cik = AiBk −AkBi

is antisymmetric. It can be shown easily that an antisymmetric second-order tensor

has an matrix like this:

(Cik) =

0 C12 C13

−C12 0 C23

−C13 −C23 0

.That is Cik = 0 for i = k for an antisymmetric tensor.

We note that any second-order tensor Tik can be represented as a sum of a

symmetric tensor and an antisymmetric tensor:

Tik = Sik +Aik

whereSik = 1

2 (Tik + Tki)

Aik = 12 (Tik − Tki).

1.14. Pseudotensors.

Given a coordinate system K with the basis vectors ii, (i = 1, 2, 3). Let us

consider the quantities:

εjkl = (ij × ik) · il.

We have been assuming that our coordinate system K is always right-handed, i.e.,

the thumb of the right hand points to the direction of i3 if we position our right

hand so that our four fingers can rotate from i1 to i2. In this case, we can calculate

to find that

εjkl =

1, if j, k, l is a cyclic permutation of 1, 2, 3.

−1, if j, k, l is a cyclic permutation of 2, 1, 3.

0, otherwise.

More precisely, we have

ε123 = ε231 = ε312 = 1

ε213 = ε132 = ε321 = −1,

and all others with repeated indices ε111 = ε112 = · · · are zero. We verify for example

that

ε123 = (i1 × i2) · i3 = i3 · i3 = 1,

and

ε213 = (i2 × i1) · i3 = −i3 · i3 = −1,

and

ε113 = (i1 × i1) · i3 = 0 · i3 = 0.

Under orthogonal coordinate transformations from this K to another right-

handed system K ′, we can show that εjkl transform like a third-order tensor.

But occasionally we need to use left-handed coordinate systems. In this case the

thumb of the left hand points to the direction of i3 if we position our left hand so

that our four fingers can rotate from i1 to i2. In a left handed coordinate system

the vector product of A × B is defined by the left hand rule; i.e., the direction of

A ×B has the direction so that the three vectors A,B, and A ×B follow the left

hand rule. For either handedness, the rule of the direction of the product A × B

is such that the three vectors A,B, and A × B have the same handedness as the

coordinate system. This way all the formula for vector product hold for both kinds

of coordinate systems. In particular, the formula

A×B =

∣∣∣∣∣∣∣∣∣i1 i2 i3a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣∣∣∣2

is valid in both kinds of coordinate system.

A coordinate system transformation may change the handedness. We have al-

lowed for these transformations in our definition of tensors of all orders.

However there are tensor-like quantities that change slightly differently from the

laws of tensors. For example, let us calculate the changes in εjkl. Let K ′ be a

coordinate system with the basis vectors i′1 = i1, i′2 = i2, i′3 = −i3 and the same

origin. By definition we have

ε′123 = (i′1 × i′2) · i′3.

Note that K ′ is now left handed. So the way to figure out the vector product i′1× i′2is to use the left hand rule, so we find that

i′1 × i′2 = i′3.

Thus

ε′123 = 1.

Now let us calculate the term

α1′lα2′mα3′nεlmn

which would be equal to ε′123 if εjkl were a third-order tensor. Note that the coordi-

nate transformation coefficients are

(αi′l) =

1 0 0

0 1 0

0 0 −1

.Thus

α1′lα2′mα3′nεlmn = α1′1α2′2α3′3ε123 = −1.

We can do all the calculations to verify that there actually hold

ε′ijk = −αi′lαj′mαk′nεlmn.

So εjkl is not a third-order tensor. This leads to the concept and definition of

pseudotensors.

3

Definition of pseudotensors. A pseudotensor of order n has 3n components

Ak1k2···kn that transform under changes of coordinate system according to the law

A′i1i2···in = ∆αi′1k1αi′2k2

· · ·αi′nknAk1k2···kn

where Ak1k2···kn , A′i1i2···in are the components of the pseudovector in the old and new

coordinate systems K and K ′ respectively, αi′k is the cosine of the angle between the

i-th axis of K ′ and the k-th axis of K, ∆ is 1 if K and K ′ have the same handedness,

and ∆ is −1 if K and K ′ have different handedness.

Note that a change of coordinate system is called a proper transformation if it

preserves the handedness. It is called an improper transformation if it reverses the

handedness. Pseudotensors are also called tensor densities.

We can verify that the permutation tensor εjkl is a third-order pseudotensor. It

is called the unit pseudotensor of order three. Since it appears in many physical and

geometrical situations, it also has the name Levi-Civita tensor density. It sometimes

is denoted as δjkl, a reminder that it is a generalization of the Kronecker δjk.

We note further that the permutation tensor εjkl is antisymmetric in any pair of

indices:

εjkl = −εkjl; εjkl = −εjlk; εjkl = −εlkj.

With two swaps, we have

εjkl = −εkjl = εklj, etc.

Because of this, it is often called the alternating (pseudo-)tensor of third order.

Lastly we note that the vector product A×B, which does not transform as an

ordinary vector, has a pseudotensor representation:

(A×B)i = εijkAjBk.

That is, pseudotensors can be multiplied to yield pseudotensors. Higher-order pseu-

dotensors can be contracted to form pseudotensors. In the current situation, the

outer product of εjkl with ordinary vectors A and B yields a pseudotensor of order

5. When contracted twice, the result is a pseudotensor of order 1.

An ordinary first-order tensor is called a polar vector. Polar vectors transform

under both types of changes of coordinate systems without the factor ∆. A first-order

4

pseudotensor is called an axial vector. It is called axial because it has something to

do with the axis of rotation associated in the product v = ~ω × r.

5

1.15. Curvilinear Coordinate Systems.

We need curvilinear coordinate systems in applications. The spherical coordinate

system is an example of curvilinear coordinate systems.

Let u1, u2, u3 denote new coordinates and suppose that they are related to the

cartesian coordinates x1, x2, x3 by the equations

ui = φi(x1, x2, x3) (i = 1, 2, 3). (1)

Assume that φi have continuous first-order derivatives in a domain D of the x-space

and there holds the condition

∂(u1, u2, u3)∂(x1, x2, x3)

=

∣∣∣∣∣∣∣∣∣∂φ1

∂x1

∂φ1

∂x2

∂φ1

∂x3

∂φ2

∂x1

∂φ2

∂x2

∂φ2

∂x3

∂φ3

∂x1

∂φ3

∂x2

∂φ3

∂x3

∣∣∣∣∣∣∣∣∣ 6= 0 (2)

in the domain. This determinant is called the Jacobian of the transformation from

x to u.

The nonvanishing condition (2) ensures that it is possible to determine (x1, x2, x3)

in terms of the coordinates (u1, u2, u3); i.e., there exist functions fi(u1, u2, u3) (i =

1, 2, 3) such that

xi = fi(u1, u2, u3) (3)

where fi are defined in D determined from (1). Moreover, fi have continuous first-

order derivatives for which∂(x1, x2, x3)∂(u1, u2, u3)

6= 0

in D. The functions (f1, f2, f3) define the inverse transformation of (1). It is im-

portant to note that the Jacobians satisfy the relation

∂(u1, u2, u3)∂(x1, x2, x3)

· ∂(x1, x2, x3)∂(u1, u2, u3)

= 1.

Now let P be any point in D with coordinate (x1, x2, x3) and let the numbers

u1, u2, u3 be determined by (1). We call the ordered triple of numbers (u1, u2, u3) the

curvilinear coordinates of the point P . The equations in (1) are called the coordinate

transformation, and they are said to define a curvilinear coordinate system in D. It

follows that the Jacobian of a coordinate transformation is the reciprocal of the

Jacobian of its inverse.

Example 1.15a. Consider the transformation from the rectangular cartesian

coordinates (x, y) on a plane to the polar coordinates (r, θ) defined by

r =√x2 + y2, θ = arccos

x√x2 + y2

( or = arcsiny√

x2 + y2)

where arccos is chosen such that a unique θ in 0 ≤ θ < 2π exists so that cos θ =

x/√x2 + y2 and sin θ = y/

√x2 + y2. The domain D is all points except the origin.

The Jacobian is∂(r, θ)∂(x, y)

=

∣∣∣∣∣∣x√x2+y2

y√x2+y2

− yx2+y2

xx2+y2

∣∣∣∣∣∣ =1r.

In the above calculation we find the partial derivative ∂r/∂x as follows: From

r2 = x2 + y2

we find

2rrx = 2x.

Dividing by 2r we find rx = x/r. We can use cos θ = x/r to find the derivative θx,

etc. Thus the inverse exists except at the origin, the inverse is

x = r cos θ, y = r sin θ.

Note that the inverse is defined for all (r, θ). We can calculate

∂(x, y)∂(r, θ)

=

∣∣∣∣∣∣ cos θ −r sin θ

sin θ r cos θ

∣∣∣∣∣∣ = r.

It is clear that the product of the two Jacobians is 1.

1.15.1. Coordinate surfaces, coordinate curves, and local basis.

Let P0 = (x01, x

02, x

03) be a point in D with coordinates (u0

1, u02, u

03) in the curvi-

linear coordinate system. We call the surface

φi(x1, x2, x3) = u0i

the i-th coordinate surface passing through P0 (i = 1, 2, 3). The intersection of two

coordinate surfaces, say,

φ1(x1, x2, x3) = u01, φ2(x1, x2, x3) = u0

2

2

is called the u3-coordinate curve. See Figure 1.15.1.

P

x 3

x 1

x 2

φ

φ = u

= u

30

3

φ = u 022

0

101

Figure 1.15.1. Coordinate surfaces, coordinate curves, and local basis.

We next derive a basis at the point P0. The position vector of an arbitrary point

P is

R(u1, u2, u3) = xiii = fi(u1, u2, u3)ii.

If we set u2 = u02, u3 = u0

3, then the resulting vector function R(u1, u02, u

03) represents

the u1-curve. On this curve u1 is the parameter. It follows that the derivative

∂R∂u1

represents the tangent vector to this curve. Likewise, we have

∂R∂u2

,∂R∂u3

representing the tangent vectors to the u2- and u3-curves respectively.

Since the determinant of a matrix is the same as the determinant of its transpose,

it follows from the definition of Jacobian and that of the scalar triple product that

∂(x1, x2, x3)∂(u1, u2, u3)

=∂R∂u1·(∂R∂u2× ∂R∂u3

). (4)

(See homework problem 1.) Hence at each point where (4) is not zero, the three

tangent vectors

(∂R∂u1

,∂R∂u2

,∂R∂u3

) (5)

3

are linearly independent and thus form a basis.

Every vector or vector field at each point can then be represented in terms of

this basis (5). Unlike the unit vectors (i1, i2, i3), however, this new basis varies from

point to point in space. For this reason, we call (5) a local basis.

4

1.15.2. Arclength and orthogonal curvilinear coordinate systems.

We assume that the three vectors

(∂R∂u1

,∂R∂u2

,∂R∂u3

)

form a right-handed basis; i.e., the vector product ∂R∂u1× ∂R

∂u2has positive inner

product with ∂R∂u3

. In this case, the Jacobian ∂(x1,x2,x3)∂(u1,u2,u3) is positive.

We derive arclength formula in curvilinear coordinate systems. Consider the

position vector

R = xiii = fi(u1, u2, u3)ii.

We have(ds)2 = dR · dR

= (∑3i=1

∂R∂ui

dui) ·∑3j=1

∂R∂uj

duj)

= ( ∂R∂ui· ∂R∂uj

)duiduj

= gij dui duj

(1)

where we have introduced

gij =∂R∂ui· ∂R∂uj

which is called the metric tensor.

From here one can pursue the study of general metric tensors, which are used

for example in general relativity. For us, we choose to be more specific. We say

that the curvilinear coordinate system is orthogonal curvilinear if the triple vectors

∂R/∂ui(i = 1, 2, 3) are mutually orthogonal. For orthogonal curvilinear coordinate

systems, the directions and magnitudes of ∂R/∂ui(i = 1, 2, 3) can still vary. Let us

define

hi = |∂R∂ui| (i = 1, 2, 3).

Then we have

gij =

h2i , i = j,

0, i 6= j.

Example 1.15b. The transformation relating the cylindrical coordinates (r, θ, z)

to the rectangular cartesian coordinates (x, y, z) is defined by the equations

r =√x2 + y2

θ = arccos x√x2+y2

( or arcsin y√x2+y2

)

z = z.

It is defined for all (x, y, z) except for the origin.

We find

∂(r, θ, z)∂(x, y, z)

=

∣∣∣∣∣∣∣∣∣∂r∂x

∂r∂y 0

∂θ∂x

∂θ∂y 0

0 0 1

∣∣∣∣∣∣∣∣∣ =1r.

The inverse is

x = r cos θ, y = r sin θ, z = z

which is valid for all (r, θ, z).

Let us identify the coordinate surfaces and coordinate curves. Refer to Figure

1.15.2.

xy

z

Pr

zθ

Figure 1.15.2. Cylindrical coordinate system.

0

Coordinate surfaces: The coordinate surface r = r0 is the surface of a cylinder

passing through a point P0, and extending to infinity in both the positive and

negative directions of z-axis. The coordinate surface θ = θ0 is a half plane starting

at the z-axis and extending to infinity. The coordinate surface z = z0 is the plane

passing through P0 and perpendicular to the z-axis.

2

Coordinate curves: The r-coordinate curve is a ray starting on the z-axis, passing

through the point P0, and parallel to the xy-plane. The θ-coordinate curve is a circle

passing through the point P0, and parallel to the xy-plane. The z-coordinate curve

is a straight line parallel to the old z-axis.

We have the position vector

R(r, θ, z) = r cos θ i1 + r sin θ i2 + z i3.

Tangent vectors to the coordinate curves are

∂R∂r = cos θ i1 + sin θ i2∂R∂θ = −r sin θ i1 + r cos θ i2∂R∂z = i3.

Let u1 = r, u2 = θ, u3 = z, then the three tangent vectors form a right-handed

orthogonal curvilinear coordinate system. We find that gij = 0 for i 6= j, and

h1 = 1, h2 = r, h3 = 1.

Thus the distance formula is

(ds)2 = (dr)2 + (rdθ)2 + (dz)2.

Example 1.15c. The spherical coordinates u1 = r, u2 = φ, u3 = θ are defined

byr =

√x2 + y2 + z2

φ = arccos z√x2+y2+z2

θ = arccos x√x2+y2

.

(r ≥ 0, 0 ≤ φ < π, 0 ≤ θ < 2π). Refer to Figure 1.15.3 for the variables.

3

Figure 1.15.3. Spherical coordinate system.

x

y

z

P

u

u

u

2

3

1

0

φ r

θ

We can calculate the Jacobian

∂(r, φ, θ)∂(x, y, z)

=

∣∣∣∣∣∣∣∣∣∣rx

yr

zr

xz

r2√x2+y2

yz

r2√x2+y2

−x2+y2

r2

− y√x2+y2

x√x2+y2

0

∣∣∣∣∣∣∣∣∣∣=

1r2 sinφ

.

The inverse isx = r sinφ cos θ

y = r sinφ sin θ

z = r cosφ.

The Jacobian is∂(x, y, z)∂(r, φ, θ)

= r2 sinφ.

The position vector is

R(r, φ, θ) = r sinφ cos θ i1 + r sinφ sin θ i2 + r cosφ i3.

4

The three tangent vectors are

∂R∂r = sinφ cos θ i1 + sinφ sin θ i2 + cosφ i3∂R∂φ = r cosφ cos θ i1 + r cosφ sin θ i2 − r sinφ i3∂R∂θ = −r sinφ sin θ i1 + r sinφ cos θ i2.

They are mutually orthogonal. We have

g11 = h21 = 1

g22 = h22 = r2

g33 = h23 = r2 sin2 φ.

The distance formula is

(ds)2 = (dr)2 + (rdφ)2 + (r sinφdθ)2.

For the volume element, we have

dV = dr · rdφ · r sinφdθ = r2 sinφdr dφ dθ.

In the next lecture we calculate the grad, div, and curl in orthogonal curvilinear

coordinate systems.

5

1.16. Grad, div, and curl in orthogonal curvilinear coordinate systems.

In this section we derive the expressions of various vector concepts in an orthog-

onal curvilinear coordinate system.

Let (u1, u2, u3) be such a system:

ui = φi(x1, x2, x3). (i = 1, 2, 3).

Let

xi = fi(u1, u2, u3)

be the inverse transformation. We introduce the normalized coordinate tangent

vectors:

ui =1hi

∂R∂ui

( no summation )i = 1, 2, 3,

where hi = |∂R/∂ui|. Assume that (u1,u2,u3) is right-handed so that the Jacobian

is positive.

1.16.1. Gradient of a scalar field.

Let F (x1, x2, x3) be a scalar field in a rectangular system. We know that ∇F is

a vector, which can be represented as a linear combination of any basis. So let

∇F = F1u1 + F2u2 + F3u3.

We need to find (F1, F2, F3). We recall from Section 1.5.3(of Lecture 5) the coordinate-

independent formula

∇F (P0) = limV→0

1V

∫ ∫∂V

nF (y) dSy (1)

where V is a domain that contains the point P0 and n is the unit exterior normal

to ∂V . By the way, we have also the formulas

∇ · F(P0) = limV→0

1V

∫ ∫∂V

n · F(y) dSy

for the divergence (∇·) of a vector field F and and

∇× F(P0) = limV→0

1V

∫ ∫∂V

n× F(y) dSy

for the curl (∇×) which we will use for the representations of div and curl. The three

formulas certainly have striking uniformity. Back to our gradient representation, we

take V to be an elementary “curvilinear parallelepiped” of volume

ds1 ds2 ds3 = h1h2h3 du1 du2 du3

with faces perpendicular to the coordinate curves, see Figure 1.16.1.

Figure 1.16.1. Curvilinear parallelepiped.

u

u

u

n

n

P01

2

3

To calculate the surface integral (1), we first note that there are six sides. For

the side that passes through P0 and perpendicular to the u1-axis, we have the ap-

proximate value

−u1 F (P0)h2du2 h3du3,

where the surface area element is ds2 ds3 = h2du2 h3du3. The integral on the surface

that is parallel to the previous surface is approximately

u1 F (P0 + du1u1)h2du2 h3du3,

where P1 = P0 + du1u1 is the position of P0 with an increment du1 along the u1-

coordinate axis. Combining these two sides and note that the volume of the element

is

V = ds1 ds2 ds3 = h1h2h3 du1du2du3,

the average becomes

(−F (P0) + F (P0 + du1u1))h2h3 du2du3

h1h2h3 du1du2du3−→ 1

h1

∂F (P0)∂u1

u1

as V → 0. Similarly we can calculate the other four sides. In summary, we find

∇F =1h1

∂F

∂u1u1 +

1h2

∂F

∂u2u2 +

1h3

∂F

∂u3u3.

2

Theorem The del operator has the formula

∇ = u11h1

∂

∂u1+ u2

1h2

∂

∂u2+ u3

1h3

∂

∂u3.

Example 1.16a Find the expression of ∇ in cylindrical coordinates.

Solution. Let u1 = r, u2 = θ, u3 = z. It is right-handed. We have h1 = 1, h2 =

r, h3 = 1. Also

u1 = cos θ i1 + sin θ i2, u2 = − sin θ i1 + cos θ i2, u3 = i3.

Thus

∇ = u1∂

∂r+ u2

1r

∂

∂θ+ u3

∂

∂z.

Example 1.16b. Find the gradient of f = xyz in the cylindrical coordinates.

Solution. We have f = r2z sin θ cos θ. Thus

∇f = u12rz sin θ cos θ + u2rz(cos2 θ − sin2 θ) + u3r2 sin θ cos θ.

1.16.2. Divergence. We let

F = F1u1 + F2u2 + F3u3.

Then we can find, similar to the previous section, that

div F =1

h1h2h3

[∂

∂u1(F1h2h3) +

∂

∂u2(F2h1h3) +

∂

∂u3(F3h1h2)

].

Example 1.16c Derive the formula for the Laplacian ∆ defined as ∆ = div ∇.

Solution. Consider an F = ∇f . We have

∆f = div ∇f= 1

h1h2h3

[∂∂u1

(h2h3h1

∂f∂u1

) + ∂∂u2

(h1h3h2

∂f∂u2

) + ∂∂u3

(h1h2h3

∂f∂u3

)].

(2)

1.16.3. The curl.

Similarly, for

F = F1u1 + F2u2 + F3u3,

we can derive the formula

curl F =1

h1h2h3

∣∣∣∣∣∣∣∣∣h1u1 h2u2 h3u3

∂∂u1

∂∂u2

∂∂u3

F1h1 F2h2 F3h3

∣∣∣∣∣∣∣∣∣ .

3

Appendix: Useful expressions

I. In cylindrical coordinates

u1 = r, u2 = θ, u3 = z

h1 = 1, h2 = r, h3 = 1,

there holdgrad f = ∂f

∂rur + 1r∂f∂θuθ + ∂f

∂zuz,

div A = 1r∂∂r (rAr) + 1

r∂Aθ∂θ + ∂Az

∂z ,

curl A =(

1r∂Az∂θ −

∂Aθ∂z

)ur +

(∂Ar∂z −

∂Az∂r

)uθ+

+1r

(∂∂r (rAθ)− ∂Ar

∂θ

)uz,

∆f = 1r∂∂r

(r ∂f∂r

)+ 1

r2∂2f∂θ2 + ∂2f

∂z2 ,

where

ur = cos θ i1 + sin θ i2, uθ = − sin θ i1 + cos θ i2, uz = i3

is the local orthonormal basis, and A has components Ar, Aθ, Az with respect to

this basis.

II. In spherical coordinates. See text book by Borisenko, p174.

4

Chapter II. Complex Variables

Dates: September 24, 26, 28.

These three lectures will cover the following sections of the text book by Keener.

§6.1. Complex valued functions and branch cuts;

§6.2.1. Differentiation and analytic functions, Cauchy-Riemann conditions;

§6.2.2. Integration;

§6.2.3. Cauchy integral formula;

§6.2.4. Taylor series expansion.

2.1. Complex valued functions.

1. Complex numbers. We introduce the imaginary number i, whose square is

−1:

i2 = −1.

Complex numbers are in the form a+ ib where a and b are real numbers. Complex

numbers can be represented in the Argand diagram by the vector (a, b): (Figure to

be provided later). Addition and subtraction of two complex numbers are simple:

(a+ bi)± (c+ di) = (a± c) + (b± d)i.

Multiplication and division are as follows:

(a+ bi)(c+ di) = (ac− bd) + i(ad+ bc)a+bic+di = (a+bi)(c−di)

c2+d2

(1)

provided that c2 + d2 6= 0 for the division. From these one can calculate the power

(a+ bi)n when n is an integer.

2. Functions. Let z = a + bi. We call z a complex variable when we use z as a

variable. In general we let z = x+iy to be consistent with our habit of real variables.

Consider

f(z) = z2.

It is called a complex valued function. Other complex valued functions are z3,

g(z) =z + 1z − 1

; h(z) =az + b

cz + d

where a, b, c, d are complex numbers. An important function is

f(z) = z = x− iy

where the bar is called “complex conjugate.”

We introduce the exponential function

ez =∞∑n=0

zn

n!

for all complex z. Note that this definition is consistent with the real exponential.

We see this opens a new world. First we see

eiθ =∑∞n=0

(iθ)n

n! =∑∞n=0

(iθ)2n

(2n)! +∑∞n=0

(iθ)2n+1

(2n+1)!

=∑∞n=0(−1)n θ2n

(2n)! + i∑∞n=0(−1)n θ2n+1

(2n+1)!

= cos θ + i sin θ.

(2)

Multiplying it with any real number r, we find

reiθ = r cos θ + ir sin θ.

If r and θ is the polar representation of the point (a, b), then we find the polar

representation of complex numbers:

a+ bi = reiθ.

From this we have a special case:

eiπ + 1 = 0

which is called Euler’s identity. This identity is fun to watch since it involves the

simplest symbols of mathematics: 0, 1,+,=, i and two transcendental numbers e and

π. Another version is

e−iπ + 1 = 0

which involves additionally the minus − sign. In polar representation, multiplica-

tions of complex numbers is extremely easy:

(a+ bi)(c+ di) = (r1eiθ1)(r2e

iθ2) = (r1r2)ei(θ1+θ2).

2

We have more examples:

ez = ex+iy = exeiy = ex(cos y + i sin y)

ez1+z2 = ez1ez2 .(3)

Inverse functions of z2 and ez

We know that√

5 is a number 2.236... and satisfies the equation x2 = 5. Another

solution to this equation is −√

5. We can verify that both

z1 =√

22

+ i

√2

2, z2 = −

√2

2− i√

22

satisfy z2 = i. There are multiple solutions to the square root. We calculate a

solution√z =√reiθ =

√reiθ/2.

We can verify that this satisfies w2 = z. When we restrict θ to be in [0, 2π), this

root is called the principal branch. We can see another solution is

√z =

√rei(θ+2π) =

√rei(θ/2+π).

For the exponential function we define the inverse w = ln z as

ln z = w iff z = ew.

We note that if a ln z works, then ln z + 2nπi all work for all integer n:

eln z+2nπi = eln ze2nπi = eln z = z.

So there are multiple inverses. We can take one branch:

−π < Im (ln z) ≤ π.

This is called a principal branch. The line θ = π is called a branch cut. The origin is

called a branch point. Any continuous interval of 2π length is called a branch. The

principal branch for a function may vary from discipline to discipline.

3

2.2. Calculus of Complex Functions.

2.2.1. Differentiation.

We define the derivative f ′(z) of a complex valued function f(z) like the deriva-

tive of a real function:

f ′(z) = limξ→z

f(ξ)− f(z)ξ − z

where the limit is over all possible ways of approaching z. If the limit exists, the

function f is called differentiable and f ′(z) is the derivative.

Definition. If f ′(z) is continuous, then f is called analytic.

Continuity is like that for real functions of two variables.

Theorem 2.1 (Cauchy-Riemann conditions) The function f(z) = u(x, y) + iv(x, y)

for z = x + iy is analytic in some region Ω if and only if ∂u∂x ,

∂u∂y ,

∂v∂x ,

∂v∂y exist, are

continuous, and satisfy the equations

∂u

∂x=∂v

∂y,

∂v

∂x= −∂u

∂y.

Proof. (Skipped in class due to lecture on “memorizing formula/knowledge”).

Let f be continuously differentiable. Then take the special path along x-axis:

f(z+∆x)−f(z)∆x = u(x+∆x,y)+iv(x+∆x,y)−u(x,y)−iv(x,y)

∆x

−→ ∂u∂x + i∂v∂x .

(4)

Then along the path y-axis:

f(z+∆y)−f(z)i∆y −→ ∂u

i∂y + i ∂vi∂x = −i∂u∂y + ∂v∂x . (5)

The two limits have to be the same by definition, so we have obtained the Cauchy-

Riemann equations∂u

∂x=∂v

∂y,

∂v

∂x= −∂u

∂y.

Conversely, suppose the Cauchy-Riemann conditions hold; i.e., the existence and

continuity of the partial derivatives and the equations of Cauchy-Riemann all hold.

Let z0 = x0 + iy0. From theory of real variables we have the expansion

u(x, y) = u(x0, y0) + ∂u∂x(x0, y0)∆x+ ∂u

∂y (x0, y0)∆y +R1(∆x,∆y),

v(x, y) = v(x0, y0) + ∂v∂x(x0, y0)∆x+ ∂v

∂y (x0, y0)∆y +R2(∆x,∆y),(6)

4

where ∆x = x− x0,∆y = y − y0, and

lim∆x,∆y→0

Ri√(∆x)2 + (∆y)2

= 0.

Now we have

f(z0 + ∆z)− f(z0) = ∂u∂x(x0, y0)∆x+ ∂u

∂y (x0, y0)∆y +R1

+i[∂v∂x(x0, y0)∆x+ ∂v∂y (x0, y0)∆y +R2]

= ∂u∂x(∆x+ i∆y) + i∂v∂x (∆x+ i∆y) +R1 +R2i.

(7)

Sof(z0+∆z)−f(z0)

∆z = ∂u∂x + i∂v∂x + R1+R2i

∆z

→ ∂u∂x + i∂v∂x .

(8)

This completes the proof.

We list some practical rules of differentiation:

f(z) = z2 −→ f ′(z) = 2z

f(z) = zk −→ f ′(z) = kzk−1(k integer )

(ez)′ = ez

(f(z)g(z))′ = f ′(z)g(z) + f(z)g′(z)

[F (g(z))]′ = F ′(g(z))g′(z)

( 1f )′ = − 1

f2 f′.

(9)

2.2.2. Integration.

Integration in the complex plane is defined in terms of real line integrals of the

complex function f = u+ iv. If C is any (geometric) curve in the complex plane we

define the line integral∫Cf(z)dz =

∫C

(u+ iv)(dx+ idy) =∫Cu(x, y)dx− v(x, y)dy + i

∫Cvdx+ udy.

Example. See homework hints.

Theorem 2.2. If f(z) is analytic in a domain Ω, then∫Cf(z)dz = 0

for any closed curve C whose interior lies entirely in Ω.

5

Note that “a curve C whose interior lies entirely in Ω” is a stronger requirement

than “a curve C which lies entirely in Ω”. The stronger requirement rules out the

situation that the relevant part of Ω is not simply connected.

Proof Recall Green’s Theorem∫∂Ωφdx+ ψdy =

∫Ω

(∂ψ

∂x− ∂φ

∂y)dxdy

for a simply connected domain Ω. We apply this formula to our complex integral to

obtain ∫C f(z)dz =

∫C(u+ iv)(dx + idy)

=∫C u(x, y)dx− v(x, y)dy + i

∫C vdx+ udy

=∫cC( ∂

∂x(−v)− ∂∂yu)dxdy + i

∫cC

∂∂xu−

∂∂yv)dxdy

= 0

(10)

where we use cC to denote the interior of the contour C. This completes the proof.

Examples. 1. We have ∫Czndz = o

for any integer n and any contour C that does not enclose the origin. This follows

from theorem 2.2.

2. We can calculate∫|z|=1

z−1dz =∫ 2π

01−1e−iθ · 1eiθidθ = 2πi.

3. We leave as an exercise the claim∫|z|=1

z−ndz = 0

for all integer n 6= 1.

We note that the notation |z| = 1 means all points of the unit circle x2 + y2 = 1.

The default direction of the circle is counterclockwise.

6

2.2.3. Cauchy integral formula

We have found that contour integrals of analytic functions are always zero. Only

a few integrands with singularities result in nonzero values. The following Cauchy

integral formula describes contour integrals extremely well.

Theorem 2.3. (Cauchy integral formula) Let C be a simple noninteracting closed

curve traversed counterclockwise. Suppose f(z) is analytic everywhere inside C. For

any point z inside C, there holds∫C

f(ξ)ξ − z dξ = 2πif(z). (11)

Proof. For any ε > 0 fixed, we deform the curve C to C ′ where C ′ is the circle

|ξ − z| = ε such that |f(z) − f(ξ)| < ε for all points ξ inside C ′. Note that the

integrand in (11) f(z)/(ξ−z) is analytic in the region between C and C ′, we conclude

that the integral in (11) is equal to the same integral over C ′. (This can be achieved

by the previous Theorem and a double-sided cut (or bridge) connecting C and C ′.)

Now on C ′ we have∫Cf(ξ)ξ−z dξ =

∫C′

f(ξ)ξ−z dξ

= f(z)∫C′

1ξ−zdξ +

∫C′

f(ξ)−f(z)ξ−z dξ

= 2πif(z) + i∫ 2π

0 [f(z + εeiθ)− f(z)]dθ

= 2πif(z) + iI

(12)

where the integral I is such that |I| ≤ 2πε. Let ε → 0. We recover the Cauchy

integral formula. This completes the proof of the theorem.

Corollary 2.4. Under the same assumptions of theorem 2.3, there hold∫C

f(ξ)(ξ − z)2

dξ = 2πif ′(z) (13)

and

n!∫C

f(ξ)(ξ − z)n+1

dξ = 2πif (n)(z) (14)

for all n-th (n a positive integer) order derivatives. And thus analyticity implies

that f(z) is infinitely differentiable.

Corollary 2.5. (Poisson formula) A solution to the boundary value problem of the

Laplacian ∆u ≡ ∂2u∂x2 + ∂2u

∂y2 = 0, in x2 + y2 ≤ 1

u(r, θ) = u0(θ) on the boundary r = 1(15)

7

where (r, θ) is the polar coordinate and u0(θ) is a given continuous function, is given

by the formula

u(r, θ) =1

2π

∫ 2π

0u0(θ)

1− r2

1− 2r cos(θ − φ) + r2dθ.

Proof. Consider an analytic function f(z) = u(x, y) + iv(x, y) in r < 1. We have the

Cauchy-Riemann equations:

∂u

∂x=∂v

∂y,

∂v

∂x= −∂u

∂y.

So we have∂2u

∂x2=

∂2v

∂y∂x=

∂

∂y(−∂u∂y

),

thus∂2u

∂x2+∂2u

∂y2= 0.

That is, the real part of an analytic function is a harmonic function (satisfying the

Laplace equation). Now we use the Cauchy integral formula

f(z) =1

2π

∫ 2π

0

f(ξ)ξξ − z dθ (letting ξ = eiθ)

for z inside the unit circle and the same formula

0 =1

2π

∫ 2π

0

f(ξ)ξξ − (z)−1

dθ

applied at the point (z)−1 which is outside of the unit circle (|1z | > 1 if |z| < 1.)

Noting that ξ = (ξ)−1 on the unit circle, we can add the previous formulas

f(z) =1

2π

∫ 2π

0f(ξ)[

ξ

ξ − z −1/ξ

1/ξ − 1/z]dθ.

Or

f(z) =1

2π

∫ 2π

0f(ξ)

1− |z|2|ξ − z|2 dθ.

Taking the real part of the formula, we obtain Poisson formula. This completes the

proof.

2.2.4. Taylor series.

8

We would like to expand an analytic function in a powerful series:

f(z) =∞∑n=0

an(z − z0)n,

where

an =f (n)(z0)n!

and z0 is a convenient point for an application. Using Cauchy integral formula for

derivatives (Corollary 2.4), we find that

an =1

2πi

∫C

f(ξ)(ξ − z0)n+1

dξ

for any simple contour C that contains z0.

We can use other simple ways to find Taylor series: For example we have

11− z = 1 + z + z2 + z3 + · · ·

valid for all |z| < 1.

Using these lecture notes together with the text book is recommended.

9

Chapter V. Ordinary Differential Equations

Outline:

5.1 First-order linear scalar equation.

5.2 High-order linear scalar equation with constant cofficients.

5.3 First-order linear system with constant cofficients.

5.4 Stablity of first-order linear system.

5.5 Hopf Bifurcation.

We cover perturbation method next semester.

5.1. First-order linear scalar equation

Let us solve the problem

dy

dt+ a(t)y = 0, y(0) = C. (1)

We findy′

y= −a(t).

Integrate:

ln y(t)− ln y(0) = −∫ t0 a(τ)dτ,

ln y(t) = ln y(0)−∫ t0 a(τ)dτ,

y(t) = eln y(0)−∫ t

0a(τ)dτ

= y(0)e−∫ t

0a(τ)dτ .

So the solution to (1) is

y(t) = Ce−∫ t

0a(τ)dτ . (2)

Now we consider (the first-order linear scalar equation)

dy

dt+ a(t)y = f(t). (3)

We look for a factor m(t) such that

m(t)y′ + a(t)m(t)y = [m(t)y]′.

From the product rule, we need m(t) to satisfy

m′(t) = a(t)m(t)

From formula (2), we find an m(t)

m(t) = e∫ t

0a(s)ds.

Equation (3) multiplied by m(t) becomes

d

dt[m(t)y(t)] = m(t)f(t).

Thus

m(t)y(t)−m(0)y(0) =∫ t

0m(τ)f(τ)dτ,

or

y(t) =1

m(t)[y(0) +

∫ t

0m(τ)f(τ)dτ ],

or

y(t) = e−∫ t

0a(τ)dτ [y(0) +

∫ t

0e∫ τ

0a(s)dsf(τ)dτ ]. (4)

Examples 1. All solutions to

y′ + 5y = 0

are

y = ce−5t.

2. All solutions to

y′ + t2y = 0

are

y = ce−13t3 .

5.2. High-order linear scalar equations with constant coefficients

Let us solve the problem

d3x

dt3+d2x

dt2− 2x = 0, (5)

x(0) = 0, x′(0) = 1, x′′(0) = −1. (6)

We try solutions of the form

x(t) = eλt. (7)

2

Then x′ = λx(t), x′′(t) = λ2x(t), x′′′(t) = λ3x(t). Thus λ needs to satisfy

λ3x(t) + λ2x(t)− 2x(t) = 0.

We find λ1 = 1, λ2 = −1+ i, λ3 = −1− i. So we have solutions et, e−t+it, e−t−it.

The two complex solutions can be added or subtracted one from the other to produce

two real solutions, so we have three real solutions et, e−t cos t, e−t sin t, since the

equation is linear. Also, any linear combination of the three solutions are solutions.

Thus, we have

x(t) = c1e−t + c2e

t cos t + c3e−t sin t (8)

as the general solution formula for (5). One can use the three initial conditions (6)

to determine the three coefficients c1, c2, c3 in (8) which we omit here.

In general the n−th-order linear scalar equation

anx(n)(t) + an−1x

(n−1)(t) + ...+ a1x′ + a0x = 0

with constant coefficients (an, an−1, · · · , a0) without forcing(right-hand side=0) can

be solved by the guess work (7). More precisely, from the algebraic equation,

anλn + an−1λ

n−1 + · · ·+ a1λ+ a0 = 0,

we can find n roots. Suppose it has n distinct roots λ1, λ2, · · · , λn. Then the solution

for the ODE is

x(t) = c1eλ1t + c2e

λ2t + · · · + cneλnt.

If λ1 and λ2 are a pair of conjugate complex roots, say,

λ1 = a+ bi, λ2 = a− bi,

then we can replace the part c1eλ1t + c2eλ2t by real solution of the form

αeat cos(bt) + βeat sin(bt).

If λ1 is repeated, say λ1 = λ2, then c2eλ2t is a multiple of the first solution c1e

λ1t.

In this case, we replace the guess work eλt by teλt, and the solution c1eλ1t + c2e

λ2t

is replaced by c1eλ1t + c2teλ1t. If λ1 is repeaded m times, then the solution part

c1eλ1t + c2e

λ2t + · · ·+ cmeλmt

3

is replaced by

c1eλ1t + c2te

λ1t + · · ·+ cmtm−1eλ1t.

Example 3. Solved2x

dt2− 2

dx

dt+ x = 0.

Solution Try x = eλt, we find

λ2 − 2λ+ 1 = 0.

So

λ1 = λ2 = 1.

So the solutions are

x(t) = c1et + c2te

t.

4

5.3. First-order linear systems with constant coefficients

Motivation: The planetary motion can be described by a system of equations.

Let us solve the system

dx1dt − x2 − x3 = 0,dx2dt − x1 − x3 = 0,dx3dt − x1 − x2 = 0,

(1)

with initial conditions: x1(0)

x2(0)

x3(0)

=

c1

c2

c3

. (2)

We try the form, motivated by the guess work for a scalar equation x = ceλt,x1(t)

x2(t)

x3(t)

=

aeλt

beλt

ceλt

. (3)

We see x′1

x′2

x′3

= λ

a

b

c

eλt = λ

x1

x2

x3

.Inserting this back to (1), we have

λx1 − x2 − x3 = 0,

λx2 − x1 − x3 = 0,

λx3 − x1 − x2 = 0.

This is a homogeneous linear system of three algebraic equations. We write it in

matrix form λ −1 −1

−1 λ −1

−1 −1 λ

·x1

x2

x3

= 0.

Remove the common factor eλt in [x1, x2, x3]T in the above equation, we findλ −1 −1

−1 λ −1

−1 −1 λ

·a

b

c

= 0. (4)

To have a nonzero solution [a, b, c]T , we need the matrix to have zero determinant:

det

λ −1 −1

−1 λ −1

−1 −1 λ

= 0. (5)

This determinant can be evaluated to be

λ3 − 3λ− 2 = (λ+ 1)2(λ− 2). (6)

The factorization is made possible by inspection and λ = −1 is a solution. Equation

(5) then has three roots:

λ1 = 2, λ2 = −1, λ3 = −1. (7)

Using the root λ1 = 2 in (4), we have the equation2 −1 −1

−1 2 −1

−1 −1 2

a

b

c

= 0. (8)

The solutions are a

b

c

= α

1

1

1

, (α− free) (9)

Thus we find the first batch of solutionsx1

x2

x3

= α

1

1

1

e2t. (10)

Using λ2 = −1 in (4), we find the equation−1 −1 −1

−1 −1 −1

−1 −1 −1

a

b

c

= 0. (11)

2

The solutions to (11) area

b

c

= β

1

0

−1

+ γ

0

1

−1

, (β, γ free) (12)

We find another batch of solutions to (1):x1

x2

x3

= β

1

0

−1

e−t + γ

0

1

−1

e−t. (13)

If λ3 is different from λ2, we can use it to find another batch. But so far we have

found plenty of solutions. We combine linearly the solutions (10) and (13) to end

up with the general solution formula for (1)x1

x2

x3

= α

1

1

1

e2t + β

1

0

−1

e−t + γ

0

1

−1

e−t. (14)

We can use initial condition (2) to determine the three arbitrary constants α, β, γ

(which we skip).

In general equation (1) can be written as

d~x

dt= A~x (15)

for an n× n matrix A with constant coefficients (aij). For our previous example

A =

0 1 1

1 0 1

1 1 0

.The guess work is

~x = ~aeλt. (16)

where ~a is a vector. Then λ needs to satisfy

det(λI −A) = 0 (17)

3

and ~a satisfies

A~a = λ~a. (18)

If the characteristic equation (17) has n roots λ1, λ2, · · · , λn, and the eigenvalue prob-

lem (18) has n corresponding linearly independent eigenvectors α1~a1, α2~a2, · · · , αn~an,

then the general solution for (15) is

~x(t) = α1~a1eλ1t + α2~a2e

λ2t + · · ·+ αn~aneλnt. (19)

If, for example, λ1 = λ2, and the corresponding linearly independent eigenvectors

are fewer than n, then by guess work (in addition to α1~a1eλ1t),

~x = α2(~a2eλ1t + ~a1te

λ1t).

This way a nonzero ~a2 can be found in the sequence of equations

A~a1 = λ1 ~a1,

A~a2 = λ1 ~a2 + ~a1.

And the general solution is

~x = α1~a1eλ1t + α2(~a2 + ~a1t)eλ1t + α3~a3e

λ3t · · ·+ αn~aneλnt.

If λ1 is repeated more times, then higher orders of t can be used in the guess work.

If, however, all eigenvalues are distinct, a theorem says that the n corresponding

eigenvectors are linearly independent and (19) gives the general solution.

4

5.4. Stability of first-order linear system

Motivation: A solution needs to be stable in order to be useful in practice. The

U.S. missile defense system is not yet stable.

Considerd~x

dt= A~x, ~x(0) = ~C. (1)

where A is an n× n matrix of constants, with n distinct eigenvalues. The solution

formula is

~x(t) = α1~a1eλ1t + α2~a2e

λ2t + · · ·+ αn~aneλnt.

Theorem 1. If the real parts of all the eigenvalues of the coefficient matrix A are

(strictly) negative, then any solution to (1) goes to zero as t→ +∞.Theorem 2. If one or more eigenvalues of A have positive real parts, then some

solutions of (1) go to infinity as t→ +∞.Proofs: They follow from the solution formula if all eigenvalues are distinct.

Otherwise, solutions are like tmeλt which also go to zero if the real part of λ is

negative, or go to infinity if the real part of λ is positive.

Now let us consider a perturbation of (1)

d~x

dt= A~x+R(t, ~x). (2)

Suppose

‖R(t, ~x)‖ ≤ α‖~x‖, on t ≥ 0, ‖~x‖ < H. (3)

for some constants α and H > 0. Then

Theorem 3. If the real parts of all the eigenvalues of A are (strictly) negative,

and (3) holds for a suitably small α, then the zero solution of (1) is asymptotically

stable; i.e., all solutions of (2) with small initial data go to zero as t→ +∞.Theorem 4. If one or more eigenvalues of A have positive real parts, then the zero

solution is not stable, provided that (3) holds for a suitably small α.

What if one eigenvalue has zero real part and all others have negative real parts?

This is called the critical case, and is where bifurcation occurs. We will discuss these

issues in the next section. We provide some concrete stability examples below.

Examples 1. Consider dx1dt = λx1

dx2dt = µx2.

(4)

Suppose that λ < 0, µ < 0. Then all solutions go to zero as t → +∞. Add

perturbation R(t, x1, x2) : ‖R(t, ~x)‖ < ε‖~x‖,dx1dt = λx1 +R1(t, x1, x2)dx2dt = µx2 +R2(t, x1, x2),

where ε < min(|λ|, |µ|), the zero solution is stable: all solutions x(t)→ 0 as t→ +∞.2. For (4) again, but λ < 0 < µ. Zero is still a solution. But it is not stable

since the initially nearby solution x1

x2

=

0

αeµt

,where α is small, grows to infinity.

3. Consider now for β 6= 0, the systemdx1dt = βx2

dx2dt = −βx1.

Differentiating the first equation and using the second equation we find

d2x1

dt2+ β2x1 = 0.

We can therefore find the solution formula x1 = x01 cos(βt) + x0

2 sin(βt)

x2 = −x01 sin(βt) + x0

2 cos(βt).

Introduce ρ(t) = (x21 + x2

2)12 , then ρ(t) = ((x0

1)2 + (x02)2)

12 . See Figure 5.2 for the

phase portrait of the solutions. This solution is however unstable to perturbations

of the form 0

αx2

,

2

where α > 0, because then the equation has the matrix 0 β

−β α

,one of whose eigenvalues has positive real part.

Notes 1. The stability of a nonzero solution w(t) can be transformed to the stability

of the zero solution to the equation for v(t) ≡ u(t)− w(t).

2. A general nonlinear system

d~x

dt= ~F (t, ~x)

may be approximateed by (2) just as a curve can be approximated by its tangent

lines.

x

x

x

x

Figure 5.1. Any solution of Example 3 traces a circle.

β < 0 β > 0

(x (t), x (t)) (x (t), x (t))

1

2 1

2 21

2

1

5.5. Hopf bifurcations and example.

Motivation: Bifurcation theory is used in many life sciences, ecological systems,

weather system, fluid, chaos, and turbulence.

Considerd2u

dt2+ (u2 − λ)

du

dt+ u = 0. (5)

It has the solution u = 0. Let us consider the linearized equation

d2u

dt2− λdu

dt+ u = 0.

3

When we try solutions of the form u = eµt, we find

µ2 − λu+ 1 = 0.

For λ < 0, both roots have negative real parts, so zero solution is stable. For λ > 0,

both roots have positive real parts, so zero solution is unstable. At λ = 0, the roots

are purely imaginary µ = i,−i and the linearized equation has periodic solutions

u = eit = cos t + i sin t, or u = e−it = cos t − i sin t. Both the real and imaginary

parts are real solutions u(t) = cos t or sin t.

We write equation (5) in vector form by introduing u1 = u, u2 = u′ : u′1 = u2

u′2 = (λ− u21)u2 − u1.

Or

~u(t) =

0 1

−1 λ

· u1

u2

+

0

−u21u2

.This nonlinear system has nonzero periodic solution near λ = 0 :

λ =ε2

4+O(ε3), u1(t) = ε cos(ωt) +O(ε3), ω = 1 +O(ε3).

We will derive this expansion in perturbation theory next semester. For now we have

a bifurcation diagram, see Figure 5.2, and we state a general bifurcation theorem

called Hopf bifurcation.

4

|| u ||max

λλ = 0

0

Amplitude of a solution

Branch of zero solution

A brach of periodic solutions

t

u (t)

Figure 5.2. Hopf bifurcation diagram.

(a). Bifurcation diagram

(b) A periodic solution

Each point on the branch indicates a periodic solution:

Theorem(Hopf Bifurcation). Suppose the n× n matrix A(λ) has eigenvalues µj =

µj(λ), (j = 1, 2, · · · , n), and that for λ = λ0, µ1(λ0) = iβ, µ2(λ0) = −iβ and

Reµj(λ0) 6= 0 for all j > 2. Suppose further that Re (µ′1(λ0)) 6= 0. Then the system

of differential equationsdu

dt= A(λ)u+ f(u)

with f(0) = 0, f(u) a smooth function of u, has a branch (continuum) of periodic

solutions emanating from u = 0, λ = λ0.

(The direction of bifurcation is not determinated by the Hopf Bifurcation The-

orem, but must be calculated by a local power series expansion (See Keener)).

We plan to do serious perturbation theory next semester, where we can un-

derstand how a mathematician’s perturbation and calaulation helps locating the

position of the 9th planet of the solar system.

5

5.6. Another bifurcation example.

See Keener p.478: Nonlinear Eigenvalue problems.

Consider the elastica equation (a.k.a. Euler column) y′′ + (λ− 12

∫ 10 (y′)2ds)y = 0

y(0) = y(1),(6)

where λ is a parameter.

We see that the integral∫ 1

0 (y′)2(s)ds is a number. So let us introduce the number

µ = λ− 12

∫ 10 (y′)2(s)ds. Then equation (6) becomes

y′′ + µy = 0, y(0) = y(1), (7)

which has solutions

y(x) = A sin(nπx), for µ = n2π2. (8)

These solutions produce

µ = λ− 12

∫ 1

0(y′)2(s) = λ− 1

2

∫ 1

0(Anπ)2 cos2(nπx)dx = λ− 1

4(Anπ)2.

To satisfy (6), we need this µ to be the same as in (8); that is

A2n

4=

λ

n2π2− 1.

Thus we find many branches of solutions besides the zero solution, See Figure 5.3.

6

1 4 9

A

λπ 2

Figure 5.3. A nonlinear eigenvalue bifurcation diagram.

bifurcationpoint

A bifurcation branch

n =1

n = 2

n = 3

: Amplitude

Each pointindicates

y = A sin ( x)π

7

4.2. Laplace transform

Definition. For any f(t) ∈ L1([0,∞)), the function

L[f ](s) =∫ ∞

0e−stf(t) dt

for s ≥ 0, is called the Laplace transform of f(t).

Laplace transform can be obtained from Fourier transform through a certain

specilization. Laplace transform is very convenient to use for certain differential

equations or with certain boundary condition. But in general the two linear trans-

forms are basically the same.

Property a. L[dfdt ](s) = sL[f ]− f(0).

Proof. We have the calculation

L[f ′] =∫∞

0 e−stf ′(t) dt

= e−stf(t)|∞0 −∫∞0 e−st(−s)f(t) dt

= −f(0) + s∫∞0 e−stf(t) dt

= sL[f ]− f(0).

We list other properties below without proof. We let F (s) denote L[f ](s); i.e., we

use the capital letter to denote the Laplace transform of a lower-case letter function.

Properties b. L[1] = 1s ,

c. L[eat] = 1s−a ,

d. L[f ′′] = s2F (s)− sf(0)− dfdt (0),

e. L[−tf(t)] = dFds ,

f. L[∫ t

0 f(t− t)g(t) dt] = F (s)G(s),

g. L[δ(t − b)] = e−bs, (b ≥ 0)

h. L[tn] = n!sn+1 , (n > −1),

i. L[tneat] = n!(s−a)n+1 , (n > −1).

We note that Laplace transform transforms differentiation to a multiplication by

the independent variable s, and the multiplication by t to the differentiation with

respect to s. It also transforms the convolution∫ t

0f(t− t)g(t) dt

to the product of the transforms of f and g. Note that Laplace transform does

not need f(t) to be defined in t < 0. Even if f(t) is defined in t < 0, its value

there does not affect the transform. Therefore we adopt the convention that all

relevant functions for the Laplace transform are defined by zero in t < 0. Under this

convention we find that∫ t

0f(t− t)g(t) dt =

∫ ∞−∞

f(t− t)g(t) dt = f ∗ g

is indeed the the convolution defined in the previous section.

We also note that Laplace transforms exist for functions or functionals that are

not in L1([0,∞)), e.g., the functional δ(t− b), and t2.

We mention another example. Consider the Heaviside function

H(t− b) =

0, t < b,

1, t ≥ b.

For b > 0, we find that

Property j. L[H(t− b)](s) =∫∞0 H(t− b)e−st dt =

∫∞b e−st dt = e−bs

s .

Similar to the Fourier transform, there exists the inverse transform for the

Laplace transform. But its use is inconvenient. The best way has been to use

the above list of properties a− j for the inversion. If F (s) is the Laplace transform

of f(t), then we call f(t) the inverse of F (s). For example,

L−1[1s

] = 1, (t > 0).

Example 1. Solve the initial value problem

u′′ + 2u′ + 2u = 0, t > 0

u(0) = 1,

u′(0) = 2,

Physical background. This equation can be regarded as the motion of a particle

with mass m = 1, attached to a spring with Hooke’s spring constant k = 2, and wind

drag force proportional to the velocity u′. Newton’s second law says F = ma (Force

= mass × acceleration). Here ma = u′′ where u(t) represents the displacement of

the particle from the equilibrium. The spring force is −2u, the wind drag force is

−2u′, where u′ is velocity. So u′′ = −2u′ − 2u is Newton’s law.

2

Solution: Let U(s) = L[u]. Then we use the properties a− j to find

L[u′] = sU − 1,

L[u′′] = s2U − s− 2.

Sos2U − s− 2 + 2(sU − 1) + 2U = 0,

(s2 + 2s+ 2)U = s+ 4,

U = s+4s2+2s+2 .

We notice s2 + 2s+ 2 = [s+ (1− i)][s+ (1 + i)]. By partial fractions (see later), we

haves+ 4

s2 + 2s + 2=

α

s+ 1− i +β

s+ 1 + i,

where

α =12

(1− 3i), β =12

(1 + 3i).

Then by linearity of the transform, we have

u = L−1(U(s)) = αL−1[1

s− (−1 + i)] + βL−1[

1s− (−1− i) ].

Using property c for a = −1 + i and then a = −1− i, we have

u = αe(−1+i)t + βe(−1−i)t

= 12(1− 3i)e−t(cos t+ i sin t) + 1

2(1 + 3i)e−t(cos t− i sin t)= e−t(cos t+ 3 sin t).

Partial fractions. We show here how we express a complicated fraction into a sum

of simple fractions for which we can invert the Laplace transform. We make a guess

of the sum:s+ 4

s2 + 2s+ 2=

α

s+ 1− i +β

s+ 1 + i

where α and β are to be determined numbers. Then we multiply the two sides of

the equation with s2 + 2s+ 2 to find

s+ 4 = αs+ βs+ α(1 + i) + β(1− i).

We rearrange terms to find

(α+ β − 1)s+ α(1 + i) + β(1 − i)− 4 = 0.

3

This equation has to be true for all s, so we have

α+ β = 1,

α(1 + i) + β(1 − i) = 4.

This system of algebraic equations can be solved easily:

1 + i(α − β) = 4,

α− β = −3i,

α = 12(1− 3i),

β = 12(1 + 3i).

This finishes the partial fraction used in our example 1. For general ways of partial

fractions, see our text book.

Example 2. Solve the initial value problem of the ordinary differential equation

(ODE) u′′ + tu′ + u = 0, t > 0,

u(0) = 1,

u′(0) = 0.

Solution: Let U(s) = L[u]. Then

L[u′] = sU − 1,

L[u′′] = s2U − s,L[tu′] = − d

ds [sU − 1]

= −sU ′ − U(s).

Then the ODE in question becomes

−sU ′ + s2U − s = 0,

or

U ′ − sU = −1.

How to solve this new ODE? We multiply it with e−s2

2 , so

(e−s2

2 U)′ = −e− s2

2 .

We integrate it in s from s to ∞ and use the condition U(s)→ 0 as s→∞:

0− e− s2

2 U(s) = −∫ ∞s

e−σ2

2 dσ.

4

Or we have

U(s) = es2

2

∫ ∞s

e−σ2

2 dσ.

Instead of evaluating this integral and then inverting it, we use a special trick. We

introduce the new variable t = σ − s. The integral becomes

U(s) =∫∞s e−

σ2−s22 dσ,

U(s) =∫∞0 e−ste−

t2

2 dt.

Hence U(s) is the Laplace transform of e−t2

2 , and so

u(t) = e−t2

2

is the solution. (Reference book: Weinberger).

For more on Laplace and Fourier transforms, see

1. Richard Haberman: Elementary Applied Partial Differential Equations, 2nd

or later edtions, Prentice Hall, 1987.

2. H. F. Weinberger: A First Course in Partial Differential Equations, John

Wiley & Sons, 1965.

5

Chapter VI. Partial Differential Equations

Tentative contents

A. In infinite domains.

6.1. Transport equations, method of characteristics.

6.2. wave equation in IR1.



6.5. Heat equation in IRn and IR1+.

6.6. Laplace and Poisson equations in IRn.

6.7. Concept of fundamental solutions.

B. On rectangular domains, separation of variables.

6.8. Laplace equation in a rectangle, Fourier series.

6.9. Poisson equation in a rectangle.

6.10. Heat equation in a rectangle.

6.11. Wave equation in a rectangle.

6.12. Eigenvalue problems, Sturm-Liouville operator.

6.13. Explicit eigenfunctions, orthogonal polynomials, special functions, Bessel’s

functions.

6.14. Vibrating circular membrane.

C. Bounded domains general, Green’s function.

6.15. Laplace equation in general bounded domains, Green’s function

6.1 Transport equation, method of characteristics

We consider the simplest partial differential equation

∂u

∂t+ a

∂u

∂x= 0, t > 0, x ∈ IR1, (1)

where a is a constant. The general solution formula is

u(t, x) = g(x − at) (2)

where g(·) is an arbitrary (smooth) function. Let t = 0 in (2), we see that

u(0, x) = g(x), (3)

thus g(·) is the initial condition for u and equation (1). One can let g be a Gaussian:

g(x) = e−x2

and plot the solution at times t = 1, 2, 3, · · · , 10 for a = −2,−1, 0, 1, 2.

We can conclude that the graph of u(t, x) is simply the graph of g(x) shifted by the

amount at in the x direction.

u

x

t = 1t = 0 t > 1

ata

Figure 6.1. Transport feature ( shown for positive velocity ).a

We consider now the transport equation in n-dimension

∂u

∂t+ a1

∂u

∂x1+ a2

∂u

∂x2+ · · ·+ an

∂u

∂xn= 0, t > 0, ~x = (x1, · · · , xn) ∈ IRn (4)

with initial condition

u(0, ~x) = g(~x). (5)

It can be readily verified that

u(t, ~x) = g(~x− ~at), (6)

Equation (4) is called a passive transport equation. We can add a source term to

it and consider ∂u∂t + ~a · ∇u = f(t, ~x),

u(0, ~x) = g(~x).(7)

Let us consider the straight linesd~x

dt= ~a, (8)

i.e.,

~x = ~x(t) ≡ ~x0 + ~at, (9)

2

which cover the whole space IRn × IR, when ~x0 and t vary freely. These lines are

called characteristic lines of equation (7). See Figure 6.2. Let us fix a ~x0 and consider

the function u(t, ~x(t)). We find

d

dtu(t, ~x(t)) =

∂u

∂t+∇u · d

dt~x(t) =

∂u

∂t+ a · ∇u = f(t, ~x(t)). (10)

Thus we can integrate (10) to find

u(t, ~x(t)) = u(0, ~x(0)) +∫ t

0f(s, ~x(s))ds = g(~x0) +

∫ t

0f(s, ~x0 + ~as) ds. (11)

Looking at the characteristic lines the other way around, we can first fix a point

(t, ~x) ∈ IR1 × IRn, and determine an ~x0 at t = 0 from (9), and then (11) reads as

u(t, ~x) = g(~x− ~at) +∫ t

0f(s, ~x− ~a(t− s))ds.

x

t

x(t) = x + a t

x

(x, t)

0

0

Figure 6.2. Characteristic lines .

Motivation of the equation:

Convection or transport is an important part in many partial different equations,

such as neutron transport, Boltzmann equation, fluid dynamics, etc.

The method used in (8)-(11) is called the method of characteristics. This method

can be used to solve equation (7) when ~a is a function of (t, ~x), or even when ~a is a

function of u, making (7) a nonlinear first-order equation.

3

6.2. Wave equation in IR1

Modeling: Imagine a piece of string stretched tightly (a taut string). We measure

the speed of sound c = (Tρ )12 , where T is tension in the string, ρ is linear mass density,

both are assumed constant. Given the initial position g(x), and initial velocity h(x),

we use a video camera to record its true motion, and a mathematical model with a

computer to make a movie of the motion. We then compare the two videos. They

can be made extremely close! I choose to present the mathematical solution formula

only, while omit the derivation of the model, although the derivation is important

and very interesting.

We consider the wave equation (vibrating string equation)

∂2u

∂t2− c2 ∂

2u

∂x2= f(t, x), t > 0, x ∈ IR1, (1)

with initial conditions

u(0, x) = g(x),∂u

∂t(0, x) = h(x). (2)

Theorem (D’Alembert formula) A solution to (1)- (2) is

u(t, x) =12

[g(x+ct)+g(x−ct)]+ 12c

∫ x+ct

x−cth(y)dy+

12c

∫ t

0

∫ x+c(t−s)

x−c(t−s)f(s, y)dyds. (3)

Proof: We introduce the new coordinates ξ = x+ ct,

η = x− ct.(4)

Then by the chain rule,

∂2u

∂x2=∂2u

∂ξ2+ 2

∂2u

∂ξ∂η+∂2u

∂η2,

∂2u

∂t2= c2[

∂2u

∂ξ2− 2

∂2u

∂ξ∂η+∂2u

∂η2]

Thus (1) becomes

−4c2∂2u

∂ξ∂η= f(t, x), or

∂2u

∂ξ∂η= − 1

4c2f(ξ − η

2c,ξ + η

2).

Integrating twice, we find

u = F (ξ) +G(η) +∫ ξ

0

∫ η

0− 1

4c2f(ξ − η

2c,ξ + η

2)dξdη. (5)

Thus

u(t, x) = F (x+ ct) +G(x− ct)− 14c2

∫ x+ct

0

∫ x−ct

0f(ξ − η

2c,ξ + η

2)dξdη. (6)

By fitting this formula to initial data (2), we can determine F and G.

By further change of variables, we can manipulate the third term of (6) to be

like formula (3).

6.3. Wave equation in IR3.

We consider the initial value problem for the homogeneous three-dimensional

wave equation∂2u

∂t2− c2[

∂2u

∂x21

+∂2u

∂x22

+∂2u

∂x23

] = 0, t > 0, (7)

u(0, x1, x2, x3) = 0. (8)∂u

∂t(0, x1, x2, x3) = h(x1, x2, x3). (9)

We let

u(t, w1, w2, w3) =1

(2π)32

∫ +∞

−∞

∫ +∞

−∞

∫ +∞

−∞u(t, x1, x2, x3)ei(ω1x1+ω2x2+ω3x3)dx1dx2dx3,

h(ω1, ω2, ω3) =1

(2π)32

∫IR3

h(x1, x2, x3)eiω·xdx.

That is, u is the Fourier transform of u in IR3. Under this transform, equation (7)

and conditions (8)-(9) become

∂2u

∂t2+c2(ω2

1+ω22+ω2

3)u = 0, u(0, ω1, ω2, ω3) = 0,∂u

∂t(0, ω1, ω2, ω3) = h(ω1, ω2, ω3).

(10)

The problem has a solution

u(t, ω1, ω2, ω3) = h(ω1, ω2, ω3)sin((ω2

1 + ω22 + ω2

3)12 ct)

c(ω21 + ω2

2 + ω23)

12

. (11)

By the inverse theorem

u(t, x1, x2, x3) =1

(2π)32

∫IR3

u(ω)e−i~ω·~xd~ω, (12)

and a series of hard calculations (see Weinberger, pp.333-335), we end up with

u(t, x1, x2, x3) =t

4π(ct)2

∫ ∫|y−x|=ct

h(y)dSy

2

=t

4π

∫ 2π

0

∫ π

0h(x1 + ct sinφ cos θ, x2 + ct sinφ sin θ, x3 + ct cos φ) sinφdφdθ. (13)

Recall the spherical coordinates:

x = r sinφ cos θ, y = r sinφ sin θ, z = r cosφ

where θ is the angle in the (x, y) plane and φ is the angle away from the z-axis.

This solution (13) is t times the average of h on a sphere centred at (x1, x2, x3) with

radius ct.

It is interesting to note that a solution to∂2u∂t2 − c

2(∂2u∂x2

1+ · · ·+ ∂2u

∂x23) = 0,

u(0, x1, x2, x3) = g(x1, x2, x3),∂u∂t (0, x1, x2, x3) = 0

(14)

is simply

u(t, x1, x2, x3) =∂

∂t[

t

4π(ct)2

∫ ∫|y−x|=ct

g(y)dSy ]. (15)

This can be seen by using the Fourier transform∂2u∂t2 + c2(ω2

1 + ω22 + ω2

3)u = 0,

u(0, ω1, ω2, ω3) = g(ω1, ω2, ω3),∂u∂t (0, ω1, ω2, ω3) = 0

(16)

which has a solution

u(t, ω1, ω2, ω3) = g cos((ω21 + ω2

2 + ω23)

12 ct). (17)

Luckily we do not need to do any hard calculation to invert it since

u(t, ω1, ω2, ω3) =∂

∂t[g

sin((ω21 + ω2

2 + ω23)

12 ct)

c(ω21 + ω2

2 + ω23)

12

]

and thus (similar to the process from (11) to (13))

u(t, x1, x2, x3) =∂

∂t[g

sin((ω21 + ω2

2 + ω23)

12 ct)

c(ω21 + ω2

2 + ω23)

12

]∨ =∂

∂t[

t

4π(ct)2

∫ ∫|y−x|=ct

g(y)dSy].

A solution to the full initial value problem∂2u∂t2− c2(∂

2u∂x2

1+ · · ·+ ∂2u

∂x23) = 0,

u(0, x1, x2, x3) = g(x1, x2, x3),∂u∂t (0, x1, x2, x3) = h(x1, x2, x3)

(18)

3

is the sum of the previous two solutions, which is called the Poisson formula:

u(t, x1, x2, x3) = t4π(ct)2

∫ ∫|y−x|=ct h(y)dSy

+ ∂∂t [

t4π(ct)2

∫ ∫|y−x|=ct g(y)dSy].

(19)

4

6.3. (Continued)

For the inhomogeneous problem

∂2u

∂t2− c2(

∂2u

∂x21

+∂2u

∂x22

+∂2u

∂x23

) = f(t, x1, x2, x3), (1)

u(0, ~x) = 0, (2)

∂u

∂t(0, ~x) = 0, (3)

a solution is given by Duhamel’s principle (Fritz John, PDE, p.135)

u(t, ~x) =1

4πc2

∫ t

0

ds

t− s

∫ ∫|y−x|=c(t−s)

f(s, ~y)dSy. (4)

Duhamel’s principle: Given a time t > 0. Replace force f(s, ~x), s ∈ [0, t], by

acquired velocity at time

0 = s1 < s2 < s3 < · · · < sn < sn+1 = t,

and consider wi(s, ~x):

∂2wi∂s2

− c2(∂2wi∂x2

1

+∂2wi∂x2

2

+∂2wi∂x2

3

) = 0, s > si, (5)

wi(si, ~x) = 0, (6)

∂wi∂s

(si, ~x) = f(si, ~x)(si+1 − si). (7)

The solution wi(s, ~x), which we assume is zero for s < si, is the part of the dis-

placement u(t, ~x) that is resulted from a pulse force f(s, ~x) during the time interval

[si, si+1], which is equivalent to a velocity f(si, ~x)(si+1−si). The final total displace-

ment u(t, ~x) is by superposition

u(t, ~x) =n∑i=1

wi(t, ~x). (8)

Let n →∞ and all si+1 − si → 0, the approximation becomes exact. We can solve

(5)-(7) just as before (Poisson formula):

wi(s, ~x) =1

4πc2(s− si)

∫ ∫|~y−~x|=c(s−si)

(si+1 − si)f(si, ~y)dSy, s > si.

Details are in John, PDE, p.135.

Applications: Maxwell’s equations of electromagnetism ( ~E, ~B) in vacuum are

∂2 ~E

∂t2− c2(

∂2 ~E

∂x21

+∂2 ~E

∂x22

+∂2 ~E

∂x23

) = 0,

∂2 ~B

∂t2− c2(

∂2 ~B

∂x21

+∂2 ~B

∂x22

+∂2 ~B

∂x23

) = 0,

where c = (Tρ )12 is the speed of light in vacuum. We see that the speed of light is

lower in air since the density ρ is higher.

6.4. Hadamard’s method of descent.

In IR2, the wave equation

∂2u

∂t2− c2(

∂2u

∂x21

+∂2u

∂x22

) = 0,

u(0, x1, x2) = g(x1, x2),

∂u

∂t(0, x1, x2) = h(x1, x2),

can be regarded as a problem in IR3 where u(t, x1, x2, x3) is independent of the third

dimension x3. In this way, we find that the spherical integrals on the sphere

|~y − ~x| = ((y1 − x1)2 + (y2 − x2)2 + y23)

12 = ct

can be changed into top and bottom integrals over the disk

(y1 − x1)2 + (y2 − x2)2 < (ct)2.

Thus

u(t, x1, x2) =1

2πc

∫ ∫r<ct

h(y1, y2)

(c2t2 − r2)12

dy1dy2 +∂

∂t[

12πc

∫ ∫r<ct

g(y1, y2)

(c2t2 − r2)12

dy1dy2],

where r =√

(x1 − y1)2 + (x2 − y2)2.

2

Figure 6.4.1. Integrals on a sphere becomes integrals on a disk.

y

y

y

(x , x , 0)

3

3

ct

1 2

2

1

y = (c t) − r 2 22

Nonlinear wave equations

Large amplitude:

utt − c2(ux

(1− u2x)

12

)x = 0,

In liquid crystals:

utt − c(u)(c(u)ux)x = 0,

where c(u) = (α cos2 u+ β sin2 u)12 , α > 0, β > 0, are physical elastic constants.

These nonlinear equations do not have solution formulas.

3

6.5. Heat equation in IRn and IR1+.

Modeling heat conduction. (Keener, p.380.)

We propose to study heat conduction in a material, for example, gas or metal.

Let u(t, ~x) be temperature. Then the total thermal energy in a region Ω is∫Ωρ c u(t, ~x) d~x,

where ρ(t, ~x) is density of mass, c is heat capacity (energy/unit mass). Let ~q be heat

flux: energy per unit area per unit time; let f(t, ~x) be heat production: energy per

unit volume per unit time. Then “conservation of energy” is

d

dt

∫Ωρcu(t, ~x)d~x =

∫Ωf(t, ~x) d~x−

∫∂Ω~q · ~n dS,

where ~n is the unit outward normal to the boundary ∂Ω. Through physical experi-

ments, there holds Fourier’s law of cooling:

~q = −k∇u

for many common materials. By Gauss divergence theorem, see Chapter 1.6, we

thus have ∫Ω

[∂

∂t(ρcu)− f ]d~x = k

∫∂Ω∇u · ~ndS = k

∫Ω

div (∇u)d~x.

Or ∫Ω

[∂

∂t(ρcu)− k div (∇u)− f ]d~x = 0.

for all Ω. Thus∂

∂t(ρcu) = k div (∇u)− f.

Assuming ρ =constant, c =constant. Let D = kρc (the diffusion coefficient). Then

∂

∂tu = D4u+

f

ρc,

where

4 = div ∇ =n∑i=1

∂i2

is called the Laplacian. We can use τ = Dt so that the scaled equation is

∂u

∂t= 4u+

f

ρcD.

Solution. Consider ∂∂tu = 4u

u(0, ~x) = g(~x).

Use Fourier transform in IRn:

u(t, ~ω) =1

(2π)n2

∫IRn

u(t, ~x)eiω·~xd~x.

Then ∂∂t u = −(ω2

1 + · · ·+ ω2n)u,

u(0, ~ω) = g.

Thus

u = ge−(ω21+···+ω2

n)t = g[1

(2t)n2

e−x21+···+x2

n4t ]∧.

By inversion theorem,

u = u∨ = 1

(2π)n2g ∗ 1

(2t)n2e−

x21+···+x2

n4t

= 1

(4πt)n2

∫∞−∞ · · ·

∫∞−∞ g(y1, · · · , yn)e−

(x1−y1)2+···−(xn−yn)2

4t dy1 dy2 · · · dyn.(1)

Theorem 6.5.1. A solution to

∂tu = k4u, u(0, x) = g(x), x ∈ IRn

is

u(t, x) =1

(4πkt)n2

∫IRn

g(y)e−|x−y|2

4kt dy.

Theorem 6.5.2. A solution to∂∂tu = k4u+ f(t, x),

u(0, ~x) = 0

is

u(t, x) =∫ t

0

1(4πk(t − τ))

n2

∫IRn

f(τ, y)e−|x−y|24k(t−τ) dy dτ.

(Duhamel’s principle).

The heat equation in IR1+ is a homework problem.

2

Notes 1. Multi-dimensional Fourier transform is equivalent to one-dimensional

Fourier transform applied many times. For example, the two-dimensional Fourier

transform is

u(ω1, ω2) = 12π

∫u(x1, x2)eix1ω1+ix2ω2dx1dx2

= 1

(2π)12

∫[ 1

(2π)12

∫u(x1, x2)eiω2x2dx2]eiω1x1dx1

= 1

(2π)12

∫u(x1, ω2)eiω1x1dx1 = (u(x1, ω2))∧(ω1, ω2).

(2)

2. Transform formula: It follows easily from the one-dimensional one:

(e−β(x21+x2

2+···+x2n))∧ = (e−βx

21)∧(e−βx

22)∧ · · · (e−βx2

n)∧

=1

(2β)12

e−ω2

14β

1

(2β)12

e−ω2

24β · · · 1

(2β)12

e−ω2n

4β =1

(2β)n2

e−1

4β(ω2

1+ω22+···+ω2

n).

3. Inverse transform: Definition:

u∨ = (2π)−n2

∫IRn

u(x)e−iω·xdx.

4. Inversion Theorem: It follows easily from the one-dimensional one:

(u∧)∨ = u.

5. Convolution formula: Definition:

f ∗ g(~x) =∫ ∫

· · ·∫f(x1 − y1, x2 − y2, · · · , xn − yn)g(y1 · · · yn)dy1 · · · dyn.

Formula: It follows easily from the one-dimensional one:

(f ∗ g)∧(~ω) = (2π)n2 f g.

Inverse: It follows easily from the one-dimensional one:

f ∗ g = (2π)n2 (f g)∨.

3

6.6. Laplace and Poisson equations in IR2 and IR3.

Laplace equation:

4u = ∂2x1u+ ∂2

x2u+ · · ·+ ∂2

xnu = 0 in IRn.

It describes the temperature distribution in equilibrium (time-independent). So-

lutions are any linear functions a1x1 + a2x2 + · · · + anxn or quadratic functions

x21 − x2

2, x1x2, or many higher-order polynomials. These solutions are called har-

monic functions. But the only bounded solutions are the constant solutions.

Poisson equation:

4u = f(~x) in IRn.

We assume that |f(~x)| → 0 as |x| → ∞, and we look for bounded solutions only.

Try Fourier transform:

u(~ω) = (2π)−n2

∫IRn

u(x)e−iω·xdx.

Then−|ω|2u = f ,

u = − 1|ω|2 f .

Inversion: In IR3 only: (skipped because it needs distribution theorey, see next

semester)

(4π|x|)−1 = (2π)−32

1|ω|2 .

Convolution formula yields

u = − 14π|x| ∗ f.

u(x) = − 14π

∫IR3

f(y)|x−y|dy in IR3 only.

Another method:

Motivation: We know that

div ~E = q,

i.e., the divergence of an electric field is charge density. For the electric potential A

such that~E = ∇A,

there holds

4A = q.

Consider the ideal case where q is a Dirac delta

4A = δ(x).

We can look for a spherically symmetric solution

A(~x) = A(|~x|).

Recall in IRn (Chapter 1, Section 1.16.2)

4A(r) =∂2A

∂r2+n− 1r

∂A

∂r.

At r 6= 0, we have∂2A

∂r2+n− 1r

∂A

∂r= 0. (1)

In IR3, we can integrate the above equation

1r2

∂

∂r[r2∂A

∂r] = 0.

Thenr2 ∂A

∂r = C,∂A∂r = C

r2 ,

A = C1 − Cr .

The constant C1 will be arbitrary. But the constant C is determined by the strength

of the δ(x) : ∫|x|<1

4Adx =∫δ(x) dx = 1.

A calculation can be done to find (skipped in class, but see Keener p.341, or later

of this lecture notes):

C =1

4π.

Thus

A(r) = − 14πr

+C1.

This function

A(r) = − 14πr

2

is called the fundamental solution to the Laplace equation in IR3.

The constant C. We find the constant C in A(r) here. We use

fε(r) =

1

43πε3

, r < ε,

0, r > ε

to approximate the δ(x). Then equation (1) is

1r2

∂

∂r[r2∂A

∂r] = fε(r).

We integrate and use the condition r2 ∂A∂r |r=0 = 0 to find

r2 ∂A∂r − 0 =

∫ r0 r

2fε(r)dr

=

1

43πε3

r3

3 , r < ε,

14π , r ≥ ε.

Let ε→ 0, we conclude that

r2 ∂A∂r = 1

4π , r ≥ 0,∂A∂r = 1

4πr2 ,

A = − 14πr + C1.

3

6.6. (Continued)

We have found a solution to

4u = f(x) in IR3

in the form

u(x) = − 14π

∫ ∫ ∫IR3

f(y)|x− y|dy

via the Fourier transform, but we were not able to invert the function 1|ω|2 . We

started another approach: Finding a radial solution A(x) = A(|x|) to

4A = δ(x).

We find

A(x) = − 14π|x| + C1.

We call

A(x) = − 14π|x|

the fundamental solution to the Laplacian in IR3. We explain here why it is called

the fundamental solution.

This is why: we can see by translation that a solution to

4U = δ(x− x0)

is

U = − 14π|x− x0|

.

Further, a solution to

4W = f(x0)δ(x − x0)

is

W = − f(x0)4π|x− x0|

.

Integrating the W equation in x0, we find

4∫

IRnWdx0 =∫

IRn f(x0)δ(x − x0) dx0

= f(x).(1)

Thus

u(x) = −∫

IRn

f(x0)4π|x− x0|

dx0

is a solution to

4u = f(x).

Thus the general solution to 4u = f(x) is given by the convolution of f(x) with the

fundamental solution.

We remark that the integration step (1) is generally known as Superposition.

In IR2: We follow the same idea in IR3. We consider a radial solution A(|x|) to

4A = δ(x) in IR2. (2)

We have known that

4A =∂2A

∂r2+

1r

∂A

∂r

in IR2. Thus

4A =1r

∂

∂r[r∂A

∂r].

At r 6= 0, equation (2) is1r

∂

∂r[r∂A

∂r] = 0.

So

r∂A

∂r= C.

Or

A = C ln r + C1.

We need to find the constant C; the other constant C1 is not important and we set

it to zero. To do so, we approximate δ(x) in IR3 by

Fε(x) =

1πε2 , |x| < ε,

0, |x| ≥ ε

and consider radial solution to

4A(x) = Fε(x).

2

We find1r

∂

∂r[r∂A

∂r] = Fε(r)

Or∂∂r [r ∂A∂r ] = rFε(r). (3)

Since A(|x|) is radial, we expect and assume ∂A∂r = 0 at r = 0. Then we integrate

(3) in r over [0, r] to find

r∂A

∂r=∫ r

0sFε(s)ds =

1πε2 ( r

2

2 ), r < ε,1

2π , r ≥ ε

Now let ε→ 0, we find

r∂A

∂r=

12π, r ≥ 0.

Thus

A =1

2πln r + C1.

As usual, we set C1 = 0. By Superposition we find that a solution to

4u = f

in IR2 is

u(x) = A(x) ∗ f(x) =1

2π

∫IR2

ln |x− x0|f(x0)dx0.

Application: Biot-Savart law (Keener, p.342)

Consider an incompressible fluid with velocity ~u :

div ~u = 0

and vorticity

~ω = curl ~u

in IR3. Introduce a vector potential ~A (whose existence follows from div ~A = 0)

such that

~u = curl ~A

and

div ~A = 0.

3

Then~ω = curl (curl ~A)

= ∇(div ~A)− div (∇ ~A)

= −4 ~A,

Using the solution formula for Poisson equation, we have

~A =1

4π

∫ ~ω(y)|x− y|dy.

Thus~u(x) = curl ~A

= 14π

∫curl ( ~ω(y)

|x−y|)dy

= 14π

∫curl x( 1

|x−y|)× ~ω(y)dy

= 14π

∫ (~x−~y)×~ω(~y)|~x−~y|3 d~y.

This is the Biot-Savart Law in IR3, which relates vorticity ~ω to velocity ~u.

Biot-Savart law in IR2.

Consider an imcompressible fluid in IR2 with velocity ~u:

div ~u = 0.

There exists a stream function (a scalar function) ψ such that

~u = (u1, u2) = (∂x2ψ,−∂x1ψ).

The vorticity of ~u will be a scalar ω, which is the third component of the physical

vorticity ~ω. We haveω = curl ~u

= ∂x1u2 − ∂x2u1

= −∂x1∂x1ψ − ∂x2∂x2ψ

= −4ψ.Thus

ψ = − 12π

∫IR2

ln |x− y|ω(y)dy.

Hence~u = − 1

2π

∫IR2

(x2−y2,y1−x1)|~x−~y|2 ω(~y)d~y

= − 12π

∫IR2

(y2,−y1)|~y|2 ω(~x− ~y)d~y

= 12π

∫IR2

~y⊥

|~y|2ω(~x− ~y)d~y,

4

where ~y⊥ = (−y2, y1). See Figure 6.6.1 for ~y⊥, which is a rotation by 90 counter-

clockwise of ~y.

y

(y , y )

y

1 2

2

2

1

1y = ( −y , y )

Figure 6.7.1. The relation between the perp of y to y.

6.7 Concept of Fundamental Solutions.

1. The solution to

4u = δ(x) in IRn

is called the fundamental solution to the Laplace equation. This δ(x) may be re-

garded as a point charge.

2. The solution to ∂∂tu−4u = 0,

u(0, x) = δ(x),

which is

F (t, x) =1

(4πt)n2

e−x2

4t

is called the fundamental solution to the heat equation. This δ(x) may be regarded

as a point spot of heat.

5

B. PDE on rectangular domains, separation of variables.

6.8. Laplace equation in a rectangle, Fourier series.

We want to solve the Dirichlet boundary value problem for the Laplace equation

in a rectangle

∂2u∂x2 + ∂2u

∂y2 = 0, 0 < x < L, 0 < y < H,

u(0, y) = g1(y),

u(L, y) = g2(y),

u(x, 0) = g3(x),

u(x,H) = g4(x).

(1)

By superposition, we know that we can split problem (1) into four similar problems,

each of which satisfies one of the four boundary functions and the zero boundary

condition on the other three sides. For example, let us find u1 :

∂2u1∂x2 + ∂2u1

∂y2 = 0,

u1(0, y) = g1(y),

u1(L, y) = 0,

u1(x, 0) = 0,

u1(x,H) = 0.

(2)

We use the method of separation of variables. Let u1(x, y) = X(x)Y (y). Then

the Laplace equation can be written as

X ′′(x)X(x)

= −Y′′(y)Y (y)

. (3)

Since the two sides of (3) are functions of different variables, we conclude that they

must be constant, which we set to be λ. Thus (3) becomes

X ′′ = λX, (4)

Y ′′ = −λY. (5)

We see that we can let Y (y) satisfy the boundary condition

Y (0) = Y (H) = 0. (6)

Then, the Y equation (5) and the boundary condition (6) have solution

Y = c sin(nπy

H) (7)

for

λ = (nπ

H)2, (8)

where n = 1, 2, · · · . For the λ in (8), we find that the general solution to (4) is

X(x) = EenπHx + Fe−

nπHx.

Or

X(x) = a1 cosh[nπ

H(x− L)] + a2 sinh[

nπ

H(x− L)].

The shift in x by L is selected to satisfy the boundary condition at x = L conve-

niently. We impose that X(L) = 0, which implies a1 = 0. Thus

X(x) = a2 sinh[nπ

H(x− L)].

In summary, solutions u of the product form X(x)Y (y) satisfying the zero condition

on the three corresponding sides of (2) are

u(x, y) = a2 sinh[nπ

H(x− L)] sin(

nπy

H)

for any constant a2 and all n = 1, 2, 3, · · · . By superposition, we find that

u(x, y) =∞∑n=1

an sinh[nπ

H(x− L)] sin(

nπy

H) (9)

is also a solution to the Laplace equation with zero value on the three zero-value

sides of the rectangle in (2) for any real numbers an. We want to choose an such

that u in (9) satisfies the fourth nonzero boundary condition:

u(0, y) =∞∑n=1

an sinh[−nπLH

] sin(nπy

H) = g1(y).

It is known that any smooth function g1(y) with g1(0) = g1(h) = 0 can be expressed

as a Fourier sine series

g1(y) =∞∑n=1

An sin(nπy

H), 0 ≤ y ≤ H, (10)

2

where

An =2H

∫ H

0g1(y) sin(

nπy

H)dy. (11)

Thus, we can take

an =An

sinh[−nπLH ]

so that (9) satisfies all four boundary conditions. Hence we find a solution to (2):

u(x, y) =∞∑n=1

2H sinh[−nπL

H ](∫ H

0g1(η)

sin(nπη)H

dη) sinh[nπ

H(x− L)] sin(

nπy

H). (12)

Notes.

1. The cosh and sinh functions are coshx = 12 (ex + e−x); sinhx = 1

2(ex − e−x).

2. Problem (5) and (6) is called an eigenvalue problem, where we need find both

a number λ and a nonzero solution Y (y).

3. The Fourier sine series (10)(11) is a special case of the general Fourier series:

f(x) = a0 +∞∑n=1

an cos(nπx

L) +

∞∑n=1

an sin(nπx

L), −L ≤ x ≤ L,

wherea0 = 1

2L

∫ L−L f(x)dx,

an = 1L

∫ L−L f(x) cos(nπxL )dx,

bn = 1L

∫ L−L f(x) sin(nπxL )dx.

For a proof, see Dirichlet-Jordan Convergence Theorem, p.164, Sect. 4.5.1, Keener.)

4. Other boundary conditions, such as Neumann boundary condition can be solved

similarly (See homework).

5. Nonhomogenerous equation (Poisson equation) can be solved also (See next

lecture).

6. Laplace equation in a disk can be solved by separation of variables in addition

to the complex variables method.

3

Excerpt: Story of Fourier

Joseph Fourier’s father was a tailor in Auxerre. After the death of his first wife,

with whom he had three children, he remarried and Joseph was the ninth of the

twelve children of this second marriage. Joseph’s mother died when he was nine

years old and his father died the following year.

It was during his time in Grenoble that Fourier did his important mathematical

work on the theory of heat. His work on the topic began around 1804 and by 1807 he

had completed his important memoir On the Propagation of Heat in Solid Bodies.

The memoir was read to the Paris Institute on 21 December 1807 and a committee

consisting of Lagrange, Laplace, and others was set up to report on the work. Now

this memoir is very highly regarded but at the time it caused controversy.

There were two reasons for the committee to feel unhappy with the work. The

first objection, made by Lagrange and Laplace in 1808, was to Fourier’s expansions

of functions as trigonometrical series, what we now call Fourier series. Further

clarification by Fourier still failed to convince them.

The second objection was made by Biot against Fourier’s derivation of the equa-

tions of transfer of heat. Fourier had not made reference to Biot’s 1804 paper on

this topic but Biot’s paper is certainly incorrect. Laplace, and later Poisson, had

similar objections.

The Institute set as a prize competition subject the propagation of heat in solid

bodies for the 1811 mathematics prize. Fourier submitted his 1807 memoir together

with additional work on the cooling of infinite solids and terrestrial and radiant heat.

Only one other entry was received and the committee set up to decide on the award

of the prize, Lagrange, Laplace, Malus, Hauy, and Legendre awarded Fourier the

prize. The report was not however completely favourable and states:

... the manner in which the author arrives at these equations is not exempt of

difficulties and that his analysis to integrate them still leaves something to be desired

on the score of generality and even rigour.

With this rather mixed report there was no move in Paris to publish Fourier’s

work. His work was later published in 1822.

4

6.9. Poisson equation in a rectangle.

We consider

∂2u

∂x2+∂2u

∂y2= f(x, y), 0 < x < L, 0 < y < H, (1)

u(0, y) = u(L, y) = 0, 0 < y < H, (2)

u(x, 0) = u(x,H) = 0, 0 < x < L. (3)

We propose the eigenvalue problem∂2u∂x2 + ∂2u

∂y2 = −λu, 0 < x < L, 0 < y < H,

u(0, y) = u(L, y) = 0, 0 < y < H,

u(x, 0) = u(x,H) = 0, 0 < x < L.

(4)

We use separation of variables to find

u(x, y) = sin(nπx

L) sin(

mπy

H), λ = (

nπx

L)2 + (

mπy

H)2, (5)

where n = 1, 2, · · · ,m = 1, 2, · · ·, are solutions to (4). Using Fourier sine series

Theorem in the x−variable first and then in the y-variable, we can expand f(x, y)

into

f(x, y) =∞∑n=1

∞∑m=1

Bnm sin(nπx

L) sin(

mπy

H), (6)

where

Bnm =4LH

∫ L

0

∫ H

0f(x, y) sin(

nπx

L) sin(

mπy

H) dxdy. (7)

Thus a solution to (1)-(3) is

u(x, y) = −∞∑n=1

∞∑m=1

Bnm(nπL )2 + (mπH )2

sin(nπx

L) sin(

mπy

H). (8)

where Bnm are given in (7). We can verify that (8) is indeed a solution:

4u = −∑∞n=1

∑∞m=1

Bnm(nπL

)2+(mπH

)24(sin(nπxL ) sin(mπyH ))

=∑∞n=1

∑∞m=1Bnm sin(nπxL ) sin(mπyH )

= f(x, y).

Note: Let Ω be a general domain in IRn. Consider −4u = λu, in Ω,

u = 0 on ∂Ω.(9)

Then in general there is no explicit formula for u or λ. But a general theorem tells

us that there exists λ1 < λ2 < · · · and corresponding u1, u2 · · · such that (λj , uj) are

solutions to the eigenvalue problem (9) and anyf(~x) can be written as

f(~x) =∞∑n=1

Bnun. (10)

Thus a solution to

4u = f(x)

is

u = −∞∑n=1

Bnλnun. (11)

So the study of the eigenvalue problem is very useful. The expansion (10) is called

the eigen-function expansion. The equation

−4u = λu (12)

is called the Helmholtz equation.

The eigen functions un are orthogonal in the sense∫Ωun(x)um(x) dx = 0, if n 6= m. (13)

This is why we can determine the coefficients Bn quickly:

Bn =∫Ω f(x)un(x) dx∫

Ω(un(x))2 dx(14)

for all n = 1, 2, · · · .

2

6.10. Heat equation in a rectangle.

6.10.1. Initial Value Problem

Consider ∂u∂t = k ∂

2u∂x2 , 0 < x < L,

u(t, 0) = u(t, L) = 0,

u(0, x) = g(x).

(1)

We try separation of variables:

u(t, x) = φ(x)G(t). (2)

ThenG′(t)kG(t)

=φ′′(x)φ(x)

.

Thus we set them to be a common constant −λ:

G′(t) = −λkG(t), (3)

φ′′(x) = −λφ(x). (4)

For φ satisfying

φ(0) = φ(L) = 0, (5)

we find the solutions to the eigenvalue problem(4)-(5):

φ(x) = c sin(nπx

L), λ = (

nπ

L)2, n = 1, 2, · · · . (6)

So we have solutions

u(t, x) =∞∑n=1

Cne−k(nπ

L)2t sin(

nπx

L). (7)

We need (7) to satisfy the initial condition

u(0, x) =∞∑n=1

Cn sin(nπx

L) = g(x).

By Fourier sine series, we only need to take

Cn =2L

∫ L

0g(x) sin(

nπx

L)dx.

So a complete solution to (1) is

u(t, x) =∞∑n=1

(2L

∫ L

0g(x) sin(

nπx

L)dx)e−k(nπ

L)2t sin(

nπx

L).

6.10.2. Inhomogeneous Problem

Consider ∂u∂t = k ∂

2u∂x2 +Q(t, x), 0 < x < L,

u(t, 0) = u(t, L) = 0,

u(0, x) = 0.

(8)

Let us use the eigenfunctions and propose a solution in the form

u(t, x) =∞∑n=1

Cn(t) sin(nπx

L), Cn(0) = 0.

We expand Q(t, x) as

Q(t, x) =∞∑n=1

qn(t) sin(nπx

L).

We note that Q(t, x) may not have zero value at the boundaries, but the expansion

is still valid in the L2(0, L) sense. Then (8) can be projected onto the component

sin(nπxL ):

C ′n(t) = −k(nπ

L)2Cn + qn(t).

We find

Cn(t) = e−k(nπL

)2t∫ t

0ek(nπ

L)2sqn(s)ds.

Thus a solution to (8) is

u(t, x) =∞∑n=1

e−k(nπL

)2t(∫ t

0ek(nπ

L)2sqn(s)ds) sin(

nπx

L),

where

qn(t) =2L

∫ L

0Q(t, x) sin(

nπx

L)dx. (n = 1, 2, · · ·)

6.10.3. Boundary Value Problem

Consider

∂u∂t = k ∂

2u∂x2 , 0 < x < L,

u(0, x) = 0,

u(t, 0) = a(t),

u(t, L) = b(t).

(9)

2

We use the variable

V = u− [a(t) +x

L(b(t)− a(t))].

Then

∂V∂t = k ∂

2V∂x2 − [a′(t) + x

L(b′(t)− a′(t))], 0 < x < L,

v(0, x) = −[a(0) + xL(b(0)− a(0))],

v(t, 0) = 0,

v(t, L) = 0.

This can be solved by the previous two steps.

We shall solve the heat equation in a two-dimensional rectangular domain next

time.

3

6.11. Wave equation in a rectangle.

6.11.1 Vibrating string with fixed ends

PDE∂2u

∂t2= c2

∂2u

∂x2, 0 < x < L, (1)

Boundary Condition: u(t, 0) = u(t, L) = 0, (2)

Initial Condition: u(0, x) = g(x), (3)

ut(0, x) = h(x). (4)

Propose to study the associated eigenvalue problem

∂2u

∂x2= −λu, (5)

u(t, 0) = u(t, L) = 0. (6)

The solutions to (5)(6) are

u = φn(x) := sin(nπxL ),

λ = λn := (nπL )2, n = 1, 2, · · · .

Now use eigenfunction expansion:

u(t, x) =∑∞n=1 Cn(t)φn(x),

g(x) =∑∞n=1 gnφn(x),

h(x) =∑∞n=1 hnφn(x).

Use equation (1) to obtain

∞∑n=1

(C ′′n(t) + λnc2Cn)φn = 0.

ThusC ′′n(t) + λnc

2Cn = 0,

Cn(0) = gn,

C ′n(0) = hn.

General solution formula for Cn is

Cn = an cos(nπct

L) + bn sin(

nπct

L),

so that

C ′n(t) = −annπc

Lsin(

nπct

L) + bn

nπc

Lcos(

nπct

L).

Using initial condition, we find

an = gn,

bn = hnLnπc .

Thus

Cn(t) = gn cos(nπct

L) + hn

L

nπcsin(

nπct

L).

Hence the general solution to (1)-(4) is

u(t, x) =∞∑n=1

[gn cos(nπct

L) + hn

L

nπcsin(

nπct

L)] sin(

nπx

L),

wheregn = 2

L

∫ L0 g(x) sin(nπxL ) dx,

hn = 2L

∫ L0 h(x) sin(nπxL ) dx.

Remark Traditionally, we use separation of variables u = G(t)φ(x) in (1)-(4),

and then end up with the same problem of eigenvalue problem (5)-(6).

Figure 6.11.1. Normal mode at n = 3.

L2L/3L/3

u(t, x)

Property of solutions:

Let us look at the so-called normal modes of vibration:

[gn cos(nπct

L) + hn

L

nπcsin(

nπct

L)] sin(

nπx

L) = An sin(

nπct

L+nπαnL

) sin(nπx

L),

2

where

An =

√g2n + (

hnL

nπc)2,

and αn is an angle. We see An is the amplitude, temporal frequency is nπcL and

spatial frequency is nπL . See Figure 6.11.1.

6.11.2 Vibrating rectangular membrane.

Consider∂2u∂t2 = c24u, 0 < x < L, 0 < y < H,

u = 0 on boundary,

u(0, x, y) = α(x, y),

ut(0, x, y) = β(x, y).

Eigenvalue problem:4u+ λu = 0,

u = 0 on boundary.

We know that the eigenfunctions and the eigenvalues are (from Section 6.9, Poisson

equation in a rectangle, Lecture 8 of Chapter VI)

u = φnm(x, y) := sin(nπxL ) sin(mπyH ),

λ = λnm := (nπL )2 + (mπH )2.

Solutions are

u(t, x, y) =∞∑m=1

∞∑n=1

[Anm cos(ct√λnm) +Bnm sin(ct

√λnm)]φnm,

whereAnm = 4

LH

∫ L0

∫H0 α(x, y)φnm(x, y)dxdy,

c√λnmBnm = 4

LH

∫ L0

∫H0 β(x, y)φnm(x, y)dxdy.

I feel that you might have difficulty with the two-dimensional eigenvalue problem.

You can either come to see me for more explanation or use the book “Elementary

Applied PDE” by Haberman, or “Advanced Engineering Mathematics” by Erwin

Kreyszig.

3

6.12. Eigenvalue problem.

6.12.1 Motivation.

We have seen that eigenvalue problems are useful in the previous sections. And

they all have explicit solutions so far. Let us now look at the heat conduction

equation again but with more complexity:

ρc∂u

∂t= div(k∇u). (1)

Suppose the density ρ is now a function of x : ρ = ρ(x) and c = c(x), k = k(x).

Through separation of variables

u = G(t)φ(x),

we end up withG′

G=

div(k(x)∇φ(x))ρcφ(x)

= −λ, (2)

or

div(k(x)∇φ(x)) + λρcφ(x) = 0. (3)

In general, there is no explicit solutions for φ. But we still love the idea of eigenfunc-

tion expansion. We think our elementary functions, xn, ex, ln(x), sin(x), arcsin(x), etc.

are too few. We would like to establish more functions: these are called special func-

tions and the Bessel’s functions are examples.

6.12.2. Eigenvalue problem of Sturm-Liouville (p. 153 Keener).

Equation:d

dx(p(x)

dφ

dx) + q(x)φ+ λσ(x)φ = 0, a < x < b, (4)

Boundary conditions:

β1φ(a) + β2dφdx (a) = 0,

β3φ(b) + β4dφdx (b) = 0.

(5)

Assumptions: p > 0, σ > 0, p, q, σ are smooth, |β1|+ |β2| 6= 0, |β3|+ |β4| 6= 0.

Conclusions:

1. All eigenvalues are real.

2. There exists an infinite number of eigenvalues:

λ1 < λ2 < · · · < λn < λn+1 < · · · ,

a. There is a smallest value λ1,

b. There is no greatest value: λn → +∞, as n→∞.

3. Corresponding to each λn, there is an eigenfunction, denoted by φn(x) (which is

unique up to a constant factor), φn(x) has exactly n− 1 zeros for a < x < b.

4. The eigenfunction φn∞n=1 form a “complete” set: meaning that any L2 in-

tegrable function f(x) can be represented by a generalized Fourier series of the

eigenfunctions

f(x) =∞∑n=1

anφn(x), where an =∫ ba f(x)φn(x)σ(x) dx∫ ba φ

2n(x)σ(x) dx

,

in L2(a, b). Furthermore, this infinite series converges pointwise to

f(x+) + f(x−)2

for a < x < b, provided that f(x) is piecewise smooth (p. 164 Keener).

5. Weighted orthogonality:∫ b

aφn(x)φm(x)σ(x)dx = 0, if λn 6= λm.

6. Any eigvenvalue can be related to its eigenfunction by the Rayleigh quotient:

λn =−pφnφ′n|ba +

∫ ba (pφ′2n − qφ2

n)dx∫ ba φ

2nσdx

.

Example:

For uxx + λu = 0, 0 < x < 1,

u(0) = u(1) = 0.

We know un(x) = sin(nπx), λn = (nπ)2. But also,

u(x) = λ

∫ 1

0K(x, y)u(y)dy,

where

K(x, y) =

y(1− x), 0 ≤ y < x ≤ 1

x(1− y), 0 ≤ x < y ≤ 1.

2

Let

Tu :=∫ 1

0K(x, y)u(y)dy.

Then the eigenvalue problem becomes the spectrum problem of the compact

operator

Tu =1λu

(p. 114 Keener.)

3

Announcement: The final exam will be cumulative. I will give out a mock exam

on the Monday of the exam week on the web.

There will be no class on Dec 7, Friday, it is all because you have been so good.

We will not cover the originally planned topic Homogenization.

We will cover more special functions including Bessel’s functions this week, and

Green’s functions next Monday. Next Wednesday (Dec 5) is a review day.

6.12.3. Example: Heat flow in a nonuniform rod:

PDE: c(x)ρ(x)∂u

∂t=

∂

∂x(K0(x)

∂u

∂x), (1)

Boundary condition: u(t, 0) = 0, (2)

∂u

∂x(t, L) = 0, (3)

Initial condition: u(0, x) = g(x). (4)

Before jumping to an eigenvalue problem, let us try to use separation of variables:

u(t, x) = G(t)φ(x).

ThenG′(t)G(t)

=(K0(x)φ′)′

ρcφ= −λ.

Thus we have the eigenvalue problem (K0(x)φ′)′ + λρcφ = 0,

φ(0) = φ′(L) = 0.

and

G(t) = be−λt.

By Sturm-Liouville theory, we have

λ1 < λ2 < · · · < λn · · · , (5)

φ1, φ2, · · · , φn, · · · . (6)

So general solutions are

u(t, x) =∞∑n=1

bne−λntφn(x).

To find bn, we have the initial condition

g(x) =∞∑n=1

bnφn(x).

Using orthogonality condition with weight ρc, we find∫ L

0g(x)φm(x)ρc dx = bm

∫ L

0φ2m(x)ρc dx, (m = 1, 2, · · · .)

So

bm =∫ L0 g(x)φm(x)ρc dx∫ L

0 φ2m(x)ρc dx

. (7)

Thus a solution to (1)-(4) is

u(t, x) =∞∑n=1

bne−λntφn(x).

where bn is given in (7), λn in (5), and φn in (6).

6.13. Explicit eigenfunctions: Orthogonal polynomials and special func-

tions.

6.13.1. Legendre polynomials: (p.167, Keener)

Pn(x) =1

2nn!dn

dxn[(x2 − 1)n]

and eigenfunctions of the differential operator

Lu = λu,

Lu = −((1− x2)u′)′, −1 < x < 1,

λn = n(n+ 1).

Boundary condition is that u is bounded on −1 ≤ x ≤ 1. The function p(x) = 1−x2

vanishes at |x| = 1, so it is not regular. (What we covered last time in 6.12 is called

a regular Sturm-Liouville eigenvalue problem. But we claim the polynomials are

orthogonal and complete nontheless.)

6.13.2. The Schrodinger equation

u′′ + (E − V (x))u = 0, x ∈ IR1.

2

E is eigenvalue and physicist’s notation for λ. Let

V (x) = x2 − 1, u = e−x2

2 w.

Then

w′′ − 2xw′ + λw = 0,

w = Hn(x) = (−1)nex2

2dn

dxne−

x2

2 , λn = 2n.

The functions Hn(x) are called the Hermite polynomials, which are orthogonal

polynomials.

See p.167–, Keener’s for more special functions.

3

6.13.3. Special functions, Bessel’s differential equations.

Bessel’s differential equation of order m (m ≥ 0) is

z2 d2u

dz2+ z

du

dz+ (z2 −m2)u = 0, 0 < z <∞.

We state without proof the following. It has two solutions Jm(z) and Ym(z): As

z → 0, they satisfy

Jm(z) ∼

1, m = 0,1

2mm!zm, m > 0;

Ym(z) ∼

2π ln z, m = 0,

−2m(m−1)!π z−m, m > 0.

Jm(z) are called Bessel functions of the first kind, while Ym(z) are called the second

kind. Jm are bounded, Ym are not bounded. Both are oscillatory and decay to zero

as z →∞. See Figures 16.3.1-2. We shall let zmn denote the positive zeros of Jm(z),

n = 1, 2, · · · .

Figure 6.13.1. Bessel functions of the first kinds.

1J

J

0

1

z2

z

z 02

01

(z)

(z)

12z

= 2.40483...

= 5.52008...

4

Figure 6.13.2. Bessel functions of the second kinds.

z2

0.5Y

0(z)

Y1(z)

Another form of Bessel’s equation: Let z =√λr, then

r2d2u

dr2+ r

du

dr+ (λr2 −m2)u = 0.

Let

Lu = −(ru′)′ +m2

ru.

Then Bessel’s equation becomes

Lu = λru, 0 < r <∞.

We propose to study the eigenvalue problem Lu = λu, 0 < r < a,

|u(0)| <∞, u(a) = 0,

where a is any positive given number. Solution to the equation with |u(0)| <∞ are

u(r) = cJm(√λr).

Imposing the other boundary condition u(a) = 0, we have

√λa = zmn.

2

Thus

λ = λmn := (zmna

)2, n = 1, 2, · · · .

The eigenvalue problem is a singular Sturm-Liouville eigenvalue problem, we claim

that orthogonality and completeness both hold:∫ a

0Jm(

√λmn r)Jm(

√λmk r) r dr = 0, n 6= k,

and any α(r), such that∫ a0 rα

2(r) dr <∞, has expansion

α(r) =∞∑n=1

bnJm(√λmnr),

where

bn =∫ a

0 α(r)Jm(√λmn r) r dr∫ a

0 J2m(√λmn r) r dr

.

This expansion is called the Fourier-Bessel series.

3

6.14 Vibrating membrane in a circular domain

PDE:∂2u

∂t2= c24u, in disk r < a in IR2, (1)

Boundary condition: u(t, a, θ) = 0, r = a, θ ∈ [−π, π], (2)

Initial condition: u(0, r, θ) = α(r, θ),∂u∂t (0, r, θ) = β(r, θ).

(3)

Using separation of variables

u(t, r, θ) = φ(r, θ)G(t), (4)

we find

4φ+ λφ = 0, (5)

andd2G

dt2+ λc2G = 0. (6)

We need

φ(a, θ) = 0. (7)

This time the domain is not rectangular. We will not have sin(mπxL ) sin(mπyH ). We

try separation of variables again

φ(r, θ) = f(r)g(θ), 0 < r < a, −π < θ < π. (8)

Recall in 2−D:

4φ =1r

∂

∂r(r∂

∂rφ) +

1r2

∂2

∂θ2φ.

Thus (5) becomes

−1g

d2g

dθ2=r

f

d

dr(rdf

dr) + λr2 =: µ. (9)

(Comment on uxx + uyy + λu = 0 : u = f(x)g(y) : fxxf + gyy

g + λ = 0

⇒ fxxf

+ λ = −gyyg

=: µ.)

We see that g needs to be periodic in θ:

g(π) = g(−π),ddθg(π) = d

dθg(−π).(10)

The “regular” Sturm-Liouville eigenvalue problem

d2g

dθ2+ µg = 0, with (10) (11)

yields

µ = µm := m2, m = 0, 1, 2, · · · . (12)

g = sin(mθ) or cos(mθ). (13)

Thus for m = 0 there is one eigenfunction g = 1, but for m > 0, there are two

linearly independent eigenfunctions. These eigenfunctions generate a complete and

orthogonal basis for L2[−π, π]. This is the full Fourier series: any function Γ(θ)

in L2[−π, π] has the expansion

Γ(θ) =∞∑m=0

[am cos(mθ) + bm sin(mθ)]. (14)

(Define b0 = 0 for notational convenience.) All right. Now for each µm, we consider

equation (9)r

f

d

dr(rdf

dr) + λr2 = m2 (15)

with the natural condition |f(0)| <∞ and f(a) = 0 derived from (7); i.e., r(rf ′)′ + (λr2 −m2)f = 0,

|f(0)| <∞, f(a) = 0.(16)

The solution to (16) are given in section 6.13.3 (Bessel’s functions), and they are f(r) = fmn(r) := Jm(√λmn r),

λ = λmn := (zmna )2, n = 1, 2, · · · .(17)

We have found φ for (5)

φ = φmn := Jm(√λmn r)[am cos(mθ) + bm sin(mθ)].

For the G function in (6) we find

G(t) = cos(c√λmn t) or sin(c

√λmn t).

2

Combining all the factors, we find a general solution formula

u(t, r, θ) =∑∞m=0

∑∞n=1[AmnJm(

√λmn r) cos(mθ) cos(c

√λmn t)

+BmnJm(√λmn r) sin(mθ) cos(c

√λmn t)

+CmnJm(√λmn r) cos(mθ) sin(c

√λmn t)

+DmnJm(√λmn r) sin(mθ) sin(c

√λmn t)].

(18)

Imposing the initial condition (3) on (18) will determine the coefficients. For exam-

ple, let β = 0, we find Cmn = Dmn = 0. Then

α(r, θ) =∞∑m=0

(∞∑n=1

AmnJm(√λmn r)) cos(mθ) +

∞∑m=1

(∞∑n=1

BmnJm(√λmn r)) sin(mθ),

where

Amn =∫ a

0

∫ 2π0 α(r, θ)Jm(

√λmn r) cos(mθ) r dr dθ∫ a

0

∫ 2π0 J2

m(√λmn r) cos2(mθ) r dr dθ

,

Bmn =∫ a0

∫ 2π0 α(r, θ)Jm(

√λmn r) sin(mθ) r dr dθ∫ a

0

∫ 2π0 J2

m(√λmn r) sin2(mθ) r dr dθ

.

Notes. Two dimensional eigenvalue problems.

I give a summary here for all the two dimensional eigenvalue problems that we

have encountered. They have appeared in

1. Poisson equation in a rectangle Ω (Section 6.9) (and Homework set 14)

4φ+ λφ = 0, in Ω

φ = 0 on ∂Ω.

2. Or in a disk (Section 6.13, Bessel’s functions).

3. Heat flow in a rectangle, Section 6.10.4, to be up-loaded (also in Homework set

14).

4. Wave equation in a rectangle (Section 6.11), disk (Section 6.13).

3

Appendix: One-dimensional eigenvalue problem

We provide a complete solution to the eigenvalue problemd2φdx2 + λφ = 0, 0 < x < L

φ(0) = φ(L) = 0.

Solution. The objective of the eigenvalue problem is to find both the parameter

λ and a nonzero solution φ. We use the strategy of shooting. First let λ be zero

λ = 0, and see whether we can find a nonzero solution φ. In this case, the equation

becomes

φ′′ = 0.

Thus

φ = a1 + a2x.

Then the boundary conditions imply that a1 = a2 = 0. Thus we do not have any

nonzero solution for λ = 0. Let us now try to find a negative solution of λ : λ =

−c2, c > 0. Then the equation becomes

φ′′ − c2φ = 0.

We use the guess work

φ = eαx

to find that

α2 − c2 = 0.

So α = ±c and we have the solution

φ = a1ecx + a2e

−cx.

The boundary conditions imply similarly that a1 = a2 = 0. So there is no solution

for λ = −c2. Let us now try λ = c2, c > 0, and solution of the form φ = eαx; we find

α = ±ic and the solutions are

φ = a1 cos(cx) + a2 sin(cx).

The boundary condition φ(0) = 0 implies a1 = 0. The boundary condition φ(L) = 0

implies

sin(cL) = 0.

4

So we choose c, such that cL = nπ, n = 1, 2, · · · . Thus λ = (nπL )2, and the

corresponding solutions are

φ = a2 sin(nπx

L).

5

C. General bounded domains, Green’s function.

6.15 Poisson equation in a bounded domain, Green’s function.

Given a domain Ω ⊂ IRn. Consider the problem

4u = f(x) in Ω, (1)

u|∂Ω = 0. (2)

If Ω is not one of the special cases (interval, rectangle, disk, sphere, cylinder, half

disk or quarter disk etc.), then separation of variables or transform methods do not

work. The eigenvalue problem and eigenfunction expansion is a way, however there

is an alternative way which reduces the work of finding a solution to (1)-(2) for

arbitrary f(x) to finding a solution for a single function f(x) = δ(x − x0). This

single solution is called Green’s function and is defined as the solution to

4G = δ(x − x0) in Ω, (3)

G|∂Ω = 0. (4)

Recall that our definition of fundamental solution F is

4F = δ(x− x0) in IRn

with the condition F (x)→ 0 as |x| → ∞. Thus W := G− F satisfies 4W = 0 in Ω,

W |∂Ω = −F.

So Green’s function G = F + W is a “fundamental solution” that satisfies the zero

boundary condition.

If we multiply (3)-(4) with f(x0) and integrate in x0, we find

4(∫ΩG(x, x0)f(x0)dx0) =

∫Ω δ(x− x0)f(x0)dx0 = f(x),∫

ΩG(x, x0)f(x0)dx0|∂Ω = 0.

Thus

u =∫

ΩG(x, y)f(y)dy

is a solution to 4u = f(x) in Ω,

u|∂Ω = 0 on ∂Ω.

For the simplest example, we see that the solution to

d2u

dx2= f(x), u(0) = u(1) = 0

is given by

u(x) =∫ 1

0g(x, y)f(y)dy,

where

g(x, y) = (x− y)H(x− y)− x(1− y), for 0 ≤ x, y ≤ 1.

where H is the Heaviside function H(x) = 1 for x > 0, and H(x) = 0 for x < 0. See

Fig. 6.15.1. Simple calculation shows g(0, y) = g(1, y) = 0, and

gx(x, y) = H(x− y)− (1− y),

gxx = δ(x− y).

See pp. 145-146 of Keener. (Textbook of Keener identifies fundamental solutions

with Green’s functions.)

0 x

y

g

1

Figure 6.15.1. The Green’s function for u" = f(x), u(0) = u(1) = 0.

In general, a Green’s function has no explicit formula. Still, it is helpful even

when only the abstract form of existence is available. Green’s functions appear in

numerous situations.

2