Chapter I. Vectors and Tensors
Yuxi Zheng
Abstract.
We assume the students know these materials:
Basic vector algebra: addition, subtraction, and multiplication by a scalar, linear
dependence, linear independence, basis, expansion of a vector with respect to other
vectors, inner product.
We will cover the topics:
Advanced vector algebra: Projection of a vector onto an axis; vector product, product
of three vectors;
Brief Introduction
As human beings learned to know the natural world around them, they invented
words for description, and later introduced units of measurement to quantify their
description. They tried various ways, including imagination, observation, and setting
up laboratories, to find out the mechanism of motion. Math is to introduce mathe-
matical symbols (numbers, variables (length, area, volume), coordinates, functions,
vectors, tensors, rate of change, equations, inequalities, etc) to model the natural
phenomena. With enough symbols accumulated, a branch of math, called pure math,
is devoted to the study of these symbols. The study of these symbols (rules of opera-
tions) with aims on applications to the natural world is called Applied Mathematics.
A clear distinction between pure and applied mathematics is hard to draw. However,
the application of mathematics is easily seen as the use of developed mathematical
tools in sciences, engineering, and other fields.
Our goal will be learning the basic tools of mathematics that had and will con-
tinue to have applications. These tools will be introduced most often with some
background of origin. Applications are often to follow. Our emphasis is on the
math: principles and essential calculations
1.1. Vectors
Review. Vectors are
A = (1, 1, 1), B = (0,−1, 2).
The notation for a vector here is a bold face letter; it can be a letter with an arrow
on top of it.
Scalar multiplication
2A = (2, 2, 2).
Addition
2A + B = (2, 2, 2) + (0,−1, 2) = (2, 1, 4).
The zero vector
0 = (0, 0, 0).
The subtraction
A−B = (1, 2,−1).
Other examples
C = (1, 2), D = (0, 3).
And
2C−D = (2, 4) − (0, 3) = (2, 1).
Geometric Representation.
(Figure 1.1.1. Representation of C.)
2
x
x
C
1
1
2
1
Figure 1.1.1. Representation of C.
(Figure 1.1.2. Representation of A) (omitted)
(Figure 1.1.3. Addition and subtraction of C + D and C−D.)
2
2
x
x
C
1
1
2
1
3C+D
Figure 1.1.3. Representation of C +D.
D
(Figure 1.1.4. Scalar multiplication of C.) (omitted)
Length (magnitude) of a vector B = (x1, x2, x3):
|B| =√
x21+ x2
2+ x2
3.
A unit vector in the direction of B:
B
|B| =(0,−1, 2)
√
02 + (−1)2 + 22= (0,− 1√
5,
2√5).
(Figure 1.1.5. A unit vector. ) (omitted)
A nonzero vector B yields naturally an axis: the line that passes through B with
the same direction of B.
(Figure 1.1.6. An axis associated with a given nonzero vector B.) (omitted)
1.1.1. Projection of a vector A onto an axis u: |u| = 1.
Definition: ProjuA = (|A| cos φ)u.
Here φ represents the angle between the two vectors A and u.
Figure 1.1.7. Projections with acute and obtuse angles.
3
B
A
u Proj
Φ
Β
A
AB
ProjuA
Figure 1.1.7. Projections with acute and obtuse angles.
If φ is between −π/2 and π/2, then the cos is positive and the projection is in the
same direction as u. If φ is between π/2 and 3π/2, the projection is in the opposite
direction.
We see that if u1 and u2 are orthogonal axes, the projection can be used to
decompose A:
A = Proju1A + Proju2
A.
For a general nonzero vector B, the projection onto B is
ProjBA = (|A| cos φ)B
|B| .
1.1.2. Inner product. (a.k.a. scalar product, dot product)
Example. Let A = (1,−1, 2), B = (2, 3,−5). Then
A ·B = 1 · 2 + (−1) · 3 + 2 · (−5) = −11. (1)
The inner product can be used to express the projection. Formula:
ProjBA =A ·B|B|2 B. (2)
The proof will be given minutes later.
Now let us examine an example. Let
i1 = (1, 0, 0), i2 = (0, 1, 0), i3 = (0, 0, 1).
Then,
i1 · i1 = 1, i2 · i2 = 1, i3 · i3 = 1,
i1 · i2 = 0, i1 · i3 = 0, i2 · i3 = 0.(3)
4
If a set of vectors satisfies (3), then the set is called orthonormal. Conditions (3) are
called orthonormal conditions.
Although the inner product in (1) is simple and direct, there is another popular
definition, which is equivalent. It is
A ·B = |A||B| cos(A,B) (4)
where (A,B) denotes the angle from A to B.
Figure 1.1.8. Angle and inner product.
B
A
Figure 1.1.8. Angle and inner product.
(A, B)
Properties: A ·B = B ·A, (2A) ·B = 2(A ·B).
Distributive law: A · (B + C) = A ·B + A ·C.
We give a one line proof for the formula (2):
ProjBA = (|A| cos φ)B
|B| = (|A||A| cos φ)B
|B|2 = (A ·B)B
|B|2 .
Now we see that A and B is orthogonal (defined as φ = ±π/2) if and only if
A ·B = 0, and if and only if ProjBA = 0.
We are interested in deriving (1) from (4): Let A = a1i1 + a2i2 + a3i3, B =
b1i1 + b2i2 + b3i3. Following the distributive law, we have
A ·B = A · (b1i1 + b2i2 + b3i3)
= b1A · i1 + b2A · i2 + b3A · i3= b1(a1 + 0 + 0) + b2(0 + a2 + 0) + b3(0 + 0 + a3)
= a1b1 + a2b2 + a3b3.
(5)
—End of Lecture 1. —
5
1.1.3. Vector product (a.k.a cross product)
Given two vectors A and B. We define the vector product of A and B to be
a vector C:
C = A×B
where
1. The length of C is the area of the parallelogram spanned by A and B:
|C| = |A||B|| sin(A,B)|;
2. The direction of C is perpendicular to the plane formed by A and B. And
the three vectors A, B, and C follow the right-hand rule.
The right-hand rule is: when the four fingers of the right hand turn from A to
B, the thumb points to the direction of C.
(Figure 1.1.3.1. Definition of cross product.)
A
B
C=AxB
Figure 1.1.3.1. Definition of cross product.
Area
Right hand rule
Properties:
A×B = −B×A;
A× (B + C) = A×B + A×C;
A‖B is the same as A×B = 0
where the symbol ‖ means parallel.
Example 1.1.3a: Recall the three vectors i1, i2, i3. They follow the right hand
rule. By definition we can find that
i1 × i1 = 0, i2 × i2 = 0, i3 × i3 = 0;
and
i1 × i2 = i3, i2 × i3 = i1, i3 × i1 = i2.
With Example 1.1.3a and the distributive property, we can find a formula for
the product. Let
A = a1i1 + a2i2 + a3i3, B = b1i1 + b2i2 + b3i3.
Then (please do it on your own. you can do it.)
A×B =
∣
∣
∣
∣
∣
∣
∣
∣
∣
i1 i2 i3
a1 a2 a3
b1 b2 b3
∣
∣
∣
∣
∣
∣
∣
∣
∣
.
Example 1.1.3b. An electric charge e moving with velocity v in a magnetic
field H experiences a force:
F =e
c(v ×H)
where c is the speed of light.
Example 1.1.3c. The moment M of a force F about a point O is
M = r× F.
(Figure 1.1.3.2. Moment of a force.)
Figure 1.1.3.2. The moment of a force.
r
F
M
O
2
1.1.4. Product of three vectors.
Given three vectors A, B, and C. We list three products with formula
(A×B)×C = B(A ·C)−A(B ·C);
A× (B×C) = B(A ·C)−C(A ·B);
(A×B) ·C =
∣
∣
∣
∣
∣
∣
∣
∣
∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣
∣
∣
∣
∣
∣
∣
∣
∣
where the entries are the coordinates of the three vectors. The first two products
are called vector triple products, the third is called scalar triple product. The proof
for the formulas for the vector triple products are complicated. But the proof for
the formula for the scalar triple product is straightforward. The reader should be
able to do it alone.
To remember the formulas for the two vector triple products, there is a quick way.
You see that the final product of the first vector triple product will be perpendicular
to A×B, so it will lie in the plane spanned by A and B. It is perpendicular to C, so
there will be no component in the C direction. So the first vector triple product is
a linear combination of A and B, not C. The coefficients are the inner products of
the remaining two vectors, with a minus sign for the second term; while the middle
vector B is the first term.
Recall that the magnitude (length) of A × B is the area of the parallelogram
spanned by A and B, and the inner product with C is this magnitude times |C| cos φ,
which is exactly the height of the parallelepiped with a “slanted height” |C| and a
bottom parallelogram spanned by A and B. Thus the magnitude of the scalar triple
product is the volume of the parallelepiped formed by the three vectors. See Figure
1.1.3.3.
(Figure 1.1.3.3. Volume of the parallelepiped formed by three vectors.)
3
A
B
Area
AxB
Cheight =
projection
of C.
Figure 1.1.3.3. The volume of the parallelepiped is the magnitude of (AxB) \centerdot C.
One can form other triple products, but they all can be reduced quickly to one
of the three mentioned here. One may notice that the second vector triple product
can be reduced to the first vector product easily. So essentially there is only one
vector triple product and one scalar triple product.
1.2. Variable vectors.
1.2.1. Vector functions of a scalar argument.
Example 1.2.1a. A(t) = (cos t, sin t, 0), (−∞ < t < ∞).
The graph (the collection of all the tips of the vector A(t)) is a circle.
Example 1.2.1b. A(t) = (cos t, sin t, t), (−∞ < t < ∞).
The graph is a helix.
Example 1.2.1c. A(t) = (3t, 2t,−t) = (3, 2,−1)t. It represents a straight line.
The general formula for a straight line is
A(t) = tα + β
where α and β are numerical vectors independent of t.
1.2.2. The derivatives of a vector function.
Let
A(t) = (A1(t), A2(t), A3(t)) = A1(t)i1 + A2(t)i2 + A3(t)i3.
Then
dA(t)
dt= (
dA1(t)
dt,dA2(t)
dt,dA3(t)
dt) =
dA1(t)
dti1 +
dA2(t)
dti2 +
dA3(t)
dti3.
4
Formal definition:
A′(t) = lim∆t→0
A(t + ∆t)−A(t)
∆t.
(Figure 1.2.1. Derivative: pay attention to the direction of the difference quo-
tient.)
Figure 1.2.1. Derivative: tangent direction.
A (t) A (t+ dt)
Example 1.2.2a. Let r(t) be the position vector of a moving particle. Then
dr(t)
dt= v(t) is the velocity;
d2r(t)
dt2=
dv(t)
dt= a(t) is the acceleration.
Example 1.2.2b. Consider r(t) = (cos t, sin t). Then r′(t) = (− sin t, cos t).
Note that the derivative is tangent to the graph of r(t). Consider R(t) = 2(cos t, sin t).
Then R′(t) = 2(− sin t, cos t). Notice that the magnitude of the derivative is twice
as large. See Figure 1.2.2.
(Figure 1.2.2. Derivative: direction and magnitude.)
5
1
r
r’R
R’
2
Figure 1.2.2. Derivative: direction and magnitudes.
Simple rules:d
dt(A±B) = d
dtA± d
dtB,
d
dt(cA) = dc
dtA + c d
dtA,
d
dt(A ·B) = dA
dt·B + A · dB
dt.
d
dt(A×B) = dA
dt×B + A× dB
dt.
(1)
1.2.3. The integral of a vector function.
The integral of a vector function is defined also component-wise.
Let
A(t) = (A1(t), A2(t), A3(t)) = A1(t)i1 + A2(t)i2 + A3(t)i3.
Then
B(t) =
∫
A(t) dt = (
∫
A1(t) dt,
∫
A2(t) dt,
∫
A3(t) dt).
Combining Lectures 1 and 2, we have covered this week the following sections
of our text book (Vector and Tensor Analysis with Applications, by Borisenko etc.)
1.2.3, 1.4, 1.5, and 1.7.
End of Lecture 2.
6
1.3. Vector fields
We have mentioned the magnetic field, which is defined as a domain in which a
vector of magnetism is defined at every point. Another example is the velocity field
in a stream: each water droplet has a velocity. See Figure 1.3.1.
(Figure 1.3.1. Velocity field)
Figure 1.3.1. The velocity field of a stream.
Generally, a vector field is a domain Ω and a vector function A(r) defined in it.
Furthermore, a vector field may be time-dependent: A(r, t).
1.3.1. Line integrals and circulation.
We introduce an integral that gives work done by a force field or circulation of
velocity around a loop.
Let A(r) be a vector field with domain Ω. Let M1M2 be a curve in the domain
directed from M1 to M2. Chop the curve into many small pieces, say n pieces. One
typical piece is denoted by the end points ri and ri+1. See Figure 1.3.2. The work
done in this piece is approximately A(ri) ·∆ri, where ∆ri = ri+1− ri, if we imagine
that the vector field is a force field. This can also be interpreted as the flow of the
vector field in the direction of ∆ri. We sum over all such pieces and take the limit
as all ∆ri → 0 to define the line integral:
limn→∞
∑ni=1 A(ri) ·∆ri =
∫M1M2
A(r) · dr
=∫M1M2
A1dx1 + A2dx2 + A3dx3.(1)
Here the notation is A = A1i1 + A2i2 + A3i3.
Line integrals give either total work done by the vector field, or total flow of the
vector field along the curve M1M2 in the direction specified.
Total circulation around a contour L is defined as
Γ =
∮L
A · dr.
(Figure 1.3.2. Definition of the line integral.)
Figure 1.3.2. Definition of the line integral.
O
M2
rri+1
i
r M 1∆ i
A
Example 1.3.1a. Let A = (−x2, x1, 0). Let L be the unit circle: x21 + x2
2 =
1, x3 = 0 and counter-clockwise. Then
Γ =∮LA · dr =
∮L−x2dx1 + x1dx2
(L : x1 = cos θ, x2 = sin θ, 0 ≤ θ ≤ 2π)
=∫
2π
0− sin θd cos θ + cos θd sin θ
=∫
2π
0sin2 θdθ + cos2 θdθ
=∫
2π
0dθ = 2π.
(2)
See Figure 1.3.3.
(Figure 1.3.3. Examples of circulation and line integrals.)
2
(a)
Figure 1.3.3. Examples of circulation and line integrals.
Γ = 0
(b)
Γ = 2π
Example 1.3.1b. Let A = (x1, x2, 0) and L as before. Then
Γ =∮LA · dr =
∮L
x1dx1 + x2dx2
(L : x1 = cos θ, x2 = sin θ, 0 ≤ θ ≤ 2π)
=∫
2π
0cos θd cos θ + sin θd sin θ
=∫
2π
0− cos θ sin θdθ + sin θ cos θdθ
=∫
2π
00dθ = 0.
(3)
Example 1.3.1c. Let A = (x1, x2, 0) and L be the line segment: 0 ≤ x1 ≤
1, x2 = 0 directed toward the x1-axis. Then
Γ =
∮L
A · dr =
∫1
0
x1dx1 =1
2x2
1|10 =
1
2. (4)
1.4. Theorems of Gauss, Green, and Stokes.
Recall the Fundamental Theorem of Calculus:
∫b
a
F ′(x) dx = F (b)− F (a).
Its magic is to reduce the domain of integration by one dimension. We want higher
dimensional versions of this theorem.
We want two theorems like
∫∫S(integrand) dS =
∮∂S
(another integrand) d`∫∫∫V
(integrand) dV =∫∫
∂V(another integrand) dS.
(5)
3
When S is a flat surface, the formula is called Green’s Theorem. When S is
curved, it is called Stokes’ Theorem. The volume integral is called Gauss’ Theorem.
Gauss’ Theorem. Let P (x1, x2, x3), Q(x1, x2, x3), R(x1, x2, x3) and all their
partial derivatives be continuous in a given domain V with boundary ∂V . Then∫ ∫ ∫V
(∂P
∂x1
+∂Q
∂x2
+∂R
∂x3
)dV =
∫ ∫∂V
(P cos(n, x1) + Q cos(n, x2) + R cos(n, x3)) dS.
Here n is the unit exterior normal to the surface ∂V . The term (n, x1) represents
the angle between n and the x1-axis, etc.
Note that the domain V can have holes: V can be a shell (a ball with another
concentric, smaller ball removed, in which case the boundary of V consists of two dis-
joint parts: an exterior surface with normal pointing outside and an interior surface
with exterior unit normal pointing to the origin). The boundary ∂V can be allowed
to be piecewise smooth. But the functions P (x1, x2, x3), Q(x1, x2, x3), R(x1, x2, x3)
and their derivatives are required to be continuous.
The more common form of Gauss’ Theorem is in vector form. Let
A = (P,Q,R).
Let the divergence of the vector A be
div A =∂P
∂x1
+∂Q
∂x2
+∂R
∂x3
.
Recall
n = (n1, n2, n3) = (cos(n, x1), cos(n, x2), cos(n, x3)).
Then Gauss’ Theorem can be written in vector form:∫ ∫ ∫V
div A dV =
∫ ∫∂V
A · n dS.
The proof of Gauss’ Theorem is omitted.
Green’s Theorem. Given a planar region S bounded by a closed contour L.
Suppose that P (x1, x2) and Q(x1, x2) and all their partial derivatives are continuous
in the union S ∪ L. Then∫ ∫S
(∂Q
∂x1
−∂P
∂x2
) dS =
∮∂S
Pdx1 + Qdx2,
where L is traversed in the direction such that S appears to the left of an observer
moving along L.
End of Lecture 3.
4
1.4. Theorems of Gauss, Green, and Stokes(Cont’d)
Stokes Theorem Given a surface S bounded by a closed contour L. Suppose
that all P,Q,R and their derivatives are continuous on the union of S ∪ L. Then∫∫S [( ∂R∂x2
− ∂Q∂x3
) cos(n, x1) + ( ∂P∂x3− ∂R
∂x1) cos(n, x2) + ( ∂Q∂x1
− ∂P∂x2
) cos(n, x3)]dS
=∮L Pdx1 +Qdx2 +Rdx3,
(1)
where n is the unit normal to S. Here L is traversed in the direction such that S
appears to the left of an observer moving along L with the vector n at points near
L pointing from the observer’s feet to his/her head.
(Figure 1.4.1. Orientations in Stokes Theorem)
L
S
Figure 1.4.1. Orientations in Stokes’ Theorem.
n
Stokes’ Theorem in vector form. If we let
A = (P,Q,R) = P i1 +Qi2 +Ri3
and definecurl A = ( ∂R∂x2
− ∂Q∂x3
, ∂P∂x3− ∂R
∂x1, ∂Q∂x1
− ∂P∂x2
)
=
∣∣∣∣∣∣∣∣∣i1 i2 i3∂x1 ∂x2 ∂x3
P Q R
∣∣∣∣∣∣∣∣∣ ,then Stokes’ Theorem can be written as∫ ∫
Scurl A · ndS =
∮∂S
A · dr.
We see that the term∮∂S A ·dr is the total circulation of the vector field A along
∂S. The term∫∫S curl A · ndS is called the total flux of the vector field curl A
through the surface S. In general the total flux of a vector W through a surface S
is defined as ∫ ∫S
W · ndS.
See Figure 1.4.2.
(Figure 1.4.2. Definition of flux)
n
W
Figure 1.4.2. Flux of a vector field through a surface.
S
We will come back to the meaning of curl later.
1.4.1. Simply connected domains.
We emphasize that Stokes’ Theorem holds only when the vector field A and its
curl are continuous on the union of the surface with its boundary. In general the
continuity condition is verified in a domain D that contains S. Sometimes one may
make mistakes in the relation of S with D. Let us consider the following question.
Let D be a domain. Suppose a vector A and all its derivatives are continuous
in D. Suppose further that curl A = 0 in D. Can we then use Stokes’ Theorem to
conclude that∮L A · dr = 0 for any contour L within D?
The answer is yes if D is a solid ball or even a shell ( a shell is the region between
two concentric balls). But the answer is no if D is a torus. See Figure 1.4.3. To see
why, we imagine a contour L that goes along the long circle of the torus. Now it is
clear that we can not find a surface S whose boundary is L and lies entirely in the
2
domain D. It may well be the case that the curl of A is not zero anymore outside
of D. In this case, we do not have a surface S to apply Stokes’s Theorem on.
(Figure 1.4.3. A torus.)
Figure 1.4.3. A torus and a contour that cannot shrink to a point within.
L
The difference between a torus and a shell or a ball can be characterized as
follows. Any closed curve inside a ball can be shrunk within the ball to a point. But
not every closed curve inside a torus can be shrunk within the torus to a point.
Definition. A domain is called simply connected if any closed curve inside
the domain can be shrunk continuously to a point within the domain. A domain is
called multiply connected if some closed curves cannot be shrunk within the domain
to a point.
A torus is multiply connected. A ball is simply connected, so is a shell.
To see how a shell is simply connected, imagine a curve in a domain bounded by
two parallel infinite planes. The curve can shrink within the plane to a point. Then
imagine bending the plate so that a portion of the plate forms a portion of a shell.
Stokes’ Theorem applies to any contour L within a simply connected domain.
In particular, circulation of a vector field along any closed curve within a simply
connected domain is zero if the vector field and its curl are continuous and the curl
vanishes at every point in the domain.
1.5. Scalar fields
Examples of scalar fields are the pressure function p(r) and the temperature
function T (r) in a domain D.
A real function of r in a domain is called a scalar field.
1.5.1. Gradient.
3
Let h(x1, x2) denote the height of a mountain for (x1, x2) in a planar domain D.
The set
(x1, x2) : h(x1, x2) = a height c
is called a level curve.
The gradient of h(x) is defined as
∇h(x1, x2) = (∂x1h, ∂x2h).
It is a vector. Its direction gives the direction for fastest change of h. It is normal
to the level curve. See Figure 1.5.1.
(Figure 1.5.1.)
Figure 1.5.1. Level curves and the gradient.
In three dimensions, the gradient of a function f(x1, x2, x3) is defined similarly:
∇f(x1, x2, x3) = (∂x1f, ∂x2f, ∂x3f).
It is perpendicular to the level surfaces:
(x1, x2, x3) : f(x1, x2, x3) = c
where c stands for a real number. The direction of ∇f points to the direction of
fastest change of f . See Figure 1.5.2.
4
(Figure 1.5.2.)
f = constant
Figure 1.5.2. Level surfaces and the gradient.
∆
f
End of Lecture 4.
5
1.5. Scalar fields (continued)
1.5.1. Gradient (continued)
Since we would like “to learn the REAL meaning of mathematical terms and
how/when/why to use them”, quoted from the feedback of a student, let us consider
an example.
Example 1.5.1a. Find the tangent plane to the surface
4x21 + x2
2 + 25x23 = 30
at the point (1, 1, 1).
Solution. Let
φ(x1, x2, x3) = 4x21 + x2
2 + 25x23.
Then the set (x1, x2, x3) : φ = constant is a level surface. Our surface is one of
the level surfaces. The gradient
∇φ = (8x1, 2x2, 50x3)
is a normal to the level surfaces. A normal to our surface at the point (1, 1, 1) is
∇φ(1, 1, 1) = (8, 2, 50).
All points (x1, x2, x3) on the tangent plane satisfy
(x1 − 1, x2 − 1, x3 − 1) · ∇φ(1, 1, 1) = 0.
That is,
8x1 + 2x2 + 50x3 = 60.
See Figure 1.5.3.
(Figure 1.5.3.) Figure 1.5.3. Tangent plane.
∆φ
Summary: Gradient of a scalar field is normal to the level surfaces and the scalar
increases most rapidly in the direction of the gradient among all possible directions.
Proof will be given shortly.
1.5.2. Directional derivative.
We sometimes need to find the rate of change of a scalar in a scalar field in a
given direction. For example, a NASA scientist wants to know the rate of change of
air density along a chosen path for the re-entry of a space ship.
Given a point with position vector r in a scalar field φ. Given also a unit vector
l. The rate of change of φ along l at the point r is defined as
dφ
d l= lim
α→0
φ(r + αl)− φ(r)α
.
See Figure 1.5.4.
(Figure 1.5.4.)
Figure 1.5.4. Directional derivative.
l
r
α l
It can be shown that there holds
dφ
d l= ∇φ · l. (1)
(Note: there might be open space from regular sized text to footnote sized text
that is to follow.)
2
Mathematical proof of properties of gradient. Let l = (l1, l2, l3) and r =
(x1, x2, x3). Then
φ(r + αl)− φ(r) = φ(x1 + αl1, x2 + αl2, x3 + αl3)− φ(x1, x2 + αl2, x3 + αl3)
+φ(x1, x2 + αl2, x3 + αl3)− φ(x1, x2, x3 + αl3)
+φ(x1, x2, x3 + αl3)− φ(x1, x2, x3).
(2)
We see thatφ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)
α
= φ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)αl1
l1(3)
converges to
∂φ(x1, x2 + αl2, x3 + αl3)
∂x1l1|α=0 =
∂φ(x1, x2, x3)
∂x1l1 (4)
as α → 0. Similarly, we can find the limits for the other two terms of difference in
(2). In summary, we find that
dφ
d l=
∂φ
∂x1l1 +
∂φ
∂x2l2 +
∂φ
∂x3l3
which is (1).
Now we can derive the two properties of the gradient from (1). We see that the term
∇φ · l achieves maximum when the angle between ∇φ and l is zero, i.e., l = ∇φ/|∇φ|,among all possible directions. So this is why the gradient gives the direction for most
rapid increase. Furthermore, the least change is zero change, achieved at directions
that are perpendicular to ∇φ, i.e., the angle between ∇φ and l is 90 degrees. We
know intuitively that there is no change in a level surface. Thus the gradient is a
normal to the level surfaces.
1.5.3. Coordinate-independent representation of gradient.
The representation is
∇φ(x) = limV→0
1V
∫ ∫∂V
nφ(y) dSy
where V is a domain that contains the point x and n is the unit exterior normal to
∂V . We also use V to denote the volume of the domain V , a misnomer.
3
Mathematical proof of the coordinate-independent representation
Consider
A = Cφ
where C is a constant vector. Let us apply Gauss’ Theorem:∫ ∫ ∫V
div A dV =
∫ ∫∂V
A · n dS.
Notice that div A = C · ∇φ. We find
C ·∫ ∫ ∫
V
∇φdV = C ·∫ ∫
∂V
nφ dS.
Since C is arbitrary, we conclude that∫ ∫ ∫V
∇φdV =
∫ ∫∂V
nφdS.
Dividing the equation by the volume V and taking the limit V → 0, we have
∇φ(x) = limV→0
1
V
∫ ∫∂V
nφ(y) dSy.
1.6. Vector fields
1.6.1. Flux of a vector field.
Let S be a two-sided piecewise-smooth surface in a vector field A(r). Let n be
a unit normal to S. The flux of A through an element dS is
A · n dS.
The flux through S is ∫ ∫S
A · n dS.
See Figure 1.4.2 for an earlier definition of flux.
To figure out the real meaning of a flux, let us consider an example. Imagine a
highway, an obserational gate, and you are watching the passing cars. If no car is
moving (a complete stall), you see no car passing through your gate, we say that the
flux is zero. Now suppose that all cars are moving at the same speed of 60 miles per
hour. Then in one minute the first car you saw at the begining of the minute has
traveled one mile. If the density of cars on the highway is 1 car per mile, then you
have seen 1 car in one minute passing through the gate. If the density is 2 cars per
mile, then you have seen 2 cars passing. So both velocity and density plays roles.
4
Now suppose that there are multiple lanes on the highway and the density is number
of cars per mile per lane, then the number of cars passed depends on the number of
gates you watch. That is, the cross length of the observational line plays a role. See
Figure 1.6.1.
(Figure 1.6.1.)
Figure 1.6.1. Flux associated with cars.sl
ante
dga
te
gate
cros
s
However, if the observational line is not perpendicular to the line of moving,
then the real length of the observational line does not play a role; the projection
of the observational line onto the line perpendicular to the lines of motion plays a
role. This projection is the same as the projection of the velocity vector field onto
the normal n of the surface S (observational line). So in three dimensions, let v be
the velocity of fluid particles, let S be a surface, let ρ be the density of the fluid
particles, then the quantity ∫ ∫Sρv · n dS
is the total mass of particles that have passed through S in unit time. Without the
density factor, it is called the flux.
1.6.2. The divergence of a vector field.
Recall
div A =∂A1
∂x1
+∂A2
∂x1
+∂A3
∂x3
= ( new notation )∇ ·A.
Recall Gauss’ Theorem: ∫ ∫ ∫V
div A dV =∫ ∫
∂VA · n dS.
Dividing the equation by the volume V and taking the limit V → 0 so that V shrinks
to a point x, we find
div A(x) = limV→0
∫ ∫∂V
A · n dS.
5
So we have found a coordinate-independent representation of the divergence. Using
the definition of flux, we see that the integral over ∂V is the total flux through the
surface ∂V . This total flux over volume V is the flux per unit volume. In the limit
V → 0, the limiting value measures the flux production per unit volume at the point
x. This is the real meaning of the divergence. If it is positive, then it is called a
source. If it is negative, then it is called a sink. If it is zero in a domain, then there
is no source or sink, and it is called divergence free.
6
Example 1.6.2a. Let
A(r) = qr
r3
where r denotes the norm of r. It is an exercise that
div A = 0
at every point where except r = 0. We are interested to find the flux through the
unit sphere S centered at the origin. We know that the unit exterior normal to the
unit sphere is given by
n =r
r.
So we have ∫∫S
A · n dS =∫∫
r=1q rr3· rrdS
= q∫∫r=1
dS
= 4πq.
(5)
If q > 0, it is a source (fountain). If q < 0, it is a sink. By Gauss’ Theorem, we can
see that the flux through any surface is 4πq if the surface encloses the origin. The flux
is zero if the surface does not enclose the origin. We also see that the flux is the same
no matter how small the surface is as long as it contains the origin. This vector field
is smooth every where away from the origin, and the origin is a point source/sink.
See Figure 1.6.2.
(Figure 1.6.2.)
Figure 1.6.2. Flux and source/sink.
SinkSource
1.6.3. The curl of a vector field.
Recall we have introduced the curl of a vector in association with Stokes’ Theo-
7
rem:curl A = (∂A3
∂x2− ∂A2
∂x3, ∂A1∂x3− ∂A3
∂x1, ∂A2∂x1− ∂A1
∂x2)
=
∣∣∣∣∣∣∣∣∣i1 i2 i3∂x1 ∂x2 ∂x3
A1 A2 A3
∣∣∣∣∣∣∣∣∣ .We state without proof that the curl has a coordinate-independent representation:
curl A = limV→0
1V
∫ ∫∂V
n×A dS (= ∇×A).
We see the real meaning of the curl in the next example.
εxample 1.6.3a. Consider a rigid body rotating about a fixed point O with
angular velocity ω. See Figure 1.6.3. The velocity of a point with position vector r
is given by
v = ω × r.
ω
vr
Figure 1.6.3. Curl is twice angular velocity.
Let us calculate the curl of v.
curl1v = ∂x2v3 − ∂x3v2
= ∂x2(ω1x2 − ω2x1)− ∂x3(ω3x1 − ω1x3)
= 2ω1
(6)
(ω is independent of r), and similarly
curl2v = 2ω2
curl3v = 2ω3.(7)
It follows that
curl v = 2ω.
That is, the curl of the velocity field of a rotating body equals twice the angular
velocity of the body.
End of Lecture 5.
8
1.7. Coordinate transformations.
We deal with coordinate transformations between rectangular coordinate sys-
tems, which play an important role in the definition of tensors.
Preliminary remark on tensors. Tensors are physical quantities that exist in-
dependent of coordinate systems. Scalar quantities are called zeroth-order tensors
(e.g., temperature); vectors are called first-order tensors (e.g., velocity); second-order
tensors can all be represented by 3× 3 matrices (e.g., the stress tensor), but not all
3 × 3 matrices are tensors. Tensors of orders greater than 2 cannot be represented
by 3 × 3 matrices. A n-th-order tensor requires 3n real numbers and is invariant
under change of coordinate systems. The requirement of the invariance is natural
since physical observables are invariant under change of coordinate systems.
Suppose we have two orthonormal bases:
i1, i2, i3, and i′1, i′2, i′3,
and two origin O and O′ to form two rectangular coordinate systems K and K ′. Let
a point M in space have the representation
r = x1i1 + x2i2 + x3i3r′ = x′1i
′1 + x′2i
′2 + x′3i
′3.
(1)
Note the equations:r = r′ + r′0, r′0 = ~OO′
r′ = r + r0, r0 = ~O′O,(2)
where the vector r′0 is a vector from O to O′ and r0 = −r′0. See Figure 1.7.1.
(Figure 1.7.1.)
xx
x
x
xx
r’
r
r
11’
2
3 3’
2’
0’
M
Figure 1.7.1. Coordinate transformation.
i
i
ii’
i’
KO
O’
K’
1
2
32
3
i’1
Now we use the summation convention: a repeated index is automatically summed:
xkik =3∑k=1
xkik.
Thus, the equations in (2) can be written
xkik = x′ki′k + x′0kik
x′ki′k = xkik + x0ki′k
(3)
where x′0k are the coordinates of r′0 with respect to the old system K, etc. Take
inner product with il or i′l in equations (3) and note the Kronecker delta function
ik · il = δkl =
0, k 6= l,
1, k = l.(4)
We findxl = x′k(i
′k · il) + x′0l
x′l = xk(ik · i′l) + x0l.(5)
2
We introduce new notations
i′k · il = 1 · 1 · cos(i′k, il) = αk′l. (6)
Thus
ik · i′l = i′l · ik = αl′k. (7)
Thereforexl = αk′lx
′k + x′0l
x′l = αl′kxk + x0l.(8)
The first equation of (8) is the transformation from K ′ to K, while the second
equation of (8) is the inverse transformation from K ′ to K. Note the index summed
in the second equation is the second index of α, while the index summed in the first
equation of (8) is the first index.
Properties of αl′k.
We note that the Kronecker delta in system K ′ is
δ′kl = i′k · i′l =
0, k 6= l,
1, k = l.(9)
Note the expansions:
i′k = αk′lil, and ik = αl′ki′l,
see Figure 1.7.2.
(Figure 1.7.2.)
i’
i’
O’ 2
3
i’1
i
i
iK
O
1
2
3 i’
i
k
k
Figure 1.7.2. Calculations of the coeffieients.
K’
(a) (b)
3
The (αl′k) are often given as a 3× 3 matrix
(αl′k) =
α1′1 α1′2 α1′3
α2′1 α2′2 α2′3
α3′1 α3′2 α3′3
.Note also
δkm = ik · im = αl′ki′l · αj′mi′j = αl′kαl′m
δ′km = i′k · i′m = αk′lil · αm′jij = αk′lαm′l.
Thus we have the propertyαl′kαl′m = δkm,
αk′lαm′l = δ′km.
These properties simply say that the columns of the matrix (αl′k) are orthonormal,
so are the rows of the matrix. Thus the matrix (αl′k) is an orthogonal matrix.
4
1.8. Zeroth-Order Tensors (Scalars).
A scalar is a single function (i.e., one component) which is invariant under
changes of the coordinate systems. We deal with rectangular coordinate systems
only. Thus our tensors are also called cartesian tensors.
Let ϕ be a function of points in a domain in space. Think of ϕ as a physical
or geometrical quantity. This function exists irrelevant of a coordinate system (e.g.
temperature, density, or pressure). Suppose we have two rectangular coordinate sys-
tems K and K ′. In K we have the representation ϕ(x1, x2, x3) of the function; while
in K ′ we have the representation ϕ′(x′1, x′2, x′3) where xi and x′i are the coordinates
of one and the same point in K and K ′. If the function is a scalar, then
ϕ(x1, x2, x3) = ϕ′(x′1, x′2, x′3)
for all points in the domain.
Example 1.8.1a. We show that the distance between two points is a scalar.
Let A and B be two points. Let K and K ′ be two rectangular coordinate systems.
In these systems both A and B have coordinates:
A has coordinates xAi in K, and x′iA in K ′;
B has coordinates xBi in K, and x′iB in K ′.
Let
∆xi = xBi − xAi , ∆x′i = x′iB − x′i
A.
Let the transformation from K to K ′ be
x′i = αi′kxk + x0i.
Then∆x′i = x′i
B − x′iA = αi′kx
Bk + x0i − αi′kxAk − x0i
= αi′k(xBk − xAk ) = αi′k∆xk.
Thus
∆x′i = αi′k∆xk. (1)
Recall Pythagorean Theorem for distance
(∆s′)2 =3∑i=1
(∆x′i)2.
Then(∆s′)2 = αi′k∆xkαi′l∆xl = αi′kαi′l∆xk∆xl
= δkl∆xk∆xl=∑3k=1(∆xk)2.
Thus
∆s′ = ∆s.
1.9. First-Order (Cartesian) Tensors (Vectors)
A first-order tensor is given by three components, and satisfies a certain trans-
formation law.
Think of point B as displaced from point A. Then ∆xi are the displacement.
We have seen that the displacement satisfies the law (1).
Definition. A vector (a.k.a first-order tensor) A is a quantity uniquely specified
in any coordinate system by three real numbers (called the components of the vector)
which transform under changes of the coordinate system according to the law
A′i = αi′kAk (2)
where Ak, A′i are the components of the vector in the old and new coordinate systems
K and K ′ respectively, and αi′k is the cosine of the angle between the i-th axis of
K ′ and the k-th axis of K.
We remark that it is obvious that the zero vector (0, 0, 0) is represented the same
way in any coordinate system. Furthermore, this definition of vector is equivalent
to the definition of a vector as a directed line segment. Lastly, we can use formula
(2) to calculate the components of the representation of a vector in K ′ from the
components of the representation in K.
Example 1.9a. A moving particle P has position coordinate xi(t) in a coordi-
nate system K. The displacement
xi(t+ ∆t)− xi(t)
satisfies the law
x′i(t+ ∆t)− x′i(t) = αi′k(xi(t+ ∆t)− xi(t)) (3)
2
by (1). Thus it determines a vector. We divide (3) by ∆t (a scalar) to find
x′i(t + ∆t)− x′i(t)∆t
= αi′k(xi(t + ∆t)− xi(t))
∆t.
Taking the limit ∆t→ 0 and using the definition of velocity
vi(t) = lim∆t→0
xi(t + ∆t)− xi(t)∆t
,
we find that
v′i(t) = αi′kvk(t).
So the velocity is a vector. Similarly the acceleration
ak(t) =dvk(t)dt
is a vector. Multiplying the acceleration with the scalar mass m, and by Newton’s
second law, we find that the force
F = mA
is a vector.
1.10. Second-Order Tensors
Definition. A second-order tensor is a quantity uniquely specified by 9 real
numbers (called the components of the tensor) which transform under changes of
the coordinate system according to the law
A′ik = αi′lαk′mAlm (4)
where Alm, A′ik are the components of the tensor in the old and new coordinate
systems K and K ′ respectively, and αi′k is the cosine of the angle between the i-th
axis of K ′ and the k-th axis of K.
Remarks. 1. We can use the transformation law to determine the coordinates
of A from one system to another.
2. The zero tensor has zero coordinates in any coordinate system.
3. The components of a second-order tensor are often written as a matrix:
A = (Aik) =
A11 A12 A13
A21 A22 A23
A31 A32 A33
.
3
It can be regarded as a representation of a tensor with respect to a coordinate
system.
4. Tensors of order higher than 2 cannot be represented by matrices.
Example 1.10a. Given two vectors A and B. There are nine products of a
component of A with a component of B:
AiBk (i, k = 1, 2, 3).
Suppose we transform to a new coordinate system K ′, in which A and B have
components A′i and B′k. Then
A′i = αi′lAl, B′k = αk′mBm
and hence
A′iB′k = αi′lαk′mAlBm.
This shows that AiBk is a second-order tensor. It is often denoted as A⊗B.
More examples are in the text book.
4
1.10.1. The Stress Tensor.
Consider an elastic medium, such as rubber. Use a surface to separate the
medium you will encouter a force acting between them. The total force divided by
the total surface area is called the stress (vector). The stress depends on the location
in the medium and the normal direction of the surface. It is possible that we can
factor out the direction part of the stress vector to form a quantity called the stress
tensor so that the stress vector depends bilinearly on the tensor and the direction
of the surface.
Take a rectangular coordinate system K. Take an arbitrary point M in the
elastic medium. Take a tetrahedron with M being one vertex, so that the three
faces passing through M are parallel to the coordinate surfaces, see Figure 1.10.1.
p
p
p
p
n
1
3
2
x
x
x
2
3
1
n
M
Figure 1.10.1. Tetrahedron at M with stress vectors.
Let n be the exterior normal to the slant surface, with area dσn. Let pn be the
stress (force/unit area) onto the tetrahedron through the slant surface. Let pi be the
stress onto the exterior of the tetrahedron through the surface that is perpendicular
to the i-th axis. Let a be the acceleration of the tetrahedron and f be the body
force per unit mass. By Newton’s second law, we have
adm = fdm + pndσn − pidσi.
Let dm go to zero, and note that volume goes to zero faster than corresponding
surface area, we find
pndσn = pidσi.
Note the area formula
dσi = dσn cos(n, ii) = nidσn.
We find that
pn = pini.
Projecting to ik we find
pnk = pikni.
Definition. The stress tensor is (pik). Normal stresses are pii. Tangential
(shearing) stresses are pij(i 6= j).
Note that n is arbitrary since the tetrahedron is not necessarily regular. We
have
pn = pnkik = pikniik,
which determines the stress (vector) on all surfaces.
Note now that (pik) depends only on M , not n.
Real Meaning of Stress Tensor. Once n is specified, the stress tensor (pik)
and n give the stress
pn = pikniik ( force/unit area ).
Once the area is specified as dσn, the force is
pndσn.
A second-order (stress) tensor takes a vector (unit normal) to a (stress) vector.
It only remains to verify that pik is indeed a second-order tensor. For mathemat-
ical rigor as well as the whole point of the concept of tensor, we should verify that
the stress satisfies the law of coordinate transformation. However, it is a technical
point, which I choose to skip in class.
(Note: there might be open space from regular sized text to footnote sized text
that is to follow.)
2
Additional readings. When we use the fact that the tetrahedron is rotation free
(See reference book by Young), we can deduce that
pik = pki.
Thus the matrix (pik) is symmetric.
In hydrodynamics, it is customary to write
pik = −pδik + pik
where the scalar p is called the hydrodynamic pressure and pik is the viscous stress
tensor. A Newtonian fluid is such that the linear relation
pik = ηiklmvlm
holds where
vlm =1
2(∂vl∂xm
+∂vm∂xl
)
is the rate of deformation tensor. For isotropic fluid, there holds
pik = 2µvik + µ′δikvll
where µ and µ′ are called viscosity coefficients.
Verification of the tensor character of the stress. Since the definition of (pik)
involves no restriction on the normal n, we can take n to be the i-th base vector of
the new coordinate system K′, so that
n = i′i
(K and K′ have orthonomal bases i1, i2, i3 and i′1, i′2, i′3, respectively). Then projecting
n onto the l-th axis of K gives
nl = n · il = i′i · il = αi′l,
where αi′l is the cosine of the angle between the i-th axis of K′ and the l-th axis of
K, and hence
pn ≡ p′i = plnl = αi′lpl = αi′limplm.
Finally, projecting p′i onto the k-th axis of K′, we obtain
p′i · i′k = αi′l(im · i′k)plm
or
p′ik = αi′lαk′mplm.
By definition, we find that (pik) transforms like a second-order tensor.
3
1.10.2. The moment of inertia tensor. Consider a rigid body system of n
particles with coordinates (x(j)1 , x
(j)2 , x
(j)3 ), j = 1, 2, · · · , n and mass mj in a coordi-
nate system K with origin O. The quantities
Iik =n∑j=1
mj(δikx(j)l x
(j)l − x
(j)i x
(j)k )
are called the moment of inertia tensor (about the origin O). It is a second-order
tensor. It is used in physics in
ωkIik = Li
where
L =n∑j=1
mj(r(j) × v(j))
is the angular momentum and ~ω is the angular velocity:
v(j) = ~ω × r(j).
4
1.10.3. The Deformation Tensor.
Let u(r) be the displacement of a point with position vector r. Then the quan-
tities
uik =12
(∂ui∂xk
+∂uk∂xi
)form a second-order tensor, called the deformation tensor.
1.10.3. The rate of Deformation Tensor.
Let v(M) be the velocity at a point M of a moving fluid. Then the quantities
vik =12
(∂vi∂xk
+∂vk∂xi
)form a second-order tensor, called the rate of deformation tensor.
1.11. High-Order Tensors.
By a tensor of order n is meant a quantity uniquely specified by 3n real numbers
(the components of the tensor) which transform under changes of coordinate systems
according to the law
A′i1i2···in = αi′1k1αi′2k2
· · ·αi′nknAk1k2···kn
where Ak1k2···kn , A′i1i2···in are the components of the vector in the old and new
coordinate systems K and K ′ respectively, and αi′k is the cosine of the angle between
the i-th axis of K ′ and the k-th axis of K.
Example 1.11a. If A,B, and C are three vectors, then the 33 = 27 quantities
Dijk = AiBjCk
form a tensor of order 3. The proof is omitted, but see an exercise.
Example 1.11b. Suppose one second-order tensor Aik is a linear function of
another second-order tensor Bik, such that
Aik = λiklmBlm,
then λiklm form a fourth-order tensor. Proof is omitted.
1.12. Tensor Algebra.
1.12.1. Addition. We can add any two tensors of the same order, the sum is
a tensor of the same order, whose components are the sums of the corresponding
components of the two tensors. For example, tensor Aik and tensor Bik can be added
to give a tensor Cik:
Cik = Aik +Bik.
1.12.2. Multiplication. We can multiply any number of tensors of arbitrary
orders. The product of two tensors, for example, is a tensor whose order is the sum of
the orders of the two tensors, and whose components are products of a component of
one tensor with any component of the other tensor. The product of two second-order
tensors Aik with Blm, for example, is a fourth-order tensor Ciklm with components
Ciklm = AikBlm.
Our product of tensors is also called outer product.
1.12.3. Contraction of Tensors.
Summing a tensor of order n (n ≥ 2) over two of its indices is called contraction.
For example, summing over the first and second indices of a third-order tensor
Aiik = A11k +A22k +A33k
gives a vector. This is called contraction in the first and second indices. Contraction
in both indices of a second-order tensor Bij gives a scalar
Bii = B11 +B22 +B33.
Another example is Aiki gives another vector.
Contraction can be done many times.
Inner product. Multiplying two or more tensors and then contracting the
product with respect to indices belonging to different factors is often called an inner
product of the given tensors. For example, AikBk, AiBi, and λiklmBlm are all inner
products. But AiiBk is not an inner product.
2
1.13. Symmetry Properties of Tensors.
A tensor Sikl··· ( of order 2 or higher) is said to be symmetric in the first and
second indices (say) if
Sikl··· = Skil···.
It is antisymmetric in the first and second indices (say) if
Sikl··· = −Skil···.
Antisymmetric tensors are also called skewsymmetric or alternating tensors. The
Kronecker δik is a symmetric second-order tensor since
δik = ii · ik = ik · ii = δki.
The stress tensor pik is symmetric. But the tensor
Cik = AiBk −AkBi
is antisymmetric. It can be shown easily that an antisymmetric second-order tensor
has an matrix like this:
(Cik) =
0 C12 C13
−C12 0 C23
−C13 −C23 0
.That is Cik = 0 for i = k for an antisymmetric tensor.
We note that any second-order tensor Tik can be represented as a sum of a
symmetric tensor and an antisymmetric tensor:
Tik = Sik +Aik
whereSik = 1
2 (Tik + Tki)
Aik = 12 (Tik − Tki).
1.14. Pseudotensors.
Given a coordinate system K with the basis vectors ii, (i = 1, 2, 3). Let us
consider the quantities:
εjkl = (ij × ik) · il.
We have been assuming that our coordinate system K is always right-handed, i.e.,
the thumb of the right hand points to the direction of i3 if we position our right
hand so that our four fingers can rotate from i1 to i2. In this case, we can calculate
to find that
εjkl =
1, if j, k, l is a cyclic permutation of 1, 2, 3.
−1, if j, k, l is a cyclic permutation of 2, 1, 3.
0, otherwise.
More precisely, we have
ε123 = ε231 = ε312 = 1
ε213 = ε132 = ε321 = −1,
and all others with repeated indices ε111 = ε112 = · · · are zero. We verify for example
that
ε123 = (i1 × i2) · i3 = i3 · i3 = 1,
and
ε213 = (i2 × i1) · i3 = −i3 · i3 = −1,
and
ε113 = (i1 × i1) · i3 = 0 · i3 = 0.
Under orthogonal coordinate transformations from this K to another right-
handed system K ′, we can show that εjkl transform like a third-order tensor.
But occasionally we need to use left-handed coordinate systems. In this case the
thumb of the left hand points to the direction of i3 if we position our left hand so
that our four fingers can rotate from i1 to i2. In a left handed coordinate system
the vector product of A × B is defined by the left hand rule; i.e., the direction of
A ×B has the direction so that the three vectors A,B, and A ×B follow the left
hand rule. For either handedness, the rule of the direction of the product A × B
is such that the three vectors A,B, and A × B have the same handedness as the
coordinate system. This way all the formula for vector product hold for both kinds
of coordinate systems. In particular, the formula
A×B =
∣∣∣∣∣∣∣∣∣i1 i2 i3a1 a2 a3
b1 b2 b3
∣∣∣∣∣∣∣∣∣2
is valid in both kinds of coordinate system.
A coordinate system transformation may change the handedness. We have al-
lowed for these transformations in our definition of tensors of all orders.
However there are tensor-like quantities that change slightly differently from the
laws of tensors. For example, let us calculate the changes in εjkl. Let K ′ be a
coordinate system with the basis vectors i′1 = i1, i′2 = i2, i′3 = −i3 and the same
origin. By definition we have
ε′123 = (i′1 × i′2) · i′3.
Note that K ′ is now left handed. So the way to figure out the vector product i′1× i′2is to use the left hand rule, so we find that
i′1 × i′2 = i′3.
Thus
ε′123 = 1.
Now let us calculate the term
α1′lα2′mα3′nεlmn
which would be equal to ε′123 if εjkl were a third-order tensor. Note that the coordi-
nate transformation coefficients are
(αi′l) =
1 0 0
0 1 0
0 0 −1
.Thus
α1′lα2′mα3′nεlmn = α1′1α2′2α3′3ε123 = −1.
We can do all the calculations to verify that there actually hold
ε′ijk = −αi′lαj′mαk′nεlmn.
So εjkl is not a third-order tensor. This leads to the concept and definition of
pseudotensors.
3
Definition of pseudotensors. A pseudotensor of order n has 3n components
Ak1k2···kn that transform under changes of coordinate system according to the law
A′i1i2···in = ∆αi′1k1αi′2k2
· · ·αi′nknAk1k2···kn
where Ak1k2···kn , A′i1i2···in are the components of the pseudovector in the old and new
coordinate systems K and K ′ respectively, αi′k is the cosine of the angle between the
i-th axis of K ′ and the k-th axis of K, ∆ is 1 if K and K ′ have the same handedness,
and ∆ is −1 if K and K ′ have different handedness.
Note that a change of coordinate system is called a proper transformation if it
preserves the handedness. It is called an improper transformation if it reverses the
handedness. Pseudotensors are also called tensor densities.
We can verify that the permutation tensor εjkl is a third-order pseudotensor. It
is called the unit pseudotensor of order three. Since it appears in many physical and
geometrical situations, it also has the name Levi-Civita tensor density. It sometimes
is denoted as δjkl, a reminder that it is a generalization of the Kronecker δjk.
We note further that the permutation tensor εjkl is antisymmetric in any pair of
indices:
εjkl = −εkjl; εjkl = −εjlk; εjkl = −εlkj.
With two swaps, we have
εjkl = −εkjl = εklj, etc.
Because of this, it is often called the alternating (pseudo-)tensor of third order.
Lastly we note that the vector product A×B, which does not transform as an
ordinary vector, has a pseudotensor representation:
(A×B)i = εijkAjBk.
That is, pseudotensors can be multiplied to yield pseudotensors. Higher-order pseu-
dotensors can be contracted to form pseudotensors. In the current situation, the
outer product of εjkl with ordinary vectors A and B yields a pseudotensor of order
5. When contracted twice, the result is a pseudotensor of order 1.
An ordinary first-order tensor is called a polar vector. Polar vectors transform
under both types of changes of coordinate systems without the factor ∆. A first-order
4
pseudotensor is called an axial vector. It is called axial because it has something to
do with the axis of rotation associated in the product v = ~ω × r.
5
1.15. Curvilinear Coordinate Systems.
We need curvilinear coordinate systems in applications. The spherical coordinate
system is an example of curvilinear coordinate systems.
Let u1, u2, u3 denote new coordinates and suppose that they are related to the
cartesian coordinates x1, x2, x3 by the equations
ui = φi(x1, x2, x3) (i = 1, 2, 3). (1)
Assume that φi have continuous first-order derivatives in a domain D of the x-space
and there holds the condition
∂(u1, u2, u3)∂(x1, x2, x3)
=
∣∣∣∣∣∣∣∣∣∂φ1
∂x1
∂φ1
∂x2
∂φ1
∂x3
∂φ2
∂x1
∂φ2
∂x2
∂φ2
∂x3
∂φ3
∂x1
∂φ3
∂x2
∂φ3
∂x3
∣∣∣∣∣∣∣∣∣ 6= 0 (2)
in the domain. This determinant is called the Jacobian of the transformation from
x to u.
The nonvanishing condition (2) ensures that it is possible to determine (x1, x2, x3)
in terms of the coordinates (u1, u2, u3); i.e., there exist functions fi(u1, u2, u3) (i =
1, 2, 3) such that
xi = fi(u1, u2, u3) (3)
where fi are defined in D determined from (1). Moreover, fi have continuous first-
order derivatives for which∂(x1, x2, x3)∂(u1, u2, u3)
6= 0
in D. The functions (f1, f2, f3) define the inverse transformation of (1). It is im-
portant to note that the Jacobians satisfy the relation
∂(u1, u2, u3)∂(x1, x2, x3)
· ∂(x1, x2, x3)∂(u1, u2, u3)
= 1.
Now let P be any point in D with coordinate (x1, x2, x3) and let the numbers
u1, u2, u3 be determined by (1). We call the ordered triple of numbers (u1, u2, u3) the
curvilinear coordinates of the point P . The equations in (1) are called the coordinate
transformation, and they are said to define a curvilinear coordinate system in D. It
follows that the Jacobian of a coordinate transformation is the reciprocal of the
Jacobian of its inverse.
Example 1.15a. Consider the transformation from the rectangular cartesian
coordinates (x, y) on a plane to the polar coordinates (r, θ) defined by
r =√x2 + y2, θ = arccos
x√x2 + y2
( or = arcsiny√
x2 + y2)
where arccos is chosen such that a unique θ in 0 ≤ θ < 2π exists so that cos θ =
x/√x2 + y2 and sin θ = y/
√x2 + y2. The domain D is all points except the origin.
The Jacobian is∂(r, θ)∂(x, y)
=
∣∣∣∣∣∣x√x2+y2
y√x2+y2
− yx2+y2
xx2+y2
∣∣∣∣∣∣ =1r.
In the above calculation we find the partial derivative ∂r/∂x as follows: From
r2 = x2 + y2
we find
2rrx = 2x.
Dividing by 2r we find rx = x/r. We can use cos θ = x/r to find the derivative θx,
etc. Thus the inverse exists except at the origin, the inverse is
x = r cos θ, y = r sin θ.
Note that the inverse is defined for all (r, θ). We can calculate
∂(x, y)∂(r, θ)
=
∣∣∣∣∣∣ cos θ −r sin θ
sin θ r cos θ
∣∣∣∣∣∣ = r.
It is clear that the product of the two Jacobians is 1.
1.15.1. Coordinate surfaces, coordinate curves, and local basis.
Let P0 = (x01, x
02, x
03) be a point in D with coordinates (u0
1, u02, u
03) in the curvi-
linear coordinate system. We call the surface
φi(x1, x2, x3) = u0i
the i-th coordinate surface passing through P0 (i = 1, 2, 3). The intersection of two
coordinate surfaces, say,
φ1(x1, x2, x3) = u01, φ2(x1, x2, x3) = u0
2
2
is called the u3-coordinate curve. See Figure 1.15.1.
P
x 3
x 1
x 2
φ
φ = u
= u
30
3
φ = u 022
0
101
Figure 1.15.1. Coordinate surfaces, coordinate curves, and local basis.
We next derive a basis at the point P0. The position vector of an arbitrary point
P is
R(u1, u2, u3) = xiii = fi(u1, u2, u3)ii.
If we set u2 = u02, u3 = u0
3, then the resulting vector function R(u1, u02, u
03) represents
the u1-curve. On this curve u1 is the parameter. It follows that the derivative
∂R∂u1
represents the tangent vector to this curve. Likewise, we have
∂R∂u2
,∂R∂u3
representing the tangent vectors to the u2- and u3-curves respectively.
Since the determinant of a matrix is the same as the determinant of its transpose,
it follows from the definition of Jacobian and that of the scalar triple product that
∂(x1, x2, x3)∂(u1, u2, u3)
=∂R∂u1·(∂R∂u2× ∂R∂u3
). (4)
(See homework problem 1.) Hence at each point where (4) is not zero, the three
tangent vectors
(∂R∂u1
,∂R∂u2
,∂R∂u3
) (5)
3
are linearly independent and thus form a basis.
Every vector or vector field at each point can then be represented in terms of
this basis (5). Unlike the unit vectors (i1, i2, i3), however, this new basis varies from
point to point in space. For this reason, we call (5) a local basis.
4
1.15.2. Arclength and orthogonal curvilinear coordinate systems.
We assume that the three vectors
(∂R∂u1
,∂R∂u2
,∂R∂u3
)
form a right-handed basis; i.e., the vector product ∂R∂u1× ∂R
∂u2has positive inner
product with ∂R∂u3
. In this case, the Jacobian ∂(x1,x2,x3)∂(u1,u2,u3) is positive.
We derive arclength formula in curvilinear coordinate systems. Consider the
position vector
R = xiii = fi(u1, u2, u3)ii.
We have(ds)2 = dR · dR
= (∑3i=1
∂R∂ui
dui) ·∑3j=1
∂R∂uj
duj)
= ( ∂R∂ui· ∂R∂uj
)duiduj
= gij dui duj
(1)
where we have introduced
gij =∂R∂ui· ∂R∂uj
which is called the metric tensor.
From here one can pursue the study of general metric tensors, which are used
for example in general relativity. For us, we choose to be more specific. We say
that the curvilinear coordinate system is orthogonal curvilinear if the triple vectors
∂R/∂ui(i = 1, 2, 3) are mutually orthogonal. For orthogonal curvilinear coordinate
systems, the directions and magnitudes of ∂R/∂ui(i = 1, 2, 3) can still vary. Let us
define
hi = |∂R∂ui| (i = 1, 2, 3).
Then we have
gij =
h2i , i = j,
0, i 6= j.
Example 1.15b. The transformation relating the cylindrical coordinates (r, θ, z)
to the rectangular cartesian coordinates (x, y, z) is defined by the equations
r =√x2 + y2
θ = arccos x√x2+y2
( or arcsin y√x2+y2
)
z = z.
It is defined for all (x, y, z) except for the origin.
We find
∂(r, θ, z)∂(x, y, z)
=
∣∣∣∣∣∣∣∣∣∂r∂x
∂r∂y 0
∂θ∂x
∂θ∂y 0
0 0 1
∣∣∣∣∣∣∣∣∣ =1r.
The inverse is
x = r cos θ, y = r sin θ, z = z
which is valid for all (r, θ, z).
Let us identify the coordinate surfaces and coordinate curves. Refer to Figure
1.15.2.
xy
z
Pr
zθ
Figure 1.15.2. Cylindrical coordinate system.
0
Coordinate surfaces: The coordinate surface r = r0 is the surface of a cylinder
passing through a point P0, and extending to infinity in both the positive and
negative directions of z-axis. The coordinate surface θ = θ0 is a half plane starting
at the z-axis and extending to infinity. The coordinate surface z = z0 is the plane
passing through P0 and perpendicular to the z-axis.
2
Coordinate curves: The r-coordinate curve is a ray starting on the z-axis, passing
through the point P0, and parallel to the xy-plane. The θ-coordinate curve is a circle
passing through the point P0, and parallel to the xy-plane. The z-coordinate curve
is a straight line parallel to the old z-axis.
We have the position vector
R(r, θ, z) = r cos θ i1 + r sin θ i2 + z i3.
Tangent vectors to the coordinate curves are
∂R∂r = cos θ i1 + sin θ i2∂R∂θ = −r sin θ i1 + r cos θ i2∂R∂z = i3.
Let u1 = r, u2 = θ, u3 = z, then the three tangent vectors form a right-handed
orthogonal curvilinear coordinate system. We find that gij = 0 for i 6= j, and
h1 = 1, h2 = r, h3 = 1.
Thus the distance formula is
(ds)2 = (dr)2 + (rdθ)2 + (dz)2.
Example 1.15c. The spherical coordinates u1 = r, u2 = φ, u3 = θ are defined
byr =
√x2 + y2 + z2
φ = arccos z√x2+y2+z2
θ = arccos x√x2+y2
.
(r ≥ 0, 0 ≤ φ < π, 0 ≤ θ < 2π). Refer to Figure 1.15.3 for the variables.
3
Figure 1.15.3. Spherical coordinate system.
x
y
z
P
u
u
u
2
3
1
0
φ r
θ
We can calculate the Jacobian
∂(r, φ, θ)∂(x, y, z)
=
∣∣∣∣∣∣∣∣∣∣rx
yr
zr
xz
r2√x2+y2
yz
r2√x2+y2
−x2+y2
r2
− y√x2+y2
x√x2+y2
0
∣∣∣∣∣∣∣∣∣∣=
1r2 sinφ
.
The inverse isx = r sinφ cos θ
y = r sinφ sin θ
z = r cosφ.
The Jacobian is∂(x, y, z)∂(r, φ, θ)
= r2 sinφ.
The position vector is
R(r, φ, θ) = r sinφ cos θ i1 + r sinφ sin θ i2 + r cosφ i3.
4
The three tangent vectors are
∂R∂r = sinφ cos θ i1 + sinφ sin θ i2 + cosφ i3∂R∂φ = r cosφ cos θ i1 + r cosφ sin θ i2 − r sinφ i3∂R∂θ = −r sinφ sin θ i1 + r sinφ cos θ i2.
They are mutually orthogonal. We have
g11 = h21 = 1
g22 = h22 = r2
g33 = h23 = r2 sin2 φ.
The distance formula is
(ds)2 = (dr)2 + (rdφ)2 + (r sinφdθ)2.
For the volume element, we have
dV = dr · rdφ · r sinφdθ = r2 sinφdr dφ dθ.
In the next lecture we calculate the grad, div, and curl in orthogonal curvilinear
coordinate systems.
5
1.16. Grad, div, and curl in orthogonal curvilinear coordinate systems.
In this section we derive the expressions of various vector concepts in an orthog-
onal curvilinear coordinate system.
Let (u1, u2, u3) be such a system:
ui = φi(x1, x2, x3). (i = 1, 2, 3).
Let
xi = fi(u1, u2, u3)
be the inverse transformation. We introduce the normalized coordinate tangent
vectors:
ui =1hi
∂R∂ui
( no summation )i = 1, 2, 3,
where hi = |∂R/∂ui|. Assume that (u1,u2,u3) is right-handed so that the Jacobian
is positive.
1.16.1. Gradient of a scalar field.
Let F (x1, x2, x3) be a scalar field in a rectangular system. We know that ∇F is
a vector, which can be represented as a linear combination of any basis. So let
∇F = F1u1 + F2u2 + F3u3.
We need to find (F1, F2, F3). We recall from Section 1.5.3(of Lecture 5) the coordinate-
independent formula
∇F (P0) = limV→0
1V
∫ ∫∂V
nF (y) dSy (1)
where V is a domain that contains the point P0 and n is the unit exterior normal
to ∂V . By the way, we have also the formulas
∇ · F(P0) = limV→0
1V
∫ ∫∂V
n · F(y) dSy
for the divergence (∇·) of a vector field F and and
∇× F(P0) = limV→0
1V
∫ ∫∂V
n× F(y) dSy
for the curl (∇×) which we will use for the representations of div and curl. The three
formulas certainly have striking uniformity. Back to our gradient representation, we
take V to be an elementary “curvilinear parallelepiped” of volume
ds1 ds2 ds3 = h1h2h3 du1 du2 du3
with faces perpendicular to the coordinate curves, see Figure 1.16.1.
Figure 1.16.1. Curvilinear parallelepiped.
u
u
u
n
n
P01
2
3
To calculate the surface integral (1), we first note that there are six sides. For
the side that passes through P0 and perpendicular to the u1-axis, we have the ap-
proximate value
−u1 F (P0)h2du2 h3du3,
where the surface area element is ds2 ds3 = h2du2 h3du3. The integral on the surface
that is parallel to the previous surface is approximately
u1 F (P0 + du1u1)h2du2 h3du3,
where P1 = P0 + du1u1 is the position of P0 with an increment du1 along the u1-
coordinate axis. Combining these two sides and note that the volume of the element
is
V = ds1 ds2 ds3 = h1h2h3 du1du2du3,
the average becomes
(−F (P0) + F (P0 + du1u1))h2h3 du2du3
h1h2h3 du1du2du3−→ 1
h1
∂F (P0)∂u1
u1
as V → 0. Similarly we can calculate the other four sides. In summary, we find
∇F =1h1
∂F
∂u1u1 +
1h2
∂F
∂u2u2 +
1h3
∂F
∂u3u3.
2
Theorem The del operator has the formula
∇ = u11h1
∂
∂u1+ u2
1h2
∂
∂u2+ u3
1h3
∂
∂u3.
Example 1.16a Find the expression of ∇ in cylindrical coordinates.
Solution. Let u1 = r, u2 = θ, u3 = z. It is right-handed. We have h1 = 1, h2 =
r, h3 = 1. Also
u1 = cos θ i1 + sin θ i2, u2 = − sin θ i1 + cos θ i2, u3 = i3.
Thus
∇ = u1∂
∂r+ u2
1r
∂
∂θ+ u3
∂
∂z.
Example 1.16b. Find the gradient of f = xyz in the cylindrical coordinates.
Solution. We have f = r2z sin θ cos θ. Thus
∇f = u12rz sin θ cos θ + u2rz(cos2 θ − sin2 θ) + u3r2 sin θ cos θ.
1.16.2. Divergence. We let
F = F1u1 + F2u2 + F3u3.
Then we can find, similar to the previous section, that
div F =1
h1h2h3
[∂
∂u1(F1h2h3) +
∂
∂u2(F2h1h3) +
∂
∂u3(F3h1h2)
].
Example 1.16c Derive the formula for the Laplacian ∆ defined as ∆ = div ∇.
Solution. Consider an F = ∇f . We have
∆f = div ∇f= 1
h1h2h3
[∂∂u1
(h2h3h1
∂f∂u1
) + ∂∂u2
(h1h3h2
∂f∂u2
) + ∂∂u3
(h1h2h3
∂f∂u3
)].
(2)
1.16.3. The curl.
Similarly, for
F = F1u1 + F2u2 + F3u3,
we can derive the formula
curl F =1
h1h2h3
∣∣∣∣∣∣∣∣∣h1u1 h2u2 h3u3
∂∂u1
∂∂u2
∂∂u3
F1h1 F2h2 F3h3
∣∣∣∣∣∣∣∣∣ .
3
Appendix: Useful expressions
I. In cylindrical coordinates
u1 = r, u2 = θ, u3 = z
h1 = 1, h2 = r, h3 = 1,
there holdgrad f = ∂f
∂rur + 1r∂f∂θuθ + ∂f
∂zuz,
div A = 1r∂∂r (rAr) + 1
r∂Aθ∂θ + ∂Az
∂z ,
curl A =(
1r∂Az∂θ −
∂Aθ∂z
)ur +
(∂Ar∂z −
∂Az∂r
)uθ+
+1r
(∂∂r (rAθ)− ∂Ar
∂θ
)uz,
∆f = 1r∂∂r
(r ∂f∂r
)+ 1
r2∂2f∂θ2 + ∂2f
∂z2 ,
where
ur = cos θ i1 + sin θ i2, uθ = − sin θ i1 + cos θ i2, uz = i3
is the local orthonormal basis, and A has components Ar, Aθ, Az with respect to
this basis.
II. In spherical coordinates. See text book by Borisenko, p174.
4
Chapter II. Complex Variables
Dates: September 24, 26, 28.
These three lectures will cover the following sections of the text book by Keener.
§6.1. Complex valued functions and branch cuts;
§6.2.1. Differentiation and analytic functions, Cauchy-Riemann conditions;
§6.2.2. Integration;
§6.2.3. Cauchy integral formula;
§6.2.4. Taylor series expansion.
2.1. Complex valued functions.
1. Complex numbers. We introduce the imaginary number i, whose square is
−1:
i2 = −1.
Complex numbers are in the form a+ ib where a and b are real numbers. Complex
numbers can be represented in the Argand diagram by the vector (a, b): (Figure to
be provided later). Addition and subtraction of two complex numbers are simple:
(a+ bi)± (c+ di) = (a± c) + (b± d)i.
Multiplication and division are as follows:
(a+ bi)(c+ di) = (ac− bd) + i(ad+ bc)a+bic+di = (a+bi)(c−di)
c2+d2
(1)
provided that c2 + d2 6= 0 for the division. From these one can calculate the power
(a+ bi)n when n is an integer.
2. Functions. Let z = a + bi. We call z a complex variable when we use z as a
variable. In general we let z = x+iy to be consistent with our habit of real variables.
Consider
f(z) = z2.
It is called a complex valued function. Other complex valued functions are z3,
g(z) =z + 1z − 1
; h(z) =az + b
cz + d
where a, b, c, d are complex numbers. An important function is
f(z) = z = x− iy
where the bar is called “complex conjugate.”
We introduce the exponential function
ez =∞∑n=0
zn
n!
for all complex z. Note that this definition is consistent with the real exponential.
We see this opens a new world. First we see
eiθ =∑∞n=0
(iθ)n
n! =∑∞n=0
(iθ)2n
(2n)! +∑∞n=0
(iθ)2n+1
(2n+1)!
=∑∞n=0(−1)n θ2n
(2n)! + i∑∞n=0(−1)n θ2n+1
(2n+1)!
= cos θ + i sin θ.
(2)
Multiplying it with any real number r, we find
reiθ = r cos θ + ir sin θ.
If r and θ is the polar representation of the point (a, b), then we find the polar
representation of complex numbers:
a+ bi = reiθ.
From this we have a special case:
eiπ + 1 = 0
which is called Euler’s identity. This identity is fun to watch since it involves the
simplest symbols of mathematics: 0, 1,+,=, i and two transcendental numbers e and
π. Another version is
e−iπ + 1 = 0
which involves additionally the minus − sign. In polar representation, multiplica-
tions of complex numbers is extremely easy:
(a+ bi)(c+ di) = (r1eiθ1)(r2e
iθ2) = (r1r2)ei(θ1+θ2).
2
We have more examples:
ez = ex+iy = exeiy = ex(cos y + i sin y)
ez1+z2 = ez1ez2 .(3)
Inverse functions of z2 and ez
We know that√
5 is a number 2.236... and satisfies the equation x2 = 5. Another
solution to this equation is −√
5. We can verify that both
z1 =√
22
+ i
√2
2, z2 = −
√2
2− i√
22
satisfy z2 = i. There are multiple solutions to the square root. We calculate a
solution√z =√reiθ =
√reiθ/2.
We can verify that this satisfies w2 = z. When we restrict θ to be in [0, 2π), this
root is called the principal branch. We can see another solution is
√z =
√rei(θ+2π) =
√rei(θ/2+π).
For the exponential function we define the inverse w = ln z as
ln z = w iff z = ew.
We note that if a ln z works, then ln z + 2nπi all work for all integer n:
eln z+2nπi = eln ze2nπi = eln z = z.
So there are multiple inverses. We can take one branch:
−π < Im (ln z) ≤ π.
This is called a principal branch. The line θ = π is called a branch cut. The origin is
called a branch point. Any continuous interval of 2π length is called a branch. The
principal branch for a function may vary from discipline to discipline.
3
2.2. Calculus of Complex Functions.
2.2.1. Differentiation.
We define the derivative f ′(z) of a complex valued function f(z) like the deriva-
tive of a real function:
f ′(z) = limξ→z
f(ξ)− f(z)ξ − z
where the limit is over all possible ways of approaching z. If the limit exists, the
function f is called differentiable and f ′(z) is the derivative.
Definition. If f ′(z) is continuous, then f is called analytic.
Continuity is like that for real functions of two variables.
Theorem 2.1 (Cauchy-Riemann conditions) The function f(z) = u(x, y) + iv(x, y)
for z = x + iy is analytic in some region Ω if and only if ∂u∂x ,
∂u∂y ,
∂v∂x ,
∂v∂y exist, are
continuous, and satisfy the equations
∂u
∂x=∂v
∂y,
∂v
∂x= −∂u
∂y.
Proof. (Skipped in class due to lecture on “memorizing formula/knowledge”).
Let f be continuously differentiable. Then take the special path along x-axis:
f(z+∆x)−f(z)∆x = u(x+∆x,y)+iv(x+∆x,y)−u(x,y)−iv(x,y)
∆x
−→ ∂u∂x + i∂v∂x .
(4)
Then along the path y-axis:
f(z+∆y)−f(z)i∆y −→ ∂u
i∂y + i ∂vi∂x = −i∂u∂y + ∂v∂x . (5)
The two limits have to be the same by definition, so we have obtained the Cauchy-
Riemann equations∂u
∂x=∂v
∂y,
∂v
∂x= −∂u
∂y.
Conversely, suppose the Cauchy-Riemann conditions hold; i.e., the existence and
continuity of the partial derivatives and the equations of Cauchy-Riemann all hold.
Let z0 = x0 + iy0. From theory of real variables we have the expansion
u(x, y) = u(x0, y0) + ∂u∂x(x0, y0)∆x+ ∂u
∂y (x0, y0)∆y +R1(∆x,∆y),
v(x, y) = v(x0, y0) + ∂v∂x(x0, y0)∆x+ ∂v
∂y (x0, y0)∆y +R2(∆x,∆y),(6)
4
where ∆x = x− x0,∆y = y − y0, and
lim∆x,∆y→0
Ri√(∆x)2 + (∆y)2
= 0.
Now we have
f(z0 + ∆z)− f(z0) = ∂u∂x(x0, y0)∆x+ ∂u
∂y (x0, y0)∆y +R1
+i[∂v∂x(x0, y0)∆x+ ∂v∂y (x0, y0)∆y +R2]
= ∂u∂x(∆x+ i∆y) + i∂v∂x (∆x+ i∆y) +R1 +R2i.
(7)
Sof(z0+∆z)−f(z0)
∆z = ∂u∂x + i∂v∂x + R1+R2i
∆z
→ ∂u∂x + i∂v∂x .
(8)
This completes the proof.
We list some practical rules of differentiation:
f(z) = z2 −→ f ′(z) = 2z
f(z) = zk −→ f ′(z) = kzk−1(k integer )
(ez)′ = ez
(f(z)g(z))′ = f ′(z)g(z) + f(z)g′(z)
[F (g(z))]′ = F ′(g(z))g′(z)
( 1f )′ = − 1
f2 f′.
(9)
2.2.2. Integration.
Integration in the complex plane is defined in terms of real line integrals of the
complex function f = u+ iv. If C is any (geometric) curve in the complex plane we
define the line integral∫Cf(z)dz =
∫C
(u+ iv)(dx+ idy) =∫Cu(x, y)dx− v(x, y)dy + i
∫Cvdx+ udy.
Example. See homework hints.
Theorem 2.2. If f(z) is analytic in a domain Ω, then∫Cf(z)dz = 0
for any closed curve C whose interior lies entirely in Ω.
5
Note that “a curve C whose interior lies entirely in Ω” is a stronger requirement
than “a curve C which lies entirely in Ω”. The stronger requirement rules out the
situation that the relevant part of Ω is not simply connected.
Proof Recall Green’s Theorem∫∂Ωφdx+ ψdy =
∫Ω
(∂ψ
∂x− ∂φ
∂y)dxdy
for a simply connected domain Ω. We apply this formula to our complex integral to
obtain ∫C f(z)dz =
∫C(u+ iv)(dx + idy)
=∫C u(x, y)dx− v(x, y)dy + i
∫C vdx+ udy
=∫cC( ∂
∂x(−v)− ∂∂yu)dxdy + i
∫cC
∂∂xu−
∂∂yv)dxdy
= 0
(10)
where we use cC to denote the interior of the contour C. This completes the proof.
Examples. 1. We have ∫Czndz = o
for any integer n and any contour C that does not enclose the origin. This follows
from theorem 2.2.
2. We can calculate∫|z|=1
z−1dz =∫ 2π
01−1e−iθ · 1eiθidθ = 2πi.
3. We leave as an exercise the claim∫|z|=1
z−ndz = 0
for all integer n 6= 1.
We note that the notation |z| = 1 means all points of the unit circle x2 + y2 = 1.
The default direction of the circle is counterclockwise.
6
2.2.3. Cauchy integral formula
We have found that contour integrals of analytic functions are always zero. Only
a few integrands with singularities result in nonzero values. The following Cauchy
integral formula describes contour integrals extremely well.
Theorem 2.3. (Cauchy integral formula) Let C be a simple noninteracting closed
curve traversed counterclockwise. Suppose f(z) is analytic everywhere inside C. For
any point z inside C, there holds∫C
f(ξ)ξ − z dξ = 2πif(z). (11)
Proof. For any ε > 0 fixed, we deform the curve C to C ′ where C ′ is the circle
|ξ − z| = ε such that |f(z) − f(ξ)| < ε for all points ξ inside C ′. Note that the
integrand in (11) f(z)/(ξ−z) is analytic in the region between C and C ′, we conclude
that the integral in (11) is equal to the same integral over C ′. (This can be achieved
by the previous Theorem and a double-sided cut (or bridge) connecting C and C ′.)
Now on C ′ we have∫Cf(ξ)ξ−z dξ =
∫C′
f(ξ)ξ−z dξ
= f(z)∫C′
1ξ−zdξ +
∫C′
f(ξ)−f(z)ξ−z dξ
= 2πif(z) + i∫ 2π
0 [f(z + εeiθ)− f(z)]dθ
= 2πif(z) + iI
(12)
where the integral I is such that |I| ≤ 2πε. Let ε → 0. We recover the Cauchy
integral formula. This completes the proof of the theorem.
Corollary 2.4. Under the same assumptions of theorem 2.3, there hold∫C
f(ξ)(ξ − z)2
dξ = 2πif ′(z) (13)
and
n!∫C
f(ξ)(ξ − z)n+1
dξ = 2πif (n)(z) (14)
for all n-th (n a positive integer) order derivatives. And thus analyticity implies
that f(z) is infinitely differentiable.
Corollary 2.5. (Poisson formula) A solution to the boundary value problem of the
Laplacian ∆u ≡ ∂2u∂x2 + ∂2u
∂y2 = 0, in x2 + y2 ≤ 1
u(r, θ) = u0(θ) on the boundary r = 1(15)
7
where (r, θ) is the polar coordinate and u0(θ) is a given continuous function, is given
by the formula
u(r, θ) =1
2π
∫ 2π
0u0(θ)
1− r2
1− 2r cos(θ − φ) + r2dθ.
Proof. Consider an analytic function f(z) = u(x, y) + iv(x, y) in r < 1. We have the
Cauchy-Riemann equations:
∂u
∂x=∂v
∂y,
∂v
∂x= −∂u
∂y.
So we have∂2u
∂x2=
∂2v
∂y∂x=
∂
∂y(−∂u∂y
),
thus∂2u
∂x2+∂2u
∂y2= 0.
That is, the real part of an analytic function is a harmonic function (satisfying the
Laplace equation). Now we use the Cauchy integral formula
f(z) =1
2π
∫ 2π
0
f(ξ)ξξ − z dθ (letting ξ = eiθ)
for z inside the unit circle and the same formula
0 =1
2π
∫ 2π
0
f(ξ)ξξ − (z)−1
dθ
applied at the point (z)−1 which is outside of the unit circle (|1z | > 1 if |z| < 1.)
Noting that ξ = (ξ)−1 on the unit circle, we can add the previous formulas
f(z) =1
2π
∫ 2π
0f(ξ)[
ξ
ξ − z −1/ξ
1/ξ − 1/z]dθ.
Or
f(z) =1
2π
∫ 2π
0f(ξ)
1− |z|2|ξ − z|2 dθ.
Taking the real part of the formula, we obtain Poisson formula. This completes the
proof.
2.2.4. Taylor series.
8
We would like to expand an analytic function in a powerful series:
f(z) =∞∑n=0
an(z − z0)n,
where
an =f (n)(z0)n!
and z0 is a convenient point for an application. Using Cauchy integral formula for
derivatives (Corollary 2.4), we find that
an =1
2πi
∫C
f(ξ)(ξ − z0)n+1
dξ
for any simple contour C that contains z0.
We can use other simple ways to find Taylor series: For example we have
11− z = 1 + z + z2 + z3 + · · ·
valid for all |z| < 1.
Using these lecture notes together with the text book is recommended.
9
Chapter V. Ordinary Differential Equations
Outline:
5.1 First-order linear scalar equation.
5.2 High-order linear scalar equation with constant cofficients.
5.3 First-order linear system with constant cofficients.
5.4 Stablity of first-order linear system.
5.5 Hopf Bifurcation.
We cover perturbation method next semester.
5.1. First-order linear scalar equation
Let us solve the problem
dy
dt+ a(t)y = 0, y(0) = C. (1)
We findy′
y= −a(t).
Integrate:
ln y(t)− ln y(0) = −∫ t0 a(τ)dτ,
ln y(t) = ln y(0)−∫ t0 a(τ)dτ,
y(t) = eln y(0)−∫ t
0a(τ)dτ
= y(0)e−∫ t
0a(τ)dτ .
So the solution to (1) is
y(t) = Ce−∫ t
0a(τ)dτ . (2)
Now we consider (the first-order linear scalar equation)
dy
dt+ a(t)y = f(t). (3)
We look for a factor m(t) such that
m(t)y′ + a(t)m(t)y = [m(t)y]′.
From the product rule, we need m(t) to satisfy
m′(t) = a(t)m(t)
From formula (2), we find an m(t)
m(t) = e∫ t
0a(s)ds.
Equation (3) multiplied by m(t) becomes
d
dt[m(t)y(t)] = m(t)f(t).
Thus
m(t)y(t)−m(0)y(0) =∫ t
0m(τ)f(τ)dτ,
or
y(t) =1
m(t)[y(0) +
∫ t
0m(τ)f(τ)dτ ],
or
y(t) = e−∫ t
0a(τ)dτ [y(0) +
∫ t
0e∫ τ
0a(s)dsf(τ)dτ ]. (4)
Examples 1. All solutions to
y′ + 5y = 0
are
y = ce−5t.
2. All solutions to
y′ + t2y = 0
are
y = ce−13t3 .
5.2. High-order linear scalar equations with constant coefficients
Let us solve the problem
d3x
dt3+d2x
dt2− 2x = 0, (5)
x(0) = 0, x′(0) = 1, x′′(0) = −1. (6)
We try solutions of the form
x(t) = eλt. (7)
2
Then x′ = λx(t), x′′(t) = λ2x(t), x′′′(t) = λ3x(t). Thus λ needs to satisfy
λ3x(t) + λ2x(t)− 2x(t) = 0.
We find λ1 = 1, λ2 = −1+ i, λ3 = −1− i. So we have solutions et, e−t+it, e−t−it.
The two complex solutions can be added or subtracted one from the other to produce
two real solutions, so we have three real solutions et, e−t cos t, e−t sin t, since the
equation is linear. Also, any linear combination of the three solutions are solutions.
Thus, we have
x(t) = c1e−t + c2e
t cos t + c3e−t sin t (8)
as the general solution formula for (5). One can use the three initial conditions (6)
to determine the three coefficients c1, c2, c3 in (8) which we omit here.
In general the n−th-order linear scalar equation
anx(n)(t) + an−1x
(n−1)(t) + ...+ a1x′ + a0x = 0
with constant coefficients (an, an−1, · · · , a0) without forcing(right-hand side=0) can
be solved by the guess work (7). More precisely, from the algebraic equation,
anλn + an−1λ
n−1 + · · ·+ a1λ+ a0 = 0,
we can find n roots. Suppose it has n distinct roots λ1, λ2, · · · , λn. Then the solution
for the ODE is
x(t) = c1eλ1t + c2e
λ2t + · · · + cneλnt.
If λ1 and λ2 are a pair of conjugate complex roots, say,
λ1 = a+ bi, λ2 = a− bi,
then we can replace the part c1eλ1t + c2eλ2t by real solution of the form
αeat cos(bt) + βeat sin(bt).
If λ1 is repeated, say λ1 = λ2, then c2eλ2t is a multiple of the first solution c1e
λ1t.
In this case, we replace the guess work eλt by teλt, and the solution c1eλ1t + c2e
λ2t
is replaced by c1eλ1t + c2teλ1t. If λ1 is repeaded m times, then the solution part
c1eλ1t + c2e
λ2t + · · ·+ cmeλmt
3
is replaced by
c1eλ1t + c2te
λ1t + · · ·+ cmtm−1eλ1t.
Example 3. Solved2x
dt2− 2
dx
dt+ x = 0.
Solution Try x = eλt, we find
λ2 − 2λ+ 1 = 0.
So
λ1 = λ2 = 1.
So the solutions are
x(t) = c1et + c2te
t.
4
5.3. First-order linear systems with constant coefficients
Motivation: The planetary motion can be described by a system of equations.
Let us solve the system
dx1dt − x2 − x3 = 0,dx2dt − x1 − x3 = 0,dx3dt − x1 − x2 = 0,
(1)
with initial conditions: x1(0)
x2(0)
x3(0)
=
c1
c2
c3
. (2)
We try the form, motivated by the guess work for a scalar equation x = ceλt,x1(t)
x2(t)
x3(t)
=
aeλt
beλt
ceλt
. (3)
We see x′1
x′2
x′3
= λ
a
b
c
eλt = λ
x1
x2
x3
.Inserting this back to (1), we have
λx1 − x2 − x3 = 0,
λx2 − x1 − x3 = 0,
λx3 − x1 − x2 = 0.
This is a homogeneous linear system of three algebraic equations. We write it in
matrix form λ −1 −1
−1 λ −1
−1 −1 λ
·x1
x2
x3
= 0.
Remove the common factor eλt in [x1, x2, x3]T in the above equation, we findλ −1 −1
−1 λ −1
−1 −1 λ
·a
b
c
= 0. (4)
To have a nonzero solution [a, b, c]T , we need the matrix to have zero determinant:
det
λ −1 −1
−1 λ −1
−1 −1 λ
= 0. (5)
This determinant can be evaluated to be
λ3 − 3λ− 2 = (λ+ 1)2(λ− 2). (6)
The factorization is made possible by inspection and λ = −1 is a solution. Equation
(5) then has three roots:
λ1 = 2, λ2 = −1, λ3 = −1. (7)
Using the root λ1 = 2 in (4), we have the equation2 −1 −1
−1 2 −1
−1 −1 2
a
b
c
= 0. (8)
The solutions are a
b
c
= α
1
1
1
, (α− free) (9)
Thus we find the first batch of solutionsx1
x2
x3
= α
1
1
1
e2t. (10)
Using λ2 = −1 in (4), we find the equation−1 −1 −1
−1 −1 −1
−1 −1 −1
a
b
c
= 0. (11)
2
The solutions to (11) area
b
c
= β
1
0
−1
+ γ
0
1
−1
, (β, γ free) (12)
We find another batch of solutions to (1):x1
x2
x3
= β
1
0
−1
e−t + γ
0
1
−1
e−t. (13)
If λ3 is different from λ2, we can use it to find another batch. But so far we have
found plenty of solutions. We combine linearly the solutions (10) and (13) to end
up with the general solution formula for (1)x1
x2
x3
= α
1
1
1
e2t + β
1
0
−1
e−t + γ
0
1
−1
e−t. (14)
We can use initial condition (2) to determine the three arbitrary constants α, β, γ
(which we skip).
In general equation (1) can be written as
d~x
dt= A~x (15)
for an n× n matrix A with constant coefficients (aij). For our previous example
A =
0 1 1
1 0 1
1 1 0
.The guess work is
~x = ~aeλt. (16)
where ~a is a vector. Then λ needs to satisfy
det(λI −A) = 0 (17)
3
and ~a satisfies
A~a = λ~a. (18)
If the characteristic equation (17) has n roots λ1, λ2, · · · , λn, and the eigenvalue prob-
lem (18) has n corresponding linearly independent eigenvectors α1~a1, α2~a2, · · · , αn~an,
then the general solution for (15) is
~x(t) = α1~a1eλ1t + α2~a2e
λ2t + · · ·+ αn~aneλnt. (19)
If, for example, λ1 = λ2, and the corresponding linearly independent eigenvectors
are fewer than n, then by guess work (in addition to α1~a1eλ1t),
~x = α2(~a2eλ1t + ~a1te
λ1t).
This way a nonzero ~a2 can be found in the sequence of equations
A~a1 = λ1 ~a1,
A~a2 = λ1 ~a2 + ~a1.
And the general solution is
~x = α1~a1eλ1t + α2(~a2 + ~a1t)eλ1t + α3~a3e
λ3t · · ·+ αn~aneλnt.
If λ1 is repeated more times, then higher orders of t can be used in the guess work.
If, however, all eigenvalues are distinct, a theorem says that the n corresponding
eigenvectors are linearly independent and (19) gives the general solution.
4
5.4. Stability of first-order linear system
Motivation: A solution needs to be stable in order to be useful in practice. The
U.S. missile defense system is not yet stable.
Considerd~x
dt= A~x, ~x(0) = ~C. (1)
where A is an n× n matrix of constants, with n distinct eigenvalues. The solution
formula is
~x(t) = α1~a1eλ1t + α2~a2e
λ2t + · · ·+ αn~aneλnt.
Theorem 1. If the real parts of all the eigenvalues of the coefficient matrix A are
(strictly) negative, then any solution to (1) goes to zero as t→ +∞.Theorem 2. If one or more eigenvalues of A have positive real parts, then some
solutions of (1) go to infinity as t→ +∞.Proofs: They follow from the solution formula if all eigenvalues are distinct.
Otherwise, solutions are like tmeλt which also go to zero if the real part of λ is
negative, or go to infinity if the real part of λ is positive.
Now let us consider a perturbation of (1)
d~x
dt= A~x+R(t, ~x). (2)
Suppose
‖R(t, ~x)‖ ≤ α‖~x‖, on t ≥ 0, ‖~x‖ < H. (3)
for some constants α and H > 0. Then
Theorem 3. If the real parts of all the eigenvalues of A are (strictly) negative,
and (3) holds for a suitably small α, then the zero solution of (1) is asymptotically
stable; i.e., all solutions of (2) with small initial data go to zero as t→ +∞.Theorem 4. If one or more eigenvalues of A have positive real parts, then the zero
solution is not stable, provided that (3) holds for a suitably small α.
What if one eigenvalue has zero real part and all others have negative real parts?
This is called the critical case, and is where bifurcation occurs. We will discuss these
issues in the next section. We provide some concrete stability examples below.
Examples 1. Consider dx1dt = λx1
dx2dt = µx2.
(4)
Suppose that λ < 0, µ < 0. Then all solutions go to zero as t → +∞. Add
perturbation R(t, x1, x2) : ‖R(t, ~x)‖ < ε‖~x‖,dx1dt = λx1 +R1(t, x1, x2)dx2dt = µx2 +R2(t, x1, x2),
where ε < min(|λ|, |µ|), the zero solution is stable: all solutions x(t)→ 0 as t→ +∞.2. For (4) again, but λ < 0 < µ. Zero is still a solution. But it is not stable
since the initially nearby solution x1
x2
=
0
αeµt
,where α is small, grows to infinity.
3. Consider now for β 6= 0, the systemdx1dt = βx2
dx2dt = −βx1.
Differentiating the first equation and using the second equation we find
d2x1
dt2+ β2x1 = 0.
We can therefore find the solution formula x1 = x01 cos(βt) + x0
2 sin(βt)
x2 = −x01 sin(βt) + x0
2 cos(βt).
Introduce ρ(t) = (x21 + x2
2)12 , then ρ(t) = ((x0
1)2 + (x02)2)
12 . See Figure 5.2 for the
phase portrait of the solutions. This solution is however unstable to perturbations
of the form 0
αx2
,
2
where α > 0, because then the equation has the matrix 0 β
−β α
,one of whose eigenvalues has positive real part.
Notes 1. The stability of a nonzero solution w(t) can be transformed to the stability
of the zero solution to the equation for v(t) ≡ u(t)− w(t).
2. A general nonlinear system
d~x
dt= ~F (t, ~x)
may be approximateed by (2) just as a curve can be approximated by its tangent
lines.
x
x
x
x
Figure 5.1. Any solution of Example 3 traces a circle.
β < 0 β > 0
(x (t), x (t)) (x (t), x (t))
1
2 1
2 21
2
1
5.5. Hopf bifurcations and example.
Motivation: Bifurcation theory is used in many life sciences, ecological systems,
weather system, fluid, chaos, and turbulence.
Considerd2u
dt2+ (u2 − λ)
du
dt+ u = 0. (5)
It has the solution u = 0. Let us consider the linearized equation
d2u
dt2− λdu
dt+ u = 0.
3
When we try solutions of the form u = eµt, we find
µ2 − λu+ 1 = 0.
For λ < 0, both roots have negative real parts, so zero solution is stable. For λ > 0,
both roots have positive real parts, so zero solution is unstable. At λ = 0, the roots
are purely imaginary µ = i,−i and the linearized equation has periodic solutions
u = eit = cos t + i sin t, or u = e−it = cos t − i sin t. Both the real and imaginary
parts are real solutions u(t) = cos t or sin t.
We write equation (5) in vector form by introduing u1 = u, u2 = u′ : u′1 = u2
u′2 = (λ− u21)u2 − u1.
Or
~u(t) =
0 1
−1 λ
· u1
u2
+
0
−u21u2
.This nonlinear system has nonzero periodic solution near λ = 0 :
λ =ε2
4+O(ε3), u1(t) = ε cos(ωt) +O(ε3), ω = 1 +O(ε3).
We will derive this expansion in perturbation theory next semester. For now we have
a bifurcation diagram, see Figure 5.2, and we state a general bifurcation theorem
called Hopf bifurcation.
4
|| u ||max
λλ = 0
0
Amplitude of a solution
Branch of zero solution
A brach of periodic solutions
t
u (t)
Figure 5.2. Hopf bifurcation diagram.
(a). Bifurcation diagram
(b) A periodic solution
Each point on the branch indicates a periodic solution:
Theorem(Hopf Bifurcation). Suppose the n× n matrix A(λ) has eigenvalues µj =
µj(λ), (j = 1, 2, · · · , n), and that for λ = λ0, µ1(λ0) = iβ, µ2(λ0) = −iβ and
Reµj(λ0) 6= 0 for all j > 2. Suppose further that Re (µ′1(λ0)) 6= 0. Then the system
of differential equationsdu
dt= A(λ)u+ f(u)
with f(0) = 0, f(u) a smooth function of u, has a branch (continuum) of periodic
solutions emanating from u = 0, λ = λ0.
(The direction of bifurcation is not determinated by the Hopf Bifurcation The-
orem, but must be calculated by a local power series expansion (See Keener)).
We plan to do serious perturbation theory next semester, where we can un-
derstand how a mathematician’s perturbation and calaulation helps locating the
position of the 9th planet of the solar system.
5
5.6. Another bifurcation example.
See Keener p.478: Nonlinear Eigenvalue problems.
Consider the elastica equation (a.k.a. Euler column) y′′ + (λ− 12
∫ 10 (y′)2ds)y = 0
y(0) = y(1),(6)
where λ is a parameter.
We see that the integral∫ 1
0 (y′)2(s)ds is a number. So let us introduce the number
µ = λ− 12
∫ 10 (y′)2(s)ds. Then equation (6) becomes
y′′ + µy = 0, y(0) = y(1), (7)
which has solutions
y(x) = A sin(nπx), for µ = n2π2. (8)
These solutions produce
µ = λ− 12
∫ 1
0(y′)2(s) = λ− 1
2
∫ 1
0(Anπ)2 cos2(nπx)dx = λ− 1
4(Anπ)2.
To satisfy (6), we need this µ to be the same as in (8); that is
A2n
4=
λ
n2π2− 1.
Thus we find many branches of solutions besides the zero solution, See Figure 5.3.
6
1 4 9
A
λπ 2
Figure 5.3. A nonlinear eigenvalue bifurcation diagram.
bifurcationpoint
A bifurcation branch
n =1
n = 2
n = 3
: Amplitude
Each pointindicates
y = A sin ( x)π
7
4.2. Laplace transform
Definition. For any f(t) ∈ L1([0,∞)), the function
L[f ](s) =∫ ∞
0e−stf(t) dt
for s ≥ 0, is called the Laplace transform of f(t).
Laplace transform can be obtained from Fourier transform through a certain
specilization. Laplace transform is very convenient to use for certain differential
equations or with certain boundary condition. But in general the two linear trans-
forms are basically the same.
Property a. L[dfdt ](s) = sL[f ]− f(0).
Proof. We have the calculation
L[f ′] =∫∞
0 e−stf ′(t) dt
= e−stf(t)|∞0 −∫∞0 e−st(−s)f(t) dt
= −f(0) + s∫∞0 e−stf(t) dt
= sL[f ]− f(0).
We list other properties below without proof. We let F (s) denote L[f ](s); i.e., we
use the capital letter to denote the Laplace transform of a lower-case letter function.
Properties b. L[1] = 1s ,
c. L[eat] = 1s−a ,
d. L[f ′′] = s2F (s)− sf(0)− dfdt (0),
e. L[−tf(t)] = dFds ,
f. L[∫ t
0 f(t− t)g(t) dt] = F (s)G(s),
g. L[δ(t − b)] = e−bs, (b ≥ 0)
h. L[tn] = n!sn+1 , (n > −1),
i. L[tneat] = n!(s−a)n+1 , (n > −1).
We note that Laplace transform transforms differentiation to a multiplication by
the independent variable s, and the multiplication by t to the differentiation with
respect to s. It also transforms the convolution∫ t
0f(t− t)g(t) dt
to the product of the transforms of f and g. Note that Laplace transform does
not need f(t) to be defined in t < 0. Even if f(t) is defined in t < 0, its value
there does not affect the transform. Therefore we adopt the convention that all
relevant functions for the Laplace transform are defined by zero in t < 0. Under this
convention we find that∫ t
0f(t− t)g(t) dt =
∫ ∞−∞
f(t− t)g(t) dt = f ∗ g
is indeed the the convolution defined in the previous section.
We also note that Laplace transforms exist for functions or functionals that are
not in L1([0,∞)), e.g., the functional δ(t− b), and t2.
We mention another example. Consider the Heaviside function
H(t− b) =
0, t < b,
1, t ≥ b.
For b > 0, we find that
Property j. L[H(t− b)](s) =∫∞0 H(t− b)e−st dt =
∫∞b e−st dt = e−bs
s .
Similar to the Fourier transform, there exists the inverse transform for the
Laplace transform. But its use is inconvenient. The best way has been to use
the above list of properties a− j for the inversion. If F (s) is the Laplace transform
of f(t), then we call f(t) the inverse of F (s). For example,
L−1[1s
] = 1, (t > 0).
Example 1. Solve the initial value problem
u′′ + 2u′ + 2u = 0, t > 0
u(0) = 1,
u′(0) = 2,
Physical background. This equation can be regarded as the motion of a particle
with mass m = 1, attached to a spring with Hooke’s spring constant k = 2, and wind
drag force proportional to the velocity u′. Newton’s second law says F = ma (Force
= mass × acceleration). Here ma = u′′ where u(t) represents the displacement of
the particle from the equilibrium. The spring force is −2u, the wind drag force is
−2u′, where u′ is velocity. So u′′ = −2u′ − 2u is Newton’s law.
2
Solution: Let U(s) = L[u]. Then we use the properties a− j to find
L[u′] = sU − 1,
L[u′′] = s2U − s− 2.
Sos2U − s− 2 + 2(sU − 1) + 2U = 0,
(s2 + 2s+ 2)U = s+ 4,
U = s+4s2+2s+2 .
We notice s2 + 2s+ 2 = [s+ (1− i)][s+ (1 + i)]. By partial fractions (see later), we
haves+ 4
s2 + 2s + 2=
α
s+ 1− i +β
s+ 1 + i,
where
α =12
(1− 3i), β =12
(1 + 3i).
Then by linearity of the transform, we have
u = L−1(U(s)) = αL−1[1
s− (−1 + i)] + βL−1[
1s− (−1− i) ].
Using property c for a = −1 + i and then a = −1− i, we have
u = αe(−1+i)t + βe(−1−i)t
= 12(1− 3i)e−t(cos t+ i sin t) + 1
2(1 + 3i)e−t(cos t− i sin t)= e−t(cos t+ 3 sin t).
Partial fractions. We show here how we express a complicated fraction into a sum
of simple fractions for which we can invert the Laplace transform. We make a guess
of the sum:s+ 4
s2 + 2s+ 2=
α
s+ 1− i +β
s+ 1 + i
where α and β are to be determined numbers. Then we multiply the two sides of
the equation with s2 + 2s+ 2 to find
s+ 4 = αs+ βs+ α(1 + i) + β(1− i).
We rearrange terms to find
(α+ β − 1)s+ α(1 + i) + β(1 − i)− 4 = 0.
3
This equation has to be true for all s, so we have
α+ β = 1,
α(1 + i) + β(1 − i) = 4.
This system of algebraic equations can be solved easily:
1 + i(α − β) = 4,
α− β = −3i,
α = 12(1− 3i),
β = 12(1 + 3i).
This finishes the partial fraction used in our example 1. For general ways of partial
fractions, see our text book.
Example 2. Solve the initial value problem of the ordinary differential equation
(ODE) u′′ + tu′ + u = 0, t > 0,
u(0) = 1,
u′(0) = 0.
Solution: Let U(s) = L[u]. Then
L[u′] = sU − 1,
L[u′′] = s2U − s,L[tu′] = − d
ds [sU − 1]
= −sU ′ − U(s).
Then the ODE in question becomes
−sU ′ + s2U − s = 0,
or
U ′ − sU = −1.
How to solve this new ODE? We multiply it with e−s2
2 , so
(e−s2
2 U)′ = −e− s2
2 .
We integrate it in s from s to ∞ and use the condition U(s)→ 0 as s→∞:
0− e− s2
2 U(s) = −∫ ∞s
e−σ2
2 dσ.
4
Or we have
U(s) = es2
2
∫ ∞s
e−σ2
2 dσ.
Instead of evaluating this integral and then inverting it, we use a special trick. We
introduce the new variable t = σ − s. The integral becomes
U(s) =∫∞s e−
σ2−s22 dσ,
U(s) =∫∞0 e−ste−
t2
2 dt.
Hence U(s) is the Laplace transform of e−t2
2 , and so
u(t) = e−t2
2
is the solution. (Reference book: Weinberger).
For more on Laplace and Fourier transforms, see
1. Richard Haberman: Elementary Applied Partial Differential Equations, 2nd
or later edtions, Prentice Hall, 1987.
2. H. F. Weinberger: A First Course in Partial Differential Equations, John
Wiley & Sons, 1965.
5
Chapter VI. Partial Differential Equations
Tentative contents
A. In infinite domains.
6.1. Transport equations, method of characteristics.
6.2. wave equation in IR1.
6.3. wave equation in IR3.
6.4. wave equation in IR2.
6.5. Heat equation in IRn and IR1+.
6.6. Laplace and Poisson equations in IRn.
6.7. Concept of fundamental solutions.
B. On rectangular domains, separation of variables.
6.8. Laplace equation in a rectangle, Fourier series.
6.9. Poisson equation in a rectangle.
6.10. Heat equation in a rectangle.
6.11. Wave equation in a rectangle.
6.12. Eigenvalue problems, Sturm-Liouville operator.
6.13. Explicit eigenfunctions, orthogonal polynomials, special functions, Bessel’s
functions.
6.14. Vibrating circular membrane.
C. Bounded domains general, Green’s function.
6.15. Laplace equation in general bounded domains, Green’s function
6.1 Transport equation, method of characteristics
We consider the simplest partial differential equation
∂u
∂t+ a
∂u
∂x= 0, t > 0, x ∈ IR1, (1)
where a is a constant. The general solution formula is
u(t, x) = g(x − at) (2)
where g(·) is an arbitrary (smooth) function. Let t = 0 in (2), we see that
u(0, x) = g(x), (3)
thus g(·) is the initial condition for u and equation (1). One can let g be a Gaussian:
g(x) = e−x2
and plot the solution at times t = 1, 2, 3, · · · , 10 for a = −2,−1, 0, 1, 2.
We can conclude that the graph of u(t, x) is simply the graph of g(x) shifted by the
amount at in the x direction.
u
x
t = 1t = 0 t > 1
ata
Figure 6.1. Transport feature ( shown for positive velocity ).a
We consider now the transport equation in n-dimension
∂u
∂t+ a1
∂u
∂x1+ a2
∂u
∂x2+ · · ·+ an
∂u
∂xn= 0, t > 0, ~x = (x1, · · · , xn) ∈ IRn (4)
with initial condition
u(0, ~x) = g(~x). (5)
It can be readily verified that
u(t, ~x) = g(~x− ~at), (6)
Equation (4) is called a passive transport equation. We can add a source term to
it and consider ∂u∂t + ~a · ∇u = f(t, ~x),
u(0, ~x) = g(~x).(7)
Let us consider the straight linesd~x
dt= ~a, (8)
i.e.,
~x = ~x(t) ≡ ~x0 + ~at, (9)
2
which cover the whole space IRn × IR, when ~x0 and t vary freely. These lines are
called characteristic lines of equation (7). See Figure 6.2. Let us fix a ~x0 and consider
the function u(t, ~x(t)). We find
d
dtu(t, ~x(t)) =
∂u
∂t+∇u · d
dt~x(t) =
∂u
∂t+ a · ∇u = f(t, ~x(t)). (10)
Thus we can integrate (10) to find
u(t, ~x(t)) = u(0, ~x(0)) +∫ t
0f(s, ~x(s))ds = g(~x0) +
∫ t
0f(s, ~x0 + ~as) ds. (11)
Looking at the characteristic lines the other way around, we can first fix a point
(t, ~x) ∈ IR1 × IRn, and determine an ~x0 at t = 0 from (9), and then (11) reads as
u(t, ~x) = g(~x− ~at) +∫ t
0f(s, ~x− ~a(t− s))ds.
x
t
x(t) = x + a t
x
(x, t)
0
0
Figure 6.2. Characteristic lines .
Motivation of the equation:
Convection or transport is an important part in many partial different equations,
such as neutron transport, Boltzmann equation, fluid dynamics, etc.
The method used in (8)-(11) is called the method of characteristics. This method
can be used to solve equation (7) when ~a is a function of (t, ~x), or even when ~a is a
function of u, making (7) a nonlinear first-order equation.
3
6.2. Wave equation in IR1
Modeling: Imagine a piece of string stretched tightly (a taut string). We measure
the speed of sound c = (Tρ )12 , where T is tension in the string, ρ is linear mass density,
both are assumed constant. Given the initial position g(x), and initial velocity h(x),
we use a video camera to record its true motion, and a mathematical model with a
computer to make a movie of the motion. We then compare the two videos. They
can be made extremely close! I choose to present the mathematical solution formula
only, while omit the derivation of the model, although the derivation is important
and very interesting.
We consider the wave equation (vibrating string equation)
∂2u
∂t2− c2 ∂
2u
∂x2= f(t, x), t > 0, x ∈ IR1, (1)
with initial conditions
u(0, x) = g(x),∂u
∂t(0, x) = h(x). (2)
Theorem (D’Alembert formula) A solution to (1)- (2) is
u(t, x) =12
[g(x+ct)+g(x−ct)]+ 12c
∫ x+ct
x−cth(y)dy+
12c
∫ t
0
∫ x+c(t−s)
x−c(t−s)f(s, y)dyds. (3)
Proof: We introduce the new coordinates ξ = x+ ct,
η = x− ct.(4)
Then by the chain rule,
∂2u
∂x2=∂2u
∂ξ2+ 2
∂2u
∂ξ∂η+∂2u
∂η2,
∂2u
∂t2= c2[
∂2u
∂ξ2− 2
∂2u
∂ξ∂η+∂2u
∂η2]
Thus (1) becomes
−4c2∂2u
∂ξ∂η= f(t, x), or
∂2u
∂ξ∂η= − 1
4c2f(ξ − η
2c,ξ + η
2).
Integrating twice, we find
u = F (ξ) +G(η) +∫ ξ
0
∫ η
0− 1
4c2f(ξ − η
2c,ξ + η
2)dξdη. (5)
Thus
u(t, x) = F (x+ ct) +G(x− ct)− 14c2
∫ x+ct
0
∫ x−ct
0f(ξ − η
2c,ξ + η
2)dξdη. (6)
By fitting this formula to initial data (2), we can determine F and G.
By further change of variables, we can manipulate the third term of (6) to be
like formula (3).
6.3. Wave equation in IR3.
We consider the initial value problem for the homogeneous three-dimensional
wave equation∂2u
∂t2− c2[
∂2u
∂x21
+∂2u
∂x22
+∂2u
∂x23
] = 0, t > 0, (7)
u(0, x1, x2, x3) = 0. (8)∂u
∂t(0, x1, x2, x3) = h(x1, x2, x3). (9)
We let
u(t, w1, w2, w3) =1
(2π)32
∫ +∞
−∞
∫ +∞
−∞
∫ +∞
−∞u(t, x1, x2, x3)ei(ω1x1+ω2x2+ω3x3)dx1dx2dx3,
h(ω1, ω2, ω3) =1
(2π)32
∫IR3
h(x1, x2, x3)eiω·xdx.
That is, u is the Fourier transform of u in IR3. Under this transform, equation (7)
and conditions (8)-(9) become
∂2u
∂t2+c2(ω2
1+ω22+ω2
3)u = 0, u(0, ω1, ω2, ω3) = 0,∂u
∂t(0, ω1, ω2, ω3) = h(ω1, ω2, ω3).
(10)
The problem has a solution
u(t, ω1, ω2, ω3) = h(ω1, ω2, ω3)sin((ω2
1 + ω22 + ω2
3)12 ct)
c(ω21 + ω2
2 + ω23)
12
. (11)
By the inverse theorem
u(t, x1, x2, x3) =1
(2π)32
∫IR3
u(ω)e−i~ω·~xd~ω, (12)
and a series of hard calculations (see Weinberger, pp.333-335), we end up with
u(t, x1, x2, x3) =t
4π(ct)2
∫ ∫|y−x|=ct
h(y)dSy
2
=t
4π
∫ 2π
0
∫ π
0h(x1 + ct sinφ cos θ, x2 + ct sinφ sin θ, x3 + ct cos φ) sinφdφdθ. (13)
Recall the spherical coordinates:
x = r sinφ cos θ, y = r sinφ sin θ, z = r cosφ
where θ is the angle in the (x, y) plane and φ is the angle away from the z-axis.
This solution (13) is t times the average of h on a sphere centred at (x1, x2, x3) with
radius ct.
It is interesting to note that a solution to∂2u∂t2 − c
2(∂2u∂x2
1+ · · ·+ ∂2u
∂x23) = 0,
u(0, x1, x2, x3) = g(x1, x2, x3),∂u∂t (0, x1, x2, x3) = 0
(14)
is simply
u(t, x1, x2, x3) =∂
∂t[
t
4π(ct)2
∫ ∫|y−x|=ct
g(y)dSy ]. (15)
This can be seen by using the Fourier transform∂2u∂t2 + c2(ω2
1 + ω22 + ω2
3)u = 0,
u(0, ω1, ω2, ω3) = g(ω1, ω2, ω3),∂u∂t (0, ω1, ω2, ω3) = 0
(16)
which has a solution
u(t, ω1, ω2, ω3) = g cos((ω21 + ω2
2 + ω23)
12 ct). (17)
Luckily we do not need to do any hard calculation to invert it since
u(t, ω1, ω2, ω3) =∂
∂t[g
sin((ω21 + ω2
2 + ω23)
12 ct)
c(ω21 + ω2
2 + ω23)
12
]
and thus (similar to the process from (11) to (13))
u(t, x1, x2, x3) =∂
∂t[g
sin((ω21 + ω2
2 + ω23)
12 ct)
c(ω21 + ω2
2 + ω23)
12
]∨ =∂
∂t[
t
4π(ct)2
∫ ∫|y−x|=ct
g(y)dSy].
A solution to the full initial value problem∂2u∂t2− c2(∂
2u∂x2
1+ · · ·+ ∂2u
∂x23) = 0,
u(0, x1, x2, x3) = g(x1, x2, x3),∂u∂t (0, x1, x2, x3) = h(x1, x2, x3)
(18)
3
is the sum of the previous two solutions, which is called the Poisson formula:
u(t, x1, x2, x3) = t4π(ct)2
∫ ∫|y−x|=ct h(y)dSy
+ ∂∂t [
t4π(ct)2
∫ ∫|y−x|=ct g(y)dSy].
(19)
4
6.3. (Continued)
For the inhomogeneous problem
∂2u
∂t2− c2(
∂2u
∂x21
+∂2u
∂x22
+∂2u
∂x23
) = f(t, x1, x2, x3), (1)
u(0, ~x) = 0, (2)
∂u
∂t(0, ~x) = 0, (3)
a solution is given by Duhamel’s principle (Fritz John, PDE, p.135)
u(t, ~x) =1
4πc2
∫ t
0
ds
t− s
∫ ∫|y−x|=c(t−s)
f(s, ~y)dSy. (4)
Duhamel’s principle: Given a time t > 0. Replace force f(s, ~x), s ∈ [0, t], by
acquired velocity at time
0 = s1 < s2 < s3 < · · · < sn < sn+1 = t,
and consider wi(s, ~x):
∂2wi∂s2
− c2(∂2wi∂x2
1
+∂2wi∂x2
2
+∂2wi∂x2
3
) = 0, s > si, (5)
wi(si, ~x) = 0, (6)
∂wi∂s
(si, ~x) = f(si, ~x)(si+1 − si). (7)
The solution wi(s, ~x), which we assume is zero for s < si, is the part of the dis-
placement u(t, ~x) that is resulted from a pulse force f(s, ~x) during the time interval
[si, si+1], which is equivalent to a velocity f(si, ~x)(si+1−si). The final total displace-
ment u(t, ~x) is by superposition
u(t, ~x) =n∑i=1
wi(t, ~x). (8)
Let n →∞ and all si+1 − si → 0, the approximation becomes exact. We can solve
(5)-(7) just as before (Poisson formula):
wi(s, ~x) =1
4πc2(s− si)
∫ ∫|~y−~x|=c(s−si)
(si+1 − si)f(si, ~y)dSy, s > si.
Details are in John, PDE, p.135.
Applications: Maxwell’s equations of electromagnetism ( ~E, ~B) in vacuum are
∂2 ~E
∂t2− c2(
∂2 ~E
∂x21
+∂2 ~E
∂x22
+∂2 ~E
∂x23
) = 0,
∂2 ~B
∂t2− c2(
∂2 ~B
∂x21
+∂2 ~B
∂x22
+∂2 ~B
∂x23
) = 0,
where c = (Tρ )12 is the speed of light in vacuum. We see that the speed of light is
lower in air since the density ρ is higher.
6.4. Hadamard’s method of descent.
In IR2, the wave equation
∂2u
∂t2− c2(
∂2u
∂x21
+∂2u
∂x22
) = 0,
u(0, x1, x2) = g(x1, x2),
∂u
∂t(0, x1, x2) = h(x1, x2),
can be regarded as a problem in IR3 where u(t, x1, x2, x3) is independent of the third
dimension x3. In this way, we find that the spherical integrals on the sphere
|~y − ~x| = ((y1 − x1)2 + (y2 − x2)2 + y23)
12 = ct
can be changed into top and bottom integrals over the disk
(y1 − x1)2 + (y2 − x2)2 < (ct)2.
Thus
u(t, x1, x2) =1
2πc
∫ ∫r<ct
h(y1, y2)
(c2t2 − r2)12
dy1dy2 +∂
∂t[
12πc
∫ ∫r<ct
g(y1, y2)
(c2t2 − r2)12
dy1dy2],
where r =√
(x1 − y1)2 + (x2 − y2)2.
2
Figure 6.4.1. Integrals on a sphere becomes integrals on a disk.
y
y
y
(x , x , 0)
3
3
ct
1 2
2
1
y = (c t) − r 2 22
Nonlinear wave equations
Large amplitude:
utt − c2(ux
(1− u2x)
12
)x = 0,
In liquid crystals:
utt − c(u)(c(u)ux)x = 0,
where c(u) = (α cos2 u+ β sin2 u)12 , α > 0, β > 0, are physical elastic constants.
These nonlinear equations do not have solution formulas.
3
6.5. Heat equation in IRn and IR1+.
Modeling heat conduction. (Keener, p.380.)
We propose to study heat conduction in a material, for example, gas or metal.
Let u(t, ~x) be temperature. Then the total thermal energy in a region Ω is∫Ωρ c u(t, ~x) d~x,
where ρ(t, ~x) is density of mass, c is heat capacity (energy/unit mass). Let ~q be heat
flux: energy per unit area per unit time; let f(t, ~x) be heat production: energy per
unit volume per unit time. Then “conservation of energy” is
d
dt
∫Ωρcu(t, ~x)d~x =
∫Ωf(t, ~x) d~x−
∫∂Ω~q · ~n dS,
where ~n is the unit outward normal to the boundary ∂Ω. Through physical experi-
ments, there holds Fourier’s law of cooling:
~q = −k∇u
for many common materials. By Gauss divergence theorem, see Chapter 1.6, we
thus have ∫Ω
[∂
∂t(ρcu)− f ]d~x = k
∫∂Ω∇u · ~ndS = k
∫Ω
div (∇u)d~x.
Or ∫Ω
[∂
∂t(ρcu)− k div (∇u)− f ]d~x = 0.
for all Ω. Thus∂
∂t(ρcu) = k div (∇u)− f.
Assuming ρ =constant, c =constant. Let D = kρc (the diffusion coefficient). Then
∂
∂tu = D4u+
f
ρc,
where
4 = div ∇ =n∑i=1
∂i2
is called the Laplacian. We can use τ = Dt so that the scaled equation is
∂u
∂t= 4u+
f
ρcD.
Solution. Consider ∂∂tu = 4u
u(0, ~x) = g(~x).
Use Fourier transform in IRn:
u(t, ~ω) =1
(2π)n2
∫IRn
u(t, ~x)eiω·~xd~x.
Then ∂∂t u = −(ω2
1 + · · ·+ ω2n)u,
u(0, ~ω) = g.
Thus
u = ge−(ω21+···+ω2
n)t = g[1
(2t)n2
e−x21+···+x2
n4t ]∧.
By inversion theorem,
u = u∨ = 1
(2π)n2g ∗ 1
(2t)n2e−
x21+···+x2
n4t
= 1
(4πt)n2
∫∞−∞ · · ·
∫∞−∞ g(y1, · · · , yn)e−
(x1−y1)2+···−(xn−yn)2
4t dy1 dy2 · · · dyn.(1)
Theorem 6.5.1. A solution to
∂tu = k4u, u(0, x) = g(x), x ∈ IRn
is
u(t, x) =1
(4πkt)n2
∫IRn
g(y)e−|x−y|2
4kt dy.
Theorem 6.5.2. A solution to∂∂tu = k4u+ f(t, x),
u(0, ~x) = 0
is
u(t, x) =∫ t
0
1(4πk(t − τ))
n2
∫IRn
f(τ, y)e−|x−y|24k(t−τ) dy dτ.
(Duhamel’s principle).
The heat equation in IR1+ is a homework problem.
2
Notes 1. Multi-dimensional Fourier transform is equivalent to one-dimensional
Fourier transform applied many times. For example, the two-dimensional Fourier
transform is
u(ω1, ω2) = 12π
∫u(x1, x2)eix1ω1+ix2ω2dx1dx2
= 1
(2π)12
∫[ 1
(2π)12
∫u(x1, x2)eiω2x2dx2]eiω1x1dx1
= 1
(2π)12
∫u(x1, ω2)eiω1x1dx1 = (u(x1, ω2))∧(ω1, ω2).
(2)
2. Transform formula: It follows easily from the one-dimensional one:
(e−β(x21+x2
2+···+x2n))∧ = (e−βx
21)∧(e−βx
22)∧ · · · (e−βx2
n)∧
=1
(2β)12
e−ω2
14β
1
(2β)12
e−ω2
24β · · · 1
(2β)12
e−ω2n
4β =1
(2β)n2
e−1
4β(ω2
1+ω22+···+ω2
n).
3. Inverse transform: Definition:
u∨ = (2π)−n2
∫IRn
u(x)e−iω·xdx.
4. Inversion Theorem: It follows easily from the one-dimensional one:
(u∧)∨ = u.
5. Convolution formula: Definition:
f ∗ g(~x) =∫ ∫
· · ·∫f(x1 − y1, x2 − y2, · · · , xn − yn)g(y1 · · · yn)dy1 · · · dyn.
Formula: It follows easily from the one-dimensional one:
(f ∗ g)∧(~ω) = (2π)n2 f g.
Inverse: It follows easily from the one-dimensional one:
f ∗ g = (2π)n2 (f g)∨.
3
6.6. Laplace and Poisson equations in IR2 and IR3.
Laplace equation:
4u = ∂2x1u+ ∂2
x2u+ · · ·+ ∂2
xnu = 0 in IRn.
It describes the temperature distribution in equilibrium (time-independent). So-
lutions are any linear functions a1x1 + a2x2 + · · · + anxn or quadratic functions
x21 − x2
2, x1x2, or many higher-order polynomials. These solutions are called har-
monic functions. But the only bounded solutions are the constant solutions.
Poisson equation:
4u = f(~x) in IRn.
We assume that |f(~x)| → 0 as |x| → ∞, and we look for bounded solutions only.
Try Fourier transform:
u(~ω) = (2π)−n2
∫IRn
u(x)e−iω·xdx.
Then−|ω|2u = f ,
u = − 1|ω|2 f .
Inversion: In IR3 only: (skipped because it needs distribution theorey, see next
semester)
(4π|x|)−1 = (2π)−32
1|ω|2 .
Convolution formula yields
u = − 14π|x| ∗ f.
u(x) = − 14π
∫IR3
f(y)|x−y|dy in IR3 only.
Another method:
Motivation: We know that
div ~E = q,
i.e., the divergence of an electric field is charge density. For the electric potential A
such that~E = ∇A,
there holds
4A = q.
Consider the ideal case where q is a Dirac delta
4A = δ(x).
We can look for a spherically symmetric solution
A(~x) = A(|~x|).
Recall in IRn (Chapter 1, Section 1.16.2)
4A(r) =∂2A
∂r2+n− 1r
∂A
∂r.
At r 6= 0, we have∂2A
∂r2+n− 1r
∂A
∂r= 0. (1)
In IR3, we can integrate the above equation
1r2
∂
∂r[r2∂A
∂r] = 0.
Thenr2 ∂A
∂r = C,∂A∂r = C
r2 ,
A = C1 − Cr .
The constant C1 will be arbitrary. But the constant C is determined by the strength
of the δ(x) : ∫|x|<1
4Adx =∫δ(x) dx = 1.
A calculation can be done to find (skipped in class, but see Keener p.341, or later
of this lecture notes):
C =1
4π.
Thus
A(r) = − 14πr
+C1.
This function
A(r) = − 14πr
2
is called the fundamental solution to the Laplace equation in IR3.
The constant C. We find the constant C in A(r) here. We use
fε(r) =
1
43πε3
, r < ε,
0, r > ε
to approximate the δ(x). Then equation (1) is
1r2
∂
∂r[r2∂A
∂r] = fε(r).
We integrate and use the condition r2 ∂A∂r |r=0 = 0 to find
r2 ∂A∂r − 0 =
∫ r0 r
2fε(r)dr
=
1
43πε3
r3
3 , r < ε,
14π , r ≥ ε.
Let ε→ 0, we conclude that
r2 ∂A∂r = 1
4π , r ≥ 0,∂A∂r = 1
4πr2 ,
A = − 14πr + C1.
3
6.6. (Continued)
We have found a solution to
4u = f(x) in IR3
in the form
u(x) = − 14π
∫ ∫ ∫IR3
f(y)|x− y|dy
via the Fourier transform, but we were not able to invert the function 1|ω|2 . We
started another approach: Finding a radial solution A(x) = A(|x|) to
4A = δ(x).
We find
A(x) = − 14π|x| + C1.
We call
A(x) = − 14π|x|
the fundamental solution to the Laplacian in IR3. We explain here why it is called
the fundamental solution.
This is why: we can see by translation that a solution to
4U = δ(x− x0)
is
U = − 14π|x− x0|
.
Further, a solution to
4W = f(x0)δ(x − x0)
is
W = − f(x0)4π|x− x0|
.
Integrating the W equation in x0, we find
4∫
IRnWdx0 =∫
IRn f(x0)δ(x − x0) dx0
= f(x).(1)
Thus
u(x) = −∫
IRn
f(x0)4π|x− x0|
dx0
is a solution to
4u = f(x).
Thus the general solution to 4u = f(x) is given by the convolution of f(x) with the
fundamental solution.
We remark that the integration step (1) is generally known as Superposition.
In IR2: We follow the same idea in IR3. We consider a radial solution A(|x|) to
4A = δ(x) in IR2. (2)
We have known that
4A =∂2A
∂r2+
1r
∂A
∂r
in IR2. Thus
4A =1r
∂
∂r[r∂A
∂r].
At r 6= 0, equation (2) is1r
∂
∂r[r∂A
∂r] = 0.
So
r∂A
∂r= C.
Or
A = C ln r + C1.
We need to find the constant C; the other constant C1 is not important and we set
it to zero. To do so, we approximate δ(x) in IR3 by
Fε(x) =
1πε2 , |x| < ε,
0, |x| ≥ ε
and consider radial solution to
4A(x) = Fε(x).
2
We find1r
∂
∂r[r∂A
∂r] = Fε(r)
Or∂∂r [r ∂A∂r ] = rFε(r). (3)
Since A(|x|) is radial, we expect and assume ∂A∂r = 0 at r = 0. Then we integrate
(3) in r over [0, r] to find
r∂A
∂r=∫ r
0sFε(s)ds =
1πε2 ( r
2
2 ), r < ε,1
2π , r ≥ ε
Now let ε→ 0, we find
r∂A
∂r=
12π, r ≥ 0.
Thus
A =1
2πln r + C1.
As usual, we set C1 = 0. By Superposition we find that a solution to
4u = f
in IR2 is
u(x) = A(x) ∗ f(x) =1
2π
∫IR2
ln |x− x0|f(x0)dx0.
Application: Biot-Savart law (Keener, p.342)
Consider an incompressible fluid with velocity ~u :
div ~u = 0
and vorticity
~ω = curl ~u
in IR3. Introduce a vector potential ~A (whose existence follows from div ~A = 0)
such that
~u = curl ~A
and
div ~A = 0.
3
Then~ω = curl (curl ~A)
= ∇(div ~A)− div (∇ ~A)
= −4 ~A,
Using the solution formula for Poisson equation, we have
~A =1
4π
∫ ~ω(y)|x− y|dy.
Thus~u(x) = curl ~A
= 14π
∫curl ( ~ω(y)
|x−y|)dy
= 14π
∫curl x( 1
|x−y|)× ~ω(y)dy
= 14π
∫ (~x−~y)×~ω(~y)|~x−~y|3 d~y.
This is the Biot-Savart Law in IR3, which relates vorticity ~ω to velocity ~u.
Biot-Savart law in IR2.
Consider an imcompressible fluid in IR2 with velocity ~u:
div ~u = 0.
There exists a stream function (a scalar function) ψ such that
~u = (u1, u2) = (∂x2ψ,−∂x1ψ).
The vorticity of ~u will be a scalar ω, which is the third component of the physical
vorticity ~ω. We haveω = curl ~u
= ∂x1u2 − ∂x2u1
= −∂x1∂x1ψ − ∂x2∂x2ψ
= −4ψ.Thus
ψ = − 12π
∫IR2
ln |x− y|ω(y)dy.
Hence~u = − 1
2π
∫IR2
(x2−y2,y1−x1)|~x−~y|2 ω(~y)d~y
= − 12π
∫IR2
(y2,−y1)|~y|2 ω(~x− ~y)d~y
= 12π
∫IR2
~y⊥
|~y|2ω(~x− ~y)d~y,
4
where ~y⊥ = (−y2, y1). See Figure 6.6.1 for ~y⊥, which is a rotation by 90 counter-
clockwise of ~y.
y
(y , y )
y
1 2
2
2
1
1y = ( −y , y )
Figure 6.7.1. The relation between the perp of y to y.
6.7 Concept of Fundamental Solutions.
1. The solution to
4u = δ(x) in IRn
is called the fundamental solution to the Laplace equation. This δ(x) may be re-
garded as a point charge.
2. The solution to ∂∂tu−4u = 0,
u(0, x) = δ(x),
which is
F (t, x) =1
(4πt)n2
e−x2
4t
is called the fundamental solution to the heat equation. This δ(x) may be regarded
as a point spot of heat.
5
B. PDE on rectangular domains, separation of variables.
6.8. Laplace equation in a rectangle, Fourier series.
We want to solve the Dirichlet boundary value problem for the Laplace equation
in a rectangle
∂2u∂x2 + ∂2u
∂y2 = 0, 0 < x < L, 0 < y < H,
u(0, y) = g1(y),
u(L, y) = g2(y),
u(x, 0) = g3(x),
u(x,H) = g4(x).
(1)
By superposition, we know that we can split problem (1) into four similar problems,
each of which satisfies one of the four boundary functions and the zero boundary
condition on the other three sides. For example, let us find u1 :
∂2u1∂x2 + ∂2u1
∂y2 = 0,
u1(0, y) = g1(y),
u1(L, y) = 0,
u1(x, 0) = 0,
u1(x,H) = 0.
(2)
We use the method of separation of variables. Let u1(x, y) = X(x)Y (y). Then
the Laplace equation can be written as
X ′′(x)X(x)
= −Y′′(y)Y (y)
. (3)
Since the two sides of (3) are functions of different variables, we conclude that they
must be constant, which we set to be λ. Thus (3) becomes
X ′′ = λX, (4)
Y ′′ = −λY. (5)
We see that we can let Y (y) satisfy the boundary condition
Y (0) = Y (H) = 0. (6)
Then, the Y equation (5) and the boundary condition (6) have solution
Y = c sin(nπy
H) (7)
for
λ = (nπ
H)2, (8)
where n = 1, 2, · · · . For the λ in (8), we find that the general solution to (4) is
X(x) = EenπHx + Fe−
nπHx.
Or
X(x) = a1 cosh[nπ
H(x− L)] + a2 sinh[
nπ
H(x− L)].
The shift in x by L is selected to satisfy the boundary condition at x = L conve-
niently. We impose that X(L) = 0, which implies a1 = 0. Thus
X(x) = a2 sinh[nπ
H(x− L)].
In summary, solutions u of the product form X(x)Y (y) satisfying the zero condition
on the three corresponding sides of (2) are
u(x, y) = a2 sinh[nπ
H(x− L)] sin(
nπy
H)
for any constant a2 and all n = 1, 2, 3, · · · . By superposition, we find that
u(x, y) =∞∑n=1
an sinh[nπ
H(x− L)] sin(
nπy
H) (9)
is also a solution to the Laplace equation with zero value on the three zero-value
sides of the rectangle in (2) for any real numbers an. We want to choose an such
that u in (9) satisfies the fourth nonzero boundary condition:
u(0, y) =∞∑n=1
an sinh[−nπLH
] sin(nπy
H) = g1(y).
It is known that any smooth function g1(y) with g1(0) = g1(h) = 0 can be expressed
as a Fourier sine series
g1(y) =∞∑n=1
An sin(nπy
H), 0 ≤ y ≤ H, (10)
2
where
An =2H
∫ H
0g1(y) sin(
nπy
H)dy. (11)
Thus, we can take
an =An
sinh[−nπLH ]
so that (9) satisfies all four boundary conditions. Hence we find a solution to (2):
u(x, y) =∞∑n=1
2H sinh[−nπL
H ](∫ H
0g1(η)
sin(nπη)H
dη) sinh[nπ
H(x− L)] sin(
nπy
H). (12)
Notes.
1. The cosh and sinh functions are coshx = 12 (ex + e−x); sinhx = 1
2(ex − e−x).
2. Problem (5) and (6) is called an eigenvalue problem, where we need find both
a number λ and a nonzero solution Y (y).
3. The Fourier sine series (10)(11) is a special case of the general Fourier series:
f(x) = a0 +∞∑n=1
an cos(nπx
L) +
∞∑n=1
an sin(nπx
L), −L ≤ x ≤ L,
wherea0 = 1
2L
∫ L−L f(x)dx,
an = 1L
∫ L−L f(x) cos(nπxL )dx,
bn = 1L
∫ L−L f(x) sin(nπxL )dx.
For a proof, see Dirichlet-Jordan Convergence Theorem, p.164, Sect. 4.5.1, Keener.)
4. Other boundary conditions, such as Neumann boundary condition can be solved
similarly (See homework).
5. Nonhomogenerous equation (Poisson equation) can be solved also (See next
lecture).
6. Laplace equation in a disk can be solved by separation of variables in addition
to the complex variables method.
3
Excerpt: Story of Fourier
Joseph Fourier’s father was a tailor in Auxerre. After the death of his first wife,
with whom he had three children, he remarried and Joseph was the ninth of the
twelve children of this second marriage. Joseph’s mother died when he was nine
years old and his father died the following year.
It was during his time in Grenoble that Fourier did his important mathematical
work on the theory of heat. His work on the topic began around 1804 and by 1807 he
had completed his important memoir On the Propagation of Heat in Solid Bodies.
The memoir was read to the Paris Institute on 21 December 1807 and a committee
consisting of Lagrange, Laplace, and others was set up to report on the work. Now
this memoir is very highly regarded but at the time it caused controversy.
There were two reasons for the committee to feel unhappy with the work. The
first objection, made by Lagrange and Laplace in 1808, was to Fourier’s expansions
of functions as trigonometrical series, what we now call Fourier series. Further
clarification by Fourier still failed to convince them.
The second objection was made by Biot against Fourier’s derivation of the equa-
tions of transfer of heat. Fourier had not made reference to Biot’s 1804 paper on
this topic but Biot’s paper is certainly incorrect. Laplace, and later Poisson, had
similar objections.
The Institute set as a prize competition subject the propagation of heat in solid
bodies for the 1811 mathematics prize. Fourier submitted his 1807 memoir together
with additional work on the cooling of infinite solids and terrestrial and radiant heat.
Only one other entry was received and the committee set up to decide on the award
of the prize, Lagrange, Laplace, Malus, Hauy, and Legendre awarded Fourier the
prize. The report was not however completely favourable and states:
... the manner in which the author arrives at these equations is not exempt of
difficulties and that his analysis to integrate them still leaves something to be desired
on the score of generality and even rigour.
With this rather mixed report there was no move in Paris to publish Fourier’s
work. His work was later published in 1822.
4
6.9. Poisson equation in a rectangle.
We consider
∂2u
∂x2+∂2u
∂y2= f(x, y), 0 < x < L, 0 < y < H, (1)
u(0, y) = u(L, y) = 0, 0 < y < H, (2)
u(x, 0) = u(x,H) = 0, 0 < x < L. (3)
We propose the eigenvalue problem∂2u∂x2 + ∂2u
∂y2 = −λu, 0 < x < L, 0 < y < H,
u(0, y) = u(L, y) = 0, 0 < y < H,
u(x, 0) = u(x,H) = 0, 0 < x < L.
(4)
We use separation of variables to find
u(x, y) = sin(nπx
L) sin(
mπy
H), λ = (
nπx
L)2 + (
mπy
H)2, (5)
where n = 1, 2, · · · ,m = 1, 2, · · ·, are solutions to (4). Using Fourier sine series
Theorem in the x−variable first and then in the y-variable, we can expand f(x, y)
into
f(x, y) =∞∑n=1
∞∑m=1
Bnm sin(nπx
L) sin(
mπy
H), (6)
where
Bnm =4LH
∫ L
0
∫ H
0f(x, y) sin(
nπx
L) sin(
mπy
H) dxdy. (7)
Thus a solution to (1)-(3) is
u(x, y) = −∞∑n=1
∞∑m=1
Bnm(nπL )2 + (mπH )2
sin(nπx
L) sin(
mπy
H). (8)
where Bnm are given in (7). We can verify that (8) is indeed a solution:
4u = −∑∞n=1
∑∞m=1
Bnm(nπL
)2+(mπH
)24(sin(nπxL ) sin(mπyH ))
=∑∞n=1
∑∞m=1Bnm sin(nπxL ) sin(mπyH )
= f(x, y).
Note: Let Ω be a general domain in IRn. Consider −4u = λu, in Ω,
u = 0 on ∂Ω.(9)
Then in general there is no explicit formula for u or λ. But a general theorem tells
us that there exists λ1 < λ2 < · · · and corresponding u1, u2 · · · such that (λj , uj) are
solutions to the eigenvalue problem (9) and anyf(~x) can be written as
f(~x) =∞∑n=1
Bnun. (10)
Thus a solution to
4u = f(x)
is
u = −∞∑n=1
Bnλnun. (11)
So the study of the eigenvalue problem is very useful. The expansion (10) is called
the eigen-function expansion. The equation
−4u = λu (12)
is called the Helmholtz equation.
The eigen functions un are orthogonal in the sense∫Ωun(x)um(x) dx = 0, if n 6= m. (13)
This is why we can determine the coefficients Bn quickly:
Bn =∫Ω f(x)un(x) dx∫
Ω(un(x))2 dx(14)
for all n = 1, 2, · · · .
2
6.10. Heat equation in a rectangle.
6.10.1. Initial Value Problem
Consider ∂u∂t = k ∂
2u∂x2 , 0 < x < L,
u(t, 0) = u(t, L) = 0,
u(0, x) = g(x).
(1)
We try separation of variables:
u(t, x) = φ(x)G(t). (2)
ThenG′(t)kG(t)
=φ′′(x)φ(x)
.
Thus we set them to be a common constant −λ:
G′(t) = −λkG(t), (3)
φ′′(x) = −λφ(x). (4)
For φ satisfying
φ(0) = φ(L) = 0, (5)
we find the solutions to the eigenvalue problem(4)-(5):
φ(x) = c sin(nπx
L), λ = (
nπ
L)2, n = 1, 2, · · · . (6)
So we have solutions
u(t, x) =∞∑n=1
Cne−k(nπ
L)2t sin(
nπx
L). (7)
We need (7) to satisfy the initial condition
u(0, x) =∞∑n=1
Cn sin(nπx
L) = g(x).
By Fourier sine series, we only need to take
Cn =2L
∫ L
0g(x) sin(
nπx
L)dx.
So a complete solution to (1) is
u(t, x) =∞∑n=1
(2L
∫ L
0g(x) sin(
nπx
L)dx)e−k(nπ
L)2t sin(
nπx
L).
6.10.2. Inhomogeneous Problem
Consider ∂u∂t = k ∂
2u∂x2 +Q(t, x), 0 < x < L,
u(t, 0) = u(t, L) = 0,
u(0, x) = 0.
(8)
Let us use the eigenfunctions and propose a solution in the form
u(t, x) =∞∑n=1
Cn(t) sin(nπx
L), Cn(0) = 0.
We expand Q(t, x) as
Q(t, x) =∞∑n=1
qn(t) sin(nπx
L).
We note that Q(t, x) may not have zero value at the boundaries, but the expansion
is still valid in the L2(0, L) sense. Then (8) can be projected onto the component
sin(nπxL ):
C ′n(t) = −k(nπ
L)2Cn + qn(t).
We find
Cn(t) = e−k(nπL
)2t∫ t
0ek(nπ
L)2sqn(s)ds.
Thus a solution to (8) is
u(t, x) =∞∑n=1
e−k(nπL
)2t(∫ t
0ek(nπ
L)2sqn(s)ds) sin(
nπx
L),
where
qn(t) =2L
∫ L
0Q(t, x) sin(
nπx
L)dx. (n = 1, 2, · · ·)
6.10.3. Boundary Value Problem
Consider
∂u∂t = k ∂
2u∂x2 , 0 < x < L,
u(0, x) = 0,
u(t, 0) = a(t),
u(t, L) = b(t).
(9)
2
We use the variable
V = u− [a(t) +x
L(b(t)− a(t))].
Then
∂V∂t = k ∂
2V∂x2 − [a′(t) + x
L(b′(t)− a′(t))], 0 < x < L,
v(0, x) = −[a(0) + xL(b(0)− a(0))],
v(t, 0) = 0,
v(t, L) = 0.
This can be solved by the previous two steps.
We shall solve the heat equation in a two-dimensional rectangular domain next
time.
3
6.11. Wave equation in a rectangle.
6.11.1 Vibrating string with fixed ends
PDE∂2u
∂t2= c2
∂2u
∂x2, 0 < x < L, (1)
Boundary Condition: u(t, 0) = u(t, L) = 0, (2)
Initial Condition: u(0, x) = g(x), (3)
ut(0, x) = h(x). (4)
Propose to study the associated eigenvalue problem
∂2u
∂x2= −λu, (5)
u(t, 0) = u(t, L) = 0. (6)
The solutions to (5)(6) are
u = φn(x) := sin(nπxL ),
λ = λn := (nπL )2, n = 1, 2, · · · .
Now use eigenfunction expansion:
u(t, x) =∑∞n=1 Cn(t)φn(x),
g(x) =∑∞n=1 gnφn(x),
h(x) =∑∞n=1 hnφn(x).
Use equation (1) to obtain
∞∑n=1
(C ′′n(t) + λnc2Cn)φn = 0.
ThusC ′′n(t) + λnc
2Cn = 0,
Cn(0) = gn,
C ′n(0) = hn.
General solution formula for Cn is
Cn = an cos(nπct
L) + bn sin(
nπct
L),
so that
C ′n(t) = −annπc
Lsin(
nπct
L) + bn
nπc
Lcos(
nπct
L).
Using initial condition, we find
an = gn,
bn = hnLnπc .
Thus
Cn(t) = gn cos(nπct
L) + hn
L
nπcsin(
nπct
L).
Hence the general solution to (1)-(4) is
u(t, x) =∞∑n=1
[gn cos(nπct
L) + hn
L
nπcsin(
nπct
L)] sin(
nπx
L),
wheregn = 2
L
∫ L0 g(x) sin(nπxL ) dx,
hn = 2L
∫ L0 h(x) sin(nπxL ) dx.
Remark Traditionally, we use separation of variables u = G(t)φ(x) in (1)-(4),
and then end up with the same problem of eigenvalue problem (5)-(6).
Figure 6.11.1. Normal mode at n = 3.
L2L/3L/3
u(t, x)
Property of solutions:
Let us look at the so-called normal modes of vibration:
[gn cos(nπct
L) + hn
L
nπcsin(
nπct
L)] sin(
nπx
L) = An sin(
nπct
L+nπαnL
) sin(nπx
L),
2
where
An =
√g2n + (
hnL
nπc)2,
and αn is an angle. We see An is the amplitude, temporal frequency is nπcL and
spatial frequency is nπL . See Figure 6.11.1.
6.11.2 Vibrating rectangular membrane.
Consider∂2u∂t2 = c24u, 0 < x < L, 0 < y < H,
u = 0 on boundary,
u(0, x, y) = α(x, y),
ut(0, x, y) = β(x, y).
Eigenvalue problem:4u+ λu = 0,
u = 0 on boundary.
We know that the eigenfunctions and the eigenvalues are (from Section 6.9, Poisson
equation in a rectangle, Lecture 8 of Chapter VI)
u = φnm(x, y) := sin(nπxL ) sin(mπyH ),
λ = λnm := (nπL )2 + (mπH )2.
Solutions are
u(t, x, y) =∞∑m=1
∞∑n=1
[Anm cos(ct√λnm) +Bnm sin(ct
√λnm)]φnm,
whereAnm = 4
LH
∫ L0
∫H0 α(x, y)φnm(x, y)dxdy,
c√λnmBnm = 4
LH
∫ L0
∫H0 β(x, y)φnm(x, y)dxdy.
I feel that you might have difficulty with the two-dimensional eigenvalue problem.
You can either come to see me for more explanation or use the book “Elementary
Applied PDE” by Haberman, or “Advanced Engineering Mathematics” by Erwin
Kreyszig.
3
6.12. Eigenvalue problem.
6.12.1 Motivation.
We have seen that eigenvalue problems are useful in the previous sections. And
they all have explicit solutions so far. Let us now look at the heat conduction
equation again but with more complexity:
ρc∂u
∂t= div(k∇u). (1)
Suppose the density ρ is now a function of x : ρ = ρ(x) and c = c(x), k = k(x).
Through separation of variables
u = G(t)φ(x),
we end up withG′
G=
div(k(x)∇φ(x))ρcφ(x)
= −λ, (2)
or
div(k(x)∇φ(x)) + λρcφ(x) = 0. (3)
In general, there is no explicit solutions for φ. But we still love the idea of eigenfunc-
tion expansion. We think our elementary functions, xn, ex, ln(x), sin(x), arcsin(x), etc.
are too few. We would like to establish more functions: these are called special func-
tions and the Bessel’s functions are examples.
6.12.2. Eigenvalue problem of Sturm-Liouville (p. 153 Keener).
Equation:d
dx(p(x)
dφ
dx) + q(x)φ+ λσ(x)φ = 0, a < x < b, (4)
Boundary conditions:
β1φ(a) + β2dφdx (a) = 0,
β3φ(b) + β4dφdx (b) = 0.
(5)
Assumptions: p > 0, σ > 0, p, q, σ are smooth, |β1|+ |β2| 6= 0, |β3|+ |β4| 6= 0.
Conclusions:
1. All eigenvalues are real.
2. There exists an infinite number of eigenvalues:
λ1 < λ2 < · · · < λn < λn+1 < · · · ,
a. There is a smallest value λ1,
b. There is no greatest value: λn → +∞, as n→∞.
3. Corresponding to each λn, there is an eigenfunction, denoted by φn(x) (which is
unique up to a constant factor), φn(x) has exactly n− 1 zeros for a < x < b.
4. The eigenfunction φn∞n=1 form a “complete” set: meaning that any L2 in-
tegrable function f(x) can be represented by a generalized Fourier series of the
eigenfunctions
f(x) =∞∑n=1
anφn(x), where an =∫ ba f(x)φn(x)σ(x) dx∫ ba φ
2n(x)σ(x) dx
,
in L2(a, b). Furthermore, this infinite series converges pointwise to
f(x+) + f(x−)2
for a < x < b, provided that f(x) is piecewise smooth (p. 164 Keener).
5. Weighted orthogonality:∫ b
aφn(x)φm(x)σ(x)dx = 0, if λn 6= λm.
6. Any eigvenvalue can be related to its eigenfunction by the Rayleigh quotient:
λn =−pφnφ′n|ba +
∫ ba (pφ′2n − qφ2
n)dx∫ ba φ
2nσdx
.
Example:
For uxx + λu = 0, 0 < x < 1,
u(0) = u(1) = 0.
We know un(x) = sin(nπx), λn = (nπ)2. But also,
u(x) = λ
∫ 1
0K(x, y)u(y)dy,
where
K(x, y) =
y(1− x), 0 ≤ y < x ≤ 1
x(1− y), 0 ≤ x < y ≤ 1.
2
Let
Tu :=∫ 1
0K(x, y)u(y)dy.
Then the eigenvalue problem becomes the spectrum problem of the compact
operator
Tu =1λu
(p. 114 Keener.)
3
Announcement: The final exam will be cumulative. I will give out a mock exam
on the Monday of the exam week on the web.
There will be no class on Dec 7, Friday, it is all because you have been so good.
We will not cover the originally planned topic Homogenization.
We will cover more special functions including Bessel’s functions this week, and
Green’s functions next Monday. Next Wednesday (Dec 5) is a review day.
6.12.3. Example: Heat flow in a nonuniform rod:
PDE: c(x)ρ(x)∂u
∂t=
∂
∂x(K0(x)
∂u
∂x), (1)
Boundary condition: u(t, 0) = 0, (2)
∂u
∂x(t, L) = 0, (3)
Initial condition: u(0, x) = g(x). (4)
Before jumping to an eigenvalue problem, let us try to use separation of variables:
u(t, x) = G(t)φ(x).
ThenG′(t)G(t)
=(K0(x)φ′)′
ρcφ= −λ.
Thus we have the eigenvalue problem (K0(x)φ′)′ + λρcφ = 0,
φ(0) = φ′(L) = 0.
and
G(t) = be−λt.
By Sturm-Liouville theory, we have
λ1 < λ2 < · · · < λn · · · , (5)
φ1, φ2, · · · , φn, · · · . (6)
So general solutions are
u(t, x) =∞∑n=1
bne−λntφn(x).
To find bn, we have the initial condition
g(x) =∞∑n=1
bnφn(x).
Using orthogonality condition with weight ρc, we find∫ L
0g(x)φm(x)ρc dx = bm
∫ L
0φ2m(x)ρc dx, (m = 1, 2, · · · .)
So
bm =∫ L0 g(x)φm(x)ρc dx∫ L
0 φ2m(x)ρc dx
. (7)
Thus a solution to (1)-(4) is
u(t, x) =∞∑n=1
bne−λntφn(x).
where bn is given in (7), λn in (5), and φn in (6).
6.13. Explicit eigenfunctions: Orthogonal polynomials and special func-
tions.
6.13.1. Legendre polynomials: (p.167, Keener)
Pn(x) =1
2nn!dn
dxn[(x2 − 1)n]
and eigenfunctions of the differential operator
Lu = λu,
Lu = −((1− x2)u′)′, −1 < x < 1,
λn = n(n+ 1).
Boundary condition is that u is bounded on −1 ≤ x ≤ 1. The function p(x) = 1−x2
vanishes at |x| = 1, so it is not regular. (What we covered last time in 6.12 is called
a regular Sturm-Liouville eigenvalue problem. But we claim the polynomials are
orthogonal and complete nontheless.)
6.13.2. The Schrodinger equation
u′′ + (E − V (x))u = 0, x ∈ IR1.
2
E is eigenvalue and physicist’s notation for λ. Let
V (x) = x2 − 1, u = e−x2
2 w.
Then
w′′ − 2xw′ + λw = 0,
w = Hn(x) = (−1)nex2
2dn
dxne−
x2
2 , λn = 2n.
The functions Hn(x) are called the Hermite polynomials, which are orthogonal
polynomials.
See p.167–, Keener’s for more special functions.
3
6.13.3. Special functions, Bessel’s differential equations.
Bessel’s differential equation of order m (m ≥ 0) is
z2 d2u
dz2+ z
du
dz+ (z2 −m2)u = 0, 0 < z <∞.
We state without proof the following. It has two solutions Jm(z) and Ym(z): As
z → 0, they satisfy
Jm(z) ∼
1, m = 0,1
2mm!zm, m > 0;
Ym(z) ∼
2π ln z, m = 0,
−2m(m−1)!π z−m, m > 0.
Jm(z) are called Bessel functions of the first kind, while Ym(z) are called the second
kind. Jm are bounded, Ym are not bounded. Both are oscillatory and decay to zero
as z →∞. See Figures 16.3.1-2. We shall let zmn denote the positive zeros of Jm(z),
n = 1, 2, · · · .
Figure 6.13.1. Bessel functions of the first kinds.
1J
J
0
1
z2
z
z 02
01
(z)
(z)
12z
= 2.40483...
= 5.52008...
4
Figure 6.13.2. Bessel functions of the second kinds.
z2
0.5Y
0(z)
Y1(z)
Another form of Bessel’s equation: Let z =√λr, then
r2d2u
dr2+ r
du
dr+ (λr2 −m2)u = 0.
Let
Lu = −(ru′)′ +m2
ru.
Then Bessel’s equation becomes
Lu = λru, 0 < r <∞.
We propose to study the eigenvalue problem Lu = λu, 0 < r < a,
|u(0)| <∞, u(a) = 0,
where a is any positive given number. Solution to the equation with |u(0)| <∞ are
u(r) = cJm(√λr).
Imposing the other boundary condition u(a) = 0, we have
√λa = zmn.
2
Thus
λ = λmn := (zmna
)2, n = 1, 2, · · · .
The eigenvalue problem is a singular Sturm-Liouville eigenvalue problem, we claim
that orthogonality and completeness both hold:∫ a
0Jm(
√λmn r)Jm(
√λmk r) r dr = 0, n 6= k,
and any α(r), such that∫ a0 rα
2(r) dr <∞, has expansion
α(r) =∞∑n=1
bnJm(√λmnr),
where
bn =∫ a
0 α(r)Jm(√λmn r) r dr∫ a
0 J2m(√λmn r) r dr
.
This expansion is called the Fourier-Bessel series.
3
6.14 Vibrating membrane in a circular domain
PDE:∂2u
∂t2= c24u, in disk r < a in IR2, (1)
Boundary condition: u(t, a, θ) = 0, r = a, θ ∈ [−π, π], (2)
Initial condition: u(0, r, θ) = α(r, θ),∂u∂t (0, r, θ) = β(r, θ).
(3)
Using separation of variables
u(t, r, θ) = φ(r, θ)G(t), (4)
we find
4φ+ λφ = 0, (5)
andd2G
dt2+ λc2G = 0. (6)
We need
φ(a, θ) = 0. (7)
This time the domain is not rectangular. We will not have sin(mπxL ) sin(mπyH ). We
try separation of variables again
φ(r, θ) = f(r)g(θ), 0 < r < a, −π < θ < π. (8)
Recall in 2−D:
4φ =1r
∂
∂r(r∂
∂rφ) +
1r2
∂2
∂θ2φ.
Thus (5) becomes
−1g
d2g
dθ2=r
f
d
dr(rdf
dr) + λr2 =: µ. (9)
(Comment on uxx + uyy + λu = 0 : u = f(x)g(y) : fxxf + gyy
g + λ = 0
⇒ fxxf
+ λ = −gyyg
=: µ.)
We see that g needs to be periodic in θ:
g(π) = g(−π),ddθg(π) = d
dθg(−π).(10)
The “regular” Sturm-Liouville eigenvalue problem
d2g
dθ2+ µg = 0, with (10) (11)
yields
µ = µm := m2, m = 0, 1, 2, · · · . (12)
g = sin(mθ) or cos(mθ). (13)
Thus for m = 0 there is one eigenfunction g = 1, but for m > 0, there are two
linearly independent eigenfunctions. These eigenfunctions generate a complete and
orthogonal basis for L2[−π, π]. This is the full Fourier series: any function Γ(θ)
in L2[−π, π] has the expansion
Γ(θ) =∞∑m=0
[am cos(mθ) + bm sin(mθ)]. (14)
(Define b0 = 0 for notational convenience.) All right. Now for each µm, we consider
equation (9)r
f
d
dr(rdf
dr) + λr2 = m2 (15)
with the natural condition |f(0)| <∞ and f(a) = 0 derived from (7); i.e., r(rf ′)′ + (λr2 −m2)f = 0,
|f(0)| <∞, f(a) = 0.(16)
The solution to (16) are given in section 6.13.3 (Bessel’s functions), and they are f(r) = fmn(r) := Jm(√λmn r),
λ = λmn := (zmna )2, n = 1, 2, · · · .(17)
We have found φ for (5)
φ = φmn := Jm(√λmn r)[am cos(mθ) + bm sin(mθ)].
For the G function in (6) we find
G(t) = cos(c√λmn t) or sin(c
√λmn t).
2
Combining all the factors, we find a general solution formula
u(t, r, θ) =∑∞m=0
∑∞n=1[AmnJm(
√λmn r) cos(mθ) cos(c
√λmn t)
+BmnJm(√λmn r) sin(mθ) cos(c
√λmn t)
+CmnJm(√λmn r) cos(mθ) sin(c
√λmn t)
+DmnJm(√λmn r) sin(mθ) sin(c
√λmn t)].
(18)
Imposing the initial condition (3) on (18) will determine the coefficients. For exam-
ple, let β = 0, we find Cmn = Dmn = 0. Then
α(r, θ) =∞∑m=0
(∞∑n=1
AmnJm(√λmn r)) cos(mθ) +
∞∑m=1
(∞∑n=1
BmnJm(√λmn r)) sin(mθ),
where
Amn =∫ a
0
∫ 2π0 α(r, θ)Jm(
√λmn r) cos(mθ) r dr dθ∫ a
0
∫ 2π0 J2
m(√λmn r) cos2(mθ) r dr dθ
,
Bmn =∫ a0
∫ 2π0 α(r, θ)Jm(
√λmn r) sin(mθ) r dr dθ∫ a
0
∫ 2π0 J2
m(√λmn r) sin2(mθ) r dr dθ
.
Notes. Two dimensional eigenvalue problems.
I give a summary here for all the two dimensional eigenvalue problems that we
have encountered. They have appeared in
1. Poisson equation in a rectangle Ω (Section 6.9) (and Homework set 14)
4φ+ λφ = 0, in Ω
φ = 0 on ∂Ω.
2. Or in a disk (Section 6.13, Bessel’s functions).
3. Heat flow in a rectangle, Section 6.10.4, to be up-loaded (also in Homework set
14).
4. Wave equation in a rectangle (Section 6.11), disk (Section 6.13).
3
Appendix: One-dimensional eigenvalue problem
We provide a complete solution to the eigenvalue problemd2φdx2 + λφ = 0, 0 < x < L
φ(0) = φ(L) = 0.
Solution. The objective of the eigenvalue problem is to find both the parameter
λ and a nonzero solution φ. We use the strategy of shooting. First let λ be zero
λ = 0, and see whether we can find a nonzero solution φ. In this case, the equation
becomes
φ′′ = 0.
Thus
φ = a1 + a2x.
Then the boundary conditions imply that a1 = a2 = 0. Thus we do not have any
nonzero solution for λ = 0. Let us now try to find a negative solution of λ : λ =
−c2, c > 0. Then the equation becomes
φ′′ − c2φ = 0.
We use the guess work
φ = eαx
to find that
α2 − c2 = 0.
So α = ±c and we have the solution
φ = a1ecx + a2e
−cx.
The boundary conditions imply similarly that a1 = a2 = 0. So there is no solution
for λ = −c2. Let us now try λ = c2, c > 0, and solution of the form φ = eαx; we find
α = ±ic and the solutions are
φ = a1 cos(cx) + a2 sin(cx).
The boundary condition φ(0) = 0 implies a1 = 0. The boundary condition φ(L) = 0
implies
sin(cL) = 0.
4
So we choose c, such that cL = nπ, n = 1, 2, · · · . Thus λ = (nπL )2, and the
corresponding solutions are
φ = a2 sin(nπx
L).
5
C. General bounded domains, Green’s function.
6.15 Poisson equation in a bounded domain, Green’s function.
Given a domain Ω ⊂ IRn. Consider the problem
4u = f(x) in Ω, (1)
u|∂Ω = 0. (2)
If Ω is not one of the special cases (interval, rectangle, disk, sphere, cylinder, half
disk or quarter disk etc.), then separation of variables or transform methods do not
work. The eigenvalue problem and eigenfunction expansion is a way, however there
is an alternative way which reduces the work of finding a solution to (1)-(2) for
arbitrary f(x) to finding a solution for a single function f(x) = δ(x − x0). This
single solution is called Green’s function and is defined as the solution to
4G = δ(x − x0) in Ω, (3)
G|∂Ω = 0. (4)
Recall that our definition of fundamental solution F is
4F = δ(x− x0) in IRn
with the condition F (x)→ 0 as |x| → ∞. Thus W := G− F satisfies 4W = 0 in Ω,
W |∂Ω = −F.
So Green’s function G = F + W is a “fundamental solution” that satisfies the zero
boundary condition.
If we multiply (3)-(4) with f(x0) and integrate in x0, we find
4(∫ΩG(x, x0)f(x0)dx0) =
∫Ω δ(x− x0)f(x0)dx0 = f(x),∫
ΩG(x, x0)f(x0)dx0|∂Ω = 0.
Thus
u =∫
ΩG(x, y)f(y)dy
is a solution to 4u = f(x) in Ω,
u|∂Ω = 0 on ∂Ω.
For the simplest example, we see that the solution to
d2u
dx2= f(x), u(0) = u(1) = 0
is given by
u(x) =∫ 1
0g(x, y)f(y)dy,
where
g(x, y) = (x− y)H(x− y)− x(1− y), for 0 ≤ x, y ≤ 1.
where H is the Heaviside function H(x) = 1 for x > 0, and H(x) = 0 for x < 0. See
Fig. 6.15.1. Simple calculation shows g(0, y) = g(1, y) = 0, and
gx(x, y) = H(x− y)− (1− y),
gxx = δ(x− y).
See pp. 145-146 of Keener. (Textbook of Keener identifies fundamental solutions
with Green’s functions.)
0 x
y
g
1
Figure 6.15.1. The Green’s function for u" = f(x), u(0) = u(1) = 0.
In general, a Green’s function has no explicit formula. Still, it is helpful even
when only the abstract form of existence is available. Green’s functions appear in
numerous situations.
2
Top Related