Integration Plane parametric curves Ordinary differential...
Transcript of Integration Plane parametric curves Ordinary differential...
MAA105 – ANALYSIS
Integration
Plane parametric curves
Ordinary differential equations
2019–2020
Jérémie BETTINELLI
version of January 31, 2020
Jérémie BETTINELLI
nsup.org/~bettinel
Contents
Contents 3
1 Riemann theory of integration 51.1 Introduction 5
1.1.1 Motivation 51.1.2 A first example 7
1.2 Riemann integral 91.2.1 Step functions 91.2.2 Integrable functions 111.2.3 First properties of the integral 161.2.4 Integrals and derivatives 221.2.5 Riemann sums 261.2.6 Toolbox 27
1.3 Integration of rational functions 301.3.1 Partial fraction decomposition 301.3.2 Integrating partial fractions 331.3.3 Rational functions in other functions 35
1.4 Improper integrals 391.4.1 Definition and first properties 391.4.2 Nonnegative functions 441.4.3 Oscillating functions 501.4.4 Comparison of series with integrals 54
2 Plane parametric curves 572.1 Introduction 58
2.1.1 Motivation 582.1.2 Preliminaries 60
2.2 First definitions 632.3 Tangents 66
2.3.1 Definition 672.3.2 Link with derivatives 692.3.3 Local behavior 71
2.4 Sketching 742.4.1 Interval of study 742.4.2 Asymptotes 762.4.3 Sketching plan 79
2.5 Polar curves 852.5.1 Polar coordinates 85
3
Contents
2.5.2 Polar curves 862.5.3 What is the difference with a usual graph? 872.5.4 Tangents 882.5.5 Extremities of the interval of study 912.5.6 Sketching 93
3 Ordinary differential equations 993.1 Introduction 100
3.1.1 Motivation 1003.1.2 Formal definitions 1003.1.3 Separable differential equations 1023.1.4 Linear ODEs 103
3.2 First order linear differential equations 1053.2.1 Homogeneous equation 1053.2.2 Finding a particular solution to y′ = a(x)y + b(x) 1073.2.3 Solution to the nonhomogeneous equation 110
3.3 Systems of linear ODEs 1123.3.1 Preliminaries: matrix exponential 1133.3.2 Solution to the homogeneous equation 1163.3.3 Solution to the nonhomogeneous equation 1183.3.4 Method for solving a system of linear ODEs in practice 119
3.4 Linear differential equations with constant coefficients 1233.4.1 Homogeneous equation 1233.4.2 Nonhomogeneous equation 1263.4.3 Example: second order equation 128
Solutions to the exercises 133
Usual notation 147
Bibliography 151
Note
A vertical line on the margin like this indicates parts that will not be covered in class; they are accessibleand here for further reading.
4
1Riemann theory of integration
In this chapter, we will define the integral of a function in the sense of Riemann and cover the basictools for practical computation of integrals.
For further references about this chapter, you may consult
h [Tao16], [God05], [Har92].
1.1 Introduction 5
1.1.1 Motivation 5
1.1.2 A first example 7
1.2 Riemann integral 9
1.2.1 Step functions 9
1.2.2 Integrable functions 11
1.2.3 First properties of the integral 16
1.2.4 Integrals and derivatives 22
1.2.5 Riemann sums 26
1.2.6 Toolbox 27
1.3 Integration of rational functions 30
1.3.1 Partial fraction decomposition 30
1.3.2 Integrating partial fractions 33
1.3.3 Rational functions in other functions 35
1.4 Improper integrals 39
1.4.1 Definition and first properties 39
1.4.2 Nonnegative functions 44
1.4.3 Oscillating functions 50
1.4.4 Comparison of series with integrals 54
1.1 Introduction
1.1.1 Motivation
We are interested in the following two seemingly distinct problems.
5
Chapter 1. Riemann theory of integration
Geometric problem h area below the graph of a nonnegative function g
Given a nonnegative function f defined at least on an interval [a, b] ⊆ R, compute the area “below thegraph of f between a and b,” that is, of the set
A :={
(x, y) ∈ R2 : a ≤ x ≤ b and 0 ≤ y ≤ f(x)}.
A
x
y
a b
f
Analytic problem h finding primitives g
Given a function f : I → R, find a differentiable function F : I → R such that F ′ = f .
Such a function F is called a primitive or antiderivative of f . In fact, recall that the derivative of afunction represents the “growth rate” of the function: more precisely, for a given x and a small h, onehas F (x+h) ≈ F (x) +hF ′(x), the approximation getting better as h gets smaller. If one wants F ′ = f ,then one should define F (x + h) from F (x) by adding to it hf(x), which is precisely the area of thefollowing rectangle of width h and height f(x) (we assume f(x) ≥ 0 at this point).
y
x x+ h
f
Keeping in mind that h is supposed to become “infinitesimal,” the link between these two problemsbecomes clear. Moreover, in the case where f(x) < 0, the quantity of interest hf(x) is negative; it is theopposite of the area of the following rectangle of width h and height −f(x).
y
x x+ h
f
6
1.1.2. A first example
It thus makes sense to generalize the geometric problem for functions taking negative values asfollows.
Geometric problem h signed area below the graph of a function g
Given a function f defined at least on an interval [a, b] ⊆ R, compute the area between the graph of fand the x-axis, positively when f is above the x-axis and negatively when it is below, between a and b.
x
y
a b
f
On the picture, the desired result is the area of the red part minus that of the blue part.
1.1.2 A first example
Let us make the computation of the area below the graph of the exponential function f : e 7→ ex
between 0 and 1.
x
y
10
x 7→ ex
In order to do so, we bound the desired area with sums of “thin” rectangles as follows. We fix n ∈ N
and subdivide [0, 1] into n regular intervals by considering the points 0, 1n , 2
n , . . . , n−1n , 1. For the lower
bound, we use the n rectangles with base[
i−1n , i
n
]and height f
(i−1
n
)= e
i−1n , 1 ≤ i ≤ n, whereas, for
the upper bound, we use the n rectangles with base[
i−1n , i
n
]and height f
(in
)= e
in , 1 ≤ i ≤ n. On the
picture below, n = 5.
7
Chapter 1. Riemann theory of integration
x
y
10
x 7→ ex
x
y
10
x 7→ ex
For the lower bound, we obtain
n∑
i=1
1
n× e
i−1n =
1
n
n∑
i=1
(e
1n
)i−1=
1
n
1 −(e
1n
)n
1 − e1n
=1n
e1n − 1
(e− 1
)−−−−→n→∞
e− 1.
The computation for the upper bound is similar:
n∑
i=1
1
n× e
in =
1
n
n∑
i=1
(e
1n
)i=
1
n
e1n −
(e
1n
)n+1
1 − e1n
= e1n
1n
e1n − 1
(e− 1
)−−−−→n→∞
e− 1.
As n grows, the rectangles get thinner and thinner and the approximation gets better and better.As the bounds are valid for every n, we may pass to the limit and obtain that the area is bounded frombelow and above by the same value: e− 1. It is thus equal to this value.
x
y
10n = 20
x 7→ ex
We will write the above result as
∫ 1
0
f(x) dx =
∫ 1
0
ex dx = e− 1 or
∫ 1
0
f = e− 1 .
The real number∫ 1
0f(x) dx is read: “the integral of f on [0, 1].” The variable “x” can be replaced by any
other variable: for instance∫ 1
0eu du = e− 1.
Warning 1.1
Beware that the notation∫ 1
0 f dx is incorrect and should be avoided. On the same topic, beware not toconfuse the number f(x) with the function f . One can for instance consider the integral of sin, not of
8
1.2. Riemann integral
sin(x), which is not a function.
1.2 Riemann integral
Recall that an interval of R is a subset of the form [a, b], [a, b), [a,+∞), (a, b], (−∞, b], (a, b), (a,+∞),(−∞, b), or (−∞,+∞) = R, where a ≤ b are real numbers1. Note that a = b only makes sense in thefirst case, in which case the interval is the singleton [a, a] = {a}. An interval is called
h open if it has either of the forms (a, b), (a,+∞), (−∞, b), or (−∞,+∞);
h closed if it has either of the forms [a, b], [a,+∞), (−∞, b], or (−∞,+∞);
h bounded if it has either of the forms [a, b], [a, b), (a, b], or (a, b);
h a segment if is closed and bounded, that is, has the form [a, b].
The numbers a, b, or −∞, +∞ appearing in the definition of an interval are called its extremitities.
1.2.1 Step functions
We now properly define the functions accounting for the rectangles we used in the previous section.
Definition 1.2 h subdivision g
A subdivision of a segment [a, b] is a finite sequence a = x0 < x1 < x2 < . . . < xn = b.
xx1 x2 x3 x4 x5 x6x0
a
x7
b
Example 1.3 h regular subdivision g
For instance, we used above the regular subdivision of [0, 1] into n parts:
0 <1
n<
2
n< . . . <
n− 1
n< 1 .
More generally, the regular subdivision of [a, b] into n parts is a = x0 < x1 < x2 < . . . < xn = bwhere
xi := a+ ib− a
n, 0 ≤ i ≤ n .
Definition 1.4 h step function g
A function f : [a, b] → R is a step function if there exists a subdivision a = x0 < x1 < x2 < . . . <xn = b and real numbers c1, . . . , cn such that, for all i ∈ {1, . . . , n},
∀x ∈ (xi−1, xi), f(x) = ci .
1Formally, an interval is defined as a subset I ⊆ R such that, whenever x < y < z with x, z ∈ I , then y ∈ I .
9
Chapter 1. Riemann theory of integration
In other words, f is constant on every open subinterval (xi−1, xi) defined by the subdivision. Notethat, given a step function, the subdivision appearing in the definition is not uniquely defined. Indeed,one might always get another satisfactory subdivision by adding arbitrary points.
Remark 1.5
The value taken by f at a point of the subdivision is arbitrary: it might be equal to the value taken by fon the previous subinterval, to the value taken by f on the subsequent subinterval, or to any other value!This bears no effect in our context of area of rectangles at it corresponds to rectangles of null width.
Example 1.6 h floor and ceiling functions g
The floor function and the ceiling function are defined for x ∈ R respectively by
⌊x⌋ := max{k ∈ Z : k ≤ x
}and ⌈x⌉ := min
{k ∈ Z : k ≥ x
}.
For a < b, the restrictions of the floor and ceiling functions to [a, b] are step functions on [a, b].
Step functions are designed in such a way that the signed area of the geometric problem is straight-forward to compute as a sum of signed areas of rectangles: this is the definition of the integral of a stepfunction.
Definition 1.7 h integral of a step function g
Let f : [a, b] → R be a step function as in Definition 1.4. We define its integral as the real number
∫ b
a
f(x) dx :=
n∑
i=1
(xi − xi−1) ci .
x
y
0 x0
c1
x1
c2
x2
c3
x3c4
x4
c5
x5
c6
x6
c7
x7
Recall that the subdivision appearing in the definition of a step function is not uniquely defined. AsDefinition 1.7 is based upon such a subdivision, we need to check that the definition does not actuallydepend on this choice. First, observe that adding a point to a subdivision does not change the value ofthe definition. Indeed, let us consider three points x < y < z such that f is equal to the constant c on(x, z). Then f is also equal to c on (x, y), as well as on (y, z). As (z−x)c = (z− y)c+ (y−x)c, the value
10
1.2.2. Integrable functions
of the sum appearing in the definition is not modified upon adding a point. Reitering the argument, itis not changed upon adding a finite number of points either. Now, let us thus take two subdivisionsa = x0 < x1 < x2 < . . . < xn = b and a = y0 < y1 < y2 < . . . < ym = b compatible with f . Thenthe subdivision consisting of the numbers x0, x1, x2, . . . , xn, y0, y1, y2, . . . , ym arranged in increasingorder with duplicates removed is yet another subdivision compatible with f . The latter subdivisionis obtained from any of the original subdivisions by adding finite numbers of points, so that the sumsfor the three subdivisions must be equal.
Remark 1.8 h segment of null length g
Any function f : [a, a] → R is a step function and
∫ a
a
f(x) dx = 0 .
1.2.2 Integrable functions
For two functions f , g : I → R, we use the classical notation f ≤ g meaning
∀x ∈ I, f(x) ≤ g(x) .
We also use the notation f ≤ M for a function f : I → R and a real number M to mean
∀x ∈ I, f(x) ≤ M .
The other comparison ≥ is similarly defined. Recall that a real-valued function f is bounded if thereexists M ≥ 0 such that −M ≤ f ≤ M .
Warning 1.9 B f < g B
Beware that this notation has to be handled with care. First of all, it is easy to find two functions f and gsuch that neither f ≤ g nor f ≥ g.Moreover, the notation f < g is even more controversial: does it mean
h f(x) < g(x) for all x;
h or f ≤ g and f 6= g ?
In both cases, f(x) ≤ g(x) for all x. The difference is that, in the first case, f(x) 6= g(x) for all x,whereas, in the second case, there exists at least one x for which f(x) 6= g(x). Unfortunately, theanswer is not consensual among mathematicians, so it is advised not to use this notation, unless veryexplicitly mentioning what is meant by it.
We consider from now on a bounded function f : [a, b] → R. (It is not assumed to be continuousalthough the pictures might suggest otherwise.) We defined the two real numbers
I−(f) := sup
{∫ b
a
φ(x) dx : φ is a step function such that φ ≤ f
}
,
I+(f) := inf
{∫ b
a
φ(x) dx : φ is a step function such that φ ≥ f
}
.
In words, for I−(f), we consider all the step functions that are below f , compute their integral andthen take the supremum. The idea is to obtain the whole area below the graph of f . For I+(f), weconsider the infimum of the integrals of the step functions that are above f .
11
Chapter 1. Riemann theory of integration
x
y
φ ≥ fφ ≤ f
a b
f
Warning 1.10 B we only consider segments B
Beware that, for the time being, we only consider functions defined on a segment. The following resultsno longer hold without this assumption.
Proposition 1.11
Let f : [a, b] → R be a bounded function. Then I−(f) ≤ I+(f).
Proof. Let ψ be a step function such that ψ ≥ f . Then, for any step function φ ≤ f , one has φ ≤ ψ.
Now, this implies that∫ b
aφ ≤
∫ b
aψ. Indeed, let us consider a subdivision compatible both with φ and ψ,
obtained for instance as at the end of Section 1.2.1. As φ ≤ ψ, the value taken by φ on a subinterval
is smaller than the value taken by ψ on the same subinterval. The claimed inequality∫ b
a φ ≤∫ b
a ψimmediately follows from the definition.
Taking the supremum over the step functions φ below f , we obtain
I−(f) ≤∫ b
a
ψ .
As this inequality holds for any step function ψ above f , we may take the infimum over such functionsand obtain I−(f) ≤ I+(f), as desired.
The converse inequality does not always hold.
Definition 1.12 h integrable function g
h A bounded function f : [a, b] → R is Riemann-integrable or simply integrable if
I−(f) = I+(f) .
h If f : [a, b] → R is integrable, the integral of f on [a, b] is the number
∫ b
a
f =
∫ b
a
f(x) dx := I−(f) = I+(f) .
h If f : [a, b] → R is integrable, the signed area below the graph of f is defined as
∫ b
a
f .
12
1.2.2. Integrable functions
Example 1.13
Step functions are integrable: in this case the infimum and the supremum are reached at the functionitself. Plainly, Definition 1.12 coincides with Definition 1.7 in this case.
Hopefully, many functions are integrable. We will soon see that monotonic functions as well as(piecewise) continuous functions are always integrable. To see that not all bounded functions areintegrable, solve the exercise below.
Exercise 1.14 solution page 133
Show that the indicator function of Q on [0, 1], that is, the function 1Q : [0, 1] → {0, 1} such that1Q(x) = 1 whenever x ∈ Q and 1Q(x) = 0 otherwise, is not integrable.
You can solve the following exercise using the method we used in Section 1.1.2 for the computation
of∫ 1
0 ex dx = e− 1.
Exercise 1.15 solution page 133
Show that the function f : x ∈ [0, 1] 7→ x2 is integrable and compute∫ 1
0 f(x) dx.
Although quite intuitive, the definition of integral is not very handy in practice. Hopefully, we willsee in the subsequent sections various tool to make many fast computations.
Proposition 1.16 h changing a finite number of values g
Let f : [a, b] → R be a bounded function and g : [a, b] → R be equal to f except at a finite number of
points. Then f is integrable if and only if g is integrable. If the functions are integrable, then∫ b
a f =∫ b
a g.
Proof. Let φ ≤ f be a step function. By changing the value of φ at a finite number of points, we obtain
a step function ψ ≤ g. As∫ b
a φ =∫ b
a ψ ≤ I−(g), we obtain by taking the supremum that I−(f) ≤ I−(g).The argument being symmetrical, we obtain I−(f) = I−(g) and, similarly, I+(f) = I+(g). The resultfollows.
Let us now see that two large classes of functions are integrable, namely monotonic functions and(piecewise) continuous functions. Do not forget that we only consider functions defined on segments.
Theorem 1.17 h monotonic functions g
Let f : [a, b] → R be a monotonic function. Then f is integrable.
Proof. We use the same method as in Section 1.1.2 and Exercise 1.15, without explicitly making thecomputation. Let us first assume that f is nondecreasing on [a, b]. For n ∈ N, we consider the regularsubdivision of [a, b] into n parts a = x0 < x1 < x2 < . . . < xn = b, where
xi := a+ ib− a
n, 0 ≤ i ≤ n .
By monotonicity, for 1 ≤ i ≤ n,
∀x ∈ [xi−1, xi], f(xi−1) ≤ f(x) ≤ f(xi) .
13
Chapter 1. Riemann theory of integration
We define the step functions φn and ψn on [a, b] by φn(x) := f(xi−1) and ψn(x) := f(xi) wheneverx ∈ [xi−1, xi), as well as φn(b) = ψn(b) := f(b). This implies that φn ≤ f ≤ ψn.
We have
n∑
i=1
b− a
nf(xi−1) =
∫ b
a
φn(x) dx ≤ I−(f) ≤ I+(f) ≤∫ b
a
ψn(x) dx =
n∑
i=1
b− a
nf(xi) ,
so that
0 ≤ I+(f) − I−(f) ≤n∑
i=1
b− a
nf(xi) −
n∑
i=1
b− a
nf(xi−1) =
b− a
n
(f(b) − f(a)
).
Letting n → ∞ yields I−(f) = I+(f), so that f is integrable.If f is nonincreasing, a similar argument holds; we leave the details to the reader.
Theorem 1.18 h continuous functions g
Let f : [a, b] → R be a continuous function. Then f is integrable.
Proof. By Heine’s theorem, f is uniformly continuous on [a, b]. Recall that this means that
∀ε > 0, ∃δ > 0, |y − x| < δ =⇒ |f(y) − f(x)| ≤ ε .
Let us thus arbitrarily fix ε > 0 and choose n ∈ N large enough so that |y−x| ≤ 1n =⇒ |f(y)−f(x)| ≤ ε.
The remaining is similar to the previous proof. We consider the regular subdivision of [a, b] into n partsa = x0 < x1 < x2 < . . . < xn = b, where xi := a+ i b−a
n , 0 ≤ i ≤ n. Then,
∀x ∈ [xi−1, xi], min[xi−1,xi]
f ≤ f(x) ≤ max[xi−1,xi]
f .
We define the step functions φn and ψn on [a, b] by φn(x) := min[xi−1,xi] f and ψn(x) := max[xi−1,xi] fwhenever x ∈ [xi−1, xi), as well as φn(b) = ψn(b) := f(b).
ψn
φn
f
x
y
min[xi−1,xi]
f
max[xi−1,xi]
f
xi−1 xi
This implies that φn ≤ f ≤ ψn so that
n∑
i=1
b− a
nmin
[xi−1,xi]f =
∫ b
a
φn(x) dx ≤ I−(f) ≤ I+(f) ≤∫ b
a
ψn(x) dx =
n∑
i=1
b− a
nmax
[xi−1,xi]f .
As a result,
0 ≤ I+(f) − I−(f) ≤n∑
i=1
b− a
n
(
max[xi−1,xi]
f − min[xi−1,xi]
f
)
≤ (b− a) ε .
As ε was arbitrary, we obtain I−(f) = I+(f), so that f is integrable.
Recall that we denote by f |J the restriction of a function f defined on some I to the subset J ⊆ I .
14
1.2.2. Integrable functions
Definition 1.19 h piecewise continuous function g
A function f : [a, b] → R is piecewise continuous if there is a subdivision a = x0 < x1 < x2 <. . . < xN = b such that, for each i ∈ {1, . . . , N}, f |(xi−1,xi) is continuous and admits finite one-sidedlimits at xi−1 and at xi.
x
y
a b
f
Note that the values at the points of the subdivision are freely set.
Corollary 1.20 h piecewise continuous functions g
Let f : [a, b] → R be a piecewise continuous function. Then f is integrable.
Proof. Let a = x0 < x1 < x2 < . . . < xN = b be a subdivision as in Definition 1.19. We fix ε > 0. Then,for each i ∈ {1, . . . , N}, f |(xi−1,xi) can be extended into a continuous function fi : [xi−1, xi] → R. (Notethat the existence of one-sided limits at the extremities of (xi−1, xi) is used at this stage.) Using thedefinition of infimum and supremum, we see that there exist two step functions φi, ψi : [xi−1, xi] → R
such that φi ≤ fi ≤ ψi and
I−(fi) − ε ≤∫ xi
xi−1
φi(x) dx ≤ I−(fi) and I+(fi) ≤∫ xi
xi−1
ψi(x) dx ≤ I+(fi) + ε .
As fi is continuous, it is integrable by Theorem 1.18, so that I−(fi) = I+(fi). As a result,
0 ≤∫ xi
xi−1
ψi(x) dx−∫ xi
xi−1
φi(x) dx ≤ 2ε . (1.2)
We define the functions φ, ψ : [a, b] → R by φ(x) := φi(x) and ψ(x) := ψi(x) whenever x ∈(xi−1, xi), 1 ≤ i ≤ N , as well as φ(xi) = ψ(xi) := f(xi), 0 ≤ i ≤ N . Clearly, φ and ψ are step functions(a subdivision can be obtained by taking the points of subdivisions compatibles with the φi’s or ψi’s)satisfying φ ≤ f ≤ ψ. Moreover, it is straightforward from Definition 1.7 that
∫ b
a
φ(x) dx =N∑
i=1
∫ xi
xi−1
φi(x) dx .
Using (1.2), we finally obtain
0 ≤ I+(f) − I−(f) ≤∫ b
a
ψ(x) dx−∫ b
a
φ(x) dx ≤ 2Nε .
As ε was arbitrary (and N is fixed), we obtain I−(f) = I+(f), so that f is integrable.
15
Chapter 1. Riemann theory of integration
1.2.3 First properties of the integral
Let us start with linearity.
Proposition 1.21 h linearity of the integral g
The following holds.
(i) Let f , g : [a, b] → R be integrable functions. Then f + g : x ∈ [a, b] 7→ f(x) + g(x) is integrableand
∫ b
a
(f + g)(x) dx =
∫ b
a
f(x) dx+
∫ b
a
g(x) dx .
(ii) Let f : [a, b] → R be an integrable function and λ ∈ R. Then λf : x ∈ [a, b] 7→ λ × f(x) isintegrable and
∫ b
a
(λf)(x) dx = λ
∫ b
a
f(x) dx .
Remark 1.22
The above proposition can be condensed into one statement: let f , g : [a, b] → R be integrable functionsand λ ∈ R. Then λf + g is integrable and
∫ b
a
(λf + g)(x) dx = λ
∫ b
a
f(x) dx+
∫ b
a
g(x) dx .
The previous statements correspond to the particular cases λ = 1 and g : x ∈ [a, b] 7→ 0.
Proof. Let us consider integrable functions f , g : [a, b] → R and λ ∈ R. We furthermore assume thatλ > 0 for the time being. We also fix ε > 0. As f and g are integrable, there exist step functionsφ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that
∫ b
a
f − ε ≤∫ b
a
φ,
∫ b
a
ψ ≤∫ b
a
f + ε,
∫ b
a
g − ε ≤∫ b
a
ϕ,
∫ b
a
ξ ≤∫ b
a
g + ε . (1.3)
Now, it is easy to see that λφ+ ϕ and λψ + ξ are step functions and satisfy λφ+ ϕ ≤ λf + g ≤ λψ + ξ.Moreover, let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ and ϕ (and thusalso with λφ + ϕ) and let us denote by ci and di the constant values taken respectively by φ and by ϕon (xi−1, xi). Then
∫ b
a
(λφ + ϕ)(x) dx =
n∑
i=1
(xi − xi−1) (λci + di) = λ
n∑
i=1
(xi − xi−1) ci +
n∑
i=1
(xi − xi−1) di
= λ
∫ b
a
φ(x) dx+
∫ b
a
ϕ(x) dx .
By the same argument,∫ b
a (λψ + ξ) = λ∫ b
a ψ +∫ b
a ξ. Summing up,
λ
∫ b
a
φ+
∫ b
a
ϕ =
∫ b
a
(λφ + ϕ) ≤ I−(λf + g) ≤ I+(λf + g) ≤∫ b
a
(λψ + ξ) = λ
∫ b
a
ψ +
∫ b
a
ξ .
Adding this up with (1.3) yields
λ
∫ b
a
f +
∫ b
a
g − λε − ε ≤ I−(λf + g) ≤ I+(λf + g) ≤ λ
∫ b
a
f +
∫ b
a
g + λε+ ε .
16
1.2.3. First properties of the integral
As ε is arbitrary, we obtain that I−(λf + g) = I+(λf + g) = λ∫ b
a f +∫ b
a g. This implies that λf + g is
integrable and that its integral is λ∫ b
af +
∫ b
ag. The argument if λ < 0 is similar; one simply needs to
reverse some inequalities (for instance, one has λψ + ϕ ≤ λf + g ≤ λφ + ξ instead).
Example 1.23
Using the computations we made above,
∫ 1
0
(7x2 − ex
)dx = 7
∫ 1
0
x2 dx−∫ 1
0
ex dx = 71
3− (e− 1) =
10
3− e .
Exercise 1.24 solution page 134
(i) Admitting that
∫ 1
0
xn dx =1
n+ 1, compute
∫ 1
0
P (x) dx, where P (X) =
n∑
i=0
aiXi.
(ii) Find a degree 2 polynomial P such that
∫ 1
0
P (x) dx = 0.
Next, let us see how the integral behaves with inequalities.
Proposition 1.25 h positivity of the integral g
h Let f , g : [a, b] → R be integrable functions such that f ≤ g. Then
∫ b
a
f(x) dx ≤∫ b
a
g(x) dx .
h In particular, if f : [a, b] → R is an integrable function such that f ≥ 0, then
∫ b
a
f(x) dx ≥ 0.
Proof. By linearity, the function g − f is integrable. Furthermore, the step function x ∈ [a, b] 7→ 0 isbelow g − f , so that, by definition of the integral as a supremum over step functions that are below,
0 =
∫ b
a
0 dx ≤∫ b
a
(g − f)(x) dx =
∫ b
a
g(x) dx−∫ b
a
f(x) dx
by linearity.
In the case of continuous functions, the previous result can be significantly strengthened.
Proposition 1.26
Let f : [a, b] → R+ be a continuous function. Then
∫ b
a
f(x) dx = 0 =⇒ f ≡ 0 (that is, ∀x ∈ [a, b], f(x) = 0).
17
Chapter 1. Riemann theory of integration
Proof. If f 6≡ 0, by continuity, there exists x0 ∈ (a, b) such that f(x0) > 0. Then, still by continuity, thereexists ε > 0 such that a < x0 − ε < x0 + ε < b and, for all x ∈ [x0 − ε, x0 + ε], f(x) > 1
2f(x0). As f ≥ 0,we obtain that f ≥ 1
2f(x0)1[x0−ε,x0+ε], the latter function being a step function. Thus,
∫ b
a
f(x) dx ≥∫ b
a
1
2f(x0)1[x0−ε,x0+ε](x) dx = 2ε
1
2f(x0) = εf(x0) > 0 .
by Proposition 1.25.
Let us now look at some basic operations.
Proposition 1.27 h product, composition, min, max, absolute value g
Let f , g : [a, b] → R be integrable functions.
(i) Let Φ : R → R be a continuous function. Then the composition Φ ◦ f : x ∈ [a, b] 7→ Φ(f(x)) isintegrable.
(ii) The product f × g : x ∈ [a, b] 7→ f(x) × g(x) is integrable.
(iii) The functions min(f, g) : x ∈ [a, b] 7→ min(f(x), g(x)) and max(f, g) : x ∈ [a, b] 7→max(f(x), g(x)) are integrable.
(iv) The function |f | : x ∈ [a, b] 7→ |f(x)| is integrable and
∣∣∣∣∣
∫ b
a
f(x) dx
∣∣∣∣∣
≤∫ b
a
∣∣f(x)
∣∣ dx .
Warning 1.28
Beware that, in contrast with addition or multiplication by a scalar (a real number), the above operations
do not “commute” with the integral. For instance,∫ b
afg 6=
( ∫ b
af)( ∫ b
ag). As a counter-example,
consider for instance on [0, 2] the function f equal to 1 on [0, 1] and 0 elsewhere and g := 1 − f .
Clearlya, fg ≡ 0 so that∫ b
a fg = 0, whereas∫ b
a f =∫ b
a g = 1. (The computations are easy as the threefunctions under consideration are step functions.)
aWe use the notation ≡ for the functional equality: f ≡ g on I means ∀x ∈ I, f(x) = g(x) and f 6≡ g on I means∃x ∈ I, f(x) 6= g(x).
Proof. (i). This proof is a bit more complicated than the other as it somehow has “two layers” ofdifficulty. As f is integrable, it is bounded by definition: let M be such that |f | ≤ M . Let us fix ε > 0.By Heine’s theorem, Φ is uniformly continuous on [−M,M ]: there exists δ > 0 such that
|y − x| < δ =⇒ |Φ(y) − Φ(x)| ≤ ε . (1.4)
Now, let φ ≤ f ≤ ψ be two step functions such that∫ b
af − δε ≤
∫ b
aφ ≤
∫ b
af ≤
∫ b
aψ ≤
∫ b
af + δε and
let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible both with φ and ϕ. We denote by ci
and di the values taken on (xi−1, xi) respectively by φ and ψ. Then,
∀x ∈ (xi−1, xi), ci ≤ f(x) ≤ di and thus min[ci,di]
Φ ≤ Φ ◦ f(x) ≤ max[ci,di]
Φ .
We define the step functions ϕ and ξ by ϕ(x) = c′i := min[ci,di] Φ and ξ(x) = d′
i := max[ci,di] Φ wheneverx ∈ (xi−1, xi), 1 ≤ i ≤ n, as well asϕ(xi) = ξ(xi) := Φ(f(xi)), 0 ≤ i ≤ n. By construction, ϕ ≤ Φ◦f ≤ ξ.
18
1.2.3. First properties of the integral
On the one hand, by (1.4), when di − ci < δ, then d′i − c′
i ≤ ε. On the other hand, we always haved′
i − c′i ≤ C where we set C := max[−M,M ] Φ − min[−M,M ] Φ. We have
∫ b
a
ξ −∫ b
a
ϕ =
n∑
i=1
(xi − xi−1)(d′i − c′
i)
=
n∑
1≤i≤ndi−ci<δ
(xi − xi−1)(d′i − c′
i) +
n∑
1≤i≤ndi−ci≥δ
(xi − xi−1)(d′i − c′
i)
≤ (b− a) ε+ Cn∑
1≤i≤ndi−ci≥δ
(xi − xi−1) .
Finally, as
2δε ≥∫ b
a
ψ −∫ b
a
φ =
n∑
i=1
(xi − xi−1)(di − ci)
=n∑
1≤i≤ndi−ci<δ
(xi − xi−1)(di − ci) +n∑
1≤i≤ndi−ci≥δ
(xi − xi−1)(di − ci)
≥ δn∑
1≤i≤ndi−ci≥δ
(xi − xi−1) ,
we see that
n∑
1≤i≤ndi−ci≥δ
(xi − xi−1) ≤ 2ε , so that 0 ≤ I+(Φ ◦ f) − I−(Φ ◦ f) ≤∫ b
a
ξ −∫ b
a
ϕ ≤ (b− a+ 2C) ε .
As ε is arbitrary, we obtain that I+(Φ ◦ f) = I−(Φ ◦ f), so that Φ ◦ f is integrable.
(ii). We fix ε > 0. There exist step functions φ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that
0 ≤∫ b
a
ψ(x) dx−∫ b
a
φ(x) dx ≤ ε and 0 ≤∫ b
a
ξ(x) dx−∫ b
a
ϕ(x) dx ≤ ε .
Let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ, ψ, ϕ and ξ. We denote by ci,di, c
′i and d′
i the values taken on (xi−1, xi) respectively by φ, ψ, ϕ and ξ. Then, for x, y ∈ (xi−1, xi),
|fg(y) − fg(x)| = |f(y)g(y) − f(y)g(x) + f(y)g(x) − f(x)g(x)|≤ |f(y)| |g(y) − g(x)| + |g(x)| |f(y) − f(x)|≤ M(di − ci + d′
i − c′i) ,
where we denoted by M a bound for f and g. We define the step functions α and β by α(x) :=inf(xi−1,xi) fg and β(x) := sup(xi−1,xi) fg whenever x ∈ (xi−1, xi), 1 ≤ i ≤ n, as well as α(xi) = β(xi) := fg(xi), 0 ≤ i ≤ n. By construction, α ≤ fg ≤ β, and by the above bound,
β − α ≤ M(ψ − φ+ ξ − ϕ) ,
so that
0 ≤ I+(fg) − I−(fg) ≤∫ b
a
β −∫ b
a
α ≤ 2Mε .
19
Chapter 1. Riemann theory of integration
As ε is arbitrary, we obtain that I+(fg) = I−(fg), so that fg is integrable.
(iii). We fix ε > 0. There exist step functions φ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that
0 ≤∫ b
a
ψ(x) dx−∫ b
a
φ(x) dx ≤ ε and 0 ≤∫ b
a
ξ(x) dx−∫ b
a
ϕ(x) dx ≤ ε .
Then min(φ, ϕ) and min(ψ, xi) are step functions satisfying min(φ, ϕ) ≤ min(f, g) ≤ min(ψ, ξ) and it isnot hard to check that 0 ≤ min(ψ, ξ) − min(φ, ϕ) ≤ ψ − φ+ ξ − ϕ. We conclude as usual that min(f, g)is integrable. The proof for max(f, g) is the same.
(iv). Since x ∈ R 7→ |x| is a continuous function, (i) yields that the function |f | is integrable. Moreover,as −|f | ≤ f ≤ |f |, by positivity (Proposition 1.25),
−∫ b
a
|f(x)| dx ≤∫ b
a
f(x) dx ≤∫ b
a
|f(x)| dx ,
which is a simple rewriting of the desired result.
Exercise 1.29 solution page 134
Knowing that
∫ n
1
1
xndx =
n−n+1 − 1
−n+ 1, show that
∫ n
1
sin(nx)
1 + xndx → 0 as n → ∞.
Exercise 1.30 solution page 134
Does it hold that
(i)
∫ b
a
f(x)2 dx =
(∫ b
a
f(x) dx
)2
?
(ii)
∫ b
a
√
f(x) dx =
√∫ b
a
f(x) dx ?
(iii)
∫ b
a
∣∣f(x)
∣∣ dx =
∣∣∣∣
∫ b
a
f(x) dx
∣∣∣∣
?
(iv)
∫ b
a
∣∣f(x) + g(x)
∣∣ dx =
∣∣∣∣
∫ b
a
f(x) dx
∣∣∣∣+
∣∣∣∣
∫ b
a
g(x) dx
∣∣∣∣
?
We are now interested in splitting the segment onto which the function to integrate is defined. First,if a function is integrable on a segment, then it is also integrable on any subsegment.
Proposition 1.31 h restriction of an integrable function g
Let f : [a, b] → R be an integrable function and [a′, b′] ⊆ [a, b]. Then the restriction f |[a′,b′] : [a′, b′] →R is integrable.
Proof. Let us fix ε > 0. There exist step functions φ, ψ : [a, b] → R such that φ ≤ f ≤ ψ and∫ b
aψ−∫ b
aφ <
ε. Then, φ|[a′,b′] and ψ|[a′,b′] are step functions such that φ|[a′,b′] ≤ f |[a′,b′] ≤ ψ|[a′,b′]. Moreover,
0 ≤ I+(f |[a′,b′]
)− I−(f |[a′,b′]
)≤∫ b′
a′
(ψ − φ) ≤∫ b
a
(ψ − φ) < ε .
20
1.2.3. First properties of the integral
We used the fact that ψ−φ ≥ 0, so that the sum defining∫ b
a(ψ−φ) contains the sum defining
∫ b′
a′ (ψ−φ)plus some other nonnegative terms (see the proof of (1.5) below for step functions). As ε is arbitrary,we obtain that I+(f |[a′,b′]) = I−(f |[a′,b′]), so that f |[a′,b′] is integrable.
Let f : [a, b] → R be an integrable function and c ∈ (a, b). Then f |[a,c] and f |[c,b] are integrable. Weclaim that
∫ b
a
f(x) dx =
∫ c
a
f(x) dx+
∫ b
c
f(x) dx . (1.5)
First, it is quite clear that this holds for step functions. Indeed, if φ : [a, b] → R is a step function, leta = x0 < x1 < x2 < . . . < xk = c < xk+1 < . . . < xn = b be a subdivision compatible with φ thatcontains c (recall that we can always add arbitrary points to subdivisions). Let us denote by ci thevalue taken by φ on (xi−1, xi). Then
∫ c
a
φ(x) dx+
∫ b
c
φ(x) dx =k∑
i=1
(xi − xi−1) ci +n∑
i=k+1
(xi − xi−1) ci =n∑
i=1
(xi − xi−1) ci =
∫ b
a
φ(x) dx .
Let us come back to a general integrable function f and let us fix ε > 0. There exist step functions φ
and ψ such that φ ≤ f ≤ ψ and∫ b
aψ −
∫ b
aφ < ε. Then, using the result for the step function φ,
∫ b
a
f −∫ c
a
f −∫ b
c
f ≤∫ b
a
ψ −∫ c
a
φ−∫ b
c
φ =
∫ b
a
ψ −∫ b
a
φ < ε .
Similarly,∫ b
a f −∫ c
a f −∫ b
c f ≥∫ b
a φ−∫ c
a ψ −∫ b
c ψ =∫ b
a φ−∫ b
a ψ > −ε. In other words,∣∣∣∣∣
∫ b
a
f(x) dx−∫ c
a
f(x) dx−∫ b
c
f(x) dx
∣∣∣∣∣< ε ,
and (1.5) follows by letting ε → 0.
Reorganizing the terms in (1.5) yields
∫ b
a
f(x) dx−∫ b
c
f(x) dx =
∫ c
a
f(x) dx ,
which suggests to adopt the following convention.
Definition 1.32 h integrating “backwards” g
Let f : [a, b] → R be an integrable function. Then
∫ a
b
f(x) dx := −∫ b
a
f(x) dx .
Proposition 1.33 h Chasles’s identity g
Let I be a segment and f : I → R be an integrable function. Then, for any a, b, c ∈ I ,
∫ c
a
f(x) dx =
∫ b
a
f(x) dx+
∫ c
b
f(x) dx .
Proof. We consider all 6 possible arrangements of a, b, c. Then the result comes from (1.5) and Defini-tion 1.32
21
Chapter 1. Riemann theory of integration
1.2.4 Integrals and derivatives
Definition 1.34 h primitive g
Let f : I → R be a function defined on an arbitrary interval I . A primitive or antiderivative of f is adifferentiable function F : I → R such that F ′ = f , that is, for all x ∈ I , F ′(x) = f(x).
Warning 1.35 B primitives are never unique B
Beware that, if F is a primitive of f , then the function F + λ : x 7→ F (x) + λ is also a primitive of f ,for any constant λ ∈ R. For this reason, we cannot speak of the primitive of f without any furtherspecification.
More precisely, primitives of the same function differ by an additive constant.
Proposition 1.36 h all the primitives of a function g
Let F1 : I → R and F2 : I → R be primitives of the same function defined on an interval I . Then thereexists a constant c ∈ R such that F2 = F1 + c, that is,
∀x ∈ I , F2(x) = F1(x) + c .
Proof. By definition, the function F2 − F1 is differentiable and (F2 − F1)′ ≡ 0. It thus amounts to seethat a function F with null derivative is constant. This can be done using the mean value theorem: forall a, b ∈ I with a < b, there exists c ∈ (a, b) such that F (b) − F (a) = F ′(c)(b− a) = 0.
We may thus speak of the primitive whose value is given at some point, for instance, the primitivethat cancels at 0.
It is common to denote any primitive of f by∫f or
∫f(x) dx (same notation as for integrals without
the extremities of the interval). Beware that this notation is problematic for two reasons. First, in
contrast with∫ b
a f , which is a number,∫f is a function. Moreover, as mentioned earlier, primitives are
not uniquely defined: for instance, one could write∫
sin = − cos and∫
sin = − cos +17. From thisnotation, it would be tempting to deduce that − cos =
∫sin = − cos +17, which is obviously wrong. I
thus advice against using this notation.As you already know the derivatives of usual functions, you also already know many primitives.
Example 1.37 h primitives of usual functions g
By R⋆±, we mean either R⋆
− or R⋆+.This table is read as follows. The function x ∈ D 7→ · is a primitive of
the function x ∈ D 7→ ·, where D is to be relaced by the first entry, the first · by the third entry and thesecond · by the second entry.
D function x ∈ D 7→ · a primitive x ∈ D 7→ ·
R xn n ∈ Z≥0xn+1
n+ 1
R⋆±
1
xln |x|
22
1.2.4. Integrals and derivatives
R⋆± x−n n ∈ {2, 3, 4, . . .} x−n+1
−n+ 1
R⋆+ xα α ∈ R \ {−1} xα+1
α+ 1
R eαx α ∈ R⋆ 1
αeαx
R cos(x) sin(x)
R sin(x) − cos(x)
R1
cos2(x)= 1 + tan2(x) tan(x)
R1
sin2(x)=
1
tan2(x)+ 1 − 1
tan(x)
R1
1 + x2arctan(x)
(−1, 1)1√
1 − x2arcsin(x)
R cosh(x) :=ex + e−x
2sinh(x)
R sinh(x) :=ex − e−x
2cosh(x)
R1
cosh2(x)= 1 − tanh2(x) tanh(x)
R1
sinh2(x)=
1
tanh2(x)− 1 − 1
tanh(x)
(−1, 1)1
1 − x2artanh(x) =
1
2ln(1 + x
1 − x
)
R1√
1 + x2arsinh(x) = ln
(
x+√x2 + 1
)
(1,+∞)1√
x2 − 1arcosh(x) = ln
(
x+√x2 − 1
)
Moreover, by linearity of differentiation, we readily obtain the following property.
23
Chapter 1. Riemann theory of integration
Proposition 1.38
h Let F : I → R be a primitive of f : I → R, G : I → R be a primitive of g : I → R, and λ ∈ R.Then λF +G : I → R is a primitive of λf + g.
h Let F : I → R be a primitive of f : I → R and α ∈ R⋆. Then x ∈ I 7→ 1
αF (αx) is a primitive
of x ∈ I 7→ f(αx).
Proof. One has (λF +G)′ = λF ′ +G′ = λf + g andd
dx
( 1
αF (αx)
)
=α
αF ′(αx) = f(αx).
We now formalize the link between the problems we saw in Section 1.1.1.
Theorem 1.39 h first fundamental theorem of calculus g
Let f ∈ C([a, b]). For each x ∈ [a, b], set
F (x) :=
∫ x
a
f(y) dy .
Then F is a primitive of f . More precisely, F ∈ C1([a, b]) and
∀x ∈ (a, b) , F ′(x) = f(x) , F ′d(a) = f(a) , and F ′
g(b) = f(b) .
Note that this function F is the primitive of f that cancels at a.
Warning 1.40 B integration variable B
Notice that we wrote∫ x
a f(y) dy in the above statement and not∫ x
a f(x) dx. The latter does not makeany sense as x is both a constant (the upper extremity of the integration segment) and a variable (theintegration variable).
Proof. Assume that x0 ∈ (a, b) and let h satisfy 0 < |h| < min(x0 − a, b− x0). By Chasles’s identity,
F (x0 + h) − F (x0)
h=
1
h
∫ x0+h
x0
f(x) dx .
Then, by integrating the constant function equal to f(x0) either on [x0, x0 +h] or [x0 +h, x0] (dependingon the sign of h) and by linearity,
F (x0 + h) − F (x0)
h− f(x0) =
1
h
∫ x0+h
x0
(f(x) − f(x0)
)dx ,
so that
∣∣∣∣
F (x0 + h) − F (x0)
h− f(x0)
∣∣∣∣
≤ 1
|h|
∫ x0+|h|
x0−|h|
∣∣f(x) − f(x0)
∣∣ dx ≤ 2 sup
|x−x0|≤|h|
∣∣f(x) − f(x0)
∣∣ ,
and the latter tends to 0 as h → 0 by continuity of f at x0. This yields that F is differentiable at x0 andthat F ′(x0) = f(x0). The cases of x0 = a or x0 = b are treated similarly.
24
1.2.4. Integrals and derivatives
Warning 1.41 B Do not forget the continuity hypothesis B
Remember that∫ x
a f(y) dy is not changed if you arbitrarily modify f at a finite number of points. Theo-rem 1.39 has thus no chance to hold without the continuity hypothesis.
Warning 1.42 B not all integrable functions admit primitives! B
In fact, not all integrable functions admit primitives. Take for instance x ∈ [0, 2] 7→ 1[1,2](x). Thisstep function is integrable. By contradiction, a primitive of this function would be constant on [0, 1] andaffine on [1, 2] with slope 1, and thus not differentiable at 1.
Let f ∈ C([a, b]) and F : [a, b] → R be a primitive of f . By the previous theorem, x ∈ [a, b] 7→∫ x
a f(y) dy is also a primitive of f , so that, by Proposition 1.36, there exists a constant c ∈ R such that,
for all x ∈ [a, b], F (x) = c +∫ x
af(y) dy. Hence F (b) − F (a) =
∫ b
af(y) dy. We will now see that this
property still holds under the weaker assumption that f is integrable (and not necessarily continuous).
Definition 1.43 h notation [·] g
For a function F : [a, b] → R, we use the piece of notation
[F ]ba = [F (x)]bx=a := F (b) − F (a) .
Although not totaly rigorous, we also tolerate the notation [F (x)]ba when the integration variable isunequivocal.
Theorem 1.44 h second fundamental theorem of calculus g
Let f : [a, b] → R be an integrable function and F ∈ C([a, b]) be differentiable on (a, b) and such that
∀x ∈ (a, b) , F ′(x) = f(x) .
Then∫ b
a
f(x) dx = [F ]ba = F (b) − F (a) .
Proof. Let ε > 0. There exist step functions φ, ψ : [a, b] → R such that φ ≤ f ≤ ψ and∫ b
a ψ −∫ b
a φ < ε.Let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ and ψ. We denote by ci and di
the values taken on (xi−1, xi) respectively by φ and ψ. By the mean value theorem, for 1 ≤ i ≤ n,
(xi − xi−1) ci ≤ (xi − xi−1) inf(xi−1,xi)
f ≤ F (xi) − F (xi−1) ≤ (xi − xi−1) sup(xi−1,xi)
f ≤ (xi − xi−1) di ,
so that∫ b
a
φ(x) dx =
n∑
i=1
(xi − xi−1) ci ≤n∑
i=1
(F (xi) − F (xi−1)
)≤
n∑
i=1
(xi − xi−1) di =
∫ b
a
ψ(x) dx .
The sum in the middle is a telescopic sum equal to [F ]ba. As, furthermore,
∫ b
a
φ(x) dx ≤∫ b
a
f(x) dx ≤∫ b
a
ψ(x) dx ,
25
Chapter 1. Riemann theory of integration
we conclude that ∣∣∣∣∣[F ]ba −
∫ b
a
f(x) dx
∣∣∣∣∣
≤∫ b
a
ψ −∫ b
a
φ < ε .
The desired equality follows by letting ε → 0.
This theorem gives a more efficient way to compute integrals.
Example 1.45
Let us come back to the integrals we computed before.
(i)
∫ 1
0
ex dx =[ex]1
0= e1 − e0 = e− 1.
(ii)
∫ 1
0
x2 dx =
[x3
3
]1
0
=1
3.
Exercise 1.46 h What are the odds? g solution page 134
(i) Let f : [−a, a] → R be an odd integrable function. Show that
∫ a
−a
f(x) dx = 0.
(ii) Let g : [−a, a] → R be an even integrable function. Show that
∫ a
−a
g(x) dx = 2
∫ a
0
g(x) dx.
1.2.5 Riemann sums
There are many links between integrals and discrete sums. In fact, the integral can be thought of as asum over a continuum of points, whereas a discrete sum is a sum over a countable set of points. In oldLatin, there were two versions of the the letter “s” and the symbol
∫actually represents one of them,
the one used in particular at the beginning of the word “summa” meaning sum.
Theorem 1.47 h Riemann sums g
Let f ∈ C([a, b]). For each n ∈ N, we consider a subdivision a = xn,0 < xn,1 < xn,2 < . . . < xn,n = binto n parts and n points tn,i ∈ [xn,i−1, xn,i], 1 ≤ i ≤ n. If the mesh max1≤i≤n(xn,i − xn,i−1) tendsto 0 as n → ∞, then
n∑
i=1
(xn,i − xn,i−1)f(tn,i) −−−−→n→∞
∫ b
a
f(t) dt .
Remark 1.48
In fact, the theorem still holds for integrable functions. This was the original definition of integrablefunctions by Riemann.
Proof. By linearity and using Chalses’s identity,
n∑
i=1
(xn,i − xn,i−1)f(tn,i) −∫ b
a
f(t) dt =
n∑
i=1
∫ xn,i
xn,i−1
(f(tn,i) − f(t)
)dt .
26
1.2.6. Toolbox
Now, let ε > 0. By Heine’s theorem, f is uniformly continuous on [a, b]: there exists δ > 0 such that
|y − x| < δ =⇒ |f(y) − f(x)| ≤ ε .
As the mesh tends to 0, for n sufficiently large, one has max1≤i≤n(xn,i − xn,i−1) < δ. Then
∣∣∣∣∣
n∑
i=1
(xn,i − xn,i−1)f(tn,i) −∫ b
a
f(t) dt
∣∣∣∣∣
≤n∑
i=1
∫ xn,i
xn,i−1
∣∣f(tn,i) − f(t)
∣∣ dt ≤
n∑
i=1
∫ xn,i
xn,i−1
ε dt = (b− a) ε .
The result follows.
The most common use of this theorem is by taking regular subdivisions and tn,i = xn,i, and partic-ularly a = 0 and b = 1.
Corollary 1.49
h Let f ∈ C([a, b]). Then
b− a
n
n∑
i=1
f(
a+ ib− a
n
)
−−−−→n→∞
∫ b
a
f(t) dt .
h Let f ∈ C([0, 1]). Then
1
n
n∑
i=1
f( i
n
)
−−−−→n→∞
∫ 1
0
f(t) dt .
Example 1.50
Let us compute the limit of Sn :=
n∑
k=1
1
n+ k.
We can rewrite Sn =1
n
n∑
k=1
1
1 + kn
. Setting f : x ∈ [0, 1] 7→ 1
1 + x,
Sn =1
n
n∑
k=1
f(k
n
)
−−−−→n→∞
∫ 1
0
1
1 + xdx =
[ln |1 + x|
]1
0= ln(2) − ln(1) = ln(2) .
Exercise 1.51 solution page 135
Find the limits of Sn :=
n∑
k=1
ekn
nand S′
n :=
n∑
k=1
n
(n+ k)2.
1.2.6 Toolbox
We now know that integration is the inverse operation of differentiation. When we want to integratea function, we may recognize the derivative of a known function (maybe up to some factors to finetuneafterward). However, most of the time, we are not that lucky. We will now see the two main toolsallowing to compute common integrals.
27
Chapter 1. Riemann theory of integration
Theorem 1.52 h integration by parts g
Let u, v ∈ C1([a, b]). Then
∫ b
a
u(x)v′(x) dx =[uv]b
a−∫ b
a
u′(x)v(x) dx .
Proof. As u, v ∈ C1([a, b]), one has (uv)′ = u′v + uv′ and all these function are continuous and thusintegrable. The result is obtained by integrating this equality.
This theorem is used as follows. When we do not know how to integrate a function, we try towrite it as a product of two functions such that, after differentiating the first factor and integrating thesecond factor, we are left with a function that we can integrate. This takes some practice.
Remark 1.53 h color code g
Throughout these notes, we use this color for the fonction we differentiate and this one for the fonctionwe integrate.
Example 1.54
Let us see some examples.
(i) Let us compute
∫ 1
0
xex dx.
∫ 1
0
x ex dx =[x ex
]1
0−∫ 1
0
1 ex dx = e−∫ 1
0
ex dx = e−[ex]1
0= e− (e− 1) = 1 .
(ii) The product is usually not as straightforward as above to identify. Let us compute
∫ x
1
ln(t) dt.
∫ x
1
1 ln(t) dt =[t ln(t)
]x
1−∫ x
1
t1
tdt = x ln(x)−
∫ x
1
1 dt = x ln(x)−[t]x
1= x ln(x)−x+1 .
Exercise 1.55 solution page 135
(i) Compute
∫ e
1
x ln(x) dx.
(ii) Find a primitive of x ∈ R 7→ x2ex.
Theorem 1.56 h integration by substitution or change of variable g
Let f : I → R be a continuous function on an interval I and Φ : [a, b] → I be a differentiable function
28
1.2.6. Toolbox
with integrable derivative. Then
∫ Φ(b)
Φ(a)
f(y) dy =
∫ b
a
f(Φ(x)
)Φ′(x) dx .
Proof. Since f is continuous, it has a primitive F . The function F ◦ Φ is differentiable as a compositionof differentiable functions: by the chain rule,
∀x ∈ (a, b) , (F ◦ Φ)′(x) = f(Φ(x)
)Φ′(x) .
By the fundamental theorem of calculus (applied twice),
∫ b
a
f(Φ(x)
)Φ′(x) dx =
∫ b
a
(F ◦ Φ)′(x) dx =[F ◦ Φ
]b
a= F
(Φ(b)
)− F
(Φ(a)
)=[F]Φ(b)
Φ(a)=
∫ Φ(b)
Φ(a)
f(y) dy
as claimed.
In practice, we may remember the formula as follows. We want to set y = Φ(x), so that dydx = Φ′(x).
Working heuristically with infinitesimals yields dy = Φ′(x) dx, which we barbarously “replace” in theleft-hand side integral. We then modify the extremities by noting that, as x goes from a to b, theny = Φ(x) goes from Φ(a) to Φ(b). Bear in mind that this is not a proof, simply a mean of rememberingthe formula. Observe that this works because the notation has been well chosen (which is far frombeing always the case in maths). As with integration by parts, finding a good substitution is sometimesstraightforward and sometimes quite involved; practice is the key!
Remark 1.57 h color code g
As much as possible, we will use the following color code for substitution:
∫ Φ(b)
Φ(a)
f(y) dy =
∫ b
a
f(Φ(x)
)Φ′(x) dx .
Example 1.58 h a primitive of tan g
Let us find a primitive of tan : (− π2 ,
π2 ) → R, for instance the one that cancels at 0:
x ∈(
− π2 ,
π2
)7→∫ x
0
tan(t) dt .
As tan =sin
cos, it appears appropriate to use the substitution u = cos(t), yielding du = − sin(t) dt.
Then, for x ∈ (− π2 ,
π2
),
∫ x
0
tan(t) dt =
∫ x
0
sin(t)
cos(t)dt = −
∫ cos(x)
cos(0)
du
u= −
[ln |u|
]cos(x)
cos(0)= − ln
(cos(x)
).
29
Chapter 1. Riemann theory of integration
Exercise 1.59 solution page 136
(i) Using the change of variable x = sin(t), compute
∫ 12
0
1
(1 − x2)3/2dx.
(ii) Using the change of variable x = tan(t), find a primitive of x ∈ R 7→ 1
(1 + x2)3/2.
1.3 Integration of rational functions
We are interested in this section in integrating rational functions, that is, functions of the form
R : x 7→ P (x)
Q(x),
where P , Q ∈ R[X ] are nonzero polynomials. Dividing both the numerator and the denominator bythe leading coefficient of Q, we may assume without loss of generality that Q is monic, that is, hasleading coefficient equal to 1. If deg(Q) = 0, which means that Q = 1, then we are dealing with apolynomial, which we already know how to integrate. In the following, we therefore assume
deg(Q) ≥ 1 .
1.3.1 Partial fraction decomposition
We factor Q in R[X ] as
Q =
p∏
j=1
(X − rj
)nj
q∏
j=1
(X2 + 2bjX + cj)mj
with distinct real numbers rj ∈ R, and distinct pairs (bj , cj) ∈ R2 such that b2j − cj < 0. The real
numbers r1, . . . , rp are the real roots of Q, with multiplicity n1, . . . , np. The factors X2 + bjX + cj
correspond to the nonreal conjugate roots, with multiplicity m1, . . . , mq.
Remark 1.60 h reduced discriminant g
Recall that the discriminant of ax2 + bx + c is b2 − 4ac and that, if it is positive, then the roots are−b±
√b2−4ac
2a . If one wants to simplify things and get rid of these factors 2, one may use the reduceddiscriminant. This is what we do in these notes.The reduced discriminant of ax2 + 2bx+ c is b2 − ac; if it is positive, then the roots are −b±
√b2−ac
a .
We admit the following theorem, whose proof is beyond the reach of this course.
Theorem 1.61 h partial fraction decomposition g
There exists a unique way to write
P (X)
Q(X)= E(X) +
p∑
j=1
nj∑
k=1
αj,k
(X − rj)k+
q∑
j=1
mj∑
k=1
βj,kX + γj,k
(X2 + 2bjX + cj)k
where E(X) ∈ R[X ] and the quantities αj,k’s, βj,k’s, γj,k’s are real numbers.
30
1.3.1. Partial fraction decomposition
Definition 1.62 h integral part, partial fraction g
h The polynomial E(X) is called the integrala part of the rational fraction.h Each element of the sums above is called a partial fraction.
aThe word integral is the adjective corresponding to integer. It refers to to the fact that the polynomial ring R[X]behaves as the integer ring Z in regards to Euclidean division; it does not refer to the Riemann integral.
In fact, similarly to the case of integer division, there is a unique way to write P (X) = E(X)Q(X)+S(X) with E, S ∈ R[X ] and deg(S) < deg(Q). To convince yourself, let us see how we determine E inpractice through the following algorithm. First, let m := deg(Q), so that the leading factor of Q is Xm.
Let anXn denote the leading factor of P . If n < m, we stop. Otherwise, we write
P = anXn−mQ+ P − anX
n−mQ︸ ︷︷ ︸
polynomial of degree < n
and we reiterate the process with the polynomial P − anXn−mQ. As the degree decreases by at least 1
at each step, the algorithm terminates.
Example 1.63 h Euclidean division g
Let us divide 2X4 − 3X3 + 2X − 6 by X2 + 1. We write
2X4 − 3X3 + 2X − 6 = 2X2(X2 + 1) − 3X3 − 2X2 + 2X − 6
= (2X2 − 3X)(X2 + 1) − 2X2 + 5X − 6
= (2X2 − 3X − 2)(X2 + 1) + 5X − 4 .
Alternatively, as we know that the Euclidean division is possible, we may use indeterminate coef-ficients for E and S, develop EQ+ S and identify the coefficients. The degree of E is deg(P ) − deg(Q)and that of S is at most deg(Q) − 1.
Example 1.64
Coming back to the previous example, we solve
2X4 − 3X3 + 2X − 6 = (aX2 + bX + c)(X2 + 1) + dX + e
= aX4 + bX3 + (a+ c)X2 + (b + d)X + (c+ e) ,
which yields a = 2, b = −3, c = −2, d = 5, e = −4. The integral part of the rational fraction2X4 − 3X3 + 2X − 6
X2 + 1is thus 2X2 − 3X − 2. We obtain
2X4 − 3X3 + 2X − 6
X2 + 1= 2X2 − 3X − 2 +
5X − 4
X2 + 1
In practice, we will first do the above Euclidean division, obtaining
P (X)
Q(X)= E(X) +
S(X)
Q(X)
and then do the partial fraction decomposition
S(X)
Q(X)=
p∑
j=1
nj∑
k=1
αj,k
(X − rj)k+
q∑
j=1
mj∑
k=1
βj,kX + γj,k
(X2 + 2bjX + cj)k. (1.6)
31
Chapter 1. Riemann theory of integration
In order to determine the constants of this partial fraction decomposition, we can always multiplyby Q(X) and obtain an equality of polynomial. The left-hand side polynomial is S, whereas the coeffi-cients of the right-hand polynomial are linear expressions in the constants to be determined. Equatingthe coefficients yields a system of linear equations that always admits a unique solution, which can befound by usual methods of linear algebra.
Alternatively, there are usually faster methods that can simplify as much as possible the computa-tions. In particular, multiplying both sides of (1.6) by Q(X), one obtains an equality of polynomial, sothat equality holds whenX is replaced with any real number x (even any complex number). The sameis thus true without multiplying by Q(x), provided of course that Q(x) 6= 0.
(a) Observe that limx→+∞
xS(x)
Q(x)=
p∑
j=1
αj,1 +
q∑
j=1
βj,1.
(b) We can multiply by (X − rj)nj and specify the equation at rj . We obtain
αj,nj=
S(rj)
Q/(X−rj)nj (rj).
whereQ/(X−rj)nj (X) :=Q(X)
(X − rj)nj. Once αj,nj
is determined, we may subtractαj,nj
(X − rj)njand
reiterate.
(c) Note also that, as
Q/(X−rj)(X) =
p∏
k=1k 6=j
(X − rk
)nk
q∏
k=1
(X2 + 2bkX + ck)mk ,
we also have Q/(X−rj)nj (rj) =1
nj !Q(nj)(rj), where Q(k) denotes the k-th derivative of Q.
(d) Similarly to (b), we can multiply by (X2 + 2bjX + cj)mj and specify at a root. We obtain anequality between two complexes and thus two equations on real numbers allowing to determinethe corresponding constants βj,mj
and γj,mj.
Unless you have a hunch, it is usually a good idea to start with (b) and (d). Remember that, whenyou are stuck, you can always multiply by Q and equate the coefficients or specify the equation atvalues giving simple equations.
Example 1.65
1
X2 − 3X + 2=
1
(X − 1)(X − 2)=
a
X − 1+
b
X − 2.
Multiplying by X − 1 and specifying at 1 gives1
1 − 2= a and multiplying by X − 2 and specifying
at 2 gives1
2 − 1= b. Consequently,
1
X2 − 3X + 2=
−1
X − 1+
1
X − 2.
With some training, we can also directly write
1
X2 − 3X + 2=
1
(X − 1)(X − 2)=
(X − 1) − (X − 2)
(X − 1)(X − 2)=
1
X − 2− 1
X − 1.
32
1.3.2. Integrating partial fractions
Example 1.66
4X3
(X − 1)2(X2 + 1)=
a
X − 1+
b
(X − 1)2+cX + d
X2 + 1.
Multiplying by (X − 1)2 and specifying at 1 gives b = 2. Multiplying by X2 + 1 and specifying at i
gives ci + d =4i3
(i − 1)2=
−4i(i+ 1)2
(−2)2= 2, so that c = 0 and d = 2. Finally, (a) yields 4 = a+ c so
that a = 4. One may also get 0 = −a+ b+ d by specifying at 0, for instance. All in all,
4X3
(X − 1)2(X2 + 1)=
4
X − 1+
2
(X − 1)2+
2
X2 + 1.
1.3.2 Integrating partial fractions
Let us now see how to integrate our rational function R : x 7→ P (x)
Q(x).
Warning 1.67 B domain of definition B
Beware that f is defined on R \ {r1, . . . , rp}, which is a finite union of open intervals. One thus has tointegrate separately on each of these intervals.
By linearity, we may integrate separately the integral part, as well as each partial fraction. Theintegral is a polynomial, which we know how to integrate. It remains to deal with partial fractions.
h Integrating x 7→ (x − r)−k, r ∈ R, k ∈ N. This is, up to translation, a power function. One can dothe change of variable y = x− r.
D function x ∈ D 7→ · a primitive x ∈ D 7→ ·
(−∞, r) or (r,+∞)1
x− rln |x− r|
(−∞, r) or (r,+∞)1
(x− r)kk ∈ {2, 3, 4, . . .} 1
−k + 1(x− r)−k+1
Example 1.68
Coming back to Example 1.65, let us find the primitives of R : x 7→ 1
x2 − 3x+ 2. We start by observing
that this function is defined on R \ {1, 2}. As, for all x /∈ {1, 2},
1
x2 − 3x+ 2=
1
x− 2− 1
x− 1,
the primitives of R are the functions
x ∈ R \ {1, 2} 7→ ln
∣∣∣∣
x− 2
x− 1
∣∣∣∣+
c1 if x < 1
c2 if 1 < x < 2
c3 if x > 2
,
33
Chapter 1. Riemann theory of integration
where c1, c2, c3 ∈ R are arbitrary constants. Do not forget that, although the general expression isalways the same, there is one integration constant per interval of the domain of definition.
h Integrating x 7→ (βx+ γ)(x2 + 2bx+ c)−k, β, γ, b, c ∈ R, b2 − c < 0, k ∈ N. This one is harder; hereis the way to proceed. Do not learn the results by heart, remember the method! We first deal withthe x factor at the numerator. The idea is to transform the expression in such a way that we obtainu′(x)u(x)k , which we can integrate. Here u(x) = x2 + 2bx+ c, so that u′(x) = 2x+ 2b :
βx+ γ
(x2 + 2bx+ c)k=β
2
2x+ 2b
(x2 + bx+ c)k+ (γ − βb)
1
(x2 + 2bx+ c)k.
Still by linearity, we have two terms to treat. The first one is treated thanks to the change of variabley = x2 + 2bx + c, so that dy = (2x + 2b) dx (which was the point of making this factor appear at thenumerator).
D function x ∈ D 7→ · a primitive x ∈ D 7→ ·
R2x+ 2b
x2 + 2bx+ cln(x2 + 2bx+ c)
R2x+ 2b
(x2 + 2bx+ c)kk ∈ {2, 3, 4, . . .} 1
−k + 1(x2 + 2bx+ c)−k+1
The remaining term x 7→ (x2 + 2bx+ c)−k is the more complicated to integrate. The first step is totransform the expression so that it looks like (y2 + 1)−k. To this end, we see x2 + 2bx as the beginningof the development of (x+ b)2 :
x2 + 2bx+ c = (x+ b)2 + c− b2 .
Remark 1.69
By the way, recall that this is the method we use in order to solve second order linear equations (a 6= 0):
ax2 + 2bx+ c = 0 ⇐⇒ a(
x+b
a
)2
− b2 − ac
a= 0
⇐⇒(
x+b
a
)2
=b2 − ac
a2.
Setting y =x+ b√c− b2
does the trick (recall that b2 − c < 0): we obtain
dx
(x2 + 2bx+ c)k=
√c− b2
(c− b2)k
dy
(y2 + 1)k,
It finally remains to integrate y 7→ (y2 + 1)−k. Set fk : y ∈ R 7→∫ y
0
dt
(t2 + 1)k.
h If k = 1, then we know that f1(y) = arctan(y), for y ∈ R.
34
1.3.3. Rational functions in other functions
h If k ≥ 2, we obtain a relation between fk and fk−1 thanks to an integration by parts:
fk−1(y) =
∫ y
0
1 (t2 + 1)−k+1 dt
=[t (t2 + 1)−k+1
]y
0− 2(−k + 1)
∫ y
0
t t(t2 + 1)−k dt
= y(y2 + 1)−k+1 − 2(−k + 1)
∫ y
0
(t2 + 1 − 1)(t2 + 1)−k dt
= y(y2 + 1)−k+1 − 2(−k + 1)(fk−1(y) − fk(y)
),
so that
fk(y) =2k − 3
2k − 2fk−1(y) +
1
2k − 2
y
(y2 + 1)k−1.
Of course, you are not expected to know this result by heart! You are, however, expected to be able to recover it.
Example 1.70
Coming back to Example 1.66, we compute
∫ 0
−2
4x3
(x − 1)2(x2 + 1)dx = 4
∫ 0
−2
dx
x− 1+ 2
∫ 0
−2
dx
(x− 1)2+ 2
∫ 0
−2
dx
x2 + 1
= 4[
ln |x− 1|]0
−2+ 2
[ −1
x− 1
]0
−2
+ 2[
arctan(x)]0
−2
= −4 ln(3) +4
3− 2 arctan(−2) .
Exercise 1.71 solution page 136
Find primitives of the functions
(i) x 7→ 4x+ 5
x2 + x− 2(ii) x 7→ 6 − x
x2 − 4x+ 4(iii) x 7→ 2x− 3
x2 − 4x+ 5
Exercise 1.72 solution page 137
Compute the following:
(i)
∫ 1
0
dx
x2 + x+ 1
(ii)
∫ 1
0
xdx
x2 + x+ 1
(iii)
∫ 1
0
dx
(x2 + x+ 1)2
(iv)
∫ 1
0
xdx
(x2 + x+ 1)2
1.3.3 Rational functions in other functions
In this section, we consider rational functions in other well-behaved functions. The key to this sectionis the following fact: if R1 and R2 are rational functions, then so is R1 ◦ R2. In words, if you have
35
Chapter 1. Riemann theory of integration
a rational function in y and substitute y with a rational expression in x, then you obtain a rationalfunction in x.
h Rational functions in exp. The first and simplest example is that of functions that are rational inthe exponential function, that is, functions of the form
x 7→ R(ex) ,
where R is a rational function. These are dealt with by using the change of variable
y = ex,
for which dy = ex dx = y dx. For any a, t ∈ R such that R is defined on [ea, et] or [et, ea] dependingwhether a < t or a ≥ t, we have
∫ t
a
R(ex)dx =
∫ et
ea
R(y)
ydy .
As y 7→ R(y)
yis a rational function, we can compute this integral by the method of the previous
section.
Exercise 1.73 solution page 138
Compute
∫ 2
1
2e2x − 3ex + 2
e2x − exdx.
h Rational functions in cos and sin. We now consider functions of the form
x 7→ R(
cos(x), sin(x))
where R is a rational function in two variables, that is, R(X,Y ) = P (X,Y )Q(X,Y ) for some nonzero polynomi-
als in two variables P , Q ∈ R[X,Y ]. Here again, the point is to find a substitution that brings us backto integrating a rational function (of one variable). There are essentially four possibilities, dependingon the form of the rational function into consideration:
h u = cos(x), du = − sin(x) dx;
h u = sin(x), du = cos(x) dx;
h u = tan(x), du = (1 + u2) dx;
h t = tan(
x2
).
The last one plays a special role, we will come back to it shortly. In order to decide which substitutionto make, remember the following:
h cos2 + sin2 = 1 so that we can always “transform” cos2 into sin2 and vice versa.
h1
cos2= 1 + tan2 so that we can always “transform” cos2 into tan2.
h Never forget du.
By transform, we mean that the change yields another rational function. Now, the idea is to getin the end only sin(x) or only cos(x) or only tan(x) in your expression, not forgetting the du part.For instance, if your rational function takes the form R(cos(x), sin2(x)) sin(x) where R is a ratio-nal function, then the substitution u = cos(x) will work as the last sin(x) will be swallowed by du
36
1.3.3. Rational functions in other functions
and the sin2(x) can be replaced by 1 − cos2(x). Similarly, if your rational function takes the formR(cos2(x), sin(x)) cos(x), then the substitution u = sin(x) will work, and, if your rational functiontakes the form R(cos2(x), tan(x)), then the substitution u = tan(x) will work.
Example 1.74
∫ π6
0
dx
cos(x)=
∫ π6
0
cos(x) dx
cos2(x)
=
∫ π6
0
cos(x) dx
1 − sin2(x)
u = sin(x)
=
∫ 12
0
du
1 − u2
=1
2
∫ 12
0
du
1 − u+
1
2
∫ 12
0
du
1 + u
=1
2
[− ln(1 − u)
] 12
0+
1
2
[ln(1 + u)
] 12
0
=ln(3)
2.
It may happen that none of the above substitution work. In this case, we can use the half-angletangent substitution t = tan(x
2 ), which always work but give more complicated expressions.
Proposition 1.75 h half-angle tangent formulas g
For x 6= π mod 2π and t = tan(
x2
), one has
cos(x) =1 − t2
1 + t2, sin(x) =
2t
1 + t2, tan(x) =
2t
1 − t2, dx =
2 dt
1 + t2.
Proof. From the usual trigonometric identities,
cos(x) = cos2(x
2
)
−sin2(x
2
)
= cos2(x
2
)
(1−t2) and sin(x) = 2 sin(x
2
)
cos(x
2
)
= 2t cos2(x
2
)
.
Furthermore,
cos2(x
2
)
=cos2
(x2
)
cos2(
x2
)
+ sin2(
x2
) =1
1 + t2,
and, finally, dt =1
2tan′
(x2
)
dx =1
2(1 + t2) dx.
Using the above proposition, we see that we can replace every occurrence of sin(x) and of cos(x)by a rational expression in t and that dx is also replaced by a rational expression in t multiplied by dt.As a result, we always obtain a rational function in t, which we can integrate.
37
Chapter 1. Riemann theory of integration
Example 1.76
Using the half-angle tangent substitution, we compute
∫ 0
− π2
dx
1 − sin(x)=
∫ 0
−1
1
1 − 2t1+t2
2 dt
1 + t2
= 2
∫ 0
−1
dt
1 + t2 − 2t
= 2
∫ 0
−1
dt
(1 − t)2
= 2
[1
1 − t
]0
−1
= 2
(
1 − 1
2
)
= 1 .
Exercise 1.77 solution page 138
Compute the following:
(i)
∫ π2
− π2
sin2(x) cos3(x) dx , (ii)
∫ π2
0
cos4(x) dx , (iii)
∫ 2π
0
dx
2 + sin(x).
h Rational functions with radicals. We hereby consider functions of the form
x 7→ R(
x, n
√
ax+ b
cx+ d
)
where R is a rational function in two variables, a, b, c, d ∈ R satisfy ad− bc 6= 0, and n ≥ 2 is an integer.In this case, the substitution
y = n
√
ax+ b
cx+ d
is your friend. Indeed, observe that yn =ax+ b
cx+ dand thus nyn−1 dy =
ad− bc
(cx+ d)2dx and x =
dyn − b
a− cyn.
We can thus expressed both arguments of R as well asdx
dyas rational functions in y. We are then left
with integrating a rational function in y, as desired.
38
1.4. Improper integrals
Example 1.78
Using the substitution y =
√x
1 + x, we obtain y2 =
x
1 + x, 2y dy =
dx
(1 + x)2, x =
y2
1 − y2and then
∫ 1
0
√x
1 + xdx =
∫√
22
0
y 2y
(
1 +y2
1 − y2
)2
dy
= 2
∫√
22
0
y2
(1 − y2)2dy .
Let us do the partial fraction decomposition of the integrand:
y2
(1 − y2)2=
y2
(1 − y)2(1 + y)2=
a
1 − y+
b
(1 − y)2+
c
1 + y+
d
(1 + y)2.
Using (b), we multiply by (1 ± y)2 and cancel the term, which gives b = d = 14 . From (a), we get
0 = −a+ c and specifying at y = 0 yields 0 = a+ c+ 12 , so that a = c = 1
4 and
2
∫√
22
0
y2
(1 − y2)2dy =
1
2
∫√
22
0
(1
1 − y+
1
(1 − y)2+
1
1 + y+
1
(1 + y)2
)
dy
=1
2
[
− ln |1 − y| +1
1 − y+ ln |1 + y| − 1
1 + y
]√
22
0
=√
2 +1
2ln(3 + 2
√2).
Exercise 1.79 solution page 139
Compute
∫ 1
0
x2 + 1√x+ 1
dx.
1.4 Improper integrals
The integral we have defined so far is not completely satisfactory for two reasons: we might wantto integrate unbounded functions and we might want to integrate on intervals that are not segments(bounded or unbounded). For instance, for the well-known fundamental formula
∫ ∞
−∞e− x2
2 dx =√π
to make sense, one needs a broader notion of integral. This is what we are going to do in this section.
1.4.1 Definition and first properties
Recall from the begining of Section 1.2 the different forms an interval can have. We denote by R :=R ∪ {−∞,+∞} the set of real numbers to which we add −∞ and +∞. The extremities of an arbitraryinterval are then elements of R. We write x −−→
x∈Ia to mean that x tends to awhile staying in I . In these
notes, it can be x −−−→x>a
a, which we shorten as x >→ a, if a is the left extremity of I or x −−−→x<a
a, which
we shorten as x <→ a, if a is the right extremity of I . The notation x → a and x → a for x decreasing
39
Chapter 1. Riemann theory of integration
or increasing to a are also acceptable. The notation x → a+ and x → a− is quite widespread becausepractical but less rigorous as, obviously, a+ and a− do not represent anything.
Definition 1.80 h integrable function g
Let I be an interval and f : I → R be a function that is integrable on every segment included in I .
h Let a ∈ R be an extremity of I and c ∈ I . We say that f is integrable at a if
∫ x
c
f(t) dt admits a finite limit as x −−→x∈I
a .
h We say that f is integrable on I if it is integrable at both extremities of I . In this case, denoting by aand b respectively the left and right extremities of I , and fixing c ∈ I , we define
∫
I
f =
∫
I
f(t) dt =
∫ b
a
f =
∫ b
a
f(t) dt := limx >→a
∫ c
x
f(t) dt+ limy <→b
∫ y
c
f(t) dt
and∫ a
b
f =
∫ a
b
f(t) dt := −∫ b
a
f .
A few remarks are in order.
h In practice, we will mainly work with continuous functions, which are automatically integrableon every segment included in their interval of definition, by Theorem 1.18.
h The definition does not depend on the choice of c. Indeed, let c, d ∈ I . By Chalses’s identity,
∫ x
c
f(t) dt =
∫ d
c
f(t) dt+
∫ x
d
f(t) dt ,
so that
∫ x
c
f(t) dt converges as x → a if and only if
∫ x
d
f(t) dt converges. Furthermore,
∫ c
x
f(t) dt+
∫ y
c
f(t) dt =
∫ d
x
f(t) dt+✟✟✟✟✟❍
❍❍❍❍
∫ c
d
f(t) dt+✟✟✟✟✟❍
❍❍❍❍
∫ d
c
f(t) dt+
∫ y
d
f(t) dt .
h If the extremity a belongs to I , then f is automatically integrable at a. Indeed, one can take c = ain the definition and fix d ∈ I . As f is integrable on [a, d], then it is bounded and, for x ∈ [a, d],
∣∣∣∣
∫ x
a
f(t) dt
∣∣∣∣
≤ |a− x| sup[a,d]
|f | → 0 as x → a .
As a result, if f is integrable on I and b denotes the other extremity of I , then
∫ b
a
f = limy→b
∫ y
a
f(t) dt .
From this, we also conclude that, if I is a segment, then the integral defined here is equal to theintegral defined earlier.-
h If I is not bounded or if f is not integrable on [a, b], where a and b respectively denote the left andright extremities of I , we speak of improper integral (we say that
∫
If is an improper integral,
even if f is not integrable on I). Furthermore, if f is integrable on I , we say that the integral∫
I fconverges, whereas, if it is not integrable, we say that the integral
∫
If diverges.
40
1.4.1. Definition and first properties
h In the case where both extremities of I do not belong to I , we cannot treat them at once with asingle limit. For instance,
∫ x
−x t dt = 0 → 0 as x → ∞ but t ∈ R 7→ t is not integrable. Indeed,∫ x
0t dt = x2
2 → +∞ so that t ∈ R 7→ t is not integrable at +∞.
When we can compute a primitive F of f , the problem boils down to knowing whether F admitslimits or not at one extremity or both extremities of I .
Example 1.81
(i) Let us see whether t ∈ R+ 7→ 1
1 + t2is integrable:
∫ x
0
dt
1 + t2=[
arctan(t)]x
0= arctanx −−−−→
x→∞π
2.
It is thus integrable and
∫ +∞
0
dt
1 + t2=
π
2. In terms of the geometric problem, it means that,
although not bounded, the domain below the graph has finite area.
x
y
0
π
2t 7→ 1
1 + t2
(ii) The function t 7→ 1
1 + tis not integrable on R+ as
∫ x
0
dt
1 + t=[
ln(1 + t)]x
0= ln(1 + x) −−−−→
x→∞+∞ .
(iii) The integral
∫ 1
0
ln(t) dt converges because
∫ 1
x
1ln(t) dt =[t ln(t)
]1
x−∫ 1
x
t
tdt = −x ln(x) − (1 − x) → −1 as x >→ 0 .
(iv) The integral
∫ 1
0
dt
tdiverges since
∫ 1
x
dt
t=[
ln(t)]1
x= − ln(x) → +∞ as x >→ 0.
(v) The integral
∫ +∞
−∞
2t dt
(1 + t2)2converges and is equal to 0 as
∫ 0
x
2t dt
(1 + t2)2=[ −1
1 + t2
]0
x=
1
1 + x2−−−−−→x→−∞
0 ,
∫ y
0
2t dt
(1 + t2)2=[ −1
1 + t2
]y
0= − 1
1 + y2−−−−−→y→+∞
0 .
The choice c = 0 made the computation easier but an arbitrary value of c simply adds and subtracts−1
1 + c2.
41
Chapter 1. Riemann theory of integration
Exercise 1.82 h Mean of an exponential random variable g solution page 139
Let λ > 0. Compute
∫ +∞
0
λte−λt dt
Exercise 1.83 solution page 139
For n ≥ 0, compute
∫ +∞
0
tn e−t dt.
Definition 1.84 h notation [·] g
We extend the notation [F ]ba = for a function F : (a, b) → R and a, b ∈ R by setting
[F ]ba := limy→b
F (y) − limx→a
F (x)
whenever both limits exist, at least in R, and the difference does not result in an indeterminate form.
With this piece of notation, we may for instance directly write
∫ +∞
0
dt
1 + t2=[
arctan(t)]+∞
0=π
2.
We will, however, refrain from writing[x2]+∞
−∞.
This generalized notion of integral mostly obeys the same rules as the integral on a segment.
Proposition 1.85 h Chasles’s identity g
Let I be an interval and f , g : I → R be two functions that are integrable on every segment includedin I . We denote by a, b ∈ R respectively the left and right extremities of I , and let c ∈ I . Then
∫ b
a
f(x) dx converges ⇐⇒∫ b
c
f(x) dx and
∫ c
a
f(x) dx converge ,
in which case∫ b
a
f(x) dx =
∫ c
a
f(x) dx+
∫ b
c
f(x) dx .
Proof. From the third remark after Definition 1.80,∫ b
c f(x) dx converges if and only if∫ x
c f(x) dx admits
a limit as x <→ b and similarly for∫ c
af(x) dx. The result then follows from the second remark after
Definition 1.80.
Proposition 1.86 h linearity of the integral g
Let f , g : I → R be two functions that are integrable on an interval I and let λ ∈ R. Then the function
42
1.4.1. Definition and first properties
λf + g : I → R is integrable and
∫
I
(λf + g)(x) dx = λ
∫
I
f(x) dx+
∫
I
g(x) dx .
Warning 1.87
The converse is false: it is possible to find f , g not integrable such that f + g is integrable. Take forinstance any nonintegrable function and its opposite.
Proof. For any x < c < y in I , one has, by linearity of the integral on the segments [x, c] and [c, y]
∫ c
x
(λf + g) = λ
∫ c
x
f +
∫ c
x
g and
∫ y
c
(λf + g) = λ
∫ y
c
f +
∫ y
c
g
and the result follows by summing after taking the limits as x tends to the left extremity of I and ytends to the right extremity of I .
Proposition 1.88 h positivity of the integral g
h Let I be an interval and f , g : I → R be integrable functions such that f ≤ g. Then
∫
I
f(x) dx ≤∫
I
g(x) dx .
h In particular, if f : I → R is an integrable function such that f ≥ 0, then
∫
I
f(x) dx ≥ 0.
Proof. For any x < c < y in I , one has, by positivity of the integral on the segments [x, c] and [c, y]
∫ c
x
f ≤∫ c
x
g and
∫ y
c
f ≤∫ y
c
g
and the result follows by summing after taking the limits as x tends to the left extremity of I and ytends to the right extremity of I .
The notion of improper integrals is based on the convergence of functions. As the Cauchy crite-rion is very useful in this case, especially when the limit is not known, it is natural to translate it inthe context of improper integrals. Recall that a function g : I → R admits a finite limit at the leftextremity a ∈ R of I if and only if
∀ε > 0 , ∃d ∈ I : a < u, v < d =⇒∣∣g(u) − g(v)
∣∣ < ε
and at the right extremity b ∈ R of I if and only if
∀ε > 0 , ∃d ∈ I : d < u, v < b =⇒∣∣g(u) − g(v)
∣∣ < ε.
43
Chapter 1. Riemann theory of integration
Proposition 1.89 h Cauchy’s convergence criterion g
Let I be an interval with left and right extremities a, b ∈ R, and let f : I → R be a function that isintegrable on every segment included in I .
h Then, f is integrable at a if and only if
∀ε > 0 , ∃d ∈ I : a < u, v < d =⇒∣∣∣∣
∫ v
u
f(x) dx
∣∣∣∣< ε .
h Similarly, f is integrable at b if and only if
∀ε > 0 , ∃d ∈ I : d < u, v < b =⇒∣∣∣∣
∫ v
u
f(x) dx
∣∣∣∣< ε .
Proof. We apply the Cauchy convergence criterion at the function g : x ∈ I 7→∫ x
cf(t) dt where c ∈ I is
arbitrary, observing that∣∣g(u) − g(v)
∣∣ =
∣∣∫ v
uf(t) dt
∣∣.
1.4.2 Nonnegative functions
When we are not able to explicitly compute primitives in order to determine the nature (convergentor divergent) and the value of an integral, we need alternative criterions to conclude in regards to thenature of integrals. Near an extremity of its interval of definition, a function
h either has constant sign;
h or changes sign infinitely many times.
In the latter case, we speak of oscillating functions, which we will study in the next section. In thepresent section, we concentrate on functions whose sign become constant near the extremity into con-sideration. Up to shortening the interval of definition or taking the opposite of the function, we maywithout loss of generality assume that wa are dealing with a nonnegative function.
a +∞ a b
In this case, under the usual assumption that f : I → R+ is integrable on every segment includedin I , the question is to know whether
x 7→∫ x
c
f(t) dt
converges to a finite limit as x tends to the extremities of I . By positivity (using that f1[c,x] ≤ f1[c,y]
for c ≤ x ≤ y or Chasles’s identity), we see that the displayed function is nondecreasing. As a result, it
44
1.4.2. Nonnegative functions
always admits limits (finite or infinite) at the extremities of I . More precisely, denoting as usual by a,b ∈ R the left and right extremities of I ,
f is not integrable at a ⇐⇒∫ c
x
f(t) dt −−−→x >→a
+∞ ;
f is not integrable at b ⇐⇒∫ y
c
f(t) dt −−−→y <→b
+∞ .
Definition 1.90 h integral of nonintegrable nonnegative functions g
It makes sense to extend the piece of notation∫
If in this context of nonnegative functions by setting
∫
I
f := +∞
if f is not integrable.
In this setting, it always holds that
∫ b
a
f(t) dt = limx >→a
∫ c
x
f(t) dt+ limy <→b
∫ y
c
f(t) dt ∈ R+ ∪ {+∞} .
We similarly extend the notation to nonpositive functions, replacing +∞ with −∞.Let us mention here that linearity still holds for nonintegrable functions, provided the multiplying
scalar is nonnegative. This is only used for computational purposes.
Proposition 1.91 h linearity for nonnegative functions g
Let f , g : I → R+ be two functions that are integrable on every segments of an interval I and letα ∈ R+. Then the following equality holds in R+ ∪ {+∞},
∫
I
(αf + g)(x) dx = α
∫
I
f(x) dx+
∫
I
g(x) dx .
Proof. Observe first that αf + g is a nonnegative function. By linearity, for any x ≤ c ≤ y, one has∫ c
x (αf + g) = α∫ c
x f +∫ c
x g and∫ y
c (αf + g) = α∫ y
c f +∫ y
c g. The result follows by summing after takingthe limits as x tends to the left extremity of I and y tends to the right extremity of I .
Proposition 1.92 h sequential criterion g
Let I be an interval with left and right extremities a, b ∈ R, and let f : I → R+ be a nonnegativefunction that is integrable on every segment included in I .
h For any sequences (an), (bn) ∈ IN such that an → a and bn → b as n → ∞,
∫ bn
an
f(x) dx →∫ b
a
f(x) dx ∈ R+ ∪ {+∞} .
h In particular, if there exist two sequences (an), (bn) ∈ IN such that an → a and bn → b as
45
Chapter 1. Riemann theory of integration
n → ∞ and∫ bn
an
f(x) dx
admits a finite limit as n → ∞, then f is integrable.
Proof. This comes from the above remark that y 7→∫ y
cf(t) dt is nondecreasing and thus converges, as
y <→ b, toward∫ b
c f(t) dt. The result follows from the sequential criterion for a limit of a function.
Remark 1.93
This no longer holds if one drops the positivity assumption: for instance
∫ nπ
0
cos(x) dx =[
sin(x)]nπ
0= 0 → 0
as n → ∞ whereas cos is clearly not integrable on R+ (∫ x
0 cos(t) dt = sin(x) does not converge).
We use the following proposition in order to decide whether a nonnegative function is integrableor not.
Proposition 1.94 h comparison of nonnegative functions g
Let f , g : I → R+ be two nonnegative functions that are integrable on every segments of an interval Iand are such that f ≤ g.
h Then the following inequality holds in R+ ∪ {+∞}∫
I
f(x) dx ≤∫
I
g(x) dx .
h In particular, if g is integrable, then f is also integrable. If f is not integrable, then neither is g.
Proof. By positivity, for any x ≤ c ≤ y, one has∫ c
xf ≤
∫ c
xg and
∫ y
cf ≤
∫ y
cg. The result follows by
summing after taking the limits as x tends to the left extremity of I and y tends to the right extremityof I .
This proposition is mainly used through its second item (which is better remembered as an ap-plication of the first item). Namely, we bound the function under study either from below with anonintegrable nonnegative function or from above with an integrable nonnegative function. Recallthat we are merely interested in what happens near an extremity of the interval of study so that thiscomparison only needs to hold in this vicinity.
Example 1.95
Let us show that ∫ +∞
1
tαe−t dt
converges, for any value of α ∈ R. The integrand is a positive function, so that we can use the previoustheorem. The idea is thus to bound the integrand from above with a function that we know is integrable.
46
1.4.2. Nonnegative functions
The point is that t 7→ e−t thwarts the behavior of t 7→ tα, no matter the value of α.
More precisely, we write tαe−t = tαe− t2 e− t
2 and observe that t ∈ [1,+∞) 7→ tαe− t2 is bounded from
above by some constant Mα since tαe− t2 → 0 as t → ∞. As a result, for all t ≥ 1,
tαe−t ≤ Mα e− t
2 .
We conclude by noticing that t ∈ [1,+∞) 7→ e− t2 is integrable:
∫ x
1
e− t2 dt =
[
−2e− t2
]x
1= 2e−1/2 − 2e−x/2 −−−−−→
x→+∞2e−1/2 .
The exact value of the latter integral does not matter, we can also conclude that it is less than 2e−1/2, sothat the integral is bounded.
We now make the previous proposition a bit more precise. Recall the following notation for realfunctions f , g defined in a neighborhood of b ∈ R:
h f(x) = O(g(x)
)when x → b if there exist M > 0 such that |f(x)| ≤ M |g(x)| for every x close to b;
h f(x) ∼ g(x) when x → b if, for all ε > 0, we have |f(x) − g(x)| < ε |g(x)| for x close to b;
h f(x) = O
(g(x)
)when x → b if, for all ε > 0, we have |f(x)| < ε |g(x)| for x close to b.
Theorem 1.96 h refined comparison of nonnegative functions g
Let f , g : I → R+ be two nonnegative functions that are integrable on every segments of an interval Iand let b ∈ R denote the right extremity of I .
(i) Assumef(x) = O
(g(x)
)when x <→ b .
Then, if g is integrable at b, then f is also integrable at b. Equivalently, if f is not integrable at b,then g is not integrable at b either.
(ii) Assumef(x) ∼ g(x) when x <→ b .
Then f is integrable at b if and only if g is integrable at b. Moreover,
(a) if g is integrable at b, then
∫ b
x
f(y) dy ∼∫ b
x
g(y) dy when x <→ b;
(b) if g is not integrable at b, then, for c ∈ I fixed,
∫ x
c
f(y) dy ∼∫ x
c
g(y) dy when x <→ b.
Remark 1.97
If f : I → R is not assumed nonnegative and g : I → R+ is such that f(x) ∼ g(x) when x <→ b,then f is actually nonnegative in the vicinity of b. Indeed, taking ε = 1
2 in the definition yields that0 ≤ 1
2g(x) ≤ f(x) for x close enough to b.
47
Chapter 1. Riemann theory of integration
Proof. (i) By definition, there exist M > 0 and c ∈ I such that f(x) ≤ Mg(x) for all x ∈ [c, b). Weconclude by applying Proposition 1.94 to the nonnegative functions f |[c,b) and Mg|[c,b).
(ii) Taking for instance ε = 12 in the definition of f(x) ∼ g(x), there exists d ∈ I such that, for all
x ∈ [d, b),1
2g(x) ≤ f(x) ≤ 3
2g(x) ,
so that both f(x) = O(g(x)
)and g(x) = O
(g(x)
)when x <→ b. By (i), f is integrable at b if and only if g
is integrable at b.
Now, for any fixed ε > 0, there exists dε ∈ I such that, for all y ∈ [dε, b),
(1 − ε)g(y) ≤ f(y) ≤ (1 + ε)g(y) .
(a) If g is integrable at b, then both f and g are integrable on [dε, b) so that, by Proposition 1.88
(1 − ε)
∫ b
dε
g(y) dy ≤∫ b
dε
f(y) dy ≤ (1 + ε)
∫ b
dε
g(y) dy .
This exactly means that
∫ b
x
f(y) dy ∼∫ b
x
g(y) dy when x <→ b.
(b) If g is not integrable at b, by positivity on [dε, x] for every x ∈ [dε, b),
(1 − ε)
∫ x
dε
g(y) dy ≤∫ x
dε
f(y) dy ≤ (1 + ε)
∫ x
dε
g(y) dy .
By Chasles’s identity, the latter inequality becomes
∫ x
a
f(y) dy −∫ dε
a
f(y) dy ≤ (1 + ε)
(∫ x
a
g(y) dy −∫ dε
a
g(y) dy
)
and then∫ x
a
f(y) dy ≤ (1 + ε)
∫ x
a
g(y) dy +
∫ dε
a
(f(y) − (1 + ε)g(y)
)dy .
As g is not integrable at b, the quantity∫ x
ag(y) dy → ∞ as x → b, so that, for x close enough to b, the
constant∫ dε
a
(f(y) − (1 + ε)g(y)
)dy is smaller than ε
∫ x
a g(y) dy and thus
∫ x
a
f(y) dy ≤ (1 + 2ε)
∫ x
a
g(y) dy .
The same argument shows that
∫ x
a
f(y) dy ≥ (1 − ε)
∫ x
a
g(y) dy +
∫ dε
a
(f(y) − (1 − ε)g(y)
)dy ,
and, for x close enough to b,∫ x
a
f(y) dy ≥ (1 − 2ε)
∫ x
a
g(y) dy .
The result follows.
48
1.4.2. Nonnegative functions
Example 1.98
The integral
∫ +∞
1
x5 + 2x4 − 5
x3 + 5xe−x dx converges because
x5 + 2x4 − 5
x3 + 5xe−x ∼ x2e−x
when x → ∞ and x ∈ [1,+∞) 7→ x2e−x is integrable as we saw in Example 1.95.
We end this section with two families of integrals that are quite natural candidates for comparisonarguments. The easiest ones are power functions.
Proposition 1.99 h Riemann integrals g
Let α ∈ R. The following holds.
(i)
∫ +∞
1
1
xαdx < ∞ if and only if α > 1.
(ii)
∫ 1
0
1
xαdx < ∞ if and only if α < 1.
Proof. For α 6= 1, we have
Fα(x) :=
∫ x
1
1
tαdt =
x−α+1 − 1
−α+ 1.
Then, as x → +∞, Fα(x) admits a finite limit if and only if −α+ 1 < 0, that is, α > 1. And, as x → 0,Fα(x) admits a finite limit if and only if −α+ 1 > 0, that is, α < 1.
Finally, if α = 1,
F1(x) :=
∫ x
1
1
tdt = ln(x) ,
so that F1(x) → +∞ as x → +∞ and −F1(x) → +∞ as x → 0. (The minus sign comes from the fact
that −F1(x) =∫ 1
xt−1 dt.)
It is sometimes useful to be a bit more precise. At +∞, the previous proposition says that x 7→ 1x is
not integrable but x 7→ 1x1+ε is integrable for any ε > 0. In order to decide for a function f such that
1
x1+ε≤ f(x) ≤ 1
xfor all ε > 0,
one needs more sophisticated comparison functions.
Proposition 1.100 h Bertrand integrals g
The following holds.∫ +∞
e
dx
x(ln(x))β< ∞ if and only if β > 1 .
49
Chapter 1. Riemann theory of integration
Proof. Integrating by substitution with y = ln(x), so that dy = dxx , we see that
∫ z
e
dx
x(ln(x))β=
∫ ln(z)
1
dy
yβ
admits a finite limit as z → ∞ if and only if β > 1 by the previous proposition.
If one needs further refinement, the method can be reiterated at wish, substituting y = ln(x), thenx = ln(t), then t = ln(u), and so on,
∫ z
1
dy
yβ=
∫ ez
e
dx
x(ln(x))β=
∫ eez
ee
dt
t ln(t)(ln(ln(t)))β=
∫ eeez
eee
du
u ln(u) ln(ln(u))(ln(ln(ln(u))))β= . . . ,
always concluding from the case of Riemann integrals that it converges to a finite limit if and only ifβ > 1. It is a good idea to learn Proposition 1.99 by heart; for Bertrand integrals and the generalization,it is better advised to remember the method, that is, to use the substitution ln.
Example 1.101
Let us see whether ∫ +∞
2
√
x2 + 3x ln
(
cos( 1
x
))
sin2
(1
ln(x)
)
dx
converges. The problem is at +∞. When x → ∞,
√
x2 + 3x = x
√
1 +3
x∼ x
ln
(
cos(1
x
))
= ln
(
1 − 1
2x2+ O
( 1
x2
))
∼ − 1
2x2
sin2
(1
ln(x)
)
∼(
1
ln(x)
)2
so that the integrand is equivalent to
− 1
2x (ln(x))2.
We are thus dealing with functions that are negative in the vicinity of +∞. By Theorem 1.96 andProposition 1.100, the integral under study converges.
Exercise 1.102 solution page 140
Let α > 0. Do the following integrals converge?
(i)
∫ π
−∞αt dt (ii)
∫ +∞
π2
(t−α − sin(t−α)
)dt (iii)
∫ +∞
1
(
1 − 3√
1 + t−α)
dt
1.4.3 Oscillating functions
We now concentrate on functions that infinitely change sign near the extremity under investigation.
50
1.4.3. Oscillating functions
a +∞ a b
In contrast with nonnegative functions, for such a function f : I → R+ that is integrable on everysegment included in I , the function
x 7→∫ x
c
f(t) dt
may have no limit (finite or infinite) as x tends to the extremities of I .The most favorable case is when the absolute value of the function is integrable.
Definition 1.103 h absolutely integrable function g
h Let I be an interval and f : I → R be a function that is integrable on every segment included in I .The function f is called absolutely integrable if the function
|f | : x ∈ I 7→ |f(x)| is integrable.
h If f : I → R is absolutely integrable, then the integral
∫
I
f is said to be absolutely convergent.
Theorem 1.104 h absolute integrability g
Let f : I → R be an absolutely integrable function. Then f is integrable and
∣∣∣∣
∫
I
f(x) dx
∣∣∣∣
≤∫
I
|f(x)| dx .
Proof. This is a consequence of Cauchy’s convergence criterion (Proposition 1.89) applied to |f | thento f . Let a, b ∈ R denote the left and right extremities of I . As |f | is integrable at b, for any fixed ε > 0,there exists d ∈ I such that
d < u, v < b =⇒∣∣∣∣
∫ v
u
|f(x)| dx
∣∣∣∣< ε .
For d < u, v < b, by Proposition 1.27.(iv),∣∣∣∣
∫ v
u
f(x) dx
∣∣∣∣
≤∣∣∣∣
∫ v
u
|f(x)| dx
∣∣∣∣< ε ,
so that f is also integrable at b.Furthermore, for a fixed c ∈ I ,
∣∣∣∣
∫ b
c
f(x) dx
∣∣∣∣
= limy <→b
∣∣∣∣
∫ y
c
f(x) dx
∣∣∣∣
≤ limy <→b
∫ y
c
|f(x)| dx =
∫ b
c
|f(x)| dx .
51
Chapter 1. Riemann theory of integration
One proves similarly that f is integrable at a and satisfies
∣∣∣∣
∫ c
a
f(x) dx
∣∣∣∣
≤∫ c
a
|f(x)| dx, so that f is
integrable and∣∣∣∣
∫
I
f(x) dx
∣∣∣∣
≤∣∣∣∣
∫ c
a
f(x) dx
∣∣∣∣
+
∣∣∣∣
∫ b
c
f(x) dx
∣∣∣∣
≤∫ c
a
|f(x)| dx+
∫ b
c
|f(x)| =
∫
I
|f(x)| dx .
Exercise 1.105 solution page 140
Show that t ∈ [1,+∞) 7→ sin(t)
t2is absolutely integrable.
Warning 1.106
In contrast with integrals on segments (Proposition 1.27.(iv)), the converse of Theorem 1.104 is false.
Example 1.107 h Dirichlet integral g
The integral
∫ +∞
1
sin(t)
tdt provides a counter-example.
h It is convergent. Indeed,∫ x
1
sin(t)
tdt =
[− cos(t)
t
]x
1
−∫ x
1
(
−− cos(t)
t2
)
dt = cos(1) − cos(x)
x−∫ x
1
cos(t)
t2dt.
As x → ∞, the second term tends to 0 because
∣∣∣∣
cos(x)
x
∣∣∣∣
≤ 1
x→ 0 and the third term converges because
t ∈ [1,+∞) 7→ cos(t)
t2is absolutely integrable (see Exercise 1.105) hence integrable.
h It is not absolutely convergent. This comes from the fact that t 7→ | sin(t)| stays bounded frombelow on a constant fraction of time and t 7→ t−1 is not integrable. More precisely, for any fixed0 < θ < π
2 ,
∫ Nπ
π
∣∣∣∣
sin(t)
t
∣∣∣∣dt =
N∑
k=2
∫ kπ
(k−1)π
| sin(t)|t
dt
≥N∑
k=2
∫ kπ−θ
(k−1)π+θ
| sin(t)|t
dt
≥N∑
k=2
∫ kπ−θ
(k−1)π+θ
sin(θ)
kπdt =
sin(θ)
π(π − 2θ)
N∑
k=2
1
k−−−−→N→∞
∞ .
Instead of using θ, one might also directly bound from below∫ kπ
(k−1)π
| sin(t)|t
dt ≥ 1
kπ
∫ kπ
(k−1)π
| sin(t)| dt =2
kπ.
Another classical yet more obscure proof consists in noticing that∣∣∣∣
sin(t)
t
∣∣∣∣
≥ sin2(t)
t=
1 − cos(2t)
2t,
52
1.4.3. Oscillating functions
and concluding similarly as above by an integration by parts that t ∈ [1,+∞) 7→ cos(2t)
2tis integrable.
The remaining term t ∈ [1,+∞) 7→ 1
2tbeing not integrable, the sum cannot be integrable.
The example above illustrates the typical situation where the lack of decay at infinity (the fact thet 7→ t−1 is not integrable) is compensated by the oscillations of the function (the t 7→ sin(t) part). Thefollowing theorem gives a general criterion allowing to handle this phenomenon.
Theorem 1.108 h Abel’s criterion g
Let I be an interval with right extremity b ∈ R and f ∈ C(I), g ∈ C1(I). We suppose that there existsc ∈ I such that
h g is nonincreasing on [c, b) and g(x) → 0 as x <→ b ;
h the function x ∈ [c, b) 7→∫ x
c
f(t) dt is bounded.
Then
∫ b
c
f(x)g(x) dx converges.
Example 1.109 h Dirichlet integral g
Applying the theorem with f : x ∈ [1,+∞) 7→ sin(x) and g : x ∈ [1,+∞) 7→ 1
x, we recover that
∫ +∞
1
sin(x)
xdx converges.
Proof. It is as in Example 1.107. Let F : x ∈ [c, b) 7→∫ x
c
f(t) dt and M > 0 be such that |F | ≤ M .
∫ x
c
f(t) g(t) dt =[
F (t) g(t)]x
c−∫ x
c
F (t) g′(t) dt = F (x)g(x)︸ ︷︷ ︸
→0
−F (c)g(c) −∫ x
c
F (t)g′(t) dt .
As F is bounded and g(x) → 0 as x → b, we have F (x)g(x) → 0. It remains to see that t ∈ [c, b) 7→F (t)g′(t) is integrable. In fact, it is absolutely integrable: as g′ is nonpositive on [c, b),
∫ x
c
∣∣F (t)g′(t)
∣∣ dt ≤ M
∫ x
c
(− g′(t)
)dt = M
(g(c) − g(x)
)−−−→x <→b
g(c) .
This proves that the left-hand side integral is bounded and thus admits a finite limit as x <→ b.
Exercise 1.110 solution page 140
Let α > 0 be a real number and n ∈ N. Study the convergence and absolute convergence of
∫ +∞
1
sinn(t)
tαdt .
53
Chapter 1. Riemann theory of integration
Exercise 1.111 solution page 141
Study the convergence of
∫ 1
0
sin(
1t
)
tdt.
Exercise 1.112 solution page 141
Compute
∫ π2
0
ln(
sin(t))
dt and
∫ π2
0
ln(
cos(t))
dt.
Hint: find relations between these integrals and use their sum.
1.4.4 Comparison of series with integrals
We come back to the link between sums and integrals. The point is to compare sums of the form∞∑
n=k
f(n) with integrals of the form
∫ +∞
a
f(x) dx.
The idea is to use Chasles’s identity and write, for any integer n ≥ ⌈a⌉ + 1,
∫ n
a
f(x) dx =
∫ ⌈a⌉
a
f(x) dx+
n−1∑
k=⌈a⌉
∫ k+1
k
f(x) dx
and then use monotonicity in order to bound the integrand f(x) on [a, ⌈a⌉] and o each interval [k, k + 1].For instance, a convenient setup is to consider nonincreasing functions.
Proposition 1.113 h Comparison of series with integrals g
Let p be an integer and f : [p,+∞) → R be a nonincreasing function that is integrable on everysegment included in [p,+∞). Then, for all n ≥ p+ 1,
n∑
k=p+1
f(k) ≤∫ n
p
f(x) dx ≤n−1∑
k=p
f(k) .
Consequently, the following inequality holds in R :
+∞∑
k=p+1
f(k) ≤∫ +∞
p
f(x) dx ≤+∞∑
k=p
f(k) .
In particular, the series with terms (f(k))k converges if and only if f is integrable.
Remark 1.114
The second statement is only useful under the assumption that f(x) → 0 as x → ∞. Indeed, as f isnonincreasing, it converges to some ℓ ∈ R. If ℓ > 0, the three quantities appearing in the second displayof inequalities are clearly equal to +∞, and, if ℓ < 0, they are clearly equal to −∞.
54
1.4.4. Comparison of series with integrals
Proof. By Chasles’s identity,∫ n
p
f(x) dx =
n−1∑
k=p
∫ k+1
k
f(x) dx .
It then remains to bound the integrand of the integrals in the sum using the monotony of f . We obtain
f(k + 1) =
∫ k+1
k
f(k + 1) dx ≤∫ k+1
k
f(x) dx ≤∫ k+1
k
f(k) dx = f(k)
and the result follows by summation.As f is nonincreasing, it is either nonnegative or eventually negative. Consequently, the three
quantities in the first display of inequalities are either nondecreasing with n or eventually decreasingwith n and in any case admit limits in R. The second inequality is obtained by taking the limit as
n → ∞. Finally, by the sequential criterion, f is integrable if and only if∫ +∞
p f(x) dx < ∞.
Example 1.115 h Riemann zeta function g
Let α ∈ R. The series∑
k≥1
1
kαconverges if and only if α > 1.
The case α ≤ 0 is obvious. For α > 0, the function x ∈ R⋆+ 7→ x−α is continuous thus integrable on
every segment, nonnegative and nonincreasing. We know from Proposition 1.99 that it is integrable ifand only if α > 1.
In spite of the analogy between series and integrals explained above, beware that there are impor-tant differences between these two notions. For instance, recall that
∞∑
k=1
uk < ∞ =⇒ limk→∞
uk = 0 .
(Recall also that the converse is not true: take for instance uk = k−α with 0 < α < 1.) In contrast,∫ ∞
1
f(x) dx < ∞ 6=⇒ limx→∞
f(x) = 0 ,
even under the extra assumption that f is nonnegative. In fact, the integrability of f at +∞ does noteven imply that f is bounded. Let us construct an example of nonnegative continuous unboundedfunction f that is integrable at +∞. We let f be equal to 0 except around integer values where ithas a triangular shape: for each k ∈ N, its graph follows the sides of the isosceles triangle with basis[k − 2−2k, k + 2−2k] and height 2k.
x
y
0 1
f
55
Chapter 1. Riemann theory of integration
The area of the triangle is
∫ k+2−2k
k−2−2k
f(x) dx = 2−k, so that
∫ +∞
0
f(x) dx =
+∞∑
k=1
2−k = 1 < ∞.
Exercise 1.116 solution page 142
Tweak the above example into a positive continuous integrable function R → R⋆+, unbounded both at
−∞ and +∞.
The above example is ah hoc and may look “unnatural” in the sense that you could expect neverto encounter such an example again. The exercise below gives a more natural looking example.
Exercise 1.117 h Fresnel integral g solution page 142
Show that the function x ∈ R+ 7→ sin(x2) is integrable.
56
2Plane parametric curves
This chapter is devoted to the study of plane parametric curves. We will see their fundamentalproperties and how to sketch a plane curve. We will cover parametric curves in Cartesian coordinatesand in polar coordinates.
Here is an incentive to read this chapter:
57
Chapter 2. Plane parametric curves
If you want to learn more about parametric curves, you may consult
h [Ste16, 10];
h [LM07, 30 (I–III)], in French.
2.1 Introduction 58
2.1.1 Motivation 58
2.1.2 Preliminaries 60
2.2 First definitions 63
2.3 Tangents 66
2.3.1 Definition 67
2.3.2 Link with derivatives 69
2.3.3 Local behavior 71
2.4 Sketching 74
2.4.1 Interval of study 74
2.4.2 Asymptotes 76
2.4.3 Sketching plan 79
2.5 Polar curves 85
2.5.1 Polar coordinates 85
2.5.2 Polar curves 86
2.5.3 What is the difference with a usual graph? 87
2.5.4 Tangents 88
2.5.5 Extremities of the interval of study 91
2.5.6 Sketching 93
2.1 Introduction
2.1.1 Motivation
So far, you’ve learned a lot about curves associated with equations of the form y = f(x) for somefunction f and you can also easily deal with equations of the form x = f(y) simply by exchanging the
58
2.1.1. Motivation
roles of x and y (even if it can sometimes be confusing in practice). But not all curves can easily bedescribed in such a way. . . Take for instance the radius 1 circle centered at the origin:
C ={
(x, y) : x2 + y2 = 1}.
x
y
0 1
C
Recall that the graph of a function f : I → R is the set{(x, f(x)
): x ∈ I
}. In particular, it cannot
contain two different points with the same abscissa (x, y1) 6= (x, y2) as this would imply that f(x) = y1
and f(x) = y2. In other words, each vertical line can only intersect the graph at most once. From this,we see that it is not possible to find a function whose graph is the circle C (even if we rotate the axes).
Sure, we can write C as the union of the two disjoint graphs of the functions
x ∈ [−1, 1) 7→√
1 − y2 and x ∈ (−1, 1] 7→ −√
1 − y2 (2.1)
but this is not really satisfactory. In particular, it singles out the two points (0,−1) and (0, 1) that haveabsolutely nothing different from any other points.
Instead, it can prove convenient to introduce an extra variable t called a parameter and express xand y as functions of t. The motion of a point evolving with time is sometimes naturally given in sucha form; in particular if this motion is driven by physics equations and laws.
Remark 2.1
The parameter is often denoted by t as it can be thought of time. Of course, it can sometimes be denotedotherwise: θ for an angle, r, etc.In order to make the presentation as clear as possible, we will reserve the terminology point for points ofthe plane and use the terminology time when referring to a value the parameter can take.
In the case of the circle C , taking for the parameter the angle to the x-axis gives
{
x = cos(t)
y = sin(t)(2.2)
and we need to specify the values of twe consider. In this case, we can choose for instance t ∈ R (whichmeans that the circle is actually parameterized an infinite number of times). The description (2.2) of acurve encapsulates more information than (2.1) as the position of the particle is known at every time t.Note that (2.2) is not the only way to parameterize the circle: for instance, we could also use
{
x = cos(−3t)
y = sin(−3t)t ∈ R ,
which gives the same circle C .
59
Chapter 2. Plane parametric curves
2.1.2 Preliminaries
Let E2 denote the Euclidean plane and (O;~ı,~) denote an orthonormal basis of E2. We identify E2
with R2, that is, identify the vector x~ı + y~ ∈ E2 with the point (x, y) ∈ R2. In some practical cases,we can also identify R2 with the complex plane C, that is, identify the point (x, y) with the complexnumber x+ iy. We denote by ‖ · ‖ the usual Euclidean norm defined by
∀(x, y) ∈ R2,∥∥(x, y)
∥∥ :=
√
x2 + y2 .
In this context, the usual Euclidean distance between the two vectors ~a and ~b ∈ E2 is equal to ‖~a −~b‖.From now on, we consider a vector function
~v : I → E2
t 7→ ~v(t)
defined on an interval I ⊆ R. Furthermore, for each t ∈ I , we denote by x(t) and y(t) the coordinatesof the vector ~v(t). In other words, we write
~v(t) = x(t)~ı + y(t)~ .
Definition 2.2
Let t0 be an adherent timea of I . We say that the vector function ~v has limit ~a ∈ E2 when t → t0 if‖~v(t) − ~a‖ → 0 as t → t0. If so, we write ~v(t) → ~a as t → t0 and also say that ~v(t) tends to ~a ast → t0.
aAn adherent time of an interval is either a time in the interval or an extremity of the interval. For instance, a is anadherent time of (a, b).
Proposition 2.3
Let ~a = x0~ı + y0~. Then ~v(t) → ~a as t → t0 if and only if
{
x(t) → x0 as t → t0
y(t) → y0 as t → t0.
Proof. As ‖~v(t) − ~a‖ =√
(x(t) − x0)2 + (y(t) − y0)2, we see that
{
x(t) → x0 as t → t0
y(t) → y0 as t → t0=⇒ ‖~v(t) − ~a‖ → 0 as t → t0.
Conversely, |x(t) − x0| ≤ ‖~v(t) −~a‖ and |y(t) − y0| ≤ ‖~v(t) −~a‖, so that, if ~v(t) → ~a as t → t0, then bothx(t) → x0 and y(t) → y0 as t → t0.
Definition 2.4
Let ~v : I → E2 be a vector function and t0 ∈ I .
h The vector function ~v is continuous at t0 if ~v(t) → ~v(t0) as t → t0.
60
2.1.2. Preliminaries
h The vector function ~v is differentiable at t0 if
~v(t) − ~v(t0)
t− t0
has a limit as t → t0. If so, we denote this limit by ~v′(t0) ord~v
dt(t0).
Proposition 2.5
The following holds.
(i) The vector function ~v is continuous if and only if its coordinates are continuous.
(ii) The vector function ~v is differentiable if and only if its coordinates are differentiable. In this case,the coordinates of ~v′ are the derivatives of the coordinates of ~v.
Proof. The first statement (i) is an immediate consequence of Proposition 2.3. Let t0 ∈ I . We have
~v(t) − ~v(t0)
t− t0=x(t) − x(t0)
t− t0~ı +
y(t) − y(t0)
t− t0~ ,
so that the second statement follows directly from Proposition 2.3.
We may subsequently define, when they exist, ~v′′, ~v(3), . . . Furthermore, ~v is k time differentiable ifand only if all its coordinates are; ~v is of class1 Ck (for any given k ≥ 0 or for k = ∞) if and only if allits coordinates are.
Proposition 2.6
Let ~v1 : I → E2 and ~v2 : I → E2 be two differentiable vector functions on I .
(i) For any λ, µ ∈ R, the vector function λ~v1 +µ~v2 is differentiable and(λ~v1 + µ~v2
)′= λ~v′
1 + µ~v′2 .
(ii) The real-valued functiona ~v1 · ~v2 is differentiable and(~v1 · ~v2
)′= ~v′
1 · ~v2 + ~v1 · ~v′2 .
(iii) The real-valued functionb det(~v1, ~v2) is differentiable and(
det(~v1, ~v2))′
= det(~v′1, ~v2) +
det(~v1, ~v′2) .
(iv) If ~v1 never cancels, then the real-valued function ‖~v1‖ is differentiable and(‖~v1‖
)′=~v′
1 · ~v1
‖~v1‖ .
aWe denote by · the scalar product.bWe denote by det the determinant.
Proof. Let us write ~v1 = x1~ı + y1~ and ~v2 = x2~ı + y2~. Then we have the following.
(i) λ~v1 + µ~v2 =(λx1 + µx2
)~ı +
(λy1 + µy2
)~. The result easily follows.
1Recall that, for k ≥ 0, a function is of class Ck if it is k times differentiable and its k-th derivative is continuous. A functionis smooth or of class C∞ is it has derivatives of all orders.
61
Chapter 2. Plane parametric curves
(ii) ~v1 · ~v2 = x1x2 + y1y2, so that
(~v1 · ~v2
)′= x′
1x2 + x1x′2 + y′
1y2 + y1y′2
= x′1x2 + y′
1y2 + x1x′2 + y1y
′2
= ~v′1 · ~v2 + ~v1 · ~v′
2 .
(iii) det(~v1, ~v2) = x1y2 − y1x2, so that
(det(~v1, ~v2)
)′= x′
1y2 + x1y′2 − y′
1x2 − y1x′2
= x′1y2 − y′
1x2 + x1y′2 − y1x
′2
= det(~v′1, ~v2) + det(~v1, ~v
′2) .
(iv) ‖~v1‖ =(x2
1 + y21
) 12 , so that
(‖~v1‖
)′=
1
2
(2x′
1x1 + 2y′1y1
)(x2
1 + y21
)− 12 =
~v′1 · ~v1
‖~v1‖ .
Warning 2.7
In (iv), do not forget to check that ~v1 does not cancels. For instance t 7→ ‖t~ı + t~‖ =√
2 |t| is notdifferentiable at 0.
Truncated expansion
The local behavior of a function is often dictated by its truncated expansion. If the vector function~v : I → E2 is of class Ck, then all its coordinates are of class Ck and may thus be expanded. As we
deal with vector functions, we will extend the notation O as follows. For a vector function ~f and a realfunction g defined in a neighborhood of t0, we write ~f(t) = O(g(t)) when t → t0 if ‖~f(t)‖ = O
(g(t)
)
when t → t0. Equivalently, ~f(t) = O(g(t)) if and only if each coordinate of ~f is negligible with respectto g (same proof as Proposition 2.3).
Proposition 2.8 h truncated expansion g
Let us suppose that ~v : I → E2 is k times differentiable at a time t0 ∈ I . Then ~v admits the followingexpansion:
~v(t) = ~v(t0) + (t− t0)~v′(t0) +(t− t0)2
2~v′′(t0) + · · · +
(t− t0)k
k!~v(k)(t0) + O
((t− t0)k
). (2.3)
Proof. By Proposition 2.5, both functions x and y are k times differentiable at t0; they thus admit thetruncated expansions
x(t) = x(t0) + (t− t0)x′(t0) +(t− t0)2
2x′′(t0) + · · · +
(t− t0)k
k!x(k)(t0) + O
((t− t0)k
);
y(t) = y(t0) + (t− t0)y′(t0) +(t− t0)2
2y′′(t0) + · · · +
(t− t0)k
k!y(k)(t0) + O
((t− t0)k
).
As, for 0 ≤ i ≤ k, ~v(i)(t) = x(i)(t)~ı + y(i)(t)~, the result follows.
62
2.2. First definitions
2.2 First definitions
We will use the following terminology. Beware that there is no clear consensus, so other referencesmay use a slightly different terminology.
Definition 2.9 h plane parametric curve g
h A (plane) parametric curve is a pair (I, ~v) where I ⊆ R is an interval and ~v : I → E2 is acontinuous vector function. Equivalently, it is a system of 2 continuous functions of another variable –which is called the parameter – defined on a common interval I ⊆ R:
{
x(t)
y(t)t ∈ I .
h The image of a parametric curve (I, ~v) is the set
{~v(t) : t ∈ I
}⊆ E2 ,
or, equivalently, the set{
M ∈ R2 : ∃t ∈ I,−−→OM = ~v(t)
}
⊆ R2 .
h A set of points Γ ⊆ R2 is a (plane) curve if there exists a parametric curve whose image is Γ.
h A parametric curve whose image is a curve Γ is called a parametric representation or parameter-ization (alternatively spelled parametrization) of the curve Γ.
In particular, the graph of a continuous function f : I → R is always a curve. It indeed admits theparametric representation:
{
x(t) = t
y(t) = f(t)t ∈ I .
The reflection of this graph across the first bisector is also a curve: it admits, for instance, the parame-terization {
x(t) = f(t)
y(t) = tt ∈ I .
In this sense, parametric curves are richer than usual graphs of continuous functions.
Conversely, given a parametric curve, we may sometimes find a function describing the samecurve (not always, remember that we have seen in the introduction that the unit circle is not the graphof a single function). For instance, the image of the parametric curve
{
x(t) = t2 − t
y(t) = t+ 2t ∈ R+
is the same as that of (using the change of variable u = t+ 2)
{
x(u) = u2 − 5u+ 6
y(u) = uu ∈ [2,+∞) ,
which is the graph of the function y ∈ [2,+∞) 7→ y2 − 5y+ 6 (part of a parabola with horizontal axis).
63
Chapter 2. Plane parametric curves
x
y
0
1
1
Example 2.10
Let us draw the image of the parametric curve t ∈ R 7→ cos(t)~ı + cos(2t)~. For any t ∈ R, we havecos(2t) = 2 cos2(t) − 1, so that we can rewrite our system as
{
x(t) = cos(t)
y(t) = 2x2(t) − 1t ∈ R .
As cos(R) = [−1, 1], the image of our parametric curve is the graph of x ∈ [−1, 1] 7→ 2x2 − 1.
x
y
0
1
1−1
Definition 2.11
h Let (I, ~v) be a parametric curve with image Γ. The multiplicity of a point M ∈ Γ is the number
of times t ∈ I such that−−→OM = ~v(t). A point with multiplicity 1 is called simple. A point with
multiplicity 2 or more is called multiple: it is double if its multiplicity is 2, it is triple if its multiplicityis 3, it is quadruple if its multiplicity is 4, etc.
h A parametric curve is simple if all the points of its image are simple.
64
2.2. First definitions
h A parametric curve is closed if there exists a ≤ b such that I = [a, b] and ~v(a) = ~v(b).
h The parametric curve (I, ~v) is differentiable if the function ~v is differentiable on I . The samedefinition goes for the following: n times differentiable, of class Cn, smooth.
Remark 2.12
Note that these notions strongly depend on the parametric curve, not only on the curve.
Example 2.13
1. The parametric curve t ∈ R 7→ cos(t)~ı + sin(t)~ is smooth and has image the circle C . Everypoint of C is multiple (of infinite multiplicity).
2. The parametric curve t ∈ [0, 2π) 7→ cos(t)~ı + sin(t)~ is smooth, simple, and has image C .
3. The parametric curve t ∈ [0, 2π] 7→ cos(t)~ı + sin(t)~ is smooth, closed, and has image C .
Be careful that, in contrast with graphs of functions, the regularity of a parametric curve cannot beread off of its image. Look for instance at the two following smooth parametric curves:
x
y
{
x(t) = cos(t) − cos3(20t)
y(t) = sin(t) − sin3(10t)t ∈ R .
x
y
{
x(t) = 9 cos(10t) + 10 cos(9t)
y(t) = 9 sin(10t) − 10 sin(9t)t ∈ R .
And now look at this parametric curve, which is not differentiable at 0 :
65
Chapter 2. Plane parametric curves
{
x(t) = cos(|t|)y(t) = sin(|t|)
t ∈ R .x
y
Proposition 2.14 h reparameterization g
Let (I, ~v) be a parametric curve and ϕ : J → I be a homeomorphisma from some interval J ⊆ R to I .Then the parametric curve (J,~v ◦ ϕ)
(i) has the same image as (I, ~v);
(ii) is closed if and only if (I, ~v) is closed.
(iii) Moreover, the points of the common image have the same multiplicity for both parametric curves.
aA homeomorphism is a bijective continuous function whose inverse is continuous. In case of real intervals Iand J , the function ϕ : J → I is a homeomorphism if and only if it is continuous, strictly monotonic, and surjective(that is, ϕ(J) = I).
Proof. By definition,
~v ◦ ϕ : J → E2
u 7→ ~v(ϕ(u)
).
As ϕ is a bijection, for any vector ~a ∈ E2, the mapping{u ∈ J : ~v
(ϕ(u)
)= ~a
}→{t ∈ I : ~v(t) = ~a
}
u 7→ ϕ(u)
is also a bijection. The sets J~a :={u ∈ J : ~v
(ϕ(u)
)= ~a
}and I~a :=
{t ∈ I : ~v(t) = ~a
}thus have the
same cardinality. The image of (J,~v ◦ϕ) is precisely the set of points ~a such that J~a 6= ∅ and the imageof (I, ~v) is the set of points ~a such that I~a 6= ∅; these sets are the same one. Moreover, for a point ~a inthe image, its multiplicity for (J,~v ◦ ϕ) is Card(J~a), while its multiplicity for (I, ~v) is Card(I~a), whichis equal.
Finally, if (I, ~v) is closed, then there exist a ≤ b such that I = [a, b] and ~v(a) = ~v(b). In this case,depending whether ϕ is increasing or decreasing, J = [ϕ−1(a), ϕ−1(b)] or J = [ϕ−1(b), ϕ−1(a)] and, inboth cases, ~v(a) = ~v ◦ ϕ(ϕ−1(a)) = ~v ◦ ϕ(ϕ−1(b)) = ~v(b), so that (J,~v ◦ ϕ) is closed. Conversely, we canwrite (I, ~v) as (I, (~v ◦ ϕ) ◦ ϕ−1) where ϕ−1 : I → J is a homeomorphism. We can thus apply what wejust did in order to conclude that (J,~v ◦ ϕ) is closed implies that (I, (~v ◦ ϕ) ◦ ϕ−1) is closed.
2.3 Tangents
From now on, we consider a parametric curve (I, ~v) and write, as above,
~v(t) = x(t)~ı + y(t)~ .
Moreover, for each t ∈ I , we define the point M(t) ∈ R2 (called the location at time t) as the only pointsuch that −−−−→
OM(t) = ~v(t) .
66
2.3.1. Definition
2.3.1 Definition
For a point A ∈ R2 and a nonzero vector ~u ∈ E2, we denote by L (A, ~u) the set
L (A, ~u) :={A+ λ~u : λ ∈ R
}.
Definition 2.15 h line g
h A set L ⊆ R2 is called a line if there exist a point A ∈ R2 and a nonzero vector ~u ∈ E2 such thatL = L (A, ~u).
h A direction vector of a line L is a nonzero vector ~u ∈ E2 such that L = L (A, ~u).
h We say that the line L passes through the point A if there exists a nonzero vector ~u ∈ E2 such thatL = L (A, ~u).
It is easy to see that the direction vectors of the line L (A, ~u) are the vectors α~u for any α ∈ R⋆.Furthermore, for any two distinct points A and B, there exists a unique line passing through both A
and B: it is the line L(A,
−−→AB), which we alternatively simply denote by AB.
Definition 2.16 h limiting line g
h Let J ⊆ R be interval and t0 ∈ R. We suppose that, for each t ∈ J \ {t0}, we have a line L (t).We say that the line L (t) admits a limiting position if, for each t ∈ J \ {t0}, there exist a point A(t)through which L (t) passes and a direction vector ~u(t) of L (t) such that A(t) admits a limit as t → t0and ~u(t) admits a nonzero limit as t → t0.
h If so, the line L(
limt→t0 A(t), limt→t0 ~u(t))
is called the limiting line of L (t).
As is, it is not clear a priori that a limiting line is well defined. One needs to check that its defini-tion does not depend on the choice of A(t) and ~u(t). Let us suppose that L (t) = L
(A1(t), ~u1(t)
)=
L(A2(t), ~u2(t)
)with Ai(t) → Ai and ~ui(t) → ~ui 6= ~0 for i ∈ {1, 2}.
1. As ~u1(t) and ~u2(t) are direction vectors of the same line, there exists α(t) ∈ R⋆ such that ~u2(t) =α(t)~u1(t). Writing ~ui(t) = xi(t)~ı + yi(t)~ and ~ui = ai~ı + bi~, one obtains x2(t) = α(t)x1(t) andy2(t) = α(t)y1(t). As ~u1 6= ~0, we cannot have both a1 = b1 = 0. Without loss of generality, we can
assume that a1 6= 0. Then, for t close to t0, α(t) = x2(t)x1(t) → a2
a1. On the one hand, ~u2(t) → ~u2 ; on
the other hand, ~u2(t) = α(t)~u1(t) → a2
a1~u1 ; as a consequence, ~u2 = a2
a1~u1 and a2
a16= 0.
2. Now A2(t) ∈ L(A1(t), ~u1(t)
)so there exists β(t) ∈ R such that A2(t) = A1(t) + β(t)~u1(t). As
A2(t)−A1(t) → A2 −A1 and ~u1(t) → ~u1 6= ~0, the same reasoning as above shows that β(t) admitsa limit β ∈ R. As a result A2 = A1 + β~u1.
3. As a conclusion,
L(A2, ~u2
)={A2 + λ~u2 : λ ∈ R
}={
A1 +(
β + λa2
a1
)
~u1 : λ ∈ R}
= L(A1, ~u1
).
Here are a few direct consequences of Definition 2.16:
h In order to show that a line admits a limiting position, one needs to find a point and a directionvector both having a limit.
h If all the lines L (t) pass through the same point A, one can always choose A(t) = A.
67
Chapter 2. Plane parametric curves
h If the direction vector ~u(t) of L (t) admits a nonzero limit ~u, then ~u(t)‖~u(t)‖ is also a direction vector
of L (t) and it satisfies ~u(t)‖~u(t)‖ → ~u
‖~u‖ . One can thus restrict oneself to unit direction vectors.
h In order to show that a line does not admit a limiting position, it is sufficient to show that noneof their unit direction vectors admits a limit.
Exercise 2.17 solution page 142
Let f : I → R be a class C1 function. Show that the tangent to the graph of f at the time t admits alimiting position at any t0 ∈ I .
Exercise 2.18 solution page 142
Show that the tangent to the graph of t ∈ R⋆+ 7→ sin(1
t ) at the time t does not admit a limiting positionas t → 0.
Let t0 ∈ I and let us set M0 := M(t0).
Definition 2.19 h tangent g
If, for t 6= t0 and t close to t0, the line M0M(t) is well defined (that is, M(t) 6= M0) and admits alimiting position as t → t0, then the limiting line is called the tangent to the parametric curve at t0.
M(t)
M0
tangent
On sketches, it is customary to symbolize a tangent by a small double arrow, or a simple arrow ifthe curve does not keep going afterward:
Be careful that we speak of the tangent to a parametric curve at the real time t0 ∈ I , not at thepoint of the plane M0. In fact, when M0 is a simple point, there is no ambiguity and one may speak ofthe tangent to the curve at M0. On the other hand, if M0 is a multiple point, there might be multipletangents at the curve at M0:
68
2.3.2. Link with derivatives
M0
Whenever the line M0M(t) is defined, the vector ~v(t) − ~v(t0) is one of its direction vectors. Asa result, any direction vector of M0M(t) has the form α(t − t0)
(~v(t) − ~v(t0)
)for some nonzero real
number α(t − t0) ∈ R⋆. We obtain that the line M0M(t) admits a limiting position if and only if thereexists a real function α defined on some (−ε, ε) \ {0} such that the vector α(t− t0)
(~v(t) −~v(t0)
)admits
a nonzero limit. As ~v(t) − ~v(t0) always tends to ~0 as t → t0 (because ~v is continuous), we must haveα(u) → ±∞ as u → 0. It is thus natural to look at derivatives of ~v.
2.3.2 Link with derivatives
Definition 2.20 h regular, singular, biregular g
h The time t0 ∈ I is called regular (for (I, ~v)) if ~v is differentiable at t0 and ~v′(t0) 6= ~0.
h The time t0 ∈ I is called singular (for (I, ~v)) if it is not regular, that is, ~v is not differentiable at t0 or~v′(t0) = ~0.
h The time t0 ∈ I is called biregular (for (I, ~v)) if ~v is twice differentiable at t0 and the vectors ~v′(t0)and ~v′′(t0) are linearly independenta (hence nonzero).
h The parametric curve (I, ~v) is called regular (resp. biregular) if all the times of I are regular (resp.biregular).
aRecall that two vectors ~a and ~b are linearly independent if λ~a + µ~b = ~0 =⇒ λ = µ = 0. Equivalently, ~a and ~bare linearly independent if and only if they are not collinear.
Proposition 2.21 h tangent at a regular time g
If the time t0 is regular, then the parametric curve admits a tangent at t0 given by the lineL(M(t0), ~v′(t0)
).
Proof. As ~v is differentiable at t0, the direction vector
~v(t) − ~v(t0)
t− t0
of the line M0M(t) tends to ~v′(t0) as t → t0. By definition, this means that the line M0M(t) admits aslimiting position the line L
(M(t0), ~v′(t0)
)as t → t0.
We just learned that there always is a tangent at a regular time. But what about a singular time? Aparametric curve may very well admit a tangent at a singular time. For instance, 0 is a singular timeof t ∈ (−1, 1) 7→ t5~ı + t5~ although this parametric curve clearly admits a tangent at 0:
69
Chapter 2. Plane parametric curves
x
y
0
1
−1
1−1
Proposition 2.22
We suppose that there exists a smallest p ≥ 1 such that ~v is p times differentiable at t0 ∈ I and~v(p)(t0) 6= ~0. Then the parametric curve admits a tangent at t0 given by the line L
(M(t0), ~v(p)(t0)
).
Proof. By Proposition 2.8,
~v(t) − ~v(t0) =(t− t0)p
p!~v(p)(t0) + O
((t− t0)p
),
so that
p!
(t− t0)p
(~v(t) − ~v(t0)
)→ ~v(p)(t0) 6= ~0 ,
and this is a direction vector of M0M(t).
Proposition 2.22 gives a sufficient condition for the existence of tangents. But if there exists nosmallest p as in the statement of Proposition 2.22, one can always study the limiting slope of M0M(t),that is, the limit as t → t0 of
y(t) − y(t0)
x(t) − x(t0).
If this quantity has a limit in R∪{−∞,+∞}, then there is a tangent at t0. More precisely, if this quantitytends to some a ∈ R, then the tangent is L (M0,~ı + a~); if it tends to ±∞, the tangent is L (M0,~).
Example 2.23
Let us consider ~v : t ∈ R 7→ |t|~ı + |t|t~. At t = 0, this vector function is not differentiable. The slope ofM0M(t) is equal to
|t|t|t| = t → 0 ,
so that the parametric curve admits a horizontal tangent at 0.
70
2.3.3. Local behavior
x
y
0
Case of the graph of a function
Let us consider a parametric curve{
x(t) = t
y(t) = f(t)t ∈ I ,
with a C1 function f : I → R. It is differentiable and has derivative{
x′(t) = 1
y′(t) = f ′(t)t ∈ I ,
which cannot be ~0, so that every time of I is regular. At each t ∈ I , the parametric curve thus admitsa tangent with direction vector ~ı + f ′(t)~, that is, with slope f ′(t). This is coherent with what you’velearned so far.
2.3.3 Local behavior
We consider an interior time t0 ∈ I and suppose the existence of two integers 1 ≤ p < q such that ~v is qtimes differentiable at t0 ∈ I and
(i) p is the smallest k ≥ 1 such that ~v(k)(t0) 6= ~0;
(ii) q is the smallest k ≥ p + 1 such that the vectors ~v(p)(t0) and ~v(k)(t0) are linearly independent(which means that {~v(p)(t0), ~v(q)(t0)} forms a basis of the whole plane R2).
Under these assumptions, Proposition 2.22 ensures the existence of a tangent at t0. Moreover, (2.3)becomes
~v(t) = ~v(t0) +(t− t0)p
p!~v(p)(t0) +
(t− t0)p+1
(p+ 1)!~v(p+1)(t0) + · · · +
(t− t0)q
q!~v(q)(t0) + O
((t− t0)q
).
As, for all p + 1 ≤ i ≤ q − 1, the vector ~v(i)(t0) is collinear with ~v(p)(t0), there exists αi ∈ R such that~v(i)(t0) = αi~v
(p)(t0). We thus get
~v(t) = ~v(t0) +(t− t0)p
p!
(1 + ε(t)
)~v(p)(t0) +
(t− t0)q
q!~v(q)(t0) + ~η(t) ,
71
Chapter 2. Plane parametric curves
where ~η(t) = O
((t− t0)q
)and
ε(t) := αp+1t− t0p+ 1
+ · · · + αq−1(t− t0)q−p−1
(p+ 1) . . . (q − 1)→ 0 as t → t0.
As {~v(p)(t0), ~v(q)(t0)} forms a basis of R2, we can write ~η(t) = η1(t)~v(p)(t0) + η2(t)~v(q)(t0) with η1(t) =O
((t− t0)q
)and η2(t) = O
((t− t0)q
). All in all, we obtain
~v(t) − ~v(t0) =(t− t0)p
p!
(1 + ε1(t)
)~v(p)(t0) +
(t− t0)q
q!
(1 + ε2(t)
)~v(q)(t0) . (2.4)
where ε1(t) = O(t− t0) and ε2(t) = O(t− t0).
In order to study the position of the parametric curve with respect to its tangent in a neighborhoodof t0, we work in the basis
(
M0;1
p!~v(p)(t0),
1
q!~v(q)(t0)
)
of the Euclidean plane. In this basis, the tangent is the first axis and Equation (2.4) stipulates that thepoint M(t) has coordinates
{
(t− t0)p(1 + ε1(t)
)
(t− t0)q(1 + ε2(t)
) .
As 1 + εi(t) → 1 as t → t0, we have that 1 + εi(t) > 0 in a neighborhood of t0, so that the signs of thecoordinates are those of (t − t0)p and (t − t0)q , that is, positive for t > t0 and positive or negative fort < t0, depending on the parity of p and q. We obtain the following classification into four classes:
1p!~v
(p)(t0)
1q!~v
(q)(t0)
M0
p odd, q even
h standard time g
1p!~v
(p)(t0)
1q!~v
(q)(t0)
M0
p odd, q odd
h inflection time g
72
2.3.3. Local behavior
1p!~v
(p)(t0)
1q!~v
(q)(t0)
M0
p even, q odd
h cusp of the first kind g
1p!~v
(p)(t0)
1q!~v
(q)(t0)
M0
p even, q even
h cusp of the second kind g
Example 2.24
Let us look, in a neighborhood of 0, at ~v : t ∈ R 7→ tp~ı + tq~ for some values of 1 ≤ p < q. We have~v(p)(0) = p!~ı, ~v(q)(0) = q!~, and ~v(k)(0) = 0 whenever k /∈ {p, q}, so that p and q correspond to theintegers defined by (i) and (ii).
~ı
~
0
p = 1, q = 2
~ı
~
0
p = 1, q = 3
~ı
~
0
p = 2, q = 3
~ı
~
0
p = 2, q = 4
In practice, almost all times are biregular. Recall that this means that the previous integers are welldefined and p = 1, q = 2. Such times are standard: the curve touches its tangent without crossing it(top-left picture).
h Looking for cusps. At a regular time, we have p = 1, so that we cannot have a cusp. Cusps arethus necessarily singular times! Beware that the converse is false! As a result, if one wants to look forcusps, it is enough to study singular times.
h Looking for inflection times. At an inflection time t0, p and q are odd. As result, the vectors ~v′(t0)and ~v′′(t0) are collinear (otherwise p = 1 and q = 2). Beware that the converse is false! If one wantsto look for inflection times, it is enough to study the times t0 such that ~v′(t0) and ~v′′(t0) are collinear.This means that
det(v′(t0), ~v′′(t0)
)= 0 ⇐⇒
∣∣∣∣
x′(t0) x′′(t0)y′(t0) y′′(t0)
∣∣∣∣
= 0
⇐⇒(x′y′′ − x′′y′)(t0) = 0 .
73
Chapter 2. Plane parametric curves
If x′(t0) = 0, this is equivalent to x′(t0) = y′(t0) = 0 (t0 is singular) or x′(t0) = x′′(t0) = 0. If x′ does
not cancel at t0, this is equivalent tod
dt
(y′
x′
)
(t0) = 0. One thus starts by looking at the canceling times
of x′ and ofd
dt
(y′
x′
)
.
Alternatively, one may also look at the canceling times of y′ and ofd
dt
(x′
y′
)
(the computations may
be easier for one than for the other).Then, for each of these times, compute p (p = 1 if and only if t0 is not singular). If p is even, stop
there and conclude that it is not an inflection time; if p is odd, compute q. If q is even, conclude that itis not an inflection time; if q is odd, conclude that it is an inflection time.
2.4 Sketching
2.4.1 Interval of study
As mathematicians don’t like to work worthlessly, the first step when studying a parametric curve isto try to study it as little as possible. More precisely, when studying a parametric curve (I, ~v), it can beenough to study (J,~v) for some interval J ( I as small as possible.
Periodicity
The first step is to look at periodicity. If ~v is periodic with period T , it is enough to study it on someinterval of length T . If x is Tx-periodic and y is Ty-periodic, note that ~v is periodic if and only ifTx/Ty ∈ Q.
Isometries
We try to split I = J ∪ J1 ∪ · · · ∪ Jk into a finite number of subintervals in such a way that, foreach 1 ≤ i ≤ k, the image ~v(Ji) corresponds to ~v(J) through a simple plane isometry (reflection,translation, rotation). In other words, for each 1 ≤ i ≤ k, there exist a bijection ϕi : Ji → J such that~v(J) = ~v ◦ ϕi(Ji) corresponds to ~v(Ji) through a simple plane isometry. In this case, we study (J,~v)and deduce the image of (I, ~v) thanks to the isometries.
Remark 2.25
The subintervals don’t have to be disjoint, but, in practice, they are or share an extremity.
In general, the bijections have the simple form t 7→ α ± t for some α ∈ R, but any can do (as, forinstance t ∈ (0, 1] 7→ 1
t ∈ [1,+∞)).
The isometries are usually to look for among the following:
h (x, y) 7→ (x, y) : identity;
h (x, y) 7→ (−x, y) : reflection across the vertical axis;
h (x, y) 7→ (x,−y) : reflection across the horizontal axis;
h (x, y) 7→ (−x,−y) : rotation about O of angle π ;
h (x, y) 7→ (−y, x) : rotation about O of angle π/2 ;
h (x, y) 7→ (y,−x) : rotation about O of angle −π/2 ;
h (x, y) 7→ (y, x) : reflection across the first bisector (line y = x);
74
2.4.1. Interval of study
h (x, y) 7→ (−y,−x) : reflection across the second bisector (line y = −x);
h (x, y) 7→ (x+ a, y + b) : translation of vector a~ı + b~.
Remark 2.26
Reducing the interval of study by periodicity almost falls into this framework with bijections of the formt 7→ iT + t, for i ∈ Z and the identity as isometry, but with a countable number of subintervals. . .
Sometimes, there exist more complicated isometries (they will be indicated in exercises).
Example 2.27
Let us find a good interval of study for the parametric curve ~v : t ∈ R 7→(2 cos(t) + cos(2t)
)~ı +
(2 sin(t) − sin(2t)
)~.
First, we clearly have ~v(t+ 2π) = ~v(t) for all t ∈ R, so that the parametric curve is 2π-periodic: we willstudy it on an interval of length 2π (to be determined later).
Second, we observe that, for any t ∈ R, we have x(−t) = x(t) and y(−t) = −y(t), so that we willstudy the curve on [0, π] and complete its sketch by applying the isometry (x, y) 7→ (x,−y), that is, thereflection across the horizontal axis.
Finally, let z(t) := x(t) + iy(t) = 2eit + e−2it ∈ C. We can check (in exercises, you will get a hint)that, for any t ∈ R,
z(
t+2π
3
)
= 2eite2i π3 + e−2ite−4i π
3 = e2i π3
(2eit + e−2it
)= e2i π
3 z(t).
The point M(t + 2π3 ) is thus the image of M(t) through the rotation about O of angle 2π
3 . The curve isthus invariant under this rotation. As a result, we study it on the interval [0, 2π
3 ].
x
y
0 0
2π3
π
4π3
Remark 2.28
On the picture, we indicated in green the value of the parameter at strategic times. The curve corre-sponding to the interval of study is drawn with a thicker paintbrush.
75
Chapter 2. Plane parametric curves
2.4.2 Asymptotes
Throughout this section, we suppose that
‖~v(t)‖ → ∞ as t approaches an extremity t0 of I
(either t0 = a if I = (a, b] or (a, b) or t0 = b if I = [a, b) or (a, b), where a, b ∈ R ∪ {±∞}). We areinterested in the behavior of the parametric curve near such a time. First, notice that the point M(t)
lies on the line L(O, ~v(t)
‖~v(t)‖), where the direction vector ~v(t)
‖~v(t)‖ has norm 1. It is natural to say that M(t)
tends to infinity in a given direction if this vector admits a limit; let us give a precise definition.
Definition 2.29 h asymptotic direction, asymptote g
h We say that the raya R(O, ~u) is the asymptotic direction of the parametric curve at t0 if
~v(t)
‖~v(t)‖ → ~u as t → t0.
h A line L such that the distance from ~v(t) to L tends to 0 as t → t0 is called an asymptote.
aWe denote by R(A, ~u) :={
A + λ~u : λ ∈ R+
}the ray starting at the point A with direction vector ~u.
Here is an illustration of three asymptotes: from left to right, a vertical asymptote, a horizontalasymptote and a slant asymptote.
x
y
x
y
x
y
Proposition 2.30
The line with equation ax+ by + c = 0 is an asymptote if and only if ax(t) + by(t) + c → 0 as t → t0.
Proof. The distance from M(t) to the line is
|ax(t) + by(t) + c|√a2 + b2
.
This quantity tends to 0 if and only if ax(t) + by(t) + c → 0.
Proposition 2.31
If a parametric curve admits an asymptote as t → t0, then it also has an asymptotic direction at t0.More precisely, if the line L (A, ~u) is an asymptote as t → t0, then either R(O, ~u) or R(O,−~u) is theasymptotic direction at t0.
76
2.4.2. Asymptotes
Proof. Let a~ı+b~ be a vector orthogonal to ~u, so that the Cartesian equation of L (A, ~u) is ax+by+c = 0,for some c ∈ R. By Proposition 2.30, ax(t) + by(t) + c → 0, so that
[ab
]
· ~v(t)
‖~v(t)‖ =ax(t) + by(t)√
x2(t) + y2(t)→ 0 .
Whenever ~v(t) 6= ~0, which is the case when t is close to t0, the function t 7→ ~v(t)‖~v(t)‖ is a continuous
function taking its values on the circle C . As there are only two vectors on C that are orthogonal toa~ı + b~, namely ~u and −~u, we conclude that
~v(t)
‖~v(t)‖ → ±~u ,
as desired.
Note that a parametric curve may have an asymptotic direction without an asymptote.
Exercise 2.32 solution page 142
Show that t ∈ R+ 7→ t~ı + sin(t)~ admits an asymptotic direction but no asymptote at +∞.
x
y
0
Moreover, the fact that ‖~v(t)‖ → ∞ does not ensure the existence of an asymptotic direction.
Exercise 2.33 solution page 142
Show that t ∈ R+ 7→ t sin(t)~ı + t cos(t)~ admits no asymptotic direction at +∞.
x
y
0
In practice, the following theorem gives criterions that very often allow to conclude.
77
Chapter 2. Plane parametric curves
Theorem 2.34 h asymptotic directions and asymptotes g
The following holds.
(i) If |y(t)| → ∞ and x(t) → x0, then the line with equation x = x0 is an asymptote.
(ii) If |x(t)| → ∞ and y(t) → y0, then the line with equation y = y0 is an asymptote.
(iii) If |x(t)| → ∞ and |y(t)| → ∞, then we have the following:
a) if∣∣ y(t)
x(t)
∣∣ → ∞ then we have a vertical asymptotic ray and no asymptote;
b) if∣∣ y(t)
x(t)
∣∣ → 0 then we have a horizontal asymptotic ray and no asymptote;
c) if y(t)x(t) → a 6= 0 then
1) if |y(t) − ax(t)| → ∞, we have an asymptotic ray with direction ε~ı + aε~, whereε ∈ {−1,+1} is the sign of x(t) near t0, and no asymptote,
2) if y(t) − ax(t) → b ∈ R, the line with equation y = ax+ b is an asymptote.
Proof. Cases (i), (ii) and 2) are immediate consequences of Proposition 2.30.If |x(t)| → ∞ and |y(t)| → ∞, let us see under which condition we can have an asymptote: we
suppose that the line with equation αx + βy + c = 0 (with (α, β) 6= (0, 0)) is an asymptote. By Propo-sition 2.30, we must have αx(t) + βy(t) + c → 0. This implies that α 6= 0 and β 6= 0, as otherwise|αx(t)+βy(t)+c| → ∞. The equation of the asymptote can thus be rewritten as y = ax+b (with a 6= 0)
and we must have y(t) − ax(t) → b and, after dividing by x(t), which tends to infinity, y(t)x(t) → a. This
proves that there is no asymptote in cases a), b) and 1).
In case 1), we have that y(t) ∼ ax(t), so that ‖~v(t)‖ =√
x2(t) + y2(t) ∼ |x(t)|√
1 + a2 and
~v(t)
‖~v(t)‖ =x(t)
‖~v(t)‖~ı +y(t)
‖~v(t)‖~ → ε~ı + a~√1 + a2
.
In case a), we have ‖~v(t)‖ ∼ |y(t)|, so that
~v(t)
‖~v(t)‖ → ±~
and, in case b), we have ‖~v(t)‖ ∼ |x(t)|, so that
~v(t)
‖~v(t)‖ → ±~ı .
Whenever there is an asymptote of equation ax + by + c = 0, one gets the position of M(t) withrespect to the asymptote by studying the sign of ax(t) + by(t) + c. For instance, for b = 1, we have thefollowing:
h if ax(t) + y(t) + c > 0, the point M(t) lies above the asymptote;
h if ax(t) + y(t) + c = 0, the point M(t) lies on the asymptote;
h if ax(t) + y(t) + c < 0, the point M(t) lies below the asymptote.
In particular, it is interesting to see whether the parametric curve crosses the asymptote an infinitenumber of times or a finite number of times. In the latter case, after the last crossing time, the para-metric curve stays in one of the half-planes delimited by the asymptote.
78
2.4.3. Sketching plan
2.4.3 Sketching plan
Here is the road map for sketching a parametric curve (I, ~v).
h Interval of definition. If ~v is given without its interval of definition I , you should find it. If ~v isgiven with a definition domain that is not an interval, split it into separate intervals.
h Interval of study. Reduce as much as possible the interval of study. See Section 2.4.1.
h Variations of x and y. Study and draw the table of variations for x and y.
h Extremities of the interval of study. Look at what happens at the extremities. Is there a finitelimit? Are there asymptotic directions? Are there asymptotes? If so, what is the position of thecurve with respect to the asymptote? See Section 2.4.2.
h Singular times. Study singular times.
h Particular times. Study the times that have a particular interest (multiple points, local ex-tremums, etc.).
h Precise sketch. Don’t forget to draw the tangents at the times of interest. It is also a good practiceto indicate the values of the parameter at these times. The golden rule for sketching a parametriccurve is the following:
d plot the points at the times of interest with their tangents;
d link these points together: if x increases, move right; if x decreases, move left; if y increases,move up; if y decreases, move down;
d if it is ugly, it is wrong!
Remark 2.35
In the table of variations, it is convenient to arrange the lines in the order x′, x, y, y′ so that x and y areadjacent. This gives a better view of the variations. Moreover, we put as many values as possible; thesegive the coordinates and tangents at strategic times. In particular, the times where x′ or y′ cancels are ofinterest (if regular, these times have respectively a vertical or a horizontal tangent).
Let us now see a couple of examples.
Example 2.36 h astroid g
Let us start with the study of
~v : t ∈ R 7→{
x(t) = cos3(t)
y(t) = sin3(t).
h Periodicity. The function ~v is 2π-periodic, we can study this parametric curve on an interval oflength 2π.
h Isometries. We observe that, for any t ∈ R,
x(−t) = x(t) and y(−t) = −y(t) ,
so that the curve is symmetric across the horizontal axis. We thus chose an interval of length 2π thatis symmetric with respect to 0 (so that t 7→ −t makes sense), that is, [−π, π] and split it in two. As a
79
Chapter 2. Plane parametric curves
result, we study the parametric curve on [0, π] and we will make a reflection across the horizontal axis inorder to complete the picture. Next, for any t ∈ R,
x(π − t) = −x(t) and y(π − t) = y(t) ,
so that the curve is symmetric across the vertical axis; we can study it on [0, π2 ]. Finally, for any t ∈ R,
x(
π2 − t
)= y(t) and y
(π2 − t
)= x(t) ,
so that the curve is symmetric across the first bisector; we can study it on I := [0, π4 ].
h Derivatives and variations. We have{
x′(t) = −3 sin(t) cos2(t)
y′(t) = 3 cos(t) sin2(t).
On I , both functions x′ and y′ only cancel at 0; the time 0 is thus the unique singular time on I .
t
x′
x
y
y′
0π
4
0 − − 3√
24
11√
24
√2
4
00
√2
4
√2
4
0 + 3√
24
h Singular time. At 0, we have the following Taylor expansion:
x(t) =
(
1 − t2
2+ O(t2)
)3
= 1 − 3t2
2+ O(t3) and y(t) =
(t+ O(t2)
)3= t3 + O(t3) ,
so that
~v(t) = ~v(0) − 3t2
2~ı + t3~ + O(t3) .
We thus have a cusp of the first kind at 0. The half-tangent at this time is directed by −~ı.
h Sketch.
80
2.4.3. Sketching plan
x
y
0π
3π2
π2
π4
3π4
5π4
7π4
Example 2.37
Let us study
~v(t) =
x(t) =t3
t2 − 1
y(t) =t(3t− 2)
3(t− 1)
.
h Domain of study. Here, the domain of definition is not explicitly given. As ~v is defined on R \{−1, 1}, we have three parametric curves to study, ((−∞,−1), ~v), ((−1, 1), ~v), and ((1,+∞), ~v). Thereis no periodicity and no immediate isometry to see.
h Derivatives and variations. On R \ {−1, 1}, we have
x′(t) =3t2(t2 − 1) − t3(2t)
(t2 − 1)2=t2(t2 − 3)
(t2 − 1)2
and
y′(t) =(6t− 2)(t− 1) − (3t2 − 2t)
3(t− 1)2=
3t2 − 6t+ 2
3(t− 1)2.
The function x′ cancels at −√
3, 0 and√
3. The function y′ cancels at 1 ±√
33 . As they do not cancel at
a common time, all three parametric curves are regular.
81
Chapter 2. Plane parametric curves
t
x′
x
y
y′
−∞ −√
3 −1 0 1 −√
3
31 1 +
√3
3
√3 +∞
+ 0 − − 0 − − 0 +
−∞−∞
− 3√
32− 3√
32
−∞
+∞
−∞
+∞
3√
32
3√
32
+∞+∞0
x2
x3
−∞−∞
y2y2
−∞
+∞
y3y3
+∞+∞
y1
− 56
0y4
+ 0 − − 0 +
We used the following notation:
y1 := y(
−√
3)
=3 − 7
√3
6≈ −1.52
x2 := x(
1 − 1√3
)
=42 − 26
√3
33≈ −0.09
y2 := y(
1 − 1√3
)
=4 − 2
√3
3≈ 0.17
x3 := x(
1 + 1√3
)
=42 + 26
√3
33≈ 2.63
y3 := y(
1 + 1√3
)
=4 + 2
√3
3≈ 2.48
y4 := y(√
3)
=3 + 7
√3
6≈ 2.52
h Extremities. There are quite a few extremities to study. First, as t → ±∞, both x and y tend to
infinity. Theorem 2.34 tells us to study the quotient y(t)x(t) , which is well defined for t close to ±∞ (more
precisely, for any t ∈ R \ {−1, 0, 1}).
y(t)
x(t)=
(3t− 2)(t+ 1)
3t2∼ 1 as t → ±∞.
One should then study
y(t) − x(t) =t(3t− 2)
3(t− 1)− t3
t2 − 1=
t2 − 2t
3(t− 1)(t+ 1)∼ 1
3as t → ±∞.
The line with equation y = x + 13 is thus an asymptote. In order to know the relative position between
the curve and this asymptote, one study the sign of y(t) −(x(t) + 1
3
)= −2t+1
3(t−1)(t+1) :
82
2.4.3. Sketching plan
t
y(t) −(
x(t) + 13
)
position
−∞ −1 12 1 +∞
+ − 0 + −
curve above curve below curve above curve below
The curve crosses its asymptote at the point M(12 ) = (− 1
6 ,16 ).
As t → −1, y(t) → − 56 , so that we have a horizontal asymptote of equation y = − 5
6 . As y is increasingaround −1, the curve approaches its asymptote from below at −1− and from above at −1+.
As t → 1, both x and y tend to infinity. We have
y(t)
x(t)=
(3t− 2)(t+ 1)
3t2→ 2
3and y(t) − 2
3x(t) =
t(t+ 2)
3(t+ 1)→ 1
2.
The line of equation y = 23x + 1
2 is thus an asymptote. The relative position is given by the sign of
y(t) −(
23x(t) + 1
2
)= (t−1)(2t+3)
6(t+1) , which is negative as t → 1− and positive as t → 1+.
h Sketch.
x
y
y = − 56
y = 23x+ 1
2
y = x+ 13
−∞
−1−
−1+
1−
1+
+∞
−√
3
0
1 −√
32
1 +√
32
√3
83
Chapter 2. Plane parametric curves
Exercise 2.38 solution page 143
Check that the point (− 83 ,− 4
3 ) is the point where the previous parametric curves intersect.
Example 2.39 h a fishy example g
Let us study
~v(t) =
{
x(t) = cos(t) −√
22 cos2(t)
y(t) = sin(t) cos(t).
Here, the domain of definition is not explicit. We take R as both functions are well defined on R.
h Periodicity. The function y is π-periodic, the function cos is 2π-periodic and cos2 is π-periodic.As a result, x is 2π-periodic, and so is y. So far, we can study this parametric curve on an interval oflength 2π.
h Isometries. We observe that
x(−t) = cos(t) −√
2
2cos2(t) = x(t) and y(−t) = − sin(t) cos(t) = −y(t) ,
so that the curve is symmetric across the horizontal axis. We thus chose an interval of length 2π thatis symmetric with respect to 0 (so that t 7→ −t makes sense), that is, [−π, π] and split it in two. As aresult, we study the parametric curve on I := [0, π] and we will make a reflection across the horizontalaxis in order to complete the picture.
h Derivatives and variations. We have{
x′(t) = sin(t)(√
2 cos(t) − 1)
y′(t) = cos(2t).
On I , the function x′ cancels at 0, π4 and π; the function y′ cancels at π
4 and 3π4 . There is thus a unique
singular time on I : the time π4 .
t
x′
x
y
y′
0π
4
3π
4π
0 + 0 − −√
2 − 0
2−√
22
2−√
22
√2
4
√2
4
− 2+√
22− 2+√
22
− 3√
24
00
1212
− 12− 12
00
1 + 0 − 0 + 1
84
2.5. Polar curves
h Extremities. We have ~v(0) = 2−√
22~ı, ~v(π) = − 2+
√2
2~ı, and ~v′(0) = ~v′(π) = ~. The curve thus
starts and ends on the horizontal axis with vertical tangents.
h Singular time. We compute
{
x′′(t) = cos(t)(√
2 cos(t) − 1)
−√
2 sin2(t) =√
2 cos(2t) − cos(t)
y′′(t) = −2 sin(2t)
and {
x′′′(t) = −2√
2 sin(2t) + sin(t)
y′′′(t) = −4 cos(2t),
so that
~v′′(π
4
)
= −√
2
2~ı − 2~ and ~v′′′
(π
4
)
= −3
√2
2~ı .
These vectors are not collinear so that we have a cusp of the first kind at π4 . The tangent at this time is
directed by −√
22~ı − 2~.
h Double points. We see that the origin is a double point, obtained for t = π2 . The tangent at this
time has direction ~v′(π2
)= −~ı −~.
h Sketch.
x
y
0
π4
3π4
ππ2
2.5 Polar curves
2.5.1 Polar coordinates
For any point M = x~ı + y~ ∈ E2, there exists at least a pair (ρ, θ) ∈ R+ × R such that
{
x = ρ cos(θ)
y = ρ sin(θ). (2.5)
85
Chapter 2. Plane parametric curves
Definition 2.40 h polar coordinates g
h Such numbers ρ ∈ R+ and θ ∈ R are called polar coordinates of M .
h In this context, the origin O is called the pole.
h The number ρ ∈ R+ is called the radial coordinate or the radius of M .
h The number θ ∈ R is called an angular coordinate, a polar angle, or an azimuth of M .
Note that, in contrast with Cartesian coordinates, polar coordinates are not unique. More precisely,
the radius ρ =√
x2 + y2 is uniquely defined but the polar angle is only defined modulo 2π for M 6= Oand is completely arbitrary for the pole O. When one wants a unique system of coordinates, it iscustomary to set the polar angle of the pole to be 0, and one specifies a half-open interval of length 2πfor the polar angle (often (−π, π] or [0, 2π)).
It is really simple to express the Cartesian coordinates from polar coordinates by (2.5). Conversely,
it is not as easy: the radius is always ρ =√
x2 + y2, but the angle always needs to be defined withcare. For instance, if we choose the system where
(ρ, θ) ∈(R⋆
+ × (−π, π])
∪ {(0, 0)} ,
then the polar angle is given by
θ =
arctan(
yx
)if x > 0
π2 if x = 0 and y > 0
0 if x = 0 and y = 0
− π2 if x = 0 and y < 0
arctan(
yx
)+ π if x < 0 and y ≥ 0
arctan(
yx
)− π if x < 0 and y < 0
.
Exercise 2.41 solution page 143
Express the polar angle for the system where (ρ, θ) ∈(R⋆
+ × [0, 2π))
∪ {(0, 0)}.
2.5.2 Polar curves
For θ ∈ R, we define the unit vectors
~uθ := cos(θ)~ı + sin(θ)~ and ~vθ := ~uθ+ π2
= − sin(θ)~ı + cos(θ)~.
x
y
0 1
~uθ
θ
~vθ
π2
86
2.5.3. What is the difference with a usual graph?
The following straightforward properties hold.
Proposition 2.42
h We haved
dθ~uθ = ~vθ and
d
dθ~vθ = −~uθ.
h We have ~uθ±π = −~uθ and ~vθ±π = −~vθ .
Definition 2.43 h polar curve g
A parametric curve in polar coordinates or polar curve is a parametric curve given in the formθ ∈ I 7→ ρ(θ) ~uθ , where ρ : I → R is a continuous function on an interval I ⊆ R. We will use theshorthand notation (I, ρ).
From now on, we consider a polar curve (I, ρ), set
~v(θ) := ρ(θ) ~uθ and, as before, define M(θ) such that−−−−→OM(θ) = ~v(θ) .
Warning 2.44 B ρ may take negative values B
The function ρ is allowed to take negative values. The radius of M(θ) is nevertheless always |ρ(θ)|.
h If ρ(θ) ≥ 0, then the pair(ρ(θ), θ
)is a pair of polar coordinates of M(θ).
h If ρ(θ) < 0, then the pair(
− ρ(θ), θ + π)
is a pair of polar coordinates of M(θ).
Remark 2.45
We can always see the polar curve (I, ρ) as a parametric curve in Cartesian coordinates:
{
x(t) = ρ(t) cos(t)
y(t) = ρ(t) sin(t)t ∈ I .
In theory, we thus already know how to study such a parametric curve. So what is the differencewith what we have done so far? In fact, if we have a parametric curve t ∈ I 7→ x(t)~ı + y(t)~, we canalways write the vector x(t)~ı + y(t)~ in polar coordinates as some ρ(t)~uθ(t). We can even do so in sucha way that t ∈ I 7→ ρ(t) and t ∈ I 7→ θ(t) are continuous functions. But, in general, this is not a polarcurve! Polar curves are more restrictive as the parameter has to be a polar angle (modulo π). In otherwords, we must have θ(t) = t mod π.
Polar curves may thus have special properties that general parametric curves do not necessarilypossess. Let us now see some of their particularities.
2.5.3 What is the difference with a usual graph?
Let us investigate the difference between sketching the graph of (ρ, I) as usual and sketching the polarcurve (ρ, I).
87
Chapter 2. Plane parametric curves
h Graph. When we plot the graph of (ρ, I), we plot for each θ ∈ I the point with Cartesian coordi-nates (θ, ρ(θ)). In other words, on the vertical line x = θ, we select the point with coordinate ρ(θ). As θranges over I , the vertical line x = θ moves at constant speed from left to right.
x
y
πI
h Polar curve. Now, when we plot the polar curve (ρ, I), for each θ ∈ I , we select the point withcoordinate ρ(θ) on the line L (O, ~uθ). As θ ranges over I , the line L (O, ~uθ) turns around the pole atconstant speed in the direct sense.
x
y
2.5.4 Tangents
Let us suppose that ρ is differentiable at θ0 ∈ I . Then ~v(θ) = ρ(θ) ~uθ is differentiable at θ0 and
~v′(θ0) = ρ′(θ0) ~uθ0 + ρ(θ0)~vθ0 . (2.6)
88
2.5.4. Tangents
Proposition 2.46 h tangents g
Let θ0 ∈ I .
(i) If M(θ0) = O and there exists ε > 0 such thatM(θ) 6= O for θ ∈ (θ0 − ε, θ0) ∪ (θ0, θ0 + ε), thenthe polar curve admits as tangent at t0 the line L (O, ~uθ).
(ii) IfM(θ0) 6= O and ρ is differentiable at θ0, then θ0 is regular and the polar curve admits as tangentat t0 the line L
(M(θ0), ρ′(θ0) ~uθ0 + ρ(θ0)~vθ0 ).
Proof. (i) The line M(θ0)M(θ) is well defined for θ ∈ (θ0 − ε, θ0) ∪ (θ0, θ0 + ε) and admits ~uθ asdirection vector. The result follows from the fact that ~uθ → ~uθ0 as θ → θ0.
(ii) Saying that M(θ0) 6= O is equivalent to saying that ρ(θ0) 6= 0. If ρ is differentiable at θ0,then (2.6) gives that ~v′(θ0) 6= ~0 so that θ0 is regular and Proposition 2.21 entails the result.
x
y
~uθ0
M(θ0)
θ0
Exercise 2.47 solution page 143
Find the tangents at times π3 and π
2 of the polar curve ρ(θ) = 1 − 2 cos(θ).
Let us see in more details what this proposition entails. If ρ is differentiable on I , then (ii) ensuresthat all the times are regular except possibly the times θ such that M(θ) = O.
Let θ0 be such that ρ only cancels at θ0 in a neighborhood of θ0. As the line M(θ0)M(θ) admits ~uθ
as direction vector, we see that, if ρ changes sign at θ0, then we have a standard time; if ρ does notchange sign at θ0, then we have a cusp of the first kind. (Note that, if ρ changes sign at θ0, then θ0 is asingular time (indeed, either ρ is not differentiable at θ0 or it is and ρ′(θ0) = 0).)
In particular, a differentiable polar curve can never have a cusp of the second kind!
Example 2.48
Let us study at time π2 the polar curves
ρ(θ) = (θ + 1) cos(θ) and ρ(θ) = 2 cos2(θ) .
In both case, ρ(π2 ) = 0, so that M(π
2 ) = O. We thus know that the tangent is the line L (O, ~u π2), that
is, the vertical axis.
h For the first polar curve, ρ takes positive values closely before time π2 and negative values closely
after. As a result, the point M(θ) passes through the origin from the top to the bottom (M(θ) ∈R(O, ~uθ) for θ < π
2 and M(θ) ∈ R(O,−~uθ) for θ > π2 ): we have a standard time.
89
Chapter 2. Plane parametric curves
h In the second case, ρ stays positive in a neighborhood of time π2 . As a result, the point M(θ) stays
on top of the horizontal axis (M(θ) ∈ R(O, ~uθ) for θ close to π2 ): we have a cusp of the first kind.
x
y
0
1
1 x
y
0
1
1π2
The tangent is naturally expressed in the orthonormal basis (M(θ0); ~uθ0 , ~vθ0) of E2, rather than in(O;~ı,~). Instead of trying to transform the expression, it is often convenient to work directly in thisbasis.
Definition 2.49 h mobile basis g
The orthonormal basis (M(θ0); ~uθ0 , ~vθ0) of E2 is called the mobile basis at the time θ0.
~v′(θ0)
ρ′(θ0)
ρ(θ0)
~uθ0
~vθ0
M(θ0)
θ0
~ı
~
O
In fact, under the hypothesis that ρ is differentiable at θ0, the tangent at time θ0 is very constrained.In the mobile basis, its coordinates are (ρ′(θ0), ρ(θ0)), so that the coordinate along ~vθ0 is equal to thedistance between O and M(θ0). On the above picture, this fact is symbolized by the dotted circle.
h Looking for inflection times. Let us suppose that ρ is twice differentiable. We have
~v′(θ) = ρ′(θ) ~uθ + ρ(θ)~vθ and ~v′′(θ) =(ρ′′(θ) − ρ(θ)
)~uθ + 2ρ′(θ)~vθ .
Remember from the end of Section 2.3.3 that, at an inflection time, these vectors are collinear. Remem-ber also that the converse is false! It is enough to study the times θ such that
det(v′(θ), ~v′′(θ)
)= 0 ⇐⇒
∣∣∣∣
ρ′(θ) ρ′′(θ) − ρ(θ)ρ(θ) 2ρ′(θ)
∣∣∣∣
= 0
⇐⇒ 2ρ′2(θ) + ρ2(θ) − ρ(θ)ρ′′(θ) = 0 .
90
2.5.5. Extremities of the interval of study
We already know from above that, if ρ(θ) = 0, then θ cannot be an inflection time. Observing that
(1
ρ
)′= − ρ′
ρ2and
(1
ρ
)′′= −ρ′′
ρ2+ 2
ρ′2
ρ3=
2ρ′2 − ρρ′′
ρ3,
it is enough to study times at which
1
ρ+
(1
ρ
)′′= 0 .
2.5.5 Extremities of the interval of study
h Finite extremities. Let us first suppose that ‖~v(θ)‖ → ∞ as θ approaches a finite extremity θ0 ∈ R
of I . This means that ρ(θ) → ±∞ as θ → θ0. We have that2
~v(θ)
‖~v(θ)‖ = Sign(ρ(θ)
)~uθ → ε~uθ0
where ε ∈ {−1,+1} is the limit of the sign of ρ. We thus always have an asymptotic directionR(O, ε~uθ0 ).
In order to see whether there exists an asymptote or not, we work in the the basis (O, ~uθ0 , ~vθ0). Inthis basis, the coordinates of M(θ) are
(~v(θ) · ~uθ0 , ~v(θ) · ~vθ0
)=(ρ(θ) cos(θ − θ0), ρ(θ) sin(θ − θ0)
).
The abscissa tends to ε∞. If the ordinate has a limit ρ(θ) sin(θ− θ0) → α ∈ R, then the line of equationY = α in this basis is an asymptote. In other words, this is the line L
(α~vθ0 , ~uθ0
). The side is given
by ε and the relative position is obtained by studying the sign of ρ(θ) sin(θ − θ0) − α.
h Infinite extremities. Let us now see what can happen as θ → ±∞. There are three cases of interest:
h if ρ(θ) → 0, then the pole O is an asymptotic point;
h if ρ(θ) → a ∈ R⋆, then the circle of center O and radius |a| is an asymptotic circle (the relativeposition is obtained by studying the sign of ρ(θ) − a);
h if ρ(θ) → ±∞, then the polar curve behaves as a spiral.
2We denote by Sign the sign function defined by Sign(x) = −1 if x < 0, Sign(0) = 0, and Sign(x) = +1 if x > 0.
91
Chapter 2. Plane parametric curves
x
y
ρ : θ ∈ R 7→ sin(θ)
θ
x
y
ρ : θ ∈ R 7→ sin(2θ)
2θ
h asymptotic point g
x
y
ρ : θ ∈ [1,+∞) 7→ 1 − 1
θ
h asymptotic circle g
x
y
ρ : θ ∈ [1,+∞) 7→ θ
10π
h spiral behavior g
Remark 2.50
The first two examples ρ1 : θ ∈ R 7→ sin(θ)θ and ρ2 : θ ∈ R 7→ sin(2θ)
2θ are also there to illustrate the factthat, although ρ2(θ) = ρ1(2θ), the curves do not really look alike. This is simply due to the fact that thesecond one is ρ2(θ)~uθ = ρ1(2θ)~uθ 6= ρ1(2θ)~u2θ . Reparameterization θ 7→ ϕ(θ) only works for polarcurves if ~uϕ(θ) is a real function of ~uθ ; this is quite restrictive!
92
2.5.6. Sketching
2.5.6 Sketching
The sketching plan for a polar curve is the same as the one for general parametric curves. Let us recallit and review the particularities of polar curves.
h Interval of definition. If ρ is given without its interval of definition I , you should find it. If ρ isgiven with a definition domain that is not an interval, split it into separate intervals.
h Interval of study. If ρ is T -periodic, we study it on some interval of length T .
Warning 2.51 B The period of ρ is not the period of ~v ! B
Beware that ρ(θ + T ) = ρ(θ) implies that ~v(θ + T ) = ρ(θ) ~uθ+T . If T /∈ 2πZ, then ~uθ+T 6= ~uθ .
However, ρ(θ) ~uθ+T is the image of ρ(θ) ~uθ through a rotation about O of angle T . We thus studythe polar curve on an interval of length T and complete the sketch through subsequent rotations ofangle T , 2T , 3T , etc. If T
2π ∈ Q, then after a finite number of such rotations, the curve will close itself.Otherwise, we should do an infinite number of such rotations. . . In practice, this will hopefully nothappen.
Here is a list of useful isometries in polar coordinates.
h ρ ~uθ 7→ ρ ~u2ϕ−θ : reflection across the line L (O, ~uϕ) ; in particular,
d ρ ~uθ 7→ ρ ~u−θ = −ρ ~uπ−θ : reflection across the vertical axis,
d ρ ~uθ 7→ ρ ~uπ−θ = −ρ ~u−θ : reflection across the horizontal axis,
d ρ ~uθ 7→ ρ ~u π2 −θ = −ρ ~u− π
2 −θ : reflection across the first bisector (line y = x),
d ρ ~uθ 7→ ρ ~u− π2
−θ = −ρ ~u− π2
−θ : reflection across the second bisector (line y = −x).
h ρ ~uθ 7→ ρ ~uθ+ϕ : rotation about O of angle ϕ ; in particular,
d ρ ~uθ 7→ ρ ~uθ±π = −ρ ~uθ : rotation about O of angle π.
Summing up, we look for real numbers a ∈ R such that ρ(θ + a) = ±ρ(θ).
x
y
ρ ~uθ
ρ ~u2ϕ−θ
θ
ϕ
ρ ~uθ 7→ ρ ~u2ϕ−θ
x
y
ρ ~uθ
ρ ~uθ+ϕ
θ
ϕ
ρ ~uθ 7→ ρ ~uθ+ϕ
93
Chapter 2. Plane parametric curves
Example 2.52
Let us find an interval of study for the polar curve
ρ(θ) = 1 + 2 cos2(θ).
The function ρ is defined on R and is 2π-periodic. We thus have, for θ ∈ R,
~v(θ + 2π) = ρ(θ + 2π) ~uθ+2π = ρ(θ) ~uθ = ~v(θ) .
(Here, as the period of ρ is 2π, ~v is also 2π-periodic; remember Warning 2.51.) As a result, we have theentire curve by studying an interval of length 2π.
The function ρ is even. For θ ∈ R,
~v(−θ) = ρ(−θ) ~u−θ = ρ(θ) ~u−θ ,
so that M(−θ) is obtained from M(θ) by reflection across the horizontal axis. We study ([0, π], ρ) andcomplete the picture by a reflection across the horizontal axis.
For θ ∈ [0, π],~v(π − θ) = ρ(π − θ) ~uπ−θ = ρ(θ) ~uπ−θ ,
so that M(π − θ) is obtained from M(θ) by reflection across the vertical axis. We study ([0, π2 ], ρ) and
complete the picture by a reflection across the vertical axis and a reflection across the horizontal axis.
Here is the result:
x
y
0
1
1 0
π2
h Sign and variations of ρ. Draw the table of variations for ρ in such a way that the sign of ρ can bedirectly obtained (in other words, put the canceling times of ρ in the table). In particular, solve ρ(θ) = 0(on the interval of study) in order to find the times at which the curve passes through the pole. Thesign of ρ around such times allows to decide between standard times and cusps of the first kind.
h Extremities of the interval of study. Look at what happens at the extremities. See Section 2.5.5.
h Particular times. Study the times that have a particular interest (multiple points, intersections withthe axes, etc.). Note that a multiple point is such that there exists times θ1, θ2 such that ~v(θ1) = ~v(θ2).This means that ρ(θ1) ~uθ1 = ρ(θ2) ~uθ2 , which implies that ~uθ1 = ±~uθ2 , so that θ2 − θ1 ∈ πZ.
Beware that, although one of the times (say, θ1) can be chosen in the interval of study, the otherone does not necessarily belong to this interval (but can nonetheless be chosen in an interval oflength the period of ~v if it exists).
94
2.5.6. Sketching
h Precise sketch. Don’t forget to draw the tangents at the times of interest. It is also a good practiceto indicate the values of the parameter at these times. The golden rule for sketching a polar curve isthe following:
h always turn in the direct sense (counterclockwise sense) at constant angular speed;
h if |ρ| increases, move away from the pole, if |ρ| decreases, move toward the pole.
This is simply due to the fact that θ 7→ ~uθ turns around the pole at constant angular speed in the directsense.
Example 2.53 h cardioid g
Let us study the polar curveρ(θ) = 1 − cos(θ).
The domain of definition is not explicit; we take R, on which ρ is well defined.
h Interval of study. The function ρ is 2π-periodic and, as θ 7→ ~uθ is also 2π-periodic, the polarcurve itself is 2π-periodic. Moreover, ρ is even, so that we study ([0, π], ρ) and complete the picture byreflection across the horizontal axis.
h Sign and variations. We have ρ′ = sin(θ).
θ
ρ′
ρ
0 π
0 + 0
00
22
h Particular times. The time 0 is the only time where M = O. The tangent is the horizontal axis(L (O, ~u0)).
In Cartesian coordinates, the parametric curve is
{
x(θ) = ρ(θ) cos(θ) = cos(θ) − cos2(θ)
y(θ) = ρ(θ) sin(θ) = sin(θ) − sin(θ) cos(θ)θ ∈ [0, π] .
We have a horizontal tangent when y′ cancels (provided that x′ does not) and a vertical tangent when x′
(provided that y′ does not). We have x′(θ) = sin(θ)(2 cos(θ) − 1
), so that the canceling times of x′
on [0, π] are 0, π3 and π. Next, y′(θ) = cos(θ) − cos2(θ) + sin2(θ) = −2 cos2(θ) + cos(θ) + 1 =
(1 − 2 cos(θ)
)(cos(θ) − 1
), so that the canceling times of y′ on [0, π] are 0 and 2π
3 .As a result, we have a horizontal tangent at time 2π
3 and vertical tangents at times π3 and π. At time 0,
we cannot conclude from this analysis but already know that there is a horizontal tangent.
h Sketch.
95
Chapter 2. Plane parametric curves
x
y
O
1
1
0
π3
2π3
π
Example 2.54
Let us study the polar curve
ρ(θ) = 1 + tan
(θ
2
)
.
The function ρ is not defined on π + 2πZ. We must, a priori, study each polar curves ((−π + 2kπ, π +2kπ), ρ) for k ∈ Z.
h Interval of study. The function ρ is 2π-periodic and, as θ 7→ ~uθ is also 2π-periodic, each polarcurve ((−π + 2kπ, π + 2kπ), ρ) for k ∈ Z has the same image. We cannot further reduce the interval ofstudy: we choose for instance to study ((−π, π), ρ).
h Sign and variations. On (−π, π), we have ρ′(θ) =1
2 cos2(
θ2
) .
θ
ρ′
ρ
−π −π
20
π
2π
+ + 12 + 1 +
−∞
+∞
01
2
h Particular times. At time − π2 , the tangent is L (O, ~u− π
2), that is, the horizontal axis.
We added to the table of variations the values ρ(0) = 1, ρ′(0) = 12 , ρ(π
2 ) = 2 and ρ′(π2 ) = 1 as they are
easily computable. The tangents at these points have direction
ρ′(0)~u0 + ρ(0)~v0 =1
2~ı +~ and ρ′
(π
2
)
~u π2
+ ρ(π
2
)
~v π2
= ~ − 2~ı .
96
2.5.6. Sketching
h Extremities. As θ ր π, we have ρ(θ) → +∞ : the polar curve admits as asymptotic direction theray R(O, ~uπ) = R(O,−~ı). As θ ց −π, we have ρ(θ) → −∞ : the polar curve admits as asymptoticdirection the ray R(O,−~u−π) = R(O,~ı). In fact, it is enough to study ρ in a neighborhood of π byperiodicity: the behavior on (−π,−π + ε) is the same as that on (π, π + ε) (we momentarily change theinterval of study).In order to see whether there are asymptotes, we work in the basis (O, ~uπ, ~vπ). In this basis, the ordinateof M(θ) is ρ(θ) sin(θ − π) = −ρ(θ) sin(θ). Recalling that
sin(θ) = 2 sin
(θ
2
)
cos
(θ
2
)
=2 sin
(θ2
)cos(
θ2
)
cos2(
θ2
)+ sin2
(θ2
) =2 tan
(θ2
)
1 + tan2(
θ2
) ,
we obtain that the ordinate of M(θ) is
−ρ(θ) sin(θ) = −2t(1 + t)
1 + t2
where we set t = tan( θ2 ). As θ → π, t → ±∞, so that the ordinate of M(θ) tends to −2. We have
thus an asymptote with equation Y = −2 in the basis (O, ~uπ, ~vπ). Moreover, we obtain the relativeposition by studying the sign of
−ρ(θ) sin(θ) − (−2) = 21 − t
1 + t2,
which is negative as θ ր π (as t → +∞) and positive as θ ց π (as t → −∞).Summing up, in the basis (O, ~uπ, ~vπ), as θ ր π, the point M(θ) goes to the right from below the lineY = −2. As θ ց π, the point M(θ) goes to the left from above the line Y = −2. Now, be careful thatworking in the basis (O, ~uπ, ~vπ) means making a rotation about the pole of angle π.In (O,~ı,~), as θ ր π, the point M(θ) goes to the left from above the line y = 2. As θ ց π, thepoint M(θ) goes to the right from below the line y = 2.
h Double point. From the previous study (and a rough draft of the plot), we see that there is a doublepoint obtained for θ1 ∈ (0, π
2 ) and θ2 ∈ (−π,− π2 ). Let us try to find these values. We have
~v(θ1) = ~v(θ2) =⇒ ρ(θ1) ~uθ1 = ρ(θ2) ~uθ2 =⇒ θ2 = θ1 − π and ρ(θ1) = −ρ(θ2) .
We thus need to solve
ρ(θ1) = −ρ(θ1 − π) =⇒ 1 + tan
(θ1
2
)
= −1 − tan
(θ1
2− π
2
)
=⇒ 1 + tan
(θ1
2
)
= −1 +1
tan(
θ1
2
)
=⇒ 1 + t = −1 +1
tsetting t := tan
(θ1
2
)
> 0
=⇒ t2 + 2t− 1 = 0
=⇒ t = −1 ±√
2
=⇒ t = −1 +√
2 as t > 0
=⇒ θ1 =π
4.
Note: you are not expected to know such results. In exercises and tests, you will get help. You are,however, expected to know the method.
97
Chapter 2. Plane parametric curves
h Sketch.
x
y
O
1
1
− π2
π2
π4 , 3π
40
π−
−π+
98
3Ordinary differential equations
In this chapter, we will see how to solve some ordinary differential equations, mainly first orderlinear differential equations and linear systems of ODEs, with a special focus on linear differentialequations with constant coefficients.
For further references about this chapter, you may consult
h [Cod61], [CC97, 1–3], [BDH11, 1–3];
h [LM07, 31], [MTW07, 16], in French.
3.1 Introduction 100
3.1.1 Motivation 100
3.1.2 Formal definitions 100
3.1.3 Separable differential equations 102
3.1.4 Linear ODEs 103
3.2 First order linear differential equations 105
3.2.1 Homogeneous equation 105
3.2.2 Finding a particular solution to y′ = a(x)y + b(x) 107
3.2.3 Solution to the nonhomogeneous equation 110
3.3 Systems of linear ODEs 112
3.3.1 Preliminaries: matrix exponential 113
3.3.2 Solution to the homogeneous equation 116
3.3.3 Solution to the nonhomogeneous equation 118
3.3.4 Method for solving a system of linear ODEs in practice 119
3.4 Linear differential equations with constant coefficients 123
3.4.1 Homogeneous equation 123
3.4.2 Nonhomogeneous equation 126
3.4.3 Example: second order equation 128
99
Chapter 3. Ordinary differential equations
3.1 Introduction
3.1.1 Motivation
Fundamental quantities of physics are sometimes linked through equations and some quantities areobtained from other by derivation (for instance, the acceleration is the derivative of the speed, which isthe derivative of the position). This may give rise to equations linking a quantity with several of itsderivative. Solving such equations is key to a better understanding of the physical world.
A very basic example is that of a falling object. If we neglect frictionforces, the object is only subject to gravity (this is the context of so-calledfree fall). Applying Newton’s Fundamental Principle of Dynamics, we get
ma(t) = mg
where m denotes the mass of the object, g the gravitational acceleration,and a the vertical acceleration of the object. As the acceleration is thesecond derivative of the position, which we denote by y, we simply obtainy′′(t) = g. This equation may directly be integrated as
y′(t) = v0 + g t and finally y(t) = y0 + v0 t+ gt2
2,
where y0 and v0 denote the initial position and speed.
PSfrag replacements
y
0
mg~ey
~ey is a unitary vectoralong the y axis.
Now, if we want to take friction forces into account, we may use theclassical model where the friction is the product of a friction coefficient fwith its velocity v(t) = y′(t), and is opposite to the motion of the object.We thus obtain
ma(t) = mg − f v(t),
which can be rewritten as
v′(t) +f
mv(t) = g. (3.1)
PSfrag replacements
y
0
mg~ey
−f v(t)~ey
Solving Equation (3.1) is not as easy as the one we obtained earlier. This is an example of ordinarydifferential equations. The aim of this chapter is to see how to integrate such equations.
Remark 3.1
In the physics literature, it is common to denote the first and second derivatives with respect to time ofthe function y respectively by y and y. For instance, (3.1) would read y + f y = g with this notation.
3.1.2 Formal definitions
100
3.1.2. Formal definitions
Definition 3.2 h ODE g
h An ordinary differential equation (ODE in short) of order n is an equation of the form
F(x, y(x), y′(x), y′′(x), . . . , y(n)(x)
)= 0 (ODE)
where F : Rn+2 → R is a function that is not constant with respect to its last variable.
h A solution to (ODE) on an interval I ⊆ R is an n times differentiable function y : I → R suchthat (ODE) holds for all x ∈ I .
Remark 3.3
Depending on the context, the variable is usually x or t (often when it represents time) and the functionis often y or x (only when x is not used as the variable).
Remark 3.4
In order to lighten the notation, the argument of the function is often omitted; we thus write for instancey′ = y + cos(x), which should be understood as y′(x) = y(x) + cos(x).
Remark 3.5
The interval on which an ODE is to be solved is usually not mentioned. Your first task is to determinewhere the ODE makes sense! For instance, the ODE
√1 − x y′′ + y
x = ex only makes sense when
x 7→√
1 − x, x 7→ 1x and x 7→ ex are defined. We can thus only try to solve this ODE on the inter-
vals (−∞, 0) and (0, 1) (which does not ensure that there will be solutions defined on these intervals).
Some ODEs can easily be solved by guesswork:
Exercise 3.6 solution page 143
Find at least a solution to each of the following ODEs:
(i) y′ = x+ sin(x) (ii) y′ = y (iii) y′ = 7y (iv) y′′ = 2y
Checking that a given function is a solution to an ODE is also often quite easy:
Exercise 3.7 solution page 144
Check that x ∈ (−c,+∞) 7→ 1
x+ cis solution to y′ = −y2 for any fixed constant c ∈ R.
But solving (or integrating) an ODE consists in finding all its solutions and there is no way to doso in general. Even showing that a solution exists can sometimes prove very difficult! In real-lifesituations, it often happens that the general solution to an ODE cannot be found; instead, one can useapproximate solutions: this is the focus of the branch of mathematics called numerical analysis. In whatfollows, we will restrict our attention to particular ODEs that can be solved.
101
Chapter 3. Ordinary differential equations
Definition 3.8 h maximal solution g
A maximal solution to an ODE is a solution y on an interval I such that there exists no solution z onan interval J with I ( J and z|I = y.
Beware that the interval on which a solution to an ODE is defined really matters. For instance,the ODE y′ = 1/x admits as maximal solutions the functions x ∈ R⋆
− 7→ ln(−x) + c for c ∈ R, andx ∈ R⋆
+ 7→ ln(x) + c for c ∈ R. When not specified, we will implicitly consider that the interval is R.
3.1.3 Separable differential equations
Definition 3.9 h separable differential equation g
A separable differential equation is an ODE that can be written in the form
y′f(y) = g(x).
Solving such an ODE is done by finding primitives F and G of f and g and noticing that
(F ◦ y)′ = y′F ′(y) = y′f(y) = g = G′ ,
which is equivalent to F ◦ y = G+ c for some constant c ∈ R.
In practice, it is useful to write f(y) dy = g(x) dx and integrate both sides as
∫ y
y0
f(u) du =
∫ x
x0
g(u) du ,
which gives F (y) − F (y0) = G(x) − G(x0). This is the same as above, the value of the constant beingchosen such that y(x0) = y0.
Example 3.10
Let us solve x2y′ = e−y. We first “separate” the variables. Note that this equation cannot be satisfied atx = 0 for any function y, so that it is safe to assume x 6= 0. We rewrite the equation as
y′ey =1
x2which gives ey = − 1
x+ c (c ∈ R) .
This can only make sense when − 1
x+ c > 0, in which case
y(x) = ln
(
− 1
x+ c
)
.
This is a well-defined, differentiable, maximal solution on
(1c , 0) if c < 0
R⋆− if c ≥ 0
(1c ,+∞) if c > 0
.
102
3.1.4. Linear ODEs
3.1.4 Linear ODEs
From now on, we will only focus on a special type of ODEs, called linear.
Definition 3.11 h linear differential equation g
h A linear differential equation of order n is an ODE of the form
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b(x)
where each ai and b are real-valued continuous functions on a common interval I ⊆ R, and an 6≡ 0on I .a
h It is called homogeneous if b ≡ 0, nonhomogeneous if b 6≡ 0.
h We speak of linear differential equation with constant coefficients when all the ai’s are constant(note that b is not necessarily constant).
aRecall that we use the notation ≡ for the functional equality: f ≡ g on I means ∀x ∈ I, f(x) = g(x) and f 6≡ gon I means ∃x ∈ I, f(x) 6= g(x).
Example 3.12
(i) y′ + 5xy = ex is a first order linear differential equation.
(ii) y′ + 5xy = 0 is the homogeneous linear differential equation associated with the previous one.
(iii) 2y′′ − 3y′ + 5y = 0 is a second order homogeneous linear differential equation with constantcoefficients.
(iv) y′2 − y = x or y′′ y′ − y = 0 are not linear differential equations.
The term linear comes from the following proposition.
Proposition 3.13
The set of solutions on some fixed interval J ⊆ I to a homogeneous linear differential equation is a realvector space.
Proof. Clearly, x ∈ J 7→ 0 is a valid solution. If y1 and y2 are solutions to a given homogeneous lineardifferential equation, then it is plain to see that, for any real number λ, λy1 + y2 is also a solution.
In order to solve a linear differential equation, one proceeds in two steps:
(i) solve the associated homogeneous equation;
(ii) find a particular solution to the original equation.
The second step is only needed when the original equation is not homogeneous. One then concludeas follows.
103
Chapter 3. Ordinary differential equations
Proposition 3.14
Let yE be a solution on some interval J ⊆ I to
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b(x) (E)
and SH be the set of solutions on J to the associated homogeneous equation
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = 0 . (H)
Then the set of solutions on J to (E) is SE :={yE + y : y ∈ SH
}.
Proof. Let y be a solution to (H). By adding, for each x ∈ J ,
a0(x)yE + a1(x)y′E + · · · + an(x)y
(n)E = b(x) and a0(x)y + a1(x)y′ + · · · + an(x)y(n) = 0 ,
one sees that y + yE is a solution to (E). Reciprocally, let z be a solution to (E). Then, for each x ∈ J ,
a0(x)z + a1(x)z′ + · · · + an(x)z(n) = b(x) and a0(x)yE + a1(x)y′E + · · · + an(x)y
(n)E = b(x) ,
so that, by subtraction, z − yE ∈ SH. There exists thus y = z − yE ∈ SH such that z = yE + y.
Proposition 3.15 h superposition of solutions g
If y1, y2, . . . , yk are respectively solutions on some interval J ⊆ I to
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b1(x)
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b2(x)
......
...
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = bk(x) ,
then y1 + y2 + · · · + yk is solution on J to
a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b1(x) + b2(x) + · · · + bk(x) .
Proof. It suffices to add up, for each x ∈ J , the equations
a0(x)y1 + a1(x)y′1 + · · · + an(x)y
(n)1 = b1(x)
a0(x)y2 + a1(x)y′2 + · · · + an(x)y
(n)2 = b2(x)
......
...
a0(x)yk + a1(x)y′k + · · · + an(x)y
(n)k = bk(x)
in order to obtain the result.
104
3.2. First order linear differential equations
3.2 First order linear differential equations
The aim of this section is to solve linear differential equations of order 1, which take the form a0(x)y+a1(x)y′ = b(x) with a0, a1 and b real-valued continuous functions on a common interval I ⊆ R, anda1 6≡ 0 on I . Such an equation will first be solved on each interval where a1 does not cancel; oneshould then see whether the solutions can be extended over the points where a1 cancels and patchedover different intervals.
Example 3.16
We want to solve xy′ = y. This simple example could be solved directly but the general method thatwe will see consists in solving y′ = y
x . We will find that all the solutions are x ∈ R⋆− 7→ c−x and
x ∈ R⋆+ 7→ c+x for any constants c−, c+ ∈ R. Two such functions can be extended at 0 and reunited in
a differentiable way only if c− = c+. In the end, we check that each x ∈ R 7→ cx for a constant c ∈ R isa solution to the initial equation xy′ = y.
Dividing by a1, our equation takes the form
y′ = a(x)y + b(x) (E1)
with a and b continuous on some interval I ⊆ R. As explained in Proposition 3.14, we will first solvethe homogeneous equation y′ = a(x)y and then find a particular solution to (E1).
3.2.1 Homogeneous equation
Theorem 3.17 h solutions to a homogeneous first order linear equation g
Let a : I → R be a continuous function and A : I → R one of its primitives. The solutions on I to
y′ = a(x)y (H1)
are x ∈ I 7→ c eA(x) for any constant c ∈ R.
Proof. We have
(H1) ⇐⇒ ∀x ∈ I, y′(x) − a(x)y(x) = 0
⇐⇒ ∀x ∈ I, e−A(x)(y′(x) − a(x)y(x)
)= 0
⇐⇒ ∀x ∈ I, y′(x)e−A(x) − y(x)A′(x)e−A(x) = 0
⇐⇒ ∀x ∈ I,(
y(x)e−A(x))′
= 0
⇐⇒ ∃c ∈ R, ∀x ∈ I, y(x)e−A(x) = c
⇐⇒ ∃c ∈ R, ∀x ∈ I, y(x) = c eA(x) .
The previous proof is quite ad hoc. A fast way to recover Theorem 3.17 is to write (assuming that ydoes not cancel)
y′
y= a(x) ⇐⇒ ∃c ∈ R, ln |y(x)| = A(x) + k
⇐⇒ ∃k ∈ R, |y(x)| = eA(x)+k
⇐⇒ ∃k ∈ R, y(x) = ±ekeA(x)
⇐⇒ ∃c ∈ R⋆, y(x) = c eA(x) (with c = ±ek) .
105
Chapter 3. Ordinary differential equations
We thus recover all the solutions except x ∈ I 7→ 0.
Remark 3.18
Theorem 3.17 is coherent with Proposition 3.13. In fact, the set of solution is the one-dimensional vectorspace Span
(x ∈ I 7→ eA(x)
).
An important fact to notice about the solutions is that, for any fixed x0 ∈ I and y0 ∈ R, there existsexactly one solution y to (H1) such that y(x0) = y0. Indeed, by Theorem 3.17, there exists c ∈ R suchthat y : x ∈ I 7→ c eA(x), and one should have c eA(x0) = y0 ⇐⇒ c = y0 e
−A(x0). Reciprocally, thefunction y : x ∈ I 7→ y0 e
A(x)−A(x0) is as desired. The condition y(x0) = y0 is called an initial conditionto the ODE. Note that x 7→ A(x) − A(x0) is the primitive of a that cancels at x0. Let us sum up thisdiscussion in the following proposition.
Theorem 3.19 h Cauchy–Lipschitz (solution with initial condition) g
Let a : I → R be a continuous function, x0 ∈ I and y0 ∈ R. There exists a unique solution on I to
y′ = a(x)y
that satisfies the initial condition y(x0) = y0. Moreover, this solution is x ∈ I 7→ y0 e
∫x
x0a(u) du
.
The particular case where a is a constant function takes a particularly simple form. Let us summa-rize Theorems 3.17 and 3.19 in this case.
Corollary 3.20 h homogeneous first order linear equation with constant coefficients g
Let α, x0, y0 ∈ R. The set of solutions toy′ = αy
is the vector line{x ∈ R 7→ c eα x, c ∈ R
}. Moreover, there exists a unique solution that satisfies the
initial condition y(x0) = y0: this is x ∈ R 7→ y0 eα(x−x0).
Depending on the cases, the solutions look like this:
x
y
α > 0 y0 > 0
y0 = 0
y0 < 0
x
y
α < 0
106
3.2.2. Finding a particular solution to y′ = a(x)y + b(x)
Exercise 3.21 solution page 144
Solve 2y′ − 5y = 0 and ex y′ + 2y = 0.
Recall that the set of solutions to a homogeneous linear differential equation is always a vectorspace (Proposition 3.13). It would be interesting to know its dimension. In light of Remark 3.18, weknow that in the case of equation (H1), this vector space is a vector line (a vector space of dimension 1).
Warning 3.22 B Beware of patched solutions! B
It is wrong to say that the set of solutions to a homogeneous first order linear differentialequation is always a vector line! Indeed, recall that a homogeneous first order linear differentialequation has the general form
a0(x)y + a1(x)y′ = 0,
where a1 is allowed to cancel.
Let us look at a concrete example.
Example 3.23
Let us solvexy′ = 2y . (3.2)
We need to solve this equation on the two intervals R⋆− and R⋆
+. On each of these two intervals, it isequivalent to
y′ =2
xy ,
so, by Theorem 3.17 the solutions are x ∈ R⋆− 7→ c− x2 and x ∈ R⋆
+ 7→ c+ x2 for any constants c−,
c+ ∈ R.Let us fix arbitrary c−, c+ ∈ R and set y− : x ∈ R⋆
− 7→ c− x2 and y+ : x ∈ R⋆+ 7→ c+ x
2. We see thaty−(x) → 0 as x ↑ 0 and y+(x) → 0 as x ↓ 0, so that the function defined, for x ∈ R, by
y(x) :=
y−(x) if x < 0
0 if x = 0
y+(x) if x > 0
is continuous. Moreover, for h 6= 0,
∣∣∣y(h) − y(0)
h
∣∣∣ ≤ max
(|c−|, |c+|
)|h| → 0 as h → 0 ,
so that y is differentiable on R, with y′(0) = 0. As a result, y is solution to (3.2). We conclude that theset of solutions to (3.2) is the vector plane Span
(x 7→ x2
1R⋆−
(x), x 7→ x21R⋆
+(x)).a
aThe function 1A is the indicator function of the set A: it is defined as 1A(x) = 1 if x ∈ A, 1A(x) = 0 if x /∈ A.For instance, our function y can be written as y(x) = c− x2
1R⋆−
(x) + c+ x21R⋆
+(x).
3.2.2 Finding a particular solution to y′ = a(x)y + b(x)
As explained above, it remains to find a particular solution to (E1). In some lucky cases, this can bedone directly by guesswork:
107
Chapter 3. Ordinary differential equations
Example 3.24
Let us solve, on (0, π),y′ sin(x) + y cos(x) = sin(x) + x cos(x) . (3.3)
Clearly, x ∈ (0, π) 7→ x is a solution to this equation. By Theorem 3.17, the solutions to the associatedhomogeneous equation are, for c ∈ R,
x ∈ (0, π) 7→ c e
∫x
0− cos(u)
sin(u)du
= c e− ln(
sin(x))
=c
sin(x).
By Proposition 3.14, the solutions to (3.3) are thus x ∈ (0, π) 7→ c
sin(x)+ x, for c ∈ R.
Recall also Proposition 3.15: it might be useful to split up b and find separate particular solutions.
Exercise 3.25 solution page 144
Solve y′ − 2xy = 4x− 1
x2− 2.
Variation of constants
In most cases, however, we use a more robust method: the so-called variation of constants. Althoughthe name is quite contradictory, it makes some sense. As in Theorem 3.17, let A : I → R be a primitiveof a. We know that the solutions on I to (H1) have the form x ∈ I 7→ c eA(x). The idea of the method isto look for a solution to (E1) in the form yE : x ∈ I 7→ c(x) eA(x) where, now, c : I → R is a differentiablefunction!
As A′ = a, we have
y′E(x) = a(x)c(x)eA(x) + c′(x)eA(x) = a(x)yE(x) + c′(x)eA(x) .
As a result, yE is solution to (E1) if and only if
y′E(x) = a(x)yE(x) + b(x) ⇐⇒ c′(x)eA(x) = b(x)
⇐⇒ c′(x) = b(x)e−A(x)
⇐⇒ c(x) = c(x0) +
∫ x
x0
b(u)e−A(u) du where x0 ∈ I is arbitrary.
Warning 3.26 B The solution is not x ∈ I 7→ c(x) but x ∈ I 7→ c(x) eA(x) B
Once you have found c(x), don’t forget to multiply it by eA(x) in order to obtain the particular solution.It is a very common mistake.
Example 3.27
Let us solvey′ + y = ex + 1 (3.4)
on R. The associated homogeneous equation is y′ = −y, whose solutions are x 7→ ce−x, c ∈ R. Let us
108
3.2.2. Finding a particular solution to y′ = a(x)y + b(x)
look for a particular solution yE : x 7→ c(x)e−x. This function is solution to (3.4) if and only if
y′E + yE = ex + 1 ⇐⇒ c′(x)e−x − c(x)e−x + c(x)e−x = ex + 1
⇐⇒ c′(x)e−x = ex + 1
⇐⇒ c′(x) = e2x + ex
⇐⇒ c(x) =1
2e2x + ex + k for some k ∈ R.
We choose, for instance, k = 0, which gives yE(x) = c(x)e−x =1
2ex + 1. The solutions to (3.4) are
thus
x ∈ R 7→ 1
2ex + 1 + c e−x, c ∈ R.
Case where a is constant
Let us suppose here that a ≡ α 6= 0. In this case, there might be some faster ways to find a particularsolution to
y′ = αy + b(x)
than to use the variation of constants when b is a simple function (or a sum of simple functions, recallProposition 3.15). In what follows, we always work on R.
h b is a polynomial function. If b is a degree k polynomial function, looking for a particular solutionin the form of a degree k polynomial amounts to solving a simple linear system of equations. Indeed,
x 7→ a0 + a1x+ a2x2 + · · · + akx
k
is solution to
y′ − αy = b0 + b1x+ b2x2 + · · · + bkx
k
if and only if
a1 − αa0 = b0
2a2 − αa1 = b1
3a3 − αa2 = b2
......
...
kak − αak−1 = bk−1
−αak = bk
.
This system can easily be solved from bottom to top.
h b : x 7→ eβxPk(x) where β ∈ R⋆ and Pk is a degree k polynomial. In this case, we look for asolution of the form eβxz(x) for an unknown function z. This function is solution to
y′ − αy = eβxPk(x)
if and only if, for all x ∈ R,
eβx(βz(x) + z′(x) − αz(x)
)= eβxPk(x) ⇐⇒ (β − α)z(x) + z′(x) = Pk(x) .
If β = α, take for z any primitive of Pk; otherwise, we are back to the previous case.
109
Chapter 3. Ordinary differential equations
Exercise 3.28 solution page 144
Solve y′ = −2y + x2e−x.
h b : x 7→ A cos(βx) + B sin(βx) where A, B ∈ R, β ∈ R⋆. In this case, we look for a solution of theform C cos(βx) +D sin(βx). This function is solution to
y′ − αy = A cos(βx) +B sin(βx)
if and only if, for all x ∈ R,
(βD − αC) cos(βx) + (−βC − αD) sin(βx) = A cos(βx) +B sin(βx) ,
that is,{
(βD − αC) = A
(−βC − αD) = B⇐⇒
C = −αA+ βB
α2 + β2
D =βA− αB
α2 + β2
.
Exercise 3.29 solution page 145
Solve y′ + y = sin(2x).
3.2.3 Solution to the nonhomogeneous equation
Summing up, we obtain the following crucial theorem.
Theorem 3.30 h Cauchy–Lipschitz for first order linear equations g
Let a : I → R and b : I → R be two continuous function, x0 ∈ I , and A : x ∈ I 7→∫ x
x0a(u) du the
primitive of a that cancels at x0. The solutions on I to
y′ = a(x)y + b(x) (E1)
are the functions
x ∈ I 7→(
c+
∫ x
x0
b(u)e−A(u) du
)
eA(x) for any constant c ∈ R.
Moreover, for each y0 ∈ R, there is a unique solution with initial condition y(x0) = y0: this solution is
x ∈ I 7→(
y0 +
∫ x
x0
b(u)e−A(u) du
)
eA(x) .
Example 3.31
Let us find the solution of y′ + y = ex + 1 satisfying y(1) = 2. We saw in Example 3.27 that there existsc ∈ R such that
y : x ∈ R 7→ 1
2ex + 1 + c e−x .
110
3.2.3. Solution to the nonhomogeneous equation
We need to find the (unique) value of c for which the initial condition y(1) = 2 is satisfied.
y(1) = 2 ⇐⇒ 1
2e1 + 1 + c e−1 = 2 ⇐⇒ c = e− e2
2.
The desired solution is thus y : x ∈ R 7→ 1
2ex + 1 +
(
e− e2
2
)
e−x.
Integral curves
Recall that the graph of a function f : I → R is the set of points of the plane {(x, f(x)) : x ∈ I}. Anintegral curve of an ODE is the graph of a solution to this ODE. Theorem 3.30 ensures the following.
Corollary 3.32 h integral curves g
Let us consider (E1) on an interval I . For each point (x0, y0) ∈ I × R, there is a unique integral curveof (E1) that contains (x0, y0).This furthermore implies that any two different integral curves never intersect!
Example 3.33
The solutions to y′ + y = x are
y : x ∈ R 7→ x− 1 + c e−x , c ∈ R.
For each point (x0, y0) ∈ R2, there is a unique integral curve containing (x0, y0).
x
y
(x0, y0)
Exercise 3.34 solution page 145
Solve y′ + y ln 2 = 0. Plot the integral curves and highlight the one corresponding to y(1) =1
2.
111
Chapter 3. Ordinary differential equations
Warning 3.35 B Beware of patched solutions! B
Beware that Theorem 3.30 and Corollary 3.32 only hold for first order linear equations under theform (E1). For an equation
a0(x)y + a1(x)y′ = b(x) ,
these only hold on each interval where a1 does not cancel! It is wrong otherwise! For instance,the integral curves on R of Example 3.23 all intersect at (0, 0). It is also wrong to say that for each(x0, y0) ∈ R⋆
+ × R, there is a unique solution y : R → R to (3.2) such that y(x0) = y0. What is rightto say is that, for each (x0, y0) ∈ R⋆
+ × R, there is a unique solution y : R⋆+ → R to (3.2) such that
y(x0) = y0 as, on the interval R⋆+, the theorem holds! But such a solution is not maximal. . .
3.3 Systems of linear ODEs
In this section, we are interested in solving systems of linear ODEs with constant coefficients, as forinstance {
y′1 = y1 + 3y2 + b1(t)
y′2 = −4y1 + 7y2 + b2(t)
.
(In the context of this section, it is more usual to use t as the variable.) Here, we have two unknownfunctions y1 and y2 that evolve together. In general, we cannot solve these equations separately, weneed to solve them at the same time. Setting
X :=
[y1
y2
]
, A :=
[1 3
−4 7
]
and B :=
[b1
b2
]
,
we can rewrite our system as X ′ = AX + B(t). In this notation, A is a square fixed matrix, B is avector-valued function and X is an unknown vector-valued differentiable function.
Definition 3.36
h We say that a vector-valued or a matrix-valued function is continuous (resp. differentiable) if allits coordinates are continuous (resp. differentiable).
h The limit (resp. derivative) of a continuous (resp. differentiable) vector-valued or matrix-valued func-tion is the vector or matrix whose coordinates are the limits (resp. derivatives) of its coordinates.
Note that we did not define continuity and differentiability as in the previous chapter (Defini-tion 2.4). The previous definition is actually more accurate1 but is equivalent with Definition 3.36 inthis context (recall Proposition 2.5).
In this section, we will see how to solve such an equation: more precisely, we consider the equation
X ′ = AX +B(t) , (ES)
where A ∈ Mn(R) is an n × n real matrix, B : I → Rn is a continuous n-dimensional vector-valuedfunction defined on some interval I ⊆ R and X : I → Rn is an unknown differentiable n-dimensionalvector-valued function.
The propositions of Section 3.1.4 (Propositions 3.13, 3.14, and 3.15) have the following straightfor-ward analog in the present setting.
1In mathematics, the notion of continuity is very crucial and can be defined in a much broader context. This notion is waybeyond the scope of this course.
112
3.3.1. Preliminaries: matrix exponential
Proposition 3.37
The following holds.
(i) The set SHS of solutions to the homogeneous system of linear ODEs
X ′ = AX (HS)
is a real vector space.
(ii) Let XES be a solution to (ES). The set of solutions to (ES) is{XES +X : X ∈ SHS
}.
(iii) If, for 1 ≤ i ≤ k, we have X ′i = AXi + Bi(t), then
(∑ki=1 Xi
)′= A
∑ki=1 Xi +
∑ki=1 Bi(t)
(superposition of solutions).
3.3.1 Preliminaries: matrix exponential
In light of Corollary 3.20, it is tempting to say that the solutions to the homogeneous equation
X ′ = AX (HS)
are of the form etA, up to some constants. Of course, this involves taking the exponential of a matrix,which is not clear. . . We will see now that this can indeed be properly done!
First, let us introduce some terminology. For 1 ≤ i ≤ m, 1 ≤ j ≤ n, we denote by [A]ij the term ofindex (i, j) of a matrix A ∈ Mmn(C). Furthermore, for complex numbers aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n,we denote by [aij ]1≤i≤m,1≤j≤n the m × n matrix whose term of index (i, j) is aij , for all 1 ≤ i ≤ m,1 ≤ j ≤ n. Finally, for an upper case letter matrix, we will often use the same letter in lower case todesign its entries: for instance, we will set aij := [A]ij .
Recall that we may define the usual exponential as the sum of the following converging series:
exp(x) := 1 + x+x2
2!+x3
3!+ · · · =
∞∑
k=0
xk
k!.
Proposition 3.38
For any matrix A ∈ Mn(C), each entry of the matrix
In +A+1
2!A2 +
1
3!A3 + · · · +
1
r!Ar =
r∑
k=0
1
k!Ak
absolutely converges as the integer r → ∞. (In denotes the identity matrix of Mn(C) and we use theusual convention that A0 = In.)
Proof. Let A = [apq ]1≤p,q≤n ∈ Mn(C) and 1 ≤ i, j ≤ n. First notice that, for each r ∈ N,[
r∑
k=0
1
k!Ak
]
ij
=
r∑
k=0
1
k!
[Ak]
ij=
r∑
k=0
1
k!
∑
1≤p1,...,pk−1≤n
aip1ap1p2 . . . apk−2pk−1apk−1j ,
is indeed the partial sum of a series. This series is absolutely convergent as claimed because, denotingby max(A) := max1≤p,q≤n |apq|, we have
r∑
k=0
∣∣∣∣
1
k!
∑
1≤p1,...,pk−1≤n
aip1ap1p2 . . . apk−2pk−1apk−1j
∣∣∣∣
≤r∑
k=0
nk−1 max(A)k
k!≤ en max(A)
n< ∞ .
113
Chapter 3. Ordinary differential equations
This proposition legitimizes the following definition.
Definition 3.39 h matrix exponential g
For any matrix A ∈ Mn(C), the exponential of A is defined as
exp(A) := In +A+1
2!A2 +
1
3!A3 + · · · =
∞∑
k=0
1
k!Ak .
We alternatively also use the shorthand notation eA.
Clearly, ifA is a 1×1 matrix [δ], then eA = [eδ]. More generally, the exponential of a diagonal matrixcan be simply expressed:
Proposition 3.40 h exponential of a diagonal matrix g
For any δ1, . . . , δn ∈ C,
exp
δ1 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 δn
=
eδ1 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 eδn
.
In particular, denoting by 0n ∈ Mn(C) the zero matrix, we have e0n = In.
Proof. As, for each k ≥ 0,
δ1 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 δn
k
=
δk1 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 δk
n
,
this directly comes from the definition.
The situation is way more complicated in general! There is however a simple way to express theexponential of a diagonalizable matrix thanks to the following Proposition.
Proposition 3.41 h exponential of similar matrices g
Let A and B be two similar matrices of Mn(C): recall that this means that there exists an invertiblematrix P of Mn(C) such that B = PAP−1. We have
eB = eP AP −1
= PeAP−1 .
Proof. This comes from the fact that, for any k ≥ 0,
Bk = PAP−1 × PAP−1 × · · · × PAP−1
︸ ︷︷ ︸
k times
= PAkP−1 ,
114
3.3.1. Preliminaries: matrix exponential
which yieldsr∑
k=0
1
k!Bk = P
(r∑
k=0
1
k!Ak
)
P−1 ,
and then eB = PeAP−1 by taking r → ∞. (We admit that matrix multiplication is “continuous” insome sense: roughly speaking, the coefficients of a matrix product are simply linear combinations ofthe entries of the matrices.)
Corollary 3.42 h exponential of a diagonalizable matrix g
Let A ∈ Mn(C) be diagonalizable: we may write it as A = P∆P−1 for some invertible matrix P andsome diagonal matrix ∆. Then
eA = Pe∆P−1 .
Exercise 3.43 solution page 145
Let
A =
[1 1
−2 4
]
.
Check that
A =
[1 11 2
] [2 00 3
] [2 −1
−1 1
]
and deduce etA.
Warning 3.44 B eA+B 6= eAeB B
Beware that, in contrast to the fundamental identity ex+y = exey when x and y are real (or complex)numbers, we do not have, in general, eA+B = eAeB when A and B are matrices!
Exercise 3.45 solution page 146
Let
A :=
[1 00 0
]
and B =
[0 10 0
]
.
Show that eA+B 6= eAeB .
This identity however holds when A and B commute.
Proposition 3.46
Let A, B ∈ Mn(C) be such that AB = BA. We have
eA+B = eAeB .
Proof. Recall that, if a and b are two elements of any ring,
(a+ b)k =∑
(u1,...,uk)∈{0,1}k
au1b1−u1au2b1−u2 . . . aukb1−uk
115
Chapter 3. Ordinary differential equations
(in the i-th factor (a + b), one selects a if ui = 1 or b if ui = 0). If a and b commute, the factorcorresponding to (u1, . . . , uk) can be simplified as aΣuibk−Σui and we obtain the so-called Newton’sbinomial expansion
(a+ b)k =∑
i+j=k
k!
i!j!aibj .
As Mn(C) is not commutative, this expression does not hold in general for matrices! Here, however,we can use it for A and B as we assumed that they commute. By definition,
eA+B =∞∑
k=0
(A+B)k
k!
=
∞∑
k=0
1
k!
∑
i+j=k
k!
i!j!AiBj
=
∞∑
k=0
∑
i+j=k
Ai
i!
Bj
j!
=
∞∑
i=0
Ai
i!
∞∑
j=0
Bj
j!
= eAeB .
The second equality is Newton’s binomial expansion and the fourth equality is obtained by observingthat summing over i + j = k for k ∈ {0, 1, 2, . . .} amounts to summing (along diagonal descendinglines) over (i, j) ∈ {0, 1, 2, . . .}2.
As a byproduct, we obtain the following fundamental property.
Corollary 3.47 h inverse of an exponential g
Let A ∈ Mn(C). Then eA is invertible and its inverse is e−A.
Proof. By Propositions 3.46 (A and −A commute) and 3.40, we have
eAe−A = eA−A = e0n = In .
3.3.2 Solution to the homogeneous equation
We may now turn to the central theorem of this section, which will allow us to solve (HS).
Theorem 3.48 h derivative of t 7→ etA g
Let A ∈ Mn(C). The function t ∈ R 7→ etA is differentiable (thus continuous) and its derivative ist ∈ R 7→ AetA.
Proof. Let t ∈ R and h 6= 0. As tA commute with hA, by Proposition 3.46,
e(t+h)A − etA
h=etAehA − etA
h=
(ehA − In
h
)
etA
116
3.3.2. Solution to the homogeneous equation
By definition 3.39,
ehA − In − hA =
∞∑
k=2
1
k!(hA)k
so that, using the same technique as in the proof of Proposition 3.38, for any 1 ≤ i, j ≤ n, we have
∣∣∣∣∣
[ehA − In
h−A
]
ij
∣∣∣∣∣
≤ 1
|h|
∞∑
k=2
1
k!
∣∣∣∣
[
(hA)k]
ij
∣∣∣∣
≤ 1
|h|
∞∑
k=2
nk−1 max(hA)k
k!
=1
n|h|
∞∑
k=2
(n|h| max(A)
)k
k!
=1
n|h|(
en|h| max(A) − 1 − n|h| max(A))
=1
n|h|(1 + n|h| max(A) + O
((n|h| max(A))2
)− 1 − n|h| max(A)
)
= O(|h|) → 0
as h → 0. (In the second equality, we used the expansion of the exponential for the real numbern|h| max(A).) Summing up,
∀i, j,[ehA − In
h−A
]
ij
→ 0 , so thatehA − In
h→ A , and finally
e(t+h)A − etA
h→ AetA .
Theorem 3.49 h solutions to (HS) g
Let A ∈ Mn(R). The solutions toX ′ = AX (HS)
are the functions
t ∈ R 7→ etA
c1
c2
...cn
for any constants c1, c2, . . . , cn ∈ R. Moreover, for any vector X0 ∈ Rn, there is a unique solutionto (HS) with initial condition X(t0) = X0: it is t ∈ R 7→ e(t−t0)AX0.
Proof. We already know from Theorem 3.48 that, for any vector C ∈ Rn, the function t ∈ R 7→ etACis solution to (HS). Conversely, let X be a solution to (HS) and Y : t ∈ R 7→ e−tAX ∈ Rn. We claimthat, if U and V are matrix-valued functions, the product rule for the derivative of UV works as forreal-valued functions. Indeed, with a transparent notation,
[(UV )′]
ij=([UV ]ij
)′=(∑
k
uikvkj
)′=∑
k
(uikvkj)′ =∑
k
(u′ikvkj + uikv
′kj)
=∑
k
u′ikvkj +
∑
k
uikv′kj = [U ′V ]ij + [UV ′]ij .
117
Chapter 3. Ordinary differential equations
Using it, together with Theorem 3.48, we obtain that
Y ′(t) = −Ae−tAX + e−tAX ′ = −Ae−tAX + e−tAAX = 0n1 ,
where 0n1 denotes the zero n-dimensional vector. In the last equality, we used the fact that A and etA
commute, which is easy to see from Definition 3.39. As a result, Y is equal to some constant vectorC ∈ Rn. By Corollary 3.47, the inverse of e−tA is etA, so that
X = etAe−tAX = etAY = etAC .
The second statement comes from the fact that X(t0) = et0AC, so that C = e−t0AX(t0).
Corollary 3.50
The set of solutions to (HS) is the n-dimensional vector space with basis {t 7→ etAei : 1 ≤ i ≤ n}
where {ei : 1 ≤ i ≤ n} is any basis of Rn.
Exercise 3.51 solution page 146
Recalling Exercice 3.43, solve{
y′1 = y1 + y2
y′2 = −2y1 + 4y2
.
3.3.3 Solution to the nonhomogeneous equation
As in Section 3.2, we solve (ES) by using variation of constants.
Theorem 3.52 h solutions to (ES) g
Let A ∈ Mn(R), B : I → Rn be a continuous function defined on some interval I ⊆ R, and t0 ∈ I .The solutions to
X ′ = AX +B(t) (ES)
are the functions
t ∈ I 7→ etA
(
C0 +
∫ t
t0
e−uAB(u) du
)
for any constant vector C0 ∈ Rn. Moreover, for any vector X0 ∈ Rn, there is a unique solution to (ES)with initial condition X(t0) = X0: it is
t ∈ I 7→ e(t−t0)A
(
X0 +
∫ t
t0
e−uAB(u) du
)
.
Proof. We look for solutions of (ES) in the form X = etAC(t) with C : I → Rn. By the product rule,
C′ =(e−tAX
)′= −Ae−tAX + e−tAX ′ = −Ae−tAX + e−tA
(AX +B(t)
)= e−tAB(t) ,
so that
C(t) = C(t0) +
∫ t
t0
e−uAB(u) du .
The first statement follows. The second statement is obtained by noticing that X(t0) = et0AC(t0).
118
3.3.4. Method for solving a system of linear ODEs in practice
3.3.4 Method for solving a system of linear ODEs in practice
In theory, Theorem 3.49, Corollary 3.50 and Theorem 3.52 tell us a lot of interesting things. But, inpractice (on a concrete example), the formulas are not always helpful (mostly because computing amatrix exponential is not an easy task in general). Instead, we use the following method. The first stepis to put A in the nicest form possible.
Diagonalizable matrix
Let us first recall some matrix algebra. The matrixA ∈ Mn(R) is called diagonalizable if there exist aninvertible matrix P ∈ Mn(R) and a diagonal matrix2 ∆ ∈ Mn(R) such that A = P∆P−1. Beware thatnot all matrices are diagonalizable. When diagonalizing a matrix A ∈ Mn(R), one searches for n realnumbers δ1, . . . , δn ∈ R (not necessarily distinct) and n linearly independent vectors v1,. . . , vn ∈ Rn
such that Avi = δivi for 1 ≤ i ≤ n. Such a number δi is called an eigenvalue and such a vector vi
is called an eigenvector. We refer you to your algebra course for the method allowing to find theseeigenvalues and eigenvectors.
Once we have found the eigenvalues δ1, . . . , δn and eigenvectors v1,. . . , vn, we have the following.As v1,. . . , vn are n linearly independent vectors in an n-dimensional vector space, they form a basisof Rn. The n× n matrix
P :=
v1 . . . vn
=
v11 . . . vn1
......
v1n . . . vnn
allows to pass from the canonical basis (e1, . . . , en) to the eigenbasis (v1, . . . ,vn): Pei = vi. Recipro-cally, its inverse allows to go the other way: P−1
vi = ei. The point of the eigenbasis is that, in thisbasis, the mapping v 7→ Av is represented by the diagonal matrix
∆ :=
δ1 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 δn
since, precisely, Avi = δivi. We thus have A = P∆P−1.
Back to our system of ODEs, we set Y = P−1X . Equation (HS) is equivalent to
X ′ = AX ⇐⇒ X ′ = P∆P−1X ⇐⇒ P−1X ′ = ∆P−1X ⇐⇒ Y ′ = ∆Y .
From Theorem 3.49, we know that the solutions to this system are given by
Y (t) = et∆
c1
...
...cn
=
eδ1t 0 . . . 0
0. . .
. . ....
.... . .
. . . 00 . . . 0 eδnt
c1
...
...cn
=
c1eδ1t
...
...cne
δnt
.
The solutions to (HS) are thus given by
X(t) = PY (t) =
v1 . . . vn
c1eδ1t
...cne
δnt
= c1e
δ1tv1 + · · · + cne
δntvn . (3.5)
2A matrix M is called diagonal if [M ]ij = 0 whenever i 6= j.
119
Chapter 3. Ordinary differential equations
In order to solve (ES), we need one particular solution (Proposition 3.37). To this aim, we use thevariation of constants and look for a solution Z : t ∈ R 7→ c1(t)eδ1t
v1 + · · · + cn(t)eδntvn where c1, . . . ,
cn are n differentiable functions. In other words, Z(t) = Pet∆C(t) whereC : R → Rn is a differentiablevector.
Z ′ = AZ +B(t) ⇐⇒ Pet∆C′(t) + P∆et∆C(t) = P∆P−1Pet∆C(t) +B(t)
⇐⇒ Pet∆C′(t) = B(t)
⇐⇒ c′1(t)eδ1t
v1 + · · · + c′n(t)eδnt
vn = B(t) .
We obtain c′1, . . . , c′
n by solving this system of n equations and find c1, . . . , cn by integration.
Remark 3.53
Solving the previous system amounts to finding C′(t) = e−t∆P−1B(t). But, when doing so, we do notnecessarily need to explicitly compute P−1 (for instance, if B ≡ 0, we do not have anything to do).
What if we had computed etA instead? Alternatively, we could have computed directly etA thanksto Corollary 3.42. If we had done so, we would have found for the solutions to (HS) the functions
t ∈ R 7→ etAC = eP t∆P −1
C = Pet∆P−1C
for any vector C ∈ Rn. In comparison with (3.5) where we had t ∈ R 7→ Pet∆C, the difference is thatthe constants are not chosen in the same way. Of course, the result is exactly the same, as P−1 is abijection from Rn to Rn so this is just renaming the constants. The variation of constants can then beused: we find, as in Theorem 3.52, C′(t) = e−tAB(t). We can directly integrate this equation as wehave already computed e−tA.
This is a perfectly valid method but, in general, the previous one is faster: the computations areeasier and P−1 may not need to be fully computed.
Example 3.54
Let us solve the following system of ODEs:
x′ = 4x+ 16t
y′ = y + 2z
z′ = 2y + z + 2e4t
.
h Diagonalizing the matrix. The matrix associated with the system is
A =
4 0 00 1 20 2 1
.
Let us see if it happens to be diagonalizable. To this aim, we look for eigenvalues δ ∈ R and eigenvectorsv = (a, b, c) 6= (0, 0, 0). We have
Av = δv ⇐⇒
4a = δa
b+ 2c = δb
2b+ c = δc
.
120
3.3.4. Method for solving a system of linear ODEs in practice
We immediately see the eigenvalue δ1 := 4 corresponding to the eigenvector v1 := (1, 0, 0). Otherwise,{
b+ 2c = δb
2b+ c = δc⇐⇒
{
3(b+ c) = δ(b+ c)
b− c = δ(c− b).
We find δ2 := 3 and δ3 := −1 respectively corresponding, for instance, to v2 := (0, 1, 1) and v3 :=(0, 1,−1). All in all, A = P∆P−1, where
P :=
1 0 00 1 10 1 −1
and ∆ :=
4 0 00 3 00 0 −1
.
h Solving the homogeneous system. The homogeneous system is X ′ = AX = P∆P−1X , whichis equivalent to P−1X ′ = ∆P−1X , that is Y ′ = ∆Y , where we set Y := P−1X . The solutions to thissystem are the functions
Y : t ∈ R 7→ et∆
c1
c2
c3
=
c1 e4t
c2 e3t
c3 e−t
, c1, c2, c3 ∈ R ,
so that
X : t ∈ R 7→ PY (t) = c1
e4t
00
+ c2
0e3t
e3t
+ c3
0e−t
−e−t
, c1, c2, c3 ∈ R ,
h Finding a particular solution. In order to solve the original system of ODEs, we look for aparticular solution thanks to the variation of constants:
Z : t ∈ R 7→ c1(t)
e4t
00
+ c2(t)
0e3t
e3t
+ c3(t)
0e−t
−e−t
is solution if and only if
Z ′ = AZ +
16t0
2e4t
⇐⇒ c′1(t)
e4t
00
+ c′2(t)
0e3t
e3t
+ c′3(t)
0e−t
−e−t
=
16t0
2e4t
⇐⇒
c′1(t)e4t = 16t
c′2(t)e3t + c′
3(t)e−t = 0
c′2(t)e3t − c′
3(t)e−t = 2e4t
⇐⇒
c′1(t)e4t = 16t
2c′2(t)e3t = 2e4t
2c′3(t)e−t = −2e4t
⇐⇒
c′1(t) = 16te−4t
c′2(t) = et
c′3(t) = −e5t
.
We can take for instance c1 : t ∈ R 7→ −(1 + 4t)e−4t, c2 : t ∈ R 7→ et, and c3 : t ∈ R 7→ − 15e
5t. Thisgives the particular solution
Z : t ∈ R 7→ −(1 + 4t)e−4t
e4t
00
+ et
0e3t
e3t
− 1
5e5t
0e−t
−e−t
=
−1 − 4t45e
4t
65e
4t
.
121
Chapter 3. Ordinary differential equations
h Solving the nonhomogeneous system. The solutions to our system are thus
X : t ∈ R 7→
−1 − 4t45e
4t
65e
4t
+ c1
e4t
00
+ c2
0e3t
e3t
+ c3
0e−t
−e−t
, c1, c2, c3 ∈ R .
In other words, the solutions are the functions
x : t ∈ R 7→ −1 − 4t+ c1e4t
y : t ∈ R 7→ 45e
4t + c2e3t + c3e
−t
z : t ∈ R 7→ 65e
4t + c2e3t − c3e
−t
, for any c1, c2, c3 ∈ R .
Trigonalizable matrix
A matrix A ∈ Mn(R) is called trigonalizable if there exist an invertible matrix P ∈ Mn(R) and atriangular matrix3 T ∈ Mn(R) such thatA = PTP−1. Beware that not all matrices are trigonalizable.Let us suppose now thatA is trigonalizable: we write itA = PTP−1 with an upper triangular matrix T .
We cannot use exactly the same method as eT is not easily computable. . . Although it is temptingto write T as the sum of a diagonal matrix and a nilpotent one, this does not prove conclusive as, ingeneral, these two matrices do not commute. For instance, if we write
[1 10 2
]
=
[1 00 2
]
+
[0 10 0
]
,
notice that [1 00 2
] [0 10 0
]
=
[0 10 0
]
and
[0 10 0
] [1 00 2
]
=
[0 20 0
]
.
Instead, we proceed as follows. As above, we set Y = P−1X . Equation (ES) is equivalent to
X ′ = AX +B(t) ⇐⇒ X ′ = PTP−1X +B(t)
⇐⇒ P−1X ′ = TP−1X + P−1B(t)
⇐⇒ Y ′ = TY + P−1B(t) .
Using the notation (we assume here that T is lower triangular)
Y =
y1
...
...yn
, T =
t11 0 . . . 0...
. . .. . .
......
. . . 0tn1 . . . . . . tnn
and P−1B =
f1
...
...fn
,
our system becomes
y′1 = t11 y1 + f1(t)
y′2 = t22 y2 +
(t21 y1(t) + f2(t)
)
......
...
y′n = tnn yn +
(tn1 y1(t) + . . . tn(n−1) yn−1(t) + fn(t)
)
.
3A matrix M is called upper triangular if [M ]ij = 0 whenever i > j; it is called lower triangular if [M ]ij = 0 wheneveri < j; it is called triangular if it is upper triangular or lower triangular.
122
3.4. Linear differential equations with constant coefficients
This system can be solved line by line, from top to bottom. Indeed, the first line is a first order linearODE. Then, once it has been solved, the second line is also a first order linear ODE, the term
(t21 y1(t)+
f2(t))
now being a known function. And so on and so forth.
Remark 3.55
We proceed similarly with an upper triangular matrix instead of a lower triangular matrix; the systemis solved from bottom to top.
Exercise 3.56 solution page 146
Let
A =
[6 3
−5 −2
]
, T =
[1 01 3
]
and P =
[2 1
−3 −1
]
.
(i) Show that P−1 =
[−1 −13 2
]
.
(ii) Show that A = PTP−1.
(iii) Solve X ′ = AX +
[0t2
]
.
3.4 Linear differential equations with constant coefficients
In this section, we consider differential equations of order higher than one but only with constantcoefficients, that is, equations of the form
y(n) + an−1y(n−1) + · · · + a1y
′ + a0y = b(t) , (En)
where a0, a1, . . . , an−1 ∈ R and b : I → R is a continuous function. This is a linear system of ODEs asit can be written as
yy′
...
...y(n−1)
′
=
0 1 0 . . . 0...
. . .. . .
. . ....
.... . .
. . . 00 . . . . . . 0 1
−a0 −a1 −a2 . . . −an−1
yy′
...
...y(n−1)
+
0......0b(t)
. (SEn)
3.4.1 Homogeneous equation
From what we have done before, we know the following:
Proposition 3.57 h structure of the set of solutions g
The solutions to (En) are obtained by adding a particular solution to (En) and a solution to the associatedhomogeneous equation
y(n) + an−1y(n−1) + · · · + a1y
′ + a0y = 0 . (Hn)
Moreover, the set of solutions to (Hn) is an n-dimensional real vector space (Corollary 3.50).
123
Chapter 3. Ordinary differential equations
In theory, we know the solutions to (Hn) from Theorem 3.49 but, in practice, this theorem is hardto use as it requires to compute a matrix exponential. . . In fact, all we need is to find n linearly inde-pendent solutions!
The intuition we gathered in the previous sections tells us to look for solutions to (Hn) in the formt ∈ R 7→ ert for some r ∈ C to be determined. Such a function is solution if and only if
rnert + an−1rn−1ert + · · · + a1re
rt + a0ert = 0 ⇐⇒ rn + an−1r
n−1 + · · · + a1r + a0 = 0
⇐⇒ P (r) = 0 ,
where P is defined as follows.
Definition 3.58 h characteristic polynomial g
The polynomialP := Xn + an−1X
n−1 + · · · + a1X + a0 ∈ R[X ]
is called the characteristic polynomial associated with (Hn) (or with (En)).
Let us write P in its factor form P =∏d
j=1(X − rj)nj with distinct complex rj ’s and∑d
j=1 nj = n.
If all the roots of P are simple (that is, nj = 1 for all j), we have found the n solutions t ∈ R 7→ erjt,1 ≤ j ≤ n. If P possesses multiple roots, however, we need to find more solutions.
We denote by D the differential operator, that is, the function D : f 7→ f ′ mapping a differen-tiable function to its derivative. If Q =
∑qi=0 aiX
i is a polynomial, we denote by Q(D) the functionQ(D) : f 7→ ∑q
i=0 aif(i). We have the following crucial property: if Q and R are polynomials, then
Q(D) ◦ R(D) = QR(D). In other words, for any function f differentiable enough, Q(D)(R(D)(f)
)=
(QR)(D)(f). Indeed, if Q =∑q
i=0 aiXi and R =
∑rj=0 bjX
j , we have
Q(D)(R(D)(f)
)= Q(D)
( r∑
j=0
bjf(j)
)
=
q∑
i=0
ai
( r∑
j=0
bjf(j)
)(i)
=
q∑
i=0
r∑
j=0
aibjf(i+j) = (QR)(D)(f) .
With this notation, (Hn) can be rewritten as
y(n) + an−1y(n−1) + · · · + a1y
′ + a0y = 0 ⇐⇒ P (D)(y) = 0
⇐⇒( d∏
j′=1
(D −rj′)nj′
)
(y) = 0
⇐⇒(∏
j′ 6=j
(D −rj′)nj′
)((D −rj)nj (y)
)= 0 .
As, for any k ≥ 1,
(D −rj
)(tkerjt
)=(tkerjt
)′ − rj
(tkerjt
)
= ktk−1erjt + tkrjerjt − rjt
kerjt
= ktk−1erjt ,
and(
D −rj
)(erjt)
= 0 ,
we see that, for 0 ≤ k ≤ nj − 1,(
D −rj
)nj(tkerjt
)= 0 .
124
3.4.1. Homogeneous equation
As a result, t ∈ R 7→ tkerjt is solution to (Hn) as long as 0 ≤ k ≤ nj − 1. For each 1 ≤ j ≤ d, we thus
have nj solutions. All in all, this adds up to∑d
j=1 nj = n solutions, as desired.There is still a caveat, though. Recall that, although P ∈ R[X ] has real coefficients, its roots can
be nonreal (C is an algebraically closed field but R isn’t!). If rj ∈ C\R, the function t 7→ erjt takesnonreal values and we want real-valued functions. . . Hopefully, the nonreal roots of a real polynomialare conjugates, with the same order of multiplicity. Indeed, denoting by · the complex conjugation,
for any z ∈ C and k ≥ 0, one has zk = zk, so that, for any Q ∈ R[X ], Q(z) = Q(z), and thusQ(z) = 0 ⇐⇒ Q(z) = 0. As r is a root of multiplicity d of Q if and only if Q(j)(r) = 0 for 0 ≤ j ≤ dandQ(d+1)(r) 6= 0, one sees that r is a root of multiplicity d ofQ if and only if r is a root of multiplicity dof Q.
Rearranging the roots of P , we denote by r1, . . . , rp its real roots, with multiplicity n1, . . . , np, and byα1 ± iβ1, . . . , αq ± iβq its nonreal roots, with multiplicity m1, . . . , mq (so that
∑pj=1 nj +2
∑qj=1 mj = n).
Let us look at the functions we have for nonreal conjugate roots:
e(αj+iβj)t = eαjt(
cos(βjt) + i sin(βjt))
and e(αj−iβj)t = eαjt(
cos(βjt) − i sin(βjt)).
We can come back to the real world by considering
e(αj+iβj)t + e(αj−iβj)t
2= eαjt cos(βjt) and
e(αj+iβj)t − e(αj−iβj)t
2i= eαjt sin(βjt) .
Summing up, we have found the following∑p
j=1 nj + 2∑q
j=1 mj = n real-valued solutions:
h t ∈ R 7→ tkerjt, for 1 ≤ j ≤ p and 0 ≤ k ≤ nj − 1;
h t ∈ R 7→ tkeαjt cos(βjt), for 1 ≤ j ≤ q and 0 ≤ k ≤ mj − 1;
h t ∈ R 7→ tkeαjt sin(βjt), for 1 ≤ j ≤ q and 0 ≤ k ≤ mj − 1.
Theorem 3.59
Let P = Xn + an−1Xn−1 + · · · + a1X + a0 ∈ R[X ] be the characteristic polynomial associated with
y(n) + an−1y(n−1) + · · · + a1y
′ + a0y = 0 . (Hn)
We denote by r1, . . . , rp its real roots, with multiplicity n1, . . . , np, and by α1 ± iβ1, . . . , αq ± iβq itsnonreal roots, with multiplicity m1, . . . , mq . In other words, we factor P in R[X ] as
P =
p∏
j=1
(X − rj
)nj
q∏
j=1
(
X2 − 2αjX + (α2j + β2
j ))mj
with distinct real numbers rj ∈ R, and distinct pairs (αj , βj) ∈ R × R⋆. The solutions to (Hn) are thefunctions
t ∈ R 7→p∑
j=1
erjtPj(t) +
q∑
j=1
eαjt(
cos(βjt)Qj(t) + sin(βjt)Rj(t)),
where P1, . . . , Pp, Q1, . . . , Qq, R1, . . . , Rq ∈ R[X ] are real polynomials whose degrees satisfy
∀j, deg(Pj) ≤ nj − 1, deg(Qj) ≤ mj − 1, deg(Rj) ≤ mj − 1 .
Proof. This is just rewriting the fact that the n functions we found above form a basis of the set of so-lutions. From the analysis we did, it only remains to see that these solutions are linearly independent.
125
Chapter 3. Ordinary differential equations
This can be established by observing that all these functions have different behavior around infinity.More precisely, let us argue by contradiction and suppose that there exist not all zero real numbers ajk,bjk, cjk such that
∀t ∈ R,
p∑
j=1
nj−1∑
k=0
ajktkerjt +
q∑
j=1
mj−1∑
k=0
bjktkeαjt cos(βjt) +
q∑
j=1
mj−1∑
k=0
cjktkeαjt sin(βjt) = 0 .
We first consider the largest number among all the rj ’s and αj ’s that appear in this sum (that is, with atleast one ajk 6= 0 for rj or at least one nonzero bjk or cjk for αj): let us denote it by r. We then considerthe highest power function in front of ert: let us denote it by tℓ. We factor tℓert out of the previous sumand get, for all t ∈ R⋆,
p∑
j=1
nj−1∑
k=0
ajktk−ℓe(rj−r)t +
q∑
j=1
mj−1∑
k=0
bjktk−ℓe(αj−r)t cos(βjt) +
q∑
j=1
mj−1∑
k=0
cjktk−ℓe(αj−r)t sin(βjt) = 0 .
Now when t → ∞, most terms in this sum tend to 0. In fact, if there exists j1 such that rj1 = r,the first double sum tends to aj1ℓ; otherwise, it tends to 0. Similarly, the second double sum tends tobj2ℓ cos(βj2t) or 0 and the third double sum tends to cj3ℓ sin(βj3t) or 0. All in all, the whole sum tendsto
a+ b cos(βt) + c sin(β′t) (3.6)
for some β, β′ ∈ R⋆ and three numbers a, b, c ∈ R that are not all equal to 0 (as, by definition,tℓert was present in the original sum). But (3.6) should be equal to 0 for all t ∈ R⋆, which is clearlyimpossible.
Exercise 3.60 solution page 147
Solve y′′′ + y′′ − y′ − y = 0.
3.4.2 Nonhomogeneous equation
From Theorem 3.52, we know that (SEn) admits a unique solution for any given initial condition. Thisimplies that (En) admits a unique solution satisfying any initial condition of the form
y(t0) = x0
y′(t0) = x1
......
...
y(n−1)(t0) = xn−1
,
where t0 ∈ I and x0, x1, . . . , xn−1 ∈ R.
As above, we will solve (SEn) using variation of constants. Let us denote by {y1, . . . , yn} a basis ofsolutions to (Hn) (as given for instance by Theorem 3.59). In fact,
y1
y′1...
y(n−1)1
, . . . ,
yn
y′n...
y(n−1)n
126
3.4.2. Nonhomogeneous equation
forms a basis of solutions to
yy′
...
...y(n−1)
′
=
0 1 0 . . . 0...
. . .. . .
. . ....
.... . .
. . . 00 . . . . . . 0 1
−a0 −a1 −a2 . . . −an−1
yy′
...
...y(n−1)
. (SHn)
All the solutions to (SHn) are thus of the form
n∑
j=1
λj
yj
y′j...
y(n−1)j
, λ1, . . . , λn ∈ R .
Let us replace the constants λ1, . . . , λn by functions λ1 : I → R, . . . , λn : I → R. We look for a solutionto (SEn) in the form
ySE
y′SE...
y(n−1)SE
=
n∑
j=1
λj(t)
yj
y′j...
y(n−1)j
⇐⇒
ySE =
n∑
j=1
λj(t) yj
y′SE =
n∑
j=1
λj(t) y′j
......
...
y(n−1)SE =
n∑
j=1
λj(t) y(n−1)j
(3.7)
Differentiating the first line of (3.7) and using the second line gives
y′SE =
n∑
j=1
λ′j(t) yj +
n∑
j=1
λj(t) y′j =
n∑
j=1
λj(t) y′j =⇒
n∑
j=1
λ′j(t) yj = 0 .
Conducting the same reasoning with the subsequent lines yields
∀0 ≤ k ≤ n− 2,
n∑
j=1
λ′j(t) y
(k)j = 0 .
Now, ySE is solution to (En) if and only if y(n)SE +an−1y
(n−1)SE + · · ·+a1y
′SE +a0ySE = b(t). Differentiating
the last line of (3.7) and using all the lines, we obtain
n∑
j=1
λ′j(t) y
(n−1)j +
n∑
j=1
λj(t) y(n)j + an−1
n∑
j=1
λj(t) y(n−1)j + · · · + a0
n∑
j=1
λj(t) yj = b(t)
n∑
j=1
λ′j(t) y
(n−1)j +
n∑
j=1
λj(t)(y
(n)j + an−1y
(n−1)j + · · · + a0yj
)
︸ ︷︷ ︸0
= b(t)
n∑
j=1
λ′j(t) y
(n−1)j = b(t) .
127
Chapter 3. Ordinary differential equations
All in all, this boils down to solving, for each t ∈ I , the linear system
y1(t) . . . yn(t)y′
1(t) . . . y′n(t)
......
...
y(n−1)1 (t) . . . y
(n−1)n (t)
λ′1(t)λ′
2(t)...
λ′n(t)
=
0...0b(t)
and then integrating the solutions.
3.4.3 Example: second order equation
Theorem 3.59 takes a particularly simple form for a second order homogeneous equation
y′′ + 2by′ + cy = 0 (H2)
as one knows explicitly how to factor polynomials of degree 2. The characteristic polynomial is X2 +2bX + c and its roots are found using its reduced discriminant4 ∆ = b2 − c.
Corollary 3.61
Let ∆ = b2 − c. The solutions toy′′ + 2by′ + cy = 0 (H2)
are as follows:
(i) if ∆ > 0, the characteristic polynomial has two distinct real roots r1 = −b +√
∆ and r2 =
−b−√
∆ ; the solutions are
t ∈ R 7→ λer1t + µer2t λ, µ ∈ R ;
(ii) if ∆ = 0, the characteristic polynomial has one double real root r = −b ; the solutions are
t ∈ R 7→ (λt+ µ)ert λ, µ ∈ R ;
(iii) if ∆ < 0, the characteristic polynomial has two conjugate complex roots α ± iβ = −b ±√
−∆ ;the solutions are
t ∈ R 7→ eαt(λ cos(βt) + µ sin(βt)
)λ, µ ∈ R .
Example 3.62
1. Solve y′′ − y′ − 2y = 0. The characteristic polynomial is X2 − X − 2 = (X + 1)(X − 2). Thesolutions are thus t ∈ R 7→ λe−t + µe2t, with λ, µ ∈ R.
2. Solve y′′ −4y′ +4y = 0. The characteristic polynomial isX2 −4X+4 = (X−2)2. The solutionsare thus t ∈ R 7→ (λt+ µ)e2t, with λ, µ ∈ R.
3. Solve y′′ − 2y′ + 5y = 0. The characteristic polynomial is X2 − 2X + 5 = 0; its roots are 1 ± 2i.The solutions are thus t ∈ R 7→ et(λ cos(2t) + µ sin(2t)), with λ, µ ∈ R.
To solve the nonhomogeneous equation
y′′ + 2by′ + cy = b(t) , (E2)
4Recall Remark 1.60.
128
3.4.3. Example: second order equation
the variation of constants goes as follows. Let {y1, y2} be the basis of solutions of Corollary 3.61(depending on the case, y1 : t 7→ er1t and y2 : t 7→ er2t, or y1 : t 7→ ert and y2 : t 7→ tert, ory1 : t 7→ eαt cos(βt) and y2 : t 7→ eαt sin(βt)). We search for a solution ySE to (E2) in the form
{
ySE = λ(t)y1 + µ(t)y2
y′SE = λ(t)y′
1 + µ(t)y′2
(3.8)
where λ : I → R and µ : I → R are functions to be determined. This yields
y′SE =
differentiating the first line of (3.8)
↑λ′(t)y1 + µ′(t)y2 + λ(t)y′
1 + µ(t)y′2 =
second line of (3.8)
↑λ(t)y′
1 + µ(t)y′2 =⇒ λ′(t)y1 + µ′(t)y2 = 0
and y′′SE + 2by′
SE + cySE = b(t) implies
λ′(t)y′1 + µ′(t)y′
2 + λ(t)y′′1 + µ(t)y′′
2 + 2b(λ(t)y′
1 + µ(t)y′2
)+ c(λ(t)y1 + µ(t)y2
)= b(t) ,
so thatλ′(t)y′
1 + µ′(t)y′2 = b(t) .
Example 3.63
Let us solve, on (− π2 ,+
π2 ),
y′′ + y =1
cos(t).
The solutions to the homogeneous equation y′′ + y = 0 are t ∈ (− π2 ,+
π2 ) 7→ λ cos(t) + µ sin(t), with
λ, µ ∈ R. We thus look for a solution ySE : t ∈ (− π2 ,+
π2 ) 7→ λ(t) cos(t) + µ(t) sin(t) satisfying
y′SE = λ(t) cos′(t) + µ(t) sin′(t) and y′′
SE + ySE =1
cos(t),
so that
λ′(t) cos(t)+µ′(t) sin(t) = 0 and λ′(t) cos′(t)+µ′(t) sin′(t) = −λ′(t) sin(t)+µ′(t) cos(t) =1
cos(t).
This yields
{
λ′(t) sin(t) cos(t) + µ′(t) sin2(t) = 0
−λ′(t) sin(t) cos(t) + µ′(t) cos2(t) = 1=⇒
µ′(t) = 1
λ′(t) = − sin(t)
cos(t)
,
so, for instance, µ(t) = t and λ(t) = ln(cos(t)) (recall that t ∈ (− π2 ,+
π2 )). In the end, the solutions are
t ∈ (−π
2,+
π
2) 7→
(λ+ ln(cos(t))
)cos(t) + (µ+ t) sin(t), λ, µ ∈ R .
Particular cases
In some particular cases, there is a faster way to find a particular solution to (3.8).
h b is a polynomial function. If b is a degree k polynomial function, look for a polynomial solutionof degree k (if c 6= 0, otherwise (3.8) is of degree less than one in y′).
129
Chapter 3. Ordinary differential equations
h b : t 7→ eαtPk(t) where α ∈ R⋆ and Pk ∈ R[X ] is a degree k polynomial. In this case, look for asolution of the form ySE : t ∈ R 7→ eαtQ(t) where Q ∈ R[X ]. We have
ySE(t) = eαtQ(t)
y′SE(t) = eαt
(Q′(t) + αQ(t)
)
y′′SE(t) = eαt
(Q′′(t) + 2αQ′(t) + α2Q(t)
).
The function ySE is solution to (3.8) if and only if
y′′SE + 2by′
SE + cy = eαtPk(t) ⇐⇒(Q′′(t) + 2αQ′(t) + α2Q(t)
)+ 2b
(Q′(t) + αQ(t)
)+ cQ(t) = Pk(t)
⇐⇒ Q′′(t) + 2(α+ b)Q′(t) + (α2 + 2bα+ c)Q(t) = Pk(t) .
Thus, look for a polynomial of the form XmQk where Qk is of degree k and
(i) m = 0 if α2 + 2bα+ c 6= 0 (that is, α is not a root of the characteristic polynomial);
(ii) m = 1 if α2 +2bα+c = 0 and α+b 6= 0 (that is, α is a simple root of the characteristic polynomial);
(iii) m = 2 if α2 +2bα+c = 0 and α+b = 0 (that is, α is a double root of the characteristic polynomial).
Remark 3.64
If α is a simple root of the characteristic polynomial, then, for any a ∈ R, the function t 7→ aeαt isalready solution to the homogeneous equation, one needs to go up one degree. Similarly, if α is a doubleroot of the characteristic polynomial, then, for any a, b ∈ R, the function t 7→ (a + bt)eαt is alreadysolution to the homogeneous equation, one needs to go up two degrees.
h b : t 7→ eαt cos(βt)Pk(t)+eαt sin(βt)Qk(t) where α ∈ R, β ∈ R⋆ and Pk, Qk ∈ R[X ] are polynomials,one of degree k, one of degree k or less. In fact, the previous reasoning perfectly works for nonzerocomplex values of α.
As a result, look for a solution of the form t 7→ tmeαt(Rk(t) cos(βt)+Sk(t) sin(βt)
)whereRk and Sk
are both polynomials of degree k, and m is the multiplicity of α ± iβ as a root of the characteristicpolynomial, that is,
(i) m = 0 if α± iβ are not roots of the characteristic polynomial;
(ii) m = 1 if α± iβ are simple roots of the characteristic polynomial.
Example 3.65
Let us solve y′′ + y = sin(t). The roots of the characteristic polynomial are ±i, so that the solutions tothe homogeneous equation are
t ∈ R 7→ λ sin(t) + µ cos(t), λ, µ ∈ R .
As i is a simple root of the characteristic polynomial, the analysis above tells us to look for a solution ofthe form z : t 7→ t
(a cos(t) + b sin(t)
)for some a, b ∈ R. We have
z′ = a cos(t) + b sin(t) + t(
− a sin(t) + b cos(t)),
z′′ = 2(
− a sin(t) + b cos(t))
+ t(
− a cos(t) − b sin(t)),
130
3.4.3. Example: second order equation
so that z is solution if and only if −2a = 1 and b = 0. All the solutions are thus
t ∈ R 7→ λ sin(t) + µ cos(t) − t
2cos(t), λ, µ ∈ R .
131
132
Solutions to the exercises
Solution to Exercise 1.14 page 13
As, whenever x < y, there exist at least a rational number and an irrational number in (x, y), we seethat, for any step functions φ ≤ 1Q ≤ ψ, one has φ ≤ 0 and ψ ≥ 1. As a result, I−(1Q) ≤ 0 andI+(1Q) ≥ 1. In fact, it is easy to see that I−(1Q) = 0 and I+(1Q) = 1 by considering the constant stepfunctions equal to 0 for the lower bound and 1 for the upper bound.
Solution to Exercise 1.15 page 13
We consider the regular subdivision 0 < 1n < 2
n < . . . < n−1n < 1 and use the monotonicity of the
square function on [0, 1] to see that, for 1 ≤ i ≤ n,
∀x ∈[
i−1n , i
n
],
(i−1
n
)2 ≤ x2 ≤(
in
)2. (3.9)
We define the step functions φn and ψn on [0, 1] by φn(x) :=(
i−1n
)2and ψn(x) :=
(in
)2whenever
x ∈[
i−1n , i
n
), as well as φn(1) = ψn(1) := 1. From (3.9), we obtain that φn ≤ f ≤ ψn.
ψnψnψnψnψn
φnφnφnφnφn
1
0 1n = 5
x
yf
We have
∫ 1
0
φn(x) dx =
n∑
i=1
1
n
( i− 1
n
)2
=1
n3
n−1∑
i=0
i2 =(n− 1)(2n− 1)
6n2,
∫ 1
0
ψn(x) dx =
n∑
i=1
1
n
( i
n
)2
=1
n3
n∑
i=1
i2 =(n+ 1)(2n+ 1)
6n2.
133
Solutions to the exercises
We thus obtain
(n− 1)(2n− 1)
6n2=
∫ 1
0
φn ≤ I−(f) ≤ I+(f) ≤∫ 1
0
ψn =(n+ 1)(2n+ 1)
6n2.
Letting n → ∞ yields I−(f) = I+(f) = 13 . Thus f is integrable and
∫ 1
0 x2 dx = 1
3 .
Solution to Exercise 1.24 page 17
(i) By linearity,
∫ 1
0
P (x) dx =
n∑
i=0
ai
∫ 1
0
xi dx =
n∑
i=0
ai
i+ 1.
(ii) If P (X) = a0 +a1X+a2X2, then
∫ 1
0
P (x) dx = a0 +a1
2+a2
3. For instance, P (X) = 3X2 −1
works.
Solution to Exercise 1.29 page 20
We have∣∣∣∣
∫ n
1
sin(nx)
1 + xndx
∣∣∣∣
≤∫ n
1
∣∣∣∣
sin(nx)
1 + xn
∣∣∣∣dx ≤
∫ n
1
1
xndx =
n−n+1 − 1
−n+ 1−−−−→n→∞
0 .
Solution to Exercise 1.30 page 20
No. Take for instance the following.
(i) a = 0, b = 2, f = 1[0,1] + 21[1,2]. We get 1 + 22 = (1 + 2)2.
(ii) a = 0, b = 2, f ≡ 1. We get 2 =√
2.
(iii) a = −1, b = 1, f = −1[−1,0) + 1[0,1]. We get 2 = 0.
(iv) a = 0, b = 1, f ≡ 1, g ≡ −1. We get 0 = 1 + 1.
Solution to Exercise 1.46 page 26
(i) Let F : [−a, a] → R be a primitive of f . We consider ϕ : x ∈ [−a, a] 7→ F (x) − F (−x).It is differentiable and ϕ′(x) = f(x) + f(−x) = 0. As a consequence, ϕ is constantly equalto ϕ(0) = F (0) − F (−0) = 0, so that F is even. Finally,
∫ a
−a
f(x) dx = [F ]a−a = F (a) − F (−a) = 0 .
(ii) Let G : [−a, a] → R be a primitive of g. We consider ψ : x ∈ [−a, a] 7→ G(x) + G(−x).It is differentiable and ϕ′(x) = g(x) − g(−x) = 0. As a consequence, ψ is constantly equal
134
Solutions to the exercises
to ψ(0) = 2G(0). Finally,
∫ a
−a
g(x) dx = [G]a−a = G(a) −G(−a) = G(a) − (ψ(a) −G(a))
= 2G(a) − 2G(0) = 2[G]a0 = 2
∫ a
0
g(x) dx .
Solution to Exercise 1.51 page 27
We readily recognize a Riemann sum:
Sn =1
n
n∑
k=1
ekn −−−−→
n→∞
∫ 1
0
ex dx =[ex]1
0= e− 1 .
For S′n, there is a slight rewriting necessary:
S′n =
1
n
n∑
k=1
1(1 + k
n
)2 −−−−→n→∞
∫ 1
0
1
1 + x2dx =
[arctan(x)
]1
0= arctan(1) − arctan(0) =
π
4.
Solution to Exercise 1.55 page 28
(i) We have
∫ e
1
x ln(x) dx =
[x2
2ln(x)
]e
1
−∫ e
1
x2
2
1
xdx
=e2
2− 1
2
∫ e
1
xdx
=e2
2− 1
2
[x2
2
]e
1
=e2
2− e2
4+
1
4=e2 + 1
4.
(ii) For any k ∈ N,
∫ x
0
tket dt =[
tket]x
0− k
∫ x
0
tk−1et dt = xkex − k
∫ x
0
tk−1et dt .
Using this for k = 2 then k = 1, we get
∫ x
0
x2ex dx = x2ex − 2
(
xex −∫ x
0
et dt
)
= (x2 − 2x+ 2) ex .
Of course, this gives a primitive. The other ones are obtained by adding an arbitrary constant.
135
Solutions to the exercises
Solution to Exercise 1.59 page 30
(i) As indicated, we use the substitution x = sin(t), yielding dx = cos(t) dt, and mapping [0, π6 ]
onto [0, 12 ]. Thus,
∫ 12
0
dx
(1 − x2)3/2=
∫ π6
0
cos(t) dt
(1 − sin2(t))3/2=
∫ π6
0
cos(t) dt
(cos2(t))3/2
=
∫ π6
0
1
cos2(t)dt =
[tan(t)
] π6
0=
√3
3.
(ii) We have x = tan(t) and dx =dt
cos2(t). Hence, recalling that
1
cos2(t)= 1 + tan2(t),
∫ y
0
dx
(1 + x2)3/2=
∫ arctan(y)
0
1
(1 + tan2(t))3/2
dt
cos2(t)
=
∫ arctan(y)
0
cos(t) dt
=[
sin(t)]arctan(y)
0= sin(arctan(x)) .
This can be rewritten as∫ y
0
dx
(1 + x2)3/2= sin(arctan(x)) = tan(arctan(x)) cos(arctan(x)) =
x√1 + x2
.
We used the fact that cos2(arctan(x)) =1
1 + tan2(arctan(x))=
1
1 + x2and that
cos(arctan(x)) ≥ 0 as − π2 ≤ arctan(x) ≤ π
2 .
Solution to Exercise 1.71 page 35
The partial fraction decomposition gives
(i)4x+ 5
x2 + x− 2=
3
x− 1+
1
x+ 2(ii)
6 − x
x2 − 4x+ 4=
−1
x− 2+
4
(x− 2)2
so that we obtain for instance the primitives
(i) x ∈ R \ {−2, 1} 7→ 3 ln |x− 1| + ln |x+ 2| (ii) x ∈ R \ {2} 7→ − ln |x− 2| − 4(x− 2)−1
We chose all 5 constants equal to 0. If one wants all the primitives, one needs to add one constant perinterval of the domain of definition (so 3 for the first function and 2 for the second one).
The third one is a bit more involved. It is already in partial fraction decomposition form. We have∫ y
0
2x− 3
x2 − 4x+ 5dx =
∫ y
0
2x− 4
x2 − 4x+ 5dx+
∫ y
0
dx
(x − 2)2 + 1
=[
ln(x2 − 4x+ 5)]y
0+
∫ y−2
−2
du
u2 + 1
=[
ln(x2 − 4x+ 5)]y
0+[
arctan(u)]y−2
−2
= ln(y2 − 4y + 5) + arctan(y − 2) + c
136
Solutions to the exercises
where c is a constant whose value does not matter (there is no need to compute it).
Solution to Exercise 1.72 page 35
We have
x2 + x+ 1 =(x+ 1
2
)2+ 3
4 = 34
((2√3
(x+ 1
2
))2
+ 1)
so that
(i)
∫ 1
0
dx
x2 + x+ 1=
4
3
∫ 1
0
dx(
2√3
(x+ 1
2
))2
+ 1=
4
3
√3
2
∫√
3
1√3
dy
y2 + 1
=2√
3
3
[
arctan(y)]
√3
1√3
=2√
3
3
(
arctan(√
3)
− arctan(√
3/3))
(ii)
∫ 1
0
xdx
x2 + x+ 1=
1
2
∫ 1
0
2x+ 1
x2 + x+ 1dx− 1
2
∫ 1
0
dx
x2 + x+ 1
=1
2
[
ln(x2 + x+ 1)]1
0− 1
2
∫ 1
0
dx
x2 + x+ 1=
ln(3)
2− 1
2
∫ 1
0
dx
x2 + x+ 1
(iii)
∫ 1
0
dx
(x2 + x+ 1)2=
16
9
∫ 1
0
dx((
2√3
(x+ 1
2
))2+ 1)2 =
16
9
√3
2
∫√
3
1√3
dy
(y2 + 1)2
Then,
∫√
3
1√3
1
y2 + 1dy =
[y
y2 + 1
]√
3
1√3
−∫
√3
1√3
−2yy
(y2 + 1)2dy = 2
∫√
3
1√3
y2 + 1 − 1
(y2 + 1)2dy,
so that
∫√
3
1√3
dy
(y2 + 1)2=
1
2
∫√
3
1√3
dy
y2 + 1dy
and
∫ 1
0
dx
(x2 + x+ 1)2=
4√
3
9
(
arctan(√
3)
− arctan(√
3/3))
.
(iv)
∫ 1
0
xdx
(x2 + x+ 1)2=
1
2
∫ 1
0
2x+ 1
(x2 + x+ 1)2dx− 1
2
∫ 1
0
dx
(x2 + x+ 1)2
=1
2
[ −1
x2 + x+ 1
]1
0
− 1
2
∫ 1
0
dx
(x2 + x+ 1)2=
1
3− 1
2
∫ 1
0
dx
(x2 + x+ 1)2
137
Solutions to the exercises
Solution to Exercise 1.73 page 36
∫ 2
1
2e2x − 3ex + 2
e2x − exdx =
∫ e2
e
2y2 − 3y + 2
y (y2 − y)dy
=
∫ e2
e
(1
y − 1− 2
y2+
1
y
)
dy
=
[
ln(y − 1) +2
y+ ln(y)
]e2
e
= ln(e2 − 1
e− 1
)
+2
e2− 2
e+ ln
(e2
e
)
= ln(e+ 1) +2
e2− 2
e+ 1 .
Solution to Exercise 1.77 page 38
(i)
∫ π2
− π2
sin2(x) cos3(x) dx =
∫ π2
− π2
sin2(x)(1 − sin2(x)
)cos(x) dx
=
∫ 1
−1
y2(1 − y2) dy =
[y3
3− y5
5
]1
−1
=4
15
(ii) Ok, this was vicious. If you use the half-angle tangent, you get to integrate(1 − t2)4
(1 + t2)5; good luck
with that! Rather, use power-reduction for cos4, that is,
cos4(x) =
(eix + e−ix
2
)4
=e4ix + 4e2ix + 6 + 4e−2ix + e−4ix
24=
cos(4x) + 4 cos(2x) + 3
8
and obtain
∫ π2
0
cos4(x) dx =1
8
[sin(4x)
4+ 2 sin(2x) + 3x
]π2
0
=3π
16.
(iii) From the material of Section 1.3.3, we infer that the half-angle tangent substitution is appropriate.There is however some care needed as π ∈ [0, 2π]. (Overseeing this should bother you as it wouldresult in integrating from 0 to 0; this may happen with some substitutions but not with the half-angle tangent substitution because it is a bijective substitution. The image of an interval of positivelength cannot be a single point with a bijection.) Even though the integral is not an improperintegral (these are addressed in next section) a priori because the integrand is well defined andcontinuous on [0, 2π], the half-angle tangent substitution will make it an improper integral. Afirst possibility is to “avoid” the point π by use of Chasles’s identity and continuity of primitives:
∫ 2π
0
dx
2 + sin(x)= lim
x <→π
∫ x
0
du
2 + sin(u)+ lim
y >→π
∫ 2π
y
du
2 + sin(u).
We may now use the half-angle tangent substitution on [0, x] with 0 < x < π fixed:
∫ x
0
du
2 + sin(u)=
∫ tan( x2 )
0
1
2 + 2t1+t2
2 dt
1 + t2=
∫ tan( x2 )
0
2 dt
(1 + t)2
=
[ −2
1 + t
]tan( x2 )
0
= 2 − 2
1 + tan(x2 )
−−−→x <→π
2 .
138
Solutions to the exercises
Similarly, for π < y < 2π,
∫ 2π
y
du
2 + sin(u)=
[ −2
1 + t
]0
tan( y
2 )
=2
1 + tan(y2 )
− 2 −−−→y >→π
−2 .
All in all,
∫ 2π
0
dx
2 + sin(x)= 2 − 2 = 0.
Solution to Exercise 1.79 page 39
Using the substitution y =√x+ 1 yields y2 = x+ 1, 2y dy = dx and
∫ 1
0
x2 + 1√x+ 1
dx = 2
∫√
2
1
((y2 − 1)2 + 1
)dy = 2
∫√
2
1
(y4 − 2y2 + 2
)dy
= 2
[y5
5− 2
y3
3+ 2y
]√
2
1
=44
15
√2 − 46
15.
Solution to Exercise 1.82 page 42
We integrate by parts:
∫ x
0
λt e−λt dt = −[
t e−λt]x
0−∫ x
0
−1e−λt dt = −xe−λx +
∫ x
0
e−λt dt
= −xe−λx − 1
λ
[
e−λt]x
0= −xe−λx − 1
λ
(e−λx − 1
)−−−−→x→∞
1
λ.
The integral is thus convergent and
∫ +∞
0
λt e−λt dt =1
λ.
Solution to Exercise 1.83 page 42
For n = 0,
∫ x
0
e−t dt =[
− e−t]x
0= 1 − e−x → 1 as x → ∞.
Next, we integrate by parts as in the previous exercise. Let n ≥ 1.
∫ x
0
tn e−t dt = −[
tn e−t]x
0−∫ x
0
−ntn−1 e−t dt = −xne−x + n
∫ x
0
tn−1e−t dt .
The convergence of the integral for n−1 implies the convergence for n, so that, by induction, the integralconverges and, taking the limit as x → ∞ yields
∫ +∞
0
tn e−t dt = n
∫ +∞
0
tn−1e−t dt .
By induction, we readily obtain
∫ +∞
0
tn e−t dt = n!.
139
Solutions to the exercises
Solution to Exercise 1.102 page 50
(i) Clearly, if α = 1, the integral diverges; let us now suppose that α 6= 1. The problem is at −∞. Wemove it to +∞ thanks to the substitution x = −t, with dx = − dt. We have
∫ π
−u
αt dt =
∫ u
−π
α−x dx =
∫ u
−π
e−x ln(α) dx =
[
e−x ln(α)
− ln(α)
]u
−π
=e−u ln(α) − eπ ln(α)
− ln(α),
which converges to a finite limit if and only if ln(α) ≥ 0, that is, α > 1 (recall that α 6= 1).
(ii) As t → ∞, t−α − sin(t−α) ∼ 16 t
−3α, which is integrable at +∞ if and only if 3α > 1, that is,α > 1
3 .
(iii) As t → ∞, 1 − 3√
1 + t−α ∼ − 13 t
−α, which is integrable at +∞ if and only if α > 1.
Solution to Exercise 1.105 page 52
For t ≥ 1,
∣∣∣∣
sin(t)
t2
∣∣∣∣
≤ 1
t2, which is integrable at +∞. By comparison of nonnegative functions, we
conclude that t ∈ [1,+∞) 7→∣∣∣∣
sin(t)
t2
∣∣∣∣
is integrable.
Solution to Exercise 1.110 page 53
As
∣∣∣∣
sinn(t)
tα
∣∣∣∣
≤ 1
tα, we see that the integral is absolutely convergent for α > 1. Next,
∫ Nπ
π
∣∣∣∣
sinn(t)
tα
∣∣∣∣dt =
N∑
k=2
∫ kπ
(k−1)π
| sinn(t)|tα
dt
≥ 1
πα
N∑
k=2
1
kα
∫ kπ
(k−1)π
| sinn(t)| dt =cn
πα
N∑
k=2
1
kα,
where cn :=
∫ π
0
| sinn(t)| dt is a positive constant depending only on n. For 0 < α ≤ 1, the latter sum
tends to +∞ as N → ∞ (see 1.115), so that the integral under study is not absolutely convergent.Moreover, for even n, the integrand is positive, so that the above argument holds without taking theabsolute value.Finally, it remains to see whether the integral is convergent for odd n and 0 < α ≤ 1. Using Abel’scriterion with f : t ∈ [1,+∞) 7→ sinn(t) and g : t ∈ [1,+∞) 7→ t−α, wee see that it is alwaysconvergent. The only point to check is that
x ∈ [1,+∞) 7→∫ x
1
f(t) dt
is bounded. This is easily obtained from the power-reduction formula
sin2p+1(t) =1
4p
n∑
ℓ=0
(−1)ℓ
(2p+ 1
p− ℓ
)
sin ((2ℓ+ 1)t) ,
140
Solutions to the exercises
observing that the primitives of sine function are cosine functions and thus are bounded. Of course,the argument does not hold for even values of n; the reason is that there is a constant factor in thepower-reduction formula, so that its primitives are not bounded.Summing up, we obtained the following.
h α > 1. Absolutely convergent.
h 0 < α ≤ 1 and even n. Not convergent.
h 0 < α ≤ 1 and odd n. Convergent but not absolutely convergent.
Solution to Exercise 1.111 page 54
We use the change of variable u = 1t .
∫ 1
x
sin(
1t
)
tdt =
∫ 1
1/x
u sin(u)− du
u2=
∫ 1/x
1
sinu
udu .
As x >→ 0, 1x → +∞. We recover a known integral, which is convergent but not absolutely convergent.
Solution to Exercise 1.112 page 54
Let us first check that the first integral converges. The problem is at 0. As√t ln(t) → 0 as t >→ 0, we
obtain that, for small t, | ln(t)| ≤ t−12 . By comparison with Riemann integrals, ln is integrable at 0. As
sin(t) ∼ t when t → 0, the same is true for the integrand. Let us denote by A this integral.Next, observe that
∫ x
0
ln(
cos(t))dt = −
∫ π2 −x
π2
ln(
cos(π2 − u)
)du =
∫ π2
π2 −x
ln(
sin(u))
du −−−→x→ π
2
A
so that the second integral also conveges and is equal to the first one.As suggested, let us consider the sum:
2A =
∫ π2
0
ln(
sin(t))
dt+
∫ π2
0
ln(
cos(t))
dt =
∫ π2
0
ln(
sin(t) cos(t))
dt
=
∫ π2
0
ln(
12 sin(2t)
)
dt = −π
2ln(2) +
∫ π2
0
ln(
sin(2t))dt
u = 2t
= −π
2ln(2) +
1
2
∫ π
0
ln(
sin(u))du = −π
2ln(2) +
1
2
(
A+
∫ π
π2
ln(
sin(u))du
)
v = π − u
= −π
2ln(2) +
1
2
(
A−∫ 0
π2
ln(
sin(π − v))dv
)
= −π
2ln(2) +A .
We finally obtain A = −π
2ln(2).
141
Solutions to the exercises
Solution to Exercise 1.116 page 56
One may for instance proceed as follows. Let f denote the above example. We make it unbounded at −∞by adding x 7→ f(−x). Next, we make it positive by adding a continuous positive integrable function,
for instance x ∈ R 7→ e−x2
. In the end, the function x ∈ R 7→ f(x) + f(−x) + e−x2
answers thequestion.
Solution to Exercise 1.117 page 56
The change of variable u = t2 yields t =√u, dt = du
2√
u, and
∫ x
1
sin(t2) dt =
∫ x2
1
sin(u)du
2√u.
We conclude thanks to Abel’s criterion (see Exercise 1.110) that the integral converges.
Solution to Exercise 2.17 page 68
The tangent is the line L((t, f(t)),~ı + f ′(t)~
). As t → t0, we have ‖(t, f(t)) − (t0, f(t0))‖ → 0 and
‖(~ı + f ′(t)~) − (~ı + f ′(t0)~)‖ = |f ′(t) − f ′(t0)| → 0, so that the tangent admits the limiting positionL((t0, f(t0)),~ı + f ′(t0)~
).
Solution to Exercise 2.18 page 68
The tangent has direction vector ~ı − cos(
1t
)
t2 ~. At times an := 2π+4πn , this direction vector is ~ı, which
tends to ~ı as n → ∞. At times bn := 12πn , the corresponding unit vector is ~ı−(2πn)2~√
1+(2πn)4→ −~ as
n → ∞. We have thus found two sequences of times tending to 0 at which no unit direction vectors canconverge to the same limit.
Solution to Exercise 2.32 page 77
The vector~v(t)
‖~v(t)‖ =t
√
t2 + sin2(t)~ı +
sin(t)√
t2 + sin2(t)~
tends to~ı as t → ∞ so that the parametric curve admits the ray R(O,~ı) as asymptotic direction.By Proposition 2.31, if it admits an asymptote, then it has to be horizontal. But the distance to the lineof equation y = a is |a− sin(t)|, which does not tend to 0 for any a ∈ R.
Solution to Exercise 2.33 page 77
The vector~v(t)
‖~v(t)‖ = sin(t)~ı + cos(t)~
clearly does not admit a limit as t → ∞ (you can for instance find a susequence along which it is equalto~ı and one along which it is equal to~).
142
Solutions to the exercises
Solution to Exercise 2.38 page 84
Let us find the times at which the location is (− 83 ,− 4
3 ). According to the sketch, we should find twotimes, one on (−∞,−1) and one on (−1, 1).The sketch shows that the abscissa will be reached three times, whereas the ordinate will be reached onlytwice, at the desired times. It is thus a priori a better idea to find the times at which the ordinate − 4
3 isobtained and then check that at these times the abscissa is − 8
3 as desired.Let us thus solve
t(3t− 2)
3(t− 1)= −4
3⇐⇒ 3t2 + 2t− 4 = 0 ⇐⇒ t =
−1 ±√
13
3.
Let us next check the abscissa at these times:
x
(−1 ±√
13
3
)
= −8
3.
As a result, M(−1±√
133 ) = (− 8
3 ,− 43 ) is the point of intersection of the two parametric curves
((−∞,−1), ~v) and ((−1, 1), ~v).
Note that solving t3
t2−1 = − 83 directly is quite involved as it is a third degree equation, which you are
not supposed to know how to solve. . .
Solution to Exercise 2.41 page 86
The polar angle is given by
θ =
π2 if x = 0 and y > 0
0 if x = 0 and y = 03π2 if x = 0 and y < 0
arctan(
yx
)if x > 0 and y ≥ 0
arctan(
yx
)+ 2π if x > 0 and y < 0
arctan(
yx
)+ π if x < 0
.
Solution to Exercise 2.47 page 89
At time π3 , ρ(π
3 ) = 0 so that the location is the pole. As ρ only cancels at time π3 around that time, the
polar curve admits as tangent at π3 the line L (O, ~u π
3).
We have ρ(π2 ) = 1 and ρ′(π
2 ) = 2, so that, at time π2 , the polar curve admits as tangent the line
L((0, 1), 2~u π
2+ ~v π
2).
Solution to Exercise 3.6 page 101
First observe that all these ODEs are well defined on R. The first ODE is very particular as it onlyinvolves one derivative of y, namely y′. From your knowledge of usual functions and their derivatives,you may think of the following solutions, defined on R.
(i) y =x2
2− cos(x) (ii) y = ex (iii) y = e7x (iv) y = e
√2x
143
Solutions to the exercises
Solution to Exercise 3.7 page 101
Let y : x ∈ (−c,+∞) 7→ 1
x+ c. Then, for all x ∈ (−c,+∞),
y′(x) = − 1
(x+ c)2= y(x)2 ,
so that y′ = −y2 on (−c,+∞), as desired.
Solution to Exercise 3.21 page 107
Both these ODEs are defined on R.The ODE 2y′ − 5y = 0 can be rewritten as y′ = 5
2y, so that its solutions are the functions x ∈ R 7→c e
52 x, for any c ∈ R.
The ODE ex y′ + 2y = 0 can be rewritten as y′ = −2e−xy (observe that x ∈ R 7→ ex does notcancel so that dividing by ex does not cause problems). A primitive of x ∈ R 7→ −2e−x is for instance
x ∈ R 7→ 2e−x, so that the solutions to the ODE are the functions x ∈ R 7→ c e2e−x
, for any c ∈ R.
Solution to Exercise 3.25 page 108
The solutions to the associated homogeneous equation y′ = 2xy are the functions x ∈ R 7→ c ex2
for anyc ∈ R.We now need to find a particular solution; we use superposition of solutions in order to do so. Easilyenough, we observe that the function y1 : x ∈ R 7→ −2 is solution to y′ − 2xy = 4x and that
y2 : x ∈ R 7→ 1x is solution to y′ − 2xy = − 1
x2− 2. Summing up, the solutions to the ODE are the
functions
x ∈ R 7→ c ex2 − 2 +1
x,
for any c ∈ R.
Solution to Exercise 3.28 page 110
The solutions to the homogeneous equation y′ = −2y are the functions x ∈ R 7→ λ e−2x for any λ ∈ R.We recognize that the ODE has a particular form (y′ = αy + P (x)eβx) for which we know a fasterway than to use variation of constants. We know that there exists a particular solution of the formz : x ∈ R 7→ (ax2 + bx+ c) e−x where a, b, c ∈ R. We have
z′ + 2z − x2e−x = 0 ⇐⇒(2ax+ b− (ax2 + bx+ c)
)e−x + 2(ax2 + bx+ c) e−x − x2e−x = 0
⇐⇒ (a− 1)x2 + (2a+ b)x+ b+ c = 0
⇐⇒ a = 1, b = −2 and c = 2 .
The solutions to y′ = −2y + x2e−x are thus the functions
x ∈ R 7→ (x2 − 2x+ 2) e−x + λe−3x , for any λ ∈ R.
144
Solutions to the exercises
Solution to Exercise 3.29 page 110
The solutions to the associated homogeneous equation y′ = −y are the functions x ∈ R 7→ c e−x for anyc ∈ R. The ODE has a particular form, for which we know that there exists a particular solution of theform z : x ∈ R 7→ a cos(2x) + b sin(2x) where a, b ∈ R. We have
z′ + z = sin(2x) ⇐⇒ −2a sin(2x) + 2b cos(2x) + a cos(2x) + b sin(2x) = sin(2x)
⇐⇒ −2a+ b = 1 and 2b+ a = 0
⇐⇒ a = −2
5and b =
1
5.
The solutions to the ODE are thus
x ∈ R 7→ −2
5cos(2x) +
1
5sin(2x) + c e−x , for any c ∈ R.
Solution to Exercise 3.34 page 111
The solutions arey : x ∈ R 7→ c2−x , c ∈ R.
The solution satisfying y(1) =1
2is the one where c = 1.
x
y
(1, 12 )
Solution to Exercise 3.43 page 115
We check that [1 11 2
] [2 00 3
] [2 −1
−1 1
]
=
[2 32 6
] [2 −1
−1 1
]
=
[1 1
−2 4
]
.
145
Solutions to the exercises
As[
2 −1−1 1
]
=
[1 11 2
]−1
,
we can use Corollary 3.42 and Proposition 3.40 in order to deduce
etA =
[1 11 2
] [e2t 00 e3t
] [2 −1
−1 1
]
=
[e2t e3t
e2t 2e3t
] [2 −1
−1 1
]
=
[2e2t − e3t −e2t + e3t
2e2t − 2e3t −e2t + 2e3t
]
.
Solution to Exercise 3.45 page 115
We have
A+B =
[1 10 0
]
,
and, by a direct computation, (A+B)n = A+B for each n ≥ 1. Using the definition, we get
eA+B =
[e e− 10 1
]
.
On the other hand,
eA =
[e1 00 e0
]
=
[e 00 1
]
and eB = I +B + 02 + 02 + . . . =
[1 10 1
]
,
so that
eAeB =
[e e0 1
]
6= eA+B .
Solution to Exercise 3.51 page 118
The matrix associated with the system is the matrix A from Exercice 3.43. The solutions are thus givenby
{
t 7→ etA
[αβ
]
=
[2e2t − e3t −e2t + e3t
2e2t − 2e3t −e2t + 2e3t
] [αβ
]
, α, β ∈ R
}
,
that is, {
y1 : t ∈ R 7→ (2e2t − e3t)α+ (−e2t + e3t)β
y2 : t ∈ R 7→ (2e2t − 2e3t)α+ (−e2t + 2e3t)β, α, β ∈ R .
Solution to Exercise 3.56 page 123
(i) The fastest way to invert a 2 × 2 matrix is to use the formula P−1 =adj(P )
det(P ). In our case,
det(P ) = 1, so that
P−1 =
[−1 −13 2
]
.
146
Solutions to the exercises
(ii) We have
PTP−1 =
[2 1
−3 −1
] [1 01 3
] [−1 −13 2
]
=
[3 3
−4 −3
] [−1 −13 2
]
=
[6 3
−5 −2
]
= A .
(iii) Setting Y :=
[y1
y2
]
= P−1X , we want to solve
X ′ = AX +
[0t2
]
⇐⇒[y′
1
y′2
]
=
[1 01 3
] [y1
y2
]
+
[−1 −13 2
] [0t2
]
,
that is, {
y′1 = y1 − t2
y′2 = y1 + 3y2 + 2t2
.
We thus obtain first y1 : t ∈ R 7→ λ et + t2 + 2t+ 2, for any λ ∈ R. Next, y2 satisfies
y′2 = 3y2 + λ et + 3t2 + 2t+ 2 ,
so that y2 : t ∈ R 7→ µ e3t − λ
2et − t2 − 4
3t− 10
9, for any µ ∈ R.
Finally, X = PY =
[2y1 + y2
−3y1 − y2
]
, that is
X : t ∈ R 7→
3
2λ et + µ e3t + t2 +
8
3t+
26
9
−5
2λ et − µ e3t − 2t2 − 14
3t− 44
9
, for any λ, µ ∈ R.
Solution to Exercise 3.60 page 126
The characteristic polynomial is
X3 +X2 −X − 1 = (X + 1)(X − 1)2 .
(The factorization is obtained by noticing that 1 and −1 are obvious roots.) The roots are −1 of multi-plicity 1 and 1 of multiplicity 2. The solutions are thus
x ∈ R 7→ ae−x + (b+ cx) ex , for any a, b, c ∈ R .
147
148
Usual notation
N {1, 2, 3, . . .}, set of positive integers . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Q set of rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
R set of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
R⋆ {x ∈ R : x 6= 0}, set of nonzero real numbers . . . . . . . . . . . . . . . . . . . . . 22
R+ {x ∈ R : x ≥ 0}, set of nonnegative real numbers . . . . . . . . . . . . . . . . . . 17
R⋆+ {x ∈ R : x > 0}, set of positive real numbers . . . . . . . . . . . . . . . . . . . . . 22
R− {x ∈ R : x ≤ 0}, set of nonpositive real numbers . . . . . . . . . . . . . . . . . . . 17
R⋆− {x ∈ R : x < 0}, set of negative real numbers . . . . . . . . . . . . . . . . . . . . . 22
R R ∪ {−∞,+∞} . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
≡, 6≡ functional equality, functional inequality . . . . . . . . . . . . . . . . . . . . . . . . 18
⌊·⌋, ⌈·⌉ floor function, ceiling function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Sign sign function: equal to −1 on R⋆−, 0 at 0 and +1 on R⋆
+ . . . . . . . . . . . . . . . 91
n√x n-th root of x, that is, x
1n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1A indicator function of the set A: equals 1 on A, 0 outside A . . . . . . . . . . . . . 13
f |J restriction of the function f to the set J . . . . . . . . . . . . . . . . . . . . . . . . . 14
C(I) set of continuous functions on I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
C∞ set of smooth functions, that is, admitting derivatives of any order. . . . . . . . . 61
Ck set of functions of class Ck, that is, k times differentiable with a continuous k-thderivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
x >→ a, x <→ a x tends to a from above, from below . . . . . . . . . . . . . . . . . . . . . . . . . . 39
In identity matrix of size n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
0n zero matrix of size n× n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Mn(C) set of n× n complex matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Mn(R) set of n× n real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
· scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
det determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Card cardinality of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
L (A, ~u) line passing through the point A with direction vector ~u . . . . . . . . . . . . . . 67
R(A, ~u) ray starting at the point A with direction vector ~u . . . . . . . . . . . . . . . . . . 76
149
150
Bibliography
[BDH11] Paul Blanchard, Robert L. Devaney, and Glen L.R Hall. Differential equations. CengageLearning, 2011.
[CC97] Earl A. Coddington and Robert Carlson. Linear ordinary differential equations. Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997.
[Cod61] Earl A. Coddington. An introduction to ordinary differential equations. Prentice-Hall Mathe-matics Series. Prentice-Hall, Inc., Englewood Cliffs, N.J., 1961.
[God05] Roger Godement. Analysis. II. Universitext. Springer-Verlag, Berlin, 2005. Differentialand integral calculus, Fourier series, holomorphic functions, Translated from the Frenchby Philip Spain.
[Har92] G. H. Hardy. A course of pure mathematics. Cambridge Mathematical Library. CambridgeUniversity Press, Cambridge, tenth edition, 1992.
[LM07] Laurent Lazzarini and Jean-Pierre Marco. Mathématiques L1. Cours complet avec 1000 tests etexercices corrigés. Pearson, 2007. in French.
[MTW07] Jean-Pierre Marco, Philippe Thieullen, and Jacques-Arthur Weil. Mathématiques L2. Courscomplet avec 700 tests et exercices corrigés. Pearson, 2007. in French.
[Ste16] James Stewart. Calculus: Early transcendentals, 8th edition. Brooks Cole, 2016.
[Tao16] Terence Tao. Analysis. I, volume 37 of Texts and Readings in Mathematics. Hindustan BookAgency, New Delhi; Springer, Singapore, third edition, 2016.
151