Integration Plane parametric curves Ordinary differential...

MAA105 – ANALYSIS

Integration

Plane parametric curves

Ordinary differential equations

2019–2020

Jérémie BETTINELLI

version of January 31, 2020

Jérémie BETTINELLI

[email protected]

nsup.org/~bettinel

mailto:[email protected]

http://www.normalesup.org/~bettinel

Contents

Contents 3

1 Riemann theory of integration 51.1 Introduction 5

1.1.1 Motivation 51.1.2 A first example 7

1.2 Riemann integral 91.2.1 Step functions 91.2.2 Integrable functions 111.2.3 First properties of the integral 161.2.4 Integrals and derivatives 221.2.5 Riemann sums 261.2.6 Toolbox 27

1.3 Integration of rational functions 301.3.1 Partial fraction decomposition 301.3.2 Integrating partial fractions 331.3.3 Rational functions in other functions 35

1.4 Improper integrals 391.4.1 Definition and first properties 391.4.2 Nonnegative functions 441.4.3 Oscillating functions 501.4.4 Comparison of series with integrals 54

2 Plane parametric curves 572.1 Introduction 58

2.1.1 Motivation 582.1.2 Preliminaries 60

2.2 First definitions 632.3 Tangents 66

2.3.1 Definition 672.3.2 Link with derivatives 692.3.3 Local behavior 71

2.4 Sketching 742.4.1 Interval of study 742.4.2 Asymptotes 762.4.3 Sketching plan 79

2.5 Polar curves 852.5.1 Polar coordinates 85

3

Contents

2.5.2 Polar curves 862.5.3 What is the difference with a usual graph? 872.5.4 Tangents 882.5.5 Extremities of the interval of study 912.5.6 Sketching 93

3 Ordinary differential equations 993.1 Introduction 100

3.1.1 Motivation 1003.1.2 Formal definitions 1003.1.3 Separable differential equations 1023.1.4 Linear ODEs 103

3.2 First order linear differential equations 1053.2.1 Homogeneous equation 1053.2.2 Finding a particular solution to y′ = a(x)y + b(x) 1073.2.3 Solution to the nonhomogeneous equation 110

3.3 Systems of linear ODEs 1123.3.1 Preliminaries: matrix exponential 1133.3.2 Solution to the homogeneous equation 1163.3.3 Solution to the nonhomogeneous equation 1183.3.4 Method for solving a system of linear ODEs in practice 119

3.4 Linear differential equations with constant coefficients 1233.4.1 Homogeneous equation 1233.4.2 Nonhomogeneous equation 1263.4.3 Example: second order equation 128

Solutions to the exercises 133

Usual notation 147

Bibliography 151

Note

A vertical line on the margin like this indicates parts that will not be covered in class; they are accessibleand here for further reading.

4

1Riemann theory of integration

In this chapter, we will define the integral of a function in the sense of Riemann and cover the basictools for practical computation of integrals.

For further references about this chapter, you may consult

h [Tao16], [God05], [Har92].

1.1 Introduction 5

1.1.1 Motivation 5

1.1.2 A first example 7

1.2 Riemann integral 9

1.2.1 Step functions 9

1.2.2 Integrable functions 11

1.2.3 First properties of the integral 16

1.2.4 Integrals and derivatives 22

1.2.5 Riemann sums 26

1.2.6 Toolbox 27

1.3 Integration of rational functions 30

1.3.1 Partial fraction decomposition 30

1.3.2 Integrating partial fractions 33

1.3.3 Rational functions in other functions 35

1.4 Improper integrals 39

1.4.1 Definition and first properties 39

1.4.2 Nonnegative functions 44

1.4.3 Oscillating functions 50

1.4.4 Comparison of series with integrals 54

1.1 Introduction

1.1.1 Motivation

We are interested in the following two seemingly distinct problems.

5

Chapter 1. Riemann theory of integration

Geometric problem h area below the graph of a nonnegative function g

Given a nonnegative function f defined at least on an interval [a, b] ⊆ R, compute the area “below thegraph of f between a and b,” that is, of the set

A :={

(x, y) ∈ R2 : a ≤ x ≤ b and 0 ≤ y ≤ f(x)}.

A

x

y

a b

f

Analytic problem h finding primitives g

Given a function f : I → R, find a differentiable function F : I → R such that F ′ = f .

Such a function F is called a primitive or antiderivative of f . In fact, recall that the derivative of afunction represents the “growth rate” of the function: more precisely, for a given x and a small h, onehas F (x+h) ≈ F (x) +hF ′(x), the approximation getting better as h gets smaller. If one wants F ′ = f ,then one should define F (x + h) from F (x) by adding to it hf(x), which is precisely the area of thefollowing rectangle of width h and height f(x) (we assume f(x) ≥ 0 at this point).

y

x x+ h

f

Keeping in mind that h is supposed to become “infinitesimal,” the link between these two problemsbecomes clear. Moreover, in the case where f(x) < 0, the quantity of interest hf(x) is negative; it is theopposite of the area of the following rectangle of width h and height −f(x).

y

x x+ h

f

6

1.1.2. A first example

It thus makes sense to generalize the geometric problem for functions taking negative values asfollows.

Geometric problem h signed area below the graph of a function g

Given a function f defined at least on an interval [a, b] ⊆ R, compute the area between the graph of fand the x-axis, positively when f is above the x-axis and negatively when it is below, between a and b.

x

y

a b

f

On the picture, the desired result is the area of the red part minus that of the blue part.

1.1.2 A first example

Let us make the computation of the area below the graph of the exponential function f : e 7→ ex

between 0 and 1.

x

y

10

x 7→ ex

In order to do so, we bound the desired area with sums of “thin” rectangles as follows. We fix n ∈ N

and subdivide [0, 1] into n regular intervals by considering the points 0, 1n , 2

n , . . . , n−1n , 1. For the lower

bound, we use the n rectangles with base[

i−1n , i

n

]and height f

(i−1

n

)= e

i−1n , 1 ≤ i ≤ n, whereas, for

the upper bound, we use the n rectangles with base[

i−1n , i

n

]and height f

(in

)= e

in , 1 ≤ i ≤ n. On the

picture below, n = 5.

7


x

y

10

x 7→ ex

x

y

10

x 7→ ex

For the lower bound, we obtain

n∑

i=1

1

n× e

i−1n =

1

n

n∑

i=1

(e

1n

)i−1=

1

n

1 −(e

1n

)n

1 − e1n

=1n

e1n − 1

(e− 1

)−−−−→n→∞

e− 1.

The computation for the upper bound is similar:

n∑

i=1

1

n× e

in =

1

n

n∑

i=1

(e

1n

)i=

1

n

e1n −

(e

1n

)n+1

1 − e1n

= e1n

1n

e1n − 1

(e− 1

)−−−−→n→∞

e− 1.

As n grows, the rectangles get thinner and thinner and the approximation gets better and better.As the bounds are valid for every n, we may pass to the limit and obtain that the area is bounded frombelow and above by the same value: e− 1. It is thus equal to this value.

x

y

10n = 20

x 7→ ex

We will write the above result as

∫ 1

0

f(x) dx =

∫ 1

0

ex dx = e− 1 or

∫ 1

0

f = e− 1 .

The real number∫ 1

0f(x) dx is read: “the integral of f on [0, 1].” The variable “x” can be replaced by any

other variable: for instance∫ 1

0eu du = e− 1.

Warning 1.1

Beware that the notation∫ 1

0 f dx is incorrect and should be avoided. On the same topic, beware not toconfuse the number f(x) with the function f . One can for instance consider the integral of sin, not of

8

1.2. Riemann integral

sin(x), which is not a function.

1.2 Riemann integral

Recall that an interval of R is a subset of the form [a, b], [a, b), [a,+∞), (a, b], (−∞, b], (a, b), (a,+∞),(−∞, b), or (−∞,+∞) = R, where a ≤ b are real numbers1. Note that a = b only makes sense in thefirst case, in which case the interval is the singleton [a, a] = {a}. An interval is called

h open if it has either of the forms (a, b), (a,+∞), (−∞, b), or (−∞,+∞);

h closed if it has either of the forms [a, b], [a,+∞), (−∞, b], or (−∞,+∞);

h bounded if it has either of the forms [a, b], [a, b), (a, b], or (a, b);

h a segment if is closed and bounded, that is, has the form [a, b].

The numbers a, b, or −∞, +∞ appearing in the definition of an interval are called its extremitities.

1.2.1 Step functions

We now properly define the functions accounting for the rectangles we used in the previous section.

Definition 1.2 h subdivision g

A subdivision of a segment [a, b] is a finite sequence a = x0 < x1 < x2 < . . . < xn = b.

xx1 x2 x3 x4 x5 x6x0

a

x7

b

Example 1.3 h regular subdivision g

For instance, we used above the regular subdivision of [0, 1] into n parts:

0 <1

n<

2

n< . . . <

n− 1

n< 1 .

More generally, the regular subdivision of [a, b] into n parts is a = x0 < x1 < x2 < . . . < xn = bwhere

xi := a+ ib− a

n, 0 ≤ i ≤ n .

Definition 1.4 h step function g

A function f : [a, b] → R is a step function if there exists a subdivision a = x0 < x1 < x2 < . . . <xn = b and real numbers c1, . . . , cn such that, for all i ∈ {1, . . . , n},

∀x ∈ (xi−1, xi), f(x) = ci .

1Formally, an interval is defined as a subset I ⊆ R such that, whenever x < y < z with x, z ∈ I , then y ∈ I .

9


In other words, f is constant on every open subinterval (xi−1, xi) defined by the subdivision. Notethat, given a step function, the subdivision appearing in the definition is not uniquely defined. Indeed,one might always get another satisfactory subdivision by adding arbitrary points.

Remark 1.5

The value taken by f at a point of the subdivision is arbitrary: it might be equal to the value taken by fon the previous subinterval, to the value taken by f on the subsequent subinterval, or to any other value!This bears no effect in our context of area of rectangles at it corresponds to rectangles of null width.

Example 1.6 h floor and ceiling functions g

The floor function and the ceiling function are defined for x ∈ R respectively by

⌊x⌋ := max{k ∈ Z : k ≤ x

}and ⌈x⌉ := min

{k ∈ Z : k ≥ x

}.

For a < b, the restrictions of the floor and ceiling functions to [a, b] are step functions on [a, b].

Step functions are designed in such a way that the signed area of the geometric problem is straight-forward to compute as a sum of signed areas of rectangles: this is the definition of the integral of a stepfunction.

Definition 1.7 h integral of a step function g

Let f : [a, b] → R be a step function as in Definition 1.4. We define its integral as the real number

∫ b

a

f(x) dx :=

n∑

i=1

(xi − xi−1) ci .

x

y

0 x0

c1

x1

c2

x2

c3

x3c4

x4

c5

x5

c6

x6

c7

x7

Recall that the subdivision appearing in the definition of a step function is not uniquely defined. AsDefinition 1.7 is based upon such a subdivision, we need to check that the definition does not actuallydepend on this choice. First, observe that adding a point to a subdivision does not change the value ofthe definition. Indeed, let us consider three points x < y < z such that f is equal to the constant c on(x, z). Then f is also equal to c on (x, y), as well as on (y, z). As (z−x)c = (z− y)c+ (y−x)c, the value

10

1.2.2. Integrable functions

of the sum appearing in the definition is not modified upon adding a point. Reitering the argument, itis not changed upon adding a finite number of points either. Now, let us thus take two subdivisionsa = x0 < x1 < x2 < . . . < xn = b and a = y0 < y1 < y2 < . . . < ym = b compatible with f . Thenthe subdivision consisting of the numbers x0, x1, x2, . . . , xn, y0, y1, y2, . . . , ym arranged in increasingorder with duplicates removed is yet another subdivision compatible with f . The latter subdivisionis obtained from any of the original subdivisions by adding finite numbers of points, so that the sumsfor the three subdivisions must be equal.

Remark 1.8 h segment of null length g

Any function f : [a, a] → R is a step function and

∫ a

a

f(x) dx = 0 .

1.2.2 Integrable functions

For two functions f , g : I → R, we use the classical notation f ≤ g meaning

∀x ∈ I, f(x) ≤ g(x) .

We also use the notation f ≤ M for a function f : I → R and a real number M to mean

∀x ∈ I, f(x) ≤ M .

The other comparison ≥ is similarly defined. Recall that a real-valued function f is bounded if thereexists M ≥ 0 such that −M ≤ f ≤ M .

Warning 1.9 B f < g B

Beware that this notation has to be handled with care. First of all, it is easy to find two functions f and gsuch that neither f ≤ g nor f ≥ g.Moreover, the notation f < g is even more controversial: does it mean

h f(x) < g(x) for all x;

h or f ≤ g and f 6= g ?

In both cases, f(x) ≤ g(x) for all x. The difference is that, in the first case, f(x) 6= g(x) for all x,whereas, in the second case, there exists at least one x for which f(x) 6= g(x). Unfortunately, theanswer is not consensual among mathematicians, so it is advised not to use this notation, unless veryexplicitly mentioning what is meant by it.

We consider from now on a bounded function f : [a, b] → R. (It is not assumed to be continuousalthough the pictures might suggest otherwise.) We defined the two real numbers

I−(f) := sup

{∫ b

a

φ(x) dx : φ is a step function such that φ ≤ f

}

,

I+(f) := inf

{∫ b

a

φ(x) dx : φ is a step function such that φ ≥ f

}

.

In words, for I−(f), we consider all the step functions that are below f , compute their integral andthen take the supremum. The idea is to obtain the whole area below the graph of f . For I+(f), weconsider the infimum of the integrals of the step functions that are above f .

11


x

y

φ ≥ fφ ≤ f

a b

f

Warning 1.10 B we only consider segments B

Beware that, for the time being, we only consider functions defined on a segment. The following resultsno longer hold without this assumption.

Proposition 1.11

Let f : [a, b] → R be a bounded function. Then I−(f) ≤ I+(f).

Proof. Let ψ be a step function such that ψ ≥ f . Then, for any step function φ ≤ f , one has φ ≤ ψ.

Now, this implies that∫ b

aφ ≤

∫ b

aψ. Indeed, let us consider a subdivision compatible both with φ and ψ,

obtained for instance as at the end of Section 1.2.1. As φ ≤ ψ, the value taken by φ on a subinterval

is smaller than the value taken by ψ on the same subinterval. The claimed inequality∫ b

a φ ≤∫ b

a ψimmediately follows from the definition.

Taking the supremum over the step functions φ below f , we obtain

I−(f) ≤∫ b

a

ψ .

As this inequality holds for any step function ψ above f , we may take the infimum over such functionsand obtain I−(f) ≤ I+(f), as desired.

The converse inequality does not always hold.

Definition 1.12 h integrable function g

h A bounded function f : [a, b] → R is Riemann-integrable or simply integrable if

I−(f) = I+(f) .

h If f : [a, b] → R is integrable, the integral of f on [a, b] is the number

∫ b

a

f =

∫ b

a

f(x) dx := I−(f) = I+(f) .

h If f : [a, b] → R is integrable, the signed area below the graph of f is defined as

∫ b

a

f .

12


Example 1.13

Step functions are integrable: in this case the infimum and the supremum are reached at the functionitself. Plainly, Definition 1.12 coincides with Definition 1.7 in this case.

Hopefully, many functions are integrable. We will soon see that monotonic functions as well as(piecewise) continuous functions are always integrable. To see that not all bounded functions areintegrable, solve the exercise below.

Exercise 1.14 solution page 133

Show that the indicator function of Q on [0, 1], that is, the function 1Q : [0, 1] → {0, 1} such that1Q(x) = 1 whenever x ∈ Q and 1Q(x) = 0 otherwise, is not integrable.

You can solve the following exercise using the method we used in Section 1.1.2 for the computation

of∫ 1

0 ex dx = e− 1.


Show that the function f : x ∈ [0, 1] 7→ x2 is integrable and compute∫ 1

0 f(x) dx.

Although quite intuitive, the definition of integral is not very handy in practice. Hopefully, we willsee in the subsequent sections various tool to make many fast computations.

Proposition 1.16 h changing a finite number of values g

Let f : [a, b] → R be a bounded function and g : [a, b] → R be equal to f except at a finite number of

points. Then f is integrable if and only if g is integrable. If the functions are integrable, then∫ b

a f =∫ b

a g.

Proof. Let φ ≤ f be a step function. By changing the value of φ at a finite number of points, we obtain

a step function ψ ≤ g. As∫ b

a φ =∫ b

a ψ ≤ I−(g), we obtain by taking the supremum that I−(f) ≤ I−(g).The argument being symmetrical, we obtain I−(f) = I−(g) and, similarly, I+(f) = I+(g). The resultfollows.

Let us now see that two large classes of functions are integrable, namely monotonic functions and(piecewise) continuous functions. Do not forget that we only consider functions defined on segments.

Theorem 1.17 h monotonic functions g

Let f : [a, b] → R be a monotonic function. Then f is integrable.

Proof. We use the same method as in Section 1.1.2 and Exercise 1.15, without explicitly making thecomputation. Let us first assume that f is nondecreasing on [a, b]. For n ∈ N, we consider the regularsubdivision of [a, b] into n parts a = x0 < x1 < x2 < . . . < xn = b, where

xi := a+ ib− a

n, 0 ≤ i ≤ n .

By monotonicity, for 1 ≤ i ≤ n,

∀x ∈ [xi−1, xi], f(xi−1) ≤ f(x) ≤ f(xi) .

13


We define the step functions φn and ψn on [a, b] by φn(x) := f(xi−1) and ψn(x) := f(xi) wheneverx ∈ [xi−1, xi), as well as φn(b) = ψn(b) := f(b). This implies that φn ≤ f ≤ ψn.

We have

n∑

i=1

b− a

nf(xi−1) =

∫ b

a

φn(x) dx ≤ I−(f) ≤ I+(f) ≤∫ b

a

ψn(x) dx =

n∑

i=1

b− a

nf(xi) ,

so that

0 ≤ I+(f) − I−(f) ≤n∑

i=1

b− a

nf(xi) −

n∑

i=1

b− a

nf(xi−1) =

b− a

n

(f(b) − f(a)

).

Letting n → ∞ yields I−(f) = I+(f), so that f is integrable.If f is nonincreasing, a similar argument holds; we leave the details to the reader.

Theorem 1.18 h continuous functions g

Let f : [a, b] → R be a continuous function. Then f is integrable.

Proof. By Heine’s theorem, f is uniformly continuous on [a, b]. Recall that this means that

∀ε > 0, ∃δ > 0, |y − x| < δ =⇒ |f(y) − f(x)| ≤ ε .

Let us thus arbitrarily fix ε > 0 and choose n ∈ N large enough so that |y−x| ≤ 1n =⇒ |f(y)−f(x)| ≤ ε.

The remaining is similar to the previous proof. We consider the regular subdivision of [a, b] into n partsa = x0 < x1 < x2 < . . . < xn = b, where xi := a+ i b−a

n , 0 ≤ i ≤ n. Then,

∀x ∈ [xi−1, xi], min[xi−1,xi]

f ≤ f(x) ≤ max[xi−1,xi]

f .

We define the step functions φn and ψn on [a, b] by φn(x) := min[xi−1,xi] f and ψn(x) := max[xi−1,xi] fwhenever x ∈ [xi−1, xi), as well as φn(b) = ψn(b) := f(b).

ψn

φn

f

x

y

min[xi−1,xi]

f

max[xi−1,xi]

f

xi−1 xi

This implies that φn ≤ f ≤ ψn so that

n∑

i=1

b− a

nmin

[xi−1,xi]f =

∫ b

a

φn(x) dx ≤ I−(f) ≤ I+(f) ≤∫ b

a

ψn(x) dx =

n∑

i=1

b− a

nmax

[xi−1,xi]f .

As a result,

0 ≤ I+(f) − I−(f) ≤n∑

i=1

b− a

n

(

max[xi−1,xi]

f − min[xi−1,xi]

f

)

≤ (b− a) ε .

As ε was arbitrary, we obtain I−(f) = I+(f), so that f is integrable.

Recall that we denote by f |J the restriction of a function f defined on some I to the subset J ⊆ I .

14


Definition 1.19 h piecewise continuous function g

A function f : [a, b] → R is piecewise continuous if there is a subdivision a = x0 < x1 < x2 <. . . < xN = b such that, for each i ∈ {1, . . . , N}, f |(xi−1,xi) is continuous and admits finite one-sidedlimits at xi−1 and at xi.

x

y

a b

f

Note that the values at the points of the subdivision are freely set.

Corollary 1.20 h piecewise continuous functions g

Let f : [a, b] → R be a piecewise continuous function. Then f is integrable.

Proof. Let a = x0 < x1 < x2 < . . . < xN = b be a subdivision as in Definition 1.19. We fix ε > 0. Then,for each i ∈ {1, . . . , N}, f |(xi−1,xi) can be extended into a continuous function fi : [xi−1, xi] → R. (Notethat the existence of one-sided limits at the extremities of (xi−1, xi) is used at this stage.) Using thedefinition of infimum and supremum, we see that there exist two step functions φi, ψi : [xi−1, xi] → R

such that φi ≤ fi ≤ ψi and

I−(fi) − ε ≤∫ xi

xi−1

φi(x) dx ≤ I−(fi) and I+(fi) ≤∫ xi

xi−1

ψi(x) dx ≤ I+(fi) + ε .

As fi is continuous, it is integrable by Theorem 1.18, so that I−(fi) = I+(fi). As a result,

0 ≤∫ xi

xi−1

ψi(x) dx−∫ xi

xi−1

φi(x) dx ≤ 2ε . (1.2)

We define the functions φ, ψ : [a, b] → R by φ(x) := φi(x) and ψ(x) := ψi(x) whenever x ∈(xi−1, xi), 1 ≤ i ≤ N , as well as φ(xi) = ψ(xi) := f(xi), 0 ≤ i ≤ N . Clearly, φ and ψ are step functions(a subdivision can be obtained by taking the points of subdivisions compatibles with the φi’s or ψi’s)satisfying φ ≤ f ≤ ψ. Moreover, it is straightforward from Definition 1.7 that

∫ b

a

φ(x) dx =N∑

i=1

∫ xi

xi−1

φi(x) dx .

Using (1.2), we finally obtain

0 ≤ I+(f) − I−(f) ≤∫ b

a

ψ(x) dx−∫ b

a

φ(x) dx ≤ 2Nε .

As ε was arbitrary (and N is fixed), we obtain I−(f) = I+(f), so that f is integrable.

15


1.2.3 First properties of the integral

Let us start with linearity.

Proposition 1.21 h linearity of the integral g

The following holds.

(i) Let f , g : [a, b] → R be integrable functions. Then f + g : x ∈ [a, b] 7→ f(x) + g(x) is integrableand

∫ b

a

(f + g)(x) dx =

∫ b

a

f(x) dx+

∫ b

a

g(x) dx .

(ii) Let f : [a, b] → R be an integrable function and λ ∈ R. Then λf : x ∈ [a, b] 7→ λ × f(x) isintegrable and

∫ b

a

(λf)(x) dx = λ

∫ b

a

f(x) dx .

Remark 1.22

The above proposition can be condensed into one statement: let f , g : [a, b] → R be integrable functionsand λ ∈ R. Then λf + g is integrable and

∫ b

a

(λf + g)(x) dx = λ

∫ b

a

f(x) dx+

∫ b

a

g(x) dx .

The previous statements correspond to the particular cases λ = 1 and g : x ∈ [a, b] 7→ 0.

Proof. Let us consider integrable functions f , g : [a, b] → R and λ ∈ R. We furthermore assume thatλ > 0 for the time being. We also fix ε > 0. As f and g are integrable, there exist step functionsφ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that

∫ b

a

f − ε ≤∫ b

a

φ,

∫ b

a

ψ ≤∫ b

a

f + ε,

∫ b

a

g − ε ≤∫ b

a

ϕ,

∫ b

a

ξ ≤∫ b

a

g + ε . (1.3)

Now, it is easy to see that λφ+ ϕ and λψ + ξ are step functions and satisfy λφ+ ϕ ≤ λf + g ≤ λψ + ξ.Moreover, let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ and ϕ (and thusalso with λφ + ϕ) and let us denote by ci and di the constant values taken respectively by φ and by ϕon (xi−1, xi). Then

∫ b

a

(λφ + ϕ)(x) dx =

n∑

i=1

(xi − xi−1) (λci + di) = λ

n∑

i=1

(xi − xi−1) ci +

n∑

i=1

(xi − xi−1) di

= λ

∫ b

a

φ(x) dx+

∫ b

a

ϕ(x) dx .

By the same argument,∫ b

a (λψ + ξ) = λ∫ b

a ψ +∫ b

a ξ. Summing up,

λ

∫ b

a

φ+

∫ b

a

ϕ =

∫ b

a

(λφ + ϕ) ≤ I−(λf + g) ≤ I+(λf + g) ≤∫ b

a

(λψ + ξ) = λ

∫ b

a

ψ +

∫ b

a

ξ .

Adding this up with (1.3) yields

λ

∫ b

a

f +

∫ b

a

g − λε − ε ≤ I−(λf + g) ≤ I+(λf + g) ≤ λ

∫ b

a

f +

∫ b

a

g + λε+ ε .

16

1.2.3. First properties of the integral

As ε is arbitrary, we obtain that I−(λf + g) = I+(λf + g) = λ∫ b

a f +∫ b

a g. This implies that λf + g is

integrable and that its integral is λ∫ b

af +

∫ b

ag. The argument if λ < 0 is similar; one simply needs to

reverse some inequalities (for instance, one has λψ + ϕ ≤ λf + g ≤ λφ + ξ instead).

Example 1.23

Using the computations we made above,

∫ 1

0

(7x2 − ex

)dx = 7

∫ 1

0

x2 dx−∫ 1

0

ex dx = 71

3− (e− 1) =

10

3− e .


(i) Admitting that

∫ 1

0

xn dx =1

n+ 1, compute

∫ 1

0

P (x) dx, where P (X) =

n∑

i=0

aiXi.

(ii) Find a degree 2 polynomial P such that

∫ 1

0

P (x) dx = 0.

Next, let us see how the integral behaves with inequalities.

Proposition 1.25 h positivity of the integral g

h Let f , g : [a, b] → R be integrable functions such that f ≤ g. Then

∫ b

a

f(x) dx ≤∫ b

a

g(x) dx .

h In particular, if f : [a, b] → R is an integrable function such that f ≥ 0, then

∫ b

a

f(x) dx ≥ 0.

Proof. By linearity, the function g − f is integrable. Furthermore, the step function x ∈ [a, b] 7→ 0 isbelow g − f , so that, by definition of the integral as a supremum over step functions that are below,

0 =

∫ b

a

0 dx ≤∫ b

a

(g − f)(x) dx =

∫ b

a

g(x) dx−∫ b

a

f(x) dx

by linearity.

In the case of continuous functions, the previous result can be significantly strengthened.

Proposition 1.26

Let f : [a, b] → R+ be a continuous function. Then

∫ b

a

f(x) dx = 0 =⇒ f ≡ 0 (that is, ∀x ∈ [a, b], f(x) = 0).

17


Proof. If f 6≡ 0, by continuity, there exists x0 ∈ (a, b) such that f(x0) > 0. Then, still by continuity, thereexists ε > 0 such that a < x0 − ε < x0 + ε < b and, for all x ∈ [x0 − ε, x0 + ε], f(x) > 1

2f(x0). As f ≥ 0,we obtain that f ≥ 1

2f(x0)1[x0−ε,x0+ε], the latter function being a step function. Thus,

∫ b

a

f(x) dx ≥∫ b

a

1

2f(x0)1[x0−ε,x0+ε](x) dx = 2ε

1

2f(x0) = εf(x0) > 0 .

by Proposition 1.25.

Let us now look at some basic operations.

Proposition 1.27 h product, composition, min, max, absolute value g

Let f , g : [a, b] → R be integrable functions.

(i) Let Φ : R → R be a continuous function. Then the composition Φ ◦ f : x ∈ [a, b] 7→ Φ(f(x)) isintegrable.

(ii) The product f × g : x ∈ [a, b] 7→ f(x) × g(x) is integrable.

(iii) The functions min(f, g) : x ∈ [a, b] 7→ min(f(x), g(x)) and max(f, g) : x ∈ [a, b] 7→max(f(x), g(x)) are integrable.

(iv) The function |f | : x ∈ [a, b] 7→ |f(x)| is integrable and

∣∣∣∣∣

∫ b

a

f(x) dx

∣∣∣∣∣

≤∫ b

a

∣∣f(x)

∣∣ dx .

Warning 1.28

Beware that, in contrast with addition or multiplication by a scalar (a real number), the above operations

do not “commute” with the integral. For instance,∫ b

afg 6=

( ∫ b

af)( ∫ b

ag). As a counter-example,

consider for instance on [0, 2] the function f equal to 1 on [0, 1] and 0 elsewhere and g := 1 − f .

Clearlya, fg ≡ 0 so that∫ b

a fg = 0, whereas∫ b

a f =∫ b

a g = 1. (The computations are easy as the threefunctions under consideration are step functions.)

aWe use the notation ≡ for the functional equality: f ≡ g on I means ∀x ∈ I, f(x) = g(x) and f 6≡ g on I means∃x ∈ I, f(x) 6= g(x).

Proof. (i). This proof is a bit more complicated than the other as it somehow has “two layers” ofdifficulty. As f is integrable, it is bounded by definition: let M be such that |f | ≤ M . Let us fix ε > 0.By Heine’s theorem, Φ is uniformly continuous on [−M,M ]: there exists δ > 0 such that

|y − x| < δ =⇒ |Φ(y) − Φ(x)| ≤ ε . (1.4)

Now, let φ ≤ f ≤ ψ be two step functions such that∫ b

af − δε ≤

∫ b

aφ ≤

∫ b

af ≤

∫ b

aψ ≤

∫ b

af + δε and

let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible both with φ and ϕ. We denote by ci

and di the values taken on (xi−1, xi) respectively by φ and ψ. Then,

∀x ∈ (xi−1, xi), ci ≤ f(x) ≤ di and thus min[ci,di]

Φ ≤ Φ ◦ f(x) ≤ max[ci,di]

Φ .

We define the step functions ϕ and ξ by ϕ(x) = c′i := min[ci,di] Φ and ξ(x) = d′

i := max[ci,di] Φ wheneverx ∈ (xi−1, xi), 1 ≤ i ≤ n, as well asϕ(xi) = ξ(xi) := Φ(f(xi)), 0 ≤ i ≤ n. By construction, ϕ ≤ Φ◦f ≤ ξ.

18


On the one hand, by (1.4), when di − ci < δ, then d′i − c′

i ≤ ε. On the other hand, we always haved′

i − c′i ≤ C where we set C := max[−M,M ] Φ − min[−M,M ] Φ. We have

∫ b

a

ξ −∫ b

a

ϕ =

n∑

i=1

(xi − xi−1)(d′i − c′

i)

=

n∑

1≤i≤ndi−ci<δ

(xi − xi−1)(d′i − c′

i) +

n∑

1≤i≤ndi−ci≥δ

(xi − xi−1)(d′i − c′

i)

≤ (b− a) ε+ Cn∑


(xi − xi−1) .

Finally, as

2δε ≥∫ b

a

ψ −∫ b

a

φ =

n∑

i=1

(xi − xi−1)(di − ci)

=n∑

1≤i≤ndi−ci<δ

(xi − xi−1)(di − ci) +n∑


(xi − xi−1)(di − ci)

≥ δn∑


(xi − xi−1) ,

we see that

n∑


(xi − xi−1) ≤ 2ε , so that 0 ≤ I+(Φ ◦ f) − I−(Φ ◦ f) ≤∫ b

a

ξ −∫ b

a

ϕ ≤ (b− a+ 2C) ε .

As ε is arbitrary, we obtain that I+(Φ ◦ f) = I−(Φ ◦ f), so that Φ ◦ f is integrable.

(ii). We fix ε > 0. There exist step functions φ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that

0 ≤∫ b

a

ψ(x) dx−∫ b

a

φ(x) dx ≤ ε and 0 ≤∫ b

a

ξ(x) dx−∫ b

a

ϕ(x) dx ≤ ε .

Let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ, ψ, ϕ and ξ. We denote by ci,di, c

′i and d′

i the values taken on (xi−1, xi) respectively by φ, ψ, ϕ and ξ. Then, for x, y ∈ (xi−1, xi),

|fg(y) − fg(x)| = |f(y)g(y) − f(y)g(x) + f(y)g(x) − f(x)g(x)|≤ |f(y)| |g(y) − g(x)| + |g(x)| |f(y) − f(x)|≤ M(di − ci + d′

i − c′i) ,

where we denoted by M a bound for f and g. We define the step functions α and β by α(x) :=inf(xi−1,xi) fg and β(x) := sup(xi−1,xi) fg whenever x ∈ (xi−1, xi), 1 ≤ i ≤ n, as well as α(xi) = β(xi) := fg(xi), 0 ≤ i ≤ n. By construction, α ≤ fg ≤ β, and by the above bound,

β − α ≤ M(ψ − φ+ ξ − ϕ) ,

so that

0 ≤ I+(fg) − I−(fg) ≤∫ b

a

β −∫ b

a

α ≤ 2Mε .

19


As ε is arbitrary, we obtain that I+(fg) = I−(fg), so that fg is integrable.

(iii). We fix ε > 0. There exist step functions φ ≤ f ≤ ψ and ϕ ≤ g ≤ ξ such that

0 ≤∫ b

a

ψ(x) dx−∫ b

a

φ(x) dx ≤ ε and 0 ≤∫ b

a

ξ(x) dx−∫ b

a

ϕ(x) dx ≤ ε .

Then min(φ, ϕ) and min(ψ, xi) are step functions satisfying min(φ, ϕ) ≤ min(f, g) ≤ min(ψ, ξ) and it isnot hard to check that 0 ≤ min(ψ, ξ) − min(φ, ϕ) ≤ ψ − φ+ ξ − ϕ. We conclude as usual that min(f, g)is integrable. The proof for max(f, g) is the same.

(iv). Since x ∈ R 7→ |x| is a continuous function, (i) yields that the function |f | is integrable. Moreover,as −|f | ≤ f ≤ |f |, by positivity (Proposition 1.25),

−∫ b

a

|f(x)| dx ≤∫ b

a

f(x) dx ≤∫ b

a

|f(x)| dx ,

which is a simple rewriting of the desired result.


Knowing that

∫ n

1

1

xndx =

n−n+1 − 1

−n+ 1, show that

∫ n

1

sin(nx)

1 + xndx → 0 as n → ∞.


Does it hold that

(i)

∫ b

a

f(x)2 dx =

(∫ b

a

f(x) dx

)2

?

(ii)

∫ b

a

√

f(x) dx =

√∫ b

a

f(x) dx ?

(iii)

∫ b

a

∣∣f(x)

∣∣ dx =

∣∣∣∣

∫ b

a

f(x) dx

∣∣∣∣

?

(iv)

∫ b

a

∣∣f(x) + g(x)

∣∣ dx =

∣∣∣∣

∫ b

a

f(x) dx

∣∣∣∣+

∣∣∣∣

∫ b

a

g(x) dx

∣∣∣∣

?

We are now interested in splitting the segment onto which the function to integrate is defined. First,if a function is integrable on a segment, then it is also integrable on any subsegment.

Proposition 1.31 h restriction of an integrable function g

Let f : [a, b] → R be an integrable function and [a′, b′] ⊆ [a, b]. Then the restriction f |[a′,b′] : [a′, b′] →R is integrable.

Proof. Let us fix ε > 0. There exist step functions φ, ψ : [a, b] → R such that φ ≤ f ≤ ψ and∫ b

aψ−∫ b

aφ <

ε. Then, φ|[a′,b′] and ψ|[a′,b′] are step functions such that φ|[a′,b′] ≤ f |[a′,b′] ≤ ψ|[a′,b′]. Moreover,

0 ≤ I+(f |[a′,b′]

)− I−(f |[a′,b′]

)≤∫ b′

a′

(ψ − φ) ≤∫ b

a

(ψ − φ) < ε .

20


We used the fact that ψ−φ ≥ 0, so that the sum defining∫ b

a(ψ−φ) contains the sum defining

∫ b′

a′ (ψ−φ)plus some other nonnegative terms (see the proof of (1.5) below for step functions). As ε is arbitrary,we obtain that I+(f |[a′,b′]) = I−(f |[a′,b′]), so that f |[a′,b′] is integrable.

Let f : [a, b] → R be an integrable function and c ∈ (a, b). Then f |[a,c] and f |[c,b] are integrable. Weclaim that

∫ b

a

f(x) dx =

∫ c

a

f(x) dx+

∫ b

c

f(x) dx . (1.5)

First, it is quite clear that this holds for step functions. Indeed, if φ : [a, b] → R is a step function, leta = x0 < x1 < x2 < . . . < xk = c < xk+1 < . . . < xn = b be a subdivision compatible with φ thatcontains c (recall that we can always add arbitrary points to subdivisions). Let us denote by ci thevalue taken by φ on (xi−1, xi). Then

∫ c

a

φ(x) dx+

∫ b

c

φ(x) dx =k∑

i=1

(xi − xi−1) ci +n∑

i=k+1

(xi − xi−1) ci =n∑

i=1

(xi − xi−1) ci =

∫ b

a

φ(x) dx .

Let us come back to a general integrable function f and let us fix ε > 0. There exist step functions φ

and ψ such that φ ≤ f ≤ ψ and∫ b

aψ −

∫ b

aφ < ε. Then, using the result for the step function φ,

∫ b

a

f −∫ c

a

f −∫ b

c

f ≤∫ b

a

ψ −∫ c

a

φ−∫ b

c

φ =

∫ b

a

ψ −∫ b

a

φ < ε .

Similarly,∫ b

a f −∫ c

a f −∫ b

c f ≥∫ b

a φ−∫ c

a ψ −∫ b

c ψ =∫ b

a φ−∫ b

a ψ > −ε. In other words,∣∣∣∣∣

∫ b

a

f(x) dx−∫ c

a

f(x) dx−∫ b

c

f(x) dx

∣∣∣∣∣< ε ,

and (1.5) follows by letting ε → 0.

Reorganizing the terms in (1.5) yields

∫ b

a

f(x) dx−∫ b

c

f(x) dx =

∫ c

a

f(x) dx ,

which suggests to adopt the following convention.

Definition 1.32 h integrating “backwards” g

Let f : [a, b] → R be an integrable function. Then

∫ a

b

f(x) dx := −∫ b

a

f(x) dx .

Proposition 1.33 h Chasles’s identity g

Let I be a segment and f : I → R be an integrable function. Then, for any a, b, c ∈ I ,

∫ c

a

f(x) dx =

∫ b

a

f(x) dx+

∫ c

b

f(x) dx .

Proof. We consider all 6 possible arrangements of a, b, c. Then the result comes from (1.5) and Defini-tion 1.32

21


1.2.4 Integrals and derivatives

Definition 1.34 h primitive g

Let f : I → R be a function defined on an arbitrary interval I . A primitive or antiderivative of f is adifferentiable function F : I → R such that F ′ = f , that is, for all x ∈ I , F ′(x) = f(x).

Warning 1.35 B primitives are never unique B

Beware that, if F is a primitive of f , then the function F + λ : x 7→ F (x) + λ is also a primitive of f ,for any constant λ ∈ R. For this reason, we cannot speak of the primitive of f without any furtherspecification.

More precisely, primitives of the same function differ by an additive constant.

Proposition 1.36 h all the primitives of a function g

Let F1 : I → R and F2 : I → R be primitives of the same function defined on an interval I . Then thereexists a constant c ∈ R such that F2 = F1 + c, that is,

∀x ∈ I , F2(x) = F1(x) + c .

Proof. By definition, the function F2 − F1 is differentiable and (F2 − F1)′ ≡ 0. It thus amounts to seethat a function F with null derivative is constant. This can be done using the mean value theorem: forall a, b ∈ I with a < b, there exists c ∈ (a, b) such that F (b) − F (a) = F ′(c)(b− a) = 0.

We may thus speak of the primitive whose value is given at some point, for instance, the primitivethat cancels at 0.

It is common to denote any primitive of f by∫f or

∫f(x) dx (same notation as for integrals without

the extremities of the interval). Beware that this notation is problematic for two reasons. First, in

contrast with∫ b

a f , which is a number,∫f is a function. Moreover, as mentioned earlier, primitives are

not uniquely defined: for instance, one could write∫

sin = − cos and∫

sin = − cos +17. From thisnotation, it would be tempting to deduce that − cos =

∫sin = − cos +17, which is obviously wrong. I

thus advice against using this notation.As you already know the derivatives of usual functions, you also already know many primitives.

Example 1.37 h primitives of usual functions g

By R⋆±, we mean either R⋆

− or R⋆+.This table is read as follows. The function x ∈ D 7→ · is a primitive of

the function x ∈ D 7→ ·, where D is to be relaced by the first entry, the first · by the third entry and thesecond · by the second entry.

D function x ∈ D 7→ · a primitive x ∈ D 7→ ·

R xn n ∈ Z≥0xn+1

n+ 1

R⋆±

1

xln |x|

22

1.2.4. Integrals and derivatives

R⋆± x−n n ∈ {2, 3, 4, . . .} x−n+1

−n+ 1

R⋆+ xα α ∈ R \ {−1} xα+1

α+ 1

R eαx α ∈ R⋆ 1

αeαx

R cos(x) sin(x)

R sin(x) − cos(x)

R1

cos2(x)= 1 + tan2(x) tan(x)

R1

sin2(x)=

1

tan2(x)+ 1 − 1

tan(x)

R1

1 + x2arctan(x)

(−1, 1)1√

1 − x2arcsin(x)

R cosh(x) :=ex + e−x

2sinh(x)

R sinh(x) :=ex − e−x

2cosh(x)

R1

cosh2(x)= 1 − tanh2(x) tanh(x)

R1

sinh2(x)=

1

tanh2(x)− 1 − 1

tanh(x)

(−1, 1)1

1 − x2artanh(x) =

1

2ln(1 + x

1 − x

)

R1√

1 + x2arsinh(x) = ln

(

x+√x2 + 1

)

(1,+∞)1√

x2 − 1arcosh(x) = ln

(

x+√x2 − 1

)

Moreover, by linearity of differentiation, we readily obtain the following property.

23


Proposition 1.38

h Let F : I → R be a primitive of f : I → R, G : I → R be a primitive of g : I → R, and λ ∈ R.Then λF +G : I → R is a primitive of λf + g.

h Let F : I → R be a primitive of f : I → R and α ∈ R⋆. Then x ∈ I 7→ 1

αF (αx) is a primitive

of x ∈ I 7→ f(αx).

Proof. One has (λF +G)′ = λF ′ +G′ = λf + g andd

dx

( 1

αF (αx)

)

=α

αF ′(αx) = f(αx).

We now formalize the link between the problems we saw in Section 1.1.1.

Theorem 1.39 h first fundamental theorem of calculus g

Let f ∈ C([a, b]). For each x ∈ [a, b], set

F (x) :=

∫ x

a

f(y) dy .

Then F is a primitive of f . More precisely, F ∈ C1([a, b]) and

∀x ∈ (a, b) , F ′(x) = f(x) , F ′d(a) = f(a) , and F ′

g(b) = f(b) .

Note that this function F is the primitive of f that cancels at a.

Warning 1.40 B integration variable B

Notice that we wrote∫ x

a f(y) dy in the above statement and not∫ x

a f(x) dx. The latter does not makeany sense as x is both a constant (the upper extremity of the integration segment) and a variable (theintegration variable).

Proof. Assume that x0 ∈ (a, b) and let h satisfy 0 < |h| < min(x0 − a, b− x0). By Chasles’s identity,

F (x0 + h) − F (x0)

h=

1

h

∫ x0+h

x0

f(x) dx .

Then, by integrating the constant function equal to f(x0) either on [x0, x0 +h] or [x0 +h, x0] (dependingon the sign of h) and by linearity,

F (x0 + h) − F (x0)

h− f(x0) =

1

h

∫ x0+h

x0

(f(x) − f(x0)

)dx ,

so that

∣∣∣∣

F (x0 + h) − F (x0)

h− f(x0)

∣∣∣∣

≤ 1

|h|

∫ x0+|h|

x0−|h|

∣∣f(x) − f(x0)

∣∣ dx ≤ 2 sup

|x−x0|≤|h|

∣∣f(x) − f(x0)

∣∣ ,

and the latter tends to 0 as h → 0 by continuity of f at x0. This yields that F is differentiable at x0 andthat F ′(x0) = f(x0). The cases of x0 = a or x0 = b are treated similarly.

24

1.2.4. Integrals and derivatives

Warning 1.41 B Do not forget the continuity hypothesis B

Remember that∫ x

a f(y) dy is not changed if you arbitrarily modify f at a finite number of points. Theo-rem 1.39 has thus no chance to hold without the continuity hypothesis.

Warning 1.42 B not all integrable functions admit primitives! B

In fact, not all integrable functions admit primitives. Take for instance x ∈ [0, 2] 7→ 1[1,2](x). Thisstep function is integrable. By contradiction, a primitive of this function would be constant on [0, 1] andaffine on [1, 2] with slope 1, and thus not differentiable at 1.

Let f ∈ C([a, b]) and F : [a, b] → R be a primitive of f . By the previous theorem, x ∈ [a, b] 7→∫ x

a f(y) dy is also a primitive of f , so that, by Proposition 1.36, there exists a constant c ∈ R such that,

for all x ∈ [a, b], F (x) = c +∫ x

af(y) dy. Hence F (b) − F (a) =

∫ b

af(y) dy. We will now see that this

property still holds under the weaker assumption that f is integrable (and not necessarily continuous).

Definition 1.43 h notation [·] g

For a function F : [a, b] → R, we use the piece of notation

[F ]ba = [F (x)]bx=a := F (b) − F (a) .

Although not totaly rigorous, we also tolerate the notation [F (x)]ba when the integration variable isunequivocal.

Theorem 1.44 h second fundamental theorem of calculus g

Let f : [a, b] → R be an integrable function and F ∈ C([a, b]) be differentiable on (a, b) and such that

∀x ∈ (a, b) , F ′(x) = f(x) .

Then∫ b

a

f(x) dx = [F ]ba = F (b) − F (a) .

Proof. Let ε > 0. There exist step functions φ, ψ : [a, b] → R such that φ ≤ f ≤ ψ and∫ b

a ψ −∫ b

a φ < ε.Let a = x0 < x1 < x2 < . . . < xn = b be a subdivision compatible with φ and ψ. We denote by ci and di

the values taken on (xi−1, xi) respectively by φ and ψ. By the mean value theorem, for 1 ≤ i ≤ n,

(xi − xi−1) ci ≤ (xi − xi−1) inf(xi−1,xi)

f ≤ F (xi) − F (xi−1) ≤ (xi − xi−1) sup(xi−1,xi)

f ≤ (xi − xi−1) di ,

so that∫ b

a

φ(x) dx =

n∑

i=1

(xi − xi−1) ci ≤n∑

i=1

(F (xi) − F (xi−1)

)≤

n∑

i=1

(xi − xi−1) di =

∫ b

a

ψ(x) dx .

The sum in the middle is a telescopic sum equal to [F ]ba. As, furthermore,

∫ b

a

φ(x) dx ≤∫ b

a

f(x) dx ≤∫ b

a

ψ(x) dx ,

25


we conclude that ∣∣∣∣∣[F ]ba −

∫ b

a

f(x) dx

∣∣∣∣∣

≤∫ b

a

ψ −∫ b

a

φ < ε .

The desired equality follows by letting ε → 0.

This theorem gives a more efficient way to compute integrals.

Example 1.45

Let us come back to the integrals we computed before.

(i)

∫ 1

0

ex dx =[ex]1

0= e1 − e0 = e− 1.

(ii)

∫ 1

0

x2 dx =

[x3

3

]1

0

=1

3.

Exercise 1.46 h What are the odds? g solution page 134

(i) Let f : [−a, a] → R be an odd integrable function. Show that

∫ a

−a

f(x) dx = 0.

(ii) Let g : [−a, a] → R be an even integrable function. Show that

∫ a

−a

g(x) dx = 2

∫ a

0

g(x) dx.

1.2.5 Riemann sums

There are many links between integrals and discrete sums. In fact, the integral can be thought of as asum over a continuum of points, whereas a discrete sum is a sum over a countable set of points. In oldLatin, there were two versions of the the letter “s” and the symbol

∫actually represents one of them,

the one used in particular at the beginning of the word “summa” meaning sum.

Theorem 1.47 h Riemann sums g

Let f ∈ C([a, b]). For each n ∈ N, we consider a subdivision a = xn,0 < xn,1 < xn,2 < . . . < xn,n = binto n parts and n points tn,i ∈ [xn,i−1, xn,i], 1 ≤ i ≤ n. If the mesh max1≤i≤n(xn,i − xn,i−1) tendsto 0 as n → ∞, then

n∑

i=1

(xn,i − xn,i−1)f(tn,i) −−−−→n→∞

∫ b

a

f(t) dt .

Remark 1.48

In fact, the theorem still holds for integrable functions. This was the original definition of integrablefunctions by Riemann.

Proof. By linearity and using Chalses’s identity,

n∑

i=1

(xn,i − xn,i−1)f(tn,i) −∫ b

a

f(t) dt =

n∑

i=1

∫ xn,i

xn,i−1

(f(tn,i) − f(t)

)dt .

26

1.2.6. Toolbox

Now, let ε > 0. By Heine’s theorem, f is uniformly continuous on [a, b]: there exists δ > 0 such that

|y − x| < δ =⇒ |f(y) − f(x)| ≤ ε .

As the mesh tends to 0, for n sufficiently large, one has max1≤i≤n(xn,i − xn,i−1) < δ. Then

∣∣∣∣∣

n∑

i=1

(xn,i − xn,i−1)f(tn,i) −∫ b

a

f(t) dt

∣∣∣∣∣

≤n∑

i=1

∫ xn,i

xn,i−1

∣∣f(tn,i) − f(t)

∣∣ dt ≤

n∑

i=1

∫ xn,i

xn,i−1

ε dt = (b− a) ε .

The result follows.

The most common use of this theorem is by taking regular subdivisions and tn,i = xn,i, and partic-ularly a = 0 and b = 1.

Corollary 1.49

h Let f ∈ C([a, b]). Then

b− a

n

n∑

i=1

f(

a+ ib− a

n

)

−−−−→n→∞

∫ b

a

f(t) dt .

h Let f ∈ C([0, 1]). Then

1

n

n∑

i=1

f( i

n

)

−−−−→n→∞

∫ 1

0

f(t) dt .

Example 1.50

Let us compute the limit of Sn :=

n∑

k=1

1

n+ k.

We can rewrite Sn =1

n

n∑

k=1

1

1 + kn

. Setting f : x ∈ [0, 1] 7→ 1

1 + x,

Sn =1

n

n∑

k=1

f(k

n

)

−−−−→n→∞

∫ 1

0

1

1 + xdx =

[ln |1 + x|

]1

0= ln(2) − ln(1) = ln(2) .


Find the limits of Sn :=

n∑

k=1

ekn

nand S′

n :=

n∑

k=1

n

(n+ k)2.

1.2.6 Toolbox

We now know that integration is the inverse operation of differentiation. When we want to integratea function, we may recognize the derivative of a known function (maybe up to some factors to finetuneafterward). However, most of the time, we are not that lucky. We will now see the two main toolsallowing to compute common integrals.

27


Theorem 1.52 h integration by parts g

Let u, v ∈ C1([a, b]). Then

∫ b

a

u(x)v′(x) dx =[uv]b

a−∫ b

a

u′(x)v(x) dx .

Proof. As u, v ∈ C1([a, b]), one has (uv)′ = u′v + uv′ and all these function are continuous and thusintegrable. The result is obtained by integrating this equality.

This theorem is used as follows. When we do not know how to integrate a function, we try towrite it as a product of two functions such that, after differentiating the first factor and integrating thesecond factor, we are left with a function that we can integrate. This takes some practice.

Remark 1.53 h color code g

Throughout these notes, we use this color for the fonction we differentiate and this one for the fonctionwe integrate.

Example 1.54

Let us see some examples.

(i) Let us compute

∫ 1

0

xex dx.

∫ 1

0

x ex dx =[x ex

]1

0−∫ 1

0

1 ex dx = e−∫ 1

0

ex dx = e−[ex]1

0= e− (e− 1) = 1 .

(ii) The product is usually not as straightforward as above to identify. Let us compute

∫ x

1

ln(t) dt.

∫ x

1

1 ln(t) dt =[t ln(t)

]x

1−∫ x

1

t1

tdt = x ln(x)−

∫ x

1

1 dt = x ln(x)−[t]x

1= x ln(x)−x+1 .


(i) Compute

∫ e

1

x ln(x) dx.

(ii) Find a primitive of x ∈ R 7→ x2ex.

Theorem 1.56 h integration by substitution or change of variable g

Let f : I → R be a continuous function on an interval I and Φ : [a, b] → I be a differentiable function

28

1.2.6. Toolbox

with integrable derivative. Then

∫ Φ(b)

Φ(a)

f(y) dy =

∫ b

a

f(Φ(x)

)Φ′(x) dx .

Proof. Since f is continuous, it has a primitive F . The function F ◦ Φ is differentiable as a compositionof differentiable functions: by the chain rule,

∀x ∈ (a, b) , (F ◦ Φ)′(x) = f(Φ(x)

)Φ′(x) .

By the fundamental theorem of calculus (applied twice),

∫ b

a

f(Φ(x)

)Φ′(x) dx =

∫ b

a

(F ◦ Φ)′(x) dx =[F ◦ Φ

]b

a= F

(Φ(b)

)− F

(Φ(a)

)=[F]Φ(b)

Φ(a)=

∫ Φ(b)

Φ(a)

f(y) dy

as claimed.

In practice, we may remember the formula as follows. We want to set y = Φ(x), so that dydx = Φ′(x).

Working heuristically with infinitesimals yields dy = Φ′(x) dx, which we barbarously “replace” in theleft-hand side integral. We then modify the extremities by noting that, as x goes from a to b, theny = Φ(x) goes from Φ(a) to Φ(b). Bear in mind that this is not a proof, simply a mean of rememberingthe formula. Observe that this works because the notation has been well chosen (which is far frombeing always the case in maths). As with integration by parts, finding a good substitution is sometimesstraightforward and sometimes quite involved; practice is the key!

Remark 1.57 h color code g

As much as possible, we will use the following color code for substitution:

∫ Φ(b)

Φ(a)

f(y) dy =

∫ b

a

f(Φ(x)

)Φ′(x) dx .

Example 1.58 h a primitive of tan g

Let us find a primitive of tan : (− π2 ,

π2 ) → R, for instance the one that cancels at 0:

x ∈(

− π2 ,

π2

)7→∫ x

0

tan(t) dt .

As tan =sin

cos, it appears appropriate to use the substitution u = cos(t), yielding du = − sin(t) dt.

Then, for x ∈ (− π2 ,

π2

),

∫ x

0

tan(t) dt =

∫ x

0

sin(t)

cos(t)dt = −

∫ cos(x)

cos(0)

du

u= −

[ln |u|

]cos(x)

cos(0)= − ln

(cos(x)

).

29



(i) Using the change of variable x = sin(t), compute

∫ 12

0

1

(1 − x2)3/2dx.

(ii) Using the change of variable x = tan(t), find a primitive of x ∈ R 7→ 1

(1 + x2)3/2.

1.3 Integration of rational functions

We are interested in this section in integrating rational functions, that is, functions of the form

R : x 7→ P (x)

Q(x),

where P , Q ∈ R[X ] are nonzero polynomials. Dividing both the numerator and the denominator bythe leading coefficient of Q, we may assume without loss of generality that Q is monic, that is, hasleading coefficient equal to 1. If deg(Q) = 0, which means that Q = 1, then we are dealing with apolynomial, which we already know how to integrate. In the following, we therefore assume

deg(Q) ≥ 1 .

1.3.1 Partial fraction decomposition

We factor Q in R[X ] as

Q =

p∏

j=1

(X − rj

)nj

q∏

j=1

(X2 + 2bjX + cj)mj

with distinct real numbers rj ∈ R, and distinct pairs (bj , cj) ∈ R2 such that b2j − cj < 0. The real

numbers r1, . . . , rp are the real roots of Q, with multiplicity n1, . . . , np. The factors X2 + bjX + cj

correspond to the nonreal conjugate roots, with multiplicity m1, . . . , mq.

Remark 1.60 h reduced discriminant g

Recall that the discriminant of ax2 + bx + c is b2 − 4ac and that, if it is positive, then the roots are−b±

√b2−4ac

2a . If one wants to simplify things and get rid of these factors 2, one may use the reduceddiscriminant. This is what we do in these notes.The reduced discriminant of ax2 + 2bx+ c is b2 − ac; if it is positive, then the roots are −b±

√b2−ac

a .

We admit the following theorem, whose proof is beyond the reach of this course.

Theorem 1.61 h partial fraction decomposition g

There exists a unique way to write

P (X)

Q(X)= E(X) +

p∑

j=1

nj∑

k=1

αj,k

(X − rj)k+

q∑

j=1

mj∑

k=1

βj,kX + γj,k

(X2 + 2bjX + cj)k

where E(X) ∈ R[X ] and the quantities αj,k’s, βj,k’s, γj,k’s are real numbers.

30

1.3.1. Partial fraction decomposition

Definition 1.62 h integral part, partial fraction g

h The polynomial E(X) is called the integrala part of the rational fraction.h Each element of the sums above is called a partial fraction.

aThe word integral is the adjective corresponding to integer. It refers to to the fact that the polynomial ring R[X]behaves as the integer ring Z in regards to Euclidean division; it does not refer to the Riemann integral.

In fact, similarly to the case of integer division, there is a unique way to write P (X) = E(X)Q(X)+S(X) with E, S ∈ R[X ] and deg(S) < deg(Q). To convince yourself, let us see how we determine E inpractice through the following algorithm. First, let m := deg(Q), so that the leading factor of Q is Xm.

Let anXn denote the leading factor of P . If n < m, we stop. Otherwise, we write

P = anXn−mQ+ P − anX

n−mQ︸︷︷︸

polynomial of degree < n

and we reiterate the process with the polynomial P − anXn−mQ. As the degree decreases by at least 1

at each step, the algorithm terminates.

Example 1.63 h Euclidean division g

Let us divide 2X4 − 3X3 + 2X − 6 by X2 + 1. We write

2X4 − 3X3 + 2X − 6 = 2X2(X2 + 1) − 3X3 − 2X2 + 2X − 6

= (2X2 − 3X)(X2 + 1) − 2X2 + 5X − 6

= (2X2 − 3X − 2)(X2 + 1) + 5X − 4 .

Alternatively, as we know that the Euclidean division is possible, we may use indeterminate coef-ficients for E and S, develop EQ+ S and identify the coefficients. The degree of E is deg(P ) − deg(Q)and that of S is at most deg(Q) − 1.

Example 1.64

Coming back to the previous example, we solve

2X4 − 3X3 + 2X − 6 = (aX2 + bX + c)(X2 + 1) + dX + e

= aX4 + bX3 + (a+ c)X2 + (b + d)X + (c+ e) ,

which yields a = 2, b = −3, c = −2, d = 5, e = −4. The integral part of the rational fraction2X4 − 3X3 + 2X − 6

X2 + 1is thus 2X2 − 3X − 2. We obtain

2X4 − 3X3 + 2X − 6

X2 + 1= 2X2 − 3X − 2 +

5X − 4

X2 + 1

In practice, we will first do the above Euclidean division, obtaining

P (X)

Q(X)= E(X) +

S(X)

Q(X)

and then do the partial fraction decomposition

S(X)

Q(X)=

p∑

j=1

nj∑

k=1

αj,k

(X − rj)k+

q∑

j=1

mj∑

k=1

βj,kX + γj,k

(X2 + 2bjX + cj)k. (1.6)

31


In order to determine the constants of this partial fraction decomposition, we can always multiplyby Q(X) and obtain an equality of polynomial. The left-hand side polynomial is S, whereas the coeffi-cients of the right-hand polynomial are linear expressions in the constants to be determined. Equatingthe coefficients yields a system of linear equations that always admits a unique solution, which can befound by usual methods of linear algebra.

Alternatively, there are usually faster methods that can simplify as much as possible the computa-tions. In particular, multiplying both sides of (1.6) by Q(X), one obtains an equality of polynomial, sothat equality holds whenX is replaced with any real number x (even any complex number). The sameis thus true without multiplying by Q(x), provided of course that Q(x) 6= 0.

(a) Observe that limx→+∞

xS(x)

Q(x)=

p∑

j=1

αj,1 +

q∑

j=1

βj,1.

(b) We can multiply by (X − rj)nj and specify the equation at rj . We obtain

αj,nj=

S(rj)

Q/(X−rj)nj (rj).

whereQ/(X−rj)nj (X) :=Q(X)

(X − rj)nj. Once αj,nj

is determined, we may subtractαj,nj

(X − rj)njand

reiterate.

(c) Note also that, as

Q/(X−rj)(X) =

p∏

k=1k 6=j

(X − rk

)nk

q∏

k=1

(X2 + 2bkX + ck)mk ,

we also have Q/(X−rj)nj (rj) =1

nj !Q(nj)(rj), where Q(k) denotes the k-th derivative of Q.

(d) Similarly to (b), we can multiply by (X2 + 2bjX + cj)mj and specify at a root. We obtain anequality between two complexes and thus two equations on real numbers allowing to determinethe corresponding constants βj,mj

and γj,mj.

Unless you have a hunch, it is usually a good idea to start with (b) and (d). Remember that, whenyou are stuck, you can always multiply by Q and equate the coefficients or specify the equation atvalues giving simple equations.

Example 1.65

1

X2 − 3X + 2=

1

(X − 1)(X − 2)=

a

X − 1+

b

X − 2.

Multiplying by X − 1 and specifying at 1 gives1

1 − 2= a and multiplying by X − 2 and specifying

at 2 gives1

2 − 1= b. Consequently,

1

X2 − 3X + 2=

−1

X − 1+

1

X − 2.

With some training, we can also directly write

1

X2 − 3X + 2=

1

(X − 1)(X − 2)=

(X − 1) − (X − 2)

(X − 1)(X − 2)=

1

X − 2− 1

X − 1.

32

1.3.2. Integrating partial fractions

Example 1.66

4X3

(X − 1)2(X2 + 1)=

a

X − 1+

b

(X − 1)2+cX + d

X2 + 1.

Multiplying by (X − 1)2 and specifying at 1 gives b = 2. Multiplying by X2 + 1 and specifying at i

gives ci + d =4i3

(i − 1)2=

−4i(i+ 1)2

(−2)2= 2, so that c = 0 and d = 2. Finally, (a) yields 4 = a+ c so

that a = 4. One may also get 0 = −a+ b+ d by specifying at 0, for instance. All in all,

4X3

(X − 1)2(X2 + 1)=

4

X − 1+

2

(X − 1)2+

2

X2 + 1.

1.3.2 Integrating partial fractions

Let us now see how to integrate our rational function R : x 7→ P (x)

Q(x).

Warning 1.67 B domain of definition B

Beware that f is defined on R \ {r1, . . . , rp}, which is a finite union of open intervals. One thus has tointegrate separately on each of these intervals.

By linearity, we may integrate separately the integral part, as well as each partial fraction. Theintegral is a polynomial, which we know how to integrate. It remains to deal with partial fractions.

h Integrating x 7→ (x − r)−k, r ∈ R, k ∈ N. This is, up to translation, a power function. One can dothe change of variable y = x− r.


(−∞, r) or (r,+∞)1

x− rln |x− r|

(−∞, r) or (r,+∞)1

(x− r)kk ∈ {2, 3, 4, . . .} 1

−k + 1(x− r)−k+1

Example 1.68

Coming back to Example 1.65, let us find the primitives of R : x 7→ 1

x2 − 3x+ 2. We start by observing

that this function is defined on R \ {1, 2}. As, for all x /∈ {1, 2},

1

x2 − 3x+ 2=

1

x− 2− 1

x− 1,

the primitives of R are the functions

x ∈ R \ {1, 2} 7→ ln

∣∣∣∣

x− 2

x− 1

∣∣∣∣+

c1 if x < 1

c2 if 1 < x < 2

c3 if x > 2

,

33


where c1, c2, c3 ∈ R are arbitrary constants. Do not forget that, although the general expression isalways the same, there is one integration constant per interval of the domain of definition.

h Integrating x 7→ (βx+ γ)(x2 + 2bx+ c)−k, β, γ, b, c ∈ R, b2 − c < 0, k ∈ N. This one is harder; hereis the way to proceed. Do not learn the results by heart, remember the method! We first deal withthe x factor at the numerator. The idea is to transform the expression in such a way that we obtainu′(x)u(x)k , which we can integrate. Here u(x) = x2 + 2bx+ c, so that u′(x) = 2x+ 2b :

βx+ γ

(x2 + 2bx+ c)k=β

2

2x+ 2b

(x2 + bx+ c)k+ (γ − βb)

1

(x2 + 2bx+ c)k.

Still by linearity, we have two terms to treat. The first one is treated thanks to the change of variabley = x2 + 2bx + c, so that dy = (2x + 2b) dx (which was the point of making this factor appear at thenumerator).


R2x+ 2b

x2 + 2bx+ cln(x2 + 2bx+ c)

R2x+ 2b

(x2 + 2bx+ c)kk ∈ {2, 3, 4, . . .} 1

−k + 1(x2 + 2bx+ c)−k+1

The remaining term x 7→ (x2 + 2bx+ c)−k is the more complicated to integrate. The first step is totransform the expression so that it looks like (y2 + 1)−k. To this end, we see x2 + 2bx as the beginningof the development of (x+ b)2 :

x2 + 2bx+ c = (x+ b)2 + c− b2 .

Remark 1.69

By the way, recall that this is the method we use in order to solve second order linear equations (a 6= 0):

ax2 + 2bx+ c = 0 ⇐⇒ a(

x+b

a

)2

− b2 − ac

a= 0

⇐⇒(

x+b

a

)2

=b2 − ac

a2.

Setting y =x+ b√c− b2

does the trick (recall that b2 − c < 0): we obtain

dx

(x2 + 2bx+ c)k=

√c− b2

(c− b2)k

dy

(y2 + 1)k,

It finally remains to integrate y 7→ (y2 + 1)−k. Set fk : y ∈ R 7→∫ y

0

dt

(t2 + 1)k.

h If k = 1, then we know that f1(y) = arctan(y), for y ∈ R.

34

1.3.3. Rational functions in other functions

h If k ≥ 2, we obtain a relation between fk and fk−1 thanks to an integration by parts:

fk−1(y) =

∫ y

0

1 (t2 + 1)−k+1 dt

=[t (t2 + 1)−k+1

]y

0− 2(−k + 1)

∫ y

0

t t(t2 + 1)−k dt

= y(y2 + 1)−k+1 − 2(−k + 1)

∫ y

0

(t2 + 1 − 1)(t2 + 1)−k dt

= y(y2 + 1)−k+1 − 2(−k + 1)(fk−1(y) − fk(y)

),

so that

fk(y) =2k − 3

2k − 2fk−1(y) +

1

2k − 2

y

(y2 + 1)k−1.

Of course, you are not expected to know this result by heart! You are, however, expected to be able to recover it.

Example 1.70

Coming back to Example 1.66, we compute

∫ 0

−2

4x3

(x − 1)2(x2 + 1)dx = 4

∫ 0

−2

dx

x− 1+ 2

∫ 0

−2

dx

(x− 1)2+ 2

∫ 0

−2

dx

x2 + 1

= 4[

ln |x− 1|]0

−2+ 2

[ −1

x− 1

]0

−2

+ 2[

arctan(x)]0

−2

= −4 ln(3) +4

3− 2 arctan(−2) .


Find primitives of the functions

(i) x 7→ 4x+ 5

x2 + x− 2(ii) x 7→ 6 − x

x2 − 4x+ 4(iii) x 7→ 2x− 3

x2 − 4x+ 5


Compute the following:

(i)

∫ 1

0

dx

x2 + x+ 1

(ii)

∫ 1

0

xdx

x2 + x+ 1

(iii)

∫ 1

0

dx

(x2 + x+ 1)2

(iv)

∫ 1

0

xdx

(x2 + x+ 1)2

1.3.3 Rational functions in other functions

In this section, we consider rational functions in other well-behaved functions. The key to this sectionis the following fact: if R1 and R2 are rational functions, then so is R1 ◦ R2. In words, if you have

35


a rational function in y and substitute y with a rational expression in x, then you obtain a rationalfunction in x.

h Rational functions in exp. The first and simplest example is that of functions that are rational inthe exponential function, that is, functions of the form

x 7→ R(ex) ,

where R is a rational function. These are dealt with by using the change of variable

y = ex,

for which dy = ex dx = y dx. For any a, t ∈ R such that R is defined on [ea, et] or [et, ea] dependingwhether a < t or a ≥ t, we have

∫ t

a

R(ex)dx =

∫ et

ea

R(y)

ydy .

As y 7→ R(y)

yis a rational function, we can compute this integral by the method of the previous

section.


Compute

∫ 2

1

2e2x − 3ex + 2

e2x − exdx.

h Rational functions in cos and sin. We now consider functions of the form

x 7→ R(

cos(x), sin(x))

where R is a rational function in two variables, that is, R(X,Y ) = P (X,Y )Q(X,Y ) for some nonzero polynomi-

als in two variables P , Q ∈ R[X,Y ]. Here again, the point is to find a substitution that brings us backto integrating a rational function (of one variable). There are essentially four possibilities, dependingon the form of the rational function into consideration:

h u = cos(x), du = − sin(x) dx;

h u = sin(x), du = cos(x) dx;

h u = tan(x), du = (1 + u2) dx;

h t = tan(

x2

).

The last one plays a special role, we will come back to it shortly. In order to decide which substitutionto make, remember the following:

h cos2 + sin2 = 1 so that we can always “transform” cos2 into sin2 and vice versa.

h1

cos2= 1 + tan2 so that we can always “transform” cos2 into tan2.

h Never forget du.

By transform, we mean that the change yields another rational function. Now, the idea is to getin the end only sin(x) or only cos(x) or only tan(x) in your expression, not forgetting the du part.For instance, if your rational function takes the form R(cos(x), sin2(x)) sin(x) where R is a ratio-nal function, then the substitution u = cos(x) will work as the last sin(x) will be swallowed by du

36

1.3.3. Rational functions in other functions

and the sin2(x) can be replaced by 1 − cos2(x). Similarly, if your rational function takes the formR(cos2(x), sin(x)) cos(x), then the substitution u = sin(x) will work, and, if your rational functiontakes the form R(cos2(x), tan(x)), then the substitution u = tan(x) will work.

Example 1.74

∫ π6

0

dx

cos(x)=

∫ π6

0

cos(x) dx

cos2(x)

=

∫ π6

0

cos(x) dx

1 − sin2(x)

u = sin(x)

=

∫ 12

0

du

1 − u2

=1

2

∫ 12

0

du

1 − u+

1

2

∫ 12

0

du

1 + u

=1

2

[− ln(1 − u)

] 12

0+

1

2

[ln(1 + u)

] 12

0

=ln(3)

2.

It may happen that none of the above substitution work. In this case, we can use the half-angletangent substitution t = tan(x

2 ), which always work but give more complicated expressions.

Proposition 1.75 h half-angle tangent formulas g

For x 6= π mod 2π and t = tan(

x2

), one has

cos(x) =1 − t2

1 + t2, sin(x) =

2t

1 + t2, tan(x) =

2t

1 − t2, dx =

2 dt

1 + t2.

Proof. From the usual trigonometric identities,

cos(x) = cos2(x

2

)

−sin2(x

2

)

= cos2(x

2

)

(1−t2) and sin(x) = 2 sin(x

2

)

cos(x

2

)

= 2t cos2(x

2

)

.

Furthermore,

cos2(x

2

)

=cos2

(x2

)

cos2(

x2

)

+ sin2(

x2

) =1

1 + t2,

and, finally, dt =1

2tan′

(x2

)

dx =1

2(1 + t2) dx.

Using the above proposition, we see that we can replace every occurrence of sin(x) and of cos(x)by a rational expression in t and that dx is also replaced by a rational expression in t multiplied by dt.As a result, we always obtain a rational function in t, which we can integrate.

37


Example 1.76

Using the half-angle tangent substitution, we compute

∫ 0

− π2

dx

1 − sin(x)=

∫ 0

−1

1

1 − 2t1+t2

2 dt

1 + t2

= 2

∫ 0

−1

dt

1 + t2 − 2t

= 2

∫ 0

−1

dt

(1 − t)2

= 2

[1

1 − t

]0

−1

= 2

(

1 − 1

2

)

= 1 .


Compute the following:

(i)

∫ π2

− π2

sin2(x) cos3(x) dx , (ii)

∫ π2

0

cos4(x) dx , (iii)

∫ 2π

0

dx

2 + sin(x).

h Rational functions with radicals. We hereby consider functions of the form

x 7→ R(

x, n

√

ax+ b

cx+ d

)

where R is a rational function in two variables, a, b, c, d ∈ R satisfy ad− bc 6= 0, and n ≥ 2 is an integer.In this case, the substitution

y = n

√

ax+ b

cx+ d

is your friend. Indeed, observe that yn =ax+ b

cx+ dand thus nyn−1 dy =

ad− bc

(cx+ d)2dx and x =

dyn − b

a− cyn.

We can thus expressed both arguments of R as well asdx

dyas rational functions in y. We are then left

with integrating a rational function in y, as desired.

38

1.4. Improper integrals

Example 1.78

Using the substitution y =

√x

1 + x, we obtain y2 =

x

1 + x, 2y dy =

dx

(1 + x)2, x =

y2

1 − y2and then

∫ 1

0

√x

1 + xdx =

∫√

22

0

y 2y

(

1 +y2

1 − y2

)2

dy

= 2

∫√

22

0

y2

(1 − y2)2dy .

Let us do the partial fraction decomposition of the integrand:

y2

(1 − y2)2=

y2

(1 − y)2(1 + y)2=

a

1 − y+

b

(1 − y)2+

c

1 + y+

d

(1 + y)2.

Using (b), we multiply by (1 ± y)2 and cancel the term, which gives b = d = 14 . From (a), we get

0 = −a+ c and specifying at y = 0 yields 0 = a+ c+ 12 , so that a = c = 1

4 and

2

∫√

22

0

y2

(1 − y2)2dy =

1

2

∫√

22

0

(1

1 − y+

1

(1 − y)2+

1

1 + y+

1

(1 + y)2

)

dy

=1

2

[

− ln |1 − y| +1

1 − y+ ln |1 + y| − 1

1 + y

]√

22

0

=√

2 +1

2ln(3 + 2

√2).


Compute

∫ 1

0

x2 + 1√x+ 1

dx.

1.4 Improper integrals

The integral we have defined so far is not completely satisfactory for two reasons: we might wantto integrate unbounded functions and we might want to integrate on intervals that are not segments(bounded or unbounded). For instance, for the well-known fundamental formula

∫ ∞

−∞e− x2

2 dx =√π

to make sense, one needs a broader notion of integral. This is what we are going to do in this section.

1.4.1 Definition and first properties

Recall from the begining of Section 1.2 the different forms an interval can have. We denote by R :=R ∪ {−∞,+∞} the set of real numbers to which we add −∞ and +∞. The extremities of an arbitraryinterval are then elements of R. We write x −−→

x∈Ia to mean that x tends to awhile staying in I . In these

notes, it can be x −−−→x>a

a, which we shorten as x >→ a, if a is the left extremity of I or x −−−→x<a

a, which

we shorten as x <→ a, if a is the right extremity of I . The notation x → a and x → a for x decreasing

39


or increasing to a are also acceptable. The notation x → a+ and x → a− is quite widespread becausepractical but less rigorous as, obviously, a+ and a− do not represent anything.

Definition 1.80 h integrable function g

Let I be an interval and f : I → R be a function that is integrable on every segment included in I .

h Let a ∈ R be an extremity of I and c ∈ I . We say that f is integrable at a if

∫ x

c

f(t) dt admits a finite limit as x −−→x∈I

a .

h We say that f is integrable on I if it is integrable at both extremities of I . In this case, denoting by aand b respectively the left and right extremities of I , and fixing c ∈ I , we define

∫

I

f =

∫

I

f(t) dt =

∫ b

a

f =

∫ b

a

f(t) dt := limx >→a

∫ c

x

f(t) dt+ limy <→b

∫ y

c

f(t) dt

and∫ a

b

f =

∫ a

b

f(t) dt := −∫ b

a

f .

A few remarks are in order.

h In practice, we will mainly work with continuous functions, which are automatically integrableon every segment included in their interval of definition, by Theorem 1.18.

h The definition does not depend on the choice of c. Indeed, let c, d ∈ I . By Chalses’s identity,

∫ x

c

f(t) dt =

∫ d

c

f(t) dt+

∫ x

d

f(t) dt ,

so that

∫ x

c

f(t) dt converges as x → a if and only if

∫ x

d

f(t) dt converges. Furthermore,

∫ c

x

f(t) dt+

∫ y

c

f(t) dt =

∫ d

x

f(t) dt+✟✟✟✟✟❍

❍❍❍❍

∫ c

d

f(t) dt+✟✟✟✟✟❍

❍❍❍❍

∫ d

c

f(t) dt+

∫ y

d

f(t) dt .

h If the extremity a belongs to I , then f is automatically integrable at a. Indeed, one can take c = ain the definition and fix d ∈ I . As f is integrable on [a, d], then it is bounded and, for x ∈ [a, d],

∣∣∣∣

∫ x

a

f(t) dt

∣∣∣∣

≤ |a− x| sup[a,d]

|f | → 0 as x → a .

As a result, if f is integrable on I and b denotes the other extremity of I , then

∫ b

a

f = limy→b

∫ y

a

f(t) dt .

From this, we also conclude that, if I is a segment, then the integral defined here is equal to theintegral defined earlier.-

h If I is not bounded or if f is not integrable on [a, b], where a and b respectively denote the left andright extremities of I , we speak of improper integral (we say that

∫

If is an improper integral,

even if f is not integrable on I). Furthermore, if f is integrable on I , we say that the integral∫

I fconverges, whereas, if it is not integrable, we say that the integral

∫

If diverges.

40

1.4.1. Definition and first properties

h In the case where both extremities of I do not belong to I , we cannot treat them at once with asingle limit. For instance,

∫ x

−x t dt = 0 → 0 as x → ∞ but t ∈ R 7→ t is not integrable. Indeed,∫ x

0t dt = x2

2 → +∞ so that t ∈ R 7→ t is not integrable at +∞.

When we can compute a primitive F of f , the problem boils down to knowing whether F admitslimits or not at one extremity or both extremities of I .

Example 1.81

(i) Let us see whether t ∈ R+ 7→ 1

1 + t2is integrable:

∫ x

0

dt

1 + t2=[

arctan(t)]x

0= arctanx −−−−→

x→∞π

2.

It is thus integrable and

∫ +∞

0

dt

1 + t2=

π

2. In terms of the geometric problem, it means that,

although not bounded, the domain below the graph has finite area.

x

y

0

π

2t 7→ 1

1 + t2

(ii) The function t 7→ 1

1 + tis not integrable on R+ as

∫ x

0

dt

1 + t=[

ln(1 + t)]x

0= ln(1 + x) −−−−→

x→∞+∞ .

(iii) The integral

∫ 1

0

ln(t) dt converges because

∫ 1

x

1ln(t) dt =[t ln(t)

]1

x−∫ 1

x

t

tdt = −x ln(x) − (1 − x) → −1 as x >→ 0 .

(iv) The integral

∫ 1

0

dt

tdiverges since

∫ 1

x

dt

t=[

ln(t)]1

x= − ln(x) → +∞ as x >→ 0.

(v) The integral

∫ +∞

−∞

2t dt

(1 + t2)2converges and is equal to 0 as

∫ 0

x

2t dt

(1 + t2)2=[ −1

1 + t2

]0

x=

1

1 + x2−−−−−→x→−∞

0 ,

∫ y

0

2t dt

(1 + t2)2=[ −1

1 + t2

]y

0= − 1

1 + y2−−−−−→y→+∞

0 .

The choice c = 0 made the computation easier but an arbitrary value of c simply adds and subtracts−1

1 + c2.

41


Exercise 1.82 h Mean of an exponential random variable g solution page 139

Let λ > 0. Compute

∫ +∞

0

λte−λt dt


For n ≥ 0, compute

∫ +∞

0

tn e−t dt.

Definition 1.84 h notation [·] g

We extend the notation [F ]ba = for a function F : (a, b) → R and a, b ∈ R by setting

[F ]ba := limy→b

F (y) − limx→a

F (x)

whenever both limits exist, at least in R, and the difference does not result in an indeterminate form.

With this piece of notation, we may for instance directly write

∫ +∞

0

dt

1 + t2=[

arctan(t)]+∞

0=π

2.

We will, however, refrain from writing[x2]+∞

−∞.

This generalized notion of integral mostly obeys the same rules as the integral on a segment.

Proposition 1.85 h Chasles’s identity g

Let I be an interval and f , g : I → R be two functions that are integrable on every segment includedin I . We denote by a, b ∈ R respectively the left and right extremities of I , and let c ∈ I . Then

∫ b

a

f(x) dx converges ⇐⇒∫ b

c

f(x) dx and

∫ c

a

f(x) dx converge ,

in which case∫ b

a

f(x) dx =

∫ c

a

f(x) dx+

∫ b

c

f(x) dx .

Proof. From the third remark after Definition 1.80,∫ b

c f(x) dx converges if and only if∫ x

c f(x) dx admits

a limit as x <→ b and similarly for∫ c

af(x) dx. The result then follows from the second remark after

Definition 1.80.

Proposition 1.86 h linearity of the integral g

Let f , g : I → R be two functions that are integrable on an interval I and let λ ∈ R. Then the function

42

1.4.1. Definition and first properties

λf + g : I → R is integrable and

∫

I

(λf + g)(x) dx = λ

∫

I

f(x) dx+

∫

I

g(x) dx .

Warning 1.87

The converse is false: it is possible to find f , g not integrable such that f + g is integrable. Take forinstance any nonintegrable function and its opposite.

Proof. For any x < c < y in I , one has, by linearity of the integral on the segments [x, c] and [c, y]

∫ c

x

(λf + g) = λ

∫ c

x

f +

∫ c

x

g and

∫ y

c

(λf + g) = λ

∫ y

c

f +

∫ y

c

g

and the result follows by summing after taking the limits as x tends to the left extremity of I and ytends to the right extremity of I .

Proposition 1.88 h positivity of the integral g

h Let I be an interval and f , g : I → R be integrable functions such that f ≤ g. Then

∫

I

f(x) dx ≤∫

I

g(x) dx .

h In particular, if f : I → R is an integrable function such that f ≥ 0, then

∫

I

f(x) dx ≥ 0.

Proof. For any x < c < y in I , one has, by positivity of the integral on the segments [x, c] and [c, y]

∫ c

x

f ≤∫ c

x

g and

∫ y

c

f ≤∫ y

c

g

and the result follows by summing after taking the limits as x tends to the left extremity of I and ytends to the right extremity of I .

The notion of improper integrals is based on the convergence of functions. As the Cauchy crite-rion is very useful in this case, especially when the limit is not known, it is natural to translate it inthe context of improper integrals. Recall that a function g : I → R admits a finite limit at the leftextremity a ∈ R of I if and only if

∀ε > 0 , ∃d ∈ I : a < u, v < d =⇒∣∣g(u) − g(v)

∣∣ < ε

and at the right extremity b ∈ R of I if and only if

∀ε > 0 , ∃d ∈ I : d < u, v < b =⇒∣∣g(u) − g(v)

∣∣ < ε.

43


Proposition 1.89 h Cauchy’s convergence criterion g

Let I be an interval with left and right extremities a, b ∈ R, and let f : I → R be a function that isintegrable on every segment included in I .

h Then, f is integrable at a if and only if

∀ε > 0 , ∃d ∈ I : a < u, v < d =⇒∣∣∣∣

∫ v

u

f(x) dx

∣∣∣∣< ε .

h Similarly, f is integrable at b if and only if

∀ε > 0 , ∃d ∈ I : d < u, v < b =⇒∣∣∣∣

∫ v

u

f(x) dx

∣∣∣∣< ε .

Proof. We apply the Cauchy convergence criterion at the function g : x ∈ I 7→∫ x

cf(t) dt where c ∈ I is

arbitrary, observing that∣∣g(u) − g(v)

∣∣ =

∣∣∫ v

uf(t) dt

∣∣.

1.4.2 Nonnegative functions

When we are not able to explicitly compute primitives in order to determine the nature (convergentor divergent) and the value of an integral, we need alternative criterions to conclude in regards to thenature of integrals. Near an extremity of its interval of definition, a function

h either has constant sign;

h or changes sign infinitely many times.

In the latter case, we speak of oscillating functions, which we will study in the next section. In thepresent section, we concentrate on functions whose sign become constant near the extremity into con-sideration. Up to shortening the interval of definition or taking the opposite of the function, we maywithout loss of generality assume that wa are dealing with a nonnegative function.

a +∞ a b

In this case, under the usual assumption that f : I → R+ is integrable on every segment includedin I , the question is to know whether

x 7→∫ x

c

f(t) dt

converges to a finite limit as x tends to the extremities of I . By positivity (using that f1[c,x] ≤ f1[c,y]

for c ≤ x ≤ y or Chasles’s identity), we see that the displayed function is nondecreasing. As a result, it

44

1.4.2. Nonnegative functions

always admits limits (finite or infinite) at the extremities of I . More precisely, denoting as usual by a,b ∈ R the left and right extremities of I ,

f is not integrable at a ⇐⇒∫ c

x

f(t) dt −−−→x >→a

+∞ ;

f is not integrable at b ⇐⇒∫ y

c

f(t) dt −−−→y <→b

+∞ .

Definition 1.90 h integral of nonintegrable nonnegative functions g

It makes sense to extend the piece of notation∫

If in this context of nonnegative functions by setting

∫

I

f := +∞

if f is not integrable.

In this setting, it always holds that

∫ b

a

f(t) dt = limx >→a

∫ c

x

f(t) dt+ limy <→b

∫ y

c

f(t) dt ∈ R+ ∪ {+∞} .

We similarly extend the notation to nonpositive functions, replacing +∞ with −∞.Let us mention here that linearity still holds for nonintegrable functions, provided the multiplying

scalar is nonnegative. This is only used for computational purposes.

Proposition 1.91 h linearity for nonnegative functions g

Let f , g : I → R+ be two functions that are integrable on every segments of an interval I and letα ∈ R+. Then the following equality holds in R+ ∪ {+∞},

∫

I

(αf + g)(x) dx = α

∫

I

f(x) dx+

∫

I

g(x) dx .

Proof. Observe first that αf + g is a nonnegative function. By linearity, for any x ≤ c ≤ y, one has∫ c

x (αf + g) = α∫ c

x f +∫ c

x g and∫ y

c (αf + g) = α∫ y

c f +∫ y

c g. The result follows by summing after takingthe limits as x tends to the left extremity of I and y tends to the right extremity of I .

Proposition 1.92 h sequential criterion g

Let I be an interval with left and right extremities a, b ∈ R, and let f : I → R+ be a nonnegativefunction that is integrable on every segment included in I .

h For any sequences (an), (bn) ∈ IN such that an → a and bn → b as n → ∞,

∫ bn

an

f(x) dx →∫ b

a

f(x) dx ∈ R+ ∪ {+∞} .

h In particular, if there exist two sequences (an), (bn) ∈ IN such that an → a and bn → b as

45


n → ∞ and∫ bn

an

f(x) dx

admits a finite limit as n → ∞, then f is integrable.

Proof. This comes from the above remark that y 7→∫ y

cf(t) dt is nondecreasing and thus converges, as

y <→ b, toward∫ b

c f(t) dt. The result follows from the sequential criterion for a limit of a function.

Remark 1.93

This no longer holds if one drops the positivity assumption: for instance

∫ nπ

0

cos(x) dx =[

sin(x)]nπ

0= 0 → 0

as n → ∞ whereas cos is clearly not integrable on R+ (∫ x

0 cos(t) dt = sin(x) does not converge).

We use the following proposition in order to decide whether a nonnegative function is integrableor not.

Proposition 1.94 h comparison of nonnegative functions g

Let f , g : I → R+ be two nonnegative functions that are integrable on every segments of an interval Iand are such that f ≤ g.

h Then the following inequality holds in R+ ∪ {+∞}∫

I

f(x) dx ≤∫

I

g(x) dx .

h In particular, if g is integrable, then f is also integrable. If f is not integrable, then neither is g.

Proof. By positivity, for any x ≤ c ≤ y, one has∫ c

xf ≤

∫ c

xg and

∫ y

cf ≤

∫ y

cg. The result follows by

summing after taking the limits as x tends to the left extremity of I and y tends to the right extremityof I .

This proposition is mainly used through its second item (which is better remembered as an ap-plication of the first item). Namely, we bound the function under study either from below with anonintegrable nonnegative function or from above with an integrable nonnegative function. Recallthat we are merely interested in what happens near an extremity of the interval of study so that thiscomparison only needs to hold in this vicinity.

Example 1.95

Let us show that ∫ +∞

1

tαe−t dt

converges, for any value of α ∈ R. The integrand is a positive function, so that we can use the previoustheorem. The idea is thus to bound the integrand from above with a function that we know is integrable.

46


The point is that t 7→ e−t thwarts the behavior of t 7→ tα, no matter the value of α.

More precisely, we write tαe−t = tαe− t2 e− t

2 and observe that t ∈ [1,+∞) 7→ tαe− t2 is bounded from

above by some constant Mα since tαe− t2 → 0 as t → ∞. As a result, for all t ≥ 1,

tαe−t ≤ Mα e− t

2 .

We conclude by noticing that t ∈ [1,+∞) 7→ e− t2 is integrable:

∫ x

1

e− t2 dt =

[

−2e− t2

]x

1= 2e−1/2 − 2e−x/2 −−−−−→

x→+∞2e−1/2 .

The exact value of the latter integral does not matter, we can also conclude that it is less than 2e−1/2, sothat the integral is bounded.

We now make the previous proposition a bit more precise. Recall the following notation for realfunctions f , g defined in a neighborhood of b ∈ R:

h f(x) = O(g(x)

)when x → b if there exist M > 0 such that |f(x)| ≤ M |g(x)| for every x close to b;

h f(x) ∼ g(x) when x → b if, for all ε > 0, we have |f(x) − g(x)| < ε |g(x)| for x close to b;

h f(x) = O

(g(x)

)when x → b if, for all ε > 0, we have |f(x)| < ε |g(x)| for x close to b.

Theorem 1.96 h refined comparison of nonnegative functions g

Let f , g : I → R+ be two nonnegative functions that are integrable on every segments of an interval Iand let b ∈ R denote the right extremity of I .

(i) Assumef(x) = O

(g(x)

)when x <→ b .

Then, if g is integrable at b, then f is also integrable at b. Equivalently, if f is not integrable at b,then g is not integrable at b either.

(ii) Assumef(x) ∼ g(x) when x <→ b .

Then f is integrable at b if and only if g is integrable at b. Moreover,

(a) if g is integrable at b, then

∫ b

x

f(y) dy ∼∫ b

x

g(y) dy when x <→ b;

(b) if g is not integrable at b, then, for c ∈ I fixed,

∫ x

c

f(y) dy ∼∫ x

c

g(y) dy when x <→ b.

Remark 1.97

If f : I → R is not assumed nonnegative and g : I → R+ is such that f(x) ∼ g(x) when x <→ b,then f is actually nonnegative in the vicinity of b. Indeed, taking ε = 1

2 in the definition yields that0 ≤ 1

2g(x) ≤ f(x) for x close enough to b.

47


Proof. (i) By definition, there exist M > 0 and c ∈ I such that f(x) ≤ Mg(x) for all x ∈ [c, b). Weconclude by applying Proposition 1.94 to the nonnegative functions f |[c,b) and Mg|[c,b).

(ii) Taking for instance ε = 12 in the definition of f(x) ∼ g(x), there exists d ∈ I such that, for all

x ∈ [d, b),1

2g(x) ≤ f(x) ≤ 3

2g(x) ,

so that both f(x) = O(g(x)

)and g(x) = O

(g(x)

)when x <→ b. By (i), f is integrable at b if and only if g

is integrable at b.

Now, for any fixed ε > 0, there exists dε ∈ I such that, for all y ∈ [dε, b),

(1 − ε)g(y) ≤ f(y) ≤ (1 + ε)g(y) .

(a) If g is integrable at b, then both f and g are integrable on [dε, b) so that, by Proposition 1.88

(1 − ε)

∫ b

dε

g(y) dy ≤∫ b

dε

f(y) dy ≤ (1 + ε)

∫ b

dε

g(y) dy .

This exactly means that

∫ b

x

f(y) dy ∼∫ b

x

g(y) dy when x <→ b.

(b) If g is not integrable at b, by positivity on [dε, x] for every x ∈ [dε, b),

(1 − ε)

∫ x

dε

g(y) dy ≤∫ x

dε

f(y) dy ≤ (1 + ε)

∫ x

dε

g(y) dy .

By Chasles’s identity, the latter inequality becomes

∫ x

a

f(y) dy −∫ dε

a

f(y) dy ≤ (1 + ε)

(∫ x

a

g(y) dy −∫ dε

a

g(y) dy

)

and then∫ x

a

f(y) dy ≤ (1 + ε)

∫ x

a

g(y) dy +

∫ dε

a

(f(y) − (1 + ε)g(y)

)dy .

As g is not integrable at b, the quantity∫ x

ag(y) dy → ∞ as x → b, so that, for x close enough to b, the

constant∫ dε

a

(f(y) − (1 + ε)g(y)

)dy is smaller than ε

∫ x

a g(y) dy and thus

∫ x

a

f(y) dy ≤ (1 + 2ε)

∫ x

a

g(y) dy .

The same argument shows that

∫ x

a

f(y) dy ≥ (1 − ε)

∫ x

a

g(y) dy +

∫ dε

a

(f(y) − (1 − ε)g(y)

)dy ,

and, for x close enough to b,∫ x

a

f(y) dy ≥ (1 − 2ε)

∫ x

a

g(y) dy .

The result follows.

48


Example 1.98

The integral

∫ +∞

1

x5 + 2x4 − 5

x3 + 5xe−x dx converges because

x5 + 2x4 − 5

x3 + 5xe−x ∼ x2e−x

when x → ∞ and x ∈ [1,+∞) 7→ x2e−x is integrable as we saw in Example 1.95.

We end this section with two families of integrals that are quite natural candidates for comparisonarguments. The easiest ones are power functions.

Proposition 1.99 h Riemann integrals g

Let α ∈ R. The following holds.

(i)

∫ +∞

1

1

xαdx < ∞ if and only if α > 1.

(ii)

∫ 1

0

1

xαdx < ∞ if and only if α < 1.

Proof. For α 6= 1, we have

Fα(x) :=

∫ x

1

1

tαdt =

x−α+1 − 1

−α+ 1.

Then, as x → +∞, Fα(x) admits a finite limit if and only if −α+ 1 < 0, that is, α > 1. And, as x → 0,Fα(x) admits a finite limit if and only if −α+ 1 > 0, that is, α < 1.

Finally, if α = 1,

F1(x) :=

∫ x

1

1

tdt = ln(x) ,

so that F1(x) → +∞ as x → +∞ and −F1(x) → +∞ as x → 0. (The minus sign comes from the fact

that −F1(x) =∫ 1

xt−1 dt.)

It is sometimes useful to be a bit more precise. At +∞, the previous proposition says that x 7→ 1x is

not integrable but x 7→ 1x1+ε is integrable for any ε > 0. In order to decide for a function f such that

1

x1+ε≤ f(x) ≤ 1

xfor all ε > 0,

one needs more sophisticated comparison functions.

Proposition 1.100 h Bertrand integrals g

The following holds.∫ +∞

e

dx

x(ln(x))β< ∞ if and only if β > 1 .

49


Proof. Integrating by substitution with y = ln(x), so that dy = dxx , we see that

∫ z

e

dx

x(ln(x))β=

∫ ln(z)

1

dy

yβ

admits a finite limit as z → ∞ if and only if β > 1 by the previous proposition.

If one needs further refinement, the method can be reiterated at wish, substituting y = ln(x), thenx = ln(t), then t = ln(u), and so on,

∫ z

1

dy

yβ=

∫ ez

e

dx

x(ln(x))β=

∫ eez

ee

dt

t ln(t)(ln(ln(t)))β=

∫ eeez

eee

du

u ln(u) ln(ln(u))(ln(ln(ln(u))))β= . . . ,

always concluding from the case of Riemann integrals that it converges to a finite limit if and only ifβ > 1. It is a good idea to learn Proposition 1.99 by heart; for Bertrand integrals and the generalization,it is better advised to remember the method, that is, to use the substitution ln.

Example 1.101

Let us see whether ∫ +∞

2

√

x2 + 3x ln

(

cos( 1

x

))

sin2

(1

ln(x)

)

dx

converges. The problem is at +∞. When x → ∞,

√

x2 + 3x = x

√

1 +3

x∼ x

ln

(

cos(1

x

))

= ln

(

1 − 1

2x2+ O

( 1

x2

))

∼ − 1

2x2

sin2

(1

ln(x)

)

∼(

1

ln(x)

)2

so that the integrand is equivalent to

− 1

2x (ln(x))2.

We are thus dealing with functions that are negative in the vicinity of +∞. By Theorem 1.96 andProposition 1.100, the integral under study converges.


Let α > 0. Do the following integrals converge?

(i)

∫ π

−∞αt dt (ii)

∫ +∞

π2

(t−α − sin(t−α)

)dt (iii)

∫ +∞

1

(

1 − 3√

1 + t−α)

dt

1.4.3 Oscillating functions

We now concentrate on functions that infinitely change sign near the extremity under investigation.

50

1.4.3. Oscillating functions

a +∞ a b

In contrast with nonnegative functions, for such a function f : I → R+ that is integrable on everysegment included in I , the function

x 7→∫ x

c

f(t) dt

may have no limit (finite or infinite) as x tends to the extremities of I .The most favorable case is when the absolute value of the function is integrable.

Definition 1.103 h absolutely integrable function g

h Let I be an interval and f : I → R be a function that is integrable on every segment included in I .The function f is called absolutely integrable if the function

|f | : x ∈ I 7→ |f(x)| is integrable.

h If f : I → R is absolutely integrable, then the integral

∫

I

f is said to be absolutely convergent.

Theorem 1.104 h absolute integrability g

Let f : I → R be an absolutely integrable function. Then f is integrable and

∣∣∣∣

∫

I

f(x) dx

∣∣∣∣

≤∫

I

|f(x)| dx .

Proof. This is a consequence of Cauchy’s convergence criterion (Proposition 1.89) applied to |f | thento f . Let a, b ∈ R denote the left and right extremities of I . As |f | is integrable at b, for any fixed ε > 0,there exists d ∈ I such that

d < u, v < b =⇒∣∣∣∣

∫ v

u

|f(x)| dx

∣∣∣∣< ε .

For d < u, v < b, by Proposition 1.27.(iv),∣∣∣∣

∫ v

u

f(x) dx

∣∣∣∣

≤∣∣∣∣

∫ v

u

|f(x)| dx

∣∣∣∣< ε ,

so that f is also integrable at b.Furthermore, for a fixed c ∈ I ,

∣∣∣∣

∫ b

c

f(x) dx

∣∣∣∣

= limy <→b

∣∣∣∣

∫ y

c

f(x) dx

∣∣∣∣

≤ limy <→b

∫ y

c

|f(x)| dx =

∫ b

c

|f(x)| dx .

51


One proves similarly that f is integrable at a and satisfies

∣∣∣∣

∫ c

a

f(x) dx

∣∣∣∣

≤∫ c

a

|f(x)| dx, so that f is

integrable and∣∣∣∣

∫

I

f(x) dx

∣∣∣∣

≤∣∣∣∣

∫ c

a

f(x) dx

∣∣∣∣

+

∣∣∣∣

∫ b

c

f(x) dx

∣∣∣∣

≤∫ c

a

|f(x)| dx+

∫ b

c

|f(x)| =

∫

I

|f(x)| dx .


Show that t ∈ [1,+∞) 7→ sin(t)

t2is absolutely integrable.

Warning 1.106

In contrast with integrals on segments (Proposition 1.27.(iv)), the converse of Theorem 1.104 is false.

Example 1.107 h Dirichlet integral g

The integral

∫ +∞

1

sin(t)

tdt provides a counter-example.

h It is convergent. Indeed,∫ x

1

sin(t)

tdt =

[− cos(t)

t

]x

1

−∫ x

1

(

−− cos(t)

t2

)

dt = cos(1) − cos(x)

x−∫ x

1

cos(t)

t2dt.

As x → ∞, the second term tends to 0 because

∣∣∣∣

cos(x)

x

∣∣∣∣

≤ 1

x→ 0 and the third term converges because

t ∈ [1,+∞) 7→ cos(t)

t2is absolutely integrable (see Exercise 1.105) hence integrable.

h It is not absolutely convergent. This comes from the fact that t 7→ | sin(t)| stays bounded frombelow on a constant fraction of time and t 7→ t−1 is not integrable. More precisely, for any fixed0 < θ < π

2 ,

∫ Nπ

π

∣∣∣∣

sin(t)

t

∣∣∣∣dt =

N∑

k=2

∫ kπ

(k−1)π

| sin(t)|t

dt

≥N∑

k=2

∫ kπ−θ

(k−1)π+θ

| sin(t)|t

dt

≥N∑

k=2

∫ kπ−θ

(k−1)π+θ

sin(θ)

kπdt =

sin(θ)

π(π − 2θ)

N∑

k=2

1

k−−−−→N→∞

∞ .

Instead of using θ, one might also directly bound from below∫ kπ

(k−1)π

| sin(t)|t

dt ≥ 1

kπ

∫ kπ

(k−1)π

| sin(t)| dt =2

kπ.

Another classical yet more obscure proof consists in noticing that∣∣∣∣

sin(t)

t

∣∣∣∣

≥ sin2(t)

t=

1 − cos(2t)

2t,

52

1.4.3. Oscillating functions

and concluding similarly as above by an integration by parts that t ∈ [1,+∞) 7→ cos(2t)

2tis integrable.

The remaining term t ∈ [1,+∞) 7→ 1

2tbeing not integrable, the sum cannot be integrable.

The example above illustrates the typical situation where the lack of decay at infinity (the fact thet 7→ t−1 is not integrable) is compensated by the oscillations of the function (the t 7→ sin(t) part). Thefollowing theorem gives a general criterion allowing to handle this phenomenon.

Theorem 1.108 h Abel’s criterion g

Let I be an interval with right extremity b ∈ R and f ∈ C(I), g ∈ C1(I). We suppose that there existsc ∈ I such that

h g is nonincreasing on [c, b) and g(x) → 0 as x <→ b ;

h the function x ∈ [c, b) 7→∫ x

c

f(t) dt is bounded.

Then

∫ b

c

f(x)g(x) dx converges.

Example 1.109 h Dirichlet integral g

Applying the theorem with f : x ∈ [1,+∞) 7→ sin(x) and g : x ∈ [1,+∞) 7→ 1

x, we recover that

∫ +∞

1

sin(x)

xdx converges.

Proof. It is as in Example 1.107. Let F : x ∈ [c, b) 7→∫ x

c

f(t) dt and M > 0 be such that |F | ≤ M .

∫ x

c

f(t) g(t) dt =[

F (t) g(t)]x

c−∫ x

c

F (t) g′(t) dt = F (x)g(x)︸︷︷︸

→0

−F (c)g(c) −∫ x

c

F (t)g′(t) dt .

As F is bounded and g(x) → 0 as x → b, we have F (x)g(x) → 0. It remains to see that t ∈ [c, b) 7→F (t)g′(t) is integrable. In fact, it is absolutely integrable: as g′ is nonpositive on [c, b),

∫ x

c

∣∣F (t)g′(t)

∣∣ dt ≤ M

∫ x

c

(− g′(t)

)dt = M

(g(c) − g(x)

)−−−→x <→b

g(c) .

This proves that the left-hand side integral is bounded and thus admits a finite limit as x <→ b.


Let α > 0 be a real number and n ∈ N. Study the convergence and absolute convergence of

∫ +∞

1

sinn(t)

tαdt .

53



Study the convergence of

∫ 1

0

sin(

1t

)

tdt.


Compute

∫ π2

0

ln(

sin(t))

dt and

∫ π2

0

ln(

cos(t))

dt.

Hint: find relations between these integrals and use their sum.

1.4.4 Comparison of series with integrals

We come back to the link between sums and integrals. The point is to compare sums of the form∞∑

n=k

f(n) with integrals of the form

∫ +∞

a

f(x) dx.

The idea is to use Chasles’s identity and write, for any integer n ≥ ⌈a⌉ + 1,

∫ n

a

f(x) dx =

∫ ⌈a⌉

a

f(x) dx+

n−1∑

k=⌈a⌉

∫ k+1

k

f(x) dx

and then use monotonicity in order to bound the integrand f(x) on [a, ⌈a⌉] and o each interval [k, k + 1].For instance, a convenient setup is to consider nonincreasing functions.

Proposition 1.113 h Comparison of series with integrals g

Let p be an integer and f : [p,+∞) → R be a nonincreasing function that is integrable on everysegment included in [p,+∞). Then, for all n ≥ p+ 1,

n∑

k=p+1

f(k) ≤∫ n

p

f(x) dx ≤n−1∑

k=p

f(k) .

Consequently, the following inequality holds in R :

+∞∑

k=p+1

f(k) ≤∫ +∞

p

f(x) dx ≤+∞∑

k=p

f(k) .

In particular, the series with terms (f(k))k converges if and only if f is integrable.

Remark 1.114

The second statement is only useful under the assumption that f(x) → 0 as x → ∞. Indeed, as f isnonincreasing, it converges to some ℓ ∈ R. If ℓ > 0, the three quantities appearing in the second displayof inequalities are clearly equal to +∞, and, if ℓ < 0, they are clearly equal to −∞.

54

1.4.4. Comparison of series with integrals

Proof. By Chasles’s identity,∫ n

p

f(x) dx =

n−1∑

k=p

∫ k+1

k

f(x) dx .

It then remains to bound the integrand of the integrals in the sum using the monotony of f . We obtain

f(k + 1) =

∫ k+1

k

f(k + 1) dx ≤∫ k+1

k

f(x) dx ≤∫ k+1

k

f(k) dx = f(k)

and the result follows by summation.As f is nonincreasing, it is either nonnegative or eventually negative. Consequently, the three

quantities in the first display of inequalities are either nondecreasing with n or eventually decreasingwith n and in any case admit limits in R. The second inequality is obtained by taking the limit as

n → ∞. Finally, by the sequential criterion, f is integrable if and only if∫ +∞

p f(x) dx < ∞.

Example 1.115 h Riemann zeta function g

Let α ∈ R. The series∑

k≥1

1

kαconverges if and only if α > 1.

The case α ≤ 0 is obvious. For α > 0, the function x ∈ R⋆+ 7→ x−α is continuous thus integrable on

every segment, nonnegative and nonincreasing. We know from Proposition 1.99 that it is integrable ifand only if α > 1.

In spite of the analogy between series and integrals explained above, beware that there are impor-tant differences between these two notions. For instance, recall that

∞∑

k=1

uk < ∞ =⇒ limk→∞

uk = 0 .

(Recall also that the converse is not true: take for instance uk = k−α with 0 < α < 1.) In contrast,∫ ∞

1

f(x) dx < ∞ 6=⇒ limx→∞

f(x) = 0 ,

even under the extra assumption that f is nonnegative. In fact, the integrability of f at +∞ does noteven imply that f is bounded. Let us construct an example of nonnegative continuous unboundedfunction f that is integrable at +∞. We let f be equal to 0 except around integer values where ithas a triangular shape: for each k ∈ N, its graph follows the sides of the isosceles triangle with basis[k − 2−2k, k + 2−2k] and height 2k.

x

y

0 1

f

55


The area of the triangle is

∫ k+2−2k

k−2−2k

f(x) dx = 2−k, so that

∫ +∞

0

f(x) dx =

+∞∑

k=1

2−k = 1 < ∞.


Tweak the above example into a positive continuous integrable function R → R⋆+, unbounded both at

−∞ and +∞.

The above example is ah hoc and may look “unnatural” in the sense that you could expect neverto encounter such an example again. The exercise below gives a more natural looking example.

Exercise 1.117 h Fresnel integral g solution page 142

Show that the function x ∈ R+ 7→ sin(x2) is integrable.

56

2Plane parametric curves

This chapter is devoted to the study of plane parametric curves. We will see their fundamentalproperties and how to sketch a plane curve. We will cover parametric curves in Cartesian coordinatesand in polar coordinates.

Here is an incentive to read this chapter:

57

Chapter 2. Plane parametric curves

If you want to learn more about parametric curves, you may consult

h [Ste16, 10];

h [LM07, 30 (I–III)], in French.

2.1 Introduction 58

2.1.1 Motivation 58

2.1.2 Preliminaries 60

2.2 First definitions 63

2.3 Tangents 66

2.3.1 Definition 67

2.3.2 Link with derivatives 69

2.3.3 Local behavior 71

2.4 Sketching 74

2.4.1 Interval of study 74

2.4.2 Asymptotes 76

2.4.3 Sketching plan 79

2.5 Polar curves 85

2.5.1 Polar coordinates 85

2.5.2 Polar curves 86

2.5.3 What is the difference with a usual graph? 87

2.5.4 Tangents 88

2.5.5 Extremities of the interval of study 91

2.5.6 Sketching 93

2.1 Introduction

2.1.1 Motivation

So far, you’ve learned a lot about curves associated with equations of the form y = f(x) for somefunction f and you can also easily deal with equations of the form x = f(y) simply by exchanging the

58

2.1.1. Motivation

roles of x and y (even if it can sometimes be confusing in practice). But not all curves can easily bedescribed in such a way. . . Take for instance the radius 1 circle centered at the origin:

C ={

(x, y) : x2 + y2 = 1}.

x

y

0 1

C

Recall that the graph of a function f : I → R is the set{(x, f(x)

): x ∈ I

}. In particular, it cannot

contain two different points with the same abscissa (x, y1) 6= (x, y2) as this would imply that f(x) = y1

and f(x) = y2. In other words, each vertical line can only intersect the graph at most once. From this,we see that it is not possible to find a function whose graph is the circle C (even if we rotate the axes).

Sure, we can write C as the union of the two disjoint graphs of the functions

x ∈ [−1, 1) 7→√

1 − y2 and x ∈ (−1, 1] 7→ −√

1 − y2 (2.1)

but this is not really satisfactory. In particular, it singles out the two points (0,−1) and (0, 1) that haveabsolutely nothing different from any other points.

Instead, it can prove convenient to introduce an extra variable t called a parameter and express xand y as functions of t. The motion of a point evolving with time is sometimes naturally given in sucha form; in particular if this motion is driven by physics equations and laws.

Remark 2.1

The parameter is often denoted by t as it can be thought of time. Of course, it can sometimes be denotedotherwise: θ for an angle, r, etc.In order to make the presentation as clear as possible, we will reserve the terminology point for points ofthe plane and use the terminology time when referring to a value the parameter can take.

In the case of the circle C , taking for the parameter the angle to the x-axis gives

{

x = cos(t)

y = sin(t)(2.2)

and we need to specify the values of twe consider. In this case, we can choose for instance t ∈ R (whichmeans that the circle is actually parameterized an infinite number of times). The description (2.2) of acurve encapsulates more information than (2.1) as the position of the particle is known at every time t.Note that (2.2) is not the only way to parameterize the circle: for instance, we could also use

{

x = cos(−3t)

y = sin(−3t)t ∈ R ,

which gives the same circle C .

59


2.1.2 Preliminaries

Let E2 denote the Euclidean plane and (O;~ı,~) denote an orthonormal basis of E2. We identify E2

with R2, that is, identify the vector x~ı + y~ ∈ E2 with the point (x, y) ∈ R2. In some practical cases,we can also identify R2 with the complex plane C, that is, identify the point (x, y) with the complexnumber x+ iy. We denote by ‖ · ‖ the usual Euclidean norm defined by

∀(x, y) ∈ R2,∥∥(x, y)

∥∥ :=

√

x2 + y2 .

In this context, the usual Euclidean distance between the two vectors ~a and ~b ∈ E2 is equal to ‖~a −~b‖.From now on, we consider a vector function

~v : I → E2

t 7→ ~v(t)

defined on an interval I ⊆ R. Furthermore, for each t ∈ I , we denote by x(t) and y(t) the coordinatesof the vector ~v(t). In other words, we write

~v(t) = x(t)~ı + y(t)~ .

Definition 2.2

Let t0 be an adherent timea of I . We say that the vector function ~v has limit ~a ∈ E2 when t → t0 if‖~v(t) − ~a‖ → 0 as t → t0. If so, we write ~v(t) → ~a as t → t0 and also say that ~v(t) tends to ~a ast → t0.

aAn adherent time of an interval is either a time in the interval or an extremity of the interval. For instance, a is anadherent time of (a, b).

Proposition 2.3

Let ~a = x0~ı + y0~. Then ~v(t) → ~a as t → t0 if and only if

{

x(t) → x0 as t → t0

y(t) → y0 as t → t0.

Proof. As ‖~v(t) − ~a‖ =√

(x(t) − x0)2 + (y(t) − y0)2, we see that

{

x(t) → x0 as t → t0

y(t) → y0 as t → t0=⇒ ‖~v(t) − ~a‖ → 0 as t → t0.

Conversely, |x(t) − x0| ≤ ‖~v(t) −~a‖ and |y(t) − y0| ≤ ‖~v(t) −~a‖, so that, if ~v(t) → ~a as t → t0, then bothx(t) → x0 and y(t) → y0 as t → t0.

Definition 2.4

Let ~v : I → E2 be a vector function and t0 ∈ I .

h The vector function ~v is continuous at t0 if ~v(t) → ~v(t0) as t → t0.

60

2.1.2. Preliminaries

h The vector function ~v is differentiable at t0 if

~v(t) − ~v(t0)

t− t0

has a limit as t → t0. If so, we denote this limit by ~v′(t0) ord~v

dt(t0).

Proposition 2.5


(i) The vector function ~v is continuous if and only if its coordinates are continuous.

(ii) The vector function ~v is differentiable if and only if its coordinates are differentiable. In this case,the coordinates of ~v′ are the derivatives of the coordinates of ~v.

Proof. The first statement (i) is an immediate consequence of Proposition 2.3. Let t0 ∈ I . We have

~v(t) − ~v(t0)

t− t0=x(t) − x(t0)

t− t0~ı +

y(t) − y(t0)

t− t0~ ,

so that the second statement follows directly from Proposition 2.3.

We may subsequently define, when they exist, ~v′′, ~v(3), . . . Furthermore, ~v is k time differentiable ifand only if all its coordinates are; ~v is of class1 Ck (for any given k ≥ 0 or for k = ∞) if and only if allits coordinates are.

Proposition 2.6

Let ~v1 : I → E2 and ~v2 : I → E2 be two differentiable vector functions on I .

(i) For any λ, µ ∈ R, the vector function λ~v1 +µ~v2 is differentiable and(λ~v1 + µ~v2

)′= λ~v′

1 + µ~v′2 .

(ii) The real-valued functiona ~v1 · ~v2 is differentiable and(~v1 · ~v2

)′= ~v′

1 · ~v2 + ~v1 · ~v′2 .

(iii) The real-valued functionb det(~v1, ~v2) is differentiable and(

det(~v1, ~v2))′

= det(~v′1, ~v2) +

det(~v1, ~v′2) .

(iv) If ~v1 never cancels, then the real-valued function ‖~v1‖ is differentiable and(‖~v1‖

)′=~v′

1 · ~v1

‖~v1‖ .

aWe denote by · the scalar product.bWe denote by det the determinant.

Proof. Let us write ~v1 = x1~ı + y1~ and ~v2 = x2~ı + y2~. Then we have the following.

(i) λ~v1 + µ~v2 =(λx1 + µx2

)~ı +

(λy1 + µy2

)~. The result easily follows.

1Recall that, for k ≥ 0, a function is of class Ck if it is k times differentiable and its k-th derivative is continuous. A functionis smooth or of class C∞ is it has derivatives of all orders.

61


(ii) ~v1 · ~v2 = x1x2 + y1y2, so that

(~v1 · ~v2

)′= x′

1x2 + x1x′2 + y′

1y2 + y1y′2

= x′1x2 + y′

1y2 + x1x′2 + y1y

′2

= ~v′1 · ~v2 + ~v1 · ~v′

2 .

(iii) det(~v1, ~v2) = x1y2 − y1x2, so that

(det(~v1, ~v2)

)′= x′

1y2 + x1y′2 − y′

1x2 − y1x′2

= x′1y2 − y′

1x2 + x1y′2 − y1x

′2

= det(~v′1, ~v2) + det(~v1, ~v

′2) .

(iv) ‖~v1‖ =(x2

1 + y21

) 12 , so that

(‖~v1‖

)′=

1

2

(2x′

1x1 + 2y′1y1

)(x2

1 + y21

)− 12 =

~v′1 · ~v1

‖~v1‖ .

Warning 2.7

In (iv), do not forget to check that ~v1 does not cancels. For instance t 7→ ‖t~ı + t~‖ =√

2 |t| is notdifferentiable at 0.

Truncated expansion

The local behavior of a function is often dictated by its truncated expansion. If the vector function~v : I → E2 is of class Ck, then all its coordinates are of class Ck and may thus be expanded. As we

deal with vector functions, we will extend the notation O as follows. For a vector function ~f and a realfunction g defined in a neighborhood of t0, we write ~f(t) = O(g(t)) when t → t0 if ‖~f(t)‖ = O

(g(t)

)

when t → t0. Equivalently, ~f(t) = O(g(t)) if and only if each coordinate of ~f is negligible with respectto g (same proof as Proposition 2.3).

Proposition 2.8 h truncated expansion g

Let us suppose that ~v : I → E2 is k times differentiable at a time t0 ∈ I . Then ~v admits the followingexpansion:

~v(t) = ~v(t0) + (t− t0)~v′(t0) +(t− t0)2

2~v′′(t0) + · · · +

(t− t0)k

k!~v(k)(t0) + O

((t− t0)k

). (2.3)

Proof. By Proposition 2.5, both functions x and y are k times differentiable at t0; they thus admit thetruncated expansions

x(t) = x(t0) + (t− t0)x′(t0) +(t− t0)2

2x′′(t0) + · · · +

(t− t0)k

k!x(k)(t0) + O

((t− t0)k

);

y(t) = y(t0) + (t− t0)y′(t0) +(t− t0)2

2y′′(t0) + · · · +

(t− t0)k

k!y(k)(t0) + O

((t− t0)k

).

As, for 0 ≤ i ≤ k, ~v(i)(t) = x(i)(t)~ı + y(i)(t)~, the result follows.

62

2.2. First definitions

2.2 First definitions

We will use the following terminology. Beware that there is no clear consensus, so other referencesmay use a slightly different terminology.

Definition 2.9 h plane parametric curve g

h A (plane) parametric curve is a pair (I, ~v) where I ⊆ R is an interval and ~v : I → E2 is acontinuous vector function. Equivalently, it is a system of 2 continuous functions of another variable –which is called the parameter – defined on a common interval I ⊆ R:

{

x(t)

y(t)t ∈ I .

h The image of a parametric curve (I, ~v) is the set

{~v(t) : t ∈ I

}⊆ E2 ,

or, equivalently, the set{

M ∈ R2 : ∃t ∈ I,−−→OM = ~v(t)

}

⊆ R2 .

h A set of points Γ ⊆ R2 is a (plane) curve if there exists a parametric curve whose image is Γ.

h A parametric curve whose image is a curve Γ is called a parametric representation or parameter-ization (alternatively spelled parametrization) of the curve Γ.

In particular, the graph of a continuous function f : I → R is always a curve. It indeed admits theparametric representation:

{

x(t) = t

y(t) = f(t)t ∈ I .

The reflection of this graph across the first bisector is also a curve: it admits, for instance, the parame-terization {

x(t) = f(t)

y(t) = tt ∈ I .

In this sense, parametric curves are richer than usual graphs of continuous functions.

Conversely, given a parametric curve, we may sometimes find a function describing the samecurve (not always, remember that we have seen in the introduction that the unit circle is not the graphof a single function). For instance, the image of the parametric curve

{

x(t) = t2 − t

y(t) = t+ 2t ∈ R+

is the same as that of (using the change of variable u = t+ 2)

{

x(u) = u2 − 5u+ 6

y(u) = uu ∈ [2,+∞) ,

which is the graph of the function y ∈ [2,+∞) 7→ y2 − 5y+ 6 (part of a parabola with horizontal axis).

63


x

y

0

1

1

Example 2.10

Let us draw the image of the parametric curve t ∈ R 7→ cos(t)~ı + cos(2t)~. For any t ∈ R, we havecos(2t) = 2 cos2(t) − 1, so that we can rewrite our system as

{

x(t) = cos(t)

y(t) = 2x2(t) − 1t ∈ R .

As cos(R) = [−1, 1], the image of our parametric curve is the graph of x ∈ [−1, 1] 7→ 2x2 − 1.

x

y

0

1

1−1

Definition 2.11

h Let (I, ~v) be a parametric curve with image Γ. The multiplicity of a point M ∈ Γ is the number

of times t ∈ I such that−−→OM = ~v(t). A point with multiplicity 1 is called simple. A point with

multiplicity 2 or more is called multiple: it is double if its multiplicity is 2, it is triple if its multiplicityis 3, it is quadruple if its multiplicity is 4, etc.

h A parametric curve is simple if all the points of its image are simple.

64

2.2. First definitions

h A parametric curve is closed if there exists a ≤ b such that I = [a, b] and ~v(a) = ~v(b).

h The parametric curve (I, ~v) is differentiable if the function ~v is differentiable on I . The samedefinition goes for the following: n times differentiable, of class Cn, smooth.

Remark 2.12

Note that these notions strongly depend on the parametric curve, not only on the curve.

Example 2.13

1. The parametric curve t ∈ R 7→ cos(t)~ı + sin(t)~ is smooth and has image the circle C . Everypoint of C is multiple (of infinite multiplicity).

2. The parametric curve t ∈ [0, 2π) 7→ cos(t)~ı + sin(t)~ is smooth, simple, and has image C .

3. The parametric curve t ∈ [0, 2π] 7→ cos(t)~ı + sin(t)~ is smooth, closed, and has image C .

Be careful that, in contrast with graphs of functions, the regularity of a parametric curve cannot beread off of its image. Look for instance at the two following smooth parametric curves:

x

y

{

x(t) = cos(t) − cos3(20t)

y(t) = sin(t) − sin3(10t)t ∈ R .

x

y

{

x(t) = 9 cos(10t) + 10 cos(9t)

y(t) = 9 sin(10t) − 10 sin(9t)t ∈ R .

And now look at this parametric curve, which is not differentiable at 0 :

65


{

x(t) = cos(|t|)y(t) = sin(|t|)

t ∈ R .x

y

Proposition 2.14 h reparameterization g

Let (I, ~v) be a parametric curve and ϕ : J → I be a homeomorphisma from some interval J ⊆ R to I .Then the parametric curve (J,~v ◦ ϕ)

(i) has the same image as (I, ~v);

(ii) is closed if and only if (I, ~v) is closed.

(iii) Moreover, the points of the common image have the same multiplicity for both parametric curves.

aA homeomorphism is a bijective continuous function whose inverse is continuous. In case of real intervals Iand J , the function ϕ : J → I is a homeomorphism if and only if it is continuous, strictly monotonic, and surjective(that is, ϕ(J) = I).

Proof. By definition,

~v ◦ ϕ : J → E2

u 7→ ~v(ϕ(u)

).

As ϕ is a bijection, for any vector ~a ∈ E2, the mapping{u ∈ J : ~v

(ϕ(u)

)= ~a

}→{t ∈ I : ~v(t) = ~a

}

u 7→ ϕ(u)

is also a bijection. The sets J~a :={u ∈ J : ~v

(ϕ(u)

)= ~a

}and I~a :=

{t ∈ I : ~v(t) = ~a

}thus have the

same cardinality. The image of (J,~v ◦ϕ) is precisely the set of points ~a such that J~a 6= ∅ and the imageof (I, ~v) is the set of points ~a such that I~a 6= ∅; these sets are the same one. Moreover, for a point ~a inthe image, its multiplicity for (J,~v ◦ ϕ) is Card(J~a), while its multiplicity for (I, ~v) is Card(I~a), whichis equal.

Finally, if (I, ~v) is closed, then there exist a ≤ b such that I = [a, b] and ~v(a) = ~v(b). In this case,depending whether ϕ is increasing or decreasing, J = [ϕ−1(a), ϕ−1(b)] or J = [ϕ−1(b), ϕ−1(a)] and, inboth cases, ~v(a) = ~v ◦ ϕ(ϕ−1(a)) = ~v ◦ ϕ(ϕ−1(b)) = ~v(b), so that (J,~v ◦ ϕ) is closed. Conversely, we canwrite (I, ~v) as (I, (~v ◦ ϕ) ◦ ϕ−1) where ϕ−1 : I → J is a homeomorphism. We can thus apply what wejust did in order to conclude that (J,~v ◦ ϕ) is closed implies that (I, (~v ◦ ϕ) ◦ ϕ−1) is closed.

2.3 Tangents

From now on, we consider a parametric curve (I, ~v) and write, as above,

~v(t) = x(t)~ı + y(t)~ .

Moreover, for each t ∈ I , we define the point M(t) ∈ R2 (called the location at time t) as the only pointsuch that −−−−→

OM(t) = ~v(t) .

66

2.3.1. Definition

2.3.1 Definition

For a point A ∈ R2 and a nonzero vector ~u ∈ E2, we denote by L (A, ~u) the set

L (A, ~u) :={A+ λ~u : λ ∈ R

}.

Definition 2.15 h line g

h A set L ⊆ R2 is called a line if there exist a point A ∈ R2 and a nonzero vector ~u ∈ E2 such thatL = L (A, ~u).

h A direction vector of a line L is a nonzero vector ~u ∈ E2 such that L = L (A, ~u).

h We say that the line L passes through the point A if there exists a nonzero vector ~u ∈ E2 such thatL = L (A, ~u).

It is easy to see that the direction vectors of the line L (A, ~u) are the vectors α~u for any α ∈ R⋆.Furthermore, for any two distinct points A and B, there exists a unique line passing through both A

and B: it is the line L(A,

−−→AB), which we alternatively simply denote by AB.

Definition 2.16 h limiting line g

h Let J ⊆ R be interval and t0 ∈ R. We suppose that, for each t ∈ J \ {t0}, we have a line L (t).We say that the line L (t) admits a limiting position if, for each t ∈ J \ {t0}, there exist a point A(t)through which L (t) passes and a direction vector ~u(t) of L (t) such that A(t) admits a limit as t → t0and ~u(t) admits a nonzero limit as t → t0.

h If so, the line L(

limt→t0 A(t), limt→t0 ~u(t))

is called the limiting line of L (t).

As is, it is not clear a priori that a limiting line is well defined. One needs to check that its defini-tion does not depend on the choice of A(t) and ~u(t). Let us suppose that L (t) = L

(A1(t), ~u1(t)

)=

L(A2(t), ~u2(t)

)with Ai(t) → Ai and ~ui(t) → ~ui 6= ~0 for i ∈ {1, 2}.

1. As ~u1(t) and ~u2(t) are direction vectors of the same line, there exists α(t) ∈ R⋆ such that ~u2(t) =α(t)~u1(t). Writing ~ui(t) = xi(t)~ı + yi(t)~ and ~ui = ai~ı + bi~, one obtains x2(t) = α(t)x1(t) andy2(t) = α(t)y1(t). As ~u1 6= ~0, we cannot have both a1 = b1 = 0. Without loss of generality, we can

assume that a1 6= 0. Then, for t close to t0, α(t) = x2(t)x1(t) → a2

a1. On the one hand, ~u2(t) → ~u2 ; on

the other hand, ~u2(t) = α(t)~u1(t) → a2

a1~u1 ; as a consequence, ~u2 = a2

a1~u1 and a2

a16= 0.

2. Now A2(t) ∈ L(A1(t), ~u1(t)

)so there exists β(t) ∈ R such that A2(t) = A1(t) + β(t)~u1(t). As

A2(t)−A1(t) → A2 −A1 and ~u1(t) → ~u1 6= ~0, the same reasoning as above shows that β(t) admitsa limit β ∈ R. As a result A2 = A1 + β~u1.

3. As a conclusion,

L(A2, ~u2

)={A2 + λ~u2 : λ ∈ R

}={

A1 +(

β + λa2

a1

)

~u1 : λ ∈ R}

= L(A1, ~u1

).

Here are a few direct consequences of Definition 2.16:

h In order to show that a line admits a limiting position, one needs to find a point and a directionvector both having a limit.

h If all the lines L (t) pass through the same point A, one can always choose A(t) = A.

67


h If the direction vector ~u(t) of L (t) admits a nonzero limit ~u, then ~u(t)‖~u(t)‖ is also a direction vector

of L (t) and it satisfies ~u(t)‖~u(t)‖ → ~u

‖~u‖ . One can thus restrict oneself to unit direction vectors.

h In order to show that a line does not admit a limiting position, it is sufficient to show that noneof their unit direction vectors admits a limit.


Let f : I → R be a class C1 function. Show that the tangent to the graph of f at the time t admits alimiting position at any t0 ∈ I .


Show that the tangent to the graph of t ∈ R⋆+ 7→ sin(1

t ) at the time t does not admit a limiting positionas t → 0.

Let t0 ∈ I and let us set M0 := M(t0).

Definition 2.19 h tangent g

If, for t 6= t0 and t close to t0, the line M0M(t) is well defined (that is, M(t) 6= M0) and admits alimiting position as t → t0, then the limiting line is called the tangent to the parametric curve at t0.

M(t)

M0

tangent

On sketches, it is customary to symbolize a tangent by a small double arrow, or a simple arrow ifthe curve does not keep going afterward:

Be careful that we speak of the tangent to a parametric curve at the real time t0 ∈ I , not at thepoint of the plane M0. In fact, when M0 is a simple point, there is no ambiguity and one may speak ofthe tangent to the curve at M0. On the other hand, if M0 is a multiple point, there might be multipletangents at the curve at M0:

68

2.3.2. Link with derivatives

M0

Whenever the line M0M(t) is defined, the vector ~v(t) − ~v(t0) is one of its direction vectors. Asa result, any direction vector of M0M(t) has the form α(t − t0)

(~v(t) − ~v(t0)

)for some nonzero real

number α(t − t0) ∈ R⋆. We obtain that the line M0M(t) admits a limiting position if and only if thereexists a real function α defined on some (−ε, ε) \ {0} such that the vector α(t− t0)

(~v(t) −~v(t0)

)admits

a nonzero limit. As ~v(t) − ~v(t0) always tends to ~0 as t → t0 (because ~v is continuous), we must haveα(u) → ±∞ as u → 0. It is thus natural to look at derivatives of ~v.

2.3.2 Link with derivatives

Definition 2.20 h regular, singular, biregular g

h The time t0 ∈ I is called regular (for (I, ~v)) if ~v is differentiable at t0 and ~v′(t0) 6= ~0.

h The time t0 ∈ I is called singular (for (I, ~v)) if it is not regular, that is, ~v is not differentiable at t0 or~v′(t0) = ~0.

h The time t0 ∈ I is called biregular (for (I, ~v)) if ~v is twice differentiable at t0 and the vectors ~v′(t0)and ~v′′(t0) are linearly independenta (hence nonzero).

h The parametric curve (I, ~v) is called regular (resp. biregular) if all the times of I are regular (resp.biregular).

aRecall that two vectors ~a and ~b are linearly independent if λ~a + µ~b = ~0 =⇒ λ = µ = 0. Equivalently, ~a and ~bare linearly independent if and only if they are not collinear.

Proposition 2.21 h tangent at a regular time g

If the time t0 is regular, then the parametric curve admits a tangent at t0 given by the lineL(M(t0), ~v′(t0)

).

Proof. As ~v is differentiable at t0, the direction vector

~v(t) − ~v(t0)

t− t0

of the line M0M(t) tends to ~v′(t0) as t → t0. By definition, this means that the line M0M(t) admits aslimiting position the line L

(M(t0), ~v′(t0)

)as t → t0.

We just learned that there always is a tangent at a regular time. But what about a singular time? Aparametric curve may very well admit a tangent at a singular time. For instance, 0 is a singular timeof t ∈ (−1, 1) 7→ t5~ı + t5~ although this parametric curve clearly admits a tangent at 0:

69


x

y

0

1

−1

1−1

Proposition 2.22

We suppose that there exists a smallest p ≥ 1 such that ~v is p times differentiable at t0 ∈ I and~v(p)(t0) 6= ~0. Then the parametric curve admits a tangent at t0 given by the line L

(M(t0), ~v(p)(t0)

).

Proof. By Proposition 2.8,

~v(t) − ~v(t0) =(t− t0)p

p!~v(p)(t0) + O

((t− t0)p

),

so that

p!

(t− t0)p

(~v(t) − ~v(t0)

)→ ~v(p)(t0) 6= ~0 ,

and this is a direction vector of M0M(t).

Proposition 2.22 gives a sufficient condition for the existence of tangents. But if there exists nosmallest p as in the statement of Proposition 2.22, one can always study the limiting slope of M0M(t),that is, the limit as t → t0 of

y(t) − y(t0)

x(t) − x(t0).

If this quantity has a limit in R∪{−∞,+∞}, then there is a tangent at t0. More precisely, if this quantitytends to some a ∈ R, then the tangent is L (M0,~ı + a~); if it tends to ±∞, the tangent is L (M0,~).

Example 2.23

Let us consider ~v : t ∈ R 7→ |t|~ı + |t|t~. At t = 0, this vector function is not differentiable. The slope ofM0M(t) is equal to

|t|t|t| = t → 0 ,

so that the parametric curve admits a horizontal tangent at 0.

70

2.3.3. Local behavior

x

y

0

Case of the graph of a function

Let us consider a parametric curve{

x(t) = t

y(t) = f(t)t ∈ I ,

with a C1 function f : I → R. It is differentiable and has derivative{

x′(t) = 1

y′(t) = f ′(t)t ∈ I ,

which cannot be ~0, so that every time of I is regular. At each t ∈ I , the parametric curve thus admitsa tangent with direction vector ~ı + f ′(t)~, that is, with slope f ′(t). This is coherent with what you’velearned so far.

2.3.3 Local behavior

We consider an interior time t0 ∈ I and suppose the existence of two integers 1 ≤ p < q such that ~v is qtimes differentiable at t0 ∈ I and

(i) p is the smallest k ≥ 1 such that ~v(k)(t0) 6= ~0;

(ii) q is the smallest k ≥ p + 1 such that the vectors ~v(p)(t0) and ~v(k)(t0) are linearly independent(which means that {~v(p)(t0), ~v(q)(t0)} forms a basis of the whole plane R2).

Under these assumptions, Proposition 2.22 ensures the existence of a tangent at t0. Moreover, (2.3)becomes

~v(t) = ~v(t0) +(t− t0)p

p!~v(p)(t0) +

(t− t0)p+1

(p+ 1)!~v(p+1)(t0) + · · · +

(t− t0)q

q!~v(q)(t0) + O

((t− t0)q

).

As, for all p + 1 ≤ i ≤ q − 1, the vector ~v(i)(t0) is collinear with ~v(p)(t0), there exists αi ∈ R such that~v(i)(t0) = αi~v

(p)(t0). We thus get

~v(t) = ~v(t0) +(t− t0)p

p!

(1 + ε(t)

)~v(p)(t0) +

(t− t0)q

q!~v(q)(t0) + ~η(t) ,

71


where ~η(t) = O

((t− t0)q

)and

ε(t) := αp+1t− t0p+ 1

+ · · · + αq−1(t− t0)q−p−1

(p+ 1) . . . (q − 1)→ 0 as t → t0.

As {~v(p)(t0), ~v(q)(t0)} forms a basis of R2, we can write ~η(t) = η1(t)~v(p)(t0) + η2(t)~v(q)(t0) with η1(t) =O

((t− t0)q

)and η2(t) = O

((t− t0)q

). All in all, we obtain

~v(t) − ~v(t0) =(t− t0)p

p!

(1 + ε1(t)

)~v(p)(t0) +

(t− t0)q

q!

(1 + ε2(t)

)~v(q)(t0) . (2.4)

where ε1(t) = O(t− t0) and ε2(t) = O(t− t0).

In order to study the position of the parametric curve with respect to its tangent in a neighborhoodof t0, we work in the basis

(

M0;1

p!~v(p)(t0),

1

q!~v(q)(t0)

)

of the Euclidean plane. In this basis, the tangent is the first axis and Equation (2.4) stipulates that thepoint M(t) has coordinates

{

(t− t0)p(1 + ε1(t)

)

(t− t0)q(1 + ε2(t)

) .

As 1 + εi(t) → 1 as t → t0, we have that 1 + εi(t) > 0 in a neighborhood of t0, so that the signs of thecoordinates are those of (t − t0)p and (t − t0)q , that is, positive for t > t0 and positive or negative fort < t0, depending on the parity of p and q. We obtain the following classification into four classes:

1p!~v

(p)(t0)

1q!~v

(q)(t0)

M0

p odd, q even

h standard time g

1p!~v

(p)(t0)

1q!~v

(q)(t0)

M0

p odd, q odd

h inflection time g

72

2.3.3. Local behavior

1p!~v

(p)(t0)

1q!~v

(q)(t0)

M0

p even, q odd

h cusp of the first kind g

1p!~v

(p)(t0)

1q!~v

(q)(t0)

M0

p even, q even

h cusp of the second kind g

Example 2.24

Let us look, in a neighborhood of 0, at ~v : t ∈ R 7→ tp~ı + tq~ for some values of 1 ≤ p < q. We have~v(p)(0) = p!~ı, ~v(q)(0) = q!~, and ~v(k)(0) = 0 whenever k /∈ {p, q}, so that p and q correspond to theintegers defined by (i) and (ii).

~ı

~

0

p = 1, q = 2

~ı

~

0

p = 1, q = 3

~ı

~

0

p = 2, q = 3

~ı

~

0

p = 2, q = 4

In practice, almost all times are biregular. Recall that this means that the previous integers are welldefined and p = 1, q = 2. Such times are standard: the curve touches its tangent without crossing it(top-left picture).

h Looking for cusps. At a regular time, we have p = 1, so that we cannot have a cusp. Cusps arethus necessarily singular times! Beware that the converse is false! As a result, if one wants to look forcusps, it is enough to study singular times.

h Looking for inflection times. At an inflection time t0, p and q are odd. As result, the vectors ~v′(t0)and ~v′′(t0) are collinear (otherwise p = 1 and q = 2). Beware that the converse is false! If one wantsto look for inflection times, it is enough to study the times t0 such that ~v′(t0) and ~v′′(t0) are collinear.This means that

det(v′(t0), ~v′′(t0)

)= 0 ⇐⇒

∣∣∣∣

x′(t0) x′′(t0)y′(t0) y′′(t0)

∣∣∣∣

= 0

⇐⇒(x′y′′ − x′′y′)(t0) = 0 .

73


If x′(t0) = 0, this is equivalent to x′(t0) = y′(t0) = 0 (t0 is singular) or x′(t0) = x′′(t0) = 0. If x′ does

not cancel at t0, this is equivalent tod

dt

(y′

x′

)

(t0) = 0. One thus starts by looking at the canceling times

of x′ and ofd

dt

(y′

x′

)

.

Alternatively, one may also look at the canceling times of y′ and ofd

dt

(x′

y′

)

(the computations may

be easier for one than for the other).Then, for each of these times, compute p (p = 1 if and only if t0 is not singular). If p is even, stop

there and conclude that it is not an inflection time; if p is odd, compute q. If q is even, conclude that itis not an inflection time; if q is odd, conclude that it is an inflection time.

2.4 Sketching

2.4.1 Interval of study

As mathematicians don’t like to work worthlessly, the first step when studying a parametric curve isto try to study it as little as possible. More precisely, when studying a parametric curve (I, ~v), it can beenough to study (J,~v) for some interval J ( I as small as possible.

Periodicity

The first step is to look at periodicity. If ~v is periodic with period T , it is enough to study it on someinterval of length T . If x is Tx-periodic and y is Ty-periodic, note that ~v is periodic if and only ifTx/Ty ∈ Q.

Isometries

We try to split I = J ∪ J1 ∪ · · · ∪ Jk into a finite number of subintervals in such a way that, foreach 1 ≤ i ≤ k, the image ~v(Ji) corresponds to ~v(J) through a simple plane isometry (reflection,translation, rotation). In other words, for each 1 ≤ i ≤ k, there exist a bijection ϕi : Ji → J such that~v(J) = ~v ◦ ϕi(Ji) corresponds to ~v(Ji) through a simple plane isometry. In this case, we study (J,~v)and deduce the image of (I, ~v) thanks to the isometries.

Remark 2.25

The subintervals don’t have to be disjoint, but, in practice, they are or share an extremity.

In general, the bijections have the simple form t 7→ α ± t for some α ∈ R, but any can do (as, forinstance t ∈ (0, 1] 7→ 1

t ∈ [1,+∞)).

The isometries are usually to look for among the following:

h (x, y) 7→ (x, y) : identity;

h (x, y) 7→ (−x, y) : reflection across the vertical axis;

h (x, y) 7→ (x,−y) : reflection across the horizontal axis;

h (x, y) 7→ (−x,−y) : rotation about O of angle π ;

h (x, y) 7→ (−y, x) : rotation about O of angle π/2 ;

h (x, y) 7→ (y,−x) : rotation about O of angle −π/2 ;

h (x, y) 7→ (y, x) : reflection across the first bisector (line y = x);

74

2.4.1. Interval of study

h (x, y) 7→ (−y,−x) : reflection across the second bisector (line y = −x);

h (x, y) 7→ (x+ a, y + b) : translation of vector a~ı + b~.

Remark 2.26

Reducing the interval of study by periodicity almost falls into this framework with bijections of the formt 7→ iT + t, for i ∈ Z and the identity as isometry, but with a countable number of subintervals. . .

Sometimes, there exist more complicated isometries (they will be indicated in exercises).

Example 2.27

Let us find a good interval of study for the parametric curve ~v : t ∈ R 7→(2 cos(t) + cos(2t)

)~ı +

(2 sin(t) − sin(2t)

)~.

First, we clearly have ~v(t+ 2π) = ~v(t) for all t ∈ R, so that the parametric curve is 2π-periodic: we willstudy it on an interval of length 2π (to be determined later).

Second, we observe that, for any t ∈ R, we have x(−t) = x(t) and y(−t) = −y(t), so that we willstudy the curve on [0, π] and complete its sketch by applying the isometry (x, y) 7→ (x,−y), that is, thereflection across the horizontal axis.

Finally, let z(t) := x(t) + iy(t) = 2eit + e−2it ∈ C. We can check (in exercises, you will get a hint)that, for any t ∈ R,

z(

t+2π

3

)

= 2eite2i π3 + e−2ite−4i π

3 = e2i π3

(2eit + e−2it

)= e2i π

3 z(t).

The point M(t + 2π3 ) is thus the image of M(t) through the rotation about O of angle 2π

3 . The curve isthus invariant under this rotation. As a result, we study it on the interval [0, 2π

3 ].

x

y

0 0

2π3

π

4π3

Remark 2.28

On the picture, we indicated in green the value of the parameter at strategic times. The curve corre-sponding to the interval of study is drawn with a thicker paintbrush.

75


2.4.2 Asymptotes

Throughout this section, we suppose that

‖~v(t)‖ → ∞ as t approaches an extremity t0 of I

(either t0 = a if I = (a, b] or (a, b) or t0 = b if I = [a, b) or (a, b), where a, b ∈ R ∪ {±∞}). We areinterested in the behavior of the parametric curve near such a time. First, notice that the point M(t)

lies on the line L(O, ~v(t)

‖~v(t)‖), where the direction vector ~v(t)

‖~v(t)‖ has norm 1. It is natural to say that M(t)

tends to infinity in a given direction if this vector admits a limit; let us give a precise definition.

Definition 2.29 h asymptotic direction, asymptote g

h We say that the raya R(O, ~u) is the asymptotic direction of the parametric curve at t0 if

~v(t)

‖~v(t)‖ → ~u as t → t0.

h A line L such that the distance from ~v(t) to L tends to 0 as t → t0 is called an asymptote.

aWe denote by R(A, ~u) :={

A + λ~u : λ ∈ R+

}the ray starting at the point A with direction vector ~u.

Here is an illustration of three asymptotes: from left to right, a vertical asymptote, a horizontalasymptote and a slant asymptote.

x

y

x

y

x

y

Proposition 2.30

The line with equation ax+ by + c = 0 is an asymptote if and only if ax(t) + by(t) + c → 0 as t → t0.

Proof. The distance from M(t) to the line is

|ax(t) + by(t) + c|√a2 + b2

.

This quantity tends to 0 if and only if ax(t) + by(t) + c → 0.

Proposition 2.31

If a parametric curve admits an asymptote as t → t0, then it also has an asymptotic direction at t0.More precisely, if the line L (A, ~u) is an asymptote as t → t0, then either R(O, ~u) or R(O,−~u) is theasymptotic direction at t0.

76

2.4.2. Asymptotes

Proof. Let a~ı+b~ be a vector orthogonal to ~u, so that the Cartesian equation of L (A, ~u) is ax+by+c = 0,for some c ∈ R. By Proposition 2.30, ax(t) + by(t) + c → 0, so that

[ab

]

· ~v(t)

‖~v(t)‖ =ax(t) + by(t)√

x2(t) + y2(t)→ 0 .

Whenever ~v(t) 6= ~0, which is the case when t is close to t0, the function t 7→ ~v(t)‖~v(t)‖ is a continuous

function taking its values on the circle C . As there are only two vectors on C that are orthogonal toa~ı + b~, namely ~u and −~u, we conclude that

~v(t)

‖~v(t)‖ → ±~u ,

as desired.

Note that a parametric curve may have an asymptotic direction without an asymptote.


Show that t ∈ R+ 7→ t~ı + sin(t)~ admits an asymptotic direction but no asymptote at +∞.

x

y

0

Moreover, the fact that ‖~v(t)‖ → ∞ does not ensure the existence of an asymptotic direction.


Show that t ∈ R+ 7→ t sin(t)~ı + t cos(t)~ admits no asymptotic direction at +∞.

x

y

0

In practice, the following theorem gives criterions that very often allow to conclude.

77


Theorem 2.34 h asymptotic directions and asymptotes g


(i) If |y(t)| → ∞ and x(t) → x0, then the line with equation x = x0 is an asymptote.

(ii) If |x(t)| → ∞ and y(t) → y0, then the line with equation y = y0 is an asymptote.

(iii) If |x(t)| → ∞ and |y(t)| → ∞, then we have the following:

a) if∣∣ y(t)

x(t)

∣∣ → ∞ then we have a vertical asymptotic ray and no asymptote;

b) if∣∣ y(t)

x(t)

∣∣ → 0 then we have a horizontal asymptotic ray and no asymptote;

c) if y(t)x(t) → a 6= 0 then

1) if |y(t) − ax(t)| → ∞, we have an asymptotic ray with direction ε~ı + aε~, whereε ∈ {−1,+1} is the sign of x(t) near t0, and no asymptote,

2) if y(t) − ax(t) → b ∈ R, the line with equation y = ax+ b is an asymptote.

Proof. Cases (i), (ii) and 2) are immediate consequences of Proposition 2.30.If |x(t)| → ∞ and |y(t)| → ∞, let us see under which condition we can have an asymptote: we

suppose that the line with equation αx + βy + c = 0 (with (α, β) 6= (0, 0)) is an asymptote. By Propo-sition 2.30, we must have αx(t) + βy(t) + c → 0. This implies that α 6= 0 and β 6= 0, as otherwise|αx(t)+βy(t)+c| → ∞. The equation of the asymptote can thus be rewritten as y = ax+b (with a 6= 0)

and we must have y(t) − ax(t) → b and, after dividing by x(t), which tends to infinity, y(t)x(t) → a. This

proves that there is no asymptote in cases a), b) and 1).

In case 1), we have that y(t) ∼ ax(t), so that ‖~v(t)‖ =√

x2(t) + y2(t) ∼ |x(t)|√

1 + a2 and

~v(t)

‖~v(t)‖ =x(t)

‖~v(t)‖~ı +y(t)

‖~v(t)‖~ → ε~ı + a~√1 + a2

.

In case a), we have ‖~v(t)‖ ∼ |y(t)|, so that

~v(t)

‖~v(t)‖ → ±~

and, in case b), we have ‖~v(t)‖ ∼ |x(t)|, so that

~v(t)

‖~v(t)‖ → ±~ı .

Whenever there is an asymptote of equation ax + by + c = 0, one gets the position of M(t) withrespect to the asymptote by studying the sign of ax(t) + by(t) + c. For instance, for b = 1, we have thefollowing:

h if ax(t) + y(t) + c > 0, the point M(t) lies above the asymptote;

h if ax(t) + y(t) + c = 0, the point M(t) lies on the asymptote;

h if ax(t) + y(t) + c < 0, the point M(t) lies below the asymptote.

In particular, it is interesting to see whether the parametric curve crosses the asymptote an infinitenumber of times or a finite number of times. In the latter case, after the last crossing time, the para-metric curve stays in one of the half-planes delimited by the asymptote.

78

2.4.3. Sketching plan

2.4.3 Sketching plan

Here is the road map for sketching a parametric curve (I, ~v).

h Interval of definition. If ~v is given without its interval of definition I , you should find it. If ~v isgiven with a definition domain that is not an interval, split it into separate intervals.

h Interval of study. Reduce as much as possible the interval of study. See Section 2.4.1.

h Variations of x and y. Study and draw the table of variations for x and y.

h Extremities of the interval of study. Look at what happens at the extremities. Is there a finitelimit? Are there asymptotic directions? Are there asymptotes? If so, what is the position of thecurve with respect to the asymptote? See Section 2.4.2.

h Singular times. Study singular times.

h Particular times. Study the times that have a particular interest (multiple points, local ex-tremums, etc.).

h Precise sketch. Don’t forget to draw the tangents at the times of interest. It is also a good practiceto indicate the values of the parameter at these times. The golden rule for sketching a parametriccurve is the following:

d plot the points at the times of interest with their tangents;

d link these points together: if x increases, move right; if x decreases, move left; if y increases,move up; if y decreases, move down;

d if it is ugly, it is wrong!

Remark 2.35

In the table of variations, it is convenient to arrange the lines in the order x′, x, y, y′ so that x and y areadjacent. This gives a better view of the variations. Moreover, we put as many values as possible; thesegive the coordinates and tangents at strategic times. In particular, the times where x′ or y′ cancels are ofinterest (if regular, these times have respectively a vertical or a horizontal tangent).

Let us now see a couple of examples.

Example 2.36 h astroid g

Let us start with the study of

~v : t ∈ R 7→{

x(t) = cos3(t)

y(t) = sin3(t).

h Periodicity. The function ~v is 2π-periodic, we can study this parametric curve on an interval oflength 2π.

h Isometries. We observe that, for any t ∈ R,

x(−t) = x(t) and y(−t) = −y(t) ,

so that the curve is symmetric across the horizontal axis. We thus chose an interval of length 2π thatis symmetric with respect to 0 (so that t 7→ −t makes sense), that is, [−π, π] and split it in two. As a

79


result, we study the parametric curve on [0, π] and we will make a reflection across the horizontal axis inorder to complete the picture. Next, for any t ∈ R,

x(π − t) = −x(t) and y(π − t) = y(t) ,

so that the curve is symmetric across the vertical axis; we can study it on [0, π2 ]. Finally, for any t ∈ R,

x(

π2 − t

)= y(t) and y

(π2 − t

)= x(t) ,

so that the curve is symmetric across the first bisector; we can study it on I := [0, π4 ].

h Derivatives and variations. We have{

x′(t) = −3 sin(t) cos2(t)

y′(t) = 3 cos(t) sin2(t).

On I , both functions x′ and y′ only cancel at 0; the time 0 is thus the unique singular time on I .

t

x′

x

y

y′

0π

4

0 − − 3√

24

11√

24

√2

4

00

√2

4

√2

4

0 + 3√

24

h Singular time. At 0, we have the following Taylor expansion:

x(t) =

(

1 − t2

2+ O(t2)

)3

= 1 − 3t2

2+ O(t3) and y(t) =

(t+ O(t2)

)3= t3 + O(t3) ,

so that

~v(t) = ~v(0) − 3t2

2~ı + t3~ + O(t3) .

We thus have a cusp of the first kind at 0. The half-tangent at this time is directed by −~ı.

h Sketch.

80


x

y

0π

3π2

π2

π4

3π4

5π4

7π4

Example 2.37

Let us study

~v(t) =

x(t) =t3

t2 − 1

y(t) =t(3t− 2)

3(t− 1)

.

h Domain of study. Here, the domain of definition is not explicitly given. As ~v is defined on R \{−1, 1}, we have three parametric curves to study, ((−∞,−1), ~v), ((−1, 1), ~v), and ((1,+∞), ~v). Thereis no periodicity and no immediate isometry to see.

h Derivatives and variations. On R \ {−1, 1}, we have

x′(t) =3t2(t2 − 1) − t3(2t)

(t2 − 1)2=t2(t2 − 3)

(t2 − 1)2

and

y′(t) =(6t− 2)(t− 1) − (3t2 − 2t)

3(t− 1)2=

3t2 − 6t+ 2

3(t− 1)2.

The function x′ cancels at −√

3, 0 and√

3. The function y′ cancels at 1 ±√

33 . As they do not cancel at

a common time, all three parametric curves are regular.

81


t

x′

x

y

y′

−∞ −√

3 −1 0 1 −√

3

31 1 +

√3

3

√3 +∞

+ 0 − − 0 − − 0 +

−∞−∞

− 3√

32− 3√

32

−∞

+∞

−∞

+∞

3√

32

3√

32

+∞+∞0

x2

x3

−∞−∞

y2y2

−∞

+∞

y3y3

+∞+∞

y1

− 56

0y4

+ 0 − − 0 +

We used the following notation:

y1 := y(

−√

3)

=3 − 7

√3

6≈ −1.52

x2 := x(

1 − 1√3

)

=42 − 26

√3

33≈ −0.09

y2 := y(

1 − 1√3

)

=4 − 2

√3

3≈ 0.17

x3 := x(

1 + 1√3

)

=42 + 26

√3

33≈ 2.63

y3 := y(

1 + 1√3

)

=4 + 2

√3

3≈ 2.48

y4 := y(√

3)

=3 + 7

√3

6≈ 2.52

h Extremities. There are quite a few extremities to study. First, as t → ±∞, both x and y tend to

infinity. Theorem 2.34 tells us to study the quotient y(t)x(t) , which is well defined for t close to ±∞ (more

precisely, for any t ∈ R \ {−1, 0, 1}).

y(t)

x(t)=

(3t− 2)(t+ 1)

3t2∼ 1 as t → ±∞.

One should then study

y(t) − x(t) =t(3t− 2)

3(t− 1)− t3

t2 − 1=

t2 − 2t

3(t− 1)(t+ 1)∼ 1

3as t → ±∞.

The line with equation y = x + 13 is thus an asymptote. In order to know the relative position between

the curve and this asymptote, one study the sign of y(t) −(x(t) + 1

3

)= −2t+1

3(t−1)(t+1) :

82


t

y(t) −(

x(t) + 13

)

position

−∞ −1 12 1 +∞

+ − 0 + −

curve above curve below curve above curve below

The curve crosses its asymptote at the point M(12 ) = (− 1

6 ,16 ).

As t → −1, y(t) → − 56 , so that we have a horizontal asymptote of equation y = − 5

6 . As y is increasingaround −1, the curve approaches its asymptote from below at −1− and from above at −1+.

As t → 1, both x and y tend to infinity. We have

y(t)

x(t)=

(3t− 2)(t+ 1)

3t2→ 2

3and y(t) − 2

3x(t) =

t(t+ 2)

3(t+ 1)→ 1

2.

The line of equation y = 23x + 1

2 is thus an asymptote. The relative position is given by the sign of

y(t) −(

23x(t) + 1

2

)= (t−1)(2t+3)

6(t+1) , which is negative as t → 1− and positive as t → 1+.

h Sketch.

x

y

y = − 56

y = 23x+ 1

2

y = x+ 13

−∞

−1−

−1+

1−

1+

+∞

−√

3

0

1 −√

32

1 +√

32

√3

83



Check that the point (− 83 ,− 4

3 ) is the point where the previous parametric curves intersect.

Example 2.39 h a fishy example g

Let us study

~v(t) =

{

x(t) = cos(t) −√

22 cos2(t)

y(t) = sin(t) cos(t).

Here, the domain of definition is not explicit. We take R as both functions are well defined on R.

h Periodicity. The function y is π-periodic, the function cos is 2π-periodic and cos2 is π-periodic.As a result, x is 2π-periodic, and so is y. So far, we can study this parametric curve on an interval oflength 2π.

h Isometries. We observe that

x(−t) = cos(t) −√

2

2cos2(t) = x(t) and y(−t) = − sin(t) cos(t) = −y(t) ,

so that the curve is symmetric across the horizontal axis. We thus chose an interval of length 2π thatis symmetric with respect to 0 (so that t 7→ −t makes sense), that is, [−π, π] and split it in two. As aresult, we study the parametric curve on I := [0, π] and we will make a reflection across the horizontalaxis in order to complete the picture.

h Derivatives and variations. We have{

x′(t) = sin(t)(√

2 cos(t) − 1)

y′(t) = cos(2t).

On I , the function x′ cancels at 0, π4 and π; the function y′ cancels at π

4 and 3π4 . There is thus a unique

singular time on I : the time π4 .

t

x′

x

y

y′

0π

4

3π

4π

0 + 0 − −√

2 − 0

2−√

22

2−√

22

√2

4

√2

4

− 2+√

22− 2+√

22

− 3√

24

00

1212

− 12− 12

00

1 + 0 − 0 + 1

84

2.5. Polar curves

h Extremities. We have ~v(0) = 2−√

22~ı, ~v(π) = − 2+

√2

2~ı, and ~v′(0) = ~v′(π) = ~. The curve thus

starts and ends on the horizontal axis with vertical tangents.

h Singular time. We compute

{

x′′(t) = cos(t)(√

2 cos(t) − 1)

−√

2 sin2(t) =√

2 cos(2t) − cos(t)

y′′(t) = −2 sin(2t)

and {

x′′′(t) = −2√

2 sin(2t) + sin(t)

y′′′(t) = −4 cos(2t),

so that

~v′′(π

4

)

= −√

2

2~ı − 2~ and ~v′′′

(π

4

)

= −3

√2

2~ı .

These vectors are not collinear so that we have a cusp of the first kind at π4 . The tangent at this time is

directed by −√

22~ı − 2~.

h Double points. We see that the origin is a double point, obtained for t = π2 . The tangent at this

time has direction ~v′(π2

)= −~ı −~.

h Sketch.

x

y

0

π4

3π4

ππ2

2.5 Polar curves

2.5.1 Polar coordinates

For any point M = x~ı + y~ ∈ E2, there exists at least a pair (ρ, θ) ∈ R+ × R such that

{

x = ρ cos(θ)

y = ρ sin(θ). (2.5)

85


Definition 2.40 h polar coordinates g

h Such numbers ρ ∈ R+ and θ ∈ R are called polar coordinates of M .

h In this context, the origin O is called the pole.

h The number ρ ∈ R+ is called the radial coordinate or the radius of M .

h The number θ ∈ R is called an angular coordinate, a polar angle, or an azimuth of M .

Note that, in contrast with Cartesian coordinates, polar coordinates are not unique. More precisely,

the radius ρ =√

x2 + y2 is uniquely defined but the polar angle is only defined modulo 2π for M 6= Oand is completely arbitrary for the pole O. When one wants a unique system of coordinates, it iscustomary to set the polar angle of the pole to be 0, and one specifies a half-open interval of length 2πfor the polar angle (often (−π, π] or [0, 2π)).

It is really simple to express the Cartesian coordinates from polar coordinates by (2.5). Conversely,

it is not as easy: the radius is always ρ =√

x2 + y2, but the angle always needs to be defined withcare. For instance, if we choose the system where

(ρ, θ) ∈(R⋆

+ × (−π, π])

∪ {(0, 0)} ,

then the polar angle is given by

θ =

arctan(

yx

)if x > 0

π2 if x = 0 and y > 0

0 if x = 0 and y = 0

− π2 if x = 0 and y < 0

arctan(

yx

)+ π if x < 0 and y ≥ 0

arctan(

yx

)− π if x < 0 and y < 0

.


Express the polar angle for the system where (ρ, θ) ∈(R⋆

+ × [0, 2π))

∪ {(0, 0)}.

2.5.2 Polar curves

For θ ∈ R, we define the unit vectors

~uθ := cos(θ)~ı + sin(θ)~ and ~vθ := ~uθ+ π2

= − sin(θ)~ı + cos(θ)~.

x

y

0 1

~uθ

θ

~vθ

π2

86

2.5.3. What is the difference with a usual graph?

The following straightforward properties hold.

Proposition 2.42

h We haved

dθ~uθ = ~vθ and

d

dθ~vθ = −~uθ.

h We have ~uθ±π = −~uθ and ~vθ±π = −~vθ .

Definition 2.43 h polar curve g

A parametric curve in polar coordinates or polar curve is a parametric curve given in the formθ ∈ I 7→ ρ(θ) ~uθ , where ρ : I → R is a continuous function on an interval I ⊆ R. We will use theshorthand notation (I, ρ).

From now on, we consider a polar curve (I, ρ), set

~v(θ) := ρ(θ) ~uθ and, as before, define M(θ) such that−−−−→OM(θ) = ~v(θ) .

Warning 2.44 B ρ may take negative values B

The function ρ is allowed to take negative values. The radius of M(θ) is nevertheless always |ρ(θ)|.

h If ρ(θ) ≥ 0, then the pair(ρ(θ), θ

)is a pair of polar coordinates of M(θ).

h If ρ(θ) < 0, then the pair(

− ρ(θ), θ + π)

is a pair of polar coordinates of M(θ).

Remark 2.45

We can always see the polar curve (I, ρ) as a parametric curve in Cartesian coordinates:

{

x(t) = ρ(t) cos(t)

y(t) = ρ(t) sin(t)t ∈ I .

In theory, we thus already know how to study such a parametric curve. So what is the differencewith what we have done so far? In fact, if we have a parametric curve t ∈ I 7→ x(t)~ı + y(t)~, we canalways write the vector x(t)~ı + y(t)~ in polar coordinates as some ρ(t)~uθ(t). We can even do so in sucha way that t ∈ I 7→ ρ(t) and t ∈ I 7→ θ(t) are continuous functions. But, in general, this is not a polarcurve! Polar curves are more restrictive as the parameter has to be a polar angle (modulo π). In otherwords, we must have θ(t) = t mod π.

Polar curves may thus have special properties that general parametric curves do not necessarilypossess. Let us now see some of their particularities.

2.5.3 What is the difference with a usual graph?

Let us investigate the difference between sketching the graph of (ρ, I) as usual and sketching the polarcurve (ρ, I).

87


h Graph. When we plot the graph of (ρ, I), we plot for each θ ∈ I the point with Cartesian coordi-nates (θ, ρ(θ)). In other words, on the vertical line x = θ, we select the point with coordinate ρ(θ). As θranges over I , the vertical line x = θ moves at constant speed from left to right.

x

y

πI

h Polar curve. Now, when we plot the polar curve (ρ, I), for each θ ∈ I , we select the point withcoordinate ρ(θ) on the line L (O, ~uθ). As θ ranges over I , the line L (O, ~uθ) turns around the pole atconstant speed in the direct sense.

x

y

2.5.4 Tangents

Let us suppose that ρ is differentiable at θ0 ∈ I . Then ~v(θ) = ρ(θ) ~uθ is differentiable at θ0 and

~v′(θ0) = ρ′(θ0) ~uθ0 + ρ(θ0)~vθ0 . (2.6)

88

2.5.4. Tangents

Proposition 2.46 h tangents g

Let θ0 ∈ I .

(i) If M(θ0) = O and there exists ε > 0 such thatM(θ) 6= O for θ ∈ (θ0 − ε, θ0) ∪ (θ0, θ0 + ε), thenthe polar curve admits as tangent at t0 the line L (O, ~uθ).

(ii) IfM(θ0) 6= O and ρ is differentiable at θ0, then θ0 is regular and the polar curve admits as tangentat t0 the line L

(M(θ0), ρ′(θ0) ~uθ0 + ρ(θ0)~vθ0 ).

Proof. (i) The line M(θ0)M(θ) is well defined for θ ∈ (θ0 − ε, θ0) ∪ (θ0, θ0 + ε) and admits ~uθ asdirection vector. The result follows from the fact that ~uθ → ~uθ0 as θ → θ0.

(ii) Saying that M(θ0) 6= O is equivalent to saying that ρ(θ0) 6= 0. If ρ is differentiable at θ0,then (2.6) gives that ~v′(θ0) 6= ~0 so that θ0 is regular and Proposition 2.21 entails the result.

x

y

~uθ0

M(θ0)

θ0


Find the tangents at times π3 and π

2 of the polar curve ρ(θ) = 1 − 2 cos(θ).

Let us see in more details what this proposition entails. If ρ is differentiable on I , then (ii) ensuresthat all the times are regular except possibly the times θ such that M(θ) = O.

Let θ0 be such that ρ only cancels at θ0 in a neighborhood of θ0. As the line M(θ0)M(θ) admits ~uθ

as direction vector, we see that, if ρ changes sign at θ0, then we have a standard time; if ρ does notchange sign at θ0, then we have a cusp of the first kind. (Note that, if ρ changes sign at θ0, then θ0 is asingular time (indeed, either ρ is not differentiable at θ0 or it is and ρ′(θ0) = 0).)

In particular, a differentiable polar curve can never have a cusp of the second kind!

Example 2.48

Let us study at time π2 the polar curves

ρ(θ) = (θ + 1) cos(θ) and ρ(θ) = 2 cos2(θ) .

In both case, ρ(π2 ) = 0, so that M(π

2 ) = O. We thus know that the tangent is the line L (O, ~u π2), that

is, the vertical axis.

h For the first polar curve, ρ takes positive values closely before time π2 and negative values closely

after. As a result, the point M(θ) passes through the origin from the top to the bottom (M(θ) ∈R(O, ~uθ) for θ < π

2 and M(θ) ∈ R(O,−~uθ) for θ > π2 ): we have a standard time.

89


h In the second case, ρ stays positive in a neighborhood of time π2 . As a result, the point M(θ) stays

on top of the horizontal axis (M(θ) ∈ R(O, ~uθ) for θ close to π2 ): we have a cusp of the first kind.

x

y

0

1

1 x

y

0

1

1π2

The tangent is naturally expressed in the orthonormal basis (M(θ0); ~uθ0 , ~vθ0) of E2, rather than in(O;~ı,~). Instead of trying to transform the expression, it is often convenient to work directly in thisbasis.

Definition 2.49 h mobile basis g

The orthonormal basis (M(θ0); ~uθ0 , ~vθ0) of E2 is called the mobile basis at the time θ0.

~v′(θ0)

ρ′(θ0)

ρ(θ0)

~uθ0

~vθ0

M(θ0)

θ0

~ı

~

O

In fact, under the hypothesis that ρ is differentiable at θ0, the tangent at time θ0 is very constrained.In the mobile basis, its coordinates are (ρ′(θ0), ρ(θ0)), so that the coordinate along ~vθ0 is equal to thedistance between O and M(θ0). On the above picture, this fact is symbolized by the dotted circle.

h Looking for inflection times. Let us suppose that ρ is twice differentiable. We have

~v′(θ) = ρ′(θ) ~uθ + ρ(θ)~vθ and ~v′′(θ) =(ρ′′(θ) − ρ(θ)

)~uθ + 2ρ′(θ)~vθ .

Remember from the end of Section 2.3.3 that, at an inflection time, these vectors are collinear. Remem-ber also that the converse is false! It is enough to study the times θ such that

det(v′(θ), ~v′′(θ)

)= 0 ⇐⇒

∣∣∣∣

ρ′(θ) ρ′′(θ) − ρ(θ)ρ(θ) 2ρ′(θ)

∣∣∣∣

= 0

⇐⇒ 2ρ′2(θ) + ρ2(θ) − ρ(θ)ρ′′(θ) = 0 .

90

2.5.5. Extremities of the interval of study

We already know from above that, if ρ(θ) = 0, then θ cannot be an inflection time. Observing that

(1

ρ

)′= − ρ′

ρ2and

(1

ρ

)′′= −ρ′′

ρ2+ 2

ρ′2

ρ3=

2ρ′2 − ρρ′′

ρ3,

it is enough to study times at which

1

ρ+

(1

ρ

)′′= 0 .

2.5.5 Extremities of the interval of study

h Finite extremities. Let us first suppose that ‖~v(θ)‖ → ∞ as θ approaches a finite extremity θ0 ∈ R

of I . This means that ρ(θ) → ±∞ as θ → θ0. We have that2

~v(θ)

‖~v(θ)‖ = Sign(ρ(θ)

)~uθ → ε~uθ0

where ε ∈ {−1,+1} is the limit of the sign of ρ. We thus always have an asymptotic directionR(O, ε~uθ0 ).

In order to see whether there exists an asymptote or not, we work in the the basis (O, ~uθ0 , ~vθ0). Inthis basis, the coordinates of M(θ) are

(~v(θ) · ~uθ0 , ~v(θ) · ~vθ0

)=(ρ(θ) cos(θ − θ0), ρ(θ) sin(θ − θ0)

).

The abscissa tends to ε∞. If the ordinate has a limit ρ(θ) sin(θ− θ0) → α ∈ R, then the line of equationY = α in this basis is an asymptote. In other words, this is the line L

(α~vθ0 , ~uθ0

). The side is given

by ε and the relative position is obtained by studying the sign of ρ(θ) sin(θ − θ0) − α.

h Infinite extremities. Let us now see what can happen as θ → ±∞. There are three cases of interest:

h if ρ(θ) → 0, then the pole O is an asymptotic point;

h if ρ(θ) → a ∈ R⋆, then the circle of center O and radius |a| is an asymptotic circle (the relativeposition is obtained by studying the sign of ρ(θ) − a);

h if ρ(θ) → ±∞, then the polar curve behaves as a spiral.

2We denote by Sign the sign function defined by Sign(x) = −1 if x < 0, Sign(0) = 0, and Sign(x) = +1 if x > 0.

91


x

y

ρ : θ ∈ R 7→ sin(θ)

θ

x

y

ρ : θ ∈ R 7→ sin(2θ)

2θ

h asymptotic point g

x

y

ρ : θ ∈ [1,+∞) 7→ 1 − 1

θ

h asymptotic circle g

x

y

ρ : θ ∈ [1,+∞) 7→ θ

10π

h spiral behavior g

Remark 2.50

The first two examples ρ1 : θ ∈ R 7→ sin(θ)θ and ρ2 : θ ∈ R 7→ sin(2θ)

2θ are also there to illustrate the factthat, although ρ2(θ) = ρ1(2θ), the curves do not really look alike. This is simply due to the fact that thesecond one is ρ2(θ)~uθ = ρ1(2θ)~uθ 6= ρ1(2θ)~u2θ . Reparameterization θ 7→ ϕ(θ) only works for polarcurves if ~uϕ(θ) is a real function of ~uθ ; this is quite restrictive!

92

2.5.6. Sketching

2.5.6 Sketching

The sketching plan for a polar curve is the same as the one for general parametric curves. Let us recallit and review the particularities of polar curves.

h Interval of definition. If ρ is given without its interval of definition I , you should find it. If ρ isgiven with a definition domain that is not an interval, split it into separate intervals.

h Interval of study. If ρ is T -periodic, we study it on some interval of length T .

Warning 2.51 B The period of ρ is not the period of ~v ! B

Beware that ρ(θ + T ) = ρ(θ) implies that ~v(θ + T ) = ρ(θ) ~uθ+T . If T /∈ 2πZ, then ~uθ+T 6= ~uθ .

However, ρ(θ) ~uθ+T is the image of ρ(θ) ~uθ through a rotation about O of angle T . We thus studythe polar curve on an interval of length T and complete the sketch through subsequent rotations ofangle T , 2T , 3T , etc. If T

2π ∈ Q, then after a finite number of such rotations, the curve will close itself.Otherwise, we should do an infinite number of such rotations. . . In practice, this will hopefully nothappen.

Here is a list of useful isometries in polar coordinates.

h ρ ~uθ 7→ ρ ~u2ϕ−θ : reflection across the line L (O, ~uϕ) ; in particular,

d ρ ~uθ 7→ ρ ~u−θ = −ρ ~uπ−θ : reflection across the vertical axis,

d ρ ~uθ 7→ ρ ~uπ−θ = −ρ ~u−θ : reflection across the horizontal axis,

d ρ ~uθ 7→ ρ ~u π2 −θ = −ρ ~u− π

2 −θ : reflection across the first bisector (line y = x),

d ρ ~uθ 7→ ρ ~u− π2

−θ = −ρ ~u− π2

−θ : reflection across the second bisector (line y = −x).

h ρ ~uθ 7→ ρ ~uθ+ϕ : rotation about O of angle ϕ ; in particular,

d ρ ~uθ 7→ ρ ~uθ±π = −ρ ~uθ : rotation about O of angle π.

Summing up, we look for real numbers a ∈ R such that ρ(θ + a) = ±ρ(θ).

x

y

ρ ~uθ

ρ ~u2ϕ−θ

θ

ϕ

ρ ~uθ 7→ ρ ~u2ϕ−θ

x

y

ρ ~uθ

ρ ~uθ+ϕ

θ

ϕ

ρ ~uθ 7→ ρ ~uθ+ϕ

93


Example 2.52

Let us find an interval of study for the polar curve

ρ(θ) = 1 + 2 cos2(θ).

The function ρ is defined on R and is 2π-periodic. We thus have, for θ ∈ R,

~v(θ + 2π) = ρ(θ + 2π) ~uθ+2π = ρ(θ) ~uθ = ~v(θ) .

(Here, as the period of ρ is 2π, ~v is also 2π-periodic; remember Warning 2.51.) As a result, we have theentire curve by studying an interval of length 2π.

The function ρ is even. For θ ∈ R,

~v(−θ) = ρ(−θ) ~u−θ = ρ(θ) ~u−θ ,

so that M(−θ) is obtained from M(θ) by reflection across the horizontal axis. We study ([0, π], ρ) andcomplete the picture by a reflection across the horizontal axis.

For θ ∈ [0, π],~v(π − θ) = ρ(π − θ) ~uπ−θ = ρ(θ) ~uπ−θ ,

so that M(π − θ) is obtained from M(θ) by reflection across the vertical axis. We study ([0, π2 ], ρ) and

complete the picture by a reflection across the vertical axis and a reflection across the horizontal axis.

Here is the result:

x

y

0

1

1 0

π2

h Sign and variations of ρ. Draw the table of variations for ρ in such a way that the sign of ρ can bedirectly obtained (in other words, put the canceling times of ρ in the table). In particular, solve ρ(θ) = 0(on the interval of study) in order to find the times at which the curve passes through the pole. Thesign of ρ around such times allows to decide between standard times and cusps of the first kind.

h Extremities of the interval of study. Look at what happens at the extremities. See Section 2.5.5.

h Particular times. Study the times that have a particular interest (multiple points, intersections withthe axes, etc.). Note that a multiple point is such that there exists times θ1, θ2 such that ~v(θ1) = ~v(θ2).This means that ρ(θ1) ~uθ1 = ρ(θ2) ~uθ2 , which implies that ~uθ1 = ±~uθ2 , so that θ2 − θ1 ∈ πZ.

Beware that, although one of the times (say, θ1) can be chosen in the interval of study, the otherone does not necessarily belong to this interval (but can nonetheless be chosen in an interval oflength the period of ~v if it exists).

94

2.5.6. Sketching

h Precise sketch. Don’t forget to draw the tangents at the times of interest. It is also a good practiceto indicate the values of the parameter at these times. The golden rule for sketching a polar curve isthe following:

h always turn in the direct sense (counterclockwise sense) at constant angular speed;

h if |ρ| increases, move away from the pole, if |ρ| decreases, move toward the pole.

This is simply due to the fact that θ 7→ ~uθ turns around the pole at constant angular speed in the directsense.

Example 2.53 h cardioid g

Let us study the polar curveρ(θ) = 1 − cos(θ).

The domain of definition is not explicit; we take R, on which ρ is well defined.

h Interval of study. The function ρ is 2π-periodic and, as θ 7→ ~uθ is also 2π-periodic, the polarcurve itself is 2π-periodic. Moreover, ρ is even, so that we study ([0, π], ρ) and complete the picture byreflection across the horizontal axis.

h Sign and variations. We have ρ′ = sin(θ).

θ

ρ′

ρ

0 π

0 + 0

00

22

h Particular times. The time 0 is the only time where M = O. The tangent is the horizontal axis(L (O, ~u0)).

In Cartesian coordinates, the parametric curve is

{

x(θ) = ρ(θ) cos(θ) = cos(θ) − cos2(θ)

y(θ) = ρ(θ) sin(θ) = sin(θ) − sin(θ) cos(θ)θ ∈ [0, π] .

We have a horizontal tangent when y′ cancels (provided that x′ does not) and a vertical tangent when x′

(provided that y′ does not). We have x′(θ) = sin(θ)(2 cos(θ) − 1

), so that the canceling times of x′

on [0, π] are 0, π3 and π. Next, y′(θ) = cos(θ) − cos2(θ) + sin2(θ) = −2 cos2(θ) + cos(θ) + 1 =

(1 − 2 cos(θ)

)(cos(θ) − 1

), so that the canceling times of y′ on [0, π] are 0 and 2π

3 .As a result, we have a horizontal tangent at time 2π

3 and vertical tangents at times π3 and π. At time 0,

we cannot conclude from this analysis but already know that there is a horizontal tangent.

h Sketch.

95


x

y

O

1

1

0

π3

2π3

π

Example 2.54

Let us study the polar curve

ρ(θ) = 1 + tan

(θ

2

)

.

The function ρ is not defined on π + 2πZ. We must, a priori, study each polar curves ((−π + 2kπ, π +2kπ), ρ) for k ∈ Z.

h Interval of study. The function ρ is 2π-periodic and, as θ 7→ ~uθ is also 2π-periodic, each polarcurve ((−π + 2kπ, π + 2kπ), ρ) for k ∈ Z has the same image. We cannot further reduce the interval ofstudy: we choose for instance to study ((−π, π), ρ).

h Sign and variations. On (−π, π), we have ρ′(θ) =1

2 cos2(

θ2

) .

θ

ρ′

ρ

−π −π

20

π

2π

+ + 12 + 1 +

−∞

+∞

01

2

h Particular times. At time − π2 , the tangent is L (O, ~u− π

2), that is, the horizontal axis.

We added to the table of variations the values ρ(0) = 1, ρ′(0) = 12 , ρ(π

2 ) = 2 and ρ′(π2 ) = 1 as they are

easily computable. The tangents at these points have direction

ρ′(0)~u0 + ρ(0)~v0 =1

2~ı +~ and ρ′

(π

2

)

~u π2

+ ρ(π

2

)

~v π2

= ~ − 2~ı .

96

2.5.6. Sketching

h Extremities. As θ ր π, we have ρ(θ) → +∞ : the polar curve admits as asymptotic direction theray R(O, ~uπ) = R(O,−~ı). As θ ց −π, we have ρ(θ) → −∞ : the polar curve admits as asymptoticdirection the ray R(O,−~u−π) = R(O,~ı). In fact, it is enough to study ρ in a neighborhood of π byperiodicity: the behavior on (−π,−π + ε) is the same as that on (π, π + ε) (we momentarily change theinterval of study).In order to see whether there are asymptotes, we work in the basis (O, ~uπ, ~vπ). In this basis, the ordinateof M(θ) is ρ(θ) sin(θ − π) = −ρ(θ) sin(θ). Recalling that

sin(θ) = 2 sin

(θ

2

)

cos

(θ

2

)

=2 sin

(θ2

)cos(

θ2

)

cos2(

θ2

)+ sin2

(θ2

) =2 tan

(θ2

)

1 + tan2(

θ2

) ,

we obtain that the ordinate of M(θ) is

−ρ(θ) sin(θ) = −2t(1 + t)

1 + t2

where we set t = tan( θ2 ). As θ → π, t → ±∞, so that the ordinate of M(θ) tends to −2. We have

thus an asymptote with equation Y = −2 in the basis (O, ~uπ, ~vπ). Moreover, we obtain the relativeposition by studying the sign of

−ρ(θ) sin(θ) − (−2) = 21 − t

1 + t2,

which is negative as θ ր π (as t → +∞) and positive as θ ց π (as t → −∞).Summing up, in the basis (O, ~uπ, ~vπ), as θ ր π, the point M(θ) goes to the right from below the lineY = −2. As θ ց π, the point M(θ) goes to the left from above the line Y = −2. Now, be careful thatworking in the basis (O, ~uπ, ~vπ) means making a rotation about the pole of angle π.In (O,~ı,~), as θ ր π, the point M(θ) goes to the left from above the line y = 2. As θ ց π, thepoint M(θ) goes to the right from below the line y = 2.

h Double point. From the previous study (and a rough draft of the plot), we see that there is a doublepoint obtained for θ1 ∈ (0, π

2 ) and θ2 ∈ (−π,− π2 ). Let us try to find these values. We have

~v(θ1) = ~v(θ2) =⇒ ρ(θ1) ~uθ1 = ρ(θ2) ~uθ2 =⇒ θ2 = θ1 − π and ρ(θ1) = −ρ(θ2) .

We thus need to solve

ρ(θ1) = −ρ(θ1 − π) =⇒ 1 + tan

(θ1

2

)

= −1 − tan

(θ1

2− π

2

)

=⇒ 1 + tan

(θ1

2

)

= −1 +1

tan(

θ1

2

)

=⇒ 1 + t = −1 +1

tsetting t := tan

(θ1

2

)

> 0

=⇒ t2 + 2t− 1 = 0

=⇒ t = −1 ±√

2

=⇒ t = −1 +√

2 as t > 0

=⇒ θ1 =π

4.

Note: you are not expected to know such results. In exercises and tests, you will get help. You are,however, expected to know the method.

97


h Sketch.

x

y

O

1

1

− π2

π2

π4 , 3π

40

π−

−π+

98

3Ordinary differential equations

In this chapter, we will see how to solve some ordinary differential equations, mainly first orderlinear differential equations and linear systems of ODEs, with a special focus on linear differentialequations with constant coefficients.

For further references about this chapter, you may consult

h [Cod61], [CC97, 1–3], [BDH11, 1–3];

h [LM07, 31], [MTW07, 16], in French.

3.1 Introduction 100

3.1.1 Motivation 100

3.1.2 Formal definitions 100

3.1.3 Separable differential equations 102

3.1.4 Linear ODEs 103

3.2 First order linear differential equations 105

3.2.1 Homogeneous equation 105

3.2.2 Finding a particular solution to y′ = a(x)y + b(x) 107

3.2.3 Solution to the nonhomogeneous equation 110

3.3 Systems of linear ODEs 112

3.3.1 Preliminaries: matrix exponential 113

3.3.2 Solution to the homogeneous equation 116

3.3.3 Solution to the nonhomogeneous equation 118

3.3.4 Method for solving a system of linear ODEs in practice 119

3.4 Linear differential equations with constant coefficients 123

3.4.1 Homogeneous equation 123

3.4.2 Nonhomogeneous equation 126

3.4.3 Example: second order equation 128

99

Chapter 3. Ordinary differential equations

3.1 Introduction

3.1.1 Motivation

Fundamental quantities of physics are sometimes linked through equations and some quantities areobtained from other by derivation (for instance, the acceleration is the derivative of the speed, which isthe derivative of the position). This may give rise to equations linking a quantity with several of itsderivative. Solving such equations is key to a better understanding of the physical world.

A very basic example is that of a falling object. If we neglect frictionforces, the object is only subject to gravity (this is the context of so-calledfree fall). Applying Newton’s Fundamental Principle of Dynamics, we get

ma(t) = mg

where m denotes the mass of the object, g the gravitational acceleration,and a the vertical acceleration of the object. As the acceleration is thesecond derivative of the position, which we denote by y, we simply obtainy′′(t) = g. This equation may directly be integrated as

y′(t) = v0 + g t and finally y(t) = y0 + v0 t+ gt2

2,

where y0 and v0 denote the initial position and speed.

PSfrag replacements

y

0

mg~ey

~ey is a unitary vectoralong the y axis.

Now, if we want to take friction forces into account, we may use theclassical model where the friction is the product of a friction coefficient fwith its velocity v(t) = y′(t), and is opposite to the motion of the object.We thus obtain

ma(t) = mg − f v(t),

which can be rewritten as

v′(t) +f

mv(t) = g. (3.1)

PSfrag replacements

y

0

mg~ey

−f v(t)~ey

Solving Equation (3.1) is not as easy as the one we obtained earlier. This is an example of ordinarydifferential equations. The aim of this chapter is to see how to integrate such equations.

Remark 3.1

In the physics literature, it is common to denote the first and second derivatives with respect to time ofthe function y respectively by y and y. For instance, (3.1) would read y + f y = g with this notation.

3.1.2 Formal definitions

100

3.1.2. Formal definitions

Definition 3.2 h ODE g

h An ordinary differential equation (ODE in short) of order n is an equation of the form

F(x, y(x), y′(x), y′′(x), . . . , y(n)(x)

)= 0 (ODE)

where F : Rn+2 → R is a function that is not constant with respect to its last variable.

h A solution to (ODE) on an interval I ⊆ R is an n times differentiable function y : I → R suchthat (ODE) holds for all x ∈ I .

Remark 3.3

Depending on the context, the variable is usually x or t (often when it represents time) and the functionis often y or x (only when x is not used as the variable).

Remark 3.4

In order to lighten the notation, the argument of the function is often omitted; we thus write for instancey′ = y + cos(x), which should be understood as y′(x) = y(x) + cos(x).

Remark 3.5

The interval on which an ODE is to be solved is usually not mentioned. Your first task is to determinewhere the ODE makes sense! For instance, the ODE

√1 − x y′′ + y

x = ex only makes sense when

x 7→√

1 − x, x 7→ 1x and x 7→ ex are defined. We can thus only try to solve this ODE on the inter-

vals (−∞, 0) and (0, 1) (which does not ensure that there will be solutions defined on these intervals).

Some ODEs can easily be solved by guesswork:


Find at least a solution to each of the following ODEs:

(i) y′ = x+ sin(x) (ii) y′ = y (iii) y′ = 7y (iv) y′′ = 2y

Checking that a given function is a solution to an ODE is also often quite easy:


Check that x ∈ (−c,+∞) 7→ 1

x+ cis solution to y′ = −y2 for any fixed constant c ∈ R.

But solving (or integrating) an ODE consists in finding all its solutions and there is no way to doso in general. Even showing that a solution exists can sometimes prove very difficult! In real-lifesituations, it often happens that the general solution to an ODE cannot be found; instead, one can useapproximate solutions: this is the focus of the branch of mathematics called numerical analysis. In whatfollows, we will restrict our attention to particular ODEs that can be solved.

101


Definition 3.8 h maximal solution g

A maximal solution to an ODE is a solution y on an interval I such that there exists no solution z onan interval J with I ( J and z|I = y.

Beware that the interval on which a solution to an ODE is defined really matters. For instance,the ODE y′ = 1/x admits as maximal solutions the functions x ∈ R⋆

− 7→ ln(−x) + c for c ∈ R, andx ∈ R⋆

+ 7→ ln(x) + c for c ∈ R. When not specified, we will implicitly consider that the interval is R.

3.1.3 Separable differential equations

Definition 3.9 h separable differential equation g

A separable differential equation is an ODE that can be written in the form

y′f(y) = g(x).

Solving such an ODE is done by finding primitives F and G of f and g and noticing that

(F ◦ y)′ = y′F ′(y) = y′f(y) = g = G′ ,

which is equivalent to F ◦ y = G+ c for some constant c ∈ R.

In practice, it is useful to write f(y) dy = g(x) dx and integrate both sides as

∫ y

y0

f(u) du =

∫ x

x0

g(u) du ,

which gives F (y) − F (y0) = G(x) − G(x0). This is the same as above, the value of the constant beingchosen such that y(x0) = y0.

Example 3.10

Let us solve x2y′ = e−y. We first “separate” the variables. Note that this equation cannot be satisfied atx = 0 for any function y, so that it is safe to assume x 6= 0. We rewrite the equation as

y′ey =1

x2which gives ey = − 1

x+ c (c ∈ R) .

This can only make sense when − 1

x+ c > 0, in which case

y(x) = ln

(

− 1

x+ c

)

.

This is a well-defined, differentiable, maximal solution on

(1c , 0) if c < 0

R⋆− if c ≥ 0

(1c ,+∞) if c > 0

.

102

3.1.4. Linear ODEs

3.1.4 Linear ODEs

From now on, we will only focus on a special type of ODEs, called linear.

Definition 3.11 h linear differential equation g

h A linear differential equation of order n is an ODE of the form

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b(x)

where each ai and b are real-valued continuous functions on a common interval I ⊆ R, and an 6≡ 0on I .a

h It is called homogeneous if b ≡ 0, nonhomogeneous if b 6≡ 0.

h We speak of linear differential equation with constant coefficients when all the ai’s are constant(note that b is not necessarily constant).

aRecall that we use the notation ≡ for the functional equality: f ≡ g on I means ∀x ∈ I, f(x) = g(x) and f 6≡ gon I means ∃x ∈ I, f(x) 6= g(x).

Example 3.12

(i) y′ + 5xy = ex is a first order linear differential equation.

(ii) y′ + 5xy = 0 is the homogeneous linear differential equation associated with the previous one.

(iii) 2y′′ − 3y′ + 5y = 0 is a second order homogeneous linear differential equation with constantcoefficients.

(iv) y′2 − y = x or y′′ y′ − y = 0 are not linear differential equations.

The term linear comes from the following proposition.

Proposition 3.13

The set of solutions on some fixed interval J ⊆ I to a homogeneous linear differential equation is a realvector space.

Proof. Clearly, x ∈ J 7→ 0 is a valid solution. If y1 and y2 are solutions to a given homogeneous lineardifferential equation, then it is plain to see that, for any real number λ, λy1 + y2 is also a solution.

In order to solve a linear differential equation, one proceeds in two steps:

(i) solve the associated homogeneous equation;

(ii) find a particular solution to the original equation.

The second step is only needed when the original equation is not homogeneous. One then concludeas follows.

103


Proposition 3.14

Let yE be a solution on some interval J ⊆ I to

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b(x) (E)

and SH be the set of solutions on J to the associated homogeneous equation

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = 0 . (H)

Then the set of solutions on J to (E) is SE :={yE + y : y ∈ SH

}.

Proof. Let y be a solution to (H). By adding, for each x ∈ J ,

a0(x)yE + a1(x)y′E + · · · + an(x)y

(n)E = b(x) and a0(x)y + a1(x)y′ + · · · + an(x)y(n) = 0 ,

one sees that y + yE is a solution to (E). Reciprocally, let z be a solution to (E). Then, for each x ∈ J ,

a0(x)z + a1(x)z′ + · · · + an(x)z(n) = b(x) and a0(x)yE + a1(x)y′E + · · · + an(x)y

(n)E = b(x) ,

so that, by subtraction, z − yE ∈ SH. There exists thus y = z − yE ∈ SH such that z = yE + y.

Proposition 3.15 h superposition of solutions g

If y1, y2, . . . , yk are respectively solutions on some interval J ⊆ I to

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b1(x)

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b2(x)

......

...

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = bk(x) ,

then y1 + y2 + · · · + yk is solution on J to

a0(x)y + a1(x)y′ + · · · + an(x)y(n) = b1(x) + b2(x) + · · · + bk(x) .

Proof. It suffices to add up, for each x ∈ J , the equations

a0(x)y1 + a1(x)y′1 + · · · + an(x)y

(n)1 = b1(x)

a0(x)y2 + a1(x)y′2 + · · · + an(x)y

(n)2 = b2(x)

......

...

a0(x)yk + a1(x)y′k + · · · + an(x)y

(n)k = bk(x)

in order to obtain the result.

104

3.2. First order linear differential equations

3.2 First order linear differential equations

The aim of this section is to solve linear differential equations of order 1, which take the form a0(x)y+a1(x)y′ = b(x) with a0, a1 and b real-valued continuous functions on a common interval I ⊆ R, anda1 6≡ 0 on I . Such an equation will first be solved on each interval where a1 does not cancel; oneshould then see whether the solutions can be extended over the points where a1 cancels and patchedover different intervals.

Example 3.16

We want to solve xy′ = y. This simple example could be solved directly but the general method thatwe will see consists in solving y′ = y

x . We will find that all the solutions are x ∈ R⋆− 7→ c−x and

x ∈ R⋆+ 7→ c+x for any constants c−, c+ ∈ R. Two such functions can be extended at 0 and reunited in

a differentiable way only if c− = c+. In the end, we check that each x ∈ R 7→ cx for a constant c ∈ R isa solution to the initial equation xy′ = y.

Dividing by a1, our equation takes the form

y′ = a(x)y + b(x) (E1)

with a and b continuous on some interval I ⊆ R. As explained in Proposition 3.14, we will first solvethe homogeneous equation y′ = a(x)y and then find a particular solution to (E1).

3.2.1 Homogeneous equation

Theorem 3.17 h solutions to a homogeneous first order linear equation g

Let a : I → R be a continuous function and A : I → R one of its primitives. The solutions on I to

y′ = a(x)y (H1)

are x ∈ I 7→ c eA(x) for any constant c ∈ R.

Proof. We have

(H1) ⇐⇒ ∀x ∈ I, y′(x) − a(x)y(x) = 0

⇐⇒ ∀x ∈ I, e−A(x)(y′(x) − a(x)y(x)

)= 0

⇐⇒ ∀x ∈ I, y′(x)e−A(x) − y(x)A′(x)e−A(x) = 0

⇐⇒ ∀x ∈ I,(

y(x)e−A(x))′

= 0

⇐⇒ ∃c ∈ R, ∀x ∈ I, y(x)e−A(x) = c

⇐⇒ ∃c ∈ R, ∀x ∈ I, y(x) = c eA(x) .

The previous proof is quite ad hoc. A fast way to recover Theorem 3.17 is to write (assuming that ydoes not cancel)

y′

y= a(x) ⇐⇒ ∃c ∈ R, ln |y(x)| = A(x) + k

⇐⇒ ∃k ∈ R, |y(x)| = eA(x)+k

⇐⇒ ∃k ∈ R, y(x) = ±ekeA(x)

⇐⇒ ∃c ∈ R⋆, y(x) = c eA(x) (with c = ±ek) .

105


We thus recover all the solutions except x ∈ I 7→ 0.

Remark 3.18

Theorem 3.17 is coherent with Proposition 3.13. In fact, the set of solution is the one-dimensional vectorspace Span

(x ∈ I 7→ eA(x)

).

An important fact to notice about the solutions is that, for any fixed x0 ∈ I and y0 ∈ R, there existsexactly one solution y to (H1) such that y(x0) = y0. Indeed, by Theorem 3.17, there exists c ∈ R suchthat y : x ∈ I 7→ c eA(x), and one should have c eA(x0) = y0 ⇐⇒ c = y0 e

−A(x0). Reciprocally, thefunction y : x ∈ I 7→ y0 e

A(x)−A(x0) is as desired. The condition y(x0) = y0 is called an initial conditionto the ODE. Note that x 7→ A(x) − A(x0) is the primitive of a that cancels at x0. Let us sum up thisdiscussion in the following proposition.

Theorem 3.19 h Cauchy–Lipschitz (solution with initial condition) g

Let a : I → R be a continuous function, x0 ∈ I and y0 ∈ R. There exists a unique solution on I to

y′ = a(x)y

that satisfies the initial condition y(x0) = y0. Moreover, this solution is x ∈ I 7→ y0 e

∫x

x0a(u) du

.

The particular case where a is a constant function takes a particularly simple form. Let us summa-rize Theorems 3.17 and 3.19 in this case.

Corollary 3.20 h homogeneous first order linear equation with constant coefficients g

Let α, x0, y0 ∈ R. The set of solutions toy′ = αy

is the vector line{x ∈ R 7→ c eα x, c ∈ R

}. Moreover, there exists a unique solution that satisfies the

initial condition y(x0) = y0: this is x ∈ R 7→ y0 eα(x−x0).

Depending on the cases, the solutions look like this:

x

y

α > 0 y0 > 0

y0 = 0

y0 < 0

x

y

α < 0

106

3.2.2. Finding a particular solution to y′ = a(x)y + b(x)


Solve 2y′ − 5y = 0 and ex y′ + 2y = 0.

Recall that the set of solutions to a homogeneous linear differential equation is always a vectorspace (Proposition 3.13). It would be interesting to know its dimension. In light of Remark 3.18, weknow that in the case of equation (H1), this vector space is a vector line (a vector space of dimension 1).

Warning 3.22 B Beware of patched solutions! B

It is wrong to say that the set of solutions to a homogeneous first order linear differentialequation is always a vector line! Indeed, recall that a homogeneous first order linear differentialequation has the general form

a0(x)y + a1(x)y′ = 0,

where a1 is allowed to cancel.

Let us look at a concrete example.

Example 3.23

Let us solvexy′ = 2y . (3.2)

We need to solve this equation on the two intervals R⋆− and R⋆

+. On each of these two intervals, it isequivalent to

y′ =2

xy ,

so, by Theorem 3.17 the solutions are x ∈ R⋆− 7→ c− x2 and x ∈ R⋆

+ 7→ c+ x2 for any constants c−,

c+ ∈ R.Let us fix arbitrary c−, c+ ∈ R and set y− : x ∈ R⋆

− 7→ c− x2 and y+ : x ∈ R⋆+ 7→ c+ x

2. We see thaty−(x) → 0 as x ↑ 0 and y+(x) → 0 as x ↓ 0, so that the function defined, for x ∈ R, by

y(x) :=

y−(x) if x < 0

0 if x = 0

y+(x) if x > 0

is continuous. Moreover, for h 6= 0,

∣∣∣y(h) − y(0)

h

∣∣∣ ≤ max

(|c−|, |c+|

)|h| → 0 as h → 0 ,

so that y is differentiable on R, with y′(0) = 0. As a result, y is solution to (3.2). We conclude that theset of solutions to (3.2) is the vector plane Span

(x 7→ x2

1R⋆−

(x), x 7→ x21R⋆

+(x)).a

aThe function 1A is the indicator function of the set A: it is defined as 1A(x) = 1 if x ∈ A, 1A(x) = 0 if x /∈ A.For instance, our function y can be written as y(x) = c− x2

1R⋆−

(x) + c+ x21R⋆

+(x).

3.2.2 Finding a particular solution to y′ = a(x)y + b(x)

As explained above, it remains to find a particular solution to (E1). In some lucky cases, this can bedone directly by guesswork:

107


Example 3.24

Let us solve, on (0, π),y′ sin(x) + y cos(x) = sin(x) + x cos(x) . (3.3)

Clearly, x ∈ (0, π) 7→ x is a solution to this equation. By Theorem 3.17, the solutions to the associatedhomogeneous equation are, for c ∈ R,

x ∈ (0, π) 7→ c e

∫x

0− cos(u)

sin(u)du

= c e− ln(

sin(x))

=c

sin(x).

By Proposition 3.14, the solutions to (3.3) are thus x ∈ (0, π) 7→ c

sin(x)+ x, for c ∈ R.

Recall also Proposition 3.15: it might be useful to split up b and find separate particular solutions.


Solve y′ − 2xy = 4x− 1

x2− 2.

Variation of constants

In most cases, however, we use a more robust method: the so-called variation of constants. Althoughthe name is quite contradictory, it makes some sense. As in Theorem 3.17, let A : I → R be a primitiveof a. We know that the solutions on I to (H1) have the form x ∈ I 7→ c eA(x). The idea of the method isto look for a solution to (E1) in the form yE : x ∈ I 7→ c(x) eA(x) where, now, c : I → R is a differentiablefunction!

As A′ = a, we have

y′E(x) = a(x)c(x)eA(x) + c′(x)eA(x) = a(x)yE(x) + c′(x)eA(x) .

As a result, yE is solution to (E1) if and only if

y′E(x) = a(x)yE(x) + b(x) ⇐⇒ c′(x)eA(x) = b(x)

⇐⇒ c′(x) = b(x)e−A(x)

⇐⇒ c(x) = c(x0) +

∫ x

x0

b(u)e−A(u) du where x0 ∈ I is arbitrary.

Warning 3.26 B The solution is not x ∈ I 7→ c(x) but x ∈ I 7→ c(x) eA(x) B

Once you have found c(x), don’t forget to multiply it by eA(x) in order to obtain the particular solution.It is a very common mistake.

Example 3.27

Let us solvey′ + y = ex + 1 (3.4)

on R. The associated homogeneous equation is y′ = −y, whose solutions are x 7→ ce−x, c ∈ R. Let us

108

3.2.2. Finding a particular solution to y′ = a(x)y + b(x)

look for a particular solution yE : x 7→ c(x)e−x. This function is solution to (3.4) if and only if

y′E + yE = ex + 1 ⇐⇒ c′(x)e−x − c(x)e−x + c(x)e−x = ex + 1

⇐⇒ c′(x)e−x = ex + 1

⇐⇒ c′(x) = e2x + ex

⇐⇒ c(x) =1

2e2x + ex + k for some k ∈ R.

We choose, for instance, k = 0, which gives yE(x) = c(x)e−x =1

2ex + 1. The solutions to (3.4) are

thus

x ∈ R 7→ 1

2ex + 1 + c e−x, c ∈ R.

Case where a is constant

Let us suppose here that a ≡ α 6= 0. In this case, there might be some faster ways to find a particularsolution to

y′ = αy + b(x)

than to use the variation of constants when b is a simple function (or a sum of simple functions, recallProposition 3.15). In what follows, we always work on R.

h b is a polynomial function. If b is a degree k polynomial function, looking for a particular solutionin the form of a degree k polynomial amounts to solving a simple linear system of equations. Indeed,

x 7→ a0 + a1x+ a2x2 + · · · + akx

k

is solution to

y′ − αy = b0 + b1x+ b2x2 + · · · + bkx

k

if and only if

a1 − αa0 = b0

2a2 − αa1 = b1

3a3 − αa2 = b2

......

...

kak − αak−1 = bk−1

−αak = bk

.

This system can easily be solved from bottom to top.

h b : x 7→ eβxPk(x) where β ∈ R⋆ and Pk is a degree k polynomial. In this case, we look for asolution of the form eβxz(x) for an unknown function z. This function is solution to

y′ − αy = eβxPk(x)

if and only if, for all x ∈ R,

eβx(βz(x) + z′(x) − αz(x)

)= eβxPk(x) ⇐⇒ (β − α)z(x) + z′(x) = Pk(x) .

If β = α, take for z any primitive of Pk; otherwise, we are back to the previous case.

109



Solve y′ = −2y + x2e−x.

h b : x 7→ A cos(βx) + B sin(βx) where A, B ∈ R, β ∈ R⋆. In this case, we look for a solution of theform C cos(βx) +D sin(βx). This function is solution to

y′ − αy = A cos(βx) +B sin(βx)

if and only if, for all x ∈ R,

(βD − αC) cos(βx) + (−βC − αD) sin(βx) = A cos(βx) +B sin(βx) ,

that is,{

(βD − αC) = A

(−βC − αD) = B⇐⇒

C = −αA+ βB

α2 + β2

D =βA− αB

α2 + β2

.


Solve y′ + y = sin(2x).

3.2.3 Solution to the nonhomogeneous equation

Summing up, we obtain the following crucial theorem.

Theorem 3.30 h Cauchy–Lipschitz for first order linear equations g

Let a : I → R and b : I → R be two continuous function, x0 ∈ I , and A : x ∈ I 7→∫ x

x0a(u) du the

primitive of a that cancels at x0. The solutions on I to

y′ = a(x)y + b(x) (E1)

are the functions

x ∈ I 7→(

c+

∫ x

x0

b(u)e−A(u) du

)

eA(x) for any constant c ∈ R.

Moreover, for each y0 ∈ R, there is a unique solution with initial condition y(x0) = y0: this solution is

x ∈ I 7→(

y0 +

∫ x

x0

b(u)e−A(u) du

)

eA(x) .

Example 3.31

Let us find the solution of y′ + y = ex + 1 satisfying y(1) = 2. We saw in Example 3.27 that there existsc ∈ R such that

y : x ∈ R 7→ 1

2ex + 1 + c e−x .

110

3.2.3. Solution to the nonhomogeneous equation

We need to find the (unique) value of c for which the initial condition y(1) = 2 is satisfied.

y(1) = 2 ⇐⇒ 1

2e1 + 1 + c e−1 = 2 ⇐⇒ c = e− e2

2.

The desired solution is thus y : x ∈ R 7→ 1

2ex + 1 +

(

e− e2

2

)

e−x.

Integral curves

Recall that the graph of a function f : I → R is the set of points of the plane {(x, f(x)) : x ∈ I}. Anintegral curve of an ODE is the graph of a solution to this ODE. Theorem 3.30 ensures the following.

Corollary 3.32 h integral curves g

Let us consider (E1) on an interval I . For each point (x0, y0) ∈ I × R, there is a unique integral curveof (E1) that contains (x0, y0).This furthermore implies that any two different integral curves never intersect!

Example 3.33

The solutions to y′ + y = x are

y : x ∈ R 7→ x− 1 + c e−x , c ∈ R.

For each point (x0, y0) ∈ R2, there is a unique integral curve containing (x0, y0).

x

y

(x0, y0)


Solve y′ + y ln 2 = 0. Plot the integral curves and highlight the one corresponding to y(1) =1

2.

111


Warning 3.35 B Beware of patched solutions! B

Beware that Theorem 3.30 and Corollary 3.32 only hold for first order linear equations under theform (E1). For an equation

a0(x)y + a1(x)y′ = b(x) ,

these only hold on each interval where a1 does not cancel! It is wrong otherwise! For instance,the integral curves on R of Example 3.23 all intersect at (0, 0). It is also wrong to say that for each(x0, y0) ∈ R⋆

+ × R, there is a unique solution y : R → R to (3.2) such that y(x0) = y0. What is rightto say is that, for each (x0, y0) ∈ R⋆

+ × R, there is a unique solution y : R⋆+ → R to (3.2) such that

y(x0) = y0 as, on the interval R⋆+, the theorem holds! But such a solution is not maximal. . .

3.3 Systems of linear ODEs

In this section, we are interested in solving systems of linear ODEs with constant coefficients, as forinstance {

y′1 = y1 + 3y2 + b1(t)

y′2 = −4y1 + 7y2 + b2(t)

.

(In the context of this section, it is more usual to use t as the variable.) Here, we have two unknownfunctions y1 and y2 that evolve together. In general, we cannot solve these equations separately, weneed to solve them at the same time. Setting

X :=

[y1

y2

]

, A :=

[1 3

−4 7

]

and B :=

[b1

b2

]

,

we can rewrite our system as X ′ = AX + B(t). In this notation, A is a square fixed matrix, B is avector-valued function and X is an unknown vector-valued differentiable function.

Definition 3.36

h We say that a vector-valued or a matrix-valued function is continuous (resp. differentiable) if allits coordinates are continuous (resp. differentiable).

h The limit (resp. derivative) of a continuous (resp. differentiable) vector-valued or matrix-valued func-tion is the vector or matrix whose coordinates are the limits (resp. derivatives) of its coordinates.

Note that we did not define continuity and differentiability as in the previous chapter (Defini-tion 2.4). The previous definition is actually more accurate1 but is equivalent with Definition 3.36 inthis context (recall Proposition 2.5).

In this section, we will see how to solve such an equation: more precisely, we consider the equation

X ′ = AX +B(t) , (ES)

where A ∈ Mn(R) is an n × n real matrix, B : I → Rn is a continuous n-dimensional vector-valuedfunction defined on some interval I ⊆ R and X : I → Rn is an unknown differentiable n-dimensionalvector-valued function.

The propositions of Section 3.1.4 (Propositions 3.13, 3.14, and 3.15) have the following straightfor-ward analog in the present setting.

1In mathematics, the notion of continuity is very crucial and can be defined in a much broader context. This notion is waybeyond the scope of this course.

112

3.3.1. Preliminaries: matrix exponential

Proposition 3.37


(i) The set SHS of solutions to the homogeneous system of linear ODEs

X ′ = AX (HS)

is a real vector space.

(ii) Let XES be a solution to (ES). The set of solutions to (ES) is{XES +X : X ∈ SHS

}.

(iii) If, for 1 ≤ i ≤ k, we have X ′i = AXi + Bi(t), then

(∑ki=1 Xi

)′= A

∑ki=1 Xi +

∑ki=1 Bi(t)

(superposition of solutions).

3.3.1 Preliminaries: matrix exponential

In light of Corollary 3.20, it is tempting to say that the solutions to the homogeneous equation

X ′ = AX (HS)

are of the form etA, up to some constants. Of course, this involves taking the exponential of a matrix,which is not clear. . . We will see now that this can indeed be properly done!

First, let us introduce some terminology. For 1 ≤ i ≤ m, 1 ≤ j ≤ n, we denote by [A]ij the term ofindex (i, j) of a matrix A ∈ Mmn(C). Furthermore, for complex numbers aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n,we denote by [aij ]1≤i≤m,1≤j≤n the m × n matrix whose term of index (i, j) is aij , for all 1 ≤ i ≤ m,1 ≤ j ≤ n. Finally, for an upper case letter matrix, we will often use the same letter in lower case todesign its entries: for instance, we will set aij := [A]ij .

Recall that we may define the usual exponential as the sum of the following converging series:

exp(x) := 1 + x+x2

2!+x3

3!+ · · · =

∞∑

k=0

xk

k!.

Proposition 3.38

For any matrix A ∈ Mn(C), each entry of the matrix

In +A+1

2!A2 +

1

3!A3 + · · · +

1

r!Ar =

r∑

k=0

1

k!Ak

absolutely converges as the integer r → ∞. (In denotes the identity matrix of Mn(C) and we use theusual convention that A0 = In.)

Proof. Let A = [apq ]1≤p,q≤n ∈ Mn(C) and 1 ≤ i, j ≤ n. First notice that, for each r ∈ N,[

r∑

k=0

1

k!Ak

]

ij

=

r∑

k=0

1

k!

[Ak]

ij=

r∑

k=0

1

k!

∑

1≤p1,...,pk−1≤n

aip1ap1p2 . . . apk−2pk−1apk−1j ,

is indeed the partial sum of a series. This series is absolutely convergent as claimed because, denotingby max(A) := max1≤p,q≤n |apq|, we have

r∑

k=0

∣∣∣∣

1

k!

∑

1≤p1,...,pk−1≤n

aip1ap1p2 . . . apk−2pk−1apk−1j

∣∣∣∣

≤r∑

k=0

nk−1 max(A)k

k!≤ en max(A)

n< ∞ .

113


This proposition legitimizes the following definition.

Definition 3.39 h matrix exponential g

For any matrix A ∈ Mn(C), the exponential of A is defined as

exp(A) := In +A+1

2!A2 +

1

3!A3 + · · · =

∞∑

k=0

1

k!Ak .

We alternatively also use the shorthand notation eA.

Clearly, ifA is a 1×1 matrix [δ], then eA = [eδ]. More generally, the exponential of a diagonal matrixcan be simply expressed:

Proposition 3.40 h exponential of a diagonal matrix g

For any δ1, . . . , δn ∈ C,

exp

δ1 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 δn

=

eδ1 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 eδn

.

In particular, denoting by 0n ∈ Mn(C) the zero matrix, we have e0n = In.

Proof. As, for each k ≥ 0,

δ1 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 δn

k

=

δk1 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 δk

n

,

this directly comes from the definition.

The situation is way more complicated in general! There is however a simple way to express theexponential of a diagonalizable matrix thanks to the following Proposition.

Proposition 3.41 h exponential of similar matrices g

Let A and B be two similar matrices of Mn(C): recall that this means that there exists an invertiblematrix P of Mn(C) such that B = PAP−1. We have

eB = eP AP −1

= PeAP−1 .

Proof. This comes from the fact that, for any k ≥ 0,

Bk = PAP−1 × PAP−1 × · · · × PAP−1

︸︷︷︸

k times

= PAkP−1 ,

114

3.3.1. Preliminaries: matrix exponential

which yieldsr∑

k=0

1

k!Bk = P

(r∑

k=0

1

k!Ak

)

P−1 ,

and then eB = PeAP−1 by taking r → ∞. (We admit that matrix multiplication is “continuous” insome sense: roughly speaking, the coefficients of a matrix product are simply linear combinations ofthe entries of the matrices.)

Corollary 3.42 h exponential of a diagonalizable matrix g

Let A ∈ Mn(C) be diagonalizable: we may write it as A = P∆P−1 for some invertible matrix P andsome diagonal matrix ∆. Then

eA = Pe∆P−1 .


Let

A =

[1 1

−2 4

]

.

Check that

A =

[1 11 2

] [2 00 3

] [2 −1

−1 1

]

and deduce etA.

Warning 3.44 B eA+B 6= eAeB B

Beware that, in contrast to the fundamental identity ex+y = exey when x and y are real (or complex)numbers, we do not have, in general, eA+B = eAeB when A and B are matrices!


Let

A :=

[1 00 0

]

and B =

[0 10 0

]

.

Show that eA+B 6= eAeB .

This identity however holds when A and B commute.

Proposition 3.46

Let A, B ∈ Mn(C) be such that AB = BA. We have

eA+B = eAeB .

Proof. Recall that, if a and b are two elements of any ring,

(a+ b)k =∑

(u1,...,uk)∈{0,1}k

au1b1−u1au2b1−u2 . . . aukb1−uk

115


(in the i-th factor (a + b), one selects a if ui = 1 or b if ui = 0). If a and b commute, the factorcorresponding to (u1, . . . , uk) can be simplified as aΣuibk−Σui and we obtain the so-called Newton’sbinomial expansion

(a+ b)k =∑

i+j=k

k!

i!j!aibj .

As Mn(C) is not commutative, this expression does not hold in general for matrices! Here, however,we can use it for A and B as we assumed that they commute. By definition,

eA+B =∞∑

k=0

(A+B)k

k!

=

∞∑

k=0

1

k!

∑

i+j=k

k!

i!j!AiBj

=

∞∑

k=0

∑

i+j=k

Ai

i!

Bj

j!

=

∞∑

i=0

Ai

i!

∞∑

j=0

Bj

j!

= eAeB .

The second equality is Newton’s binomial expansion and the fourth equality is obtained by observingthat summing over i + j = k for k ∈ {0, 1, 2, . . .} amounts to summing (along diagonal descendinglines) over (i, j) ∈ {0, 1, 2, . . .}2.

As a byproduct, we obtain the following fundamental property.

Corollary 3.47 h inverse of an exponential g

Let A ∈ Mn(C). Then eA is invertible and its inverse is e−A.

Proof. By Propositions 3.46 (A and −A commute) and 3.40, we have

eAe−A = eA−A = e0n = In .

3.3.2 Solution to the homogeneous equation

We may now turn to the central theorem of this section, which will allow us to solve (HS).

Theorem 3.48 h derivative of t 7→ etA g

Let A ∈ Mn(C). The function t ∈ R 7→ etA is differentiable (thus continuous) and its derivative ist ∈ R 7→ AetA.

Proof. Let t ∈ R and h 6= 0. As tA commute with hA, by Proposition 3.46,

e(t+h)A − etA

h=etAehA − etA

h=

(ehA − In

h

)

etA

116

3.3.2. Solution to the homogeneous equation

By definition 3.39,

ehA − In − hA =

∞∑

k=2

1

k!(hA)k

so that, using the same technique as in the proof of Proposition 3.38, for any 1 ≤ i, j ≤ n, we have

∣∣∣∣∣

[ehA − In

h−A

]

ij

∣∣∣∣∣

≤ 1

|h|

∞∑

k=2

1

k!

∣∣∣∣

[

(hA)k]

ij

∣∣∣∣

≤ 1

|h|

∞∑

k=2

nk−1 max(hA)k

k!

=1

n|h|

∞∑

k=2

(n|h| max(A)

)k

k!

=1

n|h|(

en|h| max(A) − 1 − n|h| max(A))

=1

n|h|(1 + n|h| max(A) + O

((n|h| max(A))2

)− 1 − n|h| max(A)

)

= O(|h|) → 0

as h → 0. (In the second equality, we used the expansion of the exponential for the real numbern|h| max(A).) Summing up,

∀i, j,[ehA − In

h−A

]

ij

→ 0 , so thatehA − In

h→ A , and finally

e(t+h)A − etA

h→ AetA .

Theorem 3.49 h solutions to (HS) g

Let A ∈ Mn(R). The solutions toX ′ = AX (HS)

are the functions

t ∈ R 7→ etA

c1

c2

...cn

for any constants c1, c2, . . . , cn ∈ R. Moreover, for any vector X0 ∈ Rn, there is a unique solutionto (HS) with initial condition X(t0) = X0: it is t ∈ R 7→ e(t−t0)AX0.

Proof. We already know from Theorem 3.48 that, for any vector C ∈ Rn, the function t ∈ R 7→ etACis solution to (HS). Conversely, let X be a solution to (HS) and Y : t ∈ R 7→ e−tAX ∈ Rn. We claimthat, if U and V are matrix-valued functions, the product rule for the derivative of UV works as forreal-valued functions. Indeed, with a transparent notation,

[(UV )′]

ij=([UV ]ij

)′=(∑

k

uikvkj

)′=∑

k

(uikvkj)′ =∑

k

(u′ikvkj + uikv

′kj)

=∑

k

u′ikvkj +

∑

k

uikv′kj = [U ′V ]ij + [UV ′]ij .

117


Using it, together with Theorem 3.48, we obtain that

Y ′(t) = −Ae−tAX + e−tAX ′ = −Ae−tAX + e−tAAX = 0n1 ,

where 0n1 denotes the zero n-dimensional vector. In the last equality, we used the fact that A and etA

commute, which is easy to see from Definition 3.39. As a result, Y is equal to some constant vectorC ∈ Rn. By Corollary 3.47, the inverse of e−tA is etA, so that

X = etAe−tAX = etAY = etAC .

The second statement comes from the fact that X(t0) = et0AC, so that C = e−t0AX(t0).

Corollary 3.50

The set of solutions to (HS) is the n-dimensional vector space with basis {t 7→ etAei : 1 ≤ i ≤ n}

where {ei : 1 ≤ i ≤ n} is any basis of Rn.


Recalling Exercice 3.43, solve{

y′1 = y1 + y2

y′2 = −2y1 + 4y2

.

3.3.3 Solution to the nonhomogeneous equation

As in Section 3.2, we solve (ES) by using variation of constants.

Theorem 3.52 h solutions to (ES) g

Let A ∈ Mn(R), B : I → Rn be a continuous function defined on some interval I ⊆ R, and t0 ∈ I .The solutions to

X ′ = AX +B(t) (ES)

are the functions

t ∈ I 7→ etA

(

C0 +

∫ t

t0

e−uAB(u) du

)

for any constant vector C0 ∈ Rn. Moreover, for any vector X0 ∈ Rn, there is a unique solution to (ES)with initial condition X(t0) = X0: it is

t ∈ I 7→ e(t−t0)A

(

X0 +

∫ t

t0

e−uAB(u) du

)

.

Proof. We look for solutions of (ES) in the form X = etAC(t) with C : I → Rn. By the product rule,

C′ =(e−tAX

)′= −Ae−tAX + e−tAX ′ = −Ae−tAX + e−tA

(AX +B(t)

)= e−tAB(t) ,

so that

C(t) = C(t0) +

∫ t

t0

e−uAB(u) du .

The first statement follows. The second statement is obtained by noticing that X(t0) = et0AC(t0).

118

3.3.4. Method for solving a system of linear ODEs in practice

3.3.4 Method for solving a system of linear ODEs in practice

In theory, Theorem 3.49, Corollary 3.50 and Theorem 3.52 tell us a lot of interesting things. But, inpractice (on a concrete example), the formulas are not always helpful (mostly because computing amatrix exponential is not an easy task in general). Instead, we use the following method. The first stepis to put A in the nicest form possible.

Diagonalizable matrix

Let us first recall some matrix algebra. The matrixA ∈ Mn(R) is called diagonalizable if there exist aninvertible matrix P ∈ Mn(R) and a diagonal matrix2 ∆ ∈ Mn(R) such that A = P∆P−1. Beware thatnot all matrices are diagonalizable. When diagonalizing a matrix A ∈ Mn(R), one searches for n realnumbers δ1, . . . , δn ∈ R (not necessarily distinct) and n linearly independent vectors v1,. . . , vn ∈ Rn

such that Avi = δivi for 1 ≤ i ≤ n. Such a number δi is called an eigenvalue and such a vector vi

is called an eigenvector. We refer you to your algebra course for the method allowing to find theseeigenvalues and eigenvectors.

Once we have found the eigenvalues δ1, . . . , δn and eigenvectors v1,. . . , vn, we have the following.As v1,. . . , vn are n linearly independent vectors in an n-dimensional vector space, they form a basisof Rn. The n× n matrix

P :=

v1 . . . vn

=

v11 . . . vn1

......

v1n . . . vnn

allows to pass from the canonical basis (e1, . . . , en) to the eigenbasis (v1, . . . ,vn): Pei = vi. Recipro-cally, its inverse allows to go the other way: P−1

vi = ei. The point of the eigenbasis is that, in thisbasis, the mapping v 7→ Av is represented by the diagonal matrix

∆ :=

δ1 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 δn

since, precisely, Avi = δivi. We thus have A = P∆P−1.

Back to our system of ODEs, we set Y = P−1X . Equation (HS) is equivalent to

X ′ = AX ⇐⇒ X ′ = P∆P−1X ⇐⇒ P−1X ′ = ∆P−1X ⇐⇒ Y ′ = ∆Y .

From Theorem 3.49, we know that the solutions to this system are given by

Y (t) = et∆

c1

...

...cn

=

eδ1t 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 eδnt

c1

...

...cn

=

c1eδ1t

...

...cne

δnt

.

The solutions to (HS) are thus given by

X(t) = PY (t) =

v1 . . . vn

c1eδ1t

...cne

δnt

= c1e

δ1tv1 + · · · + cne

δntvn . (3.5)

2A matrix M is called diagonal if [M ]ij = 0 whenever i 6= j.

119


In order to solve (ES), we need one particular solution (Proposition 3.37). To this aim, we use thevariation of constants and look for a solution Z : t ∈ R 7→ c1(t)eδ1t

v1 + · · · + cn(t)eδntvn where c1, . . . ,

cn are n differentiable functions. In other words, Z(t) = Pet∆C(t) whereC : R → Rn is a differentiablevector.

Z ′ = AZ +B(t) ⇐⇒ Pet∆C′(t) + P∆et∆C(t) = P∆P−1Pet∆C(t) +B(t)

⇐⇒ Pet∆C′(t) = B(t)

⇐⇒ c′1(t)eδ1t

v1 + · · · + c′n(t)eδnt

vn = B(t) .

We obtain c′1, . . . , c′

n by solving this system of n equations and find c1, . . . , cn by integration.

Remark 3.53

Solving the previous system amounts to finding C′(t) = e−t∆P−1B(t). But, when doing so, we do notnecessarily need to explicitly compute P−1 (for instance, if B ≡ 0, we do not have anything to do).

What if we had computed etA instead? Alternatively, we could have computed directly etA thanksto Corollary 3.42. If we had done so, we would have found for the solutions to (HS) the functions

t ∈ R 7→ etAC = eP t∆P −1

C = Pet∆P−1C

for any vector C ∈ Rn. In comparison with (3.5) where we had t ∈ R 7→ Pet∆C, the difference is thatthe constants are not chosen in the same way. Of course, the result is exactly the same, as P−1 is abijection from Rn to Rn so this is just renaming the constants. The variation of constants can then beused: we find, as in Theorem 3.52, C′(t) = e−tAB(t). We can directly integrate this equation as wehave already computed e−tA.

This is a perfectly valid method but, in general, the previous one is faster: the computations areeasier and P−1 may not need to be fully computed.

Example 3.54

Let us solve the following system of ODEs:

x′ = 4x+ 16t

y′ = y + 2z

z′ = 2y + z + 2e4t

.

h Diagonalizing the matrix. The matrix associated with the system is

A =

4 0 00 1 20 2 1

.

Let us see if it happens to be diagonalizable. To this aim, we look for eigenvalues δ ∈ R and eigenvectorsv = (a, b, c) 6= (0, 0, 0). We have

Av = δv ⇐⇒

4a = δa

b+ 2c = δb

2b+ c = δc

.

120

3.3.4. Method for solving a system of linear ODEs in practice

We immediately see the eigenvalue δ1 := 4 corresponding to the eigenvector v1 := (1, 0, 0). Otherwise,{

b+ 2c = δb

2b+ c = δc⇐⇒

{

3(b+ c) = δ(b+ c)

b− c = δ(c− b).

We find δ2 := 3 and δ3 := −1 respectively corresponding, for instance, to v2 := (0, 1, 1) and v3 :=(0, 1,−1). All in all, A = P∆P−1, where

P :=

1 0 00 1 10 1 −1

and ∆ :=

4 0 00 3 00 0 −1

.

h Solving the homogeneous system. The homogeneous system is X ′ = AX = P∆P−1X , whichis equivalent to P−1X ′ = ∆P−1X , that is Y ′ = ∆Y , where we set Y := P−1X . The solutions to thissystem are the functions

Y : t ∈ R 7→ et∆

c1

c2

c3

=

c1 e4t

c2 e3t

c3 e−t

, c1, c2, c3 ∈ R ,

so that

X : t ∈ R 7→ PY (t) = c1

e4t

00

+ c2

0e3t

e3t

+ c3

0e−t

−e−t

, c1, c2, c3 ∈ R ,

h Finding a particular solution. In order to solve the original system of ODEs, we look for aparticular solution thanks to the variation of constants:

Z : t ∈ R 7→ c1(t)

e4t

00

+ c2(t)

0e3t

e3t

+ c3(t)

0e−t

−e−t

is solution if and only if

Z ′ = AZ +

16t0

2e4t

⇐⇒ c′1(t)

e4t

00

+ c′2(t)

0e3t

e3t

+ c′3(t)

0e−t

−e−t

=

16t0

2e4t

⇐⇒

c′1(t)e4t = 16t

c′2(t)e3t + c′

3(t)e−t = 0

c′2(t)e3t − c′

3(t)e−t = 2e4t

⇐⇒

c′1(t)e4t = 16t

2c′2(t)e3t = 2e4t

2c′3(t)e−t = −2e4t

⇐⇒

c′1(t) = 16te−4t

c′2(t) = et

c′3(t) = −e5t

.

We can take for instance c1 : t ∈ R 7→ −(1 + 4t)e−4t, c2 : t ∈ R 7→ et, and c3 : t ∈ R 7→ − 15e

5t. Thisgives the particular solution

Z : t ∈ R 7→ −(1 + 4t)e−4t

e4t

00

+ et

0e3t

e3t

− 1

5e5t

0e−t

−e−t

=

−1 − 4t45e

4t

65e

4t

.

121


h Solving the nonhomogeneous system. The solutions to our system are thus

X : t ∈ R 7→

−1 − 4t45e

4t

65e

4t

+ c1

e4t

00

+ c2

0e3t

e3t

+ c3

0e−t

−e−t

, c1, c2, c3 ∈ R .

In other words, the solutions are the functions

x : t ∈ R 7→ −1 − 4t+ c1e4t

y : t ∈ R 7→ 45e

4t + c2e3t + c3e

−t

z : t ∈ R 7→ 65e

4t + c2e3t − c3e

−t

, for any c1, c2, c3 ∈ R .

Trigonalizable matrix

A matrix A ∈ Mn(R) is called trigonalizable if there exist an invertible matrix P ∈ Mn(R) and atriangular matrix3 T ∈ Mn(R) such thatA = PTP−1. Beware that not all matrices are trigonalizable.Let us suppose now thatA is trigonalizable: we write itA = PTP−1 with an upper triangular matrix T .

We cannot use exactly the same method as eT is not easily computable. . . Although it is temptingto write T as the sum of a diagonal matrix and a nilpotent one, this does not prove conclusive as, ingeneral, these two matrices do not commute. For instance, if we write

[1 10 2

]

=

[1 00 2

]

+

[0 10 0

]

,

notice that [1 00 2

] [0 10 0

]

=

[0 10 0

]

and

[0 10 0

] [1 00 2

]

=

[0 20 0

]

.

Instead, we proceed as follows. As above, we set Y = P−1X . Equation (ES) is equivalent to

X ′ = AX +B(t) ⇐⇒ X ′ = PTP−1X +B(t)

⇐⇒ P−1X ′ = TP−1X + P−1B(t)

⇐⇒ Y ′ = TY + P−1B(t) .

Using the notation (we assume here that T is lower triangular)

Y =

y1

...

...yn

, T =

t11 0 . . . 0...

. . .. . .

......

. . . 0tn1 . . . . . . tnn

and P−1B =

f1

...

...fn

,

our system becomes

y′1 = t11 y1 + f1(t)

y′2 = t22 y2 +

(t21 y1(t) + f2(t)

)

......

...

y′n = tnn yn +

(tn1 y1(t) + . . . tn(n−1) yn−1(t) + fn(t)

)

.

3A matrix M is called upper triangular if [M ]ij = 0 whenever i > j; it is called lower triangular if [M ]ij = 0 wheneveri < j; it is called triangular if it is upper triangular or lower triangular.

122

3.4. Linear differential equations with constant coefficients

This system can be solved line by line, from top to bottom. Indeed, the first line is a first order linearODE. Then, once it has been solved, the second line is also a first order linear ODE, the term

(t21 y1(t)+

f2(t))

now being a known function. And so on and so forth.

Remark 3.55

We proceed similarly with an upper triangular matrix instead of a lower triangular matrix; the systemis solved from bottom to top.


Let

A =

[6 3

−5 −2

]

, T =

[1 01 3

]

and P =

[2 1

−3 −1

]

.

(i) Show that P−1 =

[−1 −13 2

]

.

(ii) Show that A = PTP−1.

(iii) Solve X ′ = AX +

[0t2

]

.

3.4 Linear differential equations with constant coefficients

In this section, we consider differential equations of order higher than one but only with constantcoefficients, that is, equations of the form

y(n) + an−1y(n−1) + · · · + a1y

′ + a0y = b(t) , (En)

where a0, a1, . . . , an−1 ∈ R and b : I → R is a continuous function. This is a linear system of ODEs asit can be written as

yy′

...

...y(n−1)

′

=

0 1 0 . . . 0...

. . .. . .

. . ....

.... . .

. . . 00 . . . . . . 0 1

−a0 −a1 −a2 . . . −an−1

yy′

...

...y(n−1)

+

0......0b(t)

. (SEn)

3.4.1 Homogeneous equation

From what we have done before, we know the following:

Proposition 3.57 h structure of the set of solutions g

The solutions to (En) are obtained by adding a particular solution to (En) and a solution to the associatedhomogeneous equation

y(n) + an−1y(n−1) + · · · + a1y

′ + a0y = 0 . (Hn)

Moreover, the set of solutions to (Hn) is an n-dimensional real vector space (Corollary 3.50).

123


In theory, we know the solutions to (Hn) from Theorem 3.49 but, in practice, this theorem is hardto use as it requires to compute a matrix exponential. . . In fact, all we need is to find n linearly inde-pendent solutions!

The intuition we gathered in the previous sections tells us to look for solutions to (Hn) in the formt ∈ R 7→ ert for some r ∈ C to be determined. Such a function is solution if and only if

rnert + an−1rn−1ert + · · · + a1re

rt + a0ert = 0 ⇐⇒ rn + an−1r

n−1 + · · · + a1r + a0 = 0

⇐⇒ P (r) = 0 ,

where P is defined as follows.

Definition 3.58 h characteristic polynomial g

The polynomialP := Xn + an−1X

n−1 + · · · + a1X + a0 ∈ R[X ]

is called the characteristic polynomial associated with (Hn) (or with (En)).

Let us write P in its factor form P =∏d

j=1(X − rj)nj with distinct complex rj ’s and∑d

j=1 nj = n.

If all the roots of P are simple (that is, nj = 1 for all j), we have found the n solutions t ∈ R 7→ erjt,1 ≤ j ≤ n. If P possesses multiple roots, however, we need to find more solutions.

We denote by D the differential operator, that is, the function D : f 7→ f ′ mapping a differen-tiable function to its derivative. If Q =

∑qi=0 aiX

i is a polynomial, we denote by Q(D) the functionQ(D) : f 7→ ∑q

i=0 aif(i). We have the following crucial property: if Q and R are polynomials, then

Q(D) ◦ R(D) = QR(D). In other words, for any function f differentiable enough, Q(D)(R(D)(f)

)=

(QR)(D)(f). Indeed, if Q =∑q

i=0 aiXi and R =

∑rj=0 bjX

j , we have

Q(D)(R(D)(f)

)= Q(D)

( r∑

j=0

bjf(j)

)

=

q∑

i=0

ai

( r∑

j=0

bjf(j)

)(i)

=

q∑

i=0

r∑

j=0

aibjf(i+j) = (QR)(D)(f) .

With this notation, (Hn) can be rewritten as

y(n) + an−1y(n−1) + · · · + a1y

′ + a0y = 0 ⇐⇒ P (D)(y) = 0

⇐⇒( d∏

j′=1

(D −rj′)nj′

)

(y) = 0

⇐⇒(∏

j′ 6=j

(D −rj′)nj′

)((D −rj)nj (y)

)= 0 .

As, for any k ≥ 1,

(D −rj

)(tkerjt

)=(tkerjt

)′ − rj

(tkerjt

)

= ktk−1erjt + tkrjerjt − rjt

kerjt

= ktk−1erjt ,

and(

D −rj

)(erjt)

= 0 ,

we see that, for 0 ≤ k ≤ nj − 1,(

D −rj

)nj(tkerjt

)= 0 .

124

3.4.1. Homogeneous equation

As a result, t ∈ R 7→ tkerjt is solution to (Hn) as long as 0 ≤ k ≤ nj − 1. For each 1 ≤ j ≤ d, we thus

have nj solutions. All in all, this adds up to∑d

j=1 nj = n solutions, as desired.There is still a caveat, though. Recall that, although P ∈ R[X ] has real coefficients, its roots can

be nonreal (C is an algebraically closed field but R isn’t!). If rj ∈ C\R, the function t 7→ erjt takesnonreal values and we want real-valued functions. . . Hopefully, the nonreal roots of a real polynomialare conjugates, with the same order of multiplicity. Indeed, denoting by · the complex conjugation,

for any z ∈ C and k ≥ 0, one has zk = zk, so that, for any Q ∈ R[X ], Q(z) = Q(z), and thusQ(z) = 0 ⇐⇒ Q(z) = 0. As r is a root of multiplicity d of Q if and only if Q(j)(r) = 0 for 0 ≤ j ≤ dandQ(d+1)(r) 6= 0, one sees that r is a root of multiplicity d ofQ if and only if r is a root of multiplicity dof Q.

Rearranging the roots of P , we denote by r1, . . . , rp its real roots, with multiplicity n1, . . . , np, and byα1 ± iβ1, . . . , αq ± iβq its nonreal roots, with multiplicity m1, . . . , mq (so that

∑pj=1 nj +2

∑qj=1 mj = n).

Let us look at the functions we have for nonreal conjugate roots:

e(αj+iβj)t = eαjt(

cos(βjt) + i sin(βjt))

and e(αj−iβj)t = eαjt(

cos(βjt) − i sin(βjt)).

We can come back to the real world by considering

e(αj+iβj)t + e(αj−iβj)t

2= eαjt cos(βjt) and

e(αj+iβj)t − e(αj−iβj)t

2i= eαjt sin(βjt) .

Summing up, we have found the following∑p

j=1 nj + 2∑q

j=1 mj = n real-valued solutions:

h t ∈ R 7→ tkerjt, for 1 ≤ j ≤ p and 0 ≤ k ≤ nj − 1;

h t ∈ R 7→ tkeαjt cos(βjt), for 1 ≤ j ≤ q and 0 ≤ k ≤ mj − 1;

h t ∈ R 7→ tkeαjt sin(βjt), for 1 ≤ j ≤ q and 0 ≤ k ≤ mj − 1.

Theorem 3.59

Let P = Xn + an−1Xn−1 + · · · + a1X + a0 ∈ R[X ] be the characteristic polynomial associated with

y(n) + an−1y(n−1) + · · · + a1y

′ + a0y = 0 . (Hn)

We denote by r1, . . . , rp its real roots, with multiplicity n1, . . . , np, and by α1 ± iβ1, . . . , αq ± iβq itsnonreal roots, with multiplicity m1, . . . , mq . In other words, we factor P in R[X ] as

P =

p∏

j=1

(X − rj

)nj

q∏

j=1

(

X2 − 2αjX + (α2j + β2

j ))mj

with distinct real numbers rj ∈ R, and distinct pairs (αj , βj) ∈ R × R⋆. The solutions to (Hn) are thefunctions

t ∈ R 7→p∑

j=1

erjtPj(t) +

q∑

j=1

eαjt(

cos(βjt)Qj(t) + sin(βjt)Rj(t)),

where P1, . . . , Pp, Q1, . . . , Qq, R1, . . . , Rq ∈ R[X ] are real polynomials whose degrees satisfy

∀j, deg(Pj) ≤ nj − 1, deg(Qj) ≤ mj − 1, deg(Rj) ≤ mj − 1 .

Proof. This is just rewriting the fact that the n functions we found above form a basis of the set of so-lutions. From the analysis we did, it only remains to see that these solutions are linearly independent.

125


This can be established by observing that all these functions have different behavior around infinity.More precisely, let us argue by contradiction and suppose that there exist not all zero real numbers ajk,bjk, cjk such that

∀t ∈ R,

p∑

j=1

nj−1∑

k=0

ajktkerjt +

q∑

j=1

mj−1∑

k=0

bjktkeαjt cos(βjt) +

q∑

j=1

mj−1∑

k=0

cjktkeαjt sin(βjt) = 0 .

We first consider the largest number among all the rj ’s and αj ’s that appear in this sum (that is, with atleast one ajk 6= 0 for rj or at least one nonzero bjk or cjk for αj): let us denote it by r. We then considerthe highest power function in front of ert: let us denote it by tℓ. We factor tℓert out of the previous sumand get, for all t ∈ R⋆,

p∑

j=1

nj−1∑

k=0

ajktk−ℓe(rj−r)t +

q∑

j=1

mj−1∑

k=0

bjktk−ℓe(αj−r)t cos(βjt) +

q∑

j=1

mj−1∑

k=0

cjktk−ℓe(αj−r)t sin(βjt) = 0 .

Now when t → ∞, most terms in this sum tend to 0. In fact, if there exists j1 such that rj1 = r,the first double sum tends to aj1ℓ; otherwise, it tends to 0. Similarly, the second double sum tends tobj2ℓ cos(βj2t) or 0 and the third double sum tends to cj3ℓ sin(βj3t) or 0. All in all, the whole sum tendsto

a+ b cos(βt) + c sin(β′t) (3.6)

for some β, β′ ∈ R⋆ and three numbers a, b, c ∈ R that are not all equal to 0 (as, by definition,tℓert was present in the original sum). But (3.6) should be equal to 0 for all t ∈ R⋆, which is clearlyimpossible.


Solve y′′′ + y′′ − y′ − y = 0.

3.4.2 Nonhomogeneous equation

From Theorem 3.52, we know that (SEn) admits a unique solution for any given initial condition. Thisimplies that (En) admits a unique solution satisfying any initial condition of the form

y(t0) = x0

y′(t0) = x1

......

...

y(n−1)(t0) = xn−1

,

where t0 ∈ I and x0, x1, . . . , xn−1 ∈ R.

As above, we will solve (SEn) using variation of constants. Let us denote by {y1, . . . , yn} a basis ofsolutions to (Hn) (as given for instance by Theorem 3.59). In fact,

y1

y′1...

y(n−1)1

, . . . ,

yn

y′n...

y(n−1)n

126

3.4.2. Nonhomogeneous equation

forms a basis of solutions to

yy′

...

...y(n−1)

′

=

0 1 0 . . . 0...

. . .. . .

. . ....

.... . .

. . . 00 . . . . . . 0 1

−a0 −a1 −a2 . . . −an−1

yy′

...

...y(n−1)

. (SHn)

All the solutions to (SHn) are thus of the form

n∑

j=1

λj

yj

y′j...

y(n−1)j

, λ1, . . . , λn ∈ R .

Let us replace the constants λ1, . . . , λn by functions λ1 : I → R, . . . , λn : I → R. We look for a solutionto (SEn) in the form

ySE

y′SE...

y(n−1)SE

=

n∑

j=1

λj(t)

yj

y′j...

y(n−1)j

⇐⇒

ySE =

n∑

j=1

λj(t) yj

y′SE =

n∑

j=1

λj(t) y′j

......

...

y(n−1)SE =

n∑

j=1

λj(t) y(n−1)j

(3.7)

Differentiating the first line of (3.7) and using the second line gives

y′SE =

n∑

j=1

λ′j(t) yj +

n∑

j=1

λj(t) y′j =

n∑

j=1

λj(t) y′j =⇒

n∑

j=1

λ′j(t) yj = 0 .

Conducting the same reasoning with the subsequent lines yields

∀0 ≤ k ≤ n− 2,

n∑

j=1

λ′j(t) y

(k)j = 0 .

Now, ySE is solution to (En) if and only if y(n)SE +an−1y

(n−1)SE + · · ·+a1y

′SE +a0ySE = b(t). Differentiating

the last line of (3.7) and using all the lines, we obtain

n∑

j=1

λ′j(t) y

(n−1)j +

n∑

j=1

λj(t) y(n)j + an−1

n∑

j=1

λj(t) y(n−1)j + · · · + a0

n∑

j=1

λj(t) yj = b(t)

n∑

j=1

λ′j(t) y

(n−1)j +

n∑

j=1

λj(t)(y

(n)j + an−1y

(n−1)j + · · · + a0yj

)

︸︷︷︸0

= b(t)

n∑

j=1

λ′j(t) y

(n−1)j = b(t) .

127


All in all, this boils down to solving, for each t ∈ I , the linear system

y1(t) . . . yn(t)y′

1(t) . . . y′n(t)

......

...

y(n−1)1 (t) . . . y

(n−1)n (t)

λ′1(t)λ′

2(t)...

λ′n(t)

=

0...0b(t)

and then integrating the solutions.

3.4.3 Example: second order equation

Theorem 3.59 takes a particularly simple form for a second order homogeneous equation

y′′ + 2by′ + cy = 0 (H2)

as one knows explicitly how to factor polynomials of degree 2. The characteristic polynomial is X2 +2bX + c and its roots are found using its reduced discriminant4 ∆ = b2 − c.

Corollary 3.61

Let ∆ = b2 − c. The solutions toy′′ + 2by′ + cy = 0 (H2)

are as follows:

(i) if ∆ > 0, the characteristic polynomial has two distinct real roots r1 = −b +√

∆ and r2 =

−b−√

∆ ; the solutions are

t ∈ R 7→ λer1t + µer2t λ, µ ∈ R ;

(ii) if ∆ = 0, the characteristic polynomial has one double real root r = −b ; the solutions are

t ∈ R 7→ (λt+ µ)ert λ, µ ∈ R ;

(iii) if ∆ < 0, the characteristic polynomial has two conjugate complex roots α ± iβ = −b ±√

−∆ ;the solutions are

t ∈ R 7→ eαt(λ cos(βt) + µ sin(βt)

)λ, µ ∈ R .

Example 3.62

1. Solve y′′ − y′ − 2y = 0. The characteristic polynomial is X2 − X − 2 = (X + 1)(X − 2). Thesolutions are thus t ∈ R 7→ λe−t + µe2t, with λ, µ ∈ R.

2. Solve y′′ −4y′ +4y = 0. The characteristic polynomial isX2 −4X+4 = (X−2)2. The solutionsare thus t ∈ R 7→ (λt+ µ)e2t, with λ, µ ∈ R.

3. Solve y′′ − 2y′ + 5y = 0. The characteristic polynomial is X2 − 2X + 5 = 0; its roots are 1 ± 2i.The solutions are thus t ∈ R 7→ et(λ cos(2t) + µ sin(2t)), with λ, µ ∈ R.

To solve the nonhomogeneous equation

y′′ + 2by′ + cy = b(t) , (E2)

4Recall Remark 1.60.

128

3.4.3. Example: second order equation

the variation of constants goes as follows. Let {y1, y2} be the basis of solutions of Corollary 3.61(depending on the case, y1 : t 7→ er1t and y2 : t 7→ er2t, or y1 : t 7→ ert and y2 : t 7→ tert, ory1 : t 7→ eαt cos(βt) and y2 : t 7→ eαt sin(βt)). We search for a solution ySE to (E2) in the form

{

ySE = λ(t)y1 + µ(t)y2

y′SE = λ(t)y′

1 + µ(t)y′2

(3.8)

where λ : I → R and µ : I → R are functions to be determined. This yields

y′SE =

differentiating the first line of (3.8)

↑λ′(t)y1 + µ′(t)y2 + λ(t)y′

1 + µ(t)y′2 =

second line of (3.8)

↑λ(t)y′

1 + µ(t)y′2 =⇒ λ′(t)y1 + µ′(t)y2 = 0

and y′′SE + 2by′

SE + cySE = b(t) implies

λ′(t)y′1 + µ′(t)y′

2 + λ(t)y′′1 + µ(t)y′′

2 + 2b(λ(t)y′

1 + µ(t)y′2

)+ c(λ(t)y1 + µ(t)y2

)= b(t) ,

so thatλ′(t)y′

1 + µ′(t)y′2 = b(t) .

Example 3.63

Let us solve, on (− π2 ,+

π2 ),

y′′ + y =1

cos(t).

The solutions to the homogeneous equation y′′ + y = 0 are t ∈ (− π2 ,+

π2 ) 7→ λ cos(t) + µ sin(t), with

λ, µ ∈ R. We thus look for a solution ySE : t ∈ (− π2 ,+

π2 ) 7→ λ(t) cos(t) + µ(t) sin(t) satisfying

y′SE = λ(t) cos′(t) + µ(t) sin′(t) and y′′

SE + ySE =1

cos(t),

so that

λ′(t) cos(t)+µ′(t) sin(t) = 0 and λ′(t) cos′(t)+µ′(t) sin′(t) = −λ′(t) sin(t)+µ′(t) cos(t) =1

cos(t).

This yields

{

λ′(t) sin(t) cos(t) + µ′(t) sin2(t) = 0

−λ′(t) sin(t) cos(t) + µ′(t) cos2(t) = 1=⇒

µ′(t) = 1

λ′(t) = − sin(t)

cos(t)

,

so, for instance, µ(t) = t and λ(t) = ln(cos(t)) (recall that t ∈ (− π2 ,+

π2 )). In the end, the solutions are

t ∈ (−π

2,+

π

2) 7→

(λ+ ln(cos(t))

)cos(t) + (µ+ t) sin(t), λ, µ ∈ R .

Particular cases

In some particular cases, there is a faster way to find a particular solution to (3.8).

h b is a polynomial function. If b is a degree k polynomial function, look for a polynomial solutionof degree k (if c 6= 0, otherwise (3.8) is of degree less than one in y′).

129


h b : t 7→ eαtPk(t) where α ∈ R⋆ and Pk ∈ R[X ] is a degree k polynomial. In this case, look for asolution of the form ySE : t ∈ R 7→ eαtQ(t) where Q ∈ R[X ]. We have

ySE(t) = eαtQ(t)

y′SE(t) = eαt

(Q′(t) + αQ(t)

)

y′′SE(t) = eαt

(Q′′(t) + 2αQ′(t) + α2Q(t)

).

The function ySE is solution to (3.8) if and only if

y′′SE + 2by′

SE + cy = eαtPk(t) ⇐⇒(Q′′(t) + 2αQ′(t) + α2Q(t)

)+ 2b

(Q′(t) + αQ(t)

)+ cQ(t) = Pk(t)

⇐⇒ Q′′(t) + 2(α+ b)Q′(t) + (α2 + 2bα+ c)Q(t) = Pk(t) .

Thus, look for a polynomial of the form XmQk where Qk is of degree k and

(i) m = 0 if α2 + 2bα+ c 6= 0 (that is, α is not a root of the characteristic polynomial);

(ii) m = 1 if α2 +2bα+c = 0 and α+b 6= 0 (that is, α is a simple root of the characteristic polynomial);

(iii) m = 2 if α2 +2bα+c = 0 and α+b = 0 (that is, α is a double root of the characteristic polynomial).

Remark 3.64

If α is a simple root of the characteristic polynomial, then, for any a ∈ R, the function t 7→ aeαt isalready solution to the homogeneous equation, one needs to go up one degree. Similarly, if α is a doubleroot of the characteristic polynomial, then, for any a, b ∈ R, the function t 7→ (a + bt)eαt is alreadysolution to the homogeneous equation, one needs to go up two degrees.

h b : t 7→ eαt cos(βt)Pk(t)+eαt sin(βt)Qk(t) where α ∈ R, β ∈ R⋆ and Pk, Qk ∈ R[X ] are polynomials,one of degree k, one of degree k or less. In fact, the previous reasoning perfectly works for nonzerocomplex values of α.

As a result, look for a solution of the form t 7→ tmeαt(Rk(t) cos(βt)+Sk(t) sin(βt)

)whereRk and Sk

are both polynomials of degree k, and m is the multiplicity of α ± iβ as a root of the characteristicpolynomial, that is,

(i) m = 0 if α± iβ are not roots of the characteristic polynomial;

(ii) m = 1 if α± iβ are simple roots of the characteristic polynomial.

Example 3.65

Let us solve y′′ + y = sin(t). The roots of the characteristic polynomial are ±i, so that the solutions tothe homogeneous equation are

t ∈ R 7→ λ sin(t) + µ cos(t), λ, µ ∈ R .

As i is a simple root of the characteristic polynomial, the analysis above tells us to look for a solution ofthe form z : t 7→ t

(a cos(t) + b sin(t)

)for some a, b ∈ R. We have

z′ = a cos(t) + b sin(t) + t(

− a sin(t) + b cos(t)),

z′′ = 2(

− a sin(t) + b cos(t))

+ t(

− a cos(t) − b sin(t)),

130

3.4.3. Example: second order equation

so that z is solution if and only if −2a = 1 and b = 0. All the solutions are thus

t ∈ R 7→ λ sin(t) + µ cos(t) − t

2cos(t), λ, µ ∈ R .

131

Solutions to the exercises

Solution to Exercise 1.14 page 13

As, whenever x < y, there exist at least a rational number and an irrational number in (x, y), we seethat, for any step functions φ ≤ 1Q ≤ ψ, one has φ ≤ 0 and ψ ≥ 1. As a result, I−(1Q) ≤ 0 andI+(1Q) ≥ 1. In fact, it is easy to see that I−(1Q) = 0 and I+(1Q) = 1 by considering the constant stepfunctions equal to 0 for the lower bound and 1 for the upper bound.


We consider the regular subdivision 0 < 1n < 2

n < . . . < n−1n < 1 and use the monotonicity of the

square function on [0, 1] to see that, for 1 ≤ i ≤ n,

∀x ∈[

i−1n , i

n

],

(i−1

n

)2 ≤ x2 ≤(

in

)2. (3.9)

We define the step functions φn and ψn on [0, 1] by φn(x) :=(

i−1n

)2and ψn(x) :=

(in

)2whenever

x ∈[

i−1n , i

n

), as well as φn(1) = ψn(1) := 1. From (3.9), we obtain that φn ≤ f ≤ ψn.

ψnψnψnψnψn

φnφnφnφnφn

1

0 1n = 5

x

yf

We have

∫ 1

0

φn(x) dx =

n∑

i=1

1

n

( i− 1

n

)2

=1

n3

n−1∑

i=0

i2 =(n− 1)(2n− 1)

6n2,

∫ 1

0

ψn(x) dx =

n∑

i=1

1

n

( i

n

)2

=1

n3

n∑

i=1

i2 =(n+ 1)(2n+ 1)

6n2.

133


We thus obtain

(n− 1)(2n− 1)

6n2=

∫ 1

0

φn ≤ I−(f) ≤ I+(f) ≤∫ 1

0

ψn =(n+ 1)(2n+ 1)

6n2.

Letting n → ∞ yields I−(f) = I+(f) = 13 . Thus f is integrable and

∫ 1

0 x2 dx = 1

3 .


(i) By linearity,

∫ 1

0

P (x) dx =

n∑

i=0

ai

∫ 1

0

xi dx =

n∑

i=0

ai

i+ 1.

(ii) If P (X) = a0 +a1X+a2X2, then

∫ 1

0

P (x) dx = a0 +a1

2+a2

3. For instance, P (X) = 3X2 −1

works.


We have∣∣∣∣

∫ n

1

sin(nx)

1 + xndx

∣∣∣∣

≤∫ n

1

∣∣∣∣

sin(nx)

1 + xn

∣∣∣∣dx ≤

∫ n

1

1

xndx =

n−n+1 − 1

−n+ 1−−−−→n→∞

0 .


No. Take for instance the following.

(i) a = 0, b = 2, f = 1[0,1] + 21[1,2]. We get 1 + 22 = (1 + 2)2.

(ii) a = 0, b = 2, f ≡ 1. We get 2 =√

2.

(iii) a = −1, b = 1, f = −1[−1,0) + 1[0,1]. We get 2 = 0.

(iv) a = 0, b = 1, f ≡ 1, g ≡ −1. We get 0 = 1 + 1.


(i) Let F : [−a, a] → R be a primitive of f . We consider ϕ : x ∈ [−a, a] 7→ F (x) − F (−x).It is differentiable and ϕ′(x) = f(x) + f(−x) = 0. As a consequence, ϕ is constantly equalto ϕ(0) = F (0) − F (−0) = 0, so that F is even. Finally,

∫ a

−a

f(x) dx = [F ]a−a = F (a) − F (−a) = 0 .

(ii) Let G : [−a, a] → R be a primitive of g. We consider ψ : x ∈ [−a, a] 7→ G(x) + G(−x).It is differentiable and ϕ′(x) = g(x) − g(−x) = 0. As a consequence, ψ is constantly equal

134


to ψ(0) = 2G(0). Finally,

∫ a

−a

g(x) dx = [G]a−a = G(a) −G(−a) = G(a) − (ψ(a) −G(a))

= 2G(a) − 2G(0) = 2[G]a0 = 2

∫ a

0

g(x) dx .


We readily recognize a Riemann sum:

Sn =1

n

n∑

k=1

ekn −−−−→

n→∞

∫ 1

0

ex dx =[ex]1

0= e− 1 .

For S′n, there is a slight rewriting necessary:

S′n =

1

n

n∑

k=1

1(1 + k

n

)2 −−−−→n→∞

∫ 1

0

1

1 + x2dx =

[arctan(x)

]1

0= arctan(1) − arctan(0) =

π

4.


(i) We have

∫ e

1

x ln(x) dx =

[x2

2ln(x)

]e

1

−∫ e

1

x2

2

1

xdx

=e2

2− 1

2

∫ e

1

xdx

=e2

2− 1

2

[x2

2

]e

1

=e2

2− e2

4+

1

4=e2 + 1

4.

(ii) For any k ∈ N,

∫ x

0

tket dt =[

tket]x

0− k

∫ x

0

tk−1et dt = xkex − k

∫ x

0

tk−1et dt .

Using this for k = 2 then k = 1, we get

∫ x

0

x2ex dx = x2ex − 2

(

xex −∫ x

0

et dt

)

= (x2 − 2x+ 2) ex .

Of course, this gives a primitive. The other ones are obtained by adding an arbitrary constant.

135



(i) As indicated, we use the substitution x = sin(t), yielding dx = cos(t) dt, and mapping [0, π6 ]

onto [0, 12 ]. Thus,

∫ 12

0

dx

(1 − x2)3/2=

∫ π6

0

cos(t) dt

(1 − sin2(t))3/2=

∫ π6

0

cos(t) dt

(cos2(t))3/2

=

∫ π6

0

1

cos2(t)dt =

[tan(t)

] π6

0=

√3

3.

(ii) We have x = tan(t) and dx =dt

cos2(t). Hence, recalling that

1

cos2(t)= 1 + tan2(t),

∫ y

0

dx

(1 + x2)3/2=

∫ arctan(y)

0

1

(1 + tan2(t))3/2

dt

cos2(t)

=

∫ arctan(y)

0

cos(t) dt

=[

sin(t)]arctan(y)

0= sin(arctan(x)) .

This can be rewritten as∫ y

0

dx

(1 + x2)3/2= sin(arctan(x)) = tan(arctan(x)) cos(arctan(x)) =

x√1 + x2

.

We used the fact that cos2(arctan(x)) =1

1 + tan2(arctan(x))=

1

1 + x2and that

cos(arctan(x)) ≥ 0 as − π2 ≤ arctan(x) ≤ π

2 .


The partial fraction decomposition gives

(i)4x+ 5

x2 + x− 2=

3

x− 1+

1

x+ 2(ii)

6 − x

x2 − 4x+ 4=

−1

x− 2+

4

(x− 2)2

so that we obtain for instance the primitives

(i) x ∈ R \ {−2, 1} 7→ 3 ln |x− 1| + ln |x+ 2| (ii) x ∈ R \ {2} 7→ − ln |x− 2| − 4(x− 2)−1

We chose all 5 constants equal to 0. If one wants all the primitives, one needs to add one constant perinterval of the domain of definition (so 3 for the first function and 2 for the second one).

The third one is a bit more involved. It is already in partial fraction decomposition form. We have∫ y

0

2x− 3

x2 − 4x+ 5dx =

∫ y

0

2x− 4

x2 − 4x+ 5dx+

∫ y

0

dx

(x − 2)2 + 1

=[

ln(x2 − 4x+ 5)]y

0+

∫ y−2

−2

du

u2 + 1

=[

ln(x2 − 4x+ 5)]y

0+[

arctan(u)]y−2

−2

= ln(y2 − 4y + 5) + arctan(y − 2) + c

136


where c is a constant whose value does not matter (there is no need to compute it).


We have

x2 + x+ 1 =(x+ 1

2

)2+ 3

4 = 34

((2√3

(x+ 1

2

))2

+ 1)

so that

(i)

∫ 1

0

dx

x2 + x+ 1=

4

3

∫ 1

0

dx(

2√3

(x+ 1

2

))2

+ 1=

4

3

√3

2

∫√

3

1√3

dy

y2 + 1

=2√

3

3

[

arctan(y)]

√3

1√3

=2√

3

3

(

arctan(√

3)

− arctan(√

3/3))

(ii)

∫ 1

0

xdx

x2 + x+ 1=

1

2

∫ 1

0

2x+ 1

x2 + x+ 1dx− 1

2

∫ 1

0

dx

x2 + x+ 1

=1

2

[

ln(x2 + x+ 1)]1

0− 1

2

∫ 1

0

dx

x2 + x+ 1=

ln(3)

2− 1

2

∫ 1

0

dx

x2 + x+ 1

(iii)

∫ 1

0

dx

(x2 + x+ 1)2=

16

9

∫ 1

0

dx((

2√3

(x+ 1

2

))2+ 1)2 =

16

9

√3

2

∫√

3

1√3

dy

(y2 + 1)2

Then,

∫√

3

1√3

1

y2 + 1dy =

[y

y2 + 1

]√

3

1√3

−∫

√3

1√3

−2yy

(y2 + 1)2dy = 2

∫√

3

1√3

y2 + 1 − 1

(y2 + 1)2dy,

so that

∫√

3

1√3

dy

(y2 + 1)2=

1

2

∫√

3

1√3

dy

y2 + 1dy

and

∫ 1

0

dx

(x2 + x+ 1)2=

4√

3

9

(

arctan(√

3)

− arctan(√

3/3))

.

(iv)

∫ 1

0

xdx

(x2 + x+ 1)2=

1

2

∫ 1

0

2x+ 1

(x2 + x+ 1)2dx− 1

2

∫ 1

0

dx

(x2 + x+ 1)2

=1

2

[ −1

x2 + x+ 1

]1

0

− 1

2

∫ 1

0

dx

(x2 + x+ 1)2=

1

3− 1

2

∫ 1

0

dx

(x2 + x+ 1)2

137



∫ 2

1

2e2x − 3ex + 2

e2x − exdx =

∫ e2

e

2y2 − 3y + 2

y (y2 − y)dy

=

∫ e2

e

(1

y − 1− 2

y2+

1

y

)

dy

=

[

ln(y − 1) +2

y+ ln(y)

]e2

e

= ln(e2 − 1

e− 1

)

+2

e2− 2

e+ ln

(e2

e

)

= ln(e+ 1) +2

e2− 2

e+ 1 .


(i)

∫ π2

− π2

sin2(x) cos3(x) dx =

∫ π2

− π2

sin2(x)(1 − sin2(x)

)cos(x) dx

=

∫ 1

−1

y2(1 − y2) dy =

[y3

3− y5

5

]1

−1

=4

15

(ii) Ok, this was vicious. If you use the half-angle tangent, you get to integrate(1 − t2)4

(1 + t2)5; good luck

with that! Rather, use power-reduction for cos4, that is,

cos4(x) =

(eix + e−ix

2

)4

=e4ix + 4e2ix + 6 + 4e−2ix + e−4ix

24=

cos(4x) + 4 cos(2x) + 3

8

and obtain

∫ π2

0

cos4(x) dx =1

8

[sin(4x)

4+ 2 sin(2x) + 3x

]π2

0

=3π

16.

(iii) From the material of Section 1.3.3, we infer that the half-angle tangent substitution is appropriate.There is however some care needed as π ∈ [0, 2π]. (Overseeing this should bother you as it wouldresult in integrating from 0 to 0; this may happen with some substitutions but not with the half-angle tangent substitution because it is a bijective substitution. The image of an interval of positivelength cannot be a single point with a bijection.) Even though the integral is not an improperintegral (these are addressed in next section) a priori because the integrand is well defined andcontinuous on [0, 2π], the half-angle tangent substitution will make it an improper integral. Afirst possibility is to “avoid” the point π by use of Chasles’s identity and continuity of primitives:

∫ 2π

0

dx

2 + sin(x)= lim

x <→π

∫ x

0

du

2 + sin(u)+ lim

y >→π

∫ 2π

y

du

2 + sin(u).

We may now use the half-angle tangent substitution on [0, x] with 0 < x < π fixed:

∫ x

0

du

2 + sin(u)=

∫ tan( x2 )

0

1

2 + 2t1+t2

2 dt

1 + t2=

∫ tan( x2 )

0

2 dt

(1 + t)2

=

[ −2

1 + t

]tan( x2 )

0

= 2 − 2

1 + tan(x2 )

−−−→x <→π

2 .

138


Similarly, for π < y < 2π,

∫ 2π

y

du

2 + sin(u)=

[ −2

1 + t

]0

tan( y

2 )

=2

1 + tan(y2 )

− 2 −−−→y >→π

−2 .

All in all,

∫ 2π

0

dx

2 + sin(x)= 2 − 2 = 0.


Using the substitution y =√x+ 1 yields y2 = x+ 1, 2y dy = dx and

∫ 1

0

x2 + 1√x+ 1

dx = 2

∫√

2

1

((y2 − 1)2 + 1

)dy = 2

∫√

2

1

(y4 − 2y2 + 2

)dy

= 2

[y5

5− 2

y3

3+ 2y

]√

2

1

=44

15

√2 − 46

15.


We integrate by parts:

∫ x

0

λt e−λt dt = −[

t e−λt]x

0−∫ x

0

−1e−λt dt = −xe−λx +

∫ x

0

e−λt dt

= −xe−λx − 1

λ

[

e−λt]x

0= −xe−λx − 1

λ

(e−λx − 1

)−−−−→x→∞

1

λ.

The integral is thus convergent and

∫ +∞

0

λt e−λt dt =1

λ.


For n = 0,

∫ x

0

e−t dt =[

− e−t]x

0= 1 − e−x → 1 as x → ∞.

Next, we integrate by parts as in the previous exercise. Let n ≥ 1.

∫ x

0

tn e−t dt = −[

tn e−t]x

0−∫ x

0

−ntn−1 e−t dt = −xne−x + n

∫ x

0

tn−1e−t dt .

The convergence of the integral for n−1 implies the convergence for n, so that, by induction, the integralconverges and, taking the limit as x → ∞ yields

∫ +∞

0

tn e−t dt = n

∫ +∞

0

tn−1e−t dt .

By induction, we readily obtain

∫ +∞

0

tn e−t dt = n!.

139



(i) Clearly, if α = 1, the integral diverges; let us now suppose that α 6= 1. The problem is at −∞. Wemove it to +∞ thanks to the substitution x = −t, with dx = − dt. We have

∫ π

−u

αt dt =

∫ u

−π

α−x dx =

∫ u

−π

e−x ln(α) dx =

[

e−x ln(α)

− ln(α)

]u

−π

=e−u ln(α) − eπ ln(α)

− ln(α),

which converges to a finite limit if and only if ln(α) ≥ 0, that is, α > 1 (recall that α 6= 1).

(ii) As t → ∞, t−α − sin(t−α) ∼ 16 t

−3α, which is integrable at +∞ if and only if 3α > 1, that is,α > 1

3 .

(iii) As t → ∞, 1 − 3√

1 + t−α ∼ − 13 t

−α, which is integrable at +∞ if and only if α > 1.


For t ≥ 1,

∣∣∣∣

sin(t)

t2

∣∣∣∣

≤ 1

t2, which is integrable at +∞. By comparison of nonnegative functions, we

conclude that t ∈ [1,+∞) 7→∣∣∣∣

sin(t)

t2

∣∣∣∣

is integrable.


As

∣∣∣∣

sinn(t)

tα

∣∣∣∣

≤ 1

tα, we see that the integral is absolutely convergent for α > 1. Next,

∫ Nπ

π

∣∣∣∣

sinn(t)

tα

∣∣∣∣dt =

N∑

k=2

∫ kπ

(k−1)π

| sinn(t)|tα

dt

≥ 1

πα

N∑

k=2

1

kα

∫ kπ

(k−1)π

| sinn(t)| dt =cn

πα

N∑

k=2

1

kα,

where cn :=

∫ π

0

| sinn(t)| dt is a positive constant depending only on n. For 0 < α ≤ 1, the latter sum

tends to +∞ as N → ∞ (see 1.115), so that the integral under study is not absolutely convergent.Moreover, for even n, the integrand is positive, so that the above argument holds without taking theabsolute value.Finally, it remains to see whether the integral is convergent for odd n and 0 < α ≤ 1. Using Abel’scriterion with f : t ∈ [1,+∞) 7→ sinn(t) and g : t ∈ [1,+∞) 7→ t−α, wee see that it is alwaysconvergent. The only point to check is that

x ∈ [1,+∞) 7→∫ x

1

f(t) dt

is bounded. This is easily obtained from the power-reduction formula

sin2p+1(t) =1

4p

n∑

ℓ=0

(−1)ℓ

(2p+ 1

p− ℓ

)

sin ((2ℓ+ 1)t) ,

140


observing that the primitives of sine function are cosine functions and thus are bounded. Of course,the argument does not hold for even values of n; the reason is that there is a constant factor in thepower-reduction formula, so that its primitives are not bounded.Summing up, we obtained the following.

h α > 1. Absolutely convergent.

h 0 < α ≤ 1 and even n. Not convergent.

h 0 < α ≤ 1 and odd n. Convergent but not absolutely convergent.


We use the change of variable u = 1t .

∫ 1

x

sin(

1t

)

tdt =

∫ 1

1/x

u sin(u)− du

u2=

∫ 1/x

1

sinu

udu .

As x >→ 0, 1x → +∞. We recover a known integral, which is convergent but not absolutely convergent.


Let us first check that the first integral converges. The problem is at 0. As√t ln(t) → 0 as t >→ 0, we

obtain that, for small t, | ln(t)| ≤ t−12 . By comparison with Riemann integrals, ln is integrable at 0. As

sin(t) ∼ t when t → 0, the same is true for the integrand. Let us denote by A this integral.Next, observe that

∫ x

0

ln(

cos(t))dt = −

∫ π2 −x

π2

ln(

cos(π2 − u)

)du =

∫ π2

π2 −x

ln(

sin(u))

du −−−→x→ π

2

A

so that the second integral also conveges and is equal to the first one.As suggested, let us consider the sum:

2A =

∫ π2

0

ln(

sin(t))

dt+

∫ π2

0

ln(

cos(t))

dt =

∫ π2

0

ln(

sin(t) cos(t))

dt

=

∫ π2

0

ln(

12 sin(2t)

)

dt = −π

2ln(2) +

∫ π2

0

ln(

sin(2t))dt

u = 2t

= −π

2ln(2) +

1

2

∫ π

0

ln(

sin(u))du = −π

2ln(2) +

1

2

(

A+

∫ π

π2

ln(

sin(u))du

)

v = π − u

= −π

2ln(2) +

1

2

(

A−∫ 0

π2

ln(

sin(π − v))dv

)

= −π

2ln(2) +A .

We finally obtain A = −π

2ln(2).

141



One may for instance proceed as follows. Let f denote the above example. We make it unbounded at −∞by adding x 7→ f(−x). Next, we make it positive by adding a continuous positive integrable function,

for instance x ∈ R 7→ e−x2

. In the end, the function x ∈ R 7→ f(x) + f(−x) + e−x2

answers thequestion.


The change of variable u = t2 yields t =√u, dt = du

2√

u, and

∫ x

1

sin(t2) dt =

∫ x2

1

sin(u)du

2√u.

We conclude thanks to Abel’s criterion (see Exercise 1.110) that the integral converges.


The tangent is the line L((t, f(t)),~ı + f ′(t)~

). As t → t0, we have ‖(t, f(t)) − (t0, f(t0))‖ → 0 and

‖(~ı + f ′(t)~) − (~ı + f ′(t0)~)‖ = |f ′(t) − f ′(t0)| → 0, so that the tangent admits the limiting positionL((t0, f(t0)),~ı + f ′(t0)~

).


The tangent has direction vector ~ı − cos(

1t

)

t2 ~. At times an := 2π+4πn , this direction vector is ~ı, which

tends to ~ı as n → ∞. At times bn := 12πn , the corresponding unit vector is ~ı−(2πn)2~√

1+(2πn)4→ −~ as

n → ∞. We have thus found two sequences of times tending to 0 at which no unit direction vectors canconverge to the same limit.


The vector~v(t)

‖~v(t)‖ =t

√

t2 + sin2(t)~ı +

sin(t)√

t2 + sin2(t)~

tends to~ı as t → ∞ so that the parametric curve admits the ray R(O,~ı) as asymptotic direction.By Proposition 2.31, if it admits an asymptote, then it has to be horizontal. But the distance to the lineof equation y = a is |a− sin(t)|, which does not tend to 0 for any a ∈ R.


The vector~v(t)

‖~v(t)‖ = sin(t)~ı + cos(t)~

clearly does not admit a limit as t → ∞ (you can for instance find a susequence along which it is equalto~ı and one along which it is equal to~).

142



Let us find the times at which the location is (− 83 ,− 4

3 ). According to the sketch, we should find twotimes, one on (−∞,−1) and one on (−1, 1).The sketch shows that the abscissa will be reached three times, whereas the ordinate will be reached onlytwice, at the desired times. It is thus a priori a better idea to find the times at which the ordinate − 4

3 isobtained and then check that at these times the abscissa is − 8

3 as desired.Let us thus solve

t(3t− 2)

3(t− 1)= −4

3⇐⇒ 3t2 + 2t− 4 = 0 ⇐⇒ t =

−1 ±√

13

3.

Let us next check the abscissa at these times:

x

(−1 ±√

13

3

)

= −8

3.

As a result, M(−1±√

133 ) = (− 8

3 ,− 43 ) is the point of intersection of the two parametric curves

((−∞,−1), ~v) and ((−1, 1), ~v).

Note that solving t3

t2−1 = − 83 directly is quite involved as it is a third degree equation, which you are

not supposed to know how to solve. . .


The polar angle is given by

θ =

π2 if x = 0 and y > 0

0 if x = 0 and y = 03π2 if x = 0 and y < 0

arctan(

yx

)if x > 0 and y ≥ 0

arctan(

yx

)+ 2π if x > 0 and y < 0

arctan(

yx

)+ π if x < 0

.


At time π3 , ρ(π

3 ) = 0 so that the location is the pole. As ρ only cancels at time π3 around that time, the

polar curve admits as tangent at π3 the line L (O, ~u π

3).

We have ρ(π2 ) = 1 and ρ′(π

2 ) = 2, so that, at time π2 , the polar curve admits as tangent the line

L((0, 1), 2~u π

2+ ~v π

2).


First observe that all these ODEs are well defined on R. The first ODE is very particular as it onlyinvolves one derivative of y, namely y′. From your knowledge of usual functions and their derivatives,you may think of the following solutions, defined on R.

(i) y =x2

2− cos(x) (ii) y = ex (iii) y = e7x (iv) y = e

√2x

143



Let y : x ∈ (−c,+∞) 7→ 1

x+ c. Then, for all x ∈ (−c,+∞),

y′(x) = − 1

(x+ c)2= y(x)2 ,

so that y′ = −y2 on (−c,+∞), as desired.


Both these ODEs are defined on R.The ODE 2y′ − 5y = 0 can be rewritten as y′ = 5

2y, so that its solutions are the functions x ∈ R 7→c e

52 x, for any c ∈ R.

The ODE ex y′ + 2y = 0 can be rewritten as y′ = −2e−xy (observe that x ∈ R 7→ ex does notcancel so that dividing by ex does not cause problems). A primitive of x ∈ R 7→ −2e−x is for instance

x ∈ R 7→ 2e−x, so that the solutions to the ODE are the functions x ∈ R 7→ c e2e−x

, for any c ∈ R.


The solutions to the associated homogeneous equation y′ = 2xy are the functions x ∈ R 7→ c ex2

for anyc ∈ R.We now need to find a particular solution; we use superposition of solutions in order to do so. Easilyenough, we observe that the function y1 : x ∈ R 7→ −2 is solution to y′ − 2xy = 4x and that

y2 : x ∈ R 7→ 1x is solution to y′ − 2xy = − 1

x2− 2. Summing up, the solutions to the ODE are the

functions

x ∈ R 7→ c ex2 − 2 +1

x,

for any c ∈ R.


The solutions to the homogeneous equation y′ = −2y are the functions x ∈ R 7→ λ e−2x for any λ ∈ R.We recognize that the ODE has a particular form (y′ = αy + P (x)eβx) for which we know a fasterway than to use variation of constants. We know that there exists a particular solution of the formz : x ∈ R 7→ (ax2 + bx+ c) e−x where a, b, c ∈ R. We have

z′ + 2z − x2e−x = 0 ⇐⇒(2ax+ b− (ax2 + bx+ c)

)e−x + 2(ax2 + bx+ c) e−x − x2e−x = 0

⇐⇒ (a− 1)x2 + (2a+ b)x+ b+ c = 0

⇐⇒ a = 1, b = −2 and c = 2 .

The solutions to y′ = −2y + x2e−x are thus the functions

x ∈ R 7→ (x2 − 2x+ 2) e−x + λe−3x , for any λ ∈ R.

144



The solutions to the associated homogeneous equation y′ = −y are the functions x ∈ R 7→ c e−x for anyc ∈ R. The ODE has a particular form, for which we know that there exists a particular solution of theform z : x ∈ R 7→ a cos(2x) + b sin(2x) where a, b ∈ R. We have

z′ + z = sin(2x) ⇐⇒ −2a sin(2x) + 2b cos(2x) + a cos(2x) + b sin(2x) = sin(2x)

⇐⇒ −2a+ b = 1 and 2b+ a = 0

⇐⇒ a = −2

5and b =

1

5.

The solutions to the ODE are thus

x ∈ R 7→ −2

5cos(2x) +

1

5sin(2x) + c e−x , for any c ∈ R.


The solutions arey : x ∈ R 7→ c2−x , c ∈ R.

The solution satisfying y(1) =1

2is the one where c = 1.

x

y

(1, 12 )


We check that [1 11 2

] [2 00 3

] [2 −1

−1 1

]

=

[2 32 6

] [2 −1

−1 1

]

=

[1 1

−2 4

]

.

145


As[

2 −1−1 1

]

=

[1 11 2

]−1

,

we can use Corollary 3.42 and Proposition 3.40 in order to deduce

etA =

[1 11 2

] [e2t 00 e3t

] [2 −1

−1 1

]

=

[e2t e3t

e2t 2e3t

] [2 −1

−1 1

]

=

[2e2t − e3t −e2t + e3t

2e2t − 2e3t −e2t + 2e3t

]

.


We have

A+B =

[1 10 0

]

,

and, by a direct computation, (A+B)n = A+B for each n ≥ 1. Using the definition, we get

eA+B =

[e e− 10 1

]

.

On the other hand,

eA =

[e1 00 e0

]

=

[e 00 1

]

and eB = I +B + 02 + 02 + . . . =

[1 10 1

]

,

so that

eAeB =

[e e0 1

]

6= eA+B .


The matrix associated with the system is the matrix A from Exercice 3.43. The solutions are thus givenby

{

t 7→ etA

[αβ

]

=

[2e2t − e3t −e2t + e3t

2e2t − 2e3t −e2t + 2e3t

] [αβ

]

, α, β ∈ R

}

,

that is, {

y1 : t ∈ R 7→ (2e2t − e3t)α+ (−e2t + e3t)β

y2 : t ∈ R 7→ (2e2t − 2e3t)α+ (−e2t + 2e3t)β, α, β ∈ R .


(i) The fastest way to invert a 2 × 2 matrix is to use the formula P−1 =adj(P )

det(P ). In our case,

det(P ) = 1, so that

P−1 =

[−1 −13 2

]

.

146


(ii) We have

PTP−1 =

[2 1

−3 −1

] [1 01 3

] [−1 −13 2

]

=

[3 3

−4 −3

] [−1 −13 2

]

=

[6 3

−5 −2

]

= A .

(iii) Setting Y :=

[y1

y2

]

= P−1X , we want to solve

X ′ = AX +

[0t2

]

⇐⇒[y′

1

y′2

]

=

[1 01 3

] [y1

y2

]

+

[−1 −13 2

] [0t2

]

,

that is, {

y′1 = y1 − t2

y′2 = y1 + 3y2 + 2t2

.

We thus obtain first y1 : t ∈ R 7→ λ et + t2 + 2t+ 2, for any λ ∈ R. Next, y2 satisfies

y′2 = 3y2 + λ et + 3t2 + 2t+ 2 ,

so that y2 : t ∈ R 7→ µ e3t − λ

2et − t2 − 4

3t− 10

9, for any µ ∈ R.

Finally, X = PY =

[2y1 + y2

−3y1 − y2

]

, that is

X : t ∈ R 7→

3

2λ et + µ e3t + t2 +

8

3t+

26

9

−5

2λ et − µ e3t − 2t2 − 14

3t− 44

9

, for any λ, µ ∈ R.


The characteristic polynomial is

X3 +X2 −X − 1 = (X + 1)(X − 1)2 .

(The factorization is obtained by noticing that 1 and −1 are obvious roots.) The roots are −1 of multi-plicity 1 and 1 of multiplicity 2. The solutions are thus

x ∈ R 7→ ae−x + (b+ cx) ex , for any a, b, c ∈ R .

147

Usual notation

N {1, 2, 3, . . .}, set of positive integers . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Q set of rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

R set of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

R⋆ {x ∈ R : x 6= 0}, set of nonzero real numbers . . . . . . . . . . . . . . . . . . . . . 22

R+ {x ∈ R : x ≥ 0}, set of nonnegative real numbers . . . . . . . . . . . . . . . . . . 17

R⋆+ {x ∈ R : x > 0}, set of positive real numbers . . . . . . . . . . . . . . . . . . . . . 22

R− {x ∈ R : x ≤ 0}, set of nonpositive real numbers . . . . . . . . . . . . . . . . . . . 17

R⋆− {x ∈ R : x < 0}, set of negative real numbers . . . . . . . . . . . . . . . . . . . . . 22

R R ∪ {−∞,+∞} . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

≡, 6≡ functional equality, functional inequality . . . . . . . . . . . . . . . . . . . . . . . . 18

⌊·⌋, ⌈·⌉ floor function, ceiling function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Sign sign function: equal to −1 on R⋆−, 0 at 0 and +1 on R⋆

+ . . . . . . . . . . . . . . . 91

n√x n-th root of x, that is, x

1n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

1A indicator function of the set A: equals 1 on A, 0 outside A . . . . . . . . . . . . . 13

f |J restriction of the function f to the set J . . . . . . . . . . . . . . . . . . . . . . . . . 14

C(I) set of continuous functions on I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

C∞ set of smooth functions, that is, admitting derivatives of any order. . . . . . . . . 61

Ck set of functions of class Ck, that is, k times differentiable with a continuous k-thderivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

x >→ a, x <→ a x tends to a from above, from below . . . . . . . . . . . . . . . . . . . . . . . . . . 39

In identity matrix of size n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

0n zero matrix of size n× n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Mn(C) set of n× n complex matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Mn(R) set of n× n real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

· scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

det determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Card cardinality of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

L (A, ~u) line passing through the point A with direction vector ~u . . . . . . . . . . . . . . 67

R(A, ~u) ray starting at the point A with direction vector ~u . . . . . . . . . . . . . . . . . . 76

149

Bibliography

[BDH11] Paul Blanchard, Robert L. Devaney, and Glen L.R Hall. Differential equations. CengageLearning, 2011.

[CC97] Earl A. Coddington and Robert Carlson. Linear ordinary differential equations. Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997.

[Cod61] Earl A. Coddington. An introduction to ordinary differential equations. Prentice-Hall Mathe-matics Series. Prentice-Hall, Inc., Englewood Cliffs, N.J., 1961.

[God05] Roger Godement. Analysis. II. Universitext. Springer-Verlag, Berlin, 2005. Differentialand integral calculus, Fourier series, holomorphic functions, Translated from the Frenchby Philip Spain.

[Har92] G. H. Hardy. A course of pure mathematics. Cambridge Mathematical Library. CambridgeUniversity Press, Cambridge, tenth edition, 1992.

[LM07] Laurent Lazzarini and Jean-Pierre Marco. Mathématiques L1. Cours complet avec 1000 tests etexercices corrigés. Pearson, 2007. in French.

[MTW07] Jean-Pierre Marco, Philippe Thieullen, and Jacques-Arthur Weil. Mathématiques L2. Courscomplet avec 700 tests et exercices corrigés. Pearson, 2007. in French.

[Ste16] James Stewart. Calculus: Early transcendentals, 8th edition. Brooks Cole, 2016.

[Tao16] Terence Tao. Analysis. I, volume 37 of Texts and Readings in Mathematics. Hindustan BookAgency, New Delhi; Springer, Singapore, third edition, 2016.

151

Integration Plane parametric curves Ordinary differential...

Documents

Transcript of Integration Plane parametric curves Ordinary differential...