Notes

26
Calculus II Chapter 15 Lecture Notes Professor Blanchard 1 Functions of Several Variables Warm-up. If x =2t, y = t +1, and z =2t 2 , try to write z in terms of x and y by eliminating the parameter t. We have had vector functions which take one variable and give a vector. Now, we want functions that take many variables (a vector) and return a scalar. Definition. A function of several variables is a rule that assignes to each ordered pair of real numbers (x, y) in a set D a unique real number denoted f (x, y). The set D is the domain of f and its range is the set of output values {f (x, y):(x, y) D}⊆ R. The input variables (x, y) are also called the independent variables, while the output variable z = f (x, y) is also called the dependent variable. Example (67). The function z = f (x, y)= x + y defines a plane, while z = f (x, y)= p x 2 + y 2 defines the upper left sphere. Other examples of functions of several variables: g(x, y)= x 2 + y 2 + xy, h(x, y)= cos x + sin y + xy 2 - 4. Consider z = f (x, y)= x(y - 1). Some more examples of 2 variable functions: Body Mass Index: B(h, w)= w/h 2 Area of a rectangle: A(l, w)= lw Volume of a right cylinder: V (r, h)= πr 2 h 1

Transcript of Notes

Page 1: Notes

Calculus II

Chapter 15 Lecture Notes

Professor Blanchard

1 Functions of Several Variables

Warm-up. If x = 2t, y = t+1, and z = 2t2, try to write z in terms of x and y by eliminating the parametert.

We have had vector functions which take one variable and give a vector. Now, we want functions thattake many variables (a vector) and return a scalar.

Definition. A function of several variables is a rule that assignes to each ordered pair of real numbers (x, y)in a set D a unique real number denoted f(x, y). The set D is the domain of f and its range is the set ofoutput values {f(x, y) : (x, y) ∈ D} ⊆ R.

The input variables (x, y) are also called the independent variables, while the output variable z = f(x, y)is also called the dependent variable.

Example (67). The function z = f(x, y) = x + y defines a plane, while z = f(x, y) =√x2 + y2 defines

the upper left sphere. Other examples of functions of several variables: g(x, y) = x2 + y2 + xy, h(x, y) =cosx+ sin y + xy2 − 4.

Consider z = f(x, y) = x(y − 1).Some more examples of 2 variable functions:

• Body Mass Index: B(h,w) = w/h2

• Area of a rectangle: A(l, w) = lw

• Volume of a right cylinder: V (r, h) = πr2h

1

Page 2: Notes

x

y

0fD

Figure 1: A function of two variables

• Wind chill: W (T, s) = 35.74 + 0.6215T + (0.4275− 35.75)s0.16, where s ≥ 3 is the wind speed in milesper hour, and T < 50 is the temperature in degrees Fahrenheit.

Definition. If f is a function of two variables with domain D, the graph of f is the set of all points(x, y, z) = (x, y, f(x, y)) in R3 where z = f(x, y) for (x, y) ∈ D.

A graph of a function of two variables is called a surface.

Definition. The level curves of a two variable function f(x, y) are the curves f(x, y) = k, where k is aconstant in the range of f . Level curves are often plotted in the (x, y)-plane or on the surface (x, y, f(x, y)).

Example (68). Let f(x, y) = 100−x2− y2. Determine several level curves of f and plot them in the plane.Use the level curves to sketch the graph of f over D. What are D and f(D)?

D = R2, f(D) = {z ∈ R, z ≤ 100}.We can compute the following:

f(x, y) = 100 ⇐⇒ (x, y) = 0f(x, y) = 84 ⇐⇒ x2 + y2 = 16f(x, y) = 75 ⇐⇒ x2 + y2 = 25f(x, y) = 64 ⇐⇒ x2 + y2 = 36f(x, y) = 36 ⇐⇒ x2 + y2 = 64f(x, y) = 0 ⇐⇒ x2 + y2 = 100f(x, y) = −44 ⇐⇒ x2 + y2 = 144

See Figures 2 and 3 on pages 3 and 4 for a visualization of the level curves of f .

2

Page 3: Notes

−10 −5 0 5 10

−10

−5

0

5

10

Figure 2: Level curves of f(x, y) = 100− x2 − y2

Functions of three or more variables

Suppose D ⊂ Rn is a set of n-tuples of real numbers (x1, x2, . . . , xn). A real valued function f : D → R isa rule that assigns to each n-tuple (x1, x2, . . . , xn) ∈ D a unique real number w = f(x1, x2, . . . , xn). Thedomain of f is D, the range of f is {w ∈ R : w = f(x1, x2, . . . , xn)}. the input/independent variables arex1, x2, . . . , xn, and the output/dependent variable is w.

For functions of three or more variables, graphs are harder to visualize, but they are (x1, x2, . . . , xn, w)for w = f(x1, . . . , xn). In the case of three variables, w = f(x, y, z), one can look at the level surfacesf(x, y, z) = k for k a constant in the range of f .

3

Page 4: Notes

−10−5

05

10 −10−5

05

10

−50

0

50

100

Figure 3: Level curves of f(x, y) = 100− x2 − y2

Limits

Informally, L is the limit of f(x, y) as (x, y) approaches (a, b). We can make the value f(x, y) as close to Las we want by making (x, y) close enough to (a, b)

(a,b) )(L

f(x,y)f

(x,y)

Figure 4: Informal illustration of a limit

4

Page 5: Notes

Formally, we can define the limit in two variables as follows:

Definition. Let f(x, y) be a function with domain D where D contains a neighborhood of (a, b). Then thelimit of f(x, y) as (x, y) approaches (a, b) is L:

lim(x,y)→(a,b)

f(x, y) = L

if for every number ε > 0 there is a corresponding δ > 0 such that if (x, y) ∈ D and 0 <√

(x− a)2 + (y − b)2 <δ then:

|f(x, y)− L| < ε

All of the familiar limit laws carry over. If lim(x,y)→(a,b)

f(x, y) = L1, lim(x,y)→(a,b)

g(x, y) = L2, then:

lim(x,y)→(a,b)

[f(x, y) + g(x, y)] = L1 + L2

lim(x,y)→(a,b)

f(x, y)g(x, y)

=L1

L2if L2 6= 0, etc.

Definition. A function f(x, y) is continuous at the point (a, b) if lim(x,y)→(a,b)

f(x, y) = f(a, b). Further, f is

continuous on D if f is continuous at every point in D.

Example (69). BMI, wind chill, f(x, y) = 100 − x2 − y2 are continuous functions. f(x, y) = x/y is notcontinuous at y = 0. Polynomials like f(x, y) = x2 + xy + y2 + 2 are continuous, as are rational functions

like f(x, y) =x2 + y

y3 + x2y + 26y − 2.

Our analog for Two-Sided Limits for functions of two variables is the Two Path Test:

Proposition (Two Path Test). If a function f(x, y) has different limits along two different paths as (x, y)approaches (a, b), then lim

(x,y)→(a,b)f(x, y) does not exist.

Example (70). f(x, y) = x/y. Let (x, y) → (0, 0). We could approach any line through (0, 0) in the formy = mx. So, f(x,mx = 1/m). So f = 1/m for all points on the line y = mx. Thus, every line we travelalong toward the origin produces a different limit.

In the case of many variables, we have the following: If ~x = (x1, x2, . . . , xn), then lim~x→~0

f(~x) = L if for any

ε > 0, there is a δ > 0 so that if 0 < |~x− ~a| < δ then |f(~x)− L| < ε.

2 Partial Derivatives

Warm-up. If f(x, y) = x2 + xy + y3, let g(x) = f(x, 2). Find g′(x).

5

Page 6: Notes

We care about rates of change. We have these new multivariate functions, we know limits, etc. Howshould we define a derivative?

If we fix y = y0, we can define g(x) = f(x, y0). Then,

g′(x) = lim∆x→0

g(x+ ∆x)− g(x)∆x

= lim∆x→0

f(x+ ∆x, y0)− f(x, y0)∆x

So with respect to a change in the variable x, we can find a derivative by holding the other variable yconstant.

If we fix x = x0 and define h(y) = f(x0, y) we can have

h′(y) = lim∆y→0

h(y + ∆y)− h(y)∆y

= lim∆y→0

f(x0, y + ∆y)− f(x0, y)∆y

What happens if we vary x0? We get partial derivatives:

Definition. If z = f(x, y), then the (first) partial derivatives of f with respect to x and with respect to yare the functions fx(x, y) and fy(x, y), respectively. They are defined as follows:

fx(x, y) = lim∆x→0

f(x+ ∆x, y)− f(x, y)∆x

fy(x, y) = lim∆y→0

f(x, y + ∆y)− f(x, y)∆y

provided these limits exist.

To find fx, consider y a constant and differentiate with respect to x, and to find fy, consider x a constantand differentiate with respect to y.Notation: The following all denote partial derivatives with respect to x and y, respectively: fx, ∂f∂x , zx,

∂z∂x

and fy,∂f∂y , zy,

∂z∂y .

Example (71). If f(x, y) = x3y + xy2, find (a) fx and (b) fy.

(a) Consider y a constant and differentiate with respect to x:

fx(x, y) = 3x2y + y2

(b) Consider x a constant and differentiate with respect to y:

fy(x, y) = x3 + 2xy

The values of a partial derivative at a point (a, b) are

∂f

∂x

∣∣∣∣(a,b)

= fx(a, b),∂f

∂y

∣∣∣∣(a,b)

= fy(a, b)

Example (72). If z = x2 sin(3x+ y3), find ∂z∂x and ∂z

∂y at the point (π3 , 0).

∂z

∂x= 3x2 cos(3x+ y3) + 2x sin(3x+ y3)

∂z

∂y= [x2 cos(3x+ y3)]3y2 = 3x2y2 cos(3x+ y3)

6

Page 7: Notes

So,

∂z

∂x

∣∣∣∣(π3 ,0)

= 3(π2

9

)cos(π + 0) + 2

(π3

)sinπ = −π

2

3

∂z

∂y

∣∣∣∣(π3 ,0)

= 3(π2

9

)(0) cos(π + 0) = 0

Definition. If w = f(x1, x2, . . . , xn) is a function of n variables, then

∂f

∂xi= fxi(x1, x2, . . . , xn) = fi(x1, x2, . . . , xn) = lim

∆xi→0

f(x1, . . . , xi + ∆xi, . . . , xn)− f(x1, . . . , xn)∆xi

is the partial derivative of f with respect to xi.

Example (73). f(x, y, z) = x2 + 2xy2 + yz3. Find fx, fy, fz.fx(x, y, z) = 2x+ 2y2, fy(x, y, z) = 4xy + z3, fz(x, y, z) = 3yz2.

Implicit Partial Differentiation

If z = f(x, y) is defined implicitly, consider z as a function of both x and y.

Example (74). Find ∂z∂x and ∂z

∂y if x2 + 5xy2 + z3 = 1.

0 = Dx[1] = Dx[x2 + 5xy2 + z3] = 2x+ 5y2 + 3z2 ∂z

∂x

So,∂z

∂x=−2x− 5y2

3z2=

−2x− 5y22

3(1− x2 − 5xy2)2/3

0 = Dy[1] = Dy[x2 + 5xy2 + z3] = 0 + 10xy + 3z2 ∂z

∂y

So,∂z

∂y=−10xy

3z2=

−10xy)3(1− x2 − 5xy2)2/3

Geometric Interpretation

If z = f(x, y), then consider the surface S = (x, y, f(x, y)). The plane x = a intersects S at a curve Cx. Theplane y = b also intersects S at a curve Cy. So, ∂f

∂x (a, b) is the slope of the line tangent to Cx at (a, b), and∂f∂y (a, b) is the slope of the line tangent to Cy at (a, b).

In other words, the line parallel to the xz-plane and tangent to the surface z = f(x, y) at the pointP (a, b, c) has slope fx(a, b). Likewise, the tangent line to the surface z = f(x, y) at P (a, b, c) that is parallelto the yz-plane has slope fy(a, b).

At the point (a, b), the function f(x, y) changes at a rate given by fx(a, b) in the direction~i and at a rategiven by fy(a, b) in the direction ~j.

Example (75). Find the slope of the line parallel to the xz-plane and tangent to the surface z = x√x+ y

at the point P (1, 3, 2).f(x, y) = x

√x+ y = x(x+ y)1/2, so

fx(x, y) = x

[12

(x+ y)−1/2

]+ (1)(x+ y)1/2 =

x

2√x+ y

+√x+ y =

3x+ 2y2√x+ y

7

Page 8: Notes

So,

fx(1, 3) =3 + 62√

4=

92 · 2

=94

Example (76). In an electrical curcuit with electromotive force (EMF) of E volts and resistance R ohms,the current is I = E/R amps. Find the partial derivatives ∂I

∂E and ∂I∂R at the instant when E = 120 and

R = 15 and interpret these as rates.I = ER−1 so ∂I

∂E = R−1 = 1R and ∂I

∂R = −ER−2 = −ER2 .

When E = 120, R = 15, we have ∂I∂E

∣∣(120,15)

= 115 ≈ 0.0667, and ∂I

∂R

∣∣(120,15)

= −120(15)2 ≈ −0.5333.

If we hold the EMF constant at 120 volts, the current is decreasing at a rate of −0.5333 amps/ohm whenthe resistance is 15 ohms, since ∂I

∂R = −0.5333.Instead, if we instead fix the resistance at 15 ohms, the current is increasing when we have an EMF of

120 volts, and it is increasing at a rate of 115 amp/volt.

3 Higher Partial Derivatives

Warm-up. Let f(x, y) = x2y3 + 2y − 1 and g(x, y) = 2xy3. Find fx(x, y), as well as gx(x, y) and gy(x, y).

A partial derivative of a multivariate function is another multivariate function. So, we can find its partialderivatives.

Definition. Let z = f(x, y). The second partial derivatives of z are as follows:

∂2z∂x2 = ∂

∂x

(∂z∂x

)= ∂

∂x (fx) = (fx)x = fxx

∂2z∂y2 = ∂

∂y

(∂z∂y

)= ∂

∂y (fy) = (fy)y = fyy

∂2z∂x∂y = ∂

∂x

(∂z∂y

)= ∂

∂x (fy) = (fy)x = fyx

∂2z∂y∂x = ∂

∂y

(∂z∂x

)= ∂

∂y (fx) = (fx)y = fxy

The last two derivatives, fyx and fxy, are called the mixed partial derivatives.

Example (77). Let z = f(x, y) = 5x2 − 2xy + 3y3. Find (a) fyx(x, y); (b) fxy(x, y); (c) fxx(x, y); (d)fyy(x, y).

The first partial derivatives are fx(x, y) = 10x− 2y and fy = −2x+ 9y2. So,

8

Page 9: Notes

(a) fyx = (fy)x = −2

(b) fxy = (fx)y = −2

(c) fxx = (fx)x = 10

(d) fyy = (fy)y = 18y, so fyy(3, 2) = 36.

Theorem (Clairaut’s Theorem). Suppose f is defined on a disk D that contains the point (a, b). If thefunctions fxy and fyx are both continuous on D, then fxy(a, b) = fyx(a, b).

Example (78). Find fxxx, fxyx, fxxy if f(x, y) = x2yey: We have that (fx)xy = (fx)yx, so fxxy = fxyx.Therefore, there are only two things to find. Starting with fx(x, y) = 2xyey, so fxx(x, y) = 2yey. Thenfxxx = 0 and fxxy = fxyx = 2yey + 2ey = (2y + 2)ey.

Here are some Important Partial Differential Equations:

• Laplace Equation: uxx + uyy = 0 (harmonic functions — fluid flow, electrical potential)

• Wave Equation: utt = a2uxx (oscillations such as vibrating strings, sound waves, light waves)

• Heat Equation: ut = c2uxx (heat in an insulated wire)

Example (79). Verify that T (x, t) = e−t cos(xc ) satisfies the heat equation.

∂T∂t = − cos(xc )e−t = −e−t cos(xc )

∂T∂x = −e−t sin(xc )( 1

c ) = − e−t

c sin(xc )

∂2T∂x2 = − e

−t

c cos(xc )( 1c ) = − e

−t

c2 cos(xc )

So, ∂T∂t = − cos(xc )e−t = (c2)

(−e−tc2

)cos(xc ) = c2 ∂

2T∂x2 . Hence T (x, t) satisfies the heat equation.

Example (80). By direct calculation, show fxyz = fyzx = fzyx for f(x, y, z) = xyz + x2y3z4.

fx(x, y, z) = yz + 2xy3z4

fy(x, y, z) = xz + 3x2y2z4

fz(x, y, z) = xy + 4x2y3z3

fxy(x, y, z) = z + 6y2z4

fyz(x, y, z) = x+ 12x2y2z3

fzy(x, y, z) = x+ 12x2y2z3

So, fxyz = 1 + 24xy2z3, fyzx = 1 + 24xy2z3, fzyx = 1 + 24xy2z3.

4 Tangent Planes

Let S be the surface z = f(x, y) and suppose fx, fy are continuous at (x0, y0). Let C1 be the intersection ofS with the plane x = x0 and C2 be the intersection of S with the plane y = y0.

9

Page 10: Notes

Then, C1 has a tangent line at (x0, y0) with slope fx(x0, y0), and C2 has a tangent line at (x0, y0) withslope fy(x0, y0). Letting z0 = f(x0, y0), these two lines define a plane:

A(x− x0) +B(y − y0) + C(z − z0) = 0

Let a = −AC , b = −BC . Then z − z0 = a(x− x0) + b(y − y0).If x = x0, we have z − z0 = b(y − y0) but this is the tangent line to C2, so b = fy(x0, y0).If y = y0, we have z − z0 = a(x− x0) but this is the tangent line to C1, so a = fx(x0, y0).

Definition. Let z = f(x, y) with fx, fy continuous at (x0, y0), and z0 = f(x0, y0). The equation for thetangent plane to the surface z = f(x, y) at the point (x0, y0, z0) is

z − z0 = fx(x0, y0)(x− x0) + fy(x0, y0)(y − y0)

Equivalently,z = f(x0, y0) + fx(x0, y0)(x− x0) + fy(x0, y0)(y − y0)

Example (81). Find an equation for the tangent plane to the surface z = tan−1( yx ) at the point (1,√

3, π3 ).

fx(x, y) =1

1 + ( yx )2

(−yx2

)=

−yx2 + y2

So, fx(1,√

3) = −√

34 .

fy(x, y) =1

1 + ( yx )2

(1x

)=

1

x+ y2

x

=x

x2 + y2

So, fy(1,√

3) = 14 .

Hence, the tangent plane is

z − π

3=−√

34

(x− 1) +14

(y −√

3)

12z = 4π = −3√

3x+ 3√

3 + 3y − 3√

3or

3√

3x− 3y + 12z = 4π

Linearization/Tangent Plane Approximation

Just as the tangent line at (x, y) is a good approximation to the curve near (x, y) so too a tangent plane atP is a good approximation to f(x, y) when we are near P .

Suppose f(x, y) is a function of two variables and fx, fy exist at (a, b). The linaer function

L(x, y) = f(a, b) + fx(a, b)(x− a) + fy(a, b)(y − b)defines the tangent plane to the graph of f at the point (a, b, f(a, b)).

When (x, y) is near (a, b), this linearization is a good approximation

f(x, y) ≈ f(a, b) + fx(a, b)(x− a) + fy(a, b)(y − b)Example (82). Say f(x, y) = tan−1(y/x). Use what we already know to approximate f(1.05, 1.77).

Note that (1,√

3) ≈ (1, 1.73205). So, (1.05, 1.77) may be a good candidate for a linear approximation.

L(x, y) =π

3−√

34

(x− 1) +14

(y −√

3) ≈ f(x, y) near (1,√

3,π

3)

L(1.05, 1.77) =π

3−√

34

(0.5) +1.77−

√3

4≈ 1.03504

Actual Matlab: tan−1( 1.771.05 ) = 1.03538 (six digit rounding).

10

Page 11: Notes

5 Differentiability and the Chain Rule

Warm-up. Find the tangent plane to the surface z = x2 − xy + 12y

2 + 3 at the point (3, 2).

We found a tangent plane when fx, fy both exist and are continuous. Say t = f(x, y). Then, let∆z = f(x + ∆x, y + ∆y) − f(x, y). When this change is dependent on ∆x and ∆y, we call the functiondifferentiable.

Definition. If z = f(x, y), then f is differentiable at (a, b) if ∆z can be expressed in the form

∆z = fx(a, b)∆x+ fy(a, b)∆y + ε1∆x+ ε2∆y

where ε1 → 0 and ε2 → 0 as (x, y)→ (0, 0).

Here is the formal way of saying the linearization of the tangent plane is a good approximation near thepoint:

Theorem. If the partial derivatives fx and fy exist near (a, b) amd are continuous at (a, b), then f isdifferentiable.

Example (83). The function f(x, y) = x2 − xy + 12y

2 + 3 has continuous first partial derivatives hence isdifferentiable everywhere.

Chain Rule: Suppose z = f(x, y). Then x, y might be functions of some variable t, and then z is a functionof t. Or, x = g(s, t), y = h(s, t) is function of two variables, but then z is a function of s, t.

Theorem (Chain Rule (Case 1)). Suppose z = f(x, y) is a differentiable function of x and y, where x =g(t), y = h(t) are both differentiable functions of t. Then z is a differentiable function of t, with

∂z

∂t=∂z

∂x

dx

dt+∂z

∂y

dy

dt

Proof. ∆z = ∂z∂x∆x+ ∂z

∂y∆y + ε1∆x+ ε2∆y with ε1 → 0, ε2 → 0 as (∆x,∆y)→ (0, 0). Then,

∆z∆t

=∂z

∂x

∆x∆t

+∂z

∂y

∆y∆t

+ ε1∆x∆t

+ ε2∆y∆t

Since x = g(t), y = h(t) are both differentiable functions of t, as ∆t → 0, (∆x,∆y) → (0, 0). So as∆t→ 0, ε1 → 0, ε2 → 0.

11

Page 12: Notes

Then,

dzdt = lim∆t→0

∆z∆t

= ∂z∂x

(lim∆t→0

∆x∆t

)+ ∂z

∂y

(lim∆t→0

∆y∆t

)+ (lim∆t→0 ε1)

(lim∆t→0

∆x∆t

)+ (lim∆t→0 ε2)

(lim∆t→0

∆y∆t

)= ∂z

∂x ·dxdt + ∂z

∂y ·dydt + 0 · dxdt + 0 · dydt

= ∂z∂x ·

dxdt + ∂z

∂ ·dydt

Example (84). We use the chain rule to find the derivative of w = xy with respect to t along the pathx = cos t, y = sin t. What is the value of the derivative at t = π/2?

∂w

∂x= y,

∂w

∂y= x,

dx

dt= − sin t,

dy

dt= cos t

Then,dwdt = ∂w

∂x ·dxdt + ∂w

∂y ·dydt

= y(− sin t) + x(cos t)

= (sin t)(− sin t) + cos t(cos t)

= − sin2 t+ cos2 t

= cos 2t

At t = π2 we have cosπ = −1, so dw

dt

∣∣t=π

2= −1.

We could evaluate the derivative at a point as follows: When t = π2 , x = cos π2 = 0, y = sin π

2 = 1. So,

dw

dt

∣∣∣∣t=π

2

=∂w

∂x

∣∣∣∣t=π

2

· dxdt

∣∣∣∣t=π

2

+∂w

∂y

∣∣∣∣t=π

2

· dydt

∣∣∣∣t=π

2

= 1 · (− sinπ

2) + 0 · (cos

π

2) = −1

Theorem (Chain Rule (Case 2)). Suppose z = f(x, y) is a differentiable function of x and y and x =g(s, t), y = h(s, t) where g, h are differentiable functions of s, t. Then z is a function of (s, t) and

∂z

∂s=∂z

∂x· ∂x∂s

+∂z

∂y· ∂y∂s

∂z

∂t=∂z

∂x· ∂x∂t

+∂z

∂y· ∂y∂t

Example (85). Express ∂w∂r and ∂w

∂s in terms of r, s if w = x2 + y2, x = r − s and y = r + s.

∂w

∂x= 2x,

∂w

∂y= 2y,

∂x

∂r= 1,

∂x

∂s= −1,

∂y

∂r= 1,

∂y

∂s= 1

12

Page 13: Notes

Now,∂w

∂r=∂w

∂x· ∂x∂r

+∂w

∂y· ∂y∂r

= 2x(1) + 2y(1) = 2(r − s) = 2(r + s) = 4r

∂w

∂s= 2x(−1) + 2y(1) = −2(r − s) + 2(r + s) = 4s

What if w = f(x) and x = g(s, t)? Then w is a function of s, t.

∂w

∂s=dw

dx· ∂x∂s,∂w

∂t=dw

dx· ∂x∂t

What if w = f(x, y, z), x = g(s, t), y = h(s, t), z = k(s, t)? What if w = f(x1, . . . , xn) and xi =g(t1, . . . , tm)?

Theorem (Chain Rule (General)). Suppose w is a differentiable function of n variables x1, x2, . . . , xn andeach xi is a differentiable fucntion of m variables t1, t2, . . . , tm. Then w is a differentiable function of mvariables t1, . . . , tm and

∂w

∂ti=

∂w

∂x1· ∂x1

∂ti+∂w

∂x2· ∂x2

∂ti+ · · ·+ ∂w

∂xn· ∂xn∂ti

for each i = 1, 2, . . . ,m.

Example (86). Find ∂w∂s and ∂w

∂t is w = x+ 2y + z2 with x = r/s, y = r2 + lns, z = 2r.

∂w

∂x= 1,

∂x

∂s=−rs2,∂x

∂r=

1s

∂w

∂y= 2,

∂y

∂s=

1s,∂y

∂r= 2r

∂w

∂z= 2z,

∂z

∂s= 0,

∂z

∂r= 2

So,∂w∂s = 1(−rs2 ) + 2( 1

s ) + 2z(0) = −rs2 + 2

s = 2s−rs2

∂w∂s = 1( 1

s ) + 2(2r) = 2z(2) = 1s + 4r + 4z

= 1s + 4r + 4(2r) = 1

s + 12r

13

Page 14: Notes

Tree diagrams

Say s = f(u, v) where u = g(x, y), v = h(y) = h(x, y).

s

u v

x y x y

s

Figure 5: Multivariable chain rule: tree diagram

Implicit Differentiation (with the chain rule)

An advanced theorem:

Theorem (Implicit Function Theorem). If F is defined on a disc around (a, b) with F (a, b) = 0, Fy(a, b) 6= 0,and both Fx and Fy are continuous on the disk, then the equation F (x, y) = 0 defines y as a function of xon this disc.

This is more generally defined for more variables, as well.So, let F (x, y) = 0 and apply the chain rule to get ∂F

∂x ·dxdx + ∂F

∂y ·dydx = 0. Well, dx

dx = 1 and solving fordydx we get

dy

dx=−∂F∂x∂F∂y

= −FxFy

When faced with implicit differentiation, consider the equation as a function of n+ 1 variables, with oneside of the equation equal to zero.

Example (87). Find dydx if xy2 + 2yx = 7 − y

x . Immediately, xy2 + 2yx + yx − 7 = 0. Define F (x, y) =

xy2 + 2yx+ yx − 7, so

Fx(x, y) = y2 + 2y − y

x2

Fy(x, y) = 2xy + 2x+1x

sody

dx= −Fx

Fy=

y2 + 2y − yx2

2xy + 2x+ 1x

Example (88). Say x3z3 + y2xz = 1− z. Let F (x, y, z) = x3z3 + y2xz + z − 1 = 0.

Fx(x, y, z) = 3x2 + y2z

Fy(x, y, z) = 2yxz

Fz(x, y, z) = 3x3z2 + xy2 + 1

14

Page 15: Notes

∂F

∂x· ∂x∂x

+∂F

∂y· ∂y∂x

+∂F

∂z· ∂z∂x

= 0

∂F

∂x· (1) + 0 +

∂F

∂z· ∂z∂x

= 0

So∂z

∂x=−∂F∂x∂F∂x

=−FxFz

Similarly, ∂z∂y = −Fy

Fz.

Applications

Example (89). Water is circulating around a square region {(x, y) : 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1}. Thevelocity of a small parcel of water has an x-component u(x, y) = 2 sinπx cosπy and a y-component v(x, y) =−2 cosπx sinπy.

Then, the speed of the water parcel is

S(x, y) =√

[u(x, y)]2 + [v(x, y)]2

which is a function of u and v, hence implicitly a function of x and y. What are the rates of the water speedin the x and y directions?

We have s =√u2 + v2, u = 2 sinπx cosπy, v = −2 cosπx sinπy.

∂s

∂u=

12

(u2 + v2)−1/2 · 2u =u√

u2 + v2=u

s

∂s

∂v=

v√u2 + v2

=v

s

∂u

∂x= 2π cosπx cosπy,

∂u

∂y= −2π sinπx sinπy

∂v

∂x= 2π sinπx sinπy,

∂v

∂y= −2π cosπx cosπy

So,

∂s

∂x=∂s

∂u· ∂u∂x

+∂s

∂v· ∂v∂x

=u

s(2π cosπx cosπy) +

v

s(2π sinπx sinπy)

∂s

∂y=∂s

∂u· ∂u∂y

+∂s

∂v· ∂v∂y

=u

s(−2π sinπx sinπy) +

v

s(−2π cosπx cosπy)

15

Page 16: Notes

6 Directional Derivatives and the Gradient

Warm-up. Find fx(3, 2) and fy(3, 2) when z = f(x, y) = 14 (x2 + 2y2) + 2.

We have fx, fy and a tangent plane. fx(x0, y0) is the rate of change in f if we move from (x0, y0) in thedirection (1, 0). Similarly, fy(x0, y0) is the rate of change in f if we move from (x0, y0) in the direciton (0, 1).What if we want to go in a different direction? How will z change if both x and y change?

Say ~u = (u1, u2) is a unit vector (direction vector). We want to move from (x0, y0) in the direction ~u bya step of length h. Then ∆z = f(xo + hu1, y0 + hu2)− f(x0, y0). So, the rate of change is

∆zh

=f(xo + hu1, y0 + hu2)− f(x0, y0)

h=f(~x+ h~u)− f(~x)

h

Definition. The directional derivative of f at ~x0 = (x0, y0) in the direction ~u = (u1, u2) is

D~uf(x, y) = limh→0

f(xo + hu1, y0 + hu2)− f(x0, y0)h

= limh→0

f(~x+ h~u)− f(~x)h

if this limit exists.

Consider ~u =~i,~j: We have D~if(x, y) = fx(x, y) and D~j = fy(x, y).

Example (90). What is the directional derivative of f(x, y) in the direction that is θ radians counterclockwisefrom the x-axis?

~u = (cos θ, sin θ), so

D~uf(x, y) = limh→0

f(x+ h cos θ, y + h sin θ)− f(x, y)h

Hence, every unit vector can be expressed in terms of θ.

Theorem (15.6.3). If f is a differentiable function of x and y, then f has a directional derivative in thedirection of any unit vector ~u = (u1, u2) and D~uf(x, y) = fx(x, y)u1 + fy(x, y)u2 = (fx(x, y), fy(x, y)) · ~u.

Proof. Define g(h) = f(x0 + hu1, y0 + hu2). Then g(h) = f(x, y) where x = x0 + hu1 and y = y0 + hu2.First,

g′(0) = limh→0

g(0 + h)− g(0)h

= limh→0

f(x0 + hu1, y0 + hu2)− f(x0, y0)h

= D~uf(x0, y0)

Also, since g(h) = f(x, y), we can apply the chain rule (case 1):

g′(h) = fx(x, y)dx

dh+ fy(x, y)

dy

dh= fx(x, y)u1 + fy(x, y)u2

16

Page 17: Notes

If h = 0, x = x0, y = y0,D~uf(x0, y0) = g′(0) = fx(x0, y0)u1 + fy(x0, y0)u2

Example (91). Consider the paraboloid z = f(x, y) = 14 (x2 + 2y2) + 2. Let P0 be the point (3, 2) and

consider the unit vectors ~u = ( 1√2, 1√

2) and ~v = ( 1

2 ,−√

32 ). Find the directional derivative of f at P0 in the

directions ~u and ~v.From warm-up: fx(3, 2) = 3

2 , fy(3, 2) = 2. So,

D~uf(3, 2) =32· 1√

2+ 2

(1√2

)=

3√

24

+2√

22

=7√

24

D~vf(3, 2) =32· 1

2+ 2

(−√

32

)=

34−√

3

So, f is increasing if we move in the direction (1, 1) but decreasing when we move in the direction(1,−

√3).

In the definition above, we wrote D~uf(x, y) = (fx(x, y), fy(x, y)) · ~u. This vector of partial derivatives isimportant.

Definition. If f is a function of two variables x and y, then the gradient of f is the vector function ∇f(x, y)defined by

∇f(x, y) = (fx(x, y), fy(x, y)) = fx(x, y)~i+ fy(x, y)~j

or ∇f = (∂f∂x ,∂f∂y ) = ∂f

∂x~i+ ∂f

∂y~j.

So if ~u is a unit vector, D~uf(x, y) = ∇f · ~u.

Example (92). Find ∇f,∇f(3, 2) and D~uf(3, 2) for ~u = ( 35 ,

45 ) when f(x, y) = x2 + 2xy − y3.

We have that fx(x, y) = 2x+ 2y, fy = 2x− 3y2, so

∇f(xy) = (2(x+ y), 2x− 3y2)

∇f(3, 2) = (2(5), 6− 12) = (10,−6)

D~uf(3, 2) = (10,−6) · (35,

45

) = 6− 245

=65

.

Example (93). Is f increasing or decreasing at (3,−1) in the direction of (1,−1) if f(x, y) = 3− x2

10 + xy2

10 ?We first compute the first partial derivatives: fx(x, y) = −x5 + y2

10 , so fx(3,−1) = −35 + 1

10 = −510 = − 1

2 .Similarly, fy(x, y) = xy/5, so fy(3,−1) = −3

5 . Thus, ∇f(3,−1) = (−12 ,−35 ).

Now, ~r = (1,−1) is not a unit vector, so we must use ~u = ~v|~v| = ( 1√

2, −1√

2). Then,

D~uf(3,−1) = ∇f(3,−1) · ~u = (−12 ,−35 ) · ( 1√

2, −1√

2)

= −12√

2+ 3

5√

2= −

√2

4 + 310

= −10√

2+1240

So −10√

2+1240 < 0 so f is decreasing at (3,−1) in the direction ~u = ( 1√

2, −1√

2)

17

Page 18: Notes

7 Gradients

Warm-up. If f(x, y, z) = x2 + 2y2 + 4z2 − 1, find fx, fy, fz.

If ∇f(x, y) = (fx, fy) when f is a function of two variables, what is ∇f(x, y, z) when f is a function ofthree variables (or possibly more)?

Definition. If w = f(x, y, z), then the gradient of f is the vector function∇f(x, y, z) = (fx(x, y, z), fy(x, y, z), fz(x, y, z)).More generally if ~x is an n-dimensional vector and f is a function of n variables, the gradient of f is thevector function

∇f(~x) = (fx1(~x), . . . , fxn(~x))

Example (94). Say f(x, y, z) = x2 + 2y2 + 4z2 − 1. By the warm-up, ∇f = (2x, 4y, 8z).

Example (95). Say f(u, v, w, x, y, z) = uv + w cos(x, y, z). Then

∇f = (v, u, cos(xyz),−wyz sin(xyz),−wxz sin(xyz),−wxy sin(xyz))

Definition. If w = f(x, y, z), the directional derivative of f in the direction of a unit vector ~u = (u1, u2, u3)is

D~uf(x, y, z) = ∇f(x, y, z) · ~u = fx(x, y, z)u1 + fy(x, y, z)u2 + fz(x, y, z)u3

More generally, if ~x is an n-dimensional vector, f is a function of n variables and ~u is an n-dimensional unitvector, then the directional derivative of f in the direction ~u is

D~uf(~x) = ∇f(~x) · ~u =n∑i=1

fxi(~x)ui

Example (96). Find the directional derivative of f(u, v, w, x, y, z) = uv+w cos(xyz) in the direction barv =(1, 2, 1, 1, 1, 1) at the point (1, 4, 2, π, 1, 1).

∇f(1, 4, 2, π, 1, 1) = (4, 1, cosπ,−2 sinπ,−2π sinπ,−2π sinπ)

= (4, 1,−1, 0, 0, 0)

So ~u = ~v|~v| = 1

3 (1, 2, 1, 1, 1, 1), and then

18

Page 19: Notes

D~uf(1, 4, 2, π, 1, 1) = (4, 1,−1, 0, 0, 0) · ( 13 (1, 2, 1, 1, 1, 1))

= 13 (4 + 2− 1)

= 53

7.1 What is the Gradient?

D~uf(~x) = ∇f(~x) · ~u = |∇f(~x)| |~u| cos θ = |∇f(~x)| cos θ

Therefore, −|∇f(x)| ≤ D~uf(~x) ≤ |∇f(~x)|So D~uf(~x) = |∇f(~x)| if ~u is the direction of ∇f , i.e. ~u and ∇f are parallel.Also, if ~u is the direction of ∇f(~x), then D~uf(~x) = |f(~x)| > 0 so f is increasing in this direction.If ~u is the direction opposite ∇f , then θ = π so we have D~uf(~x) = |∇f(~x)| cosπ = −|∇f(~x)| < 0.If θ = π/2, then ∇f and u are orthogonal and D~uf = 0.Note that ∇f is called the direction of steepest ascent and −∇f is called the direction of steepest descent.

Theorem (Directions of Change). Let f be differentiable at (a, b).

(1) f has its maximum rate of increase in the direction of the gradient ∇f(a, b). The rate of increase in thisdirection is |∇f(a, b)|.

(2) f has its maximum rate of decrease at (a, b) in the direction −∇f(a, b). The rate of decrease is−|∇f(a, b)|.

(3) The directional derivative is zero in any direction orthogonal to ∇f(a, b).

Example (97). Consider the bowl-shaped paraboloid z = 4+x2+3y2. Let P = (2, −12 ,

354 ) and Q = (3, 1, 16).

1. If you are at P , which direction should you move to ascend the surface at a maximum rate? What isthe rate of change in z?

2. From P , which direction lets you descend the surface at the maximum rate? What is that rate?

3. From Q, which directions provide no change in z?

1. Let z = f(x, y) = 4 + x2 + 3y2. Then ∇f(x, y) = (2x, 6y) so ∇f(2, −12 ) = (4,−3). Then |∇f(2, −1

2 ) =√16 + 9 = 5. The maximum rate of ascent is in the direction (4,−3) producing a rate of change of 5.

2. The direction −(4,−3) = (−4, 3) is the direction of steepest decent. It has a rate of change of −5.

3. At Q, ∇f(3, 1) = (6, 6). The directions with no change in z are those ~u = (u1, u2) so that

0 = ∇f(3, 1) · ~u = (6, 6) · (u1, u2) = 6(u1 + u2)

Thus ~u1 = (−6, 6) and ~u2 = (6,−6) are the directions with no change in z.

Example (98). x, y are doses of a drug. f(x, y) measures their anaesthetic effect while g(x, y) describes theblood pressure of a patient. If f(x, y) = k, and we need f(x, y) = k but need to decrease g, how might wedo this?

We could find ~n so that ~n ·∇f = 0 and then determine if ∇g ·~n < 0. If so, move in direction ~n, otherwisemove in the direction −~n. If ∇g ·~n = 0, then we can’t maintain the current level of anaesthesia and decreasethe blood pressure.

19

Page 20: Notes

Theorem. The gradient is always orthogonal to a level set. In particular, the gradient is orthogonal to atangent line of a level curve and the gradient is normal to the tangent plane of a level surface.

Example (99). Consider the upper sheet z = f(x, y) =√

1 + 2x2 + y2 of a hyperboloid of two sheets.Verify that the gradient is orthogonal to the level curve at the point (1, 1).

f(1, 1) = 2 so (1, 1, 2) is the point of the hyperboloid. So, we want the level curve f(x, y) = 2 whichsatisfies the equation 4 = 1 = 2x2 + y2 or 2x2 + y2 − 3 = 0. To find the slope of the tangent, we candifferentiate implicitly. F (x, y) = 2x2 + y2 − 3, Fx = 4x, Fy = 2y, so

y′(x) =−FxFy

=−4x2y

=−2xy

So at (1, 1), y′(x) = −2. A line with slope −2 is parallel to (1,−2).

∇f(x, y) =

(4x

2√

1 + 2x2 + y2,

2y

2√

1 + 2x2 + y2

)=

1√1 + 2x2 + y2

(2x, y)

Then ∇f(1, 1) = 12 (2, 1) = (1, 1

2 ).So, ∇f(1, 1) · (1,−2) = (1, 1

2 ) · (1,−2) = 1− 1 = 0. The equation of the tangent line is easily found withthe tangent plane and z = 2.

8 Extreme Values

Definition (15.7.1). A function f of two variables has a local maximum at (a, b) if there is an open diskcentered at (a, b) with f(x, y) ≤ f(a, b) for all (x, y) in the disk. The value f(a, b) is then called a localmaximum value.

Similarly, f has a local minimum at (a, b) if there is a disk centered at (a, b) with f(x, y) ≥ f(a, b) for all(x, y) in the disk. Then, f(a, b) is called a local minimum value.

When can a function of one variable have a local extreme value?

20

Page 21: Notes

Theorem (15.7.2). If f has a local maximum or minimum at (a, b) and the first-order partial derivativesexist, then fx(a, b) = 0 and fy(a, b) = 0, i.e. ∇f(a, b) = ~0.

Proof. Say f has a local maximum at (a, b). Fix b and define g(x) = f(x, b). Then g′(x) = fx(x, b). Now ghas a local maximum at a since g(x) = f(x, b) ≤ f(a, b) = g(a) for all x near a. Therefore g′(a) = 0. So,0 = g′(a) = fx(a, b). Now let h(y) = f(a, y). Then fy(a, b) = h′(b) = 0 by a similar argument.

Definition. A point (a, b) is a critical point of f if fx(a, b) = 0 and fy(a, b) = 0 or if either fx or fy do notexist.

Corollary. If f hsa a local maximum at (a, b), then (a, b) is a critical point of f .

Caution: If (a, b) is a critical point of f , f might not have a local maximum or local maximum at (a, b).

Definition. A differentiable function f(x, y) has a saddle point at (a, b) if every open disk centered at (a, b)has points (x, y) with f(x, y) < f(a, b) and points (w, z) with f(w, z) > f(a, b).

Example (100). Find the local extreme values of f(x, y) = x2 + y2.fx(x, y) = 2x, fy(x, y) = 2y exist everywhere. So the only critical point is at (0, 0). If x 6= 0 or y 6= 0,

then f(x, y) > 0 = f(0, 0) so f must have a local minimum at (0, 0).

Example (101). Find the local extreme values of f(x, y) = x2 − y2:fx(x, y) = 2x, fy(x, y) = −2y exist everywhere, so the only critical point is (0, 0). If x = 0, f(x, y) =

−y2 < 0 for all y 6= 0. If y = 0, f(x, y) = x2 > 0 for all x 6= 0. Therefore (0, 0) is a saddle point.

Definition. The discriminant of f(x, y) is defined

D = D(x, y) =∣∣∣∣fxx(x, y) fxy(x, y)fyx(x, y) fyy(x, y)

∣∣∣∣ = fxx(x, y)fyy(x, y)− (fxy(x, y))2

Theorem (Second Derivative Test). Suppose the second partial derivatives of f are continuous on an opendisk with center (a, b) and that (a, b) is a critical point of f . Let D = D(a, b) be the discriminant of f at(a, b).

1. If D > 0 and fxx(a, b) > 0, then f(a, b) is a local minimum.

2. If D > 0 and fxx(a, b) < 0, then f(a, b) is a local maximum.

3. If D < 0, (a, b) is a saddle point of f .

4. If D = 0, this test fails to provide any information.

Example (102). Find the local extreme values of f(x, y) = xy.fx = y and fy = x are both zero only at (0, 0). So, the only critical point is (0, 0). Since fxx = 0, fyy =

0, fxy = 1, we have that D = 0(o)− 12 = −1, therefore (0, 0) is a saddle point of f . Hence, f has no extremevalues.

Example (103). Let f(x, y) = 9x3 − 4xy + y3

3 . Find and classify all critical points of f .fx(x, y) = 27x2 − 4y = 0 if y = 27

4 x2. Here we see y ≥ 0. fy(x, y) = y2 − 4x = 0 if y2 = 4x or y = 2

√x.

27x2 − 4(2√x) = 0⇒

√x(27x3/2 − 8) = 0, so x = 0 or x3/2 = 8

27 , so x = 4/9.If x = 0, y = 0, so (0, 0) is a critical point. If x = 4/9, y = 2 · 2/3 = 4/3, so (4/9, 4/3) is a critical point.fxx(x, y) = 54x, fyy(x, y) = 2y, fxy(x, y) = −4. So D(x, y) = 108xy − 4. D(0, 0) = −4 < 0 so (0, 0) is a

saddle point. D( 49 ,

43 ) = 108 · 16

27 − 4 > 0. fxx( 49 ,

43 ) = 54 · 4

9 > 0, so (4/9, 4/3) is a local minimum.

21

Page 22: Notes

8.1 Global Extreme Values

Definition. A function f has a absolute global maximum at (a, b) if f(a, b) ≥ f(x, y) for every (x, y) in thedomain of f . Likewise f has a absolute global minimum at (a, b) if f(a, b) ≤ f(x, y) for every (x, y) in thedomain of f .

Example (104). Find the global extreme values of the “bowl” z = f(x, y) = 4 + x2 + 3y2.fx(x, y) = 2x = 0 if x = 0 and fy(x, y) = 6y = 0 if y = 0. f(0, 0) = 4 and when x 6= 0, x2 > 0 and if

y 6= 0, 3y2 > 0 so f(x, y) ≥ 4 = f(0, 0) for all (x, y) in R2.

Recall the EVT for a one variable function: If f is continuous on the closed interval [a, b], then f has anabsolute maximum and an absolute minimum on [a, b].

Definition. A set D in R2 is closed if D contains all of its boundary points. A set D in R2 is bounded if itis completely contained in some disk.

Figure 6: Closed sets

Figure 7: Bounded sets

Theorem (15.7.8 — Extreme Value Theorem on R2). If f is continuous on a closed, bounded set D inR2 then f attains an absolute maximum value f(x0, y0) and an absolute minimum value f(x1, y1) at somepoints (x0, y0) and (x1, y1) in D.

Note: f attains these absolute extreme values at a point in D that is either a critical point of f or a boundarypoint of D.

Example (105). Find the absolute maximum and absolute minimum of f(x, y) = 2 + 2x+ 2y − x2 − y2 ona triangular plate in the first quadrant bounded by the lines x = 0, y = 0 and y = 9− x.

1. Find the values of f at the critical points of f in D.

fx(x, y) = 2− 2x = 2(1− x)⇒ fx = 0 if x = 1

fy(x, y) = 2− 2y = 2(1− y)⇒ fy = 0 ify = 1

So the only critical point of f in D is the point (1, 1). f(1, 1) = 2 + 2 + 2− 1− 1 = 4.

22

Page 23: Notes

2. Find the extreme values of f on the boundary of D. If x = 0, g(y) = f(0, y) = 2 + 2y − y2. Theng′(y) = 2− 2y = 2(1− y). So g′(y) = 0 only if y = 1. Then, g(1) = f(0, 1) = 2 + 2− 1 = 3. Likewise,g(0) = f(0, 0) = 2, g(9) = f(0, 9) = −61.

If y = 0, h(x) = f(x, 0) = 2 + 2x − x2. Then h′(x) = 0 if 2(1 − x) = 0, or x = 1. We computeh(1) = f(1, ) = 2 + 2− 1 = 3, and the endpoints h(0) = f(0, 0) = 2 and h(9) = f(9, 0).

If y = 9− x, thel(y) = f(x, 9− x)

= 2 + 2x+ 2(9− x)− x2 − (9− x)2

= 2 + 2x+ 18− 2x− x2 − (81− 18x+ x2)

= −61 + 17x− 2x2

So l′(y) = 18− 4x and l′(y) = 0 if x = 184 = 9/2. Then y = 9− 9/2 = 9/2. Thus

f

(92,

92

)= 2 + 9 + 9− 81

4− 81

4= 20− 81

2=

402− 81

2=−41

2

We already did the endpoints of this line segment: (9, 0) and (0, 9).

3. The largest of the values of f from steps 1 and 2 is the absolute maximum value of f on D. Similarly,the smallest value of f is the absolute minimum value of f on D. The candidates are

f(1, 1) = 4; f(0, 1) = 3 = f(1, 0); f(0, 0) = 2; f(0, 9) = −61 = f(9, 0); f(

92,

92

)=−41

2

So f has an absolute maximum value of f at (1, 1) and attains its absolute minimum value of −61 at(9, 0) and (0, 9).

8.2 Optimizations with Constraints

Find the maximum value of f(x, y, z) = x2 + y2 + z2.Find the minimum value of f(x, y, z): It is 0, since x2, y2, z2 ≥ 0. f is the square of the distance function.What if we find the point on a surface that is closeest to the origin?

Example (106). Find the point P = (x, y, z) on the plane 2x+ y − z = 4 that lies closest to the origin.The distance of a point P to the origiin is | ~OP | =

√x2 + y2 + z2. This will be minimized when

f(x, y, z) = x2 + y2 + z2 is minimized.So we want to solve an optimization problem: Minimize f(x, y, z) = x2 + y2 + z2 (the objective function)

subject to 2x+ y − z = 5 (the constraint).The constraint can be written as z = 2x+ y − 5. In this case we can define

h(x, y) = f(x, y, 2x+ y − 4) = x2 + y2 + (2x+ y − 5)2

Now we solve our constrained optimization problem by minimizing h(x, y).

23

Page 24: Notes

hx(x, y) = 2x+ 2(2x+ y − 5)2 = 2x+ 8x+ 4y − 20

10x+ 4y − 20

hy(x, y) = 2y + 2(2x+ y − 5) = 2y + 4x+ 2y − 10

4x+ 4y − 10

0 = hx(x, y) when 10x+ 4y = 20, and 0 = hy(x, y) when 4x+ 4y = 10 so it follows that both are zero if6x+ 0 = 10, therefore x = 5

3 .So, 4( 4

3 ) + 4y = 10, hence 4y = 303 −

203 = 10

3 . Therefore, y = 1012 = 5

6 ,So h(x, y) is minimized at ( 5

3 ,56 ). Then

z = 2(53

) +56− 5 =

103

+56− 5 =

206

+56− 30

6=−56

So P = ( 53 ,

56 ,−56 ).

The minimum value of f is 10036 + 25

36 + 2536 = 150

36 .

Then the minimum distance from this plane 2x+ y − z = 5 to ~0 is | ~OP | =√

15036

=5√

66≈ 2.04.

9 Lagrange Multipliers

Warm-up. Find the critical points of h(x, y) = x2 + y2 + (2x+ y − 5)2.

Alternative solution to the warmup: Look at the plane in space. Take a ball of radius 1, all points onthis ball are a distance 1 from the origin. Let this ball expand until it touches the plane, which will happenat exactly one point (x0, y0, z0). This plane is the tangent plane to the sphere at this point of intersection.

f(x, y, z) = x2 + y2 + z2

g(x, y, z) = 2x+ y − z with g(x, y, z) = 5

So ∇f(x0, y0, z0) and ∇g(x0, y0, z0) will be parallel. We can therefore write ∇f = λ∇g for some λ in R.

24

Page 25: Notes

∇f(x, y, z) = (2x, 2y, 2z)

∇g(x, y, z) = (2, 1,−1)

If ∇f(x, y, z) = λ∇g(x, y, z), then:

(2x, 2y, 2z) = λ(2, 1,−1)

2x = λ2 sox = λ

2y = λ sox = 12λ

2z = −λ soz = −12 λ

2x+ y − z = since g(x, y, z) = 5

Then 2λ+ 12λ+ 1

2λ = 5, so λ = 53 , Then (x0, y0, z0) = ( 5

3 ,56 ,−53 ).

9.1 Method of Lagrange Multipliers for Constrained Optimization

To find the maximum and minimum values of f(x, y, z) subject to the constraint g(x, y, z) = k, if ∇g 6= 0on the surface g(x, y, z) = k, then

(1) Find all values of x, y, z and ∇ such that ∇f(x, y, z) = λ∇g(x, y, z) and g(x, y, z) = k.

(2) Evaluate f at all points found in step 1. Then the largest (smallest) value is the maximum (minimum)value of f under the constraint.

Comment: This method is valid regardless of the number of variables: In particular in two dimensions:∇f(x, y) = λ∇g(x, y) and g(x, y) = k.

Example (107). Find the extreme values of f(x, y) = xy on the ellipse x2

8 + y2

2 = 1.f(x, y) = xy, so ∇f(x, y) = (y, x), and g(x, y) = x2

8 + y2

2 so ∇g(x, y) = (14x, y).

We need to find all λ so that (y, x) = λ( 14x, y). It follows that y = 1

4λx and x = λy, so x = λ( 14λ)x = 1

4λ2x.

If x 6= 0, 0 = 14λ

20 and 0 = λy so y = 0. But g(0, 0) = 0 6= 1. So x 6= 0. If x 6= 0 we have 14λ

2 = 1 orλ = ±2. Then x = ±2y.

1 = g(±2y, y) =18

(±2y)2 +12y2 =

12y2 +

12y2 = y2

so y2 = 1 or y = ±1.If y = 1, x = ±2, if y = −1, x = ∓2. Thus the possible points which optimize this constrained problem

are (±2,±1).f(2, 1) = 2 = f(−2,−1) and f(−2,−1) = −2 = f(2,−1). The extreme values of f on this ellipse are 2

and −2 at the points (±2,±1).

Example (108). Find the largest product of three positive numbers x, y, z can have if x+ y + z2 = 16.We can turn this into an optimization problem: Maximize f(x, y, z) = xyz subject to the constraint

g(x, y, z) = x+ y + z2 = 16.

∇f(x, y, z) = (yz, xz, xy), ∇g(x, y, z) = (1, 1, 2z)

25

Page 26: Notes

yz = λ ifz 6= 0 yz = xz ⇒ y = x

xz = λ

xy = 2λz (xz)(yz) = 2λz3 ⇒ λ2 = 2λz3 ⇐⇒ λ2 = z3

x+ y + z2 = 16

=

= xz + yz + z3 = 16z

= λ+ λ+ λ2 = 16z

= 52λ = 16z

= λ = 325 z

Then yz = 325 z or y = 32

5 , so x = 325 .

Then z2 = 16− 6415 = 80

5 −645 = 16

5 and z = 4√5,

Thus,

f(325,

325,

4√5

) =(

325

)(325

)(4√

55

)≈ 4096

√5

125

26