Equality Constraints-print Version

Equality Constraints and the Theorem of LagrangeConstrained Optimization Problems.

It is rare that optimization problems have unconstrained solutions. Usually someor all the constraints matter.

Before we begin our study of th solution of constrained optimization problems,we first put some additional structure on our constraint set D and make a fewdefinitions.

Definition. Let U be an open subset of Rn.

An equality constrained optimization problem is an optimization problem inwhich the constraint set D can be represented as

D = U {x Rn | g(x) = 0},

where g : Rn Rk. We refer to the functions g = (g1, . . . , gk) as equalityconstraints.

An inequality constrained optimization problem is an optimization problem inwhich the constraint set D can be represented as

D = U {x Rn | h(x) 0},

where h : Rn Rl. We refer to the functions h = (h1, . . . , hl) as inequalityconstraints.

An optimization problem with mixed constraints is an optimization problem inwhich the constraint set D can be represented as

D = U {x Rn | g(x) = 0 and h(x) 0},

where there are both equality and inequality constraints. N

The given specifications of the constraint set D are very general. For instance, nonnegativity constraints can be easily handled. If a problem re-

quires that x Rn+, we can simply define functions hj : Rn R by

hj(x) = xj , j = 1, . . . , n,

and use the n inequality constraints

hj(x) 0, j = 1, . . . , n.

Similarly, we can rewrite the constraints on the left as on the right.

(x) a (x) a 0,(x) b b (x) 0,(x) = c c (x) = 0.

Example 1. Consider the budget set

B(p, I) = {x Rn+ | p x I}

of the utility maximization problem.

This can be represented using n + 1 inequality constraints. Define h : Rn Rn+1 by

hj(x) = xj , j = 1, . . . , n,hk(x) = I p x k = n+ 1.

ThenB(p, I) = {x Rn | hj(x) 0, j = 1 . . . , n+ 1}

is the budget set written in the specified form.

Equality Constraints and the Theorem of Lagrange.

We will study equality constrained problems and begin with a graphical proofof the theorem of Lagrange with two choice variables and one constraint.

Consider the problem

Maximize f(x1, x2) subject to g(x1, x2) = 0,

Here the constraint set is D = {(x1, x2) R2 | g(x1, x2) = 0}. We will look at this problem graphically.

First we draw the constraint set D in the x1x2-plane. It is represented bythe red line.

Then we draw, in blue, level curves of the objective function f . Our goal is to find the highest level curve of f which meets the constraint

set.

The highest level curve of f cannot cross the constraint curve D. If it did,as at point b, nearby higher level sets would cross too.

Thus the highest level curve of f to touch the constraint set D must betangent at the constrained maximizer, x.

2

x1

x2

x

D

b

Figure 1: At the constrained maximizer x, the highest level curve of f is tangent to the constraint setD.

We can use our knowledge of the implicit function theorem to represent thiscondition mathematically.

Since the level curve of f is tangent to the constraint set D at the constrainedmaximizer x, the slopes of the level set of f and of the constraint curve must beequal at x

Since the level set at x is given by the equation f(x1, x2) = f(x1, x2), we canuse the implicit function theorem to calculate its slope as

(f/x1)(x)

(f/x2)(x).

Similarly the constraint set is given by the implicit function g(x1, x2) = 0 andso its slope at x is

(g/x1)(x)

(g/x2)(x).

Since these two slopes are equal at x we have

(f/x1)(x)

(g/x1)(x)= (f/x2)(x

)(g/x2)(x)

= . (1)

We can rewrite this as two equationsf

x1(x) +

g

x1(x) = 0,

f

x2(x) +

g

x2(x) = 0.

Since we have to solve for three unknowns, (x1, x2, ), we need three equations.The third is the constraint equation g(x1, x2) = 0.

3

So we have a system of three equations in three unknowns:f

x1(x) +

g

x1(x) = 0,

f

x2(x) +

g

x2(x) = 0,

g(x1, x2) = 0.

A convenient way of writing this is to form the Lagrangean (function)

L(x1, x2, ) = f(x1, x2) + g(x1, x2).

The critical points of the Lagrangian, are found by computing L/x1, L/x2and L/ and setting them equal to zero. But this gives the system of threeequations above.

We have transformed a two variable constrained problem into an unconstrainedproblem of three variables.

Note, in equation (1), we need g/x1 and/or g/x2 to be nonzero at theconstrained maximizer x. This restriction is called the constraint qualification.

Theorem 1. Let f, g : R2 R be C1 functions. Suppose that x = (x1, x2) isa local maximizer or minimizer of f subject to g(x1, x2) = 0. Suppose also thatDg(x1, x

2) 6= 0. Then, there exists a scalar R such that (x1, x2, ) is a critical

point of the Lagrangean

L(x1, x2, ) = f(x1, x2) + g(x1, x2).

In other words, at (x1, x2, )

L

x1= 0,

L

x2= 0 and

L

= 0.

We now present the theorem of Lagrange in the general case of optimization ofa function in n variables subject ot k equality constraints.

Theorem 2 (The Theorem of Lagrange). Let f : Rn R and gi : Rn R fori = 1, . . . , k be C1 functions. Suppose x is a local maximizer or minimizer of f onthe constraint set

D = U {x Rn | g(x) = 0},where U Rn is open. Suppose also that rankDg(x) = k. Then, there exists avector = (1, . . . ,

k) Rk such that (x, ) is a critical point of the Lagrangean

L(x, ) = f(x) +ki=1

gi(x).

4

That is

Df(x) +ki=1

iDgi(x) = 0.

or writing out the system explicitly

L

xj(x, ) = 0, j = 1, . . . , n

L

i(x, ) = 0, i = 1, . . . , k.

The theorem of Lagrange only provides necessary conditions for local optimax. Furthermore, these conditions only apply to those local optima x whichalso meet the constraint qualification rankDg(x) = k.

These conditions are not sufficient. That is, the theorem does not say that if thereexists (x, ) such that g(x) = 0 and Df(x) +

ki=1 iDgi(x) = 0, then x

must be a local maximum or a local minimum even if it also meets the constraintqualification.

The following example shows that the conditions of the theorem cannot be suf-ficient.

Example 2. Let f and g be functions defined by f(x, y) = x3+y3 and g(x, y) = xyand consider the equality constrained problem of maximizing or minimizing f(x, y)over the constraint set D = {(x, y) R2 | g(x, y) = 0}. Let (x, y) = (0, 0) and let = 0. Then g(x, y) = 0, so that (x, y) is a

feasible point.

It also meets the constraint qualification, since Dg(x, y) = (1 1) for any(x, y).

Finally, since Df(x, y) = (3x2 3y2), we haveDf(x, y) + Dg(x, y) = (0 0) + 0(1 1) = (0 0).

Hence, if the conditions of the theorem of Lagrange were also sufficient, then(x, y) would be either a local minimizer or maximizer of f on D. However,we have neither.

We have f(x, y) = 0. But for every > 0, it is the case that (,) Dand (, ) D. Futhermore

f(,) = 23 < f(x, y)

and

f(, ) = 23 > f(x, y),

so that (x, y) is not a local maximizer or minimizer.

5

The Lagrangean Multipliers.

The vector = (1, . . . , k) in the theorem of Lagrange is called the vector ofLagrangean multipliers corresponding to the local optimum x.

The ith multiplier i measures the sensitivity of the value of the objective func-tion at x to a small relaxation of the ith constraint gi.

For simplicity, we will look at the optimization of a function f in two variablessubject to one constraint, which we will write as g(x, y, c) = c g(x, y) = 0.Then a relaxation of the constraint corresponds to an increase in c.

The Lagrangean for this problem isL(x, y, , c) = f(x, y) + [c g(x, y)]

with c entering as a parameter.

By Lagranges theorem, a local optimizer (x(c), y(c), (c)) satisfiesf

x(x(c), y(c)) g

x(x(c), y(c)) = 0,

f

y(x(c), y(c)) g

y(x(c), y(c)) = 0.

Also, since g(x(c), y(c)) = c for all c, we haveg

x(x(c), y(c))

dx

dc(c) +

g

y(x(c), y(c))

dy

dc(c) = 1

Now by the chain rule, and the equations above, we havedf

dc(x(c), y(c)) =

f

x(x(c), y(c))

dx

dc+f

y(x(c), y(c))

dy

dc.

=

=L

c(x(c), y(c), (c)) (Envelope Theorem)

Example 3. Consider the consumers utility maximization problem in which the bud-get constraint holds with equality:

Maximize u(x1, x2) subject to I p1x1 p2x2 = 0. Here we can look at relaxing the budget constraint by increasing income by one

unit.

By the preceding calculations, at the optimum, an increase in income by one unitwill raise utility by units, where is the Lagrange multiplier at the optimum.

Thus represents the consumers marginal utility of income.

6

Second Order Conditions.

Now we present the second order conditions for two variable optimization sub-ject to a single equality constraint.

Theorem 3. Let f, g : R2 R be C2 functions. Let D = {(x, y) R2 | g(x, y) = 0.Suppose that at (x, y, )

L

x1= 0,

L

x2= 0 and

L

= 0.

Let D2(,x,y)L denote the Hessian of the Lagrangian at (x, y, )

H = D2(,x,y)L =

0 gx gygx Lxx Lxygy Lyx Lyy

.1. If |H| > 0, then (x, y) is a strict local maximizer of f on D.2. If |H| < 0, then (x, y) is a strict local minimizer of f on D.

Next we consider the more general problem of maximizing or minimizing f :Rn R over the set

D = U {x | g(x) = 0},where g : Rn Rk, and U is an open subset of Rn. We will assume that f andg are both C2 functions.

We form the Lagrangian L(x, ) = f(x) +ki=1 igi(x). The second derivative D2xL(x, ) of L(, ) with respect to the x variables is then n matrix defined by

D2xL(x, ) = D2f(x) +

ki=1

iD2gi(x).

Since f and g are C2 functions, so is L(, ) for any given Rk. Thus,D2xL(x, ) is a symmetric matrix and defines a quadratic form on Rn.

Theorem 4. Suppose there exist points x D and Rk such that rankDg(x) =k and Df(x) +

ki=1

iDgi(x

) = 0. Define

Z(x) = {z Rn | Dg(x)z = 0}

and let D2xL denote D2xL(x

, ) = D2f(x) +ki=1

iD

2gi(x).

1. If f has a local maximum at x, then zT (D2xL)z 0 for all z Z(x).

2. If f has a local minimum at x, then zT (D2xL)z 0 for all z Z(x).

7

3. If zT (D2xL)z < 0 for all z Z(x) with z 6= 0, then x is a strict local

maximizer of f on D.4. If zT (D2xL

)z > 0 for all z Z(x) with z 6= 0, then x is a strict localminimizer of f on D.

Note the similarity between the conditions of the theorem and the correspondingtheorem for unconstrained maximization problems. There are two importantdifferences.

Here the second order conditions are stated in terms of the second deriva-tives of the Lagrangean instead of the function f .

The properties of the quadratic formD2xL(x, ) are only required to holdon a subset of Rn defined by Z(x).

How do we verify the definiteness of D2xL(x, ) on the constraint set Z? We form a (k+n) (k+n) matrix consisting of the Hessian of the Lagrangean

with respect to x bordered by the Jacobian of the constraint functions g. This issometimes called a bordered Hessian.

It is the Hessian of the Lagrangean with respect to both and x. We will denote this matrix by H:

H = D2L(,x)(x, )

=(

0 Dg(x)Dg(x) D2xL(x

, )

)

=

0 0 | g1x1 g1xn

.... . .

... |...

. . ....

0 0 | gkx1 gkxn

g1x1

gkx1 | 2Lx21

2Lxnx1...

. . .... |

.... . .

...g1xn

gkxn | 2L

x1xn 2Lx2n

.

If the determinant of H has the same sign as (1)n, and the last n k leadingprincipal minors of H alternate in sign, then the condition in part (3) of thetheorem holds and x is a local maximizer.

If the last nk leading principal minors ofH have the same sign as (1)k, thenthe condition in part (4) holds and x is a local minimizer.

If both of the above conditions on H are violated by nonzero leading principalminors, then (1) and (2) do not hold and x is neither a local maximizer nor alocal minimizer.

8

We summarize these results in the following theorem.

Theorem 5. Suppose there exist points x D and Rk such that rankDg(x) =k and Df(x) +

ki=1

iDgi(x

) = 0. Consider the bordered Hessian given above.Let Hr denote the rth order leading principal submatrix of H .

1. If (1)rk|Hr| > 0 for all r = 2k + 1, . . . , n + k, then x is a strict localmaximizer of f on D.

2. If (1)k|Hr| > 0 for all r = 2k+1, . . . , n+k, then x is a strict local minimizerof f on D.

3. If either of the above conditions is violated by nonzero leading principal minors,then x is neither a local maximizer nor a local minimizer.

Using the Theorem of Lagrange.

We now describe a cookbook procedure for using the theorem of Lagrange tosolve a maximization problem.

Consider an equality constrained optimization problem of the form

Maximize f(x) subject to x D = U {x | g(x) = 0},

where f : Rn R and g : Rn Rk are C1 functions and U is an open subsetof Rn.

1. Set up a function L : D Rk R, called the Lagrangean defined by

L(x, ) = f(x) +ki=1

igi(x).

The vector = (1, . . . , k) Rk is called the vector of Lagrange multipliers.

2. Find the set of all critical points of L for which x U i.e. all points (x, ) atwhich DL(x, ) = 0 and x U . Since x Rn and Rk, this conditionresults in a system of n+ k equations in n+ k unknowns:

L

xj(x, ) = 0, j = 1, . . . , n

L

i(x, ) = 0, i = 1, . . . , k.

Let M be the set of solutions to these equations for which x U :

M = {(x, ) | x U and DL(x, ) = 0}.

9

3. Finally, evaluate f at each point x in the set

{x Rn | there is some such that (x, ) M}.

The values of x which maximize f over this set are also usually solutions to theequality constrained maximization problem.

Theorem 6. Suppose the following two conditions hold.

1. A global optimum x exists to the given problem.

2. The constraint qualification is met at x.

Then there exists a such that (x, ) is a critical point of L.

Under the two conditions above, the Lagrangean method wil be successful infinding the optimum x.

This result also explains why the Lagrangean method usually works in practice. The existence of a solution is usually not a problem (check using Weier-

strass theorem) and neither is the constraint qualification.

Although, in general, it is not possible to verify that the constraint qualifi-cation holds beforehand, it is often the case that the constraint qualificationholds everywhere on the feasible set D.

In particular, if there is a single linear constraint and two choice variables,the constraint qualification will hold at all x D.

Unfortunately if the conditions of theorem (6) fail to hold, the procedure can alsofail to identify global optima.

First, if a global optimum exists but the constraint qualification is not met atthe optimum, then the optimum will not be found among the set of criticalpoints.

Second, even if the constraint qualification holds everywhere on D, theprocedure can fail simply because no global optimum exists.

Most problems in economic theory involve inequality rather than equality con-straints.

However, under suitable conditions, it is possible to reduce inequality constrainedproblems to equivalent equality constrained problems. Then the theorem of La-grange can be applied.

Example 4. Consider a utility maximization problem in which a consumer consumestwo goods.

The consumers utility from consuming amount xi of commodity i = 1, 2, isgiven by u(x1, x2) = x1x2.

10

The consumer has an income I > 0, and the price of commodity i is pi > 0. Thus, the problem is to solve

max{x1x2 | I p1x1 p2x2 0, x1 0, x2 0}

We will proceed in three steps.1. We begin by reducing the utility maximization problem to an equality con-

strained problem.

First note that the budget setB(p, I) = {(x1, x2) | I p1x1 p2x2 0, x1 0, x2 0}

is a compact set and the utility function is continuous on this set.

Thus, by the Weierstrass theorem, a solution (x1, x2) does exist to the givenmaximization problem.

Now, if either x1 = 0 or x2 = 0, then u(x1, x2) = 0. However, the consumption point (x1, x2) = (I/2p1, I/2p2), which divides in-

come equally between the two commodities is feasible, and satisfies u(x1, x2) =x1x2 > 0.

Since any solution (x1, x2) must satisfy u(x1, x2) u(x1, x2), it follows thatany solution must satisfy xi > 0, i = 1, 2.

Furthermore, any solution must satisfy the budget constraint with equality, ortotal utility could be increased.

Thus, we can see that (x1, x2) is a solution to the original problem iff it is asolution to the problem

max{x1x2 | I p1x1 p2x2 = 0, x1 > 0, x2 > 0}

The constraint set of this reduced problem, which we will denote by B(p, I),can be written as

B(p, I) = R2++ {(x1, x2) | I p1x1 p2x2 = 0}and by setting U = R2++ and g(x1, x2) = I p1x1 p2x2 we can use thetheorem of Lagrange.

2. Next we obtain the critical points of the Lagrangean.

We first set up the LagrangeanL(x1, x2, ) = x1x2 + (I p1x1 p2x2).

11

The critical points of L are the solutions (x1, x2, ) R2++ R toL

x1= x2 p1 = 0

L

x2= x1 p2 = 0

L

= I p1x1 p2x2 = 0

If = 0, this system of equations has no solution, since then we need x1 =x2 = 0 from the first two equations.

So suppose 6= 0. From the first two equations, we then find = x1/p1 =x2/p2, so that x1 = p2x2/p1. Using this in the third equation, we obtain theunique solution to the set of equations: x1 = I/2p1, x

2 = I/2p2 and

=I/2p1p2.

3. Now we classify the critical points of the Lagrangean.

To classify the single critical point ofLwe will apply the second order conditionsto check that (x1, x

2) is a strict local maximum of u on B(p, I).

First, note that Dg(x1, x2) = (p1 p2), so we have

Z(x) = {z R2 | Dg(x)z = 0} ={z R2 | z1 = p2z2

p1

}.

Define D2xL = D2u(x) + D2g(x). Then we have

D2xL =

(0 11 0

)+

(0 00 0

)=(

0 11 0

).

So that for any z R2, we have zT (D2xL)z = 2z1z2. Thus, for any z Z(x)with z 6= 0, we have zT (D2xL)z = 2p2z22/p1 < 0.

Alternatively we can check the determinant of D2(,x)(x).

|D2(,x)L(x, )| =

0 p1 p2p1 0 1p2 1 0

= 2p1p2 > 0 Both methods show that (x1, x2) satisfies the second order conditions for a strict

local maximizer of u on B(p, I). We can actually show the stronger result that (x1, x2) is a global maxmizer onB(p, I), by showing that the conditions of theorem (6) hold.

First, note that a global maximum exists by the argument in step 1.

12

Next, note that the single constraint g(x1, x2) satisfiesDg(x1, x2) = (p1 p2) 6=0 everywhere onB(p, I). Hence rankDg(x1, x2) = 1 at all (x1, x2) B(p, I)and the constraint qualification holds at the global maximum.

Therefore, by theorem (6), the global maximum must be a critical point of theLagrangean. Since there is only one critical point (x1, x

2), this must be the

problems global maximum.

13

Equality Constraints-print Version

Documents

Transcript of Equality Constraints-print Version