Lecture 7: Constrained Optimization: Necessary and ...
Transcript of Lecture 7: Constrained Optimization: Necessary and ...
Copyright ©1991-2009 by K. Pattipati1
Lecture 7: Constrained Optimization:
Necessary and Sufficient Conditions
Prof. Krishna R. Pattipati
Dept. of Electrical and Computer Engineering
University of Connecticut Contact: [email protected] (860) 486-2890
Fall 2009
October 13, 2009
ECE 6437Computational Methods for Optimization
Copyright ©1991-2009 by K. Pattipati2
Outline of Lecture 7
Necessary and Sufficient Conditions
Methods of Specifying constraint set ()
Basic Result: Necessary Conditions of Optimality
Examples
Equality Constraints
Economical Interpretation of Lagrange Multipliers
Copyright ©1991-2009 by K. Pattipati3
“ Minimize f(x) subject to
• x* is a local minimum of f over if a scalar > 0
• x* is a global minimum of f over if
• When f and are convex, local minimum global minimum
Methods of specifying constraint set
• is non-empty, convex, and closed subset of Rn
• f(x) is continuously differentiable over
• Difficulty with an open set
• Difficulty with a non-convex set: Nec. conditions of optimality fail!!
; constraint setnx R
Constrained Minimization Problem
* *( ) ( ) ,|| ||f x f x x x x
*( ) ( )f x f x x
2 * *min , { | 0 1}; isundefined; 0x x x x x
Copyright ©1991-2009 by K. Pattipati4
Set constraints:
• Non negative orthant constraints
• Simple bounds
Equality constraints:
• Simplex constraints
• Some applications: routing, allocation problems
convexx
Methods of Specifying - 1
{ | 0; 1,2,..., }ix x i n
{ | ; 1,2,..., }i i ix x i n
x2
x1
{ | ( ) 0; 1,2,..., }ix h x i n
1
{ | ; 0}n
i i
i
x x r x
x2
x1 (3,0)
(0,3
) x1+x2=3
14
3
2
5Origin of OD pair
Destination of OD pair
wr
wr
x4
x3
x2
x1
Copyright ©1991-2009 by K. Pattipati5
• Linear equality constraints:
• Nonlinear equality constraints:
For x to be convex, equality constraints must be linear
Inequality constraints
• Linear inequality constraints
• Nonlinear inequality constraints
, allpathstraversing
( , )
min ( ) ( )
. .
0 ,
w
ij p
i jpi j
p w
p P
p w
D x D x
s t x r w W
x p P w W
allpathsptraversing( , )
ij p
i j
F x
{ | 0; }ix x Ax b
2 2
1 2 1 2{ | 3; , 0}x x x x x
{ | ( ) 0; 1,2,.., }ix g x j r
{ | 0; }x x Ax b 2 2
1 2{ | 3; 0}ix x x x
x1
x2
convex
Linear inequality
constraints
x1
x2
Not convex
Non-linear equality
constraints
x1
x2
Non-linear inequality
constraints
Methods of Specifying - 2
3
3
3
3
Copyright ©1991-2009 by K. Pattipati6
Recall unconstrained case: f(x*) = 0
What happens when there are constraints?
Let us take a simple case where = {x| xi } and x is scalar
• Case 1: x* =
• Case 2: If < x* <
Necessary:2f(x*) 0;
Sufficiency: 2f(x*) > 0
f(x)
x
f(x)
x
x*
f(x)
x
** * *( )
| 0 0 ( )( ) 0Tdf x
and x x x f x x xdx
*
*
* *
| 0 can bepositiveor negative
( )( ) 0
x
T
dfand x x
dx
f x x x x
First-order Conditions of Optimality - 1
Copyright ©1991-2009 by K. Pattipati7
• Case 3: x* =
Basic Result
• 1. If x* is a local minimum of f over a convex set Rn, then
• 2. If f(x) is convex, then the condition is also sufficient
*
*
* *
( )| 0 0
( )( ) 0
x x
T
df xand x x x
dx
f x x x x
* * * * 0( )( ) 0, [ ( ),( )] 90T
f x x x x f x x x
f(x*)
x
x*
= convex
Convexity of is critical
x*
f (x*)
First-order Conditions of Optimality - 2
Copyright ©1991-2009 by K. Pattipati8
Proof
• 1. Suppose x Then, from the mean value
theorem, for an [0,1] s.t.
• 2. Since f(x) is convex
Proof of Basic Result
* * * * * *
* * *
* * *
( ( )) ( ) ( ( ))( )
Forsufficientlysmall , ( ( ))( ) 0
( ( )) ( ) a contradiction
T
T
f x x x f x f x x x x x
f x x x x x
f x x x f x
* * *
* *
* *
( ) ( ) ( )( );
Now,since ( )( ) 0
( ) ( ) isa minimum
T
T
f x f x f x x x x
f x x x
f x f x x x
* *( )( ) 0.T
f x x x
f(x)
x
Copyright ©1991-2009 by K. Pattipati9
Examples:
• Example 1:
Know at optimum
Specialization of First-order Conditions - 1
** * *
1
* *
* * * *
( )( )( ) 0 ; ( ) 0
0 0;
1 30 0; , ; ;
2 2
nT
i i
i i
i k k
i
i k k i i i i
i
f xf x x x x x x x
x
fx Let x x k i
x
fx To show this let x x k i and x x x x
x
{ | 0; 1,2,..., }non-negativeorthantix x i n
x1
x2
f(x*)
x1
x2 f(x*)
Copyright ©1991-2009 by K. Pattipati10
• Example 2:
• Example 3:
{ | }i ix x *
*
*
0;
0;
0;
i i
i i
i
i i
xf
xx
x
1
{ | 0, }optimizationoverasimplexn
i i
i
x x x r
x2
x1
.4
Specialization of First-order Conditions - 2
3
All coordinates with positive allocation at the optimum
must have equal or minimal partial cost derivative
1
2 2
1 2 1 2
* *
1 2
1*
2 1 2
* **
{ | 0, }optimization over a simplex
min ( 4) . . 3 0
( , ) (3,0)
2( 4) 2( ) ;
2 0
( ) ( )In general 0
n
i i
i
i
i
i j
x x x r
x x s t x x and x
x x
x f ff x
x x x
f x f xx j
x x
Copyright ©1991-2009 by K. Pattipati11
–
• Example 4: Projection on a convex set
– Let be a closed convex set and let z be a fixed vector in Rn
– “projection of z onto involves finding a point x* in nearest to z”.
– Mathematically,
– 4.1 z
**
1 1
* * *
1
* * * **
( )( ) 0 0with
Suppose 0. Pick 0, . .
( ) ( ) ( ) ( )0
n n
i i i i
i ii
n
i i j i j i
i
i
j i i j
f xx x x x r
x
x x x x x s t x r
f x f x f x f xx j
x x x x
21min ( ) || || . .
2x
f x z x s t x
x* x1+x2=r
Specialization of First-order Conditions - 3
x*
z =0
Copyright ©1991-2009 by K. Pattipati12
– 4.1. 2
1
2 2
1 1 2 1 1 1 2 1
* 1 21 1
* 1 2 1 22 2
2
**
*
1min || || . .
2
1 1min ( ) ( ) 0
2 2
2 2 2
;2 2 2 2
1In general,min || || ( ) ( ) 0
2
( )( ) 0
n
i
i
T
T TT
z x s t x r
z x z r x z x z r x
z zr rx z z
z z z zr rx z z z
z x x e r z x e
e z x e z rn e z x
n n
x
*; ( )
I P
T T Te z r r e z ee r
z e z e e z z e z x I z en n n n n
Projection on a Convex Set - 1
*
* * *
*
( )
( )
( )
TT
T T
T
z x e
z x x e x
e x e z e
e z n r
Copyright ©1991-2009 by K. Pattipati13
Definition
TT ee
A e Pn
* 1
* * *
( );
( ) ( )
0 ( ) ( ) ( )
n
Ti
i
T
zr e z
x e z z e zn n n
r ee re I z e I P z
n n n
r x I P z z x Pz x I P z
* 2
* *
* *
* *
arg min || || . .
Fromnecessaryconditions
( )( ) 0
-( - ) ( ) 0
( - ) ( ) 0
x
T
T
T
x z x s t x
f x x x
z x x x
or z x x x
R(AT)
N(A)
z
Pz
(I-P)z.
z-x*
x*
Projection on a Convex Set - 2
x*
z
x
Copyright ©1991-2009 by K. Pattipati14
– 4.2.
*
* 2
* ** *
** 1
* 1 1
1
1 2
1min || || ; . .
2
1( , ) ( ) ( ) ( )
2
| 0 ( ) 0
since ( ) ( )
[ ( ) ] ( )
( ) ( )
( ) Projection matrix ;
TT
T T
x x
T
T T T T
T T
T T n
z x s t Ax b
L x z x z x Ax b
Lz x A x z A
x
Ax b AA Az b
x I A AA A z A AA b
I P z A AA b
P A AA A P P P
*
*
*
; 1
Recall { | 0} ( )
( )
Projection of onto ( )
Projection of onto ( )T
P n
x Ax N A
x I P z
x z N A
P z z x z R A
A is an mxn matrix
m<n
Rank(A)=m
Projection on a Convex Set - 3
Recall ( ) ( )TN A R A
Copyright ©1991-2009 by K. Pattipati15
• Suppose m=1 aTx = 0 defines an (n-1) dimensional subspace M
perpendicular to a
• Suppose aT = (1,1)
• Recall that a hyperplane H in Rn is defined by
*
* *
( ) ;
||
T T
T T
T
T
a a a ax I z I P z P
a a a a
a az x z Pz z x a
a a
1 2
1* 1 2
2 2 1
1 2
1 2
1 2
1 1
12 2 2
1 1 12
2 2 2
12
12
2
z z
z z zx
z z z
z z
z zPz
z z
{ | }T
H x a x b
Pzz ax*
M
Projection on a Convex Set - 4
Copyright ©1991-2009 by K. Pattipati16
– Every hyperplane H can be written as
where is any vector in H, i.e.,
– Example:
• Supporting hyperplane: For every convex set Rn and every
boundary point , a hyperplane that supports at , i.e.,
Supporting and Separating Hyperplanes
H x M
x Ta x b
1 2 1 2
1
2 ˆ{ | 1} { | 0}1
2
ˆ
x
H x x x x x x
x x x
x xT Ta x a x x
ba
M
x..
x
1
2
x̂
x
Supporting hyperplane
Separating
hyperplane
Copyright ©1991-2009 by K. Pattipati17
• Separating hyperplane: If 1 and 2 are two disjoint convex sets, then
a hyperplane that separates them
Equality constraints
We will provide intuitive proofs first and then provide geometric
interpretations later
• Special case:
Equality Constraints
1 2,T Ta x a y x y
1 2
min ( ) . . ( ) 0, 1,2,...,
( )and ( )aredifferentiablein
,
min ( )
. . ( ) 0; ( , ,..., )
i
n
i
T
m
f x s t h x i m
f x h x R
Equivalently
f x
s t h x h h h h
min ( )
. .
f x
s t Ax b A is an m x n matrix
Copyright ©1991-2009 by K. Pattipati18
– Without loss of generality, assume that the first m columns are independent
Constrained problem Unconstrained problem
– Necessary condition for optimality
1
1
| ; ; &
Theconstraintsimply
[ ]
Let uslook at ( ) ( , ) ( ( ), )
m n mT m n m
B N B N
B N B N
B N N N
A m B N x x x x R x R
Bx Nx b x B b Nx
f x f x x f B b Nx x
min ( )
. .
f x
s t Ax b
1min ( ) ( ( ), )
. .
N N N
n m
N
f x f B b Nx x
s t x R
* *
* 1 *
*1 *
( ) ( ) ( )
( ) ( ) ( ) 0
Welet (- ) ( )
N N N B
N B
B
x N x x B x
T T
x x
T
x
f x f x x f x
f x N B f x
B f x
( ) 0Nf x
Linear Equality Constraints - 1
Copyright ©1991-2009 by K. Pattipati19
We have a set of simultaneous equations:
Alternatively, if we write
- At a local minimum GRADIENT OF THE COST FUNCTION = Linear combination of the gradients of CONSTRAINTS & WEIGHTS = Lagrange multipliers
**
**
( ) 0
( ) 0
N
B
T
x
T
x
f x N
f x B
**( ) 0Tf x A
* * *
| | |*
1 2| | |
* * *
1
( ) , then
( ) ( ) 0
( ) [ , ,.., ] [ 1, 2,..., ]
( ) ( ) 0
n nxm
m
ii
i
h x Ax b
f x h x
h x h h h row row rowm
f x h x
Linear Equality Constraints - 2
Copyright ©1991-2009 by K. Pattipati20
• Second approach: Form Lagrangian function
– Finding stationary and involves solving (n+m) equations
• What about second order conditions: Consider a point satisfying
the 1st order necessary conditions. Suppose we go from to
another feasible point. To 2nd order
*
** *
( , ) ( ) ( ),
0 ( ) ( ) 0
0 ( ) 0
when ( ) 0 ( ) 0;
T m
T
L x f x h x R
Lf x h x
x
Lh x
h x Ax b f x A Ax b
Convert an equality
constrained problem
into an unconstrained
problem
*x*
* * *
1
*
( ) ( ) 0
( ) 0
m
ii
i
f x h x
h x
*x*x *x x
Lagrangian Approach
Copyright ©1991-2009 by K. Pattipati21
– For to be feasible need
– For to be a local minimum, need for all feasible
around , i.e., those satisfying
* * * * 2 *
* * * * 2 *
* * * *
1 1
2 * 2 *
1
1( ) ( ) ( ) ( ) ( ) higher order
2
1( ) ( ) ( ) ( ) ( ) higher order
2
( ) ( ) [ ( ) ( )]
1[ ( ) ( )]
2
T T
T T
ii i i i
m mT
ii i i
i i
mT
i i
i
f x f x x f x f x x x f x x
h x h x x h x h x x x h x x
f x h x f x h x x
x f x h x
higher orderx
*x x *( ) 0 1,2,..,ih x i m
* 2 * 2 *
1
1, ( ) [ ( ) ( )]
2
mT
i i
i
so f x x f x h x x
*x *( ) 0f x
x *x*x ( ) 0 1,2,..,ih x i m
2 *
2 * 2 *
1
( )
[ ( ) ( )] 0
xx
mT
i i
i
L x
Need x f x h x x
*x
Second Order Conditions - 1
Copyright ©1991-2009 by K. Pattipati
– Necessary conditions:
• Special case:
– Necessary conditions:
*
for all satisfying ( ) 0, 1,2,..,
since ( ) ( ) 0
i
T
ii
x h x i m
h x h x x
( ) 0T
ii ih x a x b
* * *
1
*
2 2 * * 2 *
1
*
( ) ( ) 0
( ) 0; 1,2,..,
[ ( ) ( )] 0
( ) 0; 1,2,..,
m
ii
i
i
mT T
xx i i
i
T
i
f x h x
h x i m
y L y y f x h x y y
h x y i m
First order
**
*
( ) 0Tf x A
Ax b
First order
Second Order Conditions - 2
Copyright ©1991-2009 by K. Pattipati
1
* 2
1
2 2 *
1
1
1 2
1
( ) 0;.
( ) 0 0
Recall that for any , 0,defines thenullspaceof , ( )
UsingSVDrepresentation
..
. . 0..
. .
. .
T
Tm
ii
i
T
m
T T
xx
n
T
T
mT
m T
m
m
a
af x a A
a
y L y y f x y y Ay
y R Ay A N A
v
vA U V u u u
v
1
mT
i i i
i
T
n
u v
v
Second Order Conditions - 3
Copyright ©1991-2009 by K. Pattipati
– Sufficient conditions:
– If are linearly independent, then the subspace
Special case: Linear constraints
Linear independence R(A) spans m dimensional subspace=Rm
– When are independent, the constraints are said to be regular or qualified
1
2 2 2 *
1
2
0
Linear combination of last ( ) rowsof
0 ( ) 0
Alternately,since( ) 0 ( )
[( ) ( )]
Infact,anyorthonormalbasisof
n m
m ii
i
T T TT T
xx xx
T T T
xx
Ay y v V
n m V
y L y V L V V f x V
I P y where P A AA A
Rank I P L I P n m
N
( ) willdo todefineA V
2 0 ( ) 0; 1,2,..,T T
ixxy L y y h x y i m
*( )ih x
*( ) { | 0} ( )M x y Ay N A
*( )ih x
Second Order Conditions - 4
Copyright ©1991-2009 by K. Pattipati25
• Example 1:
Illustrative Examples - 1
2 2 2
1 2 3 1 2 3
1
* * *2 1 2 3
*3
1 2 3
2
2 2 2 2
1 2 3 1 2 3
2 2
1 2 1 2
1min ( ) . . 3
2
First order conditions:
0
0 1
0 1
3
Secondorder conditions:
1 0 0
0 1 0
0 0 1
0
(
xx
T
xx
x x x s t x x x
x
x x x x
x
x x x
L
y L y y y y y y y
y y y y
2
1 2) 0 , 0
strict localminimum
y y
Copyright ©1991-2009 by K. Pattipati26
2
3
2
Alternately, (1,1,1) 1 1 1
Orthonormalbasis for
11 0 1
2
11 2 1
6
1 1
1 1 2 61 0 00
22 20 1 0 0
1 2 1 60 0 1
1 16 6 6
2 6
1 00
0 1
T
T
T
T
T
xx
a or A
V
v
v
V L V
Illustrative Examples - 2
Copyright ©1991-2009 by K. Pattipati27
2 2
2 1 1
3 3 3
1 2 1Orthogonal Projection: ;
3 3 3
1 1 2
3 3 3
( ) ( ) ( ) ( )
Rank=2 spans2-dimensionalsubspace
xx
I P
I P L I P I P I P
• If x* is a local minimum but not a regular point, no or infinite
number of Lagrange multipliers
Example 2:2 2 2
1 2 1
1 2
* * * * *
( ) ( 1) ( 1) . . 0
min at 0& 1unique
2 0 0( ) ( )
0 0 0
f x x x s t x
x x
f x h x
Never satisfied
f(x*)
Illustrative Examples - 3
Copyright ©1991-2009 by K. Pattipati28
• Example 3:
• For linear constraints, Lagrange multipliers exist even if the
constraints are not regular. We will discuss this later in the context of
convex programming problems
• Example 4: A continuous random variable x. Don’t know the density
p(x)
2 2 2
1 2 1
* * * *
* * *
1 2
min ( ) ( 1) . . 0
0 0( ) ( )
0 0
Solution 0, 1, anything
f x x x s t x
f x h x
x x
2 2 2 2
( ) ( ) (1)
( ) ( ) (2)
Given E x m xp x dx
E x m x p x dx
.Illustrative Examples - 4
Copyright ©1991-2009 by K. Pattipati29
( ) 0 ( ) 0
2
( ) 1(3)
( ) 0
One wayof finding ( )is tomaximizeentropy {log ( )}
max {log ( )} max ( ) log ( )
(1),(2),(3)satisfies
( , , , )= { ( ) ln ( ) ( ) (
e
e ep x p x
p x dxNeed
p x
p x H E p x
E p x p x p x dx
such that
L p p x p x xp x x p
2 2
2
2
) ( )}
( )
0 ln ( ) 1 0( )
( ) exp( 1 )
x p x dx
m m
Lp x x x
p x
p x c x x
Illustrative Examples - 5
Copyright ©1991-2009 by K. Pattipati30
2
2
' 2 ' 2
( )
2
2
( 1)
( 1)
2
exp( ( ) ); exp( ( ) 1)2 2
1( )
2
2
1
2
For only1st moment
( ) exp( 1 )
( ) 1 11
( )
x m
c x c c
Normal density p x e
m
p x c x
cep x dx
mce
xp x dx m m
Illustrative Examples - 6
Copyright ©1991-2009 by K. Pattipati31
• Example 5:
( 1)
1
2
2
1( ) exponential density
10
( ) ( )
xx m
ce
p x e em
L
p x p x
max
. .( )2
Solution:6
a cube volume = 6 6
xyz
cs t xy yz zx
cx y z
c c
Illustrative Examples - 7
Copyright ©1991-2009 by K. Pattipati32
Sensitivity interpretation
Interpretation of Lagrange Multipliers - 1
1
1 2 1 1 2 1
1 1 1 1
* *
1 2 1
1 11 1
1 1 1 2 2
1 2
1 1 1 2 1
* 1 1 1 21 1
1 1 2
min ( , ) . . ( , )
( , ) ( ) ( ( ) )
Suppose , , isoptimal
0 0 & 0
( ) (1)
( ) 0 [
b
f x x s t h x x b
L x f x h x b
x x
h hL f f
x x x x x
x xf f fx f x
b x b x b
h x h xh x b
x b x b
1
1
21
1
11 1
0
1] 0 (2)
(1) *(2) gives
( ) ii
i i i
xhf f
b x x b
1
1
f
b
Rate of change of f wrt level of constraint changes (or)
Impact on the cost function if additional resources are added
Copyright ©1991-2009 by K. Pattipati33
2
**
2 * * 2 * *
1
*
Consider min ( )
. . ( )
Necessarycondition:
( ) ( ) 0
( )
Jacobian at ,
( ) ( ) | ( )
|
( ) | 0
xxL
m
i i
i
T
f x
s t h x b
f x h x
h x b
x
f x h x h x
J
h x
*, ( ( )) ( )bIn general f x b b
Interpretation of Lagrange Multipliers - 2
Copyright ©1991-2009 by K. Pattipati34
2 *
*
st
*2 *
*
isnon-singular (why?)
If not 0
( ) 0
( ) 0
Premultiply1 equation by
0 , isnot a strict localminimum
Contradiction
0 0since ( )has rank
For nearby ,wehave
xx
T
T
T
xx
J
y yJ
z z
L y h x z
h x y
y
y L y x
y z h x m
b
f
*
( ( )) ( ( )) ( ) 0
( ( ))
x b h x b b
h x b b
Interpretation of Lagrange Multipliers - 3
Copyright ©1991-2009 by K. Pattipati35
Takinggradient of Lagrangian wrt
( ) ( ( )) ( ) ( ( )) ( ) 0
( ( )) ( ) 0
I
b b
b
b
x b f x b x b h x b b
f x b b
( ) ( ( ))bb f x b
Interpretation of Lagrange Multipliers - 4