1
INTRODUCTION
This dissertation is the reading of references [1], [2] and [3]. Farkas’
lemma is one of the theorems of the alternative. These theorems characterize
the optimality conditions of several minimization problems. It is known that
these theorems of the alternative are all equivalent in the sense that these
theorems can easily be derived from each other. Usually these systems consist
of linear inequalities and/or equalities called a primal system and a dual
system. A theorem of the alternative asserts that either the primal system has a
solution or the dual system has a solution, but never both. The optimality conditions are also given by the duality theorem.
A standard technique for proving a duality theorem linking two constrained
optimization problems is to apply an appropriate theorem of alternative.
The chapter 1 contains results from [1]. We show that, using simple
logical arguments, a duality theorem is equivalent to a theorem of the
alternative. These arguments do not need assumption of linear space structure.
The abstract theorem of alternative is as follows:
Let X and Y be arbitrary non empty sets, and let f and g be arbitrary
extended real valued functions defined on X and Y, respectively. For each
, consider the statements :
I there exists x X such that f x ,
II there exists y Y such that g y .
The following statement is an abstract theorem of the alternative involving the
pairs (f, X) and (g, Y) :
2
(a) for all , , exactly one of I , II holds.
The statement (a) is just introduced as a logical statement. Next, the following
logical statement is an abstract duality theorem
(d) inf maxx X y Y
f x g y
This statement means inf supX Yf g and supY g g y for some y i.e.,
supY g is attained in Y .
In chapter 2, we elaborate an elementary proof of Farkas’ lemma given
in [2]*. The proof is based on elementary arguments.
FARKAS’ LEMMMA : Let A be a real m n matrix and let c be a real non-zero
n-vector. Then either the primal system
(1) 0Ax and 0Tc x
has a solution nx R or the dual system
(2) TA y c and 0y
has a solution my R , but never both.
As Farkas’ lemma is one of theorems of the alternative, it is not possible
that both systems are solvable. The question of which of the two systems is
solvable is answered by considering the bounded least squares problem
2
0
TA y c
y
minimize
subject to
* We are thankful to Prof. Achiya Dax, Hydrological Services, Israel for encouraging discussions and his helpful comments.
3
Let * my R and let * *Tr A y c denote the corresponding residual vector.
Then we prove that if * 0r then *y solves second system otherwise *r
solves first system. The existence of a point *y that solves above BLS is given
by a simple iterative algorithm.
The last chapter 3 gives various theorems of the alternative namely due
to Gale, Gordan, Stiemke’s, Motzkin and Dax. First, we prove what is called
as the alternative form of Farkas’ lemma (AFFL) using Tucker’s theorem.
Then we show that it is equivalent to Farkas’ lemma. The statement of AFFL
is
Let m nM R and nc R be arbitrary. Then either
(A) There exists 0v such that TM v c or
(B) There exists 0w such that 0Mw and 0Tc w .
and the statement of Tucker’s theorem is :
Let A be an arbitrary skew-symmetric matrix. There exists 0u such
that 0Au and 0u Au .
Note that as A is skew-symmetric, 0Tu Au . Hence if 0iu then
0Au and 0iAu implies 0iu . This is strict complementarities between
u and Au .
The above mentioned theorems of the alternative are merely special
cases of this theorem and may be proved simply by substituting the
appropriate matrices for M. ∎
4
CHAPTER 1
DUALITY THEOREMS AND THEOREMS OF ALTERNATIVE
Let X and Y be arbitrary non-empty sets and let f and g be arbitrary extended real valued functions defined on X and Y respectively. For each ϵ (-∞, +∞], consider the statements
I There exists x X such that f x
II There exists y Y such that g y .
The following logical statement is an abstract theorem of the alternative involving the pairs ,f X and ,g Y :
for all , , exactly one of I , II holds. (1)
Consider two abstract optimization problems.
The primal problem as
infx X
f x
(2)
and the dual problem as
supy Y
g y
(3)
For these problems, the abstract duality theorem is the following logical statement.
infx X
f x = maxy Y
g y
(4)
This statement means inf supX Yf g and supY g g y for some y i.e., supY g is attained in Y.
5
Neither (1) nor (4) has any real content until the pairs ,f X and ,g Y are assigned specific interpretations, or structure, and also hypotheses are given under which (1) or (4) is true. The theory of dual optimization problems involves representing f and g in terms of some other function
: ,L X Y such that
sup ,y Y
f x L x y
for all x X (5)
and
inf ,x X
g y L x y
for all y Y (6)
Then the primal is
inf inf sup ,x X x X y Y
f x L x y
(7)
and the dual is
sup sup inf ,x Xy Y y Y
g y L x y
(8)
The abstract duality theorem is
inf sup ,x X y Y
L x y
max inf ,x Xy Y
L x y
(9)
This is called the abstract minimax theorem.
Note : The symbol denotes negation.
Lemma 1.1 : The following two statements are equivalent.
(a1) for all , , II ⇒ I
I ⇒ II II I II I
6
(d1) [Weak duality theorem] inf sup
x X y Y
f x g y
Proof : [(a1) ⇒(d1)]
Let y Y and put g y . If then obviously inf X f .
Since y Y is arbitrary and LHS is independent of y, inf supX Yf g .
Now if , , by (a1) if there is y Y such that g y then for all
, .x X f x This gives inf .X f Hence, inf sup .X Yf g
[(d1) ⇒ (a1)]
Let , be such that II holds. There is 0y Y such that
0 .g y Hence, 0inf supX Yf g g y and II follows. ∎
Lemma 1.2 : The following two statements are equivalent.
(a2) [the non-trivial half of the theorem of the alternative]
for all ϵ (-∞, +∞], I II
I II
(d2) there exists y Y such that inf
x X
g y f x
Proof : [(a2) ⇒(d2)]
If inf X f then clearly infXg y f for each .y Y Now
suppose inf X f > . Then I holds for inf .X f Therefore (a2)
implies that II holds. Thus there exists y Y such that g y ,i.e., there
exists y Y such that inf .Xg y f This is (d2).
7
[(d2) ⇒(a2)]
Let (d2) holds i.e., there exists y Y such that inf .Xg y f Suppose
, be such that I holds i.e., for all , .x X f x This
gives inf .X f Then by (d2), there is some y Y such that
inf .Xg y f This means II holds. ∎
Following proposition gives the basic logical principle between the
duality theorem and the theorem of alternatives.
Proposition 1.3: (1) holds if and only if (4) holds.
Proof: The condition (1) is equivalent to (a1) and (a2) and (4) is equivalent to
(d1) and (d2). Hence by previous two lemmas, we get (1) if and only if (4). ∎
Remark : The “if” half of the proposition gives a general recipe for theorems
of the alternative.
8
CHAPTER – 2
AN ELEMENTARY PROOF OF FARKAS’ LEMMA
Farkas’ Lemma 2.1 : Let A be a real m n matrix and let c be a real non-zero n- vector. Then either the primal system
0Ax and 0Tc x (1)
has a solution for nx R or the dual system
TA y c and 0y (2)
has a solution for my R , but never both.
Remark 2.2 : Note that, 0 0 Tc x x . While TA y c means 0y as 0c . Now if both (1) and (2) holds then
0 TT T Tc x A y x y Ax
contradicts 0Tc x . Hence either of the system has a solution but never both.
Now the question that which of the two systems is solvable is answered
by considering bounded least squares (BLS) problem
2
0
TA y c
y
where is the Euclidean norm. Let * my R and let
* *Tr A y c (4)
*r is called the residual vector for the BLS.
minimize
such that
(3)
9
Lemma 2.3 : Let * my R and let *r be the residual vector for the BLS (3).
Then *y solves (3) if and only if *y and *r satisfy the conditions
* 0y , * 0Ar and * * 0T
y Ar (5)
Proof : Let *y solves (3). Consider one parametric quadratic function
if 2*TiA y e c (6)
2*
ia r , 1,i m
Where, Tia denotes the thi row of A, is a real variable and ie denotes the thi
column of m m unit matrix. As *y solves (3), 0i if f for all real
such that * 0iy . Hence, 0 solves the problem
minimize if
subject to * 0iy
Note that,
2*Ti if A y e c
2 2* *2TT T
i iA y e c A y e c
2 2 2* 2 * *2 2TT T T T T
i i iA y a a A y c A y a c
2 *2 2 2T T T Ti i i if a a A y a c
2 *
2 *
2 2
2 2
T T Ti i
T Ti i
a a A y c
a a r
10
*0 2 Ti if a r
As the derivative vanishes at the minimum point, * 0iy *ii.e., y ,
implies, * 0Tia r .
Next, if * 0iy . ., 0, i e and * 0Tia r then for sufficiently small
we can write
0 0 0f f f f
(Where 0 as 0 ) to get 0f f contradicting the minimality
of 0 6 ,Theorem4.1.2 . Thus, * 0iy implies * 0Tia r . Therefore (5)
holds.
Conversely, assume that (5) holds and let z be an arbitrary point in mR such that 0z . Put *u z y . Then clearly * 0iy implies 0iu . From (5) as
* 0Ar , * * 0T
iy Ar and 0z ,
* * * * * 0TT T Tu Ar z Ar y Ar z Ar .
Now,
22 *T TA z c A u y c
2*T TA y c A u
2 2* *2TT T T TA y c A u A y c A u
2 2* *2T T TA y c A u u Ar
2*TA y c
11
shows that *y solves (3). ∎
Note : Now, combining (4) and (5), we have
* * * *TT Tc r A y r r
* * * *T Ty Ar r r
* 2r
which gives following theorem.
Theorem 2.4 : Let *y solve (3) and let * *Tr A y c denote the
corresponding residual vector. If * 0r then *y solves (2). Otherwise, *r
solves (1) and * * 2Tc r r .
Remark 2.5 : The theorem tells us that which of the two systems, in Farkas’ Lemma, is solvable by considering the BLS. Now, what remains is the existence of a point *y that solves (3).
Existence of *y :
The existence of a point *y that solves (3) is achieved by introducing a
simple iterative algorithm whose thk iteration, 1,2,k consists of following two steps.
Step1 - Solving an unconstrained least squares problem.
Let 1, , 0Tk my y y denote the current estimate of solution at the
beginning of thk iteration. Define
Tk kr A y c
0k iv i y and 0k iw i y
12
The number of indices in kw is denoted by t . Now let kA be the t n
matrix whose rows are Tia , ki w . For simplicity assume that 1, , T
k tA a a ,
1, , kw t and 1, , kv t m .
Case (i) : 0t (i.e., 0ky ) then skip to step 2 (to check optimality and moving away from dead point).
Case (ii) : 0kr (i.e., TkA y c ) then skip to step 2 (to check optimality and
moving away from dead point).
Let the t-vector 1, , Tk tw w w solves the unconstrained least
squares problem
minimize 2T
k kA w r .
Claim : 0 solves this problem if and only if 0k kA r .
Let f w 2T
k kA w r , ( w : unrestricted)
2 2 2T T Tk k k kA w r r A w
If 0 solves the problem then
2kr
2Tk kA w r
0 2 2T
k k kA w r r 22T T T
k k kA w r A w
0T Tk kr A (since w is unrestricted).
Conversely, if 0k kA r , then
2T
k kA w r 2 2T
k kA w r
2 2T
k k kA w r r 2
0TkA w
13
2kr
2Tk kA w r
i.e., 0 solves the problem. In this case skip to step 2.
Otherwise, define a non-zero search direction m
ku R by i iu w for 1, ,i t and 0iu for 1, ,i t m and the next point is given by
1k k k ky y u
where 0k is the largest number in the interval 0, 1 that keeps the point
k k ky u feasible. This implies following :
for 1, ,i t to be 0,i k iy w we must have
k max | 0ii
i
y ww
(7)
and
k min | 0ii
i
y ww
(8)
Further note that when 0iw , any 0k will work and when 0iw any 0k will work. Next, the minimum in (8) is positive or zero. Clearly, this
minimum shall also work for (7). Hence we can write
k min 1 ( ) | 0i i iy w w
Step 2 : Testing optimality and moving away from a dead point.
Here 0t or 0kr or 0k kA r . When 0, 0kt y , so that kA does not exist. Therefore, in all we can assume that 0k kA r . This implies following claim.
Claim : If 0k kA r then ky solves the problem
14
2min
0
0
T
i k
i k
imize A y C
subject to y for i v
and y for i w
(9)
[[7], theorem 3L, page 156]
In this case, ky is called a “dead point” (t does not change). To test whether
ky is optimal. Compute index j such that
Tj ka r min |T
i k ka r i v .
If 0Tj ka r then 0T
i ka r for all ki v . We have 0Ti ka r for all ki w (since
0k kA r ). Hence 0kAr . Further by definitions of kv and kw , 0Tk ky Ar .
Thus, ky and kr satisfy (5). Therefore, by lemma 2.3, ky solves (3) and algorithm terminates.
Otherwise (if 0Tj ka r ), the next point is defined as
1
Tj k
k k jTj j
a ry y e
a a
Note that, 0Tj k
Tj j
a ra a
.
Claim: 1ky minimizes f 2Tk jA y e c
Note that
f 22 22 T
k j k jr a r a
f 2
2 2 Tj j ka a r
2 Tj k
j
a r
a
minimize
Subject to
and
15
and f 2
2 ja 0 . ∎
Theorem 2.6 : The above algorithm terminates in finite number of steps.
Proof: It is clear from following properties.
(a) The objective function is strictly decreasing at each iteration.
since 0Tj ka r , 1k ky y and 1ky minimizes f . Hence,
2T
kA y c 2
1T
kA y c .
(b) If 1k then
1 , 0, 1, , .ik i
i
y u i tu
So t does not change. Hence 1ky is a dead point.
Otherwise, for l
kl
yu
, clearly
1 0k ly
i.e., t decreases.
Hence, it is not possible to perform more than m iterations without reaching a dead point.
(c) Each time we reach a dead point, the current point solves (9).
(d) There is a finite number of such problems (9). Because of (a), it is not possible to get the same problem twice.
Corollary 2.7 : Let *y and *r be as in theorem 2.4 and assume that * 0r .
Then the vector *
*rr
solves the steepest descent problem
16
minimize Tc x (10)
subject to 0Ax and 1x (11)
Proof: Let x satisfy the constraints (11). Then, as * 0y ,
* 0T
y Ax .
The Cauchy Schwartz inequality gives,
* Tr x *r x *r
This gives
* * TT Tc x A y r x * *T Ty Ax r x
* Tr x
* Tr x
*r
Since, *
**
T rc rr
, the vector *
*rr
solves the steepest descent problem. ∎
17
CHAPTER - 3
ON THEOREMS OF THE ALTERNATIVE
In this chapter we give various theorems of the alternative namely due
to Gale, Gordan, Stiemke’s, Motzkin and Dax. First, Alternative form of
Farkas’ Lemma (AFFL) is proved using Tucker’s theorem. The above
mentioned theorems of the alternative are merely special cases of this theorem.
Tucker’s Theorem 3.1: If B is a real skew-symmetric matrix there exists a
non-negative vector u such that Bu is non-negative and u Bu is strictly
positive.
proof : [[4], theorem 3.4].
Theorem 3.2 ( AFFL: Alternative form of Farkas’ Lemma ) :
Let m nM R and nc R be arbitrary. Then either
(A) there exists 0v such that TM v c (i.e., 0TM v c )
or
(B) there exists 0w such that 0Mw and 0Tc w
but not both (A) and (B) hold.
Proof : Consider the following skew-symmetric matrix
B
0 0
0 0
0
T
T
M
c
M c
(1)
18
Now, by Tucker’s Theorem 3.1, we get u vtw
0, such that Bu 0 and
u Bu > 0. Now,
Bu =
0 0
0 0
0
T
T
M
c
M c
vtw
0
Mw 0, Tc w 0, TM v + ct 0. (2)
Then u Bu > 0 gives
vtw
+ T
T
Mw
c w
M v ct
> 0
v + Mw > 0, t - Tc w > 0, w - TM v + ct > 0. (3)
Case (i) : t = 0, (2) gives Mw 0 and (3) gives Tc w < 0. Hence, (B) holds.
Case (ii) : t = 1, (2) gives TM v c . This is (A).
Further, if (A) and (B) hold together then by (A) Tv M Tc and by (B)
there exists w 0 such that Mw 0 and Tc w < 0.Therefore, Tv Mw Tc w < 0.
But Mw 0 and v 0 gives Tv Mw 0. Hence, both (A) and (B) cannot
exist together. ∎
19
Theorem 3.3 : Farkas’ Lemma and AFFL are equivalent.
Proof : [⇒] According to Farkas’ Lemma, only one of the following system
holds: For A m nR and b nR ,
(C) Ax 0, Tc x 0
(D) TA y c and y 0
In (C), replace x by x to get Ax 0, Tc x 0. This is the (B) of AFFL.
Clearly, (D) implies (A) of the AFFL.
[⇐] In (B) of AFFL, replace w by x to get (C). Now, apply AFFL to
M A A , Tc T Tb b to get (D). In this case y v .
Notation : We denote by e the vector whose every element is 1.
Gales Theorem 3.4 : Either
1A there exists x nR such that Ax b
or
1B there exists y 0 such that TA y 0 and Tb y 1.
Proof : Let
M T
T
A
A
, v 1
2
xx
,
c = b and w y .
Substituting these values in (A) of AFFL, we get,
20
A A 1
2
xx
b,
i.e., 1Ax 2Ax b.
i.e., Ax b for x 2x 1x .
This is 1A .
Now, same substitutions in (B) of AFFL gives
T
T
A
A
y 0 and Tb y 0
TA y 0 and Tb y 0.
Now, replace y by Ty
b y to get Tb y 1. ∎
Gordan’s Theorem 3.5 : Either
2A there exists nx R such that 0Ax
or
2B there exists 0y such that 0TA y and 0Te y .
Proof : Put b e in Gale’s theorem 3.4. ∎
Note : Gordan’s theorem ensures that at least one component of y is strictly
positive.
21
Farkas’ Lemma 3.6 : Let m nA R and nb R be arbitrary. Then either
3A there exists 0y such that TA y b
or
3B there exists nx R such that 0Ax and 0Tb x .
Proof: Theorem 3.3.
Stiemke’s Theorem 3.7 : Either
4A there exists 0z such that 0TA z
or
4B there exists nx R such that 0Ax and 0Ax .
Proof : Put Tb A e in Farkas’ Lemma 3.6. Then equation ( 3A ) becomes
0TA y e . Put 0z y e . This gives 4A .
Now, equation 3B gives for 0Tb x , 0Te Ax . Now, note that
Te Ax 1 1 _ _ _ 1
1
2
___m
a x
a x
a x
0
0ij ji j
a x
22
As 0ij jj
a x , there is atleast one i such that 0ij jj
a x . Therefore, 0Ax .
This gives 4B . ∎
Motzkin’s Theorem 3.8: Let TA 1 2 3T T TA A A , where iA im nR and
3
1i
im
m . Then either one of the following holds.
5A there exists x nR such that 1 0A x , 2 0A x and 3A x 0.
5B there exists 1y 0, 2y 0 and 3y 3 mR such that 3
10
Ti i
iA y
and 1Te y 0.
Proof : Let
M 1 2 3 3
1 2 3 3
T T T T
T T T T
A A A A
A A A A , Tc 0 0 0
T T T Te
By AFFL, we have either 1
2
xv
x such that TM v c or there is
1 2 1 2 0T T T T Tw y y z z , such that 0Mw and 0Tc w . Thus, TM v c
gives
1 1
2 2
3 3
3 3
A A
A A
A A
A A
1
2
xx
0
0
0
e
23
1 0 A x e and 2 0A x and 3 0A x , where 2 1 x x x .
This gives 5A . Now 0Mw implies
1 2 3 3
1 2 3 3
T T T T
T T T T
A A A A
A A A A
1
2
1
2
yyzz
0
1 1 2 2 3 1 3 2 0T T T TA y A y A z A z and 1 1 2 2 3 1 3 2 0T T T TA y A y A z A z
1 1 2 2 3 3 0 T T TA y A y A y , where 3 1 2 y z z
Next, 0Tc w gives
0 0 0T T T Te
1
2
1
2
yyzz
0 ,
1 0Te y .
This is 5B . ∎
Notation : By x we mean the vector whose elements are ix .
Dax’s Theorem 3.9 : Let md R be any strictly positive vector. Then either
6A there exists my R satisfying d y d such that TA y b
or
6B there exists nx R such that 0T Tb x d Ax .
24
Proof : Let
A A I I
MA A I I
, 1
2
yv
y
,
T T T T Tc b b d d and 1 2 1 2T T T T Tw x x z z .
The AFFL condition TM v c gives
T T
T T
A A
A AI II I
1
2
yy
b
b
d
d
, ,TA y b y d y d
where 1 2y y y . This gives 6A .
Now, 0Mw gives
A A I I
A A I I
1
2
1
2
xxzz
0
1 2Ax z z where, 2 1.x x x
Further, 1 2 1 2Ax z z z z , since 1 0z , 2 0z , therefore
1 2 1 2.z z z z
Now,
25
T T T Tb b d d
1
2
1
2
xxzz
0 ,
1 2 0T Tb x d z z
Since 0d and 1 2Ax z z we have 1 2 0T Td Ax d z z .
Adding this inequality to above inequality we get
0.T Td Ax b x
Next, 6A gives
T T Tb x y Ax y Ax
T T Ty Ax y Ax y Ax
But since d y d , y d so that
T Tb x d Ax
which contradicts 6B . Hence both 6A and 6B do not hold together. ∎
26
REFERENCES [1] Duality Theorems and Theorems of The Alternative, L. McLinden, Proceedings of the American Mathematical Society, V. 53, 172-175, 1975. [2] An Elementary Proof of Farkas’ Lemma, Achiya Dax, SIAM REV. V. 39, No. 3, 503-507, 1997. [3] On theorems of the alternative. C. G. Broyden, Optimization methods and Software, V.16, N0.1, 101-111. [4] A simple algebraic proof of Farkas’s lemma and related theorems, C. G. Broyden, Optimization Methods and Software, V.8, No.3, 185-199, 1998. [5] Optimization by Vector Space Methods, D. G. Luenberger, Wiley, New York, ©1969. [6] Non linear Programming : Theory and Algorithms, M. S. Bazaraa, Hanif D. Sherali, C. M. Shetty, John Wiley & Sons. [7] Linear Algebra and its applications, Gilbert Strang, Thomson learning Inc, © 1998.
27
Top Related