Download - INTRODUCTION - Devi Ahilya Vishwavidyalaya

1

INTRODUCTION

This dissertation is the reading of references [1], [2] and [3]. Farkas’

lemma is one of the theorems of the alternative. These theorems characterize

the optimality conditions of several minimization problems. It is known that

these theorems of the alternative are all equivalent in the sense that these

theorems can easily be derived from each other. Usually these systems consist

of linear inequalities and/or equalities called a primal system and a dual

system. A theorem of the alternative asserts that either the primal system has a

solution or the dual system has a solution, but never both. The optimality conditions are also given by the duality theorem.

A standard technique for proving a duality theorem linking two constrained

optimization problems is to apply an appropriate theorem of alternative.

The chapter 1 contains results from [1]. We show that, using simple

logical arguments, a duality theorem is equivalent to a theorem of the

alternative. These arguments do not need assumption of linear space structure.

The abstract theorem of alternative is as follows:

Let X and Y be arbitrary non empty sets, and let f and g be arbitrary

extended real valued functions defined on X and Y, respectively. For each

, consider the statements :

I there exists x X such that f x ,

II there exists y Y such that g y .

The following statement is an abstract theorem of the alternative involving the

pairs (f, X) and (g, Y) :

2

(a) for all , , exactly one of I , II holds.

The statement (a) is just introduced as a logical statement. Next, the following

logical statement is an abstract duality theorem

(d) inf maxx X y Y

f x g y

This statement means inf supX Yf g and supY g g y for some y i.e.,

supY g is attained in Y .

In chapter 2, we elaborate an elementary proof of Farkas’ lemma given

in [2]*. The proof is based on elementary arguments.

FARKAS’ LEMMMA : Let A be a real m n matrix and let c be a real non-zero

n-vector. Then either the primal system

(1) 0Ax and 0Tc x

has a solution nx R or the dual system

(2) TA y c and 0y

has a solution my R , but never both.

As Farkas’ lemma is one of theorems of the alternative, it is not possible

that both systems are solvable. The question of which of the two systems is

solvable is answered by considering the bounded least squares problem

2

0

TA y c

y

minimize

subject to

* We are thankful to Prof. Achiya Dax, Hydrological Services, Israel for encouraging discussions and his helpful comments.

3

Let * my R and let * *Tr A y c denote the corresponding residual vector.

Then we prove that if * 0r then *y solves second system otherwise *r

solves first system. The existence of a point *y that solves above BLS is given

by a simple iterative algorithm.

The last chapter 3 gives various theorems of the alternative namely due

to Gale, Gordan, Stiemke’s, Motzkin and Dax. First, we prove what is called

as the alternative form of Farkas’ lemma (AFFL) using Tucker’s theorem.

Then we show that it is equivalent to Farkas’ lemma. The statement of AFFL

is

Let m nM R and nc R be arbitrary. Then either

(A) There exists 0v such that TM v c or

(B) There exists 0w such that 0Mw and 0Tc w .

and the statement of Tucker’s theorem is :

Let A be an arbitrary skew-symmetric matrix. There exists 0u such

that 0Au and 0u Au .

Note that as A is skew-symmetric, 0Tu Au . Hence if 0iu then

0Au and 0iAu implies 0iu . This is strict complementarities between

u and Au .

The above mentioned theorems of the alternative are merely special

cases of this theorem and may be proved simply by substituting the

appropriate matrices for M. ∎

4

CHAPTER 1

DUALITY THEOREMS AND THEOREMS OF ALTERNATIVE

Let X and Y be arbitrary non-empty sets and let f and g be arbitrary extended real valued functions defined on X and Y respectively. For each ϵ (-∞, +∞], consider the statements

I There exists x X such that f x

II There exists y Y such that g y .

The following logical statement is an abstract theorem of the alternative involving the pairs ,f X and ,g Y :

for all , , exactly one of I , II holds. (1)

Consider two abstract optimization problems.

The primal problem as

infx X

f x

(2)

and the dual problem as

supy Y

g y

(3)

For these problems, the abstract duality theorem is the following logical statement.

infx X

f x = maxy Y

g y

(4)

This statement means inf supX Yf g and supY g g y for some y i.e., supY g is attained in Y.

5

Neither (1) nor (4) has any real content until the pairs ,f X and ,g Y are assigned specific interpretations, or structure, and also hypotheses are given under which (1) or (4) is true. The theory of dual optimization problems involves representing f and g in terms of some other function

: ,L X Y such that

sup ,y Y

f x L x y

for all x X (5)

and

inf ,x X

g y L x y

for all y Y (6)

Then the primal is

inf inf sup ,x X x X y Y

f x L x y

(7)

and the dual is

sup sup inf ,x Xy Y y Y

g y L x y

(8)

The abstract duality theorem is

inf sup ,x X y Y

L x y

max inf ,x Xy Y

L x y

(9)

This is called the abstract minimax theorem.

Note : The symbol denotes negation.

Lemma 1.1 : The following two statements are equivalent.

(a1) for all , , II ⇒ I

I ⇒ II II I II I

6

(d1) [Weak duality theorem] inf sup

x X y Y

f x g y

Proof : [(a1) ⇒(d1)]

Let y Y and put g y . If then obviously inf X f .

Since y Y is arbitrary and LHS is independent of y, inf supX Yf g .

Now if , , by (a1) if there is y Y such that g y then for all

, .x X f x This gives inf .X f Hence, inf sup .X Yf g

[(d1) ⇒ (a1)]

Let , be such that II holds. There is 0y Y such that

0 .g y Hence, 0inf supX Yf g g y and II follows. ∎

Lemma 1.2 : The following two statements are equivalent.

(a2) [the non-trivial half of the theorem of the alternative]

for all ϵ (-∞, +∞], I II

I II

(d2) there exists y Y such that inf

x X

g y f x

Proof : [(a2) ⇒(d2)]

If inf X f then clearly infXg y f for each .y Y Now

suppose inf X f > . Then I holds for inf .X f Therefore (a2)

implies that II holds. Thus there exists y Y such that g y ,i.e., there

exists y Y such that inf .Xg y f This is (d2).

7

[(d2) ⇒(a2)]

Let (d2) holds i.e., there exists y Y such that inf .Xg y f Suppose

, be such that I holds i.e., for all , .x X f x This

gives inf .X f Then by (d2), there is some y Y such that

inf .Xg y f This means II holds. ∎

Following proposition gives the basic logical principle between the

duality theorem and the theorem of alternatives.

Proposition 1.3: (1) holds if and only if (4) holds.

Proof: The condition (1) is equivalent to (a1) and (a2) and (4) is equivalent to

(d1) and (d2). Hence by previous two lemmas, we get (1) if and only if (4). ∎

Remark : The “if” half of the proposition gives a general recipe for theorems

of the alternative.

8

CHAPTER – 2

AN ELEMENTARY PROOF OF FARKAS’ LEMMA

Farkas’ Lemma 2.1 : Let A be a real m n matrix and let c be a real non-zero n- vector. Then either the primal system

0Ax and 0Tc x (1)

has a solution for nx R or the dual system

TA y c and 0y (2)

has a solution for my R , but never both.

Remark 2.2 : Note that, 0 0 Tc x x . While TA y c means 0y as 0c . Now if both (1) and (2) holds then

0 TT T Tc x A y x y Ax

contradicts 0Tc x . Hence either of the system has a solution but never both.

Now the question that which of the two systems is solvable is answered

by considering bounded least squares (BLS) problem

2

0

TA y c

y

where is the Euclidean norm. Let * my R and let

* *Tr A y c (4)

*r is called the residual vector for the BLS.

minimize

such that

(3)

9

Lemma 2.3 : Let * my R and let *r be the residual vector for the BLS (3).

Then *y solves (3) if and only if *y and *r satisfy the conditions

* 0y , * 0Ar and * * 0T

y Ar (5)

Proof : Let *y solves (3). Consider one parametric quadratic function

if 2*TiA y e c (6)

2*

ia r , 1,i m

Where, Tia denotes the thi row of A, is a real variable and ie denotes the thi

column of m m unit matrix. As *y solves (3), 0i if f for all real

such that * 0iy . Hence, 0 solves the problem

minimize if

subject to * 0iy

Note that,

2*Ti if A y e c

2 2* *2TT T

i iA y e c A y e c

2 2 2* 2 * *2 2TT T T T T

i i iA y a a A y c A y a c

2 *2 2 2T T T Ti i i if a a A y a c

2 *

2 *

2 2

2 2

T T Ti i

T Ti i

a a A y c

a a r

10

*0 2 Ti if a r

As the derivative vanishes at the minimum point, * 0iy *ii.e., y ,

implies, * 0Tia r .

Next, if * 0iy . ., 0, i e and * 0Tia r then for sufficiently small

we can write

0 0 0f f f f

(Where 0 as 0 ) to get 0f f contradicting the minimality

of 0 6 ,Theorem4.1.2 . Thus, * 0iy implies * 0Tia r . Therefore (5)

holds.

Conversely, assume that (5) holds and let z be an arbitrary point in mR such that 0z . Put *u z y . Then clearly * 0iy implies 0iu . From (5) as

* 0Ar , * * 0T

iy Ar and 0z ,

* * * * * 0TT T Tu Ar z Ar y Ar z Ar .

Now,

22 *T TA z c A u y c

2*T TA y c A u

2 2* *2TT T T TA y c A u A y c A u

2 2* *2T T TA y c A u u Ar

2*TA y c

11

shows that *y solves (3). ∎

Note : Now, combining (4) and (5), we have

* * * *TT Tc r A y r r

* * * *T Ty Ar r r

* 2r

which gives following theorem.

Theorem 2.4 : Let *y solve (3) and let * *Tr A y c denote the

corresponding residual vector. If * 0r then *y solves (2). Otherwise, *r

solves (1) and * * 2Tc r r .

Remark 2.5 : The theorem tells us that which of the two systems, in Farkas’ Lemma, is solvable by considering the BLS. Now, what remains is the existence of a point *y that solves (3).

Existence of *y :

The existence of a point *y that solves (3) is achieved by introducing a

simple iterative algorithm whose thk iteration, 1,2,k consists of following two steps.

Step1 - Solving an unconstrained least squares problem.

Let 1, , 0Tk my y y denote the current estimate of solution at the

beginning of thk iteration. Define

Tk kr A y c

0k iv i y and 0k iw i y

12

The number of indices in kw is denoted by t . Now let kA be the t n

matrix whose rows are Tia , ki w . For simplicity assume that 1, , T

k tA a a ,

1, , kw t and 1, , kv t m .

Case (i) : 0t (i.e., 0ky ) then skip to step 2 (to check optimality and moving away from dead point).

Case (ii) : 0kr (i.e., TkA y c ) then skip to step 2 (to check optimality and

moving away from dead point).

Let the t-vector 1, , Tk tw w w solves the unconstrained least

squares problem

minimize 2T

k kA w r .

Claim : 0 solves this problem if and only if 0k kA r .

Let f w 2T

k kA w r , ( w : unrestricted)

2 2 2T T Tk k k kA w r r A w

If 0 solves the problem then

2kr

2Tk kA w r

0 2 2T

k k kA w r r 22T T T

k k kA w r A w

0T Tk kr A (since w is unrestricted).

Conversely, if 0k kA r , then

2T

k kA w r 2 2T

k kA w r

2 2T

k k kA w r r 2

0TkA w

13

2kr

2Tk kA w r

i.e., 0 solves the problem. In this case skip to step 2.

Otherwise, define a non-zero search direction m

ku R by i iu w for 1, ,i t and 0iu for 1, ,i t m and the next point is given by

1k k k ky y u

where 0k is the largest number in the interval 0, 1 that keeps the point

k k ky u feasible. This implies following :

for 1, ,i t to be 0,i k iy w we must have

k max | 0ii

i

y ww

(7)

and

k min | 0ii

i

y ww

(8)

Further note that when 0iw , any 0k will work and when 0iw any 0k will work. Next, the minimum in (8) is positive or zero. Clearly, this

minimum shall also work for (7). Hence we can write

k min 1 ( ) | 0i i iy w w

Step 2 : Testing optimality and moving away from a dead point.

Here 0t or 0kr or 0k kA r . When 0, 0kt y , so that kA does not exist. Therefore, in all we can assume that 0k kA r . This implies following claim.

Claim : If 0k kA r then ky solves the problem

14

2min

0

0

T

i k

i k

imize A y C

subject to y for i v

and y for i w

(9)

[[7], theorem 3L, page 156]

In this case, ky is called a “dead point” (t does not change). To test whether

ky is optimal. Compute index j such that

Tj ka r min |T

i k ka r i v .

If 0Tj ka r then 0T

i ka r for all ki v . We have 0Ti ka r for all ki w (since

0k kA r ). Hence 0kAr . Further by definitions of kv and kw , 0Tk ky Ar .

Thus, ky and kr satisfy (5). Therefore, by lemma 2.3, ky solves (3) and algorithm terminates.

Otherwise (if 0Tj ka r ), the next point is defined as

1

Tj k

k k jTj j

a ry y e

a a

Note that, 0Tj k

Tj j

a ra a

.

Claim: 1ky minimizes f 2Tk jA y e c

Note that

f 22 22 T

k j k jr a r a

f 2

2 2 Tj j ka a r

2 Tj k

j

a r

a

minimize

Subject to

and

15

and f 2

2 ja 0 . ∎

Theorem 2.6 : The above algorithm terminates in finite number of steps.

Proof: It is clear from following properties.

(a) The objective function is strictly decreasing at each iteration.

since 0Tj ka r , 1k ky y and 1ky minimizes f . Hence,

2T

kA y c 2

1T

kA y c .

(b) If 1k then

1 , 0, 1, , .ik i

i

y u i tu

So t does not change. Hence 1ky is a dead point.

Otherwise, for l

kl

yu

, clearly

1 0k ly

i.e., t decreases.

Hence, it is not possible to perform more than m iterations without reaching a dead point.

(c) Each time we reach a dead point, the current point solves (9).

(d) There is a finite number of such problems (9). Because of (a), it is not possible to get the same problem twice.

Corollary 2.7 : Let *y and *r be as in theorem 2.4 and assume that * 0r .

Then the vector *

*rr

solves the steepest descent problem

16

minimize Tc x (10)

subject to 0Ax and 1x (11)

Proof: Let x satisfy the constraints (11). Then, as * 0y ,

* 0T

y Ax .

The Cauchy Schwartz inequality gives,

* Tr x *r x *r

This gives

* * TT Tc x A y r x * *T Ty Ax r x

* Tr x

* Tr x

*r

Since, *

**

T rc rr

, the vector *

*rr

solves the steepest descent problem. ∎

17

CHAPTER - 3

ON THEOREMS OF THE ALTERNATIVE

In this chapter we give various theorems of the alternative namely due

to Gale, Gordan, Stiemke’s, Motzkin and Dax. First, Alternative form of

Farkas’ Lemma (AFFL) is proved using Tucker’s theorem. The above

mentioned theorems of the alternative are merely special cases of this theorem.

Tucker’s Theorem 3.1: If B is a real skew-symmetric matrix there exists a

non-negative vector u such that Bu is non-negative and u Bu is strictly

positive.

proof : [[4], theorem 3.4].

Theorem 3.2 ( AFFL: Alternative form of Farkas’ Lemma ) :

Let m nM R and nc R be arbitrary. Then either

(A) there exists 0v such that TM v c (i.e., 0TM v c )

or

(B) there exists 0w such that 0Mw and 0Tc w

but not both (A) and (B) hold.

Proof : Consider the following skew-symmetric matrix

B

0 0

0 0

0

T

T

M

c

M c

(1)

18

Now, by Tucker’s Theorem 3.1, we get u vtw

0, such that Bu 0 and

u Bu > 0. Now,

Bu =

0 0

0 0

0

T

T

M

c

M c

vtw

0

Mw 0, Tc w 0, TM v + ct 0. (2)

Then u Bu > 0 gives

vtw

+ T

T

Mw

c w

M v ct

> 0

v + Mw > 0, t - Tc w > 0, w - TM v + ct > 0. (3)

Case (i) : t = 0, (2) gives Mw 0 and (3) gives Tc w < 0. Hence, (B) holds.

Case (ii) : t = 1, (2) gives TM v c . This is (A).

Further, if (A) and (B) hold together then by (A) Tv M Tc and by (B)

there exists w 0 such that Mw 0 and Tc w < 0.Therefore, Tv Mw Tc w < 0.

But Mw 0 and v 0 gives Tv Mw 0. Hence, both (A) and (B) cannot

exist together. ∎

19

Theorem 3.3 : Farkas’ Lemma and AFFL are equivalent.

Proof : [⇒] According to Farkas’ Lemma, only one of the following system

holds: For A m nR and b nR ,

(C) Ax 0, Tc x 0

(D) TA y c and y 0

In (C), replace x by x to get Ax 0, Tc x 0. This is the (B) of AFFL.

Clearly, (D) implies (A) of the AFFL.

[⇐] In (B) of AFFL, replace w by x to get (C). Now, apply AFFL to

M A A , Tc T Tb b to get (D). In this case y v .

Notation : We denote by e the vector whose every element is 1.

Gales Theorem 3.4 : Either

1A there exists x nR such that Ax b

or

1B there exists y 0 such that TA y 0 and Tb y 1.

Proof : Let

M T

T

A

A

, v 1

2

xx

,

c = b and w y .

Substituting these values in (A) of AFFL, we get,

20

A A 1

2

xx

b,

i.e., 1Ax 2Ax b.

i.e., Ax b for x 2x 1x .

This is 1A .

Now, same substitutions in (B) of AFFL gives

T

T

A

A

y 0 and Tb y 0

TA y 0 and Tb y 0.

Now, replace y by Ty

b y to get Tb y 1. ∎

Gordan’s Theorem 3.5 : Either

2A there exists nx R such that 0Ax

or

2B there exists 0y such that 0TA y and 0Te y .

Proof : Put b e in Gale’s theorem 3.4. ∎

Note : Gordan’s theorem ensures that at least one component of y is strictly

positive.

21

Farkas’ Lemma 3.6 : Let m nA R and nb R be arbitrary. Then either

3A there exists 0y such that TA y b

or

3B there exists nx R such that 0Ax and 0Tb x .

Proof: Theorem 3.3.

Stiemke’s Theorem 3.7 : Either

4A there exists 0z such that 0TA z

or

4B there exists nx R such that 0Ax and 0Ax .

Proof : Put Tb A e in Farkas’ Lemma 3.6. Then equation ( 3A ) becomes

0TA y e . Put 0z y e . This gives 4A .

Now, equation 3B gives for 0Tb x , 0Te Ax . Now, note that

Te Ax 1 1 _ _ _ 1

1

2

___m

a x

a x

a x

0

0ij ji j

a x

22

As 0ij jj

a x , there is atleast one i such that 0ij jj

a x . Therefore, 0Ax .

This gives 4B . ∎

Motzkin’s Theorem 3.8: Let TA 1 2 3T T TA A A , where iA im nR and

3

1i

im

m . Then either one of the following holds.

5A there exists x nR such that 1 0A x , 2 0A x and 3A x 0.

5B there exists 1y 0, 2y 0 and 3y 3 mR such that 3

10

Ti i

iA y

and 1Te y 0.

Proof : Let

M 1 2 3 3

1 2 3 3

T T T T

T T T T

A A A A

A A A A , Tc 0 0 0

T T T Te

By AFFL, we have either 1

2

xv

x such that TM v c or there is

1 2 1 2 0T T T T Tw y y z z , such that 0Mw and 0Tc w . Thus, TM v c

gives

1 1

2 2

3 3

3 3

A A

A A

A A

A A

1

2

xx

0

0

0

e

23

1 0 A x e and 2 0A x and 3 0A x , where 2 1 x x x .

This gives 5A . Now 0Mw implies

1 2 3 3

1 2 3 3

T T T T

T T T T

A A A A

A A A A

1

2

1

2

yyzz

0

1 1 2 2 3 1 3 2 0T T T TA y A y A z A z and 1 1 2 2 3 1 3 2 0T T T TA y A y A z A z

1 1 2 2 3 3 0 T T TA y A y A y , where 3 1 2 y z z

Next, 0Tc w gives

0 0 0T T T Te

1

2

1

2

yyzz

0 ,

1 0Te y .

This is 5B . ∎

Notation : By x we mean the vector whose elements are ix .

Dax’s Theorem 3.9 : Let md R be any strictly positive vector. Then either

6A there exists my R satisfying d y d such that TA y b

or

6B there exists nx R such that 0T Tb x d Ax .

24

Proof : Let

A A I I

MA A I I

, 1

2

yv

y

,

T T T T Tc b b d d and 1 2 1 2T T T T Tw x x z z .

The AFFL condition TM v c gives

T T

T T

A A

A AI II I

1

2

yy

b

b

d

d

, ,TA y b y d y d

where 1 2y y y . This gives 6A .

Now, 0Mw gives

A A I I

A A I I

1

2

1

2

xxzz

0

1 2Ax z z where, 2 1.x x x

Further, 1 2 1 2Ax z z z z , since 1 0z , 2 0z , therefore

1 2 1 2.z z z z

Now,

25

T T T Tb b d d

1

2

1

2

xxzz

0 ,

1 2 0T Tb x d z z

Since 0d and 1 2Ax z z we have 1 2 0T Td Ax d z z .

Adding this inequality to above inequality we get

0.T Td Ax b x

Next, 6A gives

T T Tb x y Ax y Ax

T T Ty Ax y Ax y Ax

But since d y d , y d so that

T Tb x d Ax

which contradicts 6B . Hence both 6A and 6B do not hold together. ∎

26

REFERENCES [1] Duality Theorems and Theorems of The Alternative, L. McLinden, Proceedings of the American Mathematical Society, V. 53, 172-175, 1975. [2] An Elementary Proof of Farkas’ Lemma, Achiya Dax, SIAM REV. V. 39, No. 3, 503-507, 1997. [3] On theorems of the alternative. C. G. Broyden, Optimization methods and Software, V.16, N0.1, 101-111. [4] A simple algebraic proof of Farkas’s lemma and related theorems, C. G. Broyden, Optimization Methods and Software, V.8, No.3, 185-199, 1998. [5] Optimization by Vector Space Methods, D. G. Luenberger, Wiley, New York, ©1969. [6] Non linear Programming : Theory and Algorithms, M. S. Bazaraa, Hanif D. Sherali, C. M. Shetty, John Wiley & Sons. [7] Linear Algebra and its applications, Gilbert Strang, Thomson learning Inc, © 1998.