1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

48
1 Chapter 6 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION

Transcript of 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

Page 1: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

1

Ch

apte

r 6

Chapter 6

UNCONSTRAINED MULTIVARIABLEOPTIMIZATION

Page 2: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

2

Ch

apte

r 6

6.1 Function Values Only

6.2 First Derivatives of f (gradient and conjugate direction methods)

6.3 Second Derivatives of f (e.g., Newton’s method)

6.4 Quasi-Newton methods

Page 3: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

3

Ch

apte

r 6

Page 4: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

4

Ch

apte

r 6

Page 5: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

5

Ch

apte

r 6

Page 6: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

6

Ch

apte

r 6

Page 7: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

7

Ch

apte

r 6

Page 8: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

8

Ch

apte

r 6

Page 9: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

9

Ch

apte

r 6

General Strategy for Gradient methods

(1) Calculate a search direction(2) Select a step length in that direction to reduce f(x)

1k k k k kx x s x x

Steepest DescentSearch Direction

( ) k ks f x Don’t need to normalize

Method terminates at any stationary point. Why?

0)( xf

ks

Page 10: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

10

Ch

apte

r 6

So procedure can stop at saddle point. Need to show

)( *xH is positive definite for a minimum.

Step Length

How to pick • analytically• numerically

Page 11: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

11

Ch

apte

r 6

Page 12: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

12

Ch

apte

r 6

Page 13: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

13

Ch

apte

r 6

Page 14: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

14

Ch

apte

r 6

Analytical MethodHow does one minimize a function in a search direction using an analytical method?

It means s is fixed and you want to pick , the steplength to minimize f(x). Note

1 1( ) ( ) ( ) ( )( ) ( ) ( )( )

2

( )0 ( )( ) ( ) ( )( )

Solve for

( )( )

( ) ( )( )

k k k k k k k k kT Tf x s f x f x f x x x H x x

k kdf x s k k k k kT Tf x s s H x sd

k kT f x sk k kTs H x s

This yields a minimum of the approximating function.

. k kx s

(6.9)

Page 15: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

15

Ch

apte

r 6

Numerical MethodUse coarse search first (1) Fixed ( = 1) or variable ( = 1, 2, ½, etc.)

Options for optimizing (1) Use interpolation such as quadratic, cubic(2) Region Elimination (Golden Search)(3) Newton, Secant, Quasi-Newton(4) Random(5) Analytical optimization

(1), (3), and (5) are preferred. However, it maynot be desirable to exactly optimize (better togenerate new search directions).

Page 16: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

16

Ch

apte

r 6

Suppose we calculate the gradient at the point xT = [2 2]

Page 17: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

17

Ch

apte

r 6

Page 18: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

18

Ch

apte

r 6

Page 19: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

19

Termination Criteria

Big change in f(x) but little changein x. Code will stop if x is sole criterion.

Big change in x but little changein f(x). Code will stop if x is sole criterion.

For minimization you can use up to three criteria for termination:

f(x)

f(x)

(1) 1

1 12

except when ( ) 0 ( ) ( )

( ) then use ( ) ( )

kk k

k k k

f xf x f x

f x f x f x

1

3 14

except when 0

then use

kk ki i

k k ki

xx x

x x x

5 6( ) or k kif x s

(2)

(3)

x

x

Ch

apte

r 6

Page 20: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

20

Conjugate Search Directions

• Improvement over gradient method for general quadratic functions

• Basis for many NLP techniques• Two search directions are conjugate relative to Q if

• To minimize f(xnx1) when H is a constant matrix (=Q), you are guaranteed to reach the optimum in n conjugate direction stages if you minimize exactly at each stage

(one-dimensional search)

( ) ( ) 0i T j s Q s

Ch

apte

r 6

Page 21: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

21

Ch

apte

r 6

Page 22: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

22

Conjugate Gradient Method0 0

0 0

At calculate ( ). Let

( )

f

f

Step 1. x x

s x

0

1 0 0 0

Save ( ) and compute

f

Step 2. x

x x s

by minimizing f(x) with respect to in the s0 direction (i.e., carry out a unidimensional search for 0).

Step 3. Calculate The new search direction is a linear combination of

For the kth iteration the relation is

For a quadratic function it can be shown that these successive search directions are conjugate.After n iterations (k = n), the quadratic function is minimized. For a nonquadratic function,the procedure cycles again with xn+1 becoming x0.

Step 4. Test for convergence to the minimum of f(x). If convergence is not attained, return to step 3.

Step n. Terminate the algorithm when is less than some prescribed tolerance.

1 11 1 0

0 0

( ) ( )( )

( ) ( )

T

T

f ff

f f

x x

s x sx x

1 11 1 ( ) ( )

( )( ) ( )

T k kk k k

T k k

f ff

f f

x xs x s

x x

1 1( ), ( ).f fx x

( )kf x

(6.6)

0 1 and ( ) :fs x

Ch

apte

r 6

Page 23: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

23

Ch

apte

r 6

Page 24: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

24

Ch

apte

r 6

Page 25: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

25

Ch

apte

r 6

Page 26: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

26

Ch

apte

r 6

Page 27: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

27

Ch

apte

r 6

2 21 2( 3) 9( 5)f x x Minimize using the method of conjugate gradients with

0 01 21 and 1x x

0 1

1x

0

4

72xf

For steepest descent,

0

0 4

72xs f

Steepest Descent Step (1-D Search)

1 0 01 4, 0.

1 72x

The objective function can be expressed as a function of 0 as follows:

0 0 2 0 2( ) (4 2) 9(72 4) .f

Minimizing f(0), we obtain f = 3.1594 at 0 = 0.0555. Hence

1 1.223

5.011x

as an initial point.

In vector notation,

Ch

apte

r 6

Page 28: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

28

Ch

apte

r 6

Calculate Weighting of Previous step

The new gradient can now be determined as

and 0 can be computed as

Generate New (Conjugate) Search Direction

and

One dimensional Search

Solving for 1 as before [i.e., expressing f(x1) as a function of 1 and minimizing with respect to 1] yields f = 5.91 x 10-10 at 1 = 0.4986. Hence

which is the optimum (in 2 steps, which agrees with the theory).

1

3.554

0.197xf

2 20

2 2

(3.554) (0.197)0.00244.

(4) (72)

1 3.554 4 3.5640.00244

0.197 72 0.022s

2 11.223 3.564

5.011 0.022x

2 3.0000

5.0000X

Ch

apte

r 6

Page 29: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

29

Ch

apte

r 6

Page 30: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

30

Ch

apte

r 6

Page 31: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

31

Ch

apte

r 6

Page 32: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

32

Ch

apte

r 6

Fletcher – Reeves Conjugate Gradient Method

0 0

1 1 01

2 2 12

1

( )

( )

( )

are chosen to make 0 (conjugate directions)k k kk

Let s f x

s f x s

s f x s

s H s

Derivation: (let )kH H

1 2

1

( ) ( ) ( )( )

( ) ( )

k k k k

k k k kk

f x f x f x x x

f x f x H x H s

Page 33: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

33

Ch

apte

r 6

1 1

1 1

1

1 1 1

1

1

( ) ( )

( ) ( ) ( ) /

Using definition of conjugate directions, ( ) =0,

( ) ( ) ( ) 0

( ) ( ) 0

and ( ) 0,

(

k k k k

Tk k kT k

k kT

Tk k k kk

k kT

k kT

Tk

s H f x f x

s f x f x H

s Hs

f x f x H H f x s

f x f x

f x s

f

1 1

1 1

) ( )

( ) ( )

( )

k k

k kT

k k kk

x f x

f x f x

s f x s

and solving for the weighting factor:

Page 34: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

34

Ch

apte

r 6

Linear vs. Quadratic Approximation of f(x)

1( ) ( ) ( ) ( ) ( ) ( )( )

2

(1) Using a linear approximation of ( ) :

( )0 ( ) so cannot solve for !

( )

(2) Using a quadratic approximation for (x) :

( )

(

k k k k k kT T

k k kk

k k

T

T

f x f x x x f x x x H x x x

x x x s

f x

df xf x x

d x

f

df x

d x

1 k 1

Newton's method0 ( ) ( )( )) solves one of these

with x x( ) ( )

(simultaneous

k k k

k k k

f x H x x x

or x x H x f x

equation-solving)

Page 35: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

35

Ch

apte

r 6

1

Note: Both direction and step length are determined

- Requires second derivatives (Hessian)

- , must be positive definite (for minimum) to guarantee convergence

- Iterate if ( ) is not quadratic

Mod

H H

f x

1 1

2 21 2

0

ified Newton's Procedure:

( ) ( )

1 for Newton's Method

(If , you have steepest descent)

Example

( ) 20

Minimize starting at x 1 1

k k k kk

k

T

x x H x f x

H I

f x x x

f

Page 36: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

36

Ch

apte

r 6

Page 37: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

37

Ch

apte

r 6

Page 38: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

38

Ch

apte

r 6

Page 39: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

39

Ch

apte

r 6

Marquardt’s Method1

11

If ( ) or ( ) is not always positive definite, make it

positive definite.

Let ( ) ( ) ; similar for H( )

is a positive constant large enough to shift all the

negative eigenvalues of (

H x H x

H x H x I x

H x

0

0

1 2

0

).

Example

At the start of the search, ( ) is evaluated at and

Not positive definite1 2

found to be ( ) as the eigenvalues2 1

are 3, 1

Modify ( ) to be ( 2)

Positive def1 2 2

H2 1 2

H x x

H x

e e

H x

1 2

inite as the

eigenvalues are e 5, 1

is adjusted as search proceeds.

e

Page 40: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

40

Ch

apte

r 6

Step 1

0Pick x the starting point.

Let convergence criterion

Step 2

0 3Set 0. Let 10 k

Step 3

Calculate ( ) kf x

Step 4

Is ( ) ) ? If yes, terminate. If no, continue. kf x

Page 41: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

41

Ch

apte

r 6

Step 5

1

Calculate s( ) - ( )

k k k kkx H I f x

Step 6

1Calculate ( ) k k kx x s x

Step 7

1Is ( )? If yes, go to step 8. If no, go to step 9. k kf( x ) f x

Step 8

k 1 1Set and 1. Go to step 3

4 k k k

Step 9

kSet 2 . Go to step 5 k

Page 42: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

42

Ch

apte

r 6

Secant MethodsRecall for one dimensional search the secant methodonly uses values of f(x) and f ′(x).

1

1

1

( ) ( )( )

Approximate ( ) by a straight line (the secant).

Hence it is called a "Quasi-Newton" method.

The basic idea (for a quadratic function):

( ) 0 ( )

k pk k k

k p

k k k k

f x f xx x f x

x x

f x

f x H x or x x 1

k

2 2

1 1

2 1 2 1

( )

Pick two points to start (x Ref. point)

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

k

k k

k k

k

H f x

f x f x H x x

f x f x H x x

f x f x y H x x

Page 43: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

43

Ch

apte

r 6

1

1

For a non-quadratic function, would be calculated,

after taking a step from to , by solving the

secant equations

y

- An infinite number of candidates exist for when n 1

-

k k

k k k k

H

x x

H x or x H y

H-1 -1We want to choose (or ) close to (or ) in

some sense. Several methods can be used to update

H H H H

H

Page 44: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

44

Ch

apte

r 6

• Probably the best update formula is the BFGS update(Broyden – Fletcher – Goldfarb – Shanno) – ca. 1970

• BFGS is the basis for the unconstrained optimizerin the Excel Solver

• Does not require inverting the Hessian matrix butapproximates the inverse with values of f

Page 45: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

45

Ch

apte

r 6

Page 46: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

46

Ch

apte

r 6

Page 47: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

47

Ch

apte

r 6

Page 48: 1 Chapter 6 UNCONSTRAINED MULTIVARIABLE OPTIMIZATION.

48

Ch

apte

r 6