1 Lecture 10: descent methods Gradient descent (reminder)

18
1 Generic descent algorithm Generalization to multiple dimensions Problems of descent methods, possible improvements Fixes Local minima Lecture 10: descent methods Gradient descent (reminder) f x f(x) f(m) m guess Minimum of a function is found by following the slope of the function

Transcript of 1 Lecture 10: descent methods Gradient descent (reminder)

Page 1: 1 Lecture 10: descent methods Gradient descent (reminder)

1

• Generic descent algorithm

• Generalization to multiple dimensions

• Problems of descent methods, possible improvements

• Fixes

• Local minima

Lecture 10: descent methods

Gradient descent (reminder)

f

x

f(x)

f(m)

m

guess

Minimum of a function is found by following the slope of the function

Page 2: 1 Lecture 10: descent methods Gradient descent (reminder)

2

Gradient descent (illustration)

f

x

f(x)

f(m)

m

guess

next step

Gradient descent (illustration)

f

x

f(x)

f(m)

m

guess

next step

new gradient

Page 3: 1 Lecture 10: descent methods Gradient descent (reminder)

3

Gradient descent (illustration)

f

x

f(x)

f(m)

m

guess

next step

Gradient descent (illustration)

f

x

f(x)

f(m)

m

guess

stop

Page 4: 1 Lecture 10: descent methods Gradient descent (reminder)

4

Gradient descent: algorithm

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

guess

Gradient descent: algorithm

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Direction: downhill

Page 5: 1 Lecture 10: descent methods Gradient descent (reminder)

5

Gradient descent: algorithm

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

step

Gradient descent: algorithm

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Now you are here

Page 6: 1 Lecture 10: descent methods Gradient descent (reminder)

6

Gradient descent: algorithm

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Stop when “close”

from minimum

Gradient descent: algorithm

Start with a point (guess) guess = x

Repeat

Determine a descent direction direction = -f’(x)

Choose a step step = h > 0

Update x:=x–hf’(x)

Until stopping criterion is satisfied f’(x)~0

Page 7: 1 Lecture 10: descent methods Gradient descent (reminder)

7

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Page 8: 1 Lecture 10: descent methods Gradient descent (reminder)

8

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Definition of the gradient in 2D

This is just a genaralization of the derivative in two dimensions.

This can be generalized to any dimension.

Page 9: 1 Lecture 10: descent methods Gradient descent (reminder)

9

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Gradient descent works in 2D

Page 10: 1 Lecture 10: descent methods Gradient descent (reminder)

10

Generalization to multiple dimensions

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

guess

Generalization to multiple dimensions

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Direction: downhill

Page 11: 1 Lecture 10: descent methods Gradient descent (reminder)

11

Generalization to multiple dimensions

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

step

Generalization to multiple dimensions

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Now you are here

Page 12: 1 Lecture 10: descent methods Gradient descent (reminder)

12

Generalization to multiple dimensions

Start with a point (guess)

Repeat

Determine a descent direction

Choose a step

Update

Until stopping criterion is satisfied

Stop when “close”

from minimum

Generalization to multiple dimensions

Start with a point (guess) guess = x

Repeat

Determine a descent direction direction = -f(x)

Choose a step step = h > 0

Update x:=x–h Vf’(x)

Until stopping criterion is satisfied Vf’(x)~0

Page 13: 1 Lecture 10: descent methods Gradient descent (reminder)

13

Multiple dimensions

Everything that you have seen with derivatives can be generalized

with the gradient.

For the descent method, f’(x) can be replaced by

In two dimensions, and by

in N dimensions.

Example of 2D gradient: MATLAB demo

The cost to buy a portfolio is:

If you want to minimize the price to buy your portfolio, you

need to compute the gradient of its price:

Sto

ck 1

Sto

ck 2

Sto

ck i

Sto

ck N

Page 14: 1 Lecture 10: descent methods Gradient descent (reminder)

14

Problem 1: choice of the step

When updating the current computation:

- small steps: inefficient

- large steps: potentially bad results

f

x

f(x)

f(m)

m

guess

stop

Too many steps:

takes too long to converge

Problem 1: choice of the step

When updating the current computation:

- small steps: inefficient

- large steps: potentially bad results

f

x

f(x)

f(m)

m

Current point

Next point (went too far)

Page 15: 1 Lecture 10: descent methods Gradient descent (reminder)

15

Problem 2: « ping pong effect »

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 2: « ping pong effect »

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Page 16: 1 Lecture 10: descent methods Gradient descent (reminder)

16

Problem 2: (other norm dependent issues)

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 3: stopping criterion

Intuitive criterion:

In multiple dimensions:

Or equivalently

Rarely used in practice.

More about this in EE227A (convex optimization, Prof. L. El Ghaoui).

Page 17: 1 Lecture 10: descent methods Gradient descent (reminder)

17

Fixes

Several methods exist to address this problem

- Line search methods, in particular

- Backtracking line search

- Exact line search

- Normalized steepest descent

- Newton steps

Fundamental problem of the method: local minima

Local minima: pic of the MATLAB demo

The iterations of the algorithm

converge to a local minimum

Page 18: 1 Lecture 10: descent methods Gradient descent (reminder)

18

Local minima: pic of the MATLAB demo

View of the algorithm is « myopic »