L05 Non-convex Optimization: GD + noise converges to ... · Analysis of Perturbed Gradient Descent...

50.579 Optimization for Machine Learning

Ioannis Panageas

ISTD, SUTD

L05Non-convex Optimization: GD + noise converges to second order stationarity

Optimization for Machine Learning

• This is only true in the unconstrained case!

Vanishing step-sizes

Definitions

Convergence to first order stationarity

Perturbed Gradient Descent

Analysis of Perturbed Gradient Descent

• High level proof strategy:

1) When the current iterate is not an 𝜖-second order stationary point, it must either (a) have a large gradient or (b) have a strictly negative eigenvalue the Hessian.

2) We can show in both cases that yield a significant decrease in function value in a controlled number of iterations.

3) Since the decrease cannot be more that f x0 − 𝑓(𝑥∗) (global minimum is bounded) we can reach contradiction.

Conclusion

• Introduction to Non-convex Optimization.

– Perturbed Gradient Descent avoids strict saddles!

– Same is true for Perturbed SGD.

• Next lecture we will talk about more about accelerated methods.

• Week 8 we are going to talk about min-max optimization (GANs).

SUTD ISTD 50.004 Intro to Algorithms

L05 Non-convex Optimization: GD + noise converges to ... · Analysis of Perturbed Gradient Descent...

Documents

Transcript of L05 Non-convex Optimization: GD + noise converges to ... · Analysis of Perturbed Gradient Descent...

Q922+de1+l05 v1

L05 centralita1 400

L05 - Equilibrium

OpenGL L05-Texturing

Test l05 Ut1

L05 - Arrays

Sp first l05

Iuwne10 S02 L05

L05 treat20150305

L05 - SAT Reductionscourses.csail.mit.edu/6.892/spring19/lectures/L05.pdf · Title: L05 - SAT Reductions.jnt Author: edemaine Created Date: 2/10/2019 2:21:50 PM

L05 Abed-Hossain GIS

Iuwne10 S04 L05

Scc2301 Ch2 l05 Breakeven

L05 time management

Iuwne10 S05 L05

Q922+de2+l05 v1

L05 thriller conventions

Policy Optimization Provably Converges to Nash Equilibria ...papers.nips.cc/paper/9335-policy-optimization... · making problems, e.g., playing the game of Go [2, 3], and playing

L05- Software Architectures

L05 Technological Revolutions