Post on 17-Dec-2020
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Optimization II:Unconstrained Multivariable
CS 205A:Mathematical Methods for Robotics, Vision, and Graphics
Justin Solomon
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
UnconstrainedMultivariable Problems
minimizef : Rn→ R
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 2 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Recall
∇f (~x)
“Direction ofsteepest ascent”
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 3 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Recall
−∇f (~x)
“Direction ofsteepest descent”
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 4 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Observation
If ∇f (~x) 6= ~0, forsufficiently small α > 0,
f (~x− α∇f (~x)) ≤ f (~x)
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 5 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Gradient Descent Algorithm
Iterate until convergence:
1. gk(t) ≡ f (~xk − t∇f (~xk))2. Find t∗ ≥ 0 minimizing
(or decreasing) gk3. ~xk+1 ≡ ~xk − t∗∇f (~xk)CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 6 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Stopping Condition
∇f (~xk) ≈ ~0Don’t forget:
Check optimality!
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 7 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Line Search
gk(t) ≡ f (~xk − t∇f (~xk))
I One-dimensional optimization
I Don’t have to minimize completely:
Wolfe conditions
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 8 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Newton’s Method (again!)
f (~x) ≈f (~xk) +∇f (~xk)>(~x− ~xk)
+1
2(~x− ~xk)>Hf(~xk)(~x− ~xk)
=⇒ ~xk+1 = ~xk − [Hf(~xk)]−1∇f (~xk)
Consideration:What if Hf is not positive (semi-)definite?
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 9 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Newton’s Method (again!)
f (~x) ≈f (~xk) +∇f (~xk)>(~x− ~xk)
+1
2(~x− ~xk)>Hf(~xk)(~x− ~xk)
=⇒ ~xk+1 = ~xk − [Hf(~xk)]−1∇f (~xk)
Consideration:What if Hf is not positive (semi-)definite?
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 9 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Newton’s Method (again!)
f (~x) ≈f (~xk) +∇f (~xk)>(~x− ~xk)
+1
2(~x− ~xk)>Hf(~xk)(~x− ~xk)
=⇒ ~xk+1 = ~xk − [Hf(~xk)]−1∇f (~xk)
Consideration:What if Hf is not positive (semi-)definite?
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 9 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Motivation
I∇f might be hard tocompute but Hf is harder
I Hf might be dense: n2
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 10 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Quasi-Newton Methods
Approximate derivatives toavoid expensive calculations
e.g. secant, Broyden, ...
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 11 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Common Optimization Assumption
I∇f known
I Hf unknown or hard tocompute
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 12 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Quasi-Newton Optimization
~xk+1 = ~xk − αkB−1k ∇f (~xk)Bk ≈ Hf(~xk)
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 13 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Warning
<advanced material>
See Nocedal & Wright
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 14 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Broyden-Style Update
Bk+1(~xk+1 − ~xk) =∇f (~xk+1)−∇f (~xk)
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 15 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Additional Considerations
I Bk should be symmetricI Bk should be positive(semi-)definite
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 16 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Davidon-Fletcher-Powell (DFP)
minBk+1
‖Bk+1 −Bk‖
s.t. B>k+1 = Bk+1
Bk+1(~xk+1 − ~xk) = ∇f (~xk+1)−∇f (~xk)
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 17 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Observation
‖Bk+1−Bk‖ small does notmean ‖B−1k+1−B
−1k ‖ is small
Idea: Try to approximateB−1k directly
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 18 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Observation
‖Bk+1−Bk‖ small does notmean ‖B−1k+1−B
−1k ‖ is small
Idea: Try to approximateB−1k directly
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 18 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
BFGS Update
minHBk+1
‖Hk+1 −Hk‖
s.t. H>k+1 = Hk+1
~xk+1 − ~xk = Hk+1(∇f (~xk+1)−∇f (~xk))
State of the art!
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 19 / 20
Multivariable Problems Gradient Descent Newton’s Method Quasi-Newton Missing Details
Lots of Missing Details
I Choice of ‖ · ‖
I Limited-memoryalternative
Next
CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 20 / 20