Gradient Methods Yaron Lipman May 2003. Preview Background Steepest Descent Conjugate Gradient.
Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest...
Transcript of Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest...
![Page 1: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/1.jpg)
Multidisciplinary SystemMultidisciplinary System Design Optimization (MSDO)Design Optimization (MSDO)
Optimization Method Selection
Recitation 5
Andrew March
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics
1
![Page 2: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/2.jpg)
TodayToday’’s Topicss Topics
• Review optimization algorithms
• Algorithm Selection • Questions
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics
2
![Page 3: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/3.jpg)
Analytical Methods
• Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton
• Direct Search: – Compass search – Nelder-Mead Simplex
• Note: The gradient methods have a constrained equivalent. – Steepest Descent/CG: Use projection – Newton/Quasi-Newton: SQP – Direct search typically uses barrier or penalty methods
![Page 4: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/4.jpg)
Gradient Methods
• Compute descent direction, dk
• Compute step length αk
• Take step: xk +1 = xk +αk dk
• Repeat until αkdk≤ε
![Page 5: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/5.jpg)
Steepest Descent
• Compute descent direction, dk = −∇f (xk )
• Compute step length, αk – Exactly: αk = arg min f (xk +αdk )α
– Inexactly: any αk such that for a c1,c2 in (0<c1<c2<1) f (xk +αk dk ) ≤ f (xk )+ c1αk∇f (xk )T dk
∇f (xk +αk dk )T dk ≥ c2∇f (xk )T dk
• Take step: x = x +α dk +1 k k k
• Repeat until αkdk≤ε
![Page 6: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/6.jpg)
Conjugate Gradient
• Compute descent direction, dk = −∇f (xk )+ βk dk −1
( )T ∇f (x )=
f x T (∇ (x ) ∇f∇f x ∇ ( ) f − (x ))βk = kT
k or βk k
Tk k −1
∇ ( ) ( )∇ x f ( ) ( )xf xk −1 f k −1 ∇ k −1 ∇f xk −1
• Compute step length, αk – Exactly: αk = arg min f (xk +αdk )α – Inexactly: any αk such that for a c1,c2 in (0<c1<c2<1)
f (xk +αk dk ) ≤ f (xk )+ c1αk∇f (xk )T dk
( ) dk∇f (xk +αk dk )T dk ≥ c2∇f xkT
• Take step: xk +1 = xk +αk dk
• Repeat until αkdk≤ε
![Page 7: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/7.jpg)
Newton’s Method
−1• Compute descent direction, dk = −H (xk )∇f (xk ) • Compute step length, αk
– Try: αk=1, decrease? If not: • Exactly: αk = arg min f (xk +αdk )α • Inexactly: any αk such that for a c1,c2 in (0<c1<c2<1)
f (xk +αk dk ) ≤ f (xk )+ c1αk ∇f (xk )T dk
∇f (xk +αk dk )T dk ≥ c2∇f (xk )T dk
• Trust-region: αk dk ≤ Δk
• Take step: x = x +α dk +1 k k k
• Repeat until αkdk≤ε
![Page 8: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/8.jpg)
Quasi-Newton
−1• Compute descent direction, dk = −B (xk )∇f (xk )( ) ≈ H (x ); B(x )f 0B xk k k
• Compute step length, αk – Try: αk=1, decrease? If not:
• Exactly: αk = arg min f (xk +αdk )α • Inexactly: any αk such that for a c1,c2 in (0<c1<c2<1)
f (xk +αk dk ) ≤ f (xk )+ c1αk ∇f (xk )T dk
( ) dk∇f (xk +αk dk )T dk ≥ c2∇f xkT
• Take step: xk +1 = xk +αk dk
• Repeat until αkdk≤ε
![Page 9: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/9.jpg)
Direct Search, Compass Search xk+Δkej
Δk
xk xk+Δkeixk-Δkei
xk-Δkej
• Evaluate f (xk ± Δk ei ), ∀i
• If f (xk ± Δk ei ) < f (xk ) – Move to minimum of: f (xk ± Δk ei ), ∀i
• Else 1
– Δk +1 = 2 Δk
![Page 10: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/10.jpg)
Direct Search, Nelder-Mead
Generate n+1 points in ℜn, {x1,…,xn+1} Iterate:
x = arg min f (x)• l xi x = arg max f (x)• h xi
• x = centroid{x1,K,xn+1} • Reflect (α>0): xr = (1+α)x −αxh
• if ( f (xl ) < f (xr ) and f(xr ) < f (xh )), xh = xr , return • if ( f (xr ) < f (xl )), Expand (γ>1): xe = γxr + (1 −γ )x • if ( f (xe ) < f (xl )), xh = xe , return • else, xh = xr , return • if ( f (xr ) > f (xh )), Contract (0<β<1): xc = βxh + (1− β )x
• if ( f (xc ) ≤ min{f (xh ), f (xr )}), xh = xc , return • else, xi = (xi + xl )/ 2, ∀i J. A. Nelder and R. A. Mead, A simplex method for function minimization, Computer Journal, Vol. 7, pp 308-313, 1965.
![Page 11: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/11.jpg)
Heuristic Methods
• Simulated Annealing • Genetic Algorithms • Particle Swarm Optimization (next lecture)• Tabu Search (next lecture) • Efficient Global Optimization
![Page 12: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/12.jpg)
Simulated Annealing • Terminology:
– X (or R or Γ) = Design Vector (i.e. Design, Architecture, Configuration) – E = System Energy (i.e. Objective Function Value) – T = System Temperature – Δ = Difference in System Energy Between Two Design Vectors
• The Simulated Annealing Algorithm 1) Choose a random Xi, select the initial system temperature, and specify the cooling (i.e. annealing) schedule 2) Evaluate E(Xi) using a simulation model 3) Perturb Xi to obtain a neighboring Design Vector (Xi+1) 4) Evaluate E(Xi+1) using a simulation model 5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution 6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a probability e(- Δ/T) where Δ = E(Xi+1) - E(Xi).7) Reduce the system temperature according to the cooling schedule.8) Terminate the algorithm.
![Page 13: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/13.jpg)
Genetic AlgorithmGenetic AlgorithmInitialize Population (initialization)
Select individual for mating (selection)
Mate individuals and produce children (crossover)
Mutate children (mutation)
Insert children into population (insertion)
Are stopping criteria satisfied ?n y
Finish Ref: Goldberg (1989)
next
gen
erat
ion
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics
13
![Page 14: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/14.jpg)
Particle Swarm
• Birds go in a somewhat random direction, but also somewhat follow a swarm
• Keep checking for “better” locations
– Generally continuous parameters only, but there are discrete formulations.
![Page 15: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/15.jpg)
Tabu Search
• Keep a list of places you’ve visited • Don’t return, keep finding new places
![Page 16: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/16.jpg)
Efficient Global Optimization • Started by Jones 1998 • Based on probability theory
– Assumes: T 2f (x) ≈ β x + N (μ(x),σ (x))
• β T x , true behavior, regression
2 • N (μ(x),σ (x)), error from
true behavior is normally distributed, with mean μ(x), andvariance σ2(x)
• Estimate function values with a Kriging model (radial basis functions) – Predicts mean and variance – Probabilistic way to find optima
• Evaluate function at “maximum expected improvementlocation(s)” and update model
![Page 17: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/17.jpg)
What’s good algorithm?
Objective Contours:• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 18: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/18.jpg)
What’s good algorithm?
2 2f (x) = 100(x2 − x1 ) + (1− x1 )2
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 19: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/19.jpg)
What’s good algorithm?
Objective Function:
a) Find quick improvement?
b) Find global optima?
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 20: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/20.jpg)
What’s good algorithm?
• x1={1,2,3,4}• x2∈ℜ • min f(x1,x2)
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 21: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/21.jpg)
What’s good algorithm?
• Airfoil design with CFD – Run-time~3 hours
a) Without an adjointsolution?
b) With an adjoint solution?
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 22: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/22.jpg)
What’s good algorithm?
• Minimize weight – s.t. stress<σmax
• Natran output – Stress=3.500x104
– (finite precision)
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 23: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/23.jpg)
What’s good algorithm?
• Flat section:
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 24: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/24.jpg)
min cTxs.t. Ax=b
What’s good algorithm? • Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex • SA • GA • Tabu • PSO • EGO
![Page 25: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/25.jpg)
What’s good algorithm?
Nonsmooth objective:
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 26: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/26.jpg)
What’s good algorithm?
Islands of feasibility: x2
x1
=feasible
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex• SA • GA • Tabu • PSO • EGO
![Page 27: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/27.jpg)
What’s good algorithm?
• Problem aspects: – Islands of feasibility – Many local minima – Mixed
discrete/continuous variables
– Many design variable scales (10-1Æ104)
– Long function evaluation time (~2 minutes)
• Steepest descent • Conjugate gradient • Newton’s Method • Quasi-Newton • SQP • Compass-Search • Nelder-Mead Simplex
• SA • GA • Tabu • PSO • EGO
![Page 28: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/28.jpg)
Example: Operational Design Space • Objectives
– Time to Climb, Fuel Burn, Noise, Operating Cost
• Parameters – Flap setting – Throttle setting– Velocity – Transition Altitude – Climb gradient* – 18 Total
• Constraints: – Regulations
• No pilot input below 684 ft • Initial climb at V2+15kts
– Flap settings– Velocity
• Min: stall • Max: max q
– Throttle • Min: engine idle or positive rate
of climb• Max: full power
Velocity known
FlFlap setting known
![Page 29: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/29.jpg)
Example:Design Space Exploration Methods
• Exploration Challenges – Islands of feasibility – Many local minima – Mixed discrete/continuous variables – Many design variable scales (10-1Æ104) – Long function evaluation time (~2 minutes with noise)
• Sequential Quadratic Programming [Climb time: 312 s] – Stuck at local minima – Can’t handle discrete integers
• Direct Search (Nelder-Mead) [Climb time: 319 s] – Similar problems as SQP, but worse results
• Particle Swarming Optimization [Climb time: 319 s] – Slow running (8-12 hours), optimum not as good as Genetic Algorithm
• Genetic Algorithm [Climb time: 308 s] – No issues with any of the challenges of this problem. – No convergence guarantee and SLOW! Run-time ~24 hours. – But, best result.
![Page 30: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/30.jpg)
Summary
• You have a large algorithm toolbox.• You can often tell by inspection what
algorithm might work well. • Always take advantage of aspects of
your problem that will speed convergence.
![Page 31: Multidisciplinary System Multidisciplinary System … Methods • Gradient Based: – Steepest descent – Conjugate Gradient – Newton’s Method – Quasi-Newton • Direct Search:](https://reader031.fdocuments.us/reader031/viewer/2022022013/5b2a49b17f8b9ae1288ba2b3/html5/thumbnails/31.jpg)
MIT OpenCourseWarehttp://ocw.mit.edu
ESD.77 / 16.888 Multidisciplinary System Design Optimization Spring 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.