Optimization in Python - Kevin T. Carlberg · Optimization tools in Python...
Transcript of Optimization in Python - Kevin T. Carlberg · Optimization tools in Python...
![Page 1: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/1.jpg)
Optimization in Python
Kevin Carlberg (Sandia National Laboratories)
August 13, 2019
1
![Page 2: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/2.jpg)
Optimization tools in Python
We will go over and use two tools:1. scipy.optimize2. CVXPY
See quadratic_minimization.ipynbI User inputs defined in the second cellI Enables exploration of how problem attributes affect optimization-solver performance
2
![Page 3: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/3.jpg)
scipy.optimize
scipy.optimize 3
![Page 4: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/4.jpg)
Outline
scipy.optimize
CVXPY
Example: quadratic_minimization.ipynb
scipy.optimize 4
![Page 5: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/5.jpg)
scipy.optimize
scipy.optimize: sub-package of SciPy, which is an open source Python library forscientific computingI Analogous to Matlab’s optimization toolboxI Capabilities
I OptimizationI Local optimizationI Equation minimizersI Global optimization
I Fitting (nonlinear least squares)I Root findingI Linear ProgrammingI Utilities (e.g., check_grad for verifying analytic gradients)
scipy.optimize 5
![Page 6: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/6.jpg)
scipy.optimize interfaceRequires the user to define a function in PythonI Can be black box: no closed-form mathematical expression needed!I Only the function value f(x) is requiredI Can optionally provides the gradient ∇f(x) and Hessian ∇2f(x)I Example: evaluating f constitutes a run of a complicated simulation codeI Drawback: cannot exploit special structure underlying f
scipy.optimize
f(x), rf(x), r2f(x)
black-box function
x
scipy.optimize 6
![Page 7: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/7.jpg)
scipy.optimize: local optimization algorithms
Unconstrained minimizationI Derivative free: no gradient or Hessian
I Nelder-Mead: simplexI Powell: sequential minimization along each vector in a direction set
I Gradient-based: gradient only (no Hessian)I CG: nonlinear conjugate gradientI BFGS: quasi-Newton BFGS method
I Gradient-based: gradient and Hessian can be specifiedI Newton-CG: approximately solves Newton system using CG (truncated Newton
method)I dogleg: dog-leg trust-region algorithm. Hessian must be SPDI trust-ncg: Newton conjugate gradient trust-region method
scipy.optimize 7
![Page 8: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/8.jpg)
scipy.optimize: local optimization algorithms
Constrained minimization (all are gradient-based)I Only bound constraints
I L-BFGS-B: limited memory BFGS bound constrained optimizationI TNC: truncated Newton allows for upper and lower bounds
I General constraintsI COBYLA: Constrained Optimization BY Linear ApproximationI SLSQP: Sequential Least SQuares Programming
scipy.optimize 8
![Page 9: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/9.jpg)
scipy.optimize: global optimization algorithms
Global optimization (all are derivative free)I basinhopping: stochastic algorithm by Wales and Doye,
I useful when the function has many minima separated by large barriersI brute: brute force minimization over a specified rangeI differential_evolution: an evolutionary algorithm
scipy.optimize 9
![Page 10: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/10.jpg)
CVXPY
CVXPY 10
![Page 11: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/11.jpg)
Outline
scipy.optimize
CVXPY
Example: quadratic_minimization.ipynb
CVXPY 11
![Page 12: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/12.jpg)
Modeling languages for convex optimizationI High-level language support for convex optimization has been developed recently
1. Describe problem in high-level language2. Description automatically tranformed to standard form3. Solved by standard solver, tranformed back
I Implementations:I YALMIP, CVX (Matlab)I CVXPY (Python)I Convex.jl (Julia)
I Benefits:I Easy to perform rapid prototypingI Can exploit special structure because have full mathematical descriptionI Let users focus on what their model should be instead of how to solve itI No algorithm tuning or babysitting
I Drawbacks:I Won’t work if your problem isn’t convexI Need explicit mathematical formulas for the objective and constraintsI Thus, it cannot handle black-box functionsCVXPY 12
![Page 13: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/13.jpg)
CVXPY
I CVXPY: “a Python-embedded modeling language for convex optimizationproblems. It allows you to express your problem in a natural way that followsthe math, rather than in the restrictive standard form required by solvers.”
from cvxpy import *x = Variable(n)cost = sum_squares(A*x-b) + gamma*norm(x,1) # explicit formula!prob = Problem(Minimize(cost,[norm(x,"inf") <=1]))opt_val = prob.solve()solution = x.valueI solve method converts problem to standard form, solves and assignes opt_val
attributes
CVXPY 13
![Page 14: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/14.jpg)
CVXPY usage
I cvxpy.Problem: optimization problemI cvxpy.Variable: optimiation variableI cvxpy.Minimize: minimization functionI cvxpy.Parameter: symbolic representations of constants
I can change the value of a constant without reconstructing the entire problemI can enforce to be positive or negative on construction
I Constraints simply Python listsI Many functions implemented: see cvxpy.org website for list
CVXPY 14
![Page 15: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/15.jpg)
Complete CVXPY exampleimport cvxpy as cvx# Create two scalar optimization variables (CVXPY Variable)x = cvx.Variable()y = cvx.Variable()# Create two constraints (Python list)constraints = [x + y == 1, x - y >= 1]# Form objectiveobj = cvx.Minimize(cvx.square(x - y))# Form and solve problemprob = cvx.Problem(obj, constraints)prob.solve() # Returns the optimal value.print("status:", prob.status)print("optimal value", prob.value)print("optimal var", x.value, y.value)
CVXPY 15
![Page 16: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/16.jpg)
Ensuring convexity
I CVXPY must somehow ensure the written optimization problem is convex. How?I Disciplined convex programming (DCP)
I Defines conventions that ensure an optimization problem is convexI Example: the positive sum of two convex functions is convexI These rules are sufficient (but not necessary) for convexity
I Usage in CVXPYI must assess the sign and curvature of cvxpy.Variable and cvx.Parameter types:I x.sign: returns sign of xI x.curvature: returns the curvature of x
CVXPY 16
![Page 17: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/17.jpg)
Example: quadratic_minimization.ipynb
Example: quadratic_minimization.ipynb 17
![Page 18: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/18.jpg)
Outline
scipy.optimize
CVXPY
Example: quadratic_minimization.ipynb
Example: quadratic_minimization.ipynb 18
![Page 19: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/19.jpg)
Explore minimization methods minimization
I Consider minimizing the quadratic function
f(x) =n∑
i=1ai · (xi − 1)2
I Properties: convex, smooth, minimum at x? = (1, . . . , 1)I Let’s compare method performance for:
1. Well-conditioned (narrow distribution of ai) v. ill-conditioned (wide distribution of ai)2. Low-dimensional (n small) v. high-dimensional (n large)
Example: quadratic_minimization.ipynb 19
![Page 20: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/20.jpg)
scipy.opt function implementation
I Must define function, and optionally gradient and Hessiandef fun(x):
return 0.5*sum(np.multiply(quadratic_coeff,\np.square(np.array(x)-np.ones(np.array(x).size))))
def fun_grad(x):return np.array(np.multiply(quadratic_coeff,np.array(x)\
-np.ones(np.array(x).size)))def fun_hess(x):
return np.diag(quadratic_coeff)I To solve, define initial guess x0 and invoke a solver with the functions as arguments:
res = opt.minimize(fun,x0,method='newton-cg',jac=fun_grad,hess=fun_hess)
Example: quadratic_minimization.ipynb 20
![Page 21: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/21.jpg)
CVXPY setup
Assume we have already specified:I dimension (int): number of optimization variable nI quadratic_coeff (numpy.ndarray): array of ai
import cvxpy as cvxx = cvx.Variable(dimension)quadratic_coeff_cvx = cvx.Parameter(dimension,sign='Positive')quadratic_coeff_cvx.value=quadratic_coeffobj = cvx.Minimize(0.5*quadratic_coeff.T*cvx.square(x-1))prob = cvx.Problem(obj)prob.solve()I Note that the objective has to be explicitly coded in CVXPY objectiveI Cannot use black-box functions!
Example: quadratic_minimization.ipynb 21
![Page 22: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/22.jpg)
Method comparisonWe will compare:I Global, no gradients
I differential_evolutionI Best performance: non-convex, low-dimensional. Noise okay!
I Local, no gradients:I Nelder-MeadI CG with finite-difference Jacobian approximations (CGfd)I Best performance: well-conditioned, noise-free, low-dimensional
I Local, gradients:I CGI Best performance: well-conditioned, noise-free. High dimensions okay!
I Local, gradients and HessiansI newton-cgI CVXPY (requires convexity)I Best performance: noise-free. Ill-conditioning, high dimensions okay!
Example: quadratic_minimization.ipynb 22
![Page 23: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/23.jpg)
Low-dimensional, well-conditionedI Low-dimension: n = 2 optimization variablesI Well-conditioned: ai = 1, i = 1, . . . , n
variable x1
−20
24 va
riablex 2
−20
24
5
10
15
objective function
2
4
6
8
10
12
14
I This is the easiest case of all!Example: quadratic_minimization.ipynb 23
![Page 24: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/24.jpg)
Low-dimensional, well-conditioned
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.0008.000
10.000 10.00012.000
objective function
minimum
initial guess
Nelder-Mead
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.0008.000
10.000 10.00012.000
objective function
minimum
initial guess
CG
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.0008.000
10.000 10.00012.000
objective function
minimum
initial guess
CGfd
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.0008.000
10.000 10.00012.000
objective function
minimum
initial guess
newton-cg
Example: quadratic_minimization.ipynb 24
![Page 25: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/25.jpg)
Low-dimensional, well-conditioned
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
CV
XP
Y
10−14
10−12
10−10
10−8
10−6
rela
tive
dis
tan
ceto
solu
tion
x?
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
101
102
103
nu
mb
erof
fun
ctio
nev
alu
atio
ns
I All methods find the minimum (computed solution close to x? = (1, 1))I Derivative-free methods (Nelder-Mead and differential evolution) very inefficient!I CG more expensive when finite-difference gradient approximations used
Example: quadratic_minimization.ipynb 25
![Page 26: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/26.jpg)
Low-dimensional, poorly conditionedI Low-dimension: n = 2 optimization variablesI Poorly conditioned: ai = 1 have large variance (a1 = 1.2× 104, a2 = 1)
variable x1
−20
24 va
riablex 2
−20
24
2
4
6
8
objective function
1
2
3
4
5
6
7
I Slope is much larger in one direction relative to the otherI Hard to minimize in direction x1 using only the gradientI The Hessian can help in this case!
Example: quadratic_minimization.ipynb 26
![Page 27: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/27.jpg)
Low-dimensional, poorly conditioned
−2 0 2 4 6variable x1
−2
0
2
4va
riab
lex
2
1.500
1.500
3.000
3.000
4.500
4.500
6.000
7.500objective function
minimum
initial guess
Nelder-Mead
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
1.500
1.500
3.000
3.000
4.500
4.500
6.000
7.500objective function
minimum
initial guess
CG
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
1.500
1.500
3.000
3.000
4.500
4.500
6.000
7.500objective function
minimum
initial guess
CGfd
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
1.500
1.500
3.000
3.000
4.500
4.500
6.000
7.500objective function
minimum
initial guess
newton-cg
Example: quadratic_minimization.ipynb 27
![Page 28: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/28.jpg)
Low-dimensional, poorly conditioned
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
CV
XP
Y
10−13
10−10
10−7
10−4
10−1
rela
tive
dis
tan
ceto
solu
tion
x?
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
101
102
103
nu
mb
erof
fun
ctio
nev
alu
atio
ns
I All methods do a farily good job at finding the minimumI newton-cg and CVXPY do the best by far (both use Hessian information)
I Hessian information helps ‘cure’ ill conditioning!I Derivative-free methods (Nelder-Mead and differential evolution) very inefficient
Example: quadratic_minimization.ipynb 28
![Page 29: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/29.jpg)
High-dimensional, poorly conditionedI High(er)-dimension: n = 100 optimization variables (not truly high dimensional)I Poorly conditioned: ai = 1 have large variance (maxi ai/mini ai = 3.6× 108)
variable x1
−20
24 va
riablex 2
−20
24
0.0001
0.0002
0.0003
objective function
0.00005
0.00010
0.00015
0.00020
0.00025
0.00030
I Higher dimensions pose significant challenges to gradient-free methodsExample: quadratic_minimization.ipynb 29
![Page 30: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/30.jpg)
High-dimensional, poorly conditioned
0 50 100 150variable x1
0
5
10
vari
ablex
2
0.000
0.000
0.000
0.000
0.000
0.00
0
0.0000.0000.000
objective function
minimum
initial guess
Nelder-Mead
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
0.000
0.000
0.000
0.000
0.000
0.00
0
0.0000.000
0.000
objective function
minimum
initial guess
CG
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
0.000
0.000
0.000
0.000
0.000
0.00
0
0.0000.000
0.000
objective function
minimum
initial guess
CGfd
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
0.000
0.000
0.000
0.000
0.000
0.00
0
0.0000.000
0.000
objective function
minimum
initial guess
newton-cg
Example: quadratic_minimization.ipynb 30
![Page 31: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/31.jpg)
High-dimensional, poorly conditioned
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
CV
XP
Y
10−3
10−2
10−1
100
101
rela
tive
dis
tan
ceto
solu
tion
x?
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
101
102
103
104
105
106
nu
mb
erof
fun
ctio
nev
alu
atio
ns
I Nelder–Mead fails to find the minimum in 10,0000 function evaluationsI Differential evolution finds the minimum, but incurs > 106 function calls!I CG w/ finite-difference gradients is very expensive (n+ 1 function calls per gradient)I newton-cg and CVXPY do extremely well (both use Hessian information)
Example: quadratic_minimization.ipynb 31
![Page 32: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/32.jpg)
Lessons
I Gradient information helps “cure” high-dimensionalityI Gradients enable a good direction to be found in a high-dimensional spaceI Without gradients, many function evaluations are needed to explore the spaceI Finite-difference approximations of the Jacobian become expensive in high dimensions
(require n+ 1 function evaluations)I Hessian information helps “cure” ill conditioning!
I Hessians inform the optimizer of curvature; thus the optimizer deals with illconditioning directly
I Ill-conditioned Hessians can still pose numerical problems
Example: quadratic_minimization.ipynb 32
![Page 33: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/33.jpg)
Let’s add noise
I Let’s add sinusoidal noise to the function:
f(x) =n∑
i=1ai · (xi − 1)2 + b ·
[n−
n∑i=1
cos(2π(xi − 1))]
I b controls the amount of additional noiseI For b > 0, the function is no longer convex!
I Many local minimaI Local methods may not find the global minimum!I CVXPY not applicable
Example: quadratic_minimization.ipynb 33
![Page 34: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/34.jpg)
Low-dimensional, well-conditioned, noisyI Low-dimension: n = 2 optimization variablesI Well-conditioned: ai = 1, i = 1, . . . , n
variable x1
−20
24 va
riablex 2
−20
24
5
10
15
objective function
2
4
6
8
10
12
14
I Many local minima in which to get “trapped”Example: quadratic_minimization.ipynb 34
![Page 35: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/35.jpg)
Low-dimensional, well-conditioned, noisy
−2 0 2 4variable x1
−2
0
2
4va
riab
lex
2
2.000
4.000
6.000
6.000
8.000
8.000
10.00
0
10.00
0
10.000
12.00014.000
objective function
minimum
initial guess
Nelder-Mead
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.000
8.000
8.000
10.00
0
10.00
0
10.000
12.00014.000
objective function
minimum
initial guess
CG
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.000
8.000
8.000
10.00
0
10.00
0
10.000
12.00014.000
objective function
minimum
initial guess
CGfd
−2 0 2 4variable x1
−2
0
2
4
vari
ablex
2
2.000
4.000
6.000
6.000
8.000
8.000
10.00
0
10.00
0
10.000
12.00014.000
objective function
minimum
initial guess
newton-cg
Example: quadratic_minimization.ipynb 35
![Page 36: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/36.jpg)
Low-dimensional, well-conditioned, noisy
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
10−8
10−6
10−4
10−2
100
rela
tive
dis
tan
ceto
solu
tion
x?
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
101
102
103
nu
mb
erof
fun
ctio
nev
alu
atio
ns
I All local methods get trapped in a local minimumI CVXPY cannot be usedI differential evolution finds the closest solution,
I However, it requires over a thousand function evaluations!Example: quadratic_minimization.ipynb 36
![Page 37: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/37.jpg)
High-dimensional (n = 100), well-conditioned, noisy
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
100
5× 10−1
6× 10−1
7× 10−1
8× 10−1
9× 10−1re
lati
ved
ista
nce
toso
luti
onx?
CG
CG
fd
new
ton
-cg
Nel
der
-Mea
d
diff
.ev
ol.
101
102
103
104
105
106
nu
mb
erof
fun
ctio
nev
alu
atio
ns
I All local methods get trapped in a local minimum (again)I CVXPY cannot be used (again)I Differential evolution comes closest to finding the solution
I However, it requires over one million function evaluations!Example: quadratic_minimization.ipynb 37
![Page 38: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/38.jpg)
Lessons
Noise can make optimization very difficult!I Makes the problem non-convex, with many local minimaI Local methods get trapped in a local minimumI Global methods are needed, but these perform poorly in high dimensionsI Tools like CVXPY cannot be usedI Lesson: avoid noisy functions by any means possible (e.g., smoothing,
convexification)
Example: quadratic_minimization.ipynb 38
![Page 39: Optimization in Python - Kevin T. Carlberg · Optimization tools in Python Wewillgooverandusetwotools: 1. scipy.optimize 2.CVXPY Seequadratic_minimization.ipynb I Userinputsdefinedinthesecondcell](https://reader033.fdocuments.us/reader033/viewer/2022052002/60149b001092b306b44ac375/html5/thumbnails/39.jpg)
RecapI Global, no gradients
I differential_evolutionI Best performance: non-convex, low-dimensional. Noise okay!
I Local, no gradients:I Nelder-MeadI CG with finite-difference Jacobian approximations (CGfd)I Best performance: well-conditioned, noise-free, low-dimensional
I Local, gradients:I CGI Best performance: well-conditioned, noise-free. High dimensions okay!
I Local, gradients and HessiansI newton-cgI CVXPY (requires convexity)I Best performance: noise-free. Ill-conditioning, high dimensions okay!
Example: quadratic_minimization.ipynb 39