Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 ·...

56
Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore University of Management Sciences 2 nd International Bhurban Conference on Applied Sciences and Technology Control and Simulation (June 19 – 21, 2003)

Transcript of Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 ·...

Page 1: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Nonlinear Optimization Theory and Practice

by

Asim Karim Computer Science Dept.

Lahore University of Management Sciences

2nd International Bhurban Conference on Applied Sciences and Technology

Control and Simulation (June 19 – 21, 2003)

Page 2: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimization

What is optimization? Finding solution(s) from a set of admissible or feasible solutions that minimizes (or maximizes) a performance measure or objective

Examples Engineering design: find the cross-sectional dimensions of a beam that results in the least weight structure

Resource management: find the optimal distribution of resources to accomplish a task in least time

Machine control: find the policy for injecting fuel that leads to lest fuel consumption

Traveling salesperson problem: find the path through a given set of locations that has the shortest distance

Optimization is a very powerful concept. Many problems in different fields can be posed as optimization problems.

Page 3: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Types of Optimizations

Two basic classes of optimization problems Static: decision variables do not vary over time Dynamic: decision variables vary over time, and optimal solutions are time-paths or trajectories rather than single values

Other classifications Linear and nonlinear: if any non-linearity exists in the problem, then it is a nonlinear optimization problem; otherwise, it is a linear optimization problem

Unconstrained and constrained: if the variables are unrestricted, then it is an unconstrained optimization problem; otherwise, it is a constrained optimization problem

Page 4: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Nonlinear Optimization

Nonlinear optimization theory includes as a special case the linear optimization problem. The basic concepts of static optimization are similar to those of dynamic optimization.

Solution methods Mathematical: These methods are based on calculus and geometry. Collectively, these methods are known as nonlinear programming techniques.

Heuristic: These methods are based on search heuristics. Examples include genetic algorithms and simulated annealing.

We will be focusing on nonlinear programming – that is, mathematical methods for solving static nonlinear optimization problems

Page 5: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimization Theory

Two questions Existence: do local/global minima exist? Optimality conditions: what are the properties or characteristics of local/global minima?

Does f(x) = x has a local minimum?. What about f(x) = exp(x)? We won’t be focusing on existence of optimal solutions. Optimality conditions are used in algorithms for solving optimization problems.

Page 6: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Criteria for Characterizing Solution Methods

Rate of convergence Stability of convergence Search for minima (local or global?) Computational efficiency and scalability Memory usage and scalability Other requirements (continuous differentiability, twice continuous differentiability, etc)

Page 7: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Unconstrained Optimization

Definition Minimize: )(xf subject to nR∈x Rn is the n-dimensional space of real numbers (Euclidean space).

Local and Global Minimum Vector x* is a local minimum of f if there exists 0>ε such that ε<−∀≤ * with );(*)( xxxxx ff Vector x* is a global minimum if nR );(*)( ∈∀≤ xxx ff

Page 8: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Local and Global Minima

Page 9: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimality Conditions

Assuming f is continuously differentiable:

Necessary conditions 0*)( =∇ xf (first order necessary condition 0*)(2 ≥∇ xf (i.e. positive semi-definite; second order necessary

conditions

Sufficient condition 0*)(2 >∇ xf (i.e. positive definite)

Special cases Convex or quadratic function: first order necessary condition is also sufficient. Moreover, the stationary point is a global minimum.

Page 10: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimality Conditions - Example

Page 11: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Gradient Methods (1)

Gradient methods Method of steepest descent (and its variations) Newton’s method (and its variations) Quasi-Newton’s method

Basic strategy and equations These methods involve iterative descent such that

,...2,1 )()( 1 =<+ kff kk xx Update rule: kkkk dxx α+=+1 such that 0)( <∇ kTkf dx

Page 12: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Gradient Methods (2)

The different methods vary in their choice of dk. The stepsize αk is determined by a line search technique.

Page 13: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Method of Steepest Descent (1)

Direction vector )( kk f xd −∇=

This method is often slow to converge.

Page 14: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Method of Steepest Descent (2)

Scaled steepest descent )( kkk f xDd ∇−=

where Dk is a diagonal matrix used to scale the gradient vector. Usually, the diagonal element i in Dk is computed as the inverse of the second order partial derivative of f with xi (an approximation to the Newton’s method). This method converges faster than the method of steepest descent.

Page 15: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Newton’s Method (1)

Direction vector [ ] )()(

12 kkk ff xxd ∇∇−= −

assuming the Hessian is positive definite.

When αk = 1, then it is known as pure Newton’s method. However, the pure method has some major drawbacks (can you identify some?)

Faster convergence (see figure on next slide), but computationally expensive (Hessian computation)

A variation to reduce computational complexity To reduce computation expense of the Hessian, the modified Newton’s method computes the Hessian every p > 1 iterations, instead of every iteration.

Page 16: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Newton’s Method (2)

To ensure global convergence (a drawback of pure method) Use the steepest descent direction vector whenever the Hessian is negative or undefined.

Page 17: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Quasi-Newton Methods

Direction vector )( kkk f xDd ∇−=

where Dk is a positive definite matrix selected such that it approximates the Newton direction. A popular way to compute Dk is

TkkkkkkTk

kTkkk

kTk

Tkkkk τ ))((

)()(

))((

)()(

))((1 υυqDq

DqqDqp

ppDD ξ+−+=+

where

k

kk

kTk

kk

τqD

qppυ −=)(

; )()( kkTkk qDq=τ ; 10 ≤≤ kξ ; kkk xxp −= +1 ;

)()( 1 kkk ff xxq ∇−∇= + When ξ k = 0 (for all k), then it is known as DFP method When ξ k = 0 (for all k), then it is known as BFGS method (popular)

Page 18: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Conjugate Gradient Method

Iterative improvement kkkk dxx α+=+1

where the direction vectors dk (k = 0, 1, …) are Q-conjugate.(Q is a positive definite matrix)

The directions dk are computed by the Gram-Shmidt method.

)( 00 xd f−∇=

111 )()(

)()()( −

−− ∇∇∇∇+−∇= k

kTk

kTkkk

xff

fff d

xxx

xd

Conjugate gradient method and its variations are popular approaches for unconstrained optimization and solving linear system of equations.

Page 19: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Stepsize Selection Methods

Importance In practice, the choice of the stepsize αk significantly affects the rate of convergence, stability and computational efficiency of iterative direction methods.

If αk is too small, convergence may be very slow If αk is too large, convergence may not be smooth (divergence) Exact computation can be expensive

Common methods Constant stepsize Line minimization Armijo rule Goldstein rule

Page 20: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Line Minimization

The stepsize αk is such that it minimizes f(x) along dk, that is

)(min)( kkkkk ff dxdx αα +=+ Usually ],0[ s∈α where s > 0 to reduce computation (method known as limited line minimization) The bisection or Newton-Raphson methods are used for this minimization (these are line or 1-D optimization algorithms)

Disadvantage It is computationally expensive, requiring the solution of a sub-optimization problem in each iteration.

Page 21: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Armijo Rule

The stepsize skmk βα = is determined by a successive reduction process where mk is the first non-negative integer for which

kTkmkmkk fssff dxdxx )()()( ∇−≥+− σββ

Procedure Select a value for s, )1,0(),1,0( ∈∈ σβ Set m0 = 0 Evaluate the inequality. If it is satisfied, skmk βα = ; otherwise, increment m and repeat the evaluation

Usually σ is chosen close to zero and β is between ½ and 1/10. If dk is scaled then s = 1 is an appropriate choice.

Page 22: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Comparison of Methods

Steepest descent Newton’s Quasi-Newton CG Slow convergence

Fastest Fast Fast

Computationally less expensive

HIgh High/Moderate Moderate

Needs once differentiability

Twice differentiability

Once differentiability

Once differentiability

Suitable for less complex problems

Well-defined problems

Complex problems

Complex problems

Suitable for small scale problems

Small to medium

Medium Medium and large

Hard to parallelize

Hardest Easier Easier

Page 23: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Constrained Optimization

Definition Minimize: )(xf subject to nRC ⊂∈x C is the constraint set, an n-dimensional subspace of real numbers. C is defined by Iihi ,...,1 0)( ==x (equality constraints) Jjg j ,...,1 0)( =≤x (inequality constraints)

Local and Global Minimum Vector C∈*x is a local minimum of f over C if there exists 0>ε such that ε<−∈∀≤ * with );(*)( xxxxx Cff Vector C∈*x is a global minimum if C );(*)( ∈∀≤ xxx ff

Page 24: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimality Conditions (1)

Assuming f is continuously differentiable:

Necessary conditions Cf T ∈∀≥−∇ xxxx 0*)(*)(

If f is convex over C, then the above is also sufficient for optimality.

If f and constraint set C are convex then local minimum x* is also a global minimum.

The solution methods based on these optimality conditions are similar to those for unconstrained problems (feasible directions methods).

Page 25: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Geometric Interpretation

Page 26: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimality Conditions (2)

Karush-Kuhn-Tucker Necessary Condition If C∈*x is a local minimum of f over C, then there exist Lagrange multiplier vectors ),...,(* 11 λλ=λ and ),...,(* 1 Jµµ=µ , such that 0*)*,*,( =∇ µλxLx where Jjj ,...,1 0* =≥µ

0* =jµ for j when constraint j is inactive at x* Lagrangian function

∑∑==

++=J

jjj

I

iii ghfL

11

)()()(),,( xxxµλx µλ

Page 27: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example

Minimize f(x) = x1 + x2 Subject to: x1

2 + x22 = 2

Lagrangian

)()(),( xxx hfλL λ+= At the local optimum x*, the KKT condition must be satisfied

0*)(*)( =∇+∇ xx hf λ What is the value of λ?

Page 28: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore
Page 29: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Barrier and Interior Point Methods

Constrained problem is converted to a sequence of unconstrained problems which involve an added high cost for approaching the boundary of the feasible region.

Barrier and interior point methods are used for inequality constrained problems.

Minimize )()()( xxx BfFB += Barrier function

∑=

−−=J

jjgB

1

)}(ln{)( xx (logarithmic)

∑=

−=J

j jgB

1 )(

1)(

xx (inverse)

Page 30: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Barrier Method – Geometrical Interpretation

Page 31: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Penalty Method

Constrained problem is converted to a sequence of unconstrained problems which involve an added high cost for infeasibility.

A penalty parameter or function is used to penalize violation of constraints.

[ ] [ ]

++= ∑∑

=

+

=

J

jj

I

ii

nP gh

rfF

1

2

1

2 )()(2

)()( xxxx

where [ ])(,0max)( xx jj gg =+ and rn is a penalty parameter

These approaches are also known as SUMT, sequential unconstrained minimization technique (any unconstrained algorithm may be used)

Often the penalized function is an augmented Lagrangian function to improve convergence.

Page 32: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimal Control (1)

Optimal control problems are dynamic optimization problems

Definition (discrete-time optimal control)

Minimize ∑−

=+=

1

0

),()(N

iiiiNN ggJ uxx

Subject to 1,...,0 );,(1 −==+ Nif iiii uxx (system equation)

NiRX nii ,...,1 ; =⊂∈x (state vector forming a trajectory)

1...,0 ; −=⊂∈ NiRU miiu (control vector forming a trajectory)

x0: given The system equation specifies uniquely the state trajectory which corresponds to a given control trajectory.

Page 33: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Optimal Control (2)

Given a control trajectory u = (u0, u1,…,uN-1), the state trajectory is uniquely determined by the system equation fi (i = 1, N). Equivalently, we can write

Nix ii ,...,2,1 );( == uφ where iφ is determined from fi.

Simplified definition

Minimize ∑−

=+=

1

0

)),(())(()(N

iiiiNN ggJ uuuu φφ

Subject to NiRX n

ii ,...,1 ; =⊂∈x 1...,0 ; −=⊂∈ NiRU m

iiu x0: given The optimal solution u* = (u0*, u1*…,uN-1*) can be found by any of the nonlinear optimization methods.

Page 34: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Nonlinear Optimization Algorithm (1)

1. Select x0 (i.e. set decision/control variables); set k = 0 2. For constrained problems, set the initial penalty r0 to a suitable

small value. Choose a penalty update rule (e.g. rk = rk-1 * 1.75) 3. For constrained problems, formulate the equivalent

unconstrained objective function (using penalty method) 4. Find the new vector xk+1

a. Compute the direction vector dk b. Compute the stepsize αk c. Update kkkk dxx α+=+1 d. For constrained problem, repeat steps a to c until convergence

is achieved within reasonable (large) tolerance 5. If stopping criteria is satisfied, stop. xk is the optimum solution

and f(xk) the optimum objective value 6. if stopping criteria is not satisfied, update k = k + 1, update rk+1

and go to step 3.

Page 35: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Nonlinear Optimization Algorithm (2)

Stopping criteria

ε<− −

)(

)()( 1

k

kk

f

ff

xxx

ε<− −

k

kk

xxx 1

where ε is a small positive number.

Calculating gradients The gradients are typically computed by the finite difference method in practice. This procedure is general and does not require explicit expressions for the gradient functions (which are often not available in practice – implicit equations)

Page 36: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Practical Guidelines (1)

General Nonlinear programming is computationally challenging. Theoretical results often do not translate into practical behavior. This is because of the discrete nature of digital computations.

Each problem should be considered from modeling to implementation independently from others.

Experimentation can yield insights that can be used to tune the methods for improved efficiency and performance.

Large scale problems require additional care in design and implementation.

Real world problems often do not possess many properties assumed during theoretical analyses (e.g. continuously differentiable functions, etc)

Page 37: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Practical Guidelines (2)

Categories Mathematical modeling/problem formulation Scaling Validation Method selection Large scale problems High performance and parallel implementation

Page 38: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Mathematical Modeling/Problem Formulation (1)

Key questions What are the objectives of the optimization? Is an accurate mathematical model of the problem available? Is reliable data available?

Goal: the simplest mathematical model consistent with the objectives and accuracy of available data and models

Specific decisions Objective function? Number and type of variables? Number and type of constraints?

Page 39: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Mathematical Modeling/Problem Formulation (2)

Some guidelines If there are more than one objective functions, embody all but one into the constraints set. (otherwise, a multi-objective optimization technique has to be used)

If unsure of design and implementation decisions, start with a simple formulation and study the results of the optimization before modifying it

Two rules of thumb: (1) convex objective and constraints sets are preferable to non-convex ones; (2) Linear and simple nonlinear functions are preferable

Converting nonlinear functions to piecewise linear ones is generally NOT preferable, as this increases the number of variables and distorts the physical understanding of the problem

Converting integer variables to continuous ones may be done Making objective and constraint functions differential may be done

Page 40: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Scaling

Variables should be scaled so that their values are neither too large nor too small relative to one another.

Benefits Controls round-off errors Improves the conditioning of the problem Often improves convergence

Example: suppose x ranges between a and b. It is scaled to [-1, 1] by

[ ]

[ ]2/)(

2/)(

ab

bax

−+−

Page 41: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Method Selection

A comparison of several methods was presented on a previous slide

Some further considerations Is a one-time solution sought or the problem has to be solved many times (for varying parameters)? In the latter case, efficiency and accuracy is important.

For complex nonlinear problems with non-convex sets, simpler methods like the method of steepest descent are preferable.

If parallel implementation is desired then the conjugate-gradient method is preferable.

The choice of the stepsize rule has a significant impact on efficiency and accuracy.

Page 42: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Validation

It is essential that the solution obtained is validated as correct. There are no fixed procedures for this; each problem has to be considered separately.

Some useful strategies If an optimal solution is known, then a close solution from a method indicates correctness

Run the algorithm with several different starting values to see if it converges to the same optimum solution

Vary parameters of the problem and correlate the behavior of the solution to the physical understanding of the problem.

Solve the problem using different methods

Page 43: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example – Min. Weight Design of Cold-Formed Steel Beam

Objective of the optimization problem To develop parameterized minimum weight design curves for cold-formed steel hat-shape beams

Page 44: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example (2)

Problem definition Min. [ ]dbLtf 22 += µ Subject to: the constraints of the building code (AISI) Variables of the problem are t, b, and d only; others are parameters. The code specified constraints are complex, nonlinear, and implicit. Some equations are not continuously differentiable.

Scaling No scaling is needed since the code equations are based on the ratios b/t and d/t

Method Selection The method of steepest descent (with scaling) is most appropriate because it can be followed and understood, and hence, tuned to give good solutions.

Page 45: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example (3)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5

Span length (m)

Thi

ckne

ss, t

(m

m)

b

d t

b/2 b/2

q (KN/m)20

15

10

7.55

2.5

Fy = 345 N/mm^2Unbraced

Validation The solution is validated by comparing with optimal solutions found by other algorithms and by parametric behavior of the solution.

Page 46: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Large Scale Problems (1)

What is large-scale? There are no hard and fast rules. One criteria is

Hundreds or thousands of variables Run-time in the tens of minutes

Consideration in design and implementation Scalability of method (both computational efficiency and memory usage)

Memory requirement Run-time Utilizing structure in the problem to enhance performance (e.g. large-scale problems are often sparse)

Numerical conditioning Robustness Parallel implement ability

Page 47: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Large Scale Problems (2)

Recommendations Non-Newton-like methods such as method of steepest descent and conjugate gradient method

Conjugate gradient method, especially when implemented in parallel, is is usually the best

Scaling of variables is essential Scaling of gradient direction (when using steepest descent method)

Page 48: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Parallel Implementation

Motivation Significant speedups can be achieved by implementing the method on a high-performance parallel computer

Parallel computing is affordable with cluster of computers running Linux and freely available parallel libraries.

Recommendation The conjugate gradient method is readily parallelizable on both distributed-memory and shared-memory architectures. Signficant speedups and efficiencies can are obtained in practice.

Other advantages of parallel implementation Improved search for a global minimum. This is a consequence of the non-deterministic execution order of parallel programs

Faster and stable convergence. This has been observed in practice for the solution of complex and large problems

Page 49: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

MATLAB Optimization Toolbox

The MATLAB toolbox implements several methods. This makes experimentation straightforward and the selection of the best method easier. However, MATLAB code is not as efficient as compiled C or Fortran code. Hence, it is appropriate for small to medium scale problems only.

Two key functions fminunc - Multidimensional unconstrained nonlinear minimization. fmincon - Multidimensional constrained nonlinear minimization.

These m-files are the primary interface for constrained and unconstrained optimization in MATLAB. Type help optim to list all toolbox functions

Page 50: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Unconstrained Optimization

Syntax X=FMINUNC(FUN,X0,OPTIONS)

where FUN = objective function to be minimized; X0 = starting vector; OPTIONS = structure specifying optimization options

Example FUN can be specified using @:

X = fminunc(@myfun,2) where MYFUN is a MATLAB function such as: function F = myfun(x)

F = sin(x) + 3;

Page 51: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Constrained Optimization

Syntax X=FMINCON(FUN,X0,A,B,Aeq,Beq,LB,UB,NONLCON,OPTIONS) This function minimizes the optimization problem: Minimize F(X) subject to: A*X <= B; Aeq*X = Beq (linear constraints)

C(X) <= 0; Ceq(X) = 0 (nonlinear constraints) LB <= X <= UB

The function NONLCON accepts X and returns the vectors C and Ceq, representing the nonlinear inequalities and equalities, respectively. Like FUN, NONLCON can be specified with a function or with INLINE.

Page 52: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Setting Optimization Parameters

The OPTIMSET function is used to modify the OPTIONS structure that specifies optimization parameters such as optimization method and line search method.

Syntax OPTIONS = OPTIMSET('PARAM1',VALUE1,'PARAM2',VALUE2,...)

For medium scale problems, MATLAB provides steepest descent, Newton and Quasi-Newton (BFGS and DFP) methods

For large scale problems, MATLAB provides CG and sequential quadratic programming methods

HessUpdate - [ {bfgs} | dfp | steepdesc ] Use help command for more details.

Page 53: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore
Page 54: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example (1)

Min. f(x)= 100*(x(2)-x(1)^2)^2+(1-x(1))^2 (banana function)

05

1015

2025

3035

0

10

20

30

400

500

1000

1500

2000

2500

3000

Page 55: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

Example (2)

Optimal solution is x* = [1, 1] and f(x*) = 0

BFGS Quasi-Newton method Value of the function at the solution: 8.98565e-009 Number of function evaluations: 105

DFP Quasi-Newton method Value of the function at the solution: 2.26078e-008 Number of function evaluations: 109

Steepest descent method Value of the function at the solution: 4.84404 Number of function evaluations: 302 Steepest descent did not converge in 302 iterations.

Page 56: Nonlinear Optimizationweb.lums.edu.pk › ~akarim › pub › optimization1.pdf · 2015-09-15 · Nonlinear Optimization Theory and Practice by Asim Karim Computer Science Dept. Lahore

References

Dimitri P. Bertsekas, Nonlinear Programming, Athena Scientific, MA, 1995.

Edward K. Chong et al. An Introduction to Optimization, Wiley, 2001.

Hojjat Adeli and Asim Karim, Construction Scheduling, Cost Optimization and Management, Spon Press, 2001.

Ananth Grama et al., An Introduction to Parallel Computing: Design and Analysis of Algorithms, Addison-Wesley, 2003.

MATLAB Optimization Toolbox, http://www.mathworks.com/access/helpdesk/help/toolbox/optim/optim.shtml