91710v00 ML DG Parallel Opt

7/30/2019 91710v00 ML DG Parallel Opt

1/6

1 MATLAB Digest www.mathworks.com

Parallel computing techniques can help re-duce the time it takes to reach a solution. oderive the ull bene ts o parallelization, it isimportant to choose an approach that is ap-propriate or the optimization problem.

Tis article describes two ways to use theinherent parallelism in optimization prob-lems to reduce the time to solution. Te

rst example solves a mathematical prob-lem using the parallel computing option inOptimization oolbox, and requires nocode modi cation. Te second, a practicalengineering optimization problem, requiresa single-line change in code. Both examplesuse Parallel Computing oolbox andMA LAB Distributed Computing Serverto automate and manage the parallel com-puting tasks.

Parallel Optimization withOptimization Toolbox In a typical optimization, an iterativesearch procedure is used to nd a mini-mum value o a given unction or ex-ample, using a gradient-based algorithm to

nd a minimum value o the peaks unc-tion in MA LAB (Figure 1).

Gradient estimation is o en the mosttime-consuming computation during opti-mization. In a single iteration, the optimi-

zation solver estimates the local gradient o

Engineers, scientists, and fnancial analysts requently use optimizationmethods to solve computationally expensive problems such as smoothing

the large computational meshes used in uid dynamic simulations, per ormingimage registration, or analyzing high-dimensional fnancial port olios. How-ever, computing a solution can take anywhere rom hours to days.

Improving Optimization Performance withParallel Computing

the unction and uses that in ormation todetermine the direction o the search andthe magnitude o the step to the next point

in the search. Tis process is repeated until

Products Used MATLAB

Simulink

Optimization Toolbox

Statistics Toolbox

Parallel Computing Toolbox

MATLAB Distributed Computing Server

Figure 1. Using a gradient-based optimization solver to search or a minimum value.


2/6


the solver nds a minimum value or reachesa prede ned time or iteration limit.

Te gradient estimation step, which iso en per ormed using an approximationmethod, such as nite di erences, requiresseveral unction evaluations near the cur-rent point. ypically, N unction evalua-tions are required, where N is the numbero variables, or the dimensionality o theproblem. As N increases, so do the numbero unction evaluations or each iterationand the time to solution.

Figure 2 shows how you can per ormcomputations in parallel to accelerate gra-dient estimation. Te N unction evalua-tion used to estimate the gradient whenper ormed on a single MA LAB workerwould be executed serially.

Whereas in the serial approach the Nunction evaluations occur one a er the

other, in the parallel approach the com-putations are distributed across MA LABworkers, and several unction evaluationscan occur simultaneously. We can speedup gradient estimation as long as the cost o distributing the unction evaluation across

multiple workers is less than the executiontime o N objective (and constraint) unc-tion evaluations. Te actual solution timedepends on the objective/constraint unc-tion execution speed, the computer pro-cessing speed, available memory, currentload, and network speed.

An ElectroStatics Example i

his example illustrates how to ormu-late and solve an optimization problem inMA LAB. Consider N electrons in a con-ducting body (Figure 3). Te electrons ar-range themselves to minimize their potentialenergy subject to the constraint o lying in-side the conducting body. At the minimum

Serial Approach

Start

Stop

For k = 1 : N

Evaluate model

Parallel Approach

...MATLAB......workers...

Evaluatemodel

Evaluatemodel

Evaluatemodel

Evaluatemodel

Stop

Start

Figure 2. Serial and parallel approaches to gradient estimation.

Figure 3. Conducting body and electrons.

i Inspired by Dolan, Mor, and Munson, Benchmarking Optimization So ware with COPS 3.0, Argonne National Laboratory echnical Report.ANL/MCS- M-273, February 2004.


3/6


total potential energy, all the electrons lieon the boundary o the body. Because theelectrons are indistinguishable, there is no

unique minimum or this problem (permut-ing the electrons in one solution gives another valid solution).

Te optimization goal is to minimizethe total potential energy o the electronssubject to the constraint that the electronsremain within the conducting body. Te objective unction, potential energy, is the sumo the inverses o the distances between eachelectron pair (i,j = 1, 2, 3, N):

Te constraints that de ne the boundary o the conducting body are

As written, the rst inequality is a non-smooth nonlinear constraint because o the absolute values on x and y . Absolute

values can be linearized to simpli y theoptimization problem. Tis constraintcan be written in linear orm as a seto our constraints or each electron, i.

Te indices 1, 2, and 3 re er to the x, y, andz coordinates, respectively.

Tis problem can be solved with thenonlinear constrained solver fmincon inOptimization oolbox. Figure 4 shows theproblem ormulation in MA LAB. Notethat in de ning the objective unction,sumInvDist , the statement pause(t) wasadded. Tis changes the time taken to exe-cute sumInvDist , letting us determine

how the execution time changes the solu-tion time.

We execute the optimization problem ona single MA LAB worker. For convenience,we de ne speed-up as the ratio o the timeit takes to solve the problem on 1 workerrelative to the time it takes to solve the sameproblem on N workers. Tus, or any prob-lem running on a single worker, the speed-up is de ned as 1. When the same problemis run on more than one worker, a speed-upgreater than one denotes a reduction in so-lution time and a speed-up less than one de-

notes an increase.o use the parallel computing capabil-

ity in Optimization oolbox, we change

the UseParallel option rom the de ault,Never, to Always, enable the desired num-ber o compute nodes with the matlabpool

command, and run the optimization solveras be ore (Figure 5).

Tis example runs on two workers andis 20% slower than the single-worker case.Te evaluation time o N objective and con-straint unctions is on the order o 0.5 milli-seconds. A single evaluation o the objective

unction and the constraint unction takesabout 0.5 milliseconds. Because the evalu-ations take so little time, the overheads as-

sociated with arming out data and compu-tations outweigh any gains that are realizedby running the code in parallel. As a result,

Figure 4. Problem ormulationin MATLAB.

Figure 5. Running the optimization solver using the parallel computing capability in Optimization Toolb

problem.options = optimset(problem.options, UseParallel , Always );

matlabpool open 2

[x,fval,exit ag,output] = fmincon(problem);

< = ji ji x x

energy1

y x z

1)1( 222 +++ z y x

0

0

0

0

2,1,3,

2,1,3,

2,1,3,

2,1,3,

++

+

+

iii

iii

iii

iii

x x x

x x x

x x x

x x x


4/6


distributing computations to run in parallelis slower than running the problem serially.

Tis example shows that or optimiza-tion problems to bene t rom running thecomputations in parallel, the cost associatedwith the unction evaluations with gradientestimation must be greater than the overheadcosts associated with data and code trans er.

o urther understand the trade-o s as-sociated with the objective unction execu-tion time, number o MA LAB workers, andnumber o variables (electrons), we ran thisproblem with a pause time (t) ranging rom 0to 0.4 seconds, di erent workers rom 1 to 32,and number o electrons ranging rom 8 to20. Figures 6a and 6b show the results plotted

or t = 0 and 0.1 seconds, respectively.

Figure 6a shows how the number o elec-

trons changes the speed-up. For 8 and 10electrons, the overhead cost o more work-ers reduced the speed-up. For 16 electrons,the overhead cost ell below the unctionevaluation cost, and we saw a positive e ecton the speed-up or most workers. Te aver-age time to evaluate one objective unctionwas on the order o 1 mill isecond on a singleworker. Te maximum speed-up occurred

at around 8 workers. Increasing the num-ber o workers increased the communica-tion overhead and eventually eliminated thebene t o using parallel computing.

Figure 6b shows how the results changewhen the objective unction evaluation timesigni cantly exceeds the overhead costs. Arapid increase in speed-up is again observed

around 8 workers. Notice that the maximumreductions in the curve or the 8-electron caseand the 16-electron case are 8 workers and 16workers, respectively. Tis increase is a resulto balancing the number o parallel compu-tations that can be per ormed with the samenumber o workers. For the 8-electron case,

8 workers resulted in the greatest increase. Asimilar trend is seen or the 16-electron case.

User-Defined ParallelOptimization ProblemAs we have seen, the time required toevaluate the objective strongly infuencesthe solution time. Tere ore, one o the ap-proaches we can take is to look or parallel-ism within the objective unction and theconstraint unctionthat is, to parallelizethe objective or the constraint unction.

We will take this approach by distribut-ing computations within the objective unc-tion or the problem shown in Figure 7. Te

Figure 6b. Results or pause time t = 0.01 seconds.Figure 6a. Results or pause time t = 0 seconds.

Figure 7. Vehicle suspension model and simulation results or an empty vehicle.


5/6


goal o this problem is to design a suspen-sion system that minimizes the discom ortthe driver would experience when travelingover a bump in the road. At the same time,

we must account or uncertainty in the masso the driver, passengers, and trunk load-ings. We can modi y the our parametersthat de ne the ront and rear suspensionsystem sti ness and damping rate: kf, kr,cf, cr. Te masses o the driver, passengers,and trunk loadings are uncertain and have anormal distribution assigned to them.

A Monte-Carlo simulation is per ormed tocapture the di erent vehicle loadings. Te

model outputs are angular acceleration aboutthe center o gravity (thetadotdot, ) and vertica l acceleration ( zdotdot , z ).

Te objective unction, myCostFcnRR ,contains a Monte-Carlo simulation used toevaluate the mean and standard deviationo acceleration that the passengers wouldexperience (Figure 8). Te goal is to mini-mize the mean and standard deviations.

In a Monte-Carlo simulation, each runis independent and there ore can bene t

rom paral lel computation. o convert theproblem rom serial to parallel, we sim-

ply replace the for loop construct withthe parfor (parallel for loop) construct(Figure 9). Te objective statements insidethe parfor loop can then run in parallel,

speeding up the objective unction evalu-ation time.

Te optimization problem was run orthree di erent values o nRuns , the num-ber o points to evaluate in the Monte-Carlo simulation. Te results show thatparallelizing the objective unction yieldedsubstantial per ormance gains (Figure 10).

Figure 8. Objective unction used to run aMonte-Carlo simulation within the optimization process.

Figure 9. Objective unction modifed to execute in parallel.

Figure 10. Results or the suspension system design problem.

Optimizer

Stop

Objective Function

Monte-Carlo Simulation

For k = 1 :NEvaluate Model


6/6


For More Information

Designing for Reliability and Robustness.MATLAB Digest,January 2008www.mathworks.com/company/url

Cleves Corner: Parallel MATLAB:Multiple Processors and Multiple Cores,The MathWorks News and Notes,June2007.www.mathworks.com/company/url

MATLAB Distributed Computing Server

Computer Setup8 networked PCs confgured with Linux 2.6, 2 dual-core AMD Opeteron 2.59 GHzprocessors, 3 GB RAM.

91710v00 03/09

2009 The MathWorks, Inc. MATLAB and Simulink are registered trademarks o The MathWorks, Inc. Seewww.mathworks.com/trademarks or a list o additionaltrademarks. Other product or brand names may be trade-marks or registered trademarks o their respective holders.

Selecting a ParallelizationMethodAs the results show, it is best to selecta parallelization method based uponwhere computational expense is encoun-

tered in the optimization problem. heirst example showed that it is possible

to see a reduction in per ormance evena ter distributing computations i theobjective/constraint unction executiontime does not exceed the communicationoverhead. For the problem and hardwarecon iguration tested, we would need anexecution time o at least one millisec-ond to see any bene it rom parallelizingthe problem.

Te second example showed that thebene ts obtained by per orming parallelcomputations can depend on the problembeing solved. Using the parallelism withinthe objective unctionthat is, parallel-ising the objective unction itsel result-ed in better per ormance than would havebeen possible using the parallel comput-ing option in Optimization oolbox. Inthis example, the execution time o the

objective unction was the slowest part o the optimization problem, and speedingup the objective unction resulted in thegreatest reduction in solution time.

In summary, when selecting a parallel -ization approach it is important to considerthe number o available workers and theexecution time o the objective/constraint

unction relative to overheads associated

with distributing computations acrossmultiple workers Te built-in support inOptimization oolbox is bene cial orproblems that have objective/constraint

unctions with execution times greater

than network overhead. However, parallel-izing the objective/constraint unction it-sel can be a better approach i it is the mostexpensive step in the optimization prob-lem and can be accelerated by parallelizingthe objective unction.

91710v00 ML DG Parallel Opt

Documents

Transcript of 91710v00 ML DG Parallel Opt