Benders Decomposition for Dummies

download Benders Decomposition for Dummies

of 19

description

A bender course,

Transcript of Benders Decomposition for Dummies

Benders Decomposition for Dummies

qwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmrtyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnm

Benders Decomposition for DummiesHow I learned it4/14/2011Gro Klboe

ContentsFor whom is this note intended?3What is Benders decomposition?3What are the benefits of using Benders decomposition?3Presentation of the case study4Dressing up the problem in the standard notation4The first stage4The second stage5The optimality cuts the link between the first and the second stage7Taking care of stochasticity8Example of an optimality cut8The Bender procedure from beginning to end11The steps in Benders decomposition12Step 1: Initialization12Step 2: Sub problems12Step 3: Convergence test12Step 4: Master problem12But how do I know how many iterations to run?13List of notation15References15

For whom is this note intended?The primary target group for this memo is very narrow: Me. I need to write down what I just have understood, so that I dont forget it. Other people that might benefit from this memo are probably quite like me in the following respects: You have a basic idea about what stochastic programming is about You are familiar with the notion of two stage recourse problems You are used to the notation in the textbook of Birge & Loveaux[1] You have tried to follow a textbook on the implementation of benders, but has stumbled in the details You need a practical rather than mathematical approach to concepts to understand them fully You are looking for a basic, step-by-step-explanation on how to implement benders, and are willing to suffer a tedious presentation to get all the detailsSince I dont intend to publish these notes other places than on my home page, I dont bother to pretend that Google isnt my main source of information. Therefore, the list of references include other lecture notes and memos and other useful stuff found on the net.What is Benders decomposition?Benders decomposition is a way to split complicated mathematical programming problems into two, and thereby simplifying the solution by solving one master problem and one subproblem. It is commonly used for stochastic two-stage programs with recourse where the problem can be split in the first and second stage problem, but benders decomposition can be used for deterministic problems too - see [2] for an example.Originally, benders was written to solve integer, non-stochastic programs. This is what mr. Benders was struggling with.Within stochastic programming, one often refer to the L-shaped method [1], which was developed by van Slyke and Wets. I think they were the first to borrow from Benders decomposition technique and apply it to decomposition methods. (There is also someone named Kelly who touched upon the same things). This method is basically the same as benders decomposition, with the addition of feasibility cuts (see [1] ch 5.1 for details on this). Since we in this article will work with problems with relative complete recourse, we can disregard the feasibility cuts. What are the benefits of using Benders decomposition?This is a good question. There exists two families of decomposition techniques: Decomposition along the time stages, and decomposition along scenarios[3]. Benders decomposition decomposes along time stages. There exists other technique that decomposes along time stages that are faster than Benders. I generally have the impression that Benders is the learn-to-walk-before-you-can-run way to approach the problems. Presentation of the case studyI will illustrate the implementation of Benders from a simple example from Higles excellent tutorial on stochastic programming [4]. The problem is basically as follows: Dakota furniture company makes desk, tables and chairs from three types of input: lumber and two types of labour: finishing and carpentry. They are faced with a two-stage recourse problem in the following way: They need to purchase their input factors before demand for the different products are known. However, as demand is revealed, they are free to divide the input factors between the different products in a way that maximizes their income, given the availability of inputs.The first stage problem is thus to determine how much lumber to buy, and how much finishing and carpentry-skilled labour to hire. The random variable is demand quantities, and the second stage problem is to figure out how many desks, tables and chairs to produce, recognizing that production opportunities are limited by the inputs bought in stage 1. The parameters of the problem are as follows:Tabell 1: Cost of input factors and input factor requirementsResourceCostsInput requirements

$DeskTableChair

Lumber (bd ft)2861

Finishing (hrs)4421.5

Carpentry (hrs)5.221.50.5

Tabell 2: Demand scenarios and sell pricesDemand scenariosSell price

LowMost likelyHigh$

Desks5015025060

Tables2011025040

Chairs20022550010

Probability0.30.40.3

Dressing up the problem in the standard notation

The first stageFirst, one word of caution: In every introductory text on Benders the first stage problem is a minimization problem, and the second stage is also a minimization problem. Do yourself a favor and formulate your problem so that it is a minimization problem in both stages, or youll be doing sign errors constantly. Remember that a maximization problem can always be turned into a minimization problem by minimizing the negative of the max problem. Believe me, its easier to interpret the negative of your problem than to get all the other signs and inequalities in Benders right.Ok. Lets turn to the standard formulation. For the first stage, the notation is as follows:

s.t

is an approximation of the second stage costs as a function of your chosen x. We will elaborate on how we approximate it later. The first stage problem basically says that you should choose x as to minimize the costs in the first and the second stage, subject to the restrictions on xs that are identifiable in the first stage. However, since we havent dealt with the second stage yet, we have no clue about the costs in the second stage (the value of ), and therefore we optimistically set its initial value to -.If you are like me, it is often useful to see the vectors with some real values to get a feel of the problem, so here is the objective.

But what do the first-stage restrictions look like? In Higles original problems, there are none specified (probably because Higle do not show how to solve the problem using Benders). There are no restrictions to how much lumber, finishing and carpentry labour we can buy. But when solving the case, well find out that to get started, we really need some upper limits on the x-vector, otherwise the first stage problem will in some of the starting iterations be unbounded. This will often be the case with Benders. Luckily, its usually not hard to think of some limits to the x that would be relevant to include. There might be some budget limits on how much money the producer could spend on buying inputs, or in the extreme case, the availability of resources will ultimately be limited by global availability. (There cannot be more labour hours available for carpentry in one specific hour than there are people in the world!). So, Ill give you some upper limits on x, which I know are sufficient for the levels of demand: Lumber: max 3500 bd ft. Finishing labour: max 1500 hours. Carpentry labour: max 875 hours. With this information at hand, the A-matrix simply is the identy matrix, and the restriction Ax=b becomes:

Solving this problem for the first iteration of the first stage, it is obvious that the solution is:

The second stageGiven that we have chosen some values of the xs from the first stage, how can we now make the best of the situation in the second stage. Let us denote the vector of xs from the first stage to emphasize that the values are fixed. We now want to solve the second stage problems. For all the discrete realizations, s=1,S, of the random variable , we solve the following problem

st

Some comments on the notation and stochasticity here. T is often named the technology matrix, W the recourse matrix, h the resource matrix[footnoteRef:1] and q the cost vector. All these might be a function of the stochastic variable, but to use Benders, W must be independent of . That is known as fixed recourse. Also, life is easier when q is independent of , as this interferes with feasibility cuts in some ways that I have not really thought very hard about (the interested reader is referred to chapter 3 of [5]). However, in our case, only demand is stochastic which means (as I will show) that only the resource vector is stochastic. [1: Note that if you are reading Higle, she denote the resource matrix with r rather than h. ]

The objective in the second stage is really to maximize income give the available inputs and demand constraints. However, as we want to work with a minimization problem, we formulate the problem so that we try to minimize the negative costs of selling furniture.

Then, lets move on to the restriction matrix for the second stage. There are two sets of equations to this particular problem, the input restrictions saying that you cannot make more furniture than the stock of inputs allows you (the three first rows of the W,h and T vector/matrix), and the demand restriction saying that you cannot sell more to the market than the demand scenario allow you to (the three last rows of W,h and T vector/matrices).

Note that many of the rows in the technology matrix consist of all zeros. This took a bit time for me to realize (but maybe you are smarter than me and grasp this at once), but all restrictions that limits the second stage problem, but are not a function of your first stage variables will have all-0-rows in the T-matrix. However, they are still important, because the influence the objective function in the second stage problem, and as we will see later on, the objective function will enter the approximated second stage cost (). Note also that the three last rows in the h-vector have stochastic elements. Thus, we will actually not have only one second stage problem we will have s (in our case 3) problems one for each scenario.One thing that I see is imprecise in this exposition, is that the second stage restrictions are represented by an equity constraint, whereas it really should be less than or equal to constraints. I guess the condenced form Wy = h Tx involves some slack variables to make the equation hold. However, I will ignore this for now since I am going to solve the problem through the high-level OR program GAMS rather than by matrix calculations. If you are trying to solve the problem in MATLAB , Maple or equivalent, you should probably think about this a bit. The optimality cuts the link between the first and the second stageSince you are reading this text, you have probably looked at the Benders decomposition method and know that the first and the second stage problem are tied together through the optimality cuts. Thus, after the first initialization, the first stage problem gets a set of extra equations (one for each iteration), that limits .

s.t

Note that theta is unbounded, so that negative values are allowed. Note also that has no subscript l, so that for each iteration, there will be more and more restrictions that limits , ultimately pushing to its optimal value. But what do these restrictions represent? Let us rewrite the optimality cut from the Birge & Louveaux [1]notation: This equation says that must be greater than the right hand side. Thus, we limit the objective (a minimization problem) by saying that the -part must be at least greater than something. But what is this something? To put it short, the something is the expected objective value of the second stage problem el (give some fixed values of x) less the contribution to reduce the second stage objective value by changing the first stage variables x. The next question is then how do we know how much changing x is going to reduce the second stage objective? The answer is that we use the marginal values on the restrictions that x enters. If the marginal value is negative, increasing x would reduce the second stage objective (and thus making it better since it is a minimization problem), whereas a positive marginal value, would increase it. The coefficient El tells us how fast the change is.[A word of caution: It is really easy to get sign confused when working with optimality cuts. Birge & Louveauxs formulation is more intuitive appealing, whereas Higles formulation is more mathematically correct. When reading the two text side by side it might seem as they disagree on the sign for the x-term, but they dont. In Birge & Louveauxs notation, El is the simplex multiplicator of Tx, whereas in Higle, the term x is the simplex multiplicator of Tx. Therefore, the two restrictions (Birge & Louveaux)and (Higle)are really equivalent].Where does this whole idea of optimality cut come from, and why does it work? The answer is duality theory. Let us go back to the formulation of the second stage problem. If you know your linear programming basics, you know that you can formulate the dual of the second stage problem as follows:

s.t.

Where is the vector of dual variables, and all other matrixes should be known from the definition of the primal second stage problem. [The last restriction, 0, really only holds if Wyh(s)-Tx, but since I guess that slack variables are included in B&L formulation, I dont worry too much about this].For the optimal x-vector, the objective of the dual second stage problem will be equal to the objective of the primal second stage problem. For all non-optimal x-vectors, the objective value of the dual problem will be less than the second stage objective. This means that the objective of the dual second stage problem act as a lower bound on the true second stage cost. (3)Does this look familiar? It should. Remember that the value is an approximation of the second stage objective w. And the approximation takes place using dual variables from the optimal solution of the second stage given that we have fixed some first-stage variables. Using the notation from Birge and Louveax, we define

With these definitions at hand, it is easy to see that the weak duality property (3) is equivalent with the optimality cut. Taking care of stochasticityForegive me father, cause I have sinned The last paragraph is imprecise, because I have partly ignored the stochasticity of the problem. is not really an approximation of the second stage cost, but the second stage costs that might arise given different outcomes of the random variable. What really should approximate, is the expected value of the second stage problem. Example of an optimality cutTo illustrate what an optimality cut may look like, let us have a look at one particular iteration of the Dakota furniture problem. I have chosen to show iteration 4.In the first stage problem, the objective is still to minimize the cost of purchasing inputs while balancing the income that the purchase of inputs could lead to in the second stage

We still have the first stage constraints limiting how much input it is possible to buy:

From the 3 earlier iterations, we have the following three restrictions on thetha:

If you want to reflect a bit upon the values of these three optimality cuts, youll notice that without at least one unit of each input, the Dakota Furniture Company will be unable to produce a single table, desk or chair, and the objective value of the second stage (el) is therefore 0 in all three cuts. Also, the feedback from the second stage value tells us that it is the lack of lumber that is the most binding constraint, leading to a negative marginal value[footnoteRef:2] on this in the first iteration. Since the cost of lumber (2) is less than the expected income from buying lumber (-6.25), this leads us to buy lumber in the second iteration, but then it is the lack of labour for finishing that is most restrictive and yields a negative marginal value. Again, we decide to buy finishing as well, only to discover that it would be beneficial also to have carpentry skilled labour. [2: Remember: Minimization objective : Negative marginal value indicates a better objective if we increase the level. ]

With this information we solve the first stage problem and get the following decision on how much input to buy:

[Jeg undres hvilken informasjon det er som fr programmet til kjpe mindre enn max av finishing og carpentry?]With these numbers at hand, we move to the second stage problem. We maximize income for each of the three demand scenarios given this stock of inputs. The objective value of the three scenarios and the marginal value of increased inputs (the first stage variable) are listed in the table below:

ScenarioWProbabilityMarginal values

LumberFinishingCarpentry

1 (low demand)-58000.3000

2 (most likely)-156500.4000

3 (high demand)-212500.30-150

Also, let us have a look at the marginal value of increased demand in the three scenarios in this iteration:ScenarioProbabilityMarginal values

DesksTablesChairs

1 (low demand)0.3-60-40-10

2 (most likely)0.4-60-40-10

3 (high demand)0.30-100

The marginal values of increased demand should not be so surprising. Since we have quite a lot of input, the marginal value of increased demand in the low and most likely scenario is just equal to the (negative of the) selling price since we are left with additional inputs even when all demand is met. In the high demand scenario it is only demand of tables that can fully be met, so an increase in demand for tables would increase profit. However, due to lack of inputs, increased production of tables would happen at the expense of desk production, so contribution to profit is less than the selling price for tables.Given the structure of this problem[footnoteRef:3], where the h-vector has zeros for all rows where the T-matrix is non-zero, and vice versa, the constant term of the cuts can be calculated in two ways either directly through the marginal values on the h-vector, or indirectly, but subtracting the marginal values of the T-matrix multiplied with the first stage decisions from the second stage objective. [3: I have not really thought it through whether this structure is a necessary condition for being able to calculate el in two ways, or whether it also can be done with other types of problems. ]

Let us start out with the conventional way of calculating that [1] prescribe:

Thus, for our problem, this becomes:

However, we can choose to do this in another way. Remember that the objective of the second stage problem is equal to the dual variables times the right hand side. We may exploit this in the following way:

So, calculating this for our case leaves us with. (Note that the marginal values refer to Tx, so that we need to replace the + with a sign to get the calculations correct).

Which method is better, depends on the structure of the problem. With a large second stage problem (h-vector with many non-zero elements), I think this second approach is faster.

Let us now calculate El. That is just a question of calculating the probability weighted sums of the marginals from all scenarios. For lumber and carpentry the value of increased inputs are 0 for all scenarios, but for finishing the marginal value is -15 in the high demand scenario. This leaves us with a coefficient of 0.3*0 + 0.4*0 + 0.3*-15 = -4.5 for finishing. The fourth cut will then look as follows:

The cut tells us that, from the second stage point of view, the expected income of the input vector bought in l4 is $-8750, and that this income could be increased by $4.5 for each extra unit of finishing that was available. Is this interpretation really correct?The Bender procedure from beginning to endThe procedure so far: Each iteration contains one run of the first stage problem and s runs of the second stage problem where s is the number of scenarios The goal of the first stage problem is to decide upon the first stage decision variables, x. These are then transferred to the second stage problem of the same iteration and kept constant for all scenarios. The goal of the second stage problem is to find the expected value of the second stage problem, and also the gradient of the first stage variable with respect to the second stage objective. Each iteration of the second stage problem adds one cut to the first stage problem, whereas each run of the first stage problem only replace the old set of first stage variables. When you solve the first stage problem the cost of the second stage problem is bounded by the optimality cuts. Thetha can be reduced by increasing the first stage decision variables (the xs), a decision that has to be weighed against the increased first stage costs. But the gradient of which increased x reduce the second stage cost is based on an optimal second stage decision given x, so the estimation of thetha from the first stage will act as a lower bound on the second stage cost.The second stage problem will calculate the cost of the second stage given optimal second stage decisions given fixed first stage variables. The steps in Benders decompositionThis section follows [6].Step 1: Initializationv:=1{iteration number}UB:={Upper bound}LB:=- {Lower bound}Solve initial master problem

{optimal values}

Step 2: Sub problemsFor in doSolve the sub problem End forUB:=

Step 3: Convergence testIf (UB-LB)/(1+LB)TOL thenStop: required accuracy achievedReturn End if

Step 4: Master problemSolve the master problem

{Optimal values}

LB:=Go to step 2The naming of iterations in BendersThis is confusing stuff.But how do I know how many iterations to run?The question of the number of iteration was my main headache when I first tried to implement the Benders algorithm. Birge & Louveaux [1] has the following stopping criterion:Let If v wv, then stop, xv is an optimal solution. Otherwise, add the optimality cut to the master problem and run another iteration. However, this knowledge is not very useful in predicting the progress of your algorithm, since the gap v-wv do not necessarily steadily decrease. This is illustrated for our case study in Figur 1.

Figur 1: Gap between the probability weighted sum of second stage objective and the approximation (theta) in the Dakota furniture problem.With large problems at hand, it is quite frustrating not to know how much closer you are to the optimal solution, but just to sit and hope for an optimal solution in the next iteration. If you look at implementations of Benders algorithm or look at applications in papers, you have almost certainly stumbled over the notions of the upper and lower bound. The notion of gradually narrowing the range of the master objective through a non-decreasing lower and a non-increasing upper bound, as illustrated in Figur 2, seems rather appealing. In addition, this measure allows us not to calculate a problem to full optimality, but to cut the algorithm when the upper and lower bound is close enough.

Figur 2: Example of bounding of the master objective of the Dakota furniture problemHowever, finding out how this upper and lower bound is calculated is rather tedious. Let me present some intuition behind it here:Since we have a minimization problem, the lower bound of the objective is the best possible objective, right? And the upper bound represent the worst case. So, after the first iteration we know that the expected profit of the Dakota furniture company is between $14875 and $0 not a very precise measure huh? Ok, now I reveal how the upper and lower bound is calculated. The lower bound is simply the optimal value of the master objective .

The upper bound is the first stage cost, plus the probability weighted sum of optimal responses in the second stage:

Arent these two measures really the same? No, there is one big difference. When calculating the lower bound, you are allowed to vary the first stage decisions, while simultaneously taking into account the impact of your choice of x in the first and second stage although the effect on second stage can only be approximated through the restrictions on theta. In the Dakota furniture problem, this means that we choose the amount of inputs to buy while accounting for that the lumber, finishing and carpentry both represent a direct cost, but also an income opportunity in the second stage, and trade these two objectives off against each other.When calculating the upper bound, however, the choice of inputs is fixed, and the only thing we can do about it is to make the best out of it when demand is revealed. However, if the choice of inputs happened to be optimal, we will also have an optimal solution. Birge & Louveaux [1] describe the theory behind this in chapter 9. They state that The L-shaped method () is based on iteratively providing a lower bound on the recourse objective, . For details on the use of bonding, I refer the interested reader to chapter 9, and 9.3 in particular. The lower bound is continuously improving for each iteration, whereas the upper bound stays the same for several iterations.

List of notationl iteration number in the L-shaped methodh resource vectorp probability

s scenarioT technology matrix

dual variables

References[1]J. R. Birge and F. Louveaux, Introduction to Stochastic Programming. New York: Springer, 1997.[2]U. G. Christensen and A. B. Pedersen, "Lecture Note on Benders' Decomposition," ed, 2008.[3]C. C. Care and R. Schultz, "Dual Decomposition in Stochastic Integer Programming," K.-Z.-Z. f. I. Berlin, Ed., ed, 1996.[4]J. L. Higle, "Stochastic Programming: Optimization When Uncertainty Matters," in Tutorials in Operations Research, ed: INFORMS, 2005.[5]P. Kall and S. W. Wallace, Stochastic programming, 1 ed.: John Wiley & Sons, 1994.[6]E. Kalvelagen, "Benders Decomposition for Stochastic Programming with GAMS," ed, 2003, p. 10.