Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute....

52
Draft 1 Common random numbers (CRN) Simulation is often used to compare similar systems, e.g., for the purpose of optimization. Suppose we want to estimate μ 2 - μ 1 by Δ = X 2 - X 1 , where μ 1 = E[X 1 ] and μ 2 = E[X 2 ]. We have Var[Δ] = Var[X 1 ]+ Var[X 2 ] - 2Cov[X 1 , X 2 ].

Transcript of Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute....

Page 1: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

1

Common random numbers (CRN)

Simulation is often used to compare similar systems,e.g., for the purpose of optimization.

Suppose we want to estimate µ2 − µ1 by ∆ = X2 − X1, where µ1 = E[X1]and µ2 = E[X2]. We have

Var[∆] = Var[X1] + Var[X2]− 2Cov[X1,X2].

If each Xk has (fixed) cdf Fk for k = 1, 2, then taking Xk = F−1k (U) for asingle common r.v. U ∼ U(0, 1) maximizes the covariance (Frechet 1951).

For typical simulations, F−1k (U) is much too complicated to compute.

Page 2: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

1

Common random numbers (CRN)

Simulation is often used to compare similar systems,e.g., for the purpose of optimization.

Suppose we want to estimate µ2 − µ1 by ∆ = X2 − X1, where µ1 = E[X1]and µ2 = E[X2]. We have

Var[∆] = Var[X1] + Var[X2]− 2Cov[X1,X2].

If each Xk has (fixed) cdf Fk for k = 1, 2, then taking Xk = F−1k (U) for asingle common r.v. U ∼ U(0, 1) maximizes the covariance (Frechet 1951).

For typical simulations, F−1k (U) is much too complicated to compute.

Page 3: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

1

Common random numbers (CRN)

Simulation is often used to compare similar systems,e.g., for the purpose of optimization.

Suppose we want to estimate µ2 − µ1 by ∆ = X2 − X1, where µ1 = E[X1]and µ2 = E[X2]. We have

Var[∆] = Var[X1] + Var[X2]− 2Cov[X1,X2].

If each Xk has (fixed) cdf Fk for k = 1, 2, then taking Xk = F−1k (U) for asingle common r.v. U ∼ U(0, 1) maximizes the covariance (Frechet 1951).

For typical simulations, F−1k (U) is much too complicated to compute.

Page 4: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

2

Common random numbers (CRNs)

What we can do is simulate the two systems with exactly the samestreams of uniforms random numbers. Important: make sure that thecommon random numbers (CRN) are used for the same purpose for bothsystems (synchronization) and generate all r.v.’s by inversion.

Proposition. If X1 and X2 are monotone functions of each uniform, in thesame direction then Cov[X1,X2] > 0.

In general (non-monotone functions), we can still have Cov[X1,X2] > 0.

With independent random numbers(IRN), Cov[X1,X2] = 0.

Multiple comparisons: All of this applies if we want to compare severalsimilar systems.

Page 5: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

2

Common random numbers (CRNs)

What we can do is simulate the two systems with exactly the samestreams of uniforms random numbers. Important: make sure that thecommon random numbers (CRN) are used for the same purpose for bothsystems (synchronization) and generate all r.v.’s by inversion.

Proposition. If X1 and X2 are monotone functions of each uniform, in thesame direction then Cov[X1,X2] > 0.

In general (non-monotone functions), we can still have Cov[X1,X2] > 0.

With independent random numbers(IRN), Cov[X1,X2] = 0.

Multiple comparisons: All of this applies if we want to compare severalsimilar systems.

Page 6: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

3

Example: The stochastic activity network

0source 1Y0

2

Y1Y2

3Y3

4

Y7

5

Y9

Y4

Y5

6Y6

7

Y11

Y8

8 sink

Y12

Y10

Suppose we increase µ2 from 7.0 to 10.0, and µ4 from 16.5 to 18.5. What is theimpact on the project duration T?

X1 = project duration T under original laws.X2 = project duration T under modified distributions.We want to study the distribution of ∆ = X1 − X2 and estimate E[∆].

Suppose that Yj = F−1j (Uj) for X1 and Yj = F−1j (Uj) for X2.

IRN: The Uj are independants of the Uj . CRN: Uj = Uj for each j .

Page 7: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

4

We make n = 100, 000 runs for each estimator.

With IRN, the realizations of ∆ = X2 −X1 range from −223.22 to 280.92.mean = 1.326, variance = 967.95% CI on E[∆]: (1.133, 1.519).

With CRN, ∆ goes from 0 a 49.88. mean = 1.528, variance = 9.1,95% CI on E[∆]: (1.510, 1.547).

CRNs reduce the variance by a factor of (approx.) 106.

With CRN, 67,880 realizations of ∆ are 0 (because the modified Yj arenot on the longest path) and the 32,120 other realizations ∆ are > 0.Explanation: when we increase its mean µj , Yj = −µj ln(1− Uj)(exponential r.v.) cannot decrease for Uj fixed, because − ln(1− Uj) > 0.Then T cannot decrease.

With IRN, ∆ can take both positive and negative values.

Page 8: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

4

We make n = 100, 000 runs for each estimator.

With IRN, the realizations of ∆ = X2 −X1 range from −223.22 to 280.92.mean = 1.326, variance = 967.95% CI on E[∆]: (1.133, 1.519).

With CRN, ∆ goes from 0 a 49.88. mean = 1.528, variance = 9.1,95% CI on E[∆]: (1.510, 1.547).

CRNs reduce the variance by a factor of (approx.) 106.

With CRN, 67,880 realizations of ∆ are 0 (because the modified Yj arenot on the longest path) and the 32,120 other realizations ∆ are > 0.Explanation: when we increase its mean µj , Yj = −µj ln(1− Uj)(exponential r.v.) cannot decrease for Uj fixed, because − ln(1− Uj) > 0.Then T cannot decrease.

With IRN, ∆ can take both positive and negative values.

Page 9: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

4

We make n = 100, 000 runs for each estimator.

With IRN, the realizations of ∆ = X2 −X1 range from −223.22 to 280.92.mean = 1.326, variance = 967.95% CI on E[∆]: (1.133, 1.519).

With CRN, ∆ goes from 0 a 49.88. mean = 1.528, variance = 9.1,95% CI on E[∆]: (1.510, 1.547).

CRNs reduce the variance by a factor of (approx.) 106.

With CRN, 67,880 realizations of ∆ are 0 (because the modified Yj arenot on the longest path) and the 32,120 other realizations ∆ are > 0.Explanation: when we increase its mean µj , Yj = −µj ln(1− Uj)(exponential r.v.) cannot decrease for Uj fixed, because − ln(1− Uj) > 0.Then T cannot decrease.

With IRN, ∆ can take both positive and negative values.

Page 10: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

5

∆0 50-50 100-100 150-150

Frequency (IRN)

5,000

10,000

∆0 5 10 15 20

Frequency (CRN)

0

1,000

2,000

Page 11: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

6

Example: a telephone call centerOpen 13 hours a day.nj = number of agents available during hour j .Arrivals: Poisson at rate Bλj per hour during hour j , whereB = busyness factor for the day; B ∼ gamma(10, 10);

E[B] = 1, Var[B] = 0.1.Expected number of arrivals: a = E[A] = E[B]

∑12j=0 λj .

Service times : i.i.d. exponential with mean θ = 100 seconds.FIFO queue.Patience time: 0 with prob. p = 0.1, exponential with mean 1000, withprob. 1− p. If wait > patience: abandonment.

Let G = number of calls answered within 20 seconds on a given day.Performance measure of interest:µ = fraction of calls answered within 20 seconds, in the long run.

Unbiased estimator of µ: X = G/a.

Page 12: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

6

Example: a telephone call centerOpen 13 hours a day.nj = number of agents available during hour j .Arrivals: Poisson at rate Bλj per hour during hour j , whereB = busyness factor for the day; B ∼ gamma(10, 10);

E[B] = 1, Var[B] = 0.1.Expected number of arrivals: a = E[A] = E[B]

∑12j=0 λj .

Service times : i.i.d. exponential with mean θ = 100 seconds.FIFO queue.Patience time: 0 with prob. p = 0.1, exponential with mean 1000, withprob. 1− p. If wait > patience: abandonment.

Let G = number of calls answered within 20 seconds on a given day.Performance measure of interest:µ = fraction of calls answered within 20 seconds, in the long run.

Unbiased estimator of µ: X = G/a.

Page 13: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

6

Example: a telephone call centerOpen 13 hours a day.nj = number of agents available during hour j .Arrivals: Poisson at rate Bλj per hour during hour j , whereB = busyness factor for the day; B ∼ gamma(10, 10);

E[B] = 1, Var[B] = 0.1.Expected number of arrivals: a = E[A] = E[B]

∑12j=0 λj .

Service times : i.i.d. exponential with mean θ = 100 seconds.FIFO queue.Patience time: 0 with prob. p = 0.1, exponential with mean 1000, withprob. 1− p. If wait > patience: abandonment.

Let G = number of calls answered within 20 seconds on a given day.Performance measure of interest:µ = fraction of calls answered within 20 seconds, in the long run.

Unbiased estimator of µ: X = G/a.

Page 14: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

7

j 0 1 2 3 4 5 6 7 8 9 10 11 12nj 4 6 8 8 8 7 8 8 6 6 4 4 4λj 100 150 150 180 200 150 150 150 120 100 80 70 60

(Arrival rates are per hour and service times are in seconds.)

Let X1 = value of G with this configuration;and X2 = value of G with one more agent for periods 5 and 6.

Want to estimate µ2 − µ1 = E[X2 − X1] = E[∆].

Here, Var[∆] is about 225 times smaller with CRNs than with IRNs.

In an optimization algorithm, we may have to compare thousands ofconfigurations (different staffings, routings of calls, etc.), and theefficiency gain can make a huge difference.

Page 15: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

7

j 0 1 2 3 4 5 6 7 8 9 10 11 12nj 4 6 8 8 8 7 8 8 6 6 4 4 4λj 100 150 150 180 200 150 150 150 120 100 80 70 60

(Arrival rates are per hour and service times are in seconds.)

Let X1 = value of G with this configuration;and X2 = value of G with one more agent for periods 5 and 6.

Want to estimate µ2 − µ1 = E[X2 − X1] = E[∆].

Here, Var[∆] is about 225 times smaller with CRNs than with IRNs.

In an optimization algorithm, we may have to compare thousands ofconfigurations (different staffings, routings of calls, etc.), and theefficiency gain can make a huge difference.

Page 16: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

8

public class CallCenterCRN extends CallCenter {

Tally statQoS1 = new Tally ("stats on QoS for config.1");

Tally statDiffIndep = new Tally ("stats on difference with IRNs");

Tally statDiffCRN = new Tally ("stats on difference with CRNs");

int[] numAgents1, numAgents2;

public CallCenterCRN (String fileName) throws IOException {

super (fileName);

numAgents1 = new int[numPeriods];

numAgents2 = new int[numPeriods];

for (int j = 0; j < numPeriods; j++)

numAgents1[j] = numAgents2[j] = numAgents[j];

}

// Set the number of agents in each period j to the values in num.

public void setNumAgents (int[] num) {

for (int j = 0; j < numPeriods; j++) numAgents[j] = num[j];

}

}

Page 17: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

9public void simulateDiffCRN (int n) {

double value1, value2;

statQoS1.init(); statDiffIndep.init(); statDiffCRN.init();

for (int i = 0; i < n; i++) {

setNumAgents (numAgents1);

streamB.resetNextSubstream();

streamArr.resetNextSubstream();

streamPatience.resetNextSubstream();

(genServ.getStream()).resetNextSubstream();

simulateOneDay(); // Simulate config 1

value1 = (double)nGoodQoS / nCallsExpected;

setNumAgents (numAgents2);

streamB.resetStartSubstream();

streamArr.resetStartSubstream();

streamPatience.resetStartSubstream();

(genServ.getStream()).resetStartSubstream();

simulateOneDay(); // Simulate config 2 with CRN

value2 = (double)nGoodQoS / nCallsExpected;

statQoS1.add (value1);

statDiffCRN.add (value2 - value1); // Stat. for CRN

simulateOneDay(); // Simulate config 2 indep.

value2 = (double)nGoodQoS / nCallsExpected;

statDiffIndep.add (value2 - value1); // Stat. for IRN

}

}

Page 18: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

10

static public void main (String[] args) throws IOException {

int n = 1000; // Number of replications.

CallCenterCRN cc = new CallCenterCRN ("CallCenter.dat");

cc.numAgents2[5]++; cc.numAgents2[6]++; // Config 2

cc.simulateDiffCRN (n);

System.out.println (

cc.statQoS1.reportAndCIStudent (0.9) +

cc.statDiffIndep.reportAndCIStudent (0.9) +

cc.statDiffCRN.reportAndCIStudent (0.9));

double varianceIndep = cc.statDiffIndep.variance();

double varianceCRN = cc.statDiffCRN.variance();

// Print variance reduction factor.

System.out.println ("Variance ratio: " +

PrintfFormat.format (10, 2, 3, varianceIndep / varianceCRN));

}

Page 19: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

11

REPORT on Tally stat. collector ==> stats on QoS for config.1

min max average standard dev. num. obs

0.293 1.131 0.861 0.163 1000

90.0% confidence interval for mean: ( 0.853, 0.870 )

REPORT on Tally stat. collector ==> stats on difference with IRNs

min max average standard dev. num. obs

-0.763 0.775 9.9E-3 0.234 1000

90.0% confidence interval for mean: ( -0.002, 0.022 )

REPORT on Tally stat. collector ==> stats on difference with CRNs

min max average standard dev. num. obs

-0.013 0.101 9.9E-3 0.016 1000

90.0% confidence interval for mean: ( 0.009, 0.011 )

Variance ratio: 223.69

Page 20: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

12

Derivative estimation for call center

Service times are exponential with mean θ = 100 seconds.

We would like to estimate the derivative of µ = E[G ] w.r.t. θ.

For that, we simulate the system at θ = θ1 = 100 to get X1, then atθ = θ2 = 100 + δ, and estimate the derivative by D(θ, δ) = (X2 − X1)/δ.

Can simulate X1 and X2 either with CRNs or with IRNs.

We replicate this n times, independently, and compute the empirical meanand variance.

Page 21: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

13

How to implement CRNs?

Four types of random variates in this model, all generated by inversion:

(a) the busyness factor B for the day;(b) the times between successive arrivals of calls;(c) the call durations;(d) the patience times;

Synchronization problem: when service times change, waiting times andabandonment decisions can change.For a given call, we may need to generate a patience time in one case andnot on the other one (if call does not wait), or a service time in one caseand not on the other one (if call abandons).

Page 22: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

14

Possible strategies:

(a) generate a service time for all calls, or

(b) only for those who do not abandon.

Similarly, we can

(c) generate a patience time for all calls, or

(d) only for those who wait.

Page 23: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

14

Possible strategies:

(a) generate a service time for all calls, or

(b) only for those who do not abandon.

Similarly, we can

(c) generate a patience time for all calls, or

(d) only for those who wait.

Page 24: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

15

Experimental results, with n = 104. S2n = Var[D(θ, δ)].

Method δ = 10 δ = 1 δ = 0.1Dn(θ, δ) δ2S2

n Dn(θ, δ) δ2S2n Dn(θ, δ) δ2S2

n

IRN (a + c) 5.52 56913 4.98 45164 6.6 44046IRN (a + d) 5.22 54696 7.22 45192 -18.2 45022IRN (b + c) 5.03 56919 9.98 44241 15.0 45383IRN (b + d) 5.37 55222 5.82 44659 13.6 44493CRN, no sync. 5.60 3187 5.90 1204 01.9 726CRN (a + c) 5.64 2154 6.29 37 06.2 1.8CRN (a + d) 5.59 2161 6.08 158 07.4 53.8CRN (b + c) 5.58 2333 6.25 104 06.3 7.9CRN (b + d) 5.55 2323 6.44 143 05.9 35.3

Page 25: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

16

Derivative estimation: theory

Suppose µ = µ(θ) = E[X (θ,U)] depends on a continuous parameter θand we want to estimate

µ′(θ1) =∂µ(θ)

∂θ

∣∣∣∣θ=θ1

=∂Eθ[X (θ)]

∂θ

∣∣∣∣θ=θ1

= limδ→0

Eθ1+δ[X (θ1 + δ)]− Eθ1 [X (θ1)]

δ.

For a vector of parameters: gradient (vector).

Why estime derivatives?

(a) To evaluate the relative importance of different model parameters(sensitivity avalysis).(b) Confidence interval that accounts for estimation error in modelparameters.(c) Evaluate the effect of a change (sensitivity) in a decision parameter.(d) A gradient estimator is often required in optimisation algorithms.(e) In finance, the derivatives of a contract price (the “Greeks”) w.r.t.certain parameters are required to implement hedging strategies.

Page 26: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

16

Derivative estimation: theory

Suppose µ = µ(θ) = E[X (θ,U)] depends on a continuous parameter θand we want to estimate

µ′(θ1) =∂µ(θ)

∂θ

∣∣∣∣θ=θ1

=∂Eθ[X (θ)]

∂θ

∣∣∣∣θ=θ1

= limδ→0

Eθ1+δ[X (θ1 + δ)]− Eθ1 [X (θ1)]

δ.

For a vector of parameters: gradient (vector).

Why estime derivatives?

(a) To evaluate the relative importance of different model parameters(sensitivity avalysis).(b) Confidence interval that accounts for estimation error in modelparameters.(c) Evaluate the effect of a change (sensitivity) in a decision parameter.(d) A gradient estimator is often required in optimisation algorithms.(e) In finance, the derivatives of a contract price (the “Greeks”) w.r.t.certain parameters are required to implement hedging strategies.

Page 27: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

17

Want to estimate

µ′(θ1) = limδ→0

Eθ1+δ[X (θ1 + δ)]− Eθ1 [X (θ1)]

δ.

Finite-difference estimator:

δ= D(θ1, δ) =

X (θ1 + δ,U2)− X (θ1,U1)

δ=

(X2 − X1)

δ.

for some δ > 0, where U1 and U2 are sequences of uniform r.v.’s.

This estimator is biased, but biais β → 0 when δ → 0. Moreover:

Var[∆/δ] =Var(X2 − X1)

δ2=

Var[X1] + Var[X2]− 2Cov[X1,X2]

δ2.

Requires at least d + 1 simulations to estimate a d-dimensional gradient.

Page 28: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

18

Proposition.

(i) If U1 and U2 are independent, then

limδ→0

δ2Var[D(θ, δ)] = 2Var[X (θ)].

That is, Var[D(θ, δ)] blows up at rate 1/δ2.

(ii) Suppose U1 = U2 = U (CRNs), X (θ,U) is continuous in θ anddifferentiable almost everywhere, and D(θ, δ) is uniformly integrable(uniformly in θ).Then Var[D(θ, δ)] remains bounded when δ → 0.

(iii) Suppose U1 = U2 = U and X (θ,U) is discontinuous in θ, but theprobability that X (·,U) is discontinuous in (θ, θ + δ) converges to 0 asO(δβ) when δ → 0, and X 2+ε(θ) is uniformly integrable for some ε > 0.Then Var[D(θ, δ)] = O(1 + δβ−2−ε), for any ε > 0, when δ → 0.

Page 29: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

18

Proposition.

(i) If U1 and U2 are independent, then

limδ→0

δ2Var[D(θ, δ)] = 2Var[X (θ)].

That is, Var[D(θ, δ)] blows up at rate 1/δ2.

(ii) Suppose U1 = U2 = U (CRNs), X (θ,U) is continuous in θ anddifferentiable almost everywhere, and D(θ, δ) is uniformly integrable(uniformly in θ).Then Var[D(θ, δ)] remains bounded when δ → 0.

(iii) Suppose U1 = U2 = U and X (θ,U) is discontinuous in θ, but theprobability that X (·,U) is discontinuous in (θ, θ + δ) converges to 0 asO(δβ) when δ → 0, and X 2+ε(θ) is uniformly integrable for some ε > 0.Then Var[D(θ, δ)] = O(1 + δβ−2−ε), for any ε > 0, when δ → 0.

Page 30: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

18

Proposition.

(i) If U1 and U2 are independent, then

limδ→0

δ2Var[D(θ, δ)] = 2Var[X (θ)].

That is, Var[D(θ, δ)] blows up at rate 1/δ2.

(ii) Suppose U1 = U2 = U (CRNs), X (θ,U) is continuous in θ anddifferentiable almost everywhere, and D(θ, δ) is uniformly integrable(uniformly in θ).Then Var[D(θ, δ)] remains bounded when δ → 0.

(iii) Suppose U1 = U2 = U and X (θ,U) is discontinuous in θ, but theprobability that X (·,U) is discontinuous in (θ, θ + δ) converges to 0 asO(δβ) when δ → 0, and X 2+ε(θ) is uniformly integrable for some ε > 0.Then Var[D(θ, δ)] = O(1 + δβ−2−ε), for any ε > 0, when δ → 0.

Page 31: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

19

Can improve efficiency by arbitrary large factor when δ → 0.For ex., δ = 10−4 (and we assume hidden constants are 1), thenVar[D(θ, δ)] is 200 millions times larger with (i) than with (ii).So (i) needs 200 millions times more runs for same accuracy.

Page 32: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

20

When (ii) holds, we may take the stochastic derivative

X ′(θ,U) = ∂X (θ,U)/∂θ = limδ→0

D(θ, δ)

as an estimator of µ′(θ), if not too hard to compute.This is infinitesimal perturbation analysis.

This estimator is unbiased iff

E[X ′(θ,U)]def= E

[∂X (θ,U)

∂θ

]?=∂E[X (θ,U)]

∂θ

def= µ′(θ,U). (1)

Sufficient condition: Lebesgue Dominated Convergence Theorem.If there is a δ1 > 0 and a random variable Y such that

supδ∈(0,δ1]

|X (θ1 + δ,U)− f (θ1,U)|δ

≤ Y

and E[Y ] <∞, then the interchange in (1) is valid.

Page 33: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

20

When (ii) holds, we may take the stochastic derivative

X ′(θ,U) = ∂X (θ,U)/∂θ = limδ→0

D(θ, δ)

as an estimator of µ′(θ), if not too hard to compute.This is infinitesimal perturbation analysis.

This estimator is unbiased iff

E[X ′(θ,U)]def= E

[∂X (θ,U)

∂θ

]?=∂E[X (θ,U)]

∂θ

def= µ′(θ,U). (1)

Sufficient condition: Lebesgue Dominated Convergence Theorem.If there is a δ1 > 0 and a random variable Y such that

supδ∈(0,δ1]

|X (θ1 + δ,U)− f (θ1,U)|δ

≤ Y

and E[Y ] <∞, then the interchange in (1) is valid.

Page 34: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

21

We may change the definition of X (θ) to make it continuous and benefitfrom case (ii). For ex., by replacing some r.v.’s by conditional expectations(conditional Monte Carlo).

For example, if X (θ) counts the customer abandonments, we may replaceeach indicator of abandonment (0 or 1) by the probability of abandonmentgiven the waiting time.

Case (iii) shows that CRNs may provide substantial benefits even if X (θ)is discontinuous.In the call center example, we can prove that (iii) holds with β = 1.

Page 35: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

21

We may change the definition of X (θ) to make it continuous and benefitfrom case (ii). For ex., by replacing some r.v.’s by conditional expectations(conditional Monte Carlo).

For example, if X (θ) counts the customer abandonments, we may replaceeach indicator of abandonment (0 or 1) by the probability of abandonmentgiven the waiting time.

Case (iii) shows that CRNs may provide substantial benefits even if X (θ)is discontinuous.In the call center example, we can prove that (iii) holds with β = 1.

Page 36: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

22

Example: stochastic activity network, E[T ].

0source 1Y0

2

Y1Y2

3Y3

4

Y7

5

Y9

Y4

Y5

6Y6

7

Y11

Y8

8 sink

Y12

Y10

Yj = F−1j ,θj(Uj).

Some Yj are exponential with mean θj = µj :Yj = Yj(θj) = −θj ln(1− Uj).

Some Yj are normal with mean θj = µj :Yj = Yj(θj) = θj + (θj/4)Zj = θj + (θj/4)Φ−1(Uj).

Page 37: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

23

We want to estimate the derivative of E[T ] w.r.t. each θj .We consider a single θj at a time.We write T = X (θj ,U) where U = (U1, . . . ,U13) and ∂T/∂θj = X ′j (θ,U).

We see that X ′j (θ,U) = Y ′j (θj) if arc j is on the longest path, andX ′j (θ,U) = 0 otherwise.

If Yj is exponential, then Yj = Yj(θj) = −θj ln(1− Uj),Y ′j (θj) = − ln(1− Uj) and

0 ≤Xj(θ + δej ,U)− Xj(θ,U)

δ≤−δ ln(1− Uj)

δ= − ln(1− Uj) = Ej ,

where Ej ∼ Exponentielle(1). The dominated convergence theoremapplies: X ′j (θ,U) is unbiased and has finite variance.

If Yj is normal, then Yj = Yj(θj) = θj + (θj/4)Φ−1(Uj) andY ′j (θj) = 1 + Φ−1(Uj)/4. Also unbiased.

Page 38: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

23

We want to estimate the derivative of E[T ] w.r.t. each θj .We consider a single θj at a time.We write T = X (θj ,U) where U = (U1, . . . ,U13) and ∂T/∂θj = X ′j (θ,U).

We see that X ′j (θ,U) = Y ′j (θj) if arc j is on the longest path, andX ′j (θ,U) = 0 otherwise.

If Yj is exponential, then Yj = Yj(θj) = −θj ln(1− Uj),Y ′j (θj) = − ln(1− Uj) and

0 ≤Xj(θ + δej ,U)− Xj(θ,U)

δ≤−δ ln(1− Uj)

δ= − ln(1− Uj) = Ej ,

where Ej ∼ Exponentielle(1). The dominated convergence theoremapplies: X ′j (θ,U) is unbiased and has finite variance.

If Yj is normal, then Yj = Yj(θj) = θj + (θj/4)Φ−1(Uj) andY ′j (θj) = 1 + Φ−1(Uj)/4. Also unbiased.

Page 39: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

23

We want to estimate the derivative of E[T ] w.r.t. each θj .We consider a single θj at a time.We write T = X (θj ,U) where U = (U1, . . . ,U13) and ∂T/∂θj = X ′j (θ,U).

We see that X ′j (θ,U) = Y ′j (θj) if arc j is on the longest path, andX ′j (θ,U) = 0 otherwise.

If Yj is exponential, then Yj = Yj(θj) = −θj ln(1− Uj),Y ′j (θj) = − ln(1− Uj) and

0 ≤Xj(θ + δej ,U)− Xj(θ,U)

δ≤−δ ln(1− Uj)

δ= − ln(1− Uj) = Ej ,

where Ej ∼ Exponentielle(1). The dominated convergence theoremapplies: X ′j (θ,U) is unbiased and has finite variance.

If Yj is normal, then Yj = Yj(θj) = θj + (θj/4)Φ−1(Uj) andY ′j (θj) = 1 + Φ−1(Uj)/4. Also unbiased.

Page 40: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

23

We want to estimate the derivative of E[T ] w.r.t. each θj .We consider a single θj at a time.We write T = X (θj ,U) where U = (U1, . . . ,U13) and ∂T/∂θj = X ′j (θ,U).

We see that X ′j (θ,U) = Y ′j (θj) if arc j is on the longest path, andX ′j (θ,U) = 0 otherwise.

If Yj is exponential, then Yj = Yj(θj) = −θj ln(1− Uj),Y ′j (θj) = − ln(1− Uj) and

0 ≤Xj(θ + δej ,U)− Xj(θ,U)

δ≤−δ ln(1− Uj)

δ= − ln(1− Uj) = Ej ,

where Ej ∼ Exponentielle(1). The dominated convergence theoremapplies: X ′j (θ,U) is unbiased and has finite variance.

If Yj is normal, then Yj = Yj(θj) = θj + (θj/4)Φ−1(Uj) andY ′j (θj) = 1 + Φ−1(Uj)/4. Also unbiased.

Page 41: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

24

Example: stochastic activity network, P[T > x ].Now we want to estimate the derivative of P[T > x ] w.r.t. θj .The standard estimator of P[T > x ] is X (θ,U) = I[T > x ].Always 0 or 1.

Therefore the derivative X ′j (θ,U) is always 0 or undefined (the latteroccurs with probability 0).Thus, P[X ′j (θ,U) = 0] = 1. This is a biased estimator ofµ′(θj) = ∂P[T > x ]/∂θj .

The dominated convergence does not apply here, becausesupδ>0[X (θ + δej ,U)− X (θ,U)]/δ is not integrable.This ratio can always be 1/δ with positive probability w.r.t. U.

This problem can be solved (in this example) by replacing the estimatorI[T > x ] by a conditional expectationP[T > x | Y0,Y1,Y2,Y3,Y6,Y7,Y10,Y12], which is continuous in θ.

Page 42: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

24

Example: stochastic activity network, P[T > x ].Now we want to estimate the derivative of P[T > x ] w.r.t. θj .The standard estimator of P[T > x ] is X (θ,U) = I[T > x ].Always 0 or 1.

Therefore the derivative X ′j (θ,U) is always 0 or undefined (the latteroccurs with probability 0).Thus, P[X ′j (θ,U) = 0] = 1. This is a biased estimator ofµ′(θj) = ∂P[T > x ]/∂θj .

The dominated convergence does not apply here, becausesupδ>0[X (θ + δej ,U)− X (θ,U)]/δ is not integrable.This ratio can always be 1/δ with positive probability w.r.t. U.

This problem can be solved (in this example) by replacing the estimatorI[T > x ] by a conditional expectationP[T > x | Y0,Y1,Y2,Y3,Y6,Y7,Y10,Y12], which is continuous in θ.

Page 43: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

24

Example: stochastic activity network, P[T > x ].Now we want to estimate the derivative of P[T > x ] w.r.t. θj .The standard estimator of P[T > x ] is X (θ,U) = I[T > x ].Always 0 or 1.

Therefore the derivative X ′j (θ,U) is always 0 or undefined (the latteroccurs with probability 0).Thus, P[X ′j (θ,U) = 0] = 1. This is a biased estimator ofµ′(θj) = ∂P[T > x ]/∂θj .

The dominated convergence does not apply here, becausesupδ>0[X (θ + δej ,U)− X (θ,U)]/δ is not integrable.This ratio can always be 1/δ with positive probability w.r.t. U.

This problem can be solved (in this example) by replacing the estimatorI[T > x ] by a conditional expectationP[T > x | Y0,Y1,Y2,Y3,Y6,Y7,Y10,Y12], which is continuous in θ.

Page 44: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

25

puits

V5

V4

V3

V6

V7

V8

V9

V10

V11

V12

V13

V1

V2

source......................................................................................................................................

..........................................................................................................................................................................

..................................................................................................................

............................................................................................................

....................

..........................................................................................................................................................................................

.................... ....................

..........................................

..........................................

..........................................

...........................................

............................................................................................................

...................................................................................................................

....................

...................................................................................................................................................................................................................................................................................................................

1

2 4 7

3

6 9

5 8

Page 45: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

26

CMC Estimator:Xe = P[T > x | {Vj , j 6∈ L}].

Computed as follows.

For each l ∈ L, say going from al to bl , we compute the length αl of thelongest path from the source to al , then the length βl of the longest pathfrom bl to the sink.

No path going through l is longer than x iff αl + Vl + βl ≤ x .

Conditionally on {Vj , j ∈ B}, this holds with probabilityP[Vl ≤ x − αl − βl ] = Fl [x − αl − βl ].Since the Vl are independent, we obtain

Xe = 1− P[Vl ≤ x − αl − βl for all l ] = 1−∏l∈L

Fl [x − αl − βl ].

Page 46: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

26

CMC Estimator:Xe = P[T > x | {Vj , j 6∈ L}].

Computed as follows.

For each l ∈ L, say going from al to bl , we compute the length αl of thelongest path from the source to al , then the length βl of the longest pathfrom bl to the sink.

No path going through l is longer than x iff αl + Vl + βl ≤ x .

Conditionally on {Vj , j ∈ B}, this holds with probabilityP[Vl ≤ x − αl − βl ] = Fl [x − αl − βl ].

Since the Vl are independent, we obtain

Xe = 1− P[Vl ≤ x − αl − βl for all l ] = 1−∏l∈L

Fl [x − αl − βl ].

Page 47: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

26

CMC Estimator:Xe = P[T > x | {Vj , j 6∈ L}].

Computed as follows.

For each l ∈ L, say going from al to bl , we compute the length αl of thelongest path from the source to al , then the length βl of the longest pathfrom bl to the sink.

No path going through l is longer than x iff αl + Vl + βl ≤ x .

Conditionally on {Vj , j ∈ B}, this holds with probabilityP[Vl ≤ x − αl − βl ] = Fl [x − αl − βl ].Since the Vl are independent, we obtain

Xe = 1− P[Vl ≤ x − αl − βl for all l ] = 1−∏l∈L

Fl [x − αl − βl ].

Page 48: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

27

Discrete-event models

The stochastic derivative can be complicated to compute, because aninfinitesimal change in θ may have complicated impacts on the sequenceof events.

Propagation technique: infinitesimal perturbation analysis (IPA).

Page 49: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

28

Sample average optimizationSuppose we have an optimization problem of the form

min E[H(y,U)]subject to E[Gk(y,U)] ≥ bk for all k ,

y ∈ S (some set)

Simulate n copies of functions H and Gk , with CRNs across y, and takeaverages. Sample average problem (deterministic in y):

min Hn(y)subject to Gk,n(y) ≥ bk for all k ,

y ∈ S.Can be solved by a deterministic optimization method, but for eachsolution y, the objective and constraints are evaluated by simulation.Convergence: well-developed theory, CLTs, large deviations, etc.

Well-synchronized CRNs are essential.

Example: staffing and scheduling in a multiskill call center.

More general: Objective and constraints may contain functions of severalexpectations, or quantiles, etc.

Page 50: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

28

Sample average optimizationSuppose we have an optimization problem of the form

min E[H(y,U)]subject to E[Gk(y,U)] ≥ bk for all k ,

y ∈ S (some set)

Simulate n copies of functions H and Gk , with CRNs across y, and takeaverages. Sample average problem (deterministic in y):

min Hn(y)subject to Gk,n(y) ≥ bk for all k ,

y ∈ S.

Can be solved by a deterministic optimization method, but for eachsolution y, the objective and constraints are evaluated by simulation.Convergence: well-developed theory, CLTs, large deviations, etc.

Well-synchronized CRNs are essential.

Example: staffing and scheduling in a multiskill call center.

More general: Objective and constraints may contain functions of severalexpectations, or quantiles, etc.

Page 51: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

28

Sample average optimizationSuppose we have an optimization problem of the form

min E[H(y,U)]subject to E[Gk(y,U)] ≥ bk for all k ,

y ∈ S (some set)

Simulate n copies of functions H and Gk , with CRNs across y, and takeaverages. Sample average problem (deterministic in y):

min Hn(y)subject to Gk,n(y) ≥ bk for all k ,

y ∈ S.Can be solved by a deterministic optimization method, but for eachsolution y, the objective and constraints are evaluated by simulation.Convergence: well-developed theory, CLTs, large deviations, etc.

Well-synchronized CRNs are essential.

Example: staffing and scheduling in a multiskill call center.

More general: Objective and constraints may contain functions of severalexpectations, or quantiles, etc.

Page 52: Common random numbers (CRN)...For typical simulations, F 1 k (U) is much too complicated to compute. Draft Common random numbers (CRN) 1 Simulation is often used tocomparesimilar systems,

Dra

ft

28

Sample average optimizationSuppose we have an optimization problem of the form

min E[H(y,U)]subject to E[Gk(y,U)] ≥ bk for all k ,

y ∈ S (some set)

Simulate n copies of functions H and Gk , with CRNs across y, and takeaverages. Sample average problem (deterministic in y):

min Hn(y)subject to Gk,n(y) ≥ bk for all k ,

y ∈ S.Can be solved by a deterministic optimization method, but for eachsolution y, the objective and constraints are evaluated by simulation.Convergence: well-developed theory, CLTs, large deviations, etc.

Well-synchronized CRNs are essential.

Example: staffing and scheduling in a multiskill call center.

More general: Objective and constraints may contain functions of severalexpectations, or quantiles, etc.