Simulation

384
IE 519 1 Simulation Modelling

description

.

Transcript of Simulation

Page 1: Simulation

IE 519 1

Simulation Modelling

Page 2: Simulation

IE 519 2

Contents

Input Modelling 3Random Number Generation 41Generating Random Variates 80Output Analysis 134Resampling Methods 205Comparing Multiple Systems 219Simulation Optimization248Metamodels 278Variance Reduction 292Case Study 350

Page 3: Simulation

IE 519 3

Input Modelling

Page 4: Simulation

IE 519 4

Input Modelling

You make custom WidgetsHow do you model the input process?Is it deterministic?Is it random?

Look at some data

Page 5: Simulation

IE 519 5

Orders

1/5/20041/12/20041/20/20041/29/2004

2/3/20042/15/20042/19/20042/25/20042/28/2004

3/6/20043/15/20043/27/20043/31/2004

4/10/20044/14/20044/17/20044/21/20044/22/20044/28/2004

5/2/20045/3/2004

5/24/20045/26/2004

6/4/20046/15/2004

Now what?

Page 6: Simulation

IE 519 6

Histogram

0

1

2

3

4

5

6

7

8

[0,2

][2

,4]

[4,6

][6

,7]

[8,1

0]

[10,

12]

[12,

14]

[14,

16]

[16,

18]

[18,

20]

20+

Page 7: Simulation

IE 519 7

Other Observations

Trend? Stationary or non-stationary process

Seasonality May require multiple processes

Page 8: Simulation

IE 519 8

Choices for Modelling

Use the data directly (trace-driven simulation)Use the data to fit an empirical distributionUse the data to fit a theoretical distribution

Page 9: Simulation

IE 519 9

Assumptions

To fit a distribution, the data should be drawn from IID observationsCould it be from more than one distribution? Statistical test

Is it independent? Statistical test

Page 10: Simulation

IE 519 10

Activity I

Hypothesize families of distributions Look at the data Determine what is a reasonable

process Summary statistics Histograms Quantile summaries and box plots

Page 11: Simulation

IE 519 11

Activity II

Estimate the parameters Maximum likelihood estimator (MLE) Sometimes a very simple statistics Sometimes requires numerical

calculations

Page 12: Simulation

IE 519 12

Activity III

Determine quality of fit Compare theoretical distribution with

observations graphically Goodness of fit tests

Chi-square tests Kolmogorov-Smirnov test

Software

Page 13: Simulation

IE 519 13

Chi-Square Test

Formal comparison of a histogram and the probability density/mass function

Divide the range of the fitted distribution into intervals

Count the number of observations in each interval

),[),...,,[),,[ 12110 kk aaaaaa

),[in s' ofNumber 1 jjj aaXN

Page 14: Simulation

IE 519 14

Chi-Square TestCompute the expected proportion

Test statistic is

Reject if too large

data discretefor )(ˆ

data continuousfor )(ˆ

1

1

jij

j

j

axai

a

aj

xp

dxxfp

k

j j

jj

np

npN

1

2

Page 15: Simulation

IE 519 15

How good is the data?

Assumption of IID observationsSometimes time-dependent (non-stationary)Assessment Correlation plot Scatter diagram Nonparametric tests

Page 16: Simulation

IE 519 16

Correlation Plot

Calculate and plot the sample correlation

0:H

apart nsobservatio nsobservatio ofn Correlatioˆ

0

j

j j

Page 17: Simulation

IE 519 17

Scatter Diagram

Plot pairs

Should be scattered randomly through the planeIf there is a pattern then this indicates correlation

1, ii XX

Page 18: Simulation

IE 519 18

Multiple Data Sets

Often you have multiple data sets (e.g., different days, weeks, operators)

Is the data drawn from the same process (homogeneous) and can thus be combined?Kruskal-Wallis test

knkk

n

n

XXX

XXX

XXX

121

22221

11211

,...,,

,...,,

,...,,

1

1

Page 19: Simulation

IE 519 19

Kruskal-Wallis (K-W) Statistic

Assign rank 1 to the smallest observation, rank 2 to the second smallest, etcCalculate

k

i i

i

n

jiji

ijij

k

ii

nn

R

nnT

XRR

XXR

nn

i

1

2

1

1

)1(3)1(

12

ofRank

Page 20: Simulation

IE 519 20

K-W Test

The null hypothesis is

H0: All the population distribution are identical

H1: At least one is larger than at least one

other

We reject H0 at level if

In other words, the test statistic follows a chi-square distribution with k-1 degrees of freedom

21,1 kT

Page 21: Simulation

IE 519 21

Absence of Data

We have assumed that we had data to fit a distributionSometimes no data is availableTry to obtain minimum, maximum, and mode and/or mean of the distribution Documentation SMEs

Page 22: Simulation

IE 519 22

Triangular Distribution

Page 23: Simulation

IE 519 23

Symmetric Beta Distributions

==2 ==3

==5 ==10

Page 24: Simulation

IE 519 24

Skewed Beta Distributions

=2, =4

Page 25: Simulation

IE 519 25

Beta Parameters

2

1

Mode

)(

Mean

abac

aba

a

babc

baca

ˆ)(ˆ

))((

)2)((ˆ

Estimates

Page 26: Simulation

IE 519 26

Benefits of Fitting a Parametric Distribution

We have focused mainly on the approach where we fit a distribution to dataBenefits:

Fill in gaps and smooth data Make sure tail behavior is represented

Extreme events are very important to the simulation but may not be represented

Can easily incorporate changes in the input process

Change mean, variability, etc. Reflect dependencies in the inputs

Page 27: Simulation

IE 519 27

What About DependenciesAssumed so far an IID processMany processes are not:

A customer places a monthly order. Since the customer keeps inventory of the product, a large order is often followed by a small order

A distributor with several warehouses places monthly orders, and these warehouses can supply the same customers

The behavior of customers logging on to a web site depends on age, gender, income, and where they live

Do not ignore it!

Page 28: Simulation

IE 519 28

Solutions

A customer places a monthly order. Should use a time-series model that

captures the autocorrelation

A distributor with several warehouses Need a vector time-series model

Customers logging on to a web site Need a random vector model where each

component may have a different distribution

Page 29: Simulation

IE 519 29

Taxonomy of Input Models

Time-independent

Stochastic Processes

Univariate

Multivariate

Discrete-time

Continuous-time

Discrete

Continuous

‘Mixed

Discrete

Continuous

‘Mixed

Discrete-state

Cont.-state

Discrete-state

Cont.-state

Time-series models

Markov chains (stationary?)

Poisson process (stationary?)

Markov process

Binomial, etc.Normal, gamma, beta, etc.

Empirical/Trace-driven

Independent binomial

Multivariate normal

Bivariate-exponential

Examples of models

Page 30: Simulation

IE 519 30

What if it Changes over Time?

Do not ignore it!Non-stationary input processExamples: Arrivals of customers to a restaurant Arrivals of email to a server Arrivals of bug discovery in software

Could model as nonhomogeneous Poisson process

Page 31: Simulation

IE 519 31

Goodness-of-Fit Test

The distribution fitted is tested using goodness-of-fit tests (GoF)How good are those tests?The null hypothesis is that the data is drawn from the chosen distribution with the estimated parametersIs it true?

Page 32: Simulation

IE 519 32

Power of GoF Tests

The null hypothesis is always false!If the GoF test is powerful enough then it will always be rejectedWhat we see in practice:

Few data points: no distribution is rejected

A great deal of data: all distributions are rejected

At best, GoF tests should be used as a guide

Page 33: Simulation

IE 519 33

Input Modeling Software

Many software packages exist for input modeling (fitting distributions)Each has at least 20-30 distributionsYou input IID data, the software gives you a ranked list of distributions (according to GoF tests)Pitfalls?

Page 34: Simulation

IE 519 34

Why Fit a Distribution at All?

There is a growing sentiment that we should never fit distributions (not consensus, just growing)A couple of issues: You don’t always benefit from data Fitting distribution is misleading

Page 35: Simulation

IE 519 35

Is Data Reality

Data is often Distorted

Poorly communicated, mistranslated or recorded Dated

Data is always old by definition Deleted

Some of the data is often missing Dependent

Often only summaries, or collected at certain times Deceptive

This may all be on purpose!

Page 36: Simulation

IE 519 36

Problems with Fitting

Fitting an input distribution can be misleading for numerous reasons

There is rarely a theoretical justification for the distribution. Simulation is often sensitive to the tails and this is where the problem is!

Selecting the correct model is futile The model gives the simulation practitioner

a false sense of the model being well-defined

Page 37: Simulation

IE 519 37

Alternative

Use empirical/trace-driven

simulation when there is sufficient

data

Treat other cases as if there is no

data, and use beta distribution

Page 38: Simulation

IE 519 38

Empirical Distribution

xX

XxXXXn

Xx

n

i

Xx

xF

XXXn

xXxF

XXX

n

iiii

iX

n

iX

n

)(

)1()()()1(

)(

)1(

)()2()1(

21

1)1(1

1

0

)(ˆ

define and ... nsobservatio order thecan or we

ofNumber )(ˆ

(CDF)function on distributi Empirical

,...,,

nsObservatio

Page 39: Simulation

IE 519 39

Beta Distribution Shapes

Page 40: Simulation

IE 519 40

What to Do?

Old rule of thumb based on number of data points available:<20: Not enough data to fit21-50 : Fit, rule out poor choices50-200 : Fit a distribution>200 : Use empirical distribution

Page 41: Simulation

IE 519 41

Random Number Generation

Page 42: Simulation

IE 519 42

Random-Number Generation

Any simulation with random components requires generating a sequence of random numbersE.g., we have talked about arrival times, service times being drawn from a particular distributionWe do this by first generating a random number (uniform between [0,1]) and then transforming it appropriately

Page 43: Simulation

IE 519 43

Three Alternatives

True random numbers Throw a dice Not possible to do with a computer

Pseudo-random numbers Deterministic sequence that is statistically

indistinguishable from a random sequence

Quasi-random numbers A regular distribution of numbers over the

desired interval

Page 44: Simulation

IE 519 44

Why is this Important?

Validity The simulation model may not be valid

due to cycles and dependencies in the model

Precision You can improve the output analysis by

carefully choosing the random numbers

Page 45: Simulation

IE 519 45

Pseudo-Random Numbers

Want an iterative algorithm that outputs numbers on a fixed intervalWhen we subject this sequence to a number of statistical test, we cannot distinguish it from a random sequenceIn reality, it is completely deterministic

Page 46: Simulation

IE 519 46

Linear Congruential Generators (LCG)

Introduced in the early 50s and still in very wide use todayRecursive formula

seed

modulus

increment

multiplier

mod)(

0

1

Z

m

c

a

mcaZZ ii

Every number is determinedby these four values

Page 47: Simulation

IE 519 47

Transform to Unit Uniform

Simply divide by m

What values can we take?

m

ZU i

i

Page 48: Simulation

IE 519 48

Examples

1,16mod)12(

1,13mod)3(

1,16mod)11(

01

01

01

ZZZ

ZZZ

ZZZ

ii

ii

ii

Page 49: Simulation

IE 519 49

Characteristics

All LCGs loopThe length of the cycle is the periodLCGs with period m have full periodThis happens if and only if

The only positive integer that divides both m and c is 1

If q is a prime that divides m, then q divides a-1

If 4 divides m then 4 divides a-1

Page 50: Simulation

IE 519 50

Types of LCGs

If c=0 then it is called multiplicative LCG, otherwise mixed LCG

Mixed and multiplicative LCG behave rather differently

Page 51: Simulation

IE 519 51

Comments on Parameters

Mixed Generator Want m to be large A good choice is m = 2b, where b is the number

of bits Obtain full period if c is odd and a-1 is

divisible by 4

Multiplicative LCGs Simpler Cannot have full period (first condition cannot

be satisfied) Still an attractive option

Page 52: Simulation

IE 519 52

Performance Tests

Empirical testsUse the RNG to generate some numbers and then test the null hypothesis

H0: The sequence is IID U(0,1)

Page 53: Simulation

IE 519 53

Test 1: Chi-Square Test

Similar to before:Generate Split [0,1] into k subintervals (k 100 )Test statistic is

With k-1 degrees of freedom

lsubintervath in s' ofNumber

1

22

jUf

k

nf

n

k

ij

k

jj

nUUU ,...,, 21

Page 54: Simulation

IE 519 54

Test 2: Serial Test

Consider

Similar to before

,...,...,,

,,...,,

2212

211

ddd

d

UUU

UUU

U

U

lsubinterva in the s' ofNumber 21

1 2

211 1 1

22

ijjj

k

j

k

j

k

jdjjj

d

Uf

k

nf

n

k

d

d

d

Page 55: Simulation

IE 519 55

Test 3: Runs Test

Calculate for

Test statistic (chi-square w/6 d.f.)

Where the a and b values are given empirically

66length of up runs ofnumber

5,...,2,1for length of up runs ofnumber

i

iiri

nUUU ,...,, 21

6

1

6

1

1

i jjjiiij nbrnbra

NR

Page 56: Simulation

IE 519 56

Test 4: Correlation Test

For uniform variables

3124

1

,12

1,

2

1

jiij

jii

jiijii

jiij

UUE

UUE

UEUEUUE

UUCovC

UVarUE

Page 57: Simulation

IE 519 57

Test 4: Correlation TestEmpirical estimate is

Test statistic

Approximately standard normal

2

0)1(11

)1(

713ˆ

1/)1(

31

12ˆ

h

hVar

jnh

UUh

j

h

kjkkjj

j

jj

VarA

ˆ

ˆ

Page 58: Simulation

IE 519 58

Passing the Test

A RNG with long period that passes a fixed set of statistical test is no guarantee of this being a good RNG

Many commonly used generators are not good at all, even though they pass all of the most basic tests

Page 59: Simulation

IE 519 59

Classic LCG16807Multiplicative LCGs cannot have full period, but they can get very close

Has period of 231-2, that is, best possibleDates back to 1969Suggested in many simulation texts and was (is) the standard for simulation softwareStill in use in many software packages

12

12mod16807

31

311

ii

ii

ZU

ZZ

Page 60: Simulation

IE 519 60

Java RNG

Mixed LCG with full period

Variant of the old rand48() Unix LCG

53

2112

22227

481

222

2

2mod)1172521490391(

ii

i

ii

ZZ

U

ZZ

482i

iZU

Page 61: Simulation

IE 519 61

Two more LCGs

VB

Excel

24

241

2

2mod)128201631140671485(

ii

ii

ZU

ZZ

1mod)211327.00.9821( 1 ii UU

Page 62: Simulation

IE 519 62

Simple Simulation Tests

Collision Test Divide [0,1) into d equal intervals Generate n points in [0,1)t C=Number of times a point falls in a box

that already has a point (collision)

Birthday Spacing Test Have k boxes, labeled with Define the spacing Consider

)()2()1( nIII

)()1( jjj IIS

2,...,1,: 1 njSSjY jj

Page 63: Simulation

IE 519 63

Performance: Collision

After 215 numbers, VB starts failingAfter 217 numbers, Excel starts failingAfter 219 numbers, LCG16807 starts failingThe Java RNG does OK up to at least 220 numbersNote that this means that a clear pattern is observed from the VB RNG with less than 100,000 numbers generated!

Page 64: Simulation

IE 519 64

Performance: B-day Spacing

After 210 numbers, VB starts failingAfter 214 numbers, Excel starts failingAfter 214 numbers, LCG16807 starts failingAfter 218 numbers, Java starts failing

For this test, the VB RNG is only good for about 1000 numbers!The performance gets even worse if we look at less significant digits.

Page 65: Simulation

IE 519 65

Combined LCGA better RNG is obtained as follows:

Recommended parameters (k=3)

Cycle length of 2191 with good structure

1mod

mod)...(

mod)...(

2

,2

1

,1

2,2,22,22,21,21,2,1

1,1,12,12,11,11,1,1

mZ

mZ

U

mZaZaZaZ

mZaZaZaZ

iii

kikiii

kikiii

228532

2092

1370589,0,527612,,

810728,1403580,0,,

322

321

131211

131211

m

m

aaa

aaa

Page 66: Simulation

IE 519 66

Why do RNGs Fail?

We have seen that many commonly used RNGs fail simulation tests, even though they past the standard empirical tests

Why do these RNGs fail?Need to analyze the structure of the RNG

Page 67: Simulation

IE 519 67

Lattice Structure

For all LCGs, the numbers generated fall in a fixed number of planesWe want this to be as many planes as possible and ‘fill-up’ the spaceThis should be true in many dimensions

Page 68: Simulation

IE 519 68

Example: Two Full-Period LCGs

Page 69: Simulation

IE 519 69

LCG RANDU in 3 Dimensions

Page 70: Simulation

IE 519 70

Theoretical Tests

Based on analyzing the structure of the numbers that can be generatedLattice testSpectral test

Page 71: Simulation

IE 519 71

Selecting the Seed

1 10 15 0 13 6 11 12 9 2 7 8 5 14 3 4 1

16mod)11( 1 ii ZZ

Seed = 15

Seed = 1

Say we need two independent sequences of 8 numbersSelect seed values 1 and 15

Good RNGs will haveprecomputed seed values

Page 72: Simulation

IE 519 72

Streams and Substreams

A segment corresponding to a seed is usually called a streamAlso want to be able to get independent substreams of each streamExample: Assign each stream to generating one type of numbers & and use each substream for independent replicationsRequires very long period generators, and precomputed streams

Page 73: Simulation

IE 519 73

Analysis of RNG 16mod)11( 1 ii ZZ1 10 15 0 13 6 11 12 9 2

0.06 0.63 0.94 0.00 0.81 0.38 0.69 0.75 0.56 0.13

0

0.5

1

1.5

2

2.5

3

3.5

[0,0.25) [0.25,0.5) [0.5,0.75) [0.75,1)

Page 74: Simulation

IE 519 74

Do We Need Randomness?

For certain applications, definitelyFor simulation, maybe not alwaysQuasi-random numbersSay we want to estimate an expected value

uu dfs

)1,0[

Page 75: Simulation

IE 519 75

Monte Carlo Estimate

Using n independent simulation runs

Error converges at rate

)1,0(/)ˆ(

ˆ

2

1

0

Nn

nVar

fn

n

ii

u

n

Page 76: Simulation

IE 519 76

Quasi-Monte CarloReplace the random points with a set of points that cover [0,1)s more uniformly

Page 77: Simulation

IE 519 77

Discussion

By using Quasi-random numbers, we are able to achieve faster convergence rateWhen estimating an integral, real randomness is not really an issueWhat about discrete event simulation?

Page 78: Simulation

IE 519 78

DiscussionGenerating random numbers is important to every simulation project Validity of the simulation Precision of the output analysis

Not all RNG are very good

Page 79: Simulation

IE 519 79

Discussion

Problems Too short a period (period of 231 not sufficient) Unfavorable lattice structure

Numbers generated by RANDU() fall on 15 planes in R2

Inability to get truly independent subsequences Need streams (segments), and substreams

Should choose a RNG that passes both empirical and theoretical tests, has a very long period, and allows us to get good streams

Page 80: Simulation

IE 519 80

Generating Random Variates

Page 81: Simulation

IE 519 81

Generating Random Variates

Say we have fitted an exponential distribution to interarrival times of customersEvery time we anticipate a new customer arrival (place an arrival even on the events list), we need to generate a realization of of the arrival timesKnow how to generate unit uniformCan we use this to generate exponential? (And other distributions)

Page 82: Simulation

IE 519 82

Two Types of Approaches

Direct Obtain an analytical expression Inverse transform

Requires inverse of the distribution function Composition & Convolution

For special forms of distribution functions

Indirect Acceptance-rejection

Page 83: Simulation

IE 519 83

Inverse-Transform Method

Page 84: Simulation

IE 519 84

Formulation

Algorithm

Proof

)(Return .2

)1,0(~ Generate 1.1 UFX

UU

)(

))((

)(1

xF

xFUP

xUFPxXP

Page 85: Simulation

IE 519 85

Example: Weibull

Page 86: Simulation

IE 519 86

Example: Exponential

00

01)(

/

x

xexF

x

Page 87: Simulation

IE 519 87

Discrete Distributions

Page 88: Simulation

IE 519 88

Formulation

Algorithm

Proof: Need to show

IxFUIX

UU

xxX

:minReturn .2

)1,0(~ Generate 1.

,..., valuescan take 21

ixpxXP ii )(

Page 89: Simulation

IE 519 89

Continuous, Discrete, Mixed

Algorithm

UxFxX

UU

)(:minReturn 2.

)1,0(~ Generate .1

Page 90: Simulation

IE 519 90

Discussion: Disadvantages

Must evaluate the inverse of the distribution function May not exist in closed form Could still use numerical methods

May not be the fastest way

Page 91: Simulation

IE 519 91

Discussion: Advantages

Facilitates variance reduction

Ease of generating truncated distributions21

21

21

21

22

11

11

1

tindependen ,

selectCan

)(

)(

UU

UU

UU

UFX

UFX

Page 92: Simulation

IE 519 92

Composition

Assume that

Algorithm1. Generate a positive random integer,

such that P(J=j)=pj

2. Return X with distribution Fj

1

1

1

,)()(

jj

jjj

p

xFpxF

Page 93: Simulation

IE 519 93

Convolution

Assume that(where the Y’s are IID with CDF G)Algorithm

mYYYX ...21

m

m

YYYX

GYYY

...Return 2.

CDFeach with IID ... Generate 1.

21

21

Page 94: Simulation

IE 519 94

Acceptance-Rejection Method

Specify a function that majorizes the density

New density function

Algorithm

xxfxt ),()(

1. Step back to go Otherwise

.return ),()( If 3.

oft independin Generate .2

density ith Generate 1.

YXYtYfU

YU

rwY

dxxt

xtxr)(

)()(

Page 95: Simulation

IE 519 95

Example:

Page 96: Simulation

IE 519 96

Example: More Efficient

Page 97: Simulation

IE 519 97

Simple Distributions

Uniform

Exponential

m-Erlang

UabaX )(

UX ln

m

iiU

mX

1

ln

Page 98: Simulation

IE 519 98

GammaDistribution function

No closed-form inverseNote that if then

otherwise0

0!

1)(1

0

/ xj

x

exFj

j

x

),(~ gammaX

)1,(~ gammaX

Page 99: Simulation

IE 519 99

Gamma(,1) Density

Page 100: Simulation

IE 519 100

Gamma(,1)

Gamma(1,1) is exponential(1)0<<1: Acceptance-rejection with

This majorizes the Gamma (,1) density, but can we generate random variates?

xe

xxx

xtx

1)(

10)(

00

)(1

Page 101: Simulation

IE 519 101

Gamma(,1), 0<<1

The integral of majorizing function

New densityeeb

b

dxe

dxx

dxxtx

/)(

,)(

)()()(

1

1

0

11

0

xbe

xbx

x

xrx

1

10

00

)(1

Page 102: Simulation

IE 519 102

Gamma(,1), 0<<1

The distribution function is

Invert

xb

e

xb

x

dyyrxR x

x

11

10)()(

0

otherwise)1(

ln

1

)(

/1

1

ubb

ubuuR

Page 103: Simulation

IE 519 103

Gamma(,1), 0<<1

1.Generate U1~U(0,1) and let P=bU1. If P>1, go to step 3. Otherwise go to step 2

2.Let Y=P1/, and generate U2~U(0,1). If U2eY, return X=Y. Otherwise, go to step 1.

3.Let Y=-ln[(b-P)/], and generate U2~U(0,1). If U2Y-1, return X=Y. Otherwise, go to step 1.

Page 104: Simulation

IE 519 104

Gamma(,1), 1<

Acceptance-rejection with

)(4

12

2

1

ec

x

xxt

Page 105: Simulation

IE 519 105

Gamma(,1), 1<

Distribution function (log-logistics)

Inverse

x

xxR

)(

/11

1)(

u

uuR

Page 106: Simulation

IE 519 106

Normal

Distribution function does not have closed form (so neither does the inverse)Can use numerical methods for inverse-transformNote that

If we can generate unit normal, then we can generate any normal

),(~

)1,0(~

NX

NX

Page 107: Simulation

IE 519 107

Normal: Box-Muller

Algorithm

Technically, independent N(0,1), but serious problem if used with LCGs

1

212

211

21

Return .3

2sinln2

,2cosln2Set 2.

)1,0(~,t independen Generate .1

X

UUX

UUX

UUU

Page 108: Simulation

IE 519 108

Polar Method

Algorithm

YVX

YVX

WWY

VVWUV

UUU

ii

22

11

22

21

21

/)ln2(

let Otherwise, 1. step togo 1 WIf 2.

,12Let

).1,0(~,t independen Generate .1

Page 109: Simulation

IE 519 109

Derived Distributions

Several distributions are derived from the gamma and normalCan take advantage of knowing how to generate those two distributions

Page 110: Simulation

IE 519 110

Beta

Density

No closed form CDF. No closed form inverseMust use numerical methods for inverse-transform method

1

0

1121

21

11

21

21

)1(,

otherwise0

10,

)1()(

dtttB

xB

xxxf

zz

Page 111: Simulation

IE 519 111

Beta Distribution Shapes

Page 112: Simulation

IE 519 112

Beta Properties

Sufficient to consider beta on [0,1]If X~beta(1,2) then 1-X~beta(2,1)

If 2=1 then

If 1=1, 2=1 then X~U(0,1)

1

1

121

)(

1,,

)1()( 1

11

1

21

11

xxF

xB

x

B

xxxf

Page 113: Simulation

IE 519 113

Beta: General Approach

If Y1~Gamma(,1) and Y2~Gamma(,1), and Y1 and Y2 are independent, then

Thus, if we can generate two gamma random variates, we can generate a beta with arbitrary parameters

2121

1 ,beta~ YY

Y

Page 114: Simulation

IE 519 114

Pearson Type V and Type VI

Pearson Type V X~PT5() iff 1/X~gamma()

Pearson Type VI If Y1~Gamma(,) and

Y2~Gamma(,1), and Y1 and Y2 are independent, then ,,6PT~ 21

2

1

Y

Y

Page 115: Simulation

IE 519 115

Pearson Type V

Page 116: Simulation

IE 519 116

Pearson Type VI

Page 117: Simulation

IE 519 117

Normal Derived Distributions

Lognormal

Test distributions (not often used for modeling): Chi-squared Student’s t distribution F distribution

22 ,~,~ LNeNY Y

Page 118: Simulation

IE 519 118

Log-Normal

Page 119: Simulation

IE 519 119

Empirical

Use inverse-transform methodDo not need to search through observations because changes occur precisely at 0, 1/(n-1), 2/(n-1), …Algorithm

)()1()( )1(Return .2

1let and

,1Let ).1,0(~ Generate .1

III XXIPXX

PI

)U(n-PUU

Page 120: Simulation

IE 519 120

Empirical Distribution Function

Page 121: Simulation

IE 519 121

Discrete Distributions

Can always use the inverse-transform methodMay not be most efficientAlgorithm

I

j

I

j

p(j)Up(j)

UU

0

1

0

satisfies that

IXinteger enonnegativ Return the.2

).1,0(~ Generate .1

Page 122: Simulation

IE 519 122

Alias Method

Another general method is the alias method, which works for every finite range discrete distribution

Page 123: Simulation

IE 519 123

Alias Method: Example

33.0

22.0

14.0

01.0

)(

x

x

x

x

xp

0 1 2 3

L2=3

Ii LXIXFU

UUnDUI

return Otherwise .return If.2

)1,0(~ ),,0(~ Generate .1

L0=1

Page 124: Simulation

IE 519 124

Bernoulli

Mass function

Algorithm

0return Otherwise .1return If.2

)1,0(~ Generate .1

XXpU

UU

otherwise0

1

01

)( xp

xp

xp

Page 125: Simulation

IE 519 125

Binomial

Mass function

Use the fact that if X~bin(t,p) then

otherwise0

},...,1,0{)1()(

txppx

txp

xtx

)(Bernoulli~

...,21

pY

YYYX

i

t

Page 126: Simulation

IE 519 126

Geometric

Mass function

Use inverse-transform

)1ln(

)1ln(Return .2

)1,0(~ Generate .1

pUX

UU

otherwise0

},...,1,0{)1()(

txppxp

x

Page 127: Simulation

IE 519 127

Negative Binomial

Mass function

Note that X~negbin(n,p) iff

otherwise0

},...,1,0{)1(1

)(txpp

x

xsxp

xs

)(Geometric~

...,21

pY

YYYX

i

n

Page 128: Simulation

IE 519 128

Poisson

Mass functionAlgorithm

Rather slow. No very good algorithm for Poisson distribution

otherwise0

},...,1,0{!)( tx

x

exp

x

1 step back to go and 1Let .3

3. step togo Otherwise .return , If

.by replace and )1,0(~ Generate 2.

0,1,Let .1

11

ii

iXab

bUbUU

ibea

ii

Page 129: Simulation

IE 519 129

Poisson Process

A stochastic process {N (t), t 0} that counts the number of events up until time t is a Poisson process if: Events occur one at a time N (t+s)-N (t) is independent of {N (t),

t0}

A Poisson process is determined by its rate )()( tNE

dt

dt

Page 130: Simulation

IE 519 130

Generating a Poisson Process

Stationary with rate >0Time between events Ai=ti-ti-1 are IID exponentialAlgorithm

Utt

UU

ii ln1Return .2

)1,0(~ Generate .1

1

Page 131: Simulation

IE 519 131

Nonstationary Case

Can we simply generalize?

(t)

it 1it

Page 132: Simulation

IE 519 132

Thinning Algorithm

1. Set t=ti-1

2. Generate U1, U2 IID U(0,1)

3. Replace t by

4. If return ti = t. Otherwise, go back to step 2.

)(max where,ln1 *

1*tUt

t

*2 /)( tU

Page 133: Simulation

IE 519 133

Summary

For any stochastic simulation it is necessary to generate random variates from either a theoretical distribution or an empirical distributionGeneral methods we covered Inverse-transform Acceptance-rejection Alias method

Page 134: Simulation

IE 519 134

Output Analysis

Page 135: Simulation

IE 519 135

Output Analysis

Analyzing the output of the simulation is a part that is often done incorrectly (by analysts and commercial software)We consider several issues

Obtaining statistical estimates of performance measures of interest

Improving precision of those estimates through variance reduction

Comparing estimates from different models Finding the optimal performance value

Page 136: Simulation

IE 519 136

Simulation Output

The output from a single simulation run is a stochastic process Y1, Y2, …

Observations (n replications of length m):

nmnin

mi

mi

yyy

yyy

yyy

1

2221

1111

Page 137: Simulation

IE 519 137

Parameter Estimation

Want to estimate some parameter based on these observations

?ˆlim

?Consistant

Unbiased?

tt

E

Page 138: Simulation

IE 519 138

Transient vs Steady State

Page 139: Simulation

IE 519 139

Initial Values: M/M/1 Queue

Page 140: Simulation

IE 519 140

Types of Simulation

Terminating simulation

Non-terminating simulation

Steady-state parameters

Steady-state cycle parameters

Other parameters

Page 141: Simulation

IE 519 141

Terminating Simulation

Examples: A retail establishment that is open for

fixed hours per day A contract to produce x number of a

high cost product Launching of a spacecraft

Never reaches steady-stateInitial conditions are included

Page 142: Simulation

IE 519 142

Non-Terminating Simulation

Any system in continuous operation (could have a ‘break’)Interested in steady-state parametersInitial conditions should be discarded

Sometimes no steady-state because the system is cyclicThen we are interested in steady-state cycle parameters

Page 143: Simulation

IE 519 143

Terminating Simulation

Let Xj be a random variable defined on the jth replicationWant to estimate the mean =E (Xj )

Fixed-sample-size procedureCI assumes Xj‘s are normally distributed

n

nStnX n

)()(

2

2/1,1

Page 144: Simulation

IE 519 144

Quality of Confidence IntervalNumber of failures

Average delay (25 customers) Average delay (500 customers)

Depends on both the underlying distribution and the number ofreplications

Page 145: Simulation

IE 519 145

Specifying the Precision

Absolute error

To obtain this

X

XP

XP

XXP

length half

length halflength half1

Page 146: Simulation

IE 519 146

Replications Needed

To obtain absolute error of , the number of replications needed is approximately

i

nStnin ia

)(:min

2

2/1,1*

Page 147: Simulation

IE 519 147

Relative ErrorAlso interested in the relative errorNow we have

X

)1(

)1(

length half1

XP

XP

XXP

XXP

XXP

XX

XP

Page 148: Simulation

IE 519 148

Replications Needed

To obtain relative error of , the number of replications needed is approximately

1)(

)(

:min

2

2/1,1*

nXinS

tnin

i

r

Page 149: Simulation

IE 519 149

Sequential Procedure

Define

Algorithm

1 step togo andn replicatio

additionalan make 1let Otherwise,

stop. and of estimate

theas )( use ,')( If 2.

,...,, from and )( Compute 1.

set and nsreplicatio Make 0.

21

00

nn

nXnXδ(n,α)

XXXnX

nnn

n

i

nStn i

)(),( ,

1'

2

2/1,1

Page 150: Simulation

IE 519 150

Other Measures

If we only use averages, the results can often be misleading or wrongWhat about the variance?Alternative/additional measures Proportions Probabilities Quantilies

Page 151: Simulation

IE 519 151

Example

Suppose we are interested in

customer delay X. We can estimate Average delay E[X]

Proportion of customer with Xa

Probabilities, e.g., P[Xa]

The q-quantile xq

Page 152: Simulation

IE 519 152

Estimating Proportions

Define an indicator function

Obtain a point estimate of the proportion

otherwise0

if1 aXI i

n

iiI

nr

1

Page 153: Simulation

IE 519 153

Estimating Probabilities

Want to estimate p=P(XB)

Have n replications X1,X2,…,Xn

Define S=number of observations that fall in set BS ~ binomial(n,p)Unbiased estimate is

n

Sp ˆ

Page 154: Simulation

IE 519 154

Estimating Quantiles

Let X(1),X(2),…,X(n) be the order statistics corresponding to n simulation runsA point estimator is then

otherwise

integeran is ifˆ )(

nq

nq

q X

nqXx

Page 155: Simulation

IE 519 155

Initial Conditions

In terminating simulation there is no steady-stateHence, the initial conditions are included in the performance measure estimatesHow should they be selected?

Use an artificial ‘warm-up’ period just to get reasonable start-up state

Collect data and model the initial conditions explicitly

Page 156: Simulation

IE 519 156

DiscussionFor terminating simulation we must use replications (cannot increase length of simulation run)Point estimates of performance measures:

Unbiased estimate and an approximate CI is easily constructed for the mean performance

Also, obtained point estimates for proportions, probabilities, and quantiles (mean not always enough)

It is important to be able to control the precision – determine how many replications are neededInitial conditions are always included in the estimates for terminating simulations – must be selected carefully

Page 157: Simulation

IE 519 157

Steady-State Behavior

Now we’re interested in parameters related to the limit distribution

Problem is that we cannot wait until infinity!

yYPyF

yYPyF

yFyF

ii

ii

)(

)(

)()(

Page 158: Simulation

IE 519 158

Estimating Mean

Suppose we want to estimate the steady-state mean

Problem:

One solution is to add warm-up and get a less biased estimator

m

liiY

lmlmY

1

1),(

mmYE ,)(

ii

YE

lim

Page 159: Simulation

IE 519 159

Approaches for Estimating

There are numerous approaches for estimating the mean Replication/deletion One long replication:

Batch-means Autoregressive method Spectrum analysis Regenerative method Standardized time series method

Start with this

Page 160: Simulation

IE 519 160

Choosing the Warm-Up Period

In replication/deletion method the main issue is to choose the warm-up periodWould likeTradeoff:

If l is too small then we still have a large bias

If l is too large then the estimate will have a large variance

Very difficult to determine from a single replication

.,),( lmlmYE

Page 161: Simulation

IE 519 161

Welch’s Procedure

)1()1()1()1( 1321

124321

,1,2,4321

,21,22,224232221

,11,12,114131211

m

mmm

mnmnmnnnnn

mmm

mmm

YYYY

YYYYYYY

YYYYYYY

YYYYYYY

YYYYYYY

Page 162: Simulation

IE 519 162

Welch’s Procedure

Key is to smooth out high-frequency oscillations in the averages

Then plot the moving average

wiYi

wmwiYw

wY i

issi

w

wssi

i

,...,2,1,12

1

,...,1,12

1

)( 1

)1(

Page 163: Simulation

IE 519 163

Example: Hourly Throughput

When is it warmed up?

Page 164: Simulation

IE 519 164

Welch’s Procedure

Much smoother and easier to tell where it has converged

Want to err on the side of selecting it too large

Page 165: Simulation

IE 519 165

Replication/Deletion

Similar to terminating simulation

Need n pilot runs to determine the warm-up period l, and then throw away the first l observations from the new n’ runs

lm

Ym

nX

n

nStnX

m

lmji

n

'

1'

1)'(

'

)'()'(

'

'

2

2/1,1'

Page 166: Simulation

IE 519 166

Discussion

Replication/deletion appoach Easiest to understand and implement Has good statistical performance if done

correctly Applies to all output parameters and can be

used to estimate several different parameters for the same model

Can be used to compare different systems

Nonetheless, some other methods have clear advantages

Page 167: Simulation

IE 519 167

Covariance Stationary Process

Classic statistical inference assumes independent and identically distributed (IID) observationsEven after eliminating the initial transient this is not true for most simulations because most simulation output is auto-correlatedHowever, it is reasonable to assume that after the initial transient the output will be covariance stationary, that is,

is independent of i

kiik YY ,cov

Page 168: Simulation

IE 519 168

Notation:

0

20

2

21

,cov

,...,,

kk

kiik

j

j

n

YY

YVar

YEv

YYY

Simulation output:

Mean:

Variance:

Covariance:

Variance:

Correlation:

Assume covariance stationary

Page 169: Simulation

IE 519 169

Implications of Autocorrelation

If the process is covariance stationary the average is still an unbiased estimator, that is,

However, the same cannot be said about the standard estimate of the variance

In fact,

n

jjY

nnn

nS

1

22

2 ˆ1

11)(ˆˆ

1

121)(

1

122

n

nknSE

n

k k

ˆ and ˆ1

EYn

jj

Page 170: Simulation

IE 519 170

Expression for VarianceAssuming covariance stationary process it can be shown that:

We hope the estimate of the variance is unbiased, that is,

By combining the top equation above with the last equation on previous slide, we can check this for an independent and autocorrelated output process

1

1

02 121ˆn

kkn

k

n

ˆ22

n

SE

nS 2

Page 171: Simulation

IE 519 171

Independent Process

If the output process is independent then

ˆˆ1

1121

121ˆ

1,0,cov

22

1

12

20

1

1

02

00

n

nk

n

n

SE

nnn

k

n

kYY

n

kk

n

kk

kiikk

Page 172: Simulation

IE 519 172

Autocorrelation in ProcessIf the process is positively correlated (usual):

Hence, the estimator has less precision than predicted and the CI is misleading

ˆˆ1

1121

121ˆ

1,0

22

1

12

20

1

1

02

n

nk

n

n

SE

nnn

k

n

k

n

kk

n

kk

k

Page 173: Simulation

IE 519 173

Batch-Means Estimators

Batch-means estimators are the most popular alternative to replication/deletionThe idea here is to do one very long simulation run and estimate the parameters from this runAdvantage is that the simulation only has to go through the initial transient onceAssuming covariance-stationary output

No problem estimating the mean Estimating the variance is difficult because the data

is likely to be autocorrelated, that is, Yi and Yi+1 are correlated

Page 174: Simulation

IE 519 174

Classical Approach

Partition the run of n into k equal-size contiguous macro replications, each composed of m=n/k micro replicationsPoint estimator

k

jjk 1

ˆ1

ˆ

Page 175: Simulation

IE 519 175

CI Analysis

Assuming as before that Y1, Y2,… is covariance-stationary with E[Yi]=

If the batch size is large enough, then the estimates will be approximately uncorrelatedSuppose we can also choose k large enough so that they are approximately normalIt follows that the batch estimates have the same mean and varianceHence we can treat them as approximately IID normal and get the usual confidence interval

jv̂

Page 176: Simulation

IE 519 176

Variants of Batch-Means

nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321

nlmlmlmmm YYYYYYYY ,...,,...,,,....,,,.....,, 121121

nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321

Batch 1 Batch 2

Batch 1

Batch 1

Batch 2

Batch 2

Page 177: Simulation

IE 519 177

Steady-State Batching

General variance estimator

}1,...,2,1{

}1)1(,...,12,1,1{

1

ˆˆ

ˆVar

22

2

2

mnB

mkmmB

B

BS

mn

S

Bj j

B

B

Page 178: Simulation

IE 519 178

Determining the Batch Size

Tradeoff Large batch sizes have the needed

asymptotic properties Small batch sizes yield more batches That is, choice between bias due to poor

asymptotics and variance due to few batches

Rule of thumb (empirical): Little benefit to more than 30 batches Should not have fewer than 10 batches

Page 179: Simulation

IE 519 179

Mean Squared Error

The mean squared error (MSE) of an estimator is

This is the classic measure of qualityCan use to select the optimal batch size

ˆVar,ˆBias

ˆ,ˆ

2

2

EMSE

Page 180: Simulation

IE 519 180

Optimal Batch Size

The asymptotic mse-optimal batch size is

gravity ofCenter

constant Variance

constant Bias

12

0

1

3/1

0

1*

v

b

v

b

c

c

c

cnm

Page 181: Simulation

IE 519 181

Regenerative Method

Similar to batch-means, the regenerative method also tries to construct independent replications from a single runAssume that Y1, Y2,… has a sequence of random points 1 B1 < B2 < … called regeneration points, and the process from Bj is independent of the process prior to Bj

The process between two successive regeneration points is called a regeneration cycle

Page 182: Simulation

IE 519 182

Estimating the Mean

cycleson regenerati ofNumber '

)'(

)'(ˆ

][

][

11

n

nN

nZ

YZ

NE

ZE

j

j

B

Biij

Page 183: Simulation

IE 519 183

Analysis

The estimator is not unbiased. However, it is strongly consistent

Let be the covariance matrix ofLetThese are IID with mean 0 and variance

(w.p.1) )'(ˆ'

n

n

Tjjj NZ ,U

jjj NZV

222

12112 2 V

Page 184: Simulation

IE 519 184

Analysis

From the CLT

Have estimates

)1,0('

)'(2

Nn

nV D

V

)'(ˆ)'(ˆ)'(ˆ)'(ˆ2)'(ˆ)'(ˆ

)'(ˆ)'(ˆ

)'(ˆ)'(ˆ)'(ˆ

222

12112

2212

1211

nnnnnn

nn

nnn

V

Page 185: Simulation

IE 519 185

Analysis

Can be shown thatHence

We get a CI

(w.p.1) )'(ˆ 2

'

2V

nV n

)1,0(

)'(')'(ˆ

)'(ˆ22

NnNnn

n D

V

)'(

')'(ˆ)'(ˆ

221

nN

nnzn V

Page 186: Simulation

IE 519 186

Non-Independence

Non-overlapping batch-means and regeneration methods try to create independence between batches/cyclesAn alternative is to use estimates of the autocorrelation structure to estimate the variance of the sample mean(Again, estimating the mean is no problem, just the variance)Spectrum analysis and autoregressive methods attempt to do this

Page 187: Simulation

IE 519 187

Spectral Variance Estimator

Assume the process is covariance stationary:

The variance can be expressed asThe spectral density function of the process:

Since , an estimate of the spectral density function at frequency 0 is an estimate of the variance

)(, lYYEYE ljjj

ll)(

l

llf )cos()(2

1)(

llf )()0(2

Page 188: Simulation

IE 519 188

Spectral Variance Estimator

Using standard results:

1)(

1)0(

)()(1

)(

)()()(ˆ

1

1

)1(

2

lw

w

nYYnYYn

l

llwn

n

n

ln

rlrrn

m

mlnn

Batch size

Weights

Page 189: Simulation

IE 519 189

Parameters

For the batch size

Example of weight functions

0

nnm

m

otherwise0

11)(

otherwise0

11)(

22 mlmllw

mlmllw

n

n

Page 190: Simulation

IE 519 190

Autoregressive Method

Again assume covariance-stationary output process, and also a pth-order autoregressive model

2

0

0

varianceand 0mean h wit

variablesrandom eduncorrelat }{

1

i

p

jijij

b

Yb

Page 191: Simulation

IE 519 191

Convergence Result

Can be shown that

Can estimate these quantities and get

A CI can be constructed using t-distribution

20

2)(Var

p

j jm

bmYm

22

ˆ

ˆ)(Var

bmmY

Page 192: Simulation

IE 519 192

What is the Coverage?Empirical results for 90% CI for two simulation models

Page 193: Simulation

IE 519 193

Discussion

Replication/deletion is certainly the most popular in practice (easy to understand)Batch-means is very effective. There are practical algorithms and still a lot of researchSpectral methods are still a subject of active research but probably not used much in practice (very complicated)Autoregressive methods appear not be used/investigated muchRegeneration methods are theoretically impeccable but practically useless!

Page 194: Simulation

IE 519 194

Comments on Variance Estimates

We have spent considerable time looking at alternative estimates of the varianceWhy does it matter?Simulation output is usually (always) auto-correlated, which makes it difficult to estimate variance, and hence the CI may be incorrectMost seriously, the precision of the estimate may be less than predicted and hence inference drawn from the model may not be valid

Page 195: Simulation

IE 519 195

Implications of Autocorrelation

Because simulation output is usually autocorrelated we cannot simply use all of the observations to estimate the meanWe need some way of obtaining no correlation

Replication/deletion gets this through independent replications

Batch-means gets the (almost) through non-overlapping batches

Regenerative method get this through independent regenerative cycles

Page 196: Simulation

IE 519 196

Sequential Procedures

None of the single run methods we have discuss can assure any given precision (which we need to make a decision)Several sequential procedures exist that allow us to do this

More complicated than for replication/deletion

May require very long simulation runs

Page 197: Simulation

IE 519 197

Good Sequential Procedures

Batch-means and relative error stopping rule

Law and Carson procedure (1979) Automated Simulation Analysis Procedure

(ASAP) and extension ASAP3 (2002, 2005)

Spectral method and relative stopping rule WASSP (2005)

All of these methods obtain much better coverageHowever, they are rarely if ever used!

Page 198: Simulation

IE 519 198

Estimating Probabilities

Know how to estimate meansHow about probabilities p = P[YB] ?Note that

We therefore already know this!

][

0011

1

otherwise0

if1

ZE

ZPZP

ZPBYP

BYZ

Page 199: Simulation

IE 519 199

Estimating Quantiles

Suppose we want to estimate the q-quantile yq, that is, P[Y yq]=q

More complicatedMost estimates based on order statistics Biased estimates Computationally expensive Coverage low if sample size is too low

Page 200: Simulation

IE 519 200

Cyclic Parameters

No steady state distributionWith some cycle definition

All of the techniques we have discussed before for steady-state parameters still apply to this new process

yYPyFyYPyF CC

i

Ci

Ci

)()(

Page 201: Simulation

IE 519 201

Multiple Measures

In practice we are usually interested in multiple measures simultaneously, so we have several CIs

How does this effect our overall CI? ?,...,1, ksIP ss

kkk IP

IP

1

1 111

Page 202: Simulation

IE 519 202

Bonferroni Inequality

No problem if independent

In practice performance measures are very unlikely to be independentIf they are not independent, we can use Bonferroni inequality

k

ssss ksIP

1

1,...,1,

k

sssss IPksIP

1

,...,1,

Page 203: Simulation

IE 519 203

Computational Implications

Say we have 5 performance measures and we want a 90% CITwo alternatives:

We can get five 98% CI for each of the performance measures, which gives us a 90% overall CI. This is computationally expensive

We can get five 90% CI and live with the fact that one or more of them is likely to not cover the true value of the parameter

We will revisit this topic when we talk about multiple comparison procedures

Page 204: Simulation

IE 519 204

Output Analysis: Discussion

Terminating simulation

Non-terminating simulation

Multiple runs• Replication/deletion• Issue with bias• Elimination of initial transient

Single long run• Batch-means, regenerative etc.• Autocorrelation problem with estimating the variance

• Replications defined by terminating event• Can determine precision• Initial conditions

Page 205: Simulation

IE 519 205

Resampling Methods

Page 206: Simulation

IE 519 206

Sources of Variance

We have learned how to estimate variance and construct CI, predict number of simulation runs needed, etc.Where does the variance come from?

Random number generator (RNG) Generation of random variates Computer only approximates real values Initial transient/stopping rules Inherently biased estimators Modelling error ?

Made worse by long runs!

Made better by long runs!

Page 207: Simulation

IE 519 207

Input Modelling

We have discussed input modeling and output analysis separatelyRecall main approaches for input modeling:

Fit a parametric distribution Fit an empirical distribution Use a trace Use beta distribution

In practice fitting a parametric distribution is the most common approach

Page 208: Simulation

IE 519 208

Numerical ExampleThe underlying system is an M/M/1/10 queueThe simulation model is 1 station, capacity of 10, and empirical distribution for interarrival and service times from 100 observationsWant to estimate the expected time in system E[W]Typical simulation experiment:

10 replications Very long run of 5000 customers Very long warm-up period of 1000 customers CI constructed using t-distribution

We would expect a very good estimate for the performance of the model

Page 209: Simulation

IE 519 209

Effect of Estimating Distribution Parameters

True model

No resampling

Direct resampling

True model assumes that the true models for interarrival and service distribution is known

No resampling is the traditional approach of empirical distribution and then construct a sample mean based on 10 replications

Direct resampling obtains a new sample of 100 data points for each of the 10 replications

Page 210: Simulation

IE 519 210

Why Poor Coverage?

The uncertainty due to replacing the true distribution with an estimate is neglected

This is the case for all commercial simulation software

Remedies Direct resampling Bootstrap resampling Uniformly randomized resampling

Page 211: Simulation

IE 519 211

Direct Resampling

For each replication (simulation run) use a new sample to create an empirical distribution functionRequires a lot more dataAlternatively what data is available can be split among the replicationsCan confidence intervals be constructed?

Page 212: Simulation

IE 519 212

Bootstrap Resampling

Use the bootstrap to create a ‘new’ sample for a new empirical distribution function for each replicationBootstrap: sampling with replacementNo need for additional data and may even be able to use less data

Page 213: Simulation

IE 519 213

Bootstrap Resampling Algorithm

For each input quantity q modeled, sample n values from the observed data with replacementConstruct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat

inq

iq

iq vvv )()2()1( ,...,,

Page 214: Simulation

IE 519 214

Uniformly Randomized

Note that if F is the cdf of X then F(X) is uniform on [0,1]

)1,(~

...

...

][

][]2[]1[

][]2[]1[

knkbetaXF

XFXFXF

XXX

k

n

n

Page 215: Simulation

IE 519 215

For each input quantity q modeled, order the observed data Generate a sample of n ordered values from a uniform distributionSetand construct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat

Uniform Randomized Bootstrap

)()2()1( ,...,, nqqq xxx

inq

iq

iq uuu )()2()1( ,...,,

ijqjq

pq uxF )()(

)(ˆ

Page 216: Simulation

IE 519 216

Numerical Results90% CI for a M/M/1/10 queue and varying traffic intensity {0.7,0.9}

Observations of interarrival and service times {50,100,500}

Page 217: Simulation

IE 519 217

Numerical Results90% CI for a M/U/1/10 queue and varying traffic intensity {0.7,0.9}

Observations of interarrival and service times {50,100,500}

Page 218: Simulation

IE 519 218

Discussion

Uncertainty in the input modeling can effect the precision of the outputFor a given application you can estimate this effect by selecting 3-5 random subsets of the data, and performing the analysis on eachBootstrap resampling can help fix the problem

Page 219: Simulation

IE 519 219

DiscussionBootstrap resampling is much more general, and provides an answer to the question:

Given a random sample & a statistic T calculated on this sample, what is the distribution of T ?

Assumptions: The empirical distribution converges to the true

distribution as the number of samples increases T is sufficiently ‘smooth’

Problems with extreme point estimates

Other simulation applications: Model validation Ranking-and-selection, etc.

Page 220: Simulation

IE 519 220

Comparing Multiple Systems

Page 221: Simulation

IE 519 221

Multiple Systems

We know something about how to evaluate the output of a single systemSimulation is rarely used to simply evaluate one systemComparison: Two alternative systems can be built Proposed versus existing system What-if analysis for current system

Page 222: Simulation

IE 519 222

Types of ComparisonsComparison of two systems Comparison of multiple systems

Comparison with a standard All pair-wise comparison Multiple comparison with the best (MCB)

Ranking-and-selection Selecting the best system of k systems Selecting a subset of m systems containing

the best Selecting the m best of k systems

Combinatorial optimization

Page 223: Simulation

IE 519 223

Overview of Various Approaches

Comparison of Systems Construct (simultaneous) confidence intervals

Ranking-and-selection Indifference zone

The systems that is selected has performance that is within an indifference zone of the best performance with a fixed probability

This is the most common method Bayesian approach Optimal simulation budget allocation

Optimization Design of experiments/Response surfaces Search procedures

Page 224: Simulation

IE 519 224

Example: One or Two Servers?

Page 225: Simulation

IE 519 225

Comparing Two Systems

Have IID observations from two output processes and want to construct a CI for the expected difference:

21

2222221

1111211

2

1

,...,,

,...,,

in

in

XEXXX

XEXXX

Page 226: Simulation

IE 519 226

A Paired-t CI

If n1=n2=n we can construct a paired CI

)(ˆ)(

)()1(

1)(ˆ

1)(

21,1

1

2

1

21

nZtnZ

nZZnn

nZ

Zn

nZ

ZE

XXZ

n

n

ii

n

ii

i

iii

Page 227: Simulation

IE 519 227

Welch CINow do not require equal samples, but assume that the two processes are independent

1)(1)(

)()(ˆ

)()()()(

)(1

1)(

1)(

2

2

22221

2

112

1

222211

21

2

222

1

12

121,ˆ2211

1

2

1

nnnSnnnS

nnSnnSf

n

nS

n

nStnXnX

nXXn

nS

Xn

nX

f

n

jiiij

iii

n

jij

iii

i

i

Page 228: Simulation

IE 519 228

Obtaining IID Observations

Need the observations from each system to be IIDTerminating simulation Each run is IID, so no problem

Non-terminating simulation Replication/deletion approach Non-overlapping batch-means

Page 229: Simulation

IE 519 229

Comparing Multiple Systems

Comparison with a standard

All pair-wise comparison

Multiple comparison with the best

(MCB)

Page 230: Simulation

IE 519 230

Comparison with a Standard

Now assume that one of the systems is the ‘standard,’ e.g. an existing systemConstruct a CI with with overall confidence level 1- for 2-1, 3-1,…, k-1.

Using Bonferroni inequality: Construct k-1 confidence intervals at level 1-k1 The individual CIs can be constructed using any method as Bonferroni will always hold

Page 231: Simulation

IE 519 231

All Pair-Wise Comparison

Now want to construct CIs to compare all systems with all otherQuite difficult because we need each individual CI to have level 1-kk1 to guarantee an overall level of 1-Only feasible for a relatively small number of k

Page 232: Simulation

IE 519 232

Multiple Comparison with the Best (MCB)

We are really interested in whatever is the best system, and hence to construct CIs to see if it is significantly better than each of the others

Here h is a critical parameter andMCB procedures are related to ranking-and-selection

nhnXnX

nhnXnX l

ilil

ilil

ili

lil

i

2)(max)(,

2)(max)(max

max

},0max{ xx

Page 233: Simulation

IE 519 233

Ranking-and-Selection

Have some k systems, and IID observations from each system:

Want to select the best system, that is, the system with the largest mean. We call this the correct selection (CS)

Can we guarantee CS?

kiii

iji XE

...21

Page 234: Simulation

IE 519 234

Indifference Zone Approach

We say that the selected system i* is the correct selection (CS) if

Here is called the indifference zoneOur goal is

Here P* is a user selected probability(Bechhofer’s approach)

*1 ii

**

1)( PPCSP

ii

Page 235: Simulation

IE 519 235

Two-Stage Approach: Stage I

Obtain n0 samples and calculate

Calculate the total samples needed

0

0

1

2

00

02

100

)1(

)(1

1)(

1)(

n

jiiji

n

jiji

nXXn

nS

Xn

nX

20

221

0

)(,1max

nSh

nN ii

Page 236: Simulation

IE 519 236

Two-Stage Approach: Stage II

Obtain Ni-n0 more observations, and calculate the second stage and overall mean

12

022

1

20

0

01

0)2(

20)1(

1

100

)2(

1

)(111

)()()(

1)(

0

ii

i

ii

ii

iiiiiii

N

njij

iii

ww

nSh

nN

n

N

N

nw

nNXwnXwNX

XnN

nNXi

Page 237: Simulation

IE 519 237

Comments: Assumptions

As usual: normal assumptionDo not need equal or known variances (many statistical selection procedures do)Two-stage approach requires an estimate of the variance (remember controlling the precision)The above approach assumes the least favorable configuration

Page 238: Simulation

IE 519 238

Subset Selection

In most applications, many of the systems are clearly inferior and can be eliminated quite easilySubset-selection: Find a subset of systems

Gupta’s approach:

*1

},..2,1{

PIiP

kI

k

n

hnXnXlI ili

l

2)(max)(:

Page 239: Simulation

IE 519 239

Proof

11,...,2,1,

,22

)()(

,2

)()(

2)(max)(

select

0

normal standardely Approximat

h

i

k

h

iiiiii

kii

iii

ik

kihZP

iin

hn

nXnXP

iin

hnXnXP

nhnXnXPIiP

kkk

k

kk

Page 240: Simulation

IE 519 240

Two-Stage Bonferroni: Stage I

Specify

Make no replications and calculate the sample variance of the difference

Calculate the second stage sample size

2

22

0

1

2

0

2

1,)1(1

*

max,max

1

1

1

0

0

ij

iji

n

ljiljliij

nk

StnN

XXXXn

S

tt

P

Page 241: Simulation

IE 519 241

Two-Stage Bonferroni: Stage II

Obtain the additional sample, calculate the overall sample means and select the best system, with the following CI:

jij

i

jij

ijij

i

XX

XX

max,0max

maxmax,0min

Page 242: Simulation

IE 519 242

Combined ProcedureInitialization: Calculate and

Subset selection: Calculate and

If |I|=1, stop. Otherwise, calculate second-stage:

Obtain more samples from each system iI

Compute the overall sample means and select the best system

)( 0)1( nX i

0

1

2

00

2 )(1

1 n

jiiji nXX

nS

2/1

022 nSStW liil

ilWnXnXiI illi ,)()(: 00

220 ,max ii hSnN

Page 243: Simulation

IE 519 243

Sequential ProcedureSet

Compute

Screen

If |I|=1, stop. Otherwise, take one more observation and go back to screening

0

1

2

000

2 )()(1

1 n

jliljiji nXnXXX

nS

rSh

rrW

ilrWrXrXIiI

ilil

illioldnew

2

22

2,0max)(

),()()(:

12 ,11

2

2

10

212 0

nhk

n

Page 244: Simulation

IE 519 244

Where does the h come from?

Solved numerically from (Rinott):

More commonly, you look it up in a table (some in the book)

density

cdf Normal

)()(11

)1(

2

1

1

0 0

1

0

*

00

f

dyyfdxxf

yxn

hP n

k

n

Page 245: Simulation

IE 519 245

Large Number of Alternatives

Two-stage ranking-and-selection procedures usually only efficient for up to about 20 alternatives

Always focus on least-favorable-configuration (LFC)

For large number of systems LFC would be very unlikelyUse screening followed by two-stage R&SUse sequential procedure

The procedure given earlier can be used up to 500 alternatives or so

1,...,2,1 , kiki

Page 246: Simulation

IE 519 246

Other Approaches

Focused on comparing expected values of performance to identify the best

Alternatives: Select the system most likely to be best Select the largest probability of success Bayesian procedures

Page 247: Simulation

IE 519 247

Bayesian Procedures

Posterior and priorTake action to maximize/minimize the posteriorR&S: Given a fixed computing budget find the allocation of simulation runs to systems that minimizes some loss function

ijj

co

i

iL

iL

max),(

otherwise1

0),(

..

10

Page 248: Simulation

IE 519 248

Discussion: Selecting the Best

Three major lines of research: Indifference-zone procedures

Most popular, easy to understand, use the LFC assumption

Screening or subset selection based on constructing a confidence interval

Can be applied for more alternatives, do not give you a final selection. Can be combined with indifference zone selection

Allocating your simulation budget to minimize a posterior loss function

More efficient use of simulation effort, but does not give you the same guarantee as indifference-zone methods

Page 249: Simulation

IE 519 249

Simulation Optimization

Page 250: Simulation

IE 519 250

Larger ProblemsEven with the best methods, R&S can only be extended to perhaps 500 alternativesOften faced with more when we can set certain parameters for the problemNeed simulation optimization

Page 251: Simulation

IE 519 251

What is Simulation Optimization?

Optimization where the objective

function is evaluated using

simulation

Complex systems

Often large scale systems

No analytical expression available

Page 252: Simulation

IE 519 252

Problem Setting

Components of any optimization problem: Decision variables () Objective function

Constraints

: nf R R

n R

Page 253: Simulation

IE 519 253

Simulation Evaluation

No closed form expression for the functionEstimated using the output of stochastic discrete event simulation:

Typically, we may have

: nf R R

( ) ( ) .f E X

( )X

Page 254: Simulation

IE 519 254

Types of TechniquesDecision Variables

Continuous Discrete

Gradient-BasedMethods

Size of

Small Large

Ranking & Selection

RandomSearch

Note: These are direct optimization methods. Metamodels approximate the objective function and then optimize (later).

Page 255: Simulation

IE 519 255

Continuous Decision Variables

Most methods are gradient based

Issues: All the same issues as in non-linear

programming How to estimate the gradient

( 1) ( ) ( )k k kk f

( )ˆ kf

Page 256: Simulation

IE 519 256

Stochastic Approximation

Fundamental work by Robbins and Monro (1951) & Kiefer and Wolfowitz (1952)Asymptotic convergence can be assured

Generally slow convergence

lim 0kk

kk

Page 257: Simulation

IE 519 257

Estimating the Gradient

Challenge to estimate the gradient:

Finite differences are simple:

(could also be two-sided)

1 2ˆ ˆ ˆ ˆ, ,..., nf f f f

( ) ( )ˆ i i ii

i

X Xf

Page 258: Simulation

IE 519 258

Improving Gradient Estimation

Finite differences requires two simulation runs for each estimateMay be numerically instableBetter: estimate gradient during the same simulation run as Perturbation analysis Likelihood ration or score method

( )X

Page 259: Simulation

IE 519 259

Other Methods

Stochastic approximation variants

have received most attention by

researchers

Other methods for continuous domains

include Sample path methods

Response surface methods (later)

Page 260: Simulation

IE 519 260

Discrete Decision Variables

Two types of feasible regions:Feasible region small (have seen this) Trivial for deterministic case but must

still account for the simulation noise

Feasible region large E.g., the stochastic counterparts of

combinatorial optimization problems

Page 261: Simulation

IE 519 261

Statistical Selection

Selecting between a few alternatives

Can evaluate every point and compareMust still account for simulation noiseWe now know several methods:

Subset selection Indifference zone ranking & selection Multiple comparison procedures (MCP) Decision theoretic methods

1 2, ,..., m

Page 262: Simulation

IE 519 262

Large Feasible Region

When the feasible region is large it is impossible to enumerate and evaluate each alternativeUse random search methods Academic research focused on

methods for which asymptotic convergence is assured

In practice, use of metaheuristics

Page 263: Simulation

IE 519 263

Random Search (generic)

Step 0: Select an initial solution (0) and simulate its performance X((0)). Set k=0

Step 1: Select a candidate solution (c) from the neighborhood N((0)) of the current solution and simulate its performance X((c))

Step 2: If the candidate satisfied the acceptance criterion, let (k+1)= (c); otherwise let (k+1)= (k)

Step 3: If stopping criterion is satisfied terminate the search; otherwise let k=k+1 and go to Step 1

Page 264: Simulation

IE 519 264

Random Search Variants

Specify a neighborhood structure

Specify a procedure for selecting candidates

Specify an acceptance criterion

Specify a stop criterion

Page 265: Simulation

IE 519 265

Metaheuristics

Random search methods that have been found effective for combinatorial optimizationFor simulation optimization

Simulated annealing Tabu search Genetic algorithms Nested partitions method

Page 266: Simulation

IE 519 266

Simulated Annealing

Falls within the random search frameworkNovel acceptance criterion:

The key parameter is Tk, which is called the temperature

c ( )

c ( )

c

1,

Accept

, otherwise

k

k

k

X X

T

X X

P

e

Page 267: Simulation

IE 519 267

Temperature Parameter

Usually the temperature is decreased as the search evolvesIf it decreases sufficiently slowly then asymptotic convergence is assuredFor simulation optimization there are indications that constant temperature works as well or better

Page 268: Simulation

IE 519 268

Tabu Search

Can be fit into the random search framework

A unique feature is the restriction of the neighborhood:

Solution requiring the reverse of recent moves not allowed in the neighborhood

Maintain a tabu list of moves

Other features include long term memory that restart with a different tabu list at good solutions

Has been applied successfully to simulation optimization

Page 269: Simulation

IE 519 269

Genetic Algorithms

Works with sets of solutions (populations) rather than single solutionsOperates on the population simultaneously:

Survival Cross-over Mutation

Novel construction of a neighborhoodHas been used successfully for simulation optimization

Page 270: Simulation

IE 519 270

Nested Partitions Method

Originally designed for simulation optimizationUses Partitioning Random sampling Local search improvements

Has asymptotic convergence

Page 271: Simulation

IE 519 271

NP Method

Most PromisingRegion

Subregion Subregion

Superregion

In k-th iterationj=2 subregions

)(1 k )(2 k

)(\)(3 kk

))(( ks )()()( 21 kkk

Partition of the feasible region In each iteration there is the most promising region (k) Use sampling to determine where to go next

Page 272: Simulation

IE 519 272

Sampling

Sources of randomness: Performance of a subset is based on a random

sample of solutions from that subset

Performance of each individual samples estimated using simulation

Difficulty of estimating performance depends on how much variability in the region

Intuitively appealing to have more sampling from regions that have high variance

Page 273: Simulation

IE 519 273

Two-Stage Sampling

Use two-stage statistical selection methods to determine the number of samples

Phase I: Obtain initial samples from each region

Calculate estimated mean and variance

Calculate how many additional samples are needed

Phase II: Obain remaining samples

Estimate performance of region

Page 274: Simulation

IE 519 274

Convergence

Single-stage NP converges asymptotically (useless?)Two-stage NP converges to a solution that is within an indifference zone of optimum with a given probability Reasonable goal softening Statement familiar with simulation users

Page 275: Simulation

IE 519 275

Theory versus Practice

Asymptotically converging methods Good theoretical properties May not converge fast of be easy to

use/understand

Practical methods Often based on heuristic search Do not necessarily account for

randomness Do no guarantee convergence

Page 276: Simulation

IE 519 276

Commercial Software

SimRunner (Promodel) Genetic algorithms

AutoStat (AudoMod) Evolutionary & genetic algorithms

OPTIMIZ (Simul8) Neural networks

OptQuest (Arena, Crystal Ball, etc) Scatter search, tabu search, neural nets

Page 277: Simulation

IE 519 277

Optimization in Practice

In academic work we have very specific definitions:

Optimization = find the best solution Approximation = find a solution that is within

a given distance of optimal performance Heuristic = seek the optimal solution

In practice, people do not always think about the theoretical ideal optimum that is the basis for all of the above

Optimization = improvement

Page 278: Simulation

IE 519 278

Combining Theory & Practice

Best of both worlds Robustness and computational power of

heuristics Guarantee performance somehow

Some examples: Combine genetic algorithms with statistical

selection Two-stage NP-Method guarantees

convergence within an indifference zone with a prespecified probability

Page 279: Simulation

IE 519 279

Metamodels

Page 280: Simulation

IE 519 280

Response Surfaces

Obtaining a precise simulation estimate is computationally expensiveWe often want to do this for many different parameter values (and even find some optimal parameter values)An alternative is to construct a response surface of the output as a function of these input parametersThis response surface is a model of the simulation models, that is, a metamodel

Page 281: Simulation

IE 519 281

Metamodels

Simulation can be (simply) represented as

For as single output and additive randomness, we can write this as

The metamodel, models g and models

gy

gy

yfg ~

Page 282: Simulation

IE 519 282

ExampleInstead of simulating an exact contour – construct a metamodel using a few values

Page 283: Simulation

IE 519 283

Regression

Most commonly, regression models have been used for metamodels

The issues are determining how many terms to include and estimating the coefficients

213

22

11

)(

)(

)(

)()(

xxp

xp

xp

pf kk

x

x

x

xx

Page 284: Simulation

IE 519 284

DOE for RS Models

The coefficients are given by

Key issues: Would like to minimize variance of

Can be done by controlling the random number stream

Would like to estimate with fewer simulation runs

Designs to reduce bias

yβ tt XXX1

Page 285: Simulation

IE 519 285

Why Response Surfaces?

Box, Hunter, and Hunter (1978). Statistics for Experimenters, Wiley.

Page 286: Simulation

IE 519 286

Compare with True Optimum

Why did this fail?

Page 287: Simulation

IE 519 287

Response Surface Optimization

Page 288: Simulation

IE 519 288

Second Order Model

Page 289: Simulation

IE 519 289

Experimental Process

State your hypothesisPlan an experiment

Design of Experiments (DOE)

Conduct the experiment Run a simulation

Analyze the data Output analysis

Repeat steps as needed

Page 290: Simulation

IE 519 290

DOE

Define the goals of the experimentIdentify and classify independent and dependent variables (see example)Choose a probability model

Linear, second order, other (see later)

Choose an experimental design Factorial design, fractional factorial, latin

hybercubes, etc.

Validate the properties of the design

Page 291: Simulation

IE 519 291

Example of Variables

Dependent

Independent

Throughput

Job release policy, lot size, previous maintenance, speed

Cycle Time Job release policy, lot size, previous maintenance, speed

Operating Cost

Previous maintenance, speed

Page 292: Simulation

IE 519 292

Other Metamodels

Many other approaches can be taken to metamodeling

Splines Have been used widely in deterministic simulation

responses Radial basis functions Neural networks Krieging

Have also been used widely in deterministic simulation and gaining a lot of ground in stochastic simulation

Page 293: Simulation

IE 519 293

Variance Reduction

Page 294: Simulation

IE 519 294

Variance ReductionAs opposed to physical experiments, in simulation we can control the source of randomnessMay be able to take advantage to improve precision

Output analysis Ranking & selection Experimental designs, etc.

Several methods: Common random numbers

Comparing two or more systems Antithetic variates

Improving precision of a single system Control variates, indirect estimation, conditioning

Page 295: Simulation

IE 519 295

Common Random Numbers

Most useful techniqueUse the same stream of random numbers for each system when comparingMotivation:

n

XXCovXVarXVar

n

ZVarnZVar

Zn

nZ

XXZ

jjjjj

n

jj

jjj

2121

1

21

,2)(

1)(

Page 296: Simulation

IE 519 296

Applicability

Page 297: Simulation

IE 519 297

Synchronization

We must match up random numbers from the different systemsCareful synchronization of the random number stream Assign one substream to each process Divide that substream up to get

replications

Does not happen automatically

Page 298: Simulation

IE 519 298

Example: Failed Sync.

Page 299: Simulation

IE 519 299

Example: M/M/1 vs M/M/2

Independent sampling

CRN

Page 300: Simulation

IE 519 300

Example: Correlation Induced

Page 301: Simulation

IE 519 301

Example: System Difference

Page 302: Simulation

IE 519 302

Methods for CRN Use

Many methods assume independence between systemsRanking & Selection, etc. Some methods designed to use CRN,

while it violates the assumptions of others

Experimental design Can design the experiments specifically

to take advantage of CRNs

Page 303: Simulation

IE 519 303

Discussion

Dramatic improvements can be achieved with CRN (but can also be harmful)Recommendations:

Make sure CRN are applicable (pilot study) Use methods that take advantage of CRN Synchronize the RNG Use one-to-one random variate generator

Page 304: Simulation

IE 519 304

Antithetic Variates

We now turn to improving precision of a simulation of a single systemBasic idea:

Pairs of runs Large observations offset by small

observations Use the average, which will have smaller

variance

Need to induce negative correlation

Page 305: Simulation

IE 519 305

Mathematical Motivation

Recall for covariance stationary process

we have

If the covariance terms are negative the variance will be reduced Difficult to get all of those negative

nYYY ,...,, 21

liil

l

n

l

YYCov

n

l

nnYVar

,

12 1

1

2

Page 306: Simulation

IE 519 306

Complementary Random Numbers

The simplest approach is to use complementary random numbersSuppose service times are exponentially distributed with mean = 5

U X 1-U X X0.37 4.98 0.63 2.30 3.640.55 3.02 0.45 3.96 3.490.98 0.09 0.02 20.17 10.130.24 7.19 0.76 1.36 4.270.71 1.70 0.29 6.22 3.96Avg. 3.40   6.80 5.10

S.Dev. 2.78   7.70 2.83

Page 307: Simulation

IE 519 307

Example (cont.)U X 1-U X

 

X0.07 13.39 0.93 0.36

 

6.870.35 5.26 0.65 2.15

 

3.700.21 7.86 0.79 1.16

 

4.510.57 2.81 0.43 4.23

 

3.520.66 2.08 0.34 5.39

 

3.74Avg. 6.28

 

2.66 

4.47S.Dev 4.58

 

2.11 

1.40

Does this prove that antithetic variates work for this example?

Page 308: Simulation

IE 519 308

What You NeedAs for CRN, we need a monotone relationship between the (many) unit uniform random numbers to the (single) output that we are interested in(When there are multiple outputs we need to hold for each output.)As before:

Synchronization Inverse-transform

X

Page 309: Simulation

IE 519 309

Formulation

Simulation

)1(

)(

212

22

1212

12

112

jjj

jj

jj

j

j

YYY

YX

YX

UFX

UFX

Page 310: Simulation

IE 519 310

Example: M/M/1 Queue

Independent sampling

Antithetic sampling

Page 311: Simulation

IE 519 311

Complimentary Processes

Imagine a queueing simulation with arrivals and servicesLarge interarrival times will in general have the same effect on performance measures as large service timesIdea: Use the random numbers used for generating interarrival times in the first run of a pair to generate service time in the second run, and vice versaThis could be extended to any situation where you can argue similar complimentary

Page 312: Simulation

IE 519 312

Combining CRN with AV

Both CRN and AV are using very similar ideas, so why not combine them?System 1: Run 1.1 and Run 1.2System 2: Run 2.1 and Run 2.2If we have all the right correlations:

Run 1.1 and Run 2.2 are positively correlated Run 1.2 and Run 2.1 are positively correlated

Thus, the overall performance may be worse

Page 313: Simulation

IE 519 313

Discussion

Basic idea is to induce negative correlation to reduce varianceSuccess is model dependentMust show that it works

Based on model structure Pilot experiments

Since we need a monotone relationship between the RNG and output: synchronization and inverse-transform

Page 314: Simulation

IE 519 314

Control VariatesWe are again interested in improving the precision of some output Y

X is the control variate (any correlated r.v.)

correlated negatively are and if0

correlatedpostively are and if0 is

meanknown thefromdeviation

output theof valuedobserved

YX

YXa

XE

XaYYC

Page 315: Simulation

IE 519 315

Estimator PropertiesThe controlled estimator is unbiased

The variance is

So

0][][

)(

XEaYE

XaYEYE C

),(2][][

)(2 YXCovaXVaraYVar

XaYVarYVar C

][),(2][ 2 XVaraYXCovaYVarYVar C

Page 316: Simulation

IE 519 316

Optimal a Given Y

0][2

),(2][][

][

),(

),(2][2

),(2][][0

22

2

*

2

YVar

YXCovaXVaraYVara

XVar

YXCova

YXCovXVara

YXCovaXVaraYVara

Page 317: Simulation

IE 519 317

Optimal VarianceWith the optimal value a*

][1][

),(][

),(][

),(2

][][

),(][

),(2][][

22

2

*2**

YVarXVar

YXCovYVar

YXCovXVar

YXCov

XVarXVar

YXCovYVar

YXCovaXVaraYVarYVar

XY

C

Page 318: Simulation

IE 519 318

ObservationsBy using the optimal value a*

The controlled estimator is never more variable than the uncontrolled estimator

If there is any correlation, the controlled estimator is more precise

Perfect correlation means perfect estimate

Where’s the catch?

0][11

2*

XY

YVarYVar XYC

Page 319: Simulation

IE 519 319

Estimating a*

Never know Cov[X,Y] and hence not a*

Need to estimate

This will in general be a biased estimatorJackknifing can be used to reduce bias

)(

)()(

)(

)(ˆ)(ˆ

)()(ˆ)()(

2

1

2*

**

nS

nnXXnYY

nS

nCna

nXnanYnY

X

n

j jj

X

XY

C

Page 320: Simulation

IE 519 320

Example: M/M/1 Queue

Want to estimate the expected customer delay in systemPossible control variates: Service times

Positive correlation Interarrival times

Negative correlation

Page 321: Simulation

IE 519 321

Example: Service Time CVs

Rep Y X

1 13.84 0.92

2 3.18 0.95

3 2.26 0.88

4 2.76 0.89

5 4.33 0.93

6 1.35 0.81

7 1.82 0.84

8 3.01 0.92

9 1.68 0.85

10 3.60 0.88

13.4)01.0(3578.3

)10()10(ˆ)10()10(

00.35)10(

)10(ˆ)10(ˆ

07.0)10(ˆ

002.0)10(

9.089.0)10(

13.478.3)10(

**

2*

2

known! bet Shouldn'

XaYY

S

Ca

C

S

X

Y

C

X

XY

XY

X

Page 322: Simulation

IE 519 322

Multiple Control Variates

Perhaps we have two correlated random variables (e.g., both service times and interarrival times)

Problems?

)2()1(

)2()2()1()1( )()(

)(

XXX

XaXaY

XaYYC

Page 323: Simulation

IE 519 323

Multiple Control Variates

To take best advantage of each control variate we need different weights

Find the partial derivatives with respect to both and solve for optimal values as before

)()( )2()2(2

)1()1(1 XaXaYYC

Page 324: Simulation

IE 519 324

In General

m

i

m

j

jiji

m

i

ii

m

i

iiC

m

i

iiiC

XXaa

XYa

XaYY

XaYY

1 1

)()(

1

)(

1

)(2

1

)()(

,cov2

,cov2

var]var[var

)(

Page 325: Simulation

IE 519 325

Types of Control Variates

Internal Input random variables, or functions of those random

variables Know expectation Must generate anyway

External We cannot know However, with some simplifying assumptions we may

have an analytical model that we can solve and hence know the same output

Requires a simulation of the simplified system

][YE

]'[' YE

Page 326: Simulation

IE 519 326

Indirect Estimation

Primarily been used for queueing simulations

ttL

ttQ

WEw

iW

DEd

iD

i

i

i

i

at time systemin customers ofNumber )(

at time queuein customers ofNumber )(

][

customerth of wait Total

][

customerth ofDelay

Page 327: Simulation

IE 519 327

Direct Estimators

)(

0

)(

0

1

1

)()(

1)(ˆ

)()(

1)(ˆ

1)(ˆ

1)(ˆ

nT

nT

n

ii

n

ii

dttLnT

nL

dttQnT

nQ

Wn

nw

Dn

nd

Page 328: Simulation

IE 519 328

Known Relationships

][)(

customer of timeService

1)(

)()(ˆ)(ˆ

1

SEnSE

iS

Sn

nS

nSndnw

i

n

ii

Can we take advantage of this?

Page 329: Simulation

IE 519 329

Indirect Estimator

Replace the average with the known expectation

Avoid variationFor any G/G/1 queue it can be shown that

Is this trivial?

][)(ˆ)(~ SEndnw

)(ˆvar)(~var nwnw

Page 330: Simulation

IE 519 330

Little’s Law

A key result from queueing is Little’s Law

Indirect estimators of average number of customer in the queue/system

dQ

wL

system torate Arrival

][)(ˆ)(~)(~

)(ˆ)(~

SEndnwnL

ndnQ

Page 331: Simulation

IE 519 331

Numerical Example

M/G/1 queue

Service Dist. =.5 =.7

=.9

Exponential 15 11 4

4-Erlang 22 17 7

Hyperexponential 4 3 2

Page 332: Simulation

IE 519 332

Conditioning

Again replace an estimate with its exact analytical value, hence removing a source of variability

XZXEXZXE

ZXEEXE

zZXE

var|varvar|var

|

knownly Analytical|

Page 333: Simulation

IE 519 333

Discussion

We need can be easily generated E[X|Z=z] can be easily calculated E[var[X|Z]] is large

This is going to be very model dependent

Page 334: Simulation

IE 519 334

Example: Time-Shared Computer Model

Want to estimate the expected delay in queue for CPU (dC), disk (dD) and tape (dT)

Page 335: Simulation

IE 519 335

ConditioningEstimating dT may be hard due to lack of data

Observe the number NT in tape queue every time a job leaves the CPUIf this job were to go to the tape queue, its expected delay would be

Variance reduction of 56% was observed

ly)analytical(known 50.12|

50.12

zzNDE

NZ

NNSEDE

TT

T

TTTT

Page 336: Simulation

IE 519 336

Discussion

Both indirect estimation and conditioning are extremely application dependentRequire both good knowledge of the system as well as some background from the analystCan achieve good variance reduction when used properly

Page 337: Simulation

IE 519 337

Variance Reduction Discussion

Have discussed several methods Common random numbers

Antithetic random variates

Control variates

Indirect estimation

Conditioning

More application specific

Page 338: Simulation

IE 519 338

Applicability & Connections

Can we use VRT with any technique for output analysis (e.g., batch-means)?Can we use VRT (especially CRN) with ranking-and-selection and multiple comparison methods?Can we design our simulation experiments (DOE) to take advantage of VRT (especially when building a metamodel)?Can we use VRT with simulation optimization techniques?

Page 339: Simulation

IE 519 339

VRT & Batch-Means

Batch-means is a very important method for output analysis (non-overlapping & overlapping)Problem is that there may be correlation between batchesGenerally no problem with the use of common random numbers or antithetic variatesUse of control variates requires some additional consideration but can be done

Page 340: Simulation

IE 519 340

VRT & Ranking & Selection

In R&S we need to make statements about

If we use CRNs then the two averages will be dependent, which complicates analysisTwo general methods:

Look at pair-wise differences Bonferroni inequality Assume some structure for the dependence

induced by the CRNs

lili XnX )(

Page 341: Simulation

IE 519 341

Pair-Wise Differences

We can replace

with the pair-wise differences

This will then include the effect of the CRN-induced covariance

njliljij nXnX

,...,2,1,)()(

njkiij nX

,...,2,1;,...,2,1)(

Page 342: Simulation

IE 519 342

Bonferroni Approach

We can break up the joint statement using Bonferroni inequalities

Very conservative approach

kikii

k

ii

k

ii

nXnXA

APAP

)()(

1

e.g.

1

1

1

1

Page 343: Simulation

IE 519 343

Assumed Structure

E.g., Nelson-Matejcik modification of two-stage ranking and selection assumes sphericity

Turns out to be a robust assumption

li

liXX

li

iljij

22,cov

Page 344: Simulation

IE 519 344

VRT & DOE/Metamodeling

Experimental design X is used in many simulation studies to construct a metamodel (usually regression model) of the response

Can we take advantage of variance reduction to improve the design?

εβy X

Page 345: Simulation

IE 519 345

23 Factorial Design

How would you used VRT for this design?

Page 346: Simulation

IE 519 346

Assignment Rule

In an m-point experiment that admits orthogonal blocking into two blocks of size m1 and m2, use a common stream of random numbers for the first block and the antithetic random numbers for the second block

,...1,1

,...,

,...,,

,...,,

,...,,

2,1,

2,1,

11211

222212

112111

jvjvjv

jvjvjv

v

v

v

uu

uu

u

u

uuu

uuuU

uuuU

Page 347: Simulation

IE 519 347

Blocking

Page 348: Simulation

IE 519 348

23 Factorial Design in 2 Blocks

In physical experiments we block because we have to (we lose the three-way interaction effect).In simulation we do it because it is better

Page 349: Simulation

IE 519 349

VRT and Optimization

Most of the optimization techniques used with simulation do not make any assumption (e.g., just heuristics from deterministic optimization applied to simulation)

No problem with using variance reduction

Nested partitions method Need independence between iterations Key is to make a correct selection of a region in each

iteration Can use CRN within each iteration to make that

selection

Page 350: Simulation

IE 519 350

DiscussionVariance reduction techniques can be very effective in improving the precision of simulation experimentsOf course variance is only part of the equation, and you should also consider bias and efficiency

XXC

XCXMSEXEff

XEXMSE

XEXEX

XE

computing ofCost )(

)()(

1)(

)(

][]var[

][

222

22

Page 351: Simulation

IE 519 351

Case Study

Page 352: Simulation

IE 519 352

Manufacturing Simulations

Objective Increased throughput Reduce in-process inventory Increase utilization Improved on-time delivery Validate a proposed design Improved understanding of system

Page 353: Simulation

IE 519 353

Evaluate Need for Resources

How many machines are needed?Where to put inventory buffers?Effect of change/increase in production mix/volume?Evaluation of capital investments (e.g., new machine)

Page 354: Simulation

IE 519 354

Performance Evaluation

ThroughputResponse timeBottlenecks

Page 355: Simulation

IE 519 355

Evaluate Operational Procedures

SchedulingControl strategiesReliabilityQuality control

Page 356: Simulation

IE 519 356

Sources of Randomness

Interarrival times between orders, parts, or raw materialProcessing, assembly, or inspection timeTimes to failureTimes to repairLoading/unloading timesSetup timesReworkProduct yield

Page 357: Simulation

IE 519 357

Example: Assembly Line

Increase in demand expectedQuestions about the ability of an assembly line to meet demandRequested to simulate the line to evaluate different options to improve throughput

Page 358: Simulation

IE 519 358

Project Objective

Improve throughput in the lineSimulate the following options: Optimize logic of central conveyor loop Reconfigure the functional test stations

to allow parallel flow of pallets Eliminate the conveyor and move to

manual material handling

Page 359: Simulation

IE 519 359

Assembly Line

Manual Station 1 Assemble heatsinks and fans Soldering

Manual Station 2Install power moduleonto power PCBA

Hi-Pot Test

Strapping

FlashingFunctional TestsHIM

Assembly

Verification

Test

Packaging

Page 360: Simulation

IE 519 360

Simulation Project

Define a conceptual model of the lineGather data on processesValidate the modelImplement model in ArenaTest model on known scenariosEvaluate optionsRecommend solutions

Page 361: Simulation

IE 519 361

How Can Throughput be Improved?

Change the queuing logic This determines how pallets move

from one station to the next Flash station

Two stations in a single loop Functional test station

Three loops with two stations each

Page 362: Simulation

IE 519 362

Logic of the Flash Stations

1

Frame goes to the 2nd station

2Frame goes to the 2nd stationand waits in the queue

3

Frame goes to the 1st station 4

Frame goes to the 1st stationand waits in the queue

Page 363: Simulation

IE 519 363

Logic of the Functional Test

12

34

56

F

Page 364: Simulation

IE 519 364

How Can Throughput be Improved?

Reconfigure functional test stations Parallel tests would be more efficient

with respect to flow of pallets Take up more space on floor – longer

distances

Is it worthwhile to reconfigure?

Page 365: Simulation

IE 519 365

Circulate Pallets in System

Page 366: Simulation

IE 519 366

How can Throughput be Improved?

Eliminate the conveyor Manual material handling

Increase number of pallets Currently 48 pallets Sometimes run out

Page 367: Simulation

IE 519 367

Arena Simulation Model

The conceptual model was implemented using the Arena softwareCurrent configuration simulation and output compared to what we have observedPerformance of several alternative configurations compared

Page 368: Simulation

IE 519 368

Options Considered

Current configurationPallets re-circulate rather than queueVarious queue logics at functional testsFlash station in series, functional test in parallelBoth flash and functional test stations in parallelIncreased number of pallets in systemEliminate conveyor

Page 369: Simulation

IE 519 369

Queue Logic Options

Option 1: Queue one drive at second station in each loop starting with furthest away loopOption 2: Queue one drive at both stations in each loop, starting with furthest away loopOption 3: No queuing in loopsOption 4: Queue at second station in first loop only, start with furthest away loop

Page 370: Simulation

IE 519 370

Throughput ComparisonConfiguration Throughput

(drives/day)Current 265

Recirculation of pallets 275 (4% increase)

Queue logic: Option 1 274 (3% increase)

Queue logic: Option 2 279 (5% increase)

Queue logic: Option 3 280 (6% increase)

Queue logic: Option 4 295 (11% increase)

Mixed series/parallel 282 (6% increase)

All tests in parallel 296 (12% increase)

Increase to 60 pallets (25%) 291 (10% increase)

No Conveyor 256 (3% decrease)

Page 371: Simulation

IE 519 371

Why Does Throughput Improve?

Consider the utilization of the test stationsFirst loop utilization = 0.81

Station 1 0.67 Station 2 0.94

Second loop utilization = 0.63 Station 3 0.45 Station 4 0.81

Third loop utilization = 0.41 Station 5 0.30 Station 6 0.52

Difference of 0.27

Difference of 0.36

Difference of 0.22

Page 372: Simulation

IE 519 372

Improving Utilization

Backfilling will improve balance between different loops (Option 2)

Loop utilization: 0.53, 0.68, 0.76 Does not solve whole issue

Not queuing at test stations will balance load between stations with a loop (Option 3)

Station utilization: 0.64, 0.65, 0.75, 0.71, 0.80, 0.76

No queuing may leave station empty too easily

Page 373: Simulation

IE 519 373

Intermediate Options

Option 1: Queue one drive at second station in each loop starting with furthest away loop

Backfilling Balance between no-queuing and current

method of queuing one drive at each station

Option 4: Queue at second station in first loop only, start with furthest away loop

Uses backfilling idea Balance between no-queuing and queuing at

second station

Page 374: Simulation

IE 519 374

Utilization ComparisonConfiguration Functional Test

UtilizationCurrent 0.67, 0.94, 0.45, 0.81, 0.30,

0.52

Recirculation of pallets 0.71, 0.94, 0.51, 0.86, 0.26, 0.54

Queue logic: Option 1 0.42, 0.67, 0.56, 0.83, 0.67, 0.91

Queue logic: Option 2 0.39, 0.67, 0.52, 0.84, 0.61, 0.91

Queue logic: Option 3 0.64, 0.65, 0.75, 0.71, 0.80, 0.76

Queue logic: Option 4 0.53, 0.82, 0.65, 0.65, 0.73, 0.70

Mixed series/parallel 0.69

All tests in parallel 0.71

Increase to 60 pallets 0.70, 0.95, 0.49, 0.83, 0.43, 0.61

Page 375: Simulation

IE 519 375

Comments on Utilization

Utilization of functional test stations is currently uneven and can be improvedKey ideas Backfilling Correct amount of queuing allowed

Page 376: Simulation

IE 519 376

Bottleneck Analysis

Utilization of various stations Manual station 1 80% Manual station 2 65% Soldering 79% Hi-Pot 15% Strapping 37% Flashing 53% average Functional test 71% average

Functional test station 2 94% HIM 31% Verification 47% Packing 57%

Bottlenecks*

Third highestutilization

*Statistically equivalent

Bottleneck

Page 377: Simulation

IE 519 377

Bottleneck Identification

Functional Test Station 2 is the most heavily loaded station on the lineOn average, the functional test stations are slightly less loaded than Manual Station 1 and Soldering Station, which should hence also be considered bottlenecks

Page 378: Simulation

IE 519 378

Functional Test BottleneckConfiguration Queue LengthCurrent 0.73 ± 0.34 MAX=10 (21%)

Recirculation of pallets 0.17 ± 0.04 MAX=1 (2%)

Queue logic: Option 1 0.72 ± 0.32 MAX=10 (21%)

Queue logic: Option 2 0.20 ± 0.10 MAX=8 (17%)

Queue logic: Option 3 1.28 ± 0.50 MAX=17 (35%)

Queue logic: Option 4 0.79 ± 0.27 MAX=11 (23%)

Mixed series/parallel 0.96 ± 0.26 MAX=19 (40%)

All tests in parallel 1.14 ± 0.33 MAX=14 (29%)

Increase to 60 pallets 1.34 ± 0.55 MAX=16 (33%)

Page 379: Simulation

IE 519 379

Comments on Queue Length

Functional test queue Average queue length relatively short Occasionally very long queues

Similar results for other stations, e.g., HIM assembly stationNot a cause for concern

Page 380: Simulation

IE 519 380

Recommendations

Throughput can be improved: Queuing logic at test stations

Requires reprogramming of conveyor Configuring test stations in parallel

Requires significant reorganization ROI must be evaluated carefully

Increase number of pallets Currently close to point of rapidly diminishing

returns Will not combine well with other improvements

Page 381: Simulation

IE 519 381

Further Improvements

Optimal logic of functional tests depend on mix of drives, daily load, etc.Possibility of dynamically changed logic?Determine a relationship between product mix parameters and best logic

Page 382: Simulation

IE 519 382

Other Areas of Improvement

Scheduling of drives Mix of frames made on each day Order of how different frames are made

Suggestion Grouping and spacing Group similar drives together for efficiency Space time consuming drives apart Account for deadlines and resource

availability

Page 383: Simulation

IE 519 383

Will Scheduling Improvements Help?

Simulation results Assume batch sizes with certain range and

certain most common batch size

Clearly improvements can be made

Type Min

Max

Most common

Throughput

Batching 1 27 14 279

Batching 1 22 10 265

No Batch 1 1 1 274

Page 384: Simulation

IE 519 384

Discussion

Significant improvement can be obtain through inexpensive changes

Recommend changing queuing logic as inexpensive but high return alternative

Worthwhile to consider issues of schedulingSimulation model can be reused to consider other potential improvementsCompany followed recommendations and increased throughput as predicted