Simulation

Post on 21-Dec-2015

8 views 1 download

Tags:

description

.

Transcript of Simulation

IE 519 1

Simulation Modelling

IE 519 2

Contents

Input Modelling 3Random Number Generation 41Generating Random Variates 80Output Analysis 134Resampling Methods 205Comparing Multiple Systems 219Simulation Optimization248Metamodels 278Variance Reduction 292Case Study 350

IE 519 3

Input Modelling

IE 519 4

Input Modelling

You make custom WidgetsHow do you model the input process?Is it deterministic?Is it random?

Look at some data

IE 519 5

Orders

1/5/20041/12/20041/20/20041/29/2004

2/3/20042/15/20042/19/20042/25/20042/28/2004

3/6/20043/15/20043/27/20043/31/2004

4/10/20044/14/20044/17/20044/21/20044/22/20044/28/2004

5/2/20045/3/2004

5/24/20045/26/2004

6/4/20046/15/2004

Now what?

IE 519 6

Histogram

0

1

2

3

4

5

6

7

8

[0,2

][2

,4]

[4,6

][6

,7]

[8,1

0]

[10,

12]

[12,

14]

[14,

16]

[16,

18]

[18,

20]

20+

IE 519 7

Other Observations

Trend? Stationary or non-stationary process

Seasonality May require multiple processes

IE 519 8

Choices for Modelling

Use the data directly (trace-driven simulation)Use the data to fit an empirical distributionUse the data to fit a theoretical distribution

IE 519 9

Assumptions

To fit a distribution, the data should be drawn from IID observationsCould it be from more than one distribution? Statistical test

Is it independent? Statistical test

IE 519 10

Activity I

Hypothesize families of distributions Look at the data Determine what is a reasonable

process Summary statistics Histograms Quantile summaries and box plots

IE 519 11

Activity II

Estimate the parameters Maximum likelihood estimator (MLE) Sometimes a very simple statistics Sometimes requires numerical

calculations

IE 519 12

Activity III

Determine quality of fit Compare theoretical distribution with

observations graphically Goodness of fit tests

Chi-square tests Kolmogorov-Smirnov test

Software

IE 519 13

Chi-Square Test

Formal comparison of a histogram and the probability density/mass function

Divide the range of the fitted distribution into intervals

Count the number of observations in each interval

),[),...,,[),,[ 12110 kk aaaaaa

),[in s' ofNumber 1 jjj aaXN

IE 519 14

Chi-Square TestCompute the expected proportion

Test statistic is

Reject if too large

data discretefor )(ˆ

data continuousfor )(ˆ

1

1

jij

j

j

axai

a

aj

xp

dxxfp

k

j j

jj

np

npN

1

2

IE 519 15

How good is the data?

Assumption of IID observationsSometimes time-dependent (non-stationary)Assessment Correlation plot Scatter diagram Nonparametric tests

IE 519 16

Correlation Plot

Calculate and plot the sample correlation

0:H

apart nsobservatio nsobservatio ofn Correlatioˆ

0

j

j j

IE 519 17

Scatter Diagram

Plot pairs

Should be scattered randomly through the planeIf there is a pattern then this indicates correlation

1, ii XX

IE 519 18

Multiple Data Sets

Often you have multiple data sets (e.g., different days, weeks, operators)

Is the data drawn from the same process (homogeneous) and can thus be combined?Kruskal-Wallis test

knkk

n

n

XXX

XXX

XXX

121

22221

11211

,...,,

,...,,

,...,,

1

1

IE 519 19

Kruskal-Wallis (K-W) Statistic

Assign rank 1 to the smallest observation, rank 2 to the second smallest, etcCalculate

k

i i

i

n

jiji

ijij

k

ii

nn

R

nnT

XRR

XXR

nn

i

1

2

1

1

)1(3)1(

12

ofRank

IE 519 20

K-W Test

The null hypothesis is

H0: All the population distribution are identical

H1: At least one is larger than at least one

other

We reject H0 at level if

In other words, the test statistic follows a chi-square distribution with k-1 degrees of freedom

21,1 kT

IE 519 21

Absence of Data

We have assumed that we had data to fit a distributionSometimes no data is availableTry to obtain minimum, maximum, and mode and/or mean of the distribution Documentation SMEs

IE 519 22

Triangular Distribution

IE 519 23

Symmetric Beta Distributions

==2 ==3

==5 ==10

IE 519 24

Skewed Beta Distributions

=2, =4

IE 519 25

Beta Parameters

2

1

Mode

)(

Mean

abac

aba

a

babc

baca

ˆ)(ˆ

))((

)2)((ˆ

Estimates

IE 519 26

Benefits of Fitting a Parametric Distribution

We have focused mainly on the approach where we fit a distribution to dataBenefits:

Fill in gaps and smooth data Make sure tail behavior is represented

Extreme events are very important to the simulation but may not be represented

Can easily incorporate changes in the input process

Change mean, variability, etc. Reflect dependencies in the inputs

IE 519 27

What About DependenciesAssumed so far an IID processMany processes are not:

A customer places a monthly order. Since the customer keeps inventory of the product, a large order is often followed by a small order

A distributor with several warehouses places monthly orders, and these warehouses can supply the same customers

The behavior of customers logging on to a web site depends on age, gender, income, and where they live

Do not ignore it!

IE 519 28

Solutions

A customer places a monthly order. Should use a time-series model that

captures the autocorrelation

A distributor with several warehouses Need a vector time-series model

Customers logging on to a web site Need a random vector model where each

component may have a different distribution

IE 519 29

Taxonomy of Input Models

Time-independent

Stochastic Processes

Univariate

Multivariate

Discrete-time

Continuous-time

Discrete

Continuous

‘Mixed

Discrete

Continuous

‘Mixed

Discrete-state

Cont.-state

Discrete-state

Cont.-state

Time-series models

Markov chains (stationary?)

Poisson process (stationary?)

Markov process

Binomial, etc.Normal, gamma, beta, etc.

Empirical/Trace-driven

Independent binomial

Multivariate normal

Bivariate-exponential

Examples of models

IE 519 30

What if it Changes over Time?

Do not ignore it!Non-stationary input processExamples: Arrivals of customers to a restaurant Arrivals of email to a server Arrivals of bug discovery in software

Could model as nonhomogeneous Poisson process

IE 519 31

Goodness-of-Fit Test

The distribution fitted is tested using goodness-of-fit tests (GoF)How good are those tests?The null hypothesis is that the data is drawn from the chosen distribution with the estimated parametersIs it true?

IE 519 32

Power of GoF Tests

The null hypothesis is always false!If the GoF test is powerful enough then it will always be rejectedWhat we see in practice:

Few data points: no distribution is rejected

A great deal of data: all distributions are rejected

At best, GoF tests should be used as a guide

IE 519 33

Input Modeling Software

Many software packages exist for input modeling (fitting distributions)Each has at least 20-30 distributionsYou input IID data, the software gives you a ranked list of distributions (according to GoF tests)Pitfalls?

IE 519 34

Why Fit a Distribution at All?

There is a growing sentiment that we should never fit distributions (not consensus, just growing)A couple of issues: You don’t always benefit from data Fitting distribution is misleading

IE 519 35

Is Data Reality

Data is often Distorted

Poorly communicated, mistranslated or recorded Dated

Data is always old by definition Deleted

Some of the data is often missing Dependent

Often only summaries, or collected at certain times Deceptive

This may all be on purpose!

IE 519 36

Problems with Fitting

Fitting an input distribution can be misleading for numerous reasons

There is rarely a theoretical justification for the distribution. Simulation is often sensitive to the tails and this is where the problem is!

Selecting the correct model is futile The model gives the simulation practitioner

a false sense of the model being well-defined

IE 519 37

Alternative

Use empirical/trace-driven

simulation when there is sufficient

data

Treat other cases as if there is no

data, and use beta distribution

IE 519 38

Empirical Distribution

xX

XxXXXn

Xx

n

i

Xx

xF

XXXn

xXxF

XXX

n

iiii

iX

n

iX

n

)(

)1()()()1(

)(

)1(

)()2()1(

21

1)1(1

1

0

)(ˆ

define and ... nsobservatio order thecan or we

ofNumber )(ˆ

(CDF)function on distributi Empirical

,...,,

nsObservatio

IE 519 39

Beta Distribution Shapes

IE 519 40

What to Do?

Old rule of thumb based on number of data points available:<20: Not enough data to fit21-50 : Fit, rule out poor choices50-200 : Fit a distribution>200 : Use empirical distribution

IE 519 41

Random Number Generation

IE 519 42

Random-Number Generation

Any simulation with random components requires generating a sequence of random numbersE.g., we have talked about arrival times, service times being drawn from a particular distributionWe do this by first generating a random number (uniform between [0,1]) and then transforming it appropriately

IE 519 43

Three Alternatives

True random numbers Throw a dice Not possible to do with a computer

Pseudo-random numbers Deterministic sequence that is statistically

indistinguishable from a random sequence

Quasi-random numbers A regular distribution of numbers over the

desired interval

IE 519 44

Why is this Important?

Validity The simulation model may not be valid

due to cycles and dependencies in the model

Precision You can improve the output analysis by

carefully choosing the random numbers

IE 519 45

Pseudo-Random Numbers

Want an iterative algorithm that outputs numbers on a fixed intervalWhen we subject this sequence to a number of statistical test, we cannot distinguish it from a random sequenceIn reality, it is completely deterministic

IE 519 46

Linear Congruential Generators (LCG)

Introduced in the early 50s and still in very wide use todayRecursive formula

seed

modulus

increment

multiplier

mod)(

0

1

Z

m

c

a

mcaZZ ii

Every number is determinedby these four values

IE 519 47

Transform to Unit Uniform

Simply divide by m

What values can we take?

m

ZU i

i

IE 519 48

Examples

1,16mod)12(

1,13mod)3(

1,16mod)11(

01

01

01

ZZZ

ZZZ

ZZZ

ii

ii

ii

IE 519 49

Characteristics

All LCGs loopThe length of the cycle is the periodLCGs with period m have full periodThis happens if and only if

The only positive integer that divides both m and c is 1

If q is a prime that divides m, then q divides a-1

If 4 divides m then 4 divides a-1

IE 519 50

Types of LCGs

If c=0 then it is called multiplicative LCG, otherwise mixed LCG

Mixed and multiplicative LCG behave rather differently

IE 519 51

Comments on Parameters

Mixed Generator Want m to be large A good choice is m = 2b, where b is the number

of bits Obtain full period if c is odd and a-1 is

divisible by 4

Multiplicative LCGs Simpler Cannot have full period (first condition cannot

be satisfied) Still an attractive option

IE 519 52

Performance Tests

Empirical testsUse the RNG to generate some numbers and then test the null hypothesis

H0: The sequence is IID U(0,1)

IE 519 53

Test 1: Chi-Square Test

Similar to before:Generate Split [0,1] into k subintervals (k 100 )Test statistic is

With k-1 degrees of freedom

lsubintervath in s' ofNumber

1

22

jUf

k

nf

n

k

ij

k

jj

nUUU ,...,, 21

IE 519 54

Test 2: Serial Test

Consider

Similar to before

,...,...,,

,,...,,

2212

211

ddd

d

UUU

UUU

U

U

lsubinterva in the s' ofNumber 21

1 2

211 1 1

22

ijjj

k

j

k

j

k

jdjjj

d

Uf

k

nf

n

k

d

d

d

IE 519 55

Test 3: Runs Test

Calculate for

Test statistic (chi-square w/6 d.f.)

Where the a and b values are given empirically

66length of up runs ofnumber

5,...,2,1for length of up runs ofnumber

i

iiri

nUUU ,...,, 21

6

1

6

1

1

i jjjiiij nbrnbra

NR

IE 519 56

Test 4: Correlation Test

For uniform variables

3124

1

,12

1,

2

1

jiij

jii

jiijii

jiij

UUE

UUE

UEUEUUE

UUCovC

UVarUE

IE 519 57

Test 4: Correlation TestEmpirical estimate is

Test statistic

Approximately standard normal

2

0)1(11

)1(

713ˆ

1/)1(

31

12ˆ

h

hVar

jnh

UUh

j

h

kjkkjj

j

jj

VarA

ˆ

ˆ

IE 519 58

Passing the Test

A RNG with long period that passes a fixed set of statistical test is no guarantee of this being a good RNG

Many commonly used generators are not good at all, even though they pass all of the most basic tests

IE 519 59

Classic LCG16807Multiplicative LCGs cannot have full period, but they can get very close

Has period of 231-2, that is, best possibleDates back to 1969Suggested in many simulation texts and was (is) the standard for simulation softwareStill in use in many software packages

12

12mod16807

31

311

ii

ii

ZU

ZZ

IE 519 60

Java RNG

Mixed LCG with full period

Variant of the old rand48() Unix LCG

53

2112

22227

481

222

2

2mod)1172521490391(

ii

i

ii

ZZ

U

ZZ

482i

iZU

IE 519 61

Two more LCGs

VB

Excel

24

241

2

2mod)128201631140671485(

ii

ii

ZU

ZZ

1mod)211327.00.9821( 1 ii UU

IE 519 62

Simple Simulation Tests

Collision Test Divide [0,1) into d equal intervals Generate n points in [0,1)t C=Number of times a point falls in a box

that already has a point (collision)

Birthday Spacing Test Have k boxes, labeled with Define the spacing Consider

)()2()1( nIII

)()1( jjj IIS

2,...,1,: 1 njSSjY jj

IE 519 63

Performance: Collision

After 215 numbers, VB starts failingAfter 217 numbers, Excel starts failingAfter 219 numbers, LCG16807 starts failingThe Java RNG does OK up to at least 220 numbersNote that this means that a clear pattern is observed from the VB RNG with less than 100,000 numbers generated!

IE 519 64

Performance: B-day Spacing

After 210 numbers, VB starts failingAfter 214 numbers, Excel starts failingAfter 214 numbers, LCG16807 starts failingAfter 218 numbers, Java starts failing

For this test, the VB RNG is only good for about 1000 numbers!The performance gets even worse if we look at less significant digits.

IE 519 65

Combined LCGA better RNG is obtained as follows:

Recommended parameters (k=3)

Cycle length of 2191 with good structure

1mod

mod)...(

mod)...(

2

,2

1

,1

2,2,22,22,21,21,2,1

1,1,12,12,11,11,1,1

mZ

mZ

U

mZaZaZaZ

mZaZaZaZ

iii

kikiii

kikiii

228532

2092

1370589,0,527612,,

810728,1403580,0,,

322

321

131211

131211

m

m

aaa

aaa

IE 519 66

Why do RNGs Fail?

We have seen that many commonly used RNGs fail simulation tests, even though they past the standard empirical tests

Why do these RNGs fail?Need to analyze the structure of the RNG

IE 519 67

Lattice Structure

For all LCGs, the numbers generated fall in a fixed number of planesWe want this to be as many planes as possible and ‘fill-up’ the spaceThis should be true in many dimensions

IE 519 68

Example: Two Full-Period LCGs

IE 519 69

LCG RANDU in 3 Dimensions

IE 519 70

Theoretical Tests

Based on analyzing the structure of the numbers that can be generatedLattice testSpectral test

IE 519 71

Selecting the Seed

1 10 15 0 13 6 11 12 9 2 7 8 5 14 3 4 1

16mod)11( 1 ii ZZ

Seed = 15

Seed = 1

Say we need two independent sequences of 8 numbersSelect seed values 1 and 15

Good RNGs will haveprecomputed seed values

IE 519 72

Streams and Substreams

A segment corresponding to a seed is usually called a streamAlso want to be able to get independent substreams of each streamExample: Assign each stream to generating one type of numbers & and use each substream for independent replicationsRequires very long period generators, and precomputed streams

IE 519 73

Analysis of RNG 16mod)11( 1 ii ZZ1 10 15 0 13 6 11 12 9 2

0.06 0.63 0.94 0.00 0.81 0.38 0.69 0.75 0.56 0.13

0

0.5

1

1.5

2

2.5

3

3.5

[0,0.25) [0.25,0.5) [0.5,0.75) [0.75,1)

IE 519 74

Do We Need Randomness?

For certain applications, definitelyFor simulation, maybe not alwaysQuasi-random numbersSay we want to estimate an expected value

uu dfs

)1,0[

IE 519 75

Monte Carlo Estimate

Using n independent simulation runs

Error converges at rate

)1,0(/)ˆ(

ˆ

2

1

0

Nn

nVar

fn

n

ii

u

n

IE 519 76

Quasi-Monte CarloReplace the random points with a set of points that cover [0,1)s more uniformly

IE 519 77

Discussion

By using Quasi-random numbers, we are able to achieve faster convergence rateWhen estimating an integral, real randomness is not really an issueWhat about discrete event simulation?

IE 519 78

DiscussionGenerating random numbers is important to every simulation project Validity of the simulation Precision of the output analysis

Not all RNG are very good

IE 519 79

Discussion

Problems Too short a period (period of 231 not sufficient) Unfavorable lattice structure

Numbers generated by RANDU() fall on 15 planes in R2

Inability to get truly independent subsequences Need streams (segments), and substreams

Should choose a RNG that passes both empirical and theoretical tests, has a very long period, and allows us to get good streams

IE 519 80

Generating Random Variates

IE 519 81

Generating Random Variates

Say we have fitted an exponential distribution to interarrival times of customersEvery time we anticipate a new customer arrival (place an arrival even on the events list), we need to generate a realization of of the arrival timesKnow how to generate unit uniformCan we use this to generate exponential? (And other distributions)

IE 519 82

Two Types of Approaches

Direct Obtain an analytical expression Inverse transform

Requires inverse of the distribution function Composition & Convolution

For special forms of distribution functions

Indirect Acceptance-rejection

IE 519 83

Inverse-Transform Method

IE 519 84

Formulation

Algorithm

Proof

)(Return .2

)1,0(~ Generate 1.1 UFX

UU

)(

))((

)(1

xF

xFUP

xUFPxXP

IE 519 85

Example: Weibull

IE 519 86

Example: Exponential

00

01)(

/

x

xexF

x

IE 519 87

Discrete Distributions

IE 519 88

Formulation

Algorithm

Proof: Need to show

IxFUIX

UU

xxX

:minReturn .2

)1,0(~ Generate 1.

,..., valuescan take 21

ixpxXP ii )(

IE 519 89

Continuous, Discrete, Mixed

Algorithm

UxFxX

UU

)(:minReturn 2.

)1,0(~ Generate .1

IE 519 90

Discussion: Disadvantages

Must evaluate the inverse of the distribution function May not exist in closed form Could still use numerical methods

May not be the fastest way

IE 519 91

Discussion: Advantages

Facilitates variance reduction

Ease of generating truncated distributions21

21

21

21

22

11

11

1

tindependen ,

selectCan

)(

)(

UU

UU

UU

UFX

UFX

IE 519 92

Composition

Assume that

Algorithm1. Generate a positive random integer,

such that P(J=j)=pj

2. Return X with distribution Fj

1

1

1

,)()(

jj

jjj

p

xFpxF

IE 519 93

Convolution

Assume that(where the Y’s are IID with CDF G)Algorithm

mYYYX ...21

m

m

YYYX

GYYY

...Return 2.

CDFeach with IID ... Generate 1.

21

21

IE 519 94

Acceptance-Rejection Method

Specify a function that majorizes the density

New density function

Algorithm

xxfxt ),()(

1. Step back to go Otherwise

.return ),()( If 3.

oft independin Generate .2

density ith Generate 1.

YXYtYfU

YU

rwY

dxxt

xtxr)(

)()(

IE 519 95

Example:

IE 519 96

Example: More Efficient

IE 519 97

Simple Distributions

Uniform

Exponential

m-Erlang

UabaX )(

UX ln

m

iiU

mX

1

ln

IE 519 98

GammaDistribution function

No closed-form inverseNote that if then

otherwise0

0!

1)(1

0

/ xj

x

exFj

j

x

),(~ gammaX

)1,(~ gammaX

IE 519 99

Gamma(,1) Density

IE 519 100

Gamma(,1)

Gamma(1,1) is exponential(1)0<<1: Acceptance-rejection with

This majorizes the Gamma (,1) density, but can we generate random variates?

xe

xxx

xtx

1)(

10)(

00

)(1

IE 519 101

Gamma(,1), 0<<1

The integral of majorizing function

New densityeeb

b

dxe

dxx

dxxtx

/)(

,)(

)()()(

1

1

0

11

0

xbe

xbx

x

xrx

1

10

00

)(1

IE 519 102

Gamma(,1), 0<<1

The distribution function is

Invert

xb

e

xb

x

dyyrxR x

x

11

10)()(

0

otherwise)1(

ln

1

)(

/1

1

ubb

ubuuR

IE 519 103

Gamma(,1), 0<<1

1.Generate U1~U(0,1) and let P=bU1. If P>1, go to step 3. Otherwise go to step 2

2.Let Y=P1/, and generate U2~U(0,1). If U2eY, return X=Y. Otherwise, go to step 1.

3.Let Y=-ln[(b-P)/], and generate U2~U(0,1). If U2Y-1, return X=Y. Otherwise, go to step 1.

IE 519 104

Gamma(,1), 1<

Acceptance-rejection with

)(4

12

2

1

ec

x

xxt

IE 519 105

Gamma(,1), 1<

Distribution function (log-logistics)

Inverse

x

xxR

)(

/11

1)(

u

uuR

IE 519 106

Normal

Distribution function does not have closed form (so neither does the inverse)Can use numerical methods for inverse-transformNote that

If we can generate unit normal, then we can generate any normal

),(~

)1,0(~

NX

NX

IE 519 107

Normal: Box-Muller

Algorithm

Technically, independent N(0,1), but serious problem if used with LCGs

1

212

211

21

Return .3

2sinln2

,2cosln2Set 2.

)1,0(~,t independen Generate .1

X

UUX

UUX

UUU

IE 519 108

Polar Method

Algorithm

YVX

YVX

WWY

VVWUV

UUU

ii

22

11

22

21

21

/)ln2(

let Otherwise, 1. step togo 1 WIf 2.

,12Let

).1,0(~,t independen Generate .1

IE 519 109

Derived Distributions

Several distributions are derived from the gamma and normalCan take advantage of knowing how to generate those two distributions

IE 519 110

Beta

Density

No closed form CDF. No closed form inverseMust use numerical methods for inverse-transform method

1

0

1121

21

11

21

21

)1(,

otherwise0

10,

)1()(

dtttB

xB

xxxf

zz

IE 519 111

Beta Distribution Shapes

IE 519 112

Beta Properties

Sufficient to consider beta on [0,1]If X~beta(1,2) then 1-X~beta(2,1)

If 2=1 then

If 1=1, 2=1 then X~U(0,1)

1

1

121

)(

1,,

)1()( 1

11

1

21

11

xxF

xB

x

B

xxxf

IE 519 113

Beta: General Approach

If Y1~Gamma(,1) and Y2~Gamma(,1), and Y1 and Y2 are independent, then

Thus, if we can generate two gamma random variates, we can generate a beta with arbitrary parameters

2121

1 ,beta~ YY

Y

IE 519 114

Pearson Type V and Type VI

Pearson Type V X~PT5() iff 1/X~gamma()

Pearson Type VI If Y1~Gamma(,) and

Y2~Gamma(,1), and Y1 and Y2 are independent, then ,,6PT~ 21

2

1

Y

Y

IE 519 115

Pearson Type V

IE 519 116

Pearson Type VI

IE 519 117

Normal Derived Distributions

Lognormal

Test distributions (not often used for modeling): Chi-squared Student’s t distribution F distribution

22 ,~,~ LNeNY Y

IE 519 118

Log-Normal

IE 519 119

Empirical

Use inverse-transform methodDo not need to search through observations because changes occur precisely at 0, 1/(n-1), 2/(n-1), …Algorithm

)()1()( )1(Return .2

1let and

,1Let ).1,0(~ Generate .1

III XXIPXX

PI

)U(n-PUU

IE 519 120

Empirical Distribution Function

IE 519 121

Discrete Distributions

Can always use the inverse-transform methodMay not be most efficientAlgorithm

I

j

I

j

p(j)Up(j)

UU

0

1

0

satisfies that

IXinteger enonnegativ Return the.2

).1,0(~ Generate .1

IE 519 122

Alias Method

Another general method is the alias method, which works for every finite range discrete distribution

IE 519 123

Alias Method: Example

33.0

22.0

14.0

01.0

)(

x

x

x

x

xp

0 1 2 3

L2=3

Ii LXIXFU

UUnDUI

return Otherwise .return If.2

)1,0(~ ),,0(~ Generate .1

L0=1

IE 519 124

Bernoulli

Mass function

Algorithm

0return Otherwise .1return If.2

)1,0(~ Generate .1

XXpU

UU

otherwise0

1

01

)( xp

xp

xp

IE 519 125

Binomial

Mass function

Use the fact that if X~bin(t,p) then

otherwise0

},...,1,0{)1()(

txppx

txp

xtx

)(Bernoulli~

...,21

pY

YYYX

i

t

IE 519 126

Geometric

Mass function

Use inverse-transform

)1ln(

)1ln(Return .2

)1,0(~ Generate .1

pUX

UU

otherwise0

},...,1,0{)1()(

txppxp

x

IE 519 127

Negative Binomial

Mass function

Note that X~negbin(n,p) iff

otherwise0

},...,1,0{)1(1

)(txpp

x

xsxp

xs

)(Geometric~

...,21

pY

YYYX

i

n

IE 519 128

Poisson

Mass functionAlgorithm

Rather slow. No very good algorithm for Poisson distribution

otherwise0

},...,1,0{!)( tx

x

exp

x

1 step back to go and 1Let .3

3. step togo Otherwise .return , If

.by replace and )1,0(~ Generate 2.

0,1,Let .1

11

ii

iXab

bUbUU

ibea

ii

IE 519 129

Poisson Process

A stochastic process {N (t), t 0} that counts the number of events up until time t is a Poisson process if: Events occur one at a time N (t+s)-N (t) is independent of {N (t),

t0}

A Poisson process is determined by its rate )()( tNE

dt

dt

IE 519 130

Generating a Poisson Process

Stationary with rate >0Time between events Ai=ti-ti-1 are IID exponentialAlgorithm

Utt

UU

ii ln1Return .2

)1,0(~ Generate .1

1

IE 519 131

Nonstationary Case

Can we simply generalize?

(t)

it 1it

IE 519 132

Thinning Algorithm

1. Set t=ti-1

2. Generate U1, U2 IID U(0,1)

3. Replace t by

4. If return ti = t. Otherwise, go back to step 2.

)(max where,ln1 *

1*tUt

t

*2 /)( tU

IE 519 133

Summary

For any stochastic simulation it is necessary to generate random variates from either a theoretical distribution or an empirical distributionGeneral methods we covered Inverse-transform Acceptance-rejection Alias method

IE 519 134

Output Analysis

IE 519 135

Output Analysis

Analyzing the output of the simulation is a part that is often done incorrectly (by analysts and commercial software)We consider several issues

Obtaining statistical estimates of performance measures of interest

Improving precision of those estimates through variance reduction

Comparing estimates from different models Finding the optimal performance value

IE 519 136

Simulation Output

The output from a single simulation run is a stochastic process Y1, Y2, …

Observations (n replications of length m):

nmnin

mi

mi

yyy

yyy

yyy

1

2221

1111

IE 519 137

Parameter Estimation

Want to estimate some parameter based on these observations

?ˆlim

?Consistant

Unbiased?

tt

E

IE 519 138

Transient vs Steady State

IE 519 139

Initial Values: M/M/1 Queue

IE 519 140

Types of Simulation

Terminating simulation

Non-terminating simulation

Steady-state parameters

Steady-state cycle parameters

Other parameters

IE 519 141

Terminating Simulation

Examples: A retail establishment that is open for

fixed hours per day A contract to produce x number of a

high cost product Launching of a spacecraft

Never reaches steady-stateInitial conditions are included

IE 519 142

Non-Terminating Simulation

Any system in continuous operation (could have a ‘break’)Interested in steady-state parametersInitial conditions should be discarded

Sometimes no steady-state because the system is cyclicThen we are interested in steady-state cycle parameters

IE 519 143

Terminating Simulation

Let Xj be a random variable defined on the jth replicationWant to estimate the mean =E (Xj )

Fixed-sample-size procedureCI assumes Xj‘s are normally distributed

n

nStnX n

)()(

2

2/1,1

IE 519 144

Quality of Confidence IntervalNumber of failures

Average delay (25 customers) Average delay (500 customers)

Depends on both the underlying distribution and the number ofreplications

IE 519 145

Specifying the Precision

Absolute error

To obtain this

X

XP

XP

XXP

length half

length halflength half1

IE 519 146

Replications Needed

To obtain absolute error of , the number of replications needed is approximately

i

nStnin ia

)(:min

2

2/1,1*

IE 519 147

Relative ErrorAlso interested in the relative errorNow we have

X

)1(

)1(

length half1

XP

XP

XXP

XXP

XXP

XX

XP

IE 519 148

Replications Needed

To obtain relative error of , the number of replications needed is approximately

1)(

)(

:min

2

2/1,1*

nXinS

tnin

i

r

IE 519 149

Sequential Procedure

Define

Algorithm

1 step togo andn replicatio

additionalan make 1let Otherwise,

stop. and of estimate

theas )( use ,')( If 2.

,...,, from and )( Compute 1.

set and nsreplicatio Make 0.

21

00

nn

nXnXδ(n,α)

XXXnX

nnn

n

i

nStn i

)(),( ,

1'

2

2/1,1

IE 519 150

Other Measures

If we only use averages, the results can often be misleading or wrongWhat about the variance?Alternative/additional measures Proportions Probabilities Quantilies

IE 519 151

Example

Suppose we are interested in

customer delay X. We can estimate Average delay E[X]

Proportion of customer with Xa

Probabilities, e.g., P[Xa]

The q-quantile xq

IE 519 152

Estimating Proportions

Define an indicator function

Obtain a point estimate of the proportion

otherwise0

if1 aXI i

n

iiI

nr

1

IE 519 153

Estimating Probabilities

Want to estimate p=P(XB)

Have n replications X1,X2,…,Xn

Define S=number of observations that fall in set BS ~ binomial(n,p)Unbiased estimate is

n

Sp ˆ

IE 519 154

Estimating Quantiles

Let X(1),X(2),…,X(n) be the order statistics corresponding to n simulation runsA point estimator is then

otherwise

integeran is ifˆ )(

nq

nq

q X

nqXx

IE 519 155

Initial Conditions

In terminating simulation there is no steady-stateHence, the initial conditions are included in the performance measure estimatesHow should they be selected?

Use an artificial ‘warm-up’ period just to get reasonable start-up state

Collect data and model the initial conditions explicitly

IE 519 156

DiscussionFor terminating simulation we must use replications (cannot increase length of simulation run)Point estimates of performance measures:

Unbiased estimate and an approximate CI is easily constructed for the mean performance

Also, obtained point estimates for proportions, probabilities, and quantiles (mean not always enough)

It is important to be able to control the precision – determine how many replications are neededInitial conditions are always included in the estimates for terminating simulations – must be selected carefully

IE 519 157

Steady-State Behavior

Now we’re interested in parameters related to the limit distribution

Problem is that we cannot wait until infinity!

yYPyF

yYPyF

yFyF

ii

ii

)(

)(

)()(

IE 519 158

Estimating Mean

Suppose we want to estimate the steady-state mean

Problem:

One solution is to add warm-up and get a less biased estimator

m

liiY

lmlmY

1

1),(

mmYE ,)(

ii

YE

lim

IE 519 159

Approaches for Estimating

There are numerous approaches for estimating the mean Replication/deletion One long replication:

Batch-means Autoregressive method Spectrum analysis Regenerative method Standardized time series method

Start with this

IE 519 160

Choosing the Warm-Up Period

In replication/deletion method the main issue is to choose the warm-up periodWould likeTradeoff:

If l is too small then we still have a large bias

If l is too large then the estimate will have a large variance

Very difficult to determine from a single replication

.,),( lmlmYE

IE 519 161

Welch’s Procedure

)1()1()1()1( 1321

124321

,1,2,4321

,21,22,224232221

,11,12,114131211

m

mmm

mnmnmnnnnn

mmm

mmm

YYYY

YYYYYYY

YYYYYYY

YYYYYYY

YYYYYYY

IE 519 162

Welch’s Procedure

Key is to smooth out high-frequency oscillations in the averages

Then plot the moving average

wiYi

wmwiYw

wY i

issi

w

wssi

i

,...,2,1,12

1

,...,1,12

1

)( 1

)1(

IE 519 163

Example: Hourly Throughput

When is it warmed up?

IE 519 164

Welch’s Procedure

Much smoother and easier to tell where it has converged

Want to err on the side of selecting it too large

IE 519 165

Replication/Deletion

Similar to terminating simulation

Need n pilot runs to determine the warm-up period l, and then throw away the first l observations from the new n’ runs

lm

Ym

nX

n

nStnX

m

lmji

n

'

1'

1)'(

'

)'()'(

'

'

2

2/1,1'

IE 519 166

Discussion

Replication/deletion appoach Easiest to understand and implement Has good statistical performance if done

correctly Applies to all output parameters and can be

used to estimate several different parameters for the same model

Can be used to compare different systems

Nonetheless, some other methods have clear advantages

IE 519 167

Covariance Stationary Process

Classic statistical inference assumes independent and identically distributed (IID) observationsEven after eliminating the initial transient this is not true for most simulations because most simulation output is auto-correlatedHowever, it is reasonable to assume that after the initial transient the output will be covariance stationary, that is,

is independent of i

kiik YY ,cov

IE 519 168

Notation:

0

20

2

21

,cov

,...,,

kk

kiik

j

j

n

YY

YVar

YEv

YYY

Simulation output:

Mean:

Variance:

Covariance:

Variance:

Correlation:

Assume covariance stationary

IE 519 169

Implications of Autocorrelation

If the process is covariance stationary the average is still an unbiased estimator, that is,

However, the same cannot be said about the standard estimate of the variance

In fact,

n

jjY

nnn

nS

1

22

2 ˆ1

11)(ˆˆ

1

121)(

1

122

n

nknSE

n

k k

ˆ and ˆ1

EYn

jj

IE 519 170

Expression for VarianceAssuming covariance stationary process it can be shown that:

We hope the estimate of the variance is unbiased, that is,

By combining the top equation above with the last equation on previous slide, we can check this for an independent and autocorrelated output process

1

1

02 121ˆn

kkn

k

n

ˆ22

n

SE

nS 2

IE 519 171

Independent Process

If the output process is independent then

ˆˆ1

1121

121ˆ

1,0,cov

22

1

12

20

1

1

02

00

n

nk

n

n

SE

nnn

k

n

kYY

n

kk

n

kk

kiikk

IE 519 172

Autocorrelation in ProcessIf the process is positively correlated (usual):

Hence, the estimator has less precision than predicted and the CI is misleading

ˆˆ1

1121

121ˆ

1,0

22

1

12

20

1

1

02

n

nk

n

n

SE

nnn

k

n

k

n

kk

n

kk

k

IE 519 173

Batch-Means Estimators

Batch-means estimators are the most popular alternative to replication/deletionThe idea here is to do one very long simulation run and estimate the parameters from this runAdvantage is that the simulation only has to go through the initial transient onceAssuming covariance-stationary output

No problem estimating the mean Estimating the variance is difficult because the data

is likely to be autocorrelated, that is, Yi and Yi+1 are correlated

IE 519 174

Classical Approach

Partition the run of n into k equal-size contiguous macro replications, each composed of m=n/k micro replicationsPoint estimator

k

jjk 1

ˆ1

ˆ

IE 519 175

CI Analysis

Assuming as before that Y1, Y2,… is covariance-stationary with E[Yi]=

If the batch size is large enough, then the estimates will be approximately uncorrelatedSuppose we can also choose k large enough so that they are approximately normalIt follows that the batch estimates have the same mean and varianceHence we can treat them as approximately IID normal and get the usual confidence interval

jv̂

IE 519 176

Variants of Batch-Means

nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321

nlmlmlmmm YYYYYYYY ,...,,...,,,....,,,.....,, 121121

nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321

Batch 1 Batch 2

Batch 1

Batch 1

Batch 2

Batch 2

IE 519 177

Steady-State Batching

General variance estimator

}1,...,2,1{

}1)1(,...,12,1,1{

1

ˆˆ

ˆVar

22

2

2

mnB

mkmmB

B

BS

mn

S

Bj j

B

B

IE 519 178

Determining the Batch Size

Tradeoff Large batch sizes have the needed

asymptotic properties Small batch sizes yield more batches That is, choice between bias due to poor

asymptotics and variance due to few batches

Rule of thumb (empirical): Little benefit to more than 30 batches Should not have fewer than 10 batches

IE 519 179

Mean Squared Error

The mean squared error (MSE) of an estimator is

This is the classic measure of qualityCan use to select the optimal batch size

ˆVar,ˆBias

ˆ,ˆ

2

2

EMSE

IE 519 180

Optimal Batch Size

The asymptotic mse-optimal batch size is

gravity ofCenter

constant Variance

constant Bias

12

0

1

3/1

0

1*

v

b

v

b

c

c

c

cnm

IE 519 181

Regenerative Method

Similar to batch-means, the regenerative method also tries to construct independent replications from a single runAssume that Y1, Y2,… has a sequence of random points 1 B1 < B2 < … called regeneration points, and the process from Bj is independent of the process prior to Bj

The process between two successive regeneration points is called a regeneration cycle

IE 519 182

Estimating the Mean

cycleson regenerati ofNumber '

)'(

)'(ˆ

][

][

11

n

nN

nZ

YZ

NE

ZE

j

j

B

Biij

IE 519 183

Analysis

The estimator is not unbiased. However, it is strongly consistent

Let be the covariance matrix ofLetThese are IID with mean 0 and variance

(w.p.1) )'(ˆ'

n

n

Tjjj NZ ,U

jjj NZV

222

12112 2 V

IE 519 184

Analysis

From the CLT

Have estimates

)1,0('

)'(2

Nn

nV D

V

)'(ˆ)'(ˆ)'(ˆ)'(ˆ2)'(ˆ)'(ˆ

)'(ˆ)'(ˆ

)'(ˆ)'(ˆ)'(ˆ

222

12112

2212

1211

nnnnnn

nn

nnn

V

IE 519 185

Analysis

Can be shown thatHence

We get a CI

(w.p.1) )'(ˆ 2

'

2V

nV n

)1,0(

)'(')'(ˆ

)'(ˆ22

NnNnn

n D

V

)'(

')'(ˆ)'(ˆ

221

nN

nnzn V

IE 519 186

Non-Independence

Non-overlapping batch-means and regeneration methods try to create independence between batches/cyclesAn alternative is to use estimates of the autocorrelation structure to estimate the variance of the sample mean(Again, estimating the mean is no problem, just the variance)Spectrum analysis and autoregressive methods attempt to do this

IE 519 187

Spectral Variance Estimator

Assume the process is covariance stationary:

The variance can be expressed asThe spectral density function of the process:

Since , an estimate of the spectral density function at frequency 0 is an estimate of the variance

)(, lYYEYE ljjj

ll)(

l

llf )cos()(2

1)(

llf )()0(2

IE 519 188

Spectral Variance Estimator

Using standard results:

1)(

1)0(

)()(1

)(

)()()(ˆ

1

1

)1(

2

lw

w

nYYnYYn

l

llwn

n

n

ln

rlrrn

m

mlnn

Batch size

Weights

IE 519 189

Parameters

For the batch size

Example of weight functions

0

nnm

m

otherwise0

11)(

otherwise0

11)(

22 mlmllw

mlmllw

n

n

IE 519 190

Autoregressive Method

Again assume covariance-stationary output process, and also a pth-order autoregressive model

2

0

0

varianceand 0mean h wit

variablesrandom eduncorrelat }{

1

i

p

jijij

b

Yb

IE 519 191

Convergence Result

Can be shown that

Can estimate these quantities and get

A CI can be constructed using t-distribution

20

2)(Var

p

j jm

bmYm

22

ˆ

ˆ)(Var

bmmY

IE 519 192

What is the Coverage?Empirical results for 90% CI for two simulation models

IE 519 193

Discussion

Replication/deletion is certainly the most popular in practice (easy to understand)Batch-means is very effective. There are practical algorithms and still a lot of researchSpectral methods are still a subject of active research but probably not used much in practice (very complicated)Autoregressive methods appear not be used/investigated muchRegeneration methods are theoretically impeccable but practically useless!

IE 519 194

Comments on Variance Estimates

We have spent considerable time looking at alternative estimates of the varianceWhy does it matter?Simulation output is usually (always) auto-correlated, which makes it difficult to estimate variance, and hence the CI may be incorrectMost seriously, the precision of the estimate may be less than predicted and hence inference drawn from the model may not be valid

IE 519 195

Implications of Autocorrelation

Because simulation output is usually autocorrelated we cannot simply use all of the observations to estimate the meanWe need some way of obtaining no correlation

Replication/deletion gets this through independent replications

Batch-means gets the (almost) through non-overlapping batches

Regenerative method get this through independent regenerative cycles

IE 519 196

Sequential Procedures

None of the single run methods we have discuss can assure any given precision (which we need to make a decision)Several sequential procedures exist that allow us to do this

More complicated than for replication/deletion

May require very long simulation runs

IE 519 197

Good Sequential Procedures

Batch-means and relative error stopping rule

Law and Carson procedure (1979) Automated Simulation Analysis Procedure

(ASAP) and extension ASAP3 (2002, 2005)

Spectral method and relative stopping rule WASSP (2005)

All of these methods obtain much better coverageHowever, they are rarely if ever used!

IE 519 198

Estimating Probabilities

Know how to estimate meansHow about probabilities p = P[YB] ?Note that

We therefore already know this!

][

0011

1

otherwise0

if1

ZE

ZPZP

ZPBYP

BYZ

IE 519 199

Estimating Quantiles

Suppose we want to estimate the q-quantile yq, that is, P[Y yq]=q

More complicatedMost estimates based on order statistics Biased estimates Computationally expensive Coverage low if sample size is too low

IE 519 200

Cyclic Parameters

No steady state distributionWith some cycle definition

All of the techniques we have discussed before for steady-state parameters still apply to this new process

yYPyFyYPyF CC

i

Ci

Ci

)()(

IE 519 201

Multiple Measures

In practice we are usually interested in multiple measures simultaneously, so we have several CIs

How does this effect our overall CI? ?,...,1, ksIP ss

kkk IP

IP

1

1 111

IE 519 202

Bonferroni Inequality

No problem if independent

In practice performance measures are very unlikely to be independentIf they are not independent, we can use Bonferroni inequality

k

ssss ksIP

1

1,...,1,

k

sssss IPksIP

1

,...,1,

IE 519 203

Computational Implications

Say we have 5 performance measures and we want a 90% CITwo alternatives:

We can get five 98% CI for each of the performance measures, which gives us a 90% overall CI. This is computationally expensive

We can get five 90% CI and live with the fact that one or more of them is likely to not cover the true value of the parameter

We will revisit this topic when we talk about multiple comparison procedures

IE 519 204

Output Analysis: Discussion

Terminating simulation

Non-terminating simulation

Multiple runs• Replication/deletion• Issue with bias• Elimination of initial transient

Single long run• Batch-means, regenerative etc.• Autocorrelation problem with estimating the variance

• Replications defined by terminating event• Can determine precision• Initial conditions

IE 519 205

Resampling Methods

IE 519 206

Sources of Variance

We have learned how to estimate variance and construct CI, predict number of simulation runs needed, etc.Where does the variance come from?

Random number generator (RNG) Generation of random variates Computer only approximates real values Initial transient/stopping rules Inherently biased estimators Modelling error ?

Made worse by long runs!

Made better by long runs!

IE 519 207

Input Modelling

We have discussed input modeling and output analysis separatelyRecall main approaches for input modeling:

Fit a parametric distribution Fit an empirical distribution Use a trace Use beta distribution

In practice fitting a parametric distribution is the most common approach

IE 519 208

Numerical ExampleThe underlying system is an M/M/1/10 queueThe simulation model is 1 station, capacity of 10, and empirical distribution for interarrival and service times from 100 observationsWant to estimate the expected time in system E[W]Typical simulation experiment:

10 replications Very long run of 5000 customers Very long warm-up period of 1000 customers CI constructed using t-distribution

We would expect a very good estimate for the performance of the model

IE 519 209

Effect of Estimating Distribution Parameters

True model

No resampling

Direct resampling

True model assumes that the true models for interarrival and service distribution is known

No resampling is the traditional approach of empirical distribution and then construct a sample mean based on 10 replications

Direct resampling obtains a new sample of 100 data points for each of the 10 replications

IE 519 210

Why Poor Coverage?

The uncertainty due to replacing the true distribution with an estimate is neglected

This is the case for all commercial simulation software

Remedies Direct resampling Bootstrap resampling Uniformly randomized resampling

IE 519 211

Direct Resampling

For each replication (simulation run) use a new sample to create an empirical distribution functionRequires a lot more dataAlternatively what data is available can be split among the replicationsCan confidence intervals be constructed?

IE 519 212

Bootstrap Resampling

Use the bootstrap to create a ‘new’ sample for a new empirical distribution function for each replicationBootstrap: sampling with replacementNo need for additional data and may even be able to use less data

IE 519 213

Bootstrap Resampling Algorithm

For each input quantity q modeled, sample n values from the observed data with replacementConstruct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat

inq

iq

iq vvv )()2()1( ,...,,

IE 519 214

Uniformly Randomized

Note that if F is the cdf of X then F(X) is uniform on [0,1]

)1,(~

...

...

][

][]2[]1[

][]2[]1[

knkbetaXF

XFXFXF

XXX

k

n

n

IE 519 215

For each input quantity q modeled, order the observed data Generate a sample of n ordered values from a uniform distributionSetand construct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat

Uniform Randomized Bootstrap

)()2()1( ,...,, nqqq xxx

inq

iq

iq uuu )()2()1( ,...,,

ijqjq

pq uxF )()(

)(ˆ

IE 519 216

Numerical Results90% CI for a M/M/1/10 queue and varying traffic intensity {0.7,0.9}

Observations of interarrival and service times {50,100,500}

IE 519 217

Numerical Results90% CI for a M/U/1/10 queue and varying traffic intensity {0.7,0.9}

Observations of interarrival and service times {50,100,500}

IE 519 218

Discussion

Uncertainty in the input modeling can effect the precision of the outputFor a given application you can estimate this effect by selecting 3-5 random subsets of the data, and performing the analysis on eachBootstrap resampling can help fix the problem

IE 519 219

DiscussionBootstrap resampling is much more general, and provides an answer to the question:

Given a random sample & a statistic T calculated on this sample, what is the distribution of T ?

Assumptions: The empirical distribution converges to the true

distribution as the number of samples increases T is sufficiently ‘smooth’

Problems with extreme point estimates

Other simulation applications: Model validation Ranking-and-selection, etc.

IE 519 220

Comparing Multiple Systems

IE 519 221

Multiple Systems

We know something about how to evaluate the output of a single systemSimulation is rarely used to simply evaluate one systemComparison: Two alternative systems can be built Proposed versus existing system What-if analysis for current system

IE 519 222

Types of ComparisonsComparison of two systems Comparison of multiple systems

Comparison with a standard All pair-wise comparison Multiple comparison with the best (MCB)

Ranking-and-selection Selecting the best system of k systems Selecting a subset of m systems containing

the best Selecting the m best of k systems

Combinatorial optimization

IE 519 223

Overview of Various Approaches

Comparison of Systems Construct (simultaneous) confidence intervals

Ranking-and-selection Indifference zone

The systems that is selected has performance that is within an indifference zone of the best performance with a fixed probability

This is the most common method Bayesian approach Optimal simulation budget allocation

Optimization Design of experiments/Response surfaces Search procedures

IE 519 224

Example: One or Two Servers?

IE 519 225

Comparing Two Systems

Have IID observations from two output processes and want to construct a CI for the expected difference:

21

2222221

1111211

2

1

,...,,

,...,,

in

in

XEXXX

XEXXX

IE 519 226

A Paired-t CI

If n1=n2=n we can construct a paired CI

)(ˆ)(

)()1(

1)(ˆ

1)(

21,1

1

2

1

21

nZtnZ

nZZnn

nZ

Zn

nZ

ZE

XXZ

n

n

ii

n

ii

i

iii

IE 519 227

Welch CINow do not require equal samples, but assume that the two processes are independent

1)(1)(

)()(ˆ

)()()()(

)(1

1)(

1)(

2

2

22221

2

112

1

222211

21

2

222

1

12

121,ˆ2211

1

2

1

nnnSnnnS

nnSnnSf

n

nS

n

nStnXnX

nXXn

nS

Xn

nX

f

n

jiiij

iii

n

jij

iii

i

i

IE 519 228

Obtaining IID Observations

Need the observations from each system to be IIDTerminating simulation Each run is IID, so no problem

Non-terminating simulation Replication/deletion approach Non-overlapping batch-means

IE 519 229

Comparing Multiple Systems

Comparison with a standard

All pair-wise comparison

Multiple comparison with the best

(MCB)

IE 519 230

Comparison with a Standard

Now assume that one of the systems is the ‘standard,’ e.g. an existing systemConstruct a CI with with overall confidence level 1- for 2-1, 3-1,…, k-1.

Using Bonferroni inequality: Construct k-1 confidence intervals at level 1-k1 The individual CIs can be constructed using any method as Bonferroni will always hold

IE 519 231

All Pair-Wise Comparison

Now want to construct CIs to compare all systems with all otherQuite difficult because we need each individual CI to have level 1-kk1 to guarantee an overall level of 1-Only feasible for a relatively small number of k

IE 519 232

Multiple Comparison with the Best (MCB)

We are really interested in whatever is the best system, and hence to construct CIs to see if it is significantly better than each of the others

Here h is a critical parameter andMCB procedures are related to ranking-and-selection

nhnXnX

nhnXnX l

ilil

ilil

ili

lil

i

2)(max)(,

2)(max)(max

max

},0max{ xx

IE 519 233

Ranking-and-Selection

Have some k systems, and IID observations from each system:

Want to select the best system, that is, the system with the largest mean. We call this the correct selection (CS)

Can we guarantee CS?

kiii

iji XE

...21

IE 519 234

Indifference Zone Approach

We say that the selected system i* is the correct selection (CS) if

Here is called the indifference zoneOur goal is

Here P* is a user selected probability(Bechhofer’s approach)

*1 ii

**

1)( PPCSP

ii

IE 519 235

Two-Stage Approach: Stage I

Obtain n0 samples and calculate

Calculate the total samples needed

0

0

1

2

00

02

100

)1(

)(1

1)(

1)(

n

jiiji

n

jiji

nXXn

nS

Xn

nX

20

221

0

)(,1max

nSh

nN ii

IE 519 236

Two-Stage Approach: Stage II

Obtain Ni-n0 more observations, and calculate the second stage and overall mean

12

022

1

20

0

01

0)2(

20)1(

1

100

)2(

1

)(111

)()()(

1)(

0

ii

i

ii

ii

iiiiiii

N

njij

iii

ww

nSh

nN

n

N

N

nw

nNXwnXwNX

XnN

nNXi

IE 519 237

Comments: Assumptions

As usual: normal assumptionDo not need equal or known variances (many statistical selection procedures do)Two-stage approach requires an estimate of the variance (remember controlling the precision)The above approach assumes the least favorable configuration

IE 519 238

Subset Selection

In most applications, many of the systems are clearly inferior and can be eliminated quite easilySubset-selection: Find a subset of systems

Gupta’s approach:

*1

},..2,1{

PIiP

kI

k

n

hnXnXlI ili

l

2)(max)(:

IE 519 239

Proof

11,...,2,1,

,22

)()(

,2

)()(

2)(max)(

select

0

normal standardely Approximat

h

i

k

h

iiiiii

kii

iii

ik

kihZP

iin

hn

nXnXP

iin

hnXnXP

nhnXnXPIiP

kkk

k

kk

IE 519 240

Two-Stage Bonferroni: Stage I

Specify

Make no replications and calculate the sample variance of the difference

Calculate the second stage sample size

2

22

0

1

2

0

2

1,)1(1

*

max,max

1

1

1

0

0

ij

iji

n

ljiljliij

nk

StnN

XXXXn

S

tt

P

IE 519 241

Two-Stage Bonferroni: Stage II

Obtain the additional sample, calculate the overall sample means and select the best system, with the following CI:

jij

i

jij

ijij

i

XX

XX

max,0max

maxmax,0min

IE 519 242

Combined ProcedureInitialization: Calculate and

Subset selection: Calculate and

If |I|=1, stop. Otherwise, calculate second-stage:

Obtain more samples from each system iI

Compute the overall sample means and select the best system

)( 0)1( nX i

0

1

2

00

2 )(1

1 n

jiiji nXX

nS

2/1

022 nSStW liil

ilWnXnXiI illi ,)()(: 00

220 ,max ii hSnN

IE 519 243

Sequential ProcedureSet

Compute

Screen

If |I|=1, stop. Otherwise, take one more observation and go back to screening

0

1

2

000

2 )()(1

1 n

jliljiji nXnXXX

nS

rSh

rrW

ilrWrXrXIiI

ilil

illioldnew

2

22

2,0max)(

),()()(:

12 ,11

2

2

10

212 0

nhk

n

IE 519 244

Where does the h come from?

Solved numerically from (Rinott):

More commonly, you look it up in a table (some in the book)

density

cdf Normal

)()(11

)1(

2

1

1

0 0

1

0

*

00

f

dyyfdxxf

yxn

hP n

k

n

IE 519 245

Large Number of Alternatives

Two-stage ranking-and-selection procedures usually only efficient for up to about 20 alternatives

Always focus on least-favorable-configuration (LFC)

For large number of systems LFC would be very unlikelyUse screening followed by two-stage R&SUse sequential procedure

The procedure given earlier can be used up to 500 alternatives or so

1,...,2,1 , kiki

IE 519 246

Other Approaches

Focused on comparing expected values of performance to identify the best

Alternatives: Select the system most likely to be best Select the largest probability of success Bayesian procedures

IE 519 247

Bayesian Procedures

Posterior and priorTake action to maximize/minimize the posteriorR&S: Given a fixed computing budget find the allocation of simulation runs to systems that minimizes some loss function

ijj

co

i

iL

iL

max),(

otherwise1

0),(

..

10

IE 519 248

Discussion: Selecting the Best

Three major lines of research: Indifference-zone procedures

Most popular, easy to understand, use the LFC assumption

Screening or subset selection based on constructing a confidence interval

Can be applied for more alternatives, do not give you a final selection. Can be combined with indifference zone selection

Allocating your simulation budget to minimize a posterior loss function

More efficient use of simulation effort, but does not give you the same guarantee as indifference-zone methods

IE 519 249

Simulation Optimization

IE 519 250

Larger ProblemsEven with the best methods, R&S can only be extended to perhaps 500 alternativesOften faced with more when we can set certain parameters for the problemNeed simulation optimization

IE 519 251

What is Simulation Optimization?

Optimization where the objective

function is evaluated using

simulation

Complex systems

Often large scale systems

No analytical expression available

IE 519 252

Problem Setting

Components of any optimization problem: Decision variables () Objective function

Constraints

: nf R R

n R

IE 519 253

Simulation Evaluation

No closed form expression for the functionEstimated using the output of stochastic discrete event simulation:

Typically, we may have

: nf R R

( ) ( ) .f E X

( )X

IE 519 254

Types of TechniquesDecision Variables

Continuous Discrete

Gradient-BasedMethods

Size of

Small Large

Ranking & Selection

RandomSearch

Note: These are direct optimization methods. Metamodels approximate the objective function and then optimize (later).

IE 519 255

Continuous Decision Variables

Most methods are gradient based

Issues: All the same issues as in non-linear

programming How to estimate the gradient

( 1) ( ) ( )k k kk f

( )ˆ kf

IE 519 256

Stochastic Approximation

Fundamental work by Robbins and Monro (1951) & Kiefer and Wolfowitz (1952)Asymptotic convergence can be assured

Generally slow convergence

lim 0kk

kk

IE 519 257

Estimating the Gradient

Challenge to estimate the gradient:

Finite differences are simple:

(could also be two-sided)

1 2ˆ ˆ ˆ ˆ, ,..., nf f f f

( ) ( )ˆ i i ii

i

X Xf

IE 519 258

Improving Gradient Estimation

Finite differences requires two simulation runs for each estimateMay be numerically instableBetter: estimate gradient during the same simulation run as Perturbation analysis Likelihood ration or score method

( )X

IE 519 259

Other Methods

Stochastic approximation variants

have received most attention by

researchers

Other methods for continuous domains

include Sample path methods

Response surface methods (later)

IE 519 260

Discrete Decision Variables

Two types of feasible regions:Feasible region small (have seen this) Trivial for deterministic case but must

still account for the simulation noise

Feasible region large E.g., the stochastic counterparts of

combinatorial optimization problems

IE 519 261

Statistical Selection

Selecting between a few alternatives

Can evaluate every point and compareMust still account for simulation noiseWe now know several methods:

Subset selection Indifference zone ranking & selection Multiple comparison procedures (MCP) Decision theoretic methods

1 2, ,..., m

IE 519 262

Large Feasible Region

When the feasible region is large it is impossible to enumerate and evaluate each alternativeUse random search methods Academic research focused on

methods for which asymptotic convergence is assured

In practice, use of metaheuristics

IE 519 263

Random Search (generic)

Step 0: Select an initial solution (0) and simulate its performance X((0)). Set k=0

Step 1: Select a candidate solution (c) from the neighborhood N((0)) of the current solution and simulate its performance X((c))

Step 2: If the candidate satisfied the acceptance criterion, let (k+1)= (c); otherwise let (k+1)= (k)

Step 3: If stopping criterion is satisfied terminate the search; otherwise let k=k+1 and go to Step 1

IE 519 264

Random Search Variants

Specify a neighborhood structure

Specify a procedure for selecting candidates

Specify an acceptance criterion

Specify a stop criterion

IE 519 265

Metaheuristics

Random search methods that have been found effective for combinatorial optimizationFor simulation optimization

Simulated annealing Tabu search Genetic algorithms Nested partitions method

IE 519 266

Simulated Annealing

Falls within the random search frameworkNovel acceptance criterion:

The key parameter is Tk, which is called the temperature

c ( )

c ( )

c

1,

Accept

, otherwise

k

k

k

X X

T

X X

P

e

IE 519 267

Temperature Parameter

Usually the temperature is decreased as the search evolvesIf it decreases sufficiently slowly then asymptotic convergence is assuredFor simulation optimization there are indications that constant temperature works as well or better

IE 519 268

Tabu Search

Can be fit into the random search framework

A unique feature is the restriction of the neighborhood:

Solution requiring the reverse of recent moves not allowed in the neighborhood

Maintain a tabu list of moves

Other features include long term memory that restart with a different tabu list at good solutions

Has been applied successfully to simulation optimization

IE 519 269

Genetic Algorithms

Works with sets of solutions (populations) rather than single solutionsOperates on the population simultaneously:

Survival Cross-over Mutation

Novel construction of a neighborhoodHas been used successfully for simulation optimization

IE 519 270

Nested Partitions Method

Originally designed for simulation optimizationUses Partitioning Random sampling Local search improvements

Has asymptotic convergence

IE 519 271

NP Method

Most PromisingRegion

Subregion Subregion

Superregion

In k-th iterationj=2 subregions

)(1 k )(2 k

)(\)(3 kk

))(( ks )()()( 21 kkk

Partition of the feasible region In each iteration there is the most promising region (k) Use sampling to determine where to go next

IE 519 272

Sampling

Sources of randomness: Performance of a subset is based on a random

sample of solutions from that subset

Performance of each individual samples estimated using simulation

Difficulty of estimating performance depends on how much variability in the region

Intuitively appealing to have more sampling from regions that have high variance

IE 519 273

Two-Stage Sampling

Use two-stage statistical selection methods to determine the number of samples

Phase I: Obtain initial samples from each region

Calculate estimated mean and variance

Calculate how many additional samples are needed

Phase II: Obain remaining samples

Estimate performance of region

IE 519 274

Convergence

Single-stage NP converges asymptotically (useless?)Two-stage NP converges to a solution that is within an indifference zone of optimum with a given probability Reasonable goal softening Statement familiar with simulation users

IE 519 275

Theory versus Practice

Asymptotically converging methods Good theoretical properties May not converge fast of be easy to

use/understand

Practical methods Often based on heuristic search Do not necessarily account for

randomness Do no guarantee convergence

IE 519 276

Commercial Software

SimRunner (Promodel) Genetic algorithms

AutoStat (AudoMod) Evolutionary & genetic algorithms

OPTIMIZ (Simul8) Neural networks

OptQuest (Arena, Crystal Ball, etc) Scatter search, tabu search, neural nets

IE 519 277

Optimization in Practice

In academic work we have very specific definitions:

Optimization = find the best solution Approximation = find a solution that is within

a given distance of optimal performance Heuristic = seek the optimal solution

In practice, people do not always think about the theoretical ideal optimum that is the basis for all of the above

Optimization = improvement

IE 519 278

Combining Theory & Practice

Best of both worlds Robustness and computational power of

heuristics Guarantee performance somehow

Some examples: Combine genetic algorithms with statistical

selection Two-stage NP-Method guarantees

convergence within an indifference zone with a prespecified probability

IE 519 279

Metamodels

IE 519 280

Response Surfaces

Obtaining a precise simulation estimate is computationally expensiveWe often want to do this for many different parameter values (and even find some optimal parameter values)An alternative is to construct a response surface of the output as a function of these input parametersThis response surface is a model of the simulation models, that is, a metamodel

IE 519 281

Metamodels

Simulation can be (simply) represented as

For as single output and additive randomness, we can write this as

The metamodel, models g and models

gy

gy

yfg ~

IE 519 282

ExampleInstead of simulating an exact contour – construct a metamodel using a few values

IE 519 283

Regression

Most commonly, regression models have been used for metamodels

The issues are determining how many terms to include and estimating the coefficients

213

22

11

)(

)(

)(

)()(

xxp

xp

xp

pf kk

x

x

x

xx

IE 519 284

DOE for RS Models

The coefficients are given by

Key issues: Would like to minimize variance of

Can be done by controlling the random number stream

Would like to estimate with fewer simulation runs

Designs to reduce bias

yβ tt XXX1

IE 519 285

Why Response Surfaces?

Box, Hunter, and Hunter (1978). Statistics for Experimenters, Wiley.

IE 519 286

Compare with True Optimum

Why did this fail?

IE 519 287

Response Surface Optimization

IE 519 288

Second Order Model

IE 519 289

Experimental Process

State your hypothesisPlan an experiment

Design of Experiments (DOE)

Conduct the experiment Run a simulation

Analyze the data Output analysis

Repeat steps as needed

IE 519 290

DOE

Define the goals of the experimentIdentify and classify independent and dependent variables (see example)Choose a probability model

Linear, second order, other (see later)

Choose an experimental design Factorial design, fractional factorial, latin

hybercubes, etc.

Validate the properties of the design

IE 519 291

Example of Variables

Dependent

Independent

Throughput

Job release policy, lot size, previous maintenance, speed

Cycle Time Job release policy, lot size, previous maintenance, speed

Operating Cost

Previous maintenance, speed

IE 519 292

Other Metamodels

Many other approaches can be taken to metamodeling

Splines Have been used widely in deterministic simulation

responses Radial basis functions Neural networks Krieging

Have also been used widely in deterministic simulation and gaining a lot of ground in stochastic simulation

IE 519 293

Variance Reduction

IE 519 294

Variance ReductionAs opposed to physical experiments, in simulation we can control the source of randomnessMay be able to take advantage to improve precision

Output analysis Ranking & selection Experimental designs, etc.

Several methods: Common random numbers

Comparing two or more systems Antithetic variates

Improving precision of a single system Control variates, indirect estimation, conditioning

IE 519 295

Common Random Numbers

Most useful techniqueUse the same stream of random numbers for each system when comparingMotivation:

n

XXCovXVarXVar

n

ZVarnZVar

Zn

nZ

XXZ

jjjjj

n

jj

jjj

2121

1

21

,2)(

1)(

IE 519 296

Applicability

IE 519 297

Synchronization

We must match up random numbers from the different systemsCareful synchronization of the random number stream Assign one substream to each process Divide that substream up to get

replications

Does not happen automatically

IE 519 298

Example: Failed Sync.

IE 519 299

Example: M/M/1 vs M/M/2

Independent sampling

CRN

IE 519 300

Example: Correlation Induced

IE 519 301

Example: System Difference

IE 519 302

Methods for CRN Use

Many methods assume independence between systemsRanking & Selection, etc. Some methods designed to use CRN,

while it violates the assumptions of others

Experimental design Can design the experiments specifically

to take advantage of CRNs

IE 519 303

Discussion

Dramatic improvements can be achieved with CRN (but can also be harmful)Recommendations:

Make sure CRN are applicable (pilot study) Use methods that take advantage of CRN Synchronize the RNG Use one-to-one random variate generator

IE 519 304

Antithetic Variates

We now turn to improving precision of a simulation of a single systemBasic idea:

Pairs of runs Large observations offset by small

observations Use the average, which will have smaller

variance

Need to induce negative correlation

IE 519 305

Mathematical Motivation

Recall for covariance stationary process

we have

If the covariance terms are negative the variance will be reduced Difficult to get all of those negative

nYYY ,...,, 21

liil

l

n

l

YYCov

n

l

nnYVar

,

12 1

1

2

IE 519 306

Complementary Random Numbers

The simplest approach is to use complementary random numbersSuppose service times are exponentially distributed with mean = 5

U X 1-U X X0.37 4.98 0.63 2.30 3.640.55 3.02 0.45 3.96 3.490.98 0.09 0.02 20.17 10.130.24 7.19 0.76 1.36 4.270.71 1.70 0.29 6.22 3.96Avg. 3.40   6.80 5.10

S.Dev. 2.78   7.70 2.83

IE 519 307

Example (cont.)U X 1-U X

 

X0.07 13.39 0.93 0.36

 

6.870.35 5.26 0.65 2.15

 

3.700.21 7.86 0.79 1.16

 

4.510.57 2.81 0.43 4.23

 

3.520.66 2.08 0.34 5.39

 

3.74Avg. 6.28

 

2.66 

4.47S.Dev 4.58

 

2.11 

1.40

Does this prove that antithetic variates work for this example?

IE 519 308

What You NeedAs for CRN, we need a monotone relationship between the (many) unit uniform random numbers to the (single) output that we are interested in(When there are multiple outputs we need to hold for each output.)As before:

Synchronization Inverse-transform

X

IE 519 309

Formulation

Simulation

)1(

)(

212

22

1212

12

112

jjj

jj

jj

j

j

YYY

YX

YX

UFX

UFX

IE 519 310

Example: M/M/1 Queue

Independent sampling

Antithetic sampling

IE 519 311

Complimentary Processes

Imagine a queueing simulation with arrivals and servicesLarge interarrival times will in general have the same effect on performance measures as large service timesIdea: Use the random numbers used for generating interarrival times in the first run of a pair to generate service time in the second run, and vice versaThis could be extended to any situation where you can argue similar complimentary

IE 519 312

Combining CRN with AV

Both CRN and AV are using very similar ideas, so why not combine them?System 1: Run 1.1 and Run 1.2System 2: Run 2.1 and Run 2.2If we have all the right correlations:

Run 1.1 and Run 2.2 are positively correlated Run 1.2 and Run 2.1 are positively correlated

Thus, the overall performance may be worse

IE 519 313

Discussion

Basic idea is to induce negative correlation to reduce varianceSuccess is model dependentMust show that it works

Based on model structure Pilot experiments

Since we need a monotone relationship between the RNG and output: synchronization and inverse-transform

IE 519 314

Control VariatesWe are again interested in improving the precision of some output Y

X is the control variate (any correlated r.v.)

correlated negatively are and if0

correlatedpostively are and if0 is

meanknown thefromdeviation

output theof valuedobserved

YX

YXa

XE

XaYYC

IE 519 315

Estimator PropertiesThe controlled estimator is unbiased

The variance is

So

0][][

)(

XEaYE

XaYEYE C

),(2][][

)(2 YXCovaXVaraYVar

XaYVarYVar C

][),(2][ 2 XVaraYXCovaYVarYVar C

IE 519 316

Optimal a Given Y

0][2

),(2][][

][

),(

),(2][2

),(2][][0

22

2

*

2

YVar

YXCovaXVaraYVara

XVar

YXCova

YXCovXVara

YXCovaXVaraYVara

IE 519 317

Optimal VarianceWith the optimal value a*

][1][

),(][

),(][

),(2

][][

),(][

),(2][][

22

2

*2**

YVarXVar

YXCovYVar

YXCovXVar

YXCov

XVarXVar

YXCovYVar

YXCovaXVaraYVarYVar

XY

C

IE 519 318

ObservationsBy using the optimal value a*

The controlled estimator is never more variable than the uncontrolled estimator

If there is any correlation, the controlled estimator is more precise

Perfect correlation means perfect estimate

Where’s the catch?

0][11

2*

XY

YVarYVar XYC

IE 519 319

Estimating a*

Never know Cov[X,Y] and hence not a*

Need to estimate

This will in general be a biased estimatorJackknifing can be used to reduce bias

)(

)()(

)(

)(ˆ)(ˆ

)()(ˆ)()(

2

1

2*

**

nS

nnXXnYY

nS

nCna

nXnanYnY

X

n

j jj

X

XY

C

IE 519 320

Example: M/M/1 Queue

Want to estimate the expected customer delay in systemPossible control variates: Service times

Positive correlation Interarrival times

Negative correlation

IE 519 321

Example: Service Time CVs

Rep Y X

1 13.84 0.92

2 3.18 0.95

3 2.26 0.88

4 2.76 0.89

5 4.33 0.93

6 1.35 0.81

7 1.82 0.84

8 3.01 0.92

9 1.68 0.85

10 3.60 0.88

13.4)01.0(3578.3

)10()10(ˆ)10()10(

00.35)10(

)10(ˆ)10(ˆ

07.0)10(ˆ

002.0)10(

9.089.0)10(

13.478.3)10(

**

2*

2

known! bet Shouldn'

XaYY

S

Ca

C

S

X

Y

C

X

XY

XY

X

IE 519 322

Multiple Control Variates

Perhaps we have two correlated random variables (e.g., both service times and interarrival times)

Problems?

)2()1(

)2()2()1()1( )()(

)(

XXX

XaXaY

XaYYC

IE 519 323

Multiple Control Variates

To take best advantage of each control variate we need different weights

Find the partial derivatives with respect to both and solve for optimal values as before

)()( )2()2(2

)1()1(1 XaXaYYC

IE 519 324

In General

m

i

m

j

jiji

m

i

ii

m

i

iiC

m

i

iiiC

XXaa

XYa

XaYY

XaYY

1 1

)()(

1

)(

1

)(2

1

)()(

,cov2

,cov2

var]var[var

)(

IE 519 325

Types of Control Variates

Internal Input random variables, or functions of those random

variables Know expectation Must generate anyway

External We cannot know However, with some simplifying assumptions we may

have an analytical model that we can solve and hence know the same output

Requires a simulation of the simplified system

][YE

]'[' YE

IE 519 326

Indirect Estimation

Primarily been used for queueing simulations

ttL

ttQ

WEw

iW

DEd

iD

i

i

i

i

at time systemin customers ofNumber )(

at time queuein customers ofNumber )(

][

customerth of wait Total

][

customerth ofDelay

IE 519 327

Direct Estimators

)(

0

)(

0

1

1

)()(

1)(ˆ

)()(

1)(ˆ

1)(ˆ

1)(ˆ

nT

nT

n

ii

n

ii

dttLnT

nL

dttQnT

nQ

Wn

nw

Dn

nd

IE 519 328

Known Relationships

][)(

customer of timeService

1)(

)()(ˆ)(ˆ

1

SEnSE

iS

Sn

nS

nSndnw

i

n

ii

Can we take advantage of this?

IE 519 329

Indirect Estimator

Replace the average with the known expectation

Avoid variationFor any G/G/1 queue it can be shown that

Is this trivial?

][)(ˆ)(~ SEndnw

)(ˆvar)(~var nwnw

IE 519 330

Little’s Law

A key result from queueing is Little’s Law

Indirect estimators of average number of customer in the queue/system

dQ

wL

system torate Arrival

][)(ˆ)(~)(~

)(ˆ)(~

SEndnwnL

ndnQ

IE 519 331

Numerical Example

M/G/1 queue

Service Dist. =.5 =.7

=.9

Exponential 15 11 4

4-Erlang 22 17 7

Hyperexponential 4 3 2

IE 519 332

Conditioning

Again replace an estimate with its exact analytical value, hence removing a source of variability

XZXEXZXE

ZXEEXE

zZXE

var|varvar|var

|

knownly Analytical|

IE 519 333

Discussion

We need can be easily generated E[X|Z=z] can be easily calculated E[var[X|Z]] is large

This is going to be very model dependent

IE 519 334

Example: Time-Shared Computer Model

Want to estimate the expected delay in queue for CPU (dC), disk (dD) and tape (dT)

IE 519 335

ConditioningEstimating dT may be hard due to lack of data

Observe the number NT in tape queue every time a job leaves the CPUIf this job were to go to the tape queue, its expected delay would be

Variance reduction of 56% was observed

ly)analytical(known 50.12|

50.12

zzNDE

NZ

NNSEDE

TT

T

TTTT

IE 519 336

Discussion

Both indirect estimation and conditioning are extremely application dependentRequire both good knowledge of the system as well as some background from the analystCan achieve good variance reduction when used properly

IE 519 337

Variance Reduction Discussion

Have discussed several methods Common random numbers

Antithetic random variates

Control variates

Indirect estimation

Conditioning

More application specific

IE 519 338

Applicability & Connections

Can we use VRT with any technique for output analysis (e.g., batch-means)?Can we use VRT (especially CRN) with ranking-and-selection and multiple comparison methods?Can we design our simulation experiments (DOE) to take advantage of VRT (especially when building a metamodel)?Can we use VRT with simulation optimization techniques?

IE 519 339

VRT & Batch-Means

Batch-means is a very important method for output analysis (non-overlapping & overlapping)Problem is that there may be correlation between batchesGenerally no problem with the use of common random numbers or antithetic variatesUse of control variates requires some additional consideration but can be done

IE 519 340

VRT & Ranking & Selection

In R&S we need to make statements about

If we use CRNs then the two averages will be dependent, which complicates analysisTwo general methods:

Look at pair-wise differences Bonferroni inequality Assume some structure for the dependence

induced by the CRNs

lili XnX )(

IE 519 341

Pair-Wise Differences

We can replace

with the pair-wise differences

This will then include the effect of the CRN-induced covariance

njliljij nXnX

,...,2,1,)()(

njkiij nX

,...,2,1;,...,2,1)(

IE 519 342

Bonferroni Approach

We can break up the joint statement using Bonferroni inequalities

Very conservative approach

kikii

k

ii

k

ii

nXnXA

APAP

)()(

1

e.g.

1

1

1

1

IE 519 343

Assumed Structure

E.g., Nelson-Matejcik modification of two-stage ranking and selection assumes sphericity

Turns out to be a robust assumption

li

liXX

li

iljij

22,cov

IE 519 344

VRT & DOE/Metamodeling

Experimental design X is used in many simulation studies to construct a metamodel (usually regression model) of the response

Can we take advantage of variance reduction to improve the design?

εβy X

IE 519 345

23 Factorial Design

How would you used VRT for this design?

IE 519 346

Assignment Rule

In an m-point experiment that admits orthogonal blocking into two blocks of size m1 and m2, use a common stream of random numbers for the first block and the antithetic random numbers for the second block

,...1,1

,...,

,...,,

,...,,

,...,,

2,1,

2,1,

11211

222212

112111

jvjvjv

jvjvjv

v

v

v

uu

uu

u

u

uuu

uuuU

uuuU

IE 519 347

Blocking

IE 519 348

23 Factorial Design in 2 Blocks

In physical experiments we block because we have to (we lose the three-way interaction effect).In simulation we do it because it is better

IE 519 349

VRT and Optimization

Most of the optimization techniques used with simulation do not make any assumption (e.g., just heuristics from deterministic optimization applied to simulation)

No problem with using variance reduction

Nested partitions method Need independence between iterations Key is to make a correct selection of a region in each

iteration Can use CRN within each iteration to make that

selection

IE 519 350

DiscussionVariance reduction techniques can be very effective in improving the precision of simulation experimentsOf course variance is only part of the equation, and you should also consider bias and efficiency

XXC

XCXMSEXEff

XEXMSE

XEXEX

XE

computing ofCost )(

)()(

1)(

)(

][]var[

][

222

22

IE 519 351

Case Study

IE 519 352

Manufacturing Simulations

Objective Increased throughput Reduce in-process inventory Increase utilization Improved on-time delivery Validate a proposed design Improved understanding of system

IE 519 353

Evaluate Need for Resources

How many machines are needed?Where to put inventory buffers?Effect of change/increase in production mix/volume?Evaluation of capital investments (e.g., new machine)

IE 519 354

Performance Evaluation

ThroughputResponse timeBottlenecks

IE 519 355

Evaluate Operational Procedures

SchedulingControl strategiesReliabilityQuality control

IE 519 356

Sources of Randomness

Interarrival times between orders, parts, or raw materialProcessing, assembly, or inspection timeTimes to failureTimes to repairLoading/unloading timesSetup timesReworkProduct yield

IE 519 357

Example: Assembly Line

Increase in demand expectedQuestions about the ability of an assembly line to meet demandRequested to simulate the line to evaluate different options to improve throughput

IE 519 358

Project Objective

Improve throughput in the lineSimulate the following options: Optimize logic of central conveyor loop Reconfigure the functional test stations

to allow parallel flow of pallets Eliminate the conveyor and move to

manual material handling

IE 519 359

Assembly Line

Manual Station 1 Assemble heatsinks and fans Soldering

Manual Station 2Install power moduleonto power PCBA

Hi-Pot Test

Strapping

FlashingFunctional TestsHIM

Assembly

Verification

Test

Packaging

IE 519 360

Simulation Project

Define a conceptual model of the lineGather data on processesValidate the modelImplement model in ArenaTest model on known scenariosEvaluate optionsRecommend solutions

IE 519 361

How Can Throughput be Improved?

Change the queuing logic This determines how pallets move

from one station to the next Flash station

Two stations in a single loop Functional test station

Three loops with two stations each

IE 519 362

Logic of the Flash Stations

1

Frame goes to the 2nd station

2Frame goes to the 2nd stationand waits in the queue

3

Frame goes to the 1st station 4

Frame goes to the 1st stationand waits in the queue

IE 519 363

Logic of the Functional Test

12

34

56

F

IE 519 364

How Can Throughput be Improved?

Reconfigure functional test stations Parallel tests would be more efficient

with respect to flow of pallets Take up more space on floor – longer

distances

Is it worthwhile to reconfigure?

IE 519 365

Circulate Pallets in System

IE 519 366

How can Throughput be Improved?

Eliminate the conveyor Manual material handling

Increase number of pallets Currently 48 pallets Sometimes run out

IE 519 367

Arena Simulation Model

The conceptual model was implemented using the Arena softwareCurrent configuration simulation and output compared to what we have observedPerformance of several alternative configurations compared

IE 519 368

Options Considered

Current configurationPallets re-circulate rather than queueVarious queue logics at functional testsFlash station in series, functional test in parallelBoth flash and functional test stations in parallelIncreased number of pallets in systemEliminate conveyor

IE 519 369

Queue Logic Options

Option 1: Queue one drive at second station in each loop starting with furthest away loopOption 2: Queue one drive at both stations in each loop, starting with furthest away loopOption 3: No queuing in loopsOption 4: Queue at second station in first loop only, start with furthest away loop

IE 519 370

Throughput ComparisonConfiguration Throughput

(drives/day)Current 265

Recirculation of pallets 275 (4% increase)

Queue logic: Option 1 274 (3% increase)

Queue logic: Option 2 279 (5% increase)

Queue logic: Option 3 280 (6% increase)

Queue logic: Option 4 295 (11% increase)

Mixed series/parallel 282 (6% increase)

All tests in parallel 296 (12% increase)

Increase to 60 pallets (25%) 291 (10% increase)

No Conveyor 256 (3% decrease)

IE 519 371

Why Does Throughput Improve?

Consider the utilization of the test stationsFirst loop utilization = 0.81

Station 1 0.67 Station 2 0.94

Second loop utilization = 0.63 Station 3 0.45 Station 4 0.81

Third loop utilization = 0.41 Station 5 0.30 Station 6 0.52

Difference of 0.27

Difference of 0.36

Difference of 0.22

IE 519 372

Improving Utilization

Backfilling will improve balance between different loops (Option 2)

Loop utilization: 0.53, 0.68, 0.76 Does not solve whole issue

Not queuing at test stations will balance load between stations with a loop (Option 3)

Station utilization: 0.64, 0.65, 0.75, 0.71, 0.80, 0.76

No queuing may leave station empty too easily

IE 519 373

Intermediate Options

Option 1: Queue one drive at second station in each loop starting with furthest away loop

Backfilling Balance between no-queuing and current

method of queuing one drive at each station

Option 4: Queue at second station in first loop only, start with furthest away loop

Uses backfilling idea Balance between no-queuing and queuing at

second station

IE 519 374

Utilization ComparisonConfiguration Functional Test

UtilizationCurrent 0.67, 0.94, 0.45, 0.81, 0.30,

0.52

Recirculation of pallets 0.71, 0.94, 0.51, 0.86, 0.26, 0.54

Queue logic: Option 1 0.42, 0.67, 0.56, 0.83, 0.67, 0.91

Queue logic: Option 2 0.39, 0.67, 0.52, 0.84, 0.61, 0.91

Queue logic: Option 3 0.64, 0.65, 0.75, 0.71, 0.80, 0.76

Queue logic: Option 4 0.53, 0.82, 0.65, 0.65, 0.73, 0.70

Mixed series/parallel 0.69

All tests in parallel 0.71

Increase to 60 pallets 0.70, 0.95, 0.49, 0.83, 0.43, 0.61

IE 519 375

Comments on Utilization

Utilization of functional test stations is currently uneven and can be improvedKey ideas Backfilling Correct amount of queuing allowed

IE 519 376

Bottleneck Analysis

Utilization of various stations Manual station 1 80% Manual station 2 65% Soldering 79% Hi-Pot 15% Strapping 37% Flashing 53% average Functional test 71% average

Functional test station 2 94% HIM 31% Verification 47% Packing 57%

Bottlenecks*

Third highestutilization

*Statistically equivalent

Bottleneck

IE 519 377

Bottleneck Identification

Functional Test Station 2 is the most heavily loaded station on the lineOn average, the functional test stations are slightly less loaded than Manual Station 1 and Soldering Station, which should hence also be considered bottlenecks

IE 519 378

Functional Test BottleneckConfiguration Queue LengthCurrent 0.73 ± 0.34 MAX=10 (21%)

Recirculation of pallets 0.17 ± 0.04 MAX=1 (2%)

Queue logic: Option 1 0.72 ± 0.32 MAX=10 (21%)

Queue logic: Option 2 0.20 ± 0.10 MAX=8 (17%)

Queue logic: Option 3 1.28 ± 0.50 MAX=17 (35%)

Queue logic: Option 4 0.79 ± 0.27 MAX=11 (23%)

Mixed series/parallel 0.96 ± 0.26 MAX=19 (40%)

All tests in parallel 1.14 ± 0.33 MAX=14 (29%)

Increase to 60 pallets 1.34 ± 0.55 MAX=16 (33%)

IE 519 379

Comments on Queue Length

Functional test queue Average queue length relatively short Occasionally very long queues

Similar results for other stations, e.g., HIM assembly stationNot a cause for concern

IE 519 380

Recommendations

Throughput can be improved: Queuing logic at test stations

Requires reprogramming of conveyor Configuring test stations in parallel

Requires significant reorganization ROI must be evaluated carefully

Increase number of pallets Currently close to point of rapidly diminishing

returns Will not combine well with other improvements

IE 519 381

Further Improvements

Optimal logic of functional tests depend on mix of drives, daily load, etc.Possibility of dynamically changed logic?Determine a relationship between product mix parameters and best logic

IE 519 382

Other Areas of Improvement

Scheduling of drives Mix of frames made on each day Order of how different frames are made

Suggestion Grouping and spacing Group similar drives together for efficiency Space time consuming drives apart Account for deadlines and resource

availability

IE 519 383

Will Scheduling Improvements Help?

Simulation results Assume batch sizes with certain range and

certain most common batch size

Clearly improvements can be made

Type Min

Max

Most common

Throughput

Batching 1 27 14 279

Batching 1 22 10 265

No Batch 1 1 1 274

IE 519 384

Discussion

Significant improvement can be obtain through inexpensive changes

Recommend changing queuing logic as inexpensive but high return alternative

Worthwhile to consider issues of schedulingSimulation model can be reused to consider other potential improvementsCompany followed recommendations and increased throughput as predicted