Simulation
-
Upload
icko-judha-dharma-putra -
Category
Documents
-
view
8 -
download
1
description
Transcript of Simulation
IE 519 1
Simulation Modelling
IE 519 2
Contents
Input Modelling 3Random Number Generation 41Generating Random Variates 80Output Analysis 134Resampling Methods 205Comparing Multiple Systems 219Simulation Optimization248Metamodels 278Variance Reduction 292Case Study 350
IE 519 3
Input Modelling
IE 519 4
Input Modelling
You make custom WidgetsHow do you model the input process?Is it deterministic?Is it random?
Look at some data
IE 519 5
Orders
1/5/20041/12/20041/20/20041/29/2004
2/3/20042/15/20042/19/20042/25/20042/28/2004
3/6/20043/15/20043/27/20043/31/2004
4/10/20044/14/20044/17/20044/21/20044/22/20044/28/2004
5/2/20045/3/2004
5/24/20045/26/2004
6/4/20046/15/2004
Now what?
IE 519 6
Histogram
0
1
2
3
4
5
6
7
8
[0,2
][2
,4]
[4,6
][6
,7]
[8,1
0]
[10,
12]
[12,
14]
[14,
16]
[16,
18]
[18,
20]
20+
IE 519 7
Other Observations
Trend? Stationary or non-stationary process
Seasonality May require multiple processes
IE 519 8
Choices for Modelling
Use the data directly (trace-driven simulation)Use the data to fit an empirical distributionUse the data to fit a theoretical distribution
IE 519 9
Assumptions
To fit a distribution, the data should be drawn from IID observationsCould it be from more than one distribution? Statistical test
Is it independent? Statistical test
IE 519 10
Activity I
Hypothesize families of distributions Look at the data Determine what is a reasonable
process Summary statistics Histograms Quantile summaries and box plots
IE 519 11
Activity II
Estimate the parameters Maximum likelihood estimator (MLE) Sometimes a very simple statistics Sometimes requires numerical
calculations
IE 519 12
Activity III
Determine quality of fit Compare theoretical distribution with
observations graphically Goodness of fit tests
Chi-square tests Kolmogorov-Smirnov test
Software
IE 519 13
Chi-Square Test
Formal comparison of a histogram and the probability density/mass function
Divide the range of the fitted distribution into intervals
Count the number of observations in each interval
),[),...,,[),,[ 12110 kk aaaaaa
),[in s' ofNumber 1 jjj aaXN
IE 519 14
Chi-Square TestCompute the expected proportion
Test statistic is
Reject if too large
data discretefor )(ˆ
data continuousfor )(ˆ
1
1
jij
j
j
axai
a
aj
xp
dxxfp
k
j j
jj
np
npN
1
2
IE 519 15
How good is the data?
Assumption of IID observationsSometimes time-dependent (non-stationary)Assessment Correlation plot Scatter diagram Nonparametric tests
IE 519 16
Correlation Plot
Calculate and plot the sample correlation
0:H
apart nsobservatio nsobservatio ofn Correlatioˆ
0
j
j j
IE 519 17
Scatter Diagram
Plot pairs
Should be scattered randomly through the planeIf there is a pattern then this indicates correlation
1, ii XX
IE 519 18
Multiple Data Sets
Often you have multiple data sets (e.g., different days, weeks, operators)
Is the data drawn from the same process (homogeneous) and can thus be combined?Kruskal-Wallis test
knkk
n
n
XXX
XXX
XXX
121
22221
11211
,...,,
,...,,
,...,,
1
1
IE 519 19
Kruskal-Wallis (K-W) Statistic
Assign rank 1 to the smallest observation, rank 2 to the second smallest, etcCalculate
k
i i
i
n
jiji
ijij
k
ii
nn
R
nnT
XRR
XXR
nn
i
1
2
1
1
)1(3)1(
12
ofRank
IE 519 20
K-W Test
The null hypothesis is
H0: All the population distribution are identical
H1: At least one is larger than at least one
other
We reject H0 at level if
In other words, the test statistic follows a chi-square distribution with k-1 degrees of freedom
21,1 kT
IE 519 21
Absence of Data
We have assumed that we had data to fit a distributionSometimes no data is availableTry to obtain minimum, maximum, and mode and/or mean of the distribution Documentation SMEs
IE 519 22
Triangular Distribution
IE 519 23
Symmetric Beta Distributions
==2 ==3
==5 ==10
IE 519 24
Skewed Beta Distributions
=2, =4
IE 519 25
Beta Parameters
2
1
Mode
)(
Mean
abac
aba
a
babc
baca
ˆ)(ˆ
))((
)2)((ˆ
Estimates
IE 519 26
Benefits of Fitting a Parametric Distribution
We have focused mainly on the approach where we fit a distribution to dataBenefits:
Fill in gaps and smooth data Make sure tail behavior is represented
Extreme events are very important to the simulation but may not be represented
Can easily incorporate changes in the input process
Change mean, variability, etc. Reflect dependencies in the inputs
IE 519 27
What About DependenciesAssumed so far an IID processMany processes are not:
A customer places a monthly order. Since the customer keeps inventory of the product, a large order is often followed by a small order
A distributor with several warehouses places monthly orders, and these warehouses can supply the same customers
The behavior of customers logging on to a web site depends on age, gender, income, and where they live
Do not ignore it!
IE 519 28
Solutions
A customer places a monthly order. Should use a time-series model that
captures the autocorrelation
A distributor with several warehouses Need a vector time-series model
Customers logging on to a web site Need a random vector model where each
component may have a different distribution
IE 519 29
Taxonomy of Input Models
Time-independent
Stochastic Processes
Univariate
Multivariate
Discrete-time
Continuous-time
Discrete
Continuous
‘Mixed
Discrete
Continuous
‘Mixed
Discrete-state
Cont.-state
Discrete-state
Cont.-state
Time-series models
Markov chains (stationary?)
Poisson process (stationary?)
Markov process
Binomial, etc.Normal, gamma, beta, etc.
Empirical/Trace-driven
Independent binomial
Multivariate normal
Bivariate-exponential
Examples of models
IE 519 30
What if it Changes over Time?
Do not ignore it!Non-stationary input processExamples: Arrivals of customers to a restaurant Arrivals of email to a server Arrivals of bug discovery in software
Could model as nonhomogeneous Poisson process
IE 519 31
Goodness-of-Fit Test
The distribution fitted is tested using goodness-of-fit tests (GoF)How good are those tests?The null hypothesis is that the data is drawn from the chosen distribution with the estimated parametersIs it true?
IE 519 32
Power of GoF Tests
The null hypothesis is always false!If the GoF test is powerful enough then it will always be rejectedWhat we see in practice:
Few data points: no distribution is rejected
A great deal of data: all distributions are rejected
At best, GoF tests should be used as a guide
IE 519 33
Input Modeling Software
Many software packages exist for input modeling (fitting distributions)Each has at least 20-30 distributionsYou input IID data, the software gives you a ranked list of distributions (according to GoF tests)Pitfalls?
IE 519 34
Why Fit a Distribution at All?
There is a growing sentiment that we should never fit distributions (not consensus, just growing)A couple of issues: You don’t always benefit from data Fitting distribution is misleading
IE 519 35
Is Data Reality
Data is often Distorted
Poorly communicated, mistranslated or recorded Dated
Data is always old by definition Deleted
Some of the data is often missing Dependent
Often only summaries, or collected at certain times Deceptive
This may all be on purpose!
IE 519 36
Problems with Fitting
Fitting an input distribution can be misleading for numerous reasons
There is rarely a theoretical justification for the distribution. Simulation is often sensitive to the tails and this is where the problem is!
Selecting the correct model is futile The model gives the simulation practitioner
a false sense of the model being well-defined
IE 519 37
Alternative
Use empirical/trace-driven
simulation when there is sufficient
data
Treat other cases as if there is no
data, and use beta distribution
IE 519 38
Empirical Distribution
xX
XxXXXn
Xx
n
i
Xx
xF
XXXn
xXxF
XXX
n
iiii
iX
n
iX
n
)(
)1()()()1(
)(
)1(
)()2()1(
21
1)1(1
1
0
)(ˆ
define and ... nsobservatio order thecan or we
ofNumber )(ˆ
(CDF)function on distributi Empirical
,...,,
nsObservatio
IE 519 39
Beta Distribution Shapes
IE 519 40
What to Do?
Old rule of thumb based on number of data points available:<20: Not enough data to fit21-50 : Fit, rule out poor choices50-200 : Fit a distribution>200 : Use empirical distribution
IE 519 41
Random Number Generation
IE 519 42
Random-Number Generation
Any simulation with random components requires generating a sequence of random numbersE.g., we have talked about arrival times, service times being drawn from a particular distributionWe do this by first generating a random number (uniform between [0,1]) and then transforming it appropriately
IE 519 43
Three Alternatives
True random numbers Throw a dice Not possible to do with a computer
Pseudo-random numbers Deterministic sequence that is statistically
indistinguishable from a random sequence
Quasi-random numbers A regular distribution of numbers over the
desired interval
IE 519 44
Why is this Important?
Validity The simulation model may not be valid
due to cycles and dependencies in the model
Precision You can improve the output analysis by
carefully choosing the random numbers
IE 519 45
Pseudo-Random Numbers
Want an iterative algorithm that outputs numbers on a fixed intervalWhen we subject this sequence to a number of statistical test, we cannot distinguish it from a random sequenceIn reality, it is completely deterministic
IE 519 46
Linear Congruential Generators (LCG)
Introduced in the early 50s and still in very wide use todayRecursive formula
seed
modulus
increment
multiplier
mod)(
0
1
Z
m
c
a
mcaZZ ii
Every number is determinedby these four values
IE 519 47
Transform to Unit Uniform
Simply divide by m
What values can we take?
m
ZU i
i
IE 519 48
Examples
1,16mod)12(
1,13mod)3(
1,16mod)11(
01
01
01
ZZZ
ZZZ
ZZZ
ii
ii
ii
IE 519 49
Characteristics
All LCGs loopThe length of the cycle is the periodLCGs with period m have full periodThis happens if and only if
The only positive integer that divides both m and c is 1
If q is a prime that divides m, then q divides a-1
If 4 divides m then 4 divides a-1
IE 519 50
Types of LCGs
If c=0 then it is called multiplicative LCG, otherwise mixed LCG
Mixed and multiplicative LCG behave rather differently
IE 519 51
Comments on Parameters
Mixed Generator Want m to be large A good choice is m = 2b, where b is the number
of bits Obtain full period if c is odd and a-1 is
divisible by 4
Multiplicative LCGs Simpler Cannot have full period (first condition cannot
be satisfied) Still an attractive option
IE 519 52
Performance Tests
Empirical testsUse the RNG to generate some numbers and then test the null hypothesis
H0: The sequence is IID U(0,1)
IE 519 53
Test 1: Chi-Square Test
Similar to before:Generate Split [0,1] into k subintervals (k 100 )Test statistic is
With k-1 degrees of freedom
lsubintervath in s' ofNumber
1
22
jUf
k
nf
n
k
ij
k
jj
nUUU ,...,, 21
IE 519 54
Test 2: Serial Test
Consider
Similar to before
,...,...,,
,,...,,
2212
211
ddd
d
UUU
UUU
U
U
lsubinterva in the s' ofNumber 21
1 2
211 1 1
22
ijjj
k
j
k
j
k
jdjjj
d
Uf
k
nf
n
k
d
d
d
IE 519 55
Test 3: Runs Test
Calculate for
Test statistic (chi-square w/6 d.f.)
Where the a and b values are given empirically
66length of up runs ofnumber
5,...,2,1for length of up runs ofnumber
i
iiri
nUUU ,...,, 21
6
1
6
1
1
i jjjiiij nbrnbra
NR
IE 519 56
Test 4: Correlation Test
For uniform variables
3124
1
,12
1,
2
1
jiij
jii
jiijii
jiij
UUE
UUE
UEUEUUE
UUCovC
UVarUE
IE 519 57
Test 4: Correlation TestEmpirical estimate is
Test statistic
Approximately standard normal
2
0)1(11
)1(
713ˆ
1/)1(
31
12ˆ
h
hVar
jnh
UUh
j
h
kjkkjj
j
jj
VarA
ˆ
ˆ
IE 519 58
Passing the Test
A RNG with long period that passes a fixed set of statistical test is no guarantee of this being a good RNG
Many commonly used generators are not good at all, even though they pass all of the most basic tests
IE 519 59
Classic LCG16807Multiplicative LCGs cannot have full period, but they can get very close
Has period of 231-2, that is, best possibleDates back to 1969Suggested in many simulation texts and was (is) the standard for simulation softwareStill in use in many software packages
12
12mod16807
31
311
ii
ii
ZU
ZZ
IE 519 60
Java RNG
Mixed LCG with full period
Variant of the old rand48() Unix LCG
53
2112
22227
481
222
2
2mod)1172521490391(
ii
i
ii
ZZ
U
ZZ
482i
iZU
IE 519 61
Two more LCGs
VB
Excel
24
241
2
2mod)128201631140671485(
ii
ii
ZU
ZZ
1mod)211327.00.9821( 1 ii UU
IE 519 62
Simple Simulation Tests
Collision Test Divide [0,1) into d equal intervals Generate n points in [0,1)t C=Number of times a point falls in a box
that already has a point (collision)
Birthday Spacing Test Have k boxes, labeled with Define the spacing Consider
)()2()1( nIII
)()1( jjj IIS
2,...,1,: 1 njSSjY jj
IE 519 63
Performance: Collision
After 215 numbers, VB starts failingAfter 217 numbers, Excel starts failingAfter 219 numbers, LCG16807 starts failingThe Java RNG does OK up to at least 220 numbersNote that this means that a clear pattern is observed from the VB RNG with less than 100,000 numbers generated!
IE 519 64
Performance: B-day Spacing
After 210 numbers, VB starts failingAfter 214 numbers, Excel starts failingAfter 214 numbers, LCG16807 starts failingAfter 218 numbers, Java starts failing
For this test, the VB RNG is only good for about 1000 numbers!The performance gets even worse if we look at less significant digits.
IE 519 65
Combined LCGA better RNG is obtained as follows:
Recommended parameters (k=3)
Cycle length of 2191 with good structure
1mod
mod)...(
mod)...(
2
,2
1
,1
2,2,22,22,21,21,2,1
1,1,12,12,11,11,1,1
mZ
mZ
U
mZaZaZaZ
mZaZaZaZ
iii
kikiii
kikiii
228532
2092
1370589,0,527612,,
810728,1403580,0,,
322
321
131211
131211
m
m
aaa
aaa
IE 519 66
Why do RNGs Fail?
We have seen that many commonly used RNGs fail simulation tests, even though they past the standard empirical tests
Why do these RNGs fail?Need to analyze the structure of the RNG
IE 519 67
Lattice Structure
For all LCGs, the numbers generated fall in a fixed number of planesWe want this to be as many planes as possible and ‘fill-up’ the spaceThis should be true in many dimensions
IE 519 68
Example: Two Full-Period LCGs
IE 519 69
LCG RANDU in 3 Dimensions
IE 519 70
Theoretical Tests
Based on analyzing the structure of the numbers that can be generatedLattice testSpectral test
IE 519 71
Selecting the Seed
1 10 15 0 13 6 11 12 9 2 7 8 5 14 3 4 1
16mod)11( 1 ii ZZ
Seed = 15
Seed = 1
Say we need two independent sequences of 8 numbersSelect seed values 1 and 15
Good RNGs will haveprecomputed seed values
IE 519 72
Streams and Substreams
A segment corresponding to a seed is usually called a streamAlso want to be able to get independent substreams of each streamExample: Assign each stream to generating one type of numbers & and use each substream for independent replicationsRequires very long period generators, and precomputed streams
IE 519 73
Analysis of RNG 16mod)11( 1 ii ZZ1 10 15 0 13 6 11 12 9 2
0.06 0.63 0.94 0.00 0.81 0.38 0.69 0.75 0.56 0.13
0
0.5
1
1.5
2
2.5
3
3.5
[0,0.25) [0.25,0.5) [0.5,0.75) [0.75,1)
IE 519 74
Do We Need Randomness?
For certain applications, definitelyFor simulation, maybe not alwaysQuasi-random numbersSay we want to estimate an expected value
uu dfs
)1,0[
IE 519 75
Monte Carlo Estimate
Using n independent simulation runs
Error converges at rate
)1,0(/)ˆ(
ˆ
1ˆ
2
1
0
Nn
nVar
fn
n
ii
u
n
IE 519 76
Quasi-Monte CarloReplace the random points with a set of points that cover [0,1)s more uniformly
IE 519 77
Discussion
By using Quasi-random numbers, we are able to achieve faster convergence rateWhen estimating an integral, real randomness is not really an issueWhat about discrete event simulation?
IE 519 78
DiscussionGenerating random numbers is important to every simulation project Validity of the simulation Precision of the output analysis
Not all RNG are very good
IE 519 79
Discussion
Problems Too short a period (period of 231 not sufficient) Unfavorable lattice structure
Numbers generated by RANDU() fall on 15 planes in R2
Inability to get truly independent subsequences Need streams (segments), and substreams
Should choose a RNG that passes both empirical and theoretical tests, has a very long period, and allows us to get good streams
IE 519 80
Generating Random Variates
IE 519 81
Generating Random Variates
Say we have fitted an exponential distribution to interarrival times of customersEvery time we anticipate a new customer arrival (place an arrival even on the events list), we need to generate a realization of of the arrival timesKnow how to generate unit uniformCan we use this to generate exponential? (And other distributions)
IE 519 82
Two Types of Approaches
Direct Obtain an analytical expression Inverse transform
Requires inverse of the distribution function Composition & Convolution
For special forms of distribution functions
Indirect Acceptance-rejection
IE 519 83
Inverse-Transform Method
IE 519 84
Formulation
Algorithm
Proof
)(Return .2
)1,0(~ Generate 1.1 UFX
UU
)(
))((
)(1
xF
xFUP
xUFPxXP
IE 519 85
Example: Weibull
IE 519 86
Example: Exponential
00
01)(
/
x
xexF
x
IE 519 87
Discrete Distributions
IE 519 88
Formulation
Algorithm
Proof: Need to show
IxFUIX
UU
xxX
:minReturn .2
)1,0(~ Generate 1.
,..., valuescan take 21
ixpxXP ii )(
IE 519 89
Continuous, Discrete, Mixed
Algorithm
UxFxX
UU
)(:minReturn 2.
)1,0(~ Generate .1
IE 519 90
Discussion: Disadvantages
Must evaluate the inverse of the distribution function May not exist in closed form Could still use numerical methods
May not be the fastest way
IE 519 91
Discussion: Advantages
Facilitates variance reduction
Ease of generating truncated distributions21
21
21
21
22
11
11
1
tindependen ,
selectCan
)(
)(
UU
UU
UU
UFX
UFX
IE 519 92
Composition
Assume that
Algorithm1. Generate a positive random integer,
such that P(J=j)=pj
2. Return X with distribution Fj
1
1
1
,)()(
jj
jjj
p
xFpxF
IE 519 93
Convolution
Assume that(where the Y’s are IID with CDF G)Algorithm
mYYYX ...21
m
m
YYYX
GYYY
...Return 2.
CDFeach with IID ... Generate 1.
21
21
IE 519 94
Acceptance-Rejection Method
Specify a function that majorizes the density
New density function
Algorithm
xxfxt ),()(
1. Step back to go Otherwise
.return ),()( If 3.
oft independin Generate .2
density ith Generate 1.
YXYtYfU
YU
rwY
dxxt
xtxr)(
)()(
IE 519 95
Example:
IE 519 96
Example: More Efficient
IE 519 97
Simple Distributions
Uniform
Exponential
m-Erlang
UabaX )(
UX ln
m
iiU
mX
1
ln
IE 519 98
GammaDistribution function
No closed-form inverseNote that if then
otherwise0
0!
1)(1
0
/ xj
x
exFj
j
x
),(~ gammaX
)1,(~ gammaX
IE 519 99
Gamma(,1) Density
IE 519 100
Gamma(,1)
Gamma(1,1) is exponential(1)0<<1: Acceptance-rejection with
This majorizes the Gamma (,1) density, but can we generate random variates?
xe
xxx
xtx
1)(
10)(
00
)(1
IE 519 101
Gamma(,1), 0<<1
The integral of majorizing function
New densityeeb
b
dxe
dxx
dxxtx
/)(
,)(
)()()(
1
1
0
11
0
xbe
xbx
x
xrx
1
10
00
)(1
IE 519 102
Gamma(,1), 0<<1
The distribution function is
Invert
xb
e
xb
x
dyyrxR x
x
11
10)()(
0
otherwise)1(
ln
1
)(
/1
1
ubb
ubuuR
IE 519 103
Gamma(,1), 0<<1
1.Generate U1~U(0,1) and let P=bU1. If P>1, go to step 3. Otherwise go to step 2
2.Let Y=P1/, and generate U2~U(0,1). If U2eY, return X=Y. Otherwise, go to step 1.
3.Let Y=-ln[(b-P)/], and generate U2~U(0,1). If U2Y-1, return X=Y. Otherwise, go to step 1.
IE 519 104
Gamma(,1), 1<
Acceptance-rejection with
)(4
12
2
1
ec
x
xxt
IE 519 105
Gamma(,1), 1<
Distribution function (log-logistics)
Inverse
x
xxR
)(
/11
1)(
u
uuR
IE 519 106
Normal
Distribution function does not have closed form (so neither does the inverse)Can use numerical methods for inverse-transformNote that
If we can generate unit normal, then we can generate any normal
),(~
)1,0(~
NX
NX
IE 519 107
Normal: Box-Muller
Algorithm
Technically, independent N(0,1), but serious problem if used with LCGs
1
212
211
21
Return .3
2sinln2
,2cosln2Set 2.
)1,0(~,t independen Generate .1
X
UUX
UUX
UUU
IE 519 108
Polar Method
Algorithm
YVX
YVX
WWY
VVWUV
UUU
ii
22
11
22
21
21
/)ln2(
let Otherwise, 1. step togo 1 WIf 2.
,12Let
).1,0(~,t independen Generate .1
IE 519 109
Derived Distributions
Several distributions are derived from the gamma and normalCan take advantage of knowing how to generate those two distributions
IE 519 110
Beta
Density
No closed form CDF. No closed form inverseMust use numerical methods for inverse-transform method
1
0
1121
21
11
21
21
)1(,
otherwise0
10,
)1()(
dtttB
xB
xxxf
zz
IE 519 111
Beta Distribution Shapes
IE 519 112
Beta Properties
Sufficient to consider beta on [0,1]If X~beta(1,2) then 1-X~beta(2,1)
If 2=1 then
If 1=1, 2=1 then X~U(0,1)
1
1
121
)(
1,,
)1()( 1
11
1
21
11
xxF
xB
x
B
xxxf
IE 519 113
Beta: General Approach
If Y1~Gamma(,1) and Y2~Gamma(,1), and Y1 and Y2 are independent, then
Thus, if we can generate two gamma random variates, we can generate a beta with arbitrary parameters
2121
1 ,beta~ YY
Y
IE 519 114
Pearson Type V and Type VI
Pearson Type V X~PT5() iff 1/X~gamma()
Pearson Type VI If Y1~Gamma(,) and
Y2~Gamma(,1), and Y1 and Y2 are independent, then ,,6PT~ 21
2
1
Y
Y
IE 519 115
Pearson Type V
IE 519 116
Pearson Type VI
IE 519 117
Normal Derived Distributions
Lognormal
Test distributions (not often used for modeling): Chi-squared Student’s t distribution F distribution
22 ,~,~ LNeNY Y
IE 519 118
Log-Normal
IE 519 119
Empirical
Use inverse-transform methodDo not need to search through observations because changes occur precisely at 0, 1/(n-1), 2/(n-1), …Algorithm
)()1()( )1(Return .2
1let and
,1Let ).1,0(~ Generate .1
III XXIPXX
PI
)U(n-PUU
IE 519 120
Empirical Distribution Function
IE 519 121
Discrete Distributions
Can always use the inverse-transform methodMay not be most efficientAlgorithm
I
j
I
j
p(j)Up(j)
UU
0
1
0
satisfies that
IXinteger enonnegativ Return the.2
).1,0(~ Generate .1
IE 519 122
Alias Method
Another general method is the alias method, which works for every finite range discrete distribution
IE 519 123
Alias Method: Example
33.0
22.0
14.0
01.0
)(
x
x
x
x
xp
0 1 2 3
L2=3
Ii LXIXFU
UUnDUI
return Otherwise .return If.2
)1,0(~ ),,0(~ Generate .1
L0=1
IE 519 124
Bernoulli
Mass function
Algorithm
0return Otherwise .1return If.2
)1,0(~ Generate .1
XXpU
UU
otherwise0
1
01
)( xp
xp
xp
IE 519 125
Binomial
Mass function
Use the fact that if X~bin(t,p) then
otherwise0
},...,1,0{)1()(
txppx
txp
xtx
)(Bernoulli~
...,21
pY
YYYX
i
t
IE 519 126
Geometric
Mass function
Use inverse-transform
)1ln(
)1ln(Return .2
)1,0(~ Generate .1
pUX
UU
otherwise0
},...,1,0{)1()(
txppxp
x
IE 519 127
Negative Binomial
Mass function
Note that X~negbin(n,p) iff
otherwise0
},...,1,0{)1(1
)(txpp
x
xsxp
xs
)(Geometric~
...,21
pY
YYYX
i
n
IE 519 128
Poisson
Mass functionAlgorithm
Rather slow. No very good algorithm for Poisson distribution
otherwise0
},...,1,0{!)( tx
x
exp
x
1 step back to go and 1Let .3
3. step togo Otherwise .return , If
.by replace and )1,0(~ Generate 2.
0,1,Let .1
11
ii
iXab
bUbUU
ibea
ii
IE 519 129
Poisson Process
A stochastic process {N (t), t 0} that counts the number of events up until time t is a Poisson process if: Events occur one at a time N (t+s)-N (t) is independent of {N (t),
t0}
A Poisson process is determined by its rate )()( tNE
dt
dt
IE 519 130
Generating a Poisson Process
Stationary with rate >0Time between events Ai=ti-ti-1 are IID exponentialAlgorithm
Utt
UU
ii ln1Return .2
)1,0(~ Generate .1
1
IE 519 131
Nonstationary Case
Can we simply generalize?
(t)
it 1it
IE 519 132
Thinning Algorithm
1. Set t=ti-1
2. Generate U1, U2 IID U(0,1)
3. Replace t by
4. If return ti = t. Otherwise, go back to step 2.
)(max where,ln1 *
1*tUt
t
*2 /)( tU
IE 519 133
Summary
For any stochastic simulation it is necessary to generate random variates from either a theoretical distribution or an empirical distributionGeneral methods we covered Inverse-transform Acceptance-rejection Alias method
IE 519 134
Output Analysis
IE 519 135
Output Analysis
Analyzing the output of the simulation is a part that is often done incorrectly (by analysts and commercial software)We consider several issues
Obtaining statistical estimates of performance measures of interest
Improving precision of those estimates through variance reduction
Comparing estimates from different models Finding the optimal performance value
IE 519 136
Simulation Output
The output from a single simulation run is a stochastic process Y1, Y2, …
Observations (n replications of length m):
nmnin
mi
mi
yyy
yyy
yyy
1
2221
1111
IE 519 137
Parameter Estimation
Want to estimate some parameter based on these observations
?ˆlim
?Consistant
?ˆ
Unbiased?
tt
E
IE 519 138
Transient vs Steady State
IE 519 139
Initial Values: M/M/1 Queue
IE 519 140
Types of Simulation
Terminating simulation
Non-terminating simulation
Steady-state parameters
Steady-state cycle parameters
Other parameters
IE 519 141
Terminating Simulation
Examples: A retail establishment that is open for
fixed hours per day A contract to produce x number of a
high cost product Launching of a spacecraft
Never reaches steady-stateInitial conditions are included
IE 519 142
Non-Terminating Simulation
Any system in continuous operation (could have a ‘break’)Interested in steady-state parametersInitial conditions should be discarded
Sometimes no steady-state because the system is cyclicThen we are interested in steady-state cycle parameters
IE 519 143
Terminating Simulation
Let Xj be a random variable defined on the jth replicationWant to estimate the mean =E (Xj )
Fixed-sample-size procedureCI assumes Xj‘s are normally distributed
n
nStnX n
)()(
2
2/1,1
IE 519 144
Quality of Confidence IntervalNumber of failures
Average delay (25 customers) Average delay (500 customers)
Depends on both the underlying distribution and the number ofreplications
IE 519 145
Specifying the Precision
Absolute error
To obtain this
X
XP
XP
XXP
length half
length halflength half1
IE 519 146
Replications Needed
To obtain absolute error of , the number of replications needed is approximately
i
nStnin ia
)(:min
2
2/1,1*
IE 519 147
Relative ErrorAlso interested in the relative errorNow we have
X
)1(
)1(
length half1
XP
XP
XXP
XXP
XXP
XX
XP
IE 519 148
Replications Needed
To obtain relative error of , the number of replications needed is approximately
1)(
)(
:min
2
2/1,1*
nXinS
tnin
i
r
IE 519 149
Sequential Procedure
Define
Algorithm
1 step togo andn replicatio
additionalan make 1let Otherwise,
stop. and of estimate
theas )( use ,')( If 2.
,...,, from and )( Compute 1.
set and nsreplicatio Make 0.
21
00
nn
nXnXδ(n,α)
XXXnX
nnn
n
i
nStn i
)(),( ,
1'
2
2/1,1
IE 519 150
Other Measures
If we only use averages, the results can often be misleading or wrongWhat about the variance?Alternative/additional measures Proportions Probabilities Quantilies
IE 519 151
Example
Suppose we are interested in
customer delay X. We can estimate Average delay E[X]
Proportion of customer with Xa
Probabilities, e.g., P[Xa]
The q-quantile xq
IE 519 152
Estimating Proportions
Define an indicator function
Obtain a point estimate of the proportion
otherwise0
if1 aXI i
n
iiI
nr
1
1ˆ
IE 519 153
Estimating Probabilities
Want to estimate p=P(XB)
Have n replications X1,X2,…,Xn
Define S=number of observations that fall in set BS ~ binomial(n,p)Unbiased estimate is
n
Sp ˆ
IE 519 154
Estimating Quantiles
Let X(1),X(2),…,X(n) be the order statistics corresponding to n simulation runsA point estimator is then
otherwise
integeran is ifˆ )(
nq
nq
q X
nqXx
IE 519 155
Initial Conditions
In terminating simulation there is no steady-stateHence, the initial conditions are included in the performance measure estimatesHow should they be selected?
Use an artificial ‘warm-up’ period just to get reasonable start-up state
Collect data and model the initial conditions explicitly
IE 519 156
DiscussionFor terminating simulation we must use replications (cannot increase length of simulation run)Point estimates of performance measures:
Unbiased estimate and an approximate CI is easily constructed for the mean performance
Also, obtained point estimates for proportions, probabilities, and quantiles (mean not always enough)
It is important to be able to control the precision – determine how many replications are neededInitial conditions are always included in the estimates for terminating simulations – must be selected carefully
IE 519 157
Steady-State Behavior
Now we’re interested in parameters related to the limit distribution
Problem is that we cannot wait until infinity!
yYPyF
yYPyF
yFyF
ii
ii
)(
)(
)()(
IE 519 158
Estimating Mean
Suppose we want to estimate the steady-state mean
Problem:
One solution is to add warm-up and get a less biased estimator
m
liiY
lmlmY
1
1),(
mmYE ,)(
ii
YE
lim
IE 519 159
Approaches for Estimating
There are numerous approaches for estimating the mean Replication/deletion One long replication:
Batch-means Autoregressive method Spectrum analysis Regenerative method Standardized time series method
Start with this
IE 519 160
Choosing the Warm-Up Period
In replication/deletion method the main issue is to choose the warm-up periodWould likeTradeoff:
If l is too small then we still have a large bias
If l is too large then the estimate will have a large variance
Very difficult to determine from a single replication
.,),( lmlmYE
IE 519 161
Welch’s Procedure
)1()1()1()1( 1321
124321
,1,2,4321
,21,22,224232221
,11,12,114131211
m
mmm
mnmnmnnnnn
mmm
mmm
YYYY
YYYYYYY
YYYYYYY
YYYYYYY
YYYYYYY
IE 519 162
Welch’s Procedure
Key is to smooth out high-frequency oscillations in the averages
Then plot the moving average
wiYi
wmwiYw
wY i
issi
w
wssi
i
,...,2,1,12
1
,...,1,12
1
)( 1
)1(
IE 519 163
Example: Hourly Throughput
When is it warmed up?
IE 519 164
Welch’s Procedure
Much smoother and easier to tell where it has converged
Want to err on the side of selecting it too large
IE 519 165
Replication/Deletion
Similar to terminating simulation
Need n pilot runs to determine the warm-up period l, and then throw away the first l observations from the new n’ runs
lm
Ym
nX
n
nStnX
m
lmji
n
'
1'
1)'(
'
)'()'(
'
'
2
2/1,1'
IE 519 166
Discussion
Replication/deletion appoach Easiest to understand and implement Has good statistical performance if done
correctly Applies to all output parameters and can be
used to estimate several different parameters for the same model
Can be used to compare different systems
Nonetheless, some other methods have clear advantages
IE 519 167
Covariance Stationary Process
Classic statistical inference assumes independent and identically distributed (IID) observationsEven after eliminating the initial transient this is not true for most simulations because most simulation output is auto-correlatedHowever, it is reasonable to assume that after the initial transient the output will be covariance stationary, that is,
is independent of i
kiik YY ,cov
IE 519 168
Notation:
0
20
2
21
,cov
,...,,
kk
kiik
j
j
n
YY
YVar
YEv
YYY
Simulation output:
Mean:
Variance:
Covariance:
Variance:
Correlation:
Assume covariance stationary
IE 519 169
Implications of Autocorrelation
If the process is covariance stationary the average is still an unbiased estimator, that is,
However, the same cannot be said about the standard estimate of the variance
In fact,
n
jjY
nnn
nS
1
22
2 ˆ1
11)(ˆˆ
1
121)(
1
122
n
nknSE
n
k k
ˆ and ˆ1
EYn
jj
IE 519 170
Expression for VarianceAssuming covariance stationary process it can be shown that:
We hope the estimate of the variance is unbiased, that is,
By combining the top equation above with the last equation on previous slide, we can check this for an independent and autocorrelated output process
1
1
02 121ˆn
kkn
k
n
ˆ22
n
SE
nS 2
IE 519 171
Independent Process
If the output process is independent then
ˆˆ1
1121
121ˆ
1,0,cov
22
1
12
20
1
1
02
00
n
nk
n
n
SE
nnn
k
n
kYY
n
kk
n
kk
kiikk
IE 519 172
Autocorrelation in ProcessIf the process is positively correlated (usual):
Hence, the estimator has less precision than predicted and the CI is misleading
ˆˆ1
1121
121ˆ
1,0
22
1
12
20
1
1
02
n
nk
n
n
SE
nnn
k
n
k
n
kk
n
kk
k
IE 519 173
Batch-Means Estimators
Batch-means estimators are the most popular alternative to replication/deletionThe idea here is to do one very long simulation run and estimate the parameters from this runAdvantage is that the simulation only has to go through the initial transient onceAssuming covariance-stationary output
No problem estimating the mean Estimating the variance is difficult because the data
is likely to be autocorrelated, that is, Yi and Yi+1 are correlated
IE 519 174
Classical Approach
Partition the run of n into k equal-size contiguous macro replications, each composed of m=n/k micro replicationsPoint estimator
k
jjk 1
ˆ1
ˆ
IE 519 175
CI Analysis
Assuming as before that Y1, Y2,… is covariance-stationary with E[Yi]=
If the batch size is large enough, then the estimates will be approximately uncorrelatedSuppose we can also choose k large enough so that they are approximately normalIt follows that the batch estimates have the same mean and varianceHence we can treat them as approximately IID normal and get the usual confidence interval
jv̂
IE 519 176
Variants of Batch-Means
nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321
nlmlmlmmm YYYYYYYY ,...,,...,,,....,,,.....,, 121121
nmmmmmm YYYYYYYYYY ,...,,,,....,,,,.....,,, 1221211321
Batch 1 Batch 2
Batch 1
Batch 1
Batch 2
Batch 2
IE 519 177
Steady-State Batching
General variance estimator
}1,...,2,1{
}1)1(,...,12,1,1{
1
ˆˆ
ˆVar
22
2
2
mnB
mkmmB
B
BS
mn
S
Bj j
B
B
IE 519 178
Determining the Batch Size
Tradeoff Large batch sizes have the needed
asymptotic properties Small batch sizes yield more batches That is, choice between bias due to poor
asymptotics and variance due to few batches
Rule of thumb (empirical): Little benefit to more than 30 batches Should not have fewer than 10 batches
IE 519 179
Mean Squared Error
The mean squared error (MSE) of an estimator is
This is the classic measure of qualityCan use to select the optimal batch size
ˆVar,ˆBias
ˆ,ˆ
2
2
EMSE
IE 519 180
Optimal Batch Size
The asymptotic mse-optimal batch size is
gravity ofCenter
constant Variance
constant Bias
12
0
1
3/1
0
1*
v
b
v
b
c
c
c
cnm
IE 519 181
Regenerative Method
Similar to batch-means, the regenerative method also tries to construct independent replications from a single runAssume that Y1, Y2,… has a sequence of random points 1 B1 < B2 < … called regeneration points, and the process from Bj is independent of the process prior to Bj
The process between two successive regeneration points is called a regeneration cycle
IE 519 182
Estimating the Mean
cycleson regenerati ofNumber '
)'(
)'(ˆ
][
][
11
n
nN
nZ
YZ
NE
ZE
j
j
B
Biij
IE 519 183
Analysis
The estimator is not unbiased. However, it is strongly consistent
Let be the covariance matrix ofLetThese are IID with mean 0 and variance
(w.p.1) )'(ˆ'
n
n
Tjjj NZ ,U
jjj NZV
222
12112 2 V
IE 519 184
Analysis
From the CLT
Have estimates
)1,0('
)'(2
Nn
nV D
V
)'(ˆ)'(ˆ)'(ˆ)'(ˆ2)'(ˆ)'(ˆ
)'(ˆ)'(ˆ
)'(ˆ)'(ˆ)'(ˆ
222
12112
2212
1211
nnnnnn
nn
nnn
V
IE 519 185
Analysis
Can be shown thatHence
We get a CI
(w.p.1) )'(ˆ 2
'
2V
nV n
)1,0(
)'(')'(ˆ
)'(ˆ22
NnNnn
n D
V
)'(
')'(ˆ)'(ˆ
221
nN
nnzn V
IE 519 186
Non-Independence
Non-overlapping batch-means and regeneration methods try to create independence between batches/cyclesAn alternative is to use estimates of the autocorrelation structure to estimate the variance of the sample mean(Again, estimating the mean is no problem, just the variance)Spectrum analysis and autoregressive methods attempt to do this
IE 519 187
Spectral Variance Estimator
Assume the process is covariance stationary:
The variance can be expressed asThe spectral density function of the process:
Since , an estimate of the spectral density function at frequency 0 is an estimate of the variance
)(, lYYEYE ljjj
ll)(
l
llf )cos()(2
1)(
llf )()0(2
IE 519 188
Spectral Variance Estimator
Using standard results:
1)(
1)0(
)()(1
)(
)()()(ˆ
1
1
)1(
2
lw
w
nYYnYYn
l
llwn
n
n
ln
rlrrn
m
mlnn
Batch size
Weights
IE 519 189
Parameters
For the batch size
Example of weight functions
0
nnm
m
otherwise0
11)(
otherwise0
11)(
22 mlmllw
mlmllw
n
n
IE 519 190
Autoregressive Method
Again assume covariance-stationary output process, and also a pth-order autoregressive model
2
0
0
varianceand 0mean h wit
variablesrandom eduncorrelat }{
1
i
p
jijij
b
Yb
IE 519 191
Convergence Result
Can be shown that
Can estimate these quantities and get
A CI can be constructed using t-distribution
20
2)(Var
p
j jm
bmYm
22
ˆ
ˆ)(Var
bmmY
IE 519 192
What is the Coverage?Empirical results for 90% CI for two simulation models
IE 519 193
Discussion
Replication/deletion is certainly the most popular in practice (easy to understand)Batch-means is very effective. There are practical algorithms and still a lot of researchSpectral methods are still a subject of active research but probably not used much in practice (very complicated)Autoregressive methods appear not be used/investigated muchRegeneration methods are theoretically impeccable but practically useless!
IE 519 194
Comments on Variance Estimates
We have spent considerable time looking at alternative estimates of the varianceWhy does it matter?Simulation output is usually (always) auto-correlated, which makes it difficult to estimate variance, and hence the CI may be incorrectMost seriously, the precision of the estimate may be less than predicted and hence inference drawn from the model may not be valid
IE 519 195
Implications of Autocorrelation
Because simulation output is usually autocorrelated we cannot simply use all of the observations to estimate the meanWe need some way of obtaining no correlation
Replication/deletion gets this through independent replications
Batch-means gets the (almost) through non-overlapping batches
Regenerative method get this through independent regenerative cycles
IE 519 196
Sequential Procedures
None of the single run methods we have discuss can assure any given precision (which we need to make a decision)Several sequential procedures exist that allow us to do this
More complicated than for replication/deletion
May require very long simulation runs
IE 519 197
Good Sequential Procedures
Batch-means and relative error stopping rule
Law and Carson procedure (1979) Automated Simulation Analysis Procedure
(ASAP) and extension ASAP3 (2002, 2005)
Spectral method and relative stopping rule WASSP (2005)
All of these methods obtain much better coverageHowever, they are rarely if ever used!
IE 519 198
Estimating Probabilities
Know how to estimate meansHow about probabilities p = P[YB] ?Note that
We therefore already know this!
][
0011
1
otherwise0
if1
ZE
ZPZP
ZPBYP
BYZ
IE 519 199
Estimating Quantiles
Suppose we want to estimate the q-quantile yq, that is, P[Y yq]=q
More complicatedMost estimates based on order statistics Biased estimates Computationally expensive Coverage low if sample size is too low
IE 519 200
Cyclic Parameters
No steady state distributionWith some cycle definition
All of the techniques we have discussed before for steady-state parameters still apply to this new process
yYPyFyYPyF CC
i
Ci
Ci
)()(
IE 519 201
Multiple Measures
In practice we are usually interested in multiple measures simultaneously, so we have several CIs
How does this effect our overall CI? ?,...,1, ksIP ss
kkk IP
IP
1
1 111
IE 519 202
Bonferroni Inequality
No problem if independent
In practice performance measures are very unlikely to be independentIf they are not independent, we can use Bonferroni inequality
k
ssss ksIP
1
1,...,1,
k
sssss IPksIP
1
,...,1,
IE 519 203
Computational Implications
Say we have 5 performance measures and we want a 90% CITwo alternatives:
We can get five 98% CI for each of the performance measures, which gives us a 90% overall CI. This is computationally expensive
We can get five 90% CI and live with the fact that one or more of them is likely to not cover the true value of the parameter
We will revisit this topic when we talk about multiple comparison procedures
IE 519 204
Output Analysis: Discussion
Terminating simulation
Non-terminating simulation
Multiple runs• Replication/deletion• Issue with bias• Elimination of initial transient
Single long run• Batch-means, regenerative etc.• Autocorrelation problem with estimating the variance
• Replications defined by terminating event• Can determine precision• Initial conditions
IE 519 205
Resampling Methods
IE 519 206
Sources of Variance
We have learned how to estimate variance and construct CI, predict number of simulation runs needed, etc.Where does the variance come from?
Random number generator (RNG) Generation of random variates Computer only approximates real values Initial transient/stopping rules Inherently biased estimators Modelling error ?
Made worse by long runs!
Made better by long runs!
IE 519 207
Input Modelling
We have discussed input modeling and output analysis separatelyRecall main approaches for input modeling:
Fit a parametric distribution Fit an empirical distribution Use a trace Use beta distribution
In practice fitting a parametric distribution is the most common approach
IE 519 208
Numerical ExampleThe underlying system is an M/M/1/10 queueThe simulation model is 1 station, capacity of 10, and empirical distribution for interarrival and service times from 100 observationsWant to estimate the expected time in system E[W]Typical simulation experiment:
10 replications Very long run of 5000 customers Very long warm-up period of 1000 customers CI constructed using t-distribution
We would expect a very good estimate for the performance of the model
IE 519 209
Effect of Estimating Distribution Parameters
True model
No resampling
Direct resampling
True model assumes that the true models for interarrival and service distribution is known
No resampling is the traditional approach of empirical distribution and then construct a sample mean based on 10 replications
Direct resampling obtains a new sample of 100 data points for each of the 10 replications
IE 519 210
Why Poor Coverage?
The uncertainty due to replacing the true distribution with an estimate is neglected
This is the case for all commercial simulation software
Remedies Direct resampling Bootstrap resampling Uniformly randomized resampling
IE 519 211
Direct Resampling
For each replication (simulation run) use a new sample to create an empirical distribution functionRequires a lot more dataAlternatively what data is available can be split among the replicationsCan confidence intervals be constructed?
IE 519 212
Bootstrap Resampling
Use the bootstrap to create a ‘new’ sample for a new empirical distribution function for each replicationBootstrap: sampling with replacementNo need for additional data and may even be able to use less data
IE 519 213
Bootstrap Resampling Algorithm
For each input quantity q modeled, sample n values from the observed data with replacementConstruct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat
inq
iq
iq vvv )()2()1( ,...,,
IE 519 214
Uniformly Randomized
Note that if F is the cdf of X then F(X) is uniform on [0,1]
)1,(~
...
...
][
][]2[]1[
][]2[]1[
knkbetaXF
XFXFXF
XXX
k
n
n
IE 519 215
For each input quantity q modeled, order the observed data Generate a sample of n ordered values from a uniform distributionSetand construct an empirical distribution for each q based on these samples Do a simulation run based on these input distributions (ith output)Repeat
Uniform Randomized Bootstrap
)()2()1( ,...,, nqqq xxx
inq
iq
iq uuu )()2()1( ,...,,
ijqjq
pq uxF )()(
)(ˆ
IE 519 216
Numerical Results90% CI for a M/M/1/10 queue and varying traffic intensity {0.7,0.9}
Observations of interarrival and service times {50,100,500}
IE 519 217
Numerical Results90% CI for a M/U/1/10 queue and varying traffic intensity {0.7,0.9}
Observations of interarrival and service times {50,100,500}
IE 519 218
Discussion
Uncertainty in the input modeling can effect the precision of the outputFor a given application you can estimate this effect by selecting 3-5 random subsets of the data, and performing the analysis on eachBootstrap resampling can help fix the problem
IE 519 219
DiscussionBootstrap resampling is much more general, and provides an answer to the question:
Given a random sample & a statistic T calculated on this sample, what is the distribution of T ?
Assumptions: The empirical distribution converges to the true
distribution as the number of samples increases T is sufficiently ‘smooth’
Problems with extreme point estimates
Other simulation applications: Model validation Ranking-and-selection, etc.
IE 519 220
Comparing Multiple Systems
IE 519 221
Multiple Systems
We know something about how to evaluate the output of a single systemSimulation is rarely used to simply evaluate one systemComparison: Two alternative systems can be built Proposed versus existing system What-if analysis for current system
IE 519 222
Types of ComparisonsComparison of two systems Comparison of multiple systems
Comparison with a standard All pair-wise comparison Multiple comparison with the best (MCB)
Ranking-and-selection Selecting the best system of k systems Selecting a subset of m systems containing
the best Selecting the m best of k systems
Combinatorial optimization
IE 519 223
Overview of Various Approaches
Comparison of Systems Construct (simultaneous) confidence intervals
Ranking-and-selection Indifference zone
The systems that is selected has performance that is within an indifference zone of the best performance with a fixed probability
This is the most common method Bayesian approach Optimal simulation budget allocation
Optimization Design of experiments/Response surfaces Search procedures
IE 519 224
Example: One or Two Servers?
IE 519 225
Comparing Two Systems
Have IID observations from two output processes and want to construct a CI for the expected difference:
21
2222221
1111211
2
1
,...,,
,...,,
in
in
XEXXX
XEXXX
IE 519 226
A Paired-t CI
If n1=n2=n we can construct a paired CI
)(ˆ)(
)()1(
1)(ˆ
1)(
21,1
1
2
1
21
nZtnZ
nZZnn
nZ
Zn
nZ
ZE
XXZ
n
n
ii
n
ii
i
iii
IE 519 227
Welch CINow do not require equal samples, but assume that the two processes are independent
1)(1)(
)()(ˆ
)()()()(
)(1
1)(
1)(
2
2
22221
2
112
1
222211
21
2
222
1
12
121,ˆ2211
1
2
1
nnnSnnnS
nnSnnSf
n
nS
n
nStnXnX
nXXn
nS
Xn
nX
f
n
jiiij
iii
n
jij
iii
i
i
IE 519 228
Obtaining IID Observations
Need the observations from each system to be IIDTerminating simulation Each run is IID, so no problem
Non-terminating simulation Replication/deletion approach Non-overlapping batch-means
IE 519 229
Comparing Multiple Systems
Comparison with a standard
All pair-wise comparison
Multiple comparison with the best
(MCB)
IE 519 230
Comparison with a Standard
Now assume that one of the systems is the ‘standard,’ e.g. an existing systemConstruct a CI with with overall confidence level 1- for 2-1, 3-1,…, k-1.
Using Bonferroni inequality: Construct k-1 confidence intervals at level 1-k1 The individual CIs can be constructed using any method as Bonferroni will always hold
IE 519 231
All Pair-Wise Comparison
Now want to construct CIs to compare all systems with all otherQuite difficult because we need each individual CI to have level 1-kk1 to guarantee an overall level of 1-Only feasible for a relatively small number of k
IE 519 232
Multiple Comparison with the Best (MCB)
We are really interested in whatever is the best system, and hence to construct CIs to see if it is significantly better than each of the others
Here h is a critical parameter andMCB procedures are related to ranking-and-selection
nhnXnX
nhnXnX l
ilil
ilil
ili
lil
i
2)(max)(,
2)(max)(max
max
},0max{ xx
IE 519 233
Ranking-and-Selection
Have some k systems, and IID observations from each system:
Want to select the best system, that is, the system with the largest mean. We call this the correct selection (CS)
Can we guarantee CS?
kiii
iji XE
...21
IE 519 234
Indifference Zone Approach
We say that the selected system i* is the correct selection (CS) if
Here is called the indifference zoneOur goal is
Here P* is a user selected probability(Bechhofer’s approach)
*1 ii
**
1)( PPCSP
ii
IE 519 235
Two-Stage Approach: Stage I
Obtain n0 samples and calculate
Calculate the total samples needed
0
0
1
2
00
02
100
)1(
)(1
1)(
1)(
n
jiiji
n
jiji
nXXn
nS
Xn
nX
20
221
0
)(,1max
nSh
nN ii
IE 519 236
Two-Stage Approach: Stage II
Obtain Ni-n0 more observations, and calculate the second stage and overall mean
12
022
1
20
0
01
0)2(
20)1(
1
100
)2(
1
)(111
)()()(
1)(
0
ii
i
ii
ii
iiiiiii
N
njij
iii
ww
nSh
nN
n
N
N
nw
nNXwnXwNX
XnN
nNXi
IE 519 237
Comments: Assumptions
As usual: normal assumptionDo not need equal or known variances (many statistical selection procedures do)Two-stage approach requires an estimate of the variance (remember controlling the precision)The above approach assumes the least favorable configuration
IE 519 238
Subset Selection
In most applications, many of the systems are clearly inferior and can be eliminated quite easilySubset-selection: Find a subset of systems
Gupta’s approach:
*1
},..2,1{
PIiP
kI
k
n
hnXnXlI ili
l
2)(max)(:
IE 519 239
Proof
11,...,2,1,
,22
)()(
,2
)()(
2)(max)(
select
0
normal standardely Approximat
h
i
k
h
iiiiii
kii
iii
ik
kihZP
iin
hn
nXnXP
iin
hnXnXP
nhnXnXPIiP
kkk
k
kk
IE 519 240
Two-Stage Bonferroni: Stage I
Specify
Make no replications and calculate the sample variance of the difference
Calculate the second stage sample size
2
22
0
1
2
0
2
1,)1(1
*
max,max
1
1
1
0
0
ij
iji
n
ljiljliij
nk
StnN
XXXXn
S
tt
P
IE 519 241
Two-Stage Bonferroni: Stage II
Obtain the additional sample, calculate the overall sample means and select the best system, with the following CI:
jij
i
jij
ijij
i
XX
XX
max,0max
maxmax,0min
IE 519 242
Combined ProcedureInitialization: Calculate and
Subset selection: Calculate and
If |I|=1, stop. Otherwise, calculate second-stage:
Obtain more samples from each system iI
Compute the overall sample means and select the best system
)( 0)1( nX i
0
1
2
00
2 )(1
1 n
jiiji nXX
nS
2/1
022 nSStW liil
ilWnXnXiI illi ,)()(: 00
220 ,max ii hSnN
IE 519 243
Sequential ProcedureSet
Compute
Screen
If |I|=1, stop. Otherwise, take one more observation and go back to screening
0
1
2
000
2 )()(1
1 n
jliljiji nXnXXX
nS
rSh
rrW
ilrWrXrXIiI
ilil
illioldnew
2
22
2,0max)(
),()()(:
12 ,11
2
2
10
212 0
nhk
n
IE 519 244
Where does the h come from?
Solved numerically from (Rinott):
More commonly, you look it up in a table (some in the book)
density
cdf Normal
)()(11
)1(
2
1
1
0 0
1
0
*
00
f
dyyfdxxf
yxn
hP n
k
n
IE 519 245
Large Number of Alternatives
Two-stage ranking-and-selection procedures usually only efficient for up to about 20 alternatives
Always focus on least-favorable-configuration (LFC)
For large number of systems LFC would be very unlikelyUse screening followed by two-stage R&SUse sequential procedure
The procedure given earlier can be used up to 500 alternatives or so
1,...,2,1 , kiki
IE 519 246
Other Approaches
Focused on comparing expected values of performance to identify the best
Alternatives: Select the system most likely to be best Select the largest probability of success Bayesian procedures
IE 519 247
Bayesian Procedures
Posterior and priorTake action to maximize/minimize the posteriorR&S: Given a fixed computing budget find the allocation of simulation runs to systems that minimizes some loss function
ijj
co
i
iL
iL
max),(
otherwise1
0),(
..
10
IE 519 248
Discussion: Selecting the Best
Three major lines of research: Indifference-zone procedures
Most popular, easy to understand, use the LFC assumption
Screening or subset selection based on constructing a confidence interval
Can be applied for more alternatives, do not give you a final selection. Can be combined with indifference zone selection
Allocating your simulation budget to minimize a posterior loss function
More efficient use of simulation effort, but does not give you the same guarantee as indifference-zone methods
IE 519 249
Simulation Optimization
IE 519 250
Larger ProblemsEven with the best methods, R&S can only be extended to perhaps 500 alternativesOften faced with more when we can set certain parameters for the problemNeed simulation optimization
IE 519 251
What is Simulation Optimization?
Optimization where the objective
function is evaluated using
simulation
Complex systems
Often large scale systems
No analytical expression available
IE 519 252
Problem Setting
Components of any optimization problem: Decision variables () Objective function
Constraints
: nf R R
n R
IE 519 253
Simulation Evaluation
No closed form expression for the functionEstimated using the output of stochastic discrete event simulation:
Typically, we may have
: nf R R
( ) ( ) .f E X
( )X
IE 519 254
Types of TechniquesDecision Variables
Continuous Discrete
Gradient-BasedMethods
Size of
Small Large
Ranking & Selection
RandomSearch
Note: These are direct optimization methods. Metamodels approximate the objective function and then optimize (later).
IE 519 255
Continuous Decision Variables
Most methods are gradient based
Issues: All the same issues as in non-linear
programming How to estimate the gradient
( 1) ( ) ( )k k kk f
( )ˆ kf
IE 519 256
Stochastic Approximation
Fundamental work by Robbins and Monro (1951) & Kiefer and Wolfowitz (1952)Asymptotic convergence can be assured
Generally slow convergence
lim 0kk
kk
IE 519 257
Estimating the Gradient
Challenge to estimate the gradient:
Finite differences are simple:
(could also be two-sided)
1 2ˆ ˆ ˆ ˆ, ,..., nf f f f
( ) ( )ˆ i i ii
i
X Xf
IE 519 258
Improving Gradient Estimation
Finite differences requires two simulation runs for each estimateMay be numerically instableBetter: estimate gradient during the same simulation run as Perturbation analysis Likelihood ration or score method
( )X
IE 519 259
Other Methods
Stochastic approximation variants
have received most attention by
researchers
Other methods for continuous domains
include Sample path methods
Response surface methods (later)
IE 519 260
Discrete Decision Variables
Two types of feasible regions:Feasible region small (have seen this) Trivial for deterministic case but must
still account for the simulation noise
Feasible region large E.g., the stochastic counterparts of
combinatorial optimization problems
IE 519 261
Statistical Selection
Selecting between a few alternatives
Can evaluate every point and compareMust still account for simulation noiseWe now know several methods:
Subset selection Indifference zone ranking & selection Multiple comparison procedures (MCP) Decision theoretic methods
1 2, ,..., m
IE 519 262
Large Feasible Region
When the feasible region is large it is impossible to enumerate and evaluate each alternativeUse random search methods Academic research focused on
methods for which asymptotic convergence is assured
In practice, use of metaheuristics
IE 519 263
Random Search (generic)
Step 0: Select an initial solution (0) and simulate its performance X((0)). Set k=0
Step 1: Select a candidate solution (c) from the neighborhood N((0)) of the current solution and simulate its performance X((c))
Step 2: If the candidate satisfied the acceptance criterion, let (k+1)= (c); otherwise let (k+1)= (k)
Step 3: If stopping criterion is satisfied terminate the search; otherwise let k=k+1 and go to Step 1
IE 519 264
Random Search Variants
Specify a neighborhood structure
Specify a procedure for selecting candidates
Specify an acceptance criterion
Specify a stop criterion
IE 519 265
Metaheuristics
Random search methods that have been found effective for combinatorial optimizationFor simulation optimization
Simulated annealing Tabu search Genetic algorithms Nested partitions method
IE 519 266
Simulated Annealing
Falls within the random search frameworkNovel acceptance criterion:
The key parameter is Tk, which is called the temperature
c ( )
c ( )
c
1,
Accept
, otherwise
k
k
k
X X
T
X X
P
e
IE 519 267
Temperature Parameter
Usually the temperature is decreased as the search evolvesIf it decreases sufficiently slowly then asymptotic convergence is assuredFor simulation optimization there are indications that constant temperature works as well or better
IE 519 268
Tabu Search
Can be fit into the random search framework
A unique feature is the restriction of the neighborhood:
Solution requiring the reverse of recent moves not allowed in the neighborhood
Maintain a tabu list of moves
Other features include long term memory that restart with a different tabu list at good solutions
Has been applied successfully to simulation optimization
IE 519 269
Genetic Algorithms
Works with sets of solutions (populations) rather than single solutionsOperates on the population simultaneously:
Survival Cross-over Mutation
Novel construction of a neighborhoodHas been used successfully for simulation optimization
IE 519 270
Nested Partitions Method
Originally designed for simulation optimizationUses Partitioning Random sampling Local search improvements
Has asymptotic convergence
IE 519 271
NP Method
Most PromisingRegion
Subregion Subregion
Superregion
In k-th iterationj=2 subregions
)(1 k )(2 k
)(\)(3 kk
))(( ks )()()( 21 kkk
Partition of the feasible region In each iteration there is the most promising region (k) Use sampling to determine where to go next
IE 519 272
Sampling
Sources of randomness: Performance of a subset is based on a random
sample of solutions from that subset
Performance of each individual samples estimated using simulation
Difficulty of estimating performance depends on how much variability in the region
Intuitively appealing to have more sampling from regions that have high variance
IE 519 273
Two-Stage Sampling
Use two-stage statistical selection methods to determine the number of samples
Phase I: Obtain initial samples from each region
Calculate estimated mean and variance
Calculate how many additional samples are needed
Phase II: Obain remaining samples
Estimate performance of region
IE 519 274
Convergence
Single-stage NP converges asymptotically (useless?)Two-stage NP converges to a solution that is within an indifference zone of optimum with a given probability Reasonable goal softening Statement familiar with simulation users
IE 519 275
Theory versus Practice
Asymptotically converging methods Good theoretical properties May not converge fast of be easy to
use/understand
Practical methods Often based on heuristic search Do not necessarily account for
randomness Do no guarantee convergence
IE 519 276
Commercial Software
SimRunner (Promodel) Genetic algorithms
AutoStat (AudoMod) Evolutionary & genetic algorithms
OPTIMIZ (Simul8) Neural networks
OptQuest (Arena, Crystal Ball, etc) Scatter search, tabu search, neural nets
IE 519 277
Optimization in Practice
In academic work we have very specific definitions:
Optimization = find the best solution Approximation = find a solution that is within
a given distance of optimal performance Heuristic = seek the optimal solution
In practice, people do not always think about the theoretical ideal optimum that is the basis for all of the above
Optimization = improvement
IE 519 278
Combining Theory & Practice
Best of both worlds Robustness and computational power of
heuristics Guarantee performance somehow
Some examples: Combine genetic algorithms with statistical
selection Two-stage NP-Method guarantees
convergence within an indifference zone with a prespecified probability
IE 519 279
Metamodels
IE 519 280
Response Surfaces
Obtaining a precise simulation estimate is computationally expensiveWe often want to do this for many different parameter values (and even find some optimal parameter values)An alternative is to construct a response surface of the output as a function of these input parametersThis response surface is a model of the simulation models, that is, a metamodel
IE 519 281
Metamodels
Simulation can be (simply) represented as
For as single output and additive randomness, we can write this as
The metamodel, models g and models
gy
gy
yfg ~
IE 519 282
ExampleInstead of simulating an exact contour – construct a metamodel using a few values
IE 519 283
Regression
Most commonly, regression models have been used for metamodels
The issues are determining how many terms to include and estimating the coefficients
213
22
11
)(
)(
)(
)()(
xxp
xp
xp
pf kk
x
x
x
xx
IE 519 284
DOE for RS Models
The coefficients are given by
Key issues: Would like to minimize variance of
Can be done by controlling the random number stream
Would like to estimate with fewer simulation runs
Designs to reduce bias
yβ tt XXX1
IE 519 285
Why Response Surfaces?
Box, Hunter, and Hunter (1978). Statistics for Experimenters, Wiley.
IE 519 286
Compare with True Optimum
Why did this fail?
IE 519 287
Response Surface Optimization
IE 519 288
Second Order Model
IE 519 289
Experimental Process
State your hypothesisPlan an experiment
Design of Experiments (DOE)
Conduct the experiment Run a simulation
Analyze the data Output analysis
Repeat steps as needed
IE 519 290
DOE
Define the goals of the experimentIdentify and classify independent and dependent variables (see example)Choose a probability model
Linear, second order, other (see later)
Choose an experimental design Factorial design, fractional factorial, latin
hybercubes, etc.
Validate the properties of the design
IE 519 291
Example of Variables
Dependent
Independent
Throughput
Job release policy, lot size, previous maintenance, speed
Cycle Time Job release policy, lot size, previous maintenance, speed
Operating Cost
Previous maintenance, speed
IE 519 292
Other Metamodels
Many other approaches can be taken to metamodeling
Splines Have been used widely in deterministic simulation
responses Radial basis functions Neural networks Krieging
Have also been used widely in deterministic simulation and gaining a lot of ground in stochastic simulation
IE 519 293
Variance Reduction
IE 519 294
Variance ReductionAs opposed to physical experiments, in simulation we can control the source of randomnessMay be able to take advantage to improve precision
Output analysis Ranking & selection Experimental designs, etc.
Several methods: Common random numbers
Comparing two or more systems Antithetic variates
Improving precision of a single system Control variates, indirect estimation, conditioning
IE 519 295
Common Random Numbers
Most useful techniqueUse the same stream of random numbers for each system when comparingMotivation:
n
XXCovXVarXVar
n
ZVarnZVar
Zn
nZ
XXZ
jjjjj
n
jj
jjj
2121
1
21
,2)(
1)(
IE 519 296
Applicability
IE 519 297
Synchronization
We must match up random numbers from the different systemsCareful synchronization of the random number stream Assign one substream to each process Divide that substream up to get
replications
Does not happen automatically
IE 519 298
Example: Failed Sync.
IE 519 299
Example: M/M/1 vs M/M/2
Independent sampling
CRN
IE 519 300
Example: Correlation Induced
IE 519 301
Example: System Difference
IE 519 302
Methods for CRN Use
Many methods assume independence between systemsRanking & Selection, etc. Some methods designed to use CRN,
while it violates the assumptions of others
Experimental design Can design the experiments specifically
to take advantage of CRNs
IE 519 303
Discussion
Dramatic improvements can be achieved with CRN (but can also be harmful)Recommendations:
Make sure CRN are applicable (pilot study) Use methods that take advantage of CRN Synchronize the RNG Use one-to-one random variate generator
IE 519 304
Antithetic Variates
We now turn to improving precision of a simulation of a single systemBasic idea:
Pairs of runs Large observations offset by small
observations Use the average, which will have smaller
variance
Need to induce negative correlation
IE 519 305
Mathematical Motivation
Recall for covariance stationary process
we have
If the covariance terms are negative the variance will be reduced Difficult to get all of those negative
nYYY ,...,, 21
liil
l
n
l
YYCov
n
l
nnYVar
,
12 1
1
2
IE 519 306
Complementary Random Numbers
The simplest approach is to use complementary random numbersSuppose service times are exponentially distributed with mean = 5
U X 1-U X X0.37 4.98 0.63 2.30 3.640.55 3.02 0.45 3.96 3.490.98 0.09 0.02 20.17 10.130.24 7.19 0.76 1.36 4.270.71 1.70 0.29 6.22 3.96Avg. 3.40 6.80 5.10
S.Dev. 2.78 7.70 2.83
IE 519 307
Example (cont.)U X 1-U X
X0.07 13.39 0.93 0.36
6.870.35 5.26 0.65 2.15
3.700.21 7.86 0.79 1.16
4.510.57 2.81 0.43 4.23
3.520.66 2.08 0.34 5.39
3.74Avg. 6.28
2.66
4.47S.Dev 4.58
2.11
1.40
Does this prove that antithetic variates work for this example?
IE 519 308
What You NeedAs for CRN, we need a monotone relationship between the (many) unit uniform random numbers to the (single) output that we are interested in(When there are multiple outputs we need to hold for each output.)As before:
Synchronization Inverse-transform
X
IE 519 309
Formulation
2ˆ
Simulation
)1(
)(
212
22
1212
12
112
jjj
jj
jj
j
j
YYY
YX
YX
UFX
UFX
IE 519 310
Example: M/M/1 Queue
Independent sampling
Antithetic sampling
IE 519 311
Complimentary Processes
Imagine a queueing simulation with arrivals and servicesLarge interarrival times will in general have the same effect on performance measures as large service timesIdea: Use the random numbers used for generating interarrival times in the first run of a pair to generate service time in the second run, and vice versaThis could be extended to any situation where you can argue similar complimentary
IE 519 312
Combining CRN with AV
Both CRN and AV are using very similar ideas, so why not combine them?System 1: Run 1.1 and Run 1.2System 2: Run 2.1 and Run 2.2If we have all the right correlations:
Run 1.1 and Run 2.2 are positively correlated Run 1.2 and Run 2.1 are positively correlated
Thus, the overall performance may be worse
IE 519 313
Discussion
Basic idea is to induce negative correlation to reduce varianceSuccess is model dependentMust show that it works
Based on model structure Pilot experiments
Since we need a monotone relationship between the RNG and output: synchronization and inverse-transform
IE 519 314
Control VariatesWe are again interested in improving the precision of some output Y
X is the control variate (any correlated r.v.)
correlated negatively are and if0
correlatedpostively are and if0 is
meanknown thefromdeviation
output theof valuedobserved
YX
YXa
XE
XaYYC
IE 519 315
Estimator PropertiesThe controlled estimator is unbiased
The variance is
So
0][][
)(
XEaYE
XaYEYE C
),(2][][
)(2 YXCovaXVaraYVar
XaYVarYVar C
][),(2][ 2 XVaraYXCovaYVarYVar C
IE 519 316
Optimal a Given Y
0][2
),(2][][
][
),(
),(2][2
),(2][][0
22
2
*
2
YVar
YXCovaXVaraYVara
XVar
YXCova
YXCovXVara
YXCovaXVaraYVara
IE 519 317
Optimal VarianceWith the optimal value a*
][1][
),(][
),(][
),(2
][][
),(][
),(2][][
22
2
*2**
YVarXVar
YXCovYVar
YXCovXVar
YXCov
XVarXVar
YXCovYVar
YXCovaXVaraYVarYVar
XY
C
IE 519 318
ObservationsBy using the optimal value a*
The controlled estimator is never more variable than the uncontrolled estimator
If there is any correlation, the controlled estimator is more precise
Perfect correlation means perfect estimate
Where’s the catch?
0][11
2*
XY
YVarYVar XYC
IE 519 319
Estimating a*
Never know Cov[X,Y] and hence not a*
Need to estimate
This will in general be a biased estimatorJackknifing can be used to reduce bias
)(
)()(
)(
)(ˆ)(ˆ
)()(ˆ)()(
2
1
2*
**
nS
nnXXnYY
nS
nCna
nXnanYnY
X
n
j jj
X
XY
C
IE 519 320
Example: M/M/1 Queue
Want to estimate the expected customer delay in systemPossible control variates: Service times
Positive correlation Interarrival times
Negative correlation
IE 519 321
Example: Service Time CVs
Rep Y X
1 13.84 0.92
2 3.18 0.95
3 2.26 0.88
4 2.76 0.89
5 4.33 0.93
6 1.35 0.81
7 1.82 0.84
8 3.01 0.92
9 1.68 0.85
10 3.60 0.88
13.4)01.0(3578.3
)10()10(ˆ)10()10(
00.35)10(
)10(ˆ)10(ˆ
07.0)10(ˆ
002.0)10(
9.089.0)10(
13.478.3)10(
**
2*
2
known! bet Shouldn'
XaYY
S
Ca
C
S
X
Y
C
X
XY
XY
X
IE 519 322
Multiple Control Variates
Perhaps we have two correlated random variables (e.g., both service times and interarrival times)
Problems?
)2()1(
)2()2()1()1( )()(
)(
XXX
XaXaY
XaYYC
IE 519 323
Multiple Control Variates
To take best advantage of each control variate we need different weights
Find the partial derivatives with respect to both and solve for optimal values as before
)()( )2()2(2
)1()1(1 XaXaYYC
IE 519 324
In General
m
i
m
j
jiji
m
i
ii
m
i
iiC
m
i
iiiC
XXaa
XYa
XaYY
XaYY
1 1
)()(
1
)(
1
)(2
1
)()(
,cov2
,cov2
var]var[var
)(
IE 519 325
Types of Control Variates
Internal Input random variables, or functions of those random
variables Know expectation Must generate anyway
External We cannot know However, with some simplifying assumptions we may
have an analytical model that we can solve and hence know the same output
Requires a simulation of the simplified system
][YE
]'[' YE
IE 519 326
Indirect Estimation
Primarily been used for queueing simulations
ttL
ttQ
WEw
iW
DEd
iD
i
i
i
i
at time systemin customers ofNumber )(
at time queuein customers ofNumber )(
][
customerth of wait Total
][
customerth ofDelay
IE 519 327
Direct Estimators
)(
0
)(
0
1
1
)()(
1)(ˆ
)()(
1)(ˆ
1)(ˆ
1)(ˆ
nT
nT
n
ii
n
ii
dttLnT
nL
dttQnT
nQ
Wn
nw
Dn
nd
IE 519 328
Known Relationships
][)(
customer of timeService
1)(
)()(ˆ)(ˆ
1
SEnSE
iS
Sn
nS
nSndnw
i
n
ii
Can we take advantage of this?
IE 519 329
Indirect Estimator
Replace the average with the known expectation
Avoid variationFor any G/G/1 queue it can be shown that
Is this trivial?
][)(ˆ)(~ SEndnw
)(ˆvar)(~var nwnw
IE 519 330
Little’s Law
A key result from queueing is Little’s Law
Indirect estimators of average number of customer in the queue/system
dQ
wL
system torate Arrival
][)(ˆ)(~)(~
)(ˆ)(~
SEndnwnL
ndnQ
IE 519 331
Numerical Example
M/G/1 queue
Service Dist. =.5 =.7
=.9
Exponential 15 11 4
4-Erlang 22 17 7
Hyperexponential 4 3 2
IE 519 332
Conditioning
Again replace an estimate with its exact analytical value, hence removing a source of variability
XZXEXZXE
ZXEEXE
zZXE
var|varvar|var
|
knownly Analytical|
IE 519 333
Discussion
We need can be easily generated E[X|Z=z] can be easily calculated E[var[X|Z]] is large
This is going to be very model dependent
IE 519 334
Example: Time-Shared Computer Model
Want to estimate the expected delay in queue for CPU (dC), disk (dD) and tape (dT)
IE 519 335
ConditioningEstimating dT may be hard due to lack of data
Observe the number NT in tape queue every time a job leaves the CPUIf this job were to go to the tape queue, its expected delay would be
Variance reduction of 56% was observed
ly)analytical(known 50.12|
50.12
zzNDE
NZ
NNSEDE
TT
T
TTTT
IE 519 336
Discussion
Both indirect estimation and conditioning are extremely application dependentRequire both good knowledge of the system as well as some background from the analystCan achieve good variance reduction when used properly
IE 519 337
Variance Reduction Discussion
Have discussed several methods Common random numbers
Antithetic random variates
Control variates
Indirect estimation
Conditioning
More application specific
IE 519 338
Applicability & Connections
Can we use VRT with any technique for output analysis (e.g., batch-means)?Can we use VRT (especially CRN) with ranking-and-selection and multiple comparison methods?Can we design our simulation experiments (DOE) to take advantage of VRT (especially when building a metamodel)?Can we use VRT with simulation optimization techniques?
IE 519 339
VRT & Batch-Means
Batch-means is a very important method for output analysis (non-overlapping & overlapping)Problem is that there may be correlation between batchesGenerally no problem with the use of common random numbers or antithetic variatesUse of control variates requires some additional consideration but can be done
IE 519 340
VRT & Ranking & Selection
In R&S we need to make statements about
If we use CRNs then the two averages will be dependent, which complicates analysisTwo general methods:
Look at pair-wise differences Bonferroni inequality Assume some structure for the dependence
induced by the CRNs
lili XnX )(
IE 519 341
Pair-Wise Differences
We can replace
with the pair-wise differences
This will then include the effect of the CRN-induced covariance
njliljij nXnX
,...,2,1,)()(
njkiij nX
,...,2,1;,...,2,1)(
IE 519 342
Bonferroni Approach
We can break up the joint statement using Bonferroni inequalities
Very conservative approach
kikii
k
ii
k
ii
nXnXA
APAP
)()(
1
e.g.
1
1
1
1
IE 519 343
Assumed Structure
E.g., Nelson-Matejcik modification of two-stage ranking and selection assumes sphericity
Turns out to be a robust assumption
li
liXX
li
iljij
22,cov
IE 519 344
VRT & DOE/Metamodeling
Experimental design X is used in many simulation studies to construct a metamodel (usually regression model) of the response
Can we take advantage of variance reduction to improve the design?
εβy X
IE 519 345
23 Factorial Design
How would you used VRT for this design?
IE 519 346
Assignment Rule
In an m-point experiment that admits orthogonal blocking into two blocks of size m1 and m2, use a common stream of random numbers for the first block and the antithetic random numbers for the second block
,...1,1
,...,
,...,,
,...,,
,...,,
2,1,
2,1,
11211
222212
112111
jvjvjv
jvjvjv
v
v
v
uu
uu
u
u
uuu
uuuU
uuuU
IE 519 347
Blocking
IE 519 348
23 Factorial Design in 2 Blocks
In physical experiments we block because we have to (we lose the three-way interaction effect).In simulation we do it because it is better
IE 519 349
VRT and Optimization
Most of the optimization techniques used with simulation do not make any assumption (e.g., just heuristics from deterministic optimization applied to simulation)
No problem with using variance reduction
Nested partitions method Need independence between iterations Key is to make a correct selection of a region in each
iteration Can use CRN within each iteration to make that
selection
IE 519 350
DiscussionVariance reduction techniques can be very effective in improving the precision of simulation experimentsOf course variance is only part of the equation, and you should also consider bias and efficiency
XXC
XCXMSEXEff
XEXMSE
XEXEX
XE
computing ofCost )(
)()(
1)(
)(
][]var[
][
222
22
IE 519 351
Case Study
IE 519 352
Manufacturing Simulations
Objective Increased throughput Reduce in-process inventory Increase utilization Improved on-time delivery Validate a proposed design Improved understanding of system
IE 519 353
Evaluate Need for Resources
How many machines are needed?Where to put inventory buffers?Effect of change/increase in production mix/volume?Evaluation of capital investments (e.g., new machine)
IE 519 354
Performance Evaluation
ThroughputResponse timeBottlenecks
IE 519 355
Evaluate Operational Procedures
SchedulingControl strategiesReliabilityQuality control
IE 519 356
Sources of Randomness
Interarrival times between orders, parts, or raw materialProcessing, assembly, or inspection timeTimes to failureTimes to repairLoading/unloading timesSetup timesReworkProduct yield
IE 519 357
Example: Assembly Line
Increase in demand expectedQuestions about the ability of an assembly line to meet demandRequested to simulate the line to evaluate different options to improve throughput
IE 519 358
Project Objective
Improve throughput in the lineSimulate the following options: Optimize logic of central conveyor loop Reconfigure the functional test stations
to allow parallel flow of pallets Eliminate the conveyor and move to
manual material handling
IE 519 359
Assembly Line
Manual Station 1 Assemble heatsinks and fans Soldering
Manual Station 2Install power moduleonto power PCBA
Hi-Pot Test
Strapping
FlashingFunctional TestsHIM
Assembly
Verification
Test
Packaging
IE 519 360
Simulation Project
Define a conceptual model of the lineGather data on processesValidate the modelImplement model in ArenaTest model on known scenariosEvaluate optionsRecommend solutions
IE 519 361
How Can Throughput be Improved?
Change the queuing logic This determines how pallets move
from one station to the next Flash station
Two stations in a single loop Functional test station
Three loops with two stations each
IE 519 362
Logic of the Flash Stations
1
Frame goes to the 2nd station
2Frame goes to the 2nd stationand waits in the queue
3
Frame goes to the 1st station 4
Frame goes to the 1st stationand waits in the queue
IE 519 363
Logic of the Functional Test
12
34
56
F
IE 519 364
How Can Throughput be Improved?
Reconfigure functional test stations Parallel tests would be more efficient
with respect to flow of pallets Take up more space on floor – longer
distances
Is it worthwhile to reconfigure?
IE 519 365
Circulate Pallets in System
IE 519 366
How can Throughput be Improved?
Eliminate the conveyor Manual material handling
Increase number of pallets Currently 48 pallets Sometimes run out
IE 519 367
Arena Simulation Model
The conceptual model was implemented using the Arena softwareCurrent configuration simulation and output compared to what we have observedPerformance of several alternative configurations compared
IE 519 368
Options Considered
Current configurationPallets re-circulate rather than queueVarious queue logics at functional testsFlash station in series, functional test in parallelBoth flash and functional test stations in parallelIncreased number of pallets in systemEliminate conveyor
IE 519 369
Queue Logic Options
Option 1: Queue one drive at second station in each loop starting with furthest away loopOption 2: Queue one drive at both stations in each loop, starting with furthest away loopOption 3: No queuing in loopsOption 4: Queue at second station in first loop only, start with furthest away loop
IE 519 370
Throughput ComparisonConfiguration Throughput
(drives/day)Current 265
Recirculation of pallets 275 (4% increase)
Queue logic: Option 1 274 (3% increase)
Queue logic: Option 2 279 (5% increase)
Queue logic: Option 3 280 (6% increase)
Queue logic: Option 4 295 (11% increase)
Mixed series/parallel 282 (6% increase)
All tests in parallel 296 (12% increase)
Increase to 60 pallets (25%) 291 (10% increase)
No Conveyor 256 (3% decrease)
IE 519 371
Why Does Throughput Improve?
Consider the utilization of the test stationsFirst loop utilization = 0.81
Station 1 0.67 Station 2 0.94
Second loop utilization = 0.63 Station 3 0.45 Station 4 0.81
Third loop utilization = 0.41 Station 5 0.30 Station 6 0.52
Difference of 0.27
Difference of 0.36
Difference of 0.22
IE 519 372
Improving Utilization
Backfilling will improve balance between different loops (Option 2)
Loop utilization: 0.53, 0.68, 0.76 Does not solve whole issue
Not queuing at test stations will balance load between stations with a loop (Option 3)
Station utilization: 0.64, 0.65, 0.75, 0.71, 0.80, 0.76
No queuing may leave station empty too easily
IE 519 373
Intermediate Options
Option 1: Queue one drive at second station in each loop starting with furthest away loop
Backfilling Balance between no-queuing and current
method of queuing one drive at each station
Option 4: Queue at second station in first loop only, start with furthest away loop
Uses backfilling idea Balance between no-queuing and queuing at
second station
IE 519 374
Utilization ComparisonConfiguration Functional Test
UtilizationCurrent 0.67, 0.94, 0.45, 0.81, 0.30,
0.52
Recirculation of pallets 0.71, 0.94, 0.51, 0.86, 0.26, 0.54
Queue logic: Option 1 0.42, 0.67, 0.56, 0.83, 0.67, 0.91
Queue logic: Option 2 0.39, 0.67, 0.52, 0.84, 0.61, 0.91
Queue logic: Option 3 0.64, 0.65, 0.75, 0.71, 0.80, 0.76
Queue logic: Option 4 0.53, 0.82, 0.65, 0.65, 0.73, 0.70
Mixed series/parallel 0.69
All tests in parallel 0.71
Increase to 60 pallets 0.70, 0.95, 0.49, 0.83, 0.43, 0.61
IE 519 375
Comments on Utilization
Utilization of functional test stations is currently uneven and can be improvedKey ideas Backfilling Correct amount of queuing allowed
IE 519 376
Bottleneck Analysis
Utilization of various stations Manual station 1 80% Manual station 2 65% Soldering 79% Hi-Pot 15% Strapping 37% Flashing 53% average Functional test 71% average
Functional test station 2 94% HIM 31% Verification 47% Packing 57%
Bottlenecks*
Third highestutilization
*Statistically equivalent
Bottleneck
IE 519 377
Bottleneck Identification
Functional Test Station 2 is the most heavily loaded station on the lineOn average, the functional test stations are slightly less loaded than Manual Station 1 and Soldering Station, which should hence also be considered bottlenecks
IE 519 378
Functional Test BottleneckConfiguration Queue LengthCurrent 0.73 ± 0.34 MAX=10 (21%)
Recirculation of pallets 0.17 ± 0.04 MAX=1 (2%)
Queue logic: Option 1 0.72 ± 0.32 MAX=10 (21%)
Queue logic: Option 2 0.20 ± 0.10 MAX=8 (17%)
Queue logic: Option 3 1.28 ± 0.50 MAX=17 (35%)
Queue logic: Option 4 0.79 ± 0.27 MAX=11 (23%)
Mixed series/parallel 0.96 ± 0.26 MAX=19 (40%)
All tests in parallel 1.14 ± 0.33 MAX=14 (29%)
Increase to 60 pallets 1.34 ± 0.55 MAX=16 (33%)
IE 519 379
Comments on Queue Length
Functional test queue Average queue length relatively short Occasionally very long queues
Similar results for other stations, e.g., HIM assembly stationNot a cause for concern
IE 519 380
Recommendations
Throughput can be improved: Queuing logic at test stations
Requires reprogramming of conveyor Configuring test stations in parallel
Requires significant reorganization ROI must be evaluated carefully
Increase number of pallets Currently close to point of rapidly diminishing
returns Will not combine well with other improvements
IE 519 381
Further Improvements
Optimal logic of functional tests depend on mix of drives, daily load, etc.Possibility of dynamically changed logic?Determine a relationship between product mix parameters and best logic
IE 519 382
Other Areas of Improvement
Scheduling of drives Mix of frames made on each day Order of how different frames are made
Suggestion Grouping and spacing Group similar drives together for efficiency Space time consuming drives apart Account for deadlines and resource
availability
IE 519 383
Will Scheduling Improvements Help?
Simulation results Assume batch sizes with certain range and
certain most common batch size
Clearly improvements can be made
Type Min
Max
Most common
Throughput
Batching 1 27 14 279
Batching 1 22 10 265
No Batch 1 1 1 274
IE 519 384
Discussion
Significant improvement can be obtain through inexpensive changes
Recommend changing queuing logic as inexpensive but high return alternative
Worthwhile to consider issues of schedulingSimulation model can be reused to consider other potential improvementsCompany followed recommendations and increased throughput as predicted