Simulation Output Analysis

23
Simulation Output Analysis

description

Simulation Output Analysis. Summary. Examples Parameter Estimation Sample Mean and Variance Point and Interval Estimation Terminating and Non-Terminating Simulation Mean Square Errors. Example: Single Server Queueing System. S 4. x ( t ). S 3. S 4. S 5. S 7. S 3. S 4. S 5. S 7. - PowerPoint PPT Presentation

Transcript of Simulation Output Analysis

Page 1: Simulation Output Analysis

Simulation Output Analysis

Page 2: Simulation Output Analysis

Summary

Examples Parameter Estimation Sample Mean and Variance Point and Interval Estimation Terminating and Non-Terminating Simulation Mean Square Errors

Page 3: Simulation Output Analysis

Example: Single Server Queueing System

Average System Time Let Sk be the time that customer k spends in the queue, then,

1

1ˆN

N kk

S SN

S1 S2

S3

S3

S4

S4

S4

S5

S5 S6

S7

S7

x(t)

t1 2 3 4 5 6 7 8 9 10 11 12 13 14

NA

N Area under x(t)

Estimate of the average system time over the first N customers

IMPORTANT: This is a Random Variable

Page 4: Simulation Output Analysis

Example: Single Server Queueing System

Probability that x(t)= i Let T(i) be the total observed time during which x(t)= i

ˆ ( )N

N

T ip iT

3T3

0

Ni

T T i

Total observation interval

T1 T1 T1 T1 T1 T1

2T2 2T2 2T2 2T2

x(t)

t1 2 3 4 5 6 7 8 9 10 11 12 13 14

Probability Estimate

Average queue length

1

1ˆ ˆN NiN

Q ip iT

N

N

A

T

Utilization

1

1 0ˆ 1NiN N

TT i

T T

T0

T0

Page 5: Simulation Output Analysis

Parameter Estimation

Let X1,…,Xn be independent identically distributed random variables with mean θ and variance σ2.

In general, θ and σ2 are unknown deterministic quantities which we would like to estimate.

1

1ˆn

ii

X Xn

Sample Mean: Random Variable!

ˆE X

Τhe sample mean can be used as an estimate of the unknown parameter θ. It has the same mean but less variance than Xi.

ˆVar X

11

11 nn

iiii

nE E XX

n nn

2 2 2

2

21

n

ii

nE E XX nnn

Page 6: Simulation Output Analysis

Bias: In general, an estimator is said to be an biased since the

following holds

where bn is the bias of the estimator

Estimator Properties

Unbiasedness: An estimator is said to be an unbiased estimator of

the parameter θ if it satisfies

n̂E

If X1,…,Xn are iid with mean θ, then the sample mean is an unbiased estimator of θ.

ˆnn

E b

Page 7: Simulation Output Analysis

Estimator Properties

Asymptotic Unbiasedness: An estimator is said to be an asymptotically

unbiased if it satisfies

ˆ ˆ and limnn nnE b E

Strong Consistency: An estimator is strongly consistent if with probability 1

ˆlimnn

E

If X1,…,Xn are iid with mean θ, then the sample mean is also strongly consistent.

Page 8: Simulation Output Analysis

Consistency of the Sample Mean

The variance of the sample mean is

2

ˆVar X n

f x

f xIncreasing n

But, σ is unknown, therefore we use the sample variance

22

1

1 ˆ1

n

n i ni

S Xn

Also a Random

Variable!

Page 9: Simulation Output Analysis

Recursive Form of Sample Mean and Variance

Let Mj and Sj be the sample mean and variance after the j-th sample is observed. Also, let M0=S0=0.

1

ji

ji

XM

j

The recursive form for generating Mj+1 and Sj+1 is

2

1 1

ji j

ji

X MS

j

11

1

1

1 1

1

jj i

j j ji

j jj

X XM M M

j j

X MM

j

2

11

11 j jj j

jM MS S j

j

Example: Let Xi be a sequence of iid exponentially distributed random variables with rate λ= 0.5 (sample.m).

Page 10: Simulation Output Analysis

Interval Estimation and Confidence Intervals

Suppose that the estimator then, the natural question is how confident are we that the true parameter θ is within the interval (θ1-ε, θ1+ε)?

Recall the central limit theorem and let a new random variable

ˆ ˆ

ˆvar

n nn

n

EZ

For the sample mean case2

ˆ

/

nnZ

n

Then, the cdf of Zn approaches the standard normal distribution N(0,1) given by

2 / 21

2

x re drx

Page 11: Simulation Output Analysis

Interval Estimation and Confidence Intervals

Let Z be a standard normal random variable, then

/ 2 / 2 / 2Pr Pr 1a a aZ Z Z Z aZ / 2aZ / 2aZ x

fZ(x)

0

Area = 1-a

/ 2 / 2Pr 1a n aZ Z Z a

Thus, as n increases, Zn density approaches the standard normal density function, thus

Page 12: Simulation Output Analysis

Interval Estimation and Confidence Intervals

/ 2 / 22

ˆPr 1

/

na aZ Z a

n

/ 2aZ / 2aZ x

fZ(x)

0

Thus, for n large, this defines the interval where θ lies with probability 1-a and the following quantities are needed The sample mean The value of Za/2 which can be obtained from tables given a

The variance of which is unknown and so the sample variance is used.

2 2/ 2 / 2

ˆ ˆPr / / 1n a n aZ n Z n a

Substituting for Zn

Page 13: Simulation Output Analysis

SOLUTION The sample mean is given by From the standard normal tables, a =0.05, implies za/22

Finally, the sample variance is given by

Example

1

1ˆn

ii

Xn

Therefore, for n large,

2 2ˆ ˆ ˆ ˆPr 2 / 2 / 0.95n n n nS n S n

Suppose that X1, …, Xn are iid exponentially distributed random variables with rate λ=2. Estimate their sample mean as well as the 95% confidence interval.

2

1

1ˆ ˆ1

n

n i ni

S Xn

SampleInterval.m

Page 14: Simulation Output Analysis

How Good is the Approximation

The standard normal N(0,1) approximation is valid as long as n is large enough, but how large is good enough?

Alternatively, the confidence interval can be evaluated based on the t-student distribution with n degrees of freedom

A t-student random variable is obtained by adding n iid Gaussian random variables (Yi) each with mean μ and variance σ2.

1

1

2

n

ini

YT

n

Page 15: Simulation Output Analysis

Terminating and Non-Terminating Simulation

Terminating Simulation There is a specific event that determines when the

simulation will terminate E.g., processing M packets or Observing M events, or simulate t time units, ...

Initial conditions are important! Non-Terminating Simulation

Interested in long term (steady-state) averages

lim kkE X

Page 16: Simulation Output Analysis

Terminating Simulation

Let X1,…,XM are data collected from a terminating simulation, e.g., the system time in a queue. X1,…,XM are NOT independent since

Xk=max{0, Xk-1-Yk}+Zk

Yk, Zk are the kth interarrival and service times respectively Define a performance measure, say

Run N simulations to obtain L1,…,LN. Assuming independent simulations, then L1,…,LN are

independent random variables, thus we can use the sample mean estimate

1

1 M

ii

L XM

1

1ˆN

jj

LN

1 1

1 1N M

ijj i

XN M

Page 17: Simulation Output Analysis

Examples: Terminating Simulation

Suppose that we are interested in the average time it will take to process the first 100 parts (given some initial condition).

Let T100,j j=1,…,M, denote the time that the 100th part is finished during the j-th replication, then the mean time required is given by

Suppose we are interested in the fraction of customers that get delayed more than 1 minute between 9 and 10 am at a certain ATM machine.

Let be the delay of the ith customer during the jth replication and define 1[Dij]=1 if Dij>1, 0 otherwise. Then,

100,1

1ˆM

jj

L TM

1

111

jM

ijjij

DLM

11

11ˆ 01jMN

ijij j

DLMN

Page 18: Simulation Output Analysis

Non-Terminating Simulation

Any simulation will terminate at some point m < ∞, thus the initial transient (because we start from a specific initial state) may cause some bias in the simulation output.

Replication with Deletions The suggestion here is to start the simulation and let it run for a

while without collecting any statistics. The reasoning behind this approach is that the simulation will

come closer to its steady state and as a result the collected data will be more representative

time0 r m

warm-up period Data collection period

Page 19: Simulation Output Analysis

Non-Terminating Simulation

Batch Means Group the collected data into n batches with m samples

each. Form the batch average

Take the average of all batches

For each batch, we can also use the warm-up periods as before.

11

1 jm

j ii mj

B Xn

1 1 11

1 1 jmn n

j ij j i mj

B B Xn nm

Page 20: Simulation Output Analysis

Non-Terminating Simulation

Regenerative Simulation Regenerative process: It is a process that is characterized by

random points in time where the future of the process becomes independent of its past (“regenerates”)

time0 Regeneration points

Regeneration points divide the sample path into intervals. Data from the same interval are grouped together We form the average over all such intervals. Example: Busy periods in a single server queue identify

regeneration intervals (why?). In general, it is difficult to find such points!

Page 21: Simulation Output Analysis

Empirical Distributions and Bootstrapping

Given a set of measurements X1,…,Xn which are realizations of iid random variables according to some unknown FX(x;θ), where θ is a parameter we would like to estimate.

We can approximate FX(x; θ) using the data with a pmf where all measurements have equal probability 1/n.

The approximation becomes better as n grows larger.

Page 22: Simulation Output Analysis

Example

Suppose we have the measurements x1,…,xn that came from a distribution FX(x) with unknown mean θ and variance σ2. We would like to estimate θ using the sample mean μ. Find the Mean Square Error (MSE) of the estimator based on the empirical data.

1 1

1n n

i i ii i

x p xn

2e eMSE E g XXx1 x2 xn

1/n

1

2/n

Empirical distribution

The empirical mean is an unbiased estimator of θ.

Based on empirical distribution

Vector of RVs from the empirical distribution

Page 23: Simulation Output Analysis

Example

1 1

1n n

i i ii i

x p xn

2e eMSE E g X

2

1

1 n

e ii

E Xn

2

21

1 n

e ii

E Xn

2

21

1 1n

e e iii

E Var XXnn

Xi is a RV from the empirical distribution

2e i eVar X E X 2

1

1 n

ii

xn

Therefore

1e e iMSE Var X

n 2

21

1 n

ii

xn

Compare this with the sample variance!