Post on 16-May-2020
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 11
STATISTICAL MODELS IN SIMULATION
How is probability and variation related to modelling of system performance?
Random variable– any system variable, X, that can
take different values- continuous or discrete
i.e. X has a range of values
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 22
How are the values within this range distributed?
- Described using Probability FunctionsProbability Functions
Probability functions – used to define probabilities of events associated with a random variable
Often mathematical functions, or graphical in nature
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 33
Discrete variables
X – described by a probability mass function P(x) – not P(X)
why? So P(x) = P(X = x)
Cumulative distribution function: F(x) = P(X≤x)
F(x) = step function bounded by 0 and 1
Xx ⊂
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 44
QA properties is considering making an offer to purchase an apartment building. Management has subjectively assessed a probability distribution for x, the purchase price: x P(x) $148000 .20 $150000 .40 $152000 .40
a) Find the cumulative distribution function F(x). What is F(152000)? b) What is the probability that the apartment house can be purchased for $150000 or less?
Problem to be solved:
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 55
Continuous variables
Here – need a probability density function (pdf)
pdf = P(x ≤ X ≤ x + dx)/dx
e.g.
More generally:
F(x) = P(X ≤ x)
∫∞−
=x
dxxf )(
1)( == ∫∞
∞−
dxxf
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 66
Probability Density FunctionProbability Density Function
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 77
Descriptive parameters
Mean value = Expected value
Discrete:
Continuous:
Variance:
Discrete:
Continuous
∫
∑∞
∞−
==
==
dxxfxXE
xPxXEiall
ii
)()(
)()(
μ
μ
222
222
)()(])[(
)()(])[(
σμμ
σμμ
=−=−
=−=−
∫
∑∞
∞−
dxxfxXE
xpxXE
i
xallii
i
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 88
Median (xm) P(X>xm) = 0.50
Mode: occurs where the density function has its peak
Skewness: θ = E([X-μ]3)/σ3
What would the 4th moment describe?
CASE STUDY 8CASE STUDY 8The number of signals arriving at a satellite monitoring system in any hour is defined by a random variable X. The probability mass function of X is believed to be
a) compute the constant c b) compute the mean and standard deviation of X c) compute the probability that the number of signals detected by the system in
any hour is less than or equal to 2
elsewhere0)(
4and3,2,1,0for1
)( 2
=
=+
=
xp
xx
cxp
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 99
a) For all x, P(x) = 1
So
Giving c = 0.538
b) Mean =
= 0.772
Similarly – variance = 1.094 giving σ = 1.05
c) P(X≤2) desired = P(X=0)+ P(X=1)+ P(X=2)= 0.538(1 +0.5 + 0.2) = 0.915
1116
119
114
111
110
1=⎥⎦⎤
⎢⎣⎡
++
++
++
++
+c
1 2 3 4 52 2 2 2 21 2 3 4 5
1 1 1 1 1( )1 1 1 1 1
i
i iall x
x P x c x x x x xx x x x x
⎡ ⎤= + + + +⎢ ⎥+ + + + +⎣ ⎦
∑
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1010
Glazer’s Winton Woods apartment complex has 80 two-bedroom apartments. The number of apartment air conditioner units that must be replaced during the summer season has the probability distribution shown below:
Air conditioner Probability Replaced 0 0.30 1 0.35 2 0.20 3 0.10 4 0.05
1)What is the expected number of airconditioner units that will be replaced during a summer season?
2)What is the variance in the number of air conditioner replacements?
3) What is the standard deviation?
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1111
Common probabilistic functions
A random variable – often takes values that follow a probability ‘trend’
If they follow a numerical pattern – can be modelled easily using distribution functions
DISCRETE
Hypergeometric
Binomial
Poisson
CONTINUOUS
Gaussian
Exponential
Weibull
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1212
HYPERGEOMTRIC
Applicable to sampling a population without subsequent replacement of sample
For D non-conforming samples in a population N,
The probability of getting x non-conforming items in a sample of size n is
)!(!!meanswhere)(
xDxD
xD
nN
xnDN
xD
xP−⎟⎟
⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
⎟⎠⎞
⎜⎝⎛
−−
⎟⎠⎞
⎜⎝⎛ −==
11VarianceMean
NnN
ND
NnD
NnD
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1313
CASE STUDY 9CASE STUDY 9
Here, N = 20, D = 5, n = 4, x = 3
Probability of 3 non-conformers:
031.0
420
115
35
)3( =
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛
=P
A batch of 20 transistors is known to contain 5 non-conforming ones. If an inspector randomly samples 4 items, find the probability of picking out 3 non-conforming ones
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1414
BINOMIALSeries of independent trials – each trial gives ‘yes’ or ‘no’
Probability of success = p = constant for any trial
Probability of x successes in n trials:
Mean = np, variance = np(1-p)
Uses:Sampling without replacement from large populations, orSampling with replacement from small populations
As N →∞ Hypergeometric → Binomial
xnppxn
xP −−⎟⎟⎠
⎞⎜⎜⎝
⎛= )1()( 3
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1515
CASE STUDY 10CASE STUDY 10
Here, n = 5, x = 2 and p = 0.05
(if success is defined as finding a dud filtered signal)
Additionally, mean and variance of the distribution:
021.0)95.0()05.0(25
)2( 32 =⎟⎟⎠
⎞⎜⎜⎝
⎛==XP
25(0.05) 0.25 (1 ) 5(0.05)(0.95) 0.2375np and np pμ = = = σ = − = =
A signal filtering device is known to be 95% successful. If a random sample of 5 filtered signals is chosen, find the probability that 2 of them have not been correctly processed.
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1616
At a particular university it has been found that 20% of the students withdraw without completing the introductory statistics course. Assume that 20 students have registered for the course this semester.
1) What is the probability that two or fewer will withdraw?
2) What is the probability that exactly four will withdraw?
3) What is the expected number of withdrawals?
Problem to be solved:
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1717
POISSON RANDOM VARIABLEPOISSON RANDOM VARIABLE
Named after Simeon D. Poisson (1781Named after Simeon D. Poisson (1781--1840)1840)Originated as an approximation to binomialOriginated as an approximation to binomialUsed extensively in stochastic modelingUsed extensively in stochastic modelingExamples include:Examples include:
Number of phone calls received, number of messages Number of phone calls received, number of messages arriving at a sending node, number of radioactive arriving at a sending node, number of radioactive disintegration, number of misprints found a printed disintegration, number of misprints found a printed page, number of defects found on sheet of processed page, number of defects found on sheet of processed metal, number of blood cells counts, etc.metal, number of blood cells counts, etc.
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1818
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 1919
POISSON
Models the number of occurrences of an event over timeor space, or volume…
Events = random and independent
Uses: number of non-conformities in a productnumber of machine breakdowns per month
here λ = average no. of events over specified time period
mean = variance = λ
If n →∞ and p → 0, then Poisson → binomial
!)(
xexp
xλλ−=
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2020
CASE STUDY 11CASE STUDY 11
Here, 1 unit is now 40m2
so λ is now 6
We need: P(X≤2) = P(X=0) + P(X=1) + P(X=2)
The mean and variance of this distribution are both 6
062.0!26
!16
!06 260606
=++=−−− eee
It is estimated that the average number of surface defects in 20m2 of paper produced by a process is 3. What is the probability of finding no more than 2 defects in 40m2 of paper through random selection?
Poisson approximation will be good as long as p≤0.05 and n>20.
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2121
A certain restaurant has a reputation for good food. Restaurant management boasts that on a Saturday night groups of customers arrive at the rate of 15 groups every half hour.
1) What is the probability that 5 minutes will pass with no customers arriving?
2) What is the probability that 8 groups of customers will arrive in 10 minutes?
3) What is the probability that more than 5 groups of customers will arrive in a 10-minute period of time?
Problem to be solved:
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2222
GAUSSIAN (normal)
The most important continuous distribution in The most important continuous distribution in probability and statisticsprobability and statisticsThe story of the outcome of normal is really the story The story of the outcome of normal is really the story of the development of statistics as a science.of the development of statistics as a science.Gauss discovered this while incorporating the method Gauss discovered this while incorporating the method of least squares for reducing the errors in fitting of least squares for reducing the errors in fitting curves for astronomical observations.curves for astronomical observations.
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2323
GAUSSIAN (normal)
Most widely used distribution for continuous random variables
- “Natures’ Distribution” –
For a population mean = μ and variance = σ2
Probability density function for x =
i.e. open-endedbell curve
∞<<∞−⎥⎦
⎤⎢⎣
⎡ −−= xxxf 2
2
2 2)(exp
21)(
σμ
πσ
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2424
Graphs of various normal PDFGraphs of various normal PDF
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2525
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2626
Often standardized such that σ2 = 1 and μ =0
Here - Z = standardized random variable and
Alternatively
Note
and due to symmetry
∞<<∞−⎥⎦
⎤⎢⎣
⎡−= xzzf
2exp
21)(
2
π
∫∞−
⎥⎦⎤
⎢⎣⎡−=≤=Φ
z
dzzzZPz 2
21exp
21)()(π
)(1)(5.0)0(
zz Φ−=−Φ=Φ
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2727
Impossible to document every σ/μcombination
∴standardization required for easy tabulation and reference
Tabulated values –
give areas under the curve –hence – probabilities
σμ−
=Xz
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2828
a)
P(X>103.3) = P(X>1.65)
From tables: P(z≤1.65) = 0.9505
So P(z>1.65) = 1-0.9505 = 0.0495
4.95% of the parts will be above 103.3mm
65.12
1003.10311 =
−=
−=
σμXz
CASE STUDY 12CASE STUDY 12The length of a machined part is known to have a normal distribution with a mean of 100mm and a standard deviation of 2mm
a) What proportion of the parts can be expected to be over 103.3mm in length? b) What proportion will be between 98.5mm and 102.0mm?
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 2929
b)
We needP(98.5≤X≤102.0)
now
From tables: 1.00 → 0.8413 and -0.75 → 0.2266
∴answer = 0.8413 – 0.2266 = 0.6142
61.47% of the output lies in the specified range
75.02
1005.9800.12
1000.10221 −=
−==
−= zz
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3030
EXPONENTIAL
Main use – reliability analysis
e.g. time to failure of a system entity
Here λ = failure rate
pdf:
i.e. failure most likely at t=0 (switching on)
Mean = 1/λ variance = 1/λ2
xexf λλ −=)(
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3131
Most important facet – memorylessmemoryless distribution
i.e. no reliance on what has occurred before
e.g. Markov chains in simulation and modelling- also memoryless- define probability of state change
Solution – very simple: failure rate = 1/500
P(X≤200) = 1 – e-(1/500)200 = 1 – e-0.04 = 0.330
CASE STUDY 13CASE STUDY 13
It is known that the battery for a video game has an average life of 500 hours. The failures of batteries are known to be random, independent and exponentially distributed. What is the probability of a battery failing within 200 hours?
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3232
WEIBULL
Main use – reliability and failure analysis
γ = location parameter α = scale parameterβ = shape parameter
Very generic!
⎥⎥⎦
⎤
⎢⎢⎣
⎡⎟⎠⎞
⎜⎝⎛ −
−⎟⎠⎞
⎜⎝⎛ −
=− ββ
αααβ yxyxxf exp)(
1
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3333
Joint PDFJoint PDF
So far we saw one random variable at a time. So far we saw one random variable at a time. However, in practice, we often see situations where However, in practice, we often see situations where more than one variable at a time need to be studied.more than one variable at a time need to be studied.For example, tensile strength (X) and diameter(Y) of For example, tensile strength (X) and diameter(Y) of a beam are of interest.a beam are of interest.Diameter (X) and thickness(Y) of an injectionDiameter (X) and thickness(Y) of an injection--molded molded disk are of interest.disk are of interest.
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3434
Joint PDF (ContJoint PDF (Cont’’d)d)X and Y are continuousX and Y are continuous
f(x,y) f(x,y) dxdx dydy = P( x < X < = P( x < X < x+dxx+dx, y < Y < , y < Y < y+dyy+dy) is ) is the probability that the random variables X will the probability that the random variables X will take values in (x, take values in (x, x+dxx+dx) and Y will take values in ) and Y will take values in ((y,y+dyy,y+dy).).f(x,y) f(x,y) > 0 for all x and y and> 0 for all x and y and ∫∫
∞
∞−
∞
∞−=1),( dydxyxf
∫∫=<<<<d
c
b
adydxyxfdYcbXaP ),(),(
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3535
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3636
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3737
2/08/20072/08/2007 ENGN8101 Modelling and OptimizationENGN8101 Modelling and Optimization 3838
Measures of Joint PDFMeasures of Joint PDF