Random Numbers and Simulation Generating truly random numbers is not possible Programs have been...
-
Upload
bertina-davidson -
Category
Documents
-
view
229 -
download
1
Transcript of Random Numbers and Simulation Generating truly random numbers is not possible Programs have been...
Random Numbers and Random Numbers and SimulationSimulation
Generating truly random numbers is Generating truly random numbers is not possiblenot possible• Programs have been developed to Programs have been developed to
generate pseudo-random numbersgenerate pseudo-random numbers• Values are generated from Values are generated from
deterministic algorithmsdeterministic algorithms
11
© © Fall 2011Fall 2011 John Grego and the University of South CarolinaJohn Grego and the University of South Carolina
Random NumbersRandom Numbers
Pseudo-random deviates can pass any Pseudo-random deviates can pass any statistical test statistical test for randomnessfor randomness
They They appearappear to be independent and to be independent and identically distributedidentically distributed
Random number generators for Random number generators for common distributions are available in common distributions are available in RR
Special techniques (STAT 740) may be Special techniques (STAT 740) may be needed as wellneeded as well
22
Monte Carlo SimulationMonte Carlo Simulation Some common uses of simulationSome common uses of simulation• Modeling stochastic behaviorModeling stochastic behavior• Calculating definite integralsCalculating definite integrals• Approximating the sampling Approximating the sampling
distribution of a statistics (e.g., distribution of a statistics (e.g., maximum of a random sample)maximum of a random sample)
33
Modeling Stochastic Modeling Stochastic BehaviorBehavior Buffon’s needleBuffon’s needle Random WalkRandom Walk Observe Observe XX11, X, X22, …, , …, where where
p=P(Xp=P(Xii=1)=P(X=1)=P(Xii=-1)=.5 =-1)=.5 and study and study SS11,S,S22,…, ,…, wherewhere
44
Si X jj1
i
Modeling Stochastic Modeling Stochastic BehaviorBehavior
This is also called This is also called Gambler’s ruinGambler’s ruin; ; each Xeach Xii represents a $1 bet with a represents a $1 bet with a return of $2 for a win and $0 for a return of $2 for a win and $0 for a loss.loss.
55
Gambler’s RuinGambler’s Ruin The properties of a The properties of a fairfair game ( game (p=.5p=.5) )
are a lot more interesting than the are a lot more interesting than the properties of an unfair game (properties of an unfair game (p≠.5p≠.5))
Some properties of this process are Some properties of this process are easy to anticipate (easy to anticipate (E(S)E(S)))
66
Gambler’s RuinGambler’s Ruin Some properties are difficult to Some properties are difficult to
anticipate, and can be aided by anticipate, and can be aided by simulation. simulation. • Expected number of returns to 0Expected number of returns to 0• Expected length of a winning streakExpected length of a winning streak• Probability of going broke given an Probability of going broke given an
initial initial bankbank
77
Calculating Definite Calculating Definite IntegralsIntegrals
In statistics, we often have to In statistics, we often have to calculate difficult definite integrals calculate difficult definite integrals (posterior distributions, expected (posterior distributions, expected values)values)
(here, (here, x x could be multidimensional)could be multidimensional)
88
I h(x)dxa
b
Calculating Definite Calculating Definite IntegralsIntegrals
Example 1Example 1
Example 2Example 2
99
I1 4
1 x 2 dx0
1
I2 (4 x12
0
1 2x22 )dx2 dx10
1
Hit-or-Miss Monte CarloHit-or-Miss Monte Carlo
Example 1Example 1
Determine Determine cc such that such that c≥h(x)c≥h(x) across across entire region of interest (here, entire region of interest (here, c=4c=4) )
1010
h(x) 4
1 x 2
4
1 x 2 dx 4(arctan(1) arctan(0)) 4 /4 0
1
Hit-or-Miss Monte CarloHit-or-Miss Monte Carlo
Generate Generate nn random uniform random uniform (X(Xii,Y,Yii) ) pairs, pairs, XXii’s from ’s from U[a,b]U[a,b] (here, (here, U[0,1]U[0,1]) ) and and YYii’s from ’s from U[0,c] U[0,c] (here, (here, U[0,4]U[0,4]))
Count the number of times (call this Count the number of times (call this mm) that ) that YYii is less than is less than h(Xh(Xii))
Then Then II11 ≈c(b-a)m/n ≈c(b-a)m/n • I.e., (height)(width)(proportion under I.e., (height)(width)(proportion under
curve)curve)
1111
Classical Monte Carlo Classical Monte Carlo IntegrationIntegration
Take n random uniform values, UTake n random uniform values, U11,…,U,…,Unn over [a,b] and estimate I usingover [a,b] and estimate I using
This method seems straightforward, but This method seems straightforward, but is actually more efficient than Hit-or-Miss is actually more efficient than Hit-or-Miss Monte CarloMonte Carlo
1212
I h(x)dxa
b
I b a
nh U i
i1
n
Expected Value of a Expected Value of a Function of a Random Function of a Random
VariableVariable Suppose X is a random variable with Suppose X is a random variable with density density ff. Find . Find E[h(x)]E[h(x)] for some for some function function hh, e.g.,, e.g.,
1313
E X 2 E X
E sin X
Expected Value of a Expected Value of a Function of a Random Function of a Random
VariableVariable
For For nn random values random values XX11, X, X22, …, X, …, Xnn from the distribution of from the distribution of X X (i.e., with (i.e., with density density ff), ),
1414
n
iiXh
nXhE
1
1
E h X h x X
dx
ExamplesExamples
Example 3: If Example 3: If XX is a random variable is a random variable with a with a N(10,1)N(10,1) distribution, find distribution, find E(XE(X22))
Example 4: If Example 4: If YY is a random variable is a random variable with a with a Beta(5,1)Beta(5,1) distribution, distribution, E(-lnY)E(-lnY)
There are more advanced methods of There are more advanced methods of integration using simulation integration using simulation (Importance Sampling)(Importance Sampling)
1515
IntegrationIntegration
integrate()integrate() performs numerical performs numerical integration for functions of a integration for functions of a singlesingle variable (variable (notnot using simulation using simulation techniques)techniques)
adapt()adapt() in the in the adaptadapt package package performs multivariate numerical performs multivariate numerical integrationintegration
1616
Approximating the Sampling Approximating the Sampling Distribution of a StatisticDistribution of a Statistic
To perform inference (CI’s, To perform inference (CI’s, hypothesis tests) based on sampling hypothesis tests) based on sampling statistics, we need to know the statistics, we need to know the sampling distribution of the statistics, sampling distribution of the statistics, at least up to an approximationat least up to an approximation
Example: Example: XX11, X, X22, …, X, …, Xnn ~ iid N(~ iid N(,,22).).
1717
T X s n
has a t(n 1) distribution
Approximating the Sampling Approximating the Sampling Distribution of a StatisticDistribution of a Statistic
What if the data’s distribution is not What if the data’s distribution is not known?known?• Large sample: Central Limit TheoremLarge sample: Central Limit Theorem• Small sample: Normal theory or Small sample: Normal theory or
nonparametric procedures based on nonparametric procedures based on permutation distributionspermutation distributions
1818
Approximating the Sampling Approximating the Sampling Distribution of a StatisticDistribution of a Statistic
If the population distribution is known, If the population distribution is known, we can approximate the sampling we can approximate the sampling distribution with simulation.distribution with simulation.• Repeatedly (Repeatedly (mm times) generate random times) generate random
samples of size samples of size nn from the population from the population distributiondistribution
• Calculate a statistic (say, Calculate a statistic (say, SS) each time) each time• The empirical (observed) distribution of The empirical (observed) distribution of S-S-
values approximates the true distribution values approximates the true distribution of of SS
1919
ExampleExample
XX11, X, X22, X, X33, X, X44 ~Expon(1)~Expon(1) What is the sampling distribution of:What is the sampling distribution of:
2020
X (the mean)max(X) min(X)
2 (the midrange)