Randomness and Monte Carlo J. Chaudhry
Transcript of Randomness and Monte Carlo J. Chaudhry
Lecture 20Randomness and Monte Carlo
J. Chaudhry
Department of Mathematics and StatisticsUniversity of New Mexico
J. Chaudhry (UNM) CS 357 1 / 41
What we’ll do:
Random number generatorsMonte-Carlo integrationMonte-Carlo simulation
Our testing and our methods often rely on “randomness” and the impact offinite precision
Random inputRandom perturbationsRandom sequences
J. Chaudhry (UNM) CS 357 2 / 41
Randomness
Randomness ≈ unpredictabilityOne view: a sequence is random if it has no shorter descriptionPhysical processes, such as flipping a coin or tossing dice, aredeterministic with enough information about the governing equations andinitial conditions.But even for deterministic systems, sensitivity to the initial conditions canrender the behavior practically unpredictable.we need random simulation methods
J. Chaudhry (UNM) CS 357 3 / 41
Quasi-Random Sequences
For some applications, reasonable uniform coverage of the sample ismore important than the “randomness”True random samples often exhibit clumpingPerfectly uniform samples uses a uniform grid, but does not scale well athigh dimensionsquasi-random sequences attempt randomness while maintainingcoverage
J. Chaudhry (UNM) CS 357 4 / 41
Quasi-Random Sequences
quasi random sequences are not random, but give random appearanceby design, the points avoid each other, resulting in no clumping
http://dilbert.com/strips/comic/2001-10-25/
http://www.xkcd.com/221/
J. Chaudhry (UNM) CS 357 5 / 41
Randomness is easy, right?
In May, 2008, Debian announced a vulnerability with OpenSSL: theOpenSSL pseudo-random number generator
I the seeding process was compromised (2 years)I only 32,767 possible keysI seeding based on process ID (this is not entropy!)I all SSL and SSH keys from 9/2006 - 5/2008 regeneratedI all certificates recertified
Cryptographically secure pseudorandom number generator (CSPRNG)are necessary for some appsOther apps rely less on true randomness
J. Chaudhry (UNM) CS 357 6 / 41
Repeatability
With unpredictability, true randomness is not repeatable...but lack of repeatability makes testing/debugging difficultSo we want repeatability, but also independence of the trials
1 >> rand(’seed’,1234)
2 >> rand(10,1)
J. Chaudhry (UNM) CS 357 7 / 41
Pseudorandom Numbers
Computer algorithms for random number generations are deterministic
...but may have long periodicity (a long time until an apparent patternemerges)These sequences are labeled pseudorandomPseudorandom sequences are predictable and reproducible (this ismostly good)
J. Chaudhry (UNM) CS 357 8 / 41
Random Number Generators
Properties of a good random number generator:Random pattern: passes statistical tests of randomnessLong period: long time before repeating
Efficiency: executes rapidly and with low storageRepeatability: same sequence is generated using same initial states
Portability: same sequences are generated on different architectures
J. Chaudhry (UNM) CS 357 9 / 41
Random Number Generators
Early attempts relied on complexity to ensure randomness“midsquare” method: square each member of a sequence and take themiddle portion of the results as the next member of the sequence...simple methods with a statistical basis are preferable
J. Chaudhry (UNM) CS 357 10 / 41
Linear Congruential Generators (LCGs)
Congruential random number generators are of the form:
xk = (axk−1 + c) ( mod m)
where a and c are integers given as input.x0 is called the seedInteger m − 1 is the largest integer representableQuality depends on a and c. The period will be at most m.
ExampleLet a = 13, c = 0, m = 31, and x0 = 1.
1, 13, 14, 27, 10, 6, . . .
This is a permutation of integers from 1, . . . , 30, so the period is m − 1.
J. Chaudhry (UNM) CS 357 11 / 41
HistoryFrom C. Moler, NCM
IBM used Scientific Subroutine Package (SSP) in the 1960’s themainframes.Their random generator, rnd used a = 65539, c = 0, and m = 231.arithmetic mod 231 is done quickly with 32 bit words.multiplication can be done quickly with a = 216 + 3 with a shift and shortadd.Notice (mod m):
xk+2 = 6xk+1 − 9xk
...strong correlation among three successive integers
J. Chaudhry (UNM) CS 357 12 / 41
HistoryFrom C. Moler, NCM
Matlab used a = 75, c = 0, and m = 231 − 1 for a whileperiod is m − 1.this is no longer sufficient
J. Chaudhry (UNM) CS 357 13 / 41
Linear Congruential Generators (LCGs) – Summary
sensitive to a and cbe careful with supplied random functions on your systemperiod is mstandard division is necessary if generating floating points in [0, 1).
J. Chaudhry (UNM) CS 357 14 / 41
what’s used?
Two popular methods:1. rand() in Unix uses LCG with a = 1103515245, c = 12345, m = 231.
2. Method of Marsaglia (period ≈ 21430).
1 Initialize x0, . . . , x3 and c to random values given a seed2
3 Let s = 2111111111xn−4 + 1492xn−31776xn−2 + 5115xn−1 + c4
5 Compute xn = s mod 232
6
7 c = floor(s/232)
In general, the digits in random numbers are not themselves random...somepatterns reoccur much more often.
J. Chaudhry (UNM) CS 357 15 / 41
Fibonacci
produce floating-point random numbers directly using differences, sums,or products.Typical subtractive generator:
xk = xk−17 − xk−5
with “lags” of 17 and 5.Lags much be chosen very carefullynegative results need fixingmore storage needed than congruential generatorsno division neededvery very good statistical propertieslong periods since repetition does not imply a period
J. Chaudhry (UNM) CS 357 16 / 41
Discrete random variables
random variable xvalues: x1, . . . , xn (the set of values is called sample space)probabilities p1, . . . , pn with
∑ni=1 pi = 1
throwing a die (1-based index)values: x1 = 1, x2 = 2, . . . , x6 = 6probabilities pi = 1/6
J. Chaudhry (UNM) CS 357 17 / 41
Expected value and variance
expected value: average value of the variable
E[x] =n∑
j=1
xjpj
variance: variation from the average
σ2[x] = E[(x − E[x])2] = E[x2] − E[x]2
Standard deviation: square root of variance. denoted by σ[x]
throwing a dieexpected value: E[x] = (1 + 2 + · · ·+ 6)/6 = 3.5variance: σ2[x] = 2.916
J. Chaudhry (UNM) CS 357 18 / 41
Estimated E[x]
to estimate the expected value, choose a set of random values based onthe probability and average the results
E[x] ≈ 1N
N∑j=1
xi
bigger N gives better estimates
throwing a die3 rolls: 3, 1, 6→ E[x] ≈ (3 + 1 + 6)/3 = 3.339 rolls:3, 1, 6, 2, 5, 3, 4, 6, 2→ E[x] ≈ (3 + 1 + 6 + 2 + 5 + 3 + 4 + 6 + 2)/9 = 3.51
J. Chaudhry (UNM) CS 357 19 / 41
Law of large numbers
by taking N to∞, the error between the estimate and the expected valueis statistically zero. That is, the estimate will converge to the correct value
P(E[x] = limN→∞ 1N
N∑i=1
xi) = 1
This is the basic idea behind Monte Carlo estimation
J. Chaudhry (UNM) CS 357 20 / 41
Continuous random variable and density function
random variable: xvalues: x ∈ [a, b] (the set of values is called sample space, which in thiscase is [a, b] )
probability: density function ρ(x) > 0 with∫b
a ρ(x) dx = 1
probability that the variable has value between (c, d) ⊂ [a, b]:∫d
c ρ(x)
J. Chaudhry (UNM) CS 357 21 / 41
Uniformly distributed random variable
ρ(x) is constant∫ba ρ(x) dx = 1 means ρ(x) = 1/(b − a)
J. Chaudhry (UNM) CS 357 22 / 41
Continuous extensions to expectation and variance
Expected value
E[x] =∫ b
axρ(x) dx
E[g(x)] =∫ b
ag(x)ρ(x) dx
Variance of random variable and a function of random variable
σ2[x] =∫ b
a(x − E[x])2ρ(x) dx = E[x2] − (E[x])2
σ2[g(x)] =∫ b
a(g(x) − E[g(x)])2ρ(x) dx = E[g(x)2] − (E[g(x)])2
Estimating the expected value of some g(x)
E[g(x)] ≈ 1N
N∑i=1
g(xi)
Law of large numbers valid for continuous case too
J. Chaudhry (UNM) CS 357 23 / 41
Sampling over intervals
Almost all systems provide a uniform random number generator on (0, 1)Matlab: randIf we need a uniform distribution over (a, b), then we modify xk on (0, 1) by
(b − a)xk + a
Matlab: Generate 100 uniformly distributed values between (a, b)r = a + (b-a).*rand(100,1)
Uniformly distributed integers: randiRandom permutations: randperm
J. Chaudhry (UNM) CS 357 24 / 41
Non-uniform distributions (Not on Exam)
sampling nonuniform distributions is much more difficultif the cumulative distribution function is invertible, then we can generatethe non-uniform sample from the uniform. Consider probability densityfunction
ρ(t) = λe−λt, t > 0
giving the cumulative distribution function as
F(x) =∫ x
0λe−λtdt = −(e−λt)
∣∣∣∣x0= (1 − e−λx)
thus inverting the CDF
xk = − log(1 − yk)/λ
where yk is uniform in (0, 1)....not so easy in general
J. Chaudhry (UNM) CS 357 25 / 41
multidementional extensions
difficult domains (complex geometries)expected value
E[g(x)] =∫
x∈Ωg(x)ρ(x) dx
Ω ⊂ Rd
Density function for uniform distribution:∫Ω
ρdx = 1 =⇒ ρ =1V
where V =∫Ω 1 dx is the volume of Ω.
J. Chaudhry (UNM) CS 357 26 / 41
Monte Carlo
Central to the PageRank (and many many other applications in fincance,science, informatics, etc) is that we randomly process somethingwhat we want to know is “on average” what is likely to happenwhat would happen if we have an infinite number of samples?let’s take a look at integral (a discrete limit in a sense)
J. Chaudhry (UNM) CS 357 27 / 41
(deterministic) numerical integration
split domain into set of fixed segmentssum function values with size of segments (Riemann!)
J. Chaudhry (UNM) CS 357 28 / 41
Monte Carlo Integration
We have for a random sequence x1, . . . , xn∫ 1
0f (x) dx ≈ 1
n
n∑i=1
f (xi)
1 n=100;
2 x=rand(n,1);
3 a=f(x);
4 s=sum(a)/n;
J. Chaudhry (UNM) CS 357 29 / 41
Monte Carlo over random domains
We have for a uniform random sequence x1, . . . , xn in Ω
I =∫Ω
f (x) dx ≈ Qn = V1n
n∑i=1
f (xi)
where V is the “volume” given by
V =
∫Ω
dx
Law of large numbers gurantees that Qn → I as n→∞Why? Recall that density for uniform distribution is ρ = 1/V. By Law oflarge numbers,
1n
n∑i=1
f (xi) ≈ E[f (x)] =∫Ω
ρ f (x)dx =1V
∫Ω
f (x)dx
J. Chaudhry (UNM) CS 357 30 / 41
Monte Carlo over random domains
Works in any dimension!One dimension: ∫ b
af (x) dx ≈ (b − a)
1n
n∑i=1
f (xi)
where x1, . . . , xn are uniformly distributed in (a, b)
J. Chaudhry (UNM) CS 357 31 / 41
2d example: computing π
Use the unit square [0, 1]2 with a quarter-circle
f (x, y) =
1 (x, y) ∈ circle0 else
Aquarter−circle =
∫ 1
0
∫ 1
0f (x, y) dxdy =
pi4
J. Chaudhry (UNM) CS 357 32 / 41
2d example: computing π
Estimate the area of the circle by randomly evaluating f (x, y)
Aquarter−circle ≈1N
N∑i=1
f (xi, yi)
J. Chaudhry (UNM) CS 357 33 / 41
2d example: computing π
By definition
Aquarter−circle = π/4
so
π ≈ 4N
N∑i=1
f (xi, yi)
J. Chaudhry (UNM) CS 357 34 / 41
2d example: computing π algorithm
1 input N2 call rand in 2d3 for i=1:N4 sum = sum + f (xi, yi)5 end
6 sum = 4 ∗ sum/N
J. Chaudhry (UNM) CS 357 35 / 41
2d example: computing π algorithm
The expected value of the error is O( σ√N)
convergence does not depend on dimensiondeterministic integration is very hard in higher dimensionsdeterministic integration is very hard for complicated domainsTrapezoid rule in dimension d: O( 1
N2/d )
J. Chaudhry (UNM) CS 357 36 / 41
Error in Monte Carlo Estimation
The distribution of an average is close to being normal, even when thedistribution from which the average is computed is not normal.
What?Let x1, . . . , xn be some independent random variables from any PDFEach xi has expected value µ and standard deviation σConsider the Average An = (x1 + · · ·+ xn)/nThe expected value of An is µ and the standard deviation is σ/
√n
This follows from Central Limit TheoremWhat? The sample mean has an error of σ/
√n
J. Chaudhry (UNM) CS 357 37 / 41
Stochastic Simulation
Stochastic simulation mimics physical behavior through randomsimulations...also known as Monte Carlo method (no single MC method!)Usefulness:
I nondeterministic processesI deterministic models that are too complicated to modelI deterministic models with high dimensionality
J. Chaudhry (UNM) CS 357 38 / 41
Stochastic Simulation
Two requirements for MC:I knowing which probability distributions are neededI generating sufficient random numbers
The probability distribution depends on the problem (theoretical orempirical evidence)The probability distribution can be approximated well by simulating alarge number of trials
J. Chaudhry (UNM) CS 357 39 / 41
Stochastic Simulation: The Monte Hall Problem
QuestionYou are on a gameshow. There are three doors that you have the option ofopening. There is a prize behind one of the doors, and there are goats behindthe other two. You select one of the doors. One of the remaining two doors isopened to reveal a goat. You are then given the option of opening your door orthe other remaining unopened door. Should you switch. Determine theprobability that you will win the prize if you switch.
Answer: Yes, always switch. Probability of success is 2/3.Can show using conditinal probability
J. Chaudhry (UNM) CS 357 40 / 41
Stochastic Simulation: The Monte Hall Problem
Solution via random simuation
1 input N2 wins = 0
3 for i=1:N4 Generate random permutation of goats and prize
5 You randomly choose a door
6 if the door had a goat, host always chooses the other door
with the goat
7 else if the door had a prize, host randomly chooses one of
the doors with the goat
8 end
9 You switch your choice
10 if the switched door had the prize
11 wins = wins + 1
12 end
13
14 end
15 probability of success = wins/N
J. Chaudhry (UNM) CS 357 41 / 41