Random Number Generation Using Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June...
-
Upload
claud-walker -
Category
Documents
-
view
218 -
download
1
Transcript of Random Number Generation Using Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June...
Random Number GenerationUsing Low Discrepancy
PointsDonald Mango, FCAS, MAAA
Centre Solutions
June 7, 1999
1999 CAS/CARe Reinsurance Seminar
Baltimore, Maryland
What is Discrepancy?
• Large # of points inside a unit hypercube :n-dimensional hypercube of length 1 on each side
• For any “sub-volume” of the hypercube,
Discrepancy = the difference between
the proportion of points inside the volumeand
the volume itself
Low Discrepancy Point Generator:
• Method to generate a set of points which fills out a given n-dimensional unit hypercube, with as little discrepancy as possible
• Attempt to be systematic and efficient in filling a space, given the number of points
• My paper discusses “Faure” Points, just one of many alternatives
• Faure method relies on prime numbers
Other Low Discrepancy Point Generators:
• Named after number theorists: Sobol’, Neiderreiter, Halton, Hammersley, ...
• More advanced methods use “irreducible polynomials” -- polynomial equivalents of prime numbers (cannot be factored)
• More complex algorithms
• Less flexible than Faure
Linear Congruential Generator:• Xn+1 = (aXn + c) mod m
• Used in spreadsheets -- RAND() in Excel, @RAND in Lotus
• Sequential
• Cyclical, with a long cycle length or “period”
• “Randomized” in spreadsheets by using a random seed value ( X0 ) = the system clock
LDPMAKER Excel 97 Workbook:
• Available in the 1999 Spring Forum section of the CAS Website:
www.casact.org/pubs/forum/99spforum/99spftoc.htm
• Includes both:
• A spreadsheet-only calculation (recalc-driven), and
• A Visual Basic for Applications (VBA) macro-driven generator (run with a button)
LDPMAKER Excel 97 Workbook:
• “Example” sheet is spreadsheet-only calculation
• Demonstrates formulas
• Not very flexible
Example: 4 Dimensions, 24 Iterations• Dimension #1:
• First, convert each iteration number N to base Prime (= 5)
• Iteration 1 = 01base5
Iteration 10 = 20base5
• F(N, 1) = Faure point (Iteration N, Dimension 1)F(1,1) = 0/52 + 1/5 = 0.20F(10,1) = 2/52 + 0/5 = 0.08
Example: 4 Dimensions, 24 Iterations• Dimension #2:
• Start with the base Prime digits from Dimension #1 and “shuffle” them
• Using combinations, sum of digits and MOD operator
• First digit in Dimension #2 = [ Sum (first digit, second digit) from Dimension #1 ] MOD Prime
•Dimension #1, Iteration 10 = 20base5
Dimension #2, Iteration 10 = 22base5
• Formula for F(N,2) is the same
Example: 4 Dimensions, 24 Iterations• Dimensions #3 and higher:
• Start with the base Prime digits from the previous dimension and “shuffle” them
• Formula for F(N,3) ... is the same
Loops in the Faure Algorithm:
• Fills out the space in ever-larger loops of ever-smaller spacing
• Fills out the space sequentially
• There MAY be an issue with ending the iterations in the middle of one of these loops
• Examples later in the test results...
Visual Basic for Applications (VBA) Version:
• VBA = real programming language
• Recursive algorithm using “dynamic arrays” - arrays which are dimensioned (sized) at run-time
• Generalization of spreadsheet-only calculations
• FAST
Performance Test #1:Sum of Limited Paretos
Test # /Pareto #
B Q Policy Limit Limited Expected Value
1 / 1 10,000 1.10 100,000 21,321
1 / 2 15,000 1.30 250,000 28,874
Test # 1 Theoretical Result 50,194
2 / 1 10,000 1.10 50,000 16,4042 / 2 15,000 1.30 25,000 12,7452 / 3 25,000 1.20 40,000 21,7442 / 4 12,500 1.40 50,000 14,8342 / 5 30,000 2.00 25,000 13,636
Test # 2 Theoretical Result 79,364
Table 2 (from Paper) - Pareto Parameters
Performance Test #1:Sum of Limited Paretos
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
250 49,170 -2.04% 47,573 -5.22%
728 50,022 -0.34% 50,267 0.15%
1,000 49,769 -0.85% 49,640 -1.10%
1,500 49,903 -0.58% 51,307 2.22%
2,186 50,137 -0.11% 50,737 1.08%
Table 3: Sum of 2 Limited Paretos
Performance Test #1:Sum of Limited Paretos
Table 4: Sum of 5 Limited Paretos
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
342 79,319 -0.06% 80,179 1.03%
1,000 79,201 -0.21% 78,837 -0.66%
1,500 79,206 -0.20% 79,088 -0.35%
2,000 79,280 -0.11% 79,049 -0.40%
2,400 79,358 -0.01% 79,154 -0.27%
Performance Test #2:Sum of Poissons
Table 5: Sum of 2 Poissons ( = 8)
# of Iterations LDP % Error RAND() % Error
250 -0.42% 1.30%
728 -0.03% 0.64%1,000 -0.22% 0.23%2,000 -0.09% -0.08%2,186 -0.01% 0.17%
Performance Test #2:Sum of Poissons
Table 6: Sum of 5 Poissons ( = 8)
# of Iterations LDP % Error RAND() % Error
342 -0.24% 0.78%
1,000 -0.20% 0.59%2,000 -0.11% -0.22%2,400 -0.04% -0.23%
Performance Test #3:Low Frequency Events
Pareto # B Q Policy Limit Limited Expected Value
1 10,000 1.30 50,000 13,860
Test #1 Theoretical Result 693
2 25,000 1.60 50,000 20,113
3 5,000 1.10 50,000 10,660
Test #2 Theoretical Result 2,232
Table 7 - Pareto Parameters used for Severity
Performance Test #3:Low Frequency Events
Table 8: One Event, 5% Prob of Occurrence
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
250 563 -18.82% 1,009 45.60%
728 615 -11.19% 657 -5.23%1,000 670 -3.27% 569 -17.93%1,500 667 -3.81% 613 -11.58%2,186 690 -0.50% 662 -4.45%
Performance Test #3:Low Frequency Events
Table 9: Two Events, each with 5% Prob of Occurrence
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
342 2,199 -1.46% 3,175 42.26%
1,000 2,251 0.86% 2,456 10.04%1,500 2,221 -0.49% 2,295 2.83%2,400 2,204 -1.22% 2,348 5.20%
Performance Test #4:99th Percentile of Sum of NormalsTable 10 - Normal Parameters
Test # /Normal #
Mean StdDev
99th
Percentile
1 / 1 2,000 750 -
1 / 2 1,000 500 -
1 Combined 3,000 901.4 5,097
2 / 1 1,000 300 -2 / 2 1,000 800 -2 / 3 500 300 -2 / 4 750 600 -2 / 5 2,000 100 -
2 Combined 5,250 1090.9 7,788
Performance Test #4:99th Percentile of Sum of NormalsTable 11 - 99th Pctle of Sum of 2 Normals
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
250 5,084 -0.25% 4,800 -5.82%
728 5,036 -1.19% 4,898 -3.91%1,000 4,995 -2.00% 4,934 -3.19%1,500 5,047 -0.98% 4,989 -2.12%2,186 5,070 -0.52% 4,967 -2.55%
Performance Test #4:99th Percentile of Sum of NormalsTable 12 - 99th Pctle of Sum of 5 Normals
# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error
342 7,661 -1.63% 7,524 -3.38%
1,000 7,808 0.26% 7,653 -1.73%1,500 7,808 0.26% 7,650 -1.76%2,400 7,804 0.21% 7,703 -1.09%
Performance Test #5:Mixed Bag
• Sum of 5 each from:
• LogNormal
• Pareto
• Uniform
• Normal
• Testing variability of estimates over 10 runs
Performance Test #5:Mixed Bag
# of Iterations LDP Average %
Error
LDP Std Dev of
% Error
Rand Average
% Error
Rand Std Dev
of % Error
250 -10.39% 0.33% -0.36% 5.51%
500 -2.28% 0.71% -3.03% 7.79%
1,000 -0.47% 1.36% -0.76% 4.39%
1,500 -0.41% 0.69% -0.67% 4.62%
2,000 -0.41% 0.62% -1.40% 4.01%
3,000 -0.72% 0.47% -1.17% 2.79%
Table 14 - Avg % Error and Std Dev of % Error over 10 runs
Possible Concerns in Using LDPs
• Unused Dimensions:
• Example: modeling Excess Claims
• # of Excess claims between 0 and 30
•requires 30 dimensions
• If # claims < 30, are the “used” dimensions still filled out with low discrepancy?
• Dr. Tom?
Possible Concerns in Using LDPs
• Time Series:
• Example: Probability of 2 consecutive years of loss ratio exceeding 75%
• How many dimensions is this problem?
• Can’t use a single dimension of LDPs, because they are sequentially dependent
• Need to know “over how many years”, then set dimensions
Possible Concerns in Using LDPs
• Correlation:
• If two variables are
•100% correlated ==> 1 dimension
• 0% correlated ==> 2 dimensions
• x% correlated ==> ? dimensions
• Is promise of “low discrepancy” still fulfilled?
• How to implement?
Possible Concerns in Using LDPs
• Loop Boundaries:
• Faure algorithm fills out space sequentially in ever-expanding loops of ever-finer granularity
• If iteration count does not finish on a loop boundary (depends on Prime), there may be potential bias...
• See Appendix B of paper