Random Number Generation Using Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June...

Random Number GenerationUsing Low Discrepancy

PointsDonald Mango, FCAS, MAAA

Centre Solutions

June 7, 1999

1999 CAS/CARe Reinsurance Seminar

Baltimore, Maryland

What is Discrepancy?

• Large # of points inside a unit hypercube :n-dimensional hypercube of length 1 on each side

• For any “sub-volume” of the hypercube,

Discrepancy = the difference between

the proportion of points inside the volumeand

the volume itself

Low Discrepancy Point Generator:

• Method to generate a set of points which fills out a given n-dimensional unit hypercube, with as little discrepancy as possible

• Attempt to be systematic and efficient in filling a space, given the number of points

• My paper discusses “Faure” Points, just one of many alternatives

• Faure method relies on prime numbers

Other Low Discrepancy Point Generators:

• Named after number theorists: Sobol’, Neiderreiter, Halton, Hammersley, ...

• More advanced methods use “irreducible polynomials” -- polynomial equivalents of prime numbers (cannot be factored)

• More complex algorithms

• Less flexible than Faure

Linear Congruential Generator:• Xn+1 = (aXn + c) mod m

• Used in spreadsheets -- RAND() in Excel, @RAND in Lotus

• Sequential

• Cyclical, with a long cycle length or “period”

• “Randomized” in spreadsheets by using a random seed value ( X0 ) = the system clock

LDPMAKER Excel 97 Workbook:

• Available in the 1999 Spring Forum section of the CAS Website:

www.casact.org/pubs/forum/99spforum/99spftoc.htm

• Includes both:

• A spreadsheet-only calculation (recalc-driven), and

• A Visual Basic for Applications (VBA) macro-driven generator (run with a button)

LDPMAKER Excel 97 Workbook:

• “Example” sheet is spreadsheet-only calculation

• Demonstrates formulas

• Not very flexible

Example: 4 Dimensions, 24 Iterations• Dimension #1:

• First, convert each iteration number N to base Prime (= 5)

• Iteration 1 = 01base5

Iteration 10 = 20base5

• F(N, 1) = Faure point (Iteration N, Dimension 1)F(1,1) = 0/52 + 1/5 = 0.20F(10,1) = 2/52 + 0/5 = 0.08

Example: 4 Dimensions, 24 Iterations• Dimension #2:

• Start with the base Prime digits from Dimension #1 and “shuffle” them

• Using combinations, sum of digits and MOD operator

• First digit in Dimension #2 = [ Sum (first digit, second digit) from Dimension #1 ] MOD Prime

•Dimension #1, Iteration 10 = 20base5

Dimension #2, Iteration 10 = 22base5

• Formula for F(N,2) is the same

Example: 4 Dimensions, 24 Iterations• Dimensions #3 and higher:

• Start with the base Prime digits from the previous dimension and “shuffle” them

• Formula for F(N,3) ... is the same

Loops in the Faure Algorithm:

• Fills out the space in ever-larger loops of ever-smaller spacing

• Fills out the space sequentially

• There MAY be an issue with ending the iterations in the middle of one of these loops

• Examples later in the test results...

Visual Basic for Applications (VBA) Version:

• VBA = real programming language

• Recursive algorithm using “dynamic arrays” - arrays which are dimensioned (sized) at run-time

• Generalization of spreadsheet-only calculations

• FAST

Performance Test #1:Sum of Limited Paretos

Test # /Pareto #

B Q Policy Limit Limited Expected Value

1 / 1 10,000 1.10 100,000 21,321

1 / 2 15,000 1.30 250,000 28,874

Test # 1 Theoretical Result 50,194

2 / 1 10,000 1.10 50,000 16,4042 / 2 15,000 1.30 25,000 12,7452 / 3 25,000 1.20 40,000 21,7442 / 4 12,500 1.40 50,000 14,8342 / 5 30,000 2.00 25,000 13,636

Test # 2 Theoretical Result 79,364

Table 2 (from Paper) - Pareto Parameters


# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error

250 49,170 -2.04% 47,573 -5.22%

728 50,022 -0.34% 50,267 0.15%

1,000 49,769 -0.85% 49,640 -1.10%

1,500 49,903 -0.58% 51,307 2.22%

2,186 50,137 -0.11% 50,737 1.08%

Table 3: Sum of 2 Limited Paretos


Table 4: Sum of 5 Limited Paretos


342 79,319 -0.06% 80,179 1.03%

1,000 79,201 -0.21% 78,837 -0.66%

1,500 79,206 -0.20% 79,088 -0.35%

2,000 79,280 -0.11% 79,049 -0.40%

2,400 79,358 -0.01% 79,154 -0.27%

Performance Test #2:Sum of Poissons

Table 5: Sum of 2 Poissons ( = 8)

# of Iterations LDP % Error RAND() % Error

250 -0.42% 1.30%

728 -0.03% 0.64%1,000 -0.22% 0.23%2,000 -0.09% -0.08%2,186 -0.01% 0.17%

Performance Test #2:Sum of Poissons

Table 6: Sum of 5 Poissons ( = 8)

# of Iterations LDP % Error RAND() % Error

342 -0.24% 0.78%

1,000 -0.20% 0.59%2,000 -0.11% -0.22%2,400 -0.04% -0.23%

Performance Test #3:Low Frequency Events

Pareto # B Q Policy Limit Limited Expected Value

1 10,000 1.30 50,000 13,860

Test #1 Theoretical Result 693

2 25,000 1.60 50,000 20,113

3 5,000 1.10 50,000 10,660

Test #2 Theoretical Result 2,232

Table 7 - Pareto Parameters used for Severity


Table 8: One Event, 5% Prob of Occurrence


250 563 -18.82% 1,009 45.60%

728 615 -11.19% 657 -5.23%1,000 670 -3.27% 569 -17.93%1,500 667 -3.81% 613 -11.58%2,186 690 -0.50% 662 -4.45%


Table 9: Two Events, each with 5% Prob of Occurrence


342 2,199 -1.46% 3,175 42.26%

1,000 2,251 0.86% 2,456 10.04%1,500 2,221 -0.49% 2,295 2.83%2,400 2,204 -1.22% 2,348 5.20%

Performance Test #4:99th Percentile of Sum of NormalsTable 10 - Normal Parameters

Test # /Normal #

Mean StdDev

99th

Percentile

1 / 1 2,000 750 -

1 / 2 1,000 500 -

1 Combined 3,000 901.4 5,097

2 / 1 1,000 300 -2 / 2 1,000 800 -2 / 3 500 300 -2 / 4 750 600 -2 / 5 2,000 100 -

2 Combined 5,250 1090.9 7,788

Performance Test #4:99th Percentile of Sum of NormalsTable 11 - 99th Pctle of Sum of 2 Normals


250 5,084 -0.25% 4,800 -5.82%

728 5,036 -1.19% 4,898 -3.91%1,000 4,995 -2.00% 4,934 -3.19%1,500 5,047 -0.98% 4,989 -2.12%2,186 5,070 -0.52% 4,967 -2.55%

Performance Test #4:99th Percentile of Sum of NormalsTable 12 - 99th Pctle of Sum of 5 Normals


342 7,661 -1.63% 7,524 -3.38%

1,000 7,808 0.26% 7,653 -1.73%1,500 7,808 0.26% 7,650 -1.76%2,400 7,804 0.21% 7,703 -1.09%

Performance Test #5:Mixed Bag

• Sum of 5 each from:

• LogNormal

• Pareto

• Uniform

• Normal

• Testing variability of estimates over 10 runs

Performance Test #5:Mixed Bag

# of Iterations LDP Average %

Error

LDP Std Dev of

% Error

Rand Average

% Error

Rand Std Dev

of % Error

250 -10.39% 0.33% -0.36% 5.51%

500 -2.28% 0.71% -3.03% 7.79%

1,000 -0.47% 1.36% -0.76% 4.39%

1,500 -0.41% 0.69% -0.67% 4.62%

2,000 -0.41% 0.62% -1.40% 4.01%

3,000 -0.72% 0.47% -1.17% 2.79%

Table 14 - Avg % Error and Std Dev of % Error over 10 runs

Possible Concerns in Using LDPs

• Unused Dimensions:

• Example: modeling Excess Claims

• # of Excess claims between 0 and 30

•requires 30 dimensions

• If # claims < 30, are the “used” dimensions still filled out with low discrepancy?

• Dr. Tom?


• Time Series:

• Example: Probability of 2 consecutive years of loss ratio exceeding 75%

• How many dimensions is this problem?

• Can’t use a single dimension of LDPs, because they are sequentially dependent

• Need to know “over how many years”, then set dimensions


• Correlation:

• If two variables are

•100% correlated ==> 1 dimension

• 0% correlated ==> 2 dimensions

• x% correlated ==> ? dimensions

• Is promise of “low discrepancy” still fulfilled?

• How to implement?


• Loop Boundaries:

• Faure algorithm fills out space sequentially in ever-expanding loops of ever-finer granularity

• If iteration count does not finish on a loop boundary (depends on Prime), there may be potential bias...

• See Appendix B of paper

Random Number Generation Using Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June...

Documents

Transcript of Random Number Generation Using Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June...