EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES C. J. SALLABERRY, a J. C. HELTON b...

1
EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES C. J. SALLABERRY, a J. C. HELTON b – S. C. HORA c aSandia National Laboratories, New Mexico PO Box 5800 Albuquerque, NM 87185-0776, USA bDepartment of Mathematics and Statistics, Arizona State University, Temp, AZ 85287-1804 USA cUniversity of Hawaii at Hilo, HI 96720-4091, USA DEFINITION OF LATIN HYPERCUBE SAMPLING [1], [2] EXTENSION ALGORITHM ILLUSTRATION OF EXTENSION ALGORITHM DISCUSSION CORRELATION REFERENCES [1] McKay M. D., Beckman, R. J. and Conover W. J. "A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code," Technometrics, 21, pp. 239-245, (1979), [2] Helton J. C. and Davis F. J., "Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems," Reliability Engineering and System Safety, 81, pp. 23-69, (2003), [3] Iman R. L. and Conover W. J., “A distribution-free approach to inducing rank correlation among input variables” Commun. Statist.-Simula. Computa., 11, n.3, pp. 311-334 (1982), [4] Tong, C. 2006. "Refinement Strategies for Stratified Sampling Methods," Reliability Engineering and System Safety. Vol. 91, no. 10-11, pp. 1257-1265 Acknowledgement: Work performed for Sandia National Laboratories (SNL), which is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Security Administration under contract DE-AC04-94AL-85000. X1 X2 X2 X1 Random Sampling Latin Hypercube Sampling Stratification into intervals of equal probability for each variable Random selection of value in each interval Random pairing of values without replacement across variables Iman/Conover restricted pairing procedure for correlation control [3] applicable Area not covered by a Random Sampling Advantages: • Less variability in the replicated estimation of the CDFs • Less variability in the replicated estimations of the mean Drawback: Difficult to increase the size of an already generated sample Possibility to extend size of sample already proposed [4], but without allowing correlation control IDEA: Extension applied to the RANK of the value to respect correlation. STEP 1 Generation of an LHS on RANK value STEP 0 Original Sample obtained using LHS with Iman/Conover procedure STEP 2 Separation of each rectangle into 4 equal probability rectangles STEPs 3 and 4 Selection of unique rectangle not covered by first LHS and random selection of value within the rectangle The extension procedure described: Provides a way to address sample size problem sequentially in computationally demanding analysis Allows re-using information provides by an original sample Can be used in the generation of very large LHSs with a specific correlation structure. Rank Correlation matrix of resulting sample close to the half sum of the correlation matrices of two generated samples Monte Carlo estimate of Variation of rank correlation Deviation for expected correlation 1 0.5 -0.9 0.5 1 -0.7 -0.9 -0.7 1 Desired Rank Correlation Matrix 1 0.755245-0.888112 0.755245 1-0.832168 -0.888112-0.832168 1 1 0.643357-0.895105 0.643357 1-0.804196 -0.895105-0.804196 1 Rank Correlation Matrix for sample 1 Rank Correlation Matrix for sample 2 1 0.699301-0.891608 0.699301 1-0.818182 -0.891608-0.818182 1 Half sum 1 0.70087-0.881739 0.70087 1-0.808696 -0.881739-0.808696 1 Rank Correlation matrix for extended sample Theoretical demonstration described in SAND Report (SAND2006-6135)

Transcript of EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES C. J. SALLABERRY, a J. C. HELTON b...

Page 1: EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES C. J. SALLABERRY, a J. C. HELTON b – S. C. HORA c aSandia National Laboratories, New Mexico.

EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES

C. J. SALLABERRY,a J. C. HELTONb – S. C. HORAc

aSandia National Laboratories, New Mexico PO Box 5800 Albuquerque, NM 87185-0776, USAbDepartment of Mathematics and Statistics, Arizona State University, Temp, AZ 85287-1804 USA

cUniversity of Hawaii at Hilo, HI 96720-4091, USA

DEFINITION OF LATIN HYPERCUBE SAMPLING [1], [2]

EXTENSION ALGORITHM

ILLUSTRATION OF EXTENSION ALGORITHM

DISCUSSION

CORRELATION

REFERENCES[1] McKay M. D., Beckman, R. J. and Conover W. J. "A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code," Technometrics, 21, pp. 239-245, (1979),[2] Helton J. C. and Davis F. J., "Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems," Reliability Engineering and System Safety, 81, pp. 23-69, (2003),[3] Iman R. L. and Conover W. J., “A distribution-free approach to inducing rank correlation among input variables” Commun. Statist.-Simula. Computa., 11, n.3, pp. 311-334 (1982),[4] Tong, C. 2006. "Refinement Strategies for Stratified Sampling Methods," Reliability Engineering and System Safety. Vol. 91, no. 10-11, pp. 1257-1265

Acknowledgement:

Work performed for Sandia National Laboratories (SNL), which is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the

United States Department of Energy’s National Security Administration under contract DE-AC04-94AL-85000.

X1

X2X2

X1

Random Sampling Latin Hypercube Sampling

• Stratification into intervals of equal probability for each variable

• Random selection of value in each interval

• Random pairing of values without replacement across variables

• Iman/Conover restricted pairing procedure for correlation control [3] applicable

Area not covered by a Random Sampling

Advantages:

• Less variability in the replicated estimation of the CDFs

• Less variability in the replicated estimations of the mean

Drawback:• Difficult to increase the size of an already generated sample

Possibility to extend size of sample already proposed [4], but without allowing correlation control

IDEA: Extension applied to the RANK of the value to respect correlation.

IDEA: Extension applied to the RANK of the value to respect correlation.

STEP 1

Generation of an LHS on RANK

value

STEP 0

Original Sample obtained using LHS with Iman/Conover

procedure

STEP 2

Separation of each rectangle into 4 equal probability

rectangles

STEPs 3 and 4

Selection of unique rectangle not

covered by first LHS and random selection of value

within the rectangle

The extension procedure described:• Provides a way to address sample size problem sequentially in computationally demanding analysis• Allows re-using information provides by an original sample • Can be used in the generation of very large LHSs with a specific correlation structure.

Rank Correlation matrix of resulting sample close to the half sum of the correlation matrices of two generated samples

Monte Carlo estimate of Variation of rank correlation

Deviation for expected correlation

1 0.5 -0.90.5 1 -0.7

-0.9 -0.7 1

Desired Rank Correlation

Matrix

1 0.755245 -0.8881120.755245 1 -0.832168

-0.888112 -0.832168 1

1 0.643357 -0.8951050.643357 1 -0.804196

-0.895105 -0.804196 1

Rank Correlation Matrix for sample 1

Rank Correlation Matrix for sample 2

1 0.699301 -0.8916080.699301 1 -0.818182

-0.891608 -0.818182 1

Half sum

1 0.70087 -0.8817390.70087 1 -0.808696

-0.881739 -0.808696 1

Rank Correlation matrix for extended sample

Theoretical demonstration described in SAND Report (SAND2006-6135)