Sensitivity Analysis of Monte Carlo Simulation Results ... · large-scale Monte Carlo simulation is...

16
1 Sensitivity Analysis of Monte Carlo Simulation Results Using the Kolomogorov-Smirnov d Statistic Hitoshi Makino, Tokai Works, Japan Nuclear Cycle Development Institute, Ibaraki, Japan (Currently Visiting Researcher in Geohydrology Department, Sandia National Laboratories, Albuquerque, New Mexico, USA) Sean A. McKenna, Geohydrology Department, Sandia National Laboratories, Albuquerque, New Mexico, USA Keiichiro Wakasugi Tokai Works, Japan Nuclear Cycle Development Institute, Ibaraki, Japan Abstract Evaluation of the future performance of geological disposal system for high-level radioactive waste is complicated by uncertainties that arise due to incomplete understanding of physical processes and limited information. The uncertainties of the model inputs result in the outputs of the model being uncertain as well. Therefore, quantitative examination of the resulting uncertainty with respect to the effect of input parameter uncertainty is important and has been a topic of discussion in this field. A large-scale Monte Carlo simulation is used to assess the future performance of the disposal system by translating uncertainties in model inputs to corresponding uncertainties in model outputs. Sensitivity analysis on the results of this Monte Carlo simulation can then be used to guide future R&D plans including effective site characterization and site selection activities. In this report, Generalized Sensitivity Analysis (GSA) using the Kolomogorov-Smirnov (K-S) d statistic was examined as a sensitivity analysis method. GSA separates the input parameter values into two distributions: those that created results that exceeded a specific threshold and those that created results that were below the threshold. The K-S d statistic is used to determine the probability that the two distributions came from the same underlying population. In an example application, GSA is applied to recent results of a Monte Carlo simulation with the nuclide transport model done for a hypothetical disposal system, and is used to determine the sensitivity of the total dose at a given point in time to each input parameter for a range of thresholds. The results from GSA are compared to results obtained through multiple linear regression analysis and also to results obtained through correlation analysis. From this study, it is shown that GSA assigns high importance to input parameters having a strong monotonic relationship with the output as do the other methods. Also, GSA can detect the localized existence of high output values for specific ranges of specific input parameters better than the other methods. Finally, it is suggested that the careful cross checking of results from more than one sensitivity analysis method is important to provide comprehensive insights with respect to the effect of parameter uncertainties on the performance of a geological disposal system. International Association of Mathematical Geology 2001 Annual Meeting, Cancun, Mexico, September 6-12, 2001

Transcript of Sensitivity Analysis of Monte Carlo Simulation Results ... · large-scale Monte Carlo simulation is...

1

Sensitivity Analysis of Monte Carlo Simulation ResultsUsing the Kolomogorov-Smirnov d Statistic

Hitoshi Makino,Tokai Works, Japan Nuclear Cycle Development Institute, Ibaraki, Japan

(Currently Visiting Researcher in Geohydrology Department,Sandia National Laboratories, Albuquerque, New Mexico, USA)

Sean A. McKenna,Geohydrology Department, Sandia National Laboratories, Albuquerque, New Mexico, USA

Keiichiro WakasugiTokai Works, Japan Nuclear Cycle Development Institute, Ibaraki, Japan

AbstractEvaluation of the future performance of geological disposal system for high-levelradioactive waste is complicated by uncertainties that arise due to incompleteunderstanding of physical processes and limited information. The uncertainties of themodel inputs result in the outputs of the model being uncertain as well. Therefore,quantitative examination of the resulting uncertainty with respect to the effect of inputparameter uncertainty is important and has been a topic of discussion in this field. Alarge-scale Monte Carlo simulation is used to assess the future performance of thedisposal system by translating uncertainties in model inputs to correspondinguncertainties in model outputs. Sensitivity analysis on the results of this Monte Carlosimulation can then be used to guide future R&D plans including effective sitecharacterization and site selection activities. In this report, Generalized SensitivityAnalysis (GSA) using the Kolomogorov-Smirnov (K-S) d statistic was examined as asensitivity analysis method. GSA separates the input parameter values into twodistributions: those that created results that exceeded a specific threshold and those thatcreated results that were below the threshold. The K-S d statistic is used to determinethe probability that the two distributions came from the same underlying population. Inan example application, GSA is applied to recent results of a Monte Carlo simulationwith the nuclide transport model done for a hypothetical disposal system, and is used todetermine the sensitivity of the total dose at a given point in time to each inputparameter for a range of thresholds. The results from GSA are compared to resultsobtained through multiple linear regression analysis and also to results obtained throughcorrelation analysis. From this study, it is shown that GSA assigns high importance toinput parameters having a strong monotonic relationship with the output as do the othermethods. Also, GSA can detect the localized existence of high output values for specificranges of specific input parameters better than the other methods. Finally, it is suggestedthat the careful cross checking of results from more than one sensitivity analysis methodis important to provide comprehensive insights with respect to the effect of parameteruncertainties on the performance of a geological disposal system.

International Association of MathematicalGeology 2001 Annual Meeting,

Cancun, Mexico, September 6-12, 2001

2

I. IntroductionImportance of uncertainty study for geological disposal for high-level radioactivewasteEvaluation of the future performance of geological disposal systems for high-levelradioactive waste is complicated by uncertainties that arise due to incompleteunderstanding of physical processes and limited information. The uncertainties of themodel inputs result in the outputs of the model being uncertain as well. Therefore,quantitative examination of the resulting uncertainty with respect to the effect of inputparameter uncertainty is important and has been a topic of discussion in this field (e.g.,[1][2]).

Current status of Japanese program for geological disposalAccording to the national program in Japan [3], high-level radioactive waste (HLW)separated during reprocessing of spent fuel must be vitrified, stored for a period of 30 to50 years for cooling and finally disposed of in a stable geological environment deepunderground (see Figure 1).

In the year 2000, the geological disposal program in Japan moved from the phase ofgeneric research and development (R&D) [4] into the phase of implementation.According to the present time schedule, repository operation will begin as early as the2030s with the site selection process proceeding in the following stepwise manner:selection of potential candidate sites and then selection of candidate site(s), followed bysite characterization at the candidate site(s) [5]. This schedule means that the Japaneseprogram has not yet had any candidate sites for a potential repository and that regulatorycriteria have not yet been formulated. Therefore, especially from the viewpoint ofeffective progress of R&D and site selection, development of methods to quantitativelyexamine the effect of various uncertainties on the performance of a geological disposalsystem and application of these methods are an important and topical issue in theJapanese HLW disposal program.

Purpose of this studyLarge-scale Monte Carlo simulations are used to assess the future performance of ageological disposal system by translating uncertainties in model inputs to correspondinguncertainties in model outputs. The resulting performance, expressed in terms of aperformance measure (e.g. dose), can be analyzed to 1) compare to a threshold (e.g.,regulatory threshold) to determine the probability of the future performance of thedisposal system meeting the threshold, 2) determine sensitivity of the results to the inputparameter uncertainties. In this study, we focus on the latter analysis becausedetermination of the important factors controlling the performance of a disposal systemhas a high priority in the current Japanese program to guide future R&D plans includingeffective site characterization and site selection activities. Sensitivity analysis providesa useful and structured framework for quantifying the strength of input-outputrelationships in assessed models (e.g., [6][7]).

In this study, Generalized Sensitivity Analysis (GSA) [8] using the Kolomogorov-Smirnov (K-S) d statistic was examined as a method to determine the sensitivity of theoutputs of Monte Carlo simulations with nuclide transport model to uncertainties of

3

inputs in the model. The results of this sensitivity analysis are compared to resultsobtained from other widely used methods: multiple linear regression analysis andcorrelation analysis. These sensitivity analyses were performed using an ensemble ofMonte Carlo simulations done for a hypothetical disposal system.

II. Overview of Monte Carlo SimulationOverview of models used in the simulationIn this study, only the Base Scenario [4] that considers the relatively high possibilityfuture behavior of the disposal system was considered. Perturbation Scenarios that maybe caused by external events and processes such as natural geological events and futurehuman activities are also an important source of uncertainty in the overall assessment(scenario uncertainty) but are not the subject of the investigation in this study.

Figure 2 [4] shows the conceptual model for nuclide transport through the engineeredbarrier system (EBS) and geosphere in a disposal system considered for the exampleMonte Carlo simulation in this study. It should be noted that the disposal system shownin Figure 2 was just a hypothetical one and was defined to examine the performance of adisposal system in Japan in the generic sense [4].

The conceptual model for the nuclide transport through the EBS has taken into accountthe following processes: (a) leaching of nuclides in the vitrified waste after overpackfailure, (b) diffusion of dissolved nuclides through the buffer materials with solubilityconstraints and with retardation by sorption onto the buffer materials, (c) inflow ofnuclides released from the EBS into fractures in the surrounding host rock undercomplete mixing with groundwater flowing through the excavation disturbed zone ofthe disposal tunnels. These processes were modeled in cylindrical co-ordinates using aone-dimensional model. Based on the above conceptual model, the following inputparameters have been sampled from uniform or log uniform distributions to perform theMonte Carlo simulation (Table 1): glass dissolution rate, elementary solubility limit,diffusion coefficient through the buffer materials and distribution coefficient as ameasure of sorption on the buffer materials.

In the conceptual model of the geosphere, nuclide transport through a fractured hostrock surrounding the repository and a major water-conducting fault (MWCF) providinga transport path to the surface was assumed. The following processes have been takeninto account for transport of nuclides through both the host rock and the MWCF: (i)transport of nuclides as solute by advection and dispersion, (ii) nuclide diffusion into theadjacent rock matrix and sorption onto the matrix pore surface (matrix diffusion). Thenuclides transported through the MWCF were assumed to finally enter a hypotheticalaquifer near the ground surface.

To consider the distribution of transmissivity as a typical example of an observedheterogeneous characteristic of fractures, transport through fractures in the host rock isrepresented using a set of model pathways. Each model pathway (modeled as a one-dimensional parallel plate) represents a set of fractures of similar transmissivity [4]. Inthis study, the distribution of transmissivity assigned to fractures in the host rock wasdivided into twelve classes and therefore represented by twelve model pathways. The

4

MWCF was modeled by single model pathway. Based on the above conceptual model,the following input parameters have been sampled from uniform or log uniformdistributions to perform the Monte Carlo simulation (Table 1): average transmissivitywith constant standard deviation, matrix diffusion depth, porosity in the matrix,proportion of fracture surface from which nuclides can diffuse into the matrix,distribution coefficient onto the matrix pore surface, transport length through the hostrock and transport length through the MWCF.

In order to evaluate the impact of nuclides originating from the repository on humanbeings, a list of “flux to dose conversion factors” previously reported for each nuclide[4] was applied to convert model outputs in flux unit (Bq/y) obtained from the MonteCarlo simulation to dose unit (mSv/y)

The Monte Carlo simulation model for the above conceptual descriptions wasmathematically modeled and implemented using the GoldSim probability simulationplatform [9][10]. Distributions of fifty-seven input parameters were defined as uniformor log uniform (Table 1) and then were used in the Monte Carlo simulation. It alsoshould be noted that these distributions have only been set as a reasonable example ofthose in a generic sense under the constraint of available information.

Sampling methodsThe random sampling of input parameter values using a Latin Hypercube Sampling(LHS) scheme was adopted to carry out a 500-realization Monte Carlo simulation run inthis study. In a LHS scheme, the range of each input parameter presented as aprobability density function is divided into N intervals of equal probability, where N isthe number of samples (realizations) and one sample is drawn randomly from eachinterval without replacement. Combination of each sampled value for all inputparameters makes a single set of inputs, and the same procedure at random and withoutreplacement for N sampled values makes N sets of inputs for a N-realization MonteCarlo simulation run. Sampling was carried out under the assumption that all inputparameters were independent.

Results of a Monte Carlo simulation runFigure 3 shows the Monte Carlo simulation results presented in terms of time history ofthe Total Dose per vitrified waste package (TD: mSv/y/waste). The TD is defined as thesum total of dose caused by all nuclides and is obtained for each realization at eachtime. Figure 3 shows that the variation of TD values is wider in the early time periodafter disposal than at later times. Five hundred TDs at each time are dominated by Se-79, Cs-135, Th-229 and Np-237 mainly and the number of realizations dominated bythese nuclides changes with time as shown in Figure 4(a). Figure 4(b) implies apotential difficulty to detect sensitivity of TDs to input parameters, especially nuclide(element) specific parameters, due to the overlap of TD values dominated by differentnuclides.

III. Sensitivity Analysis methodsGeneralized Sensitivity Analysis with Kolomogorov-Smirnov d statisticGeneralized Sensitivity Analysis (GSA) [8] is a technique that considers the sensitivity

5

of model outputs to model inputs from the viewpoint of whether or not uncertainty of acertain input parameter leads the output to exceed a specific threshold (e.g. a value ofinterest within the model output range, regulatory limit). The essence of GSA is theplacement of each vector of model inputs into one of two sample sets: those that createdmodel outputs that were over a specific threshold (“fail”) and those that created outputsthat were below the threshold (“pass”). This behavioral classification of the inputvectors into two sample sets allows the evaluation of the two sample sets as a functionof any input parameter contained within the input vectors. This process allows fordetermination of whether or not the difference between a “fail” sample set of inputs anda “pass” sample set is significant for every input parameter and also allows for therelative ranking of all input parameters in terms of the sensitivity of the outputs to thegiven input. For input parameters where the distribution of values in the “fail” sampleset is statistically indistinguishable from those in the “pass” sample set, the modeloutputs are insensitive to those input parameters [11][12]. GSA can provide aquantitative measure of the difference between the two sample sets using a non-parametric statistical test of the difference in cumulative probability distributionfunction (CDF) between the two sample sets (Figure 5). In this study, the Kolomogorov-Smirnov (K-S) test is used to determine the maximum vertical difference, d, betweenthe two CDFs. This maximum can occur at any value of the selected input parameter.The K-S test also provides a determination for the probability, P, that the maximumdistance d between the two CDFs could have occurred if the two sample sets were infact obtained from a single population. In this study, GSA with the K-S test was used todetermine the sensitivity of the TD at a given point in time to each input parameter forseveral different thresholds. Here, the K-S d statistic was used as a measure of inputparameter importance.

Overview of other sensitivity analysis methods used in this reportLinear regression provides a simple means of assessing the sensitivity of the outputsfrom a Monte Carlo simulation run to the input parameters and has been used widely(e.g., [7][10]). Partial correlation coefficients also characterize the strength of the linearrelationship between any two variables after a correction has been made for the lineareffects of the other variables in analysis. The standardized multiple linear regressionanalysis and partial correlation analysis have been used as alternative sensitivityanalysis methods in this study. In these alternative methods, the rank transformation ofthe data has been applied to improve fits to nonlinear data, and then standardized rankregression coefficients (SRRC) and partial rank correlation coefficients (PRCC) wereused as measures of input parameter importance, respectively.

Additional conditions applied in this reportIn GSA, threshold values of 10-9, 10-10, 10-11, 10-12 and 10-20 [mSv/y/waste] wereapplied in this study. The first four thresholds correspond to the high value range of TDthrough all times and the last threshold value is introduced to cover a wider distributionof TD in the early time period observed in Figure 3.

Sensitivity analyses using GSA, multiple linear regression analysis and partialcorrelation analysis provide the K-S d statistic, SRRC and PRCC for each inputparameter at a given point in time, respectively. The significance levels of these

6

coefficients were checked and then relative ranking among parameters that havecoefficients with significance level under 0.05 was done. The obtained ranks show therelative importance of each parameter from the viewpoint of the contribution from thatparameter to model outputs. The rank is called “importance rank” in this paper.

IV. Results and DiscussionFigure 6 shows the importance ranks from GSA for a threshold of 10-10 mSv/y/wasteand from SRRC. Importance ranks from SRRC and PRCC show almost completeagreement for the top ten parameters. Generally, high importance ranks of the averagetransmissivity parameter (Ave_T), Cs distribution coeffcinet on rock (Kd_G_Cs) andgeometric parameters relating to matrix diffusion (MD_depth, FR) were indicated fromall these different sensitivity analysis methods. On the other hand, Figure 6(a) showsthat GSA gives higher importance ranks to some element specific parameters andsharper evolution of those importance ranks with time relative to SRRC (Figure 6(b)).In this report, we will discuss the features of GSA through the comparison ofimportance ranks for average transmissivity parameter (Ave_T) and Se solubilityparameter (S_Se) (Figure 7) as an example showing similar highest importance ranksfrom the different methods and also as an example showing significantly differentimportance ranks from these, respectively.

GSA detects the sensitivity of outputs to inputs based on the K-S d statistic showing thedifference between CDFs in a “pass” input sample set and a “fail” input sample setquantitatively. The determination of “pass” and “fail” is made by comparison of themodel output to a specific threshold value. This approach means that the frequencydistribution of inputs in each sample set is examined in GSA, and therefore the K-S dstatistic is not dependent on the TD values with the exception of the threshold TD value.This is a remarkable difference of GSA from linear regression analysis and correlationanalysis, in which the sensitivities of TD to the input parameters are detected based on amonotonic relationship between them. This methodological feature of GSA is expectedto provide the power to detect the following two types of relationships between TD andinput parameters: i) a monotonic relationship through whole parameter range, ii) alocalized existence of high TD values for a specific range of a specific input parameter.If there is a strong monotonic relationship, it causes a distribution of “pass” inputs withhigh frequency in a specific parameter range and also causes the other distribution of“fail” inputs with high frequency in a different range. GSA will detect this type of effectby examination of the differences in the CDFs with many thresholds within distributionof TD values. However, GSA may give a weaker resolution for a weak monotonicrelationship than do the other two methods used in this report because GSA onlyexamines one input parameter at a time and introduces no correction for the effect ofother input parameters unlike the other two methods. On the other hand, the latter typeof relationship includes a situation in which a parameter can cause high TD values for aspecific parameter range but its monotonic relationship through whole parameter rangeis relatively weaker than seen with other parameters. This situation will imply that highTD values are caused by a combination of multiple specific parameters with eachspecific parameter range. Parameters having this type relationship will be detected byGSA with threshold values defined in decreasing order from near the highest TD value.In GSA, these two types of relationship can be detected in a common manner such as

7

difference between two CDFs expressed by K-S d statistic.

As an example, the similarity and the difference of importance ranks of the twoparameters at 2x104 y as determined from both GSA with a threshold of 10-10

mSv/y/waste and multiple linear regression analysis (Figure 7) are interpreted based onthe features of GSA discussed above:� The Ave_T parameter has a strong and positive monotonic relationship with TD

through the whole parameter range (Figure 8(a)). Multiple linear regression analysisresults in the highest importance rank of this parameter based on the monotonicrelationship. Also, the two input sample sets in GSA show a significant difference inthe frequency distributions of the inputs (Figures 8(b) and 8(c)). This differenceresults in a high K-S d statistic. Therefore, GSA also provides the highestimportance rank to the Ave_T parameter. Here, the highest importance rank of thisparameter is also indicated from GSA with a threshold of 10-20 mSv/y/waste at2x104 y (Figure 7(a)). This demonstrates that GSA detects the strong monotonicrelationship as does multiple linear regression analysis.

� The S_Se parameter does not have a clear monotonic relationship with TD throughthe whole parameter range (Figure 9(a)). This results in a relative low importancerank of this parameter from multiple linear regression analysis. However, the twoinput sample sets in GSA with a threshold of 10-10 mSv/y/waste show a difference inthe frequency distributions of the inputs (Figures 9(b)) and cause a high K-S dstatistic based on the CDFs shown in Figure 9(c). Here, a GSA with a threshold of10-20 mSv/y/waste does not detect significant K-S d statistic at 2x104 y (Figure7(b)). This demonstrates that GSA detects potential factors like the S_Se parameterthat cause high TD values for a specific parameter range even if these factors do nothave a strong monotonic relationship that results in a high importance rank frommultiple linear regression analysis. Such importance of the S_Se parameter declinessharply at around 105 y (Figure 7(b)). This trend is consistent with the decrease ofboth the number of realizations dominated by Se with time (Figure 4(a)) and therange of TD values dominated by Se with time (Figure 4(b)).

� GSA also indicates relatively high importance ranks of the FR parameter and theKd_G_Cs parameter at 2x104 y as well as the above two parameters. Therefore, inthis example, these four parameters are interpreted to be potential factors that causehigh TD values over the 10-10 mSv/y/waste at 2x104 y with the combination of thespecific range of those parameters. In fact, for these four parameters, the t-testbetween the mean of input parameter values in the “fail” sample set and the mean ofwhole sampled values also shows that the former mean is significantly different fromthe latter one. It is recognized that the t-test only examines differences betweenmeans and is a more conservative test than the K-S test which examines difference indistribution at any point in the range of values.

Based on the above discussion, it can be concluded that GSA assigns high importanceranks to input parameters having a strong monotonic relationship with the output asdoes multiple linear regression analysis and partial correlation analysis. Furthermore,GSA can detect the localized existence of high TD values for specific ranges ofspecific input parameters better than the other methods. In the sensitivity analysis forthe performance of a geological disposal system, understanding of factors that may

8

cause high doses is an issue of interest as well as understanding the general input-output relationship. GSA provides a method that is compatible with both objectives.

However, these results do not mean that GSA is the best sensitivity analysis method orthat it alone can be applied for the sensitivity analysis. It should be noted that GSAmay have a low resolution for a weak monotonic relationship between an input andoutput, and may also give a relatively high importance rank to obviously nonimportant parameters (e.g., in this example, Nb distribution coefficient on rock has arelatively high importance rank. However, Nb is not a dominant nuclide of TD and Nbrelevant parameters are not important obviously). Therefore, only few parameters withthe highest importance ranks has been used in this example.

V. SummaryFrom this study, it can be concluded that GSA with the K-S d statistic can be used as amethod to detect the importance of input parameters on TD in the sensitivity analysisregarding the performance of a geological disposal system of high-level radioactivewaste. It is suggested that the careful cross checking of results from more than onesensitivity analysis method is important to provide comprehensive insights withrespect to the effect of parameter uncertainties on the performance. These robustinsights can provide a basis for guiding future R&D plans including effective sitecharacterization and site selection activities.

Future work will examine the difference of importance ranks for non element specificparameters between GSA and other sensitivity analysis methods. More detailedinvestigations for mechanisms and conditions in which a certain nuclide dominateseach TD will be important. Knowledge from these investigations will help theunderstanding of Monte Carlo simulation results and also help the interpretation ofsensitivity analysis results.

References[1] OECD/NEA (1997): The Probabilistic System Assessment Group: History and

Achievements 1985-1994.[2] AEC (2000): Review of the H12 Project to Establish Technical Basis for HLW

Disposal in Japan, Atomic Energy Commission of Japan (in Japanese).[3] AEC (1987): Long-term Program for Research, Development and Utilization of

Nuclear Energy, Atomic Energy Commission of Japan (in Japanese).[4] JNC (2000): H12 Project to Establish the Scientific and Technical Basis for HLW

Disposal in Japan, Project Overview Report, JNC TN1410 2000-001.[5] Masuda, S. (2001): Evolution of Geological Repository Program in Japan, In

Proceedings of the 9th International High-Level Radioactive Waste ManagementConference, Las Vegas, NV, USA, April 29 – Mar 3, 2001.

[6] Morgan, M.G. and M. Henrion (1990): UNCERTAINTY A Guide to Dealing withUncertainty in Quantitative Risk and Policy Analysis, Cambridge University Press,New York, USA.

[7] Helton, J.C. and F.J. Davis (2000): Sampling-Based Methods for Uncertainty andSensitivity Analysis, Sandia Report SAND99-2240, Sandia National Laboratories,Albuquerque, New Mexico, USA.

9

[8] Spear, R.C. and G.M. Hornberger (1980): Eutrophication in Peel Inlet. II.Identification of Critical Uncertainties via Generalized Sensitivity Analysis, WaterResearch, Vol. 14 (1), pp.43-49.

[9] Golder Associates (2000) : GoldSim Contaminant Transport Module, User’s Guide,Golder Associates Inc., Seattle, WA, USA, 2001.

[10] Wakasugi, K., Webb, E.K., Makino, H., Ishihara, Y., Ijiri, Y., Sawada, A., Baba, T.,Ishiguro, K. and Umeki, H. (2000): A Trial of Probabilistic Simulation forReference Case in the Second Progress Report on Research and Development forthe Geological Disposal of HLW in Japan, In Proceedings of the 5th InternationalConference on Probabilistic Safety Assessment and Management, Osaka, Japan,November 27-December 1, 2000.

[11] McKenna, S.A. and B.W. Arnold (1998): Quantitative Sensitivity Analysis ofParameters in Monte Carlo Models (extended abstract), In Proceedings of the1998 International High-Level Radioactive Waste Management Conference, May11-14, Las Vegas, Nevada, American Nuclear Society, La Grange Park, Illinois.

[12] James, B.R., J.-P. Gwo and L. Toran (1996): Risk-Cost Decision Framework forAquifer Remediation Design, Journal of Water Resources Planning andManagement, Vol. 122 (6), pp. 414-420.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a LockheedMartin Company, for the United State Department of Energy under contract DE-AC04-94-AL-85000.

10

Figure 1 Schematic view of the geological disposal system in Japan [4]

11

Figure 2 Conceptual model for nuclide transport [4]

100m

Major water-conducting fault which is assumed tobe located 100 m away from the repository and toprovide a transport path

Host Rock

�@EDZBentoniteBuffer

FailedOverpack

HLW Glass

Water ConducingFeatures

�FGroundwater flow and nuclide transport�FGroundwater flow

Sedimentary layerSedimentary layer

AquiferAquifer

Rock massRock mass

RepositoryRepository

RiverRiver

Water-conducti

-35

-30

-25

-20

-15

-10

-5

1.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Tota

l Dos

e [m

Sv/y

/was

te]

10-5

10-10

10-15

10-20

10-25

10-30

10-35

103 104 105 106 103107

Realization 1-250

-35

-30

-25

-20

-15

-10

-5

1.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Tota

l Dos

e [m

Sv/y

/was

te]

10-5

10-10

10-15

10-20

10-25

10-30

10-35

103 104 105 106 103107

Realization 251-500

-35

-30

-25

-20

-15

-10

-5

1.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Tota

l Dos

e [m

Sv/y

/was

te]

90th percentile50th percentile10th percentilemean

10-5

10-10

10-15

10-20

10-25

10-30

10-35

103 104 105 106 107

Figure 3 The Monte Carlo simulation results

(a) Time history of Total Dose for 500-realization

(b) Percentile and mean for time history of Total Dose(“mean” is average of log-transformed Total Dose)

12

Time after Disposal [y]

The

num

ber o

f rea

lizat

ions

dom

inat

ed b

y di

ffere

nt n

uclid

es

0

100

200

300

400

500

1000 10000 1e5 1e6 1e7

Se-79Cs-135Th-229Np-237Other NuclideZero Value Case

103 107 106 10 5 104

log1

0 To

tal D

ose

[mSv

/y/w

aste

]

-36-35-34-33-32-31-30-29-28-27-26-25-24-23-22-21-20-19-18-17-16-15-14-13-12-11-10

-9-8-7-6

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Se-79Cs-135

Th-229Np-237

Non-Outlier MaxNon-Outlier Min

75%25%

Median

Outliers

Extremes

5x103 y 1x104 y 5x104 y 1x105 y 1x106 y5x105 y 5x106 y 1x107 y

Figure 4 Contribution of multiple nuclides to a set of Total Dose at each time

(b) The distribution of Total Dose value dominated by different nuclidesIn this figure, “Outliers” stands for values that are greater or lower than 1.5 times of the box length fromthe box edge. “Extremes” stands for values that are greater or lower than 3.0 times of the box lengthfrom the box edge.

(a) The number of realizations dominated by different nuclides

13

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Parameter (0-1)

CD

F

PASSFAIL

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Parameter (0-1)

CD

F

PASSFAIL

Figure 5 An illustrative example of CDFs for a “pass” sample set of inputand a “fail” sample set

(a) Significant K-S d statistic case (b) Insignificant K-S d statistic case

Figure 6 Importance ranks from GSA and multiple linear regression analysisThis figure shows importance ranks for parameters which are ranked within top five in any analyses using differentsensitivity analysis methods, different threshold values or different times. Importance ranks from GSA have different timesof the first appearance from SRRC. This means that the number of inputs in a “fail” sample set does not exceed the numberneeded for K-S test in this study (>25) before those times.

(a) GSA with threshold value of 10-10 mSv/y/waste

(b) Multiple linear regression analysis

1

2

3

4

5

6

7

8

9

10

111.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Impo

rtan

ce R

ank

[-]

S_SeS_ThS_NpKd_E_ThKd_G_CsKd_G_ThKd_G_NpAve_TPeMD_depthFRLENG_F

103 104 105 106 107

>10

123456789

10111.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Impo

rtan

ce R

ank

[-]

S_SeS_ThS_NpKd_E_ThKd_G_CsKd_G_ThKd_G_NpAve_TPeMD_depthFRLENG_F

103 104 105 106 107>10

14

RN: Cs-135RN: Se-79RN: Other RNRN: NONE

Average Transmissivity

Tota

l Dos

e [m

Sv/y

/was

te]

-35

-30

-25

-20

-15

-10

-5

0.0 0.2 0.4 0.6 0.8 1.0

10-5

10-25

10-20

10-15

10-10

10-30

10-35

Average Transmissivity

Freq

uenc

y of

Inpu

ts

05

1015202530354045505560

0-0.10.1-0.2

0.2-0.30.3-0.4

0.4-0.50.5-0.6

0.6-0.70.7-0.8

0.8-0.90.9-1.0

"pass" sample set determinedby threshold of 10 -10 [mSv/y/waste]

Average Transmissivity

Freq

uenc

y of

Inpu

ts

05

1015202530354045505560

0-0.10.1-0.2

0.2-0.30.3-0.4

0.4-0.50.5-0.6

0.6-0.70.7-0.8

0.8-0.90.9-1.0

"fail" sample set determinedby threshold of 10 -10 [mSv/y/waste]

Average of Transmissivity

Cum

ulat

ive

prob

abili

ty d

istr

ibut

ion

func

tion

(CDF

)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

"pass" sample set"fail" sample set

(a) (b)

(c)

Figure 7 Comparison of importance ranks for Ave_T parameter and S_Se parameterIn the case of GSA with threshold of 10-20 mSv/y/waste, passing the threshold value means that those TDs arebelow 10th percentile after 2x105 y in Figure 3(b), and therefore K-S d statistic obtained under these conditionswill be questionable.

(a) Average Transmissivity (Ave_T) (b) Se Solubility (S_Se)

Figure 8 Factors influencing the importance rank of the Average Transmissivity parameter(a) A scatter plot of transformed Average Transmissivity (Ave_T) vs. Total Dose at 2x104 y after disposal(b) Two frequency distributions of inputs regarding a “pass’ sample set and a “fail” sample set determined by

the threshold of 10-10 mSv/y/waste(c) Two CDFs from the frequency distributions shown in (b)

NOTE: In this figure, parameter values (x) shown in Table 1 are transformed to the values (x’) having range 0-1using the following equation: x’ = (x-x_min)/(x_max-x_min). Here, if the distribution of x is a log uniformin Table 1, x is log-transformed before the above transformation.

1

2

3

4

5

6

7

8

9

10

111.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Impo

rtan

ce R

ank

[-]

SRRCGSA with thresho ld of 1E-9GSA with thresho ld of 1E-10GSA with thresho ld of 1E-11GSA with thresho ld of 1E-12GSA with thresho ld of 1E-20

103 104 105 106 107

>10>10

1

2

3

4

5

6

7

8

9

10

111.E+03 1.E+04 1.E+05 1.E+06 1.E+07

Time after Disposal [y]

Impo

rtan

ce R

ank

[-]

SRRCGSA with threshold o f 1E-9GSA with threshold o f 1E-10GSA with threshold o f 1E-11GSA with threshold o f 1E-12GSA with threshold o f 1E-20

103 104 105 106 107

>10>10

15

RN: Cs-135RN: Se-79RN: Other RNRN: NONE

Se solubility

Tota

l Dos

e [m

Sv/y

/was

te]

-35

-30

-25

-20

-15

-10

-5

0.0 0.2 0.4 0.6 0.8 1.0

10-5

10-30

10-25

10-20

10-15

10-10

10-35

Se Solubility

Freq

uenc

y of

inpu

ts

05

1015202530354045505560

0-0.10.1-0.2

0.2-0.30.3-0.4

0.4-0.50.5-0.6

0.6-0.70.7-0.8

0.8-0.90.9-1.0

"pass" sample set determined by threshold of 10 -10 [mSv/y/waste]

Se Solubility

Freq

uenc

y of

inpu

ts

05

1015202530354045505560

0-0.10.1-0.2

0.2-0.30.3-0.4

0.4-0.50.5-0.6

0.6-0.70.7-0.8

0.8-0.90.9-1.0

"fail" sample set determinedby threshold of 10 -10 [mSv/y/waste]

Se Solubility

Cum

ulat

ive

prob

abili

ty d

istr

ibut

ion

func

tion

(CD

F)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

"pass" sample set"fail" sample set

(a) (b)

(c)

Figure 9 Factors influencing the importance rank of the Se Solubility parameter(a) A scatter plot of transformed Se Solubility (S-Se) vs. Total Dose at 2x104 y after disposal(b) Two frequency distributions of inputs regarding a “pass’ sample set and a “fail” sample set determined by

the threshold of 10-10 mSv/y/waste(c) Two CDFs from the frequency distributions shown in (b)

NOTE: In this figure, parameter values (x) shown in Table 1 are transformed to the values (x’) having range 0-1using the following equation: x’ = (x-x_min)/(x_max-x_min). Here, if the distribution of x is a log uniformin Table 1, x is log-transformed before the above transformation.

16

Table 1 Distribution of input parameters sampled in the Monte Carlo simulations