Sanitas Statistical Analysis · PDF file3 SANITAS STATISTICAL ANALYSIS PROCEDURES Introduction...

1

UUSSEERR GGUUIIDDEE PPaarrtt 22::

SSaanniittaass SSttaattiissttiiccaall AAnnaallyyssiiss PPrroocceedduurreess Version 8.7

Copyright

Information in this document is subject to change without notice and does not represent a

commitment on the part of Sanitas Technologies. The software described in this

document is furnished under a license agreement and may be used only in accordance

with the terms of the agreement. No part of this manual may be reproduced or

transmitted in any form or by any means, electronic or mechanical, including

photocopying, recording, or information storage or retrieval systems, for any purpose

other than the purchaser’s personal use without the permission of Sanitas Technologies.

© 1992-2007 SANITAS TECHNOLOGIES. All rights reserved.

Windows™, Windows® 95, 98, 2000 and Windows® NT are registered trademarks of Microsoft

Corporation. DUMPStat is a registered trademark of Discerning Systems Inc. No investigation has been

made of common-law trademark rights in any word. Sanitas Technologies makes no warranties, either

express or implied, regarding the enclosed computer software package or its fitness for any particular

purpose.

User Guide Version 8.7 designed by Sanitas Technologies.

SANITAS TECHNOLOGIES

22052 W 66th Street

Suite 133

Shawnee, KS 66226

(719) 742-3661

www.sanitastech.com

2

TABLE OF CONTENTS

SANITAS STATISTICAL ANALYSIS PROCEDURES ............................................. 3

INTRODUCTION ................................................................................................................ 3

DESCRIPTIVE STATISTICS ................................................................................................. 5

Time Series Plot .......................................................................................................... 5

Box and Whiskers Plot ................................................................................................ 6

Histogram ................................................................................................................... 7

Probability Plot......................................................................................................... 11

Seasonality Plot ........................................................................................................ 12

Statistical Outlier Tests............................................................................................. 12

Rank Von Neumann................................................................................................... 17

Normality Report ...................................................................................................... 19

Stiff Diagram............................................................................................................. 19

Piper Diagram .......................................................................................................... 20

DETECTION MONITORING STATISTICS ........................................................................... 20

Shewhart-CUSUM Control Chart............................................................................. 20

Intrawell Rank Sum................................................................................................... 39

Mann-Whitney / Wilcoxon Rank Sum ....................................................................... 39

Welch's t-test ............................................................................................................. 41

One-Way Analysis of Variance (ANOVA)................................................................. 42

Parametric ANOVA .................................................................................................. 43

Nonparametric ANOVA ............................................................................................ 50

Tolerance Limits ....................................................................................................... 51

Alert Levels (Arizona Standards Only) ..................................................................... 59

Prediction Limits (or Intervals): EPA Standards ..................................................... 61

Prediction Limits (or Intervals): EPA Draft Unified Guidance (UG) Standards..... 67

California Non-statistical Analysis of VOCs ............................................................ 71

Poisson Composite VOC Prediction Limit ............................................................... 72

Verification Retest Procedure – California .............................................................. 73

Intrawell ASTM Approach (ASTM Standards Only) ................................................ 74

Interwell ASTM Approach (ASTM Standards Only)................................................. 81

EVALUATION MONITORING STATISTICS......................................................................... 89

Trend Analysis .......................................................................................................... 89

Sen’s Slope Estimator ............................................................................................... 91

Seasonal Kendall Test............................................................................................... 93

COMPLIANCE OR CORRECTIVE ACTION MONITORING STATISTICS ................................. 96

Confidence Intervals ................................................................................................. 96

Tolerance Intervals ................................................................................................. 100

Proportion Estimate................................................................................................ 102

APPENDIX I: GLOSSARY OF SELECTED STATISTICAL TERMS................. 104

BIBLIOGRAPHY......................................................................................................... 106

INDEX............................................................................................................................ 107

3

SANITAS STATISTICAL ANALYSIS PROCEDURES

Introduction

This section describes the statistical methods incorporated into the Sanitas for Ground

Water and Environmental Media software developed and used by SANITAS

TECHNOLOGIES to evaluate environmental data. These methods are proposed for use

in the monitoring and response programs of Subtitle C & D facilities and incorporate the

ground water statistical analysis requirements of:

� 40 CFR Part 264;

� 40 CFR Part 257 and 258;

� the EPA “Statistical Analysis of Ground Water Monitoring Data at RCRA Facilities -

Interim Final Guidance”;

� the EPA “Addendum to the Interim Final Guidance”;

� articles 5 and 10, Chapter 15, Title 23 of the California Code of Regulations; and

� the ASTM “Standard Guide for Developing Appropriate Statistical Approaches for

Ground-Water Detection Monitoring Programs” D 6312-98.

� the EPA DRAFT Unified Guidance, September 2004.

Specifically, the descriptive statistics described in this document include:

� Time Series;

� Box and Whiskers Plot (including annual and seasonal);

� Histogram;

� Skewness;

� Kurtosis;

� Probability Plot;

� Seasonality Plot;

� Statistical Outlier Tests;

� Normality Report;

� Rank Von Neumann;

� Normality Report;

� Stiff Diagram; and

� Piper Diagram.

The distributional statistics described include:

� Shapiro-Wilk Test;

4

� Coefficient-of-Variation Test;

� Shapiro-Francia Test;

� Chi-Squared Test; and

� Levene’s Test.

The censored data substitution functions described include:

� Detection Limit Substitution;

� Cohen’s Adjustment; and

� Aitchison’s Adjustment

The detection monitoring statistical tests described include:

� Combined Shewhart-CUSUM Control Charts;

� Intrawell Rank Sum:

− Exact Test;

− Large Sample Approximation Test;

� Mann-Whitney;

� Welch's t-test;

� Parametric Analysis of Variance;

� Bonferroni t-statistics (Multiple comparisons procedure);

� Nonparametric Analysis of Variance:

− Kruskal-Wallis;

� Tolerance Limits:

− Parametric;

− Nonparametric;

� Prediction Limits:

− Parametric;

− Nonparametric;

− DMT-NP Method;

� California Non-Statistical Analysis of VOCs;

� Poisson Prediction Limits;

� Intrawell ASTM Method; and

� Interwell ASTM Method.

The evaluation/assessment monitoring statistical tests described include:

� Mann-Kendall:

− Exact Test;

5

− Normal Approximation Test; and

� Sen’s Slope Estimator and Plot.

� Seasonal Kendall Slope Estimator and Plot

The compliance and corrective action statistical tests described include:

� Confidence Intervals:

− Parametric;

− Nonparametric;

� Tolerance Intervals:

− Parametric;

− Nonparametric; and

� Proportion Estimate.

Moreover, this document describes the analysis decision logic and which pre- and post-

analysis tests are required to ensure that the data do not violate any size, distribution, or

seasonality assumptions of the relevant statistical tests.

Descriptive Statistics

Time Series Plot

Description:

Time Series plots provide a graphical method to view changes in data at a particular well

(monitoring point) or wells over time. Time Series plots display the variability in

concentration levels over time and can be used to indicate possible outliers. More than

one well can be compared on the same plot to look for differences between wells. They

can also be used to examine the data for trends.

Procedures:

Order the well measurements by sampling date. Number the sampling dates starting with

"O" for the initial date of collection. All subsequent dates will be numbered as the days

elapsed relative to this initial date. Plot the analyte measurement on the y-axis by

sampling date on the x-axis. The x-axis is labeled with intermittent month/year on the

Sanitas time series plots.

6

Box and Whiskers Plot

Description:

A quick way to visualize the distribution of data in a given data set is to construct a Box

and Whiskers plot. The basic box plot graphically locates the median, 25th

and 75th

percentiles of the data set; the "whiskers" extend to the minimum and maximum values of

the data set. The range between the ends of a box plot represents the Interquartile Range,

which can be used as a quick estimate of spread or variability. The mean is denoted by

a"+".

When comparing multiple wells or well groups, box plots for each well can be lined up

on the same axes to roughly compare the variability in each well. This may be used as a

quick exploratory screening for the test of homogeneity of variance across multiple wells.

If two or more boxes are very different in length, the variances in those well groups may

be significantly different.

Note that depending on the length of the well names and similar considerations, only

about 10 or 12 wells can fit on a Sanitas Box & Whiskers report without overcrowding.

For standard box plots, Sanitas will prompt the user for a maximum per page, but for

Grouped/Seasonal etc. box plots the user may have to divide the wells manually. To

keep the scale consistent among multiple subsets of a given View, deselect wells in the

Examine Observations sub-window. The deselected values will still be used in

calculating the scale.

Procedures:

The data are first ordered from lowest to highest. The 25th

(lower quartile), 50th

(median),

and 75th

(upper quartile) percentile values from the data set are then computed. To

compute the pth

percentile, find the data point with rank position equal to:

p n( )++++ 1

100

Where:

n = number of samples;

p = the percentile of interest.

In the case of sparse data, the following logic is applied:

When n = 1, minimum value = 25th

percentile value = median = 75th

percentile

value = maximum value;


percentile value, maximum value = 75th

percentile value, and median = ½ (minimum + maximum values);

7


percentile value, maximum value = 75th

percentile value, and median = middle value.

Histogram

Description:

A frequency distribution may be visually displayed in the form of a histogram.

Procedure:

The analyte measurements are plotted on the x-axis and the frequencies of these

measurements are plotted on the y-axis. Values are collapsed within class intervals, each

represented by a rectangular bar on the plot. The height of each bar corresponds with the

respective frequencies. Coefficients of skewness and kurtosis are computed from the data

to give an indication of normality.

Skewness:

Skewness is a measure of the symmetry of the frequency distribution. The coefficient of

skewness, γγγγ, is computed as follows:

−

Χ−Χ

=

∑

3

2/3

3

1

)(

Sn

n

n

i

γ

Where:

Xi = the value for the i th observation;

X = the mean of the n observations;

S = the standard deviation; and

n = the number of observations.

The mean, X , and the standard deviation, s, are computed as follows:

8

n

mf

X

k

i

ii∑== 1

( )

1

2

1

−

Χ−Χ

=∑

=

nS

n

i

i

Where:

fi = the frequency of the ith observation;

mi = the value of the ith observation; and

k = the number of distinct values.

A right skewed distribution has a positive skewness value, and a left skewed distribution

has a negative skewness value. A large absolute skewness value can be an indication of

the presence of outliers. A normally distributed frequency distribution would have a

skewness absolute value of less than 1.

Kurtosis:

Kurtosis is a measure of flatness or peakedness of the frequency distribution. The

coefficient of kurtosis, K, is computed as follows:

( )( )( )( )

( )( )( )32

13

321

12

4

−−

−−

Χ−Χ

−−−

+=Κ ∑

nn

n

Snnn

nn i

Where:


X = the mean of the n observations;

S = the standard deviation; and


A normal distribution has a kurtosis absolute value of less than 1. A negative kurtosis

value indicates a flatter curve than the normal distribution. A positive kurtosis value

indicates a curve that is more peaked than the normal distribution.

10

EXAMPLE 1:

Date Xi

(concentration) (Xi-X)3 [(Xi-X)/S]

4

1/5/1992 15 -25.08 0.30

4/8/1992 17.5 -0.08 0.00

7/1/1992 13.2 -105.64 2.05

10/15/1992 14.9 -27.74 0.34

1/20/1993 27 746.82 27.82

4/14/1993 22.6 102.03 1.96

7/12/1993 18.7 0.46 0.00

10/22/1993 17.4 -0.15 0.00

1/15/1994 19 1.23 0.01

4/2/1994 15 -25.08 0.30

7/3/1994 16.9 -1.08 0.00

Table 8.1: Example Data for Skewness and Kurtosis

Χ = 17.93 S = 3.95 n = 11

Skewness

( ) 68.6653

=Χ−Χ∑ i

132.1

95.311

111

11

68.665

32

3=

∗

−

=γ

Kurtosis

79.32

4

=

Χ−Χ∑

S

i

( )( )( )( )

( )( )( )

844.1311211

111379.32

311211111

11111 2

=−−

−−

∗

−−−

+=Κ

11

Probability Plot

Description:

Probability plots are a graphical test for normality. These plots may be used to

investigate whether a set of data or the residuals of the data follow a normal or

transformed-normal distribution.

Procedure:

The data are first ordered from lowest to highest. The analyte measurements are plotted

in increasing order on the x-axis and the z-scores from a standard normal distribution

corresponding to the proportion of observations less than or equal to that measurement

are plotted on the y-axis. The coordinated z-score from a standard normal distribution is

computed by the following formula:

+−Φ=

1n

i1i

y

Where:

ΦΦΦΦ −−−−1 = the inverse of the cumulative standard Normal distribution;

n = the sample size; and

i = the rank position of the ith ordered concentration.

If the data are normal, the points when plotted will lie in a straight line. Visual curves or

bends indicate that the data do not follow a normal distribution.

12

EXAMPLE 2

Concentration(X-axis) Order (I) [i/(n+1)] z-score (y-axis)

39 1 0.077 -1.425

56 2 0.154 -1.02

58.8 3 0.231 -0735

64.4 4 0.308 -0504

81.5 5 0.385 -0.294

85.6 6 0.462 -0.095

151 7 0.538 0.095

262 8 0.615 0.294

331 9 0.692 0.504

578 10 0.769 0.735

637 11 0.846 1.02

942 12 0.923 1.425 Table 8.2: Example Data for Probability Plot

Seasonality Plot

Description:

Seasonality plots are constructed as Time Series plots for both observed values and

values deseasonalized according to the method described by the EPA (U. S. EPA, April

1989). In addition to the Time Series plots, box plots are presented for the original and

deseasonalized data. The presence of seasonality is tested with the Kruskal-Wallis H

statistic with correction for ties (see Control Charts for method description).

Statistical Outlier Tests

Description:

A statistical outlier is a value that is extremely different from the other values in the data

set. Outlier tests identify data points that do not appear to fit the distribution of the rest of

the data set and determine if they differ significantly from the rest of the data.

A value is considered to be suspect if it is an order of magnitude larger or smaller than

the rest of the data. Once a value is identified as a statistical outlier, it should be checked

thoroughly for possible lab instrument failure, field collection problems, or data entry

errors. Outliers may exist naturally in the data if there is an extremely wide inherent or

temporal variability in the data, or if there is an on-sight problem such as leakage or a

new impact source. An outlier should not be removed from the data set unless the value

has been documented to be erroneous. Outliers that cannot be explained by error may

call for further investigation (EPA, April 1989).

13

Auto-Checking for Outliers

The auto-checking for outliers option does not check for normality of data, it only

identifies possible outliers, using the "EPA 1989" method. Therefore, when a possible

outlier is found using auto-check, a separate outlier test should be run for that particular

well (refer to “Auto-Checking for Outliers” in the “Analysis Options” section of the

“User-Selectable Options” chapter).

"EPA 1989" OUTLIER TEST

Assumptions:

The "EPA 1989" outlier test assumes that all data values, except for the suspect

observation, are normally or log normally distributed. A minimum of three observations

is required; however, a minimum of eight observations is recommended.

Procedure:

First, the data are log-transformed, then ordered from lowest to highest. The mean and

standard deviation are then calculated. Next, calculate the outlier test statistic, Tn, as:

( )S

nXnT

X−=

Where:

Xn = the suspect observation;

X = the sample mean; and

S = the sample standard deviation.

Then compare the absolute value of the outlier test statistic (Tn) with the critical value,

(Tn (0.05)), for the given sample size, n, at a five percent significance level (Table 8,

Appendix B, EPA, April 1989). If abs(Tn) exceeds the tabulated value, there is statistical

evidence that Xn is a statistical outlier. If so, this value is removed and the remaining

dataset is retested using the same method, until all such outliers have been accounted for.

14

EXAMPLE 3:

Total Organic Carbon (mg/I) Log Transformed Data

1700 7.4

1900 7.5

1500 7.3

1300 7.2

11000 9.3

1250 7.1

1000 6.9

1300 7.2

1200 7.1

1450 7.3

1000 6.9

1300 7.2

1000 6.9

2200 7.7

4900 8.5

3700 8.2

1600 7.4

2500 7.8

1900 7.5

Table 8.3: Example Data for Outlier Test

The mean and standard deviation for all log transformed data including the outlier.

Χ = 7.5 s = 0.61

95.261.0

5.73.919 =

−=Τ

Table 8, Appendix B, US EPA Guidance, T19(.05) is 2.532. Since T19 exceeds the

tabulated value, there is statistical evidence that this observation is an outlier.

15

DIXON'S OUTLIER TEST

Requirements and Assumptions:

Dixon’s test is only recommended for sample sizes n ≤ 25. It assumes that the data set

(not including the suspected outlier) is normally-distributed.

8.4.3 Procedure:

Step 1. Sort the data set and label the ordered values, x(i).

Step 2. To test for a low outlier, compute the test statistic C using the appropriate

equation below, based on the sample size:

C =

(x (2) − x (1))/(x (n) − x (1)) for 3 <= n <= 7

(x (2) − x (1))/(x (n−1) − x (1)) for 8 <= n <= 10

(x (3) − x (1))/(x (n−1) − x (1)) for 11 <= n <= 13

(x (3) − x (1))/(x (n−2) − x (1)) for 14 <= n <= 20

Or, to test for a high outlier, compute the test statistic C using the appropriate equation

below, based on the sample size:

C =

(x (n) − x (n−1))/(x (n) − x (1)) for 3 <= n <= 7

(x (n) − x (n−1))/(x (n) − x (2)) for 8 <= n <= 10

(x (n) − x (n−2))/(x (n) − x (2)) for 11 <= n <= 13

(x (n) − x (n−2))/(x (n) − x (3)) for 14 <= n <= 20

Step 3. Find the critical point for the specified alpha level in table 8-1, US EPA DRAFT

Unified Guidance 2004*. If C exceeds the tabulated value, the suspected outlier should

be declared a statistical outlier and investigated further.

16

Dixon's test can be modified to test for more than one outlier as follows. If the least

extreme suspected outlier is tested, having removed any more extreme values, and proves

to be a statistical outlier, then it may be concluded that the more extreme suspected

values are also statistical outliers. If not, then the least extreme of the removed values

can be tested in a similar manner. Importantly, though, this method can only test multiple

suspected outliers if they are both on the same tail, i.e. both high outliers or both low

outliers. So if both a high and a low outlier are suspected in a single data set, this test is

not recommended. If the sample size is at least 20, Rosner's should be substituted;

otherwise contact a professional statistician.

ROSNER'S OUTLIER TEST

Requirements and Assumptions:

Rosner’s test is recommended when the sample size is 20 or larger. The critical points

can be used to identify from 2 to 5 outliers. Rosner’s method again assumes the

underlying data set (less any outliers) is normally distributed, or can be transformed to

normal.

Procedure:

Step 1. Sort the data set and label the ordered values x(i). Then identify the maximum

number of suspected outliers, r0.

Step 2. Compute the mean and standard deviation of all the data; call these values x(0) and

s(0). Then determine the measurement farthest from x(0) and label it y(0).

Step 3. Remove y(0) from the data set and compute the mean and standard deviation of the

remaining observations. Call these new values x(1) and s(1). Again find the value in this

data subset furthest from x(1) and label it y(1).

Step 4. Remove y(1), again calculate the mean and standard deviation, and continue this

process until r0 potential outliers have been removed.

Step 5. We now have the values necessary to test for r outliers (r ≤ r0) by computing the

test statistic:

)1()1()1(1-r /R −−− −= rrr sxy ||

First test for r0 outliers. If the test statistic exceeds the first critical point from Table 8-2,

US EPA Draft Unified Guidance 2004*, based on the sample size and the alpha level,

this may be taken as evidence that there are r0 outliers. If not, test for r0–1 outliers in the

same manner using the next critical point, continuing until a certain number of outliers

have been identified or until no outliers are found.

17

Note that Sanitas will accept one as the number of suspected outliers. In this case, it uses

the second tabled value from k=2 (as if two outliers were suspected but not found) to test

for one outlier.

Rank Von Neumann

Description:

This statistical procedure is a test for serial correlation at a given well (monitoring point).

The test will also reflect the presence of trends or cycles, such as seasonality. Therefore,

to test for serial correlation only, one must first remove any seasonality or trends that are

present.

Rank Von Neumann Procedure:

The null hypothesis to be tested is:

H0: There is no serial correlation present in the data.

The alternative hypothesis is:

HA: There is serial correlation present in the data.

The data are first ordered from lowest to highest, assigning the rank of 1 to the smallest

observation, the rank of 2 to the next smallest,…, and the rank of n to the largest. Let R1

be the rank of x1, R2 be the rank of x2, and Rn the rank of xn.

Compute the Rank Von Neumann statistic as:

( )2

1n

1i 1iR

iR

12nn

12Rv ∑

−

= +−

−=

Where:

Ri = the rank of the ith observation in the sequence; and

Ri+1 = the rank of the (i+1)st observation in the sequence (the following

observation).

If the sample size n is greater than or equal to ten, or less than or equal to 100, the

calculated value Rv is compared to the tabulated Rvαααα (Table A5, Gilbert). The null

hypothesis is rejected if the computed value Rv is less than the tabulated critical value.

If the sample size, n, is greater than 100, compute:

18

( )2

2vRn

RZ

−=

Reject the null hypothesis if ZR is negative and the absolute value of ZR is greater than

the tabulated Z1-αααα value (Table A1, Gilbert).

EXAMPLE 4:

Date Concentration Rank [Ri-Ri+1]2

3/3/1995 2.2 10 9

6/3/1995 2.74 13 81

9/3/1995 0.42 4 4

12/3/1995 0.63 6 1

3/3/1996 0.82 7 1

6/3/1996 0.86 8 36

9/3/1996 0.31 2 100

12/3/1996 2.33 12 49

3/3/1997 0.5 5 36

6/3/1997 2.22 11 4

9/3/1997 1.1 9 36

12/3/1997 0.32 3 4

2/3/1998 0.01 1

Table 8.4: Rank Von Neumann Example Data

( )∑−

==

+−

1n

1i361

21i

Ri

R

( ) 1.984361

121313

12vR =

−=

The tabled critical value at an alpha of .05 is 1.14. Since Rv is greater than the tabled

critical value, we cannot reject H0.

19

Normality Report

Description:

The Normality Test report is a textual report of normality test results for each well

(monitoring point) selected in the current data set. Either the Shapiro-Wilk/Shapiro

Francia method or the Chi-Squared method (see descriptions elsewhere in this statistical

write-up) may be used, and optionally the normality results after each transformation in

the Ladder of Powers (see Chapter 5 of the User Guide) may be detailed.

Stiff Diagram

Description:

Stiff Diagrams are a graphical method devised to portray water compositions and

facilitate in the interpretation and presentation of chemical analysis. They may be used to

visually compare the chemical composition of water quality across wells, and aid in

determining whether the aquifer is heterogeneous or homogenous. Stiff Diagrams are

calculated in terms of equivalents per million, more commonly referred to as

milliequivalents; and they take into account the ionic charge and the formula weight for

selected constituents, specifically (sodium+potassium), magnesium, calcium, chloride,

sulfate, and bicarbonate.

Procedure:

Milliequivalents per liter for each of the above constituents are calculated as

Weight/Charge.

The resulting values determine the relative distances from the center line for the

respective vertices of the diagram.

To run a Stiff Diagram report, choose a sampling date from the drop down list, and

optionally extend this to a range if the sampling event occupied multiple days. Select the

wells to analyze, and click Run.

The following options are available:

Label Axes: Adds a scale (in milliequivalents) to the x-axis of each Stiff Diagram drawn.

Label Constituents: Adds abbreviated constituent names on the vertices.

Compare Dates: Replaces the Date ComboBox (single date selection) with a scrolling list

of dates (multiple date selection). Allows the comparison of data not only by well, but

also by date.

20

Piper Diagram

Description:

Piper diagrams are a form of tri-linear diagram, which provide a visual representation of

the ion concentration of groundwater. A piper diagram has two triangular plots on the

right and left side of a 4-sided center field. The three major cations are plotted in the left

triangle and anions in the right. Each of the three cation/anion variables, in

milliequivalents, is divided by the sum of the three values, to produce a percent of total

cation/anions. These percentages determine the location of the associated symbol. The

data points in the center field are located by extending the points in the lower triangles to

the point of intersection.

In order for a Piper diagram to be produced, the selected data file must contain the

following constituents: Sodium (or Na), Potassium (or K), Calcium (or Ca), Magnesium

(or Mg), Chloride (or Cl), Bicarbonate (or HCO3), Carbonate (or CO3) and Sulfate (or

SO4). The units should be mg/l, ppm, ug/l or ppb, and must be consistent.

To run a Piper Diagram report, choose a sampling date from the drop down list, and

optionally extend this to a range if the sampling event occupied multiple days. Select the

wells to analyze, and click Run.

The following options are available:

Label Axes: Adds percent values on the axes.

Label Constituents: Adds constituent names on the axes.

Compare Dates: Replaces the Date drop-down list (single date selection) with a scrolling

list of dates (multiple date selection). Allows the comparison of data not only by well,

but also by date.

Note Cation-Anion Balance: Shows on the report the Cation-Anion Balance, which is the

absolute value of the difference between the total cations and the total anions, both

expressed in milliequivalents, divided by their sum.

Detection Monitoring Statistics

Shewhart-CUSUM Control Chart

Description:

The combined Shewhart-Cumulative Sum (CUSUM) Control Charts are useful graphical

tools for evaluating detection-monitoring data because they monitor the inherent

statistical variation of data collected within a single well (monitoring point), and flag

anomalous results.

21

Control Charts are a form of a time-series graph, on which a parametric statistical

representation of concentrations of a given constituent are plotted at intervals over time.

The statistics are computed and plotted together with upper and/or lower control limits on

a chart where the x-axis represents time. If a result falls outside the predetermined control

limits, then the process is considered “out of control” and may indicate potentially

impacted ground water. Otherwise, the process is considered “in control.”

Assumptions:

The standard assumptions in the use of Control Charts are that the data are independent

and normally distributed with a constant mean, X , and constant variance, s2, and that the

background data haven’t been previously impacted by the facility. In addition, it is

assumed that seasonality in the data is sufficiently accounted for to minimize the chance

of mistaking seasonal effects for evidence of water quality degradation due to release

from a nearby waste management unit (WMU). Another assumption is that a sufficient

number of background data points exists to provide reliable estimates of the mean and

standard deviation of the constituent’s concentration values for a given well.

Independence:

Prior to construction of the Control Charts, the assumption of data independence should

be considered. The monitoring data should be collected to ensure physical independence

of the samples, and a specified rigorous field sampling protocol should be followed.

Distribution:

The distribution of the data is evaluated by applying the Shapiro-Wilk or Shapiro-Francia

test for normality to the raw data or, when applicable, to the transformed data. The null

hypothesis, H0, to be tested is:

H0: The population has a normal (or transformed-normal) distribution.

The alternative hypothesis, HA, is:

HA: The population does not have a normal (or transformed-normal)

distribution.

Shapiro-Wilk Test Procedure:

Calculation of the Shapiro-Wilk W-statistic to test the null hypothesis is presented in

detail on page 158 of Statistical Methods for Environmental Pollution Monitoring

(Gilbert, 1987). This test will be used when there are 50 or fewer observations to test.

Beyond 50 observations, the Shapiro-Francia test will be used.

22

The denominator, d, of the W test statistic, using n data is computed as follows:

( )2

1 1 1

22 1∑ ∑ ∑

= = =

Χ−Χ=Χ−Χ=

n

i

n

i

n

i

iiin

d

Where:


X = the mean of the n observations; and


Order the n data from smallest to largest (e.g. X[1] < X[2] < ... < X[n]). Then compute k

where:

2k

n= if n is even

2

1-nk = if n is odd

The coefficients a1, a2, ..., ak for the observed n data can be found in Table A6 (Gilbert,

1987).

The W test statistic is then computed as follows:

[ ] [ ]( )2

k

1i

i1iiad

1W

Χ−Χ= ∑

=

+−n

The data are tested at the α = 0.05 significance level. The significance level represents

the probability of rejecting the null hypothesis when it is true (i.e. the percent of false

positives). It is customary to set α at 0.05 (corresponding to a 95 percent confidence

level) or at .01 (corresponding to a 99 percent confidence level).

α - This is also known as "Type I error." Reject Ho at the α significance level if W is

less than the quantile given in Table A7 (Gilbert, 1987).

23

EXAMPLE 5:

Table 8.5: Example Data For Shapiro-Wilk Test

10=n [ ] 7865.5

21

=−=∑=i

n

i yyd

52

10

2

nk ===

( )[ ] ( )[ ] ( )[ ]( )[ ] [ ]

87.004879.01133.00399.02744.01823.01224.0

5108.031148.02141.079851.05247.03291.00402.27227.05739.0

7865.5

1W

2

=

−+−−+

−−+−−+−−=

The calculated W is greater than the W found in Table A7, Gilbert 1987 for α= .05 of

0.842. Therefore, it is concluded that the data are lognormally distributed.

The Shapiro-Wilk test of normality can be used for sample sizes up to 50. When the

sample size is larger than 50, the Shapiro-Francia test can be used instead. A less

accurate normality test for smaller samples sizes is the coefficient of variation test.

Coefficient-of-Variation Test Procedure:

Calculate the sample mean, X , of the n observations Xi, where i = 1, ..., n. Then

calculate the sample standard deviation, s. The coefficient-of-variation, CV, is calculated

as:

X

sCV =

Rank (smallest to largest Xi yi=In xi [yi-y]2

1 .13 -2.0402 3.49126

2 .45 -0.7985 0.39285

3 .60 -0.5108 0.11499

4 .76 -0.2744 0.01055

5 1.05 0.0488 0.04863

6 1.12 0.1133 0.08126

7 1.20 0.1823 0.12535

8 1.37 0.3148 0.23672

9 1.69 0.5247 0.48505

10 2.06 0.7227 0.80002

24

If CV exceeds 1.00 then reject H0 that the data are normally distributed.

EXAMPLE 6:

Date Concentration

1/5/1993 0.04

10/3/1993 0.18

2/1/1994 0.18

4/7/1994 0.25

7/2/1994 0.29

10/9/1994 0.38

1/15/1995 0.5

4/17/1995 0.5

7/1/1995 0.6

11/2/1995 0.93

1/15/1996 0.97

4/17/1996 1.1

7/1/1996 1.16

11/2/1996 1.29

1/15/1997 1.37

2/28/1997 1.38

5/1/1997 1.45

8/2/1997 1.46

11/4/1997 2.58

1/7/1998 2.69

3/6/1998 2.8

8/29/1998 3.33

11/2/1998 4.5

1/6/1999 6.6

Table 8.6: Example Data for Coefficient of Variation

52.1=Χ 56.1=s

03.11.52

1.56CV ==

Since CV is greater than 1.00, the data were not found to be normally distributed.

25

Shapiro-Francia Test Procedure:

Calculation of the Shapiro-Francia W′ -statistic to test the null hypothesis is presented in

detail by EPA (U.S. EPA, 1992). The test statistic, W′ , is computed as follows:

[ ]( ) 2

im

i2S1n

2i i

xi

mW'

∑−

∑=

Where:

xi = the ith ordered value of the sample;

mi = the approximate expected value of the ith ordered normal quantile;

n = the number of observations; and

S = the standard deviation of the sample.

The values for mi can be approximately computed as:

+Φ=

1n

i1-i

m

Where:

ΦΦΦΦ-1 = The inverse of the standard normal distribution with zero mean and unit

variance.

Reject H0 at the α = 0.05 significance level if W′ is less than the critical value provided

in Table A-3 (Appendix A; U.S. EPA, 1992). When the sample size is larger than 100,

the Chi-Squared Goodness-of-Fit test can be used instead.

Chi-Squared Goodness-of-Fitness Normality Test Procedure:

First divide the N observations by four to compute K, where K will be the number of

subgroups or ‘cells’ for the data set (maximum 10). Second, standardize each

observation, Xi, by subtracting the group mean and dividing by the group standard

deviation as follows:

( )s

XXZ i

i

−=

Where:

Zi = the standardized value;

26

X = the group mean; and

s = the group standard deviation.

Once the standardized values and K have been calculated, the third step is to subgroup

the Zi according to the cell boundaries designated for K cells in Table 4-3 (EPA, April

1989). The Chi-Squared statistic, ΧΧΧΧ2, may be calculated as follows:

( )∑=

−=

K

1i iE

2i

Ei

N2X

Where:

Ni = the number of observations in the ith cell; and

Ei = N/K, The expected number of observations in the ith cell.

Last, compare the calculated ΧΧΧΧ2 to a table of the chi-squared distribution (Table 1,

Appendix B; U.S. EPA, 1989) with α = 0.05 and K=3 degrees of freedom. If the

calculated value exceeds the tabulated value, then reject H0 that the data are normally

distributed.

The following example data represent the residuals from an analysis of variance on

dioxin concentrations. The standardization process has been applied to the residuals,

resulting in the data in the third column, the standardized residuals or Zi.

27

EXAMPLE 7:

Observation Residuals Standardized Residuals

1 -0.45 -1.9

2 -0.35 -1.48

3 -0.35 -1.48

4 -0.22 -0.93

5 -0.16 -0.67

6 -0.13 -0.55

7 -0.11 -0.46

8 -0.1 -0.42

9 -0.1 -0.42

10 -0.06 -0.25

11 -0.05 -0.21

12 0.04 0.17

13 0.11 0.47

14 0.13 0.55

15 0.16 0.68

16 0.17 0.72

17 0.2 0.85

18 0.21 0.89

19 0.3 1.27

20 0.34 1.44

21 0.41 1.73

Table 8.7:Example Data for Chi-Squared Normality Test

21=Ν

54

21==Κ

The standardized residuals are then grouped according to the cell boundaries designated

for 5 cells in Table 4-3 (EPA, April 1989). The cell boundaries for K=5 are -0.84, -0.25,

0.25 and 0.84. Applying these boundaries to the above Zi, there are 4 observations in the

first cell, 6 in the second cell, 2 in the third, 4 in the fourth, and 5 in the fifth. These

counts represent the Ni in the above equation that is used to calculate the ΧΧΧΧ2 statistic. The

expected number in each cell, Ei, is N/K or 4.2. The ΧΧΧΧ2 statistic for these data is

calculated as:

28

( ) ( ) ( ) ( ) ( )10.2

2.4

2.45

2.4

2.44

2.4

2.42

2.4

2.46

2.4

2.4422222

2 =−

+−

+−

+−

+−

=Χ

The critical value at α = 0.05 for a chi-squared test with 2 (K - 3 = 5-3 = 2) degrees of

freedom is 5.99 (Table 1, Appendix B; U.S. EPA, 1989). Since the calculated chi-

squared value is less than the tabulated value, we fail to reject H0 that the data are

normally distributed.

Seasonality:

Prior to constructing the Control Charts, the significance of data seasonality is evaluated

using the nonparametric Kruskal- Wallis test (U.S. EPA, April 1989) at the α = 0.05

significance level. The null hypothesis to be tested is:

H0: The populations from which the quarterly data sets have been drawn have

the same median.


HA: At least one population has a median larger or smaller than at least one

other population’s median.

Where there are no ties, the Kruskal-Wallis statistic, H, is calculated:

( )( )1N3

N

R

1NN

12H

k

1i i

2

i +−

+= ∑

=

Where:

Ri = the sum of the ranks of the ith group;

Ni = the number of observations in the ith group (station);

N = the total number of observations; and

k = the number of groups (seasons).

If there are tied values (more than one data point having the same value) present in the

data, the Kruskal-Wallis Η′ statistic is calculated:

29

( )

Ν−Ν−

Η=Η

∑=3

g

1i

iT

1

'

Where:

g = the number of groups of distinct tied observations; and

N = the total number of observations

Ti is computed as:

( )i

3

ii ttT −=

Where:

ti = the number of observations in tie group i.

The calculated value H (or Η′ if ties are present) is compared to the tabulated chi-

squared value with (K-1) degrees of freedom, (Table A-1, Appendix B; U.S. EPA, April

1989) where K is the number of seasons. The null hypothesis is rejected if the computed

value exceeds the tabulated critical value.

EXAMPLE 8:

Well 1 Well 2 Well 3 1/45 (7) 1.52 (8.5) 1.74 (13)

1.27 (6) 2.46 (22) 2.00 (17.5

1.17 (4) 1.23 (5) 1.79 (14)

1.01 (3) 2.20 (20) 1.81 (15)

2.30 (21) 2.68 (23) 1.91 (16)

1.54 (10) 1.52 (8.5) 2.11 (19)

1.71 (11.5) ND (1.5) 2.00 (17.5)

1.71 (11.5)

ND (1.5)

Table 8.8: Example Data for Seasonality

9N75.5,R ii == 7N88.5,R 22 == 7N112,R 33 ==

2 tand2,t2,t2,t4,g 4321 =====

30

( ) 6223

4321 =−=Τ=Τ=Τ=Τ

246666i =+++=Τ∑

( )( ) 05.5243

7

112

7

5.88

9

5.75

2423

12 222

=−

++=Η

( )

30.5

2323

241

05.5

2

=

−−

=Η′

From Table A19, Gilbert 1987, X2

.95,2 = 5.99. Since Η′<5.99, we cannot reject H0 at

α=.05 level.

Application of the Kruskal-Wallis test for seasonality requires a minimum sample size of

four data points in each season. A minimum of four years of quarterly data is thus

required in order to appropriately evaluate data for seasonality. Sanitas currently tests

seasonality for up to twelve seasons. The default seasonal start dates are February 1, May

1, August 1, and November 1. Please see the “Options” section for instructions on how

to change the default seasonal cutpoints.

Correcting for Seasonality:

When seasonality is known to exist in a Time Series of concentrations, then the data

should be deseasonalized prior to constructing Control Charts in order to take into

account seasonal variation rather than mistaking seasonal effects for evidence of

contamination. This correction is performed following transformation of the data (if a

data transformation is required) and prior to an adjustment for non-detects, described

below.

Using the method described by the EPA (U.S. EPA, April 1989), the average

concentration for season i over the sampling period, Xi , is calculated as follows:

( )N

XX iNij

i

+⋅⋅⋅+=Χ

Where:

Xij = the unadjusted observation for the ith season during the jth year; and

31

N = the number of years of sampling.

The grand mean, X , of all the observations is then calculated as:

∑ ∑∑= ==

==n

1i

n

1i

N

1j n

X

Nn

XX

Iij

Where:

n = the number of seasons per year.

The adjusted concentrations, Zij, are then computed as:

XXij

Xij

Z +−= i

EXAMPLE 9:

1983 data 1984 data 1985 data

January 1.99 2.01 2.15

February 2.10 2.10 2.17

March 2.12 2.17 2.27

April 2.12 2.13 2.23

May 2.11 2.13 2.24

June 2.15 2.18 2.26

July 2.19 2.25 2.31

August 2.18 2.24 2.32

September 2.16 2.22 2.28

October 2.08 2.13 2.22

November 2.05 2.08 2.19

December 2.08 2.16 2.22

Table 8.9: Example Data for Deseasonalizing

32

EXAMPLE 10:

3 month average

1983 adjusted

1984 adjusted

1985 adjusted

January 2.05 2.11 2.13 2.27

February 2.12 2.15 2.15 2.21

March 2.19 2.10 2.15 2.25

April 2.16 2.13 2.14 2.24

May 2.16 2.12 2.13 2.25

June 2.20 2.12 2.15 2.23

July 2.25 2.11 2.16 2.23

August 2.25 2.10 2.16 2.24

September 2.22 2.11 2.17 2.22

October 2.14 2.10 2.16 2.24

November 2.11 2.11 2.14 2.25

December 2.16 2.09 2.17 2.23

Table 8.10: Deseasonalized Data

2.17X =

January 1983 Adjusted Concentration:

1.99 – 2.05 + 2.17 = 2.11

Censored Data:

Censored data include data that are less than the detection limit. If a small proportion

(less than 15 percent) of the observations are nondetects, these will be replaced with one-

half of the method detection limit prior to running the analysis (Gilbert, 1987, and U.S.

EPA, April 1989).

If more than 15 percent but less than 50 percent of the data are less than the detection

limit, the data’s sample mean and sample standard deviation are adjusted according to the

method of Cohen (1959) or Aitchison as described by EPA (U.S. EPA, April 1989).

Assumptions for use of this technique are that the data are normally distributed and that

the detection limit is always the same. If multiple detection limits exist, then they are all

replaced with the highest detection limit.

33

Cohen’s Adjustment Procedure:

Using Cohen’s method, the sample mean, xd , is calculated for data above the detection

limit:

∑=

=m

1i iX

m

1dX

Where:

m = the number of data points above the detection limit; and

xi = the value of the ith constituent value above the detection limit.

The sample variance, Sd2 , is then calculated for data above the detection limit:

( )1m

m

1i

2m

1i iX

m

12i

X

1m

2m

1idX

iX

2d

S−

∑=

∑=

−

=−

∑=

−

=

The two parameters, h and γ , are then calculated as follows:

( )n

mnh

−=

and

( )2DL

2d

S

X −

=

d

γ

Where:

n = the total number of observations (i.e., above and below the detection

limit); and

DL = the detection limit.

These values are then used to determine the tabulated value of the parameter λ (Table A-

5, Appendix A; U.S. EPA, 1992).

The corrected sample mean, xc , which accounts for the data below detection limit, is

calculated as follows:

34

( )DLddc −Χ−Χ=Χ λ

The corrected sample standard deviation, Sc, which accounts for the data below detection

limit, is calculated as follows:

( )( ) 21

22DLSS ddc −Χ+= λ

The adjusted sample mean, xc , and sample standard deviation, Sc, are then used for

construction of the Shewhart-CUSUM Control Chart.

EXAMPLE 11:

1984 1985 1986 1987

1850 1780 <1450 1760

1760 1790 1800 1800

<1450 1780 1840 1900

1710 <1450 1820 1770

1575 1790 1860 1790

<1450 1800 1780 1780

Table 8.11: Example Data for Cohen’s Adjustment

< Indicates that the value was not detected

1786.75Xd =

4174.4S2

d =

.1666724

2024h =

−=

35

( ).0368

214501786.75

4174.4=

−=γ

From Table 7, Appendix B, US EPA Guidance

h=.15 h=.20

.00 .17342 0.24268

.05 .17925 0.25033

γ

Table 8.12: EPA Guidance

The value for γ is found through double linear interpolation:

.24268 - .17342 = .06926 .06926 * .3334 = .02309

.17342 + .02309 = .19651

.25033 - .17925 = .07108 .07108 * .3334 = .02370

.17925 + .02370 = .20295

.20295 - .19651 = .00644 .00644 * .736 = .004740

.19651 + .004740 = .20125

λ = .20125

cΧ = 1786.75-.20125(1786.75-1450) = 1718.98

CS = [4174.4+.20125(1786.75-1450)

2 ) 2

1

=164.31

Aitchison’s Adjustment Procedure:

Using Aitchison’s method the corrected sample mean, xa , is calculated:

Χ′

−=Χ

n

na

01

Where:

x′ = the average of the n1 detected values;

36

0n = the number of samples in which the compound is not detected; and

n = the sample size.

The corrected standard deviation, sa, is calculated:

( )( )

2X1nonn

n0

n2s

1n

10

nn

as

−

−+′

−

+−=

Where:

s′ = the standard deviation of the n1 detected measurements.

EXAMPLE 12:

Date Date

2/15/1997 <10

5/5/1997 <10

7/8/1997 <10

10/12/1997 15

2/5/1998 17

4/20/1998 13

6/2/1998 <10

10/4/1998 15

12/9/1998 12

2/10/1999 17

Table 8.13:Example Data for Aitchison’s Adjustment

14.83X =′ 2.04S =′

10n = 40

n =

8.9a

X = 7.8aS =

Control Chart Procedure:

This procedure for construction of the Shewhart-CUSUM Control Chart follows the EPA

recommendations (U.S. EPA, April 1989). A version customized for California is also

37

available in Sanitas, and some minor adjustments have been made for other protocol

standards.. The Shewhart-CUSUM Control Chart recommends a minimum of six to eight

historical data points in order to reliably determine the mean and standard deviation for

each constituent’s concentration in a given well.

Three parameters are selected prior to plotting:

h = the control limit to which the cumulative sum values (CUSUM) are

compared. The EPA recommended value is h = 5 units of standard deviation.

California does not require this limit to be met for detection monitoring. The

ASTM recommended value is h = 4.5 units of standard deviation for a

background n < 12 and h = 4.0 units of standard deviation for a background n

>= 12.

K = a reference value that establishes the upper limit for the acceptable

displacement of the standardized mean. The EPA and California

recommended value is K = 1. The ASTM recommended value is K=1 for

background n < 12 and K = .75 for background n >= 12.

SCL = the upper Shewhart control limit to which the standardized mean will be

compared. For California sites, a value of SCL = 2.327 units of standard

deviation is used per Article 5. For other sites a value of SCL = 4.5 is used per

EPA recommendation. The ASTM recommended value is SCL = 4.5 for a

background n < 12 and SCL = 4.0 for a background n >= 12.

Assume that at time period Ti, ni concentration measurements X1,…,Xni, are available.

Their average, X , is computed.

The Shewhart Control Chart showing the standardized mean is the equivalent to an X

chart for n=1 (within a single sampling period). The standardized mean, Zi, is then

computed:

( ) /Si

ni

Xi

Z X−=

Where:

X = the mean obtained from prior monitoring data from the same

station (at least four data points); and

S = the standard deviation obtained from prior monitoring data from

the same station (at least four data points).

When applicable, for each time period, Ti, the cumulative sum, Si (CUSUM), is

calculated:

( ){ }1-i

SKi

Z0,maxi

S +−=

38

Where max {A,B} is the maximum of A and B, starting with So = O.

The values of Si versus Ti are then plotted. An “out of control” situation occurs under

EPA standards at the time period Ti if, Si > h or Zi > SCL, and under California standards

only if Zi > SCL.

Under Unified Guidance and ASTM Standards a refinement has been added. If a single

value exceeds and is followed immediately by a value that is itself within the control

limits, then the second value serves as a non-validating retest of the first. That is, an out-

of-control situation requires either the most recent point to exceed the control limits, or

two such points in a row.

The results may be plotted in standardized units or may be converted back to their

original metric units.

EXAMPLE 13:

Date Data (mg/l) Zi (s.d.) Si (s.d.) Si (mg/l)

1/5/1991 *3.235

4/6/1991 *4.234

8/9/1991 *5.473

2/15/1992 *9.945

6/1/1992 *11.902

10/4/1992 *4.341

1/3/1993 *3.235

4/2/1993 *4.234

9/5/1993 5.473 -0.108 0 5.825

2/6/1994 9.945 1.261 0.261 6.678

5/12/1994 11.9 1.86 1.121 9.486

8/4/1994 4.341 -0454 -0333 4.735

12/22/1994 3.235 -0793 0 5.825

3/4/1995 4.234 -0.487 0 5.825

7/8/195 5.473 -0.108 0 5.825

11/5/1995 9.945 1.261 0.261 6.678

Table 8.14: Example Data for Shewhart-CUSUM Control Charts

* = Background data

5.825X = 3.267S = 1K =

mg/1 20.526s.d. 4.5SCL ==

mg/1 22.159s.d. 5h ==

39

Intrawell Rank Sum

Description:

When the historical data are neither normal nor transformed-normal, there is an option to

perform a nonparametric comparison between the historical data and subsequent data

points in lieu of constructing a Control Chart. The Kruskal-Wallis Rank Sum test is a

nonparametric procedure where the sums of ranked data sets are compared. Subsequent

sample data are compared with sampling data from the initial monitoring period of the

same well. It is assumed that during the initial monitoring period the well has shown no

evidence of contamination nor an increasing trend. This test does not require a normal

distribution of the data.

The null hypothesis to be tested is:

H0: The historical (background) data and the compliance data have the same

median constituent concentration.


HA: The compliance data have a greater median constituent concentration than

the historical data.

Procedure:

The Kruskal-Wallis test procedure is used to evaluate whether the historical (background

data) and the compliance data have the same median constituent concentration (see

Control-Chart Seasonality test for method description and example).

Mann-Whitney / Wilcoxon Rank Sum

Description:

The Mann-Whitney test, also known as Wilcoxon Rank Sum, may be used to test whether

the measurements from one population are significantly higher or lower than another

population. This test is available for both interwell and intrawell analyses.

The null hypothesis that is being tested is:

HO: The populations from which the two data sets have been drawn have the

same mean.


HA: The populations have different means.

40

Procedure:

If n1 < 10 and n2 < 10, then:

21 nnN +=

Where:

n1 = the number of observations in sample one; and

n2 = the number of observations in sample two.

Order the measurements for group 1 and group 2 from the lowest value to the highest

value.

Calculate the Mann-Whitney statistic as:

111

21 R2

1)(nnnnU −

++=

Where:

R1 = The sum of the ranks of the observations in sample one

For a one-tailed test, the calculated U is compared with the tabled values (Table B.11,

Zar, 1996). If U is greater than the critical value, then Group 2 (compliance) is greater

than Group 1 (background).

For a two-tailed test, you must compute both U and U′ , where:

222

12 R2

1)(nnnnU −

++=′

The larger of U and U′ is compared to the critical value in Table B.11 (Zar, 1996). If the

calculated U or U′ is as great or greater than the critical value of U there is a statistically

significant difference between the two populations.

If n1 > 10 and n2 > 10 or either n1 or n2 is greater than 10 the normal approximation

of the Mann-Whitney Test will be used.

If ties are present:

12

1)(Nnn

2

nnU

Z

21

21

+

−=

12

tNN*

NN

nn

2

nnU

Z3

2

21

21

∑−−

−

−=

41

Where:

A statistically significant finding is declared if the absolute value of Z is greater than the

tabled value Z1-α/2. Significance is tested at the following alpha levels: .10, .05, .025, and

.01.

Welch's t-test

Assumptions:

All t-tests assume independence of the individual sample values. It is left to the user to

ensure that the time span between subsequent samples allows for independence of the

data. This assumption can be further tested by means of the Rank Von Neumann test,

described elsewhere in this document, if desired.

The hypothesis tests with Welch's t-test assume that errors (residuals) are normally

distributed. The normal distribution can be checked using the multiple group Shapiro-

Wilk test, described below. Two groups (1 background and 1 compliance well in the

case of Interwell; time ranges in the case of Intrawell) are to be compared, and the

minimum sample size requirement is 4 samples per group. If the data normality

assumption is not met after attempted transformation(s) (depending on user settings), then

the Wilcoxon Rank Sum, described elsewhere in this document, is substituted.

In addition, the Wilcoxon Rank Sum will be substituted in cases in which > 20% of the

data are censored values.

Multiple Group Shapiro-Wilk test:

1) Given K groups to be tested, denote the sample size of the ith group as ni.

2) Compute the Shapiro-Wilk statistic (SWi) for each of the K groups, as discussed

elsewhere in this document.

3) Transform each Shapiro-Wilk statistic to the intermediate quantity (Gi). For

sample size >= 7, Gi = γ + δln(Swi - ε/1- SWi), where γ, δ, and ε are from tables in

Technometrics Vol. 10 number 4, and other sources. For sample size< 7, find a

tabled Gi based on ui = ln(Swi - ε/1- SWi).

4) Sum the Gi's, and multiply by the reciprocal of the square root of K to get the

Shapiro-Wilk multiple group statistic G.

5) Given the desired significance level (α), determine an α-level tabulated critical

point as the upper αth normal quantile (zα). If the absolute value of G > zα take

this as significant evidence of non-normality at the α level.

∑ ∑ −= )t(tt i

3

i

42

PROCEDURE

Using group means and standard deviations, Welch’s t-statistic is computed as

where B indicates background and C indicates compliance groups.

The approximate degrees of freedom are computed as

This quantity is rounded to the nearest integer to become df.

t is compared to the (1-α)*100th percentage point of the Student’s t-distribution with df

degrees of freedom. If t > the critical value, it can be concluded that the compliance

mean is significantly greater than the background mean at the α significance level.

One-Way Analysis of Variance (ANOVA)

Description:

Analysis of variance (ANOVA) is the name given to a variety of similar statistical

procedures. These similar procedures all compare the means or median values of

different groups of observations to determine if a statistical difference exists among

groups. The procedure is an interwell procedure that can be used to compare compliance

well data to background well data. Two types of analysis of variance are presented:

parametric and nonparametric one-way analysis of variance. Both methods are

appropriate when the only factor of concern is the spatial variability of constituent

measurements in a given sampling period. For statistically meaningful results, at least

three observations should be present in each well. Prior to statistical analysis, the

assumption of data independence should be considered. A specified rigorous field

sampling protocol should be followed.

43

Parametric ANOVA

Assumptions:

The hypothesis tests with parametric ANOVA assume that errors (residuals) are normally

distributed with equal variances across all wells and a single detection limit is used for

the analyte of interest. The normal distribution can be checked by testing the distribution

of the residuals (the difference between the observations and the values predicted by the

ANOVA model). At least p > 2 groups (wells) are to be compared, and the total sample

size, N, should be large enough so that N - p > 5. Under CA standards, the minimum

sample size requirement is 4 samples per well. If the data normality assumption is not

met, then nonparametric ANOVA is performed.

Normality of Residuals:

The residuals are the differences between each observation and its predicted value. In the

case of one-way analysis of variance, the predicted value for each observation is the

group (well) mean. Thus the residuals, Rij, are given by:

iXij

Xij

R −=

Where:

Xij = the jth observation in the ith well; and

Xi = the mean of the observations in the ith well.

Once the residuals have been computed, the Shapiro-Wilk test for normality (previously

described) is performed on the absolute values of the residuals. If the residuals are not

found to be normally distributed, the data are transformed and the normality test of the

residuals is repeated. If the residuals are not found to be transformed-normal,

nonparametric ANOVA is performed (subsequently described).

Equality of Variance Test:

Levene’s test for homogeneity of variance is performed as follows:

Compute the absolute values of the residuals from the ANOVA, treating each compliance

point well and the combined set of background wells as separate groups.

Compute the F-statistic for the ANOVA on the absolute residuals.

44

GroupsWithin

GroupsBetween

MS

MSstatisticF =−

Where:

MS = Mean Squares

( )1−=

p

SSMS

Groups

upsBetweenGro

and

( )pN

SSMS Error

GroupsWithin −

=

Where:

p = the number of groups;

N = the total sample size; and

SS = the Sum of Squares.

Sum of Squares are computed as follows:

∑=∑=

−∑=∑=

=−=

p

1i

2i

n

1j N

X..2ij

Xp

1i

in

1j

2..

ijX

totalSS X

( ) 2X..N

1p

1i

2i.

X

in

1p

1i..

in

StationsSS XX i −∑

=∑=

=−=

and

45

StationsSS

totalSS

ErrorSS −=

Where:

X.. = the sum of the total observations;

X.. = the mean of the total observations;

Xi. = the sum of all ni observations in group i;

.X i = the mean of the observations at group i; and

ni = the number observations at group i.

If the calculated F-statistic exceeds the tabulated F-statistic (α = 0.05) for (p - 1) and (N -

p) degrees of freedom found in Table 2, (Appendix B; U.S. EPA, April 1989), conclude

that the variances among the groups are not equal. In this case, transform the original data

and perform the equality of variance test again. If the calculated F-statistic does not

exceed the tabulated F-statistic, conclude that the variances are equal and perform

ANOVA on the original observations. If the calculated F-statistic still exceeds the

tabulated F-statistic, conclude that the variances among the groups are not equal and

perform a nonparametric analysis of variances. If the calculated F-statistic is less than the

tabulated F-statistic, conclude that the variances among the groups are equal and perform

ANOVA on the transformed data.

EXAMPLE 14:

Date Well 1 Well 2 Well 3

1/3/1995 22.9 2.0 2.0

2/5/1995 3.09 1.25 109.4

4/5/1995 35.7 7.8 4.5

6/10/1995 4.18 52 2.5

Group mean 16.47 15.76 29.6

Table 8.15: Example Data for Levene’s Equality of Variance Test

46

EXAMPLE 15:

Date

Well 1

(residuals)

Well 2

(residuals)

Well 3

(residuals)

1/3/1995 6.43 13.76 27.6

2/5/1995 13.38 14.51 79.8

4/5/1995 19.23 7.96 25.1

6/10/1995 12.29 36.23 27.1

Group mean 12.83 18.12 39.9

Overall Mean 23.62

Table 8.16:Residuals of Data

( ) ( ) ( )[ ] ( ) 1646.723.621239.9418.12412.834SS2222

wells =−++=

( ) ( ) ( )[ ] ( ) 4318.823.62123.9.913.386.43SS2222

total =−+++= L

2672.11646.74318.8SSerror =−=

2.77296.9

823.3FStatistic ==

The critical value at the .05 α level is F.95, 2, 9 = 4.26. Since the F-statistic of 2.77 is less

than the critical point, the assumption of equal variance can be accepted.

Censored Data:

Censored data include data that are less than the detection limit. If a small proportion

(less than 15 percent) of the observations are less than the detection limit, these will be

replaced with one half of the method detection limit prior to running the analysis (Gilbert,

1987 and U.S. EPA, April 1989). If more than 15 percent of the data are less than the

detection limit, a nonparametric ANOVA is performed.

Parametric ANOVA Procedure:

When there is more than one compliance well but fewer than eleven, and all the

previously mentioned assumptions are met, parametric ANOVA will be performed as

47

follows (in the case of more than 10 compliance wells, interval analysis is recommended

in lieu of ANOVA):

An F-statistic is computed (as previously described in Levene’s test for homogeneity of

variance) on the well observations (instead of the absolute residuals). When the F-statistic

is found to be significant at the α = 0.05 level, a contrast test will be performed to

determine if any compliance well constituent concentration is significantly higher than

the background well constituent concentration. The ANOVA table is presented as

follows:

EXAMPLE 16:

Source of

Variation

Sum of

Squares

Degrees of

Freedom

Mean

Squares

F

Between

Groups

SS

Groups

p-1

MS Groups =

SS Groups / (p-1)

F =

MS Groups / MS error

Error (within

Groups)

SS error

N-p

MS error =

SS error / (N - p)

Total SS total N-1

Table 8.17: ANOVA Table

Bonferroni t-statistic (used with 5 or fewer comparisons):

When the F-statistic is found to be statistically significant, a contrast test is recommended

to determine if the significant F-statistic is due to differences between background and

compliance wells. The Bonferroni t-statistic contrast test is recommended when five or

fewer comparisons are to be made (U.S. EPA, April 1989).

The mean(s), Xb , from the background well(s) is (are) computed as follows:

∑=

=u

1iiX

bn

1bX

Where:

nb = the total sample size from all u background groups;

Xi = the mean of the concentrations from the ith background group; and

u = the total number of background groups.

Compute the m differences between the average concentration from each compliance

group Xi , and the average of the background, Xb .

48

bi. XX − m,1,i K=

Where:

m = the number of compliance groups.

Compute the standard error, SEi, of each difference as:

21

ib

errorin

1

n

1MSSE

+=

Where:

MSerror = determined from the ANOVA table (see above); and

ni = the number of observations at group i.

The t-statistic is obtained from the Bonferroni t-table (Table 3, Appendix B; U.S. EPA,

April 1989)

Where:

αααα = 0.05;

(N - p) = the degrees of freedom;

N = the total number of observations;

p = the total number of groups; and

m = the number of comparisons to be made.

Compute the critical values, Di, for each compliance group i.

ti

SEi

D =

If the difference bi. XX − , exceeds the critical value, Di, then conclude that the ith

compliance group has significantly higher constituent concentrations than the average

background group(s). Otherwise, conclude that there is no statistically significant finding.

This computation should be performed for each of the m compliance groups individually.

The test is designed so that the overall experimentwise error is 5%.

When more than five group comparisons are to be made, the t-statistic used is:

− 0.99,pn

t

49

Obtained from the Bonferroni t-table (Table 3, Appendix B; U.S. EPA, April 1989).

The above is based on one-sided comparisons. When a two-tailed comparison is

indicated, Sanitas will use the t-statistic:

−−=

2mα1,pN

tt

A significant difference is indicated between background and compliance groups when

the absolute value of the difference bi Χ−Χ exceeds the critical value, Di.

When California Standards are selected, the t-statistic used will be t(n-1),(0.99). If a modified

alpha, α*, is computed, the t-statistic used will be t(n-1),(1-α*).

EXAMPLE 17:

Date Well 1 (up) Well 2 (down) Well 3 (down)

1/3/1995 22.9 70 2.0

2/5/1995 3.09 82 20

4/5/1995 35.7 65 4.5

6/10/1995 4.18 52 2.5

Group mean 16.47 67.25 7.25

Group Sample Size 4 4 4

Table 8.18: Example Data for Parametric ANOVA

EXAMPLE 18:

Source of Variation Sum of Squares

Degrees of Freedom

Mean Squares

F-Statistic

Between Wells 8351.8 2 4175.9 26.39

Error (within wells) 1424.2 9 158.2

Total 9776.0 11

Table 8.19:ANOVA Table

16.47X b =

50

50.7816.4767.25XX b1 =−=−

9.2216.477.25XX b2 −=−=−

8.894

1

4

1158.2SESE

21

21 =

+==

2.262tt 9,.975 ==

20.122.2628.89DD 21 =∗==

For compliance Well 2, the difference 50.78 exceeds the critical value 20.12. Therefore,

we can conclude that Well 2 has significantly higher constituent concentrations than

background. For compliance Well 3, the difference –9.22 does not exceed the critical

value of 20.12. Therefore, we can conclude that Well 3 does not have significantly

higher constituent concentrations than background.

Nonparametric ANOVA

Description:

This statistical procedure is an interwell test that compares the median values of

background wells to the median values of compliance wells and determines if a

significant difference exists among the groups.

Assumptions:

The standard assumption in one-way nonparametric ANOVA is that the data from each

well come from the same continuous distribution, and therefore have the same median

concentrations of chemical constituents. For statistically valuable results, at least four

observations for each well should be used and the total sample size minus the number of

groups (wells) should be greater than four. Under California options, minimums of nine

observations per well are required. In addition, this ANOVA test does not require a

distribution that is normal.

Independence:

Prior to statistical analysis, the assumption of data independence should be considered. A

specified rigorous field sampling protocol should be followed.

51

Procedure:

The Kruskal-Wallis test procedure (see Control Chart-Seasonality test for method

description) is used to evaluate the data sets at the α = 0.05 significance level when there

are two or more wells being compared. This test is performed on the ranked values, and

the null hypothesis to be tested is:

H0: The populations from which the quarterly data sets have been drawn have

the same median concentrations.

The alternative hypothesis to be tested is:

HA: At least one population has a median larger or smaller than the

background population.

The calculated value, H (or H′ , if ties are present) is compared to the tabulated chi-

squared value with (k-1) degrees of freedom (U.S. EPA, April 1989) where k is the

number of groups. The null hypothesis is rejected if the calculated value exceeds the

tabulated critical value. Application of the Kruskal-Wallis test requires a minimum

sample size of four data points for each well.

Censored Data:

Censored data include data that are less than the detection limit. These data will be

replaced with one half of the method detection limit prior to running the analysis (U.S.

EPA, 1992).

Tolerance Limits

Description:

An alternative approach to analysis of variance (to determine whether there is statistically

significant evidence of an impact) is to use Tolerance Limits. A tolerance interval is

constructed from the data on unimpacted background wells. The concentrations from

compliance wells are then compared to the upper limit of the tolerance interval. With the

exception of pH, if the compliance concentrations fall above the upper limit of the

tolerance interval (Tolerance Limit), this provides statistically significant evidence of a

difference. For pH and other constituents in which low values as well as high values may

be indicative of a facility impact, the lower limit of the tolerance interval is also used.

Compliance concentrations that fall outside the bounds of the tolerance interval provide

evidence of a statistical difference.

Assumptions:

Tolerance Limits are most appropriate for use at facilities that do not exhibit high degrees

of spatial variation between background wells and compliance wells. In addition, for a

52

Parametric Tolerance Limit, the background data must be normally or transformed

normally distributed, with at least three observations, but preferably eight or more

observations.

Distribution:

The distribution of data is evaluated using the Shapiro-Wilk test for normality (see

Control Chart-Distribution for method description) for samples with 50 or fewer

observations. The Shapiro-Francia test is used for sample sizes greater than 50 (see

Control Chart-Distribution for method description). Parametric intervals with background

sample sizes over 50 are only applicable for interwell tests.

Parametric Tolerance Limit Procedure:

To construct the upper tolerance limit, the mean, X , and the standard deviation, S, are

calculated from the background data. The one-sided upper tolerance limit, TL, is

constructed as follows:

KSXTL +=

Where:

X = the mean of the background observations;

K = the one-sided normal tolerance factor found in Table 5 (Appendix B;

U.S. EPA, April 1989); and

S = the standard deviation of the background observations.

Each observation from the compliance wells is compared to the upper tolerance limit. If

any observation exceeds the tolerance limit, that is statistically significant evidence of an

impact. In the case of transformed-normal background data, the tolerance interval is

constructed on the transformed background data, and the transformed compliance well

observations are compared to this tolerance limit.

In the case of a two-tailed test, both an upper and a lower tolerance limit are constructed.

The upper tolerance limit, UTL, is constructed as follows:

KSXUTL +=

Where:

53

K = the two-tailed normal tolerance factors (Eisenhart, C., Hastay, M.W.,

and Wallis, W.A., 1947) for 95% (default for interwell) or 99% (default

for intrawell) confidence and 95% coverage.

The lower tolerance limit, LTL, is constructed as follows:

KSLTL X −=

Where:

K = the two-tailed normal tolerance factors (Eisenhart, C., Hastay, M.W.,

and Wallis, W.A., 1947) for the confidence level in use and 95%

coverage.

EXAMPLE 19:

Well 1 (up) Well 2 (up) Well 3 (down)

4.2 7 7.6

3.5 3.4 9

5.6 6.7 6

5.6 4.6 7.2

6 5 4.3

4.3 5 5.4

2.5 4.2 6.3

5 6.3 5.2

Table 8.20: Example Data for Parametric Tolerance Limit

4.931X = 1.244s = 2.52K =

( ) 8.072.52*1.2444.931KsTL X =+=+=

Censored data:

If less than 15 percent of the background well observations are nondetects, these will be


EPA, April 1989).

If more than 15 percent but less than 50 percent of the background data are less than the

detection limit, the data’s sample mean and sample standard deviation are adjusted

according to the method of Cohen or Aitchison (see Control Chart-Censored Data for

method description).

54

If more than 50 percent but less than 90 percent of the background data are below the

detection limit, or when the background data are not transformed-normal, a

Nonparametric Tolerance Limit will be constructed.

Nonparametric Tolerance Limit Procedure:

When there is at least one detectable observation, the highest value for the background

data is used to set the upper limit of the tolerance interval. When all the data are censored

(i.e., nondetects or trace values) the decision logic outlined in figures 1 - 4 is used.

Assumption:

A minimum of 19 background samples is required for a 5% false positive rate (p.58, US

EPA, 1992). Fewer than the required minimum background sample size will raise the

false positive rate and/or lower the tolerance level.

55

Figure 8.1:Decision Logic for Nonparametric Interwell Tolerance Limit Development in Batch Processing Mode

56

Figure 8.2: Decision Logic for Nonparametric Intrawell Tolerance Limit Development in Batch Processing Mode

57

Figure 8.3:Decision Logic for Nonparametric Interwell Tolerance Limit Development in Interactive Mode

58

Figure 8.4: Decision Logic for Nonparametric Intrawell Tolerance Limit Development in Interactive Mode

59

Alert Levels (Arizona Standards Only)

Description:

Alert Levels are intrawell tolerance limits that are customized for the State of Arizona

1993 Guidance section II.D, and E. The formula used to compute Alert Levels is

identical to the formula used to compute parametric tolerance limits. Three key factors

distinguish this test from the EPA’s tolerance limit:

1) the table lookup for the tolerance factor K;

2) the decision logic regarding proportion of nondetects; and

3) the outlier removal method.

The tolerance factor K is based upon the total number of sampling rounds for the site in

lieu of the background sample size available for a given constituent (as is done for EPA

tolerance limits). Figures 5 and 6 illustrate the overall decision logic and the handling of

nondetects, respectively.

The concentrations from compliance data are then compared to the alert levels. If the

compliance concentrations fall above the alert level, this provides statistically significant

evidence of an impact.

60

Figure 8.5: Decision Logic for Alert Levels (Arizona Standards Only)

61

Figure 8.6:Handling of Nondetects Under Arizona Guidance Standards

Prediction Limits (or Intervals): EPA Standards

Description:

A prediction limit is used to determine whether a single observation is statistically

representative of a group of observations. It is a statistical interval calculated to include

one or more observations from the same population with a specified confidence. In

ground water monitoring, a prediction limit approach may be used to make comparisons

between background and compliance data. The interval is constructed from a

background set of observations such that it will contain K future compliance observations

with stated confidence. If any observation exceeds the bounds of the prediction limit, this

is statistically significant evidence that that observation is not representative of the

background group.

62

Assumptions:

The parametric prediction limit is constructed if the background data all follow a normal

or transformed-normal distribution. A minimum of four background values should be

used in constructing the interval. The estimate of the standard deviation (S) that is used

should be an unbiased estimator. The usual estimate assumes that there is only one source

of variation. If there are other sources of variation, such as time effects, or spatial

variation in the data used for the background, then the parametric Prediction Limit is

inappropriate. In these situations, a multivariate statistical procedure is suggested.

Distribution:

In order to determine whether a parametric or nonparametric prediction limit should be

used, the distribution of the data is evaluated by applying the Shapiro-Wilk or Shapiro-

Francia tests for normality to the raw data or, when applicable to the ladder of powers

(Helsel & Hirsch, 1992) transformed data. The null hypothesis, Ho, to be tested is:




distribution.

Parametric Prediction Limits Procedure:

The mean, X , and the standard deviation, S, are calculated for the raw or transformed

background data. The number of comparison observations, K, is specified to be included

in the interval. If K will be different from the default in Sanitas™ which assumes K=1

for each well, the number of observations, K, to be compared to the interval must be

specified in advance (see Prediction Limit Setup…).

Then the interval is given by:

−−++

αK,11,nt

n

1

m

1S0,X

Where:

m = 1 for K single observations;

n = the number of observations in the background data; and

t(n-1, K, (1-αααα))

is found in Table 3 (Appendix B; U.S. EPA, April 1989) with n-1

degrees of freedom, K comparison observations, and 1-αααα significance level.

63

K for intrawell tests is 1. The prediction limit is constructed to have a (1-(α /K)) percent

probability of containing each of the next K sampling observations if no change has

occurred from background conditions (or equivalently a probability of 1-α of containing

all K future observations when no change has occurred). If any of the K comparison

observations fall outside the bounds of the Prediction Limit, this is statistically significant

evidence that the comparison data are not representative of the background group of

observations.

In the case of interwell tests when K is less than 5, the t-value used in the above equation

differs under EPA and CA standards for interwell analyses but not for intrawell analyses.

For interwell tests under CA standards and intrawell tests under both EPA and CA

standards, the t-value used is consistent with a 1 percent α-level per individual

comparison observation. For interwell tests under EPA options, the α-level used to derive

the t-value is 5 percent divided by the number of comparison observations. This results in

different limits under EPA versus CA standards for interwell analyses when K is less

than 5.

EXAMPLE 20:

Well 1 (up) Well 2 (up) Well 3 (down)

104 94 112

124 102 95

109 86 87

116 105 114

Table 8. 21: Example Data for Parametric Prediction Limit

105=Χ 89.11=s 860.1=t

128.911.898

1

1

11.860105

n

1

m

1sPL X =++=++=

t

For a two-tailed test, t(n-1,K,(1-( α /2))) is substituted for t(n-1, K, (1-α)) in the above formula.

Statistically significant evidence of an impact is noted when compliance observations fall

outside the bounds of the upper and lower prediction limits.

When a modified alpha, α*, is computed, t(n-1,K,1-α*) will be substituted for t(n-1, K, (1- α)) in

the above formula.

64

Censored data:

If less than 15 percent of the background observations are nondetects, these will be


EPA, April 1989).



according to the method of Cohen or Aitchison (see Control Charts for method

description).

If more than 50 percent of the background data are less than the detection limit, a

nonparametric prediction limit will be computed.

If more than 90 percent of the background data are less than the detection limit, Sanitas

provides an option to construct a Poisson-based prediction limit.

Nonparametric Prediction Limits:

Distribution:

When the background data are not transformed-normal, or greater than 50 percent of the

background data are less than the detection limit, there is an option to construct a

nonparametric prediction limit. The highest value from the background data is used as

the upper limit of the prediction limit. Minimums of 19 background samples are required

for a 5% false positive rate when comparing a single compliance observation (k=1) to the

prediction limit. Fewer than the required minimum background sample size will result in

an inflated false positive rate that can be computed as (1-(n/(n+k))). Since the highest

background value is always used as the upper prediction limit, the actual significance

level decreases with increasing background sample size. Under CA standards, the false

positive rate is based upon the background sample size and the number of compliance

points being compared to the limit. This test presumes that two retests will be performed

when there is a statistically significant finding in detection monitoring. The site-wide

false positive rate, γ, is derived from a correction (Willits, N., 1994) of Gibbon’s Table 2

(Gibbons, R.D., 1991). In the case of a two-tailed test, the lowest value from the

background data is used to set the lower limit of the prediction limit.

Under EPA Standards, the false positive rate is based upon the formula:

( )( )knn/1 +−

Where:

n = the background sample size; and

k = the number of future values being compared to the limit.

65

Davis McNichols Test-Nonparametric (DMT-NP) Prediction Limit Procedure:

When the user explicitly selects the DMT-NP method for Prediction Limits (Davis

McNichols, 1994) in the options window, the verification-retesting plan will be

incorporated into the estimated site-wide false positive rate. The original sampling event

plus the potential number of retest samples is designated as m.

In ‘1 of m’ plans, only a single verification resample needs to pass the test for the original

statistical finding to be considered anomalous. The per-constituent false positive rate is

given for m = 1, 2, 3 and 4. In addition, the desired per-constituent false positive rate is

given for comparison purposes. This information may be used for planning purposes

when designing a site-specific statistical analysis plan.

In contrast, all verification resamples need to pass the test in California plans for the

original statistical finding to be considered anomalous. The per-constituent false positive

rate is given for m = 1, 2, and 3.

This test has been shown to have equivalent power to the EPA reference standard in

general (Davis & McNichols, 1994). However, two critical assumptions need to be met

for an accurate depiction of the per-constituent false positive rate. First, the test presumes

independence among the original samples and resample. Second, the test, when applied

to multiple wells (i.e., an interwell basis), presumes that the data are identically

distributed (ID) across all wells. An estimation of the ID assumption is automatically

performed by Sanitas. The Kruskal-Wallis test for equal medians (see ANOVA for test

description) is used to test this assumption. The user will be warned when the data fail the

ID test. When the ID assumption is not met, an intrawell analysis is recommended. In the

case of a two-tailed test, the lowest value from the background data is used to set the

lower limit of the prediction limit.

Under the interwell procedure, the prediction limit is the largest (or second largest)

observation in the background data. The background data consist of all the historical data

from all of the wells with the exception of the most recent observation from each of the

downgradient wells. These recent downgradient observations will be tested against the

prediction limit.

Under the intrawell procedure, the prediction limit is the largest (or second largest)

observation in the historical background data for that well. The background data consist

of all the historical data from the individual well except the most recent observation. The

most recent observation will be tested against the prediction limit. In most intrawell

cases, there will be insufficient background data to approximate the desired per-test false

positive rate.

Poisson-Based Prediction Limit Procedure:

When the background data contain greater than 90 percent observations below the

detection level, Sanitas gives you the option to construct a Prediction Limit based upon

66

the Poisson distribution. However, when DMT-NP is selected, a nonparametric

prediction limit will be derived versus a Poisson prediction limit.

Distribution:

The Poisson distribution is a probability distribution modeled for rare events. The

Poisson probability of a detectable observation is rare unless there is an impact.

The sum of the Poisson counts across background samples, Tn, is computed by adding

the number of parts per billion (ppb) across all observations for the background well(s).

Prior to any calculations, nondetects are set to one-half of the method detection limit

(MDL) and all trace values are evaluated as the average of the MDL and the practical

quantitation limit (PQL).

The 99% upper Poisson prediction limit is calculated as:

4

2z

c

11nTcz

2

2czncT

kT ++++=

Where:

c = k/n;

k = the number of future observations being compared to limit;

n = the background sample size;

Tn = the sum of the Poisson count of background samples; and

z = the upper 99% of the normal distribution.

The value k need not represent multiple samples from a single well. It could also denote a

collection of single samples from k distinct wells, all of which are assumed to follow the

same Poisson distribution in the absence of contamination.

To test the upper prediction limit, the Poisson count of the sum of the next k observations

from the downgradient well or the sum of the single observations from k distinct wells is

compared to the upper prediction limit. If this sum exceeds the prediction limit, there is

significant evidence of a downgradient impact. Should the exceedance occur for a sum of

observations from multiple wells, further investigation will be necessary to determine the

impacted well or wells.

67

EXAMPLE 21:

MW-1 (up) MW-2 (up) MW -3 (down)

<4 12 <4

<4 <4 6

<4 <4 <4

<4 <4 <4

Table 8.22: Example Data for Poisson Prediction Limits

1k = 8n = .1258

1C ==

( ) 2622122222n

T =++++++=

2.327.99

z =

( )( )

( ) 05.84

2327.2

125.

11261327.2125.

2

2327.2125.26125.

kT =++++=

Note: This test cannot be used for decimal values. When a Poisson analysis is attempted

on decimal data, Sanitas will advise you to change the units and to convert the

observations from parts-per-million to parts-per-billion (ppb) or ppb to parts-per-trillion.

Please note that units for all observations need to be consistent within a constituent.

Prediction Limits (or Intervals): EPA Draft Unified Guidance (UG)

Standards

Description:

UG Prediction limits are statistical intervals which include retesting strategies in order to

achieve a low facility-wide false positive rate while maintaining adequate statistical

power to detect contamination. The intervals are designed to contain K future sample(s)

or sample statistics (mean or median), with a specified probability, from a statistical

population. If any observation exceeds the prediction limit, this is statistically significant

evidence that the observation is not representative of the background group. While an

overview of these plans is provided in this section, the Draft Unified Guidance provides

detailed explanations and recommendations for prediction limits with retesting.

68

Requirements:

Prior to constructing UG prediction limits, the user must select “Unified Guidance

Standards” under the Options menu. To specify the site configuration and resampling

plan, select Prediction Limit Set Up on the Analysis tab of the Configure Sanitas window.

Enter the number of statistical evaluation periods per year (nE), number of constituents

(c), and number of monitoring wells (w). The annual target facility-wide false positive

rate should be no greater than 10% (cumulative throughout the year). If a facility

samples semi-annually, for instance, the overall target rate is distributed evenly among

each sampling event for a 5% target rate (α = .10/2 = .05 = 5%). The individual test

alpha (α*) then equals the targeted per-event false positive rate divided by the total

number of statistical tests (r).

For example, a site which samples semi-annually for 15 constituents at 7 wells would

have the following per-test alpha levels:

Semi-annual target rate: α = .10/2 = .05 = 5%

Total # of tests: r = c ● w = 15 x 7 = 105

Per-test alpha level: α* = α/r = .05/105 = .0004

Resample Plans:

Complete the site configuration by specifying whether prediction limits will be

constructed based on future observations, means of order 2, or means of order 3. If

prediction limits will be constructed for future observations, a resample program must be

selected (1 of 2, 1 of 3, 1 of 4, or 2 of 4 Modified CA Plan). The first number in each of

the plans indicates how many resamples must pass the predicted limit in order to declare

an initial exceedance a false finding. The second number indicates the “total” number of

samples required (i.e. the initial sample plus all resamples). When the resample is within

its predicted limit, it should replace the exceeded value in any future statistical analyses.

For instance, the 1 of 3 plan means that when an initial exceedance is noted, two

resamples are collected and one of them must pass the limit in order to declare the initial

exceedance a false finding. The exceedance would then be retained in the data file, but

assigned a user-specified flag so that it may be easily deselected in future statistical

analyses.

The “means of order 2 and 3” resample programs require 4 or 6 independent

measurements from each well. For instance, the “means of order 2” requires collection of

two samples so that the mean may be calculated and compared to a background limit. If

the mean exceeds the prediction limit, two additional samples are averaged and compared

to the limit.

69

Assumptions:

The parametric prediction limit is constructed if the background data follow a normal or

transformed-normal distribution. A minimum of four background values are required to

construct the interval, however, generally eight or more background samples are

recommended. The estimate of the standard deviation (S) that is used should be an

unbiased estimator. The usual estimate assumes that there is only one source of variation.

If there are other sources of variation, such as time effects, or spatial variation in the data

used for the background, then the parametric prediction limit is inappropriate. In these

situations, a multivariate statistical procedure is suggested. For more information see the

Unified Guidance and/or consult with a professional statistician.

Distribution:

In order to determine whether a parametric or nonparametric prediction limit should be

used, the distribution of the data is evaluated by applying the Shapiro-Wilk or Shapiro-

Francia tests for normality to the raw data or, when applicable, to the ladder of powers

(Helsel & Hirsch, 1992) transformed data. The null hypothesis, Ho, to be tested is:




distribution.

UG Parametric Prediction Limits Procedure:


background data. The per-evaluation facility-wide false positive rate is determined as

described above based on an annual target rate of .10 (αE = α/nE). The number of

statistical comparisons (r) for each evaluation period (r = the number of wells (w) times

the number of constituents (c) to be sampled at each well) is computed based on user

input. By default, the number of future samples to be compared against the prediction

limit equals one for each well.

Compute the upper prediction limit using kappa multiplier values (depending on the type

of prediction limit, resample program, and per-evaluation alpha level).

The interval is given by:

[ ]S PL X ×+= κ

Where:

70

X = average of background

κ = multiplier from Tables 13-1 thru 1-18 (Appendix C; Draft EPA Unified

Guidance, September 2004

S = standard deviation of background

EXAMPLE 21.5:

Background Values

240

220

240

220

210

200

220

220

240

230

240

230

Compliance Value

230

Table 8. 23: Example Data forIntrawell Parametric Prediction Limit

8.225=Χ 1.13=s 52.2=κ *

8.2581.1352.2225.8sPL X =×+=×+= κ

*The kappa multiplier value was based on the Intrawell Parametric Prediction Limit and

the 1 of 2 Plan at the .05 alpha level. The site configuration included 10 constituents (c)

and 5 wells (w) for a total of 50 statistical tests (r = c ● w).

Censored data:

If less than 15 percent of the background observations are nondetects, these will be

replaced with one half of the method detection limit prior to running the analysis.



according to the method of Cohen or Aitchison (see Control Charts for method

description).

71


nonparametric prediction limit will be computed.

Nonparametric Prediction Limits:

Distribution:

When the background data are not transformed-normal, or greater than 50 percent of the

background data are less than the detection limit, there is an option to construct a

nonparametric prediction limit. The highest or second highest value from the background

data may be specified in the prediction limit set-up window and used as the upper limit of

the prediction limit. The alpha level for each test is based on the background number (n)

and the number of wells (w), and may be obtained from Tables 13-19 through 13-30 of

the Unified Guidance.

California Non-statistical Analysis of VOCs

Description:

Note 1: this window may also be used to run an "Intrawell" screening when not in CA

Standards, in which detected values are reported for selected constituents and wells on the

selected dates. The remainder of this section will deal with the CA method.

Note 2: constituents will be automatically selected/deselected in this window based on the file

<sanitas>\util\not_VOC.txt. This file is editable, and contains instructions for its use.

The California Non-Statistical Analysis method is an interwell or intrawell test that may

be used to analyze constituents that have less than ten percent detectable observations. A

separate variant of this test is used for qualifying constituents of concern (COCs).

Regardless of the test variant used, the method involves evaluating whether downgradient

constituent values meet either of the test’s two possible triggering conditions.

Assumption:

The background samples have less than ten percent detectable values for the given

parameters. This assumption is automatically enforced in the case of interwell analysis.

The intrawell case is more flexible, but requires the user to specify which constituent/well

pairs will be analyzed. For CA intrawell use, it is recommended that a Constituent/Well

Group be created for this purpose in Sanitas. The Group can be populated with those

Constituent/Well pairs containing <10% detects (for example, Selections->Uncheck All,

and then Selections->Check Where->Constituent/Well Pair->Is Detect->Less than 10%)

and then can be further restricted by removing cases that will be analyzed statistically or

72

via the interwell non-statistical approach. This Group is then used to control the data

included in subsequent intrawell VOC analyses.

Procedure:

In the interwell case, the background well observations are checked to determine which

VOCs have less than ten percent detectable values, i.e. are eligible for the Non-Statistical

test. VOCs that have greater than or equal to ten percent detectable values must be

analyzed with a statistical analysis and are referred to as “orphans”.

Of the VOCs that are eligible for a non-statistical analysis (or for all selected constituents

and wells in the intrawell case) the compliance data are checked for the presence of either

three VOCs exceeding their method detection limit or one VOC exceeding its practical

quantitation limit.

When either of the two possible triggering conditions has been met, VOC contamination

is suspected and a verification retest is indicated (see Verification Retest Procedure

section).

Poisson Composite VOC Prediction Limit

Description:

A Poisson composite VOC prediction limit is an interwell statistical test used in detection

monitoring when nondetects exceed 90%. The Poisson test allows analysis of an entire

suite of constituents in one statistical test. The use of a multiple constituent analysis

significantly reduces the site-wide false positive rate as compared to the rate associated

with analyzing constituents separately. One drawback with analyzing multiple

constituents is when there is an exceedance, it is not clear which constituent(s) caused the

exceedance. This is not a concern when using the test for detection monitoring; however,

it is problematic for assessment monitoring.

The Poisson test estimates an upper prediction limit from the background well data for

the compliance wells by determining a limit that will contain all future measurements of

k compliance well(s) with a (1-α)% confidence level. If any of the constituent

concentration sums from the compliance wells exceed the predicted background limit,

there is statistically significant evidence of an impact.

Assumptions:

The use of interwell tests assumes the only source of variation between the upgradient

and downgradient wells is the effect of the facility. If there are other sources of variation

such as naturally occurring hydrogeologic differences or time effects, an intrawell or

multivariate testing procedure is more appropriate.

73

Distribution:

The Poisson distribution is a probability distribution modeled for rare events. In the case

of VOC presence, the Poisson probability of a VOC “hit” is very small unless there is an

impact. The use of a Poisson distribution to estimate the Upper Poisson Prediction limit

requires that detects comprise no more than 10% of the background data and no more

than 20% for any single constituent. If the detection rate for the background wells is

greater than 10%, then a Poisson test is inappropriate. If a single constituent within the

background well has greater than 20% detects, then that constituent should be removed

from the suite and individually analyzed using a more appropriate statistical method such

as a parametric or nonparametric prediction limit analysis.

Procedure:

The sum of the Poisson counts across VOC background samples, Tn, is computed by

adding the number of parts per billion (ppb) across all constituents for the background

well(s). Prior to any calculations, nondetects are set to one half of the MDL and all trace

values are evaluated as the average of the method detection limit and the PQL.

The 99% upper Poisson Prediction limit is calculated as:

4

2z

c

11

nTcz

2

2czn

cTk

T +++=

Where:

c = k/n;

k = the number of VOCs;

n = the background sample size;

Tn = the sum of the Poisson count of background samples; and

z = the upper 99% of the normal distribution.

The sum of the Poisson counts across all VOCs within each compliance well is compared

to Tk. If any compliance well sum exceeds Tk, this is considered evidence of an impact.

Verification Retest Procedure – California

The following verification procedure is intended to meet the special performance

standards under Subsection 2550.7(e)(8)(E) in addition to the statistical performance

standards under Subsection 2550.7(e)(9) for detection monitoring.

74

The proposed verification procedure consists of discrete retests, in which rejection of the

null hypothesis for any one of the retests will be considered confirmation of significant

evidence of an impact. The discrete retest consists of collecting two new suites of

samples for the constituent(s) exceeding the concentration limit from the indicating

monitoring points.

The statistical test method used to evaluate the retest results will be the same as the

method used in the initial statistical comparison. For the original indication to be ignored,

both new analyses must contradict the original indication.

In the case of a Non-Statistical VOC analysis retest, two discrete samples are taken from

the suspected well(s) and a VOC suite chemical analysis is performed to identify

detectable constituents. The same triggering conditions hold for the retest as for the

original test; however, the parameters triggering a significant finding may be different

than those triggering the original indication.

Intrawell ASTM Approach (ASTM Standards Only)

This intrawell approach to detection monitoring is described in the Standard Guide for

Developing Appropriate Statistical Approaches for Ground-Water Detection Monitoring

Programs D 6312-98.

Censored Data:

If less than 75 percent of the observations are nondetects, an Intrawell Shewhart-CUSUM

Control Chart will be used. All nondetects will be replaced with the quantification limit

prior to running the analysis. If there are multiple detection limits, the median

quantification limit will be used.


limit, an Intrawell Poisson Prediction limit will be computed unless a sufficient number

of data points are available to compute an Intrawell Nonparametric Prediction limit that

will provide 99% confidence.

If 100 percent of the data are less than the detection limit, a Nonparametric Prediction

Limit or a Poisson Prediction Limit will be computed, depending on user selection.

Distribution:

If less than 75 percent of the observations are nondetects, the distribution of the data is

evaluated by applying the Shapiro-Wilk or Shapiro-Francia test for normality to the raw

data or, when applicable, to the transformed data. For a description of both the Shapiro-

Wilk and Shapiro-Francia tests please see the Distribution subsection of the Control

Chart Section.

75

If the distribution of the data is not found to be Normal, you can continue to run a

Shewhart-CUSUM Control Chart in ASTM Standards.

Seasonality:

Prior to constructing the Control Charts, the significance of data seasonality is evaluated

using the nonparametric Kruskal-Wallis test (U.S. EPA, April 1989). For a description,

please see earlier subsection on Seasonality under the Control Chart section.

When seasonality is known to exist, the data are deseasonalized prior to constructing

Control Charts in order to take into account seasonal variation rather than mistaking

seasonal effects for evidence of contamination. The data are deseasonalized using the

method described by EPA (U.S. EPA, April 1989). For a description, please see earlier

subsection on “Correcting for Seasonality” under the Control Chart Section.

Outliers:

To remove the possibility of either a high or low outlier in the historical data set, the

historical data are screened for the existence of outliers. See subsection “Outlier

Procedure” under the Descriptive Statistics Section for a method description. Note that if

the user has manually flagged values with an "O" (or "o") then the outlier test will not be

run, and the manually flagged outliers will instead be treated as confirmed outliers.

Existing Trends:

Prior to constructing a control chart, the background data are tested for the existence of

trends. If any trend exists (positive or negative) Sanitas will not run a control chart. The

ASTM Provisional Standards restrict trend testing to increasing trends. Sanitas tests for

both increasing and decreasing trends to prevent the possibility of a significant trend

confusing the statistical results. Both increasing and decreasing trends may lead to

inflated control limits. The provisional ASTM standards state that when significant

trends in background are present and these trends are not due to an impact, that an

alternative indicator constituent may be required for that well or all wells at the facility.

The Mann-Kendall test is used to test for significant trends in the background data. For a

method description please see the “Trend Analysis” subsection of the Evaluation

Monitoring Section.

Control Chart Procedure:

This procedure for construction of the Shewhart-CUSUM Control Chart follows the

ASTM recommendations (1996). The Shewhart-CUSUM Control Chart requires a

76

minimum of eight historical data points in order to reliably determine the mean and

standard deviation for each constituent’s concentration in a given well.

Three parameters are selected by the system prior to plotting:

h = the control limit to which the cumulative sum values (CUSUM) are

compared. ASTM (1996) recommends the value h = 4.5 units of

standard deviation for a background n < 12. When the background n >

12 the h is adjusted to = 4.0.

SCL = the upper Shewhart Control Limit to which the standardized mean

will be compared. ASTM (1996) recommends a value of SCL = 4.5

when background n < 12. When the background n > 12 ASTM

recommends SCL = 4.0.

c = a parameter related to the displacement that should be quickly

detected. ASTM (1996) recommends c = 1 for background n < 12. For

background n > 12, ASTM recommends c = 0.75.

The Shewhart CUSUM Control Chart is constructed as the method description describes

in the “Control Chart Procedure” section.

The results are plotted in their original metric units rather than standard deviation units.

For background, sample sizes less than 12:

4.5sSCLh +== X

For background sample sizes greater than or equal to 12:

4.0sSCLh +== X

and the Si are converted to the metric concentration by the transformation:

X+∗si

S

Censored Data:

If less than 75 percent of the background data are less than the quantification limit, the

data’s sample mean and standard deviation are adjusted according to the method of

Cohen or Aitchison. Please see previous section for a description of Cohen’s and

Aitchison’s adjustment.

77

If more than 75 percent of the background data are less than the quantification limit, a

nonparametric prediction limit will be computed. As an option to the nonparametric

prediction limit, a Poisson-based prediction limit may be computed.

78

Figure 8.7:Intrawell ASTM Standards

79

Figure 8.8: Intrawell ASTM Standards (Cont’d)

80

Figure 8.9: Intrawell ASTM Standards (Cont’d)

81

Interwell ASTM Approach (ASTM Standards Only)

This Interwell approach to detection monitoring is described in the Standard Guide for

Developing Appropriate Statistical Approaches for Ground-Water Detection Monitoring

Programs D 6312-98.

Distribution:

The distribution of the data is evaluated by applying the multiple group version of the

Shapiro-Wilk test for normality to the raw data or, when applicable, to the log

transformed data.

The null hypothesis, H0, to be tested is:




distribution.

Multiple Group Version Shapiro-Wilk Procedure:

The multiple group version of the Shapiro-Wilk test takes into consideration that

upgradient measurements are nested within different upgradient monitoring wells.

First, calculate the Shapiro-Wilk W-statistic (see prior section for method description) for

each compliance well and denote as Wi. Calculation of the multiple group version of the

Shapiro Wilk G-statistic to test the null hypothesis is presented in detail in

Technometrics, 10 (Wilk, Shapiro, 1968).

For sample size Ni, ≥ equal to seven, calculate G

i for each well. G

i is the percentage

point of the standard normal distribution corresponding to α α α αi. Under the null

assumptions, the quantities G1,...,G

K may be considered to be a random sample from a

standard normal distribution:

iW1

εi

Wδln

iG

−

−+= γ

Where the values γγγγ, δδδδ, εεεε are given in the Shapiro-Wilk (1968) table.

82

For sample sizes between three and six, use the value for Gi obtained from Table 2 of

Shapiro-Wilk (1968) by linear interpolation on the tabulated quantities:

−

−=

iW1

εi

Wln

iu

Then, compute G, the normalized value of Gi:

( )K

G2

G1

GK

1G +++=

K

Where:

K = number of wells.

Refer the normalized mean, G, to a standard table of the normal integral. If the

probability of G is greater than .01, accept the null hypothesis that the population has a

normal (or transformed normal) distribution.

Outliers:

To remove the possibility of either a high or low outlier in the historical data set, the

historical data are screened for the existence of outliers. See subsection “Outlier

Procedure” under the Descriptive Statistics Section for a method description. Note that if

the user has manually flagged values with an "O" (or "o") then the outlier test will not be

run, and the manually flagged outliers will instead be treated as confirmed outliers.

Censored data:

If less than 50 percent of the background data are less than the detection limit, the data’s

sample mean and sample standard deviation are adjusted according to the method of

Aitchison or Cohen. The use of Cohen’s or Aitchison’s adjustment is a user-selected

option. The user has the choice to select between these two approaches for adjusting non-

detects. The U.S. EPA (1992) provides a useful approach to help select which method to

use.


nonparametric prediction limit will be computed. As an option to the nonparametric

prediction limit, a poisson-based prediction limit may be computed.

83

Parametric Prediction Limit Procedure:


background data. Then the interval is given by:

n

11S

α1,ntX +

−+

if the data are normal, and the interval is given by:

+

−+

n

11ys

α1,ntyexp

if the data are found to be lognormal.

Where:

αααα = false positive rate for each individual test;

n = the number of observations in the background data; and

t(n-1, αααα)

= one-sided (1- α) upper percentage point of Student’s t distribution on

n-1 degrees of freedom

Select α as the minimum of 0.01 or one of the following:

1) Pass the first or one of one verification resamples:

21

k1

0.951α

−=

2) Pass the first or one of two verification resamples:

31

k1

0.951α

−=

84

3) Pass the first or one of three verification resamples:

2

1k

1

0.951α ∗−=

Where:

K = number of comparisons (monitoring wells times constituents).

For a two-tailed test, t(n-1,α/2)

is substituted for t(n-1, α)

in the above formula. Statistically

significant evidence of an impact is noted when compliance observations fall outside the

bounds of the upper or the lower prediction limits.

When a modified alpha, α *, is computed, t(n-1,K,1-α*)

will be substituted for

t(n-1, K, (1-α))

in the above formula.

Nonparametric Prediction Limit Procedure:

When the background data are not transformed-normal or contain greater than 50 percent

of the observations below the detection limit, Sanitas will automatically construct a

nonparametric prediction limit. The highest value from the background data is used to set

the upper limit of the prediction limit. In the case of a two-tailed test, the lowest value

from the background data is used to set the lower limit of the prediction limit. If the

background data contain 100 percent non-detects, the prediction limit is equal to the

median quantification limit. The false positive rate is based upon the background sample

size and the number of compliance points being compared to the limit. The site-wide

false positive rate, γγγγ, is given in Table 2 (Gibbons, R.D., 1991). The minimum sample

size for a false positive rate equal to 1 percent for a single well and one resample is 13.

Poisson-Based Prediction Limit Procedure:

When the background data contain greater than 50 percent observations below the

detection level, you may choose to construct a prediction limit based upon the Poisson

distribution. Poisson prediction limits will be utilized for those cases in which there are

too few background measurements to achieve an adequate site wide false positive rate

using the nonparametric approach.

85

Distribution:

The Poisson distribution is a probability distribution modeled for rare events. The

Poisson probability of a detectable observation is rare unless there is an impact.

Procedure:

The sum of the Poisson counts across background samples, y, is computed by adding the

number of parts per billion (ppb) across all observations for the background well(s). Prior

to any calculations, nondetects are set to the median method detection limit (MDL) and

all trace values are evaluated as the median practical quantitation limit (PQL).

The 99% upper Poisson prediction limit is calculated as:

( )4

2zn1y

n

z

2n

2z

n

y++++

Where:

y = the sum of the detected measurements or the quantification limit for

those samples in which the constituent was not detected;

n = the background sample size; and

z = the (1- α) 100 upper percentage point of the normal distribution

(where α is computed as in the section on parametric prediction limits).

Note: This test cannot be used for decimal values. When a Poisson analysis is attempted

on decimal data, Sanitas will advise you to change the units and to convert the

observations from parts-per-million to parts-per-billion (ppb) or ppb to parts-per-trillion

by multiplying them by 1000. For example, 0.001 ppm should be converted to 1 ppb in

the data spreadsheet, or by using Alternate Values in the View. If you are editing the

data file, please note that units for all observations need to be consistent within a

constituent.

Transform Data

Once you have opened the Examine Observations window after creating a view, you can

choose to power-transform the data by choosing Data/Transformed Original Values

into Alt Values. For example, 0.001 ppm should be converted to 1 ppb. You will be

asked to select a power of 10 in which to multiply your original value by. In this case,

you would multiply by 1000. The transformed data will be displayed in the Alternate

Value column, and may be used in the analysis by selecting “Use Alternative Values”.

This provides transformed data in the View, but does not directly affect the original data

file.

87

Figure 8.10: Interwell ASTM Standards

88

Figure 8.11: Interwell ASTM Standards (Cont’d)

89

Evaluation Monitoring Statistics

Trend Analysis

Description and Procedure:

A trend is the general increase or decrease in observed values of some random variable

over time. A trend analysis can be used to determine the significance of an apparent trend

and to estimate the magnitude of that trend. The Mann-Kendall test for temporal trend

(Hollander & Wolfe, 1973) and Sen’s slope estimate (Gilbert, 1987) were chosen for the

site evaluation (or assessment) monitoring program to evaluate the correlation of selected

constituent concentrations with time.

The Mann-Kendall test is nonparametric, meaning that it does not depend on an

assumption of a particular underlying distribution. The test uses only the relative

magnitude of data rather than actual values. Therefore, missing values are allowed, and

values that are recorded as non-detects by the laboratory can still be used in the statistical

analysis by assigning values equal to half their detection limits (Gilbert, 1987).


H0: No significant trend of a constituent exists over time.


HA: A significant upward (or downward) trend of a constituent concentration

exists over time.

For groups having fewer than 41 data points, an exact test is performed. If 41 or more

data points are available, the normal approximation test is used (Gilbert, 1987).

- Exact Test (n <= 40):

The Mann-Kendall method assigns a positive or negative score based on the differences

between the data points. The first step is to list the data in the order in which they were

collected over time, and then determine the sign of all possible differences xj - xk, where

j > k:

−

kx

jxsgn = 0

kx

j xif 1 >−

= 0k

xj

xif 0 =−

= 0k

xj

xif 1 <−−

Where:

xj = the value of the jth observation; and

90

xk = the value of the kth observation.

The Mann-Kendall statistic, S, is then computed, which is the number of positive

differences minus the number of negative differences.

∑−

=∑

+=−=

1n

1k

n

1kj kx

jxsgnS

Where:

n = the total number of observations.

If S is a large positive number, measurements taken later in time tend to be larger than

those taken earlier, i.e., an upward trend. Similarly, if S is a large negative number,

measurements taken later in time tend to be smaller, i.e., a downward trend.

For a two-tailed test to detect either an upward or downward trend, the tabulated

probability level corresponding to the absolute value of S (Gilbert, 1987) is doubled and

H0 is rejected if that doubled value is less than the a priori α significance level of the test.

- Normal Approximation Test (n > 40):

The Mann-Kendall test statistic, S, is calculated using the same method of the exact test.

When there are no tied values, the variance of VAR(S) is computed:

18

5)1)(2nn(nVAR(S)

+−=

S and VAR(S) are then used to compute the test statistic, Z, as follows:

[ ]

[ ]

0S if

VAR(S)

1SZ

0S if0Z

0S if

VAR(S)

1SZ

2

1

2

1

<+

=

==

>−

=

91

When tied values (data points having equal values) are present, the variance of S is

computed:

+−∑

=−+−= 5)p1)(2tp(t

g

1ppt5)1)(2nn(n

18

1VAR(S)

Where:

g = the number of tied groups; and

tp = the number of observations in the pth group.

To test for an upward or a downward trend (a two-tailed test), a level of significance, α must first be chosen. The level of significance is the probability of rejecting the null

hypothesis, (Ho) no trend, when no trend actually exists (Type I error). In general, α is

chosen to be 0.05. The split Type I error probability, or α / 2, for a two-tailed test is

then 0.025.

The Z-value associated with the 0.025 significance level is 1.96, from Table A-1

(Hollander and Wolfe, 1973), corresponding to an α -level of 0.05, 95 percent (1-α ) of

the area under the normal curve lies between -Zα = -1.96 and Zα = 1.96.

A positive or negative value of Z can indicate an upward or downward trend,

respectively. With an α -value of 0.05, any Z-value above 1.96 indicates a statistically

significant upward trend, and any value below -1.96 indicates a statistically significant

downward trend. In such cases, the Ho of no trend would be rejected. For values, which

fall between -1.96 and 1.96, the null hypothesis cannot be rejected.

To reject H0, the probability corresponding to the Z-value must be less than the specified

α -value. The smaller the probability value, the greater the likelihood that a trend is

occurring and the greater the likelihood the constituent concentration (the dependent

variable) is an increasing or decreasing function of time.

Sen’s Slope Estimator

Description:

This simple nonparametric procedure was developed by Sen (1968) and presented in

Gilbert (1987) to estimate the true slope. The advantage of this method over linear

regression is that it is not greatly affected by gross data errors or outliers, and can be

computed when data are missing.

The N′ individual slope estimates, Q, are computed for each time period:

92

ii'i

X'i

XQ

−

−=

Where:

ii X and X′

= the data values at time i′ and i (in days), respectively, i′ ’> I; and

N′ = the number of data pairs for which i′ > i.

A value of one half of the detection limit will be substituted for Xi values below the

detection limit.

Sen’s Slope estimator is the median slope, obtained by ranking the N′ values of Q from

smallest to largest, and choosing the middle-ranked slope as follows.

( )[ ] odd is N' if/21nnN'Q −=

even is N' if/22N'

Q/2N'

Q2

1

++

Where:

n = the number of time periods.

This value is multiplied by 365 to give the yearly slope value.

EXAMPLE 22:

93

Time Period Data

1

10

1

22

1

21

2

30

3

22

3

30

4

40

5

40

NC NC +20 +6 +10 +10 +7.5

NC +8 0 +4 +6 +4.5

+9 +.5 +4.5 +6.33 +4.75

-8 0 +5 +3.33

NC +18 +9

+10 +5

0

Table 8.24: Example Data for Sen’s Slope

′ =N 24

Q (slope) values ranked from smallest to largest:

-8, 0, 0, 0, 0.5, 3.33, 4, 4.5, 4.5, 4.75, 5, 5, 6, 6, 6.33, 7.5, 8, 9, 9, 10, 10, 10,

18, 20

The median of these Q values is the average of the 12th and 13th largest

values, 5 and 6.

The Sen estimate of the true slope is 5.5

Seasonal Kendall Test

Description:

The Seasonal Kendall Test is an extension of the Mann-Kendall test that removes

seasonal cycles and tests for trend.

Seasonal Kendall Procedure:

Compute the Mann-Kendall statistic, S, for each season. Let Si denote this statistic for

the ith season, that is:

∑ ∑−

= +=

−=1n

1k

n

1kl

ikil

i i

)xsgn(xSi

Where l > k, ni is the number of data for season i, and:

94

0 x-x if -1

0 x-x if 0

0 x-x if 1)xsgn(x

ikil

ikil

ikilikil

<=

==

>=−

VAR(Si) is computed as follows:

1)(n2n

1)(uu1)(tt

2)1)(n(n9n

2)1)(u(uu2)1)(t(tt

5)1)(2u(uu5)1)(2t(tt5)1)(2n(nn18

1)VAR(S

ii

g

1p

h

1q

iqiqipip

iii

g

1p

h

1q

iqiqiqipipip

g

1p

h

1q

iqiqiqipipipiiii

i ii i

i i

−

−−

+−−

−−−−

+

+−−+−−+−=

∑ ∑∑ ∑

∑ ∑

= == =

= =

Where:

gi = The number of groups of tied data in season I;

tip = The number of tied data in the pth group for season I;

hi = The number of sampling times(or time periods) in season i that

contain multiple data; and

uiq = The number of multiple data in the qth time period in season i.

95

After Si and VAR(Si) are computed, we pool across the K seasons:

∑=

=K

1i

iSS'

and

∑=

=K

1i

i )VAR(S)VAR(S'

Next compute:

[ ]0 S' if

)VAR(S'

1)(S'Z

0 S' if 0 Z

0 S' if )][VAR(S'

1)(S'Z

1/2

1/2

<+

=

==

>−

=

For a two tailed test, we reject Ho of no trend if the absolute value of Z is greater than Z1-

α/2. Sanitas tests at the 80%, 90% and 95% confidence levels.

Seasonal Kendall Slope Estimator Procedure:

First compute individual Ni slope estimates for the ith season:

kl

xxQi ikil

−

−=

Where:

xil = The datum for the ith season of the lth year; and

xik = The datum for the ith season of the kth year, where l > k.

Do this for each of the K seasons. Then rank the N’1 + N’2 + …+ N’K = N’ individual

slope estimates and find their median. This median is the seasonal Kendall slope

estimator.

96

Compliance or Corrective Action Monitoring Statistics

Confidence Intervals

Description:

A Confidence Interval is constructed from sample data and is designed to contain the

mean concentration of a well analyte in ground water monitoring, with a designated level

of confidence. A Confidence Interval generally should be used when specified by permit

or when downgradient samples are being compared to the maximum concentration limit

(MCL) or alternate concentration limit (ACL). In this situation, the MCL or ACL is a

specified concentration limit or determined by the background concentrations.

Assumptions:

The sample data used to construct the intervals must be normally or transformed-

normally distributed. In the case of a transformed-normal distribution, the Confidence

Interval must be constructed on the transformed sample concentration values. In addition

to the interval construction, the comparison must be made to the transformed MCL or

ACL value. When none of the transformed models can be justified, a nonparametric

version of each interval may be utilized. If the entire Confidence Interval exceeds the

compliance limit, there is statistically significant evidence that the mean concentration

exceeds the compliance limit.

Distribution:

The distribution of the data is evaluated by applying the Shapiro-Wilk or Shapiro-Francia

test for normality to the raw data or, when applicable to the Ladder of Powers (Helsel &

Hirsch, 1992) transformed data.





distribution.

Censored Data:

If less than 15 percent of the observations are nondetects, these will be replaced with one

half the method detection limit prior to running the normality test and constructing the

Confidence Interval.

97


limit, the data’s sample mean and standard deviation are adjusted according to the

method of Cohen or Aitchison (U.S. EPA, April 1989). This adjustment is made prior to

construction of the Confidence Interval.

If more than 50 percent of the data are less than the detection limit, these values are

replaced with one half the method detection limit and a nonparametric Confidence

Interval is constructed.

Parametric Confidence Interval Procedures:

A minimum of four sample values is required for the construction of the parametric

Confidence Interval. The mean, X , and standard deviation, S, of the sample

concentration values are calculated separately for each compliance well (monitoring

point). For each well, the Confidence Interval is calculated as:

n

S1)nα,(1

tX−−

±

Where:

S = the compliance point’s standard deviation;

n = the number of observations for the compliance point; and

t(1-αααα, n-1) = is obtained from the Student’s t-Distribution found in Table 6

(Appendix B; U.S. EPA, April 1989) with (n -1) degrees of freedom.

The use of the 99th

percentile of the t-Distribution is consistent with the 1 percent α -

level of individual well comparisons. If the lower end of the interval is above the

compliance limit, then the mean concentration must be significantly greater than the

compliance limit, indicating noncompliance.

For a two-tailed test, t(0.995, n-1) will be substituted for t(0.99, n-1) in determining the

confidence interval. When the lower limit exceeds the upper compliance limit or the

upper limit falls below the lower compliance limit, there is statistically significant

evidence of noncompliance.

98

EXAMPLE 23:

Date Well#3 1/1/1988 10

4/1/1988 2.5

10/1/1988 16

4/1/1989 15

7/1/1989 8

10/1/89 15

1/1/90 21

Table 8.25: Example Data for Parametric Confidence Interval

12.5X = 6.103s = 7n =

3.143.99,6

t =

75.197

6.1033.14312.5 Limit Upper =∗+=

25.57

6.1033.143-12.5 Limit Lower =∗=

Nonparametric Confidence Interval Procedure:

The Nonparametric Confidence Interval procedure requires at least seven observations in

order to obtain a one-sided significance level of 1 percent. The observations are ordered

from smallest to largest and ranks are assigned separately within each well (monitoring

point). Average ranks are assigned to tied values. The critical values of the order statistics

are determined as follows.

If the minimum seven observations are used, the critical values are the first and seventh

values.

Otherwise, the smallest integer, M, is found such that the cumulative binomial

distribution with parameters n (sample size) and probability of success, p = 0.5 is at least

0.99.

The exact confidence coefficient for sample sizes from 4 to 11 are given by the EPA

(Table 6-3; U.S. EPA, April 1989). For larger samples, take as an approximation the

nearest integer value to:

99

4n)-(1

Z12

nM

α++=

Where:

Z(1-αααα) = the 1-α percentile from the normal distribution found in Table 4

(Appendix B; U.S. EPA, April 1989); and

n = the number of observations in the sample.

Once M has been determined, (n+1-M) is computed and the confidence limits are taken

as the order statistics, X(M) and X(n+1-M). These confidence limits are compared to the

compliance limit. If the lower limit, X(M), exceeds the compliance limit, there is

statistically significant evidence of non compliance. Otherwise, the well remains in

compliance.

EXAMPLE 24:

Date Well#1

12/1/1987 .5325

4/13/1988 .825

5/11/1988 .26

6/2/1988 .32

10/1/1988 .39

1/01/1989 .515

5/01/1989 .08

9/01/1989 .025

3/01/1990 .022

Table 8.26: Example Data for Nonparametric Confidence Interval

2.327.99

Z9n ==

8.994

92.3271

2

9M =∗++=

.825X(9)Limit Upper ==

.022X(1)9)-1X(9Limit Lower ==+=

100

For a two-tailed test, Z0.995 will be substituted for Z0.99 in deriving M. If the upper limit,

X(n+1-M), falls below the lower compliance limit, or the lower limit exceeds the upper

compliance limit, there is statistically significant evidence of non compliance.

Tolerance Intervals

Description:

In compliance monitoring, the Tolerance Interval is calculated on the compliance point

data, so that the upper one-sided tolerance limit may be compared to the appropriate

ground water protection standard (i.e., MCL or ACL). If the upper tolerance limit

exceeds the fixed standard, and especially if the tolerance limit has been constructed to

have an average coverage of 95 percent, there is significant evidence that as much as 5

percent or more of all the compliance well measurements will exceed the limit.

Assumptions:

The sample data used to construct the intervals are assumed to be normally or

transformed-normally distributed. In the case of a transformed-normal distribution, the

Tolerance Interval must be constructed on the transformed sample concentration values.

In addition to the interval construction, the comparison must be made to the transformed

MCL or ACL value. When neither the normal nor transformed models can be justified, a

nonparametric version of each interval may be utilized.

Censored Data:

If less than 15 percent of the observations are nondetects, these will be replaced with one-

half of the method detection limit prior to running the normality test and constructing the

Tolerance Interval.


limit, the data’s sample mean and standard deviation are adjusted according to the

method of Cohen or Aitchison (U.S. EPA, April 1989). This adjustment is made prior to

construction of the Tolerance Interval.

If more than 50 percent of the data are less than the detection limit, these values will be

replaced with one half the method detection limit and a nonparametric Tolerance Interval

may be constructed.

Parametric Tolerance Intervals Procedure:

A minimum of four sample values is recommended for the construction of Tolerance

Intervals. The Shapiro-Wilk or Shapiro-Francia test for normality (see Control Chart for

101

method description) is used to determine if the sample values are normally or

transformed-normally distributed. The mean, X , and the standard deviation, S , are

computed separately for each compliance well’s data. The factor, K, is determined for the

sample size, n, from Table 5 (Appendix B; U.S. EPA, April 1989). The Tolerance

Interval is computed as:

[ ]KS0,X +

Where:

X = the mean for the compliance observations;

K = the factor obtained for sample size, n, from Table 5 (Appendix B; U.S.

EPA, April 1989); and

S = the standard deviation of the compliance observations.

For a 95% coverage Tolerance Interval with confidence factor 95% for each well.

The upper limit of the Tolerance Interval is compared to the compliance limit. If the

upper limit of the Tolerance Interval exceeds that limit, there is statistically significant

evidence of an impact.

EXAMPLE 25:

Date Well#3

1/1/1988 10

4/1/1988 2.5

10/1/1988 16

4/1/1989 15

7/1/1989 8

10/1/1989 15

1/1/1990 21

Table 8.27: Example Data for Parametric Tolerance Interval

12.5X = 6.103S = 3.399K =

33.253.399)(6.10312.5 Interval Tolerance =∗+=

Nonparametric Tolerance Interval Procedure: For a Tolerance Interval the highest

compliance observation is used to set the upper limit of the tolerance interval. This upper

limit is compared to the compliance limit. If the upper limit of the Tolerance Interval

exceeds that limit, there is statistically significant evidence of an impact.

102

A minimum of 19 sample values is recommended for the construction of a 95%

Confidence/95% Coverage Tolerance Interval. The highest background value is used to

set the upper limit of the Tolerance Interval. This upper limit is compared to the

compliance limit. If the upper limit of the Tolerance Interval exceeds that limit, there is

statistically significant evidence of an impact.

Proportion Estimate

Description:

The proportion estimate test computes the proportion of observations in the record

exceeding a stated excursion limit and computes a confidence limit for this proportion.

Proportion Estimate Procedure:

For n < 20,

For the lower confidence limit, use the following distribution function:

ini

xi

qpi)!(ni!

n!x)P(XF(x) −

≤

∑

−=≤=

Where:

i = 0,1,…x;

n = The total # of observations;

p = The proportion (0 < p < 1);

q = 1-p;

u = The total # of observations that exceed xC;

xC = The stated excursion limit

x = u – 1; and

P1 = 1 – α/2

Determine the value of p through iteration. The p value is equal to the lower limit of the

interval.

For the upper confidence limit, use the same distribution function:

ini

xi

qpi)!(ni!

n!x)P(XF(x) −

≤

∑

−=≤=

Where:

i = 0,1,…x;

103


p = The proportion (0 < p < 1);

q = 1-p;


x = u; and

P2 = α/2

Determine the p value that corresponds to F(X)=P2 = a/2. This p value is the upper limit

of the interval.

For n > 20, calculate the upper and the lower limit of the interval as:

1/2

xc

xc/21xcn

p1pZp

−± −αααα

Where:



xC = The excursion limit that you set; and

pxc = u/n = The proportion of the population exceeding xC.

This equation gives you an approximate two-sided 100(1-a)% Confidence interval for

pxc. The significance levels that Sanitas uses for this test is 95% and 99%. In Sanitas,

the pxc, upper and lower confidence limits for 95% and 99% are calculated for the

overall data set and for each season.

104

APPENDIX I: GLOSSARY OF SELECTED STATISTICAL

TERMS

2-tailed Mode - The option used when there is a concern that compliance values can be

both too low as well as too high relative to background values.

95% Confidence Interval - Each time a test is performed, there is a 5% chance that it

will result in a false positive conclusion.

95% Coverage - 95% of the population is intended to be contained within the tolerance

interval.

99% Confidence Level - Each time a test is performed, there is a 1% chance that it will

result in a false positive conclusion.

Alpha Level - The false positive rate, or fraction of the results that will show and

exceedance when in fact none exists.

Analysis of Variance (ANOVA) - An interwell analysis that compares either well means

or average ranks among wells.

Box and Whiskers Plots - A concentration plot depicting the mean, median, minimum,

maximum, and 25th

and 75th

percentiles of a data set.

California Non-statistical Analysis of VOCs - An interwell analysis for a suite of

VOCs when nondetects comprise 90% or more of the background data.

Central Tendency - A statistical indicator of the average or middle value of a data set.

Confidence Interval (CI) - A concentration range that is designed to contain the mean

concentration level with a designated level of confidence (e.g., 99%)

Lower Confidence Limit (LCL) - Lower limit to a confidence interval.

Log Transformation – In Sanitas, as is typical in the Guidance documents referenced

below, the term log transformation is synonymous with natural log transformation.

Mann-Kendall Statistical Evaluation - A nonparametric statistical analysis of the

increase or decrease in concentration levels over time; calculation of a significance level

for the relationship between concentration levels and time.

Non-normal Data - The distribution of the population of data from which the sample has

been drawn is unknown; therefore no assumptions about or estimations of the population

parameters (e.g., mean) can be made.

Normally Distributed Data - Data (constituent concentration values) follow a normal

(Gaussian) or bell-shaped curve; the majority of values (95%) are within two standard

deviations from the mean of the concentration values.

Outlier - An observation that is at least an order of magnitude different from the rest of

the group of observations.

Index 105

Power - The power of a statistical test is the probability that the test will reject a false

null hypothesis, or in other words that it will not make a Type II error. The higher the

power, the greater the chance of obtaining a statistically significant result when the null

hypothesis is false.

Precision - The extent to which a given set of sample measurements of the same

population of values agree with a measure of their central tendency.

Prediction Limit Analysis - An interwell or intrawell analysis that compares one or

more future observations to a limit set by background data.

Proportion Estimate - Computes the proportion of observations in the record exceeding

a stated excursion limit and a confidence limit for this proportion.

Poisson Distributed Data - Data (constituent concentration values) follow a model of

rare events, where the probability of detection is low but stays constant from sampling

period to sampling period (U.S. EPA, 1992).

Sen’s Slope Trend Analysis - A nonparametric statistical analysis of the increase or

decrease in concentration levels over time; calculation of the slope of the linear

relationship of concentration level and time.

Site-Wide False Positive Rate - The probability that at least one parameter for at least

one well will result in a statistically significant finding for each sampling event at a

facility.

Skewness - A measure of the degree of asymmetry of a data distribution.

Testwise Alpha – The overall alpha level for a given test.

Time Series Plot - A graphic plot of time ( i.e.: days, months, years) versus

concentration levels.

Tolerance Interval (TI) - A concentration range that is constructed to contain a specified

proportion (e.g., 95%) of the population of observations with a specified confidence (i.e.,

confidence level).

Tolerance Limit - An interwell or intrawell analysis that compares compliance

observations to a limit set by background data that is constructed to contain a specified

proportion (e.g., 95%) of the population of observations.

Transformed-normally Distributed Data - The raw data are not normally distributed;

however the natural logarithms (or some other transformation in the Ladder of Powers

[Helsel & Hirsch]) of the data are normally distributed and parametric procedures may be

used.

Upper Confidence Limit (UCL) - Upper limit to a confidence interval.

Variability - A measure of divergence from the mean of a data set.

106

BIBLIOGRAPHY

ASTM, December 1998. Standard Guide for Developing Appropriate Statistical Approaches

for Ground-Water Detection Monitoring Programs. American Society For Testing and

Materials, West Conshocken, PA.

Cameron, Kirk, September, 2004. DRAFT Unified Guidance.*

Cohen, A.C., Jr., 1959. Simplified Estimators for the Normal Distribution When Samples Are

Singly Censored or Truncated, Technometrics, 1: 217-237.

Davis, C. B. and McNichols, R. J., 1994. Ground Water Monitoring Statistics Update: Part II:

Nonparametric Prediction Limits, Ground Water Monitoring Review, Fall: 159.

Eisenhart, C., Hastay, M.W., and Wallis, W.A., 1947. Techniques of Statistical Analysis.

McGraw-Hill Book Company, Inc.

Gibbons, R.D., 1991. Some Additional Prediction Limits for Groundwater Detection

Monitoring at Waste Disposal Facilities, Groundwater, 29:5.

Gilbert, R.O., 1987. Statistical Methods for Environmental Pollution Monitoring. Van

Nostrand Reinhold

Helsel, D.R. and Hirsch, R.M., 1992. Statistical Methods in Water Resources. Elsevier.

Hollander, M. and Wolfe, D.A., 1973. Nonparametric Statistical Methods. John Wiley &

Sons.

Sen, P.K., 1968. Estimates of the Regression Coefficient based on Kendall’s Tau, Journal of

the American Statistical Association, 63 : 1379-1389.

U.S. EPA, April 1989. Statistical Analysis of Ground-Water Monitoring Data at RCRA

Facilities, Interim Final Guidance. Office of Solid Waste Management Division, U.S.

Environmental Protection Agency, Washington, DC.

U.S. EPA, July 1992. Statistical Analysis of Ground-Water Monitoring Data at RCRA

Facilities, Addendum to Interim Final Guidance. Office of Solid Waste Management

Division, U.S. Environmental Protection Agency, Washington, DC.

Wilk, M.B., and Shapiro, S.S., Technometrics, 10, No. 4, 1968, p 825-839

Willits, N., 1994. Personal Communication between Henry R. Horsey and Neil Willits,

statistical consultant to the California State Water Resources Control Board, Use of

nonparametric prediction limits including retests.

Zar, Jerrold H., 1996. Biostatistical Analysis. 3rd

edition (p112) Prentice Hall.

*As of this writing, the Unified Guidance was undergoing peer review, and any changes

made after September 2004 may not be reflected in this version. Please contact the

USEPA for the current status of this document, and/or consult a professional statistician.

Index 107

INDEX

A

Aitchison’s Adjustment.................. 35, 36

Alert Levels......................................... 59

Alternate Value ................................... 85

Analysis of Variance ......................... 42

ANOVA .... 41, 42, 43, 45, 46, 48, 49, 50

Arizona............................................... 59

ASTM ............................. 38, 74, 81, 106

Auto-Checking for Outliers................. 13

B

Bonferroni t-statistic ........................... 47

Box and Whiskers Plot ....................... 6

C

California .......................... 49, 65, 71, 73

California standards ............................ 38

Censored Data .................................... 32

Chi-Squared ........................................ 25

Coefficient of-Variation ...................... 23

Cohen’s Adjustment ...................... 33, 34

Compliance or Corrective Action.... 96

Composite VOC ................................ 72

Confidence Intervals ......................... 96

Control Chart .................................... 20

Control Chart Procedure.............. 36, 75

D

Data/Transformed Original Values into

Alt Values ....................................... 85

Davis McNichols ................................. 65

Deseasonalizing .................................. 31

Detection Monitoring........................ 20

Dixon's OutLier Test ........................ 15

DMT-NP.............................................. 65

E

EPA................................................... 106

EPA 1989 Outlier Test...................... 13

Equality of Variance Test.............. 43, 45

Evaluation Monitoring ..................... 89

H

Histogram ............................................ 7

K

Kruskal-Wallis test............ 30, 39, 51, 75

Kurtosis ........................................... 7, 10

L

Ladder of Powers .......................... 62, 69

Levene’s test ....................................... 43

Log (vs. ln) ....................................... 104

M

Mann-Kendall ......................... 75, 89, 90

Mann-Whitney ............................ 39, 40

Multiple Group Shapiro-Wilk ... 41, 81

N

Nonparametric ............................ 65, 106

Nonparametric ANOVA ..................... 50

Non-Statistical Analysis...................... 71

Normality Report ................................ 19

O

Outlier ................................................ 12

P

Parametric ............................... 42, 43, 85

Parametric ANOVA.................... 43, 46

Piper Diagram ..................................... 20

Poisson .................. 64, 65, 72, 74, 77, 84

Prediction Limit ............................ 62, 83

Prediction Limits EPA................................................. 61

UG Standards ................................ 67

108

Probability Plot ................................. 11

Proportion Estimate........................ 102

R

Rank Sum .......................................... 39

Rank Von Neumann ......................... 17

ROSNER's OutLier Test .................. 16

S

Seasonal Kendall Test......................... 93

Seasonality ........................ 28, 29, 39, 75

Seasonality Adjustment ....................... 30

Seasonality Plot ................................. 12

Sen’s Slope Estimator ....................... 91

Shapiro-Francia.................................. 25

Shapiro-Wilk ....................................... 21

Shapiro-Wilk, Multiple Group .. 41, 81

Shewhart-CUSUM ... 20, 34, 36, 38, 74,

75

Skewness............................................. 10

standard deviation ................................. 7

Statistical Outlier ................................ 12

Stiff Diagram ...................................... 19

T

Time Series .................................... 5, 30

Tolerance Intervals............................ 100

Tolerance Limit ................................... 52

Tolerance Limits ............................... 51

Trend Analysis .................................. 89

Two-tailed ..................................... 40, 91

U

Unified Guidance .............................. 68

Unified Guidance ................................ 38

V

Verification Retest Procedure.......... 73

W

Welch's t-test ..................................... 41

Wilcoxon Rank Sum ......................... 39

W-statistic ........................................... 21

Index 109

Sanitas Statistical Analysis · PDF file3 SANITAS STATISTICAL ANALYSIS PROCEDURES Introduction...

Documents

Transcript of Sanitas Statistical Analysis · PDF file3 SANITAS STATISTICAL ANALYSIS PROCEDURES Introduction...