Mann KendallTest (Reparado)

download Mann KendallTest (Reparado)

of 27

Transcript of Mann KendallTest (Reparado)

  • 7/22/2019 Mann KendallTest (Reparado)

    1/27

    Kendall, M. G. (1970) Rank Correlation Methods, 4th ed. London: Griffin

    Statistical Methods for Environmental Pollution Monitoring, Richard O. Gilbert (1987)http://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdf

    (Good intro, but lacks look-up table, pdf image cuts off last sentence on each page)

    Myles Hollander and Douglas A. Wolfe (1999)

    Nonparametric Statistical Methods, 2nd Edition

    Wiley-Interscience

    ISBN-10: 0471190454

    ISBN-13: 978-0471190455

    A User-Written SAS Program for Estimating Temporal Trends and Their Magnitudehttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdf

    Techniques of Water-Resources Investigations of the United States Geological Survey

    Book 4, Hydrologic Analysis and Interpretation

    Chapter A3 Statistical Methods in Water Resources

    By D.R. Helsel and R.M. Hirschhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdf

    Detecting Trends of Annual Values of Atmospheric Pollutants by the Mann-Kendall Test ahttp://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdf

    Statistical Sirens: The Allure of Nonparametrics, Ecology 76(6), 1995, Douglas H. Johnson,http://www.jstor.org/pss/1940733

    Why Kendall tau?http://rsscse.org.uk/ts/bts/noether/text.html

    Kendalls tau and Spearmans Rhohttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlati

    Non-parametric Measures of Bivariate Relationshipshttp://www.unesco.org/webworld/idams/advguide/Chapt4_2.htm

    Kendall's rank correlation

    http://www.statsdirect.com/help/nonparametric_methods/kend.htm

    (clearer description of how to handle ties)

    Powerpoint on nonparametric time serieshttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdf

    http://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdfhttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdfhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdfhttp://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdfhttp://www.jstor.org/pss/1940733http://rsscse.org.uk/ts/bts/noether/text.htmlhttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlation-coefficient/http://www.unesco.org/webworld/idams/advguide/Chapt4_2.htmhttp://www.statsdirect.com/help/nonparametric_methods/kend.htmhttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdfhttp://www.webs.uidaho.edu/envs541/Module_08/8_2.pdfhttp://www.statsdirect.com/help/nonparametric_methods/kend.htmhttp://www.unesco.org/webworld/idams/advguide/Chapt4_2.htmhttp://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlation-coefficient/http://rsscse.org.uk/ts/bts/noether/text.htmlhttp://www.jstor.org/pss/1940733http://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdfhttp://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdfhttp://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdfhttp://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdf
  • 7/22/2019 Mann KendallTest (Reparado)

    2/27

    d Sens Slope Estimates -The Excel Template Application Makesens

    pp. 1998-2000

    n-coefficient/

  • 7/22/2019 Mann KendallTest (Reparado)

    3/27

    This Excel file has been designed to calculate a Mann-Kendall trend statistic for ten data points (i.e.

    Instructions

    Enter your data values into the green-highlighted cells C5:C14 of the sheet labeled "MannKendall"

    Change the slide title (B1), Y-axis title (C4) and the year labels (if necessary).If you have fewer than ten years of data, you must also

    Enter the number of time periods (e.g., years) into cell C18.

    Clear the contents of any irrelevant cells from D26 to L34.

    That's it. The worksheet will calculate the Mann-Kendall S statistic (FYI, some authors refer to it as t

    Here is what the worksheet is doing:

    For every n*n pair of values, subtract the value in yearK from the value in yearJ in all n(n-1)/2 cas

    (Subtract the value on the left from the value on the top for all cells above the diagonal - top value

    Above the diagonal will be values for which the column value is from a later year than the row valu

    Count the number of n(n-1)/2 cells that yielded a positive value (result > 0) and put the count valuCount the number of n(n-1)/2 cells that yielded a negative value (result < 0) and put the count valu

    Count the number of n(n-1)/2 cells that yielded a zero value (tied values) and put the count value i

    Sum all the plusses and all the minuses and subtract the total of minuses from the total of pluses.

    S=number of cells with positive values minus the number of cells with negative values.

    The sign of S indicates the slope of the trend (positive=upward, negative=downward).

    If n>=10, then calculate variance and use the formula for the normal approximation of the probabil

    There are two formulae, one if there are no tied values and another if there are tied values.

    If n=5 is required to reach p < .05.

    Evaluation

    # Positive diffs 45.00

    # Negative diffs -35.00

    S 80.00

    Variance(S) * 0 (n(n-1))(2n+5)/18) (This formula may be conservative in the p

    ZS ** 0.00

    Zcrit,.05 165 1.96 (positive or negative) is the critical value for Z, two-tailed

    Interpretation -2.64689641

    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/

    * Note: This variance formula assumes there are no tied values (i.e., no differences=0).Tied values may reduce the validity of the normal approximation when the number of data values i

    If there are tied values, then the following formula with the correction factor for the tied values sho

  • 7/22/2019 Mann KendallTest (Reparado)

    4/27

    Where q is the number of tied groups and tp is the number of data values in the pth

    group.

    I'm not ENTIRELY sure, but I think in this infant mortality example there is actually ONE tied "grou

    of years that share the same value. I think (according to how I am interpreting Gilbert's example) t

    The value 6.1 occurrs in three different years.

    Because we only had 10 years to begin with, and three are tied, we should use the lookup table to

    According to the table, S must be at least 30 for significance at the p < .05 level, and 30 is what w

    But I will also recalculate the variance with the correction factor.

    So, based on Gilbert's example (see last sheet in this file), q=1 and t1=3.

    Var(S) correction factor for tied values: = 3(3-1)((2*3)+5)) = 102

    Our variance after correction: =125-102 = 23

    Evaluation of Tied Data

    # Positive diffs 45.00

    # Negative diffs -35.00S 80.00

    Variance(S) -102 using correction factor for tied data

    ZS #NUM! **

    Zcrit,.05 1.96 1.96 (positive or negative) is the critical value for Z, two-tailed

    Interpretation #NUM!

    In any case, the correction factor decreases the variance, which increases the Z-score, and the lik

    So if there are tied values, and we do not use the variance formula with the correction factor, our t

    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/

    **The direction of Z indicates the direction of the trend. A positive(negative) value of Z indicates an

    Formula for ZS:

    if S > 0 then Z = S-1/SQRT(variance S)

    if S = 0 then Z = 0

    if S < 0 then Z = S+1/SQRT(variance S)

    Some sources said the calculation for the normal approximation of the probability of S should only

    but others said only when n>=40. But there was some ambiguity about the definition of n (#years

    #values in the matrix). If nyears=10, then the number of values inside the matrix is n(n-1)/2, or 10*

    So I'm thinking nyears>=10 is okay.

    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/

    Lookup Table for Significance of S:

    n

  • 7/22/2019 Mann KendallTest (Reparado)

    5/27

    5

    10

    15

    20

    25

    30

    3540

    /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/

    Information of the Power of Mann-Kendall S Test:

    For Mann-

    Kendall S to

    yield a signifi-

    cance level of:

    p < .10 it requires 4 or more data points (e.g., years).

    p < .05 it requires 5 or more data points (e.g., years).

    p < .01 it requires 6 or more data points (e.g., years).

    p < .001 it requires 7 or more data points (e.g., years).

    0

    20

    40

    60

    80

    100

    120

    140

    160

    180

    5 10 15 20 25 30 35 40

    CriticalValue

    of

    S

    Number of Years (data points)

    Critical Values of Mann Kendall S Statistic foralpha=.05 and Varying Values of N

  • 7/22/2019 Mann KendallTest (Reparado)

    6/27

    ten years).

    he K statistic).

    s/cells where yearJ > yearK

    minus left value for each cell.

    e, or yearJ > yearK.)

    in the first column to the right.e in the first column to the right.

    the row at the bottom.

    ity of S

    resence of tied values.)

    , at p < .05. The one-tailed value is 1.65.

    s close to 10.

    ld be used.

  • 7/22/2019 Mann KendallTest (Reparado)

    7/27

    ." It looks likeGilbert defines a "group" as a group

    atour 6.1 value constitutes one "group."

    gauge the significance of S.

    have.

    , at p < .05. The one-tailed value is 1.65.

    elihood of significance.

    st is conservative.

    pward(downward) trend.

    be used if n>=10,

    ersus

    9/2 = 45.

    critical value of S for alpha=.05

  • 7/22/2019 Mann KendallTest (Reparado)

    8/27

    11

    30

    40

    62

    85

    111

    139169

  • 7/22/2019 Mann KendallTest (Reparado)

    9/27

  • 7/22/2019 Mann KendallTest (Reparado)

    10/27

    Graph Title: Infant Mortality. New Mexico, 1999-2009

    y-axis title: Deaths per

    1000 Live

    Births

    1999 6.622000 6.42

    2001 6.06

    2002 5.40

    2003 6.28

    2004 6.07

    2005 5.68

    2006 6.14

    2007 5.03

    2008 4.74

    2009 5.00

    n=number of time periods

    n= 11

    Subtract each earlier year from each later year

    year J: 1999 2000 2001 2002 2003

    year K: 6.6 6.4 6.1 5.4 6.3

    1999 6.6 -0.20 -0.55 -1.22 -0.34

    2000 6.4 -0.36 -1.02 -0.14

    2001 6.1 -0.67 0.212002 5.4 0.88

    2003 6.3

    2004 6.1

    2005 5.7

    2006 6.1

    2007 5.0

    2008 4.7

    2009 5.0

    # ties (diff=0): 0 0 0 0 0

    If n>=10, then use the variance calculation to estimate probability.If n=5 is required to reach p

  • 7/22/2019 Mann KendallTest (Reparado)

    11/27

    Evaluation (Normal App roxim ation, N>=10)

    Variance(S) 165 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence

    ZS -2.65

    Zcrit,.05 1.96 (two-tailed. For one-tailed test use 1.65)

    Interpretation Sig. Decreasing

    Evaluation (Lookup Table for Fewer Than 10 Years)

    If S>=S-crit, then reject H0

    S-crit (p

  • 7/22/2019 Mann KendallTest (Reparado)

    12/27

    2004 2005 2006 2007 2008 2009

    6.1 5.7 6.1 5.0 4.7 5.0 # +

    -0.54 -0.93 -0.47 -1.58 -1.88 -1.62 0.00

    -0.35 -0.74 -0.28 -1.39 -1.68 -1.42 0.00

    0.01 -0.38 0.08 -1.03 -1.33 -1.06 3.000.68 0.29 0.75 -0.36 -0.66 -0.40 4.00

    -0.21 -0.60 -0.13 -1.25 -1.54 -1.28 0.00

    -0.39 0.07 -1.04 -1.33 -1.07 1.00

    0.46 -0.65 -0.94 -0.68 1.00

    -1.11 -1.40 -1.14 0.00

    -0.29 -0.03 0.00

    0.26 1.00

    0 0 0 0 0 10.00

    S= 10.00 minus

    S= -35.00

    1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

    Infant Mortality. New Mexico, 1999-2009

    urce: NM Death Certificate and Birth Certificate Data. NMDOH Bureau of Vital Records and Statistics.

  • 7/22/2019 Mann KendallTest (Reparado)

    13/27

    f tied values.

  • 7/22/2019 Mann KendallTest (Reparado)

    14/27

    # -

    10.00

    9.00

    5.003.00

    6.00

    4.00

    3.00

    3.00

    2.00

    0.00

    45.00

    45.00

    009

  • 7/22/2019 Mann KendallTest (Reparado)

    15/27

  • 7/22/2019 Mann KendallTest (Reparado)

    16/27

  • 7/22/2019 Mann KendallTest (Reparado)

    17/27

    Graph Title: New Mexico Infant Mortality Rate from 1999-2009

    y-axis title: Inf Deaths

    per 1000

    Live Births

    2000 6.62001 6.4

    2002 6.1

    2003 5.4

    2004 6.3

    2005 6.1

    2006 5.7

    2007 6.1

    2008 5.1

    2009 5

    n=number of time periodsn= 10

    Subtract each earlier year from each later year

    year J: 2000 2001 2002 2003 2004

    year K: 6.6 6.4 6.1 5.4 6.3

    2000 6.6 -0.20 -0.50 -1.20 -0.30

    2001 6.4 -0.30 -1.00 -0.10

    2002 6.1 -0.70 0.20

    2003 5.4 0.90

    2004 6.3

    2005 6.1

    2006 5.7

    2007 6.1

    2008 5.1

    2009 5.0

    # ties (diff=0): 0 0 0 0 0

    If n>=10, then use the variance calculation to estimate probability.

    If n=5 is required to reach p=10)

    0

    1

    2

    3

    4

    5

    6

    7

    InfDeathsper1000LiveBirths

  • 7/22/2019 Mann KendallTest (Reparado)

    18/27

    Variance(S) 125 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence

    ZS -2.59

    Zcrit,.05 1.96

    Interpretation Sig. Decreasing

    Evaluation (Lookup Table for Fewer Than 10 Years)

    If S>=S-crit, then reject H0

    S-crit (p

  • 7/22/2019 Mann KendallTest (Reparado)

    19/27

    2005 2006 2007 2008 2009

    6.1 5.7 6.1 5.1 5.0 # + # -

    -0.50 -0.90 -0.50 -1.50 -1.60 0.00 9.00

    -0.30 -0.70 -0.30 -1.30 -1.40 0.00 8.00

    0.00 -0.40 0.00 -1.00 -1.10 1.00 4.00

    0.70 0.30 0.70 -0.30 -0.40 4.00 2.00

    -0.20 -0.60 -0.20 -1.20 -1.30 0.00 5.00

    -0.40 0.00 -1.00 -1.10 0.00 3.00

    0.40 -0.60 -0.70 1.00 2.00

    -1.00 -1.10 0.00 2.00

    -0.10 0.00 1.00

    1 0 2 0 0 6.00 36.00

    S= 6.00 minus 36.00

    S= -30.00

    2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

    Year

    New Mexico Infant Mortality Rate from 1999-2009

  • 7/22/2019 Mann KendallTest (Reparado)

    20/27

    f tied values.

  • 7/22/2019 Mann KendallTest (Reparado)

    21/27

    Table A 30, Upper-tail Probabilities for the Null Distribution of the KendallK Statistic.

    For N>10 use the approximation given in section 8.12 (of Hollander an

    One-sided p = Prob [S x] = Prob [S x]

    N = Number of time periods

    N=3 N=4 N=5 N=6 N=7 N=8 N=9 N=10x

    1 0.5 0.5 0.5 0.5 0.05

    2 0.3335 0.375 0.408 0.43 0.443 0.452 0.46 0.4655 0.05

    3 0.167 0.271 0.325 0.36 0.386 0.406 0.4205 0.431 0.05

    4 0.167 0.242 0.2975 0.3335 0.36 0.381 0.3975 0.05

    5 0.1045 0.1795 0.235 0.281 0.317 0.3435 0.364 0.05

    6 0.042 0.117 0.1855 0.236 0.274 0.306 0.332 0.05

    7 0.0795 0.136 0.191 0.2365 0.272 0.3 0.05

    8 0.042 0.102 0.155 0.199 0.238 0.271 0.05

    9 0.02515 0.068 0.119 0.1685 0.2085 0.242 0.05

    10 0.0083 0.048 0.0935 0.138 0.179 0.216 0.05

    11 0.028 0.068 0.1135 0.1545 0.19 0.0512 0.01815 0.0515 0.089 0.13 0.168 0.05

    13 0.0083 0.035 0.0715 0.11 0.146 0.05

    14 0.00485 0.025 0.054 0.09 0.127 0.05

    15 0.0014 0.015 0.0425 0.075 0.108 0.05

    16 0.0102 0.031 0.06 0.093 0.05

    17 0.0054 0.0233 0.049 0.078 0.05

    18 0.0034 0.0156 0.038 0.066 0.05

    19 0.0014 0.01135 0.03 0.054 0.05

    20 0.0008 0.0071 0.022 0.045 0.05

    21 0.0002 0.00495 0.0172 0.036 0.05

    22 0.0028 0.0124 0.0295 0.05

    23 0.00185 0.00935 0.023 0.05

    24 0.0009 0.0063 0.01865 0.05

    25 0.00055 0.0046 0.0143 0.05

    26 0.0002 0.0029 0.0113 0.05

    27 0.0002 0.00205 0.0083 0.05

    28

  • 7/22/2019 Mann KendallTest (Reparado)

    22/27

    The table was adapted from D. Helsel and R. M. Hirsch, Statistical Methods in Water Resources

    Helsel and Hirsch cited Table A30 in Myles Hollander and Douglas A. Wolfe (1999)

    Original Table from Helsel & Hirsch:

    Table B8 -- Quantiles (p-values) for Kendall's S statistic and tau correlation coefficient

    For N>10 use the approximation given in section 8.2.2

    One-sided p = Prob [S x] = Prob [S x]

    N = Number of time periods N = Number of time periods

    4 5 8 9 3 6 7 10x x

    0 0.625 0.592 0.548 0.54 1 0.5 0.5 0.5 0.5

    2 0.375 0.408 0.452 0.46 3 0.167 0.36 0.386 0.43

    4 0.167 0.242 0.36 0.381 5 0.235 0.281 0.36

    6 0.042 0.117 0.274 0.306 7 0.136 0.191 0.3

    8 0.042 0.199 0.238 9 0.068 0.119 0.24

    10 0.0083 0.138 0.179 11 0.028 0.068 0.19

    12 0.089 0.13 13 0.0083 0.035 0.15

    14 0.054 0.09 15 0.0014 0.015 0.11

    16 0.031 0.06 17 0.0054 0.08

    18 0.0156 0.038 19 0.0014 0.05

    20 0.0071 0.022 21 0.0002 0.04

    22 0.0028 0.0124 23 0.02

    24 0.0009 0.0063 25 0.01

    26 0.0002 0.0029 27 0.01

    28

  • 7/22/2019 Mann KendallTest (Reparado)

    23/27

    Wolfe)

    0.03 RED These do not appear on the table in the textbook because they are impossible values

    0.03 Significant at p

  • 7/22/2019 Mann KendallTest (Reparado)

    24/27

    This is from Helsel & Hirsch This includes interpolated values (red text)

    S-crit (p=S=crit, then reject H0

    01

    01

    01

    01

  • 7/22/2019 Mann KendallTest (Reparado)

    25/27

    but they ARE possible if there are tied cells. We still need to figure out how to handle ties.

    22 23 24 25 26 27 28 29 30

    N=3

    N=4

    N=5

    N=6

    N=7

    N=8

    N=9

    N=10

    p=0.05

    p=0.025

  • 7/22/2019 Mann KendallTest (Reparado)

    26/27

    Gilbert, 1987, on Tied Values in Mann-Kendall Test

    23 24 0.1 6 0.1 24

    23 1 -22.9 -17 -22.9 1

    24 -23.9 -18 -23.9 00.1 5.9 0 23.9

    6 -5.9 18

    0.1 23.9

    24

    24

    0.1

    23

    # ties (diff=0): 0 0 0 1 1

    This is from the Gilbert (1987) article.Gilbert says:

    the number of tied groups=3 (!?)*

    t1=2 for the tied value 23

    t2=3 for the tied value 24

    t3=3 for the tied value .1

    *Does he mean the number of different/unique values with a tie?

    There are five columns, above, with tied values. But the number of uniq

    matrix = 3. How on Earth am I supposed to ask SAS to do that!

    0.1

    0.1

    0.1 The "value" of 23 happens twice

    6 The "value" of 24 happens three times

    23 The "value" of 23 happens three times

    23

    24

    24

    24

  • 7/22/2019 Mann KendallTest (Reparado)

    27/27

    24 0.1 23 # + # -

    1 -22.9 0 3.00 4.00

    0 -23.9 -1 0.00 5.0023.9 0 22.9 4.00 0.00

    18 -5.9 17 3.00 2.00

    23.9 0 22.9 3.00 0.00

    0 -23.9 -1 0.00 2.00

    -23.9 -1 0.00 2.00

    22.9 1.00 0.00

    2 2 1 14.00 15.00

    -1.00

    ue values that happen to have ties in the