Anova Biometry

download Anova Biometry

of 33

Transcript of Anova Biometry

  • 8/9/2019 Anova Biometry

    1/33

    One-way ANOVA• Motivating Example• Analysis of Variance• Model & Assumptions

    • Data Estimates of the Model• Analysis of Variance• Multiple Comparisons

    • Checking Assumptions• One !ay A"OVA #ransformations

  • 8/9/2019 Anova Biometry

    2/33

    Motivating Example$#reating Anorexia "ervosa

  • 8/9/2019 Anova Biometry

    3/33

    Analysis of Variance• Analysis of Variance is a !idely used statistical

    techni%ue that partitions the total varia ility in ourdata into components of varia ility that are used totest hypotheses'

    • (n One !ay A"OVA) !e !ish to test the hypothesis$ H 0 : µ 1 = µ 2 = = µ k

    against$ H a : Not all population means are the same

  • 8/9/2019 Anova Biometry

    4/33

    Model & Assumptions

    • #he model for the o served response is given y$

    • *e assume that the errors are normally distri uted

    !ith constant variance'

    • #his implies that the populations eing sampled are

    also normally distri uted !ith e%ual variances+

    ijiij x ε α µ ++=

  • 8/9/2019 Anova Biometry

    5/33

    Analysis of Variance

    • (n ANOVA ) !e compare the between-groupvariation !ith the within-group variation to assess!hether there is a difference in the population

    means'

    • #hus y comparing these t!o measures of variance,spread- !ith one another) !e are a le to detect ifthere are true differences among the underlyinggroup population means'

  • 8/9/2019 Anova Biometry

    6/33

    Analysis of Variance

    • (f the variation et!een the sample means is large)relative to the variation !ithin the samples) then !e!ould e likely to detect significant differences

    among the sample means'

  • 8/9/2019 Anova Biometry

    7/33

    Between Group Variation is LargeCompared to Within Group Variation

    Here we would almost certainly reject the null hypothesis.

  • 8/9/2019 Anova Biometry

    8/33

    Analysis of Variance Between-group

    variation is largecompared to theWithin-group

    variation

    µ 2

    µ 3

    µ 1

    µ α 3

    α 2 = µ 2 - µ

    α 1

    If we sampled from

    these populations, wewould expect to rejectH0

    ε 3 j = y3 j − µ 3

  • 8/9/2019 Anova Biometry

    9/33

    Analysis of Variance

    • (f the variation et!een the sample means is small)relative to the variation !ithin the samples) thenthere !ould e considera le overlap of

    o servations in the different samples) and !e !oulde unlikely to detect any differences among thepopulation means'

  • 8/9/2019 Anova Biometry

    10/33

    Between Group Variation is SmallCompared to Within Group Variation

    Here we would fail to reject the null hypothesis.

  • 8/9/2019 Anova Biometry

    11/33

    Analysis of Variance

    µ . µ / . µ 0 . µ 1

    All α i = 0If we sampled

    from thesepopulations, we

    would not expectto reject H 0

    ε 2 j = y2 j - µ 2

  • 8/9/2019 Anova Biometry

    12/33

    Analysis of Variance• (f !e consider all of the data together) regardless of

    !hich sample the o servation elongs to) !e canmeasure the overall total varia ility in the data y$

    • #his is the Total Sum of S uares , SS Total -'

    • (f !e divide this sum of s%uares y its degrees offreedom , N − /-) !e !ill have a measure ofvariance'

    ∑∑= = ••−k

    i

    n

    jij

    i

    x x1 1

    2

    )(

  • 8/9/2019 Anova Biometry

    13/33

    Analysis of Variance• "o!) the deviation of every o servation from the overall

    ,grand- mean can e partitioned as$

    • 2%uaring and summing across all o servations)

    !e get$

    iijiij x x x x x x −+−=− •••••• )()()(

    ∑∑ ∑∑∑∑= = = =

    •= =

    ••••• −+−=−k

    i

    n

    j

    k

    i

    n

    jiij

    k

    i

    n

    jiij

    i ii

    x x x x x x1 1 1 1

    2

    1 1

    22 )()()(

    Measure variation dueto the fact differenttreatments are used.

    Measures errorvariation, variation inresponse when same

    treatment is applied.

  • 8/9/2019 Anova Biometry

    14/33

    Analysis of Variance• "o!) the deviation of every o servation from the overall

    ,grand- mean can e partitioned as$

    • 2%uaring and summing across all o servations)

    !e get$

    iijiij x x x x x x −+−=− •••••• )()()(

    ∑∑ ∑∑∑∑= = = =

    •= =

    ••••• −+−=−k

    i

    n

    j

    k

    i

    n

    jiij

    k

    i

    n

    jiij

    i ii

    x x x x x x1 1 1 1

    2

    1 1

    22 )()()(

    Treatment Sum of Squares(SS Treat ) or Between GroupSum of Squares

    Error Sum of Squares (SS Error ) orWithin Group Sum of Squares

  • 8/9/2019 Anova Biometry

    15/33

    Analysis of Variance

    • #o convert Sums of S uares , SS - into compara lemeasures of variance) !e need to divide the SS ytheir respective degrees of freedom '

    • #his gives us mean s uares , MS - !hich aremeasures of variance $

    MS Treat = SS Treat / df Treat = SS Treat / (k – 1) MS rror = SS rror / df rror = SS rror / ( ! – k )

  • 8/9/2019 Anova Biometry

    16/33

    Analysis of Variance• #he expected values of the mean s%uares for

    repeated sampling are$ E( MS Treat ) = σ 2 + Σα i2 / (k − 1)

    E( MS rror ) = σ 2

    • #hus MS Error is an estimate of σ 0 ) the

    !ithin group variance$

    • (f all the α i are 3) then the expected value for the!-ratio !ill e σ 0 4 σ 0 . /) !hile if some of the α i are

    not 3) E, MS B - 5 E, MS W -) and E, F - 5 /

    ˆW MS = 2σ and σ ̂=W MS

  • 8/9/2019 Anova Biometry

    17/33

    Analysis of Variance "# • Our test statistic is the !-ratio ,or !-statistic -

    !hich compares these t!o mean s%uares$

    "ote that the greater the natural varia ility !ithinthe groups) the larger the effects , α i - !ill need toe ,as estimated y MS

    Treat - for us to detect any

    significant differences'

    rror

    Treat

    MS MS

    " =0

  • 8/9/2019 Anova Biometry

    18/33

    Analysis of Variance• #raditionally the Analysis of Variance calculations

    have een presented in an ANOVA Table '• #he format of the ta le is$

    k – 1

    Source of e!rees of Sum of Mean F- "atio P-value#ariation $reedom Squares SquareTreatment SS Treat MS Treat MS Treat /MS rror Tail AreaError ! – k SS

    rror MS

    rror Total ! – 1 SS T

    hese cols add up !!" df

  • 8/9/2019 Anova Biometry

    19/33

    Motivating $%ample

    # 0 = $0%&$'"(%&)* = (&+'

  • 8/9/2019 Anova Biometry

    20/33

    Analysis of Variance• A large !-statistic provides evidence against H 3

    !hile a small !-statistic indicates that the data andH 3 are compati le'

    • #o calculate a -value to test H 3 , !e compare the!-statistic !e o tained from our data to thedistri ution it !ould have under a true H 3 ) i'e' an!-distribution !ith ,k − /- and ," – k - degrees of

    freedom '

    • "ote that F 3 is al!ays positive) so this is al!ays aone tailed test'

  • 8/9/2019 Anova Biometry

    21/33

    Analysis of Variance

    hen the -value = 0&0)

    When H 0 is true, # 0 # .df / ,df '

    1et2s sa3 our o4served value for # was # 0 = '&(

    3 / 0 1 6

    3 ' 3

    3 ' 0

    3 ' 6

    3 ' 7

    3 ' 8

    F-distribution

    #or example, consider the # -distri4ution with + and $0 df

  • 8/9/2019 Anova Biometry

    22/33

    Multiple 'omparisons• A significant !-test tells us that at least t!o of the

    underlying population means are different) ut itdoes not tell us !hich ones differ from the others'

    • *e need extra tests to compare all the means)

    !hich !e call Multiple Comparisons'• *e look at the difference et!een every pair of

    group population means) as !ell as the confidenceinterval for each difference'

    • *hen !e have ( groups) there are$

    possi le pair !ise comparisons'5 choose ' ( )

    21

    )2(22

    −=−=

    k k

    k k k

  • 8/9/2019 Anova Biometry

    23/33

    Multiple 'omparisons • (f !e estimate each comparison separately !ith 9:;

    confidence) the overall error rate !ill e greaterthan )* '

    • 2o) using ordinary pair !ise comparisons ,i'e' lotsof individual t-tests -) !e tend to find too manysignificant differences et!een our sample means'

    • *e need to modify our intervals so that theysimultaneously contain the true differences !ith 9:;

    confidence across the entire set of comparisons'• #he modified intervals are kno!n as$

    simultaneous confidence intervals O<multiple comparison procedures

  • 8/9/2019 Anova Biometry

    24/33

    Multiple 'omparisons• =irst) the +onferroni correction '• (nstead of using t df, α / 2 as our multiplier for the

    confidence interval) !e use t df ,α / 2L ) !here is thetotal num er of possi le pair !ise comparisons

    ,i'e' = ( ( !" / 2 -'• #hat is) !e divide α 40 y the num er of tests to e

    done , α 40L-'• #his assumes all pair !ise comparisons are

    independent) !hich is not the case) so thisad>ustment is too conservative ,intervals !ill e too!ide? i'e' finds too fe! significant differences-'

  • 8/9/2019 Anova Biometry

    25/33

    Multiple 'omparisons• 2econd) !e have Tu(ey #ntervals '• #he calculation of #ukey (ntervals is %uite

    complicated) ut overcomes the pro lems of theunad>usted pair !ise comparisons finding too many

    significant differences,i'e' confidence intervals that are too narro!-)and the @onferroni correction finding too fe!significant differences,i'e' confidence intervals that are too !ide-'

    We will use u5e3 Intervals

  • 8/9/2019 Anova Biometry

    26/33

  • 8/9/2019 Anova Biometry

    27/33

    #ukey air !ise Comparisons

    Select Compare Means > All Pairs, Tukey HSD

    !ere "e see that onl# $eha%ioral an& 'tan&ar& therapies &i erin terms o mean "ei ht ain* e estimate those in ,eha%ioraltherap# "ill ain ,et"een 2 l,s* an& 13 l,s* more on a%era e*

  • 8/9/2019 Anova Biometry

    28/33

    'hec(ing Assumptions. #ndependence

    • #he o servations !ithin each sample must eindependent of one another'

    • #he samples must e taken from independentpopulations'

    'h (i A i

  • 8/9/2019 Anova Biometry

    29/33

    'hec(ing Assumptions.$ uality of Variance #

    • E%uality of variance is very important in One !ay A"OVA'

    • *e check e%uality of variance using Bevene s)@artlett s) @ro!n =orsythe) or O @rien s #ests '

    • (f the assumption of e%ual population variances isnot satisfied , small -value from these tests-) !ecan try transforming the data or use *elch s

    A"OVA !hich allo!s the variances to e une%ual'

    'hec(ing Assumptions

  • 8/9/2019 Anova Biometry

    30/33

    'hec(ing Assumptions.$ uality of Variance

    • =or many data sets) !e often find there is arelationship et!een the centre of the dataand the spread of the data$• (n particular) samples !ith lo! means ,or

    medians- often have small spread !hile samples!ith large means ,or medians- often have largespread ,or vice versa-'

    • #he positive relationship et!een the mean andvariance ,or et!een the median and midspread-in different samples is often true for data thathave right ske!ed distri utions'

    'hec(ing Assumptions

  • 8/9/2019 Anova Biometry

    31/33

    hec(ing Assumptions.$ uality of Variance

    • (f the variance of the samples isincreasing as the sample means

    increase a log or s%uare roottransformation is often times used'

  • 8/9/2019 Anova Biometry

    32/33

  • 8/9/2019 Anova Biometry

    33/33

    One-way ANOVA Transformations• *e can transform our response varia le if !e

    detect pro lems !ith the e%uality of variance ornormality assumptions'

    • o!ever) as in the t!o sample situation) !e canonly use a log transformation if !e !ish to e a leto ack transform and interpret our confidenceintervals meaningfully'