Statistical Methods in Computer Science
-
Upload
joshua-hebert -
Category
Documents
-
view
40 -
download
0
description
Transcript of Statistical Methods in Computer Science
![Page 1: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/1.jpg)
Statistical Methods in Computer Science
Hypothesis Testing II:Single-Factor Experiments
Ido Dagan
![Page 2: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/2.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
2
Single-Factor Experiments
A generalization of treatment experiments Determine effect of independent variable values
(nominal) Effect: On the dependent variable
treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1
treatment2 Ind2 & Ex1 & Ex2 & .... & Exn ==> Dep2
control Ex1 & Ex2 & .... & Exn ==> Dep3
Compare performance of algorithm A to B to C .... Control condition: Optional (e.g., to establish
baseline)
![Page 3: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/3.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
3
Single-Factor Experiments
An generalization of treatment experiments Determine effect of independent variable values
(nominal) Effect: On the dependent variable
treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1
treatment2 Ind2 & Ex1 & Ex2 & .... & Exn ==> Dep2
control Ex1 & Ex2 & .... & Exn ==> Dep3
Compare performance of algorithm A to B to C .... Control condition: Optional (e.g., to establish
baseline)
Values of independent variable
Values of dependent variable
![Page 4: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/4.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
4Single-Factor Experiments: Definitions
The independent variable is called the factor Its values (being tested) are called levels
Our goal: Determine whether there is an effect of levels Null hypothesis: There is no effect Alternative hypothesis: At least one level causes an
effect
Tool: One-way ANOVA A simple special case of general Analysis of Variance
![Page 5: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/5.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
5The case for Single-factor ANOVA
(one-way ANOVA) We have k samples (k levels of the factor)
Each with its own sample mean, sample std. deviation for the dependent variable value
We want to determine whether one (at least) is different
treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1
…treatment2 Indk & Ex1 & Ex2 & .... & Exn ==> Depk
control Ex1 & Ex2 & .... & Exn ==> Dep3
Values of independent variable = levels of the
factor
Values of dependent variable
Cannot use the tests we learned: Why?
![Page 6: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/6.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
6The case for Single-factor ANOVA
(one-way ANOVA) We have k samples (k levels of the factor)
Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is
different
H0: M1=M2=M3=M4
H1: There exist i,j such that Mi <> Mj
Level S. mean S. stdev. N(sample) Mi Si
1 4.24 0.91 292 3.75 1.38 1203 2.85 1.38 594 2.63 1.41 59
![Page 7: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/7.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
7The case for Single-factor ANOVA
(one-way ANOVA) We have k samples (k levels of the factor)
Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is
different
H0: M1=M2=M3=M4
H1: There exist i,j such that Mi <> Mj
Level S. mean S. stdev. N(sample) Mi Si
1 4.24 0.91 292 3.75 1.38 1203 2.85 1.38 594 2.63 1.41 59
Why not use t-test to compare every
Mi, Mj?
![Page 8: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/8.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
8
Multiple paired comparisons
Let ac be the probability of an error in a single comparison alpha = the probability of incorrectly rejecting null hypothesis
1-ac: probability of making no error in a single comparison
(1-ac)m: probability of no error in m comparisons
(experiment) ae = 1-(1-ac)
m: probability of an error in the experiment Under assumption of independent comparisons
ae quickly becomes large as m increases
![Page 9: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/9.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
9
Example
Suppose we want to contrast 15 levels of the factor 15 groups, k=15
Total number of pairwise comparisons (m) : 105 15 X (15-1) / 2 = 105
Suppose ac = 0.05 Then ae = 1-(1-ac)
m = 1-(1-0.05)105 = 0.9954
We are very likely to make a type I error!
![Page 10: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/10.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
10
Possible solutions?
Reduce ac until overall ae level is 0.05 (or as needed) Risk: comparison alpha target may become unobtainable
Ignore experiment null hypothesis, focus on comparisons Carry out m comparisons # of errors in m experiments: m X ac
e.g., m=105, ac=0.05, # of errors = 5.25. But which?
![Page 11: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/11.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
11
One-way ANOVA
A method for testing the experiment null hypothesis H0: all levels' sample means are equal to each other
Key idea: Estimate a variance B under the assumption H0 is
true Estimate a “real” variance W (regardless of H0) Use F-test to test hypothesis that B=W
Assumes variance of all groups is the same
![Page 12: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/12.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
12
Some preliminaries
Let xi,j be the jth element in sample i Let Mi be the sample mean of sample i Let Vi be the sample variance of sample i
For example:Class 1 Class 2 Class 3
14.9 11.1 5.715.2 9.5 6.617.9 10.9 6.715.6 11.7 6.810.7 11.8 6.9
Mi 14.86 11 6.54Vi 6.8 0.85 0.23
x1,2
x3,4
![Page 13: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/13.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
13
Some preliminaries
Let xi,j be the jth element in sample i Let Mi be the sample mean of sample i Let Vi be the sample variance of sample i
Let M be the grand sample mean (all elements, all samples)
Let V be the grand sample variance
![Page 14: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/14.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
14The variance contributing to a value
Every element xi,j can be re-written as:
xi,j = M + ei,j
where ei,j is some error component
We can focus on the error component
ei,j = xi,j – M
which we will rewrite as:
ei,j = (xi,j - Mi ) + (Mi - M)
![Page 15: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/15.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
15Within-group and between-group
The re-written form of the error component has two parts
ei,j = (xi,j - Mi ) + (Mi - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean
For example, in the table: x1,1 = 14.9, M1 = 14.86, M = 10.8 e1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1
![Page 16: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/16.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
16Within-group and between-group
The re-written form of the error component has two parts
ei,j = (xi,j - Mi ) + (Mi - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean
For example, in the table: x1,1 = 14.9, M1 = 14.86, M = 10.8 e1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1
Note within-group and between-group components: Most of the error (variance) is due to the between group!
Can we use this in more general fashion?
![Page 17: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/17.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
17
No within-group variance
Class 1 Class 2 Class 3M 15 11 6
10.67 15 11 6V 15 11 6
14.52 15 11 615 11 6
Mi 15 11 6Vi 0 0 0
No variance within group, in any element
![Page 18: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/18.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
18
No between-group variance
Class 1 Class 2 Class 3M 17 11 2215 26 13 14V 9 18 12
24.86 11 18 812 15 19
Mi 15 15 15Vi 46.5 9.5 31
No variance between groups, in any group
![Page 19: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/19.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
19Comparing within-group and between-groups components
The error component of a single element is:
ei,j =(xi,j - M) = (xi,j - Mi ) + (Mi - M)
Let us relate this to the sample and grand sums-of-squares It can be shown that:
Let us rewrite this as
i j
ii j
iji,i j
ji, MM+Mx=Mx 222
betweenwithintotal SS+SS=SS
![Page 20: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/20.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
20From Sums of Squares (SS) to
variances
We know
... and convert to Mean Squares (as variance estimates):
betweenwithintotal SS+SS=SS
`
1
2
1
2
IN
Mx
=N
Mx
=df
SS=MS i j
iji,
I
=ki
i jiji,
within
withinwithin
11
22
I
MMN=
I
MM
=df
SS=MS i
iii j
i
between
betweenbetween
![Page 21: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/21.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
21From Sums of Squares (SS) to
variances
We know
... and convert to variances:
betweenwithintotal SS+SS=SS
IN
Mx
=N
Mx
=df
SS=MS i j
iji,
I
=ki
i jiji,
within
withinwithin
2
1
2
1
11
22
I
MMN=
I
MM
=df
SS=MS i
iii j
i
between
betweenbetween
Degrees of freedom
![Page 22: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/22.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
22From Sums of Squares (SS) to
variances
We know
... and convert to variances:
betweenwithintotal SS+SS=SS
IN
Mx
=N
Mx
=df
SS=MS i j
iji,
I
=ki
i jiji,
within
withinwithin
2
1
2
1
11
22
I
MMN=
I
MM
=df
SS=MS i
iii j
i
between
betweenbetween
# of levels (samples)
![Page 23: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/23.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
23From Sums of Squares (SS) to
variances
We know
... and convert to variances:betweenwithintotal SS+SS=SS
`
1
2
1
2
IN
Mx
=N
Mx
=df
SS=MS i j
iji,
I
=ki
i jiji,
within
withinwithin
11
22
I
MMN=
I
MM
=df
SS=MS i
iii j
i
between
betweenbetween
![Page 24: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/24.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
24
Determining final alpha level
MSwithin is an estimate of the (inherent) population variance Which does not depend on the null hypothesis (M1=M2=... MI)
Intuition: It’s an “average” of variances in the individual groups
MSbetween estimates the population variance + the treatment effect It does depend on the null hypothesis
Intuition: It’s similar to an estimate for the variance of the samples means, where each component is multiplied by Ni
Recall: N · sample mean variance = population variance
If the null hypothesis is true – the two values estimate the inherent variance, and should be equal up to the sampling variation
So now we have two variance estimates for testing Use F-test
F = Msbetween / MSwithin
Compare to F-distribution with dfbetween, dfwithin
Determine alpha level (significance)
![Page 25: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/25.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
25
ExampleClass 1 Class 2 Class 3
M 14.9 11.1 5.710.8 15.2 9.5 6.6
V 17.9 10.9 6.714.64 15.6 11.7 6.8
10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23
173.310.86.54510.811510.814.865 222 =++=SSbetween
86.72
173.3
13
173.3===
df
SS=MS
between
betweenbetween
![Page 26: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/26.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
26
ExampleClass 1 Class 2 Class 3
M 14.9 11.1 5.710.8 15.2 9.5 6.6
V 17.9 10.9 6.714.64 15.6 11.7 6.8
10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23
22 14.8610.7...14.8614.9 ++=SSwithin
2.612
31.5
315
31.5===
df
SS=MS
within
withinwithin
31.56.546.9...1111.8....1111.1 222 =+++++
![Page 27: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/27.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
27
ExampleClass 1 Class 2 Class 3
M 14.9 11.1 5.710.8 15.2 9.5 6.6
V 17.9 10.9 6.714.64 15.6 11.7 6.8
10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23
F=MSbetween
MSwithin
=86.72.6
=32.97Check F
distribution(2,12):Significant!
![Page 28: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/28.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
28Reading the results from statistics
software
You can use a statistics software to run one-way ANOVA
It will give out something like this:
Source df SS MS F pbetween 2 173.3 86.7 32.97
p<0.001within 14 31.5 2.6total 16 204.9
You should have no problem reading this, now.
![Page 29: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/29.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
29
Analogy to linear regression
Analogy to linear regression – where:• the variance of observation is composed of:– the variance of the predictions – plus the variance of the deviations from the corresponding
predictions:
• that is – explained variance (according to the prediction) vs. unexplained variance (due to deviations from prediction)
![Page 30: Statistical Methods in Computer Science](https://reader035.fdocuments.us/reader035/viewer/2022062309/56813521550346895d9c84e0/html5/thumbnails/30.jpg)
Empirical Methods in Computer Science © 2006-now Gal Kaminka
30
Summary
Treatment and single-factor experiments Independent variable: categorical Dependent variable: “numerical” (ratio/interval)
Multiple comparisons: A problem for experiment hypotheses
Run one-way ANOVA instead Assumes:
populations are normal have equal variances independent random samples (with replacement)
Moderate deviation from normal, particularly with large samples, is still fine Somewhat different variances are fine for roughly equal samples
If significant, run additional tests for details: Tukey's procedure (T method) LSD Scheffe ...