1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i...

6
1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency in i-th bin Test statistic: 2 O i E i 2 E i i 1 k

Transcript of 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i...

Page 1: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

1

Chi-squared Test (1)

H0: population follows distribution Divide the observations into k bins Oi = observed frequency in i-th bin

Ei = expected frequency in i-th bin Test statistic:

2 Oi E i 2

E ii1

k

Page 2: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

2

Chi-squared Test (2)

2 follows (approx) a chi-square distribution with k-p-1 degrees of freedom where p is the number of estimated parameters

Reject H0 if 2 > 2,k-p-1 .

Page 3: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

3

How many categories?

Expected frequency in each bin should be at least 3. Can be as small as 1 or 2 if other bins are at least 5. If number is too small, lump with adjacent bin.

Page 4: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

4

Example: Uniform Birthdays

H0: birthdays follow a uniform distribution

Construct a chi-squared test for this hypothesis

Page 5: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

5

Contingency Table Test

Goal: to find out if two methods of classification are statistically indept

ˆ p i 1

nOij

i1

r

; ˆ q j 1

nOij

j1

c

E ij nˆ p i ˆ q j

2 Oij E ij 2

E ijj1

c

i1

r

• 2 follows a chi-squared

distribution with (r-1)(c-1)

degrees of freedom.

• Reject the null hypothesis

if 2 exceeds 2,(r-1)(c-1)

Page 6: 1 Chi-squared Test (1) H 0 : population follows distribution Divide the observations into k bins O i = observed frequency in i-th bin E i = expected frequency.

6

Example: Smoking & Gender

Test the hypothesis (at the 1% level) that whether a person smokes is independent of the person’s gender.