Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on...

Chapter 9

Inferences Based on Two Samples

z Tests and Confidence Intervals for a

Difference Between Two Population Means

The Difference Between Two Population Means

Assumptions:1. X1,…,Xm is a random sample from a

population with

2. Y1,…,Yn is a random sample from a population with

3. The X and Y samples are independent of one another

21 1 and .

22 2 and .

Expected Value and Standard Deviation of

The expected value is

The standard deviation is

1 2. So is an unbiased estimator ofX Y

2 21 2

X Y m n

Null hypothesis:0 1 2: 0H

Test statistic value: 0

2 21 2

Test Procedures for Normal Populations With Known Variances

a 1 2 0:H

Alt. Hypothesis

a 1 2 0:H

( ) = P(Type II Error)

1 2( )

Large-Sample Tests

The assumptions of normal population distributions and known values of are unnecessary. The Central Limit Theorem guarantees that has approximately a normal distribution.

Large-Sample TestsUse of the test statistic value

2 21 2

s sm n

along with previously stated rejection regions based on z critical values give large-sample tests whose significance levels are approximately

m, n >40

Confidence Interval for 1 2

with a confidence level of 100(1 )%

2 21 2

s sx y z

Provided m and n are large, a CI for

1 2 is

confidence bounds can be found by replacing / 2 by .z z

The Two-Sample t Test and

Confidence Interval

Assumptions

Both populations are normal, so that X1,…,Xm is a random sample from a normal distribution and so is Y1,…,Yn. The plausibility of these assumptions can be judged by constructing a normal probability plot of the xi’s and another of the yi’s.

t Distribution

When the population distributions are both normal, the standardized variable

2 21 2

( )X YT

S Sm n

has approximately a t distribution…

df v can be estimated from the data by

t Distribution

22 21 2

2 22 21 2/ /

s m s n

(round down to the nearest integer)

Two-Sample CI for 1 2

with a confidence level of 100(1 )%

2 21 2

s sx y t

The two-sample CI for 1 2

Null hypothesis:0 1 2 0:H

2 21 2

s sm n

Two-Sample t Test

a 0 0:H

Alternative Hypothesis

Rejection Region for Approx. Level Test

a 0 0:H

/ 2,vt t or

The Two-Sample t Test

/ 2,vt t

Pooled t Procedures

Assume two populations are normal and have equal variances. If denotes the common variance, it can be estimated by combining information from the two-samples. Standardizing using the pooled estimator gives a t variable based on m + n – 2 df.

Analysis of

Paired Data

Paired Data (Assumptions)

The data consists of n independently selected pairs (X1,Y1),…, (Xn,Yn), with

Let D1 = X1 – Y1, …, Dn = Xn – Yn. The Di’s are assumed to be normally distributed with mean value and variance

1 2( ) and ( )i iE X E Y

Null hypothesis:0 0: DH

The Paired t Test

are the sample mean and standard deviation of the di’s.

and Dd s

a 0: DH

Rejection Region for Level Test

a 0: DH

, 1nt t

/ 2, 1nt t or

The Paired t Test

/ 2, 1nt t

Confidence Interval for D

The paired t CI for isD

/ 2, 1 /n Dd t s n

confidence bounds can be found by replacing / 2 by .t t

Paired Data and Two-Sample t

1( ) ( ) iV X Y V D V D

2 21 2 1 2( ) 2iV D

Independence between X and Y

Positive dependence

Pros and Cons of Pairing

1. For great heterogeneity and large correlation within experimental units, the loss in degrees of freedom will be compensated for by an increased precision associated with pairing (use pairing).

2. If the units are relatively homogeneous and the correlation within pairs is not large, the gain in precision due to pairing will be outweighed by the decrease in degrees of freedom (use independent samples).

Inferences Concerning a

Difference Between Population Proportions

Let X ~Bin(m,p1) and Y ~Bin(n,p2) with X and Y independent variables. Then

1 2 1 2ˆ ˆE p p p p

1 1 2 21 2ˆ ˆ

p q p qV p p

m n (qi = 1 – pi)

1 2 1 2ˆ ˆ is an unbiased estimator of p p p p

Large-Samples

Null hypothesis:0 1 2: 0H p p

Test statistic value:

1 2ˆ ˆ

ˆ ˆ 1/ 1/

pq m n

a 1 2: 0H p p

Rejection Region

/ 2z z / 2z zor

Large-Samples

Valid provided

0 010 and (1 ) 10.np n p

a 1 2: 0H p p

1 2( , )p p

Alt. Hypothesis

1 2(1/ 1/ ) ( )z pq m n p p

General Expressions for

1 2( , )p p

a 1 2: 0H p p

1 2(1/ 1/ ) ( )1

z pq m n p p

Alt. Hypothesis

General Expressions for

1 2(1/ 1/ ) ( )z pq m n p p

a 1 2: 0H p p

1 2(1/ 1/ ) ( )z pq m n p p

1 2( , )p p1 2( , )p p

( ) /( )

p mp np m n

q mq nq m n

Sample Size

For the case m = n, the level test has type II error probability at the alternative values p1, p2 with p1 – p2 = d when

1 2 1 2 1 1 2 2

( )( ) / 2z p p q q z p q p qn

Confidence Interval for p1 – p2

1 1 2 21 2 / 2

ˆ ˆ ˆ ˆˆ ˆ

p q p qp p z

Inferences Concerning Two

Population Variances

The F Distribution

The F probability distribution has parameters v1 (number of numerator df) and v2 (number of denominator df). If X1 and X2 are independent chi-squared rv’s with v1 and v2 df, then

The F Distribution Density Curve Property

1 2 1 21 , , , ,1/v v v vF F

1 2, ,v vF

F density curve

Shaded area =

Inferential Methods

Let X1,…,Xm and Y1,…,Yn be random (independent) samples from normal distributions with variances respectively. Let

2 21 2 and .

2 21 2 and denoteS S

the two sample variances, then2 2

1 12 22 2

F Test for Equality of Variances

Null hypothesis: 2 20 1 2:H

Test statistic value:2 21 2/f s s

2 2a 1 2:H

Rejection Region

, 1, 1m nf F

2 2a 1 2:H

1 , 1, 1m nf F

/ 2, 1, 1m nf F

or 1 / 2, 1, 1m nf F

F Test for Equality of Variances

P-Values for F Tests

The P-value for an upper-tailed F test is the area under the F curve with appropriate numerator and denominator df to the right of the calculated f.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on...

Documents

Transcript of Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on...

©Brooks/Cole, 2003 Chapter 3 Number Representation.

© 2009 Brooks/Cole - Cengage Thermochemistry CHAPTER 6 1.

©Brooks/Cole, 2001 Chapter 4 Functions. ©Brooks/Cole, 2001 Figure 4-1.

Brooks Cole - Organic Chemistry 5e Chapter 25

©Brooks/Cole, 2001 Chapter 8 Arrays. ©Brooks/Cole, 2001 Figure 8-1.

© 2006 Thomson-Brooks Cole Chapter 12 Marine Mammals.

Brooks Cole - Organic Chemistry 5e Chapter 17

© 2008 Brooks/Cole 1 © 2008 Brooks/Cole 2 · © 2008 Brooks/Cole 1 Chapter 13: Chemical Kinetics: Rates of Reactions ... 350 6.80 x 10-2 370 2.81 x 10-1 © 2008 Brooks/Cole 40 k

©Brooks/Cole, 2003 Chapter 9 Programming Languages.

©Brooks/Cole, 2003 Chapter 5 Computer Organization.

©Brooks/Cole, 2001 Chapter 9 Regular Expressions.

©Brooks/Cole, 2001 Chapter 10 Pointer Applications.

©Brooks/Cole, 2003 Chapter 6 Computer Networks. ©Brooks/Cole, 2003 Understand the rationale for the existence of networks. Distinguish between the three.

©Brooks/Cole, 2003 Chapter 7 Operating Systems. ©Brooks/Cole, 2003 Define the purpose and functions of an operating system. Understand the components.

Brooks Cole - Organic Chemistry 5e Chapter 16

Brooks Cole - Organic Chemistry 5e Chapter 18

©Brooks/Cole, 2001 Chapter 14 Linked Lists. ©Brooks/Cole, 2001 Figure 14-1.

©2004 Brooks/Cole Chapter 7 Strings and Characters.

© 2006 Thomson-Brooks Cole Chapter 13 Intertidal Communities.

©Brooks/Cole, 2001 Chapter 13 Binary Files. ©Brooks/Cole, 2001 Figure 13-1.