Chapter 8: Introduction to Statistical Inference

25
Jun 15, 2022 Chapter 8: Chapter 8: Introduction to Introduction to Statistical Statistical Inference Inference

description

Chapter 8: Introduction to Statistical Inference. In Chapter 8:. 8.1 Concepts 8.2 Sampling Behavior of a Mean 8.3 Sampling Behavior of a Count and Proportion. § 8.1: Concepts. - PowerPoint PPT Presentation

Transcript of Chapter 8: Introduction to Statistical Inference

Page 1: Chapter 8:  Introduction to Statistical Inference

Apr 20, 2023

Chapter 8: Chapter 8: Introduction to Statistical Introduction to Statistical

InferenceInference

Page 2: Chapter 8:  Introduction to Statistical Inference

In Chapter 8:

8.1 Concepts

8.2 Sampling Behavior of a Mean

8.3 Sampling Behavior of a Count and Proportion

Page 3: Chapter 8:  Introduction to Statistical Inference

§8.1: ConceptsStatistical inference is the act of generalizing from a sample to a population with calculated degree of certainty.

We calculate statistics in the sample

We are curious about parameters in the population

Page 4: Chapter 8:  Introduction to Statistical Inference

Parameters Statistics

Source Population Sample

Calculated? No Yes

Constant? Yes No

Notation (examples) μ, σ, p psx ˆ,,

Parameters and Statistics

It is essential to draw the distinction between parameters and statistics.

Page 5: Chapter 8:  Introduction to Statistical Inference

§8.2 Sampling Behavior of a Mean

• How precisely does a given sample mean reflect the underlying population mean?

• To answer this question, we must establish the sampling distribution of x-bar

• The sampling distribution of x-bar is the hypothetical distribution of means from all possible samples of size n taken from the same population

Page 6: Chapter 8:  Introduction to Statistical Inference

Simulation Experiment

• Population: N = 10,000 with lognormal distribution (positive skew), μ = 173, and σ = 30 (Figure A, next slide)

• Take repeated SRSs, each of n = 10, from this population

• Calculate x-bar in each sample

• Plot x-bars (Figure B , next slide)

Page 7: Chapter 8:  Introduction to Statistical Inference

A. Population (individual values)

B. Sampling distribution of x-bars

Page 8: Chapter 8:  Introduction to Statistical Inference

Findings1. Distribution B is Normal

even though Distribution A is not (Central Limit Theorem)

2. Both distributions are centered on µ (“unbiasedness”)

3. The standard deviation of Distribution B is much less than the standard deviation of Distribution A (square root law)

Page 9: Chapter 8:  Introduction to Statistical Inference

Results from Simulation Experiment

• Finding 1 (central limit theorem) says the sampling distribution of x-bar tends toward Normality even when the population distribution is not Normal. This effect is strong in large samples.

• Finding 2 (unbiasedness) means that the expected value of x-bar is μ

• Finding 3 is related to the square root law which says:

nx

Page 10: Chapter 8:  Introduction to Statistical Inference

Standard Deviation (Error) of the Mean

• The standard deviation of the sampling distribution of the mean has a special name: it is called the “standard error of the mean” (SE)

• The square root law says the SE is inversely proportional to the square root of the sample size:

nSExx

Page 11: Chapter 8:  Introduction to Statistical Inference

Example, the Weschler Adult Intelligence Scale has σ = 15

Quadrupling the sample size cut the SE in half Square root law!

151

15

nSEx

For n = 1

5.74

15

nSEx

For n = 4

75.316

15

nSEx

For n = 16

Page 12: Chapter 8:  Introduction to Statistical Inference

Putting it together:

• The sampling distribution of x-bar is Normal with mean µ and standard deviation (SE) = σ / √n (when population Normal or n is large)

• These facts make inferences about µ possible• Example: Let X represent Weschler adult

intelligence scores: X ~ N(100, 15). Take an SRS of n = 10 SE = σ / √n = 15/√10 = 4.7 xbar ~ N(100, 4.7) 68% of sample mean will be in the range

µ ± SE = 100 ± 4.7 = 95.3 to 104.7

x ~ N(µ, SE)

Page 13: Chapter 8:  Introduction to Statistical Inference
Page 14: Chapter 8:  Introduction to Statistical Inference

Law of Large Numbers

• As a sample gets larger and larger, the sample mean tends to get closer and closer to the μ

• This tendency is known as the Law of Large Numbers

This figure shows results from a sampling experiment in a population with μ = 173.3

As n increased, the sample mean became a better reflection of μ = 173.3

Page 15: Chapter 8:  Introduction to Statistical Inference

8.3 Sampling Behavior of Counts and Proportions

• Recall (from Ch 6) that binomial random variable represents the random number of successes (X) in n independent “success/failure” trials; the probability of success for each trial is p

• Notation X~b(n,p)

• The sampling distribution X~b(10,0.2) is shown on the next slide: μ = 2 when the outcome is expressed as a count and μ = 0.2 when the outcome is expressed as a proportion.

Page 16: Chapter 8:  Introduction to Statistical Inference
Page 17: Chapter 8:  Introduction to Statistical Inference

Normal Approximation to the Binomial

• When n is large, the binomial distribution takes on Normal properties

• How large does the sample have to be to apply the Normal approximation?

• One rule says that the Normal approximation applies when npq ≥ 5

Page 18: Chapter 8:  Introduction to Statistical Inference

Top figure: X~b(10,0.2)npq = 10 ∙ 0.2 ∙ (1–0.2) = 1.6 (less than 5) → Normal approximation does not apply

Bottom figure: X~b(100,0.2) npq = 100 ∙ 0.2 ∙ (1−0.2) = 16 (greater than 5) → Normal approximation applies

Page 19: Chapter 8:  Introduction to Statistical Inference

Normal Approximation for a Binomial Count

npqnpNX ,~

When Normal approximation applies:

npqnp and

Page 20: Chapter 8:  Introduction to Statistical Inference

n

pqpNp

n

pqp

,~ˆ

and

Normal Approximation for a Binomial Proportion

Page 21: Chapter 8:  Introduction to Statistical Inference

“p-hat” is the symbol for the sample proportion

Page 22: Chapter 8:  Introduction to Statistical Inference

Illustrative Example: Normal Approximation to the Binomial

• Suppose the prevalence of a risk factor in a population is p = 0.2

• Take an SRS of n = 100 from this population

• A variable number of cases in a sample will follow a binomial distribution with n = 20 and p = .2

Page 23: Chapter 8:  Introduction to Statistical Inference

Illustrative Example, cont.

4,20~

48.2.100 and

20.2100

NX

npq

np

The Normal approximation for the binomial count is:

04.0,2.0~ˆ

04.0100

8.2.

.2

Np

n

pq

p

The Normal approximation for the binomial proportion is:

Page 24: Chapter 8:  Introduction to Statistical Inference

1. Statement of a problem: Suppose we see a sample with 30 cases. What is the probability of see at least 30 cases under these circumstance, i.e., Pr(X ≥ 30) = ? assuming X ~ N(20, 4)

2. Standardize: z = (30 – 20) / 4 = 2.5

3. Sketch: next slide

4. Table B: Pr(Z ≥ 2.5) = 0.0062

Illustrative Example, cont.

Page 25: Chapter 8:  Introduction to Statistical Inference

Illustrative Example, cont.Binomial and superimposed Normal sampling distributions for the problem.

Our Normal approximation suggests that only .0062 of samples will see at least this many cases.