Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the...

48
Lesson6-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson 6: Sampling Methods and the Sampling Methods and the Central Limit Theorem Central Limit Theorem
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    4

Transcript of Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the...

Page 1: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Lesson 6:

Sampling Methods and the Sampling Methods and the Central Limit TheoremCentral Limit Theorem

Page 2: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-2 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Outline

Point estimate

Why sample the population?

Probability sampling

Choice of sampling method: Sampling straws

Sampling distribution of the sample means

Probability histograms and empirical histograms

Central Limit Theorem

Normal approximation to Binomial

Page 3: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-3 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Point Estimates

Examples of point estimates are the sample mean, the sample standard deviation, the sample variance, the sample proportion.

A point estimate is one value ( a single point ) that is used to estimate a population parameter.

Page 4: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-4 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Estimating the percentage of Earth covered by water

Experiments: Paint a dot on your thumb. Catch the globe and tell me whether the dot on

your thumb lands on water. Estimate the percentage of Earth covered by water

by the average of all trials. Idea: If we draw many observations with

replacement, the sample average will approach the population proportion. Code water as 1 and land as 0, the sample average will be an estimate of the proportion will be the percentage of Earth covered by water.

Truth:Water covers 71% of the Earth's surface.e.g.,

http://pao.cnmoc.navy.mil/educate/neptune/trivia/earth.htm

Page 5: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-5 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Why Sample the Population?

The physical impossibility of checking all items in the population.

The cost of studying all the items in a population. The sample results are usually adequate. Contacting the whole population would often be

time-consuming. The destructive nature of certain tests.

Page 6: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-6 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Probability Sampling

A probability sample is a sample selected such that each item or person in the population being studied has a known likelihood of being included in the sample.

Page 7: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-7 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Methods of Probability Sampling

Simple Random Sample: A sample formulated so that each item or person in the population has the same chance of being included.

Systematic Random Sampling: The items or individuals of the population are arranged in some order. A random starting point is selected and then every k-th member of the population is selected for the sample.

Page 8: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-8 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Methods of Probability Sampling

Stratified Random Sampling: A population is first divided into subgroups, called strata, and a sample is selected from each stratum.

Cluster Sampling: A population is first divided into primary units then samples are selected from the primary units.

Page 9: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-9 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Independent identically distributed (iid)

“random draws from any population, with replacement” are independent identically distributed (i.i.d.).

Independent: the probability of drawing the current observation does not depend on what has been drawn previously.

Identically distributed: the probability of drawing the current observation is the same as what has been drawn previously and what will be drawn in the future.

Most of the things covered in this Lesson holds even when we do not have iid observations.

Page 10: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-10 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Choice of sampling method -- “Sampling Straws”

Choice of sampling method is important.

An exercise of “Sampling Straws” experiments will illustrate that some sampling method can produce a biased estimate of the population parameters. The bag contain a total of 12 straws, 4 of which

are 4 inches in length, 4 are 2 inches long, and 4 are 1 inch long.

The population mean length is 2.33 (=4*(1+2+4)/12)

Randomly draw 4 straws one by one with replacement.

Compute the sample mean. The average of the sample means of experiments

is generally larger than 2.33.

Page 11: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-11 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Choice of sampling method -- “Sampling Straws”

The sample scheme is biased because the longer straws have a higher chance of being drawn, if the draw is truly random (say, draw your first touched straw).

The draw may not be random because we can feel the length of the straw before we pull out the straw.

Page 12: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-12 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Choice of sampling method -- “Sampling Straws”

Alternative sampling scheme: Label the straws 1 to 12. Label 12 identical balls 1 to 12. Draw four balls with replacement. Measure the corresponding straws and compute

the sample mean.

121110987654321

Page 13: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-13 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Choice of sampling method -- “Telephone interview”

Suppose we are interested in estimating unemployment rate by a phone survey.1. Interview a group selected based on a random

sample of mobile phone numbers.2. Interview a group selected based on a random

sample of residential phone numbers.3. Interview a group selected based on a random

sample of mobile and residential phone numbers.

Will we obtain a good estimate of the population unemployment rate?

Page 14: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-14 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Non-Probability Sampling

In nonprobability sample, whether an observation is included in the sample is based on the judgment of the person selecting the sample.

The sampling error is the difference between a sample statistic and its corresponding population parameter. Sampling error is almost always nonzero.

Page 15: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-15 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sampling Distribution of the Sample Means

The sampling distribution of the sample mean is a probability distribution consisting of all possible sample means of a given sample size selected from a population.

Page 16: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-16 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1

The law firm of Hoya and Associates has five partners. At their weekly partners meeting each reported the number of hours they billed clients for their services last week.

Partner Hours

1. Dunn 22

2. Hardy 26 3. Kiers 30 4. Malinowski 26

5. Tillman 22

The population mean is 25.2 hours.

2.255

2226302622

Page 17: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-17 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 1

If two partners are selected randomly, how many different samples are possible?

This is the combination of 5 objects taken 2 at a time. That is:

There are a total of 10 different samples.

10)!25(!2

!525

C

Page 18: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-18 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 1 continued

Partners Total Mean

1,2 48 24

1,3 52 26

1,4 48 24

1,5 44 22

2,3 56 28

2,4 52 26

2,5 48 24

3,4 56 28

2,4 52 26

2,5 48 24

Page 19: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-19 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1 continued

Organize the sample means into a frequency distribution.

Sample Mean Frequency Relative Frequency probability

22 1 1/ 10 24 4 4/ 10 26 3 3/ 10 28 2 2/ 10

The mean of the sample means is 25.2 hours.

2.2510

)2(28)3(26)4(24)1(22

X

The mean of the sample means is exactly equal to the population mean.

Page 20: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-20 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 1

Population variance = [ (22-25.2)2+(26-25.2)2 +… + (22-25.2)2 ] / 5 = 8.96

Variance of the sample means:=[ (1)(22-25.2)2+(4)(24-25.2)2 + (3)(26-25.2)2 + (2)(22-25.2)2 ] / ( 1+2+3+2) = 3.36

The variance of sample means < variance of population variance 3.36/8.96 = 0.375 <1

Note that this is like sampling without replacement.

Page 21: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-21 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example

Suppose we had a uniformly distributed population containing equal proportions (hence equally probable instances) of (0,1,2,3,4). If you were to draw a very large number of random samples from this population, each of size n=2, the possible combinations of drawn values and the sums are

Sums

Combinations

0 0,0

1 0,1 1,0

2 1,1 2,0 0,2

3 1,2 2,1 3,0 0,3

4 1,3 3,1 2,2 4,0 0,4

5 1,4 4,1 3,2 2,3

6 3,3 4,2 2,4

7 3,4 4,3

8 4,4Note that this is sampling with replacement.

Page 22: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-22 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example

Population mean = mean of sample means Population mean

= (0+1+2+3+4)/5=2 Mean of sample means

= [ (1)(0) + (2)(0.5) + …+(1)(4) ] / 25 = 2

Variance of sample means = Population variance/ sample size Population variance

=(0-2)2 + … + (4-2)2 / 5 = 2

Variance of sample means=(1)(0-2)2+… +(1)(4-2)2 / 25=1

Means

Combinations

0.0 0,0

0.5 0,1 1,0

1.0 1,1 2,0 0,2

1.5 1,2 2,1 3,0 0,3

2.0 1,3 3,1 2,2 4,0 0,4

2.5 1,4 4,1 3,2 2,3

3.0 3,3 4,2 2,4

3.5 3,4 4,3

4.0 4,4

Page 23: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-23 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Probability Histograms

In a probability histograms, the area of the bar represents the chance of a value happening as a result of the random (chance) process Empirical histograms (from observed data) for a

process converge to the probability histogram

Page 24: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-24 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Examples of empirical histogram

Roll a fair die: 50, 200 times

DIE

654321

Percent

30

20

10

0

DIE

654321

Percent

30

20

10

0

50 times 200 times

The empirical histogram will approach the probability histogram as the number of draws increase.

Page 25: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-25 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Empirical histogram #1

Two balls in the bag:

Draw 1 ball 1000 times with replacement. Plot a relative frequency histogram (empirical probability histogram).

0.5

The empirical histogram looks likethe population distribution !!!

What is the probability of getting a red ball in any single draw? 0.5

Page 26: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-26 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Empirical histogram #2

5 balls in the bag:

Draw 1 ball 1000 times with replacement. Plot a relative frequency histogram (empirical probability histogram).

0.6 The empirical histogram looks like the population distribution !!!

What is the probability of getting a red ball in any single draw? 0.6

0.4

Page 27: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-27 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Empirical histogram #3

5 balls in the bag:

Draw 1 ball 1000 times with replacement. Plot a relative frequency histogram (empirical probability histogram).

0

0.2

The empirical histogram looks like the population distribution !!!What is the probability of getting a “three” in any single draw?

0.2

0 1 2 3 4

1 2 3 4

What is the expected value (i.e., population mean) of a single draw?

0.2*0 + 0.2*1 + … + 0.2*4 = 2

Variance = 0.2*(-2)2 + 0.2*(-1)2 +… +0.2*(2)2 = 2

Page 28: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-28 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Empirical histogram #3 continued

Means Combinations

0.0 0,0

0.5 0,1 1,0

1.0 1,1 2,0 0,2

1.5 1,2 2,1 3,0 0,3

2.0 1,3 3,1 2,2 4,0 0,4

2.5 1,4 4,1 3,2 2,3

3.0 3,3 4,2 2,4

3.5 3,4 4,3

4.0 4,4

5 balls in the bag:

Draw 2 balls 1000 times with replacement. Compute the sample mean. Plot a relative frequency histogram (empirical probability histogram) of the 1000 sample means.

0 1 2 3 4

All combinations are equally likely.

00.020.040.060.080.1

0.120.140.160.180.2

0 0.5 1 1.5 2 2.5 3 3.5 4

Page 29: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-29 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Empirical histogram #3 continued

5 balls in the bag:

Draw 2 ball 1000 times with replacement. Compute the sample mean. Plot a relative frequency histogram (empirical probability histogram) of the 1000 sample means.

0 1 2 3 4

00.020.040.060.080.1

0.120.140.160.180.2

0 0.5 1 1.5 2 2.5 3 3.5 4

What is the probability of getting a sample mean of 2.5 in any single draw?0.16

What is the expected sample mean of a single draw?

0.04*0 + 0.08*0.5 +… + 0.04*4= 2

Variance of sample mean = 0.04*(-2)2 + 0.08 *(-1.5)2 + … + 0.04*(2)2 = 1

Page 30: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-32 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Distribution of Sample means of different sample sizes and from different population distribution

http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

http://www.kuleuven.ac.be/ucs/java/index.htm and choose basic and distribution of mean.

http://faculty.vassar.edu/lowry/central.html

Page 31: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-33 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem #15 balls in the bag:

Draw n (n>30) ball 1000 times with replacement. Compute the sample mean. Plot a relative frequency histogram (empirical probability histogram) of the 1000 sample means.

0 1 2 3 4

The Central Limit Theorem says 1. The empirical histogram looks like a normal density.2. Expected value (mean of the normal distribution) = mean

of the original population mean = 2.3. Variance of the sample means = variance of the original

population /n = 2/n.

Page 32: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-34 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem #2Some unknown number of numbered balls in the bag:

We know only that the population mean is and the variance is 2. Draw n (n>30) ball 1000 times with replacement. Compute the sample mean. Plot a relative frequency histogram (empirical probability histogram) of the 1000 sample means.

0 1 2 3 4

The Central Limit Theorem says 1. The empirical histogram looks like a normal density.2. Expected value (mean of the normal distribution) = .3. Variance of the sample means = 2 /n.

? ?

Page 33: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-35 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Confidence interval #1Some unknown number of numbered balls in the bag:

We know only that the population mean is and the variance is 2.

0 1 2 3 4

The Central Limit Theorem says 1. The empirical histogram looks like a normal density.2. Expected value (mean of the normal distribution) = .3. Variance of the sample means = 2 /n.

? ?

What is the probability that the sample mean of a randomly drawn sample lies between /n ?0.682

6

Page 34: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-36 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem

For a population with a mean and a variance 2 the sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed.

The mean of the sampling distribution equal to and the variance equal to 2/n.

),(~ 2NX

)/,(~ 2 nNXn The sample mean of n observation

The population distribution

Page 35: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-37 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem: Sums

For a large number of random draws, with replacement, the distribution of the sum approximately follows the normal distribution Mean of the normal distribution is

n* (expected value of one random draw) SD for the sum (SE) is

This holds even if the underlying population is not normally distributed

n

Page 36: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-38 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem: Averages

For a large number of random draws, with replacement, the distribution of the average = (sum)/n approximately follows the normal distribution The mean for this normal distribution is

(expected value of one random draw) The SD for the average (SE) is

This holds even if the underlying population is not normally distributed

n

Page 37: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-39 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Law of large numbers

The sample mean converges to the population mean as n gets large. For a large number of random draws from any

population, with replacement, the distribution of the average = (sum)/n approximately follows the normal distribution The mean for this normal distribution is the

(expected value of one random draw) The SD for the average (SE) is

SD for the average tends to zero as n increases.

This holds even if the underlying population is not normally distributed

n

Page 38: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-40 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem Simulation

Page 39: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-41 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Effect of Sample Size

Regardless of the underlying population, the larger the sample size, the more nearly normally distributed is the population of all possible sample means.

Page 40: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-42 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem

If a population follows the normal distribution, the sampling distribution of the sample mean will also follow the normal distribution.

To determine the probability a sample mean falls within a particular region, use:

n

Xz

Page 41: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-43 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Central Limit Theorem

If the population does not follow the normal distribution, but the sample is of at least 30 observations, the sample means will follow the normal distribution.

To determine the probability a sample mean falls within a particular region, use:

ns

Xz

Page 42: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-44 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 2

Suppose the mean selling price of a gallon of gasoline in the United States is $1.30. Further, assume the distribution is positively skewed, with a standard deviation of $0.28. What is the probability of selecting a sample of 35 gasoline stations and finding the sample mean within $.08 of the population mean ($1.30)?

Page 43: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-45 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 2 continued

The first step is to find the z-values corresponding to $1.22 (=1.30-0.08) and $1.38 (=1.30+0.08). These are the two points within $0.08 of the population mean.

69.13528.0$

30.1$38.1$

ns

Xz

69.13528.0$

30.1$22.1$

ns

Xz

Page 44: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-46 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 2 continued

Next we determine the probability of a z-valuebetween -1.69 and 1.69. It is:

We would expect about 91 percent of the sample means to be within $0.08 of the population mean.

9090.)4545(.2)69.169.1( zP

Page 45: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-47 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sampling Distribution of Sample Proportion

If a random sample of size n is taken from a population then the sampling distribution of the sample proportion isp̂

Approximately normal, if n is large.

p=p̂ Has mean

np)-p(1

=p̂ Has standard deviation

Approximately normal because the sample proportion is a simple average of zeros and ones from difference trials.

Page 46: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-48 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Approximation to the Binomial revisited

The normal distribution (a continuous distribution) yields a good approximation of the binomial distribution (a discrete distribution) for large values of n.

The normal probability distribution is generally a good approximation to the binomial probability distribution when n and n(1- ) are both greater than 5.

Recall for the binomial experiment: There are only two mutually exclusive outcomes

(success or failure) on each trial. A binomial distribution results from counting the

number of successes. Each trial is independent. The probability is fixed from trial to trial, and the

number of trials n is also fixed.

iid

Page 47: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-49 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Approximation to the Binomial revisited

Recoding: Failure as 0 and success as 1.

x/n is simply the proportion of success and hence the simple average of the outcomes from the n trials.

x/n will be approximately normal according to CLT.

Hence x (=n*x/n) will also be approximately normal according to CLT.

Page 48: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.

Lesson6-50 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

- END -

Lesson 6:Lesson 6: Sampling Methods and the Central Sampling Methods and the Central Limit TheoremLimit Theorem