STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint...

38
STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012

Transcript of STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint...

STAT 110 - Section 5 Lecture 7

Professor Hao WangUniversity of South Carolina

Spring 2012

Last time: Picturing Bias and Variability

Last time: Margin of Error

The CNN Poll interviewed 1000 people. The approval rating was 57%. What is the margin of error for 95% confidence (using the quick formula)?

Answer: Recall 95% confidence

Margin of Error (continued)

Use MOE to calculate an interval that we think includes the parameter

Form for most confidence intervals:

Approximate (because we’re using the quick MOE) 95% confidence interval for p

Confidence Interval

Confidence StatementsA confidence statement interprets a confidence

interval and has two parts: a margin of error and a level of confidence.

Margin of error says how close the statistic lies to the parameter.

Level of confidence says what percentage of all possible samples result in a confidence interval which contains the true parameter

Example: President Bush

Pre 9/11: 57% with MOE 3%

Post 9/11: 90% with MOE 3%

Interpretations– We are 95% confident that the percent of all

Americans who approve of the job president Bush was doing was between 54% and 60% before 9/11.

– We are 95% confident that the percent of all Americans who approve of the job president Bush was doing was between 87% and 93% after 9/11.

Example: College Education

This May 2011 survey finds that 57% of the 2142 adult Americans polled think that “the higher education system in the United States fails to provide students good value for the money they and their families spend”. Using the quick formula for MOE, compute a 95% confidence interval for p.

Example: Coke or Pepsi

Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do.

What is , the observed percent from the population?

A .725 = 72.5%

B .565 = 56.5%

C .029 = 2.9%

D .038 = 3.8%

Example Coke Or Pepsi continued

Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do.

What is the margin of error for 95% confidence?

A square root of 1231 = 35.06 = 35.06%

B square root of 696 = 26.38 = 26.38%

C 1/square root of 1231 = 0.0285 = 2.85%

D 1/square root of 696 = 0.0379 = 3.79%

Hints for Interpretation

The conclusion of a confidence statement always applies to the population, not to the sample.

Our conclusion about the population is never completely certain.

If you want a smaller margin of error with the same confidence, take a larger sample.

Hints for Interpretation

It is very common to report the margin of error for 95% confidence.– If the level of confidence is not mentioned, assume

95% confidence.

Can choose to use a confidence level other than 95%.– Other popular levels: 80%, 90%, 99%– For a fixed sample size, if you increase the level of

confidence, your interval will become wider.– For a fixed confidence level, if you increase

sample size, your interval will become narrower

Population Size Doesn’t Matter

The variability of a statistic from a SRS does not depend on the size of the population as long as the population is at least 100 times larger than the sample.

Example: Population Size Doesn’t Matter

Suppose we take a sample of size 1000 from a population of 4,000,000 (e.g., South Carolina). Then we take a sample of 1000 from a population of 300,000,000 (e.g., the whole US). Which sample statistic would have more variability (i.e., MOE) ?

A. The one from 4,000,000

B. The one from 300,000,000

C. They are the same.

Chapter 4 – Sample Surveys

in the Real World

Type of errors:

1. Sampling Errors

a. Random Sampling Error

b. Bad Sampling Methods

2. Non-sampling Errors

a. Processing errors

b. Poorly worded questions

c. Response error

d. Non-Response

Chapter 4 – Sample Surveys in the Real World

sampling errors – errors caused by the act of taking a sample

They cause sample results to be different from the results of a census.

sampling frame – a list of individuals from which we

will draw our sample

should list every individual in the population

Errors in Sampling

random sampling error – results from chance selection in the simple random sample

• MOE lets us calculate how serious the error is.

• The error is due to chance – always present. A large sample helps control this.

• MOE includes only random sampling error.

• Most sample surveys are afflicted with errors other than random sampling errors.

Errors in Sampling

Bad sampling method – a convenience sample or a voluntary response sample

is also a form of sampling error.

Voluntary sample

Convenience sample

undercoverage – occurs when some groups in the population are left out of the process of choosing the sample

nonsampling errors – errors not related to the act of selecting a sample from the population

can even be present in a census

• nonrespone (missing data)

• response errors

• processing errors

• effects of data collection procedure

Example

The subject lies about past drug use.

A. Sampling Error: Bad Sampling MethodB. Non Sampling Error: Response ErrorC. Non Sampling Error: Non Response ErrorD. Non Sampling Error: Processing Error

The subject cannot be contacted after five calls.

A. Sampling Error: Bad Sampling Method

B. Non Sampling Error: Response Error

C. Non Sampling Error: Non Response Error

D. Non Sampling Error: Processing Error

Example

Interviewers choose people on the street to interview.

A. Sampling Error: Bad Sampling MethodB. Non Sampling Error: Response ErrorC. Non Sampling Error: Non Response ErrorD. Non Sampling Error: Processing Error

Example

Consider Wording

Be aware that the wording of a question influences the answers.

Examples:

Is our government providing too much money for welfare programs?

– 44% said “yes”

Is our government providing too much money for assistance to the poor?

– 13% said yes

More Complex Sample Designs

• Sometimes a strict simple random sample is difficult to obtain.

- Multistage Sampling Design

- Cluster Sampling

- Systematic Sampling

- Stratified Random Sampling

• Stratified Random Sample

• Step 1: Divide the sampling frame into distinct groups of individuals, called strata.

• – Choose strata because you have an interest in the groups or because the individuals within each group are similar

• – Example: graduate/undergraduate students

• Step 2: Take a separate SRS in each stratum and combine these to make up the complete sample.

Stratified Random Sample. A club has 25 student members and 10 faculty members. The club can send 4 students and 2 faculty members to a convention.

Students 01 Barrett 06 Frazier 11 Hu 16 Liu 21 Ren

02 Brady 07 Gibellato 12 Jimenez 17 Marin 22 Santos

03 Chen 08 Gulati 13 Katsaounis 18 Nemeth 23 Sroka

04 Draper 09 Han 14 Kim 19 O’Rourke 24 Tordoff

05 Duncan 10 Hostetler 15 Kohlschmidt 20 Paul 25 Wang

Faculty 0 Berliner 2 Dean 4 Goel 6 Moore 8 Stasney

1 Craigmile 3 Fligner 5 Lee 7 Pearl 9 Wolfe

Line 116:14459 26056 31424 80371 65103 62253 50490 61181Choose a Stratified RS of 4 Students, then of 2 Faculty

Cluster Sampling

• In order to reduce costs in sampling, researchers focus on efficiency by sampling from clusters

• Clusters are often formed by geographic location, resulting in decreased travel costs for the research company.

• Randomly sample clusters then survey everyone in

each cluster.

Cluster Sample - Divide population into clusters.

Select one or more clusters and include everyone in those clusters in the sample.

• Example: SC has 46 counties. Select 5 counties at random, use all household in each selected county as sample.

• Example: USC has 5000 dorms. Select 100 dorms

at random, use all students in each selected

dorm as sample.

Want to find the opinions of US adults, but want to save on time and money by randomly selecting residences. All adults residing in a sampled residence will be interviewed.

A. Stratified B. Cluster C. Both

• Want to find the opinions of US adults and need to make sure that 3 specific religious groups are represented. You sample 100 Christians, 100 Jewish, and 100 Muslims.

A. Stratified B. Cluster C. Both

• Want to find the opinions of city dwelling US adults and need to make sure that the east and west coasts are represented. You send 5 interviewers to the east coast and 5 to the west coast. 5 City blocks are chosen at random. Everyone living in a chosen city block is interviewed. (similarly for the east coast)

A. Stratified B. Cluster C. Both

Questions to Ask Before You Believe a Poll

• Who carried out the survey?

• What was the population?

• How was the sample selected?

• How large was the sample?

• What was the margin of error?

• What was the response rate?

• How were the subjects contacted?

• When was the survey conducted?

• What questions were asked?

A – a cluster sample

B – a systematic sample

C – a stratified random sample

D - undercoverage

USC has 20,065 undergraduates and 7,423 graduate students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates and a simple random sample of 74 graduate students are taken. This is an example of:

A – a cluster sample

B – a systematic sample

C – a stratified random sample

D - undercoverage

USC has 20,065 undergraduates and 7,423 graduate

students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates is taken. This is an example of: