Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances...

33
Chapter 12 – Sample Surveys

Transcript of Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances...

Page 1: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Chapter 12 – Sample Surveys

Page 2: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Vocabulary

Population - the entire group of individuals or instances about which we hope to learn

Sample - a representative subset of a population Sample Survey - descriptive study that asks

questions of a sample in hope of learning something about the whole population

Bias - any failure to accurately represent the population in a sample

Randomization - process by which each individual is given a fair and equal chance of selection for the sample (Best Defense Against Bias)

Sample size - the number of individuals in a sample

Page 3: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

More Vocabulary Census - a sample that consists of the entire population Population parameter - a numerically valued attribute

of a model of a population (What? - you’ll see later) Sample statistic -a term for statistics that parallel a

parameter (better definition later) Representative - a sample is said to be representative

of a population if it accurately reflects the population. Sampling frame - a list of individuals from which the

sample is drawn Sampling variability - the natural tendency of randomly

drawn samples to differ from one another

Page 4: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

A Quick Note

Not all the vocabulary for this chapter was listed on the previous slides

Some of the vocabulary is better understood when it is taken in the context of the chapter

And now the chapter…

Page 5: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Understanding Samples

Samples are used to “stretch” beyond the data at hand to the entire world or group at large

There are three necessary ideas in order to make this “stretch” or draw this conclusion (ERS) Examine a part of the whole Randomize Sample size

Page 6: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

STEP 1. Examining a part of the whole

Researchers often want to know about an entire population but surveying an entire population is often impractical or impossible, therefore …

Researchers often settle for creating a representative sub-group or “sample” from the population

Page 7: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Sample Surveys

Sample Surveys - Designed to ask questions of a small group of people in order to learn something about the entire population

Sample surveys are everywhere National polls Newspaper polls Internet electronic polls

Page 8: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Sample Surveys

How can sample surveys truly represent a population?

In order to understand this, let’s look first at a failed sample survey In 1936, the Literary Digest magazine held a

mock election poll with its readers. The magazine used telephone numbers in

order to select a sample of the population. According to the survey, Alf Landon received

57% of the votes beating F.D. Roosevelt (43%) in a landslide.

When the real election was held, FDR won 62% to 32%

Page 9: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

What Went Wrong?

The magazine used a “biased” sample. In 1936, the telephone was a luxury afforded

only by the affluent so the sample inadvertently was composed of only wealthy individuals.

Roosevelt’s was extremely popular among the less affluent, therefore the sample used under-represented FDR’s support.

How can researchers eliminate biased representation in samples?

The best strategy is ….

Page 10: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

STEP 2. Randomize

Randomization is the best statistical weapon against sampling bias. Why? it protects researchers by making sure that on

average that sample looks like the rest of the population.

Populations have various features that may influence the validity of the findings, sometimes even features that researchers haven’t thought about. Randomization accounts for this by giving every one an equal chance of selection and representation in the sample.

it also allows researchers to make inferences from their sample to the population from which it was drawn because the sample represents the population accurately.

Page 11: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

STEP 3. Sample SizeThe fraction of the population that you’ve

sampled does NOT matter! The only thing that is important is the sample size itself!

Samples need to be representativeIn order to see the proportion of a

population that fall into a category, it is necessary to see several respondents in each category in order to say anything precise enough to be useful. (usually several hundred respondents)

Page 12: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Census

Hey! Wouldn’t it be easier to just survey the whole population, then there is no need to worry about any of the sampling stuff. Right?? NO! A census may appear to really

represent the population but it may actually not for three main reasons.

Page 13: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

A Census Doesn’t Make Sense

1) It can be difficult to complete a census there are always some individuals who are hard to

locate (e.g., homeless) or hard to survey (e.g., people with limited ability to communicate)

2) Populations are always changing Deaths and births are constantly happening and

constantly changing the population By the time the census is completed, an event

could have changed everyone’s opinion regarding the questions in the census.

3) A census is more complicated than a survey Census’s often require a team effort and the help of

the population being surveyed.

Page 14: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Populations and Parameters

Population parameters - a parameter that is part of a model of the population

Sample Statistics – Statistics – computations from the data that

describe the sample Summary statistics - computations from the

data that estimate or refer to the population parameters

Let’s meet some new and old parameters.

Page 15: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

The Parameters

Mean - ( )Standard Deviation - ( )Correlation - ( )Regression coefficient - ( B )Proportion - ( p )

Page 16: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Simple Random Samples

A Simple Random Sample (SRS) gives each combination of people within the population an equal chance to be selected for the survey.

How is this done?

Page 17: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Simple Random Sampling

To select a sample at random we must first select a sampling frame.

A sampling frame is a list of individuals from which the sample is drawn. (must be precise) Sampling frames allow us to draw random samples from

large groups. Within the sample frame we are able to select random

members that will represent the entire sampling frame accurately.

However, when we draw a sample at random, each sample will be different. We call these sample to sample differences, sampling variability.

Page 18: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Other Sampling Designs

All Statistical sampling designs have the common idea that chance, not human choice, is used to select the sample.

Besides SRS, there are three other main Sampling designs

Stratified Random SamplingCluster SamplingMultistage Sampling

Page 19: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Other Sampling Designs

Stratified Random Sampling used when a population is already broken up

into stratas or homogeneous groups. then within each strata SRS is used.

Cluster Sampling used when a population is already broken

into homogeneous groups BUT only one group is going to be surveyed.

SRS is used in only one strata or group in this sampling design.

Page 20: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Other Sampling Designs

The final and most common design is..Multistage Sampling

multistage sampling utilizes more than one method of sampling

refers to complex sampling schemes that combine several sampling methods.E.g. - a random survey which is followed up

by a phone call if the person does not complete the survey.

Page 21: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Systematic Samples

Systematic Sampling A list of population members is prepared and

every N th name is selected until the sample size is reached; beginning from a randomly selected pointCan be used when there is no reason to believe

that the order of the list in the sampling frame is related to answers sought

• E.g., If the list is alphabetical and your asking a question about a political subject, a systematic sampling method could be choosing every tenth name, until your sample size is reached.

Page 22: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Sampling Badly

Many of the most convenient forms of sampling can be extremely biased.

There are four main problems or sample types that can cause bad samples Voluntary response sample Convenience sample Bad sampling frame Undercoverage

Page 23: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Voluntary response sample

Voluntary response sample In this approach, a large group of

individuals are eligible to participate but only those who respond to the survey are counted.

Why is this bad?Leads to a bias because only those who care

strongly enough about the survey will respond; therefore, the results from the sample are not representative of the entire population

Page 24: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Convenience Sample

Convenience sample Only those individuals who are at hand are

included Why is this Bad?

Leads to bias in response because the people at hand often have a common tie and are not representative of the whole population

Page 25: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Bad Sampling Frame

Bad Sampling Frame In a simple SRS survey people can often be

excluded from the sampling frame Why is this Bad?

Gives us an incomplete picture or representation of the population

• Remember the Roosevelt poll on an earlier slide?– The results were biased because most of the poor

people in America were not included in the sampling frame (e.g., they didn’t have a telephone so they couldn’t be selected yet they could and did vote).

Page 26: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Undercoverage

Undercoverage Refers to the scenario in which a portion

of the population is not represented or has smaller representation then the rest of the population

Why is this Bad?It doesn’t allow for an accurate

representation of the population, therefore no accurate predictions or inferences can be made from the data.

Page 27: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

What can go wrong (BIAS)

Nonresponse bias Nonresponse to surveys can be a source

of bias because those who do not respond to a survey could differ from those who do.To prevent this bias:

• Don’t bore people with long surveys• Don’t send out a lot of surveys; send out fewer

random surveys in scenarios which you can ensure a high response level

Page 28: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

What can go wrong (BIAS)

Response Bias Refers to a bias brought about by survey

questions which influences responses This influence is often referred to as a “leading

question” In leading questions the surveyor uses influential

words to “lead” a person to a certain answer.E.g.

• Do you think that the evil companies who destroy animals’ habitats should be allowed to continue destroying the rain forest when they harvest trees?- biased

• Do you think companies should be allowed to harvest trees from the rain forest? - not biased

Page 29: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Rules for Eliminating Bias in Your Surveys

Look for bias in any survey you encounter there is no way to recover from a sample or survey

that asks biased questions. All of your data becomes useless when you have a biased question included in your survey!

Spend your time and resources reducing biases

If possible, test your survey before you use it

Always report you sampling methods in detail

Page 30: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Practice Problems

Let’s try a problem! (#3 pg. 243) Identify the following items from each

passage if possible: a) The population b) The population parameter of interest,c) The sampling framed) The samplee) The sampling method; was randomization used? f) Potential sources of bias or generalization

problems

Page 31: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Practice Problems

#3 - Consumers Union asked all subscribers whether they had used alternative medical treatments and, if so, whether they had benefited from them. For almost all of the treatments, approx. 20% of those responding reported cures or substantial improvement. A) Population - All US Adults b) Parameter - Proportion who have used and benefited from

alternative medical treatments c) Sampling Frame - all Consumer’s Union subscribers d) Sample - those who responded e) Method - a nonrandom questionnaire f) Bias - Voluntary response sample causes the bias. Only those who

cared strongly enough about the question responded. This sample can not represent the whole population because those who did not respond could have different opinions or answers then those who did respond

Page 32: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Practice Problems

Let’s try one more! (pg. 243 #13) Question 1: Should elementary school - aged children

have to pass high stakes tests in order to remain with their classmates?

Question 2: Should schools and students be held accountable for meeting yearly learning goals by testing students before they advance to the next grade?

A) Do you think response to these questions might differ? What kind of bias is this?

B) Propose a question with more neutral wording that might better assess parental opinion.

Page 33: Chapter 12 – Sample Surveys. Vocabulary zPopulation - the entire group of individuals or instances about which we hope to learn zSample - a representative.

Practice Problems

Solution a) Answers to the questions will definitely differ.

Question 1 is worded to scare the respondents into “no” answers by using extreme descriptions (high stakes tests). Question 2 is worded to receive more “yes” answers by changing the subject of the question from the actual passing of the test to accountability for learning. This is a type of wording or response bias.

b) Do you think that students should have to pass a standardized test in order to be promoted to the next grade level? - This is better because it doesn’t use any extreme words or subject changes.