Introduction to Statistics (Chapter 0 & 5)

97
Slide 1 Introduction to Statistics (Chapter 0 & 5)

description

Introduction to Statistics (Chapter 0 & 5). Definitions. Data facts used to draw conclusions Example: observations (such as measurements, genders, survey responses) that have been collected. Definitions. Statistics the science of collecting, organizing, - PowerPoint PPT Presentation

Transcript of Introduction to Statistics (Chapter 0 & 5)

Page 1: Introduction to Statistics (Chapter 0 & 5)

Introduction to Statistics(Chapter 0 & 5)

Page 2: Introduction to Statistics (Chapter 0 & 5)

Slide 2Definitions Data

facts used to draw conclusions

Example: observations (such as measurements, genders, survey responses)that have been collected.

Page 3: Introduction to Statistics (Chapter 0 & 5)

Slide 3Definitions Statistics

the science of collecting, organizing,summarizing, & analyzing info. to draw conclusions (answers to questions).

Page 4: Introduction to Statistics (Chapter 0 & 5)

Slide 4DefinitionsPopulation the complete collection of ALL elements

(scores, people, measurements, etc.)to be studied.

Page 5: Introduction to Statistics (Chapter 0 & 5)

Slide 5Definitions

Censusthe collection of data from EVERY

member of the population.

Sample a sub-collection of elements

drawn from a population.

Page 6: Introduction to Statistics (Chapter 0 & 5)

Slide 6Key Concepts Sample data must be collected in an

appropriate way, such as through a process of random selection.

If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.

Page 7: Introduction to Statistics (Chapter 0 & 5)

Slide 7Definitions Parameter

a numerical measurement describing some characteristic of a population

population

parameter

Page 8: Introduction to Statistics (Chapter 0 & 5)

Slide 8

Definitions

Statistic a numerical measurement describing some characteristic of a sample.

sample

statistic

Page 9: Introduction to Statistics (Chapter 0 & 5)

Slide 9

Definitions

Quantitative data numbers representing counts or measurements.

Example: weights, age, heights

Page 10: Introduction to Statistics (Chapter 0 & 5)

Slide 10Working with

Quantitative Data

Quantitative data can further be distinguished between discrete and continuous types.

Page 11: Introduction to Statistics (Chapter 0 & 5)

Slide 11

Discrete data that can be counted using whole

numbers (not decimals or fractions).0, 1, 2, 3, . . .

Example: Number of teeth, number of eggs.

Definitions

Page 12: Introduction to Statistics (Chapter 0 & 5)

Slide 12

Continuous (numerical) data that can be measured, or

covers a range of values without gaps (includes decimal and fraction numbers).

Definitions

Example: height, weight, age

Page 13: Introduction to Statistics (Chapter 0 & 5)

Slide 13

DefinitionsQualitative (also called categorical or

attribute) datacan be separated into different categories that are distinguished by some nonnumeric characteristics.Example: genders (male/female), favorite

color (red, blue, green), favorite foods

Page 14: Introduction to Statistics (Chapter 0 & 5)

Slide 14Recap

Basic definitions and terms describing data Parameters versus statistics Types of data (quantitative and qualitative)

Page 15: Introduction to Statistics (Chapter 0 & 5)

Slide 15Major PointsWe collect sample data in order to make a

prediction about an entire population. We would not bother to collect sample data

if we had the ability, time, and money to gather information from an entire

population. If sample data are not collected in an

appropriate way, the data may be so completely useless that no amount of statistical tutoring can salvage them.

Page 16: Introduction to Statistics (Chapter 0 & 5)

Slide 16Important Points Statisticians make decisions based on

data. Data production helps us answer specific questions with an experiment or an observational study.

Do you know the DIFFERENCE between an observational study and an experiment?

Page 17: Introduction to Statistics (Chapter 0 & 5)

Slide 17

Observational Study observing and measuring specific characteristics without attempting to modify (change) the subjects being studied

Example: Survey a group of people

Definitions

Page 18: Introduction to Statistics (Chapter 0 & 5)

Slide 18

Experiment when we apply some treatment (do something to) the subjects and then

observe its effects on them

Example: give plants different fertilizers to see which fertilizer works the best.

Definitions

Page 19: Introduction to Statistics (Chapter 0 & 5)

Slide 19

Confounding occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors

(treatments)

Example: Did the plants grow larger due to the fertilizer, or did they simple receive more water and sunshine than the other plants?

A Statistician’s job is to try to design an experiment so confounding does not occur!

Definitions

Page 20: Introduction to Statistics (Chapter 0 & 5)

Slide 20

Data Set contains information on a number of

individuals.

Individuals may be people, animals, or things.Variables describe some characteristic of an individual, such as a person’s height, gender, salary.

Some variables are categorical and some are quantitative.

Definitions

Page 21: Introduction to Statistics (Chapter 0 & 5)

Slide 21

DistributionIn statistics, we often talk about (or describe) the variables (or distribution of our data).

o We do this using graphso (bar graphs, histograms, box plots, etc.)

o Graphs can be used to easily describe the mean, median, mode, range, and so forth of the data distribution.

Definitions

Page 22: Introduction to Statistics (Chapter 0 & 5)

Slide 22

Probabilitythe chance (likelihood) of a

particular outcome.

Definitions

Statistical inferenceproduces answers to specific questions,

along with a statement about how confident we can be that the answer is correct.

Page 23: Introduction to Statistics (Chapter 0 & 5)

Slide 23

Cross Sectional StudyData are observed, measured, and collected at one point in time.

Retrospective (or Case Control) StudyData are collected from the past by going back in time.

Prospective (or Longitudinal or Cohort) StudyData are collected in the future from groups (called cohorts) sharing common factors.

Types of Studies

Page 24: Introduction to Statistics (Chapter 0 & 5)

Slide 24

When conducting studies (observational or experimental), sample data must be collected.

There are good and bad sampling methods depending on the type of data you need to collect.

Sampling Methods

Page 25: Introduction to Statistics (Chapter 0 & 5)

Slide 25

Random Sample members of the population are selected in such a way that each individual member has an equal chance of being selected

Definitions

Simple Random Sample (of size n)subjects selected in such a way that everypossible sample of the same size n has the same chance of being chosen

Page 26: Introduction to Statistics (Chapter 0 & 5)

Slide 26Random Sampling selection so that each has an equal chance of being selected

Examples: Draw name out of a hat, GA lottery “numbered-ball air machine”, SRS Table in textbook, Random Integer Function on calculator.

Page 27: Introduction to Statistics (Chapter 0 & 5)

Slide 27SRS and Random # Table

• In order to have students get the same results in a SRS, questions ask students to use a random number table

Page 28: Introduction to Statistics (Chapter 0 & 5)

Slide 28Systematic SamplingSelect some starting point and then

select every K th element in the population

Example: Selecting every 3rd person in line, selecting those sitting on every other row.

Page 29: Introduction to Statistics (Chapter 0 & 5)

Slide 29Stratified Samplingsubdivide the population into at

least two different subgroups that share the same characteristics, then draw a random sample from

each subgroup (or stratum)

Example: Divide population into 2 groups (male, female), select same amount from each group.

- 4 groups (9th, 10th, 11th, 12th grade), select 10% (same proportion) from each of the 4 groups.

Page 30: Introduction to Statistics (Chapter 0 & 5)

Slide 30Cluster Samplingdivide the population into sections

(or clusters); randomly select some of those clusters; choose all members from selected clusters

Page 31: Introduction to Statistics (Chapter 0 & 5)

Slide 31Convenience Samplinguse results that are easy to get

Example: Sample the first 10 people who enter a room. * Mall Surveys

Page 32: Introduction to Statistics (Chapter 0 & 5)

Slide 32Voluntary (Self-Selected) Sampling

Example: Volunteer to answer a survey online, in a magazine, over the phone, etc.

Page 33: Introduction to Statistics (Chapter 0 & 5)

Slide 33

Random (SRS)

Systematic

Stratified

Cluster

Convenience

Voluntary

Methods of Sampling

The first four sampling methods are PROBABILITY SAMPLES meaning the samples were chosen by chance.

SRS gives each member of the population an equal chance of being selected, this may not be true in more elaborate sampling methods.

Convenience & Voluntary sampling methods are types of BAD sampling methods because they are generally BIASED (meaning they systematically favor certain outcomes).

Page 34: Introduction to Statistics (Chapter 0 & 5)

Slide 34

Another sampling method which you need to be familiar is Multi-stage sampling.

Just like its name indicates with multi-stage you select successively smaller groups within a population in stages.

Each stage may employ a different sampling method.

Methods of Sampling

Page 35: Introduction to Statistics (Chapter 0 & 5)

Slide 35

Sampling Error (OK ) the difference between a sample result and the true

population result; such an error results from chance sample fluctuations. Can’t be helped, you can never predict an outcome with 100% certainty using a sample.

Nonsampling Error (BAD )sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly)

Definitions

Page 36: Introduction to Statistics (Chapter 0 & 5)

Slide 36

Random selection eliminates bias in the choice of a sample.However, you still have to WATCH OUT FOR…

Undercoveragewhen some group(s) of the population are ‘left out’

Example: phone survey (those without phones are left out), which may mean the economically disadvantaged are under represented in the outcome.

Nonresponsewhen some individual(s) selected for the sample, can’t be contacted or refuse to respond to the survey.

Example: Phone survey – don’t answer the phone, or hang up without responding to the survey.

CAUTION: Sample Surveys

Page 37: Introduction to Statistics (Chapter 0 & 5)

Slide 37

Response Bias

respondents may lie, especially if asked about illegal or unpopular behavior

Example: * Do you smoke marijuana?* Have you ever cheated on a test?

CAUTION: Sample Surveys

Page 38: Introduction to Statistics (Chapter 0 & 5)

Slide 38

Regardless of the sampling method chosen, the GOAL should always be to select a sample and conduct a study in such a way as to NOT get BIASED (unfair, untrue) results.

Selecting a Method of Sampling

Page 39: Introduction to Statistics (Chapter 0 & 5)

Slide 39Example 1Describe how a university can conduct a survey regarding its campus safety.

The registrar of the university has determined that the community of the university consists of 6,204 students in residence, 13,304 nonresident students, and 2,401 staff for a total of 21,909 individuals.

The president has funds for only 1000 surveys to be given and then analyzed. How should she conduct the survey?

Page 40: Introduction to Statistics (Chapter 0 & 5)

Slide 40Example 2Sociologists want to gather data regarding the household income within Smyth County. They have come to the high schools for assistance.

Describe a method which would disrupt the fewest classes and still gather the data needed.

Page 41: Introduction to Statistics (Chapter 0 & 5)

Slide 41Example 3The manager of Ingles wants to measure the satisfaction of the store’s customers.

Design a sampling technique that can be used to obtain a sample of 40 customers.

Page 42: Introduction to Statistics (Chapter 0 & 5)

Slide 42Example 4The Independent Organization of Political Activity, IOPA, wants to conduct a survey focusing on the dissatisfaction with the current political parties.

Several state-wide businesses have agreed to help. IOPA has come to you for advice.

Describe a multi-stage survey strategy that will help them.

Page 43: Introduction to Statistics (Chapter 0 & 5)

Slide 43Summary and Homework• Summary

– Experiments: can detect cause and effect

– Observational Studies: suggest further work

– Sampling Methods (Probabilistic)» Simple Random Sample » Cluster Sample» Stratified Random Sample » Multi-stage Sample

• Homework– Pages 333-4 & 341-3– problems 5.1-5, 5.7, 5.8, 5.10, 5.13, 5.14

Page 44: Introduction to Statistics (Chapter 0 & 5)

Slide 44Success in Statistics Success in an introductory statistics course

typically requires more common sense than mathematical expertise.

This section is designed to illustrate how common sense is used when we think critically about data and statistics.

Page 45: Introduction to Statistics (Chapter 0 & 5)

Slide 45Misuses of Statistics

Bad Samples

Voluntary Response or Convenience Sampling

These sampling methods almost guarantee NOT to represent the entire population.

For instance, most who volunteer to respond do so because they have a strong opinion about the research topic.

Page 46: Introduction to Statistics (Chapter 0 & 5)

Slide 46Misuses of Statistics

Misleading Graphs

Bad Samples Too Small of a Sample (larger sample sizes give more accurate results).

Page 47: Introduction to Statistics (Chapter 0 & 5)

Slide 47

Figure 1-1 (Same data scaled differently)

Page 48: Introduction to Statistics (Chapter 0 & 5)

Slide 48

To correctly interpret a graph, we should analyze the numerical information given in the graph instead of being mislead by its general shape.

Page 49: Introduction to Statistics (Chapter 0 & 5)

Slide 49Misuses of Statistics

Bad Samples Small Samples Misleading Graphs Pictographs

Page 50: Introduction to Statistics (Chapter 0 & 5)

Slide 50

Figure 1-2

Double the length, width, and height of a cube, and the volume increases by a factor of eight

Page 51: Introduction to Statistics (Chapter 0 & 5)

Slide 51Misuses of Statistics

Bad Samples Small Samples Misleading Graphs Pictographs Distorted Percentages Wording of Questions, or Loaded Questions

Page 52: Introduction to Statistics (Chapter 0 & 5)

Slide 52

97% yes: “Should the President have line item veto power to eliminate waste?”Results of same question worded differently….

57% yes: “Should the President have line item veto power?”

Example: Loaded Question

Page 53: Introduction to Statistics (Chapter 0 & 5)

Slide 53

Bad SamplesSmall SamplesMisleading GraphsPictographsDistorted PercentagesLoaded QuestionsOrder of Questions

Refusals (Nonresponse)Correlation & CausalitySelf-Interest Study

Example: tobacco company conducts survey on whether tobacco use causes cancer

Deliberate Distortions

Misuses of Statistics

Page 54: Introduction to Statistics (Chapter 0 & 5)

Slide 54Recap

Reviewed 11 misuses of statistics. Illustrated how common sense can play a

big role in interpreting data and statistics

In this section we have:

Page 55: Introduction to Statistics (Chapter 0 & 5)

Slide 55Example Problemsa) Determine is the survey design is flawedb) If flawed, is it due to the sampling method of the

survey itselfc) For flawed surveys, identify the cause of the errord) Suggest a remedy to the problem

Page 56: Introduction to Statistics (Chapter 0 & 5)

Slide 56Example 1MSHS wants to conduct a study regarding the achievement of its students. The principal selects the first 50 students who enter the building on a given day and administers the survey.

Flawed sampling method

Page 57: Introduction to Statistics (Chapter 0 & 5)

Slide 57Example 2The Marion town council wishes to conduct a study regarding the income level of households in Marion. The town manager selects 10 homes in one neighborhood and sends an interviewer to the homes to determine household incomes.

Flawed sampling method

Page 58: Introduction to Statistics (Chapter 0 & 5)

Slide 58Example 3An anti-gun advocacy group wants to estimate the percentage of people who favor stricter gun laws. They conduct a nation-wide survey of 1,203 randomly selected adults 18 years old and older. The interviewer asks the respondents, “Do you favor harsher penalties for individuals who sell guns illegally?”

Poorly worded question

Page 59: Introduction to Statistics (Chapter 0 & 5)

Slide 59Example 4Cold Stone Creamery is considering opening a new store in Marion. Before opening the store, the company would like to know the percentage of households in Marion that regularly visit an ice cream shop. The market researcher obtains a list of households in Marion and randomly selects 150 of them. He mails a questionnaire to the households that asks about their ice cream eating habits and favor preferences. Of the 150 questionnaires mailed, 14 are returned.

Nonresponse

Page 60: Introduction to Statistics (Chapter 0 & 5)

Slide 60Example 5The owner of shopping mall wishes to expand the number of shops available in the food court. She has a market researcher survey mall customers during weekday mornings to determine what types of food the shoppers would like to see added to the food court.

Flawed sample method

Page 61: Introduction to Statistics (Chapter 0 & 5)

Slide 61Example 6The owner of radio station wants to know what their listeners think of the new format. He has the announcers invite the listeners to call in and voice their opinion.

Flawed sample method

Page 62: Introduction to Statistics (Chapter 0 & 5)

Slide 62Summary and Homework

• Summary– Sources of Bias

» Voluntary and convenience samples» Undercoverage, Nonresponse, response bias and poorly worded

questions

• Homework– Pages 347-51– problems 5.15, 16, 17, 20, 24, 28, 29, 30

Page 63: Introduction to Statistics (Chapter 0 & 5)

Slide 63Designing Experiments

When it comes to experimental design, there are several terms you will need to understand.

Remember, a study is an experiment when we actually DO SOMETHING to individuals in order to observe the response.

Page 64: Introduction to Statistics (Chapter 0 & 5)

Slide 64Definitions

Experimental Units – the individual on which the experiment is done.

Subjects – when the experimental units are humans they are called subjects.

Treatment – a specific experimental condition applied to the units/subjects.

Factors – categories of a treatment

Page 65: Introduction to Statistics (Chapter 0 & 5)

Slide 65Experimental Design

What are the Experimental Units/Subjects? Men & Women The treatments (conditions applied) are the diet drugs.

What are the factor(s) and their levels? You have 2 factors to consider … drug treatment & gender.

There are 5 levels ( 3-drug levels & 2-gender levels).

This experiment has a total of 6 treatment groups.

Example: You want to conduct an experiment to compare 3 different diet drugs in men & women.

FACTORS (drug treatment ) ( gender)

Diet Drug A (Alli)

Diet Drug B (Hydroxycut)

Diet Drug C(Dexatrim)

Men Treatment Group 1

Treatment Group 2

Treatment Group 3

Women Treatment Group 4

Treatment Group 5

Treatment Group 6

Page 66: Introduction to Statistics (Chapter 0 & 5)

Slide 66

• You want to see which of 3 training methods for runners produce the greatest results.

• You randomly select 3 different groups of runners and each group is subjected to a different training method.

• The runners are the experimental subjects.• The training methods are the treatments (or conditions

being applied) in this experiment.• You have 1 factor to consider… method of training.• There are 3 levels …. The 3 different ‘types of training’.• This experiment has a total of 3 treatment groups.

FACTOR (Type of Training) Training Method A

Training Method B

Training Method C

TreatmentGroup 1

Treatment Group 2

Treatment Group 3

Example:

Page 67: Introduction to Statistics (Chapter 0 & 5)

Slide 67We would like to compare GPA (averages) of children who received 1 hour of instruction per week and another getting 4 hours per week. And, we'd like to additionally vary the groups learning environments (setting) with some of groups getting the instruction in-class (pulled off to a corner of the classroom) and other groups being pulled-out of the classroom for instruction in another room.

Experimental Subjects: *students

Treatment: *learning style

Factors (2): *Instruction Time *Setting

Levels (4): *1 hour *4 hours *In-class *Pull-out

Nbr. of treatment groups: 4

Page 68: Introduction to Statistics (Chapter 0 & 5)

Slide 68More Important Definitions

Response Variable – is the OUTCOME variable. It is the variable you measure to determine the outcome

(output) of an experiment. The dependent variable (y-axis).

Explanatory Variable(s) – is the CONTROL or PREDICTOR variable.

It is the ‘input’ variable used to predict the ‘output’ or Response variable.

If you think about it as the variable that controls the change (outcome) , then you will realize the explanatory variable is simply the treatment factor.

It is the independent variable (x-axis).

Page 69: Introduction to Statistics (Chapter 0 & 5)

Slide 69More Important Definitions

Response Variable – is the OUTCOME variable. It is the variable you measure to determine the (outcome, output, result)

of an experiment.

Explanatory Variables – is the CONTROL or PREDICTOR variable.

It as the variable (factor) that controls the change.

Example: Remember the Diet Drug experiment….

Explanatory variable - type of diet drug (remember there were 3 diet drugs, and the type of drug controls the different outcomes)

Response variable – weight loss in pounds (you would weigh your subjects to see how well the diet drug preformed or how well it RESPONDED to the treatment factor)

Page 70: Introduction to Statistics (Chapter 0 & 5)

Slide 70More Important Definitions

Placeboa medication with no active ingredients, a “sugar pill” or fake pill.

Placebo Effecta term doctors use to describe the phenomenon where patients get better because they expect the treatment to work even though they have taken a fake pill.

Page 71: Introduction to Statistics (Chapter 0 & 5)

Slide 71Steps in Experimental Design• Identify the problem to be solved• Determine the Factors that Affect the Response Variable• Determine the Number of Experimental Units

– Time– Money

• Determine the Level of Each Factor– Control – fix level at one predetermined value– Manipulation – set them at predetermined levels– Randomization – tries to control the effects of factors whose

levels cannot be controlled– Replication – tries to control the effects of factors inherent to the

experimental unit• Conduct the Experiment• Test the claim (inferential statistics)

Page 72: Introduction to Statistics (Chapter 0 & 5)

Slide 72Statistically Significant

Example: Rolling a Yahtzee (to roll the same number on 5 die) 3 times in a row.

Remember our definition of unusual results (less than a 5% chance of occurrence).

Page 73: Introduction to Statistics (Chapter 0 & 5)

Slide 73Analyzing Experiments TemplateTopic Answers

Research Question: What is the question the researchers are trying to answer?

Subjects / Experimental Units: What are the experimental units?Explanatory Variable(s) / Factor(s):

Type of variable: Quantitative or Categorical

Treatment(s): What are the Factor(s) and their Levels?Response Variable(s): Type of variable: Quantitative or CategoricalExperimental Design Description:

Using words or diagrams describe the experimental design (in enough detail it can be duplicated)!

Experimental Design Principles:

Explain how these design principles apply in this study

Control: Eliminate confounding effects of extraneous variablesRandomization: No systematic difference between the groupsReplication: Reducing role of chance in results

Blocking: If blocking used, describe the blocking / why it was used.

Blinding: If blinding used, describe it in context.Concerns: What concerns about the experimental design?Statistical Analysis Technique(s):

What statistical analysis techniques are appropriate?

Conclusions: What conclusions can be drawn from the study?

Page 74: Introduction to Statistics (Chapter 0 & 5)

Slide 74Example 1Draw a picture detailing the following experiment:

A statistics class wants to know the effect of a certain fertilizer on tomato plants. They get 60 plants of the same type. They will have two levels of treatments, 2 and 4 teaspoons of fertilizer. Someone suggests that they should use a control group.

The picture should include enough detail for someone unfamiliar with the problem to understand the problem and be able to duplicate the experiment.

Page 75: Introduction to Statistics (Chapter 0 & 5)

Slide 75Random Assignment of plants to treatments:

Lay plants out in a line. Draw out of a bag one colored chip (20 chips each of three colors). All plants of the same color assigned

to one group below.

Group 1 (red) receives 20

plants

Compare Yieldtotal ounces

Group 2 (blue) receives 20

plants

Group 3 (white) receives 20

plants

Treatment ANo

Fertilizer

Treatment B2

teaspoons

Treatment C4

teaspoons

Example 1 cont

Control Group

Response Variable:total ounces produced

Explanatory Variable:amount of fertilizer

Experimental Units:tomato plants

Page 76: Introduction to Statistics (Chapter 0 & 5)

Slide 76Example 2A baby-food producer claims that her product is superior to that of her leading competitor, in that babies gain weight faster with her product. As an experiment, 30 healthy babies are randomly selected. For two months, 15 are fed her product and 15 are feed the competitor’s product. Each baby’s weight gain (in ounces) was recorded. A) How will subjects be assigned to treatments?

B) What is the response variable?

C) What is the explanatory variable?

*No details given. Poor description of random selectionHowever, there are two random selections taking place: getting the 30 and then assigning them to the two products

Baby’s weight gain in ounces

Baby food brands

Page 77: Introduction to Statistics (Chapter 0 & 5)

Slide 77Example 3Two toothpastes are being studied for effectiveness in reducing the number of cavities in children. There are 100 children available for the study.

A) How do you assign the subjects?

B) What baseline data should you know about? C) What do you measure? D) What factors might confound this experiment?

E) What would be the purpose of a randomization in this problem?

Randomly divide children into two groups. Pull names out of hat

Number of cavities before specific toothpaste and after using

Dietary habits, economic status

To try and “balance out” the variables that could affect the number of cavities

Number of cavities each child had before study.

Page 78: Introduction to Statistics (Chapter 0 & 5)

Slide 78Example 4We wish to determine whether or not a new type of fertilizer is more effective than the type currently in use. Researchers have subdivided a 20-acre farm into twenty 1-acre plots. Wheat will be planted on the farm, and at the end of the growing season the number of bushels harvested will be measured.

A) How do you assign the plots of land?

B) What is the explanatory variable?

C) What is the response variable?

D) How many treatments are there?

E) Are there any possible lurking variables that would confound the results?

Blocking?? Before randomly assigning plots

Types of fertilizer

Number of bushels of wheat harvested

Two – the new fertilizer and the old, or possibly 3 treatments, if treating with no fertilizer is used as a control group.

Soil composition, rainfall, animal destruction effects (pestilence)

Page 79: Introduction to Statistics (Chapter 0 & 5)

Slide 79Summary and Homework• Summary

– Parts of an Experiment:» Experimental units » Treatment

• Factors• Levels

» Variables (Explanatory, Response, Confounding or Lurking)

• Homework– pages 357-8 and 364-5 – problems 5.33-40, 42

Page 80: Introduction to Statistics (Chapter 0 & 5)

Slide 80Statistical “Blindness”In some studies we don’t want the person giving or getting the treatment to influence the results of the experiment.

● To avoid the effects of subject behavior Subjects not given any medication are often given a placebo.

● To avoid the effects of administrator behavior The administrators are not told which drug (real or fake) they

are administering.

● When both the subjects and the researchers do not know who is getting the treatment, this is called double-blind.

Page 81: Introduction to Statistics (Chapter 0 & 5)

Slide 81Completely Randomized Design

● A completely randomized design is when each experimental unit is assigned to a treatment completely at random

● Examples: Randomly assign 10 people to get the new drug and 10

people to get the old drug; compare results

A farmer wants to test the effects of a fertilizer; we choose a set of plants to receive the treatment; and we randomly assign plants to receive different levels of fertilizer

Page 82: Introduction to Statistics (Chapter 0 & 5)

Slide 82Randomized Design Example● We control as many factors as we can

Amount of watering Method of tilling Amount of acid in the soil

● Randomization decreases the effects of uncontrolled factors Rainfall Sunlight Temperature

Page 83: Introduction to Statistics (Chapter 0 & 5)

Slide 83Matched-Pair Design● A matched-pair design is when the experimental units

are paired up and each of the pairs are assigned to a different treatment

● A matched pair design requires Units that are paired (Example: using the same person before

and after a treatment, using twins, using 2 similar groups) Only two levels of treatment (one for each of the pair)

● Examples: New sock on right foot and old sock on left foot; and the wear-

time until a hole develops is recorded A subject before receiving the medication and then the same

subject after receiving the medication

Page 84: Introduction to Statistics (Chapter 0 & 5)

Slide 84Matched-Pair Design Example• Test whether students learn better while listening to

music or not– Match prs. of students by IQ and gender (to control those

factors)– Randomly choose one of each pair (to decrease the effects of

other uncontrolled factors– Assign that one to a quiet room and the other to a room with

music (the treatment)– Administer the test and analyze the test scores

Page 85: Introduction to Statistics (Chapter 0 & 5)

Slide 85Problem in a Random Design

Example• We are testing the effects of treatments A & B on

different types of soybean plants

• Assume that group 1 is given treatment A and group 2 is given treatment B

• Assume that group 1 has more Chemgro plants than group 2

– (happens because of randomization – choosing to randomly assign plants to treatments in order to prevent bias)

• Assume that Chemgro plants also have higher yields (number of soybeans per plant) than Pioneer plants

Page 86: Introduction to Statistics (Chapter 0 & 5)

Slide 86Confounding● If Group 1 (treatment A) has higher yields than Group 2 (treatment B)

Is this because treatment A is more effective than B? OR, is this because there are more Chemgro plants in group 1?

● It is not possible to distinguish The effects of Treatment A versus B The effects of Chemgro versus Pioneer

● When two effects cannot be distinguished, this is called confounding.

Page 87: Introduction to Statistics (Chapter 0 & 5)

Slide 87Ways to control variables Blinding

subject does not know if he/she is receiving a treatment or placebo

Blockingsubjects that are grouped together for an experiment, because they are known to be similar in some way prior to the experiment.

Completely Randomized Experimental Designsubjects are assigned to blocks through a process of random selection

Rigorously Controlled Designsubjects are very carefully chosen

Page 88: Introduction to Statistics (Chapter 0 & 5)

Slide 88Randomized Block Design• A randomized block design is when the experimental

units are grouped (because they are known to be similar in some way) and then each group is assigned a treatment at random

• The groups are called blocks• This design will reduce confounding

• This has similarities to stratified sampling

Page 89: Introduction to Statistics (Chapter 0 & 5)

Slide 89Randomized Block Design• In our soybean experiment

– We apply treatment A to one third of the Chemgro plants, chosen at random

– We apply treatment B to one third of the Chemgro plants, chosen at random

– We apply Treatment C to one third of the Chemgro plants, chosen at random

• We apply the same method to the Pioneer plants• With this randomized block design

– Insures a balance of the treatments to the type of soybean plants– Plant type does not affect the value of our response variable– The effect of treatment A versus B and the effect of Chemgro

versus Pioneer are no longer confounded• This has similarities to stratified sampling

Page 90: Introduction to Statistics (Chapter 0 & 5)

Slide 90Randomized Block Design

Blocks should be homogenous: made up of the same attribute

Page 91: Introduction to Statistics (Chapter 0 & 5)

Slide 91Example 1You are participating in the design of a medical experiment to investigate whether a calcium supplement in the diet will reduce the blood pressure of middle-aged men. Preliminary work suggests that calcium may be effective and that the effect may be greater for African-American men than for white or Hispanic men. Forty randomly selected men from each ethnic category are available for the study. 1) Outline the design of an appropriate experiment. 2) What kind of design is this? 3) Can this experiment be blinded?

Page 92: Introduction to Statistics (Chapter 0 & 5)

Slide 92Example 1 - Answer

• 3 blocks of middle-age men (40 subjects per block) –– African- American– Hispanic– White

• Matched pair design (measure blood pressure of same person before & after study)

• This experiment could be blinded… ½ the men in each group (block) could be given the calcium supplement and the other ½ could be given a placebo.

Page 93: Introduction to Statistics (Chapter 0 & 5)

Slide 93Example 2An educational psychologist wants to test two different memorization methods to compare their effectiveness to increase memorization skills. There are 120 subjects available ranging in age from 18 to 71. The psychologist is concerned that differences in memorization capacity due to age will mask (confound) the differences in the two methods.

1) What would the design look like?2) Tell the number of treatments & levels3) What is the Explanatory & Response Variable?

Page 94: Introduction to Statistics (Chapter 0 & 5)

Slide 94Example 2 - Answer 1) You could block on age ranges (teens, 20s, 30s, 40s, 50s, 60s, 70s) to help prevent confounding results.

• Matched pair design (measure memorization capacity of each subject before and after study to see if there has been any change)

• 2) There is 1 treatment - (memorization method), and 2 levels- (the 2 different memorization methods).

• 3) Explanatory Variable –method of memorization Response Variable – memorization capacity

Page 95: Introduction to Statistics (Chapter 0 & 5)

Slide 95Basic Principles of Experimental Design

• Control– Comparing several treatments in the same environment

(blocking) in order to minimize confounding effects of lurking variables. Lurking variables are the ones not being measured or controlled in the experiment.

• Randomization– Uses impersonal chance (SRS table, draw nbr. out of hat) to

assign experimental units to treatments. – Increases chances that there are no systematic differences

(bias) between treatment groups.• Replication

– Repeating an experiment many times in order to reduce the chance variation.

Page 96: Introduction to Statistics (Chapter 0 & 5)

Slide 96Recap

In this section we have looked at: Principles of Experimental Design.

Page 97: Introduction to Statistics (Chapter 0 & 5)

Slide 97Summary and Homework• Summary

– The planning for designed experiments is crucial to the success of the experiment

– A double-blind implementation of experiments reduces the amount of changes in behavior

– There are different good methods for assigning treatments to experimental units

» Completely random» Matched-pairs» Randomized blocks

• Homework– pages 371-377– problems 5.45, 48, 50, 52, 54, 56-58