Statistics : Role in Research. Statistics: A collection of procedures and processes to enable...
-
Upload
caleb-axford -
Category
Documents
-
view
217 -
download
3
Transcript of Statistics : Role in Research. Statistics: A collection of procedures and processes to enable...
Statistics : Role in Research
Statistics: A collection of procedures and processes to enable researchers in the unbiased pursuit of Knowledge
Statistics is an important part of the Scientific Method
State a Hypothesis
Analyze the Data
Design a Study and Collect Data
Interpret the Results—Draw Conclusions
State a Hypothesis: The OBJECTIVE or OBJECTIVES of the Study
A HYPOTHESIS OR SET OF HYPOTHESES should state exactly what you want to DO or LEARN or STUDY
SHOULD ANSWER
What are the factors to be studied and what relationships are to be investigated? What is the experimental material? Etc.?
The area of STATISTICS would not be needed if each time you measured an experimental unit you would obtain the same
response or valueBUT, THE RESPONSES ARE NOT THE SAME SINCE THERE IS VARIABILITY or NOISE IN THE SYSTEM
STATISTICAL METHODS EXTRACT THE SIGNAL FROM THE NOISE TO PROVIDE INFORMATION
One of the Statistician’s JOBS is to make sense from DATA in the presence of VARIABILITY or noise by using DATA
ANALYSIS TOOLS
DESIGN VS. ANALYSIS
The PURPOSE OF DATA COLLECTION is to GAIN INFORMATION OR KNOWLEDGE!!
Collecting Data does not guarantee that information is obtained.
INFORMATION ≠DATA
At best:
INFORMATION=DATA+ANALYSIS
If data are collected such that they contain NO information in the first place, then the analysis phase cannot find it!!!
The best way to insure that appropriate information is contained in the collected data is to DESIGN (plan) and Carefully Control the DATA COLLECTION PROCESS
The measured variables must relate to the stated OBJECTIVES of the study
If you have a good design and process for data collection, it is quite often straight forward to construct an analysis
that extracts all of the available information from the data
The ROLE of a STATISTICIAN is to work with the REAEARCH TEAM (or researcher) from the START of the study
A STATISTICIAN CAN HELP OBTAIN THE MAXIMUM AMOUNT INFORMATON FROM AVAILABLE RESOURCES
The MOST IMPORTANT TIME for the statistician to become involved with a research study is in the very BEGINNING
HOW???
HELP WITH THE DESIGN OF THE EXPERIMENT
DETERMINE SAMPLE SIZE NEEDED
DEVELOP PROCESS OF COLLECTING DATA
DISCUSS VARIABLES TO BE MEASURED AND HOW THEY RELATE TO THE OBJECTIVES OF THE STUDY
PROVIDE METHODS OF ANALYZING THE DATA
HELP TRANSLATE STATISTICAL CONCLUSIONS INTO SUBJECT MATTER CONCLUSIONS
THE CORE HELP FROM THE STATISTICIAN IS IN THE DESIGN OF THE EXPERIMENT
Help with selecting conditions that relate to the objectives of the study
Selecting the Experimental Units
Deciding when REPLICATIONS exist
Determining the ORDER in which the experiment is to be carried out
THE DESIGN OF THE EXPERIMENT IS CRITICAL
COMPONENTS OF DESIGNED EXPERIMENTS
TREATMENT STRUCTURE:
Factors or Populations or Treatments related to the objectives of the experiment:
Brands of Product, Types of Uses of Product
DESIGN STRUCTURE OR EXPERIMENTAL UNITS:
Factors used in blocking the experimental units as well as characteristics of exp. Units
Washing Machine, Person Using Machine, Products evaluated in Session by Taste Panelist
Complete Designed Experiment
Treatment Structure
Design Structure
RANDOMIZE – randomization plan to assign Treatment of TS to
Experimental Units in DS
RANDOMIZATION IS THE INSURANCE POLICY AGAINST INTRODUCING BIAS INTO THE STUDY
Selecting an appropriate Treatment Structure, necessary Design Structure, and required Randomization Process provides the
Statistician the information needed to construct an appropriate model
APPROPRIATE MODEL = BEST ANALYSIS
Key to the Design of the Experiment is the Concept of REPLICATION
REPLICATON: The independent observation of a treatment
An Experimental Unit Provides a Replication of the level of a Factor if the level is randomly assigned the the Experimental Unit and observed independently of the other Experimental Units
Must make sure that Sub-samples are not considered to be Replications
The Variability among Experimental Units treated independently alike provides the estimate of the variance (or Standard Error) to be used as the measuring stick for comparing the levels of treatments randomly assigned to those Experimental Units
Between Sub-sample variance is generally much less than between Replication variance
It is critical that the Replications are appropriately Identified
Treatment Structure, Design Structure (with experimental units and replication) and Randomization describe the total Design
The Statistician will provide the STATISTICAL interpretation of the results from the analyses – STATISTICAL ANALYSES
CONCLUSIONS
The Statistician will help the Researcher TRANSLATE the statistical analyses conclusions into subject matter conclusions
Discuss how the statistical analyses provide results that relate to the STATED OBJECTIVES of the study. The expected results should be written along with the objectives. Results that are not expected should be looked at carefully
Air COnditioning Example:
4 brands or models -- one machine each
3 types of laundry – Whites, Wash/wear, Denim
3 persons to operate the Machines
For each person:
Randomly assign the order of Brands
For each Brand, randomly assign the order of Types
Brand D Brand B Brand A Brand C
Random Order of Brands for Person 1
White
White
White White
W/W
W/W
W/W
W/W
Denim
Denim
Denim
Denim
Machine Random Order of Types within each Machine
Re-Randomize for each Person
Machines are Experimental Unit for Brands and Variance is computed by Person*Brand
Persons are Blocks of Machines
Compare BRANDS by using the variability among Machines Treated Alike
The Machines within a Person are Blocks for Types – Three Loads per Machine
The Loads within a machine are the Experimental Units for Type and Brand*Type
Variability among Loads treated alike provides the measuring stick for comparing the levels of Type and Brand*Type
This Design Involves Persons as Blocks and Two Sizes of Experimental Unit
Machine and Load
If you ignore that this design involves TWO sizes of Experimental Units and there are Two Error Terms, the resulting error term is a combination of these two error terms
The combined error term is Too Large for making comparisons involving Type and Brand*Type – won’t find things that are there
The Combined error term is Too Small for Making comparisons involving Brand – will declare things to be different when they are not
Statistical Conclusions can be very misleading
STATISTICIAN’S JOB – to figure out how the study is being ran and help identify the type of
design that is being used which includes determining if more than one size of
experimental unit is involved
This is accomplished BEST when the Statistician is involved at the Beginning of the Study
SALSA TASTING EXPERIMENT
NINE TYPES OR BRANDS OF SALSA
A PERSON CAN TASTE ONLY THREE SALSAS DURING THE SESSION
TWELVE PERSONS WILL BE USED IN THE STUDY
ASSIGNMENT OF PRODUCTS TO PERSONS – with order
Person Person
Order 1 2 3 Order 1 2 3
1 C A B 7 F D E
2 H I G 8 A G D
3 E B H 9 C I F
4 G B F 10 D H C
5 I E A 11 F A H
6 C G E 12 B D I
Each Product is Tasted 4 times – there are Four Replications of each product
Since each person tastes only Three of the products, how do we compare the products?
The Analysis obtains predicted values for each Product for each Person
Want to compare the Products as if each Person had tasted all of the Products
The Product Means of these Predicted Values are the “ADJUSTED MEANS” for each Product
Called LEAST SQUARES MEANS by SAS®
The LSMEANS are the Predicted Means as if Each of the Persons has Tasted and evaluated all of the products
Some times characteristics of experimental units are measured – to be used as possible covariates
Study the effect of three types of Drugs on a persons heart rate
Randomly Assign 12 persons to each of the Drugs -- person is experimental unit
Dose the person with the assigned drug and measure the heart rate after 15 minutes
Persons do not have identical heart rates before being given the respective drug
Measure the initial heart rate – heart rate before giving the drug
We want to compare the Drugs as if all experimental units (persons) had the same initial heart rate
Analysis of Covariance uses a regression model to obtain predicted after drug heart rate values as if all persons had initial heart rates of, say, 74 beats per minute
The Drug Means of these predicted heart rates are used to compare the Drugs – These means of Predicted Values are called LSMEANS
LSMEANS are adjusted means and occur in several venues
1. Obtain treatments’ means when not all treatments are observed the same number of times by each person
2. Obtain treatments’ means when the experimental units do not have identical values of the covariates
Another Role of the Statistician is to provide appropriate models for the analysis of the data from a given study in order to take into account the Design Structure and covariates to provide estimates of the treatment effects as if all experimental units had observed all treatments or all experimental units had the same value of the covariate -- provide appropriate LSMEANS
Involving the Statistician in the Beginning of the Study will
1. improve the chance of conducting a successful experiment
2. Speed up the turn around of the analyses since was involved with the design
3. Reduce the costs associated with the experiment -- making sure the sample size is adequate to provide the needed detectable differences