A Lecture on Sample Size and Statistical Inference for Health Researchers
-
Upload
dr-arindam-basu -
Category
Education
-
view
255 -
download
1
Transcript of A Lecture on Sample Size and Statistical Inference for Health Researchers
![Page 1: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/1.jpg)
Statistical Inference and Sample Size
Statistical Inference and Sample Size
Arindam [email protected]
2015-03-18
![Page 2: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/2.jpg)
Statistical Inference and Sample Size
What Shall We Learn?
Revise concepts on probability
Statistical Inference - Estimation
Concept of Hypothesis Testing
Concepts of Sampling and Sample Size
![Page 3: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/3.jpg)
Statistical Inference and Sample Size
Approaches to Population Parameters
We‘d like to know about population parameters
Parameters are unknown
As a Result, We calculate or study statistic in samples
Estimate Parameters in population from statistics inSamples
We Also Test Hypotheses About Parameter in our samples
![Page 4: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/4.jpg)
Statistical Inference and Sample Size
Concepts of Probability
![Page 5: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/5.jpg)
Statistical Inference and Sample Size
Theory of Probability and Inference
A trial/experiment has a set of specified outcomes
The outcome of one trial does not influence the outcome ofanother trial
The trials are identical
Probabilities provide a link between a population and samples
![Page 6: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/6.jpg)
Statistical Inference and Sample Size
Independence Law of Probability
Two outcomes are statistically independent if the probabilityof their joint occurrence is the product of the probabilities ofoccurrence of each outcome
![Page 7: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/7.jpg)
Statistical Inference and Sample Size
C and D Are Independent
P(CD) = P(C) * P(D); P(CD) = Joint Probability of
the Event C and Event D
![Page 8: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/8.jpg)
Statistical Inference and Sample Size
Examples of Independent Events
If Repetitions are Independent, then they are from a RandomSample
Random Sample is About the Method that produces theSample
![Page 9: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/9.jpg)
Statistical Inference and Sample Size
Law of Mutually Exclusive
Two outcomes are mutually exclusive if at most one of themcan occur at a time; that is, the outcomes do not overlap
![Page 10: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/10.jpg)
Statistical Inference and Sample Size
C and D are Mutually Exclusive
P(C OR D) = P(C) + P(D)
![Page 11: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/11.jpg)
Statistical Inference and Sample Size
Examples of Mutually Exclusive Events
Dead or Alive Outcomes
Head or Tail in a Toss
Vaginal OR Caesarian Section as Modes of Delivery
NZ European OR Asian OR Maori OR Pacific Islander
Others??
![Page 12: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/12.jpg)
Statistical Inference and Sample Size
Not All Outcomes are Mutually Exclusive
Figure: Not all outcomes mutually exclusive
![Page 13: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/13.jpg)
Statistical Inference and Sample Size
What is the Probability of Overweight OR Having HighBlood Pressure?
P(Overwt) OR P(HTN) OR P( Overwt HTN ) = 0.1 + 0.2 +
0.1 = 0.4
![Page 14: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/14.jpg)
Statistical Inference and Sample Size
Question: What is the sum of two marginal probabilities?
P(Overwt) + P(HTN) = 0.3 + 0.2 = 0.5
![Page 15: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/15.jpg)
Statistical Inference and Sample Size
What Happens When we Remove the Joint Occurrences?
P(Overwt OR HTN) = P(Overwt) + P(HTN) - P(O H) = 0.4
Thus in this case O and H are NOT mutually exclusive
![Page 16: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/16.jpg)
Statistical Inference and Sample Size
Law of Addition
By the addition rule, for any two outcomes, the probability ofoccurrence of either outcome or both is the sum of theprobabilities of each occurring minus the probability of theirjoint occurrence
![Page 17: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/17.jpg)
Statistical Inference and Sample Size
Law of Conditional Probability
For any two outcomes C and D, the conditional probability ofthe occurrence of C given the occurrence of D, P (C | D],Probability of C GIVEN D is given by
![Page 18: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/18.jpg)
Statistical Inference and Sample Size
C is Conditional on D
P(C|D) = P(C D) /P(D)
![Page 19: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/19.jpg)
Statistical Inference and Sample Size
Concepts of Randomness
![Page 20: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/20.jpg)
Statistical Inference and Sample Size
What is a Random Variable?
A Variable Associated with Random Sample
The process that generates that variable must be random
The Likelihood of Person 1 being selected will have nothing todo with the likelihood of Person 2 being selected
Empirical relative frequency of occurrence of a value of thevariable becomes an estimate of the probability of occurrenceof that value
![Page 21: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/21.jpg)
Statistical Inference and Sample Size
Consider this: Number of Boys in Families of Eight
Figure: Number of Boys in Families of 8 Children
![Page 22: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/22.jpg)
Statistical Inference and Sample Size
Calculate: What is the probability of
Finding Exactly Two Boys in that Family?
P(Number of Boys = 2) = 0.0993
Finding None, One, or Two Boys in the Family?
P(Number = 1) + P(Number = 2) + P(Number = 0) =0.1310
![Page 23: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/23.jpg)
Statistical Inference and Sample Size
Probability Distribution Function of this Data
Figure: Probability Distribution of Boys in Families of 8 Children
![Page 24: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/24.jpg)
Statistical Inference and Sample Size
Types of Variables
Discrete
Nominal
Ordinal
Continuous
Interval
Ratio
![Page 25: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/25.jpg)
Statistical Inference and Sample Size
Probability with Continuous Random Variable
What is the Probability of Findings someone with Weightexactly 50 kg?
Answer = 0! (i.e., exactly 50.000 and not 50.001 kg)
We can find someone in the interval 49.5 and 50.5 Kgs
Convert continuous variables into intervals -> treat midpointslike discrete -> list probabilities associated
![Page 26: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/26.jpg)
Statistical Inference and Sample Size
Probability with Continuous Random Variable
What is the Probability of Findings someone with Weightexactly 50 kg?
Answer = 0! (i.e., exactly 50.000 and not 50.001 kg)
We can find someone in the interval 49.5 and 50.5 Kgs
Convert continuous variables into intervals -> treat midpointslike discrete -> list probabilities associated
![Page 27: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/27.jpg)
Statistical Inference and Sample Size
Start with the Barplot of Relative Frequencies
Figure: Barplot of Relative Frequencies
![Page 28: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/28.jpg)
Statistical Inference and Sample Size
The curve would take a smooth shape
Figure: Line Plot of Relative Frequencies
![Page 29: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/29.jpg)
Statistical Inference and Sample Size
Probability Density Function
A probability density function is a curve that specifies, bymeans of the area under the curve over an interval, theprobability that a continuous random variable falls within theinterval. The total area under the curve is 1
![Page 30: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/30.jpg)
Statistical Inference and Sample Size
How to Calculate the Average of a Discrete RandomVariable?
E(Y) = Σ (p*y); where E(Y) = Expected value of Y, p
= proportion, y = individual values
![Page 31: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/31.jpg)
Statistical Inference and Sample Size
What is Normal Distribution?
Population = Set of All Possible Values of a Variable
Random Selection of Objects makes the variable RandomVariable
Challenge: Find a Model with few parameters and can applyto real data
Normal or Gaussian distribution is a Statistical model
![Page 32: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/32.jpg)
Statistical Inference and Sample Size
Why is Normal Distribution Popular?
It works!
Central Limit Theorem
Practical
![Page 33: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/33.jpg)
Statistical Inference and Sample Size
Central Limit Theorem
If a random variable Y has population mean µ and populationvariance σ , the sample mean y , based on n observations, isapproximately normally distributed with mean µ and varianceσ /
√n, for sufficiently large n
![Page 34: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/34.jpg)
Statistical Inference and Sample Size
Central Limit Theorem in Simple Terms
Means of Random Samples from Any Distribution Will beNormally Distributed
Reassuring Even when we do not know the nature of theoriginal distribution
![Page 35: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/35.jpg)
Statistical Inference and Sample Size
CLT Helps Us to Calculate the Confidence Intervals
Figure: 95 pct confidence interval
![Page 36: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/36.jpg)
Statistical Inference and Sample Size
A Table that Helps You to Calculate the 95% CI
Figure: z value table
![Page 37: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/37.jpg)
Statistical Inference and Sample Size
Example of a Normal Distribution
Figure: Density Plot
![Page 38: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/38.jpg)
Statistical Inference and Sample Size
Statistics Are Random
A statistic associated with a random sample is a random variable
![Page 39: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/39.jpg)
Statistical Inference and Sample Size
Illustration with an Example of IQ distribution
Figure: IQ Distribution
![Page 40: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/40.jpg)
Statistical Inference and Sample Size
Points to Note
Reduction of variability by a factor of 2 will require a 4-foldincrease in sample size
![Page 41: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/41.jpg)
Statistical Inference and Sample Size
Note: if you have 100 participants, and can add another10, don‘t bother
Figure: Extra 10 pct not worth
![Page 42: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/42.jpg)
Statistical Inference and Sample Size
Example: Birth Weight of Babies with SIDS (SuddenInfant Death Syndrome)
78 babies died in a City diagnosed with SIDS. Birthcertificates were obtained and found that for these 78 babies,their mean birthweight was 2994 grams. It is also known thatin this population the standard deviation of birthweight isabout 800 grams.
What is the 95% Confidence Interval for Mean Birthweight forSIDS for these infants?
![Page 43: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/43.jpg)
Statistical Inference and Sample Size
Answer to the Birth Weight Question
At the lower limit: 2994 - (1.96) * (800 /√
78) = 2816
At the higher limit: 2994 + (1.96) * (800 /√
78) = 3172
What if we wanted to be MORE confident? Say 99%confident?
![Page 44: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/44.jpg)
Statistical Inference and Sample Size
Answer to the 99% Confidence Interval
Lower Limit = 2994 - (2.58 * 800 /√
78 ) = 2760
Upper Limit = 2994 + (2.58 * 800 / $√
78 ) = 3228
![Page 45: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/45.jpg)
Statistical Inference and Sample Size
Interpretations of Confidence Intervals
As the confidence level increases, the interval level gets wider.
Why can this be?
This is the Price we pay for making sure we have straddledthe population mean
As we decrease α, we increase the level of confidence
If we want to decrease the width then we either decreaseconfidence or increase sample size
![Page 46: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/46.jpg)
Statistical Inference and Sample Size
Interpretations of Confidence Intervals
As the confidence level increases, the interval level gets wider.
Why can this be?
This is the Price we pay for making sure we have straddledthe population mean
As we decrease α, we increase the level of confidence
If we want to decrease the width then we either decreaseconfidence or increase sample size
![Page 47: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/47.jpg)
Statistical Inference and Sample Size
Steps of Estimation
Start with sample statistic
State about the population parameter
We use confidence interval to indicate that our intervalstraddles the parameter
Sort of flip it over, and get hypothesis testing
![Page 48: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/48.jpg)
Statistical Inference and Sample Size
Steps of Hypothesis Testing
Start by assuming a parameter value
Make a probability statement about the value of statistic
Measure ?how far? an observed statistic is from ahypothesised parameter
If the distance is GREAT, we argue hypothesised parameter isINCONSISTENT with the data -> reject the hypothesis
![Page 49: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/49.jpg)
Statistical Inference and Sample Size
Concepts of Distance in Hypothesis Testing
Take the basic variability of the observations (variance, σ2 )
Take the sample size (N)
If the observed value of statistic >= 2 * standard errors fromhypothesized value of parameter, question the Truth ofHypothesis
This is because the data do not match the hypothesis
![Page 50: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/50.jpg)
Statistical Inference and Sample Size
Example: Are the SIDS babies? birthweight different fromthe normal population?
Mean birthweight of our sample (N = 78) babies = 2994 g
We know standard deviation of population = 800 g
Therefore standard error = 800 /√
78 = 90.6 g
For general population, average birth weight = 3300 g.
Is our sample birthweight consistent with this?
![Page 51: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/51.jpg)
Statistical Inference and Sample Size
How far are SIDS birthweight from the average birthweight?
Figure: SIDS Birth Weight
![Page 52: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/52.jpg)
Statistical Inference and Sample Size
Conclusions from SIDS Study
The observed difference = 308 g
This is 308/90.6 = 3.38 standard errors away fromhypothesised mean
It is GREAT distance away by our rule
Hence the SIDS babies sample is inconsistent with what isexpected!
The SIDS babies come from a DIFFERENT population, less
What are other challenges to this?
![Page 53: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/53.jpg)
Statistical Inference and Sample Size
Where in the Normal Distribution We have this standarderror?
Figure: Area of Observed Value
![Page 54: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/54.jpg)
Statistical Inference and Sample Size
Can We Associate a Probability Value to this TailEstimate?
The area to the right of the standard error (here 3.38) is thep-value
We know for z = 1.96, p-value = 0.025
We know for z = 2.58, p-value = 0.005
![Page 55: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/55.jpg)
Statistical Inference and Sample Size
What if our statistic fell within the 2 standard errors?
We set it up before the data gathering as follows:
![Page 56: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/56.jpg)
Statistical Inference and Sample Size
Figure: sample space
![Page 57: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/57.jpg)
Statistical Inference and Sample Size
Concepts Related to Hypothesis Testing
Null Hypothesis - specifies hypothesised real value forparameter
Alternative Hypothesis - Real or range of values when nullhypothesis is rejected
Rejection Region values of statistic when null hypothesis isrejected
![Page 58: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/58.jpg)
Statistical Inference and Sample Size
Key Table of Hypothesis Testing
Figure: Table of Hypothesis Testing
![Page 59: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/59.jpg)
Statistical Inference and Sample Size
Applying this to our SIDS study
Figure: SIDS sample space
![Page 60: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/60.jpg)
Statistical Inference and Sample Size
Rejection Regions - One tailed versus Two tailed
Figure: one tailed
![Page 61: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/61.jpg)
Statistical Inference and Sample Size
For One tailed tests with same alpha, widen rejectionregion
Figure:
![Page 62: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/62.jpg)
Statistical Inference and Sample Size
Summary of Statistical Inference
Define population
Specify parameters
Take random sample from the population
Estimate the parameter from the sample statistic
Test Hypotheses about the sample statistic and the parameter
![Page 63: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/63.jpg)
Statistical Inference and Sample Size
Review: Assumptions of Hypothesis Testing
We knew the population variance and formalised the samplemean to estimate population
What Happens when:
We do not know either the population mean or the variance?
How do we compare two normal populations?
How do we estimate sample sizes?
![Page 64: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/64.jpg)
Statistical Inference and Sample Size
Need for a Pivotal Variable
Think of a randomly selected sample whose mean is calculated
That mean follows a normal distribution and estimates thepopulation mean
The variance (or standard deviation of that mean) estimatesthe variance of the population as well
Pivotal Variable is the link
![Page 65: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/65.jpg)
Statistical Inference and Sample Size
Pivotal Variable
Chisquare = ( (N - 1) * (standarddeviation)2 ) /σ2 Z = (y -
µ) /σ /√n ;
Chisquare = ( (N - 1) * (standarddeviation)2 ) /σ2
![Page 66: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/66.jpg)
Statistical Inference and Sample Size
Requirements of a Pivotal Variable
At least a statistic,
And a parameter
Distribution of Z or Chi-square is fixed
Confidence intervals needed Z or chisquare
These quantities are known as Pivotal Variables
![Page 67: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/67.jpg)
Statistical Inference and Sample Size
From Z to T
Random sample picked from a normal distribution and weknow the variance (sigmaˆ2)
Then, Z is our pivotal quantity which has a Normal(0,1)distribution.
What happens when we do not know the population variancebut need to estimate the population mean from sample?
The corresponding pivotal variable is ‘t‘, after Student orWilliam Gosset
T = (y - µ) /(s /√N )
What is the distribution of ‘t‘ ?
![Page 68: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/68.jpg)
Statistical Inference and Sample Size
Properties of ‘T‘ Distribution
Similar to Normal Distribution
Depends on N
Indexed by n–1, and similar to chisquare
Bell shaped, symmetrical about 0
As N approaches infinity, t becomes similar to Normal
![Page 69: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/69.jpg)
Statistical Inference and Sample Size
Concept: SIDS problem now in terms of t-statistic
This time we do not know the population variance and wouldlike to estimate the population mean
Sample mean birthweight y = 3199.8 g
Standard deviation = 663
![Page 70: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/70.jpg)
Statistical Inference and Sample Size
Challenge
Without assuming population variance, can we
Obtain an interval estimate of the population mean?
Test Null Hypothesis that
Birthweight Average of SIDS Cases is 3300g?
T-value for 14 df = 2.14
Hence, upper limit: 3199.8 + 2.14 * 663/√
15 = 3566
Lower limit = 3199 - (2.14 * 663/√
15) = 2834
Note that the confidence interval is wider
![Page 71: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/71.jpg)
Statistical Inference and Sample Size
Hypothesis Testing for Paired Data
Paired Data = Repeated or Multiple Measurements on thesame participants
Example: before after measurement of pain followinganalgesics administration
We want to look differences between pairs
Have the mean of sample differences come from a populationof differences with mean 0?
Assume that this difference is normally distributed
![Page 72: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/72.jpg)
Statistical Inference and Sample Size
Example: Aminophylline Challenge Study
Apnea children, administered Aminophylline to these children
Measure apnea episodes 16 hours later and compared withwhat would happen 24 hours before administration
Average change for 13 children = 0.767
Sd for 13 children = 0.52
T value for 12 df = 2.18
If we consider no change = 0, then,
Rejection region = 0 - (2.18 * 0.524/√
13 ) = –0.317 andlikewise 0.317
0.767 falls outside of this region.
We reject the null hypothesis
![Page 73: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/73.jpg)
Statistical Inference and Sample Size
Sampling
![Page 74: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/74.jpg)
Statistical Inference and Sample Size
Importance of Sampling
Save time and money
Measurements can be more accurate when done on smallernumbers
Therefore choose the method with most accuracy andprecision
![Page 75: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/75.jpg)
Statistical Inference and Sample Size
Alternatives to Sampling
Census - Expensive
Volunteer based reporting
Early responders are different from late responders and bothare different from members of the general public (“WorriedWell“)
Let the Interviewer Choose (“Choose those who are easiest tofind“)
![Page 76: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/76.jpg)
Statistical Inference and Sample Size
Concepts of Sampling
Capture as many respondents as you can
Also try to capture data from nonrespondents
60% or less from postal questionnaires even after 3rd posting,while 70–75% for interviewer based sampling (Jennifer Kelseyet.al. (2007)
For prevalence estimation, completely healthy and those withdiseases do not want to participate
For common but untreatable conditions like back pain, peoplewith intractable problems over-represent in the hope that?research? will solve their problems
![Page 77: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/77.jpg)
Statistical Inference and Sample Size
Key Definitions of Terms of Sampling
Sampling Unit is the basic fundamental unit around whichsampling planned (Household, persons)
Sampling Frame = Collection of sampling unit
Probability Sampling = where each sampling unit has anonzero probability of being included in the sample
Nonprobability Sampling = Convenient Sampling
![Page 78: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/78.jpg)
Statistical Inference and Sample Size
Types of Probability Sampling
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
Multistage Sampling
![Page 79: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/79.jpg)
Statistical Inference and Sample Size
Simple Random Sampling
Each unit has EQUAL probability of being included
Uses Random Numbers Table
With Replacement and Without Replacement (See Rexamples)
![Page 80: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/80.jpg)
Statistical Inference and Sample Size
Problems of Simple Random Sampling
Investigator needs to know the sampling frame before starting
If the randomising process is not robust or well done, therecan be errors
Not suitable for all situations
Problem: if the investigator is interested to find out familysize from a school, and conduct simple random sampling,there is a problem.
Children with larger families will be oversampled and it canlead to errors
![Page 81: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/81.jpg)
Statistical Inference and Sample Size
Systematic Sampling
The sampling unit is regularly spaced throughout the samplingframe
Investigator selects every kith sample
Advantages: investigator does not need to know the samplingframe in advance
Example: every 3rd newborn child in a hospital
![Page 82: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/82.jpg)
Statistical Inference and Sample Size
Advantages and Disadvantages of Systematic Sampling
Simple to implement (just select the nth sample unit)
Can capture patterns easily
If there is a cyclical pattern exists, systematic sampling canmiss the pattern entirely, e.g., seasonal trends, say FluPatterns
Cannot estimate variance of population reliably from aSINGLE sample, needs at least two samples
![Page 83: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/83.jpg)
Statistical Inference and Sample Size
Stratified Sampling
Divide population into strata or uniform groups
Draw Sample from each stratum
Represents Each subgroup
Can Get precise estimates compared with a correspondingsimple random sample
Can Assign Weights
Widely Used Strategy
Disadvantage: if too few units are selected for some stratathan others
![Page 84: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/84.jpg)
Statistical Inference and Sample Size
Cluster Sampling
Sample Clusters rather than individuals
In the sampling frame, identify clusters (say classrooms, orhouseholds, or similar units)
Then, in each cluster, examine everyone within these clusters
Want to study prevalence of dental caries in schoolchildren?Divide schools into classrooms, and sample individualclassrooms, and examine all children in the classrooms
![Page 85: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/85.jpg)
Statistical Inference and Sample Size
Advantages of Cluster Sampling
Need not enumerate entire population in advance
Economical Use of Resources
![Page 86: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/86.jpg)
Statistical Inference and Sample Size
Multistage Sampling
Identify Primary Sampling Units that are Larger
From the Primary Sampling Units identify secondary samplingunits
Sample from the secondary sampling units or, extend theprocess further
Different from Cluster Sampling
In cluster sampling one selects everyone from the secondaryunit, here the secondary unit is sampled
Can use in different stages different sampling procedures
![Page 87: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/87.jpg)
Statistical Inference and Sample Size
Sample Size Calculations
We need to know at least:
How variable are the data
How willing you are to accept that your conclusion is incorrectthat there is an effect when there is none (Type I error)
What is the magnitude of effect you want to detect
What is the certainty with which you want to detect the effect(power)
![Page 88: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/88.jpg)
Statistical Inference and Sample Size
Importance of these criteria
The more variation in data, the more observations you need
The more certain you want to be, the more observations youwill need
If the difference is very large, you need fewer people
If the difference is very small, you need more people
![Page 89: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/89.jpg)
Statistical Inference and Sample Size
The Formula
Where ∆ = ((µ1 - µ2) /σ) N = 2 * (z * (1-α /2) + z *
(1- β)2)) /∆2
Where ∆ = ((µ1 - µ2) /σ)
![Page 90: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/90.jpg)
Statistical Inference and Sample Size
What is the significance of this formula
The standardized difference enters the formula as a square
The narrower the difference, the correspondingly increase insize required
![Page 91: A Lecture on Sample Size and Statistical Inference for Health Researchers](https://reader030.fdocuments.us/reader030/viewer/2022032617/55aba1121a28ab7d348b45ba/html5/thumbnails/91.jpg)
Statistical Inference and Sample Size
Summary
This brief tour provides a snapshot of core statistical thinking
We focused on relevant study design issues
We learned about basic probability
We learned about Distributions (Z, T)
We learned about principles of estimation and hypothesistesting
We learned about sampling and sample sizes