Sampling Techniques for epidemiological studies Biagio Pedalino.
-
Upload
mackenzie-pearson -
Category
Documents
-
view
228 -
download
0
Transcript of Sampling Techniques for epidemiological studies Biagio Pedalino.
![Page 1: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/1.jpg)
Sampling Techniques for epidemiological studies
Biagio Pedalino
![Page 2: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/2.jpg)
Objectives
• To decide whether to conduct sampling
• To choose among a list of sampling techniques
• To define sampling
• To describe sampling techniques
![Page 3: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/3.jpg)
Approach
We are normally interested in:
- Distribution of a variable of interest in a specific population
![Page 4: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/4.jpg)
Definition of population
A population is defined by:•Its nature (an individual, housing, a firm etc.)•Its intrinsic characteristics (gender, housing type, industries)•Its localisation (city, neighbourhood etc.)
who, where and when
![Page 5: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/5.jpg)
Examples of populations
• The inhabitants of London in 2010• French nationality women living in Paris in
2010• Children in elementary schools in France in
2010• HIV seropositive patients in hospital centers
in France in 2006• Individuals recently entered in the French
prisons in 2010
![Page 6: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/6.jpg)
Variable of interest
• We are interested in a (non random) variable y in a population U (of k units)
• It must be defined carefully and accurately (i.e. vaccination status)
Example• We are interested in HIV (variable y) prevalence in a
population. The variable of interest is defined, for each unit k in the population by:
y= 1 (HIV seropositive); 0 (otherwise, i.e. negative, not known, etc)
![Page 7: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/7.jpg)
Example of research question
• What is the proportion of individuals vaccinated against Hepatitis B in Lazareto, in October 2012 ?
• Variable of interest?
• Population?
• Time?
• How to obtain the information about the variable ?– Census– Build a sample
![Page 8: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/8.jpg)
Example of research question
• What is the proportion of individuals vaccinated against Hepatitis B in Lazareto, in October 2012 ?
• Variable of interest?
• Population?
• Time?
• How to obtain the information about the variable ?– Census– Build a sample
![Page 9: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/9.jpg)
Example of research question
• What is the proportion of children vaccinated against Hepatitis B in Minorca, in 2012 ?
• Variable of interest?
• Population?
• Time?
• How to obtain the information about the variable ?– Census– Build a sample
![Page 10: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/10.jpg)
Example of research question
• What is the proportion of children vaccinated against Hepatitis B in Minorca, in 2012 ?
• Variable of interest?
• Population?
• Time?
• How to obtain the information about the variable ?– Census– Build a sample
![Page 11: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/11.jpg)
Definition of sampling
Sampling is the process of selecting units from a specific population to collect information on a variable of interest
![Page 12: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/12.jpg)
Sample
Sampling frameSampling frame
Target populationTarget population
![Page 13: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/13.jpg)
Why bother in the first place?
Get information from large populations with:
– Reduced costs
– Reduced field time
– Increased accuracy
![Page 14: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/14.jpg)
Definition of sampling terms
Sampling frame• List of all the sampling units from which
sample is drawn– Lists: e.g. all children < 5 years of age,
households, health care units…
Sampling scheme• Method of selecting sampling units from
sampling frame– Randomly, convenience sample…
![Page 15: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/15.jpg)
Definition of sampling terms
Sampling unit (element)• Subject under observation from whom
information is collected– Example: children <5 years, hospital discharges,
health events…
Sampling fraction• Ratio between sample size and
population size– Example: 100 out of 2000 (5%)
![Page 16: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/16.jpg)
Sampling errors
• Systematic error (or bias)
– Representativeness (validity)
– Information bias
• Sampling error (random error)
– Precision
![Page 17: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/17.jpg)
Validity
• Sample should accurately reflect the
distribution of relevant variable in population– Person (age, sex)
– Place (urban vs. rural)
– Time (seasonality)
• Representativeness essential to generalise
![Page 18: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/18.jpg)
Representativeness
• Often used as synonym of validity of a sample
• General rule: to build a sample representative of
the whole population of interest
• Erroneous
![Page 19: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/19.jpg)
Which is the correct sample?
Women Men Women Men Women Men Women Men Women MenWomen Men50% 50% 50% 50% 50% 50%
Women Men Women Men Women Men Women Men Women MenWomen Men50% 50% 30% 70% 70% 30%
Samples
Populations
All of them are correct !!!All of them are correct !!!
![Page 20: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/20.jpg)
Example: question
• Aim of the study is to– estimate the national prevalence of elevated Blood Lead
Level (BLL 100μg/L)
– determine the risk factors associated to elevated BLL
• Among children aged 1 to 6 years in Minorca in 2008-2009
We want to recruit 3000 children through hospitals
Is the best design to select each unit with the same inclusion probability?
![Page 21: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/21.jpg)
Example: answer
• No!• ... because if the expected prevalence of elevated
BLL is 1%• With a sample size = 3000, we expect to have 30
children with an elevated BLL in the sample
• Small number to achieve the second objective of the study, i.e. identification of risk factors
It is one of the reasons why we do not perform surveys with equally-represented individuals in epidemiology
![Page 22: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/22.jpg)
Example: solution
• We want to over-represent children with an elevated BLL in the sample
• If we know that some hospitals stand in areas where the risk of lead exposure in the dwellings is high, then we will over-represent hospitals in these high risk areas
![Page 23: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/23.jpg)
Representativeness: take home messages
• A sample is correct if randomly built
• It is not necessary that the distributions in the sample and in the population are the same
![Page 24: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/24.jpg)
Information bias
• Systematic problem in collecting information
– Inaccurate measuring
• Scales (weight), ultrasound, lab tests
(dubious results)
– Badly asked questions
• Ambiguous, not offering right options…
![Page 25: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/25.jpg)
Sampling error (random error)
• No sample is an exact mirror image of the population
• Standard error depends on– size of the sample – distribution of character of interest in population
• Size of error – can be measured in probability samples– standard error
![Page 26: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/26.jpg)
Quality of a sampling estimate
Precision & validity
No precision
Random error
Precision butno validity
Systematicerror (bias)
![Page 27: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/27.jpg)
Survey errors: example
Measuring height:• Measuring tape held differently
by different investigators
→ loss of precision
→ large standard error• Tape too short
→ systematic error
→ bias (cannot be correctedretrospectively)
![Page 28: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/28.jpg)
Types of sampling
• Non-probability samples– Convenience samples
• Biased
– Subjective samples• Based on knowledge• In the presence of time/resource constraints
• Probability samples – Random
• only method that allows valid conclusions about population and measurements of sampling error
![Page 29: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/29.jpg)
Non-probability samples
• Convenience samples (ease of access)
• Snowball sampling (friend of friend….etc.)
• Purposive sampling (judgemental)• You chose who you think should be in the study
Probability of being chosen is unknownCheaper- but unable to generalise, potential for bias
![Page 30: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/30.jpg)
Take a sample of the population of Minorca to ask about possible exposures following a gastroenteritis outbreak
Sampling frame: people walking aroundthe Es Castel harbour at noon on a Monday
Example of a non-probability sample
![Page 31: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/31.jpg)
Probability samples
• Random sampling– Each unit has a known probability of being
selected
• Allows application of statistical sampling theory to results in order to: – Generalise – Test hypotheses
![Page 32: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/32.jpg)
Methods used in probability samples
• Simple random sampling• Systematic sampling• Stratified sampling• Multi-stage sampling • Cluster sampling
![Page 33: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/33.jpg)
Simple random sampling
• Principle– Equal chance/probability of each unit
being drawn
• Procedure– Take sampling population– Need listing of all sampling units (“sampling frame”)– Number all units– Randomly draw units
![Page 34: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/34.jpg)
![Page 35: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/35.jpg)
Simple random sampling
5
20
27
29
32
40
![Page 36: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/36.jpg)
Simple random sampling
• Advantages– Simple– Sampling error easily measured
• Disadvantages– Need complete list of units– Units may be scattered and poorly accessible– Heterogeneous population
important minorities might not be taken into account
![Page 37: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/37.jpg)
Systematic sampling
• Principle– Select sampling units at regular intervals
(e.g. every 20th unit)
• Procedure– Arrange the units in some kind of sequence
– Divide total sampling population by the designated sample size (eg 1200/60=20)
– Choose a random starting point (for 20, the starting point will be a random number between 1 and 20)
– Select units at regular intervals (in this case, every 20th unit), i.e. 4th, 24th, 44th etc.
![Page 38: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/38.jpg)
Systematic sampling
• Advantages
– Ensures representativity across list
– Easy to implement
• Disadvantages
– Need complete list of units
– Periodicity-underlying pattern may be a problem (characteristics occurring at regular intervals)
![Page 39: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/39.jpg)
More complex sampling methods
![Page 40: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/40.jpg)
Stratified sampling
• When to use– Population with distinct subgroups
• Procedure – Divide (stratify) sampling frame into homogeneous
subgroups (strata) e.g. age-group, urban/rural areas, regions, occupations
– Draw random sample within each stratum
![Page 41: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/41.jpg)
Selecting a sample with probability proportional to size
Area Population Proportion Sample size Sampling size fraction
Rural 3000 30%
Total 10000
Urban 7000 70% 1000 x 0.7 = 700
1000 x 0.3 = 300
1000
Stratified sampling
10 %
10 %
![Page 42: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/42.jpg)
Stratified sampling
• Advantages– Can acquire information about whole
population and individual strata– Precision increased if variability within strata is
smaller (homogenous) than between strata
• Disadvantages– Sampling error is difficult to measure– Different strata can be difficult to identify– Loss of precision if small numbers in individual
strata (resolved by sampling proportional to stratum population)
![Page 43: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/43.jpg)
![Page 44: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/44.jpg)
Multiple stage sampling
Principle:
• Consecutive sampling
• Example : sampling unit = household– 1st stage: draw neighbourhoods – 2nd stage: draw buildings– 3rd stage: draw households
![Page 45: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/45.jpg)
Cluster sampling
• Principle
– Whole population divided into groups e.g. neighbourhoods
– A type of multi-stage sampling where all units at the lower level are included in the sample
– Random sample taken of these groups (“clusters”)
– Within selected clusters, all units e.g. households included (or random sample of these units)
– Provides logistical advantage
![Page 46: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/46.jpg)
![Page 47: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/47.jpg)
![Page 48: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/48.jpg)
Number of cluster needed=25
![Page 49: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/49.jpg)
Number of cluster needed=25
![Page 50: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/50.jpg)
![Page 51: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/51.jpg)
![Page 52: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/52.jpg)
![Page 53: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/53.jpg)
![Page 54: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/54.jpg)
Stage 3: Selection of the sampling unit
All third-stage units might be included in the sample
![Page 55: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/55.jpg)
Stage 3: Selection of the sampling unit
Second-stage units => HouseholdsThird-stage unit => Individuals
![Page 56: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/56.jpg)
Cluster sampling
• Advantages– Simple as complete list of sampling units within
population not required– Less travel/resources required
• Disadvantages– Cluster members may be more alike than those in
another cluster (homogeneous)– This needs to be taken into account in the sample
size and in the analysis (“design effect”)
![Page 57: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/57.jpg)
Selecting a sampling method
• Population to be studied– Size/geographical distribution– Heterogeneity with respect to variable
• Availability of list of sampling units• Level of precision required• Resources available
![Page 58: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/58.jpg)
Conclusions
• Probability samples are the best
• Ensure – Validity– Precision
• …..within available constraints
![Page 59: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/59.jpg)
Conclusions
• If in doubt…
Call a statistician !!!!
![Page 60: Sampling Techniques for epidemiological studies Biagio Pedalino.](https://reader036.fdocuments.us/reader036/viewer/2022081507/55152d505503465e608b5853/html5/thumbnails/60.jpg)
Questions?Questions?