The Information School of the University of Washington LIS 470 Data & Sampling LIS 570 Session 4.1...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of The Information School of the University of Washington LIS 470 Data & Sampling LIS 570 Session 4.1...
LIS 470 Data & Sampling
Th
e I
nfo
rmati
on
Sch
ool
of
the U
niv
ers
ity o
f W
ash
ing
ton
LIS 570
Session 4.1[Many of the slides and graphics adapted from
Harry Bruce’sSpring 2005 Class]
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 2
Objectives• Understand the options in, and
goals of, sampling techniques• Reinforce knowledge of vocabulary
and basic principles
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 3
Agenda• Warm-up exercise: review of
principles• Discussion of sampling goals and
methods• Hypothetical research exercise
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 4
What Happened in 1997?
Graduating Class
Year
Est. Average 1st Year Earnings
Projected Est. Average Total
5 year Earnings
1994 $28,100 $154,550
1995 $29,200 $160,600
1996 $30,400 $167,200
1997 $50,500 $339,800
FSU MIS Graduates
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 5
Possible Explanations• Beginning of dot com boom• Beginning of Y2K fears and staffing
frenzy• Other…?• Peter Boulware first round NFL pick
– Overall no. 4 pick by Baltimore Ravens– $800,000 1st year salary, $1M signing
bonus– $17M total 5 year package
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 6
Summary• Sampling - the process of selecting
observations– random; non-random– probability; non-probability
You don’t have to eat the whole ox toknow that the meat is tough
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 7
Aim• A representative sample: a
sample which accurately reflects its population
• Avoid (unconscious) bias
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 8
Basic terminology• Population [universe] - the entire group
of objects about which information is wanted
• Unit [element] - any individual member of the population
• Sample - a part or subset of the population used to gain information about the whole
• Sampling frame - the list of units [subset of the universe] from which the sample is chosen
• Variable - a characteristic of a unit, to be measured for those units in the sample
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 9
Step 1: Identify the Population
The units of the population about whom or which you want to know
• Define the population concretely; no ambiguityExample: “Adult Residents of Seattle”– How is “adult” defined?– What is the exact boundary of Seattle?– As of what date?– Can the population be identified completely?
(e.g., are the homeless included as “residents?”)
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 10
2. Decide on a Census or a Sample
• Census– Observe each unit– An “attempt” to sample the entire
population– Not foolproof (example: issues of US
census)
• Sample: observe a sub-group of the population
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 11
3. Decide on Sampling Approach
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 12
Random samplingRandom (Probability) Sampling• Each unit (element) has the same
chance (probability) of being in the sampleChance or luck of the draw determines who is in the sample (random)
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 13
• Each unit has a known probability or chance of being included in the sample
• An objective way of selecting units• Random Sampling is not
haphazard or unplanned sampling
Random samples
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 14
Types of random sampling
• Simple random sample• Systematic sampling• Stratified sampling• Cluster sampling
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 15
How to choose
The nature of the research problem
Availability of asampling frame
Money Desired level of accuracy
Data collection method
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 16
Simple random samples
• Obtain a complete sampling frame• Give each case a unique number
starting with one• Decide on the required sample size• Select that many numbers from a table
of random numbers• Select the cases which correspond to
the randomly chosen numbers
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 17
Systematic sampling• Sample fraction: divide the
population size by the desired sample size
• Select from the sampling frame according to the sample fraction—e.g., sample faction of 1/5 means that we select one element for every five in the population
• Must decide where to start
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 18
Stratified sampling• Premise: if a sample is to be
representative, then proportions for various groups in the sample should be the same as in the population
• Stratifying variable: characteristic on which we want to ensure correct representation in the sample– Order sampling frame into groups– Use systematic sampling to select
appropriate proportion of people from each strata
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 19
Cluster samplingInvolves drawing several different
samples– draw a sample of areas– start with large areas then progressively
sample smaller areas within the larger—e.g., example of city population
• Divide city into districts - select SRS sample of districts
• Divide sample of districts into blocks - select SRS sample of blocks
• Draw list of households in each block - select SRS sample of households
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 20
Random Samples
Advantages– Ability to generalise from sample to
population using statistical techniques—inferential statistics
– High probability that sample generally representative of the population on variables of interest
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 21
Non-random Samples
• Purposive • Quota • Accidental• Generalizability based on
“argument”– Replication– Sample “like” the population
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 22
Selecting a sampling method
• Depends on the population• Problem and aims of the research• Existence of sampling frame
Th
e In
form
atio
n S
cho
ol
of t
he U
nive
rsity
of
Was
hing
ton
LIS 570_Data & Sampling Mason; p. 23
Conclusion• The purpose of sampling is to select a
set of elements from the population in such a way that what we learn about the sample can be generalised to the population from which it was selected
• The sampling method used determines the generalizability of findings
Random samples Non-random sample