GRTS for the Average Joe: A GRTS Sampler for Windows
description
Transcript of GRTS for the Average Joe: A GRTS Sampler for Windows
WEST, Inc.WEST, Inc.
GRTS for the Average Joe: A GRTS Sampler for Windows
Trent McDonald
Monitoring Science Symposium Denver, CO
21-24 Sep 2004
WEST, Inc.WEST, Inc.Outline
• Motivation for the GRTS sampler
• Description of the sampler, S-Draw
• Examples
• Performance
• Planned modifications
WEST, Inc.WEST, Inc.Motivation
• Basic hypotheses: – Average Joe understands the utility of GRTS
samples– Average Joe does not totally understand the
inner workings of GRTS sampling– Average Joe could not draw a GRTS sample if
his life depended on it.
WEST, Inc.WEST, Inc.Motivation
• A GRTS sampler was needed because:– Average Joe should be able to draw GRTS
samples– I should be able to draw GRTS samples
WEST, Inc.WEST, Inc.S-Draw
• Windows application• Written in Fortran 95
– Amazing speed
– Cross-platform portability ok
– Cross-language calls easy (S-Plus, R, C++)
• Used Lahey compiler• Also S-DrawB
WEST, Inc.WEST, Inc.S-Draw
• Draws samples of – Discrete units (finite populations)– Located in either 1-D or 2-D
• Examples:– 1-D: River segments located by river mile– 2-D: Grid cells located in an area– 2-D: River segments located by coordinates of
their midpoints
WEST, Inc.WEST, Inc.S-Draw
• Coordinates of units are specified in a text file – i.e., the sampling frame is a ASCII file
• Sampling frame can optionally contain weights and ID’s
WEST, Inc.WEST, Inc.S-Draw• Frame Formats:
Sample Structure Data Included in Frame
Pre-defined 1-D 2-D
Sample weights
Coordin-ates ID's
Order of fields in frame file
Yes Yes Yes Yes x wgt id
Yes Yes Yes No x wgt
Yes Yes No Yes x id
Yes Yes No No x
Yes No Yes Yes wgt id
Yes No Yes No wgt
Yes No No Yes id
Yes No No No [# lines counted]
Yes Yes Yes Yes x y wgt id
Yes Yes Yes No x y wgt
Yes No Yes Yes x y id
Yes No Yes No x y
Yes* Yes - Yes k1 … kK wgt id
Yes* Yes - No k1 … kK wgt
Yes* No - Yes k1 … kK id
Yes* No - No k1 … kK
*K specified on the first line of the frame file
WEST, Inc.WEST, Inc.S-Draw
• Example frame: Eagle studySTARTPNT_X STARTPNT_Y LINE_ID ID ENDPNT_X ENDPNT_Y
-421014 4572670 1 439 -420245 4572670
-421295 4574670 2 440 -419802 4574670
-421245 4576670 3 441 -419802 4576670
-421196 4578670 4 442 -419802 4578670
-421292 4580670 5 443 -419802 4580670
-421687 4582670 6 444 -419802 4582670
-422806 4584670 7 445 -419802 4584670
-423122 4586670 8 446 -419802 4586670
-422972 4588670 9 447 -419802 4588670
-422836 4590670 10 448 -419802 4590670
-422352 4592670 11 449 -419802 4592670
-421284 4594670 12 450 -419802 4594670
Columns following frame data ignored
WEST, Inc.WEST, Inc.S-Draw
• Does the quadrant-recursive mapping of Stevens and Olsen (2004):
0 n
1 4 2 3
4 3 1 2
1
2
3
4
2
1 3
4
WEST, Inc.WEST, Inc.S-Draw
• Pixelsize = size of smallest quadrant in recursive map
• S-Draw allows user to specify pixelsize
WEST, Inc.WEST, Inc.S-Draw
• Line segment (0,n] sampled using a systematic sample– Random start between (0,1]– Step size = 1.0
0 n1 2 3
u1 u2 u3 u4
Random start = 0.19
WEST, Inc.WEST, Inc.S-Draw
• Reverse-hierarchical ordering of sample optionally applied– Convert sample order to base-4: 10010=012004
– Reverse base-4 digits: 012004=002104
– Convert back to base-10: 002104=3610
– Sort sample on base-10 numbers
WEST, Inc.WEST, Inc.S-Draw
• Users can pre-define the hierarchical sort keys
• Digits within each level of the hierarchy are randomly permuted, and sample is drawn as usual
• Allows use of a general recursive map– Triangular-recursive– ID’s like: state.county.watershed.segment
WEST, Inc.WEST, Inc.S-Draw
• Triangular-recursive mapping:
1 2
3
4
WEST, Inc.WEST, Inc.Examples
• C:\>s-drawb –n 20 –popsize 100– Will produce 1-D GRTS sample of size 20 assuming
units are located at coordinates 1, 2, …, 100
• C:\>s-drawb –n 20 –popsize 100 –pixelsize 100– Will produce a simple random sample of size 20
• C:\>s-drawb –n 20 –popsize 100 –nrand– Will produce a fixed-size systematic sample
WEST, Inc.WEST, Inc.Examples
• Golden eagle sample:– Dense grid of transect start points spaced 2km
north-south, 100km east-west– “No-fly” transect portions eliminated, new
transect start created– Frame:
• 27,078 starting points over western US
• 2-D coordinates and ID’s
– Desire sample of 416 transects• 208 primary, 208 alternate
WEST, Inc.WEST, Inc.Examples
2-D
Coordinates and ID in frame Sample size
Frame file
WEST, Inc.WEST, Inc.Examples
WEST, Inc.WEST, Inc.Performance
• GRTS sample of size 500 from 100,000 took 4.2 seconds on my laptop
• GRTS sample of size 500 from 1,000,000 took 44.3 seconds
• Algorithms approximately O(N)• Runs should take ~ 4.45e-5(N) seconds
– N=5,000,000: ~3.7 minutes
WEST, Inc.WEST, Inc.Enhancements
• An S-Plus and R interface
• Ability to read .e00 file, and ArcGIS binary files
• Ability to take a true point sample