Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine...

35
Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering Center Interface 2001 June 16, 2001

Transcript of Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine...

Page 1: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

Compression and Analysis of Very Large Imagery Data Sets

Using Spatial Statistics

James A. Shine

George Mason University and

US Army Topographic Engineering Center

Interface 2001

June 16, 2001

Page 2: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

ACKNOWLEDGMENTS

Dr. Margaret Oliver, University of Reading, UK

Dr. Richard Webster, Rothamsted Laboratory, UK

Dr. Daniel Carr, George Mason University

Page 3: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

INTRODUCTIONGreater resolution in imagery data sets:

pixel resolution (1 meter; 3 x 10^6 data points/square mile)

more bands (up to 256 in hyperspectral sensors;+10^2)

more imagery over timeCompression becomes an important part of

timely analysis.How far can image be compressed before

information is lost?

Page 4: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

PROFESSIONAL MOTIVATION:

Collecting imagery, climatic and other topographic data

Transforming the data into maps, surfaces, and other topographic products

Determination of sampling intervals using spatial statistics is an important tool for many of our applications:

collecting ground truth

choosing training points for classification

Page 5: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

DATA SETS

Page 6: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

CAMIS Data Collection

Computerized Airborne Multicamera Imaging System

Four-band sensor flown in Lear jet (blue, green, red, near infrared)

Each data frame 768x576 pixelsEach flight line has 30 framesEach collect uses 10-15 flight linesOrder of 10^7 data points per collect

Page 7: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

Data Preprocessing

Considerable overlap in flight linesBands registered to each other firstOverlap removed, forming mosaicRadiometric correctionMap registration

Page 8: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

Ft. Story, VAFt. A.P. Hill, VA

Page 9: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

SPATIAL STATISTICS

Much spatial data (such as imagery) is spatially correlated; points close together have lower variance than those farther apart.

Variance can be divided into background noise (stochastic) and spatial.

The variance can be modeled by plotting vs. distance between points (variogram) and used for many applications.

Page 10: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

STOCHASTIC AND SPATIAL VARIATION

STOCHASTIC VARIATION IS LOCAL, BACKGROUND NOISE (NUGGET EFFECT)

SPATIAL VARIATION IS GLOBAL (SILL AND RANGE)

THE SCALE OF SPATIAL VARIATION IS ESPECIALLY IMPORTANT

VARIOGRAMS DEMONSTRATE THESE TWO VARIATIONS

Page 11: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

HOW TO COMPUTE A VARIOGRAM

We have sample locations x1, x2, … and values z at each location. The semivariance

for a given distance h is:

Where n(h) is the number of pairs of points a distance h apart. The semivariance is then plotted against h as shown on the next slide.

( )[ ( ) ( )]

* ( )

( )

hz x z x

n h

i h ii

n h

2

1

2

Page 12: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

MODELING THE VARIOGRAMThe variogram is then fit on several different

models: exponential, nested exponential spherical, nested spherical circular others

The best-fitting model (minimum squared error or a similar metric) is chosen.

The model is then used to determine the scale (or scales in nested models) of variation and for interpolation and estimation.

Page 13: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

COMPARISON EXPERIMENT

Compute variogram of complete image bandCompute variograms of subsampled image

band (reduced by powers of 2)Compare the variograms, determine when

curve is lostUse this as a compression threshold

Page 14: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

COMPUTING A FULL IMAGE VARIOGRAM

Data transferred from imagery to text file (ERDAS Imagine, Arc/Info)

Modified FORTRAN program Running time: approx. 1 hour per 4 x 10^6

points only 2 directions (N-S and E-W)Current algorithm O(n^2), may be reducibleDetails: Shine, JSM 2000

Page 15: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

Ft. Story full image variograms

Page 16: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FT. STORY BAND 1 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

FT. STORY BAND 1 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

FT. STORY BAND 1 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

Page 17: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FT. STORY BAND 2 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

FT. STORY BAND 2 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

FT. STORY BAND 2 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

Page 18: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FT. STORY BAND 3 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

FT. STORY BAND 3 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

FT. STORY BAND 3 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

Page 19: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

NUGGET MODEL

h

gam

ma

0 5 10 15 20 25 30

0.8

0.9

1.0

1.1

1.2

LINEAR MODEL

h

gam

ma

0 5 10 15 20 25 30

05

1015

2025

30

SPHERICAL MODEL

h

gam

ma

0 5 10 15 20 25 30

0.2

0.4

0.6

0.8

1.0

EXPONENTIAL MODEL

h

gam

ma

0 5 10 15 20 25 30

0.2

0.4

0.6

0.8

1.0

THEORETICAL VARIOGRAM MODELS

Page 20: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

DOUBLE EXPONENTIAL MODEL

distance

ga

mm

a

0 5 10 15 20 25 30

0.5

1.0

1.5

2.0

+

+

++

++ + + + + + + + + + + + + + + + + + + + + + + + +

o

oo

oo

oo

o o o o o o o o o o o o o o o o o o o o o o o

X

X

X

X

X

XX

XX

XX X X X X X X X X X X X X X X X X X X X

A NESTED VARIOGRAM MODEL

Page 21: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

Ft. A.P. Hill full image variograms

Page 22: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

BAND 1

Page 23: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.
Page 24: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.
Page 25: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.
Page 26: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

COMPRESSION ANALYSIS

Start with full variogram

Reduce sample by ¼ successively

Compare resulting variograms

Page 27: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

EXAMPLE RESULT: A.P. HILL, BAND 1

Page 28: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FULL

Page 29: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

ADD 1/4

Page 30: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

ADD 1/16

Page 31: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

ADD 1/64

Page 32: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

ADD 1/256

Page 33: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FULL (ORANGE) AND 1/256 (BLUE) IMAGES SUPERIMPOSED

Page 34: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

CONCLUSIONS

Preliminary results show little degradation in variogram at 256 times reduction

Seems to indicate that image can be compressed ~10^2 without affecting results of spatial statistical analysis

Computing time savings: hours to minutes

Page 35: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering.

FUTURE WORK

Optimize variogram code

Finish tests on other Ft.A.P. Hill and Ft. Story imagery bands

Compare other available CAMIS imagery

Obtain general rule for achievable compression for obtaining a spatial correlation model from 1-meter imagery

Perform other image analysis operations on original and compressed images and compare.