Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and...

36
Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Transcript of Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and...

Page 1: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Multiscale Analysis of Images

Gilad LermanMath 5467

(stealing slides from Gonzalez & Woods, and Efros)

Page 2: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

The Multiscale Nature of Images

Page 3: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Recall: Gaussian pre-filtering

G 1/4

G 1/8

Gaussian 1/2

Solution: filter the image, then subsample• Filter size should double for each ½ size reduction.

Page 4: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Subsampling with Gaussian pre-filtering

G 1/4 G 1/8Gaussian 1/2

Solution: filter the image, then subsample• Filter size should double for each ½ size reduction.

Page 5: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Image Pyramids

Known as a Gaussian Pyramid [Burt and Adelson, 1983]• In computer graphics, a mip map [Williams, 1983]• A precursor to wavelet transform

Page 6: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

A bar in the big images is a hair on the zebra’s nose; in smaller images, a stripe; in the smallest, the animal’s nose

Figure from David Forsyth

Page 7: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Gaussian pyramid construction

filter mask

Repeat• Filter• Subsample

Until minimum resolution reached • can specify desired number of levels (e.g., 3-level pyramid)

The whole pyramid is only 4/3 the size of the original image!

Page 8: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

What does blurring take away? (recall)

original

Page 9: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

What does blurring take away? (recall)

smoothed (5x5 Gaussian)

Page 10: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

High-Pass filter

smoothed – original

Page 11: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Band-pass filtering

Laplacian Pyramid (subband images)Created from Gaussian pyramid by subtraction

Gaussian Pyramid (low-pass images)

Page 12: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Laplacian Pyramid

How can we reconstruct (collapse) this pyramid into the original image?

Need this!

Originalimage

Page 13: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)
Page 14: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)
Page 15: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Image resampling (interpolation)

So far, we considered only power-of-two subsampling• What about arbitrary scale reduction?

• How can we increase the size of the image?

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function• If we could somehow reconstruct the original function, any

new image could be generated, at any resolution and scale

1 2 3 4 5

d = 1 in this example

Page 16: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Image resampling

So far, we considered only power-of-two subsampling• What about arbitrary scale reduction?

• How can we increase the size of the image?

Recall how a digital image is formed

• It is a discrete point-sampling of a continuous function• If we could somehow reconstruct the original function, any

new image could be generated, at any resolution and scale

1 2 3 4 5

d = 1 in this example

Page 17: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Image resampling

So what to do if we don’t know

1 2 3 4 52.5

1 d = 1 in this example

• Answer: guess an approximation• Can be done in a principled way: filtering

Image reconstruction• Convert to a continuous function

• Reconstruct by cross-correlation:

Page 18: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Resampling filtersWhat does the 2D version of this hat function look like?

Better filters give better resampled images• Bicubic is common choice

Why not use a Gaussian?

What if we don’t want whole f, but just one sample?

performs linear interpolation

(tent function) performs bilinear interpolation

Page 19: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Bilinear interpolation

Smapling at f(x,y):

Page 20: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

What happens in the frequency domain?

• Laplacian Pyramid gives rise to different frequency bands

• Alternatively can obtain similar pyramids according to frequency bands…

Page 21: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Example of Frequency-bands “pyramid”

Taken from Rajashekar and Simoncelli (http://www.cns.nyu.edu/ftp/lcv/rajashekar08a.pdf)

Page 22: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Applications (more efficient with wavelets)

•Coding

High frequency coefficients need fewer bits, so one can

collapse a compressed Laplacian pyramid•Denoising/Restoration

Setting “most” coefficients in Laplacian pyramid to zero•Image Blending

Page 23: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Denoising demonstration (motivation)

Taken from Rajashekar and Simoncelli (http://www.cns.nyu.edu/ftp/lcv/rajashekar08a.pdf)

Observation: spatial correlation in noise-free

image (and not the noisy image)

Conclusion: Noise is more evident in the

high-frequency components, than the low-

frequency components (smooth components)

Page 24: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Denoising demonstration (idea)

Taken from Rajashekar and Simoncelli (http://www.cns.nyu.edu/ftp/lcv/rajashekar08a.pdf)

Partition into sub-bands of frequencies (top: high freq., bottom: low)

y-given image (vectorized), - estimated denoised image (vectorized)

The middle graph – intensity thresholding.

No thresholding for smoothest component and keeping only higher intensities for

higher components (the filter is learned empirically)

In practice, can have blurring and not competitive with NLM, BM3d, dictionary learning

ˆ( )x y

Page 25: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Application: Pyramid Blending

0

1

0

1

0

1

Left pyramid Right pyramidblend

Page 26: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Image Blending

Page 27: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Feathering

01

01

+

=

Encoding transparency

I(x,y) = (R, G, B, )

Iblend = left Ileft + right Iright

Ileft Iright

leftright

Page 28: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Affect of Window Size

0

1 left

right0

1

Page 29: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Affect of Window Size

0

1

0

1

Page 30: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Good Window Size

0

1

“Optimal” Window: smooth but not ghostedBurt and Adelson (83): Choose by pyramids…

Page 31: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Pyramid Blending

Page 32: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Pyramid Blending (Color)

Page 33: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Laplacian Pyramid: Blending

General Approach:1. Build Laplacian pyramids LA and LB from images A and B

2. Build a Gaussian pyramid GR from selected region R (black white corresponding images)

3. Form a combined pyramid LS from LA and LB using nodes of GR as weights:• LS(i,j) = GR(I,j,)*LA(I,j) + (1-GR(I,j))*LB(I,j)

4. Collapse the LS pyramid to get the final blended image

Page 34: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

laplacianlevel

4

laplacianlevel

2

laplacianlevel

0

left pyramid right pyramid blended pyramid

Page 35: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Blending Regions

Page 36: Multiscale Analysis of Images Gilad Lerman Math 5467 (stealing slides from Gonzalez & Woods, and Efros)

Simplification: Two-band Blending

Brown & Lowe, 2003• Only use two bands: high freq. and low freq.• Blends low freq. smoothly• Blend high freq. with no smoothing: use binary mask

Can be explored in a project…