Signals in Robot Control: Two Primary Approachesgrupen/603/slides/... · v = fy z u = z fx pinhole...

41
Signals in Robot Control: Two Primary Approaches I. Percept Inversion Stimulus = f (World) World = f 1 (S ) m -1 m geometric reconstruction provides a representation for planning and deliberation frame problem - completeness issues the functions, f (), are only partially known, and are generally difficult to invert time spent “perceiving” often renders world models obsolete 1 Copyright c 2014 Roderic Grupen

Transcript of Signals in Robot Control: Two Primary Approachesgrupen/603/slides/... · v = fy z u = z fx pinhole...

Signals in Robot Control:Two Primary Approaches

I. Percept Inversion

Stimulus = f (World) World = f−1(S)

m −1m

• geometric reconstruction provides a representation forplanning and deliberation

• frame problem - completeness issues

• the functions, f (), are only partially known, and aregenerally difficult to invert

• time spent “perceiving” often renders world modelsobsolete

1 Copyright c©2014 Roderic Grupen

Signals in Robot Control:Two Primary Approaches

II. Behavior Based Perception

2M

M 6

M 5

M 4

M 3

M 1

...f61

f6N

...f

f51

5N

...f

f41

4N

...f

f31

3N

...f

f21

2N

...f

f11

1N

• “the world is its own best representation”

• correlation over multi-sensor time-series feedback

• interaction - actions influence perceptions

• task-specific solutions to the frame problem

2 Copyright c©2014 Roderic Grupen

Vision - Light and Matter

• Energy from a light source is radiated uniformly over 4π stera-dians → 1/R2 energy distribution from a point source

• the sun delivers about 1200Watts/m2 at noon on the equator

• at optical interfaces

– transmittion (refraction)

– reflection -diffuse or specular depending on surface prop-erties and wavelength.

– absorption - heat

– heat - black body radiation

• each encounter changes the properties of light - spectral con-tent, intensity, and polarization

...light signals carry a tremendous amount of information...

3 Copyright c©2014 Roderic Grupen

Vision - Pinhole Camera Revisited

v =fyz

zu =fx

pinholeaperture

planeimage

planeimage

pinholeaperture

focallength

focallength

v

z

u

y

xu

v

zy

x

• 5th century BC - Chinese philosopher Mo-Ti created an in-verted image by passing light through a pinhole into a darkened“collecting place.”

• ca. 350 BC - Aristotle viewed the eclipsing sun projected onthe ground through the holes in a sieve

• 10th century - Arabian scholar Alhazen of Basra created aportable solar observatory

• 15th - notebooks of Leonardo Da Vinci

• 17th century - Johannes Kepler - “camera obscura” (Latin for“room” and “dark,” respectively)

4 Copyright c©2014 Roderic Grupen

Vision - Pinhole Camera Revisited

v =fyz

zu =fx

pinholeaperture

planeimage

planeimage

pinholeaperture

focallength

focallength

v

z

u

y

xu

v

zy

x

• (1558) Giovanni Battista Della Porta “Magiae Naturalis” rec-ommended the camera obscura as a drawing aid for artists...

• 18th century artists used the camera obscura - Jan Vermeer,Canaletto, Guardi, and Paul Sandby

• 19th century - a sheet of light sensitive paper transforms thecamera obscura into the modern photographic camera.

What’s wrong with the pinhole camera?

5 Copyright c©2014 Roderic Grupen

Optics and Image Acquisition

optics

imageplane

source

...the radiant intensity function is projected onto a 2D im-age plane, sampled spatially, and digitized 30 times eachsecond...

6 Copyright c©2014 Roderic Grupen

Optics and Refraction - Gathering Light

Snell’s Law

incident

transmitted

reflected

Index of Refraction — theratio of the speed of light in avacuum to that in the optical ma-terial.

n =c

v=

ǫµ

ǫ0µ0

where

µ − magnetic permeability, and

ǫ − electric permittivity.

sin(θincident)

sin(θtransmitted)=

nt

ni,

7 Copyright c©2014 Roderic Grupen

Optics - Gaussian Lens Formula

Biconvex Thin Lens

1

S1+

1

S2=

1

f

f

2SS1

f

8 Copyright c©2014 Roderic Grupen

The Eye

(a) (b) (c) (d) (e)

cornea

iris

retinalens

pupil

suspensory ligaments

ciliarymuscle

aqueous humorvitreous

humor

visualaxis

blindspot

fovea

nasaltemple retina

80O

60O

40O

20O20

O

40O

60O

80O

opticnerve

(f)

superior oblique

superior rectus

inferiorrectus

inferioroblique

lateralrectus

levatorpalpebraesuperioris

superior oblique

superior rectus

medialrectus

inferiorrectus

inferioroblique

lateralrectus

superior rectus

inferioroblique

inferiorrectus

medialrectus

superior oblique

lateralrectus

9 Copyright c©2014 Roderic Grupen

Natural Variation

bees, fish, butterflies, birds and reptiles - capable of see-ing color—most mammals do not.

Herbivores - Side facing (monocular) vision systems yield al-most wrap around field of view.

Carnivores - forward facing (stereo) system provides precisiondepth and a narrower field of view (< 180 degrees)

Cheetah - wide, eccentric foveal region spanning horizontal bandfor locating prey against the horizon

Chamelion - turret eyes capable of both side- and forward-lookingconfigurations

Nocturnal - reflective back surface of retina - small birds sacrificemuscles for size. The largest eye belongs to the giant squid—upto 15 inches in diameter.

Fishing - fishing birds use polarizing lens

Rattlesnake - eyes oriented to side, forward-looking stereo pitorgans (no lens).

10 Copyright c©2014 Roderic Grupen

Where is the Information?

2M

M 6

M 5

M 4

M 3

M 1

...f61

f6N

...f

f51

5N

...f

f41

4N

...f

f31

3N

...f

f21

2N

...f

f11

1N

Information (the opposite of noise) resides in signalsthat exhibit reliable spatiotemporal structure

andyou can create spurious artifacts on signals

everytime you “touch” them...

11 Copyright c©2014 Roderic Grupen

Sampling - Spectral Propertiesof the Signal

Dirac delta operator:

δ(x− ξ) =

{

∞ x = ξ0 otherwise

This function has the following properties:

ǫ∫

−ǫ

δ(x)dx = 1 for ǫ > 0

∞∫

−∞

F (ξ)δ(x− ξ)dξ = F (x) Sifting Property

12 Copyright c©2014 Roderic Grupen

Sampling - Spectral Propertiesof the Signal

Fourier Transform

F [f (x)] = F (u) =

∞∫

−∞

f (x)exp{−i(2π(ux)}dx

where u is the spatial frequency (in cycles/meter), so that whenx is specified in meters, (2πux) is in radians, and i =

√−1.

F−1[F (u)] = f (x) =

∞∫

−∞

F (u)exp{i(2π(ux))}du

13 Copyright c©2014 Roderic Grupen

Fourier Transform Pairs

F (ω) = F(f(x)) =

∫ ∞

−∞f(x)e−iωxdx ω(rad/pixel) = 2πu(cycles/pixel)

Name f(x) F (ω)

rectangular function rect(x) = 1 − 12 < x < 1

2 sinc(ω/2π) = sin(ω/2)ω/2

triangular function tri(x) = 2(x+ 12) − 1

2 < x < 0 sinc2(ω/2π)1− 2(x) 0 < x < 1

2

Gaussian e−α|x| 2α/(α2 + ω2)

e−px2 1√2pe−ω2/4p

unit impulse δ(x) 1

comb function∑

n δ(x− nx0)1x0

n δ(ω2π − n

x0

)

14 Copyright c©2014 Roderic Grupen

Example: Human Voice

the open-closed resonance cavity—a constant diameter tube openat one end that radiates sound pressure into an infinitely largerenvironment

the open end of the cavity reflects an inverted pressure wave

F = 1/4(c/L) Hz1

F = 3/4(c/L) Hz2

F = 5/4(c/L) Hz3

f(t) = sin(2 F t)2

f(t) = sin(2 F t)3

st1 formantnd

2 formant rd3 formant

f(t) = sin(2 F t)1

ωn = (2n− 1)(c/4L)

several resonance modes (formants) exist simultaneously

15 Copyright c©2014 Roderic Grupen

Example: Spectral Evaluation

“...the rainbow passage...”

A screen capture from an analysis by the Computerized SpeechLab (courtesy Dr. M. Andrianopoulos): frequency content versustime; sound pressure level as a function of frequency, and ampli-tude versus time for a vocalization.

16 Copyright c©2014 Roderic Grupen

Example: Spectral Evaluation

...a prolonged [a] vowel changes the relative placement and ampli-tude of the resonance modes (formants) of the vocal tract

F1 vs. F2 vowel chart for 10 vowelsounds sampled from men, women,and children.

17 Copyright c©2014 Roderic Grupen

Shift Theorem

F [f (x)] =

∞∫

−∞

f (x)exp{−i(2πux)}dx, then

F [f (x− a)] =

∞∫

−∞

f (x− a)exp{−i(2πux)}dx, then

=

∞∫

−∞

f (x′)exp{−i(2πu(x′ + a))}dx′, and

= exp{−i(2πua)}∞∫

−∞

f (x′)exp{−i(2πux′)}dx′,

so that

F [f (x− a)] = exp{−i(2πua)}F(f (x))

18 Copyright c©2014 Roderic Grupen

Convolution Theorem

f (x) ∗ g(x) = h(x) =

∫ ∞

−∞f (α)g(x− α)dα

where α an integration variable.

F [f (x) ∗ g(x)] = F [h(x)]

= F[∫

α

f (α)g(x− α)dα

]

=

x

[∫

α

f (α)g(x− α)dα

]

exp{−i2πux}dx

=

α

f (α)

[∫

x

g(x− α)exp{−i(2πux)}dx]

and by the Shift theorem,

=

α

f (α)exp{−i(2πuα)}dα∫

x

g(x)exp{−i(2πux)}dx,

and therefore,F [f (x) ∗ g(x)] = F (u)G(u)

spatial/temporal convolution ↔ frequency domain multiplication

19 Copyright c©2014 Roderic Grupen

Sampling Theorem - Aliasing

A continuous spatial function, f (x)is sampled by computing the prod-uct of f (x) and g(x), an infinite se-quence of Dirac delta operators.

h(x) = f (x)∑

n

δ(x− nx0)

=∑

n

f (nx0)δ(x− nx0)

By the convolution theorem, theproduct of these two spatial func-tions is the convolution of theirFourier transform pairs.

f(x)

g(x)

h(x)

x02 x04 x06x0−2x0−4x0−6

f (x)F−→ F (u)

n

δ(x− nx0)F−→ 1

x0

n

δ(u− n

x0) = G(u), and

H(u) = F (u) ∗G(u)

20 Copyright c©2014 Roderic Grupen

Sampling Theorem - Aliasing (cont.)

H(u) = F (u) ∗G(u)

We may write the function H(u) in terms of F (u):

H(u) = F (u)∗[

1

x0

n

δ(u− n

x0)

]

=

∫ ∞

−∞F (α)G(u− α)dα

=

∫ ∞

−∞F (α)

[

1

x0

n

δ(u− α− n

x0)dα

]

=1

x0

∫ ∞

−∞

n

F (u− n

x0)δ(u− α− n

x0)dα

=1

x0

n

F (u− n

x0)

∫ ∞

−∞δ(u− α− n

x0)dα

H(u) =1

x0

n

F (u− n

x0)

the frequency spectrum of the sampled image contains of dupli-cates of the spectrum of the original image distributed at 1/x0frequency intervals.

21 Copyright c©2014 Roderic Grupen

Aliasing (cont.)

x01/ x02/x0−1/x0

−2/

x01/ x02/x0−1/x0

−2/

(a)

(b)

Η( )ω R( )H( )ω ω

R(u) is a frequency domainbandpass filter

R(u) = 1 if |u| < 1/(2x0),

0 otherwise

Aliasing: When replicated spectra interfere, the crosstalk intro-duces energy at relatively high frequencies changing the appear-ance of the reconstructed image.

The Sampling Theorem:

If the image contains no frequency components greater than

one half the sampling frequency, then the continuous image is

faithfully represented in the sampled image.

22 Copyright c©2014 Roderic Grupen

Early Processing - Convolution

f (x, y)∗g(x, y) = h(x, y) =

∫ ∞

−∞

∫ ∞

−∞f (u, v)g(x−u, y−v)dudv

or

f (x, y) ∗ g(x, y) = h(x, y) =

∞∑

−∞

∞∑

−∞f (u, v)g(x− u, y − v)

g(x,y)

h(x,y)

h(n,n)

f(3,3)

g(n,n)

23 Copyright c©2014 Roderic Grupen

Early Processing

...if you know what you’re looking for, convolution can be used asa means of approximating the correlation of a signal fragment toa reference template...

Rgt =∑

α

β

t(α, β)g(x + α, y + β).

maxima in Rgt are minima in:

α∑

i=−α

β∑

j=−β

[g(x + i, y + j)− t(α, β)]2

24 Copyright c©2014 Roderic Grupen

Frei and Chen Templates

...patterns of responses can be more robust andmore informative than single operators...

1 1 11 1 11 1 1

f0

−1 −√2 −1

0 0 0

1√2 1

f1

−1 0 1

−√2 0

√2

−1 0 1

f2

0 −1√2

1 0 −1

−√2 1 0

f3

√2 −1 0

−1 0 1

0 1 −√2

f4

0 1 0−1 0 −10 1 0

f5

−1 0 10 0 01 0 −1

f6

1 −2 1−2 4 −21 −2 1

f7

−2 1 −21 4 1−2 1 −2

f8

Frobenius inner product A : B = trace(ABT ) = trace(ATB) ofany pair of Frei-Chen operators is zero

each operator captures an independent “shape” in 3× 3neighborhoods of the luminosity surface

the image is characterized as a vector of responses, h(x, y), where

hk = fk ∗ g, k ∈ [0, 8].

25 Copyright c©2014 Roderic Grupen

Frei and Chen Templates

hk = fk ∗ g

Edge Energy(x, y) =

2∑

k=1

h2k(x, y)

Total Energy(x, y) =

8∑

k=0

h2k(x, y)

define:

cos(θ) =

(

EE

TE

)1/2

gradientmagnitudethreshold

Frei−Chenthreshold

TotalEnergy

Edge Energy

TotalEnergy

Edge Energy

θ

1

1EE(x ,y )1

EE(x ,y )2 2

1EE(x ,y )1

EE(x ,y )2 2

26 Copyright c©2014 Roderic Grupen

Early Processing -Differential Geometry

Low Pass Filter

g(x)

h(x)

f(x)

*h = f g

1

x−1/2 +1/2

rect(x)sinc( ) =

ω2π (ω/2)

ω/2sin( )1

5 10 15 20 250−5−10−15−20−25ω

F

F−1

27 Copyright c©2014 Roderic Grupen

Early Processing -Gradient Operators

Intensity Gradients

∇g =dg

dxx +

dg

dyy

dg(x, y)

dx≈ g(x + 1, y)− g(x− 1, y)

2⇒ fx =

[

−1

2, 0,

1

2

]

1×3

dg(x, y)

dy≈ g(x, y + 1)− g(x, y − 1)

2⇒ fy =

−12012

3×1

|∇g| =

[

(

dg

dx

)2

+

(

dg

dy

)2]

12

=

φ = tan−1

(

dg/dy

dg/dx

)

28 Copyright c©2014 Roderic Grupen

Edge Operators

operator ∇1 ∇2

Roberts[

0 1−1 0

] [

1 00 −1

]

Prewit

−1 0 1−1 0 1−1 0 1

1 1 10 0 0

−1 −1 −1

Sobel

−1 0 1−2 0 2−1 0 1

1 2 10 0 0

−1 −2 −1

29 Copyright c©2014 Roderic Grupen

Early Processing - Edge Sharpening

Laplacian

∇2g =d2g

dx2+

d2g

dy2⇒ f =

0 −1 0−1 4 −10 −1 0

2

f(x)

f(x)

f(x)

30 Copyright c©2014 Roderic Grupen

Summary: Signals, Sampling, and Features

• optics and the evolution of the eye

• spectral analysis and the sampling theorem

• convolution - feature-based early processing, the spectral in-terpretation of convolution operators

Today: Deep Structure and Information

• differential structure - edge sharpening algorithm

• the Canny edge operator and Gaussian operators

• multi-scale feature detectors

– edge sharpening revisited

– gauge coordinates and scale space

– oriented edges, ridges, corners, blobs

31 Copyright c©2014 Roderic Grupen

Deep Structure

...consider a multivariate signal f (x1, . . . , xn) in the vicinity of(x01, . . . , x

0n), represented by its Taylor series expansion:

f (x1, . . . , xn) = f (x01, . . . , x0n)+

∞∑

m=1

k1,...,kn∑

j kj=m

∂mf (x01, . . . , x0n)

∂xk11 · · · ∂xknn

[

(x1 − x01)k1

k1!· · · (xn − x0n)

kn

kn!

]

Definition (N-jet): spatial derivatives

∂mf (x01, . . . , x0n)

∂xk11 · · · ∂xknn, m ∈ [0, N ],

j

kj = m.

characterize differential structure up to order N .

32 Copyright c©2014 Roderic Grupen

Scale Space and the N-jet

Image Processing

the N -jet of order 2 for function f (x, y) is the set(

f,∂f

∂x,∂f

∂y,∂2f

∂x2,

∂2f

∂x∂y,∂2f

∂y2

)

.

Topological structure in the luminance function (critical points,extremum, ridge points, corners, blobs) can be captured in termsof differential structure and form a basis for structural informationtheory.

Structures that are persistent across the scale dimension constitutethe “deep structure” of signals.

f (x, y, σ)

.

...how should N -jet operators be chosen?

33 Copyright c©2014 Roderic Grupen

Differential Convolution Operators

Suppose:

f (x) : image function

gσ(x) =1√2πσ

e−x2/2σ2 : Gaussian smoothing function

Then:

f ∗ g ⇒ bandlimited approximation of image f (x)

⇒ suppress noise prior to differentiationd

dt(f ∗ g) ⇒ edges in the image f (x)

Canny Edge Operator

d

dt(f ∗ g) = f ∗ d

dtg

34 Copyright c©2014 Roderic Grupen

Gaussian Operatorsin the Frequency Domain

35 Copyright c©2014 Roderic Grupen

Multi-Scale Feature Detectors

from before, edge sharpening:

∇2f (x, y) = 0, and

|∇f | > τ.

Oriented Edges

“steerable” filters - gauge coordinates

a local orthonormal coordinate frame (u, v) oriented with the v-axis parallel to the gradient direction at position (x, y).

by this definition, fv is always positive and the lateral first deriva-tive, fu, is zero.

fvv = 0, and

fvvv < 0.

• coarse scales suppress noise, fine scales localize more precisely,

• most sharp edge structures corresponding to object boundariesexist simultaneously at many scales.

36 Copyright c©2014 Roderic Grupen

Multi-Scale Oriented Edges

Edges reported at t=1.0, 16.0, and 256.0, respectively

37 Copyright c©2014 Roderic Grupen

Multi-Scale Feature Detectors

Oriented Ridges

gauge coordinates (p, q) - aligns the local principal curvature ofthe luminocity function such that off-diagonal terms in the imageHessian, fpq = fqp = 0.

A ridge is an extremum in the direction of the maximum principalcurvature, which we align with direction p:

fp = 0,

fpp < 0, and |fpp| ≥ |fqq|

dominant ridge structures can be found at all scales.

38 Copyright c©2014 Roderic Grupen

Multi-Scale Oriented Ridges

Ridges reported at t=1.0, 16.0, and 256.0, respectively

39 Copyright c©2014 Roderic Grupen

Intrinsic Scale

feature featuretype strength

edge√σ

21/4fv

ridge σ3

23/2(fpp − fqq)

2

corner σ4

4 fvfuu

blob σ2

2 ∇2f

search for scales at which feature strength is maximized

40 Copyright c©2014 Roderic Grupen

Intrinsic Scale - Corners, and Blobs

41 Copyright c©2014 Roderic Grupen