Discriminant Hypothesis for Visual Saliency and its ... · detection. SVCL •Hypothesis: saliency...

SVCL

Discriminant Hypothesis for Visual Saliency and its Applications in

Computer Vision

Dashan Gao

Joint work with Nuno VasconcelosStatistical Visual Computing Laboratory

Department of Electrical and Computer Engineering,University of California, San Diego

SVCL

What is visual saliency?•certain image features that attract visual attention

(Yarbus, 1967)

(Treisman & Gormican, 1988)

SVCL

What is known about saliency?

• Bottom-up (BU) saliency– stimulus-driven mechanism, fast– goal independent– e.g. traffic signs

• Top-down (TD) saliency– goal-driven mechanism, slower– provides informative locations to

the specific task

Yarbus 1967:

1. Free viewing

2. Estimate the economic level of the people

3. Judge their age

SVCL

In computer vision

•saliency is widely used in visual recognition systems, as a pre-processing stage–a sparse image representation–reduces computation–example: weakly supervised learning of object

categories

Fergus et. al 2003 Sivic et. al 2003

SVCL

however, in computer vision• most saliency definitions are universal, divorced

from the recognition problem– repeatability (stability) [Forstner(94), Harris-Stephens(88), shi-Tomasi(94), Mikolajczyk(01,04)]

– continuity (curvature) [Sha’ashua-Ullman(88), Asada-Brady(86)]

– complexity (information content) [Kadir-Brady(01),Sebe-Lew(03)]

– rarity (low probability) [Walker et al.(98), Oliva et al.(03), Bruce-Tsotsos(05)]

• in result,– detected salient locations may not be very informative for

recognition.

Harris detection

SVCL

• Hypothesis: saliency is a discriminant process

• requires a stimulus of interest, and a null hypothesis of stimuli that are not salient– context dependent: salient attributes depend on the

object of interest and the context (null hypothesis)

• Definition: salient features are those that best distinguish the given visual concept from the null hypothesis

Discriminant saliency

(NIPS 2004)

SVCL

Infomax feature selection

• solution: features that maximize the mutual information between the features (X) and the class label (Y)

• under a constraint of computational parsimony

• we use marginal mutual information,

(more on this later)

X={X1, …, Xn}

SVCL

Top-down Discriminant Saliency Model

Scale Selection

W j

WTA

Faces Discriminant Feature

Selection

Salient Features

Background

Training

Testing

Saliency Map

Original Feature Set

Malik-Perona pre-attentive perception model [M-K90]

SVCL

Qualitative evaluation• discriminant saliency more correlates with target objects

Original images

Saliency Maps by

Discriminant Saliency

Scale SaliencyDetector [K-B 01]

Harris SaliencyDetector [H-S 88]

Salient locations by Discriminant

Saliency

SVCL

Visual classification

• how informative are the salient points?

– evaluated by measuring actual recognition rates

– classifier: histogram of saliency values, fed to an SVM for a presence/absence test

– compared with two standard saliency detectors, and two bench marks

6065707580859095

100

FacesMotorbikes

Airplanes

DiscSal-DCT

Scale Saliency

Harris Saliency

Image Pixel

Constellation

SVCL

Repeatability test

• robustness to various transformations

2 2.5 3 3.5 4 4.5 5 5.5 60

10

20

30

40

50

60

70

80

90

100

increasing scale changes

rep

eata

bili

ty (

%)

DSDSSDHarrLapHesLapMser

2 2.5 3 3.5 4 4.5 5 5.5 60

10

20

30

40

50

60

70

80

90

100

increasing viewpoint angle

rep

eata

bili

ty (

%)


2 2.5 3 3.5 4 4.5 5 5.5 60

10

20

30

40

50

60

70

80

90

100

increasing blur

rep

eata

bili

ty (

%)


2 2.5 3 3.5 4 4.5 5 5.5 60

10

20

30

40

50

60

70

80

90

100

increasing JPEG compression

rep

eata

bili

ty (

%)


scale+rotation

view angle

blurring

JPEG compression

SSD: Kadir-Brady 2001

HarrLap/HesLap: Mikolajczyk 2004

Mser: Matas et al. 2004

SVCL

Other research work based on Top-down Discriminant Saliency• a hierarchical model learning of complex features

and detectors for classificationFaces

DiscriminantFeature

Selection

Background

Saliency Map &Salient Locations

Salient (complex) Features

Initial Feature Set Object

Detection

Complex Feature

Generation

Feature Complexity

Control

New Complex Feature Set

(CVPR 2005)

SVCL

Bayesian Integration• Advantage

– the trade-off between high selectivity (by TD) and high accuracy(by BU)

• Probabilistic formulation of salient locations– saliency output: probability distributions of saliency locations

over the image plane

• Saliency as Bayesian inference – BU saliency -> prior, TD saliency -> likelihood

– inference from the two observations: both accuracy and selectivity -> a posterior

SVCL

Results• better locations

• better selectivity (classification accuracy)

BU TD Integration

SVCL

Applications

• Automated gathering of training examples

SVCL

Applications

• Region-of-interest (ROI) based Image compression

Normal JPEGIn low bit rate ROI compression

Joint work with Sunhyoung Han

SVCL

Discriminant saliency hypothesis

• these results are encouraging, but how do we evaluate the hypothesis as a whole?

• two fundamental questions – can it explain biological saliency?

– can it drive both bottom-up and top-down saliency?

• bottom-up saliency particularly interesting – bottom-up visual pathway much better understood than its top-

down counterpart

• motivated us to study bottom-up discriminant saliency

SVCL

• Recall: bottom-up saliency– stimulus-driven mechanism– saliency detection on a single image

• Center-surround mechanism

Bottom-up discriminant saliency

X: features, Y: {center, surround}

Wl0

Wl1

P(x|center)

P(x|surround)

x

l

SVCL

• band-pass features exhibit regular patterns of response to natural images– bow-tie shaped conditional distributions

(Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999)

– although fine details of feature dependency may vary from scene to scene, coarse structure follows a universal law for all classes

– feature dependencies are not informative about image classes

Joint feature distribution

top: three images. bottom: conditional histogram of the same coefficient, conditioned on the value of its parent.

0

P(x

i|x j

)

SVCL

• enables the approximation of mutual information by the sum of marginal mutual information (Vasconcelos & Vasconcelos, 2004)

– all complexity is encoded in the second term

• from a computational standpoint, this is extremely simple to compute (computational parsimony)

Joint feature distribution

discriminant info of individual features discriminant info of features dependencies

SVCL

Generalized Gaussian density (GGD)

• the marginal distributions of natural image features follow a generalized Gaussian density (GGD)

For β=1, a Laplace distribution, and a Gaussian when β=2

−0.2 −0.1 0 0.1 0.210

−5

10−4

10−3

10−2

10−1

100

X

P(X

)

HistogramGGD

−2 −1 0 1 210

−5

10−4

10−3

10−2

10−1

100

X

P(X

)

HistogramGGD

Examples of GGD fit for responses of two Gabor filters

SVCL

In summary

• combining principles of– infomax organization– computational parsimony– neural tuning to

stimulus statistics

• leads to a very simple saliency operator

• which is approximately optimal in the minimumerror probability sense

Feature decomposition

Σ

Color (R/G, B/Y)IntensityOrientation

Feature maps

Feature saliency m

apsD

iscriminant

measure

SVCL

Biological plausibility• discriminant saliency can be implemented with a three-layer

neural network

−10 −5 −1.2 0 1.2 5 10

−0.6

−0.4

−0.2

0

x

φ(x)

φ′(x)

ΣS(l)

Layer 3

differential simple cell

complex cell

Σ| . |β1

......

Σ| . |β0

......

{

{Σ φ(.) ΣTl

W0l

W1l

ψ[xj,Φ0]

ψ[xj,Φ1]

+

- g[xj] Il(Xk,Y)

Layer 1 Layer 2

| . |β0

xj

| . |β1H(Y)

+

cortical columns

SVCL

Single vs. conjunctive feature searchFind a bar different from all others

discriminant saliency

prediction

SVCL

Asymmetries in visual search

Time(Find a “Q” among “O”s) < Time(Find a “O” among “Q”s)

presence of a feature absence of a feature

saliency prediction

sear

ch t

ime

# of distractors

SVCL

Distractor heterogeneity (relevant dimension)

• Saliency perception is nonlinear and affected by heterogeneous distractors if the heterogeneity is in the same dimension

bg=0o bg=10o bg=20o

Nothdurft (1993)

0 10 20 30 40 50 60 70 80 900

0.5

1

1.5

2

2.5

3

3.5

Orientation contrast (deg)

Rel

ativ

e S

alie

ncy

bg=0bg=10bg=20

Discriminant saliency

SVCL

Prediction of human eye fixations on natural images (qualitative)

S

H

SVCL

Applications in motion saliency• discriminant saliency is combined with motion field

– optical flow – removes the camera motion

Joint work with Vijay Mahadevan

(NIPS 2007)

SVCL

Dynamic background

• discriminant saliency is combined with dynamic texture model– dealing with complex motion of the background scene

• background subtraction in dynamic scenes

Joint work with Vijay Mahadevan

(NIPS 2007)

SVCL

Discussion

• discriminant saliency is– biologically plausible– consistent with psychophysics of saliency

• connects a number of “disjoint” observations from neurophysiology and psychophysics– divisive normalization and saliency asymmetries

• a (unified) holistic functional justification for V1– optimally detects salient locations in the visual field in a

decision-theoretic sense under certain approximations for the sake of computational parsimony

Discriminant Hypothesis for Visual Saliency and its ... · detection. SVCL •Hypothesis: saliency...

Documents

Transcript of Discriminant Hypothesis for Visual Saliency and its ... · detection. SVCL •Hypothesis: saliency...