Discriminant Hypothesis for Visual Saliency and its ... · detection. SVCL •Hypothesis: saliency...
Transcript of Discriminant Hypothesis for Visual Saliency and its ... · detection. SVCL •Hypothesis: saliency...
SVCL
Discriminant Hypothesis for Visual Saliency and its Applications in
Computer Vision
Dashan Gao
Joint work with Nuno VasconcelosStatistical Visual Computing Laboratory
Department of Electrical and Computer Engineering,University of California, San Diego
SVCL
What is visual saliency?•certain image features that attract visual attention
(Yarbus, 1967)
(Treisman & Gormican, 1988)
SVCL
What is known about saliency?
• Bottom-up (BU) saliency– stimulus-driven mechanism, fast– goal independent– e.g. traffic signs
• Top-down (TD) saliency– goal-driven mechanism, slower– provides informative locations to
the specific task
Yarbus 1967:
1. Free viewing
2. Estimate the economic level of the people
3. Judge their age
SVCL
In computer vision
•saliency is widely used in visual recognition systems, as a pre-processing stage–a sparse image representation–reduces computation–example: weakly supervised learning of object
categories
Fergus et. al 2003 Sivic et. al 2003
SVCL
however, in computer vision• most saliency definitions are universal, divorced
from the recognition problem– repeatability (stability) [Forstner(94), Harris-Stephens(88), shi-Tomasi(94), Mikolajczyk(01,04)]
– continuity (curvature) [Sha’ashua-Ullman(88), Asada-Brady(86)]
– complexity (information content) [Kadir-Brady(01),Sebe-Lew(03)]
– rarity (low probability) [Walker et al.(98), Oliva et al.(03), Bruce-Tsotsos(05)]
• in result,– detected salient locations may not be very informative for
recognition.
Harris detection
SVCL
• Hypothesis: saliency is a discriminant process
• requires a stimulus of interest, and a null hypothesis of stimuli that are not salient– context dependent: salient attributes depend on the
object of interest and the context (null hypothesis)
• Definition: salient features are those that best distinguish the given visual concept from the null hypothesis
Discriminant saliency
(NIPS 2004)
SVCL
Infomax feature selection
• solution: features that maximize the mutual information between the features (X) and the class label (Y)
• under a constraint of computational parsimony
• we use marginal mutual information,
X={X1, …, Xn}
SVCL
Top-down Discriminant Saliency Model
Scale Selection
W j
WTA
Faces Discriminant Feature
Selection
Salient Features
Background
Training
Testing
Saliency Map
Original Feature Set
Malik-Perona pre-attentive perception model [M-K90]
SVCL
Qualitative evaluation• discriminant saliency more correlates with target objects
Original images
Saliency Maps by
Discriminant Saliency
Scale SaliencyDetector [K-B 01]
Harris SaliencyDetector [H-S 88]
Salient locations by Discriminant
Saliency
SVCL
Visual classification
• how informative are the salient points?
– evaluated by measuring actual recognition rates
– classifier: histogram of saliency values, fed to an SVM for a presence/absence test
– compared with two standard saliency detectors, and two bench marks
6065707580859095
100
FacesMotorbikes
Airplanes
DiscSal-DCT
Scale Saliency
Harris Saliency
Image Pixel
Constellation
SVCL
Repeatability test
• robustness to various transformations
2 2.5 3 3.5 4 4.5 5 5.5 60
10
20
30
40
50
60
70
80
90
100
increasing scale changes
rep
eata
bili
ty (
%)
DSDSSDHarrLapHesLapMser
2 2.5 3 3.5 4 4.5 5 5.5 60
10
20
30
40
50
60
70
80
90
100
increasing viewpoint angle
rep
eata
bili
ty (
%)
DSDSSDHarrLapHesLapMser
2 2.5 3 3.5 4 4.5 5 5.5 60
10
20
30
40
50
60
70
80
90
100
increasing blur
rep
eata
bili
ty (
%)
DSDSSDHarrLapHesLapMser
2 2.5 3 3.5 4 4.5 5 5.5 60
10
20
30
40
50
60
70
80
90
100
increasing JPEG compression
rep
eata
bili
ty (
%)
DSDSSDHarrLapHesLapMser
scale+rotation
view angle
blurring
JPEG compression
SSD: Kadir-Brady 2001
HarrLap/HesLap: Mikolajczyk 2004
Mser: Matas et al. 2004
SVCL
Other research work based on Top-down Discriminant Saliency• a hierarchical model learning of complex features
and detectors for classificationFaces
DiscriminantFeature
Selection
Background
Saliency Map &Salient Locations
Salient (complex) Features
Initial Feature Set Object
Detection
Complex Feature
Generation
Feature Complexity
Control
New Complex Feature Set
(CVPR 2005)
SVCL
Bayesian Integration• Advantage
– the trade-off between high selectivity (by TD) and high accuracy(by BU)
• Probabilistic formulation of salient locations– saliency output: probability distributions of saliency locations
over the image plane
• Saliency as Bayesian inference – BU saliency -> prior, TD saliency -> likelihood
– inference from the two observations: both accuracy and selectivity -> a posterior
SVCL
Results• better locations
• better selectivity (classification accuracy)
BU TD Integration
SVCL
Applications
• Automated gathering of training examples
SVCL
Applications
• Region-of-interest (ROI) based Image compression
Normal JPEGIn low bit rate ROI compression
Joint work with Sunhyoung Han
SVCL
Discriminant saliency hypothesis
• these results are encouraging, but how do we evaluate the hypothesis as a whole?
• two fundamental questions – can it explain biological saliency?
– can it drive both bottom-up and top-down saliency?
• bottom-up saliency particularly interesting – bottom-up visual pathway much better understood than its top-
down counterpart
• motivated us to study bottom-up discriminant saliency
SVCL
• Recall: bottom-up saliency– stimulus-driven mechanism– saliency detection on a single image
• Center-surround mechanism
Bottom-up discriminant saliency
X: features, Y: {center, surround}
Wl0
Wl1
P(x|center)
P(x|surround)
x
l
SVCL
• band-pass features exhibit regular patterns of response to natural images– bow-tie shaped conditional distributions
(Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999)
– although fine details of feature dependency may vary from scene to scene, coarse structure follows a universal law for all classes
– feature dependencies are not informative about image classes
Joint feature distribution
top: three images. bottom: conditional histogram of the same coefficient, conditioned on the value of its parent.
0
P(x
i|x j
)
SVCL
• enables the approximation of mutual information by the sum of marginal mutual information (Vasconcelos & Vasconcelos, 2004)
– all complexity is encoded in the second term
• from a computational standpoint, this is extremely simple to compute (computational parsimony)
Joint feature distribution
discriminant info of individual features discriminant info of features dependencies
SVCL
Generalized Gaussian density (GGD)
• the marginal distributions of natural image features follow a generalized Gaussian density (GGD)
For β=1, a Laplace distribution, and a Gaussian when β=2
−0.2 −0.1 0 0.1 0.210
−5
10−4
10−3
10−2
10−1
100
X
P(X
)
HistogramGGD
−2 −1 0 1 210
−5
10−4
10−3
10−2
10−1
100
X
P(X
)
HistogramGGD
Examples of GGD fit for responses of two Gabor filters
SVCL
In summary
• combining principles of– infomax organization– computational parsimony– neural tuning to
stimulus statistics
• leads to a very simple saliency operator
• which is approximately optimal in the minimumerror probability sense
Feature decomposition
Σ
Color (R/G, B/Y)IntensityOrientation
Feature maps
Feature saliency m
apsD
iscriminant
measure
SVCL
Biological plausibility• discriminant saliency can be implemented with a three-layer
neural network
−10 −5 −1.2 0 1.2 5 10
−0.6
−0.4
−0.2
0
x
φ(x)
φ′(x)
ΣS(l)
Layer 3
differential simple cell
complex cell
Σ| . |β1
......
Σ| . |β0
......
{
{Σ φ(.) ΣTl
W0l
W1l
ψ[xj,Φ0]
ψ[xj,Φ1]
+
- g[xj] Il(Xk,Y)
Layer 1 Layer 2
| . |β0
xj
| . |β1H(Y)
+
cortical columns
SVCL
Single vs. conjunctive feature searchFind a bar different from all others
discriminant saliency
prediction
SVCL
Asymmetries in visual search
Time(Find a “Q” among “O”s) < Time(Find a “O” among “Q”s)
presence of a feature absence of a feature
saliency prediction
sear
ch t
ime
# of distractors
SVCL
Distractor heterogeneity (relevant dimension)
• Saliency perception is nonlinear and affected by heterogeneous distractors if the heterogeneity is in the same dimension
bg=0o bg=10o bg=20o
Nothdurft (1993)
0 10 20 30 40 50 60 70 80 900
0.5
1
1.5
2
2.5
3
3.5
Orientation contrast (deg)
Rel
ativ
e S
alie
ncy
bg=0bg=10bg=20
Discriminant saliency
SVCL
Prediction of human eye fixations on natural images (qualitative)
S
H
SVCL
Applications in motion saliency• discriminant saliency is combined with motion field
– optical flow – removes the camera motion
Joint work with Vijay Mahadevan
(NIPS 2007)
SVCL
Dynamic background
• discriminant saliency is combined with dynamic texture model– dealing with complex motion of the background scene
• background subtraction in dynamic scenes
Joint work with Vijay Mahadevan
(NIPS 2007)
SVCL
Discussion
• discriminant saliency is– biologically plausible– consistent with psychophysics of saliency
• connects a number of “disjoint” observations from neurophysiology and psychophysics– divisive normalization and saliency asymmetries
• a (unified) holistic functional justification for V1– optimally detects salient locations in the visual field in a
decision-theoretic sense under certain approximations for the sake of computational parsimony