Iccv11 salientobjectdetection
-
Upload
jie-feng -
Category
Technology
-
view
456 -
download
0
Transcript of Iccv11 salientobjectdetection
![Page 1: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/1.jpg)
Salient Object Detection by
Composition
Jie Feng1, Yichen Wei2, Litian Tao3, Chao Zhang1, Jian Sun2
1Key Laboratory of Machine Perception, Peking University
2Microsoft Research Asia
3Microsoft Search Technology Center Asia
![Page 2: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/2.jpg)
A key vision problem: object detection
• Fundamental for image understanding
• Extremely challenging
– Huge number of object classes
– Huge variations in object appearances
![Page 3: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/3.jpg)
What are salient objects?
• Visually distinctive and semantically meaningful
• Inherently ambiguous and subjective
Yes! Yes? probably No!
![Page 4: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/4.jpg)
Why detect salient objects?
• Relatively easy: large and distinct
• Semantically important
1. Image summarization, cropping…
2. Object level matching, retrieval…
3. A generic object detector for later recognition
– avoid running thousands of different detectors
– a scalable system for image understanding
![Page 5: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/5.jpg)
Traditional approach: saliency map
• Measures per-pixel importance
• Loses information and deficient to find objects
![Page 6: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/6.jpg)
sliding window object detection
• Slide different size windows over all positions
• Evaluate a quality function, e.g., a car classifier
• Output windows those are locally optimum
• Face, human…
• Car, bus…
• Horse, dog…
• Table, couch…
• …
![Page 7: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/7.jpg)
Salient object detection by composition
• A ‘composition’ based window saliency measure
– intuitive and generalizes to different objects
• A sliding window based generic object detector
– fast and practical: 1-2 seconds per image
– a few dozens/hundreds output windows
• Effective pre-processing for later recognition tasks
![Page 8: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/8.jpg)
It is hard to represent a salient window
• Given image I and window W
• saliency(W) = cost of composing W using (I-W)
![Page 9: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/9.jpg)
Benefits of ‘composition’ definition
•
![Page 10: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/10.jpg)
Part based representation
}...{ 31
ii SSW
}...{ 101
oo SSWI
• Each part S has an (inside/outside) area A(S)
• Each part pair (p, q) has a composition cost c(p, q)
![Page 11: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/11.jpg)
Generate parts by over-segmentation
Typically 100-200 segments in a natural image
P.F.Felzenszwalb and D.P.Huttenlocher. Efficient graph-
based image segmentation. IJCV, 2004
![Page 12: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/12.jpg)
An illustrative ‘composition’ example
saliency(W)=
cost(A,a)
+cost(B,b)
+cost(C,c)
+cost(D,d)
+cost(E,e)
AB
a
b
W={A, B, C
D, E}
![Page 13: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/13.jpg)
Computational principles
1. Appearance proximity
2. Spatial proximity
3. Non-reusability
4. Non-scale-bias
• Intuitive perceptions about saliency
![Page 14: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/14.jpg)
1. Appearance proximity
• Salient parts have distinct appearances
• q1 and q2 are equally distant from p, q2 is more similar
p q2
q1
c(p, q1)=0.6
c(p, q2)=0.2
![Page 15: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/15.jpg)
2. Spatial proximity
• Salient parts are far from similar parts
• q1 and q2 are equally similar as p, q2 is closer
p q2
q1
c(p, q1)=0.3
c(p, q2)=0.2
![Page 16: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/16.jpg)
3. Non-reusability
• An outside part can be used only once
• Robust to background clutters
![Page 17: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/17.jpg)
4. Non-scale-bias
• Normalized by window area and avoid large window bias
• tight bounding box > loose one
0.6
0.3
![Page 18: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/18.jpg)
Define composition cost c(p, q)
•
![Page 19: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/19.jpg)
Part based composition
• Finding outside parts with the same area of inside
parts and smallest composition cost
• Need to find which outside part to compose which
inside part with how much area
• Formulated as an Earth Mover’s Distance (EMD)
– optimal solution has polynomial (cubic) complexity
• A greedy optimization
– pre-computation + incremental sliding window update
![Page 20: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/20.jpg)
Greedy composition algorithm
•
![Page 21: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/21.jpg)
Algorithm pseudo code
![Page 22: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/22.jpg)
Pre-computation and initialization
•
![Page 23: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/23.jpg)
More implementation details
• 6 window sizes: 2% to 50% of image area
• 7 aspect ratios: 1:2 to 2:1
• 100-200 segments
• 1-2 seconds for 300 by 300 image
• Find local optimal windows by non-maximum
suppression
![Page 24: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/24.jpg)
Evaluation on PASCAL VOC 07
• it’s for object detection
– 20 object classes
– Large object and background variation
– Challenging for traditional saliency methods
• not totally suitable for salient object detection
– Not all labeled objects are salient: small, occluded, repetitive
– Not all salient objects are labeled: only 20 classes
• but still the best database we have
![Page 25: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/25.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
top 5 salient windows
![Page 26: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/26.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
![Page 27: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/27.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
![Page 28: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/28.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
![Page 29: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/29.jpg)
Outperforms the state-of-the-art
• Objectness: B.Alexe, T.Deselaers, and V.Ferrari. What is an object. In CVPR, 2010.
• Uses mainly local cues: find locally salient windows that are globally not
![Page 30: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/30.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
ours
objectness
![Page 31: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/31.jpg)
Yellow: correct, Red: wrong, Blue: ground truth
ours objectness
ours
objectness
![Page 32: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/32.jpg)
Failure cases: too complex
![Page 33: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/33.jpg)
Failure cases: lack of semantics
• Partial background with object: man with background
• Not annotated objects: painting, pillows
• Similar objects together: two chairs
![Page 34: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/34.jpg)
Failure cases: lack of semantics
• Partial object or object parts: wheels and seat
![Page 35: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/35.jpg)
#windows V.S. detection rate
• Find many objects within a few windows
• A practical pre-processing tool
#top windows 5 10 20 30 50
recall 0.25 0.33 0.44 0.5 0.57
![Page 36: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/36.jpg)
Evaluation on MSRA database
• Less challenging: only a single large object
– T.Liu, J.Sun, N.Zheng, X.Tang, and H.Shum. Learning to detect a
salient object. In CVPR, 2007
• Use the most salient window of our approach in evaluation
– pixel level precision/recall is comparable with previous methods
• Our approach is principled for multi-object detection
– benefits less from the database’s simplicity than previous methods
![Page 37: Iccv11 salientobjectdetection](https://reader031.fdocuments.us/reader031/viewer/2022020307/55a395e21a28aba97a8b461f/html5/thumbnails/37.jpg)
Summary
•