A Survey about Object Retrieval
-
Upload
nguyen-tuan -
Category
Technology
-
view
594 -
download
1
Transcript of A Survey about Object Retrieval
![Page 1: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/1.jpg)
The Duality of Object Retrieval:Unsupervised and Supervised
Approaches
TUAN NGUYEN ANH THE UNIVERSITY OF TOKYO
![Page 2: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/2.jpg)
Index • Part 1: Basic Object Retrieval
Ø Unsupervised approaches
• Part 2: State-of-the-art results
• Part 3: Future attempts
Ø Duality & supervised approaches
• Conclusion
2
![Page 3: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/3.jpg)
Part 1: Basic Object Retrieval
![Page 4: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/4.jpg)
Object Retrieval 4
?
1st
2nd
3rd
4th
![Page 5: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/5.jpg)
5
Similar images
Related info
Source: https://www.yandex.com/images
![Page 6: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/6.jpg)
6
Key words for images
Similar images
Source: https://www.google.com/imghp
Related info
![Page 7: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/7.jpg)
7
![Page 8: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/8.jpg)
Pinterest: Zoom-in Search 8
Source: https://www.pinterest.com/
![Page 9: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/9.jpg)
Overview of the system 9
Query
DatabaseMatching
Features
![Page 10: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/10.jpg)
Features in object retrieval 10
Query
DatabaseMatching
Features
![Page 11: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/11.jpg)
Local features • SIFT [Lowe, 1999, 2004]
• HOG [Dalal & Triggs, 2005]
11
![Page 12: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/12.jpg)
Global and deep features • GIST features [Oliva et al., 2001]
Ø Describe the images by spectral information • Deep features
Ø Extracted from neural networks
12
[Krizhevsky et al., 2012]
![Page 13: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/13.jpg)
Aggregated Features • BoF [Sivic et al., 2003] • Hamming Embedding [Jégou et al., 2008] • Fisher Vector [Perronnin et al., 2007] • VLAD [Jégou et al., 2012]
13
![Page 14: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/14.jpg)
Bag of Features (BoF) • Cluster local descriptors to build a dictionary. • Compute the BoF vector as a histogram of
visual words.
14
Images
c2
c3
DictionaryBag of Features
[Sivic et al., 2003]
![Page 15: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/15.jpg)
Hamming Embedding • Each local descriptors set of an image will be
encoded by a binary signature.
15
[Jégou et al., 2008]
![Page 16: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/16.jpg)
Fisher Vector (FV) • Cluster the local descriptors by GMM • Fisher Kernel • Fisher Vector
16
Images Local descriptors GMMFisher Vector
[Perronnin et al., 2007]
![Page 17: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/17.jpg)
VLAD • Replace the GMM in FV by k-means clustering • Approximate FV by
17
Images Local descriptors K-meansVLAD Vector
[Jégou et al., 2012]
![Page 18: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/18.jpg)
Overview of the system 18
Query
DatabaseMatching
Features
![Page 19: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/19.jpg)
Distances and similarities • Euclidean distances
• Hamming distances
• Inner product
• Approximated distances (ADC): Ø Distance between query vector and compressed
database vector.
Ø [Jégou et al., 2011]
19
![Page 20: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/20.jpg)
Nearest neighbor search 20
Query
DatabaseMatching, Nearest
neighbor search
Features
![Page 21: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/21.jpg)
Nearest neighbor search 21
Nearest neighbor
![Page 22: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/22.jpg)
Indexing and compressing data • Coarse-to-fine strategy
Ø Use quantization techniques to build an inverted file (IVF)
22
c1 1 3
c2 2
c3 4 5 6
id code
m bytes
c2
c3
Inverted File
Compressed vector Faster search
Better memory footprint
[Jégou et al., 2011]
![Page 23: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/23.jpg)
Quantization techniques • Compress the data for
better memory footprint • Search accuracy is
acceptable with appropriate parameters
23
Recall = 95% with 64 bits code
[Jégou et al., 2011]
c1 1 3
c2 2
c3 4 5 6
id code
m bytes
![Page 24: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/24.jpg)
Feature processing • Square rooting [Arandjelovic & Zisserman,
2012] • L2-normalization [Jain et al., 2012] • Centralization [Tolias et al., 2013] • Down-weight highly populated cells in
aggregation [Jégou et al., 2009] • Whitening [Jégou et al., 2010]
24
![Page 25: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/25.jpg)
Image processing: re-ranking • Estimate a transformation between the query
region and each target image. • Target images are re-ranked based on the
discriminability of the spatially verified visual words.
25
mAP with BoF: 0.618→0.645 [Philbin et al., 2007]Dataset: Oxford Buildings
Queries
![Page 26: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/26.jpg)
Image processing: query expansion 26
mAP with BoF: 0.645→0.696 [Chum et al., 2007]
• Requery after reconstructing the original query.
• The new query is constructed from verified query in the first time retrieval.
Dataset: Oxford Buildings
![Page 27: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/27.jpg)
Part 2: State-of-the-art results
![Page 28: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/28.jpg)
Nearest neighbor search • Datasets: 1M~1B vectors with ground truth
data Ø BIGANN dataset: http://corpus-texmex.irisa.fr/
• Evaluation Ø recall@R = the proportion of queries with NN
ranked in top-R results.
28
c1 1 3
c2 2
c3 4 5 6
id code
m bytes
c2
c3
Inverted File
Compressed vector
![Page 29: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/29.jpg)
Quantization techniques • Additive Quantization
[Babenko et al., 2014] • Approximate a vector by
the sum of codewords. • Learn codewords by an
iterative optimization.
• Composite Quantization [Zhang et al., 2014]
• Minimize the orthogonality of the approximation.
29
![Page 30: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/30.jpg)
Indexing techniques • Multi-indexing [Babenko et al., 2012, 2015]
• Performance in a dataset of one billion SIFT vectors Ø Memory: 12 GB Ø Search time: 2 ms/query Ø recall@100 = 70%
30
![Page 31: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/31.jpg)
Image search • Datasets: Oxford building dataset [Philbin et
al., 2007]
• Evaluation Ø mAP: Mean average precision for a set of queries
is the mean of the average precision scores for each query.
31
![Page 32: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/32.jpg)
Selective Match Kernel • [Tolias et al., 2013] • Apply the power normalization to each VLAD
component to improve the accuracy. • Use hashing to reduce the memory footprint. • mAP = 0.817 on Oxford5K dataset [Philbin et al., 2007]
32
![Page 33: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/33.jpg)
Neural Codes • [Babenko et al., 2014] • Attempt to use features that are extracted from
neural network to object retrieval. • Features are fine-tuned. • mAP = 0.435 with fc6 features on Oxford5K
dataset.
33
![Page 34: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/34.jpg)
Sum-pooled convolutional features • [Babenko et al., 2015] • Deep features are sum-pooled and Gaussian
weighted to improve the accuracy. • mAP = 0.657 on Oxford5K dataset.
34
![Page 35: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/35.jpg)
Summary of image retrieval results 35
• Search framework with deep features in object
retrieval still need to be improved.
Method Feature Framework mAPASMK [Tolias et al., 2013] SIFT VLAD 0.817Neural codes [Babenko et al., 2014] Deep features - 0.435SPoC [Babenko et al., 2015] Deep features SPoC 0.657
![Page 36: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/36.jpg)
Part 3: Future attempts
![Page 37: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/37.jpg)
Attempts on current topics • Improve the features:
Ø Feature fusion
Ø Find new match kernels
Ø Improve the system with deep features?
• Improve the distance metrics and NN search.
37
![Page 38: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/38.jpg)
Dual-process system 38
• [Stanovich et al., 1999, 2004]
Fast, high capacity, implicit knowledge and basic emotions
only .
Slow, limited capacity, explicit knowledge and
complicated emotions.
![Page 39: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/39.jpg)
Supervised Object Retrieval? • More than just apply the deep features into
retrieval.
• Learning while searching?
• Learning with feedback?
39
![Page 40: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/40.jpg)
The Duality of Object Retrieval • The collaboration between unsupervised
learning and supervised learning in object retrieval.
40
[Stanovich et al., 1999, 2004]
![Page 41: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/41.jpg)
Conclusion • Basic Object Retrieval
Ø Features: SIFT, HOG, GIST, deep features
Ø Distance metrics and NN search
Ø Hamming Embedding and Aggregation
Ø Pre-processing and post-processing
• State-of-the-art results
• Future attempts: Duality & Supervised & Unsupervised?
41
![Page 42: A Survey about Object Retrieval](https://reader031.fdocuments.us/reader031/viewer/2022030309/58f1e7a81a28abe4098b45e9/html5/thumbnails/42.jpg)
Thank you for listening