Compact Descriptors for Visual Search
-
Upload
antonio-capone -
Category
Technology
-
view
3.140 -
download
0
Transcript of Compact Descriptors for Visual Search
![Page 1: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/1.jpg)
Compact Descriptors 4 Visual Search
Danilo Pau ([email protected])
Senior Principal Engineer
Senior Member of Technical Staff
SMIEEE
SI/CVRP
STMicroelectronics/AST
Courtesy: M. Funamizu
![Page 2: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/2.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
2
15/01/2013Presentation Title
![Page 3: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/3.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
3
15/01/2013Presentation Title
![Page 4: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/4.jpg)
Visual Search Context• Millions of images and videos continue being uploaded all over the
world on remote servers
• Each day on Facebook 300 million photos are uploaded
• roughly 58 photos uploaded each second
• One hour of video uploaded to YouTube every second
4
15/01/2013Presentation Title
![Page 5: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/5.jpg)
Content Based Image Recognition
• CBIR covers the concept of search that analyzes the actual content inthe image, rather than relying on metadata.
• The development of this concept incorporated many algorithms andtechniques from fields such as statistics, pattern recognition andcomputer vision.
• CBIR attracted a lot of attention and after many years of research, ithas expanded towards the marketplace.
• CBIR’s application on mobile market is called Mobile Visual Search
• Visual Search is about the capability to initiate a search using animage as a query that captures a rigid object
• Market potential of mobile visual search considers any mobile device with camera(phones, tablets and hybrids).
5
15/01/2013Presentation Title
![Page 6: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/6.jpg)
CBIR vs QR Codes
• Quick Response codes, a type of two-dimensional barcode.
• The code is scanned by the mobile imager to produce a URL addressfor re-direction and browsing.
• QR codes are being used by 6.2% of the smart phone users in USA
6
15/01/2013Presentation Title
![Page 7: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/7.jpg)
Lots of Existing Applications• Google’s Goggles
• Nokia’s Point and Find
• oMoby
• Like.com
• Kooaba
• Moodstocks
• Snaptell
• pixlinQ
• Bing
7
15/01/2013Presentation Title
![Page 8: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/8.jpg)
Existing Apps use Jpeg
• Previous applications use mobile imager that send JPEG compressed queries
8
15/01/2013Presentation Title
Remote server
Mobile device
Send Jpeg images
Visual search result
Database
![Page 9: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/9.jpg)
An Example of Visual Search
Courtesy Telecom Italia
Interest Point DescriptionDescriptor pairingInliers
9
Query
![Page 10: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/10.jpg)
The Rise of Compressed Descriptors
• Alternatively send “compact features” extracted from raw images
• For example Scale Invariant Feature Transform – SIFT visual descriptors
• Consider 1200 descriptors, each one 128 Bytes, 4 bytes for coordinates, times 30 fps � network load nearly 38 Mbit/s �unacceptable
10
15/01/2013Presentation Title
0
20
40
60
80
100
120
140
160
JPEG High JPEG Low SIFT
VGA Image
JPEG High
JPEG Low
SIFT
KB
![Page 11: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/11.jpg)
Systems Considered
• Instead of sending images (a)
• application can send compact descriptors (b)
• and even perform search locally (c).
11
![Page 12: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/12.jpg)
Previous Attempts
• Hashing• Locality Sensitive Hashing [Yeo et ali., 2008]
• Similarity Sensitive Coding [Torralba et ali., 2008]
• Spectral Hashing [Weiss et ali, 2008]
• Transform Coding• Karunen-love Transform [Chandrasekhar et ali. 2009]
• ICA based Transform [Narozny et ali., 2008]
• Vector Quantization• Product Quantization [Jegou et ali., 2010]
• Tree Structured Vector Quantization [Nistr et ali., 2006]
• Alternative to SIFT• Compressed Histogram of Gradients [Chandrasekhar et ali. 2011]
12
15/01/2013Presentation Title
![Page 13: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/13.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
13
15/01/2013Presentation Title
![Page 14: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/14.jpg)
Is a standard on Visual Search needed ?
• Reduce load on wireless networks carrying visual search-related information.
• Ensure interoperability of visual search applications and databases,
• Enable hardware support for descriptor extraction and matching in mobile devices,
• Enable high level of performance of implementations conformant to the standard,
• Simplify design of descriptor extraction and matching for visual search applications,
14
![Page 15: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/15.jpg)
What is a suitable standardizationbody ?
• Informal title:• Moving Picture Experts Group (MPEG)
• Formal title:• ISO/IEC JTC1 SC29 WG11 (Coding of Moving Pictures and Audio)
• Parent SDOs:• ISO: International Organization for Standardization • IEC: International Electro technical Commission• JTC 1: Joint Technical Committee One• SC29: Study Committee 29: Coding of Audio, Picture,
Multimedia and Hypermedia Information
• Members: National Bodies (25 voting, 16 observers)
JTC 1
SC29
WG11 (MPEG)
15
![Page 16: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/16.jpg)
16
![Page 17: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/17.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
17
15/01/2013Presentation Title
![Page 18: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/18.jpg)
CDVS : Scope
• Descriptor extraction process needed to ensure interoperability.
• Bitstream of compact descriptors
Query Image
Descriptor extraction
Descriptor bitstream
Descriptor matching
Geometric verification
Database
List of results
Standard
18
![Page 19: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/19.jpg)
Requirements
� Robustness� High matching accuracy shall be achieved at least for images of textured
rigid objects, landmarks, and printed documents. � The matching accuracy shall be robust to changes in vantage points,
camera parameters, lighting conditions, as well as in the presence of partial occlusions.
� Sufficiency� Descriptors shall be self-contained, in the sense that no other data are
necessary for matching.
� Compactness� Shall minimize lengths/size of image descriptors
� Scalability� Shall allow adaptation of descriptor lengths to support the required
performance level and database size.� Shall enable design of web-scale visual search applications and
databases.
19
![Page 20: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/20.jpg)
How to achieve robustness• Image content is transformed into visual feature with coordinates
that are invariant to illumination, scale, rotation, affine and perspective transforms
20
![Page 21: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/21.jpg)
Types of invariance
• Illumination
21
![Page 22: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/22.jpg)
• Illumination
• Scale
22Types of invariance
![Page 23: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/23.jpg)
• Illumination
• Scale
• Rotation
23Types of invariance
![Page 24: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/24.jpg)
• Illumination
• Scale
• Rotation
• Affine Transform
24Types of invariance
![Page 25: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/25.jpg)
• Illumination
• Scale
• Rotation
• Affine Transform
• Full Perspective
25Types of invariance
![Page 26: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/26.jpg)
Compactness 26
15/01/2013Presentation Title
0
20
40
60
80
100
120
140
160
JPEG High JPEG Low SIFT 512B 1KB 2KB 4KB 8KB 16KB
VGA Image
JPEG High
JPEG Low
SIFT
512B
1KB
2KB
4KB
8KB
16KB
KB
![Page 27: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/27.jpg)
Extraction Pipeline 27
Image
Compactdescriptors
H Mode
H-Mode uses SQ encoding (256B)
S-Mode uses MSVQ encoding (38KB)
Both Mode uses SCFV (49KB)
Resizing
Local DescriptionExtraction
Encoding
SCFV
Descriptor
Coordinate coding
Arithmetic coding
MSVQ
encoding
Keypointselection
SIFTDoG
Transform & SQ
S Mode
![Page 28: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/28.jpg)
Properties of SIFTDavid Lowe’s local descriptor detection extraction (1999-2004)
Extraordinarily robust matching technique• Can handle changes in viewpoint
• Up to about 30 degree out of plane rotation
• Can handle significant changes in illumination• Sometimes even day vs. night (below)
• Lots of code available � http://www.vlfeat.org (BSD license)
28
![Page 29: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/29.jpg)
Pyramid of DoG
DoGs
DoGs
DoGs
Octave 1
Octave n
Scale 1 Scale m29
![Page 30: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/30.jpg)
Actual Interest Point Detector Output 30
![Page 31: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/31.jpg)
Building a Descriptor• Take 16x16 patch window around detected interest point
• Subdivide patch with 4x4 sub-patches
• Create per sub patch 8 bin-histogram over edge orientations weighted by magnitude
• These lead to a 4x4x8=128 element vector � the SIFT descriptor
31
15/01/2013Presentation Title
0 2ππππ
angle histogram
![Page 32: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/32.jpg)
Key point selection
• Basic idea: inlier features do not behave, in a statistical sense, as do the outlier features.
• Relevance value that results from taking into account distance from center, scale, orientation, peak, mean and variance of the SIFT descriptor.
32
![Page 33: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/33.jpg)
• Main idea is to generate a compressed descriptor from uncompressed SIFT by
• Simple linear combinations of histograms
• Scalar quantisation of resultant values
• Adaptive Arithmetic coding
• Main benefits• Very low computational complexity
• Negligible memory requirements
• Highly scalable
• Allows for very efficient matching and retrieval
Local Descriptor Compression H mode 33
![Page 34: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/34.jpg)
Vector Quantizer Scheme: S- Mode 34
![Page 35: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/35.jpg)
Location Encoding
• Histogram Map: The positions of the nonzero bins are encoded asbinary words through scanning columns and compressing the words byarithmetic coding.
• Histogram Count: The number of coordinates in the nonzero bins isencoded in an iterative fashion, by specifying first which bins containmore than 1 key point, then by specifying which among these thatcontain more than 2 keypoints, and so forth
35
![Page 36: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/36.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
36
15/01/2013Presentation Title
![Page 37: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/37.jpg)
Extraction times
• SIFT interest point detection and feature extraction made the biggest contribution
• Global descriptors as complex as Interest Point Detection
• Very fast local descriptors and coordinate encoding
37
15/01/2013Quantitative evaluation of CDVS extraction and pairwise matching
![Page 38: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/38.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
38
15/01/2013Presentation Title
![Page 39: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/39.jpg)
Mobile Visual Search: Music CDs
Query
Stream Music
39
… …
![Page 40: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/40.jpg)
SnapshotPaper-copy Initiate Visual
Search
Mass Storage
SendCompact Query
Selective quality&contentprinting
Multimedia Content RetrievalFrom the cloud
Augmentation Rendering
Composition of augmentations
and image
Augmentation 3D models and markers
Transmission of markers and 3D
models
2D / 3D Rendering
Content Augmentation
40Visual Search: eReaders, Printers
![Page 41: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/41.jpg)
News FinderStill Pictures - Visual Search
41
15/01/2013Presentation Title
![Page 42: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/42.jpg)
Application and Use Cases from Broadcaster point of view
• Logo Detection
• Interactive Fruition
42
15/01/2013Presentation TitleCourtesy RAI
![Page 43: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/43.jpg)
Automotive 3D Top View
EC
UCam
Cam
Cam
Cam
43
![Page 44: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/44.jpg)
Automotive 3D Top View 44
![Page 45: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/45.jpg)
45Moving Pictures Visual Search
Courtesy Telecom Design
![Page 46: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/46.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
46
15/01/2013Presentation Title
![Page 47: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/47.jpg)
Intra Predicted Descriptors 47
15/01/2013Presentation Title
� Desirable Properties:
� An inter descriptor coded in a compact visual stream
� Expressed in terms of one or more temporally neighboring descriptors.
� The "inter" part of the term refers to the use of Inter Frame Prediction.
� Designed to achieve higher compression rates and/or better precision-recall performances
![Page 48: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/48.jpg)
3D Mobile Devices Will Surpass 148 Million in 2015
• Advances in the 3D technology are very fast
• Industry adoption opens new opportunities � 3D Visual Search
• From In-Stat studies:• ~ 30 % of all handheld game consoles will be 3D by 2015.
• 3D mobile devices will increase demand for image sensors by 130 %.
• In 2012, Notebook will be the first 3D enabled mobile device to reach 1 million units.
• By 2014, 18 % of all tablets will be 3D.
• Nintendo, Fuji, GoPro, Sony, ViewSonic, LG, Origin, Toshiba, Fujitsu, HP, ASUS, Lenovo, Dell, Alienware, HTC and Sharp focusing on autostereoscopy mobile technologies
48
15/01/2013Presentation Title
![Page 49: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/49.jpg)
49
15/01/2013Presentation Title
Microsoft Kinect Asus Xtion
Google 3D Warehouse
LG Optimus 3D P920
LG Optimus Pad
HTC EVO 3D Sharp Aquos SH-12C
3DS by Nintendo
![Page 50: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/50.jpg)
3D Object Recognition with Kinect 50
15/01/2013Presentation Title
http://www.youtube.com/watch?v=eRW1zG_aONk
Courtesy: CV laboratory University of Bologna
SHOT: Unique Signatures of Histograms for Local Surface Description
![Page 51: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/51.jpg)
Agenda
• Visual Search: Context
• MPEG initiative on Visual Search
• Compact Descriptors for Visual Search
• Implementation
• Use Cases
• Visual Search Evolution: Moving Pictures and 3D
• Question and Answers
51
15/01/2013Presentation Title
![Page 52: Compact Descriptors for Visual Search](https://reader033.fdocuments.us/reader033/viewer/2022060116/557d4ca3d8b42a9c148b4dbf/html5/thumbnails/52.jpg)
52
15/01/2013Presentation Title