Paper Overviews

Click here to load reader

download Paper Overviews

of 65

  • date post

  • Category


  • view

  • download


Embed Size (px)


Paper Overviews. 3 types of descriptors : SIFT / PCA-SIFT ( Ke , Sukthankar ) GLOH ( Mikolajczyk , Schmid ) DAISY ( Tola , et al, Winder, et al) Comparison of descriptors ( Mikolajczyk , Schmid ). Paper Overviews. PCA-SIFT: SIFT-based but with a smaller descriptor - PowerPoint PPT Presentation

Transcript of Paper Overviews

A Fast Local Descriptor for Dense Matching

Paper Overviews3 types of descriptors:SIFT / PCA-SIFT (Ke, Sukthankar)GLOH (Mikolajczyk, Schmid)DAISY (Tola, et al, Winder, et al)Comparison of descriptors (Mikolajczyk, Schmid)Paper OverviewsPCA-SIFT: SIFT-based but with a smaller descriptorGLOH: modifies the SIFT descriptor for robustness and distinctivenessDAISY: novel descriptor that uses graph cuts for matching and depth map estimationSIFTScale Invariant Feature Transform4 stages:Peak selectionKeypoint localizationKeypoint orientationDescriptorsSIFTScale Invariant Feature Transform4 stages:Peak selectionKeypoint localizationKeypoint orientationDescriptorsSIFT1. Peak SelectionMake Gaussian pyramid

SIFT1. Peak SelectionFind local peaks using difference of Gaussians- Peaks are found at different scales

SIFTScale Invariant Feature Transform4 stages:Peak selectionKeypoint localizationKeypoint orientationDescriptorsSIFT2. Keypoint LocalizationRemove peaks that are unstable:Peaks in low-contrast areasPeaks along edges Features not distinguishable

SIFTScale Invariant Feature Transform4 stages:Peak selectionKeypoint localizationKeypoint orientationDescriptorsSIFT3. Keypoint OrientationMake histogram of gradients for a patch of pixelsOrient all patches so the dominant gradient direction is vertical Invariant Feature Transform4 stages:Peak selectionKeypoint localizationKeypoint orientationDescriptorsSIFT4. Descriptors

Ideal descriptor:CompactDistinctive from other descriptorsRobust against lighting / viewpoint changes

SIFT4. Descriptors

A SIFT descriptor is a 128-element vector:4x4 array of 8-bin histogramsEach histogram is a smoothed representation of gradient orientations of the patchPCA-SIFTChanges step 4 of the SIFT process to create different descriptors

Rationale: Construction of SIFT descriptors is complicatedReason for constructing them that way is unclear Is there a simpler alternative?PCA-SIFTPrincipal Component Analysis (PCA)A widely-used method of dimensionality reductionUsed with SIFT to make a smaller feature descriptorBy projecting the gradient patch into a smaller spacePCA-SIFTCreating a descriptor for keypoints:Create patch eigenspaceCreate projection matrixCreate feature vectorPCA-SIFT1. Create patch eigenspaceFor each keypoint:Take a 41x41 patch around the keypointCompute horizontal / vertical gradientsPut all gradient vectors for all keypoints into a matrixPCA-SIFT1. Create patch eigenspaceM = matrix of gradients for all keypointsCalculate covariance of MCalculate eigenvectors of covariance(M)PCA-SIFT2. Create projection matrixChoose first n eigenvectorsThis paper uses n = 20This is the projection matrixStore for later use, no need to re-computePCA-SIFT3. Create feature vectorFor a single keypoint:Take its gradient vector, project it with the projection matrixFeature vector is of size nThis is called Grad PCA in the paperImg PCA - use image patch instead of gradientSize difference: 128 elements (SIFT) vs. n = 20PCA-SIFTResultsTested SIFT vs. Grad PCA and Img PCA on a series of image variations:Gaussian noise45 rotation followed by 50% scaling50% intensity scalingProjective warp

PCA-SIFTResults (Precision-recall curves)Grad PCA (black) generally outperforms Img PCA (pink) and SIFT (purple) except when brightness is reducedBoth PCA methods outperform SIFT with illumination changes

PCA-SIFTResultsPCA-SIFT also gets more matches correct on images taken at different viewpoints

A Performance Evaluation of Local Descriptors

Krystian Mikojaczyk and Cordilia Schmid24Problem Setting for ComparisonMatching Problem

From a slide of David G. Lowe (IJCV 2004)As we did in Project2: Panorama, we want to find correctpairs of points in two images.25Overview of Compared MethodsRegion Detectordetects interest points

Region Descriptordescribes the points

Matching StrategyHow to find pairs of points in two images?

26Region DetectorHarris PointsBlob Structure Detector1. Harris-Laplace Regions (similar to DoG)2. Hessian-Laplace Regions 3. Harris-Affine Region4. Hessian-Affine RegionEdge DetectorCanny Detector

27Region DescriptorsDescriptorDimensionCategoryDistance MeasureSIFT128SIFT Based DescriptorsEuclideanPCA-SIFT36GLOH128Shape Context36Similar to SIFT, but focues on Edge locations with Canny DetectorSpin50A sparse set of affine-invariant local patches are usedSteerable Filter14Differential DescriptorsForcuses on the properties of local derivaties (local jet)MahalanobisDifferential Invariants14Complex Filters1681Consists of many filetersGradient Moments20Moment based descriptorCross Correlation81 Uniformaly sampled locations28Matching StrategyThreshold-Based Matching

Nearest Neighbor Matching Threshold

Nearest Neighbor Matching Distance Ratio

DB: the first neighborDB: the first neighborDC: the second neighbor29Peformance MeasurementsRepeatability rate, ROC

Recall-PrecisionRecall =# of correct machesTotal # of correct matchesPrecision =# of correct maches# of correct matches + # of false matchesTP (True Positive)Actual positiveTP (True Positive)Predicted positive==30Example of Recall-PrecisionLet's say that our method detected.. * 50 corrsponding pairs were extracted * 40 detected pairs were correct pairs * As a groud truth, there are 200 correct pairs!Then, Recall = C/B = 40/200 = 20% Precision = C/A = 40/50 = 80%The perfect descriptor gives 100% recall for any value of Precision!!Actual posPredicted PosABACB31DataSet6 different transformed images

RotationImage BlurZoom + RotationViewpoint ChangeLight Change

JPEG Compression32Matching Strategies* Hessian-Affine Regions

Nearnest Neigbor Matching Threshold Nearnest Neigbor Matching Distance RatioThreshold based Matching

33View Point Change

With Hessian Affine Regions With Harris-Affine Regions

34Scale Change with Rotation

Hessian-Laplace RegionsHarris-Laplace Regions35Image Rotation of 30~45 degree

Harris Points36Image Blur

Hessian Affine Regions37JPEG Compression

* Hessian-Affine Regions38IlluminationChanges

* Hessian-Affine Regions39Ranking of Descriptor1. SIFT-based descriptors, 128 dimensions GLOH, SIFT2. Shape Context, 36 dimensions

3. PCA-SIFT, 36 dimensions

4. Gradient moments & Steerable Filters ( 20 dimensions ) & ( 14 dimensions)

5. Other descriptorsHigh PeformanceLow Peformance Note: This performance is for matching problem. This is not general performance.40Ranking of Difficult Image Transformation1. Scale & Rotation & illumination

2. JPEG Compression

3. Image Blur

4. View Point Changeeasydifficult1. Structured Scene

2. Textured Sceneeasydifficult

Two Textured Scenes41Other ResultsHessian Regions are better than Harris RegionsNearnest Neigbor based matching is better than a simple threshold based matchingSIFT becomes better when nearenest neigbor distance ration is usedRobust region descriptors peform bettern than point-wise descriptorsImage Rotation does not have big impact on the accuracy of descriptors42A Fast Local Descriptor for Dense MatchingEngin Tola, Vincent Lepetit, Pascal FuaEcole Polytechnique Federale de Lausanne, Switzerland

Paper noveltyIntroduces DAISY local image descriptor much faster to compute than SIFT for dense point matchingworks on the par or better than SIFTDAISY descriptors are fed into expectation-maximization (EM) algorithm which uses graph cuts to estimate the scenes on low-quality images such as the ones captured by video streamsThe paper introduce a novel local image descriptor, called DAISY, designed for dense wide-baseline matching purpose. DAISY descriptor is inspired from earlier ones such as SIFT and GLOH but can be computed much faster for dense point matching, and unlike some other types of descriptor which can also be computed efficiently at every pixel, it does not introduce artifacts that degrade the matching performance. From the results that will be showed later, the performance of DAISY descriptor is at least as good as SIFT and some times even better.

The author also fed DAISY descriptors to a graph-cuts based dense depth map estimation algorithm and it yields better wide-baseline performance than the commonly used correlation windows for which the size is hard to tune. As a result, unlike competing techniques that require many high-resolution images to produce good reconstructions, DAISY descriptor can compute them from pairs of low-quality images such as the ones captured by video streams.44SIFT local image descriptor SIFT descriptor is a 3D histogram in which two dimensions correspond to image spatial dimensions and the additional dimension to the image gradient direction (normally discretized into 8 bins)

Before we go into the details of DAISY descriptor, let me first briefly describe SIFT local image descriptor, because DAISY descriptor is largely inspired from SIFT. SIFT before PCA dimensionality reduction, are 3D histograms in which two dimensions correspond to image spatial dimensions and the additional dimension to the image gradient direction. They are computed over local regions, usually centered on feature points but sometimes also densely sampled for object recognition tasks. Each pixel belonging to the local region contributes to the histogram depending on its location in the local region and the orientation and the norm of the image gradient at its location. As depicted by the figure, when an image gradient vector computed