Scan Registration using Segmented Region Growing...

1

Scan Registration using Segmented Region GrowingNDT

Arun Das∗ and Steven L. Waslander†

University of Waterloo, Waterloo, ON, Canada, N2L 3G1

Abstract—The Normal Distributions Transform (NDT) scanregistration algorithm divides a point cloud using rectilinearvoxel cells then models the points within each cell as a set ofGaussian distributions. A nonlinear optimization is performed inorder to register the scans, however the voxel based approachresults in a discontinuous cost function that hinders convergence.In this work, a segmented region growing NDT (SRG-NDT)variant is proposed, which first removes the ground pointsfrom the scan, then uses natural features in the environmentto generate Gaussian clusters for the NDT algorithm. Theremoval of the ground points is shown to significantly speedup the scan registration process with negligible effect on theregistration accuracy. By clustering the remaining points, theSRG-NDT approach is able to model the environment withfewer Gaussian distributions compared to the voxel based NDTmethods, which allows for a smooth and continuous cost functionthat guarantees the optimization will converge. Furthermore, theuse of a relatively small number of Guassian distributions allowsfor a significant improvement in run-time. Experiments in bothurban and forested environments demonstrate that the SRG-NDT approach is able to achieve comparable accuracy to existingmethods, but with an average decrease in computation time overICP, G-ICP and NDT of 90.1%, 95.3% and 94.5%, respectively.

I. INTRODUCTION

Scan registration methods play a significant role in the map-ping algorithms used on mobile robots. Many sensors provideinformation about the environment in the form of point clouds,such as RGBD cameras, LIDAR and stereo cameras. Scanregistration algorithms rely on overlapping geometry betweenscans to determine the relative transformation between the twoscans. Knowing these transformation parameters allows for theaggregation of point cloud data, which is commonly requiredfor autonomous operation in unknown environments.

The most popular scan registration methods used for map-ping are the Iterative Closest Point (ICP) (Besl and McKay,1992; Chen and Medioni, 1991; Zhang, 1994), Generalized It-erative Closest Point (G-ICP) (Segal et al., 2009) and the Nor-mal Distributions Transform (NDT) (Biber and Strasser, 2003;Magnusson et al., 2007; Stoyanov et al., 2012a) algorithms.The ICP algorithm attempts to find transform parameters suchthat the Euclidean distance between nearest neighbour pointsare minimized. Since ICP forms correspondences completelybetween points, the quality of the scan registration degradeswhen the point cloud is sparse. Generalized ICP improves ICP

* PhD Student, Mechanical and Mechatronics Engineering, University ofWaterloo; [email protected]

† Assistant Professor, Mechanical and Mechatronics Engineering, Univer-sity of Waterloo; [email protected]

by calculating the surface normal at each point using localneighbourhoods, and uses the surface normal information toimprove the accuracy of the scan registration. However, thequality of the surface normal is poor when the point cloudis sparse, or when points are sampled from surfaces that arenon-planar, such as trees and shrubs. Furthermore, the mainbottleneck of the ICP and G-ICP algorithms is the costly near-est neighbour search used to identify point correspondences,which is generally computationally expensive.

The NDT algorithm is a alternative approach, that doesnot need to explicitly compute nearest neighbour correspon-dences, as it represents the point cloud as a set of Gaussiandistributions instead of as individual points. When perform-ing registration, one scan is represented by the voxel baseddistributions, and an overlap score is calculated accordingto the probabilities of the points in the second scan beingsampled from the Gaussian distributions. To align the scans,a nonlinear optimization is performed in order to determinethe transformation parameters such that the overlap score ismaximized. However, the optimization has been shown to havepoor convergence properties (Das and Waslander, 2012), asthe cost function is discontinuous due to points crossing cellboundaries. Furthermore, The voxel based discretization ofNDT generates Gaussian distributions which do not necessar-ily model the environment accurately. The distributions onlylocally model the points within each cell at a single scale, andmay not capture large features present within the scan thathelp with broad alignment.

In this work, we apply segmentation and clustering tech-niques to improve the convergence properties and run-timeof the NDT algorithm, specifically for use in outdoor envi-ronments and with sparse point-clouds. We introduce Seg-mented Region Growing NDT (SRG-NDT) which identifiesand exploits natural features in the environment to generateGaussian clusters for the NDT algorithm. First, the groundplane is removed from the scan using a Gaussian Processbased segmentation algorithm, and a computationally efficientregion growing algorithm is applied to cluster the remainingpoints. Clustering according to natural features allows foraccurate modeling of the environment using significantly fewerdistributions compared to the standard NDT algorithm. Theapplication of ground segmentation also results in the decom-position of the environment into ground and obstacle partitionswhich can be used for higher level mission execution. Toperform the registration, the cost of every point in the scenescan is evaluated with respect to all clusters from the referencescan, resulting in a smooth and continuous cost function which

2

guarantees that the optimization will converge. Similarly, indistribution to distribution matching, the scene scan is alsosegmented and clustered and the cost of every cluster in thescene scan is evaluated with respect to all clusters from thereference scan.

Experiments verifying the approach are performed usingoutdoor data from the Ford Vision and LIDAR dataset anda data collected in a forested environment at Worcester Poly-technic Institute. From the experiments, three main results arepresented: First, we demonstrate that the SRG-NDT algorithmperforms scan registrations with comparable accuracy to exist-ing state-of-the art methods. Second, we show that the removalof the ground points from a point cloud reduces the compu-tation time required for scan registration, while maintainingcomparable registration accuracy to a scan registration whichis performed using the full point cloud. Finally, we illustratethat the SRG-NDT algorithm is capable of performing scanregistration with significantly faster run-time compared to theexisting methods, as SRG-NDT achieves an average decreasein computation time over ICP, G-ICP and NDT of 90.1%,95.3% and 94.5%, respectively.

This work proceeds as follows. Section II presents workrelated to existing scan registration and clustering methods.Section III formulates the scan registration problem and de-scribes the NDT method in detail. In Section IV we present theSegmented Region Growing NDT algorithm and the associateddetails relating to Gaussian Process based ground segmen-tation and region growing clustering. Experimental resultsvalidating the SRG-NDT approach are presented in SectionV. Finally, Section VI concludes our findings and presentsfuture directions for this work.

II. RELATED WORK

The best known and most widely used scan registrationalgorithm is the Iterative Closest Point (ICP) algorithm, inde-pendently introduced by Besl and McKay (Besl and McKay,1992), Chen and Medioni (Chen and Medioni, 1991), andZhang (Zhang, 1994). The ICP algorithm attempts to findtransform parameters that minimize the Euclidean distanceof corresponding points between the reference scan and thescene scan. To compute the correspondence, a point in thescene scan is queried against the reference scan to determinethe point in the reference scan which is the minimum Eu-clidean distance to the query point. The process of computingpoint correspondences and minimizing the Euclidean distancebetween corresponding points is performed iteratively untilconvergence, when the sum of Euclidean distances betweencorresponding point does not sufficiently decrease. In practice,the point density of a scan typically decreases with range fromthe laser scanner, thus a moving vehicle can create scans wherethe same spatial region is sampled at different densities. TheICP algorithm has difficulties in situations of differing pointdensity, as it corresponds every point individually and does notuse any surface information. To address the shortcomings ofICP, Segal et al. introduced generalized-ICP (G-ICP) (Segalet al., 2009), which calculates the surface normal at eachpoint, using local neighbourhoods, and only includes corre-spondences between points of similar surface normal structure

between the two scans. G-ICP explicitly takes sensor noise andsampling characteristic into account, thus improving the con-vergence basin and convergence rate when compared to ICP.The major shortcomings of both the ICP and G-ICP approachis the generation of the nearest neighbour correspondences.This step is generally computationally expensive and has beenshown to be the bottleneck of the ICP algorithm (Borrmannet al., 2008).

An alternative approach for scan registration is the NormalDistributions Transform (NDT). The NDT algorithm was firstsuggested by Biber and Strasser for 2D scan registration andmapping (Biber and Strasser, 2003), and was later expandedinto 3D (Magnusson et al., 2007; Takeuchi and Tsubouchi,2006). The NDT algorithm represents the underlying scanas a set of Gaussian distributions that locally model thesurface of the reference scan as a probability density function(PDF). Given a parameter estimate for the desired trans-formation, a score is assigned which quantifies the amountof overlap between the reference scan and the transformedscene scan. The overlap score is determined by evaluating thetransformed scene scan points at the Gaussian distributionswhich model the reference scan. In order to register thetwo point clouds, a nonlinear optimization is performed todetermine the transformation parameters such that the overlapscore is maximized. Stoyanov et al. introduced a distributionto distribution matching extension to NDT, which generatesGaussian distributions for both the scene and reference scanand minimizes the L2 distance between distribution sets of thescene and reference scan in order to perform registration. Thedistribution to distribution extension was shown to improvethe convergence basin, as well as improve computation time(Stoyanov et al., 2012a).

Most recently, the use of the NDT representation of pointcloud data has been significantly extended to include maprepresentation, point cloud accuracy evaluation, localization,and path planning applications. The extended methods allrely on NDT based maps, and thus can greatly benefit fromadvances in NDT based scan registration algorithms that allowfor robust and rapid aggregation of point clouds. NDT Occu-pancy Maps (NDT-OM) use a voxelized NDT representationfor the map and a recursive covariance update procedure toprobabalistically integrate new sensor information into theNDT map. The NDT-OM combines the robustness of standardoccupancy based mapping and the compact spatial representa-tion of NDT (Saarinen et al., 2013b), and has shown promisingresults when used for dynamic environments (Stoyanov et al.,2013) and Graph SLAM based approaches for long termoperations (Einhorn and Gross, 2013). 3D-NDT has also beenused to evaluate the accuracy of point cloud data collectedfrom different sensors (Stoyanov et al., 2012b), as well ascorrect for distorted point clouds caused by the laser scannerundergoing appreciable movement during the acquisition ofthe scan (Almqvist et al., 2013). The popular Monte-Carlolocalization (MCL) algorithm was reformulated for NDTbased maps and was experimentally shown to provide superiorlocalization accuracy when compared to occupancy grid MCLfor industrial environments in both static and dynamic scenar-ios (Saarinen et al., 2013a). A modified wave-front planner

3

using an NDT based map representation was demonstrated tocompute feasible paths in a computationally efficient mannerin a semi-structured 3D environment (Stoyanov et al., 2010).

Although it has been shown that the NDT algorithms areable to produce higher quality registration results relative toICP (Magnusson et al., 2009), a major shortcoming of theNDT algorithm is its poor convergence properties (Das andWaslander, 2012). As the Gaussian distributions are generatedby partitioning the point cloud with rectilinear grid cells, thenonlinear cost function used to determine the transform param-eters is not well defined as points cross cell boundaries duringthe optimization. As a result, convergence of the optimizationis not guaranteed. The issue of discontinuities in the costfunction is typically addressed using overlapping grid cellsor trilinear interpolation between grid cells (Magnusson et al.,2007), however such techniques only mitigate the discontinuesinstead of fully removing them. Furthermore, a typical NDTscan registration transforms the point cloud data into a largeset of Gaussian distributions, in accordance with the selectedcell size. The run time of the NDT algorithm scales linearlywith the number of distributions, and although the naivediscretization approach does a reasonable job at modeling theenvironment, improvements in run-time should be possible bygenerating fewer Gaussian distributions which directly modelnatural features in the environment.

It should be noted that due to strong underlying surfacestructure and high object density, scan registration algorithmsgenerally perform well in structured environments with flat,planar surfaces. In contrast, large outdoor environments withtrees and foliage makes the use of scan registration basedapproaches challenging, as a typical point cloud from a laserscanner such as the Velodyne-HDL32E sensor is sparse whenthe environment is large, and relatively noisy when scanningtrees, grass, and foliage. The state of the art scan registrationalgorithms generally make assumptions about point cloud datawhich are not valid in sparse or forested environments. Forexample, ICP methods requires a high point data density in or-der to provide accurate correspondences for nearest neighboursearch (Nuchter, 2009). G-ICP requires the computation of sur-face normal data, which is difficult to perform accurately withsparse and noisy point cloud data of grass, trees and shrubs,all of which are typical of a large, forested environment.Conceptually, the NDT approach is suitable for a forested area,as the sparsity issue can be resolved through the modellingof the point cloud with Gaussian distributions. However, thepoor convergence properties of the NDT algorithm limits itseffectiveness in such an environment.

In order to improve convergence, the authors proposed thepartitioning of a 2D laser scan using k-means clustering,over multiple scales with increasing number of clusters (Dasand Waslander, 2012). These sets of clusters were then usedin a course-to-fine optimization method in order to avoidlocal minima and improve convergence. The approach resultsin a smooth and continuous cost function, however the k-means approach does not scale well to 3D point cloud dataand requires fixing the number of partitions a priori. Scanpartitioning has also been performed in 3D by Moosmann etal., who used a graph based approach to segment a laser scan

based on local convexity criterion (Moosmann et al., 2009).The method computes surface normal information in orderto partition the scan, however good quality surface normalinformation is difficult to generate for sparse laser scans, orwhen the environment consists of objects such as trees, brush,and foliage. Gaussian processes (GP) and nearest neighborclustering algorithms operating on 3D voxel grids of pointcloud data have been shown to produce good segmentationresults on sparse laser data (Douillard et al., 2011), andwas further applied to ICP scan registration, where corre-spondences between neighbouring points were constrained tobelong to corresponding segments (Douillard et al., 2012).Golovinskiy et al. (Golovinskiy et al., 2009) also demonstrated3D point cloud clustering using a hierarchical approach, andKlasing et al. (Klasing et al., 2008) performed clustering on3D data using a radially bounded nearest neighbour (RBNN)graph. Although promising, the use of 3D nearest-neighbourclustering techniques is unnecessary for environments wherethe majority of the natural features are spatially separated andare oriented perpendicular to the ground. In such situations,simpler methods such as 2D region growing from a top-down perspective of the scan can be applied. Region growingalgorithms are well established within the computer visioncommunity as segmentation methods that are able to providegood clustering results in a computationally efficient manner(Haralick and Shapiro, 1985).

III. PROBLEM FORMULATION

A. Scan Registration

Define a point cloud as the set of points P = p1, . . . , pNP where pi ∈ R3 for i ∈ 1, . . . , NP . A point pj ∈ Pconsists of three components, pj = pxj , p

yj , p

zj, which refer

to the x, y, and z components of the point, respectively. Totransform a point from the coordinate frame of the scene scanto the coordinate frame of the reference scan, transformationparameters are required. In 3D, the transform parametersare given by E = [tx, ty, tz, φ, θ, ψ]T ∈ SE(3). Note thatthe rotations will be defined by a 3 − 2 − 1 Euler angleparametrization. The transformation function, TE : R3 7→ R3,that maps a point from the scene scan, p ∈ PS , into thecoordinate frame of the reference scan, using a parameterestimate, E , is given as

TE(p) =[RφRθRψ

]p+

tx

ty

tz

, (1)

where Rφ is the rotation matrix for a rotation about the x axisby angle φ, Rθ is the rotation matrix for a rotation about the yaxis by angle θ, Rψ is the rotation matrix for a rotation aboutthe z axis by angle ψ, and [tx, ty, tz]

T is the translation vectorbetween the origins of the two frames.

Scan registration algorithms seek to find the optimal trans-formation between two point cloud scans, a reference scan,PR, and a scene scan, PS . The goal is to determine transfor-mation parameters which best align the two, such that the twoscans overlap as much as possible. To quantify the amount ofoverlap between the two scans, a score function is required,

4

and is denoted by the mapping Λ : SE(3) 7→ R. The 3D scanregistration problem can be defined as determining the optimaltransform parameters, E∗, such that

E∗ = argminE∈SE(3)

Λ(E). (2)

B. NDT Scan Registration

The Normal Distributions Transform (NDT) is a method bywhich sections of a point cloud are represented as Gaussiandistributions within a grid structure. The transform maps apoint cloud to a smooth surface representation described asa set of local distributions which capture the shape of thesurface. Similar to an occupancy grid map, the NDT generatesa subdivided representation of the environment. However,the occupancy grid represents the probability that the cellis occupied, whereas the NDT represents the probability ofmeasuring a point cloud sample for each position within agiven cell.

The NDT registration algorithm begins by subdividing thespace occupied by the reference scan into fixed size, rectilineargrid cells, c ⊆ R3, where the set of all cells in the referencescan is denoted as CR. Denote the collection of the referencescan points in cell ci as PRci = p ∈ PR : p ∈ ci. Given theiith cell, the points of the reference scan occupying that cellare used to generate a mean, µRi , and a covariance matrix,ΣRi , for a representative Gaussian distribution, N (µRi ,Σ

Ri ).

The Gaussian distribution can be interpreted as a generativeprocess that models the local surface points PRci within thecell. Assuming that the locations of the reference scan surfacepoints are drawn from this distribution, the likelihood ofhaving measured a point p, within cell ci, can be modeledas

ρci(p) = exp

(− (p− µTi )Σ−1

i (p− µi)2

). (3)

An NDT probability distribution plot for a sample 2D laserscan is given in Figure 1.

Since the point cloud is modelled by a piecewise continuousand piecewise differentiable summation of Gaussians, numeri-cal optimization tools can be used in order to register the scenescan with the reference scan. A fitness score can be calculatedwhich quantifies the measure of overlap between the referencescan and the scene scan transformed by parameter estimate E .The original work by Magnusson for point to distribution NDTregistration calculates the cost by evaluating each point in thetransformed scan at the distribution corresponding to the NDTcell which each transformed point occupies (Magnusson et al.,2007). The NDT point-to-distribution score is given as

Λp2dndt(E) = −∑p∈PS

ρccur (TE(p)), (4)

where ccur denotes the cell which point p occupies whentransformed by parameter estimate E . Stoyanov et al. intro-duced a modified version of NDT which generates an NDTrepresentation of both scans, then compares the distributionsof the scans in order to align them (Stoyanov et al., 2012a).

Fig. 1. Probability density of the laser scan from NDT for each cell. Lighterareas indicate higher probabilities of sample points.

For distribution to distribution matching, the scene scan PS isalso divided into a set of cells, CS , where the points containedin the ith cell are used to model the distribution N (µSi ,Σ

Si ).

To align the scans, an optimization is formulated to minimizethe L2-distance between the sets of Gaussian distributions ofthe reference and scene scan. Denote the difference of meanvalues for the ith Gaussian distribution from the reference scanand the jth Gaussian distribution of the scene scan as

βij =[TE(µ

Sj )− µRi

]. (5)

Using the difference of means, the NDT distribution-to-distribution score function is given as

Λd2dndt(E) = −

|CR|∑i=1

|CS |∑j=1

exp(−1

2βTij[ETr ΣSj Er + ΣRi

]−1βij).

(6)The distribution to distribution cost function evaluates all

pairwise Gaussian components for both the scene and thereference scan. In Stoyanov’s work however, the cost functionis only evaluated at the nearest Gaussian component in orderto reduce computation. Evaluation at the nearest Gaussianresults in discontinuities in the overall cost function, whichimplies that the cost function is non-smooth and the gradientand Hessian do not exist in the transformation space whenthe correspondences between mean points change. Relatedapproaches address the discontinuous cost function using atri-linear interpolation scheme which takes distributions fromneighbouring cells into account when calculating the gradientand Hessian contributions (Magnusson et al., 2007). However,the tri-linear interpolation does not fully solve the issue, asthe cost functions are still discontinuous.

5

It should be noted that it is possible to compute an analyticgradient, g, and Hessian H , for both NDT cost functions,which can be used for improving the performance of the non-linear optimization method selected. For point to distributionmatching, denote the functions Πg(pk, E), and ΠH(pk, E),which use a point in the scene scan and the transformationparameters to calculate an associated gradient and Hessiancontribution for point pk. The optimization is initialized froma parameter estimate E , and terminates once the norm of thegradient, ‖g‖, is less than a user specified threshold, εndt.Biber et al. use a Newton method with line search to performthe optimization (Biber and Strasser, 2003), however othermethods such as Levenberg-Marquardt or Broyden-Fletcher-Goldfarb-Shanno (BFGS) method could be used (Magnusson,2009).

IV. PROPOSED METHOD: SEGMENTED REGION GROWINGNDT

To perform Segmented Region Growing NDT, the NDTalgorithm is modified in the following three ways. First, theground points of the scan are removed using the groundsegmentation method described in Section IV-A. In large,outdoor, field-like environments, the ground points typicallydo not provide as much information for the purpose of scanregistration as other environmental features such as large trees,fences, etc., and in fact, can decrease the accuracy of ICPand NDT. Scan point density also becomes an issue whendealing with the ground, as the ground points very close tothe laser scan are more dense than those further away. Non-uniform density of the point cloud is not desirable for NDT,since the high density of points near the vehicle will havea tendency to generate Gaussian distributions biased towardsthe sensor origin. The non-uniform density of a scan canbe somewhat mitigated using a down-sampling voxel basedfilter for the scan, however the use of a voxel filter addsadditional computation to the scan registration, and removesinformation from the entire scan due to the down-samplingprocess. Although down sampling the scan will remove thebiased distribution issue for NDT, it does not solve the problemof the discontinuous cost function.

Second, the non-ground points are clustered using the regiongrowing method described in Section IV-B. By performingthe clustering step, natural features in the environment, suchas trees and bushes, are clustered to form the Gaussiandistributions for SRG-NDT. The region growing clusteringmethod can be seen as a way to capture representative featuresfrom the environment, assuming they are spatially separatedonce the ground points have been removed and are distributedin a configuration that allows for the formation of clusters thatare sufficient for NDT registration.

Finally, to accommodate the region clustering of the scan,the NDT cost calculation is modified as described in SectionIV-C. The SRG-NDT algorithm evaluates the transformedpoint with respect to the set of all the Gaussian distributions forthe clusters identified by the region growing method. Similarly,SRG-NDT is applied to the distribution-to-distribution NDTcost by scoring each Gaussian distribution in the scene scan

with respect to all Gaussian distributions from the referencescan. Evaluation at all of the distributions is computationallyfeasible since the number of clusters identified in the scanis typically significantly smaller than the number of Gaussiandistributions that would result from voxel based discretization.Calculating the cost based on all Gaussian distributions inthe scans allows for the strongest contributions to comefrom the most likely clusters and provides a continuous anddifferentiable cost function for the optimization.

A. Ground Segmentation

The basis for the ground segmentation algorithm used forthis work was first demonstrated by Douillard et al. (Douillardet al., 2011). Their method performs ground segmentationbased on a 2D Gaussian process model of the sparse pointclouds. The use of Gaussian processes provides a probabilisticframework to identify ground points using an incrementalsample consensus (INSAC) method. A modified version ofDouillard’s ground segmentation algorithm was introduced byTongtong et al. (Tongtong et al., 2011). To reduce computation,Tongtong divides the scan into sectors based on a polar gridbinning, and applies the ground segmentation to each sectorseparately. The Gaussian process regression is formulated foran approximate signal, based on the points contained in eachsector. Tongtong’s method was shown to run more quicklythan the 2D method, with comparable results in segmentationquality. The 1D approximation method for ground segmenta-tion from (Tongtong et al., 2011) is used in this work, and isbriefly summarized in this section.

1) Polar Grid Binning: In order to perform the polar gridbinning, the x−y plane of the laser scan, in the laser frame, isfirst segmented into Na angular sectors. The angle each sectorcovers, τa, is given by

τa =2π

Na. (7)

Each sector is further sub-divided into Nl linear range basedbins. The distance each linear range bin covers, τl, is givenby

τi =RmaxNb

, (8)

where Rmax is the maximum range measurement expected forthe given scan. Denote the cell for the ith sector and jth linearrange bin as ξij ⊆ R2, and the set of all bins as Ξ where|Ξ| = NaNl. Define the points from point cloud P whoseprojection falls within cell ξij as

Pij = p ∈ P : (px, py) ∈ ξij. (9)

To perform rapid ground segmentation, a set of tuples isconstructed from the cell point sets, Pij . In order to constructthe set of tuples, a prototype point is determined for the pointswithin each cell. For the case of ground segmentation, it isdesirable to model the prototype point in each cell such that itbest represents the ground. The prototype point for the pointsPij within each cell ξij is selected as the point with the lowestz co-ordinate, and is given as

6

xij = argminp∈Pij

(pz). (10)

Finally, using the prototype point for the points in eachcell, the set of tuples for each angular sector can be defined.Denote a tuple associated with the prototype point xij of acell as yij = (rij , hij) ∈ R2. For the tuple notation, the firstelement of the tuple rij ∈ R denotes the range component, andhij ∈ R denotes the height component. Using the prototypepoint for the cell, xij , the range and height components forthe tuple are calculated as

rij =√

(xxij)2 + (xyij)

2 (11)

hij = xzij . (12)

The range element, rij , is the Euclidean distance of the x and ycomponents from the prototype point, xij , to the sensor origin,and the height element, hij , is simply the z component of theprototype point. The set of tuples, Yi , for angular sector i isdefined as

Yi = yij : j ∈ 1 . . . Nl. (13)

2) Gaussian Process Regression: Ground segmentation isdone by performing Gaussian Process (GP) regression. For athorough exploration of the usage of GP’s for ground terrainmodeling, see Vasudevan et al. (Vasudevan et al., 2009). TheGP regression provides a probabilistic framework to predict aset of height values for a set of query inputs, given a set oftraining data. Suppose the GP model is to be trained with asubset of the range-height tuple set, denoted as Yi ⊆ Yi. Acovariance function is used to quantify the correlation amongdata points used for the regression. In this work, a squared-exponential covariance function, κ : R2 7→ R is used. Supposethe correlation between two scalar data points, b1 ∈ R andb2 ∈ R is desired. Using the squared-exponential covariancefunction, the correlation is given by

κ(b1, b2) = σ2f exp(− 1

2l2(b1 − b2)2), (14)

where l ∈ R is the length-scale parameter, and σf ∈ R isthe input variance, which are known as the hyper-parametersfor the GP model. Given the covariance function, a regressionproblem for ground segmentation can be posed which seeksto predict the height values for the associated range valuesfrom tuple set, Yi, given that the GP model has been trainedusing the data from the training set, Yi. Using the covariancefunction, define four matrices which correlate the query andtraining range data. Denote Kqq ∈ R|Yi|×|Yi| as the auto-covariance matrix for the query set. Let the notation K accesselement (j, k) of the matrix K(j, k). To simplify notation, letthe range and height values of a tuple element yk ∈ Yi bedenoted as rk and hk, respectively. Then, each element ofthe matrix Kqq is populated using the squared exponentialcovariance function given by Equation (14),

Kqq(j, k) = κ(rj , rk), ∀ yj , yk ∈ Yi. (15)

Similarly, the cross-covariance matrix between the query andtraining data, Kqt ∈ R|Yi|×|Yi|, and the auto-covariance matrixfor the training data, Ktt ∈ R|Yi|×|Yi|, can be defined as

Kqt(j, k) = κ(rj , rk), ∀ yj ∈ Yi and yk ∈ Yi (16)

Ktq = KTqt (17)

Ktt(j, k) = κ(rj , rk), ∀ yj , yk ∈ Yi. (18)

Using the auto and cross correlation matrices, the Gaussianprocess models the relationship between the input and trainingdata as a joint distribution. Denote the vector of height valuesfrom the training set as the vector λ = [h1 . . . h|Yi|]

T , whereh ∈ Yi. Similarly, denote the vector λ∗ as height values to beinferred using the GP regression. The GP is then modelled asthe joint distribution[

λ

λ∗

]∼ N

(0,

[Ktt + σ2

nI KtqKqt Kqq

]), (19)

where σ2n is an additional hyper-parameter, the process noise

variance.From the joint distribution, an expression for the conditional

mean and conditional variance can be expressed. The condi-tional mean and covariance expressions result in the predictiveequations. In other words, using the training data and thepredictive equations, a vector of ground height values, λ∗,and associated covariances, Vλ∗ , can be predicted for a set ofqueried range values. The conditional mean and covariancepredictive equations are given as

λ∗ = Kqt[Ktt + σ2

nI]−1

λ (20)

Vλ∗ = Kqq −Kqt[Ktt + σ2

nI]−1Ktq. (21)

Denote the method Υ(Yi, Yi), that generates the four correla-tion matrices Ktt , Ktq , Kqt, and Kqq using the training data,Yi, and query points, Yi, for the angular sector, i.

From the predictive equations, the estimated ground heightfor a given range value from the set of query data can begenerated. Using index notation, denote λ∗k as the kth elementof the query height vector, λ∗, and similarly denote λk as thekth element of the vector of training heights, λ. Further denoteN∗λ as the number of elements in the vector, λ∗ and note thatN∗λ = |Yi|. In order to determine if the query point is an inlieraccording to the GP model, the predicted ground height, λ∗k,can be compared to the actual ground height from the tuple,yk = (rk, hk) ∈ Yi, for the query point. To be considered aninlier with the GP model, two criteria must be fulfilled:

Vλ∗k< δmodel (22)

hk − λ∗k√σ2n + Vλ∗

k

< δdata, (23)

where δmodel defines the threshold of covariance for thetest point and δdata defines a normalized distance of thetest point to its expected value from the GP model. Thecomparison of the test point to δmodel ensures that there

7

is sufficient confidence in the GP model to allow for inlierclassification, and the comparison to δdata uses the distancebetween the predicted ground height and measured groundheight to classify the query point as an inlier to the GP model.

3) Ground Segmentation Using Gaussian Process Regres-sion and Incremental Sample Consensus: The INSAC algo-rithm is used in conjunction with the GP regression model toidentify all of the ground points in the laser scan. The groundpoints are separately identified for each sector of the polar gridbins. The 3D scan is first binned into sectors as discussed insection IV-A1. In order to initialize the ground segmentation,a seed-set for each sector is selected as the range-height tuplepoints that are within a threshold distance of the laser scanorigin, δo. For each sector, the training points for the GPmodel, Yi, is then selected as the sector’s seed points. Theremainder of the prototype point tuples in the query set, Yi,are evaluated against the GP model. The inlier tuples for thecurrent iteration are added to the inlier set, Yin, and a newGP model is produced. The process of adding inliers and re-generating the GP model is repeated until no further inlierscan be added to the set. Since there are no remaining inliers,all the prototype points for the bins have been classified asground or not ground. To classify the remaining points in eachground bin, the height component of each point is comparedto the height component of the ground prototype point. Ifthe absolute difference in height is less than a user definedthreshold, δg , the points are also classified as ground points.Denote the method P g ← ϑ(P ), which accepts point cloud Pand returns the ground points P g . The ground segmentationprocess is summarized in Algorithm 1. Ground segmentationresults for an example laser scan are given in Figure 2.

B. Clustering of the Non-Ground Points

In order to perform clustering of the non-ground points ofa sparse point cloud, a region growing clustering techniquebased on bins of points is used. The bin assignments ofthe points generated for the ground segmentation processare reused for the clustering algorithm. Reuse of the cellsand point bin assignments allows for rapid clustering of thenon-ground points using a simple region growing approach(Haralick and Shapiro, 1985) that connect adjacent cells whichcontain non-ground points and satisfy user defined mergingconditions (such as a maximum distance between the meanthe points located within adjacent cells, or the L2 distancebetween distributions of points from adjacent cells).

Using a similar polar grid technique as described in SectionIV-A1, except using a single index for simplicity of notation,denote a polar grid cell as ξi ⊂ R2 and the set of all such cellsas Ξ. For a point cloud of non-ground points, Png , denote thepoints which fall within cell ξi as

Pngi = p ∈ Png : (px, py) ∈ ξi, (24)

and the set of all cell point sets as P = Png1 . . . Png|Ξ|. Denotea cluster of points as γ ⊆ Png , and the set of clusters asΓ = γ1 . . . γNΓ

, where NΓ is the total number of clusters,which is initially unknown. The clusters are generated by firstrandomly sampling an initial bin index, i, from a random

Algorithm 1 ϑ(P ): Given point cloud P , find the groundpoints, P g , using GP-INSAC.

1: P g ← ∅2: for all i ∈ 1 . . . Na do3: Yi ← ∅4: for all j ∈ 1 . . . Nl do5: xij = argmin

p∈Pij(pz)

6: rij =√

(xxij)2 + (xyij)

2

7: hij = xzij8: yij = (rij , hij)9: Yi ← Yi ∪ yij

10: end for11: end for12: for all i ∈ 1 . . . Na do13: Yi ← (r, h) ∈ Yi : r < δo14: Yin ← Yi15: Yi ← Yi\Yi16: while Yin 6= ∅ do17: Yin ← ∅18: [Kqq,Kqt,Ktq,Ktt]← Υ(Yi, Yi)

19: λ∗ ← Kqt[Ktt + σ2

nI]−1

λ

20: Vλ∗ ← Kqq −Kqt[Ktt + σ2

nI]−1Ktq

21: for all yk ∈ Yi do22: if Vλ∗

k< δmodel and hk−λ∗

k√σ2n+Vλ∗

k

< δdata then

23: Yin ← Yin ∪ yk24: end if25: end for26: Yi ← Yi ∪ Yin27: Yi ← Yi\Yin28: for all j ∈ 1 . . . Nl do29: if (rij , hij) ∈ Yi then30: Pin ← p ∈ Pij : |pz − hij | < δg31: P g ← P g ∪ Pin32: end if33: end for34: end while35: end for

uniform distribution, U , such that i ∼ U(i ∈ 1 . . . |Ξ|).Letting µi ∈ R3 be the mean of point set Pi, denote thedistance function d : P ×P 7→ R return the distance betweenpoint sets Pi and Pj as

d(Pi, Pj) = ‖µi − µj‖m, (25)

where m denotes the type of norm to be used. Using Equation(25), the nearest neighbour set of point set Ω, to point set Pican be given as

Ω = P ∈ P : d(P, Pi) < δnn\Pi, (26)

where δnn determines the size of neighbour to consider forpoint set Pi.

For the randomly selected point set Pi, the nearest neigh-bours are generated and added to the nearest neighbour set, Ω.An element from the nearest neighbour set, Pj ∈ Ω, can be

8

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29−15

−10

−5

0

5

10

15

20

prototype point range index

prot

otyp

e po

int h

eigh

t [m

]

(a) Classification of prototype points for one sector (b) Ground segmentation result for 3D laser Scan

Fig. 2. Ground segmentation results for an outdoor scene with large trees and shrubs. Blue points have been classified as ground, and red X points areclassified as non-ground. Figure (a) shows the classification results for a single sector after performing the GP based ground segmentation. The x-axis showsthe division of bins for the sector. Figure (b) shows the classification results for the entire 3D laser scan. The scan has been binned with the parametersτa = 8 degrees and τl = 1.875 meters.

compared to Pi to determine if the two can be combined.Merging of the two clusters is decided using the function% : P × P 7→ 0, 1. The function %(Pi, Pj) returns 1 if thetwo point sets Pi and Pj can be merged, and 0 otherwise. Inthe merging function, different metrics can be used to comparethe point sets, such as Euclidean distance between the meanof the points, the L2 distance between Gaussian distributionsformed by the point sets, or a test for Gaussian fit. If the binscan be combined, the neighbour bin is added to an open list ofpoint sets, Qo, where the notation Qko denotes the kth elementof Qo. The neighbour point set is also removed from the setof all point sets and is merged into the points of the currentcluster, γcur. The nearest neighbours of the bins from Qo areexplored and evaluated. When a bin is explored, its nearestneighbours are also added to Qo, based on the evaluationfunction, %. Once Qo is empty, it means no further nearestneighbours can be assigned to the current cluster so a newpoint set from P is selected, and the process is repeated. Theprocess continues until all non-empty point sets in P havebeen assigned to a cluster. The region growing algorithm ispresented in Algorithm 2. Denote the method Γ ← η(Png)from Algorithm 2, which accepts the point cloud of non-ground points and returns the set of clusters determined bythe region growing algorithm. An example of the resultingGaussian distributions after the clustering has been completedis presented in Figure 3.

C. SRG-NDT Cost FunctionThe NDT cost function is modified to score each point

in the segmented reference scan with respect to all of theGaussian distributions generated from performing the regiongrowing clustering algorithm on the scene scan. Denote a pointin the scene scan that has been transformed by parameterestimate E as p = TE(p). The transformed point can be scoredwith respect to a Gaussian cluster, γj , using the NDT scorefunction, ρ, and is given as

Algorithm 2 η(Png): Given the point set for the non-groundpoint cloud, P , perform Region Growing Clustering and returnthe set of clusters Γ.

1: while P 6= ∅ do2: i ∼ U(i ∈ 1 . . . |Ξ|)3: P ← P\Pi4: if Pi 6= ∅ then5: γcur ← Pi6: Qo ← Pi7: while Qo 6= ∅ do8: P ← Q1

o

9: Qo ← Qo\Q1o

10: Ω = P ∈ P : d(P, P ) < δnn\P11: for all P ∈ Ω do12: if %(P, P ) then13: Qo ← Qo ∪ P14: γcur ← γcur ∪ P15: P ← P\P16: end if17: end for18: end while19: Γ← Γ ∪ γcur20: end if21: end while

ργj (p) = exp

(−

(p− µTj )Σ−1j (p− µj)2

). (27)

The points in the scene scan are scored against all of theGaussian clusters from the reference scan, which allows for thestrongest contributions to come from the most likely clustersand provides a continuous and differentiable cost function forthe optimization. Similar to the standard NDT algorithm, itis possible to compute an analytic gradient, g, and Hessian

9

Fig. 3. The resulting Gaussian distributions after performing the clustering al-gorithm on the non-ground point laser scan. One-sigma Gaussian distributionsare shown.

H , for the cost function, which can be used for improving theperformance of the nonlinear optimization. The optimization isinitialized from a parameter estimate E , and terminates oncethe norm of the gradient ‖g‖ is less than a user specifiedthreshold, εndt. Note that since the cost is continuous, theanalytic gradient and Hessian exist and are well defined forthe whole transformation parameter space, thus guaranteeingconvergence of the optimization using any gradient basedmethod.

A summary of SRG-NDT with point to distribution match-ing is given in Algorithm 3. For distribution to distributionmatching SRG-NDT, a similar method is applied, except bothlaser scans are segmented and clustered and the set of resultingGaussian distributions for each scan are compared using thedistribution to distribution NDT cost function.

V. EXPERIMENTAL RESULTS

The SRG-NDT approach is evaluated using two data setsof different environments and is compared to other scanregistration algorithms. The algorithms used for comparisonare ICP, NDT and G-ICP, and for the purposes of comparison,the Point Cloud Library (PCL) implementations of thesealgorithms are used (Bogdan and Cousins, 2011). Note thatfor the experiments, the distribution to distribution version ofSRG-NDT was evaluated and implemented in C++/ROS, asdistribution to distribution version of NDT outperforms thepoint to distribution version of SRG-NDT in both registrationquality and run-time (Stoyanov et al., 2012a). The ICP al-gorithms are implemented using a maximum correspondencedistance of 10m and the NDT algorithm was implementedwith a grid size of 3m. For all optimization based approaches,the optimization is terminated when the norm of the gradientor the norm of the step size falls below 10−6. The criteriafor merging bins in the region growing cluster algorithm is atest for Gaussian fit, which is also used by Stoyanov et al. topartition NDT cells (Stoyanov et al., 2012a).

In order to test the accuracy of SRG-NDT against other scanregistration methods, the Ford Campus Vision and LIDAR

Algorithm 3 Register scene scan PS with reference scan PR

using SRG-NDT Algorithm and initial parameter estimate E .

1: E ← E2: P g ← ϑ(PR)3: Png ← PR\P g4: Γ← η(Png)5: for all γj ∈ Γ do

6: µγj ← 1|γj |

∑p∈γj

p

7: Σγj ← 1|γj |−1

∑p∈γj

(p− µγj )(p− µγj )T

8: end for9: while g > εndt do

10: s← 011: g ← 012: H ← 013: for all p ∈ PS do14: p← TE(p)15: for all γj ∈ Γ do16: s← s+ ργj (p)17: g ← g + Πg(p, E)18: H ← H + ΠH(p, E)19: end for20: end for21: solve: H∆E = −g22: E ← E + ∆E23: end while24: E∗ ← E

data set (Pandey et al., 2011) and the Worcester PolytechnicInstitute (WPI) data set are used. The Ford data set consistsof approximately 6200 scans, and was generated using aVelodyne HDL-64E LIDAR with ground truth acquired byintegrating the high quality velocity and acceleration mea-surements produced by an Applanix POS-LV 420 INS withTrimble GPS. When scans were aligned using the ground truthmeasurements, the alignment was seen to have on the order ofsub-degree accuracy in orientation and decimeter accuracy inposition. To determine the robustness of the proposed methodin a sparse, forested environment, the WPI data set is used.The WPI data was collected as part of the NASA SampleReturn Robot Challenge, at Worcester Polytechnic Institute(WPI) and consist of approximately 5000 scans. The laserscans were collected using a Velodyne HDL-32E LIDAR,mounted to a custom made chassis designed for the challenge.The environment consists mainly of trees and other foliage, aswell as a single small gazebo structure.

To assess the scan registration quality for the tested al-gorithms when ground truth is not available, the pair-wiseregistered scans are aggregated into a map. The maps areevaluated using a crispness measure, which is also used formap evaluation in (Douillard et al., 2012). The crispness of amap is evaluated by discretizing the 3D space into voxels andrecording the number of occupied voxels. For the experiment,a voxel size of 0.1 m is used. A map which evaluates to a

10

lower number is said to be crisper than a map which evaluatesto a higher number. The crispness quality measure attempts toquantify the blurring that results in the aggregated map causedby incremental pair-wise registration errors as the scans areaggregated.

The experimental results are presented in three parts. First,a comparison of the accuracy of the tested scan registrationmethods using the Ford data-set and WPI data set is presented.Second, analysis which demonstrates the effect of groundsegmentation on registration quality is discussed. Finally, acomparison of the run-times for the tested methods is shown.

A. Evaluation of Scan Registration Accuracy

1) Outdoor Urban Environments: In order to test theaccuracy of the scan registration in an urban environment,each algorithm performs pair-wise scan registration using theLIDAR data from the Ford data set, and is performed over thefull 6200 scans. The registration is initialized with a parameterestimate of zero translation and zero rotation. The error in thetranslation and rotation of registration is then compared tothe ground truth measurements. The translation and rotationerror distributions, along with the run-time for the evaluatedalgorithms, are displayed in Figure 4.

The error distributions from Figure 4 demonstrate that theSRG-NDT and G-ICP algorithm produces registrations withthe highest accuracy. The NDT algorithm also shows fairly lowtranslation and rotation errors, however, these do not appear tobe as accurate as SRG-NDT or G-ICP. Note that the numberof failed registrations (denoted as outliers in Figure 4) accountfor less than 10% of the total scan registrations performed. Thecrispness measure for each registration algorithm is generatedfrom the aggregate map formed from a subset of 100 scans.The crispness measures, along with the average translationand rotation errors for ICP, G-ICP, NDT, and SRG-NDT arepresented in Table I as percent improvement over the ICPcrispness value. To evaluate the strength of the correlationbetween the crispness measure and the accuracy of the scanregistration, the Pearson product moment correlation betweenthe crispness and average translation error is computed. Nocorrelation is calculated for the rotation error as the deviationin error between the evaluated algorithms is very small. Thecorrelation between the crispness and translation error is−0.9748, and is found to be statistically significant with aconfidence level of 95%. It is evident that the crispness is verystrongly correlated with the accuracy of the scan registrationand thus strengthens the case for the use of crispness todetermine the registration quality for data where ground truthis not available.

The qualitative result of map crispness can be seen whena subset of the registered scans are aggregated into a map.When aggregated, SRG-NDT and G-ICP generate high qualitymaps, while ICP produces a lower quality map due to the ac-cumulation of registration error, as illustrated in Figure 5. Theaggregated map of G-ICP is visually indistinguishable fromthat of SRG-NDT and is therefore not included in the figure.The maps produced by SRG-NDT show very little blurring offeatures such as the building walls and cars. Conversely, the

ICP G-ICP NDT SRG-NDTCrispness – 14.5% 12.1% 14.2%

Trans. err. [m] 0.363 0.199 0.271 0.190Rot. err. [rad] 0.0037 0.0030 0.0036 0.0030

TABLE IPERCENT IMPROVEMENT OF MAP CRISPNESS AND AVERAGE

TRANSLATION AND ROTATION ERROR (FORD DATASET).

ICP G-ICP NDT SRG-NDTCrispness – 13.5% 14.5% 15.2%

TABLE IIPERCENT IMPROVEMENT OF MAP CRISPNESS USING THE WPI DATASET.

map generated by ICP has a blurred appearance and less sharpfeatures.

2) Accuracy in Sparse Forested Environments: To deter-mine robustness of the proposed scan registration method ina forested environment, an experiment analogous to the oneperformed for the urban environment is conducted using theWPI data set. Each algorithm performs pair-wise scan regis-tration using every 50th scan produced by the LIDAR, as therobot moved slowly through the environment. The registrationis initialized with a parameter estimate of zero translationand zero rotation. The experiment demonstrates performancein situations where odometry estimates are highly inaccurate,such as in GPS denied environments, or on vehicles wherewheel slip or other factors make accurate state estimationdifficult. Since accurate ground truth is not available for theWPI data, the aggregated maps are evaluated based solely onthe crispness measure. The crispness measures comparing ICP,G-ICP, NDT, and SRG-NDT are presented in Table II.

The crispness measure illustrates that G-ICP was not ableto provide accurate scan registration results as compared tothe NDT or SRG-NDT algorithm. The inaccuracy is mostlikely because G-ICP relies on accurate surface normals todetermine the correct point correspondences, which is difficultto achieve in a forested environment. The stronger crispnessscores of SRG-NDT over NDT is likely a result of the poorconverge properties of NDT, along with inaccurate modelingof the environment from the voxelized discretization. TheSRG-NDT algorithm models environmental features such astrees and shrubs as a mixture of Gaussian distributions, thusgood registration results are achievable in sparse environmentswith poor local surface structure. Visually, the result of aggre-gating the registered scans is presented in Figure 6, whichshow the G-ICP and SRG-NDT maps from top down viewof the traversed area, as well as a close up view of theenvironmental features. The G-ICP map shows significant driftand occasionally poor convergence to local minima, while theSRG-NDT map maintains very crisp features. The crispnessof the features can be clearly identified by inspecting thealignment of the structural columns seen in the close-up view.

B. Effect of Ground Segmentation on Accuracy

To demonstrate the effect of ground segmentation on scanregistration quality, an experiment is conducted where ground

11

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

ICP G−ICP NDT SRG−NDT

Abs

olut

e T

rans

latio

n E

rror

[m]

(a) Absolute translation error

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07


Abs

olut

e R

otat

ion

Err

or [r

ad]

(b) Absolute rotation error

Fig. 4. Box-plots of the error distributions for each algorithm for the experiment from the Ford campus data set. (a) Absolute translation error. (b) Absoluterotation error.

(a) Top down view of the map generated by aggregating scansusing pair-wise ICP registration.

(b) Top down view of the map generated by aggregating scansusing pair-wise SRG-NDT registration.

Fig. 5. Resulting maps generated by aggregating scans using pair-wise (a) ICP and (b) SRG-NDT algorithms. Note the sharper corners, walls, cars and otherfeatures from the SRG-NDT map compared to the ICP generated map. The map is coloured based on height from the ground.

segmentation is performed to varying degrees prior to scanregistration. For the Ford data set, a set of scans is extractedfrom a sequence of the scan data where the tested scanregistration algorithms all converged to parameters that werewithin 10% of the ground truth values. For every scan inthe scan set, the ground is segmented using the GP-INSACmethod. Once the ground is segmented, a portion of the groundpoints is combined with the non-ground points, and the portionof ground points is combined by first passing the ground pointsthrough a voxel grid filter. The proportion of ground pointscombined with the non-ground points is controlled by theparameters of the voxel grid filter, as the larger the size of thevoxel grid, the fewer the number of ground points included inthe registration process. In the experiments, the proportion ofground points combined with non-ground points range fromfull ground (FG) to no ground (NG), with increasing voxel gridfilter size between the FG and NG range. For each scan set, thescans within each set are pair-wise registered using the ICP, G-ICP and NDT algorithms. The normalized crispness measure(normalized to the crispness of the aggregated map with full

ground) is calculated, as well as the translation and rotationregistration errors as compared to the data set’s ground truth.The computation time for the registration is also recorded.The results of the experiment for the Ford data set, comparingICP, G-ICP, and NDT are presented in Figure 7. A similarexperiment is performed using the the WPI data set, howeversince no ground truth data is available, only the normalizedcrispness measure and run time are calculated. The results ofthe experiment using the WPI data set, comparing the ICP,G-ICP and NDT algorithms are presented in Figure 8.

From the results of the experiments, it is evident that theremoval of the ground points does not significantly affectthe quality of the scan registration. For the aggregated mapusing the ICP, G-ICP and NDT algorithms, the normalizedcrispness measure from the aggregated map constructed usingthe down sampled ground point does not deviate significantlyfrom the crispness result generated using scans containing thefull ground. From the Ford data set, it is seen that the errorin the translation and rotation parameters is relatively constantacross the varying degrees of ground removal, suggesting that

12

(a) Pair-wise G-ICP registration (b) Pair-wise SRG-NDT registration

(c) Close-up view of the G-ICP map (d) Close-up view of the SRG-NDT map

Fig. 6. Top-down and close-up views for the resulting aggregated maps using the SRG-NDT and G-ICP algorithms. Note the crispness in features and reducedmap drift for SRG-NDT (Figures (b) and (d)), as compared to the G-ICP maps (Figures (a) and (c)).

the removal of the ground does not negatively affect the scanregistration quality and that the ground points do not contributeany significant information to the registration process. Thesharp increase in translation error for G-ICP at a 12.8 m voxelfilter size is likely the result of some of the scan registrationsconverging to local minima. Other than the anomaly seen withG-ICP, the remainder of the translation and rotation errors arerelatively consistent.

The execution time plots in Figures 7(d) and 8(b) illus-trate that as increasing percentages of the ground points areremoved, the registration run time decreases. The decreasein run time is expected, as the reduction of points fromthe ground removal will inherently decrease scan registrationtime, especially for the ICP and G-ICP algorithms wherenearest neighbour correspondences are required. Overall, theexperiment illustrates that the removal of the ground does notnegatively affect the quality of the scan registration, whiledecreasing the run time for the scan registration algorithm.Similar results are illustrated by the experiment conductedusing the WPI data set, where the crispness measure remainsrelatively constant across varying degrees of ground removal,while the required registration run time is decreased. Thedecrease in run time is even more pronounced for the WPIdataset as the environment is fairly sparse, thus the removal

of the ground points discards a larger portion of the points ascompared to the urban environment.

C. Run-time evaluation

The run-times required for each registration algorithm arevisualized in the box-plots presented in Figure 9. The run-timeevaluation clearly demonstrates that the SRG-NDT algorithmis able to perform scan registration significantly faster thanthe existing methods. The average run-time values and percentspeed-up achieved using SRG-NDT are presented in Table III.On average, the SRG-NDT algorithm achieves a 90.1% speed-up over ICP, a 95.3% speed-up over G-ICP, and a 94.5% speed-up over NDT. The computational effectiveness of the SRG-NDT algorithm is a result of using a low-cost segmentationand clustering algorithm to remove the ground points of thescan, then forming a relatively small set of Gaussian clustersusing the natural features that capture a sufficient amount ofdetail within the environment. Note that on average, the groundsegmentation and clustering require only 30 ms to perform.

VI. CONCLUSION

This work presents the Segmented Region Growing NDTalgorithm, which segments the ground from the scans, clusters

13

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG0.9

0.92

0.94

0.96

0.98

1

1.02

1.04

1.06

1.08

1.1

Nor

mal

ized

Cris

pnes

s V

alue

Ground Point Voxel Filter Size [m]

G−ICPICPNDT

(a) Crispness measure

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5

Tra

nsla

tion

Err

or [m

]


G−ICPICPNDT

(b) Translation error

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG0.01

0.0105

0.011

0.0115

0.012

0.0125

0.013

0.0135

0.014

0.0145

0.015

Rot

atio

n E

rror

[rad

]


G−ICPICPNDT

(c) Rotation error

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG

104

Run

Tim

e [m

s] (

log

scal

e)


G−ICPICPNDT

(d) Run time evaluation

Fig. 7. Ford data set: Results of scan registration accuracy and run time with varying degrees of ground removal. (a) Crispness. (b) Translation error. (c)Rotation Error. (d) Average run-time in log scale.

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG0.9

0.95

1

1.05

1.1

1.15

Nor

mal

ized

Cris

pnes

s V

alue


G−ICPICPNDT

(a) Crispness measure

FG 0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 NG

103

104

Run

Tim

e [m

s] (

log

scal

e)


G−ICPICPNDT

(b) Run time evaluation

Fig. 8. WPI data set: Results of scan registration accuracy and run time with varying degrees of ground removal. (a) Crispness. (b) Translation error. (c)Rotation Error. (d) Average run-time in log scale.

102

103

104


Run

−tim

e [m

s] (

log

scal

e)

(a)

102

103

104


Run

−tim

e [m

s] (

log

scal

e)

(b)

Fig. 9. Box-plots illustrating registration run-time for the Ford and WPI datasets. (a) Ford dataset. (b) WPI dataset.

14

ICP G-ICP NDT SRG-NDTFord 2750 5599 5782 250

% Imp. 90.1% 95.5% 95.6% –WPI 2313 4614 3434 228

% Imp. 90.1% 95.1% 93.3% –Avg. Imp. 90.1% 95.3% 94.5% –

TABLE IIIAVERAGE RUN-TIMES (MS) FOR FORD AND WPI DATASETS AND PERCENT

IMPROVEMENT OF SRG-NDT.

the non-ground points using a region growing clusteringalgorithm, then models each cluster as a Gaussian distribution.Unlike the NDT algorithm where the cost function is discon-tinuous as points cross over cell boundaries, the SRG-NDT al-gorithm evaluates the NDT cost function at all of the generatedclusters, resulting in a smooth and continuous cost functionwhich guarantees convergence of the optimization. The regiongrowing clustering algorithm generates a smaller set of clustersthat sufficiently model the environment using natural features.The proposed method is verified experimentally to performrapid scan registration with comparable accuracy to existingmethods. First, the SRG-NDT approach is applied to distri-bution to distribution NDT, and is experimentally shown toproduce accurate registration results in both urban and forestedenvironments. Second, a study of ground segmentation onregistration accuracy demonstrates that the ground points donot contribute significantly to the scan registration and can besafely discarded when sufficient natural features are present inthe environment. Finally, the run-time for SRG-NDT is shownto be significantly faster (up to 95% speed-up) compared tothe existing scan registration methods. Future work involvesusing the Gaussian clusters as features for initial alignment ofthe scans, optimizing the implementation for full graph-slamapplications, and evaluating the algorithm in a larger set ofenvironments.

References

Almqvist, H., Magnusson, M., Stoyanov, T., and Lilienthal, A.(2013). Improving point-cloud accuracy from a movingplatform in field operations. In IEEE InternationalConference on Robotics and Automation (ICRA), pages733–738, Karlsruhe, Germany.

Besl, P. and McKay, H. (1992). A method for registration of3-D shapes. IEEE Transactions on Pattern Analysis andMachine Intelligence, 14(2):239 –256.

Biber, P. and Strasser, W. (2003). The normal distributionstransform: a new approach to laser scan matching. InIEEE/RSJ International Conference on Intelligent Robotsand Systems (IROS), volume 3, pages 2743 – 2748, LasVegas, NV, USA.

Bogdan, R. and Cousins, S. (2011). 3D is here: Point CloudLibrary (PCL). In IEEE International Conference onRobotics and Automation (ICRA), pages 1–4, Shanghai,China.

Borrmann, D., Elseberg, J., Lingemann, K., Nuchter, A., andHertzberg, J. (2008). Globally consistent 3-D map-

ping with scan matching. Journal Of Robotics andAutonomous Systems, 56(2):130–142.

Chen, Y. and Medioni, G. (1991). Object modeling by regis-tration of multiple range images. In IEEE InternationalConference on Robotics and Automation (ICRA), pages2724 –2729, Sacramento, CA , USA.

Das, A. and Waslander, S. (2012). Scan registration withmulti-scale K-means normal distributions transform. InIEEE International Conference on Intelligent Robots andSystems (IROS), pages 2705–2710, Vilamoura, Algarve,Portugal.

Douillard, B., Quadros, A., Morton, P., Underwood, J.,De Deuge, M., Hugosson, S., Hallstrom, M., and Bailey,T. (2012). Scan segments matching for pairwise 3Dalignment. In 2012 IEEE International Conference onRobotics and Automation (ICRA), pages 3033–3040, St.Paul, MN, USA.

Douillard, B., Underwood, J., Kuntz, N., Vlaskine, V.,Quadros, A., Morton, P., and Frenkel, A. (2011). On thesegmentation of 3D lidar point clouds. In 2011 IEEEInternational Conference on Robotics and Automation(ICRA), pages 2798–2805, Shanghai, China.

Einhorn, E. and Gross, H.-M. (2013). Generic 2D/3D SLAMwith NDT maps for lifelong application. In To appearin European Conference on Mobile Robots (ECMR),Barcelona, Spain.

Golovinskiy, A., Kim, V., and Funkhouser, T. (2009). Shape-based recognition of 3D point clouds in urban environ-ments. In IEEE International Conference on ComputerVision (ICCV), pages 2154–2161, Kyoto, Japan.

Haralick, R. and Shapiro, L. (1985). Image segmentationtechniques. Journal of Computer Vision Graphics andImage Processing, 29(1):100–132.

Klasing, K., Wollherr, D., and Buss, M. (2008). A clusteringmethod for efficient segmentation of 3D laser data. InIEEE International Conference on Robotics and Automa-tion (ICRA), pages 4043 –4048, Pasadena, California.

Magnusson, M. (2009). The Three-Dimensional Normal-Distributions Transform — an Efficient Representation forRegistration, Surface Analysis, and Loop Detection. PhDthesis, rebro University. rebro Studies in Technology 36.

Magnusson, M., Duckett, T., and Lilienthal, A. (2007). Scanregistration for autonomous mining vehicles using 3D-NDT. Journal of Field Robotics, 24(10):803–827.

Magnusson, M., Nuchter, A., Lorken, C., Lilienthal, A., andHertzberg, J. (2009). Evaluation of 3D registration relia-bility and speed – a comparison of ICP and NDT. In IEEEInternational Conference on Robotics and Automation(ICRA), pages 3907–3912, Kobe, Japan.

Moosmann, F., Pink, O., and Stiller, C. (2009). Segmentationof 3d lidar data in non-flat urban environments usinga local convexity criterion. In 2009 IEEE IntelligentVehicles Symposium, pages 215–220.

Nuchter, A. (2009). 3D Robotic Mapping - The SimultaneousLocalization and Mapping Problem with Six Degrees ofFreedom, volume 52 of Springer Tracts in AdvancedRobotics. Springer.

Pandey, G., McBride, J., and Eustice, R. (2011). Ford

15

campus vision and lidar data set. International Journalof Robotics Research, 30(13):1543–1552.

Saarinen, J., Andreasson, H., Stoyanov, T., and Lilienthal,A. J. (2013a). Normal distributions transform monte-carlolocalization (NDT-MCL). In IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS),pages 382–389, Tokyo,Japan.

Saarinen, J. P., Andreasson, H., Stoyanov, T., and Lilienthal,A. J. (2013b). 3D normal distributions transform oc-cupancy maps: an efficient representation for mappingin dynamic environments. To appear in: InternationalJournal of Robotics Research.

Segal, A., Haehnel, D., and Thrun, S. (2009). Generalized-ICP. In Robotics: Science and Systems (RSS), pages 26–27, Seattle, USA.

Stoyanov, T., Magnusson, M., Andreasson, H., and Lilienthal,A. J. (2010). Path Planning in 3D Environments usingthe Normal Distributions Transform. In IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems(IROS), pages 3263–3268.

Stoyanov, T., Magnusson, M., and Lilienthal, A. J. (2012a).Fast and accurate scan registration through minimizationof the distance between compact 3D-NDT representa-tions. The International Journal of Robotics Research,31:1377–1393.

Stoyanov, T., Mojtahedzadeh, R., Andreasson, H., and Lilien-thal, A. J. (2012b). Comparative evaluation of rangesensor accuracy for indoor mobile robotics and auto-mated logistics applications. Journal of Robotics andAutonomous Systems.

Stoyanov, T., Saarinen, J., Andreasson, H., and Lilienthal,A. J. (2013). Normal distributions transform occupancymap fusion: Simultaneous mapping and tracking in largescale dynamic environments. In IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS),pages 4702–4708, Tokyo,Japan.

Takeuchi, E. and Tsubouchi, T. (2006). A 3-d scan matchingusing improved 3-d normal distributions transform formobile robotic mapping. In IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS),pages 3068–3073, Beijing, China.

Tongtong, C., Bin, D., Daxue, L., Bo, Z., and Qixu, L.(2011). 3D LIDAR-based ground segmentation. In AsianConference on Pattern Recognition (ACPR), pages 446–450, Beijing, China.

Vasudevan, S., Ramos, F., Nettleton, E., and Durrant-Whyte,H. (2009). Gaussian process modeling of large-scaleterrain. Journal of Field Robotics, 26(10):812–840.

Zhang, Z. (1994). Iterative point matching for registration offree-form curves and surfaces. International Journal ofComputer Vision, 13(2):119–152.

Scan Registration using Segmented Region Growing...

Documents

Transcript of Scan Registration using Segmented Region Growing...