Using Geometry to Detect Grasping Points on 3D Unknown...

8
Using Geometry to Detect Grasping Points on 3D Unknown Point Cloud Brayan S. Zapata-Impata 1,2 , Carlos M. Mateo 1,2 , Pablo Gil 1,2 and Jorge Pomares 1,2 1 Physics, System Engineering and Signal Theory, University of Alicante, 03690 Alicante, Spain 2 Computer Science Research Institute, University of Alicante, Alicante, Spain {brayan.impata, cm.mateo, pablo.gil, jpomares}@ua.es Keywords: Grasping, 3D point clouds, Surface detection, Handle grasping, Unknown object manipulation. Abstract: In this paper, we focus on the task of computing a pair of points for grasping unknown objects, given a single point cloud scene with a partial view of them. The main goal is to estimate the best pair of 3D-located points so that a gripper can perform a stable grasp over the objects in the scene with no prior knowledge of their shape. We propose a geometrical approach to find those contact points by placing them near a perpendicular cutting plane to the object’s main axis and through its centroid. During the experimentation we have found that this solution is fast enough and gives sufficiently stable grasps for being used on a real service robot. 1 INTRODUCTION The task of grasping objects using robots such as grip- pers have been widely studied in the state-of-art. Of- ten, to accomplish autonomous robots and to specifi- cally carry out the grasping task, the researchers use information acquired from visual sensors (Gil et al., 2016). In the past, the proposed approaches usually recognised the object in the scene from one or more views and later detected potential grasping points us- ing a previously stored 3D model of that object. These points can be also computed considering the grasping problem as a classification task, where large datasets are required for training and testing. In this paper, we use a single point cloud with a partial view of the objects present in the scene. More- over, the objects are unknown, hence they have not been previously recognised and we have not a 3D model to compute candidate grasping points. Our main goal is to estimate the best pair of 3D-located points so that a gripper can perform a stable grasp over the object with no prior knowledge. Recently, several authors have developed learning approaches to this problem by finding a gripper con- figuration using a grasping rectangle. (Jiang et al., 2011) introduced this idea representing a 2D oriented rectangle in the image with two of the edges corre- sponding to the gripper plates and the two other edges representing its width. Originally, the authors used RGBD images to find the optimal grasping rectangles by using a ranking linear function, learnt using a su- pervised machine learning algorithm. Afterwards, this grasping rectangle has been learnt using deep learning techniques in recent years. Thereby, (Lenz et al., 2013) used RGBD images to train a deep neural network that generated a set of rectangles ranked by features obtained from the data contained inside the bounds of the grasping rectangle. In (Wang et al., 2016) the authors built a multimodal Convolutional Neural Network (CNN) instead. Some authors have tested the grasping rectangle calculation using a different set of features apart from the RGBD channels. For instance, (Trottier et al., 2016) used RGBD images including more features like grey maps and depth normals. In (Redmon and Angelova, 2015), the authors’ proposal consisted not on adding more channels to the images but on using only the Red, Green and the Depth one. Although learning approaches have proved to be highly accurate, they require a significant amount of data and time to fine tune the learning architectures and the input features in order to be able to generalise. Another frequently taken path to solve this prob- lem consists on reconstructing a mesh from the seen object to compute the grasping points on complete CAD models or retrieve them from template grasps. In (Varley et al., 2015), the authors proposed a system that consisted on segmenting point clouds to find the objects in the scene, then they reconstructed meshes so the GraspIt! simulator (Miller and Allen, 2004) could find the best grasp configuration. In (Vahrenkamp et al., 2016), authors proposed a database of grasps templates over segmented meshes. During online grasping calculation, the robot would

Transcript of Using Geometry to Detect Grasping Points on 3D Unknown...

Page 1: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

Using Geometry to Detect Grasping Points on 3D Unknown Point Cloud

Brayan S. Zapata-Impata1,2, Carlos M. Mateo1,2, Pablo Gil1,2 and Jorge Pomares1,2

1Physics, System Engineering and Signal Theory, University of Alicante, 03690 Alicante, Spain2Computer Science Research Institute, University of Alicante, Alicante, Spain

{brayan.impata, cm.mateo, pablo.gil, jpomares}@ua.es

Keywords: Grasping, 3D point clouds, Surface detection, Handle grasping, Unknown object manipulation.

Abstract: In this paper, we focus on the task of computing a pair of points for grasping unknown objects, given a singlepoint cloud scene with a partial view of them. The main goal is to estimate the best pair of 3D-located pointsso that a gripper can perform a stable grasp over the objects in the scene with no prior knowledge of theirshape. We propose a geometrical approach to find those contact points by placing them near a perpendicularcutting plane to the object’s main axis and through its centroid. During the experimentation we have foundthat this solution is fast enough and gives sufficiently stable grasps for being used on a real service robot.

1 INTRODUCTION

The task of grasping objects using robots such as grip-pers have been widely studied in the state-of-art. Of-ten, to accomplish autonomous robots and to specifi-cally carry out the grasping task, the researchers useinformation acquired from visual sensors (Gil et al.,2016). In the past, the proposed approaches usuallyrecognised the object in the scene from one or moreviews and later detected potential grasping points us-ing a previously stored 3D model of that object. Thesepoints can be also computed considering the graspingproblem as a classification task, where large datasetsare required for training and testing.

In this paper, we use a single point cloud with apartial view of the objects present in the scene. More-over, the objects are unknown, hence they have notbeen previously recognised and we have not a 3Dmodel to compute candidate grasping points. Ourmain goal is to estimate the best pair of 3D-locatedpoints so that a gripper can perform a stable graspover the object with no prior knowledge.

Recently, several authors have developed learningapproaches to this problem by finding a gripper con-figuration using a grasping rectangle. (Jiang et al.,2011) introduced this idea representing a 2D orientedrectangle in the image with two of the edges corre-sponding to the gripper plates and the two other edgesrepresenting its width. Originally, the authors usedRGBD images to find the optimal grasping rectanglesby using a ranking linear function, learnt using a su-pervised machine learning algorithm.

Afterwards, this grasping rectangle has beenlearnt using deep learning techniques in recent years.Thereby, (Lenz et al., 2013) used RGBD images totrain a deep neural network that generated a set ofrectangles ranked by features obtained from the datacontained inside the bounds of the grasping rectangle.In (Wang et al., 2016) the authors built a multimodalConvolutional Neural Network (CNN) instead.

Some authors have tested the grasping rectanglecalculation using a different set of features apart fromthe RGBD channels. For instance, (Trottier et al.,2016) used RGBD images including more featureslike grey maps and depth normals. In (Redmon andAngelova, 2015), the authors’ proposal consisted noton adding more channels to the images but on usingonly the Red, Green and the Depth one.

Although learning approaches have proved to behighly accurate, they require a significant amount ofdata and time to fine tune the learning architecturesand the input features in order to be able to generalise.

Another frequently taken path to solve this prob-lem consists on reconstructing a mesh from the seenobject to compute the grasping points on completeCAD models or retrieve them from template grasps.In (Varley et al., 2015), the authors proposed a systemthat consisted on segmenting point clouds to find theobjects in the scene, then they reconstructed meshesso the GraspIt! simulator (Miller and Allen, 2004)could find the best grasp configuration.

In (Vahrenkamp et al., 2016), authors proposed adatabase of grasps templates over segmented meshes.During online grasping calculation, the robot would

Page 2: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

decompose the object’s RGBD image in meshes ofprimitive forms to match them against the templates.Following this same idea, (Jain and Argall, 2016) pro-posed an algorithm to match real objects against ge-ometric shape primitives, with fixed grasping strate-gies, using point clouds.

However, these solutions do not generalise well tounseen objects since they are restricted to those formspreviously recorded and they need additional views inorder to reconstruct correctly the objects.

As for using point clouds and not only RGBDimages, (Richtsfeld and Vincze, 2008) proposed amethod for computing a pair of grasping points overpoint clouds. Firstly, they searched for the top planarsurface of the object and then picked the closest pointin the rim to the object’s centre of mass. The sec-ond grasping point was on the opposite rim. Similarto this geometric approach, (ten Pas and Platt, 2015)computed grasping candidates analytically locatingantipodal grasps. This work was later followed by an-other approach where the authors localised hand-likeareas in the object’s cloud (ten Pas and Platt, 2016).

On this paper, we present a novel algorithm forrobotic grasping with grippers capable of detectinggrasping points on unknown objects using a singlepoint cloud view. This method automatically seg-ments the point cloud to detect the objects present inthe scene. Then, for each of them, it calculates a set ofcontacting points that fulfil certain geometric condi-tions and ranks their feasibility to find the most stablegrasp given the view conditions.

The rest of the paper is ordered as follows: section2 describes the robotic hand constraints and the ob-jects geometry as well. Section 3 details the methodused for segmenting the input cloud, finding the can-didate grasping points areas and ranking them for se-lecting the best grasping pair of points. Section 4shows the results obtained using a dataset of every-day objects and section 5 presents our conclusions.

2 GEOMETRIC ANDKINEMATICS CONSTRAINTS

2.1 Hand Kinematic

For this work, we take into account only the physi-cal limitations of a robotic hand for testing the con-tact points generated. The hand used is the Barretthand, shown in figure 1. This hand is an under-actuated grasper typically used in industrial scenarioswith three fingers, two of them that spread, countingwith 4 degrees of freedom.

Figure 1: Barrett robotic hand. Reproduced from(Townsend, 2000).

In this work, we will use only two fingers, F1 andF3, as a gripper. The maximum working aperturegripper max amp of this hand is equal to 335mm andtheir tip width gripper tip width is equal to 25mm.These attributes will influence the contact points cal-culation.

2.2 Object Geometry

We use for this work opaque, rigid objects which haveat least one dimension smaller than the maximumworking aperture of the robotic gripper. These objectsare restricted to be opaque due to the limitations of thedepth camera used with which RGBD images are ac-quired using projected coded IR patterns. Transparentobjects or dark ones would not be detected properlydue to the wavelength of the emitted light that can bereflected or refracted by some materials.

As for their stiffness, we are not dealing withdeformable bodies nor objects with holes passingthrough them (all objects are compacts). This as-sumption is required in order to avoid deformationsor noise that would affect the calculation.

3 GRASPING POINT SELECTION

In order to select the best grasping points for the grip-per, we first need to segment the scene where the ob-jects are presented. Then, a candidate area in the ob-jects surface is found for each of the plates and com-binations of points from these areas are ranked usinga custom function so the best configuration can guar-antee the most stable grasp under the view conditions.

3.1 Scene Segmentation and ObjectDetection

Given a recorded point cloud C, we first need to de-tect objects in the scene. In order to do so, we be-gin by filtering out points p ∈C whose z-componentfulfil pz > 1m so we are left only with points closerto the camera. Then, the ground plane Πg, where

Page 3: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

Figure 2: Scene segmentation. (left) original registeredpoint cloud, (right) detected objects after plane segmenta-tion and clusters detection.

objects are laying, is detected with RANSAC (Fis-chler and Bolles, 1981). Once the points p ∈ Πgare extracted from C, an Euclidean Cluster Extraction(Rusu, 2010)(Rusu and Cousins, 2011) is passed todetect each of the objects point clouds Ck. The resultsof this process are displayed in figure 2.

3.2 Grasping Areas

Afterwards, an object to be grasped is selected. Itspoint cloud Ck is preprocessed to filter outliers fromits surface. Next, the centroid of Ck is computed. Inaddition, the main axis ~v of the cloud is obtained inorder to approximate the object’s largest axis. Hav-ing done so, a cutting plane Π perpendicular to suchdirection~v through the object’s centroid is calculated.In the intersection of the plane Π and the cloud Ck,we subtract a set of points γ⊂Ck that are within 1cmto the plane Π, being this distance the best one foundempirically.

If the object’s axis~v is parallel to the ground planeΠg, the points located in the two opposite areas alongthe Z axis of the camera are the candidate graspingpoints. Otherwise, the candidates are in the oppositesides of the X axis from the camera viewpoint. The Yaxis is avoided since it includes the object’s points incontact to the ground plane Πg.

In order to find these candidate grasping points,the points pmin ∈ γ and pmax ∈ γ with the minimumand maximum component value (X or Z depend-ing on the case) are selected. This way, two can-didate points clouds Cmin and Cmax are extracted us-ing two spheres Smin and Smax centred in pmin andpmax respectively. Their radius are initially equal tor = 2 ∗ gripper tip width, being gripper tip widththe gripper’s plate width in millimetres. However, incase the object’s width wob j = L2norm(pmin, pmax) fitsthe condition wob j ≤ 2 ∗ r, then r = wob j∗0.9

2 to adaptthe grasp area to the object’s size.

L2norm(p,q) =

√n

∑i=1

(pi−qi)2 (1)

Figure 3: Grasping areas detection. (left) original object’spoint cloud Ck, (middle) filtered cloud, centroid detected asa red sphere, object’s axis~v as a white line and cutting planeΠ represented in green, (right) initial points pmin and pmaxin green as well as candidate points clouds Cmin and Cmaxcoloured in red.

Based on this, the candidate point clouds Cminand Cmax include those points belonging to the ob-ject’s cloud Ck which are within the volume of thesphere Smin and Smax. That is, Cmin = Smin ∩Ck andCmax = Smax∩Ck. In figure 3, we show the process ofdetecting the grasping areas.

3.3 Grasping Points Ranking

As for ranking a grasp configuration Θ = {p1 ∈Cmin, p2 ∈Cmax}, we propose a function that evaluatestheir stability depending on the following factors:

1. Distance to the grasp plane Π: this plane is cut-ting the object through its centroid so the closerthe grasping points p1 and p2 are to the plane Π,the closer they are to a reference to the object’scentre of mass. This could be translated to anequilibrated grasp. This distance is obtained as:

distance(Π, p) = abs(~n · p+o f f set) (2)

where ~n is the unitary normal vector of the planeΠ, p is one of the grasping points p1 or p2 ando f f set is the distance of the plane Π to the origin.

2. Point’s curvature: the curvature measures thevariations on the object’s surface. A grasp is likelyto be more stable if it is executed over a planararea instead of highly curved points. In partic-ular, the point’s curvature measures the variationbetween this one and its neighbours on the samesurface. Both, the size (radius of sphere) and thenumber of points (dense of sphere) influence theestimation of the curvature values. To estimatethe curvature, we apply the method presented in

Page 4: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

(Pauly et al., 2002). Thereby, we first compute thecovariance matrix of points within the sphere andobtain the eigenvalues and eigenvectors from Sin-gular Value Decomposition (SVD). The covari-ance matrix is previously obtained from a Princi-ple Component Analysis (PCA). Later, the eigen-values sum supplies surface variation informationbetween each point of the sphere and its centroidand, the smallest eigenvalue gives us the variationalong the normal vector to the surface. Accord-ingly, the curvature can be computed as:

λp =λ0

λ0 +λ1 +λ2(3)

where λp is the curvature on the point p, λi is eacheigenvalue of the covariance matrix being i = 0the smallest and i = 2 the biggest eigenvalue, re-spectively.

3. Antipodal configuration: in an antipodal graspthe gripper is able to apply opposite and collinearforces at two points. In this case, a pair of contactpoints with friction is antipodal if they lay along aline parallel to the direction of finger motion. Toguarantee this, the angle α between the ith-contactpoint’s normal ~ni and the line ~w connecting p1 andp2 should be close to zero.

4. Perpendicular grasp: since the grasping areasare spherical, there are chances for a candidategrasp to not be parallel to the cutting plane Π. Inorder to avoid trying to execute slippery grasps,we penalise those configurations which are notparallel to the cutting plane Π. That is, the line~w which connects the grasping points p1 and p2should have an angle β with the cutting plane’snormal~n close to 90 degrees.

Assuming these conditions, the following is theproposed ranking function for a grasp Θ:

rank(p1, p2) =w1∗ r1(p1, p2)+w2∗ r2(p1, p2)

r1(p1, p2) =1.0−dis(Π, p1)2 +1.0−dis(Π, p2)

2+

1.0− cos(β)r2(p1, p2) =1.0−λp1 +1.0−λp2+

cos(αp1)∗ cos(αp2)

(4)

where dis(Π, pi) is the distance of the graspingpoint pi to the grasp plane Π as measured in equation(2), and λpi is the curvature of the grasping point pias measured in equation (3), and αpi is the angle be-tween the ith-grasping point’s normal ~ni and the con-necting line ~w, and β is the angle between the cuttingplane’s normal~n and the connecting line ~w.

We split our function rank in two sub-functionsr1 and r2 because they evaluate distinct attributes ofthe grasp configuration Θ. As for r1, it evaluates thegeometrical position of the grasping points over theobject’s surface. The curvature characteristics of theirarea are evaluated by r2. These two natures are thenweighted using w1 and w2 to balance their influencein the ranking function. In this work, w1=w2 so bothfactors have the same importance.

The values included in this ranking function are allin the range [0,1] so ranking values vary in the range[0,6], being 6 the score given to best grasp configura-tions while unstable ones are closer to 0. In addition,point cloud normal vectors and curvatures are calcu-lated previously using a a radius r = 3cm.

Furthermore, these candidate areas Cmin and Cmaxare voxelised so that the calculus can be made faster.It is not necessary using every single candidate pointbecause if a point has for example a high curvaturevalue, its neighbours are likely to be under very simi-lar conditions. Thus, voxels are used as a represen-tation of a tiny surface inside the candidate grasp-ing area. These voxels are computed using a radiusdependant of the gripper’s tip width in a factor ofvoxel radius = gripper tip width∗0.5.

4 EXPERIMENTS

For this experimentation, we have acquired pointclouds using a RealSense SR300 depth camera. Thedataset of objects is composed of the following house-hold objects: milk brick, toothpaste box, cookies box,spray bottle, bowl, mug, plastic glass, book, can,tennis ball, deodorant roll-on, pillbox and telephone.These objects and the scene view from the camera arepresented in 4. Objects lay in a range from 40cm to100cm from the camera base.

Figure 4: Experimentation set comprised of 13 objects.

Page 5: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

Table 1: Average grasp feasibility as a percentage and rank value for each object presented in isolation.

Object Dimensions (mm) # Views frobot fenv r1 r2 rankToothpaste 46x190x38 6 1.00 1.00 2.38 1.04 3.42

Cookies 305x210x48 8 1.00 0.75 2.60 1.36 3.96Milk 97x195x58 6 1.00 1.00 2.38 1.77 4.15Book 123x202x25 6 1.00 0.83 2.72 1.33 4.05

Pillbox 58x77x37 6 1.00 1.00 2.51 1.11 3.62Can 65x115x65 4 1.00 0.75 2.20 1.86 4.07

Roll-on 46x102x46 4 1.00 0.75 2.28 0.92 3.20Spray 80x290x80 4 1.00 0.75 2.31 1.63 3.94Mug 82x96x82 5 1.00 1.00 2.66 1.91 4.57Glass 79x107x79 5 1.00 1.00 2.35 1.40 3.75Bowl 172x78x172 4 1.00 0.75 2.76 2.08 4.84Ball 63x63x63 2 1.00 1.00 2.80 1.57 4.38

Telephone 49x157x23 5 0.80 0.80 2.16 1.40 3.56

Our algorithm was implemented in C++ using thelibrary PCL 1.7 and tested on an Intel i7-4770 @ 3.4GHz x 8 cores with 8 GiB of system memory. Forreal time testing, it was implemented in ROS as a sub-scriber node that read the camera data directly.

In the experiments, we will evaluate frobot asthe grasps feasibility constrained only to the handlimitations by calculating the distance between thegrasping points. If it requires to open the gripper asmaller width than its maximum operating amplitudegripper max amp but they are not closer than a min-imum distance, it is scored as a feasible grasp. Other-wise, it is not physically possible to grasp it. For thispurpose, we have used gripper tip width as the min-imum and gripper max amp as the maximum open-ing width, being these the physical limitations of theBarrett hand described in section 2.1.

Furthermore, the fenv metric will evaluate the fea-sibility of the grasp taking into account the environ-mental constraints. For example, if a grasp requirespushing the robotic hand under the object or both con-tact points lay in the same surface the robot will col-lide so it will be scored as infeasible.

4.1 Objects Presented in Isolation

For this experiment, the dataset objects were placedalone on the table at a sufficient distance to be prop-erly detected by the camera. In case of smaller ob-jects like the pillbox or the ball this requires them tobe closer than bigger ones like the cookies box. Vari-ous poses were taken and the proposed grasping algo-rithm was run. In the table 1 we present the obtainedresults.

Our dataset of objects is comprised of householdobjects that can be categorised in some geometri-cal types like boxes, cylinders and spheres. Box-

like objects were laid down in different positions soeach time the camera would acquired a new view ofit. Cylinders were laid down parallel to the Z axis,as well as X axis and in diagonal to both of them.Spheres are equally seen from every point of view sothey were moved along the Z axis to bring them closerand further to the camera.

It can be observed that r1 usually contributes withat least 2 out of 3 points to the ranking function. Thismeans that our ranking function is able to find goodcontact points near the cutting plane that are paral-lel to it. In our hypothesis, grasps parallel and closeto the cutting plane Π will be more stable since theywill be closer to the point cloud centroid, that canbe used as a representation of the object’s centre ofmass. Grasping closer to this point leads to more equi-librated grasps if the object’s is proportionally densethrough its whole body, something that can be said formost of the household objects a service robot wouldmanipulate.

Regarding r2, we have tested that box-like objectstend to have lower scores on this function than cylin-ders or spheres. In this function, we are ranking thecurvature of the contact point area and if they con-figure an antipodal grasp as well. Planar objects willhave lower curvature values so it is logical to thinkthat these objects should have greater r2 values. How-ever, due to only using one viewpoint of the object, itis difficult to find more than two faces of boxes in thesame point cloud, making the configuration of an an-tipodal grasps complicated. In order to configure one,contact points should lay in two planes of the box withnormal vectors in the same direction but counter-wise.Since this is not the case, our box grasps are penalizedby not configuring an antipodal grasp.

We have observed that the closer the objects are tothe camera, the more dense their point clouds are and

Page 6: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

Figure 5: Samples of grasps calculated by the proposed al-gorithm over the cookies box in different poses.

more information is available about its surface. Thatresults in higher ranked configurations due to havingmore points to work with. That is, views that propor-tion more information about the object’s shape facili-tate finding better contact points.

Although the algorithm is designed to avoid rec-ommending infeasible grasps like those that requirecontacting the object through its lower surface, some-times the recording conditions leads to grasps thatcannot be performed. As is shown in the table 1, frobotis usually equal to 100% meaning that every proposedgrasp is physically possible for our robotic hand.However, during this experimentation, we found aproposed grasp for the telephone that required posi-tioning the grippers so close that they would collide.This fact was due to displaying a lateral view of thetelephone that was very thin so any proposed grasp us-ing such view required placing the grippers too close.

As for fenv, it has been proved that sometimes, thealgorithm proposes grasps that are placed beneath theobjects, like the one displayed in the top right cor-ner in figure 5. We have checked that this fact usu-ally happens because the detected object’s main axisis parallel to the table and there is enough noise to findsmaller Z values in lower parts than in the top of theobject so the initial point pmin is placed wrong. Thisis a case that our proposal treats incorrectly due to thepoint cloud conditions.

4.2 Objects Presented in Clutter

In most of the real situations, the robot would face,objects will not be isolated and they may be partiallyoccluded or surrounded by others. In these experi-

Table 2: Average objects presented in reality/detected, thefeasibility of their grasps and the rank for each object.

Scene # R/D frobot fenv rankCylinders 6/5 1.00 0.60 3.55

Laying 7/7 1.00 0.71 3.97Box-sphere 8/8 1.00 1.00 4.00Cyl-sphere 8/7 0.85 0.85 4.16Mix10-a 10/9 0.88 0.88 3.92Mix10-b 10/10 1.00 0.90 4.01Mix12-a 12/11 1.00 0.90 4.28Mix12-b 12/12 1.00 0.91 3.73Dataset-a 13/13 1.00 1.00 3.53Dataset-b 13/14 1.00 0.78 3.68

ments, we have distributed the dataset of all objectson the work-table surface. In the table 2, we presentthe results obtained by evaluating how many objectswere presented in real, how many were detected andhow many had a feasible grasp calculated as well astheir mean rank.

We do not present these experiments in dense clut-ter with objects contacting each other since that wouldmake the task of segmenting the objects using thepoint cloud more difficult. Since we are not evaluat-ing an object detection algorithm nor we are propos-ing one, the objects presented in these experimentsare sufficiently separated to allow us test our contactpoints calculation while still being in clutter.

Generally, all objects in the scene can be seen andtherefore detected. However, sometimes some objectsare missed since they are too occluded or small that donot seem an object for the algorithm. Regarding theobjects detected, now that they are partially occludedthe rate fenv decreases. Most of the erroneous pro-posed grasps are due to being over objects in the backso they are not properly seen or because the segment-ing algorithm confuses two objects and mixes them inthe same object. These cases are displayed in the fig-ure 6, where in the Mix12-a scene the toothpaste andmilk boxes are mixed in one single object. The samehappens in the Mix10-a scene with the mug and themilk box.

Another issue for discussing is that when an objectis in front of another one, the object in the back can beseen as two different point clouds. This fact is the caseof the milk box in the scene Dataset-b in the figure 6.If those two boxes were separated objects, it couldbe said that our algorithm is proposing two feasiblegrasps. However, we know those grasps cannot beexecuted. In a common clear-the-table task, we wouldpick the toothpaste box and then recompute grasps forthe visible objects, solving this situation by doing so.

Page 7: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

Figure 6: Samples of grasps proposed over clutter scenes.Top: (left) Mix12-a, (right) Mix10-a. Bottom: (left)Dataset-b, (right) Cyl-sphere.

4.3 Real Time Testing

Apart from evaluating the contact points feasibility ofour proposed algorithm, we have tested its speed. Forthis experiment, we have saved in rosbags 5 secondsof point clouds from our camera. Several objects con-figuration were tested including the case of one singleobject but with different sizes and a growing distri-bution where the whole dataset of objects was intro-duced one by one in each scene. Results are presentedin table 3.

As was expected, the average time needed to com-pute the contact points for a single object depends onits size. More specifically, it depends on the amount

Table 3: Average time calculating contact points for eachobject in the scene and frames per second (FPS).

Scene # Objects Time/Object (ms) FPSCookies 1 530.00 1.30

Milk 1 45.61 11.45Ball 1 26.94 17.13

Dataset 1 213.73 3.16Dataset 2 126.04 2.84Dataset 3 95.45 2.51Dataset 4 80.08 2.10Dataset 5 63.24 1.89Dataset 6 84.00 1.19Dataset 7 95.38 1.19Dataset 8 96.24 1.10Dataset 9 79.13 1.05Dataset 10 84.03 0.92Dataset 11 88.50 0.85Dataset 12 78.60 0.81Dataset 13 88.25 0.73

of points processed. Bigger objects tend to have big-ger point clouds so they are slower to process.

If just one single object is presented in the scenethe algorithm reaches a frames per second (FPS) rategreater than 1. However, checking the dataset scenesit can be observed that as we introduce more objectsthe proposed algorithm for detecting the objects andthen calculating their grasping points takes more time,reducing the FPS rate. Nevertheless, the time requiredto calculate a single pair of contact points is loweredbecause we keep introducing more objects and thenbigger ones become occluded so their point clouds arealso reduced. This issue is also an effect of beginningthis test with greater objects at the back like the cook-ies box of the spray and then keep introducing smallerone in front of them so they can be easily detected.

Any single tested object took more than 1 secondto be detected and processed to give a grasp config-uration. Therefore, it is fast enough to be used in areal implementation as the grasping points calculationsystem of a service robot.

5 CONCLUSIONS

In this paper, we propose a new algorithm for grasp-ing novel objects with robotic grippers using 3D pointclouds and no prior knowledge. The main idea is togeometrically find regions that fulfil a set of basicconditions. These regions are found by calculatingthe object’s main axis and its centroid. Then, a cut-ting plane perpendicular to such axis and that holdsthe object’s centroid is computed. This cutting planedefines in the extremes of the object two candidate ar-eas for placing the gripper plates. Configurations ofpoints combined from both areas are then ranked inorder to find the most stable and feasible grasp, tak-ing into account their geometric positions in respect tothe cutting plane and their curvature characteristics.

Due to depending on the previous segmentationof the scene and the calculation of the object’s mainaxis, it is badly influenced by noise and poor point ofviews of the objects to be grasped. Despite this fact,it is still able to find always a pair of grasping pointsin less than one second and stays as a useful basiccontact points calculation system.

In future work, we would like to extend it to makefull use of robotic hands with more than two fingers aswell as study a way to dynamically change the influ-ence of r1 and r2 in the rank function by modifyingtheirs weights w1 and w2. In addition, we want to testthis ranking function as a measure of the point of viewquality in terms of the best grasping points that can becomputed from it.

Page 8: Using Geometry to Detect Grasping Points on 3D Unknown ...rua.ua.es/dspace/bitstream/10045/75568/1/ICINCO_2017_182_CR.pdf · Using Geometry to Detect Grasping Points on 3D Unknown

ACKNOWLEDGEMENTS

This work was funded by the Spanish GovernmentMinistry of Economy, Industry and Competitivenessthrough the project DPI2015-68087-R and the pre-doctoral grant BES-2016-078290.

REFERENCES

Fischler, M. a. and Bolles, R. C. (1981). RandomSample Consensus: A Paradigm for Model Fit-ting with Applicatlons to Image Analysis andAutomated Cartography. Communications of theACM, 24(6):381 – 395.

Gil, P., Mezouar, Y., Vincze, M., and Corrales,J. A. (2016). Editorial: Robotic Perceptionof the Sight and Touch to Interact with Envi-ronments. Journal of sensors, 2016(Article ID1751205):1–2.

Jain, S. and Argall, B. (2016). Grasp detection for as-sistive robotic manipulation. Proceedings - IEEEInternational Conference on Robotics and Au-tomation, 2016-June:2015–2021.

Jiang, Y., Moseson, S., and Saxena, A. (2011). Effi-cient grasping from RGBD images: Learning us-ing a new rectangle representation. Proceedings- IEEE International Conference on Roboticsand Automation, pages 3304–3311.

Lenz, I., Lee, H., and Saxena, A. (2013). Deep Learn-ing for Detecting Robotic Grasps. pages 1–4.

Miller, A. T. and Allen, P. K. (2004). GraspIt! IEEERobotics & Automation Magazine, 11(4):110–122.

Pauly, M., Gross, M., and Kobbelt, L. (2002). Ef-ficient simplification of point-sampled surfaces.13th IEEE Visualization conference., (Section4):163–170.

Redmon, J. and Angelova, A. (2015). Real-timegrasp detection using convolutional neural net-works. 2015 IEEE International Conference onRobotics and Automation (ICRA), pages 1316–1322.

Richtsfeld, M. and Vincze, M. (2008). Grasping ofUnknown Objects from a Table Top. Workshopon Vision in Action: Efficient strategies for cog-nitive agents in complex environments.

Rusu, R. B. (2010). Semantic 3D Object Maps forEveryday Manipulation in Human Living Envi-ronments. KI - Kunstliche Intelligenz, 24:345–348.

Rusu, R. B. and Cousins, S. (2011). 3D is here: PointCloud Library (PCL). Proceedings - IEEE Inter-national Conference on Robotics and Automa-tion, (April).

ten Pas, A. and Platt, R. (2015). Using Geometry toDetect Grasps Poses in 3D Point Clouds. Int’lSymp. on Robotics Research.

ten Pas, A. and Platt, R. (2016). Localizing handle-like grasp affordances in 3D point clouds.Springer Tracts in Advanced Robotics, 109:623–638.

Townsend, W. (2000). The BarrettHand grasper-programmably flexible part handling and assem-bly. The International Journal of IndustrialRobot, 27(3):181–188.

Trottier, L., Giguere, P., and Chaib-draa, B. (2016).Dictionary Learning for Robotic Grasp Recogni-tion and Detection. arXiv preprint, pages 1–19.

Vahrenkamp, N., Westkamp, L., Yamanobe, N., Ak-soy, E. E., and Asfour, T. (2016). Part-basedGrasp Planning for Familiar Objects. 2016IEEE-RAS International Conference on Hu-manoid Robots (Humanoids 2016).

Varley, J., Weisz, J., Weiss, J., and Allen, P.(2015). Generating multi-fingered robotic graspsvia deep learning. IEEE International Confer-ence on Intelligent Robots and Systems, 2015-Decem:4415–4420.

Wang, Z., Li, Z., Wang, B., and Liu, H. (2016). Robotgrasp detection using multimodal deep convolu-tional neural networks. Advances in MechanicalEngineering, 8(9):1–12.