HKUST Aerial Robotics Group - Catching a Flying Ball with a...

12
Catching a Flying Ball with a Vision-Based Quadrotor Kunyue Su and Shaojie Shen ? The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Abstract. We present a method allowing a quadrotor equipped with only on- board cameras and an IMU to catch a flying ball. Our system runs without any external infrastructure and with reasonable computational complexity. Central to our approach is an online monocular vision-based ball trajectory estimator that recovers and predicts the 3-D motion of a flying ball using only noisy 2-D ob- servations. Our method eliminates the need for direct range sensing via stereo correspondences, making it robust against noisy or erroneous measurements. Our system is made by a simple 2-D visual ball tracker, a UKF-based state estimator that fuses optical flow and inertial data, and a nonlinear tracking controller. We perform extensive analysis on system performance by studying both the system dynamics and ball trajectory estimation accuracy. Through online experiments, we show the first mid-air interception of a flying ball with an aerial robot using only onboard sensors. Keywords: ball catching, quadrotor, vision-based state estimation 1 Introduction We address the challenging and interesting topic of using an autonomous quadrotor to perform mid-air interception of a flying ball using only onboard sensors. We consider the scenario where no external motion capture systems are available, and the quadrotor has to use only its onboard cameras and IMU to observe and estimate the 3-D trajectory of a flying ball. We also require that the quadrotor is localized using its onboard visual- inertial sensor suite and performs feedback control in reaction to the flying ball that is observed in real-time. This is a challenging, dexterous task in which all actions happen in fractions of a second. It requires instantaneous reaction to the flying ball based on only a few noisy visual observations. The quadrotor is also required to maintain high- precision self-localization under various flight conditions and execute very aggressive commands. Automated ball catching/juggling or other ball sports are popular research topics in robotics. Using ground platforms and with the help of vision-based ball tracking algorithms, successful systems have been developed using a fixed robotic arm [1] or a humanoid robot [2]. However, ground-based vision systems are typically heavy and often require careful calibration of stereo cameras for distance measurement. Such a high-quality multi-camera setup is usually not available on aerial robots that have tight size and weight constraints. Moreover, ground platforms enjoy static stability, which greatly simplifies both estimation and control. When using agile flying robots as the ? The authors would like to thank the scholarship and equipment support from DJI.

Transcript of HKUST Aerial Robotics Group - Catching a Flying Ball with a...

Page 1: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Catching a Flying Ball with a Vision-Based Quadrotor

Kunyue Su and Shaojie Shen ?

The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong

Abstract. We present a method allowing a quadrotor equipped with only on-board cameras and an IMU to catch a flying ball. Our system runs without anyexternal infrastructure and with reasonable computational complexity. Central toour approach is an online monocular vision-based ball trajectory estimator thatrecovers and predicts the 3-D motion of a flying ball using only noisy 2-D ob-servations. Our method eliminates the need for direct range sensing via stereocorrespondences, making it robust against noisy or erroneous measurements. Oursystem is made by a simple 2-D visual ball tracker, a UKF-based state estimatorthat fuses optical flow and inertial data, and a nonlinear tracking controller. Weperform extensive analysis on system performance by studying both the systemdynamics and ball trajectory estimation accuracy. Through online experiments,we show the first mid-air interception of a flying ball with an aerial robot usingonly onboard sensors.

Keywords: ball catching, quadrotor, vision-based state estimation

1 Introduction

We address the challenging and interesting topic of using an autonomous quadrotor toperform mid-air interception of a flying ball using only onboard sensors. We considerthe scenario where no external motion capture systems are available, and the quadrotorhas to use only its onboard cameras and IMU to observe and estimate the 3-D trajectoryof a flying ball. We also require that the quadrotor is localized using its onboard visual-inertial sensor suite and performs feedback control in reaction to the flying ball that isobserved in real-time. This is a challenging, dexterous task in which all actions happenin fractions of a second. It requires instantaneous reaction to the flying ball based ononly a few noisy visual observations. The quadrotor is also required to maintain high-precision self-localization under various flight conditions and execute very aggressivecommands.

Automated ball catching/juggling or other ball sports are popular research topicsin robotics. Using ground platforms and with the help of vision-based ball trackingalgorithms, successful systems have been developed using a fixed robotic arm [1] ora humanoid robot [2]. However, ground-based vision systems are typically heavy andoften require careful calibration of stereo cameras for distance measurement. Such ahigh-quality multi-camera setup is usually not available on aerial robots that have tightsize and weight constraints. Moreover, ground platforms enjoy static stability, whichgreatly simplifies both estimation and control. When using agile flying robots as the? The authors would like to thank the scholarship and equipment support from DJI.

Page 2: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

2 Kunyue Su and Shaojie Shen

Fig. 1. A diagram of the overall system architecture.

executing platform, most of the prior works rely on external motion capture systemsfor vehicle and ball state feedback. [3] presents a model predictive controller for fastadaptation to ball and quadrotor dynamics, while a complete system, with a focus ontrajectory, is proposed for aerial ball juggling [4]. Similar results are also reported in [5]and [6]. Towards vision-based approaches using flying robots, [7] made the first attemptat returning a ping-pong ball using an AR.Drone. However, its approach relies on priorknowledge of the ball size to compute the 3-D ball position. This is prone to large errorsdue to noise in 2-D tracking.

Given the fact that the gravity is known and the ball undergoes free-fall motion, wepropose a novel optimization-based method to estimate and predict the 3-D ball tra-jectory using only noisy 2-D visual observations from a single upward-facing camera(Sect. 2). Our method incorporates all visual observations to continuously refine theestimated ball trajectory that is represented by its initial position and velocity. Based onthis estimation, we can predict the short-horizon ball motion and command the quadro-tor to the estimated landing area using a standard feedback controller. We estimate theposition, velocity, and orientation of the quadrotor using a UKF-based loosely coupledvisual-inertial-fusion module [8], where visual measurements are obtained from an op-tical flow-based velocity estimator [9] that utilizes measurements from a downward-facing camera and a sonar. In the current system, 2-D ball tracking is done using asimple engineering method by outfitting the ball with LEDs, which leads to simple ballobservation by intensity thresholding. The overall system diagram is shown in Fig. 1and the experimental platform is exhibited in Fig. 2(a). We identify our contribution asthreefold:

– We propose a monocular vision-based ball trajectory estimator that recovers andpredicts the 3-D motion of a flying ball using only noisy 2-D visual measurements.

– We perform careful system engineering on state estimation and visual ball trackingto build a system that runs fast enough for mid-air ball interception.

– We integrate everything into a complete system and show the world’s first au-tonomous ball catching with an aerial robot using only onboard visual sensors.

The rest of the paper is organized as follows: Sect. 2 presents our optimization-basedball trajectory estimation method. Sect. 3 discusses other system modules, including 2-D visual ball tracking, UKF-based state estimation, feedback control, and other system

Page 3: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Vision-Based Ball Catching 3

implementation details. Sect. 4 presents experimental data to analyze platform dynam-ics and trajectory estimation accuracy, as well as overall system performance. Sect. 5concludes the paper and discusses future research directions.

2 Monocular Vision-Based 3-D Ball Trajectory Estimation

We utilize two physical assumptions to estimate the 3-D ball trajectory using only 2-Dobservations from a single camera. First, we assume the magnitude of the earth’s gravityis known. Second, we assume that the ball undergoes free-fall motion and air resistanceis ignored.

We start by defining notations. We consider (w) as the world frame and (b) as the ballframe. (c) is the camera frame that is assumed to be aligned with the quadrotor bodyframe. pw

c and Rwc represent the position and orientation of the camera in the world

frame. pwbt0

and vwbt0

represent the initial 3-D position and velocity of the ball in theworld frame. We set t0 as the time we receive the first visual measurement of the ball.λbt is the depth of the ball in the camera frame in the observation that occurred at timet, and ubt is the corresponding normalized image coordinates of the ball. gw = [0, 0, g]is the gravity acceleration in the world frame that is assumed to be known.

2.1 Linear Estimation

Given the gravity vector gw, and assuming free-fall motion, the ball trajectory can berepresented by initial position pw

bt0and velocity vw

bt0of the ball. Note that once we

obtained the ball’s initial conditions pwbt0

and vwbt0

, we are able to predict the ball motionfor an arbitrary time horizon, and thus compute the expected interception point of thequadrotor at a given height.

At time t, using the 2-D ball observation ubt from a calibrated monocular camera,as well as the camera/quadrotor pose pw

ct and Rwct given by our sensor fusion module

(Sect. 3.2), we can express the ball’s initial conditions pwbt0

and vwbt0

in terms of visualobservations and quadrotor poses:

pwbt0

+ vwbt0∆t+

1

2gw∆t2 = pw

ct + λbtRwctubt , (1)

where ∆t = t − t0 is the time difference since the first visual observation of the ball.We can then stack all visual observations occurring at t0, · · · , tN into a linear systemusing (1):

I3×3 I3×3∆t0 −Rwct0

ubt003×1 . . .

I3×3 I3×3∆t1 03×1 −Rwct1

ubt1 . . ....

......

.... . .

pwbt0

vwbt0λbt0λbt1

...λbtN

=

pwct0− 1

2gw∆t20

pwct1− 1

2gw∆t21

...pwctN− 1

2gw∆t2N

,

(2)

Page 4: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

4 Kunyue Su and Shaojie Shen

where ∆ti = ti − t0. We can then solve for the initial position and velocity of the ball,as well as the depth of the ball in all visual observations, in a linear and non-iterativemanner.

2.2 Nonlinear Optimization

The linear ball trajectory estimation presented in Sect. 2.1 does not take the noise modelof 2-D visual tracking into account, which may yield suboptimal estimation results. Inthis section, we propose to employ nonlinear optimization to refine the ball’s initialconditions pw

bt0and vw

bt0by minimizing the reprojection error of visual observations.

We can improve computational efficiency by avoiding solving the depth of the ball. Tothis end, we write the nonlinear visual measurement residual rt at time t as

ft =

xftyftzft

= Rwct

T

(pwbt0

+ vwbt0∆t+

1

2gw∆t2 − pw

ct

)

rt =

[xft

zft− uxt

yft

zft− uyt

] , (3)

where uxt and uyt are the two components of ubt . We can then write down the batchnonlinear least-square problem for all N + 1 visual observations as:

minpw

bt0,vw

bt0

N∑i=0

‖rti‖2Pi, (4)

where ‖.‖Piis the Mahalanobis distance subject to the covariance matrix Pi for the ith

visual observation that occurred at time ti. We can then linearize (4) around the currentestimates of (pw

bt0, vw

bt0):

minpw

bt0,vw

bt0

N∑i=0

‖rti + Jtiδy‖2Pi, (5)

where rti is the residual for the ith measurement computed using estimates (pwbt0, vw

bt0),

δy = [δpwbt0, δvw

bt0]T is the small error term for the ball parameters, and Jti is the

Jacobian computed with respect to δy around the current estimates(pwbt0, vw

bt0), which

can be written as:

Jti =

[ 1zft

0 −xft

z2ft

0 1zft−yft

z2ft

] [Rw

ctT Rw

ctT∆t

](6)

(5) can be written into the form of a 6 × 6 linear system Aδy = b. We solve itwith the Huber norm [10] for better robustness against tracking errors and outliers. Thelinearization procedure runs in an iterative manner until convergence, after which weobtain good estimates of ball parameters (pw

bt0, vw

bt0) for ball trajectory prediction and

physical ball catching.

Page 5: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Vision-Based Ball Catching 5

(a) (b)

Fig. 2. The experimental quadrotor shown in Fig. 2(a). We outfit the quadrotor with two cameras(one upward, one downward) for ball tracking and visual positioning, respectively. The containerfor ball catching is removed for easy visualization of electronic components. Onboard processingincludes an Odroid XU4 computer. We further process the visual data using a ground stationthat is tethered to the quadrotor using Ethernet cables. A snapshot of the experiment shown inFig. 2(b). In this experiment, the ball reaches approximately 3 meters in height, and the quadrotormoves about 0.3 meters to catch the ball. A video of our experiments are available at: http://www.ece.ust.hk/˜eeshaojie/iser2016kunyue.mp4

3 Implementation Details

In this section, we describe both the hardware platform and the software architectureof the system. The software diagram is shown in Fig. 1. Many of the software moduleswere originally developed for autonomous aerial navigation in complex indoor and out-door environments [8]. Here we briefly recap the key algorithms and present adaptationsto our ball catching application.

3.1 Hardware Setup

As shown in Fig. 2(a), our experimental quadrotor is based on the DJI F330 frame.The platform is equipped with a PixHawk AutoPilot for attitude stabilization and forstreaming IMU data for sensor fusion. Two computers are used in our experiment. AnOdroid XU4 installed onboard the quadrotor is responsible for data acquisition frommultiple sensors. A ThinkPad T440s laptop is tethered to the quadrotor using Ethernetcables. The decision to tether to an additional ground PC is made to reduce the weightof the platform in order to achieve a high thrust-weight ratio for faster reaction to ballmotions. In our system, the 2-D ball tracker, 3-D ball trajectory estimator, and optical-flow-based quadrotor velocity estimator run on the T440s laptop, while the UKF-basedsensor fusion and position controller run onboard the quadrotor. Onboard sensors in-clude an mvBlueFox MLC200wG camera with wide angle lens for ball tracking. An-other MLC200wG camera is bundled with a sonar for optical flow-based state estima-tion. The total weight of the platform is 1.21 kg, and the maximum thrust is 2.72 kg.This gives a 2.24 thrust-weight ratio to the quadrotor.

Page 6: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

6 Kunyue Su and Shaojie Shen

Additionally, we outfit the ball with two bright LED lights to simplify the process ofvisual ball detection and tracking in the monocular video sequence. In this way, we canlocalize the ball in the image by simple intensity thresholding, which greatly reduces thelatency in visual observations, and makes it possible to track a ball that flies at a speedof more than 6m/s. However, this simple tracking method also introduces significanterrors due to the size of the light blob in the image. Fortunately, our monocular-vision-based ball trajectory estimation is able to use a large number of visual observations to fita good curve. Experimental results (Sect. 4) show that even with noisy ball observations,we can estimate the landing spot of the ball with the error less than 4 cm.

3.2 Quadrotor State Estimation

Another critical component in our system is a fast and reliable estimator that tracks theposition, velocity, and orientation of the quadrotor even under highly dynamical mo-tions. Our state estimator consists of an optical flow-based quadrotor velocity estimatorand a UKF-based loosely coupled visual-inertial-sensor fusion module [8].

The optical flow-based velocity estimator utilizes continuous homography, as intro-duced in Chapter 5 of [9] to estimate the frame-to-frame motion of the camera. Theestimator starts with tracking of a fixed number of pixels that are evenly distributedwithin the image using KLT tracking [11]. Note that we do not extract features, butrather rely on tracking of pixels at fixed locations in each image. The features trackedare used for calculation of continuous homography with RANSAC for outlier rejection.The homography matrix can be decomposed into the up-to-scale linear velocity vector,the angular velocity vector, and the plane normal. The linear velocity is then scaled us-ing sonar measurement on the assumption of the existence of a single floor plane. Thevelocity estimator runs at 20Hz on the ground PC.

The 20Hz velocity estimate from the vision-only approach is too noisy and tooslow for feedback control of the quadrotor. We employ a UKF sensor fusion frame-work with delayed measurement compensation to estimate the state of the quadrotor at100Hz. The UKF we used in this project is a direct adaption of the multi-sensor fusionframework proposed in our earlier work [8]. The system state is defined as:

x =[pwc , p

wc ,q

wc ,ab

]T, (7)

where qwc is the quaternion representation of the quadrotor orientation, and ab is the

bias of the accelerometer measurement in the quadrotor body frame. We consider aclassical IMU-based, as in [8], to drive the system state forward:

xt+1 = f(xt, wt, at, nt), (8)

where wt and at are instantaneous angular velocities and linear accelerations measuredby the IMU, and nt is the additive Gaussian noise associated with the gyroscope, ac-celerometer, and accelerometer bias. The measurement model for the velocity estimatoris linear with respect to the system state and can be written as:

z = Hx+ n, (9)

Page 7: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Vision-Based Ball Catching 7

where H extracts the 3-D velocity in the state and n is the additive Gaussian noise. Thevelocity measurement update can then be performed via a linear KF update.

In our implementation, the velocity estimator runs on the ground PC, and the UKF-based sensor fusion module runs onboard the quadrotor.

3.3 Feedback Control

After the first observation of the ball in the image plane, we use the estimated ballparameters (pw

bt0, vw

bt0) to perform a forward prediction to a point where the height of

the ball is the same as the height of the quadrotor. We use the condition number in thelinear system (2) to decide whether the estimation of the ball trajectory has converged.In practice, we often find the ball trajectory converges after it reaches the highest point(Sect. 4.2).

After the ball trajectory has converged and the landing point of the ball is well-estimated, we use a standard two-level controller architecture, similar to that in [12], todrive the quadrotor to the estimated landing point for ball interception. The low-levelcontroller that runs on the PixHawk AutoPilot stabilizes the attitude of the quadrotorwhile the position controller runs at 100Hz on the Odroid XU4 onboard computer.We only control the 3-D position and velocity of the quadrotor. The yaw angle is keptconstant throughout the experiment.

One may argue that a trajectory-generation-based method [13] yield better perfor-mance than just using a feedback controller. However, after considering the dynamicsof the quadrotor (Sect. 4.1) and the convergence property of the ball trajectory esti-mator (Sect. 4.2), we found that the margin for the quadrotor to successfully catch aball is only 20 cm. Beyond that, the platform will not have sufficient control authorityto reach the desired position. For this reason, we concluded that the benefit of usingmore sophisticated trajectory generation would be minimal, and we stuck with a simplefeedback controller for our application.

4 Experiment Results

4.1 Analysis on Quadrotor Dynamics

In this experiment, we perform an analysis on the dynamical capability of the quadrotor.This is essential as we need to make a prior judgment as tp whether the quadrotor willbe able to catch the ball before it takes any intercepting action. Based on the thrust-weight ratio of our quadrotor (Sect. 3.1), we know that the platform may achieve amaximum of 63 degrees in attitude while still maintaining its height. This correspondsto a maximum horizontal acceleration of 19m/s2. We also need to take the latency indata processing, motor response, and the platform’s inertial moment into account whendetermining the maximum travel distance in a given period of time. To this end, wedesign two experiments to show the overall response of our quadrotor to step inputs.

Fig. 3 shows the response of the quadrotor to step velocity changes. Fig. 4 showsthe response to a step position command of 0.3meters. It takes approximately 0.8 secfor the quadrotor to reach the desired location. As shown in Fig. 5(b) in Sect. 4.2,

Page 8: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

8 Kunyue Su and Shaojie Shen

for a throw of about 2meters in maximum altitude, the quadrotor has about 0.8 secto react to the ball after the ball trajectory is estimated. Therefore, we conclude thatthe maximum displacement between the estimated ball landing point and the currentquadrotor position is 0.3meters, beyond which the quadrotor will not try to catch theball.

time(s)

0 5 10

po

sitio

n(m

)

-2

-1

0

1

2response

command

time(s)

0 5 10

ve

locity(m

/s)

-2

-1

0

1

2

response

command

Fig. 3. Response of the quadrotor to step velocity changes. Corresponding position profile is alsoplotted.

time(s)

2 3 4 5 6

po

sitio

n(m

)

-0.5

-0.45

-0.4

-0.35

-0.3

-0.25

response

command

Fig. 4. Response of the quadrotor to step position command. It takes about 0.8 sec to reach a stepcommand of 0.3meters

4.2 Ball Trajectory Estimation Performance

In this section, we evaluate the performance of the 3-D ball trajectory estimator usingtwo sets of experiment. An optical motion capture system (mocap) is used to provideground-truth references. We throw the ball in an indoor environment with a maximumheight of about 3meters.

In the first experiment, we create virtual 2-D ball observations by back projectingthe ground truth position of the ball provided by the mocap to a virtual camera. Thevirtual camera is attached to the ground truth location of the quadrotor, which is alsoprovided by the mocap. This experiment aims to evaluate the performance of the balltrajectory estimator using noiseless visual observations. The virtual ball observationscan be computed as:

pcb =

xpcb

ypcb

zpcb

= RwcT (pw

b − pwc )

u =1

zpcb

[xpc

b

ypcb

] , (10)

Page 9: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Vision-Based Ball Catching 9

Given a sequence of u, Rwc , and pw

c , we use the algorithm proposed in Sect. 2 to es-timate the ball trajectory parameters, which include the initial position pw

bt0and initial

velocity vwbt0

We keep the height of the quadrotor unchanged at zc and compute the Xand Y positions of the ball when it drops to the quadrotor height using forward predic-tion. We first solve for the flying time of the ball ∆t by solving the second-order linearequation:

zpb+ zvb

∆t+1

2g∆t2 = zc (11)

and then compute the intercepting point as:[xcyc

]=

[xpb

ypb

]+

[xvb

yvb

]∆t, (12)

where pbt0= [xpb

, ypb, zpb

]T and vwbt0

= [xvb, yvb

, zvb]T . Fig. 5(a) shows the conver-

gence of the estimated landing point of the ball, while Fig. 6(a) shows the correspond-ing 3-D plots of the estimated ball trajectories at different times. Earlier estimates thatuse fewer observations are shown in darker color. We see a quick convergence of theestimated ball trajectory as more observations are collected. A multi-trial experimentshowing the repeatability of our method is shown in Fig. 7(a).

time(s)

0 0.5 1

x(m

)

-1

-0.8

-0.6

-0.4

estimated

actual

time(s)

0 0.5 1

y(m

)

-0.4

-0.35

-0.3

-0.25

estimated

actual

(a)

time(s)

0 0.5 1

x(m

)

-1

-0.8

-0.6

estimated

actual

time(s)

0 0.5 1

y(m

)

-0.2

-0.18

-0.16

-0.14

estimated

actual

(b)

Fig. 5. The convergence processes of the ball trajectory. The dashed line represents the actual balllocation when it drops to the height of the quadrotor. With sufficient observation, the estimatedlanding point of the ball converges to the ground truth. In Fig. 5(a) ball’s virtual measurementsand quadrotor localization are based on the mocap. In Fig. 5(b) the measurements are from visionand quadrotor localization is based on the UKF-based sensor fusion.

Page 10: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

10 Kunyue Su and Shaojie Shen

In the second experiment, an identical procedure is followed, but this time usingonly onboard sensor measurements. We integrate the LED-assisted visual ball detec-tion, the optical flow-based velocity estimator, and the UKF-based sensor fusion andsee very similar convergence patterns to the mocap case, as shown in Fig. 5(b) andFig. 7(b). However, about twice as much time is needed for the estimated ball trajectoryto converge. The repeatability test is shown in Fig. 7(b), in which we see that the errorstandard deviation is about twice those of the mocap scenario, (0.0249 m, 0.0188 m)in the X and Y directions, respectively. However, given this standard deviation, we arestill able to perform successful ball catching.

1

x(m)

0

-1-0.4

-0.35

-0.3

y(m)

-0.25

2.5

2

1.5

1

0.5

z(m

)

(a)

10.5

x(m)

0-0.5-1-0.4

-0.2

2.5

2

1.5

1

0.5

0

y(m)

z(m

)

(b)

Fig. 6. The convergence processes shown in 3-D. Darker color lines indicate earlier estimatesof the ball trajectory. Lighter color indicates later estimates, which clearly show convergence.Solid lines are estimated ball trajectories using visual measurements, while dashed lines repre-sent the predicted ball motion. Darker (earlier) curves have more dashed portions due to fewervisual measurements. For Fig. 6(a), all measurements are based on the mocap. For Fig. 6(b), allmeasurements are from vision.

4.3 Vision-Based Ball Catching

We finally integrate everything together and have the quadrotor catch the LED-lightedball using only visual measurements. The quadrotor first hovers in place. When the ballis observed by the upward-facing camera and the ball trajectory converged, as deter-mined by the condition number of the linear system in (2), the quadrotor is commandedto fly towards the estimated landing point of the ball. A visualization is shown in Fig. 8.In this experiment, the ball is thrown to about 3m in height and lands about 0.3m awayfrom the initial hover position of the quadrotor. The quadrotor takes about 0.8 secondsto reach the interception point. The actual interception is less than 0.02m from the cen-ter of the quadrotor.

Currently, the success rate of ball catching using all onboard sensing is only about30%, and significant engineering optimization is still needed to increase the success

Page 11: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

Vision-Based Ball Catching 11

x(m)

-0.03 -0.02 -0.01 0 0.01 0.02

y(m

)

-0.02

-0.01

0

0.01

0.02

(a)x(m)

-0.05 0 0.05

y(m

)

-0.04

-0.02

0

0.02

0.04

(b)

Fig. 7. Fig. 7(a) shows error distribution of the estimated landing point of the ball in multiple trialsusing virtual ball observations from the mocap. The standard deviation is (0.0118m, 0.0065m)in the X and Y directions, respectively. Fig. 7(b) shows error in the estimated landing point ofthe ball using all visual measurements. The error standard deviation is (0.0249m, 0.0188m) inthe X and Y directions, respectively. This is about twice as the one obtained using virtual ballobservations from the mocap, but still within the margins of the ball container.

rate. However, our experiments do show that vision-based ball catching using an aerialrobot is feasible. A snapshot of this experiment is shown in Fig. 2(b).

5 Conclusion and Future Work

In this paper, we present a system, with the focus on 3-D ball trajectory estimation, forvision-based catching of a flying ball using a quadrotor. We present a novel ball trajec-tory estimation method that recovers the flying ball’s kinematic states from noisy on-board sensors. Our system is integrated with 2-D visual ball trackers, optical-flow-basedvehicle velocity estimators, a UKF-based visual-inertial fusion method, and online feed-back. Extensive experimental results are presented to verify the proposed approach. Tothe best of our knowledge, we are the first to demonstrate mid-air interception of aflying ball with aerial robots using only onboard sensors.

Notably, our current system runs using two cameras with a very simple optical flow-based localization method. However, with state-of-the-art monocular visual-inertial state

(a) (b) (c) (d) (e) (f)

Fig. 8. Visualization of the ball catching process sorted in time. The green line is the estimatedball trajectory, and red dots are ball observations back-projected onto the 3-D estimated. InFig. 8(b), only a few visual observations are made, the ball trajectory is inaccurate and unsta-ble. The quadrotor starts the catching process in Fig. 8(c). As more observations are made, asshown in Fig. 8(e)-Fig. 8(f), the estimated ball trajectory converges and the quadrotor is able tointercept the ball.

Page 12: HKUST Aerial Robotics Group - Catching a Flying Ball with a ...uav.ust.hk/wordpress/wp-content/uploads/2016/09/iser2016...tonomous ball catching with an aerial robot using only onboard

12 Kunyue Su and Shaojie Shen

estimators [14], we could further reduce the system to a monocular one with even higherstate estimation accuracy. Other areas of research include the use of trajectory gener-ation methods [13] for better reaction to the time-varying ball trajectory estimation,while still satisfying the platform dynamical constraints. We could also use dynamicvision sensors [15] for high-speed ball trajectory and vehicle state estimation.

References

1. Frese, U., Baeuml, B., Haidacher, S., Schreiber, G., Schaefer, I., Haehnle, M., Hirzinger, G.:Off-the-shelf vision for a robotic ball catcher. In: Proc. of the IEEE/RSJ Intl. Conf. on Intell.Robots and Syst. Volume 3., IEEE (2001) 1623–1629

2. Birbach, O., Frese, U., Bauml, B.: Realtime perception for catching a flying ball with amobile humanoid. In: Proc. of the IEEE Intl. Conf. on Robot. and Autom., IEEE (2011)5955–5962

3. Bouffard, P., Aswani, A., Tomlin, C.: Learning-based model predictive control on a quadro-tor: Onboard implementation and experimental results. In: Proc. of the IEEE Intl. Conf. onRobot. and Autom. (2012)

4. Muller, M., Lupashin, S., D’Andrea, R.: Quadrocopter ball juggling. In: Proc. of theIEEE/RSJ Intl. Conf. on Intell. Robots and Syst., IEEE (2011) 5113–5120

5. Dong, W., Gu, G.Y., Ding, Y., Zhu, X., Ding, H.: Ball juggling with an under-actuated flyingrobot. In: Proc. of the IEEE/RSJ Intl. Conf. on Intell. Robots and Syst., IEEE (2015) 68–73

6. Ritz, R., Muller, M.W., Hehn, M., D’Andrea, R.: Cooperative quadrocopter ball throwingand catching. In: Proc. of the IEEE/RSJ Intl. Conf. on Intell. Robots and Syst., IEEE (2012)4972–4978

7. Silva, R., Melo, F.S., Veloso, M.: Towards table tennis with a quadrotor autonomous learningrobot and onboard vision. In: Proc. of the IEEE/RSJ Intl. Conf. on Intell. Robots and Syst.,IEEE (2015) 649–655

8. Shen, S., Mulgaonkar, Y., Michael, N., Kumar, V.: Multi-sensor fusion for robust au-tonomous flight in indoor and outdoor environments with a rotorcraft mav. In: Proc. ofthe IEEE Intl. Conf. on Robot. and Autom., IEEE (2014) 4974–4981

9. Ma, Y., Soatto, S., Kosecka, J., Sastry, S.S.: An invitation to 3-D vision: from images togeometric models. Volume 26. Springer Science & Business Media (2012)

10. Huber, P.: Robust estimation of a location parameter. Annals of Mathematical Statistics35(2) (1964) 73–101

11. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application tostereo vision. In: Proc. of the Intl. Joint Conf. on Artificial Intelligence, Vancouver, Canada(August 1981) 24–28

12. Lee, T., Leoky, M., McClamroch, N.: Geometric tracking control of a quadrotor uav onSE(3). In: Proc. of the Intl. Conf. on Decision and Control, Atlanta, GA (December 2010)5420–5425

13. Mellinger, D., Kumar, V.: Minimum snap trajectory generation and control for quadrotors.In: Proc. of the IEEE Intl. Conf. on Robot. and Autom., IEEE (2011) 2520–2525

14. Shen, S., Michael, N., Kumar, V.: Tightly-coupled monocular visual-inertial fusion for au-tonomous flight of rotorcraft MAVs. In: Proc. of the IEEE Intl. Conf. on Robot. and Autom.,Seattle, WA (May 2014)

15. Mueggler, E., Baumli, N., Fontana, F., Scaramuzza, D.: Towards evasive maneuvers withquadrotors using dynamic vision sensors. In: European Conference on Mobile Robots, IEEE(2015) 1–8