1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for...

5
1 Joint GPS and Vision Estimation Using an Adaptive Filter Shubhendra Vikram Singh Chauhan and Grace Xingxin Gao, University of Illinois at Urbana-Champaign Shubhendra Vikram Singh Chauhan received his B.Tech. and M.Tech degree in aerospace engineering from Indian Institute of Technology, Bombay, India in 2016. He received Institute Silver Medal on graduation. He is currently pursuing his PhD degree at University of Illinois at Urbana-Champaign. His research interests include robotics, controls and sensor fusion. Grace Xingxin Gao received the B.S. degree in mechanical engineering and the M.S. degree in electrical engineering from Tsinghua University, Beijing, China in 2001 and 2003. She received the PhD degree in electrical engineering from Stanford University in 2008. From 2008 to 2012, she was a research associate at Stanford University. Since 2012, she has been with University of Illinois at Urbana-Champaign, where she is presently an assistant professor in the Aerospace Engineering Department. Her research interests are systems, signals, control, and robotics. Abstract—Navigation in urban environments with standalone GPS is a challenging task. In urban environments, it is probable that GPS signals are being blocked or reflected by buildings. These factors degrade position accuracy. Therefore, it is desirable to augment GPS with another sensor using a sensor fusion technique. The performance of a sensor fusion technique is highly dependent on covariance matrices and tuning these matrices is time consuming. In case of GPS, the covariance matrices may change with time and may vary in size. The expected noise in GPS measurements in urban environments is different than that of open environments. The number of visible satellites may increase or decrease with time, changing the size of measurement covariance matrix. The time and size variation of covariance matrices makes the position estimation even harder. In this paper, we propose an adaptive filter for GPS-vision fusion that adapts to time and size varying covariance matrices. In proposed method, we assume that the noise is Gaussian in small time intervals, and then use innovation sequence to derive an expression for covariance matrices. The camera image is compared with Google Street View (GSV) to obtain position. We use pseudoranges from GPS receiver and positions obtained from camera as measurements for our Extended Kalman Filter (EKF). The covariance matrices are obtained from innovation sequence and Kalman gain. The proposed filter is tested in GPS-challenged urban environments on the University of Illinois at Urbana- Champaign campus. The proposed filter demonstrates improved performance over the filter with fixed covariance matrices. Keywords—Adaptive filter, image localization, global position- ing system (GPS) I. I NTRODUCTION Navigation in urban environments has always been a chal- lenging task, especially with standalone GPS. In urban en- vironments, there are high chances of GPS satellite signals being blocked or reflected by buildings. These factors affect the position accuracy of GPS. In literature this problem has been approached in two ways. The first method is to identify the multipath signal and then either remove it or weigh it in estimation process [1], [2]. The second method is to use a sensor to augment GPS. Vision sensor is commonly used for augmentation as urban environments are rich in features like buildings, street lights etc. These features are utilized in localization process. A sensor fusion technique is used to fuse the measurements. The performance of a sensor fusion technique is highly de- pendent on covariance matrices. Tuning covariance matrices is a laborious task. [3] uses a ground truth to train covariance matrices. However, in this method there is chance of over- fitting i.e. once tuned for a particular scenario the matrices may not work for a different scenario as noise may change with time. The second challenge, which is not addressed in literature, is that the size of measurements may change with time. [4] uses covariance matching method to estimate covariance matrices adaptively. The paper doesn’t address the issue of size change of measurement covariance matrix. A. Approach The main contribution of this paper is the development of the filter that adapts to time and size varying noise. The Gaussian assumption of noise is not valid when measurements are used from GPS or vision. The noise in GPS measurements in urban areas is expected to vary with time given, the chances of reflection is high. Also number of visible satellites may increase or decrease during a course of experiment. Similarly the noise in vision will be dependent on the number of features present in the environment. We expect the noise in vision to be varying with time. We use measurements to estimate covariance matrices, as any time and size variation in noise affects measurements. The overall architecture is shown in Figure 1. Input measurements of our EKF are pseudoranges, obtained from GPS receiver and position, obtained from vision. We create a set of grid points around previous position estimate and pull images from street view database. In image matching module, we compare raw image from camera and GSV to obtain the position. This is the position referred as position obtained from vision. In covariance estimation module, we use

Transcript of 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for...

Page 1: 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for covariance matrices. The camera image is compared with Google Street View (GSV)

1

Joint GPS and Vision EstimationUsing an Adaptive Filter

Shubhendra Vikram Singh Chauhan and Grace Xingxin Gao,University of Illinois at Urbana-Champaign

Shubhendra Vikram Singh Chauhan received his B.Tech.and M.Tech degree in aerospace engineering from IndianInstitute of Technology, Bombay, India in 2016. He receivedInstitute Silver Medal on graduation. He is currently pursuinghis PhD degree at University of Illinois at Urbana-Champaign.His research interests include robotics, controls and sensorfusion.

Grace Xingxin Gao received the B.S. degree in mechanicalengineering and the M.S. degree in electrical engineeringfrom Tsinghua University, Beijing, China in 2001 and 2003.She received the PhD degree in electrical engineering fromStanford University in 2008. From 2008 to 2012, she wasa research associate at Stanford University. Since 2012, shehas been with University of Illinois at Urbana-Champaign,where she is presently an assistant professor in the AerospaceEngineering Department. Her research interests are systems,signals, control, and robotics.

Abstract—Navigation in urban environments with standaloneGPS is a challenging task. In urban environments, it is probablethat GPS signals are being blocked or reflected by buildings.These factors degrade position accuracy. Therefore, it is desirableto augment GPS with another sensor using a sensor fusiontechnique. The performance of a sensor fusion technique is highlydependent on covariance matrices and tuning these matrices istime consuming. In case of GPS, the covariance matrices maychange with time and may vary in size. The expected noisein GPS measurements in urban environments is different thanthat of open environments. The number of visible satellites mayincrease or decrease with time, changing the size of measurementcovariance matrix. The time and size variation of covariancematrices makes the position estimation even harder.

In this paper, we propose an adaptive filter for GPS-visionfusion that adapts to time and size varying covariance matrices.In proposed method, we assume that the noise is Gaussian insmall time intervals, and then use innovation sequence to derivean expression for covariance matrices. The camera image iscompared with Google Street View (GSV) to obtain position. Weuse pseudoranges from GPS receiver and positions obtained fromcamera as measurements for our Extended Kalman Filter (EKF).The covariance matrices are obtained from innovation sequenceand Kalman gain. The proposed filter is tested in GPS-challengedurban environments on the University of Illinois at Urbana-Champaign campus. The proposed filter demonstrates improvedperformance over the filter with fixed covariance matrices.

Keywords—Adaptive filter, image localization, global position-ing system (GPS)

I. INTRODUCTION

Navigation in urban environments has always been a chal-lenging task, especially with standalone GPS. In urban en-vironments, there are high chances of GPS satellite signalsbeing blocked or reflected by buildings. These factors affectthe position accuracy of GPS. In literature this problem hasbeen approached in two ways. The first method is to identifythe multipath signal and then either remove it or weigh itin estimation process [1], [2]. The second method is to usea sensor to augment GPS. Vision sensor is commonly usedfor augmentation as urban environments are rich in featureslike buildings, street lights etc. These features are utilized inlocalization process.

A sensor fusion technique is used to fuse the measurements.The performance of a sensor fusion technique is highly de-pendent on covariance matrices. Tuning covariance matricesis a laborious task. [3] uses a ground truth to train covariancematrices. However, in this method there is chance of over-fitting i.e. once tuned for a particular scenario the matricesmay not work for a different scenario as noise may changewith time. The second challenge, which is not addressedin literature, is that the size of measurements may changewith time. [4] uses covariance matching method to estimatecovariance matrices adaptively. The paper doesn’t address theissue of size change of measurement covariance matrix.

A. ApproachThe main contribution of this paper is the development

of the filter that adapts to time and size varying noise. TheGaussian assumption of noise is not valid when measurementsare used from GPS or vision. The noise in GPS measurementsin urban areas is expected to vary with time given, the chancesof reflection is high. Also number of visible satellites mayincrease or decrease during a course of experiment. Similarlythe noise in vision will be dependent on the number of featurespresent in the environment. We expect the noise in visionto be varying with time. We use measurements to estimatecovariance matrices, as any time and size variation in noiseaffects measurements. The overall architecture is shown inFigure 1.

Input measurements of our EKF are pseudoranges, obtainedfrom GPS receiver and position, obtained from vision. Wecreate a set of grid points around previous position estimateand pull images from street view database. In image matchingmodule, we compare raw image from camera and GSV toobtain the position. This is the position referred as positionobtained from vision. In covariance estimation module, we use

Page 2: 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for covariance matrices. The camera image is compared with Google Street View (GSV)

2

Fig. 1: Overall architecture

innovation sequence and Kalman gain to estimate covariancematrices. The estimated covariance matrices are fed to EKFfor next time step.

We conducted experiment in urban environments on theUniversity of Illinois at Urbana Champaign (UIUC) campus.We implemented the proposed filter on the collected dataset.The proposed filter adapted to time and size varying noiseand we were able to improve position accuracy in urbanenvironments.

The rest of the paper is organized as follows: Section IIdescribes the algorithm to match images from GSV database.Section III presents our method to estimate covariance ma-trices. Section IV provides detail for EKF measurement andprocess model. Experimental scenario is given in section V.Section VI shows the results and finally the paper is concludedin section VII with conclusions.

II. IMAGE MATCHING USING SIFT FEATURES

GSV contains database of images that are geo-referenced.This map is discrete i.e. if the difference in the position of thepull request is less than 10m, the GSV gives the same image.

SIFT [5] features are invariant to scale, rotation and lighting.Each feature comes with a vector descriptor. Features arematched by finding the nearest neighbor in descriptor spacewith some additional constraint. We use SIFT features forimage matching. Figure 2 shows the steps involved in imagematching.

Fig. 2: Image matching using SIFT features

First SIFT features are calculated for all the images presentin database and also for the camera image. These features arethen matched by implementing the Lowe’s criteria. Lowe’s

criteria considers two features corresponding to each other ifthe ratio of distance between two nearest neighbors is below acertain threshold. Feature matching step is performed to findthe feature correspondences. This step is susceptible to outliers.

The transformation matrix relating the images are foundin outlier rejection step. The matrix is found by randomlyselecting feature correspondences and then estimating thematrix elements. The random selection of features is repeatedand transformation matrix is estimated for each step. Thematrix that gives largest number of inlier is considered to be thetransformation matrix. After applying transformation matrix tothe images, the number of inlier are counted.

The number of inlier are stored for every database images.The image that has the largest number of inlier is consideredto be a match and the position corresponding to that image isconsidered to be the position obtained from vision.

III. ADAPTIVE COVARIANCE ESTIMATION

We use innovation sequence to estimate noise covariancematrices. Innovation sequence is the difference of measure-ments and expected measurements. Any time and size variationof noise will affect the measurements and the effect will beobserved in innovation sequence. Figure 3 conceptually showsthe step involved in covariance estimation.

Fig. 3: Estimation covariance matrices

Measurements and states are input to EKF. We use inno-vation sequence and measurement model to get an expressionof measurement covariance noise. Similarly we use, processmodel, Kalman gain and innovation sequence to get an expres-sion of process covariance noise. These covariance matrices arethen fed to EKF in next time step.

A. Filter EquationsPredict and update are the two steps involved in Kalman

filtering. Equations for predict step are given below

xk|k−1 = f(xk−1|k−1) + ωk (1)

Pk|k−1 = FkPk−1|k−1FTk +Qk (2)

where subscript k|k−1 denotes the value at time step k givenk − 1 measurements, k − 1|k − 1 denotes the value at timestep k− 1 given k− 1 measurements, x is a state vector, f isa non linear function that maps previous states to current timestep, ωk is assumed to be zero mean Gaussian, P is the errorcovariance matrix, Fk is the jacobian of non linear function f ,evaluated at xk|k−1 and Qk is the covariance matrix of processnoise ωk.

Page 3: 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for covariance matrices. The camera image is compared with Google Street View (GSV)

3

Equation 1 uses dynamics of system to propagate states for-ward in time. Equation 2 is used to propagate error covariancematrices forward in time. The update step is performed byfollowing equations

zk = h(xk|k−1) + νk (3)

yk = zk −Hkxk|k−1 (4)

Sk = Rk +HkPk|k−1HTk (5)

Kk = Pk|k−1HTk S−1k (6)

xk|k = xk|k−1 +Kkyk (7)

Pk|k = (I −KkHk)Pk|k−1 (8)

where subscript k denotes the time step, h is the non linearfunction that maps states to measurements, νk is assumedto be zero mean Gaussian, y is innovation sequence, yk isthe measurement, Hk is the jacobian of non linear functionh, evaluated at xk|k−1, Sk is the theoretical covariance ofinnovation sequence and Kk is the Kalman gain.

Measurement model is given by equation 3. Innovationsequence is the difference of measurements and expectedmeasurements. Innovation sequence is given by equation 4.Equation 5 shows the covariance of innovation sequence.Kalman gain is obtained using equation 6. The states areupdated using equation 7 and error covariance matrices areupdated using equation 8. In the standard EKF, both processand measurement noise is assumed to be Gaussian.

B. Covariance EstimationWe relaxed the assumption of noise being Gaussian for

whole duration to noise being Gaussian for small intervalof time. Let the length of this time interval be N . Actualcovariance of innovation sequence is calculated using thefollowing equation

E[ykyTk ] =

1

N − 1

N∑i=1

(yk−i − ˜y)(yk−i − ˜y)T (9)

where E[] denotes the expectation with respect to time, ˜yis the mean of N samples of innovation sequence aroundtime step k. Let the actual covariance be denoted by Sm

k i.e.E[yky

Tk ] = Sm

k . Substituting this expression for covarianceof innovation sequence in equation 5 and rearranging it wouldgive the following equation

Rk = Smk −HkPk|k−1H

Tk (10)

where Rk represents estimated measurement covariance noise.We also assume the noise elements to be independent. Thisalong with the assumption of noise being Gaussian in smalltime intervals, makes Rk a diagonal matrix. The elements ofdiagonal matrix are then given by

rik = Smk (i, i)−Hk(i, :)Pk|k−1Hk(i, :)T (11)

where rik is the ith element of the diagonal matrix Rk andHk(i, :) denotes the ith row of the measurement matrix. Any

size change in the measurement will change the size of Smk and

equation 11 is used to estimate the additional noise elements.Equation 11 is used to estimate the elements of measurementcovariance noise. Using process model, the process noise canbe written as

ωk = xk|k − f(xk−1|k−1) (12)

⇒ ωk = xk|k − xk|k−1 (13)

Using equation 7, above equation is simplified as

⇒ ωk = Kkyk (14)

Taking expectation on both sides in equation 14 would resultin following equation

Qk = KkSmk K

Tk (15)

where Qk is the estimated process covariance noise. We havesmoothed the estimation of covariance matrices by weightingprevious estimates, using the following equation

rik = αrik + (1− α)rik−1(16)

Qk = αQk + (1− α)Qk−1 (17)

where α is the parameter used for averaging the estimates.Equation 16 and 17 does not ensure the positive definiteness,therefore in implementation, we take the absolute value ofequation 16 and 17. Equation 16 and 17 are used to adaptivelyestimate the covariance matrices. In the next section, processand measurement model are described that are used in ourimplementation.

IV. GPS-VISION FUSION

The states for our system includes

x = [x x y y z z cδt ˙cδt]T (18)

where (x, y, z) denotes the position in Earth Centered EarthFixed (ECEF) coordinate system, (x, y, z), denotes the velocityin ECEF coordinate system, (cδt, ˙cδt) denotes clock bias andclock bias rate respectively. We use constant velocity modelfor our system. The dynamics of the system is given by

xk|k−1 = Axk−1|k−1 + ωk (19)

where A =

1 ∆t 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 ∆t 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 ∆t 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 ∆t0 0 0 0 0 0 0 1

, ∆t is a

small time step and ωk is process noise. As this system islinear, A is same as Fk. We use pseudoranges from GPS

Page 4: 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for covariance matrices. The camera image is compared with Google Street View (GSV)

4

and position obtained from vision as measurements. Themeasurement vector is given by

z =

ρSV1

...ρSVi

...ρSVn

xvisionyvisionzvision

(20)

where ρSVi is the pseudorange obtained from satellite i, n isthe number of visible satellites and (xvision, yvision, zvision)is the position obtained from vision. The measurement modelis given by

z =

√(XSV1 − x)2 + (Y SV1 − y)2 + (ZSV1 − z)2 + cδt

...√(XSVi − x)2 + (Y SVi − y)2 + (ZSVi − z)2 + cδt

...√(XSVn − x)2 + (Y SVn − y)2 + (ZSVn − z)2 + cδt

xyz

+νk

(21)where (XSVi , Y SVi , ZSVi) denotes the satellite position inECEF frame and νk is the measurement noise, Appropriatejacobian is used in filter implementation. We used the specifiedprocess and measurement model for our system. In the nextsection, experimental scenario is briefly described.

V. EXPERIMENTAL SCENARIO

We conducted the experiment on urban environments ofUIUC campus. The scenario is shown in Figure 4.

Fig. 4: Experimental scenario in an urban environment

We used a commercial GPS receiver to record GPS data.We used a hand held mobile camera to record a video whileconducting the experiment. The traveled path is shown with

red line in Figure 4. The initial path is surrounded by largestructures. It is expected that few satellites will be blocked innorth-south direction. The rest of the path can be consideredas open environment as the heights of surrounded structures isrelatively low than that of initial path.

VI. RESULTS

We implemented our image matching algorithm and testedon the collected dataset. The obtained result is shown in Figure5.

Fig. 5: Matched features for a single image in database

In Figure 5, SIFT features are shown with green circles.The top image is obtained from GSV and the bottom imageis collected while conducting the experiment. The subsequentimages, from left to right, correspond to the steps shown in theblock diagram. The right most images show the features afterremoving outliers. It can be visually verified that the featuresshown in bottom right image are corresponding to the samefeatures shown in top right image. This shows that reliablematching is achieved using our algorithm.

In equation 20, z denotes measurements. z is a columnvector with pseudoranges and positions obtained from vision.The length of this vector is referred as size of measurements.In equation 20, this length is n + 3. Figure 6 shows that thelength of z is not constant and is varying with time. Varianceof vision noise is estimated using the proposed method. Thevariance of vision noise with time is shown in Figure 7. Thisdemonstrates that noise is variable with time.

We implemented the filter with time and size varying noise.Figures 8 and 9 show the east and north position with timerespectively. In Figures 8 and 9, the noise in vision changesits magnitude around 450 seconds, the proposed filter issuccessful in adapting to this time variation. In Figure 8, thefiltered output is smooth as compared to the position obtainedfrom vision.

Using pseudoranges, we obtained position by finding theleast square estimate (LSE). These estimates are shown inFigure 10 with yellow circle. To compare our method, weimplemented EKF with fixed covariance matrices, the resultis shown in Figure 10 with red line and our proposed filteroutput is shown in blue color. The proposed filter improvedthe position accuracy in urban environments.

Page 5: 1 Joint GPS and Vision Estimation Using an Adaptive Filtergracegao/publications...an expression for covariance matrices. The camera image is compared with Google Street View (GSV)

5

Fig. 6: Length of measurement vector (z) with time

Fig. 7: Variance of vision noise with time

Fig. 8: Variation of east position with time

VII. CONCLUSION

This paper proposed a GPS-vision fusion method that ac-counts for time and size variation of noise. We used measure-

Fig. 9: Variation of north position with time

Fig. 10: LSE, filtered position with fixed and proposed covari-ance matricesments to estimate time- and size-varying covariance matrices,as the affect of noise is observed in measurements.

We conducted an experiment in urban environments ofUIUC campus. The proposed filter adapts to time and sizevarying noise. We compared our method with fixed covariancematrices and observed improvement in urban environments.

REFERENCES

[1] P. D. Groves, “Shadow matching: A new GNSS positioning techniquefor urban canyons,” Journal of Navigation, vol. 64, no. 03, pp. 417–430,2011.

[2] P. D. Groves, Z. Jiang, L. Wang, and M. K. Ziebart, “Intelligent urbanpositioning using multi-constellation GNSS with 3D mapping and NLOSsignal detection,” in Proceedings of the 25th International TechnicalMeeting of The Satellite Division of the Institute of Navigation (IONGNSS+ 2012), Nashville, TN, USA, 2012, pp. 458–472.

[3] P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, and S. Thrun, “Discrim-inative training of kalman filters,” in Proceedings of Robotics: Scienceand Systems, Cambridge, USA, June 2005.

[4] S. Akhlaghi, N. Zhou, and Z. Huang, “Adaptive Adjustment of NoiseCovariance in Kalman Filter for Dynamic State Estimation,” ArXiv e-prints, Feb. 2017.

[5] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision,vol. 60, no. 2, pp. 91–110, Nov 2004. [Online]. Available:https://doi.org/10.1023/B:VISI.0000029664.99615.94