Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust...

11
Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London Limited Robust Image Mosaicing of Soccer Videos using Self-Calibration and Line Tracking Hyunwoo Kim and Ki Sang Hong Department of Electronic and Electrical Engineering, Pohang University of Science and Technology, Pohang, Republic of Korea Abstract: In this paper we propose an accurate and robust image mosaicing method of soccer video taken from a rotating and zooming camera using line tracking and self-calibration. The mosaicing of soccer videos is not easy, because their playing fields are low textured and moving players are included in the fields. Our approach is to track line features on the playing fields. The line features are detected and tracked using a self-calibration technique for a rotating and zooming camera. To track line features efficiently, we propose a new line tracking algorithm, called camera parameter guided line tracking, which works even when the camera motion undergoes sudden changes. Since we do not need to know any model for scenes beforehand, the proposed algorithm can be easily extended to other video sources, as well as other sports videos. Experimental results show the accuracy and robustness of the algorithm. An application of mosaicing is also presented. Keywords: Inter-image homography; Line tracking; Rotating and zooming camera; Self-calibration; Soccer videos; Video mosaicing 1. INTRODUCTION The image mosaicing technique is one of the most important elements of video analysis. In earlier work, we analysed soccer video using image mosaicing results [1,2]. The known model of playing fields was used, and the mosaicing results were not accurate. However, our work showed the good application of image mosaics to soccer video analysis. The applications include determining the trajectories of players on the field models, as well as the 3D location of a soccer ball. In Reid and Zissarman [3], an accurate measurement for a soccer ball was introduced based on accurate image mosaicing results, but lines are manually matched. Irani et al [4,5] also introduced new applications, like video com- pression and indexing. The mosaicing of soccer videos is not easy, because the playing fields are low textured and moving players are included in the fields. Traditional mosaicing techniques [6–8] are not appropriate for these cases, because they work on the moving players, not on the playing field. A reasonable Received: 5 January 2000 Received in revised form: 2 May 2000 Accepted: 26 June 2000 approach is to track the line features of the playing field, and construct mosaics using the tracked lines. Harris [9] utilised information on known 3D objects for tracking. He tracked the 3D position of objects and cameras using a Kalman filter, but he calibrated the cameras using a specific pattern, and assumed that the focal length of the cameras was fixed. The work of Clarke et al [10] extended Harris’ work to the estimation of unknown 3D objects and uncalibrated cameras. They used only the structure of line objects; they did not use camera information such as focal length or relative rotation angles. In this paper, we extend their work to deal with more complex camera motion. Other related work is as follows. Szeliski and Shum [8] introduced an image mosaicing method for image-based modeling by recovering 3D camera rotations. They assumed rotating cameras with a fixed focal length, which is approxi- mately known. Morimoto and Chellappa [11] developed a fast electronic image stabilisation system that compensates for 3D rotation, but this method did not handle camera zooming either. In contrast with those methods, our tracking method can handle camera zooming and focusing, and provides an efficient tracking algorithm, called Camera Parameter guided (CP-guided) tracking, owing to the self-calibration tech- nique. Since we do not need to know any model for the

Transcript of Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust...

Page 1: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London Limited

Robust Image Mosaicing of Soccer Videosusing Self-Calibration and Line Tracking

Hyunwoo Kim and Ki Sang HongDepartment of Electronic and Electrical Engineering, Pohang University of Science and Technology, Pohang,Republic of Korea

Abstract: In this paper we propose an accurate and robust image mosaicing method of soccer video taken from a rotating and zoomingcamera using line tracking and self-calibration. The mosaicing of soccer videos is not easy, because their playing fields are low texturedand moving players are included in the fields. Our approach is to track line features on the playing fields. The line features are detectedand tracked using a self-calibration technique for a rotating and zooming camera. To track line features efficiently, we propose a new linetracking algorithm, called camera parameter guided line tracking, which works even when the camera motion undergoes sudden changes.Since we do not need to know any model for scenes beforehand, the proposed algorithm can be easily extended to other video sources,as well as other sports videos. Experimental results show the accuracy and robustness of the algorithm. An application of mosaicing isalso presented.

Keywords: Inter-image homography; Line tracking; Rotating and zooming camera; Self-calibration; Soccer videos; Video mosaicing

1. INTRODUCTION

The image mosaicing technique is one of the most importantelements of video analysis. In earlier work, we analysedsoccer video using image mosaicing results [1,2]. The knownmodel of playing fields was used, and the mosaicing resultswere not accurate. However, our work showed the goodapplication of image mosaics to soccer video analysis. Theapplications include determining the trajectories of playerson the field models, as well as the 3D location of a soccerball. In Reid and Zissarman [3], an accurate measurementfor a soccer ball was introduced based on accurate imagemosaicing results, but lines are manually matched. Irani etal [4,5] also introduced new applications, like video com-pression and indexing.

The mosaicing of soccer videos is not easy, because theplaying fields are low textured and moving players areincluded in the fields. Traditional mosaicing techniques [6–8]are not appropriate for these cases, because they work onthe moving players, not on the playing field. A reasonable

Received: 5 January 2000Received in revised form: 2 May 2000Accepted: 26 June 2000

approach is to track the line features of the playing field,and construct mosaics using the tracked lines.

Harris [9] utilised information on known 3D objects fortracking. He tracked the 3D position of objects and camerasusing a Kalman filter, but he calibrated the cameras usinga specific pattern, and assumed that the focal length of thecameras was fixed. The work of Clarke et al [10] extendedHarris’ work to the estimation of unknown 3D objects anduncalibrated cameras. They used only the structure of lineobjects; they did not use camera information such as focallength or relative rotation angles. In this paper, we extendtheir work to deal with more complex camera motion.

Other related work is as follows. Szeliski and Shum [8]introduced an image mosaicing method for image-basedmodeling by recovering 3D camera rotations. They assumedrotating cameras with a fixed focal length, which is approxi-mately known. Morimoto and Chellappa [11] developed afast electronic image stabilisation system that compensatesfor 3D rotation, but this method did not handle camerazooming either.

In contrast with those methods, our tracking method canhandle camera zooming and focusing, and provides anefficient tracking algorithm, called Camera Parameter guided(CP-guided) tracking, owing to the self-calibration tech-nique. Since we do not need to know any model for the

Page 2: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

10 H. Kim and K. S. Hong

scenes beforehand, the proposed algorithm can be easilyextended to other video sources, as well as other sportsvideos. Experimental results show the accuracy and robust-ness of the algorithm, and we present an application ofmosaics to calculate 3D ball trajectories from an unsynchron-ised stereo camera.

This paper is organised as follows. The algorithm is out-lined in Section 2, and the self-calibration method isexplained as preliminary in Section 3. In Section 4, theCP-guided tracking algorithm is proposed. The initialisationand tracking stages are described in Sections 5 and 6,respectively. Experimental results and more applications aregiven in Sections 7 and 8, respectively. Finally, concludingremarks are given in Section 9.

2. OUTLINE

This section gives a brief outline of our algorithm. Wesuppose that an image sequence S = {I0,. . .,IN} is capturedby a pan-tilt camera (without z-axis rotation) with varyinginternal parameters. A flowchart of our algorithm is shownin Fig. 1. The algorithm consists of two stages: the initialis-ation stage and the tracking stage.

The first stage of the algorithm is the initialisation stage.Initialisation consists of initial line matching and estimationof the camera parameters. First, line features l0 and l1 areextracted and matched between the 0th frame I0 and thefirst frame I1 using an initial line matching method, whichwill be explained in Section 4. Camera parameters andinter-image homography are estimated from the matchinglines using the nonlinear self-calibration method, which willbe described in Section 3. In this step, false matches arerejected, and then the initial mosaic is constructed by thehomography. Finally, lines l1 are transferred to the referenceframe I0, and the reference lines l0 are updated by registeringthem. The set of reference lines is called the line model,and it is updated during tracking. The line model is the setof registered lines to the reference frame, and it helps usperform image mosaicing even when there is no overlappingregion between the reference image and any other frames.

Fig. 1. Flowchart of our algorithm.

Next, the tracking stage follows the initialisation stage.Image mosaicing is sequentially performed in this stage. Tocarry out the sequential image mosaicing at frame Ik, theline features lk are extracted, and l0 and lk are matched usingour CP-guided tracking method, which will be introduced inSection 4. The positions of the line model l0 in image Ikare predicted using the CP-guided prediction, and the linefeatures are matched using a proximity rule; then, theinterimage homography between I0 and Ik is computed usingthe least median squares (LMedS) method. To estimatecurrent camera parameters and refine the homography, weuse the nonlinear self-calibration algorithm. An imagemosaic is constructed from the result, and the line modelis updated. These steps are repeated for the subsequent(k 1 1)th frames until we reach the last frame.

3. SELF-CALIBRATION

In this paper, we construct mosaics of images captured bya pan-tilt camera with varying focal lengths, and this sectionexplains the self-calibration method. Details can be found inKim and Hong [12]. In contrast to other algorithms [13–16],the algorithm works well even when the camera motion isalmost zooming with very little rotation.

3.1. Camera Modelling

We consider a pan-tilt camera with projection matricesPk = Kk [RkuO], where Rk denotes the rotation of the kthcamera with respect to the reference (0th) camera, and Kk

is the camera matrix defined by

Kk = diag(fk,fk,1) (1)

where fk is the focal length of the kth camera. Note thatwe assume the principal point of the camera is at theimage centre, the skew is zero and the aspect ratio can beapproximated by 1. These assumptions are reasonable,because mislocating the principal points and zero-skew mod-elling do not seem to affect the self-calibration on thepractical level. The effects of the assumptions are analysedin Seo and Hong [16].

For this camera model, there is a 2D projective transform-ation Hk, which transfers image points u0 on the referenceframe to their matching points uk on the kth frame, whosematrix is of the form

Hk = KkRkK−10 (2)

The matrix is called an inter-image homography, and it satisfiesthe relationship uk = Hku0, where uk and u0 are matchingpoints. For matching lines lk and l0, the relationship lk =H2T

k l0 is satisfied.For pan-tilt cameras, Eq. (2) can be written as

Hk(f0,fk,ak,bk) =

3cosbk sinaksinbk −f0 cosak sinbk

0 cosak f0 sinak

sinbk

fk−

sinak cosbk

fk

f0 cosak cosbk

fk

4 (3)

Page 3: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

11Robust Image Mosaicing

where ak and bk are the rotation angles around the x-axisand the y-axis of the reference camera coordinate, respect-ively. We assume that the rotation angle gk around the z-axis is negligibly small. In this case, the number of unknownsis four: two for rotation and two for focal length. One inter-image homography Hk, from which we have eight equations,is sufficient to compute the unknown parameters.

3.2. Linear Algorithm

From Eq. (3), we get the following equations:

cosbk = lh11 (4)

sinak sinbk = lh12 (5)

−f0 cosak sinbk = lh13 (6)

cosak = lh22 (7)

f0 sinak = lh23 (8)

sinbk

fk= lh31 (9)

−sinak cosbk

fk= lh32 (10)

f0 cosak cosbk

fk= lh33 (11)

where l is an arbitrary nonzero value. For eliminating f0,we use Eqs (7), (8), (10) and (11), then get the relation

tanak =h23h32

h22h33. After considering the sign of the tangent

function, we have Eq. (12) for calculating ak. In similarway, we can get bk:

ak =h23h33

uh23h33utan−1 !|h23h32

h22h33|, (12)

bk = −h13h33

uh13h33utan−1 !|h13h31

h11h33|The focal lengths f0 and fk can be also directly computedfrom inter-image homography:

5f0 = Îuh13h33/(h11h31 + h12h32)u, if uaku # ubku (13)

f0 = Îuh23h33/(h21h31 + h22h32)u, elsewhere

fk =12S! f2

0(h211 + h2

12) + h213

f20 (h231 + h2

32) + h233

+ !f20(h2

21 + h222) + h2

23

f20(h2

31 + h232) + h2

33D

(14)

In contrast to other self-calibration methods, which donot work when the camera motion is almost all zoomingwith very little rotation [13–16], this linear algorithm worksfor any camera motion. In addition, there is a nonlinearalgorithm that adjusts not only camera parameters, but

also inter-image homography so that more accurate imageregistration is made possible.

3.3. Nonlinear Algorithm and Improvement ofInter-Image Homography

All previous algorithms use the following steps for self-calibration. First, inter-image homography is computed frommatching points (or matching lines), and then the cameraparameters are calculated from the estimated homography.Therefore, the accuracy of self-calibration methods dependupon the inter-image homography estimation. That is,previous algorithms, including our proposed linear self-calibration algorithm, are sensitive to the homographyestimation. To improve the performance for real images,we merge the two steps into one step using a non-linearoptimisation. The nonlinear algorithm can improve inter-image homography due to the parameterisation using cam-era parameters.

Our approach is to estimate the camera parameters directlyfrom matching points, not from the result of inter-imagehomography estimation. The relationship between M match-ing points, uk = {u1

k,. . .,uMk } and u0 = {u1

0,. . .,uM0 }, is uk = Hku0.

Remember that the inter-image homography is parameterisedby camera parameters f0, fk, ak and bk (Eq. (3)). To solvethe camera parameters and the inter-image homographysimultaneously, we minimise the following error functionwith respect to the camera parameters:

E(f0,fk,ak,bk) =1M OM

i=1

iuik − Hk(f0,fk,ak,bk)ui

0i2 (15)

That is, we minimise the Euclidean distance between corre-sponding points.

For matching lines l0 = {l10,. . .,lM0 } and lk = {l1k,. . .,lMk }, theequation has the form of

E =1M OM

i=1

Fd(lik,H−Tk li0) + d(li0,HT

k lik)2 G2

(16)

where d(la,lb) denotes the distance between two lines, la andlb. We call it the matching error. When the two endpoints ofla are e1

a and e2a, and when lb has the form (lb,1,lb,2,lb,3)T in

the homogeneous coordinate, the distance is defined by

d(la,lb) = !ulTb e1

au2 + ulTb e2au2

(l2b,1 + l2b,2)(17)

The above distance measure between matching lines is onlyan example and other distance measures can be used [17].The linear solution (Section 3.2) is used for the initialvalues, and then Eqs (15) or (16) are optimised using theLevenberg–Marqurdt method [18]. This non-linear methodgives stable self-calibration results for real images, and refinesinter-image homography and the linear solution simul-taneously.

Page 4: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

12 H. Kim and K. S. Hong

Fig. 2. Line matching.

4. CP-GUIDED LINE TRACKING

CP-guided tracking consists of two steps: CP-guided predic-tion and robust line matching. The locations of the linesare predicted in the current frame using the CP-guidedprediction algorithm, then the line model and the lines thatare extracted in the current frame are matched throughrobust line matching.

Fig. 3. Original soccer video. (a) The reference image, (b) the other images. The order is from top left to bottom right.

Normally, to predict the locations of line features, aKalman filter is used for each line feature, but it fails whenthe camera motion undergoes sudden changes, e.g. when acamera stops suddenly [19]. We apply our proposed predic-tion algorithm to the problem and overcome it. The basicidea of our algorithm is to predict the line locations byhierarchically searching the camera parameter set with themaximum number of matchings.

Suppose that the camera parameters of the (k 2 1)th andthe kth frame are calculated in the previous frame. (fk, ak

and bk denote the focal length, tilt angle and pan angle atthe kth frame, respectively.) In this case, we want to predictthe camera parameters fk+1, ak+1 and bk+1 at the (k 1 1)thframe, and match lines based on them. The algorithm ishierarchically performed. First, we compute the deviation ofeach parameter between the previous two frames (fk −fk−1, ak − ak−1 and bk − bk−1), then we quantise the cameraparameter space with them. The quantised parameters are

Page 5: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

13Robust Image Mosaicing

used as candidates. The search range of the camera para-meter space is determined depending on the speed of thecamera. For example, if the camera can be assumed to haveconstant velocity, the camera parameter candidate will bejust one (fk+1 = fk + (fk − fk−1), ak+1 = ak + (ak − ak−1) andbk+1 = bk + (bk − bk−1)). Among the candidates, we selectthe best parameter set with the maximum number of match-ing lines as a solution. Next, we reduce the quantisationstep, then we search the best parameter set as before. Theprocedure is repeated unless the matching number increasesmore than some user-given threshold. The details of thealgorithm are as follows.

1. Set i = 0, D = 1.0, f(0) = fk, a(0) = ak and b(0) = bk.2. Select candidates using the following equations:

f(l) = f(i) + lfk − fk−1

D, l = lmin,· · ·,lmax (18)

a(m) = a(i) + mak − ak−1

D, m = mmin,· · ·,mmax (19)

b(n) = b(i) + nbk − bk−1

D, n = nmin,· · ·,nmax (20)

3. For all the candidates, calculate the number of matchinglines using a proximity rule, which is explained in Sec-tion 4.1.

4. Choose a parameter with the maximum number Nmatch

(i) of matching lines, and store its corresponding cameraparameters as fmin, amin and bmin. Then increase i.

5. If uNmatch(i) 2 Nmatch(i 21)u , Nthresh, a user-specifiedvalue, then go to Step 7. Otherwise, go to the next step.

Fig. 4. Predicted feature position. (a) 3rd frame, (b) 6th frame, (c) 9th frame, (d) 12th frame.

6. Replace f(i), a(i) and b(i) with fmin, amin and bmin,respectively. Set l,m,n = {−1,0,1} and D = 2i, and go toStep 2.

7. Select fmin, amin and bmin as fk+1, ak+1 and bk+1, respect-ively.

8. Match the previous tracked lines with the predicted linesusing our proximity rule. Refine matching lines using theLMedS method [20].

Steps 1–7 correspond to the CP-guided prediction algorithm,and Step 8 corresponds to the robust line matching. Theprediction algorithm gives a set of matching lines. Thematching by the proximity rule matches the lines on playingfields, not from those on players, due to the predictedcamera parameters. Then the LMedS method refines thematching lines by removing false matches. The thresholdvalue Nthresh should be set depending on outliers like movingplayers in a soccer sequence and the proximity parametersspecified by users for line matching (Section 4.1). Fortu-nately, we found that the number of iterations is not sosensitive to the value Nthresh. In our case, we set the value0.1 × number of current matched lines for the sequencesused in the experiment.

4.1. Proximity Rule

Let us explain our proximity rule for line matching. Supposethat l0 = {l10,· · ·,lM0 } and l1 = {l11,· · ·,lN

1 } are the two differentsets of line segments. We want to match each li0 with thebest matching line that satisfies the proximity rule and hasthe minimum matching distance among l1. Remember thatthe matching distance is given in Eq. (16). First, we check

Page 6: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

14 H. Kim and K. S. Hong

the following conditions between each matching pair li0 andlj1:

uu0 − u1u , uth

ud0 − d1u , dth (21)max(olx,oly) , olth

gT0 g1 . 0

where uk, dk and gk denote the slope angle, the distancefrom image origin, and the average of the intensity gradienton the line segments of line segments lk, respectively. olxand oly are the projected length onto the x-axis and y-axisof the overlapping region, respectively, and gT

0 g1 . 0 meansthat the gradient should not be in the opposite direction,which resolves matching ambiguities between line segmentson linear bands (e.g. thick lines). uth, dth and olth are user-specified threshold values. (Figure 2 shows the describedvariables.) Among the matching candidates that satisfy allthe proximity conditions, the line segment with the mini-mum matching distance is selected as the matching pair foreach line segment.

5. THE INITIALISATION STAGE

As previously mentioned, in the initialisation stage theinitial line matching and the camera parameter estimationare performed for their extracted line segments in the firsttwo frames. In this section, we describe the details of thecomponents of the stage.

5.1. Initial Line Matching

We extract line features l0 and l1, respectively, in the firsttwo frames I0 and I1 using the standard Hough transform[21], and then we match the line segments as follows.Because we do not know the camera parameters in theinitialisation stage, we exhaustively search the camera para-meter space using the modified version of the CP-guidedprediction algorithm. Steps 1 and 2 in the prediction algor-ithm are replaced with the following steps. In Step 1, weset f0 to several typical values such as 500, 1000 and 2000pixels, f1 − f0 = 20 pixels, a1 = 1° and b1 = 1°. In Step 2,candidates are selected as follows:

f(l) = f(i) + lf1 − f0

D, l = −lmax,· · ·,lmax (22)

a(m) = a(i) + ma1 − a0

D, m = −mmax,· · ·,mmax (23)

b(n) = b(i) + nb1 − b0

D, n = −nmax,· · ·,nmax (24)

The number of the above candidates is huge, but thecandidates are only used in the initialisation stage once,and they can be implemented hierarchically, as in the CP-guided tracking algorithm. The camera parameter set withthe minimum matching error is selected as a solution. Basedon this, we match lines using the proximity rule. From thematching lines, inter-image homography is estimated using

Fig. 5. Estimated camera parameters with respect to the referenceimage. (a) The rotation angles, (b) the focal lengths, (c) the focallength ratios.

the LMedS method, and false matches and outliers arerejected [20]. Then, the camera parameters are estimatedand an accurate image mosaic is constructed using thenonlinear self-calibration method. Finally, lines l1 are trans-ferred into the reference frame I0, and l0 is updated byadding them. In this paper, the updated line l0 is called theline model. The line model is updated during tracking, andall lines are registered to it. Therefore, image mosaicingwith respect to the reference image can be performed evenwhen there is no overlapping region between the referenceimage and any other frames.

Practically, when the zooming and rotation angles of thecamera are small at the beginning of a soccer video, thetranslation-only alignment can be used [22]. In the trans-lation-only alignment, the matching errors are computedbetween the translated images of I1 and I0. The translationwith the minimum value is selected as a solution.

Page 7: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

15Robust Image Mosaicing

6. THE TRACKING STAGE

The tracking stage carries out sequential image mosaicingusing self-calibration and line tracking. Details are describedin the following subsections.

6.1. CP-Guided Line Tracking

First, line features lk are extracted in the current kth frameIk. Then we predict the position lk of the line model inthe current frame in order to match them with the extractedline features. The prediction is performed using the previousestimated camera parameters. We predict the camera para-meters fk, ak and bk using our CP-guided prediction algor-ithm (Steps 1–7 in the tracking algorithm). Using thepredicted camera parameters, the lines lk and lk are matchedby our proximity rule. Then the matching lines are refinedby a robust line matching procedure.

Fig. 6. Image mosaics. (a) The reference frame, (b) 3rd frame, (c) 6th frame, (d) 9th frame, (e) the final 12th frame.

In the robust line matching procedure, we use the LMedSmethod, and its algorithm is described as follows (refer toZhang [20] for details of the LMedS method):

1. Randomly select five pairs from the matching lines.2. From them, compute an inter-image homography using

the singular value decomposition [18].3. Measure the homography quality for all matching lines.

(Each matching line in the reference frame is transferredto the current frame by the homography. The distancebetween the transferred line and the corresponding linein the current frame is calculated using Eq. (17). Wedefine their median value as the quality of thehomography.)

4. Repeat Steps 1–3 until a sufficient number of samplings,which can be theoretically specified in the paper byZhang [20], is reached.

5. Select the matching pairs and the homography with the

Page 8: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

16 H. Kim and K. S. Hong

best quality, i.e. the one with the smallest median value,as the solution.

6. Reject the outliers of the selected homography, and selectthe inliers as the final matching lines.

6.2. Self-Calibration and Homography Estimation

From the homography with the best quality, camera para-meters are estimated using the linear self-calibration method.Next, the nonlinear algorithm refines the linear solutionand improves the inter-image homography.

7. EXPERIMENTAL RESULTS

In this section, we apply our image mosaicing algorithm tothree video sources. One is captured from a single viewpoint,and the others are stereo videos captured from two differ-ent viewpoints.

The first soccer video is shown in Fig. 3. The images arecaptured from every fourth frame. The camera is zoomed inand rotated simultaneously, then it continues to zoom inwith little rotation. Figure 3(a) is the reference image towhich other images are to be registered. For each frame,line features are extracted using Hough transform. Usingour CP-guided prediction algorithm and robust line match-ing, the line segments are matched and tracked. In Fig. 4,

Fig. 7. Stereo soccer video. The two top rows of video are captured by the left camera and the two bottom rows are captured by the rightcamera. The order is from top left to bottom right.

the predicted positions of the tracked line segments areoverlaid on the frames (3rd, 6th, 9th and 12th frame). Wecan see that our prediction algorithm can make the predictedlines locate near the line segments in the current frame, sothat our robust line matching algorithm can work. Figure 5shows the self-calibration results after 30 iterations.Figures 5(a), (b) and (c) show the rotation angles, the focallengths and the focal length ratio, respectively. Since thecamera parameters are estimated from homographies, aspresented in Section 3, good estimation of camera para-meters means that the homographies, which are given asour mosaicing result, are accurately estimated. We estimatethe camera parameters with respect to the reference frame,so f0 should be constant for all frames. However, up to thethird frame where the rotation angles are small, the focallengths are unstable, but their ratio is stable, as shown inFigs 5(b) and (c). Nevertheless, our algorithm works well,as shown in Fig. 4(a). This is because, when the rotationangles are small, the displacement of features only dependsupon the ratio of focal lengths. In this case, Eq. (3) can bewritten as Hk = Kk K0

−1 = diag(fk/f0, fk/f0,1). After the fourthframe, the focal lengths seem to be correctly estimated. Forall frames, the focal length ratio and the rotation anglesseem to be correctly estimated, and based on them, we cantrack lines. The mosaicing results of the video are shownin Fig. 6. Each image mosaic is simply merged by averagingthe mosaic of the previous frames and the registered current

Page 9: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

17Robust Image Mosaicing

frame. You can see that the lines on the playing ground,the ad board and the auditorium are sharply registered forall the frames.

Stereo videos are shown in Fig. 7. As can be seen, thevideo pairs are not synchronised. The video captured by theleft camera is played in slow motion. The motion of thetwo stereo cameras is almost all zooming, with little rotation.Figure 8 shows the estimated camera parameters of the stereocameras. Since the rotation angles are small and the imagesare somewhat blurred, some frames give incorrect results,for example, the 8th frame captured by the left camera.Figure 9 shows the last frames and their image mosaics. Theoverlaid curves and lines are explained in the next section.

8. APPLICATIONS

When we construct accurate image mosaics, we can extractmore information from the video. In Fig. 9, ball trajectoriesare shown in the stereo image mosaics. Since the stereovideos are not synchronised, the stereo matching of the ballis not possible, therefore determination of the 3D balltrajectory is also not possible. However, after registeringother frames to the reference images, we can obtain the

Fig. 8. Estimated camera parameters. The result of the left camera is shown in the left column and the result of the right camera is shownin the right column. (a) and (b) are the focal lengths, (c) and (d) are the ratio of the focal lengths, (e) and (f) are the rotation angles.

trajectory of the ball in the stereo pair. Therefore, althoughwe cannot match the ball between each corresponding imagepair, we can match the trajectories of the ball between thestereo image mosaics. This means that we can considerdynamic objects (the ball) as static structures (the balltrajectories) due to accurate image mosaicing.

First, we manually point out the positions of the ballin the video sequences, and transfer the positions to theimage mosaics using the computed inter-image homo-graphies. When we assume that the ball trajectories aresmooth, the transferred positions are interpolated usingbi-cubic spline [18]. The trajectories can be seen in Fig. 9.Next, we compute the fundamental matrix between thestereo image mosaics, and then based on it, the epipolarline of each point on the right trajectory can be determ-ined in the left image mosaic (see two epipolar lines onthe left image). As a result, the intersection point betweenthe epipolar line and the ball trajectory in the left imageis the matching point. In Fig. 9, the matching pairs oftwo points are shown. From the matching positions, wecan compute the 3D positions of the ball, and thus the3D ball trajectory as well. The information can be usedfor the synthesis of new video from the viewpoint of thesoccer ball.

Page 10: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

18 H. Kim and K. S. Hong

Fig. 9. Image mosaics and ball trajectories. (a) Image mosaic of the left camera, (b) image mosaic of the right camera.

9. CONCLUDING REMARKS

In this paper, we have proposed an accurate and robustimage mosaicing method of soccer video using line trackingand self-calibration. Our approach is to track line featureson the playing fields. The line features are detected, andthey are tracked using self-calibration. Experimental resultsshow the accuracy and robustness of the algorithm. Wehave also presented an application of mosaics to calculate3D ball trajectories from an unsynchronised stereo camera.

References

1. Kim T, Seo Y, Hong KS. Physics-based 3D position analysis ofa soccer ball from monocular image sequences. Proc Int Confon Computer Vision 1998; 721–726

2. Seo Y, Choi S, Kim H, Hong KS. Where are the ball andplayers? Soccer game analysis with color-based tracking andimage mosaick. Proc Int Conf on Image Analysis and ProcessingSeptember 1997

3. Reid I, Zisserman A. Goal-directed video metrology. Proc EuroConf on Computer Vision 1996; II:647–658

4. Irani M, Anandan P, Bergen J, Kumar R, Hsu S. Efficientrepresentations of video sequences and their applications. SignalProcessing: Image Commun 1996; 8:327–351

5. Irani M, Anandan P. Video indexing based on mosaic represen-tations. Proc IEEE 1998; 96(5):905–921

6. Irani M, Rousso B, Peleg S. Computing occluding and trans-parent motions. Int J Computer Vision 1994; 12(1):5–16

7. Sawhney HS, Ayer S. Compact representations of videosthrough dominant and multiple motion estimation. IEEE TransPattern Analysis and Machine Intelligence 1996; 18(8):814–830

8. Szeliski R, Shum HY. Creating full view panoramic imagemosaics and environment maps. Proc SIGGRAPH 1997; 251–258

9. Harris C. Tracking with Rigid Models, in Active Vision. ABlack, A Yuille (eds). MIT Press, 1992

10. Clarke JC, Carlsson S, Zisserman A. Detecting and trackinglinear features efficiently. Proc British Machine Vision Confer-ence 1996

11. Morimoto C, Chellappa R. Fast 3D stabilization and mosaicconstruction. Proc Int Conf on Computer Vision and PatternRecognition 1997; 660–665

Page 11: Robust Image Mosaicing of Soccer Videos using Self ...cafe.postech.ac.kr/papers/ijournal/2001_Robust Image...Pattern Analysis & Applications (2001)4:9–19 2001 Springer-Verlag London

19Robust Image Mosaicing

12. Kim H, Hong KS. A practical self-calibration method of pan-tilt cameras. Proc Int Conf on Pattern Recognition 2000 (oravailable in POSTECH Technical Report TR-9901, PohangUniversity of Science and Technology, October 1999)

13. de Agapito L, Hayman E, Reid I. Self-calibration of a rotatingcamera with varying intrinsic parameters. Proc British MachineVision Conf 1998; 105–114

14. de Agapito L, Hartley RI, Hayman E. Linear calibration of arotating and zooming camera. Proc Int Conf on ComputerVision and Pattern Recognition 1999; I:15–21

15. Seo Y, Hong KS. Auto-calibration of a rotating and zoomingcamera. Proc IAPR Workshop on Machine Vision Applications1998; 274–277

16. Seo Y, Hong KS. About the self-calibration of a rotating andzooming camera: theory and practice. Proc Int Conf on Com-puter Vision 1999; 183–188

17. Hartley RI. Projective reconstruction from line correspondences.Proc Int Conf on Computer Vision and Pattern Recognition.1994; 903–907

18. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numeri-cal Recipes in C: The Art of Scientific Computing. CambridgeUniversity Press, 1993

19. Faugeras O. Three-Dimensional Computer Vision. MIT Press,1993

20. Zhang Z. Parameter Estimation Techniques: A Tutorial withApplication to Conic Fitting. INRIA Technical Report RR-2676, October 1995

21. Pitas I. Digital Image Processing Algorithms. Prentice Hall,1993; 231–239

22. Peleg S, Herman J. Panoramic mosaics by manifold projection.Proc Int Conf on Computer Vision and Pattern Recognition1997; 338–343

Hyunwoo Kim received a BS degree from Hangyang University, Seoul, Korea,in 1994, and an MS degree from POSTECH, Pohang, Korea, in 1996. He iscurrently a PhD candidate in the Department of Electronic and Electrical Engin-eering, POSTECH, Pohang, Korea, since 1996. His current research interestsinclude computer vision, virtual reality, augmented reality and computer graphics.

Ki-Sang Hong received a BS degree in Electronic Engineering from Seoul NationalUniversity, Korea, in 1977, and MS and PhD degrees in Electrical & ElectronicEngineering from KAIST, Korea, in 1979 and 1984, respectively. From 1984–1986 he was a researcher in the Korea Atomic Energy Research Institute, andin 1986 he joined POSTECH, Korea, where he is currently an associate professorof Electrical & Electronic Engineering. From 1988–1989 he worked in theRobotics Institute at Carnegie Mellon University, Pittsburgh, PA, as a visitingprofessor. His current research interests include computer vision, augmented realityand pattern recognition.

Correspondence and offprint requests to: K. S. Hong, Department of Electronicsand Electrical Engineering, Pohang University of Science and Technology,Pohang 790–784, Korea. E-mail: hongksKpostech.ac.kr