Camera Motion Identification in the Rough Indexing ParadigmRough Indexing Paradigm: Work on a lower...
Transcript of Camera Motion Identification in the Rough Indexing ParadigmRough Indexing Paradigm: Work on a lower...
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Camera Motion Identification in theRough Indexing Paradigm
Petra KRÄMER and Jenny BENOIS-PINEAU
LaBRI – University Bordeaux I, France
{petra.kraemer,jenny.benois}@labri.fr
Camera Motion Identification in the Rough Indexing Paradigm – p.1/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Introduction
Task:Given the shot boundary referenceIdentify the shots in which a certain camera motion (pan, tilt,zoom) is present
Rough Indexing Paradigm:Work on a lower spatial and temporal resolution i.e. P-Frames
Aim:Reuse motion low-level descriptors from the compressed stream
Main challenge in TRECVID 2005:Jitter camera motion due to hand-carried cameras
Camera Motion Identification in the Rough Indexing Paradigm – p.2/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Overview
P-Frames
1 Global Motion Estimation
2 Significance Value Computation
3 Motion Segmentation
4 Thresholding
5 Classification
Motion feature
θj
sj
sm
ζm
j related to frames, m related to segments of homogeneous motion
Camera Motion Identification in the Rough Indexing Paradigm – p.3/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Overview
P-Frames
1 Global Motion Estimation
2 Significance Value Computation
3 Motion Segmentation
4 Thresholding
5 Classification
Motion feature
θj
sj
sm
ζm
j related to frames, m related to segments of homogeneous motion
Camera Motion Identification in the Rough Indexing Paradigm – p.3/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Global Motion Estimation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Robust global motion estimator for P-Frames [DBP01]:
Estimation of the affine 2D motion model:(
dxi
dyi
)
=
(
a1
a4
)
+
(
a2 a3
a5 a6
)(
xi
yi
)
Based on the weighted least squares method:
θ = (HTWH)−1HTWZ
�
�
�
�θ = (a1, a2, a3, a4, a5, a6)T
Z MPEG motion compensation vectors
H macroblock centers
W weights defined by the derivative of the Tukey function
Camera Motion Identification in the Rough Indexing Paradigm – p.4/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Global Motion Estimation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
The derivative of the Tukey function:
ψ(r, λr) =
{
r(r2 − λ2
r)2 if |r| < λr
0 otherwise
The weights are [OB95]:
wi =ψ(ri)
ri�
�
�
�λr threshold
ri = zi − zi residuals
zi i-th MPEG motion vector
zi estimation of zi
-10
-8
-6
-4
-2
0
2
4
6
8
10
-4 -3 -2 -1 0 1 2 3 4
PSfrag replacements
ψ
Camera Motion Identification in the Rough Indexing Paradigm – p.5/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Global Motion Estimation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
a) -150
-100
-50
0
50
100
150
-200 -150 -100 -50 0 50 100 150 200
Motion Compensation Vectors (29087)
b) -150
-100
-50
0
50
100
150
-200 -150 -100 -50 0 50 100 150 200
Estimated Vectors (29087)
c)
'
&
$
%
a) P-Frame motion vectors
b) Estimated vectors
c) Macroblocks:
Outliers
Dominant estimation support D
(wi > 0)
Camera Motion Identification in the Rough Indexing Paradigm – p.6/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Global Motion Estimation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Problem:
The global motion parameters are noisy due to jitter motions.
The global motion parameters have different meanings.
Solution:
Significance test of the motion parameters:
Thresholding of likelihood values
Camera Motion Identification in the Rough Indexing Paradigm – p.7/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Significance Value Computation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Based on [BGG99]:
Change to another basis of elementary motion-subfields:
φ = (pan, tilt, zoom, rot, hyp1, hyp2) with
zoom = 1
2(a2 + a6) rot = 1
2(a5 − a3)
hyp1 = 1
2(a2 − a6) hyp2 = 1
2(a3 + a5)
Consider two hypotheses H0 and H1
H0: the considered component of φ is significantwith φ0 as the corresponding motion modelH1: the considered component of φ is not significant (= 0)with φ1 as the corresponding motion model
Camera Motion Identification in the Rough Indexing Paradigm – p.8/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Significance Value Computation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Likelihood function associated to each hypothesis:
f(φl) =∏
i∈D
(
1
2π√
det(Σl)exp
(
−1
2(rT
i Σ−1
l ri)
)
)
=1
(2πσx,lσy,l)||D||exp (−||D||), l = 0, 1
Assumption:Σl =
(
σ2
x,l 0
0 σ2
y,l
)
��
��
Σ covariance matrix
σx, σy variances for x and y
D dominant estimation support
Camera Motion Identification in the Rough Indexing Paradigm – p.9/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Significance Value Computation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
The significance value s is:
s = ln
(
f(φ1)
f(φ0)
)
= ||D|| (ln(σx,0σy,0) − ln(σx,1σy,1))
=∗ ||D||(
ln(σ2
0) − ln(σ2
1))
∗ assuming that σx = σy
Aim: Use s to test the significance
Idea:
If a motion feature (pan, zoom, tilt) is present in a shot, itscorresponding motion parameter is significant during a sufficientnumber of frames.
Camera Motion Identification in the Rough Indexing Paradigm – p.10/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Significance Value Computation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Problem:
The significance values can be noisy due to jitter motions.
The motion models θ can be inaccurate.
Solution:
Smooth the significance value along the time and take decision onthe temporal mean value.
–> Segment shots into subshots of homogeneous motion
Introduce confidence measures in order to reject frames with aninaccurate motion model.
Camera Motion Identification in the Rough Indexing Paradigm – p.11/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Significance Value Computation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Two reasons for inaccurate motion models:
Failure of the MPEG encoder–> Confidence measure cD ≈ ||D||
Failure of the global motion estimation algorithm–> Confidence measure cσ ≈ σ2
0
Reject of the frame if: cD < λD || cσ > λσ
��
�
λD thresholdλσ threshold
Camera Motion Identification in the Rough Indexing Paradigm – p.12/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Motion Segmentation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Hinkley test to detect changes on the temporal mean value s(t):
Downward jump:
Uk =
k∑
t=0
(
st − s+δmin
2
)
(k ≥ 0)
Mk = max0≤i≤k
Ui; detection if Mk − Uk > λH
Upward jump:
Vk =
k∑
t=0
(
st − s−δmin
2
)
(k ≥ 0)
Nk = min0≤i≤k
Vi; detection if Vk −Nk > λH��
��
s temporal mean value
δmin minimal jump magnitude
λH predefined threshold
Camera Motion Identification in the Rough Indexing Paradigm – p.13/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Motion Segmentation
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Principle of the Hinkely test:
s and s
DownMk − Uk
UpVk −Nk
Camera Motion Identification in the Rough Indexing Paradigm – p.14/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Thresholding
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Selection of the hypothesis:
s(t) =1
T − t0
t=T∑
t=t0
s(t)
H0
<
>
H1
λs
And relative thresholding to determine the dominant motion:
ζ(t) =
{
s(t) if s(t) < α · min{span, stilt, szoom, srot, shyp1, shyp2}
0 otherwise��
��
T − t0 segment of homogeneous motion
λs threshold
α constant
Camera Motion Identification in the Rough Indexing Paradigm – p.15/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Classification
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
The following classification scheme is applied to the thresholded meansignificance values ζ = (ζpan, ζtilt, ζzoom, ζrot, ζhyp1, ζhyp2):
ζ motion feature
1 (0, 0, 0, 0, 0, 0) static camera/ no significant motion
2 (ζpan, 0, 0, 0, 0, 0) pan
3 (0, ζtilt, 0, 0, 0, 0) tilt
4 (ζpan, ζtilt, ζzoom, 0, 0, 0) zoom
5 others complex camera motion
Camera Motion Identification in the Rough Indexing Paradigm – p.16/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Classification
P
1
2
3
4
5
Mf
θj
sj
sm
ζm
Postprocessing:
Join neighbored segments with the same motion feature
Reject segments with a duration shorter than tmin frames
PSfrag replacements
tmin t
If a motion feature is still present:
The shot is identified to contain the motion feature.
Camera Motion Identification in the Rough Indexing Paradigm – p.17/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Results
Results for the shot “shot106_136”:
a)-30
-25
-20
-15
-10
-5
0
5
10
15
29060 29080 29100 29120 29140 29160 29180 29200
frame number
pan static tilt zoom
reje
ct
reje
ct
a1a2a3a4a5a6
PSfrag replacementsλs b)
-250
-200
-150
-100
-50
0
50
29060 29080 29100 29120 29140 29160 29180 29200
frame number
pan static tilt zoom
reje
ct
reje
ct
pantilt
zoomrot
hyp1hyp2
PSfrag replacementsλs
c)-250
-200
-150
-100
-50
0
50
29060 29080 29100 29120 29140 29160 29180 29200
frame number
pan static tilt zoom
reje
ct
reje
ct
pantilt
zoomrot
hyp1hyp2
PSfrag replacementsλs
��
��
a) Global motion parameters θ
b) Significance values s
c) Online mean values s
Camera Motion Identification in the Rough Indexing Paradigm – p.18/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Results
Precision and recall for all submissions:
0
0.2
0.4
0.6
0.8
1
0.4 0.5 0.6 0.7 0.8 0.9 1
reca
ll
precision
UyUD
2RRS
LabsRI
HUVISION
05LFMarburgA_CAM �
��
RI –> LaBRIPrecision 0.912Recall 0.737
Camera Motion Identification in the Rough Indexing Paradigm – p.19/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
Conclusion and Perspectives
Conclusion:
Proposition of a method based on global motion estimation andsignificance test.
The proposed method can handle moving objects and jittermotions.
No decoding of the compressed stream.
Performance 3-4 times faster than real time.
Since no ground truth available, difficulties to determine the bestparameter set.
Future work:
Focus mainly on the correction of motion models if the encoderblock-matching algorithm fails.
Camera Motion Identification in the Rough Indexing Paradigm – p.20/21
TRECVID 2005 – P. KRÄMER and J.BENOIS-PINEAU
References
[BGG99] P. Bouthemy, M. Gelgon, and F. Ganansia. A unified approach to shot changedetection and camera motion characterization. IEEE Trans. on Circuits andSystems for Video Technology, 9(7):1030–1044, October 1999.
[DBP01] M. Durik and J. Benois-Pineau. Robust motion characterisation for video indexingbased on MPEG2 optical flow. In International Workshop on Content-BasedMultimedia Indexing, CBMI’01, pages 57–64, 2001.
[OB95] J.M. Odobez and P. Bouthemy. Robust multiresolution estimation of parametricmotion models. Journal of Visual Communication and Image Representation,6(4):348–365, 1995.
Camera Motion Identification in the Rough Indexing Paradigm – p.21/21