An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu...

1
An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu ([email protected]) and Margrit Betke ([email protected] ) Image and Video Computing Group, Computer Science Department, Boston University Introduction Where is the object/feature point? time Robust tracking is important for Human Computer Interaction Video-based Surveillance Remote Sensing Video Indexing Problem: Failure in tracking, especially due to occlusion Solution: Automatic Recovery Idea: Use multiple cameras http://www.cs.bu.edu/faculty/betke/research/jordan- bubble.jpg Conclusion System detects tracking failures with high accuracy Promising results on automatic feature re- initialization Correlation based term is strong to predict reliability Proposed RM is inexpensive to compute Feature Extensions: Use of particle filters or other trackers on 3D Extend RM using geometric constraints about the motion of the object Use of multiple points considering constraints about shape Proposed Method Idea: Construct a Reliability Measure (RM) for each view to detect tracking failures Observation: Prefer the view in which object is most visible. Which parameter is most informative for tracking in a view? 2 , , 2 2 , , 2 , , , ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( y x y x y x y x y x y x y x y x T y x T N y x I y x I N y x T y x I y x T y x I N T I NCC 0 R T L Fx x Wikipedia, Epipolar Geometry, http://en.wikipedia.org/wiki/Epipolar_geometry F: Fundamental Matrix Right Camera Left Camera Normalized Correlation Coefficient (NCC) I: Image T: Template N: number of pixels Epipolar distance (EPD) Estimate of the 3D Position Geometric constraints about the shape/motion of the object )) ( ( z NCC U ) ' , ( ) ' , ( z z EPD z y EPD U s y z U )) ( ( 1 [ y NCC U Term 1 Term 2 Term 3 Term 4 z: 2D tracks, : Reconstructed 3D Trajectory, y: Projection of estimated 3D position 1 0 0 , 1 , , 0 ) ( x x otherwise if if x x U 1 4 1 i i and Term 2 Term 3 Term 4 Term 1 ) ' , ( ) ' , ( ))] ( ( 1 [ )) ( ( 4 3 2 1 z z EPD z y EPD U threshold y z U y NCC U z NCC U RM where 2 1 2 t t t X X X Proposed System: Independent 2D trackers in each view utilize pyramidal implementation of optical flow algorithm Stereoscopic reconstruction of 3D trajectories (simple linear method) Predicting 3D Position with ‘Constant Velocity Assumption’ Automatic recovery using projection of estimated 3D position 0 1 Experiments & Results Left-view RM (top) and right-view RM’ (bottom) for the video of subject A and direction of subject’s movements where 1 i 0 j i j for all and High correlation b/w values of term 1 and term 2 DROP TERM 2! RMs with do not have succinct peaks. Weigh term 1 more! 1 0 Values of pairs and triplets of RM terms with weights set equally. RMs with final weights for subject A. Final Weights 5 . 0 1 0 2 25 . 0 4 3 DATASET Cameras 20 inches apart, 120 o between optical axis Training Set: 8 subjects, ~450 frames each Subjects rotating head center to right and then left and up, down. Test Set: 26 subjects, 2 sequences per subject, ~1200 frames in each sequence. Recording data from left and right cameras, while subject is using a mouse-replacement interface from frontal camera RESULTS The feature was lost in both views 9 times, but was declared as lost in only one of the views. 53 false alarms, but in all cases the feature was reinitialized to a location at most 3 pixels from the actual location, hence the false alarm rate is negligible. For 254 correctly detected tracking failures, the system was able to recover 181 times (71.3%). 304 feature loss events in one view = 254 detected in correct view + 25 detected in the other view + 25 not detected True positive rate 83.5% Feature Loss Problem with Camera Mouse: Camera Mouse is a mouse-replacement interface for people with disabilities. Automatic re-initialization would enable the subject to use Camera Mouse freely without the intervention of a caregiver. Adjusting Weights X ˆ References [1] Camera Mouse, http://www.cameramouse.org/, accessed August 2010. [2] C. Connor, E. Yu, J. Magee, E. Cansizoglu, S. Epstein, and M. Betke, "Movement and Recovery Analysis of a Mouse-Replacement Interface for Users with Severe Disabilities," 13th Int. Conference on Human-Computer Interaction, 10 pp., San Diego, USA, July 2009. [3] Y. Tong, Y. Wang, Z. Zhu, and Q. Ji, “Robust Facial Feature Tracking under Varying Face Pose and Facial Expression,” Pattern Recognition, 40(11):3195-3208, November 2007. [4] C. Fagiani, M. Betke, and J. Gips, “Evaluation of tracking methods for human-computer interaction,” IEEE Workshop on Applications in Computer Vision (WACV 2002), pp. 121-126, Orlando, USA, December 2002.

Transcript of An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu...

Page 1: An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu (ataer@ece.neu.edu) and Margrit Betke (betke@cs.bu.edu ) Image and.

An Information Fusion Approach for Multiview Feature Tracking

Esra Ataer-Cansizoglu ([email protected]) and Margrit Betke ([email protected] ) Image and Video Computing Group, Computer Science Department, Boston University

Introduction

Where is the object/feature point?

time

Robust tracking is important for

Human Computer Interaction

Video-based Surveillance

Remote Sensing

Video Indexing

Problem: Failure in tracking,

especially due to occlusion

Solution:

Automatic Recovery

Idea: Use multiple cameras

http://www.cs.bu.edu/faculty/betke/research/jordan-bubble.jpg

Conclusion System detects tracking failures with high accuracy

Promising results on automatic feature re-initialization

Correlation based term is strong to predict reliability

Proposed RM is inexpensive to compute

Feature Extensions:

Use of particle filters or other trackers on 3D

Extend RM using geometric constraints about the motion of the object

Use of multiple points considering constraints about shape

Proposed Method

Idea: Construct a Reliability Measure (RM)for each view to detect tracking failures

Observation: Prefer the view in which object is most visible.

Which parameter is most informative for tracking in a view?

2

,,

2

2

,,

2

, ,,

),(),(),(),(

),(),(),(),(

),(

yxyxyxyx

yx yxyx

yxTyxTNyxIyxIN

yxTyxIyxTyxIN

TINCC

0RTL Fxx

Wikipedia, Epipolar Geometry, http://en.wikipedia.org/wiki/Epipolar_geometry

F: Fundamental Matrix

Right CameraLeft Camera

Normalized Correlation Coefficient (NCC)I: ImageT: TemplateN: number of pixels

Epipolar distance (EPD)

Estimate of the 3D Position

Geometric constraints about the shape/motion of the object

))(( zNCCU

)',(

)',(

zzEPD

zyEPDU

s

yzU

))]((1[ yNCCU

Term 1

Term 2

Term 3

Term 4

z: 2D tracks, : Reconstructed 3D Trajectory, y: Projection of estimated 3D position

10

0

,1

,

,0

)( x

x

otherwise

if

if

xxU 14

1

i

i and

Term 2 Term 3 Term 4Term 1

)',(

)',())]((1[))(( 4321 zzEPD

zyEPDU

threshold

yzUyNCCUzNCCURM

where

212 ttt XXX

Proposed System:

Independent 2D trackers in each view utilize pyramidal implementation of optical flow algorithm

Stereoscopic reconstruction of 3D trajectories (simple linear method)

Predicting 3D Position with ‘Constant Velocity Assumption’

Automatic recovery using projection of estimated 3D position

01

Experiments & Results

Left-view RM (top) and right-view RM’ (bottom) for the video of subject A and direction of subject’s movements where 1i 0j ij for alland

High correlation b/w values of

term 1 and term 2

DROP TERM 2!

RMs with do not have

succinct peaks.Weigh term 1 more!

1 0

Values of pairs and triplets of RM terms with weights set equally. RMs with final weights for subject A.

Final Weights5.01

02

25.043

DATASET

Cameras 20 inches apart, 120o between optical axis Training Set: 8 subjects, ~450 frames each Subjects rotating head center to right and then left and up, down.

Test Set: 26 subjects, 2 sequences per subject, ~1200 frames in each sequence.

Recording data from left and right cameras, while subject is using a mouse-replacement interface from frontal camera

RESULTS

The feature was lost in both views 9 times, but was declared as lost in only one of the views.

53 false alarms, but in all cases the feature was reinitialized to a location at most 3 pixels from the actual location, hence the false alarm rate is negligible.

For 254 correctly detected tracking failures, the system was able to recover 181 times (71.3%).

304 feature loss events in one view

=254 detected in

correct view +25 detected in the other view + 25 not detected

True positive rate 83.5%

Feature Loss Problem with Camera Mouse: Camera Mouse is a mouse-replacement interface

for people with disabilities. Automatic re-initialization would enable the subject

to use Camera Mouse freely without the intervention of a caregiver.

Adjusting Weights

References[1] Camera Mouse, http://www.cameramouse.org/, accessed August 2010.

[2] C. Connor, E. Yu, J. Magee, E. Cansizoglu, S. Epstein, and M. Betke, "Movement and Recovery Analysis of a Mouse-Replacement Interface for Users with Severe Disabilities," 13th Int. Conference on Human-Computer Interaction, 10 pp., San Diego, USA, July 2009.

[3] Y. Tong, Y. Wang, Z. Zhu, and Q. Ji, “Robust Facial Feature Tracking under Varying Face Pose and Facial Expression,” Pattern Recognition, 40(11):3195-3208, November 2007.

[4] C. Fagiani, M. Betke, and J. Gips, “Evaluation of tracking methods for human-computer interaction,” IEEE Workshop on Applications in Computer Vision (WACV 2002), pp. 121-126, Orlando, USA, December 2002.