Online Multi-Object Tracking with Dual Matching Attention ...

19
Online Multi-Object Tracking with Dual Matching Attention Networks Ji Zhu, Hua Yang, Shanghai Jiao Tong University Nian Liu, Northwestern Polytechnical University Minyoung Kim, Massachusetts Institute of Technology Wenjun Zhang, Shanghai Jiao Tong University Ming-Hsuan Yang, University of California, Merced ECCV 2018

Transcript of Online Multi-Object Tracking with Dual Matching Attention ...

Page 1: Online Multi-Object Tracking with Dual Matching Attention ...

Online Multi-Object Tracking with Dual Matching Attention Networks

Ji Zhu, Hua Yang, Shanghai Jiao Tong UniversityNian Liu, Northwestern Polytechnical University

Minyoung Kim, Massachusetts Institute of TechnologyWenjun Zhang, Shanghai Jiao Tong University

Ming-Hsuan Yang, University of California, MercedECCV 2018

Page 2: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline

• STEP 1: Apply single object tracker to keep tracking each target;• STEP 2: If tracking result becomes unreliable, suspend the tracker;• STEP 3: Do data association between lost targets and detections;• STEP 4: Update results.

Xu Gao, Peking University 2

Page 3: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline

• STEP 1: Apply single object tracker to keep tracking each target;• STEP 2: If tracking result becomes unreliable, suspend the tracker;• STEP 3: Do data association between lost targets and detections;• STEP 4: Update results.

Xu Gao, Peking University 3

Page 4: Online Multi-Object Tracking with Dual Matching Attention ...

Single Object Tracking

• Baseline Method. “ECO: Efficient Convolution Operators for Tracking.” 2017 CVPR.

• 𝑥 = {(𝑥%)',… , (𝑥*)'} is a feature map with D feature channels extracted from an image patch.

• Aim to learn a multi-channel convolution filter 𝑓 = {𝑓%,… , 𝑓*}.

• 𝐸 𝑓 = ∑ 𝛼0||𝑆3 𝑥0 𝑡 − 𝑦0 𝑡 ||789 + ∑ ||𝑤(𝑡)𝑓< 𝑡 ||78

9 *<>%

?0>% .

• Where 𝑆3 𝑥0 𝑡 = 𝑓 ∗ 𝑃'𝑥0, 𝑃 is a 𝐷×𝐶 matrix. 𝑦0 𝑡 is the desired confidence map. 𝑀 is the number of training samples.

• ||𝑔 𝑡 ||789 = %

' ∫ |𝑔(𝑡)|9𝑑𝑡'I .

Desired Confidence Map

Score Map Predicted by ECO

Xu Gao, Peking University 4

Page 5: Online Multi-Object Tracking with Dual Matching Attention ...

Cost-Sensitive Tracking Loss

• Drawback of ECO: As shown in the figure, the center of the object next to the target also gets high confidence score.

• Analysis: The center of the object next to the target also gets high confidence score. Hence, these negative samples should be penalized more heavily to prevent the tracker from drifting.

• 𝐸 𝑓 = ∑ 𝛼0||𝑞(𝑡)(𝑆3 𝑥0 𝑡 − 𝑦0 𝑡 )||789 + ∑ ||𝑤(𝑡)𝑓< 𝑡 ||78

9 *<>%

?0>% .

• Where 𝑞 𝑡 = | KL MN O PQN ORSMT|KL MN O PQN O |

|9.

Desired Confidence Map

Score Map Predicted by ECO

Xu Gao, Peking University 5

Page 6: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline

• STEP 1: Apply single object tracker to keep tracking each target;• STEP 2: If tracking result becomes unreliable, suspend the tracker;• STEP 3: Do data association between lost targets and detections;• STEP 4: Update results.

Xu Gao, Peking University 6

Page 7: Online Multi-Object Tracking with Dual Matching Attention ...

Preparation for Data Association

• When the tracking process becomes unreliable, suspend the tracker and set the target to a lost state.

• 𝑠𝑡𝑎𝑡𝑒 = X𝑡𝑟𝑎𝑐𝑘𝑒𝑑, 𝑖𝑓𝑠 > 𝜏_𝑎𝑛𝑑𝑜RbSc > 𝜏d𝑙𝑜𝑠𝑡, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

• 𝑠 is the tracking score (the highest value in the confidence map);• 𝑜RbSc is the mean value of the maximum IoU between the tracked target 𝑡g

and the detections 𝐷g at frame each frame 𝑙.• 𝑜RbSc > 𝜏d is used since a false alarm detection is prone to be consistently

tracked with high confidence.• I think the set of 𝑜RbSc need to be reconsidered.

Xu Gao, Peking University 7

Page 8: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline

• STEP 1: Apply single object tracker to keep tracking each target;• STEP 2: If tracking result becomes unreliable, suspend the tracker;• STEP 3: Do data association between lost targets and detections;• STEP 4: Update results.

Xu Gao, Peking University 8

Page 9: Online Multi-Object Tracking with Dual Matching Attention ...

Data Association with DMAN

• Data association between lost trajectories and candidate detections.• Candidate detections are detections that surrounding the predicted location

which are not covered by any tracked target.• The predicted location are predicted from the lost trajectory with linear

motion model.• Dual Matching Attention Networks (DMAN) with spatial and temporal

attention.

Xu Gao, Peking University 9

Page 10: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline of DMAN

Xu Gao, Peking University 10

Page 11: Online Multi-Object Tracking with Dual Matching Attention ...

Spatial Attention Network (SAN)

• Intuition: pay more attention to common local patterns of the two feature maps.

• Matching Layer: Compute the cosine similarity between each 𝑥hi and 𝑥0

j. 𝑆h0 = (𝑥hi)'𝑥0j, 𝑥h ∈

ℝm.

• 𝑆 = (𝑥i)'𝑥j, 𝑆 ∈ ℝn×n,𝑁 = 𝐻×𝑊.

• Reshape 𝑆 ∈ ℝn×n into𝑋Ki ∈ ℝs×t×n.

• Reshape 𝑆' ∈ ℝn×n into𝑋Kj ∈ ℝs×t×n.

• Training Loss: Identification Loss and verification Loss.

Xu Gao, Peking University 11

Page 12: Online Multi-Object Tracking with Dual Matching Attention ...

Temporal Attention Network (TAN)

• Intuition: The tracklet may contain noisy observations, hence average pooling is unreliable.• Training Strategy: First train the SAN on randomly

generated image pairs, and fixed. Then train the TAN with extracted features as input.• Reason of the Strategy: The sequence of each id has

large redundancies to generate image pair, hence it is easy to overfit.• MOT 16 is used for training.

Xu Gao, Peking University 12

Page 13: Online Multi-Object Tracking with Dual Matching Attention ...

Pipeline

• STEP 1: Apply single object tracker to keep tracking each target;• STEP 2: If tracking result becomes unreliable, suspend the tracker;• STEP 3: Do data association between lost targets and detections;• STEP 4: Update results.

Xu Gao, Peking University 13

Page 14: Online Multi-Object Tracking with Dual Matching Attention ...

Datasets

• MOT 16: 14 sequences, including 7 for training and 7 for testing.• MOT 17: Same video sequences as MOT 16 but with 3 detections

(DPM, Faster-RCNN, SDP)

Xu Gao, Peking University 14

Page 15: Online Multi-Object Tracking with Dual Matching Attention ...

Visualization of the Spatial and Temporal Attention

Positive

Negative

Xu Gao, Peking University 15

Page 16: Online Multi-Object Tracking with Dual Matching Attention ...

More Visualization Results

Xu Gao, Peking University 16

Page 17: Online Multi-Object Tracking with Dual Matching Attention ...

Experiment

Xu Gao, Peking University 17

Page 18: Online Multi-Object Tracking with Dual Matching Attention ...

Ablation Study

Xu Gao, Peking University 18

Page 19: Online Multi-Object Tracking with Dual Matching Attention ...

Conclusion

• Integrate the merits of single object tracking and data association methods in a unified online MOT framework.• + Combine with single object tracking results.• + Spatial attention network seems to be useful.• - Results are not the best.• - Not too much innovation.

Xu Gao, Peking University 19