Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi...
-
Upload
mavis-banks -
Category
Documents
-
view
222 -
download
2
Transcript of Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi...
- Slide 1
- Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi Huang, Tieniu Tan {wqren,cwang,yhcheng,kqhuang,tnt}@nlpr.ia.ac.cn
- Slide 2
- Task2 : Classification + Localization Task 2b: Classification + localization with additional training data Ordered by classification error 1.Only classification labels are used 2.Full image as object location
- Slide 3
- Outline Motivation Method Results
- Slide 4
- Motivation
- Slide 5
- Knowing where to look, recognizing objects will be easier ! However, in the classification-only task, no annotations of object location are available. Weakly Supervised Localization Why Weakly Supervised Localization (WSL)?
- Slide 6
- Current WSL Results on VOC07
- Slide 7
- 13.9: Weakly supervised object detector learning with model drift detection, ICCV 2011 15.0: Object-centric spatial pooling for image classification, ECCV 2012 22.4: Multi-fold mil training for weakly supervised object localization, CVPR 2014 22.7: On learning to localize objects with minimal supervision, ICML 2014 26.4: Weakly supervised object detection with posterior regularization, BMVC 2014 31.6: Weakly supervised object localization with latent category learning, ECCV 2014 Sep 11, Poster Session 4A, #34 26.2: Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision, submitted to TPAMI
- Slide 8
- VOC 2007Results Ours31.6 DPM 5.033.7 Weakly Supervised Object Localization with Latent Category Learning ECCV 2014 VOC 2007Results Ours26.2 DPM 5.033.7 Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision Submitted to TPAMI Our Work For the consideration of high efficiency in large-scale tasks, we use the second one.
- Slide 9
- Method
- Slide 10
- Framework Conv Layers FC Layers Input Images Cls Prediction Det Prediction Rescoring 2 1 3 4
- Slide 11
- 1 st : CNN Architecture Chatfield et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets
- Slide 12
- 2 nd : MILinear SVM
- Slide 13
- Good region proposal algorithms High recall High overlap Small number Low computation cost MCG pretrained on VOC 2012 Additional Data Training: 128 windows/ image Testing: 256 windows/image Compared to Selective Search (~2000) MILinear : Region Proposal
- Slide 14
- Low Level Features SIFT, LBP, HOG Shape context, Gabor, Mid-Level Features Bag of Visual Words (BoVW) Deep Hierarchical Features Convolutional Networks Deep Auto-Encoders Deep Belief Nets MILinear: Feature Representations
- Slide 15
- Clustering KMeans Topic Model pLSA, LDA, gLDA CRF Multiple Instance Learning DD, EMDD, APR MI-NN, MI-SVM, mi-SVM MILBoost MILinear: Positive Window Mining
- Slide 16
- Multiple instance Linear SVM Optimization: trust region Newton A kind of Quasi Newton method Working in the primal Faster convergence MILinear: Objective Function and Optimization
- Slide 17
- MILinear: Optimization Efficiency
- Slide 18
- 3 rd : Detection Rescoring Rescoring with softmax 1000 classes 128 boxes max train softmax 1000 dim Softmax: consider all the categories simultaneously at each minibatch of the optimization Suppress the response of other appearance similar object categories
- Slide 19
- 4 th : Classification Rescoring Linear Combination 1000 dim One funny thing: We have tried some other strategies of score combination, but it seems not working !
- Slide 20
- Results
- Slide 21
- 1 st : Classification without WSL MethodTop 5 Error Baseline with one CNN :13.7 Average with four CNNs:12.5
- Slide 22
- 2 nd : MILinear on ImageNet 2014 MethodsDetection Error Baseline (Full Image)61.96 MILinear40.96 Winner25.3
- Slide 23
- 2 nd : MILinear on VOC 2007
- Slide 24
- 2 nd : MILinear on ILSVRC 2013 detection mAP: 9.63%! vs 8.99% (DPM5.0)
- Slide 25
- 2 nd : MILinear for Classification MethodsTop 5 Error Milinear17.1
- Slide 26
- 3 rd : WSL Rescoring (Softmax) MethodTop 5 Error Baseline with one CNN :13.7 Average with four CNN :12.5 MILinear17.1 MILinear + Rescore13.5 The Softmax based rescoring successfully suppresses the predictions of other appearance similar object categories !
- Slide 27
- 4 th : Cls and WSL Combinataion MethodTop 5 Error Baseline with one CNN model:13.7 Average with four CNN models:12.5 MILinear17.1 MILinear + Rescore13.5 Cls (12.5) + MILinear (13.5)11.5 WSL and Cls can be complementary to each other!
- Slide 28
- Russakovsky et al. ImageNet Large Scale Visual Object Challenge.
- Slide 29
- Conclusion WSL always helps classification WSL has large potential: WSL data is cheap
- Slide 30
- Thank You!