Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights
description
Transcript of Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights
![Page 1: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/1.jpg)
Large Scale Visual Recognition Challenge (ILSVRC) 2013:
Classification spotlights
![Page 2: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/2.jpg)
Additions to the ConvNet Image Classification PipelineAndrew Howard – Andrew Howard Consulting
Changes to Training:Use more pixels: Train on square patches from rectangular image instead of cropped central squareAdditional color manipulation of contrast, brightness, color balance used on training patches
Changes to Testing:Make Predictions at different scales and different views which use all pixelsPrevious: Used 10 predictions (2 flips * 5 translations)This Submission: Used 90 predictions (2 flips * 5 translations * 3 scales * 3 views)The number of predictions can be reduced with no loss of accuracy with stagewise regression
Higher Resolution Models:Use a fully trained model and fine tune on image patches from a higher resolution imageThis can be trained in about 1/3 the number of epochsPredictions on higher resolution images give complimentary predictions to the base model
Final Vision System achieves 13.6% error and is made of 5 base models and 5 higher resolution modelsStructure is the same as last year with fully connected layers twice as large, which doesn’t add much value
Use Patches From:
Instead of Patches From:
View 1: View 2: View 3:
![Page 3: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/3.jpg)
Cognitive Psychology Inspired Image Classification using Deep Neural Network
Kuiyuan Yang, Microsoft ResearchYalong Bai, Harbin Institute of Technology
Yong Rui, Microsoft Research
CognitiveVision team
![Page 4: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/4.jpg)
Our Classification Scheme
Dog Cat
French bulldog
English setter
Maltese dog
Basic CategoryClassification Easy to
distinguish
DogClassification
Given a image, predict its basic category firstly.
…
Egyptian cat
Siamese cat
tiger cat
CatClassification
dalmatian
…
Predict sub category
CognitiveVision team
![Page 5: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/5.jpg)
Caffe: Open-Sourcing Deep LearningYangqing Jia, Trevor Darrell, UC Berkeley
• Convolutional Architecture for Fast Feature Extraction– Seamless switching between CPU and GPU– Fast computation (2.5ms / image with GPU)– Full training and testing capability– Reference ImageNet model available
• A framework to support multiple applications:
Publicly available at http://caffe.berkeleyvision.org/
Classification Embedding Detection Your nextApplication!
![Page 6: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/6.jpg)
Experiments for large scale visual recognition
Deep CNN (following Krizhevsky et al’12)
We tried:+
Low level features &spatial granularities
Where did we fail?
Television (0.18) Hair spray (0.18) Coffee mug (0.10) Flute (0.10)
- TV vs. Screen,
- Coffee mug vs. Cup,
- Flute vs. Microphone,
- …
top 1 acc = 0.567
Appliance and instrument are confusing for us, including
![Page 7: Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights](https://reader036.fdocuments.us/reader036/viewer/2022062222/568168fd550346895de00b15/html5/thumbnails/7.jpg)
8:30 Classification&localization
10:30 Detection
Noon Discussion panel
14:00 Invited talk by Vittorio Ferrari: Auto-annotation and self-assessment in ImageNet
14:40 Fine-Grained Challenge 2013
Agenda
http://www.image-net.org/challenges/LSVRC/2013/iccv2013
8:50 9:05 9:20 9:35 9:50 Spotlights
10:50 11:10 11:30 11:40Spotlights