Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient...
Transcript of Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient...
![Page 1: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/1.jpg)
Searching for Efficient Multi-Scale Architectures for Dense Image Prediction
Presented by Paras Jain AISys 2019
Authors: Liang-Chieh Chen, Maxwell D. Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, Jonathon Shlens
![Page 2: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/2.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 3: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/3.jpg)
DNNs… now ubiquitous!
![Page 4: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/4.jpg)
But DNN design is getting more complex
AlexNet (2012)
VGG16 (2014) ResNet (2016)
![Page 5: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/5.jpg)
# of applications >> # of AI experts
Growing design space of DNNs
Falling price per FLOP
What is the Design Automation stack for DNNs?
AutoML tries to automatically generate high-accuracy models (subject to constraints)
![Page 6: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/6.jpg)
Blueprint image: https://arxiv.org/pdf/1808.05377.pdf Loop image courtesy Barret Zoph, Quoc Le
![Page 7: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/7.jpg)
Blueprint for an AutoML paper
Blueprint image: https://arxiv.org/pdf/1808.05377.pdf Loop image courtesy Barret Zoph, Quoc Le
![Page 8: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/8.jpg)
Learning straight-line DNNs (simple data)
![Page 9: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/9.jpg)
Learning straight-line DNNs (simple data)
NASNet exceeded human performance on CIFAR and COCO
![Page 10: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/10.jpg)
Learning straight-line DNNs (simple data)
NASNet exceeded human performance on CIFAR and COCO(classification, object detection)
Constrained optimization objective for mobile inference latency
![Page 11: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/11.jpg)
Learning straight-line DNNs (simple data)
NASNet exceeded human performance on CIFAR and COCO(classification, object detection)
Constrained optimization objective for mobile inference latency
Low-cost architecture search via backprop into architecture
![Page 12: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/12.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 13: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/13.jpg)
Motivation
• AutoML has exceeded human performance on classification
• Can we apply search to a new vision task (semantic segmentation)?
![Page 14: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/14.jpg)
What is segmentation? Label each pixel of an image with an class
Key application: Autonomous driving, cancer detection, deforestation detection
Metric: Intersection-over-Union aka Jaccard index
Semantic Segmentation task
Images: Mapillary Vistas
![Page 15: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/15.jpg)
Current state of the art in semantic segmentation
Results generalize to scene parsing (above) and person-part matching
Used AutoML to search space of 1011 models, sampled 28000 models
“Cheap AutoML” = 370 GPUs over one week
![Page 16: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/16.jpg)
Only learn a single “Dense Prediction Cell”
Sample graphs using random search
(Vizier)
A) Train using mobile backbone B) Cache feature maps
C) Early stopping (90m per sample = 100x speedup)
Blueprint for an AutoML paper
![Page 17: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/17.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 18: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/18.jpg)
Search spacePre-trained backbone• Majority of network arch is fixed
• MobileNet V2 classification net • Xception classification net
• Chop last few layers off classification net and add some new layers (DPC)
Search this
![Page 19: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/19.jpg)
Dense Prediction
Cell
Random Sampling Proxy task
• Average spatial pyramid pooling (downsample, conv1x1, upsample)
• 1x1 convolution
• 3x3 dilated convolution
output
output
4.2 x 1011 search space
![Page 20: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/20.jpg)
Dense Prediction
Cell
Random Sampling Proxy task
4.2 x 1011 search space
![Page 21: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/21.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 22: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/22.jpg)
Dense Prediction
Cell
Random Sampling Proxy task
![Page 23: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/23.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 24: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/24.jpg)
Faster NAS using proxy tasks• IDEA: Estimate architecture performance using a proxy task • The better the proxy task is, the more efficient search is • Key contribution of this paper is task-specific proxy tasks
Proxy #1Proxy #2Proxy #3Proxy #4Proxy #5
Model #1Model #2Model #4Model #3Model #5
![Page 25: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/25.jpg)
Proxy task 1: Train using MobileNet
• Predict final accuracy by using a smaller classification network • Xception: 21% top-1 error, 22M params • MobileNet v2: 28% top 1 error, 3.4M params
![Page 26: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/26.jpg)
Proxy task 2: Cache activationsCache classification network activations and only train new layers (freeze gradient)
$
![Page 27: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/27.jpg)
Background Paper overview Search space Sampling strategy Performance estimation Results
![Page 28: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/28.jpg)
Cityscapes Semantic Segmentation
![Page 29: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/29.jpg)
Person-part identification
![Page 30: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/30.jpg)
PASCAL VOC scene understanding
![Page 31: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/31.jpg)
#1 #2 #3
Dense Prediction Cells learned
![Page 32: Searching for Efficient Multi-Scale Architectures for …...2019/02/27 · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys](https://reader034.fdocuments.us/reader034/viewer/2022050400/5f7dd5a7e0afd940a23b89b9/html5/thumbnails/32.jpg)
Some discussion points• What are new application areas for NAS?
• Ideas? object detection, speech generation, GANs? • Does NAS un-democratize ML?
• Google leads the training compute arms race • Will the NAS workload influence how hardware should look? • Seems like significant domain knowledge is necessary to develop
SoTA NAS methods — is NAS most useful as a research productivity tool?