Semantic Segmentation Algorithms for Image Review of Deep ... · L.-C. Chen et al., Rethinking...

14/02/2019 Image Segmentation [Arthur Ouaknine]

file:///home/arthurouaknine/Documents/phd/slides/image_segmentation_slides/image_segmentation.html#21 1/23

Review of Deep LearningReview of Deep LearningAlgorithms for ImageAlgorithms for Image

Semantic SegmentationSemantic Segmentation

Deep Learning Working Group

Arthur OuaknineArthur Ouaknine

PhD Student

14/02/2019

valeovaleo.ai.ai



SummarySummary

Datasets and metrics

Review of Architectures

Comparison

1/18



Datasets and metricsDatasets and metrics

1/18



Published: 2012

Number of classes: 20

Training and validation datasets: 11kimages

Test dataset: 10k images

Evaluation: mean Intersection overUnion (mIoU)

PASCAL Visual Object Classes (PASCAL VOC)

―――http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html

2/18

http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html



PASCAL-Context

Published: 2014 (extension of PASCAL VOC 2010)

Number of classes: 400 (59 are commonly used)

Training and validation datasets: 10k/10k images

Test dataset: 10k images

Evaluation: mIoU (and others)

―――https://cs.stanford.edu/~roozbeh/pascal-context/

3/18

https://cs.stanford.edu/~roozbeh/pascal-context/



Common Objects in COntext (COCO)

Published: 2016/2017/2018 (two challenges: object detectionobject detection andstu� detection)


Training and validation dataset: 118K/5K images

Test datasets: 41k (dev + challenge)

Evaluation: Average Precision (AP) and Average Recall (AR)

―――http://cocodataset.org/

4/18

http://cocodataset.org/



Cityscapes

Published: 2016


Training and validation datasets: 23.5k

Testing dataset: 1.5k

Evaluation: mIoU

―――https://www.cityscapes-dataset.com/

5/18

https://www.cityscapes-dataset.com/



Review of Architectures

5/18



Fully Convolutional Network (FCN)

MotivationMotivation: end-to-end convolutional network

ArchitectureArchitecture:

Input: un�xed size

Layers: only convolution with skip-connexions, deconvolution for upsampling,1x1 convolution for the scores

PerformancesPerformances:

PASCAL VOC 2012: 62.2% mIoU

―――J. Long et al., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015

6/18



ParseNet

MotivationMotivation: Take into account the global context of the image


Backbone model: depending on the challenge (FCN, DeepLab) using the newmodule

ParseNet contexture module: global pooling layer, L2 norm layer and unpoollayer


PASCAL-Context: 40.4% mIoU


―――W. Liu et al., ParseNet: Looking Wider to See Better, arXiv 2015

7/18



MotivationMotivation: Improve upsampling withfully deconv network


Input: instance proposal

Backbone model: VGG16

Deconvolution network: VGG16-like withdeconv and unpooling layers



Convolutional and Deconvolutional Networks

―――H. Noh et al., Learning Deconvolution Network for Semantic Segmentation, ICCV 2015

8/18




Network with a downsampling and anupsampling parts

"Copy and crop" pathways to keep patterninformation

1x1 convolution generates the segmentationmap

PerformancesPerformances: None

U-Net

MotivationMotivation: Improve pattern localisation with very few parameters

―――O. Ronneberger et al., U-Net: Convolutional Networks for Biomedical Image Segmentation, MICCAI 2015

9/18




Input: instance proposal from Faster R-CNN

Bottom-up and top-down pathways (factor 2)

Lateral connections (1x1 convolution)

Add MLP for segmentation


COCO 2016: 48.1% AR

Feature Pyramid Network (FPN)

MotivationMotivation: Join low-resolution and high-resolution features atdi�erent scale

―――T.-Y. Lin, Feature Pyramid Networks for Object Detection, CVPR 2017

10/18



Pyramid Scene Parsing Network (PSPNet)

MotivationMotivation: Learn global patterns using region-based contextaggregation


Backbone network: ResNet with dilated network strategy

Pyramid Pooling Module: pooling, 1x1 convolution, upsampling andconcatenation

Convolution layer to generate pixel-wise predictions



Cityscapes: 80.2% mIoU

―――H. Zhao et al., Pyramid Scene Parsing Network, CVPR 2017

11/18



Mask R-CNN

MotivationMotivation: Multi-task network to better solve all of them


Backbone network: Faster R-CNN

RoIAlign layer: avoid using quatization for box coordinates, bilinearinterpolation instead

3 output branches: bounding box coordinates, classi�cation, binary mask


COCO 2016: 37.1% AP

COCO 2017: 41.8% AP

―――K. He et al., Mask R-CNN, ICCV 2017

12/18



MotivationMotivation: Multi-scale object and betterresolution of the intermediaterepresentations


Backbone network: ResNet-101 with "Atrous"convolution (=dilated convolution) andwithout FC layer

Atrous Spatial Pyramid Pooling (ASPP):stacking several atrous convolution

Bilinear interpolation and Conditional RandomField (CRF)



PASCAL-Context: 45.7% mIoU

Cityscapes: 70.4% mIoU

DeepLab Familly: DeepLab(v2)

―――L.-C. Chen et al., DeepLab: Semantic Image Segmentation withDeep Convolutional Nets, Atrous Convolution,andFully Connected CRFs, TPAMI 2017

13/18



DeepLab Familly: DeepLabv3

MotivationMotivation: Improve multi-scale context


Backbone network: modi�ed ResNet-101

ASPP: add 1x1 convolution and batch norm

Final 1x1 convlution for pixel-wise prediction

Performances:Performances: (pretraining: ImageNet + JFT-300M)

PASCAL VOC 2012: 86.9%

Cityscapes: 81.3%

―――L.-C. Chen et al., Rethinking Atrous Convolution for Semantic Image Segmentation, arXiv 2017

14/18




Backbone network: modi�ed Xception

Atrous separable convolution (ASPP anddecoder)

Encoder-Decoder structure to recover theboundaries


PASCAL VOC 2012: 89.0%

Cityscapes: 82.1%

DeepLab Familly: DeepLabv3+

MotivationMotivation: Re�ne the segmentation around the object boundaries

―――L.-C. Chen et al., Encoder-Decoder with Atrous SeparableConvolution for Semantic Image Segmentation, ECCV2018

15/18



Path Aggregation Network (PANet)

MotivationMotivation: Enhance information propagation


New bottom-up pathway (propagation of low level features)

Adaptative feature pooling using RoIAlign to fuse all proposals

Binary mask: FCN with 4 conv and 1 deconv + short path with FC


COCO 2016: 42.0% AP

COCO 2017: 46.7% AP

―――S. Liu et al., Path Aggregation Network for Instance Segmentation, CVPR 2018

16/18



ComparisonComparison

16/18



Results

17/18



Ressources

18/18



Thanks for your attention :)

Semantic Segmentation Algorithms for Image Review of Deep ... · L.-C. Chen et al., Rethinking...

Documents

Transcript of Semantic Segmentation Algorithms for Image Review of Deep ... · L.-C. Chen et al., Rethinking...