Bored by Classification ConvNets? End-to-end...
Transcript of Bored by Classification ConvNets? End-to-end...
![Page 1: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/1.jpg)
Thomas Brox
Bored by Classification ConvNets?
End-to-end Learning of other Computer Vision Tasks
Thomas Brox
University of Freiburg
Germany
Research funded by ERC Starting Grant VideoLearn, the German Research Foundation, and the Deutsche Telekom Stiftung
![Page 2: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/2.jpg)
Thomas Brox
Generative networks
U-Net: Multi-instance segmentation
FlowNet: Estimating optical flow
2
Outline
![Page 3: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/3.jpg)
Thomas Brox 3
Typical ConvNet architecture
cat
Classification network
![Page 4: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/4.jpg)
Thomas Brox 4
Typical ConvNet architecture
cat
Classification network
cat
![Page 5: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/5.jpg)
Thomas Brox
Small grayoffice chair,side view
cat
5
Up-convolutional network
Image generation
Related work: • Eigen et al. NIPS 2014: Network for depth map prediction• Long et al. CVPR 2015: Network for semantic segmentation
Alexey DosovitskiyCVPR 2015
New: Expanding network architecture
![Page 6: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/6.jpg)
Thomas Brox
Generating chair images with a network
Dosovitskiy et al., CVPR 2015
6
![Page 7: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/7.jpg)
Thomas Brox 7
Training set
Source: https://github.com/dimatura/seeing3d
3D chair datasetAubry et al. CVPR 2014
Rendering 809 chair styles From 62 viewpoints
Some of the rendered chairs
![Page 8: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/8.jpg)
Thomas Brox 8
Generating images of unseen views
Training set split into two subsets:
Source set: 62 viewpoints available (90% of all chair models)
Target set: fewer viewpoints available (10% of all models)
![Page 9: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/9.jpg)
Thomas Brox 9
Generating images of unseen views
8 azimuths available
4 azimuths available
2 azimuths available
1 azimuth available
![Page 10: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/10.jpg)
Thomas Brox 10
Comparison to baselines
Alexey DosovitskiyCVPR 2015
![Page 11: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/11.jpg)
Thomas Brox
Interpolation of chair styles
1111
Alexey DosovitskiyCVPR 2015
![Page 12: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/12.jpg)
Thomas Brox 12
Correspondences between chair instances
Alexey DosovitskiyCVPR 2015
![Page 13: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/13.jpg)
Thomas Brox
• Generate intermediate images with the network
• Track points with optical flow (LDOF) along the sequence
13
Correspondences between chair instances
all easy difficult
Deformable Spatial Pyramid Matching (Kim et al. 2013)
5.2 3.3 6.3
SIFT Flow (Liu et al. 2008) 4.0 2.8 4.8
Ours 3.9 3.9 3.9
Human performance 1.1 1.1 1.1
![Page 14: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/14.jpg)
Thomas Brox 14
Preview: Inverting ConvNets with ConvNets
Alexey DosovitskiyarXiv 2015
Image featurese.g. from AlexNet
Related work: • Mahendran & Vedaldi CVPR 2015• Zeiler & Fergus ECCV 2014
Learn to re-generate the input image from its feature representation
Up-convolutional network
![Page 15: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/15.jpg)
Thomas Brox 15
Reconstruction results
Up-Conv.
Mahendran & Vedaldi
Auto-encoder
More reconstructions with up-convolutional network:
![Page 16: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/16.jpg)
Thomas Brox 16
Color and position are preserved in high layers
inputAll FC8
Top 5 FC8
All but Top 5 FC8
Color experiment Position experiment
![Page 17: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/17.jpg)
Thomas Brox
A generative network
U-Net: Multi-instance segmentation
FlowNet: Estimating optical flow
17
Outline
![Page 18: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/18.jpg)
Thomas Brox
U-Net: Image segmentation with a ConvNet
18
Olaf Ronneberger
18
• Similar to Fully Convolutional Network[Long et al., CVPR 2015]
• Original inspiration: Depth map prediction[Eigen et al., NIPS 2014]
Philipp Fischer
MICCAI 2015
![Page 19: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/19.jpg)
Thomas Brox 19
Binary segmentation
Electron MicroscopyISBI 2012 Challenge
Rank 1
Light microscopy cell trackingISBI 2015 Challenge
Rank 1
![Page 20: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/20.jpg)
Thomas Brox
Intersection over union: 77.5%Second best: 46%
20
Multi-class semantic segmentation
X-ray dental segmentation, ISBI 2015 Challenge, Rank 1
![Page 21: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/21.jpg)
Thomas Brox 21
Multi-instance segmentation
Light microscopy, DIC-HeLa cell trackingISBI 2015 Challenge: Rank 1
![Page 22: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/22.jpg)
Thomas Brox
A generative network
U-Net: Multi-instance segmentation
FlowNet: Estimating optical flow
22
Outline
![Page 23: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/23.jpg)
Thomas Brox 23
FlowNet: Estimating optical flow with a ConvNet
Refinement:expanding architecture
![Page 24: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/24.jpg)
Thomas Brox 24
Helping the network with a correlation layer
Alexey Dosovitskiy
Philipp Fischer
EddyIlg
Philip Häusser
Caner Hazirbas
Vladimir Golkov
Joint work with the group of Daniel Cremers
![Page 25: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/25.jpg)
Thomas Brox
• Getting ground truth optical flow for realistic videos is hard
• Existing datasets are small:
25
Enough data to train such a network?
Frames with ground truth
Middlebury 8
KITTI 194
Sintel 1041
Needed >10000
![Page 26: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/26.jpg)
Thomas Brox 26
Realism is overrated: the “flying chair” dataset
Rendered image Optical flow
![Page 27: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/27.jpg)
Thomas Brox 27
It works!
Although the network has only seen flying chairs for training, it predicts good optical flow on Sintel
Input images Ground truth
FlowNetSimple FlowNetCorr
![Page 28: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/28.jpg)
Thomas Brox 28
Results on various datasets
Middlebury KITTI Sintel Clean Sintel Final Flying Chairs
EpicFlow 0.39 3.8 4.1 6.3 2.9
DeepFlow 0.42 5.8 5.4 7.2 3.5
LDOF 0.56 12.4 7.6 9.1 3.5
FlowNetS - - 7.4 8.4 2.7
FlowNetS+v - - 6.5 7.7 2.9
FlowNetS+ft - 9.1 7.0 7.8 3.0
FlowNetS+ft+v 0.47 7.6 6.2 7.2 3.0
FlowNetC - - 7.3 8.8 2.2
FlowNetC+v - - 6.3 8.0 2.6
FlowNetC+ft - - 6.9 8.5 2.3
FlowNetC+ft+v 0.5 - 6.1 7.9 2.7
Networks can compete with state-of-the-art conventional optical flow estimation methods
![Page 29: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/29.jpg)
Thomas Brox 29
Can handle large displacements
Input images Ground truth
FlowNetSimple FlowNetCorr
DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)
![Page 30: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/30.jpg)
Thomas Brox 30
Sometimes wrong direction
Input images Ground truth
FlowNetSimple FlowNetCorr
DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)
![Page 31: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/31.jpg)
Thomas Brox 31
Often captures fine details
Input images Ground truth
FlowNetSimple FlowNetCorr
DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)
![Page 32: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/32.jpg)
Thomas Brox 32
Results on “Flying chairs” test set
Input images
FlowNetCorrEpicFlow (Revaud et al. CVPR 2015)
Ground truth
![Page 33: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/33.jpg)
Thomas Brox 33
Runs with 10fps on the GPU
![Page 34: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/34.jpg)
Thomas Brox
A generative network
U-Net: Multi-instance segmentation
FlowNet: Estimating optical flow
34
Summary
![Page 35: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer](https://reader030.fdocuments.us/reader030/viewer/2022040322/5e5ff55ff886a8454e410948/html5/thumbnails/35.jpg)
Thomas Brox 35
Tip of the day