Big Data, Machine Learning and Remote Sensing
Deep Learning for semantic segmentation of hyperspectral and Multispectral data
Amina Ben Hamida Alexandre Benoit, Patrick Lambert, Chokri Ben Amar
●●
○○
●●
○○
Use case: Sentinel 2generates daily 1.6 TBytes of compressed raw image data from the two-satellite constellation.
● Self taught
● Unsupervised
● Supervised
Playing major roles in different domain:
● Agriculture
● Forecasting
● ….
●●
○○
● Bag of Visual Words
● Deep Learning Fine
tuningUC merced Database :
● 256*256 pixels, RGB color, 1 foot resolution
● 21 classes with 100 images for each one (2100 total)
Sparse gray BOVW 79%(SIFT) /86%(SURF)
Spatial Pyramid 78%(SIFT)
Sparse RGB BOVW 78%(SIFT) /85%(SURF)
Dense gray BOVW 87%(SIFT) / 91%(SURF)
Dense RGB BOVW 86%(SIFT) /90%(SURF)
Deep Learning fc7+linSVM 96%
Deep Learning fc7+newfc8 94%
[1] A. Ben Hamida, A. Benoit,P. Lambert C.Ben Amar .“Could multimedia approaches help remote sensinganalysis?,” in Conference Image Information Mining:Earth Observation meets Multimedia. IIM, 2015.
●●
○○
Temporal components are not taken into account
Taking into account the spatial and spectral components
Separately processing
the spectral and spatial
components using SAE
Combining the spatial and
spectral information at
early phases
Only taking into account
the spectral information
Explodes the
number of
training
parameters and
requires large
amount of data
Disregard the
spatial
component
that can hold
important
information
Perfectly fits the
concept of
hyperspectral /
Multispectral data
classification
●●
○○
The Pavia Center is a 102-band dataset that presents oneimage of size 1096×1096 pixels
The Pavia University is a 103-band dataset that presents oneimage of size 610×340 pixels
* architecture : 6 layers with 5× 5× 3 sized filters
Squeezing the net on Pavia University[2]
Squeezing the net on Pavia Center[2]
8layers: 3*3 8layers: 5*5
8layers: 3*3 8layers: 5*5 [4]
[4][3]
University of Pavia Pavia Center
* [3] accuracy of 98% from 9% of data for tranining with 20000 params* [2] accuracy of 99% from 9% of data for tranining with 7000 params
At coarse reference resolutionAt high level resolution
Two main strategies for Land Cover estimation
●●
○○
fine grained pixel level land cover estimation
*Relying on light DenseNet and 3D DenseNet architectures
e4 e5 b7 d5 d4
g16
Relying on light and generic architectures (DenseNet)
Semantic segmentation on the ISPRS Vaihingen dataset
a: near infrared, red and green bands, b:DSM and c: semantic labels
full reference :9cm/pixel
Input image full reference coarse reference inferred semantic map(NIR, red and green bands)
Full reference : 9cm/px
Coarse reference : 135cm/px
Full reference : 9cm/px
Coarse reference : 135cm/px
train/test with full reference
train/test with coarse reference
Training on high resolution/confident reference improves results
2016 Sentinel-2 images and 2009 CORINE land cover dataset
Rescaled Sentinel-2 images : 20m/pixel
CORINE landcover reference : 300m/pixel
22 classes of
May-october images
with no cloud included
23 classes of
June-August images
with cloud included
Input image CORINE reference inferred segmentation
T2
T1
* Model : e4_5_b7_g16
Limitations: Temporal consistency
Only confident on large stable area but fail in rich heterogenous areas
Estimation at the coarse reference level using SegNets
The Multispectral approach performs better with less required images
●●
○○
The semantic segmentation in the absence of ground truth:● The resorting to generative models to learn data
distribution: fits the used sensors● Fine tuning over some annotated regions for more specific
applications ( semantic segmentation, classification…)
Finalize the current work on poorly annotated databases:● enriching the currently used models ( dense, ResNet..) with
larger datasets● Predictions stability study over same zones within a small
period of time
● “Could multimedia approaches help remote sensing analysis?,” in Conference Image Information Mining: Earth Observation meets Multimedia. IIM, 2015.
● “Deep learning approach for remote sensing image analysis,” in Conference on Big Data from Space. BIDS, 2016.
● “Deep learning for semantic segmentation of remote sensing images with rich spectral content”, IGARS 2017
● “ Three dimensional Deep Learning approach for remote sensing image classification, TGARS
[1] A. Ben Hamida, A. Benoit, P. Lambert C.Ben Amar .“Could multimedia approaches help remote sensing analysis?,” in Conference Image Information Mining:Earth Observation meets Multimedia. IIM, 2015.
[2]A. Ben Hamida, A. Benoit, P. Lambert, and C.BenAmar, “Deep learning approach for remote sensing image analysis,” in Conference on Big Data from Space. BIDS, 2016.
[3]Xiaorui Ma, Jie Geng, and Hongyu Wang, “Hyperspectral image classification via contextual deep learning,”EURASIP Journal on Image and Video Processing, vol. 2015, no. 1, pp. 1–12, 2015.
[4] S. Lefevre, L. Chapel, and F. Merciol, “Hyperspectral image classification from multiscale description with constrained connectivity and metric learning,” in 6th International Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing. Lausanne, Switzerland, 2014.
“Rachel”
Represent each image as a histogram of different visual words
Unsupervised Visual words
clustering
SIFT/SURF
Supervised Tiles
classificationBag of Visual Words
(BoVW)
Training tiles+ labels
Features extraction &Visual word
matching
Test phase
Supervised classifier(SVM, KNN, etc.)
Label:"Freeway"
Training with annotations
Test tile
Top Related