Download - Remote Sensing - projects.listic.univ-smb.fr · Big Data, Machine Learning and Remote Sensing Deep Learning for semantic segmentation of hyperspectral and Multispectral data Amina

Big Data, Machine Learning and Remote Sensing

Deep Learning for semantic segmentation of hyperspectral and Multispectral data

Amina Ben Hamida Alexandre Benoit, Patrick Lambert, Chokri Ben Amar

●●

○○

Use case: Sentinel 2generates daily 1.6 TBytes of compressed raw image data from the two-satellite constellation.

● Self taught

● Unsupervised

● Supervised

Playing major roles in different domain:

● Agriculture

● Forecasting

● ….

●●

○○

● Bag of Visual Words

● Deep Learning Fine

tuningUC merced Database :

● 256*256 pixels, RGB color, 1 foot resolution

● 21 classes with 100 images for each one (2100 total)

Sparse gray BOVW 79%(SIFT) /86%(SURF)

Spatial Pyramid 78%(SIFT)

Sparse RGB BOVW 78%(SIFT) /85%(SURF)

Dense gray BOVW 87%(SIFT) / 91%(SURF)

Dense RGB BOVW 86%(SIFT) /90%(SURF)

Deep Learning fc7+linSVM 96%

Deep Learning fc7+newfc8 94%

[1] A. Ben Hamida, A. Benoit,P. Lambert C.Ben Amar .“Could multimedia approaches help remote sensinganalysis?,” in Conference Image Information Mining:Earth Observation meets Multimedia. IIM, 2015.

●●

○○

Temporal components are not taken into account

Taking into account the spatial and spectral components

Separately processing

the spectral and spatial

components using SAE

Combining the spatial and

spectral information at

early phases

Only taking into account

the spectral information

Explodes the

number of

training

parameters and

requires large

amount of data

Disregard the

spatial

component

that can hold

important

information

Perfectly fits the

concept of

hyperspectral /

Multispectral data

classification

●●

○○

The Pavia Center is a 102-band dataset that presents oneimage of size 1096×1096 pixels

The Pavia University is a 103-band dataset that presents oneimage of size 610×340 pixels

* architecture : 6 layers with 5× 5× 3 sized filters

Squeezing the net on Pavia University[2]

Squeezing the net on Pavia Center[2]

8layers: 3*3 8layers: 5*5

8layers: 3*3 8layers: 5*5 [4]

[4][3]

University of Pavia Pavia Center

* [3] accuracy of 98% from 9% of data for tranining with 20000 params* [2] accuracy of 99% from 9% of data for tranining with 7000 params

At coarse reference resolutionAt high level resolution

Two main strategies for Land Cover estimation

●●

○○

fine grained pixel level land cover estimation

*Relying on light DenseNet and 3D DenseNet architectures

e4 e5 b7 d5 d4

g16

Relying on light and generic architectures (DenseNet)

Semantic segmentation on the ISPRS Vaihingen dataset

a: near infrared, red and green bands, b:DSM and c: semantic labels

full reference :9cm/pixel

Input image full reference coarse reference inferred semantic map(NIR, red and green bands)

Full reference : 9cm/px

Coarse reference : 135cm/px

Full reference : 9cm/px

Coarse reference : 135cm/px

train/test with full reference

train/test with coarse reference

Training on high resolution/confident reference improves results

2016 Sentinel-2 images and 2009 CORINE land cover dataset

Rescaled Sentinel-2 images : 20m/pixel

CORINE landcover reference : 300m/pixel

22 classes of

May-october images

with no cloud included

23 classes of

June-August images

with cloud included

Input image CORINE reference inferred segmentation

T2

T1

* Model : e4_5_b7_g16

Limitations: Temporal consistency

Only confident on large stable area but fail in rich heterogenous areas

Estimation at the coarse reference level using SegNets

The Multispectral approach performs better with less required images

●●

○○

The semantic segmentation in the absence of ground truth:● The resorting to generative models to learn data

distribution: fits the used sensors● Fine tuning over some annotated regions for more specific

applications ( semantic segmentation, classification…)

Finalize the current work on poorly annotated databases:● enriching the currently used models ( dense, ResNet..) with

larger datasets● Predictions stability study over same zones within a small

period of time

● “Could multimedia approaches help remote sensing analysis?,” in Conference Image Information Mining: Earth Observation meets Multimedia. IIM, 2015.

● “Deep learning approach for remote sensing image analysis,” in Conference on Big Data from Space. BIDS, 2016.

● “Deep learning for semantic segmentation of remote sensing images with rich spectral content”, IGARS 2017

● “ Three dimensional Deep Learning approach for remote sensing image classification, TGARS

[1] A. Ben Hamida, A. Benoit, P. Lambert C.Ben Amar .“Could multimedia approaches help remote sensing analysis?,” in Conference Image Information Mining:Earth Observation meets Multimedia. IIM, 2015.

[2]A. Ben Hamida, A. Benoit, P. Lambert, and C.BenAmar, “Deep learning approach for remote sensing image analysis,” in Conference on Big Data from Space. BIDS, 2016.

[3]Xiaorui Ma, Jie Geng, and Hongyu Wang, “Hyperspectral image classification via contextual deep learning,”EURASIP Journal on Image and Video Processing, vol. 2015, no. 1, pp. 1–12, 2015.

[4] S. Lefevre, L. Chapel, and F. Merciol, “Hyperspectral image classification from multiscale description with constrained connectivity and metric learning,” in 6th International Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing. Lausanne, Switzerland, 2014.

“Rachel”

Represent each image as a histogram of different visual words

Unsupervised Visual words

clustering

SIFT/SURF

Supervised Tiles

classificationBag of Visual Words

(BoVW)

Training tiles+ labels

Features extraction &Visual word

matching

Test phase

Supervised classifier(SVM, KNN, etc.)

Label:"Freeway"

Training with annotations

Test tile