Planet Labels: How do we use our Earth?
Rishabh Bhargava , Timon Ruban, Vincent Sitzmann
Dataset
We use high-resolution satellite images to classify the land visible according to its cover, such as forest, water or urban land. The input to
our algorithms is a tile of 128x128 pixels cut out from a satellite image. Tiles are combined with labels from the 2011 National Land Cover
Dataset (NLCD) to yield a supervised data set. Random forests outperform a number of models yielding 92.45% accuracy on a test set.
Methodology
128 px
128 px
MODE
of all pixel
labels
NLCD labelsImage Tile
+
Satellite images of California from RapidEye satellites from
Planet Labs Explorer[1] program
5m x 5m resolution for each pixel
5 bands: B, G, R, Red Edge, Near Infrared
Tiles of size 128 pixels x 128 pixels extracted
Tiles matched with US government’s 2011 National Land
Cover Database[2] in a manner demonstrated below
Land cover classes selected on basis of Anderson’s Land
Cover Classification System[3]:
Water, Developed, Barren, Forest, Shrubland,
Herbaceous, Cultivated, Wetland
Only tiles where more than 90% of pixels belong to single
class were considered
Total dataset size: 78,082 tiles
Dataset split:
20% Test set
20% Validation set
60% Training set
Features used:
Color: HSV Histograms (convert from RGB)
Near Infrared / Red Edge Histograms
Texture: Gabor filters
Tested models:
Logistic regression with L2 regularization
Support Vector Machines with RBF Kernels
Random Forests with 500, 1000 trees and p = 𝑛predictors per tree
Convolutional Neural Networks with LeNet architecture
Random Forests outperformed all other approaches:
Logistic Regression incurred high bias (high training
and test error)
SVMs with suffered from heavy overfitting and long
training times
Lowest accuracies for classes Herbaceous, Shrubland and
Barren were observed
Water and Forest classes had the best accuracy.
Improve granularity of results by adopting per-pixel
approach
Update the NLCD dataset with RapidEye imagery
Create land cover databases for different countries using
latest satellite images
Discover temporal changes in land cover – this would help in
identifying illegal deforestation activities, patterns in urban
land growth, and shrinkage of water bodies
References
Model Parameters Validation Error
Logistic Regression L2 Regularization 0.2316
Random Forest 100 trees 0.0780
1000 trees 0.0755
Potential Next Steps
Random Forest Confusion Matrix Error Rates of
different feature vectors
Learning CurveResults
Validation set errors for different models and their parameters
[1] Owned by Planet Labs, in the frame of the Planet Labs Explorers Program: www.planet.com/explorers
[2] Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham, J.D., and Megown, K., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354
[3] Anderson, James Richard. A land use and land cover classification system for use with remote sensor data. Vol. 964. US Government Printing Office, 1976.
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Precision
Recall
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
9360 18721 28082 37443 46804
Training Set Size
Training Error
Test Error
0
5000
10000
15000
20000
25000
30000
Num
ber
of T
iles
Land Cover Class
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
hsv_hist hsv_hist +re_nir
hsv_hist +re_nir + gab
Precision and Recall of land cover classes
RapidEye image overlaid with land cover predictions
RapidEye satellite image
Top Related