Mapping and classification of spatial data using machine learning: algorithms and software tools...

Post on 11-Sep-2014

2.833 views 1 download

Tags:

description

Mapping and classification of spatial data using machine learning: algorithms and software toolsVadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne (Switzerland)Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)

Transcript of Mapping and classification of spatial data using machine learning: algorithms and software tools...

Institute of Geomatics and Analysis of Risk, University of Lausanne, Switzerland

Vadim Timonin

Vadim.Timonin @UNIL.ch

Mapping and classification of spatial data using

Machine Learning Officesoftware tools

Contents

1. Short description of the Machine Learning Office

2. SIC 2004: Application to the automatic cartography of radioactivity

3. Case study: Wind fields mapping with neural network and

regularization technique.

Machine Learning Office

EPFL press

June 2009

Part of the book:

June 20

09:00 – 12:00

Room T120

Practical work session usingMachine Learning software

Machine Learning OfficeSupervised

• Multilayer Perceptron (MLP)• General Regression Neural Networks (GRNN)• Radial Basis Function Neural Networks

(RBFNN)• K-Nearest Neighbour (KNN)• Support Vector Regression (SVR)

Regression

• Multilayer Perceptron (MLP)• Probabilistic Neural Networks (PNN)• K-Nearest Neighbour (KNN)• Support Vector Machines (SVM)

Classification

Machine Learning OfficeUnsupervised

• K-Means & EM algorithms• Gaussian Mixture Model (GMM)• Self-Organizing (Kohonen) Maps

(SOM)

Clustering & density estimation

Machine Learning OfficeMixture of supervised and unsupervised

• Mixture Density Networks (MDN)

Joint density estimation

1. Simple, without difficult tuning of the

models (can be used by “non-expert” in

machine learning)

2. Result should be unique (does not depend

on training algorithms, initial values, etc.)

Automatic Mapping of Pollution Data

Procedure should be:

1. KNN

2. GRNN / PNN

Automatic Mapping of Pollution Data

Good candidates:

Not so good candidates (?):

1. MLP

2. RBFNN

3. SVM / SVR

http://www.ai-geostats.org/

Official report:Automatic mapping algorithms for routine and

emergency monitoring data.EUR 21595 EN EC.

Dubois G. (Ed.), Office for Official Publications of the European Communities, Luxembourg, 150 p., November 2005.

Automatic Mappingwith Prior Knowledge

in situations ofRoutine and Emergency

Spatial Interpolation Comparison 2004

Spatial Interpolation Comparison 2004

Introduction

Description of the concept of SIC 2004Participants are invited using 200 observations (left, circles) to estimate (predict)

values located at 1008 locations (right, crosses).

Results of the GRNN models with cross-validation tuning

Routine scenario

Emergency (joker) scenario

Epicentre of accident (hot spot)

Results

In the following table the participants’ results for either of the two scenarios (routine and emergency) are presented.

The results have been sorted by Minimum Absolute Error (MAE) obtained in the case of the emergency scenario. Other statistics shown in this table are the Mean Error (ME) that allows to assess the bias of the results, the Root Mean Squared Error (RMSE), as well as Pearson’s Correlation Coefficient (Ro) between true and estimated values.

• GEOSTATS denotes Geostatistical techniques• NN Neural Networks• SVM Support Vector Machine

In each column, the best results have been bolded.

Results of the SIC 2004 exerciseParticipant Method

MAE ME RMSE Ro

routine joker routine joker routine joker routine joker

Timonin NN 9.40 14.85 -1.25 -0.51 12.59 45.46 0.78 0.84

Fournier GEOSTATS 9.06 16.22 -1.32 -8.58 12.43 81.44 0.79 0.27

Pozdnoukhov SVM 9.22 16.25 -0.04 -6.70 12.47 81.00 0.79 0.28

Saveliev SPLINES 9.60 17.00 3.00 10.40 13.00 82.20 0.77 0.23

Dutta NN 9.92 17.50 0.20 5.10 13.10 80.60 0.76 0.29

Ingram GEOSTATS 9.10 18.55 -1.27 -4.64 12.46 54.22 0.79 0.86

Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50

Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50

Fournier GEOSTATS 9.22 19.43 -0.89 -0.22 12.51 73.50 0.78 0.48

Fournier OTHERS 9.29 19.44 -1.12 -0.12 12.56 71.87 0.78 0.53

Savelieva GEOSTATS 9.11 19.68 -1.39 -2.18 12.49 69.08 0.78 0.56

Palaseanu GEOSTATS 9.05 19.76 1.40 2.33 12.46 74.54 0.79 0.50

Rigol S. NN 12.10 20.30 -1.20 -9.40 15.80 84.10 0.67 0.12

Pebesma GEOSTATS 9.11 20.83 -1.22 0.92 12.44 73.73 0.79 0.50

Pebesma OTHERS 9.94 21.03 -1.35 4.50 13.32 72.12 0.78 0.51

Ingram GEOSTATS 9.08 21.77 -1.44 0.72 12.47 79.57 0.79 0.35

Lophaven GEOSTATS 9.70 22.20 1.20 -4.10 13.10 71.20 0.76 0.54

Saveliev SPLINES 9.30 22.20 1.60 0.60 12.60 76.40 0.78 0.41

Ingram GEOSTATS 9.47 22.53 -1.15 3.09 12.75 79.16 0.78 0.33

Pebesma GEOSTATS 9.11 23.26 -1.22 4.00 12.44 76.19 0.79 0.42

Rigol S. NN 16.00 25.30 -1.70 -11.10 20.80 87.50 0.55 0.02

Hofierka SPLINES 9.38 26.52 -1.27 4.29 12.68 77.98 0.78 0.38

Dutta NN 9.62 28.20 0.90 -0.22 12.70 80.10 0.78 0.31

Pebesma GEOSTATS 9.11 28.45 -1.22 12.01 12.44 81.41 0.79 0.38

Dutta NN 12.20 28.90 1.50 -1.29 15.90 79.90 0.64 0.33

Rigol S. NN 21.40 30.50 5.30 3.80 45.80 96.60 0.24 0.20

Ingram NN 9.72 38.29 -1.54 8.38 13.00 84.24 0.76 0.30

Dutta NN 9.93 38.50 2.18 17.98 13.30 87.30 0.76 0.27

Ingram NN 9.48 48.41 -1.22 -3.01 12.73 90.89 0.78 0.38

Pebesma GEOSTATS 9.11 146.36 -1.22 19.71 12.44 212.10 0.79 -0.27

(pp 168-172 of the book)Monitoring network:111 stations in Switzerland(80 training + 31 for validation)

Mapping of daily:• Mean speed• Maximum gust• Average direction

Modeling of wind fields with MLPand regularization technique

Monitoring network:111 stations in Switzerland (80 training + 31 for validation)

Mapping of daily:• Mean speed• Maximum gust• Average direction

Input information:X,Y geographical coordinatesDEM (resolution 500 m)23 DEM-based « geo-features »

Total 26 features

Modeling of wind fields with MLPand regularization technique

Model:MLP 26-20-20-3

Model:

MLP 26-20-20-3

Training:• Random initialization• 500 iterations of the

RPROP algorithm

Training of the MLP

Results: naîve approach

Results: Noisy ejection regularization

Results: summary

Noisy ejection regularization

Without regularization (overfitting)

Next stop is:

June 20

09:00 – 12:00

Room T120

Practical work session usingMachine Learning software

Thank you for your attention!