PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date:...

17
Two examples of Machine Learning in Geoscience: Self-Organizing Maps (SOMs) & Feed-Forward Networks (FFNs) Lydia Keppler PhD candidate (she/her) Observations, Analysis and Synthesis (OAS) Director's Research Group (DRO)

Transcript of PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date:...

Page 1: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Two examples of Machine Learning in Geoscience:

Self-Organizing Maps (SOMs) & Feed-Forward Networks (FFNs)

Lydia KepplerPhD candidate (she/her)

Observations, Analysis and Synthesis (OAS)Director's Research Group (DRO)

Page 2: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Clustering: Self-Organising Maps (SOMs)

SOMs to cluster data into regions of similar properties (Kohonen 1987, 2001)

Figure adapted from Keppler et al. (in prep.) 2

Page 3: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

Giovanni wants to build three pizzarias near the MPI-M

Where should he put them?

3

Page 4: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

people who eat a lot of pizza

3

Giovanni wants to build three pizzarias near the MPI-M

Where should he put them?

Page 5: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

We start with three random locations for the pizzeria

3

Page 6: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

We start with three random locations for the pizzeria

Everyone goes to the one that is closest

3

Page 7: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

We move the pizzaria to the center of the houses

3

Page 8: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

We re-adjust the closest pizzarias for each house

3

Page 9: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Where to build a pizzeria

We repeat this process until the distances do not get smaller anymore

3

Page 10: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

SOM-clustering with non-pizza variables

We can e.g., use normalized temperature and salinity to cluster the ocean into water masses

1

0

tem

per

atu

re

(no

rma l

ized

)

1salinity (normalized)

4

Page 11: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

SOM-clustering with non-pizza variables

Example: temperature and salinity at 10 m as input to SOMs

salinity

temperature (°C)

cluster number

6

Page 12: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Feed-Forward Networks (FFNs)

• FFNs compute and apply statistical relationships between multiple predictor and target variables to approximate a function

→ like a MLR, but the relationships don’t have to be linear

• Here: from sparse data with gaps to mapped data

DIC (µmol kg-1)

Figure adapted from Keppler et al. (in prep.) 7

Page 13: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Feed-Forward Networks (FFNs)

• We have sparse ship data which we want to have mapped (target data)

• We need predictor data (mapped global data; e.g., temperature / salinity)

• The network establishes the statistical relationship between the predictor and the target data and then applies this relationship to map the target data

target data predictor data

8

DIC (µmol kg-1) temperature (°C) salinity

Page 14: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Establishing the relationship (training the FFN)

input

neurons

output

weighted

connections

weighted

connections

Figure adapted from Olden& Jackson, 2002

9

Page 15: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Applying the relationship

• Now the established relationship between the target data and the predictor data is applied to map the target data globally (disclaimer: the resulting field is not realistic, because of the simplified set-up, e.g. only SST and SSS as predictors; testing of the result is important)

predictor data

temp

erature

salinity

trained network

10

DIC (µmol kg-1)

output

Page 16: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Understanding the resonse

• We can see how each of the predictors contributes to the FFN (Similar to profile method by Gevrey et al., (2003))

• We train the network as usual, and in the simulation-step, we hold one predictor constant in time, and vary the others (iteratively for all predictors)

• We get the change in DIC due to each predictor

a

temperate 35°N to 65°N

b

subtropical23°N to 35°N

11

Page 17: PowerPoint-Präsentation Title: PowerPoint-Präsentation Author: Norbert P. Noreiks Created Date: 12/10/2019 2:00:45 PM

Summary

SOMs can cluster data into (e.g., into regions of similar properties)

FFNs can compute and apply statistical relationsships between multiple predictor and target variables to approximate a function

12