Learning Process-Based Models of Dynamic Systems Nikola Simidjievski Jozef Stefan Institute,...

25
Learning Process-Based Models of Dynamic Systems Nikola Simidjievski Jozef Stefan Institute, Slovenia HIPEAC 2014 LJUBLJANA

Transcript of Learning Process-Based Models of Dynamic Systems Nikola Simidjievski Jozef Stefan Institute,...

Learning Process-Based Models of Dynamic Systems

Nikola SimidjievskiJozef Stefan Institute, Slovenia

HIPEAC 2014 LJUBLJANA

Introduction• Equation Discovery (ED) is a subfield of machine learning, dealing with the task of inducing scientific laws and models in form of equations from observations.

• In the context of modeling system dynamics, the observations are time-series and the models take form of ordinary differential equations (ODEs)

• Process-Based Modeling (PBM) is an ED approach, which integrates domain-specific modeling knowledge and data into explanatory models of the observed systems.

• Using modeling knowledge formulated in a library, and observed data from the system at hand, this approach induces process-based models - an accurate, understandable and modular representation of the observed system dynamics.

Process-based models• Conceptual (high-level) representation of system dynamics.

• Process-based models are comprised of entities and processes.

• Entities and processes represent specific components and interactions observed in the system.

• Entities represent the state/variable of the system.

• Processes the represent the interactions between the entities.

• Knowledge is represented as library of entity and process templates.

The task of learning process-based models • Determining the structure of the model (ODE) Heuristic/exhaustive search over the space of suitable

candidate models • Parameter estimation finding values which minimize the difference (error)

between simulated and measured (real) data

• 3 Inputs : Library (Domain specific)

Conceptual model (Problem specific)

Data (Task specific)

Process-Based Library

Nataša Atanasova et al. , Constructing a library of domain knowledge for automated modelling of aquatic ecosystems, Ecological Modelling, 2006

Library of domain-specific modeling knowledge

Conceptual Model

ProBMoT (Čerepnalkoski, Simidjievski, Tanevski et al.)

ProBMoT1 (Process Based Modeling Tool) Tool for complete modeling, parameter estimation and simulation of process-based models

Darko Čerepnalkoski et al., The influence of parameter fitting methods on model structure selection in automated modeling of aquatic ecosystems, EM 2012

The Process

Model generator

ConceptualModel

Library

The Process

Specific models structures

~ 30 000

Model generator

Library

ConceptualModel

The Process

Parameter Estimation

Model generator

Library

Measurements

ConceptualModel

Specific models structures

The Process

Parameter Estimation

~50K-100KSimulations

per Structure

Parameter Estimation

Model generator

Library

Measurements

ConceptualModel

Specific models structures

Error values

Best Model

The Process

Parameter Estimation Parameter Estimation

Model generator

Library

Measurements

ConceptualModel

Validation

The Process Job

Best Model

Parameter Estimation Parameter Estimation

Measurements

Model generator

ConceptualModel

Library

Validation

Lets complicate the story a little bit…

Improved predictive performanceBetter address the complexity of real systems: combination of base models for better description of observed behavior

Training setT

T1

T2

TN

Model M1

Model M2

Model MN

Learningalgorithm

Learningalgorithm

Learningalgorithm

Ensemble

Ensembles

Ensembles of Process-based models•Base models are homogeneous

•Training data is represented as a time series.

•Each base model is trained on different samples of the data (Bagging)

•2-D space of candidate models A list of base models is generated by every ensemble iteration / replica

Sample ofMeasurements

Parameter Estimation

Model generator

ConceptualModel

Library

Parameter Estimation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.100

Sample ofMeasurements

Parameter Estimation

Model generator

ConceptualModel

Library

Sample ofMeasurements

Best Model 1 Best Model 2 Best Model 100

Parameter Estimation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Let say 100

Sample ofMeasurements

Parameter Estimation

Model generator

ConceptualModel

Library

Sample ofMeasurements

Validation

Ensembles of Dynamic Systems

Best Model 1

Best Model 2

Best Model 100

.

.

.

.

.

.

ENSEMBLEMODEL

Parameter Estimation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.100

Sample ofMeasurements

Parameter Estimation

Model generator

ConceptualModel

Library

Sample ofMeasurements

Job

The “fun” stuff…

& (executable ="run_[JOB]##.sh") (jobname = "Awsome_[JOB]##") (stdout = "[JOB]##.out") (stderr = "[JOB]##.err") (inputfiles = ("[JOB]##.tar") (outputfiles = ("[JOB]##.tgz") (cputime = "30 days") (memory = "2560") #!/bin/bash

tar -xf [JOB]##.tar cd JOB_workingDir java -Xms256m -Xmx2048m -jar PROBMOT.jar task/PROBMO_TaskSpec.xml

cd .. tar czf [JOB]##/out/ [JOB]##/*.log

.xRSL

run_[JOB]##.sh

Case Study : Population Dynamics in Lake Ecosystem

Modeling phytoplankton dynamics in lake ecosystems (1 ODE)

Lake Bled (Slovenia)

Lake Kasumigaura (Japan)

Lake Zurich (Switzerland) Lake Walensee (Switzerland)

Experiments Experimental Size 100 Ensemble Iterations 4 Different Lake Domains (Lake Bled, Lake Kasumigaura, Lake

Zurich, Lake Walensee) Total of 40 different modeling scenarios ~200 Model Structures per scenario 50000 Iterations in the Parameter Estimation Phase per

structure 1 Model (Structure Identification + Parameter

Estimation) = 2-3 minutes

=> Total Time of the whole experiment ~ 1,000,000min or ~ 2 years

We did it in ….. 3 Days

Nikola Simidjievski et al., Learning ensembles of population dynamics models and their application to modelling aquatic ecosystems., EM 2014

Thank You

Questions

www.probmot.ijs.si

[email protected]