International Workshop on Spatio-Temporal Statistics · International Workshop on Spatio-Temporal...

International Workshop

on

Spatio-Temporal Statistics

Imperial College London

18–20 April 2016

Organisers:

Almut Veraart

Mikko Pakkanen

The workshop is funded by A. Veraart’s Marie Curie FP7 Integration Grant (within the

7th European Union Framework Programme).

Programme

All sessions take place in Lecture Theatre 139, 1st floor of Huxley Building, unless

stated otherwise.

Monday, 18 April 2016

Registration, 8:30–8:50.

Morning Session, 8:50–12:15. Chair: Almut Veraart

8:50–9:00 Opening Remarks

9:00–9:35 Emanuele Giorgi

9:40–10:15 Ying Sun

10:15–11:00 Coffee Break

11:00–11:35 David Bolin

11:40–12:15 Gregoire Mariethoz

Lunch Break, 12:15–14:30.

Afternoon Session, 14:30–17:05. Chair: Ragnhild Noven

14:30–15:05 Michele Nguyen

15:10–15:45 Mikkel Bennedsen


16:30–17:05 Denis Allard

17:15–17:50 Poster Flash Talks

Poster Session, 18:00–20:00. Mathematics Common Room, 5th floor of Huxley

Building.

Tuesday, 19 April 2016

Morning Session, 9:00–12:15. Chair: Mikkel Bennedsen

9:00–9:35 Jesper Møller

9:40–10:15 Orimar Sauri


11:00–11:35 Sofia Olhede

11:40–12:15 Shahin Tavakoli


Afternoon Session, 14:30–17:45. Chair: Michele Nguyen

14:30–15:05 Sebastian Reich

15:10–15:45 Theresa Smith


16:30–17:05 Hajo Holzmann

17:10–17:45 Peter Tankov

Conference Dinner at 19:00 (for invited speakers).

2

Wednesday, 20 April 2016

Morning Session, 9:00–12:15. Chair: Mikko Pakkanen

9:00–9:35 Marc G. Genton

9:40–10:15 Krzysztof Podgorski


11:00–11:35 Ed Cohen

11:40–12:15 Marie-Colette van Lieshout


Afternoon Session, 14:30–17:15. Chair: Ed Cohen

14:30–15:05 Martin Schlather

15:10–15:45 Ragnhild Noven


16:30–17:05 Franz Kiraly

17:05–17:15 Closing Remarks

End of the Workshop, 17:15.

3

Information

Scope and Aims of the Workshop

This workshop brings together international experts in statistics and probability

theory to discuss recent advances in spatio-temporal statistics. Topics of particular

interest are:

? Stochastic and statistical modelling of spatio-temporal phenomena (including

time series analysis and spatial statistics)

? Statistical inference for hierarchical models

? Stochastic volatility modelling and inference (including in particular extensions

to a spatio-temporal set-up)

? Extreme value theory

? Image analysis

Venue

The workshop (including coffee breaks) takes place in Lecture Theatre 139, 1st floor

of Huxley Building (180 Queen’s Gate, London SW7 2RH) on the South Kensington

Campus of Imperial College London, except for the Poster Session (on Monday, 18

April, 18:00–20:00), which is held in the Mathematics Common Room, 5th floor of

Huxley Building. The invited speakers will receive additional information regarding

the conference dinner and arrangements for lunch breaks.

AHOI Network

This workshop is part of the activities of the AHOI Network. AHOI (AarHus,

Oslo, Imperial) is a collaborative network between researchers in Stochastics at

Aarhus University, University of Oslo and Imperial College London. The purpose

of the network is to foster basic research in the theory and applications of Ambit

Stochastics, a new field of mathematical stochastics that has its origin in the study

of turbulence, but is in fact of broad applicability in science, technology and finance,

in relation to modelling of spatio-temporal processes. For further information, visit:

https://sites.google.com/site/ahoinet/

Contact Details of the Organisers

Almut Veraart, email: [email protected]

Mikko Pakkanen, email: [email protected]

Postal Address: Department of Mathematics

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

4

https://goo.gl/maps/fvX48

https://sites.google.com/site/ahoinet/

mailto:[email protected]

mailto:[email protected]

Abstracts of Talks

A Flexible Class of Non-separable Cross-Covariance Functions for Multivariate

Space-Time Data

Denis Allard L’Institut national de la recherche agronomique

Multivariate space-time data are increasingly available in various scientific disci-

plines. When analyzing these data, one of the key issues is to describe the multi-

variate space-time dependences. Under the Gaussian framework, one needs to pro-

pose relevant models for multivariate space-time covariance functions, i.e. matrix-

valued mappings with the additional requirement of non-negative definiteness. A

flexible parametric class of cross-covariance functions for multivariate space-time

Gaussian random fields is presented. Space-time components belong to the (uni-

variate) Gneiting class of space-time covariance functions, with Matern or Cauchy

covariance functions in the spatial margins. The smoothness and scale parameters

can be different for each variable. Sufficient conditions for positive definiteness are

shown. A simulation study shows that the parameters of this model can be effi-

ciently estimated using weighted pairwise likelihood, which belongs to the class of

composite likelihood methods. The model is then illustrated on a French dataset of

weather variables.

Bootstrapping the roughness index of Brownian semistationary and related

processes

Mikkel Bennedsen Aarhus University

In this talk we are concerned with newly developed bootstrap methods for estimators

of the roughness index of a class of continuous, conditionally Gaussian, processes.

In particular, we present the local fractional bootstrap of Bennedsen, Hounyo, Lunde

and Pakkanen (The local fractional bootstrap, working paper, 2016) and its appli-

cation to the change-of-frequency estimator of the roughness index of the Brownian

semistationary process. The same method is applied in a semiparametric setup to

a different estimator of the roughness index in the spirit of Bennedsen (Semipara-

metric estimation and inference of the fractal index of a time series using the local

fractional bootstrap, working paper, 2016). We compare the methods and consider

some empirical applications.

Geostatistical Modelling Using Non-Gaussian Matern Fields

David Bolin Chalmers University of Technology

We present a class of non-Gaussian spatial models useful for analysing geostatisti-

cal data. The models are constructed as solutions to stochastic partial differential

5

equations driven by generalized hyperbolic noise and are incorporated in a standard

geostatistical setting with irregularly spaced observations, measurement errors and

covariates. We present a likelihood-based parameter estimation method and discuss

various model extensions. Finally, an application to precipitation data is presented

and the models are compared with Gaussian and trans-Gaussian models.

Spatio-temporal processes in super-resolution microscopy imaging

Ed Cohen Imperial College London

Super-resolution microscopy is a collection of imaging techniques allowing experi-

menters to delve beyond classical resolution limits to image cellular structures in the

nanometer scale. The key element to the success of super-resolution techniques is the

stochastic blinking of fluorophores (light emitting molecules) allowing sparse subsets

to be localised with very high precision and then localizations collected across time

to build a spatial point pattern of molecular positions. In this talk I will present a

brief overview of super-res microscopy and show that while these blinking properties

are key to the technique they produce spurious artefacts that can hinder rigorous

analysis of underlying spatial structures. We present a new model for the observed

blinking of molecules in an imaging experiment and demonstrate its effectiveness in

estimating key characteristics of the blinking fluorophores, followed by a look what

this model might unlock for the super-resolution community in the future. This is

joint work with Lekha Patel (Imperial), Ricardo Henriques (UCL) and Raimund

Ober (Texas A&M).

Tukey g-and-h Random Fields

Marc G. Genton King Abdullah University of Science and Technology

We propose a new class of trans-Gaussian random fields named Tukey g-and-h

(TGH) random fields to model non- Gaussian spatial data. The proposed TGH ran-

dom fields have extremely flexible marginal distributions, possibly skewed and/or

heavy-tailed, and, therefore, have a wide range of applications. The special for-

mulation of the TGH random field enables an automatic search for the most suit-

able transformation for the dataset of interest while estimating model parameters.

An efficient estimation procedure, based on maximum approximated likelihood, is

proposed and an extreme spatial outlier detection algorithm is formulated. The

probabilistic properties of the TGH random fields, such as second-order moments,

are investigated. Kriging and probabilistic prediction with TGH random fields are

developed long with prediction confidence intervals. The predictive performance of

TGH random fields is demonstrated through extensive simulation studies and an

application to a dataset of total precipitation in the south east of the United States.

The talk is based on joint work with Ganggang Xu.

6

Geostatistical modelling of zero-inflated prevalence data

Emanuele Giorgi Lancaster University

When prevalence data are collected in areas where transmission of the disease in

question is highly restricted by environmental factors, the resulting spatial datasets

often exhibit an excessive number of sampled locations with no reported disease

cases by comparison with the standard binomial geostatistical model, even after

adjustment for covariate effects and both spatially structured and spatially inde-

pendent random effects. This behaviour, usually called zero-inflation, occurs when

some parts of the study-region are fundamentally unsuitable for transmission as

distinct from general extra-binomial variation. We propose two extensions of the

standard geostatistical model for prevalence data that exhibit zero-inflation. To

motivate the extensions, let q(x) and r(x) respectively denote the probability that

location x is suitable for the transmission of the disease and the probability of con-

tracting the disease at location x given that x is suitable. In the first extension, we

model the logit transformations of q(x) and p(x) as a pair of Gaussian processes,

possibly correlated, with the resulting prevalence given by p(x) = q(x)r(x). In the

second extension we model q(x) as a binary indicator equal to 0 or 1 if location x is

unsuitable or suitable, respectively. Using an Ising model, we then allow for discon-

tinuities in prevalence to occur at the boundaries between suitable and unsuitable

areas. Finally, we describe an application of the proposed models to river blindness

prevalence data from Mozambique, Malawi and Tanzania.

Scoring functions for forecast evaluation and the role of the information set

Hajo Holzmann Philipps-Universitat Marburg

Scoring functions are an essential tool to evaluate point forecasts, and scoring rules

to evaluate probabilistic forecasts. We start by reviewing some recent results on the

construction of scoring functions and scoring rules.

Point forecasts are issued on the basis of certain information. If the forecasting

mechanisms are correctly specified, a larger amount of available information should

lead to better forecasts. We show how the effect of increasing the information set

on the forecast can be quantified by using strictly consistent scoring functions, and

also discuss the role of the information set for evaluating probabilistic forecasts by

using strictly proper scoring rules (Holzmann, H. and Eulert, M.: The role of the

information set for forecasting — with applications to risk management. The Annals

of Applied Statistics 8, 595–621, 2014).

Further, a method is proposed to test whether an increase in a sequence of infor-

mation sets leads to distinct, improved h-step point forecasts. For the value at risk

(VaR), we show that increasing the information set will result in VaR forecasts which

lead to smaller expected shortfalls, unless an increase in the information set does not

7

change the VaR forecast. The effect is illustrated in simulations and applications

to stock returns for unconditional versus conditional risk management as well as

univariate modeling of portfolio returns versus multivariate modeling of individual

risk factors.

Kernels for sequentially ordered data

Franz Kiraly University College London

We present a novel framework for kernel learning with sequential data of any kind,

such as time series, sequences of graphs, or strings. Our approach is based on signa-

ture features which can be seen as an ordered variant of sample (cross-)moments; it

allows to obtain a “sequentialized” version of any static kernel. The sequential ker-

nels are efficiently computable for discrete sequences and are shown to approximate

a continuous moment form in a sampling sense.

A number of known kernels for sequences arise as “sequentializations” of suitable

static kernels: string kernels may be obtained as a special case, and alignment ker-

nels are closely related up to a modification that resolves their open non-definiteness

issue. Our experiments indicate that our signature-based sequential kernel frame-

work may be a promising approach to learning with sequential data, such as time

series, that allows to avoid extensive manual pre-processing. (Joint work with Harald

Oberhauser)

Non-parametric indices of dependence for inhomogeneous multivariate random

closed sets

Marie-Colette van Lieshout CWI & University of Twente

We propose new summary statistics for intensity-reweighted moment stationary mul-

tivariate random closed sets. The new statistics are based on the cumulant densities

and reduce to cross K- and D-functions when stationarity holds. We explore the

relationships between the various functions and discuss their explicit forms under

specific model assumptions. We derive ratio-unbiased minus sampling estimators

for our statistics and illustrate their use in practice.

Multiple-point geostatistics with spatio-temporal training images

Gregoire Mariethoz Universite de Lausanne

Multiple-point geostatistics (MPS) has received a lot of attention in the last decade

for modeling complex spatial patterns. The underlying principle consists in repre-

senting spatial variability using training images. A common conception is that a

training image can be seen as a prior for the desired spatial variability. As a result,

a variety of algorithmic tools have been developed to generate geostatistical realiza-

8

tions of spatial processes based on what can be seen broadly as texture generation

algorithms.

While the initial applications of MPS were dedicated to the characterization of 3D

subsurface structures and the study of geological/hydrogeological reservoirs, a new

trend is to use MPS for the modeling of earth surface processes. In this domain,

the availability of remote sensing data as a basis to construct training images offers

new possibilities for represent complexity with such non-parametric data-driven ap-

proaches. Repeated satellite observations or climate models outputs, available at a

daily frequency for periods of several years, provide the required patterns repetition

for having robust statistics on high-order patterns that vary in both space and time.

This presentation will delineate recent results in this direction, including MPS appli-

cations to the stochastic downscaling of climate models, the completion of partially

informed satellite images, the removal of noise in remote sensing data, and modeling

of complex spatio-temporal phenomena such as precipitations.

Second-order pseudo-stationary random fields and point processes on graphs

and their edges

Jesper Møller Aalborg University

Suppose we are given

(i) an undirected connected graph with vertex set V and a countable set E of

edges, where each edge apart from specifying a relation between two vertices

is viewed as a set e which is in bijective correspondence with some non-empty

open interval;

(ii) V and any edge e ∈ E are disjoint, and the edges are pairwise disjoint.

Then we call the triple of V , E , and the bijective mappings/edge coordinates for a

graph with Euclidean edges, and we denote this triple by G and the whole graph

set by L = V ∪⋃

e∈E e. In the special case where each edge e is just an open line

segments whose endpoints agree with the adjacent vertices associated to e, then

L is a linear network as considered in connection to for example road networks,

dendrite networks of neurons, and brick walls. Now, for any points u, v ∈ L, the

edge coordinates lead naturally to a geodesic distance dG(u, v) given by shortest

path distance in G. If the vertex set is contained in the Euclidean space Rk and the

edges are smooth subsets Rk, we may require that condition (i) and not necessarily

(ii) is satisfied: In fact there is then a natural one-to-one correspondence to a graph

with Euclidean edges, and this naturally induces a geodesic distance dG(u, v) but

taking into consideration whether u (or v) is a certain vertex or it belongs to a

certain edge. We notice that dG(u, v) may then be different from the usual geodesic

distance dL(u, v) on L which is given by shortest path-connected curve distance.

9

Our main goal is to establish sufficient conditions on the existence of positive def-

inite functions of the form K(dG(u, v)) for all u, v ∈ L. Then the Kolmogorov

Extension Theorem establishes the existence of a separable (Gaussian) random field

Z = {Z(u) : u ∈ S} on G with covariance function

cov(Z(u), Z(v)) = K (dG(u, v)) ∀u, v ∈ L.

(Since the covariance function depends on the graph with Euclidean edges, we prefer

using the terminology ”random filed on G” rather than ”random field on L”.) We

say then that the covariance function is pseudo-stationary and that the random

field Z is second-order pseudo-stationary, noting that we do not require that the

mean function EZ(u) is constant. Note that our setting is different from that in

research on random fields on directed trees such as in a network of rivers or streams

where water flows in one direction. Then special techniques are appropriate for

constructing covariance functions of the form above. However, our techniques will

be different, since we deal with undirected graphs.

One motivation for considering a second-order pseudo-stationary random field Z is

that for any geodesic path puv ⊆ L connecting two points u, v ∈ L, the restriction

of Z to puv has the same covariance structure as the random field Z(t) defined on

t ∈ [0, t0] ⊂ R where cov(Z(t), Z(s)) = K(|t − s|) and t0 = dG(t, s). In brief, Z

restricted to a geodesic path is indistinguishable from a corresponding Gaussian

random field on a closed interval.

Another motivation is that given a covariance function of the form above, we can

construct second-order intensity-reweighted pseudo-stationary (SOIRPS) point pro-

cesses on G, meaning that the point process has a pair correlation function of the

form g(u, v) = g0(dG(u, v)) for all u, v ∈ L. A Poisson process on L is SOIRPS but

to the best of our knowledge, apart from the Poisson process, models for SOIRPS

point processes on point processes with Euclidean edges have not yet been specified

in the literature. We show that for a log Gaussian Cox process (LGCP) X, i.e. when

X conditional on a Gaussian random field Z on L is a Poisson process with intensity

function exp(Z(u)), u ∈ L, second-order pseudo-stationarity of Z is equivalent to

SOIRPS of X. We also specify moment and Palm measure theoretical results for

LGCPs. Further examples of SOIRPS point processes on graphs with Euclidean

edges will be discussed in the talk. (Joint work with Ethan Anderes, University of

California at Davis, and Jakob G. Rasmussen, Aalborg University)

Modeling spatial heteroskedasticity by volatility modulated moving averages

Michele Nguyen Imperial College London

Spatial heteroskedasticity, which refers to changing variances and covariances in

space, is a feature that has been observed in environmental data. While promi-

nent models in the literature have accounted for this behaviour by multiplying the

10

Gaussian error process with a stochastic volatility process, we propose a model that

intricately blends the effects of spatial volatility across space. This is related to the

way stochastic volatility is modelled in financial partial differential equations.

Let t ∈ Rd for some d ∈ N. Our model, which we call the volatility modulated

moving average (VMMA), is defined by:

Y (t) =

∫Rd

g(t− s)σ(s)W (ds),

where g is a deterministic (kernel) function, W is a homogeneous standard Gaussian

basis or white noise and σ is a stationary stochastic volatility field, independent ofW .

Without σ, this model reverts to the Gaussian moving average which is frequently

used in Geostatistics to design covariance structures.

In this project, we develop the theory of VMMAs and show how to simulate from

the models. We also discuss methods of inference.

Modelling complex distributions and dependence structures with trawl-type

processes

Ragnhild Noven Imperial College London

Trawl processes are a class of stationary, continuous-time stochastic processes driven

by an independently scattered random measure. They belong to the wider class of so-

called Ambit fields, and give rise to a flexible class of models that can accommodate

non-Gaussian distributions and a wide range of tempo-spatial covariance structures.

We develop the fundamentals of trawl processes with a view to statistical modelling,

and introduce a new representation that enables exact simulation and suggests novel

estimation methods. Then we use these processes to construct a general hierarchical

modelling framework, and present an application to modelling temporal dependence

in extreme values.

Characterising anisotropy in random fields

Sofia Olhede University College London

Detecting and analyzing directional structures in images is important in many appli-

cations since one-dimensional patterns often correspond to important features such

as object contours or trajectories. Classifying a structure as directional or nondirec-

tional requires a measure to quantify the degree of directionality and a threshold,

which needs to be chosen based on the statistics of the image. In order to do this,

we model the image as a random field. So far, little research has been performed on

analyzing directionality in random fields. In this paper, we propose a novel measure

to quantify the degree of anisotropy, and show how it can be applied to determine

novel and interesting features in the topography of Venus. This is joint work with

11

David Ramirez and Peter Schreier.

Event based distributions for spatio-temporal random fields

Krzysztof Podgorski Lund University

The sea surface is a classical example of stochastic field that is evolving in time.

Extreme events that are occurring on such a surface are random and of interest for

practitioners - ocean engineers are interested in large waves and damage they may

cause to an oil platform or to a ship. Thus data on the ocean surface elevation are

constantly collected by system of buoys, ship- or air-borne devices, and satellites all

around the globe. These vast data require statistical analysis to answer important

questions about random events of interest. For example, one can ask about statistical

distribution of wave sizes, in particular, how distributed large waves are or how steep

they are. Waves often travel in groups and a group of waves typically causes more

damage to a structure or a ship than an individual wave even if the latter is bigger

than each one in the group. So one can be interested in how many waves there is

per group or how fast groups are traveling in comparison to individual waves.

In the talk, a methodology that analyze statistical distributions at random events

defined on random process is presented. It is based on a classical result of Rice and

allows for computation of statistical distributions of events sampled from the sea

surface. The methodology initially was applied to Gaussian models but in fact, it is

also valid for quite general dynamically evolving stochastic surfaces.

In particular, it is discussed how sampling distributions for non-Gaussian processes

can be obtained through Slepian models that describe the distributional form of

a stochastic process observed at level crossings of a random process. This is used

for efficient simulations of the behavior of a random processes sampled at crossings

of a non-Gaussian moving average process. It is observed that the behavior of the

process at high level crossings is fundamentally different from that in the Gaussian

case, which is in line with some recent theoretical results on the subject.

Non-Gaussian data assimilation via a hybrid ensemble transform filter

Sebastian Reich Universitat Potsdam

Most current data assimilation (DA) algorithms for numerical weather prediction

(NWP) are based on variational and/or ensemble-based approaches, which rely on

a Gaussian approximation to forecast uncertainties. Such Gaussian representations

are less likely to be appropriate for characterizing uncertainties arising from fully

three dimensional and multi-phase convection-driven atmospheric circulation pat-

terns. Thus high-resolution NWP has triggered the exploration of non-Gaussian

DA methods. This talk will contribute to this development by presenting a hybrid

ensemble transform filter which bridges the ensemble Kalman filter with sequential

12

Monte Carlo methods. The proposed hybrid filter also allows for localization and

inflation as necessary for filtering spatio-temporal processes under model errors.

On the class of distributions of subordinated Levy processes and bases

Orimar Sauri Aarhus University

In this talk we study the class of innitely divisible distributions obtained by sub-

ordinating a Levy basis via an independent meta-time. We show that, for a xed

Levy basis, the law of the subordinated Levy basis is uniquely determined by the

law of the associated meta-time. In particular, we use our results to solve the so-

called recovery problem for Levy bases as well as moving average processes driven

by subordinated Levy processes. This talk is based on a joint work with Almut

Veraart.

Exact and Fast Simulation of Max-Stable Processes

Martin Schlather Universitat Mannheim

The efficiency of simulation algorithms for max-stable processes relies on the choice

of the spectral representation: different choices result in different sequences of fi-

nite approximations to the process. We modify the general optimization problem so

that a relatively simple solution can be obtained, which is essentially de Haan’s nor-

malized spectral representation. Compared to other simulation algorithms hitherto,

our approach has at least two advantages. First, it allows the exact simulation of a

comprising class of max-stable processes. Second, the algorithm has a stopping time

with finite expectation. In practice, our approach has the potential of considerably

reducing the simulation time of max-stable processes.

Spatio-temporal log-Gaussian Cox processes for public health data

Theresa Smith Lancaster University

Health data with high spatial and temporal resolution are becoming more common,

but there are several practical and computational challenges to using such data to

study the relationships between disease risk and possible predictors. These diffi-

culties include lack of measurements on individual-level covariates/exposures, inte-

grating data measured on difference spatial and temporal units, and computational

complexity.

In this talk, I outline strategies for jointly estimating systematic (i.e., parametric)

trends in disease risk and assessing residual risk with spatio-temporal log-Gaussian

Cox processes (LGCPs). In particular, I will present a Bayesian methods and MCMC

tools for using spatio-temporal LGCPs to investigate the roles of environmental and

socio-economic risk-factors in the incidence of Campylobacter in England.

13

Approximating Likelihoods for Large Environmental Datasets

Ying Sun King Abdullah University of Science and Technology

For Gaussian process models, likelihood based methods are often difficult to use with

large irregularly spaced spatial datasets due to the prohibitive computational bur-

den and substantial storage requirements. Although various approximation methods

have been developed to address the computational difficulties, retaining the statisti-

cal efficiency remains an issue. This talk focuses on statistical methods for approx-

imating likelihoods and score equations. The proposed new unbiased estimating

equations are both computationally and statistically efficient, where the covariance

matrix inverse is approximated by a sparse inverse Cholesky approach. A unified

framework based on composite likelihood methods is also introduced, which allows

for constructing different types of hierarchical low rank approximations. The perfor-

mance of the proposed methods is investigated by numerical and simulation studies,

and parallel computing techniques are explored for very large datasets. Our meth-

ods are applied to nearly 90,000 satellite-based measurements of water vapor levels

over a region in the Southeast Pacific Ocean, and nearly 1 million numerical model

generated soil moisture data in the area of Mississippi River basin. The fitted models

facilitate a better understanding of the spatial variability of the climate variables.

Making decisions with probabilistic forecasts

Peter Tankov Universite Paris-Diderot

We consider a sequential decision making process (trading, investment, production

scheduling etc.) whose outcome depends on the realization of a random factor,

such as a meteorological variable. We assume that the decision maker disposes of a

probabilistic forecast (predictive distribution) of the random factor, which is regu-

larly updated. We propose a stochastic model for the evolution of the probabilistic

forecast inspired by the Kushner’s equation of nonlinear filtering, and show how

this model may be estimated from the historical forecast data. We then show how

this stochastic model can be used to determine optimal decision making strategies

depending on the forecast updates. Applications to wind energy trading are given.

Tests for separability in nonparametric covariance operators of random surfaces

Shahin Tavakoli University of Cambridge

The assumption of separability of the covariance operator for a random image or

hypersurface can be of substantial use in applications, especially in situations where

the accurate estimation of the full covariance structure is unfeasible, either for com-

putational reasons or due to a small sample size. However, inferential tools to verify

this assumption are somewhat lacking in high-dimensional or functional settings

where this assumption is most relevant. We propose here to test separability by

14

focusing on K-dimensional projections of the difference between the covariance op-

erator and its nonparametric separable approximation (Aston, J. A. D., Pigoli, D.

and Tavakoli, S.: Tests for separability in nonparametric covariance operators of

random surfaces, working paper, http://arxiv.org/abs/1505.02023, 2015). The

subspace we project onto is one generated by the eigenfunctions estimated under the

separability hypothesis, negating the need to ever estimate the full non-separable

covariance. We show that the rescaled difference of the sample covariance operator

with its separable approximation is asymptotically Gaussian. As a by-product of this

result, we derive asymptotically pivotal tests under Gaussian assumptions, and pro-

pose bootstrap methods for approximating the distribution of the test statistics when

multiple eigendirections are taken into account. We probe the finite sample perfor-

mance through simulations studies, and present an application to log-spectrogram

images from a phonetic linguistics dataset.

15

http://arxiv.org/abs/1505.02023

Abstracts of Posters

Asymptotic high frequency theory for the multivariate Brownian semistationary

process

Andrea Granelli Imperial College London

In our work we formally prove limit theorems for the asymptotic high frequency

theory of the multivariate Brownian semistationary (BSS) process, that can be

defined as: ∫ t

−∞g(t− s)σsdWs.

This process offers very flexible modelling possibilities and has recently been used

in the context of energy prices. Depending on the behaviour of the function g, it

may lie outside the semimartingale class, and this is typically the commonest use in

practice. The literature has only dealt with the univariate case, so far. Our work

develops the fundamental high frequency asymptotic theory needed for this process,

using Malliavin calculus, to significantly extend the array of results available in the

literature, covering the multivariate case for the first time.

We look at the realised covariation of two correlated BSS processes, defined as the

limit of the sum of the product of the increments of two processes along an equally

spaced partition, as the size of the partition shrinks to zero. We give conditions to

ensure convergence of this quantity in an appropriate sense, since it does not follow

from the usual general theorems concerning semimartingales. We give very general

conditions needed for a law of large numbers to hold, and the main result of the

paper is a central limit theorem, showing convergence in law to the Gaussian, under

some more restrictive assumptions.

This result is important on its own, but also in the applications, allowing to un-

derstand the asymptotic theory for quantities of great interest in finance, like the

correlation coefficient between two assets, and the realised beta coefficient.

This poster is based on joint work with Almut Veraart.

Limit theory for Levy semistationary processes

Claudio Heinrich Aarhus University

Levy semistationary (LSS) processes are processes of the form

Xt =

∫ t

−∞g(t− s)σsdLs,

where g is a deterministic kernel, σ is a predictable process, and L is a Levy process

on the real line. These processes form an important subclass of ambit fields, a

16

flexible class of spatio-temporal processes that have found manifold applications in

various sciences such as biology, finance, and physics. We investigate the limiting

behavior for n→∞ of the realised power variation

V (p)n =n∑

i=1

|Xi/n −X(i−1)/n|p,

when X is an LSS process driven by a pure jump Levy process L.

Substitute CT generation using Markov random field mixture models

Anders Hildeman Chalmers University of Technology

Computed tomography (CT) equivalent information is needed for attenuation cor-

rection in PET imaging and for dose planning in radiotherapy. Prior work has shown

that Gaussian mixture models can be used to generate a substitute CT image from

a specific set of MRI modalities. This is highly attractive since MR images can

be acquired without exposing the subject to hazardous ionizing radiation and MRI

information is often of interest in its own right.

In this work we improve these models by incorporating spatial information through

assuming the mixture class probabilities to be distributed according to a discrete

Markov random field. Furthermore, the mixtures are extended from Gaussian to

normal inverse Gaussian distributions, allowing heavier tails and skewness.

Model parameters are estimated from training data using a maximum likelihood

approach. Due to the spatial model there is no closed form expression for the likeli-

hood function and a standard EM algorithm would not be possible. Instead, an EM

gradient algorithm utilizing MCMC approximations is developed. This procedure

yields acceptable convergence properties also when the large quantity of data makes

other common modifications of the EM-algorithm infeasible.

The estimation procedure is not only applicable to this specific problem but can

be used to a more general family of problems where the M-step is not possible to

perform but where the gradient of the likelihood can, at least approximately, be

evaluated.

The advantages of the spatial model and normal inverse Gaussian distributions are

evaluated with a cross-validation study based on data from 14 patients.

A statistical analysis of Tropical Cyclone Genesis

Thomas Patrick Leahy Imperial College London

According to the Fifth IPCC assessment report, in the 21st “the frequency of the

most intense storms will increase substantially in some ocean basins”. This poses a

significant risk to the vulnerable regions. Whilst studies suggest that there will be

17

an increase in intensity and decrease in frequency, it is still uncertain by how much.

Quantifying this change in intensity and distribution of tropical cyclones is difficult.

As a natural first step, we examine the starting points or genesis of tropical cyclones.

This poster will give an insight into attributing and quantifying the influence of

physical covariates on tropical cyclone genesis. Generalised Linear Modelling pro-

vides a statistical framework to understand the physical variables that contribute to

the generation of tropical cyclones. An in depth understanding of the contributing

factors to genesis is particularly vital in a potential future climate.

Bayesian Inference for High Dimensional Dynamic Spatio-Temporal Models

Sofia Maria Karadimitriou University of Sheffield

The first reduced dimension Dynamic Spatio Temporal Model (DSTM) was intro-

duced by Wikle and Cressie (A dimension-reduced approach to space-time kalman

filtering. Biometrika, 86(4), 815–829, 1999) to jointly describe the spatial and tem-

poral evolution of a function observed subject to noise. A basic state space model is

adopted for the discrete temporal variation, while a continuous autoregressive struc-

ture describes the continuous spatial evolution. Application of Wikle and Cressies

DTSM relies upon the pre-selection of a suitable reduced set of basis functions

and this can present a challenge in practice. In this paper we propose an off-line

estimation method for high dimensional spatio-temporal data based upon DTSM

which attempts to resolve this issue allowing the basis to adapt to the observed

data. Specifically, we present a wavelet decomposition for the spatial evolution but

where one would typically expect parsimony. This believed parsimony can be in-

corporated by placing a Laplace prior distribution on the wavelet coefficients. The

aim of using the Laplace prior, is to filter wavelet coefficients with low contribution,

and thus achieve the dimension reduction with signicant computation savings. We

then propose a Hierarchical Bayesian State Space model, for the estimation of which

we offer an appropriate Forward Filtering Backward Sampling algorithm which in-

cludes Metropolis-Hastings steps and a Bayesian Graphical Lasso scheme (Wang,

H.: Bayesian graphical lasso models and efficient posterior computation. Bayesian

Analysis, 7(4), 867–886, 2012) for the covariance inference. (Joint work with Kostas

Triantafyllopoulos and Timothy Heaton)

Seasonality of mortality in the USA: identifying patterns and trends with

Bayesian spatiotemporal modelling

Robbie Parks Imperial College London

All-cause mortality is known to exhibit seasonal variation. In this study, we will

use all-cause mortality records of the entire USA from 1982-2010 for forecasting of

seasonal age-specific mortality on a state level, analysing differences in trends. The

18

novel approach of this research comes from the Bayesian hierarchical model, which

borrows strength by neighbouring location, age group, and time. Our study will be

the first systematic analysis of seasonality of the entire USA throughout this time

period stratified by gender, age group, and state.

Initial results from the in-sample model indicate a distinct difference between sea-

sonal mortality profiles of younger and older age groups. Younger age groups relative

mortality peaks in the summer months, while older age groups peak in the winter

months. Further analysis over geography will determine if this variation is constant

across locations.

We expect the results to improve understanding on how distinct age groups and

locations are affected by season, as previous studies have looked at all age groups

combined. We also expect the model framework to enable coherent forecasts of

patterns and trends of seasonal mortality.

Rough Path Theory, Fractional Brownian Motion and connections to Statistical

Inference

Riccardo Passeggeri Imperial College London

Rough Path theory was originally developed in the late nineties by Terry Lyons.

A rough path is an analytical and algebraic object associated to an irregular path

allowing one to define and study solutions to differential equations controlled by

such irregular paths, for example a Brownian motion. In particular, given a path

we can consider its signature, which is the set of all iterated integrals of the path,

hence it is a map from the path to a tensor algebra. The signature give us additional

information about the path so that we can use them, for example, to find a solution

of a certain ODE driven by the path or to obtain additional properties of the same

path. The first objective of my research is to compute the rate of convergence of the

expected signature of a piecewise linear approximation of the fractional Brownian

motion to the expected signature of the fractional Brownian motion. This is an

open problem for any value in the interval (0, 1) of the Hurst parameter H. The

second objective is to compute the expected signature of a fractional Brownian

motion with the Hurst parameter strictly less than a half (H < 12). This research

is directly relevant to parameter estimation for rough differential equations. In

particular, Anastasia Papavasiliou and Christophe Ladroue (Parameter estimation

for rough differential equations. The Annals of Statistics, 39(4), 2047–2073, 2011)

constructed the Expected Signature Matching Estimator (ESME). The goal is to

estimate the parameter θ of the vector field f of a differential equations of the form

dYt = f(Yt; θ) · dXt, Y0 = y0

and in order to do this the authors use the ESME, which is based on the matching

between the theoretical and the empirical expected signature of the response {Yt, 0 <

19

t < T}, for T > 0. Two of the main assumptions are that the expected signature

of the path X is computable and the theoretical expected signature of the path Y ,

if not computable, can be approximated. Thus my project is strongly linked with

their work and in particular, by achieving the objectives mentioned above, it will be

possible to eliminate these restrictive assumptions in the case of fractional Brownian

motion.

20

Participants

Denis Allard (INRA, France)

Saoirse Amarteifio (Imperial College London, UK)

Mikkel Bennedsen (Aarhus University, Denmark)

Nick Bingham (Imperial College London, UK)

David Bolin (Chalmers University of Technology, Sweden)

Ignacio Bordeu Weldt (Imperial College London, UK)

Ricardo Carrizo (MINES ParisTech, France)

Ed Cohen (Imperial College London, UK)

Alice Corbella (University of Cambridge, UK)

Gabriela Czanner (University of Liverpool, UK)

Petros Dellaportas (University College London, UK)

Stefano De Marco (Ecole Polytechnique, France)

Guler Ergun (Imperial College London, UK)

Nicola Fitz-Simon (Imperial College London, UK)

Marc G. Genton (KAUST, Saudi Arabia)

Paul Ginzberg (Imperial College London, UK)

Emanuele Giorgi (Lancaster University, UK)

Andrea Granelli (Imperial College London, UK)

Zorana Grbac (Universite Paris-Diderot, France)

Claudio Heinrich (Aarhus University, Denmark)

Hessam Hessami (Universite Joseph Fourier, France)

Anders Hildeman (Chalmers University of Technology, Sweden)

Marcel Hirt (University College London, UK)

Till Hoffmann (Imperial College London, UK)

Hajo Holzmann (Philipps-Universitat Marburg, Germany)

Blanka Horvath (Imperial College London, UK)

Jack Jacquier (Imperial College London, UK)

Christopher Jarvis (London School of Hygiene and Tropical Medicine, UK)

Sofia Maria Karadimitriou (University of Sheffield, UK)

Yannis Karmpadakis (Imperial College London, UK)

Dimitrios Kiagias (University of Sheffield, UK)

Franz Kiraly (University College London, UK)

Thomas Patrick Leahy (Imperial College London, UK)

Fekadu Lemessa (Umea University, Sweden)

Marie-Colette van Lieshout (CWI & University of Twente, The Netherlands)

Gregoire Mariethoz (Universite de Lausanne, Switzerland)

Maciej Marowka (Imperial College London, UK)

Maxime Morariu-Patrichi (Imperial College London, UK)

Jesper Møller (Aalborg University, Denmark)

Michele Nguyen (Imperial College London, UK)

21

Ragnhild Noven (Imperial College London, UK)

Sofia Olhede (University College London, UK)

Thomas Opitz (INRA, France)

Aidan O’Sullivan (University College London, UK)

Mikko Pakkanen (Imperial College London, UK)

Aristeidis Panos (University College London, UK)

Robbie Parks (Imperial College London, UK)

Riccardo Passeggeri (Imperial College London, UK)

Lekha Patel (Imperial College London, UK)

Roozbeh H. Pazuki (Imperial College London, UK)

Erika Pellegrino (Imperial College London, UK)

Krzysztof Podgorski (Lund University, Sweden)

Tuomas Rajala (University College London, UK)

Nicola Reeve (Coventry University, UK)

Sebastian Reich (Universitat Potsdam, Germany)

Orimar Sauri (Aarhus University, Denmark)

Martin Schlather (Universitat Mannheim, Germany)

Theresa Smith (Lancaster University, UK)

Ying Sun (KAUST, Saudi Arabia)

James Sweeney (University College Dublin, Ireland)

Peter Tankov (Universite Paris-Diderot, France)

Shahin Tavakoli (University of Cambridge, UK)

Kostas Triantafyllopoulos (University of Sheffield, UK)

Almut Veraart (Imperial College London, UK)

Jianfeng Wang (Umea University, Sweden)

Hanna Zdanowicz (University of Oslo)

22

International Workshop on Spatio-Temporal Statistics · International Workshop on Spatio-Temporal...

Documents

Transcript of International Workshop on Spatio-Temporal Statistics · International Workshop on Spatio-Temporal...