Transcript of Directional Analysis of Stationary Point Processes
Martina Sormani
zur Verleihung des akademischen Grades Doktor der
Naturwissenschaften (Doctor rerum naturalium, Dr. rer. nat.)
genehmigte Dissertation
Datum der Disputation: 11.06.2019
Acknowledgements
First of all and most of all I would like to thank my supervisor
Claudia Redenbach, who gave me the opportunity to do my Phd, and
was of great support and guidance during this time, not only
regarding issues in mathematics.... I am grateful to Tuomas Rajala,
who shared with me lots of code and ideas and to Prof. Aila Särkkä,
especially for her detailed and accurate text corrections. Thanks
to Johannes Freitag to share with us the ice-data, for his
suggestions, and for giving me the opportunity to enter in the DFG
Project. Thanks also to his Phd student Tetsuro. I also would like
to thank the image processing group of the ITWM for letting me use
their software and to have shared their knowledge. In particular
thanks to Sonja for all her help and Prakash. I am grateful to
professors Lothar Heinrich, Jürgen Franke, Gabriele Steidl for
sharing their knowledge and for taking time to discuss with us.
Thanks to Disha that was always with me in good and bad times.
Finally I want to thank my family in Italy which has been always
near to me, and... to Luis and Diego .....
Preface
This work has mainly been supported by the DFG priority programm
“Antarktisforschung mit vergleichenden Untersuchungen in arktischen
Eisgebieten”: FR 2527/2-1, RE 3002/3-1. Partial funding by the
DFG-Graduiertenkolleg 1932 and from the Center for Mathematical and
Computational Modelling (CM)² in Kaiserslautern, is gratefully
acknowledged.
List of Symbols
B0 bounded sets of B
Nlf locally finite subsets of Rd
Nlf σ-algebra on Nlf
x point configuration on Rd
Nx(B) number of points of x in a subset B ⊂ Rd
X spatial point process on Rd
XS X ∩ S
NX(B) number of points of X in a subset B ⊂ Rd
λ intensity of X
∂W border of W
(,F ,P) probability space
K(·) reduced second order moment measure
λ(2) second order product density
P0(·) Palm measure
W ∗ window of observation of the Fry points
λZ intensity of the Fry points
µ,ν measures
G nearest neighbor distance distribution function
F empty space function
g pair correlation function
d(·, ·) distance function
Sd−1 unit sphere in Rd
kd volume of the d-dimensional unit ball
I(·) indicator function
T = RC linear mapping, R rotation matrix, C compression
matrix
R0 element of SOn
det(·) determinant of a matrix
tr(·) trace of a matrix
AT transpose of the matrix A
VI
Contents
Introduction 2
1 Spatial Point Processes 3 1.1 General notation . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2
Definitions and preliminaries . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 4 1.3 Properties of spatial point patterns . . .
. . . . . . . . . . . . . . . . . . . . . 5 1.4 Poisson point
process (CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . 6 1.5 Summary statistics . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 7
1.5.1 Intensity measures . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 8 1.5.2 Palm distributions . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 10 1.5.3 Second order summary
statistics . . . . . . . . . . . . . . . . . . . . . . 11 1.5.4
First order summary statistics . . . . . . . . . . . . . . . . . .
. . . . . . 12
1.6 Strauss process . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 13 1.7 The Metropolis Hastings algorithm .
. . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 18 1.7.2 Convergence of the algorithm . . . . . . .
. . . . . . . . . . . . . . . . . 18 1.7.3 Simulation of locally
stable point processes . . . . . . . . . . . . . . . . 19
2 Directional Analysis 21 2.1 Settings . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 23 2.1.2 Explicative examples . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 23
2.2 Fry points . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 24 2.3 Integral method . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 31 2.4 Projection method . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 36
2.4.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 37 2.5 Ellipsoid method . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 40
2.5.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 41 2.6 Estimation of C . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 41
2.6.1 Integral method . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 42 2.6.2 Projection method . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 42
3 Simulation Study 45 3.1 Simulation study . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 45
Contents
3.2 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 46 3.2.1 2D . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 46
3.2.1.1 Integral Method . . . . . . . . . . . . . . . . . . . . . .
. . . . 46 3.2.1.2 Projection Method . . . . . . . . . . . . . . .
. . . . . . . . . 58 3.2.1.3 Ellipsoid Method . . . . . . . . . . .
. . . . . . . . . . . . . . 59
3.2.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 61 3.2.2.1 Projection Method . . . . . . . . . .
. . . . . . . . . . . . . . . 61 3.2.2.2 Ellipsoid Method . . . . .
. . . . . . . . . . . . . . . . . . . . . 63
3.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 65 3.3 Estimation of C . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 68
3.3.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 68 3.3.2 3D . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 72 3.3.3 Discussion . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Directional Analysis-Additional Aspects 79 4.1 Influence of noise
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 79
4.1.1 Simulation study . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 80 4.2 Classification algorithms . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 82
4.2.1 Model specification . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 83 4.2.2 MCMC method . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 83 4.2.3 Variational Bayes
algorithm . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2.4
Comparison of the methods . . . . . . . . . . . . . . . . . . . . .
. . . . 85
4.3 Testing against anisotropy . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 86 4.3.1 Power of the “Projection” test . .
. . . . . . . . . . . . . . . . . . . . . 88
4.4 Visualization of the Fry points . . . . . . . . . . . . . . . .
. . . . . . . . . . . 89 4.4.1 2D . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 90 4.4.2 3D . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
4.5 Limit behaviour of the geometric anisotropy transform . . . . .
. . . . . . . . . 95
5 Application to Ice Data 99 5.1 Description of the data . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.1.1 Division in subsamples . . . . . . . . . . . . . . . . . . .
. . . . . . . . 101 5.2 Motivation . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 105 5.3 Directional
analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 106
5.3.1 Estimation of the interaction radius . . . . . . . . . . . .
. . . . . . . . 106 5.3.2 Estimation of R . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 107
5.3.2.1 Talos Dome core . . . . . . . . . . . . . . . . . . . . . .
. . . . 108 5.3.2.2 EDML core . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 109 5.3.2.3 Renland core . . . . . . . . . . .
. . . . . . . . . . . . . . . . . 112
5.3.3 Estimation of C . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 113 5.3.3.1 Talos Dome core . . . . . . . . . . . .
. . . . . . . . . . . . . . 114 5.3.3.2 EDML core . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 114 5.3.3.3 Discussion . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.3.4 Representation of the Fry points . . . . . . . . . . . . . .
. . . . . . . . 119
Conclusions 122
VIII
Contents
Appendices 123 A.1 Proof of unbiasedness . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 123 A.2 Expectation of
wavelet coefficients . . . . . . . . . . . . . . . . . . . . . . .
. . 123 B.1 Academic Background . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 125 B.2 Akademischer Werdegang . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 125
Bibliography 125
IX
Introduction
In this thesis we consider as a main topic the directional analysis
of a stationary point process. The interest in such an analysis
rises in the modern point process literature, since, thanks to the
advances in technology, large and complicated point pattern data,
in particular in 3D, are more common. For those patterns, the
assumptions of isotropy and stationarity can not be simply made but
need further investigation. Testing stationarity has been
considered by several authors and currently lots of non-stationary
models are available [1, 2, 27, 39, 49]. Isotropy is, on the other
hand, often still assumed without further checking, although
several tools to study anisotropy have been suggested in the
literature. To render those tools more easily accessible it was
recently published a paper [51] where the existing non-parametric
methods have been collected. In this thesis we focus on and compare
three non-parametric methods which we have defined as Integral
method, Ellipsoid method and Projection method. All of them are
based on second- order analysis of a point process. The Ellipsoid
method has been introduced in [52]. The Integral method has been
applied in the literature in several versions, for example in [53]
or in [35] and it is here described in a general context. The
Projection method, from the best of our knowledge, is introduced in
this thesis. A similar idea in 2D can be found in [26, page 254].
In a simulation study we apply the methods in order to find
preferred directions and we compare their performances. Testing
isotropy and visualization of anisotropy, both in 2D and 3D, are
also considered. Directional methods are especially useful to
detect directions in regular point patterns since it can be
difficult to visually detect anisotropy in such patterns. In
constrast, in clustered patterns, the shape and directions of the
clusters can already reveal some information. An example of a
regular pattern where it is difficult to visually detect anisotropy
is the amacrine cells data (Figure 0.0.1, left), which consists of
on cells and off cells. These data have been analyzed several times
by assuming stationarity and isotropy, but it was recently detected
by Wong and Chiu that both the marginal on and off patterns and the
superposition show some sign of anisotropy. Anisotropy can be
formed by several mechanisms. In this thesis we focus on the so
called geometric anistropy mechanism which has been considered in
the literature both for clustered point patterns, such as the Welsh
chapel data (Figure 0.0.1, right) [38, 36] and for regular point
processes, such as the amacrine cells [66] and air bubbles in polar
ice [53, 52]. Motivated by our application to real data, we here
pay special attention to the regular case. As in [53, 52] we
consider the 3D locations of air-bubbles in glacial ice-cores. For
this data the aim of a directional analysis is to get information
about the deformation of the ice-sheet at different depths. This
information is necessary for the glaciologists in order to build
dating models for the ice. A first directional analysis of the
ice-data can be found in [53] and [52].
Contents
on
off
Finally we consider the influence of isotropic and stationary noise
on the results of the direc- tional analysis of a stationary point
process. This study is motivated by the ice application. In fact it
has been recently discovered that ice core samples may contain
noise bubbles which form due to the relaxation of the ice after the
core is taken out of the drilling hole. In this context, the
classification algorithms introduced in [54] and [50] are taken
into consideration. The limit behavior of the geometric anisotropy
mechanism is also described. Introduction to point process theory
and to the main notation is given in Chapter 1. The three main
methods, the Integral, the Ellipsoid and the Projection method, are
described in Chapter 2, as well as their application in the setting
of geometric anisotropy. In Chapter 3 the methods are compared via
a simulation study both in 2D and in 3D. In Chapter 4 we consider
the influence of noise, the anisotropy tests and the limiting
behavior of the geometric anisotropy mechanism. Finally in Chapter
5 we apply the methods to the ice data. Parts of this work have
been published in
• C. Redenbach, A. Särkkä, M. Sormani (2015). Classification of
Points in Superpositions of Strauss and Poisson Processes. Spatial
Statistics. 12, 81-95.
• T. Rajala, C. Redenbach, A. Särkkä, M. Sormani (2016).
Variational Bayes Approach for Classification of Points in
Superpositions of Point Processes. Spatial Statistics. 15,
85-99.
• T. A. Rajala, A. Särkkä, C. Redenbach, M. Sormani (2016).
Estimating geometric anisotropy in spatial point patterns. Spatial
Statistics. 15, 139-155.
• T. A. Rajala, C. Redenbach, A. Särkkä, , M. Sormani (2018). A
review on anisotropy analysis of spatial point patterns. Spatial
Statistics.
2
1 Spatial Point Processes
In this chapter we describe the fundamentals of the theory of
spatial point processes which are necessary to introduce our work.
We start by giving the formal definition of a spatial point process
in Section 1.2 and by describing some important properties that a
point process may have in Section 1.3. In Section 1.4 we introduce
the Poisson point process, which is a fundamental model in spatial
point process theory. In Section 1.5 we describe some of the
possible summary statistics used to describe point patterns.
Finally in Section 1.6 we introduce the Strauss process, that will
be considered throughout the thesis. The main references we used in
this chapter are [26], [40],[59] and [64].
1.1 General notation
In this section we define some general notations that will be
considered in the thesis. More specific notations will be
introduced later. We denote with I[·], the indicator function and
with IB[·] the indicator function over a set B ⊂ Rd which, given x
∈ Rd, is defined as
IB[x] :=
.
Given a set B ⊂ Rd, we denote its Lebesgue measure by |B|. In
particular, the Lebesgue measure of the d-dimensional unit ball
Br(0) with r = 1 will be denoted by kd. We define the d−
1-dimensional unit sphere Sd−1and the positive half unit sphere
as
(Sd−1)+ := {x ∈ Sd−1s.t xd > 0},
where xd denotes the last component of x. We denote the Minkowski
sum of two sets A and B in Rd as
A⊕B = {a+ b : a ∈ A, b ∈ B}.
The set Bx = B⊕{x} therefore corresponds to the translation of the
set B by a point x ∈ Rd. We define the Euclidean norm as || · ||
and denote by d(x, y) := ||x− y|| the distance between two points
x, y∈ Rd. The distance between a point x ∈ Rd and a set B ⊂ Rd is
given by
d(x,B) := inf y∈B
d(x, y).
Given the space Lp(Rd,R) of Lp-Lebesgue integrable functions from
Rd to R, we define the corresponding Lp-norm as || · ||Lp . We
denote by det(A) the determinant of a matrix A, by tr(A) its trace
and by AT its transpose. Finally, we denote the Dirac Delta
function with δ(·). We now introduce the notation regarding three
particular types of sets in Rd. We denote S(u, ε, r) the double
conical sector centered in the origin with main direction u ∈
Sd−1,
1 Spatial Point Processes
opening angle ε and with radius of the sector given by r. In 2D the
set will be denoted by S(θ, ε, r) where θ is the angle that u forms
with the x-axis (plot 1 of Figure 1.1.1). We denote by L(u, r, hc)
the cylinder (3D) or rectangle (2D) with major-axial direction unit
vector u ∈ Sd−1, with height 2r and cross-section half-length hc
with 0 < hc < r. In 2D we use, as for the cone, the notation
L(θ, r, hc) (plot 2 of Figure 1.1.1). Finally we denote by E(u, r,
k), where k < 1, the 2D ellipse centered in the origin with
major-axial direction u ∈ Sd−1 and semi-axes of length r/k and rk.
In 2D we use the notation E(θ, r, k) (plot 3 of Figure
1.1.1).
r ε
1.2 Definitions and preliminaries
Spatial point processes are random countable subsets of a space S.
The space S is required to be a topological locally compact space
with a countable base, on which a Borel σ-algebra is defined. In
this thesis we usually consider S = Rd endowed with the σ-algebra B
induced by the Euclidean metric. In some cases we also consider S ⊂
Rd again endowed with the σ-algebra induced by the Euclidean metric
which is also denoted by B. We now give a formal definition of a
point process on S, restricting our attention to point processes
whose realizations are locally finite subsets of S. Let B0 be the
set of bounded elements of B, x be a countable subset of S, Nx(S)
its cardinality and Nx(B) the cardinality of the point
configuration x restricted to a subset B of S. We define the set
Nlf of locally finite subsets of S as
Nlf = {x ⊂ S : Nx(B) <∞ ∀B ∈ B0}
On Nlf we define the following σ-algebra
Nlf := σ{{x ∈ Nlf : Nx(B) = k}, B ∈ B0, k ∈ N}.
Definition 1.2.1. Let (,F ,P) be a probability space. A spatial
point process X on S is a measurable map
X : (,F)→ (Nlf ,Nlf ).
1.3 Properties of spatial point patterns
The distribution of X is given by the probability measure PX on the
measure space (Nlf ,Nlf )
defined as PX(F ) = P(ω : X(ω) ∈ F ) ∀F ∈ Nlf .
In applications spatial point processes are used as statistical
models for the analysis of observed patterns of points, called
spatial point patterns or spatial point configurations, where the
points represent the locations of some objects of interest. A great
variety of objects can be considered and in many different
contexts. Typical examples are locations of trees in a forest,
locations of stars in the galaxies or locations of cells in a
tissue. In all these situations the data, at a basic level, simply
consist of point coordinates. Since spatial point patterns present
a huge variety, one of the primary aims of point process theory is
to provide structural methods describing how to find a statistical
model which offers a satisfactory explanation of the considered
pattern. To this aim, different types of models, which could depend
on different parameters are considered and studied. In practice,
the data of a realization of a spatial point process are collected
in a bounded observation windowW , which affects the analysis of
the data and should therefore be carefully taken into
consideration.
1.3 Properties of spatial point patterns
In this section we describe some important properties that spatial
point patterns (processes) may have. When having a point pattern it
is in fact useful to check whether it satisfies cer- tain
properties, in order to find a correct model for the data, and, if
possible, to simplify its analysis. We start by describing two
properties, namely stationarity and isotropy, that will play a
central role throughout the thesis. Let X be a point process on
Rd.
1) Stationarity: We say that X is stationary, if its distribution
is invariant under transla- tions. This means that the point
process Y := X + x, where x is an arbitrary fixed point of Rd, has
the same probability distribution as X for all x ∈ Rd.
2) Isotropy: We say that X is isotropic, if its distribution is
invariant under rotations about the origin. This means that the
point process Y := R0X, where R0 ∈ SOn has the same probability
distribution as X for all R0 ∈ SOn.
Both the assumption of stationarity and of isotropy considerably
simplify the analysis of a point pattern. To check stationarity
various methods have been proposed in the literature, some of them
are pretty standard to be used (see for example the quadrat
counting method, [4, page 165], where one should assume
independence of the points). The hypothesis of isotropy is instead
often confirmed only by a visual check. In applications we usually
distinguish between
1) Regular point patterns: The points show repulsion between each
other and are lo- cated such to preserve a certain distance. The
repulsion may be caused by some physical limits, for example the
points could represent the centers of spheres which have a certain
radius r0. (Figure 1.3.1, second plot)
5
1 Spatial Point Processes
2) Clustered point patterns: The points show attraction between
each other and form clusters where the points lie close to each
other. An example of a clustered pattern can be the pattern of the
seeds spread by a group of plants, where each plant spreads seeds
in its proximity. (Figure 1.3.1, third plot)
3) Complete Spatial Randomness (CSR): The points do not show any
type of inter- action and are independently randomly scattered in
space. (Figure 1.3.1, first plot) The CSR model plays a major role
in spatial statistics and will be taken into consideration in the
next Section 1.4. In the literature, several models both for
regular and clustered patterns have been proposed. In this thesis
we particularly focus on regular point patterns. An additional
property of spatial point processes is
Simplicity: Realizations of X consist a.s. of strictly different
events, so that it can a.s. not happen that two events coincide. In
most applications, and also in ours, this does not represent a
constraint since for physical reasons it is impossible that two
points are located in exactly the same place.
Figure 1.3.1: Realizations of a CSR process (first plot), a regular
point process (second plot) and a clustered point process (third
plot).
CSR Regular Clustered
1.4 Poisson point process (CSR)
Definition 1.4.1. Let µ be a locally finite, diffuse measure on Rd.
A point process X on Rd
such that
(i) NX(A) ∼ Poisson(µ(A)) ∀A ∈ B0,
(ii) if A1, . . . , Ak ∈ B0 are disjoint sets, NX(A1) . . . NX(Ak)
are independent random vari- ables,
is called Poisson point process with intensity measure µ.
The Poisson process with intensity measure µ can be defined on a
subset S ⊂ Rd in an analogous way. If the measure µ has a density λ
with respect to Lebesgue measure, λ is called intensity function.
If λ is constant, we say that X is a homogeneous Poisson process.
It is easy
6
1.5 Summary statistics
to verify that the homogeneous Poisson process is stationary and
isotropic. Note that, in the literature, when talking about CSR, it
is usually referred to the homogeneous Poisson point process. The
Poisson process is also used as a base for the construction of more
complicated models and it is the most analytically tractable model.
Note that in Definition 1.4.1, if X is simple, property (i) implies
property (ii).
Definition 1.4.2. Let S ∈ B, µ a diffuse locally finite measure on
Rd with µ(S) < ∞, and let n ∈ N. A point process X is a
µ-binomial point process on S with n points if X := ∪ni=1ξi, where
ξi are independent and they are distributed µ-uniformly in S, so
that
P(ξi ∈ A) = µ(A)
µ(S) A ∈ B, A ⊂ S.
We now consider the restriction XS of a Poisson process X with
intensity measure µ on a set S such that µ(S) <∞, or
equivalently a Poisson process X with intensity measure µ defined
on such a set S. This is not a strict restriction in applications
since we usually observe a realization of our point process in a
restricted window W .
Proposition 1.4.1. Let X be a Poisson point process on Rd with
intensity measure µ. The process XS := X ∩S with S ∈ B0 and µ(S)
> 0 conditional on NX(S) = n is a binomial point process with n
points on S.
From Proposition 1.4.1 we can deduce a method to simulate XS . We
can generate a random number N ∼ Poisson(µ(S)), and then generate N
points, µ-uniformly scattered in S. Propo- sition 1.4.1 also allows
us to characterize the distribution Π of X defined on the measure
space (Nlf ,Nlf ) . In fact
Π(F ) = P(X ∈ F )
= ∞∑ n=0
e−µ(S)µ(S)n
dµ(s1)
µ(S) . . .
dµ(sn)
µ(S)
∫ S · · · ∫ S I[{s1, . . . , sn} ∈ F ]dµ(s1) . . . dµ(sn), F ∈ Nlf
.
(1.4.1)
When n = 0 the integrals should be replaced by I[∅ ∈ F ].
1.5 Summary statistics
In this section we introduce different summary statistics used to
describe spatial point patterns. Summary statistics can give
different kinds of information about the considered spatial pattern
and they can be used to help identify a suitable model for it. In
Section 1.5.1 we introduce the so called intensity measures. In
Section 1.5.2 we introduce Palm distributions, which
7
1 Spatial Point Processes
characterize conditional properties of spatial patterns and which
are necessary to introduce second order statistics in Section
1.5.3: Ripley´s K-function and the pair correlation function g.
Finally in Section 1.5.4 we introduce three possible first order
summary statistics: the empty space function F , the nearest
neighbor distance distribution G and the J-function, a combination
of F and G.
1.5.1 Intensity measures
The first order moment measure Λ, also called the intensity measure
of a point process X is defined on the space (S,B) as
Λ(A) = E(NX(A)) ∀A ∈ B,
so Λ(A) represents the expected number of points of X in A. The
first order moment measure can have a density λ : S → R+ with
respect to the Lebesgue measure. In this case we call λ the
intensity function and we can write
Λ(A) =
∫ A λ(ξ + ν)dξ ∀ν ∈ Rd.
This implies that the intensity function is constant so that λ(x) =
λ and that Λ(A) = λ|A|. In this case λ can be interpreted as the
average number of points in a unit volume and, given a realization
x of the process in the observation window W , can be estimated
by
λ = Nx(W )
|W | . (1.5.1)
In the Poisson process the intensity measure Λ coincides with
µ.
Theorem 1.5.1. (Campbell thm.) Let X be a point process on Rd and f
: Rd → R+ a non-negative measurable function. Then
E (∑ x∈X
In the stationary case the equation can be written as
E (∑ x∈X
∫ Rd f(x)dx.
The proof of this theorem can be found e.g. in [57, page 54].
The definition of first order moment measure can be extended to an
arbitrary order n as a measure on the product space (Sn,⊗nB)
8
E
A1, . . . , An ∈ B0.
We can also define the nth order factorial moment measure Λ(n)
as
Λ(n)(A1 × · · · ×An) = E
6=∑ ξ1,...,ξn∈X
A1, . . . , An ∈ B0, (1.5.2)
where 6= indicates that the sum is done only for ξ1, . . . , ξn
mutually distinct. The name of this measure is due to the fact
that
Λ(n)(A× · · · ×A) = E(NX(A)(NX(A)− 1) . . . (NX(A)− n+ 1)).
The measure M (n)(A1 × · · · × An) represents the expected number
of n-tuples we can form using the points of the process taking the
i-th point in Ai and permitting repetitions if the intersections
between some Aj are not empty, while Λ(n)(A1 × · · · × An)
represents the same quantity without permitting those repetitions.
The measure Λ(n) can have a density with respect to Lebesgue
measure on (Sn,⊗nB) that we denote with λ(n). In the case n = 2,
the density λ(2) is called second order product density. It follows
directly from the definition of Λ(2) that
λ(2)(x, y) = λ(2)(y, x) ∀x, y ∈ Rd. (1.5.3)
The Campbell Theorem 1.5.1 can be generalized to the second order
factorial moment measure Λ(2) as
Theorem 1.5.2. Let X be a point process on Rd and f : Rd × Rd → R+
a non-negative measurable function. Then
E ( 6=∑ x,y∈X
∫ Rd f(x, y)λ(2)(x, y)d(x, y).
We now define the so called Campbell measures, which will be useful
in the next section. The first order Campbell measure on the
product space (S ×Nlf ,B ⊗N lf ) is defined as
C(A× F ) = E(NX(A)I(X ∈ F ))
Notice that we have C(A×Nlf ) = Λ(A).
The first order reduced Campbell measure is defined as
C !(A× F ) = E ∑ ξ∈X
I[ξ ∈ A,X\ξ ∈ F ] ∀A ∈ B0, F ∈ Nlf .
These measures can be extended to higher orders in the obvious
way.
9
1.5.2 Palm distributions
The Palm distributions of a spatial pattern are probability
measures Pξ on (Nlf ,Nlf ), where ξ ∈ S. We will see that Pξ(F )
can be heuristically interpreted as P(X ∈ F |NX(Bε(ξ) > 0))
where ε > 0 is arbitrarily small, so Pξ gives the conditional
distribution of X given that there is a point of the process at ξ.
Formally we define the Palm distributions in the following way.
Consider F ∈ Nlf , the first moment measure Λ(·) and the measure
given by C(·, F ) where C(·, ·) is the first order Campbell
measure. We have directly from the definition that
C(·, F ) Λ(·)
where means "absolutely continuous with respect to", since Λ(A) = 0
=⇒ C(A×F ) = 0. From the Radon Nikodym theorem we have that there
exists a density dC(·×F )
dΛ : S → R such that
C(A× F ) =
dΛ dΛ(ξ) ∀A ∈ B0.
It is possible to choose this density such that fixing F we obtain
a Borel measurable function and fixing ξ we obtain a probability
measure on (S,B). We call this probability measure Palm
distribution so
Pξ(·) = dC(ξ × ·)
dΛ .
We now show heuristically that the Palm distribution can be
interpreted as the conditional distribution of X given that there
is an event in ξ. In fact, given ε small enough, if we define A :=
Bε(ξ), we can assume that in A there is at most one point of X.
With this assumption we have that
C(A× F ) ≈ E(I(X ∈ F,NX(A) > 0)) = P(X ∈ F,NX(A) > 0)
and C(A× F ) ≈ Pξ(F )Λ(A) ≈ Pξ(F )P(NX(A) > 0)
so Pξ(F ) ≈ P(X ∈ F,NX(A) > 0)
P(NX(A) > 0) = P(X ∈ F |NX(A) > 0).
In an analogous way, using the reduced Campbell measure, we can
define the reduced Palm distribution P!
ξ(·) . In this case heuristically we can interpret P! ξ(·) as the
probability dis-
tribution of X\ξ given that for X there is an event at ξ. From the
definition of the reduced Palm distribution, using standard
techniques in measure theory, the following formula, known as
Campbell-Mecke Theorem can be proved
E (∑ ξ∈X
∫ ∫ h(ξ, x)dP!
ξ(x)dΛ(ξ) (1.5.4)
for non-negative measurable functions h. Consider now the case that
X is stationary. Since the characteristics of the process are the
same throughout space, it should not be important which point ξ is
fixed when looking at the Palm measure. In fact it can be proved
that (see [40]) if we define
P! 0(F ) :=
F ∈ Nlf , A ∈ B0, (1.5.5)
10
P! ξ(F ) = P!
0(F(−ξ)) F ∈ Nlf .
In the stationary case we can therefore restrict our attention to
P! 0, which can also be in-
terpreted as the distribution of the further points of X given a
"typical point" of X. The Campbell-Mecke theorem, in the stationary
case, can be rewritten as
E( ∑ ξ∈X
h(ξ,X\{ξ})) = λ
∫ ∫ h(ξ, x+ ξ)dP!
0(x)dξ (1.5.6)
Consider now a Poisson point process. We can expect that the
distribution of the process does not change if we suppose to know
the position of one point of the process, since the scattering of
the points is completely random and does not depend on the other
positions.
Theorem 1.5.3. (Slivnyak thm.) Let X be a Poisson process on Rd
with intensity measure µ. Then PX = P!
ξ for almost all ξ ∈ Rd.
For a proof of this theorem see [57, Thm 3.3.5, Notes 3.3.3].
1.5.3 Second order summary statistics
Second order summary statistics, although they do not fully
characterize a point process, are believed to represent important
statistical properties and therefore constitute a widely used tool
for the analysis of point patterns. Second order statistics are
based on the second order factorial moment measure Λ(2) which was
defined in Equation (1.5.2). In this section we assume that X is
stationary and that the product density λ(2) exists. In this case
it can be proved that
λ(2)(x, y) = λ(2)(0, y − x) =: λ(2)(z), z = y − x
and that therefore
We now define the reduced second-order moment measure K by
λ2K(B) :=
From the definition of K and Λ(2) it follows that
Λ(2)(A×B) = λ2
Λ(2)(A×B) = λ
0(NX(B)). (1.5.11)
1 Spatial Point Processes
The quantity λK(B) can be therefore interpreted as the expected
number of points in B excluding the origin, conditioned on 0
belonging to X. When observing X on all Rd, an unbiased estimator
for K(B) is given by
λ2K(B) :=
I[y − x ∈ B]/|A|, A ∈ B0. (1.5.12)
Unbiasedness follows from Theorem 1.5.2. When observing X in a
finite window W we need to deal with edge effects, since smaller
distances between points are more likely to be observed than larger
ones. In this case an unbiased estimator is given by
λ2K(B) =
|Wx ∩Wy| , (1.5.13)
where the weights 1 |Wx∩Wy | , are called translation edge
correction weights and were introduced
by Ohser and Stoyan in [44]. When choosing B as the sphere centered
in the origin with radius r, the K-measure coincides, as a function
of r, with Ripley´s K-function which is widely used in practice,
soK(r) = K(Br). Note that Ripley´s K-function, due to the shape of
B, assumes both stationarity and isotropy. In Chapter 2, Section
2.3 we will discuss directional versions of the K-function that
take anisotropy into account. For a homogeneous Poisson process
Ripley´s K-function assumes values
K(r) = kdr d.
For clustered processes we expect that K(r) ≥ kdrd for small r, and
for regular point processes we expect that K(r) ≤ kdrd for small r.
The cumulative nature of the K-measure can make it hard to
interpret it and can sometimes obscure some details. This is why
sometimes its derivative is considered. Rewriting the K-measure
as
K(B) = λ−2
λ2
is called the pair-correlation function. The pair correlation
function is more practical than the product density λ(2) since it
is independent of the intensity. By the definition of density we
can interpret λ(2)(z)dz as the probability to have two points in
two infinitesimal volumes dx and dy with difference vector z, while
λ can be interpreted as the probability to have one point in an
infinitesimal volume dx. If the two events of having one point in
dx and one point in dy are independent, as in the homogeneous
Poisson process, we have g ≡ 1. Values of g > 1 for ||z|| small,
are typical in case of clustering, while values of g < 1 show
repulsion between the points and are typical for regular
patterns.
1.5.4 First order summary statistics
In this section we briefly introduce some first order summary
statistics for stationary point processes. The nearest neighbor
distance distribution function G is defined as
G(r) = P0(d(0, X\0) ≤ r), r > 0.
12
1.6 Strauss process
G(r) is the probability that there is at least one point which has
distance less than r from 0, which is a point belonging to the
process. The empty space function F is analogous to the G function,
with the only difference that the Palm distribution is substituted
by PX
F (r) = PX(d(0, X) ≤ r) r > 0,
where in this case 0 does a.s not belong to the process. F (r) is
then the probability to find, given a generic point in S, at least
one event of the process that has distance less than r from this
point. Therefore F is the distribution function of the distance
between an arbitrary point of S and the nearest point of the
process, while G is the distribution function of the distance
between the typical point of the process and its nearest neighbor.
The J-function is defined as
J(r) = 1−G(r)
1− F (r) =
PX(NX(B(0, r)) = 0) ∀r > 0, F (r) < 1,
where 0 is the typical point of the process. Intuitively we have
that, if J(r) takes values smaller than 1, the probability to have
an empty space larger than r between points of the process is less
than the probability to have the same empty space between a generic
point and a point of the process, which is typical in clustered
patterns. Instead if J(r) is higher than 1 we can suppose to have a
more regular pattern. These heuristic observations are confirmed by
the fact that for a Poisson process J(r) = 1 as a consequence of
Theorem 1.5.3.
1.6 Strauss process
In this section we first introduce the class of processes that have
a density with respect to a Poisson process with intensity measure
µ, defined on a set S ⊂ Rd with µ(S) <∞. We then take into
consideration a particular process belonging to this class: the
Strauss process. The Strauss process will be considered in the
simulation studies of Chapter 3.
Definition 1.6.1. We say that a process X has density p : Nlf → R+
with respect to the Poisson process with intensity measure µ
if
P(X ∈ F ) =
∫ F p(x)dΠ(x), F ∈ Nlf
where Π denotes the probability distribution induced on (Nlf ,Nlf )
by the Poisson process.
From the Radon Nikodym theorem we have that every process that
induces a probability measure on (Nlf ,Nlf ) which is absolutely
continuous with respect to Π has a density with respect to the
Poisson process and vice versa.
To be a probability density, p(·), if integrated on Nlf with
respect to Π has to give 1. Given a function p : Nlf → R+ we want
to give sufficient conditions for p to be a probability density.
Since p(·) is usually known only up to a normalizing constant, we
will consider h(·) = p(·)Z, where Z is the unknown normalizing
constant and we describe conditions that assure that∫
F h(x)dΠ(x) <∞.
Two possible conditions are called local stability and Ruelle
stability.
13
1 Spatial Point Processes
Definition 1.6.2. A non negative measurable function h on Nlf is
locally stable if
∃K > 0 such that ∀x ∈ Nlf , ∀ξ ∈ S\x : h(x ∪ ξ) ≤ Kh(x)
and Ruelle stable if
∃K > 0, c > 0 such that ∀x ∈ Nlf : h(x) ≤ cKNx(S).
Local stability implies Ruelle stability which implies
integrability of h [59].
Definition 1.6.3. We call processes that have a locally stable
density with respect to Π
locally stable point processes.
Definition 1.6.4. Given a point process X that has density p(·)
with respect to Π we define the Papangelou conditional intensity of
X as
λ(x, ξ) = p(x ∪ {ξ})
taking λ(x, ξ) = 0 if p(x)=0.
Notice that
• The local stability condition implies the existence of an upper
limit of the Papangelou conditional intensity.
• The Papangelou conditional intensity does not depend on the
normalizing constant of the density p(·), which is unknown in most
of the cases.
• Heuristically, the Papangelou conditional intensity λ(x, ξ) of a
process X can be inter- preted as
λ(x, ξ)dξ = P(NX(dξ) = 1|X ∩ (dξ)C = x ∩ (dξ)C)
so as the probability of finding a point in an infinitesimal region
dξ around ξ given that the point process agrees with the
configuration x outside dξ.
Definition 1.6.5. Suppose we have a point process X with Papangelou
conditional intensity given by λ(x, ξ). We say that X is attractive
if
λ(x, ξ) ≤ λ(y, ξ) ∀x ⊆ y ∈ Nlf
and repulsive if λ(x, ξ) ≥ λ(y, ξ) ∀x ⊆ y ∈ Nlf .
Intuitively, attractivity means that the chance that ξ ∈ X, given
that X\ξ = x, is an increas- ing function of x, while repulsivity
means the opposite.
We now give the definition of the Strauss process.
14
1.6 Strauss process
Definition 1.6.6. We say that a point process X is a Strauss
process with parameters θ =
(β, γ, r0), where β > 0, 0 ≤ γ ≤ 1, r0 > 0, if X has
density
pθ(x) = 1
Zθ βNx(S)γsr0 (x) (1.6.1)
with respect to the measure Π induced by a homogeneous Poisson
process on S with intensity 1, where Zθ is the unknown normalizing
constant and
sr0(x) = ∑
{ξ1,ξ2}⊆x:ξ1 6=ξ2
I[d(ξ1, ξ2) ≤ r0]
is the number of pairs of distinct points belonging to the point
configuration x that have distance less than r0 from each
other.
Proposition 1.6.1. The Strauss process is a locally stable,
repulsive point process.
Proof. We have that the Papangelou conditional intensity of a
Strauss process is equal to
λ(x, ξ) = β[N(x∪ξ)(S)−Nx(S)]γ[sr0 (x∪ξ)−sr0 (x)] = βγtr0 (x,ξ), ξ
/∈ x
where we have denoted tr0(x, ξ) = sr0(x ∪ ξ)− sr0(x)
which is the number of points in configuration x that have distance
less than r0 from ξ. The local stability follows from the fact
that
γtr0 (x,ξ) ≤ 1 since 0 ≤ γ ≤ 1, tr0(x, ξ) ≥ 0
and the repulsivity from the fact that
tr0(x, ξ) ≤ tr0(y, ξ) if x ⊆ y.
The normalization constant Zθ is not explicitly known and its
estimation, if needed, is not straightforward. We mention here a
possible estimation used by Cressie and Lawson in [10], based on a
Poisson aproximation (see [55])
1
(β2|W |2
2|W | kdr
d 0(γ − 1)
) . (1.6.2)
In Definition 1.6.6, β is called the intensity parameter, γ the
interaction parameter and r0 the interaction radius. Realizations
of the Strauss process have different characteristics depending on
the values of these parameters (Figure 1.6.1). Typically, if γ is
close to 0, the realizations look more regular than in the case in
which γ is close to 1. Therefore the parameter γ will also be
called regularity parameter. Consider the extreme cases. If γ = 0,
since the density assumes values different from 0 only if sr0(x) =
0, we obtain the so called hardcore process, where points with
distance less than r0 are prohibited. Instead, in the case γ = 1,
we obtain
15
1 Spatial Point Processes
a Poisson process which allows arbitrarily close points. Decreasing
γ is not the only way to obtain a more regular pattern. Another way
is to increase r0 while fixing the other parameters. Notice that
this fact highlights that the parameters r0 and γ are strictly
related to each other, and from a pattern is not easy to guess if
it is the parameter r0 that assumes for example a high value or γ
that assumes a small value. This type of correlation can cause
problems if estimations of the parameters of a Strauss process are
needed. The parameter β is related to the intensity λ of the
process. Note that λ can not be computed explicitly, even if the
values of the parameters β, γ and r0 are known. A possible
approximation of λ, given the parameters of the process, was
introduced by Baddeley and Nair in [3] and it is given by
λ = W0(βΓ)
Γ ,
where W0 is the principal branch of Lambert’s W0 function (see
[9]), and Γ = −kdrd0 log γ.
In Figure 1.6.1 we show some realizations of the Strauss process in
the observation window [0, 1] × [0, 1] with different values of the
parameters r0 and γ, while the parameter β is fixed to 200. The two
rows correspond to two different values of r0 and in every row we
consider three different increasing values of γ.
The Strauss process is a pairwise interaction point process and in
particular a Gibbs or Markov point process ([40]). Using the
properties of Markov point processes the definition of Strauss
process on a finite set S can be extended to Rd by e.g. using the
local specification character- ization as in [40, page 95]. Such an
extension is a stationary point process on Rd (the Poisson process
with intensity 1 is stationary and tr0 is invariant under
translations and rotations).
16
1.7 The Metropolis Hastings algorithm
Figure 1.6.1: Simulations of Strauss processes with different
values of the parameters γ and r. In the first
row r = 0.02, in the second r = 0.06. In the first column γ = 0, in
the second γ = 0.3 and in
the last column γ = 0.6. γ =0, r=0.02 γ =0.3, r=0.02 γ =0.6,
r=0.02
γ =0, r=0.06
γ =0.6, r=0.06
The Strauss process on a finite set S can be simulated for example
by using the Metropolis Hastings algorithm as described in Section
1.7.3.
1.7 The Metropolis Hastings algorithm
In this section we shortly introduce the Metropolis Hastings
algorithm and show how to apply it to simulate locally stable point
processes on a set S ⊂ Rd with |S| <∞. A more detailed
discussion of the topics of this section can be found in [59,
Chapter 2].
We first of all give a short description of Markov chains which are
discrete in time, but with general state space. Consider a measure
space (Y,Y), with Y countably generated. A discrete in time
homogeneous Markov chain on (Y,Y) is a process Yn characterized by
an initial distribution ν on (Y,Y) and a transition kernel P : Y ×
Y → [0, 1] such that
P(Y0∈A) = ν(A) A ∈ Y P(Yn∈A|Yn−1 = x) = P (x, A) A ∈ Y, x ∈ Y
Definition 1.7.1. Let µ and ν be two measures on (Y,Y). The total
variation norm between µ and ν is defined as
||µ(·)− ν(·)||v = sup A∈Y |µ(A)− ν(A)|.
17
1 Spatial Point Processes
Definition 1.7.2. We say that the chain Yn converges in equilibrium
to a measure π on (Y,Y)
as n→∞ if lim n→∞
||Pn(x, ·)− π(·)||v = 0 for π − a.a x ∈ Y
where Pn : Y × Y → [0, 1] is the n-step transition probability
which satisfies
P(Yn ∈ A|Y0 = x) = Pn(x, A) A ∈ Y, x ∈ Y.
The Metropolis Hastings algorithm is an MCMC (Markov Chain Monte
Carlo) method, and has the aim to get a sample from a distribution
with density π with respect to a measure µ defined on a measure
space (Y,Y). Usually this algorithm is needed when π is known only
up to a normalizing constant and therefore direct sampling is not
available. The basic idea of the method is to simulate, for a
sufficiently long time, a discrete in time Markov chain with state
space Y , that has equilibrium density given by π.
1.7.1 The algorithm
The algorithm consists in building the following discrete in time
Markov chain. Suppose that at the n-th iteration the chain is in
the state x. The n+ 1-th step is built by
• proposing a new state y using a density q(y,x) (with respect to
µ),
• accepting or refusing y as the new state of the (n + 1)-th
iteration using the following acceptance probability
α(y, x) =
1 if π(x)q(y,x) = 0
where H(y,x) is called Hastings ratio and is given by
H(y, x) = π(y)q(x,y)
π(x)q(y,x) .
Note that H(y,x) depends on π only through ratios, so for applying
this algorithm it is not necessary to know the normalizing constant
of π.
1.7.2 Convergence of the algorithm
It can be proved that the Metropolis Hastings algorithm, if
choosing an appropriate proposal density q(·, ·), which has to
render the constructed Markov chain aperiodic and irreducible [59],
converges in equilibrium to the density π. A good proposal density
q
• has to be easy to implement in practice,
• has high acceptance rate,
• provides a good mixing of the chain so that all the range of the
states is visited “often” and not only a part of it,
• guarantees no cyclic behavior of the chain.
18
1.7 The Metropolis Hastings algorithm
Notice that not only the convergence of the algorithm, but also the
rate of convergence depends on the choice of q. For example a high
rejection rate can make the convergence slower. The Metropolis
Hastings algorithm gives us a way for sampling from a density π by
running a chain for a suitable number of iterations such that the
chain has reached equilibrium. We have however to consider
that
(i) If we want a multi-dimensional sample, the sample we obtain by
running a single chain is not independent.
(ii) The density of the sample is only asymptotically equal to
π.
(iii) We do not know the rate of convergence, so for how many
iterations the chain should be run before reaching approximately
the equilibrium.
Regarding the first point, one could run multiple independent
chains, although this leads to a high computational cost. Another
possibility is to thin the chain and take its values every k-th
iteration, obtaining an approximately independent sample. Regarding
the third point, in practice, since theoretical results are in
general difficult to apply, methods such the ones introduced by
Raftery and Lewis in [48] are applied. These methods let first run
the algorithm in order to obtain one or more pilot samples. The
number of iterations are then determined by applying convergence
diagnostics to the pilot samples. To get rid of the problem of the
third point one can also use an alternative method to the
Metropolis Hastings algorithm which is called dominated coupling
from the past (DCFTP) [40]. Once it has converged the DCFTP gives
an exact simulation of π. It can however happen that the algorithm
takes long time to converge.
1.7.3 Simulation of locally stable point processes
The Metropolis Hastings algorithm can be used to simulate locally
stable processes which have a density p with respect to Π where p
is usually known only up to a normalizing constant. In this case
the state space is (Y,Y) = (Nlf ,Nlf ). It is also possible to use
the Metropolis algorithm to simulate from the conditional (on
having n points) versions of those densities. Let us first consider
the unconditional case. The proposal distribution q can be chosen
as
• propose a birth with probability q(x), where the new point u ∈ S
is sampled from a density b(x, u) with respect to µ.
• propose a death of a preexisting point with probability 1− q(x),
where the point ξ ∈ x to delete is sampled from a density d(x, ξ)
on the point configuration x.
With this choice of q, the acceptance probability α is
α(x ∪ u,x) = q(x)p(x ∪ u)d(x ∪ u, u)
(1− q(x))p(x)b(x, u) x ∈ Nlf , u ∈ S
and
19
q(x) = 1
1
Nx(S) .
It can be proved that under some conditions on b, d and q, which
are fulfilled by the previous choices, the algorithm converges to a
distribution with the specified density p. For the conditional
case, when we fix the total number of points to n, the algorithm
starts with a point pattern having n points and at each iteration
it will be proposed to replace an old point with a new proposed
point. For details see [40] page 108. Two other possible ways to
simulate locally stable processes are spatial birth and death pro-
cesses [26] and/or dominated coupling from the past [40] (exact
simulations). An exact simulation of the Strauss process in 2D can
be obtained by using the function rStrauss of the R-package
spatstat. Both 2D and 3D simulations of the Strauss process using
the Metropolis Hastings algorithm can be obtained by using the
function rstrauss which can be found in the R-package rstrauss in
https://github.com/antiphon/rstrauss.
20
2 Directional Analysis
In this chapter we describe different methods for a directional
analysis of a stationary point process, which, in order to be
exemplified, are applied to two simulated data sets, one regular
and one clustered. Although the directional methods are introduced
in the general case of a stationary point process X defined on Rd,
special attention is given to their application to regular patterns
subjected to a particular type of anisotropy mechanism, called
geometric anisotropy, which is described in Section 2.1. In Section
2.3, Section 2.4 and Section 2.5 we describe the directional
methods. In Section 2.2 we introduce the so called Fry points,
which will be important throughout the chapter.
2.1 Settings
Let X be a simple stationary point process on Rd, with intensity λ
and second order product density λ(2). Since we assume that X has
no duplicate points, λ(2)(x, x), x ∈ Rd, is not well defined, and
set equal to 0. We moreover assume that X is observed in a compact
window W ⊂ Rd.
We now describe in details and introduce notations for a particular
type of anisotropy mecha- nism, which has been called geometric
anisotropy in [36]. Let X0 be a stationary and isotropic point
process and define the point process
X = TX0 = {Tx : x ∈ X0} (2.1.1)
where T : Rd → Rd is an invertible linear mapping, which
corresponds to a d× d matrix also denoted by T . We assume here
that det(T ) > 0. If det(T ) = 1, the transformation T is called
volume preserving. T can be decomposed by using the singular value
decomposition
T = R1CR2
where R1 and R2 correspond to rotations and C is a diagonal matrix
with strictly positive entries. Since X0 is isotropic we have
that
TX0 = R1CR2X0 ∼ R1CX0.
Therefore it is sufficient to consider a linear mapping T of the
form
T = RC. (2.1.2)
The matrix C “rescales” X0 along the coordinate axes, whereas the
matrix R rotates the deformed process CX0. The axes obtained by
rotating the coordinate axes by R are called deformation axes of T
.
2 Directional Analysis
The point process X that we get after the transformation, is a
stationary point process with intensity λX = det(T−1)λX0 . If the
matrix C is not a multiple of the identity matrix, X can be
anisotropic. Note that, in the case X0 is a stationary Poisson
process, X remains a stationary Poisson process, only with
different intensity. Geometric anisotropy has already been
considered in the literature with X0 clustered or regu- lar, both
in the 2D and in the 3D case, for real and simulated data.
Regarding the simulated data, the cluster case has been considered
in 2D in [36] with log-Gaussian Cox processes and shot noise Cox
processes, in [22] and in [66] with anisotropic Thomas processes.
The regular case has been considered in [53] with Matern hard core
processes (in 3D), in [66] with Gibbs hardcore processes (in 2D)
and in [52] with Strauss processes (both in 2D and in 3D). In this
thesis we focus on the regular case in both 2D and 3D. As in [52],
in our simulation study in Chapter 3, we consider realizations of
Strauss processes. Let now X be a point process on R2 or on R3,
generated by the geometric anisotropy mecha- nism. Motivated by our
application (Chapter 5) we assume T volume preserving. In 2D the
scaling matrix C assumes the form (since det(T ) = 1)
C =
c
) . (2.1.3)
We assume that the strength of compression 0 < c ≤ 1. In 3D the
scaling matrix C assumes the form
C =
c1c2
(2.1.4)
where we assume that 0 < c1 ≤ c2√ c1 , so that c2 > c1
√ c1. We call c1 the strength of main
compression and c2 the strength of additional compression. If c2 =
1 we have only one axis of compression. The other two axes of
deformations are elongated with equal strengths. If c1 = c2√
c1 we have one axis of elongation and two axes of compression which
are deformed with
equal strengths. In both cases T is a spheroidal transform. Let us
now consider in 2D 0 < c < 1 and in 3D 0 < c1 <
c2√ c1 . Given our (non restrictive)
assumptions on the order of the elements of the diagonal of C, in
2D the process is compressed along the image (by applying the
rotation R) of the x-axis, and dilated along the image of the
y-axis. In 3D the process is compressed along the image of the y
and x axes and dilated along the image of z. Since the compression
along the image of y is stronger than the compression along x, we
say that the image of y is the axis of main compression and the
image of x is the axis of additional compression. In 2D the
deformation axes can be simply represented by the angle θ ∈ [0, π]
that the axis of compression forms with the x-axis
(counterclockwise). From now on we will call θ the direction of
compression. The matrix R can be expressed by
R =
) . (2.1.5)
In 3D we denote the axes of deformation (in order the axis of
elongation, the axis of additional compression and the axis of
compression) u1, u2, u3. The same notation will be used to de- note
the directions of the deformation axes with nonnegative z values,
that belong to (S2)+. We call these directions directions of
deformation. In the d-dimensional case we extend the
22
notation in the obvious way.
Besides geometric anisotropy, other anisotropy mechanisms could
have been taken into con- sideration. An example of anisotropic
stationary point processes not generated by geometric anisotropy,
are Poisson processes (or in general stationary processes) with
increased intensity along directed lines (see for example the
Poisson line cluster point process (PLCPP) model in [35] and the
models in [56]). These processes can be considered stationary, if
the distribution of the lines is stationary.
2.1.1 Aims
Given the assumption of geometric anisotropy, our specific aims
are
• Estimate the rotation R, so the axes of deformation.
• Estimate the matrix C, so estimate the strength c in 2D and the
strengths c1 and c2 in 3D.
In Sections 2.3.1, 2.4.1 and 2.5.1 we consider the estimation of R,
while in Section 2.6 we consider the estimation of C.
2.1.2 Explicative examples
In this section we show two realizations of 2D point processes, one
regular and the other clustered, that we will use to show the basic
ideas and the typical results of the considered directional
methods. Both examples are constructed by using the geometric
anisotropy mech- anism. For the regular case we chose X0 as a
Strauss process with fixed number of points n = 300 and parameters
γ = 0, r0 = 0.04. For the clustered case we chose X0 as a Matern
Cluster Process with radius of the clusters equal to 0.03,
intensity of the Poisson process that determines the cluster
centers equal to 10 and with an average of 40 points per cluster.
For the simulation we used the function rMatClust of the R-package
spatstat. In both the clus- tered and the regular case, we fixed R
as the identity matrix, applying no rotation to X0 and we fixed the
strength of compression c = 0.5. For details on how the
realizations of these processes can be obtained see Section 3.1.
The realization of the regular case is shown in the plot on the
left of Figure 2.1.1 and the realization of the clustered case in
the plot on the right.
23
2 Directional Analysis
0. 0
0. 2
0. 4
0. 6
0. 8
1. 0
0. 0
0. 2
0. 4
0. 6
0. 8
1. 0
Clustered
x
y In the clustered pattern, the axes of dilation and compression
are visually detectable by looking at the shape of the clusters
which are elongated along the axis of elongation x and are
compressed along the axis of compression y. In the case of the
regular pattern, the compression and the dilation axes are not so
clearly visible.
2.2 Fry points
In this section we introduce the so called Fry points, which will
be considered in all the following sections. The Fry points have
been first introduced by Fry in [20]. We define the Fry points of a
stationary point process X as
ZA := {y − x, x 6= y, x ∈ A, y ∈ X} A ∈ B0. (2.2.1)
In Equation (2.2.1) we need to consider x ∈ A ∈ B0, since, if
considering all points of X on Rd, ZA would not be locally finite.
In practice, when observing X in a finite observation window W , we
can only observe the pairwise difference vectors
ZW := Z := {y − x, x 6= y, x, y ∈ XW } (2.2.2)
which we also call Fry points. The set Z is symmetric with respect
to the origin since y − x and −(y − x) both belong to Z and is
affected by edge effects. We denote the observation window of Z,
which depends on W , by W ∗. From now on we concentrate only on the
set Z.
In the next sections we will see that the Fry points can be
exploited in order to analyze anisotropy in stationary point
processes. Moreover, due to their structure, the Fry points are
useful to visualize anisotropy both in 2D and in 3D (Section 4.4).
Let us first look at the properties of the Fry points Z under
isotropy. If X is isotropic and if W = Br(c), r ∈ R+, c ∈ Rd, we
will prove that the distribution of the Fry points Z is
rotationally symmetric with respect to rotations about the origin.
The condition on W is necessary since it implies that the window W
∗ is also a ball and therefore invariant under rotations. If W is
not a ball we can always restrict ourselves on the biggest ball
which is contained in W . If R0 ∈ SOn we can write
R0Z = {R0y −R0x, x 6= y, x, y ∈ XW } = {y − x, x 6= y, x, y ∈ R0(XW
)}. (2.2.3)
24
(1)∼ (R0X)W (2)∼ XW (2.2.4)
where in (1) we exploited the fact that W is a ball and the
stationarity of X and in (2) the isotropy of X. From Equation
(2.2.3) and Equation (2.2.4) we can easily derive that
R0Z ∼ Z ∀R0 ∈ SOn. (2.2.5)
Since Definition (2.2.2) considers X restricted to the observation
window W , estimations involving X and Z are both affected by edge
effects. For instance in W smaller distances between points are
more likely to be observed than larger distances. Edge effects can
be treated in different ways. In estimations involving Z
particularly useful are the translational edge correction weights
already introduced in Equation 1.5.13, since they can provide
unbiased estimators considering only XW . Another possible edge
treatment is given by the so called minus-sampling. In this case
only the differences
{y − x, ||y − x|| < dist(x, ∂W )} (2.2.6)