Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the...

25
Estimation of ordinal pattern probabilities in Gaussian processes with stationary increments Mathieu Sinn a , Karsten Keller *,b a David R. Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1 b Institute of Mathematics, University of Luebeck, Wallstrasse 40, D-23560 L¨ ubeck, Germany Abstract The investigation of ordinal pattern distributions is a novel approach to quan- tifying the complexity of time series and detecting changes in the underlying dynamics. Being fast and robust against monotone distortions, this method is particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement device is unknown. In this paper we investigate properties of the estimators of ordinal pattern probabilities in discrete-time Gaussian processes with stationary increments. We show that better estimators than the “sample frequency estimators” are available because the considered processes are subject to certain statistical symmetries. Furthermore, we establish sufficient conditions for the estima- tors to be strongly consistent and asymptotically normal. As an application, we discuss the Zero-Crossing (ZC) estimator of the Hurst parameter in fractional Brownian motion and compare its performance to that of a similar “metric” estimator by simulation studies. Key words: Time series analysis, ordinal pattern, stochastic process, estimation, fractional Brownian motion 2000 MSC: 60G15, 62M10, 60G18 * Corresponding author Email addresses: [email protected] (Mathieu Sinn), [email protected] (Karsten Keller) Preprint submitted to Computational Statistics & Data Analysis September 25, 2009

Transcript of Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the...

Page 1: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Estimation of ordinal pattern probabilities in Gaussian

processes with stationary increments

Mathieu Sinna, Karsten Keller∗,b

aDavid R. Cheriton School of Computer Science, University of Waterloo, 200 UniversityAvenue West, Waterloo, Ontario, Canada N2L 3G1

bInstitute of Mathematics, University of Luebeck, Wallstrasse 40, D-23560 Lubeck,Germany

Abstract

The investigation of ordinal pattern distributions is a novel approach to quan-tifying the complexity of time series and detecting changes in the underlyingdynamics. Being fast and robust against monotone distortions, this method isparticularly well-suited for the analysis of long biophysical time series wherethe exact calibration of the measurement device is unknown.

In this paper we investigate properties of the estimators of ordinal patternprobabilities in discrete-time Gaussian processes with stationary increments.We show that better estimators than the “sample frequency estimators” areavailable because the considered processes are subject to certain statisticalsymmetries. Furthermore, we establish sufficient conditions for the estima-tors to be strongly consistent and asymptotically normal.

As an application, we discuss the Zero-Crossing (ZC) estimator of theHurst parameter in fractional Brownian motion and compare its performanceto that of a similar “metric” estimator by simulation studies.

Key words:Time series analysis, ordinal pattern, stochastic process, estimation,fractional Brownian motion2000 MSC: 60G15, 62M10, 60G18

∗Corresponding authorEmail addresses: [email protected] (Mathieu Sinn),

[email protected] (Karsten Keller)

Preprint submitted to Computational Statistics & Data Analysis September 25, 2009

Page 2: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

1. Introduction

One of the main challenges in time series analysis these days is the com-putational complexity due to the length and the high resolution of data sets.For instance, time series in finance, medicine or metereology often consistof several thousands of data points. Therefore, scalability of methods is acrucial requirement.

Ordinal time series analysis is a new approach to the investigation oflong and complex time series (see Bandt (2004), Keller et al. (2007)). Thebasic idea is to consider the order relations among the values of a time seriesinstead of the values themselves. As major advantages compared to methodswhich take the exact metric structure into account, ordinal methods areparticularly fast and robust (see Bandt and Pompe (2002), Keller and Sinn(2005)). Moreover, the order structure is invariant with respect to differentoffsets or scalings of a time series, which is important for the modellingof observations where the exact calibration of the measurement device isunknown.

The key ingredient to ordinal time series analysis is the concept of ordinalpatterns (or “order patterns” according to the terminology of Bandt andShiha (2007)). An ordinal pattern represents the order relations among afixed number of equidistant values in a time series. If the values of thetime series are pairwise different, ordinal patterns can be identified withpermutations.

Statistics of the distribution of ordinal patterns in a time series (and inparts of it, respectively) provide information on the dynamics of the under-lying system. One such statistic is the permutation entropy introduced byBandt and Pompe (2002) as a measure for the complexity of time series.Permutation entropy has been applied to detect and investigate qualitativechanges in brain dynamics as measured by an electroencephalogram (EEG)(see, e.g., Keller and Lauffer (2002), Cao (2004), Li et al. (2007, 2008)). Fur-ther statistics besides permutation entropy measure the symmetry of timeseries or quantify the amount of “zigzag” (see Bandt and Shiha (2007), Kelleret al. (2007)).

The distribution of ordinal patterns in time-discrete real-valued stochasticprocesses has been first investigated by Band and Shiha (2007). For specialclasses of Gaussian processes, they derive the probabilities of ordinal patternsdescribing the order relations between three and four successive observations.Some of their results apply more generally to processes with non-degenerate

2

Page 3: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

and symmetric finite-dimensional distributions.In this paper we study statistical properties of the estimators of ordinal

pattern probabilities. The framework of our analysis are discrete-time real-valued Gaussian processes with stationary and non-degenerate increments.Since only the order relations are considered, our results apply also to mono-tone functionals of such processes.

In Section 2 we give general statements on ordinal pattern probabilitiesand their estimation. As the distribution of ordinal patterns in processeswith stationary increments is time-invariant, unbiased estimators of ordinalpattern probabilities are given by the corresponding sample frequencies. Byusing symmetries of the distribution of Gaussian processes, we derive estima-tors with strictly lower risk with respect to convex loss functions. We showthat these estimators are strongly consistent and give sufficient conditionsfor asymptotic normality.

In Section 3 we apply the results to ordinal patterns describing the or-der structure of three successive observations. We show that “reasonable”estimators of the probability of such patterns can be expressed as an affinefunction of the frequency of changes between “upwards” and “downwards”.When the probability of a change is monotonically related to underlying pro-cess parameters, we derive estimators of such parameters. As an example, wediscuss the Zero-Crossing (ZC) estimator of the Hurst parameter in fractionalBrownian motion (see Coeurjolly (2000)).

Section 4 illustrates the results and compares the ZC estimator with a“metric” analogue by simulation studies. As an interesting finding of the sim-ulations, an even number of changes between “upwards” and “downwards” ina sample is much more likely than an odd number when the Hurst parameteris large.

2. Ordinal patterns and their probabilities

Preliminaries. As usual, let N = 1, 2, . . . and Z = . . . ,−1, 0, 1, . . ., andlet RZ be the space of sequences (zt)t∈Z with zt ∈ R for all t ∈ Z. By B(R),B(Rn) and B(RZ), we denote the Borel-σ-algebras on R, Rn and RZ.

For d ∈ N, let Sd denote the set of permutations of 0, 1, . . . , d, which wewrite as (d+1)-tuples containing each of the numbers 0, 1, . . . , d exactly onetime. By the ordinal pattern of x = (x0, x1, . . . , xd) ∈ Rd+1 we understandthe unique permutation

π(x) = π(x0, x1, . . . , xd) = (r0, r1, . . . , rd)

3

Page 4: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

of 0, 1, . . . , d which satisfies

xr0 ≥ xr1 ≥ . . . ≥ xrd

and ri−1 > ri if xri−1= xri

for i = 1, 2, . . . , d. The second condition isnecessary to guarantee the uniqueness of (r0, r1, . . . , rd) if there are equalvalues among x0, x1, . . . , xd. We may regard π(x) as a representation of therank order of x0, x1, . . . , xd. If xi = xj for i, j ∈ 0, 1, . . . , d with i < j, thenxj is ranked higher than xi. When x0, x1, . . . , xd are pairwise different, theorder relation between any two components of x (being either < or >) canbe obtained from π(x).

Ordinal time series analysis is based on counting ordinal patterns in atime series. The way of getting the ordinal pattern at some time t ∈ Z forsome fixed d ∈ N is illustrated by the following example.

Example. Figure 1 shows the values of a time series (xt)t∈Z at times24, 25, . . . , 36. For t = 27 and d = 5, we have

(xt, xt+1, . . . , xt+d) = (0.3, 0.1, 0.5, 0.9, 0.7, 0.5).

In Figure 1, the given values are connected by black line segments. Since(r0, r1, . . . , rd) = (3, 4, 5, 2, 0, 1) is the only permutation of 0, 1, . . . , d satis-fying

xt+r0 ≥ xt+r1 ≥ . . . ≥ xt+rd,

we obtain π(xt, xt+1, . . . , xt+d) = (3, 4, 5, 2, 0, 1).

In order to get the whole ordinal pattern distribution of a (part of a)time series for d ∈ N, one has to determine π(xt, xt+1, . . . , xt+d) for alltimes t of interest. This can be done by a very efficient algorithm whichtakes into account the overlapping of successive vectors (see Keller et al.(2007)). Instead of the permutation representation of an ordinal pattern,this algorithm uses an equivalent representation by a sequence of inversionnumbers.

The framework of analysis. Let (Ω,A) be a measurable space and X =(Xt)t∈Z a sequence of measurable mappings from (Ω,A) into (R,B(R)). LetY = (Yt)t∈Z denote the process of increments of X, given by Yt := Xt−Xt−1

for t ∈ Z. Suppose (Ω,A) is equipped with a family of probability measures

4

Page 5: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

24 26 27 29 30 32 33 35 3634312825t

0.1

0.5

0.70.6

0.9

0.3

xt

Figure 1: A part of a time series (xt)t∈Z, where the permutation π(x) describing the orderrelations among the components of the vector x = (xt, xt+1, . . . , xt+d) with d = 5 andt = 27 is given by π(x) = (3, 4, 5, 2, 0, 1).

(Pϑ)ϑ∈Θ with Θ 6= ∅. The subscript ϑ (e.g., in Eϑ, Varϑ, etc.) indicatesintegration with respect to Pϑ. Note that the consideration of a family ofprobability measures is only necessary for Sections 3 and 4, we however as-sume it from the beginning by reason of simplicity.

We assume that for every ϑ ∈ Θ the following conditions are satisfied:

(M1) Y is non-degenerate, that is, for all t1 < t2 < . . . < tk in Z with k ∈ Nand every set B ∈ B(Rk),

Pϑ((Yt1 , Yt2 , . . . , Ytk) ∈ B) > 0

only if B has strictly positive Lebesgue measure.

(M2) Y is stationary for every ϑ ∈ Θ, that is, for all t1 < t2 < . . . < tk in Zwith k ∈ N and every l ∈ N,

(Yt1 , Yt2 , . . . , Ytk)dist= (Yt1+l, Yt2+l, . . . , Ytk+l) ,

wheredist= denotes equality in distribution.

(M3) Y is zero-mean Gaussian for every ϑ ∈ Θ.

Note that the class of models satisfying (M1)-(M3) includes equidistantdiscretizations of fractional Brownian motion (fBm) with the Hurst param-eter H ∈ (0, 1) (see Taqqu (2003) and the end of Section 3).

As a consequence of (M1), the values of X are pairwise different Pϑ-almostsurely for all ϑ ∈ Θ, that is,

Pϑ(Xs = Xt) = 0 (1)

5

Page 6: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

for all s, t ∈ Z with s 6= t. For k ∈ Z and ϑ ∈ Θ, define

ρϑ(k) := Corrϑ(Y0, Yk) . (2)

Ordinal patterns. Let d ∈ N. By the ordinal pattern of order d at time t inX, we mean the random permutation given by

Π(t) := π(Xt, Xt+1, . . . , Xt+d)

for t ∈ Z. In this section, we study the distribution of the ordinal patternprocess (Π(t))t∈Z and the problem of estimating ordinal pattern probabilities.

Note that we could define Π(t) as a causal filter only depending on the“past” values Xt−d, Xt−d+1, . . . , Xt. The above “non-causal” definition is justfor sake of simpler notation in some proofs.

Clearly, if h is a strictly monotone mapping from R onto R, then

π(x) = π(h(x0), h(x1), . . . , h(xd))

for all x = (x0, x1, . . . , xd) ∈ Rd+1. Consequently, the ordinal patterns in Xand h(X) are identical. Note that the mapping h may be “random”. Forinstance, when A and B are random variables with values in R and (0,∞),respectively, the ordinal patterns in X and A + B ·X are identical.

Stationarity. For y = (y1, y2, . . . , yd) ∈ Rd, define

π(y) := π(0, y1, y1 + y2, . . . , y1 + y2 + . . . + yd) .

Let x = (x0, x1, . . . , xd) ∈ Rd+1. Clearly, π(x) = π(x0−x0, x1−x0, . . . , xd−x0). Furthermore, for i ∈ 1, 2, . . . , d, we can write xi − x0 as the sum ofthe differences x1 − x0, x2 − x1, . . . , xi − xi−1. Therefore,

π(x) = π(x1 − x0, x2 − x1, . . . , xd − xd−1) .

This shows that, for every t ∈ Z, the ordinal pattern Π(t) only depends onthe increments Yt+1, Yt+2, . . . , Yt+d, namely,

Π(t) = π(Yt+1, Yt+2, . . . , Yt+d) .

Thus, the following corollary is an immediate consequence of (M2).

Corollary 1. (Π(t))t∈Z is stationary for every ϑ ∈ Θ.

6

Page 7: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Let r = (r0, r1, . . . , rd) ∈ Sd for some d ∈ N. For ϑ ∈ Θ, define

pr(ϑ) := Pϑ(Π(t) = r) .

According to Corollary 1, the function pr(·) does not depend on the specifictime point t ∈ Z on the right hand side of the definition. We call pr(·) theprobability of the ordinal pattern r. By (M1), we easily obtain the followingstatement.

Corollary 2. For every ϑ ∈ Θ,

0 < pr(ϑ) < 1 .

Next, we study the statistical problem of estimating the (generally un-known) ordinal pattern probability pr(·). For n ∈ N, consider the ordinalpattern sample

Πn := (Π(0), Π(1), . . . , Π(n− 1)) .

A “natural” estimator of pr(·) is given by the relative frequency of observa-tions of r in the sample Πn,

qr,n = qr,n(Πn) :=1

n

n−1∑t=0

1Π(t)=r .

Since (Π(t))t∈Z is stationary, we have Eϑ(qr,n) = pr(ϑ) for every ϑ ∈ Θ, thatis, qr,n is an unbiased estimator of pr(·). In the next paragraph we show thatthere is a simple way for improving this estimator.

Space and time symmetry. Let k ∈ N and t1 < t2 < . . . < tk in Z. Accordingto (M3), the random vectors (Yt1 , Yt2 , . . . , Ytk) and (−Yt1 ,−Yt2 , . . . ,−Ytk) arezero-mean Gaussian for every ϑ ∈ Θ. Since Covϑ(Yi, Yj) = Covϑ(−Yi,−Yj)for all i, j ∈ Z, they have identical covariance matrices, which shows that

(Yt1 , Yt2 , . . . , Ytk)dist= (−Yt1 ,−Yt2 , . . . ,−Ytk)

for every ϑ ∈ Θ. Furthermore, since Y is stationary, we have Covϑ(Yi, Yj) =Covϑ(Y−i, Y−j) for all i, j ∈ Z. Therefore,

(Yt1 , Yt2 , . . . , Ytk)dist= (Y−t1 , Y−t2 , . . . , Y−tk)

7

Page 8: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

for every ϑ ∈ Θ. We refer to these properties of Y as symmetry in space andtime, respectively. With the terminology of Bandt and Shiha (2007), sym-metry in space and time is equivalent to reversibility and rotation symmetry,respectively.

Next, we show that as a consequence of the symmetry in space and timeof Y, the distribution of Πn is invariant with respect to spatial and timereversals of ordinal pattern sequences.

Consider the mappings α, β from Sd onto itself given by

α(r) := (rd, rd−1, . . . , r0) and β(r) := (d− r0, d− r1, . . . , d− rd) (3)

for r = (r0, r1, . . . , rd) ∈ Sd. Geometrically, we can regard α(r) and β(r)as the spatial and time reversal of r (for an illustration, see Figure 2). Inparticular, if the components of x = (x0, x1, . . . , xd) ∈ Rd+1 are pairwisedifferent, then

α(π(x)) = π(−x0,−x1, . . . ,−xd) and β(π(x)) = π(xd, xd−1, . . . , x0) .

In terms of the vector of increments y = (y1, y2, . . . , yd) given by yk :=xk − xk−1 for k = 1, 2, . . . , d, we have

α(π(y)) = π(−y1,−y2, . . . ,−yd) and β(π(y)) = π(−yd,−yd−1, . . . ,−y1) . (4)

r = (2, 0, 1) α(r) = (1, 0, 2) β(r) = (0, 2, 1) β α(r) = (1, 2, 0)

Figure 2: The pattern r = (0, 2, 1) and its spatial and time reversals.

As usual, let denote the composition of functions. For r ∈ Sd, considerthe subset r of Sd given by

r :=r, α(r), β(r), β α(r)

.

Since αβ(r) = β α(r) and αα(r) = β β(r) = r, the set r is closed underα and β, i.e., α(r) = β(r) = r. Consequently, if s ∈ r for r, s ∈ Sd, then

8

Page 9: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

s = r. This provides a partition of each Sd into classes which contain 2 or 4elements. For d = 1, the only class is S1 = (0, 1), (1, 0). For d = 2, thereare two classes: (0, 1, 2), (2, 1, 0) and (0, 2, 1), (2, 0, 1), (1, 2, 0), (1, 0, 2).For d = 3, there are 8 classes. For d ≥ 2, both classes of 2 and 4 elementsare possible.

Now, let n ∈ N, and consider the mappings A,B from (Sd)n onto (Sd)

n

given by

A(r1, r2, . . . , rn) := (α(r1), α(r2), . . . , α(rn)),

B(r1, r2, . . . , rn) := (β(rn), β(rn−1), . . . , β(r1))

for (r1, r2, . . . , rn) ∈ (Sd)n. According to the geometrical interpretation of α

and β, the ordinal pattern sequences A(r1, r2, . . . , rn) and B(r1, r2, . . . , rn)can be regarded as the spatial and time reversal of the ordinal pattern se-quence (r1, r2, . . . , rn).

Lemma 3. For every ϑ ∈ Θ, the ordinal pattern sequences Πn, A(Πn),B(Πn) and B A(Πn) have the same distribution.

Proof. Let ϑ ∈ Θ. Since the values in X are pairwise different Pϑ-almostsurely (see (1)), equation (4) yields

α(π(Yt+1, Yt+2, . . . , Yt+d)) = π(−Yt+1,−Yt+2, . . . ,−Yt+d)

Pϑ-almost surely for every t ∈ Z. Furthermore, by the space symmetry of Y,the random vectors (Y1, Y2, . . . , Yn+d−1) and (−Y1,−Y2, . . . ,−Yn+d−1) havethe same distribution with respect to Pϑ. Thus,

Πn =(π(Y1, . . . , Yd), π(Y2, . . . , Yd+1), . . . , π(Yn, . . . , Yn+d−1)

)dist=

(π(−Y1, . . . ,−Yd), π(−Y2, . . . ,−Yd+1), . . . , π(−Yn, . . . ,−Yn+d−1)

)

= A(Πn) ,

where the last equality holds Pϑ-almost surely. Similarly, we obtain

β(π(Yt+1, Yt+2, . . . , Yt+d)) = π(−Yt+d,−Yt+d−1, . . . ,−Yt+1)

Pϑ-almost surely for every t ∈ Z. Because of the space and time symmetry ofY, the random vectors (Y1, Y2, . . . , Yn+d−1) and (−Yn+d−1,−Yn+d−2, . . . ,−Y1)have the same distribution with respect to Pϑ, and therefore

Πn =(π(Y1, . . . , Yd), π(Y2, . . . , Yd+1), . . . , π(Yn, . . . , Yn+d−1)

)dist=

(π(−Yn+d−1, . . . ,−Yn), . . . , π(−Yd, . . . ,−Y1)

)

= B(Πn)

9

Page 10: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

with the last equality holding Pϑ-almost surely. Now, combining the twoprevious statements yields equality in distribution of Πn and BA(Πn).

Note that, for the proof of Lemma 3, we have only used that Y is symmet-ric in space and time and that the values of X are pairwise different Pϑ-almostsurely. Thus, the statement is valid under more general assumptions than(M1)-(M3).

A Rao-Blackwellization. In this paragraph, let r ∈ Sd with d ∈ N and n ∈ N.According to Lemma 3, we obtain

pr(·) = pα(r)(·) = pβ(r)(·) = pαβ(r)(·) .

This shows that qr,n, qα(r),n, qβ(r),n and qαβ(r),n are all unbiased estimatorsof pr(·). By averaging them we obtain another unbiased estimator of pr(·),given by

pr,n = pr,n(Πn) :=1

4

(qr,n + qα(r),n + qβ(r),n + qαβ(r),n

)

=1

n

n−1∑t=0

1

] r1Π(t)∈r

where ] r denotes the cardinality of the set r. Theorem 5 below shows thatpr,n has lower risk than qr,n with respect to any convex loss function. Wefirst prove the following lemma.

Lemma 4. For every ϑ ∈ Θ, we have Pϑ(pr,n 6= qr,n) > 0.

Proof. Let ϑ ∈ Θ. We show there exists a permutation (s0, s1, . . . , sn+d−1) ∈Sn+d−1 such that Xs0 > Xs1 > . . . > Xsn+d−1

implies pr,n > 0 and qr,n = 0.Then, according to Corollary 2, we have

Pϑ(pr,n 6= qr,n) ≥ Pϑ(pr,n > 0, qr,n = 0)

≥ Pϑ(Xs0 > Xs1 > . . . > Xsn+d−1)

> 0 .

Let i, j ∈ 0, 1, . . . , d be the indices satisfying ri = d − 1 and rj = d. Ifi < j, then we choose

(s0, s1, . . . , sn+d−1) = (n + d− 1, n + d− 2, . . . , d + 1, rd, rd−1, . . . , r0) .

10

Page 11: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Otherwise, let

(s0, s1, . . . , sn+d−1) = (rd, rd−1, . . . , r0, d + 1, d + 2, . . . , n + d− 1) .

In both cases, if Xs0 > Xs1 > . . . > Xsn+d−1, then Π(0) = α(r) and Π(t) 6= r

for t = 1, 2, . . . , n − 1, which implies pr,n > 0 and qr,n = 0. The proof iscomplete.

Theorem 5. The estimator pr,n of pr(·) is unbiased and has lower risk thanqr,n with respect to any convex loss function, that is, for every ϑ ∈ Θ,

(ϕ(pr,n, pr(ϑ))

) ≤ Eϑ

(ϕ(qr,n, pr(ϑ))

)

with respect to any function ϕ : [0, 1] × [0, 1] → [0,∞) with ϕ(p, p) = 0 andϕ(·, p) being convex for every p ∈ [0, 1]. When ϕ(·, p) is strictly convex forevery p ∈ [0, 1], then pr,n has strictly lower risk than qr,n with respect to φ.In particular, for every ϑ ∈ Θ,

Varϑ(pr,n) < Varϑ(qr,n) .

Proof. Let ϑ ∈ Θ, and ≺ be any total order on (Sd)n. According to Lemma

3, the statistic S(Πn) := min≺Πn, A(Πn), B(Πn), BA(Πn) is sufficientfor Πn, namely, if π ∈ (Sd)

n is such that Pϑ(S(Πn) = π) 6= 0, then theconditional distribution of Πn given S(Πn) = π is the equidistribution onπ, A(π), B(π), B A(π) which, clearly, does not depend on ϑ. Now,note that

pr,n =1

4

(qr,n(Πn) + qr,n(A(Πn)) + qr,n(B(Πn)) + qr,n(B A(Πn))

)

Pϑ-almost surely, and thus pr,n is a conditional expectation of qr,n givenS(Πn). Since qr,n is unbiased, the statement on the lower risk of pr,n followsby Theorem 3.2.1 in Pfanzagl (1994). The result on strictness is also aconsequence of Theorem 3.2.1 in Pfanzagl (1994) and the fact that, accordingto Lemma 4, Pϑ(pr,n 6= qr,n) > 0. Now, the statement on the variance followsby (· − p)2 being strictly convex for each p ∈ [0, 1].

Strong consistency. Up to the end of this section fix some r ∈ Sd for d ∈ N. Inorder to establish sufficient conditions for strong consistency for pr,n, we usewell-known results from ergodic theory. Let τ denote shift transformation,given by

τ(z) = (zt+1)t∈Z

11

Page 12: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

for z = (zt)t∈Z ∈ RZ. For j ∈ N, let τ j be given by τ j(z) = τ j−1(τ(z)) whereτ 0(z) := z is the identity on RZ.

A real-valued stationary stochastic process Z = (Zt)t∈Z on a probabilityspace (Ω′,A′,P) is called ergodic iff P(Z ∈ B) = 0 or P(Z ∈ B) = 1 forevery set B ∈ B(RZ) satisfying P(τ−1(B) ∆ B) = 0. The Birkhoff-KhinchinErgodic Theorem states that, if Z is ergodic and f : RZ → R is measurablewith E( |f(Z)| ) < ∞,

limn→∞

1

n

n−1∑j=0

f(τ j(Z)) = E(f(Z))

P-almost surely (see Theorem 1.2.1 in Cornfeld et al. (1982)). Accordingto Theorem 14.2.2 in Cornfeld et al. (1982), a sufficient condition for astationary Gaussian process to be ergodic is that the autocorrelations tendto zero as the lag tends to infinity.

Theorem 6. If ρϑ(k) → 0 as k → ∞ for every ϑ ∈ Θ and h : [0, 1] → Ris continuous on an open set containing pr(Θ), then h(pr,n) is a stronglyconsistent estimator of h(pr(·)), that is,

limn→∞

h(pr,n) = h(pr(ϑ))

Pϑ-almost surely for every ϑ ∈ Θ. If, additionally, h is bounded on [0, 1],then h(pr,n) is an asymptotically unbiased estimator of h(pr(·)), that is,

limn→∞

Eϑ(h(pr,n)) = h(pr(ϑ))

for every ϑ ∈ Θ.

Proof. Let ϑ ∈ Θ. Consider the mapping f : RZ → R given by

f(y) :=

1 if π(y1, y2, . . . , yd) ∈ r0 otherwise

for y = (yt)t∈Z ∈ RZ. Note that, for j = 0, 1, 2, . . ., we have f(τ j(Y)) =1Π(j)∈r. Under the assumptions, Y is ergodic and thus, according to theBirkhoff-Khinchin Ergodic Theorem,

limn→∞

pr,n =1

] rlim

n→∞1

n

n−1∑j=0

f(τ j(Y))

=1

] rEϑ(f(Y))

12

Page 13: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Pϑ-almost surely. Furthermore, according to Lemma 3,

1

] rEϑ(f(Y)) = pr(ϑ)

Pϑ-almost surely. Under the assumptions, there exists a δ > 0 such that h iscontinuous on (pr(ϑ)− δ, pr(ϑ) + δ). Therefore,

limn→∞

h(pr,n) = h(pr(ϑ))

Pϑ-almost surely. Now, since h is bounded on [0, 1], the Dominated Conver-gence Theorem yields

limn→∞

Eϑ(h(pr,n)) = Eϑ( limn→∞

h(pr,n) ) .

The proof is complete.

Asymptotic normality. Next, we derive a sufficient condition on the auto-correlations of Y for the estimator pr,n to be asymptotically normally dis-tributed. Let N(0, σ2) with σ2 ∈ [0,∞) denote the (possibly degenerated)normal distribution with zero mean and variance σ2.

As usual, we write g(k) = o(h(k)) for g, h : N → R iff limk→∞g(k)h(k)

= 0.

ByPϑ−→ we denote convergence in distribution with respect to Pϑ.

Let Z = (Z1, Z2, . . . , Zn) with n ∈ N be a Gaussian random vector on aprobability space (Ω′,A′,P). For a measurable mapping f : Rn → R withVar(f(Z)) < ∞, the Hermite rank of f with respect to Z is defined by

rank(f) := minκ ∈ N | There exists a real polynomial q : Rn → R

of degree κ with E([f(Z)− E(f(Z))] q(Z)) 6= 0

,

where the minimum of the empty set is infinity. The following result isderived by a limit theorem for nonlinear functionals of a stationary Gaussiansequence of vectors in Arcones (1994), which relates the Hermite rank of fwith respect to (Y1, Y2, . . . , Yd) and the rate of decay of k 7→ ρϑ(k) to theasymptotical distribution of 1

n

∑n−1t=0 f(Yt+1, Yt+2, . . . , Yt+d).

Theorem 7. If |ρϑ(k)| = o(k−β) for ϑ ∈ Θ and some β > 12, then

√n (pr,n − pr(ϑ))

Pϑ−→ N(0, σ2ϑ) ,

13

Page 14: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

where

σ2ϑ := γϑ(0) + 2

∞∑

k=1

γϑ(k)

and γϑ(k) := 1(] r)2

Covϑ(1Π(0)∈r, 1Π(k)∈r) for k ∈ Z.

Proof. Let g : Rd → R be defined by

g(z) :=

1] r

if π(z) ∈ r

0 otherwise

for z ∈ Rd, and note that g(Y(t)) = 1] r

1Π(t)∈r for every t ∈ Z. Therefore,according to the definition of pr,n,

√n pr,n =

1√n

n−1∑t=0

g(Yt+1, Yt+2, . . . , Yt+d)

for every n ∈ N. Now, let Z = (Z1, Z2, . . . , Zd) be a standard normal randomvector on (Ω′,A′,P) and note that E((g(Z))2) < ∞. We show that g hasHermite rank κ ≥ 2 with respect to Z. Let i ∈ 1, 2, . . . , d. Note that it issufficient to show that

E([g(Z)− E(g(Z))] Zi) = 0 .

Since Zi is zero-mean Gaussian, we have E(g(Z))E(Zi) = 0 and thus

E([g(Z)− E(g(Z))] Zi) = E(g(Z) Zi) .

Furthermore, because Z is non-degenerate, using the same argument as inthe proof of Lemma 3 shows that 1π(−Z)=α(s) = 1π(Z)=s P-almost surelyfor every s ∈ Sd. Since Z is zero-mean Gaussian, Z and −Z are identicallydistributed, and thus

E(1π(Z)=α(s) Zi) = E(1π(−Z)=α(s) (−Zi)) = −E(1π(Z)=s Zi) .

Now, in the case ] r = 2 where g(Z) = 12(1π(Z)=r + 1π(Z)=α(r)), we have

2E(g(Z) Zi) = E(1π(Z)=r Zi) + E(1π(Z)=α(r) Zi) = 0 .

14

Page 15: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Analogously, in the case ] r = 4, we obtain

4E(g(Z) Zi) = E(1π(Z)=r Zi) + E(1π(Z)=α(r) Zi)

+ E(1π(Z)=β(r) Zi) + E(1π(Z)=αβ(r) Zi) = 0 .

Putting it all together, we have E([g(Z) − E(g(Z))] Zi) = 0 which showsthat g has Hermite rank κ ≥ 2. Note that we have only used that Z isnon-degenerate zero-mean Gaussian, so g has Hermite rank κ ≥ 2 also withrespect to (Y1, Y2, . . . , Yd) for every ϑ ∈ Θ. Now, let ϑ ∈ Θ and supposeρϑ(k) = o(k−β) for some β > 1

2. Define

r(i,j)ϑ (k) := ρϑ(k + i− j)

for k ∈ Z and i, j ∈ 1, 2, . . . , d. Since (k + i − j)−β ∼ k−β for all i, j ∈1, 2, . . . , d, we have r

(i,j)ϑ (k) = o(k−β) and thus

∞∑

k=1

|r(i,j)ϑ (k)|κ < ∞ .

By Theorem 4 in Arcones (1994), the result follows. /iffalse it follows thatpr,n is asymptotically normally distributed, where the expression for σ2

ϑ isobtained according to

Covϑ

(g(Y(0)), g(Y(k))

)=

1

(] r)2Covϑ

(1Π(0)∈r, 1Π(k)∈r

)

for k ∈ Z. The theorem is proved. /fi

By applying the Delta Method (see Theorem 2.5.2 in Lehmann (1999)),we obtain the following corollary.

Corollary 8. If |ρϑ(k)| = o(k−β) for ϑ ∈ Θ and for some β > 12, and

h : [0, 1] → R has a non-vanishing first derivative at pr(ϑ), then

√n

(h(pr,n)− h(pr(ϑ))

) Pϑ−→ N( 0, σ2ϑ[h′(pr(ϑ))]2 ) ,

with σ2ϑ as given in Theorem 7.

Note that Theorem 4 in Arcones (1994) also allows to derive a multidi-mensional statement: Let m ∈ N and r1 ∈ Sd1 , r2 ∈ Sd2 , . . . , rm ∈ Sdm with(possibly different) d1, d2, . . . , dm ∈ N. Suppose hi : [0, 1] → R has a non-vanishing first derivative at pri

(ϑ) for i = 1, 2, . . . , m. If |ρϑ(k)| = o(k−β)for some β > 1

2, then (h1(pr1,n), h2(pr2,n), . . . , hm(prm,n)) is asymptotically

normally distributed.

15

Page 16: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

3. Parameter estimation from the frequency of changes

The case d = 2. Next, we show that in the case of ordinal patterns of orderd = 2, any “reasonable” estimator of ordinal pattern probabilities can bewritten as an affine function of the frequency of changes between “upwards”and “downwards”.

Consider

C(t) := 1Xt≥Xt+1<Xt+2 + 1Xt<Xt+1≥Xt+2

for t ∈ Z, indicating a change of direction in X, either from “downwards”to “upwards” (when Xt ≥ Xt+1 and Xt+1 < Xt+2), or from “upwards” to“downwards” (when Xt < Xt+1 and Xt+1 ≥ Xt+2).

Clearly, C(t) only depends on the ordinal pattern of order d = 2 at timet, namely, Xt ≥ Xt+1 < Xt+2 iff Π(t) = (0, 2, 1) or Π(t) = (2, 0, 1), andXt < Xt+1 ≥ Xt+2 iff Π(t) = (1, 0, 2) or Π(t) = (1, 2, 0). As a consequence ofCorollary 1, (C(t))t∈Z is stationary for every ϑ ∈ Θ.

Consider the probability of a change,

c(ϑ) := Pϑ(C(t) = 1)

for ϑ ∈ Θ. Clearly, c(·) does not depend on the value of t on the righthand side of the definition. By evaluating two-dimensional normal orthantprobabilities, it can be shown that

c(ϑ) =1

2− 1

πarcsin ρϑ(1) (5)

for ϑ ∈ Θ (see Bandt and Shiha (2007)). For n ∈ N, let cn denote therelative frequency of changes,

cn :=1

n

n−1∑t=0

C(t) .

According to the relation between C(t) and Π(t) discussed above, we have

cn =

4 pr,n if r ∈ (1, 0, 2), (1, 2, 0), (0, 2, 1), (2, 0, 1) ,

1− 2 pr,n if r ∈ (0, 1, 2), (2, 1, 0) .

In particular, Varϑ(cn) = 16 Varϑ(pr,n) if r ∈ (1, 0, 2), (1, 2, 0), (0, 2, 1), (2, 0, 1),and Varϑ(cn) = 4 Varϑ(pr,n) if r ∈ (0, 1, 2), (2, 1, 0). Note that the resultsin Sinn and Keller (2009) allow to evaluate Varϑ(cn) numerically.

16

Page 17: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Estimation of ϑ. We now restrict to the case where Θ is a subset of R. Ifthe relation ϑ 7→ c(ϑ) is strictly monotone on Θ, then ϑ can be estimatedby plugging the estimate cn of c(ϑ) into the inverse functional relation.

More precisly, assume there exists a function h : [0, 1] → R with

h(c(ϑ)) = ϑ (6)

for every ϑ ∈ Θ. Note that according to formula (5), a necessary andsufficient condition for the existence of such a function h is that ϑ 7→ ρϑ(1)is strictly monotonic. Plugging the estimate cn of c(·) into the left hand sideof (6), we obtain

ϑn := h(cn)

as an estimate of ϑ. The following corollary gives properties of ϑn.

Corollary 9. The estimator ϑn has the following properties:

(i) If h is continuous on an open set containing c(Θ) and limk→∞ ρϑ(k) = 0for all ϑ ∈ Θ, then ϑn is a strongly consistent estimator of ϑ. If, ad-ditionally, h is bounded on [0, 1], then ϑn is an asymptotically unbiasedestimator of ϑ.

(ii) If |ρϑ(k)| = o(k−β) for some β > 12

and h has a non-vanishing firstderivative at c(ϑ), then

√n (ϑn − ϑ)

Pϑ−→ N(0, σ2ϑ[h′(c(ϑ))]2 ) ,

with

σ2ϑ := γϑ(0) + 2

∞∑

k=1

γϑ(k)

and γϑ(k) := Covϑ(C(0), C(k)) for k ∈ Z.

Proof. (i) is a consequence of Theorem 6. (ii) follows by Corollary 8.

Equidistant discretizations of fBm. In the following, we apply the previousresults to the estimation of the Hurst parameter in equidistant discretizationsof fBm. Let B = (B(t))t∈R be a family of measurable mappings from ameasurable space (Ω,A) into (R,B(R)). Furthermore, let (PH)H∈(0,1) be afamily of probability measures on (Ω,A) such that B measured with respect

17

Page 18: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

to PH is fBm with the Hurst parameter H , that is, B is zero-mean Gausianand

CovH(B(t), B(s)) =1

2

(t2H + s2H − |t− s|2H

)

for s, t ∈ R. Note that many authors define fBm only for t ∈ [0,∞), however,we adopt the double-side infinite definition in Taqqu (2003).

It is well-known that fBm with the Hurst parameter H is H-self-similar,i.e., for every a > 0, the processes (B(at))t∈R and (aHB(t))t∈R have the samefinite-dimensional distributions with respect to PH (see Taqqu (2003)).

For a fixed sampling interval length δ > 0, consider the equidistant dis-cretization of fBm, X = (Xt)t∈Z, given by

Xt := B(δt)

for t ∈ Z. Let H ∈ (0, 1). According to the self-similarity of fBm, (B(δt))t∈Zand (δHB(t))t∈Z have the same distribution. Furthermore, the ordinal pat-terns in (δHB(t))t∈Z and (B(t))t∈Z are identical. Therefore, we obtain thefollowing statement.

Corollary 10. The distribution of ordinal patterns in X does not depend onthe sampling interval length δ.

In the following, we assume δ = 1. Let Y = (Yt)t∈Z denote the incrementprocess of X given by Yt := Xt − Xt−1 for t ∈ Z. It is well-known that Yis non-degenerate, stationary and zero-mean Gaussian for every H ∈ (0, 1).Thus, with Θ := (0, 1) and ϑ := H , we have a class of stochastic processeswith Y satisfying the model assumptions (M1)-(M3).

The ZC estimator of the Hurst parameter. The frequency of changes cn isa particularly interesting statistic for equidistant discretizations of fBm, be-cause the probability of a change is monotonically related to the Hurst param-eter. Note that the first-order autocorrelation of Y measured with respect toPH is given by ρH(1) = 22H−1−1. Thus, according to formula (5) and usingthe fact that arcsin x = 2 arcsin

√(1 + x)/2− π

2for x ∈ [−1, 1], we obtain

c(H) = 1− 2

πarcsin 2H−1 (7)

18

Page 19: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

for H ∈ (0, 1). Plugging the estimate cn of c(·) into the left hand side of (7)and solving for H yields an estimator for the Hurst parameter. In order toobtain only finite non-negative estimates, we define

h(x) := max0, log2(cos(πx/2)) + 1

for x ∈ [0, 1] and set

Hn := h(cn) .

Note that the first derivative of h on (0, 23) is given by

h′(x) = − π

2 ln 2tan(πx/2) (8)

for x ∈ (0, 23).

The estimator Hn is known as the ZC estimator of the Hurst parameter,with “ZC” standing for “zero-crossings”, because changes between “upwards”and “downwards” in X are equivalent to zero-crossings in Y (see Kedem(1994), Coeurjolly (2000)). The following corollary gives properties of the ZCestimator. Note that the second statement has been established by Coeurjolly(2000) using a central limit theorem of Ho and Sun (1987).

Corollary 11. The estimator Hn has the following properties:

(i) Hn is a strongly consistent and asymptotically unbiased estimator ofthe Hurst parameter.

(ii) If H < 34, then

√n (Hn −H)

PH−→ N(0, σ2H [h′(c(H))]2) ,

with h′ as given in (8) and

σ2H := γH(0) + 2

∞∑

k=1

γH(k)

where γH(k) := CovH(C(0), C(k)) for k ∈ Z.

Proof. Note that the image of (0, 1) under c(·) is given by (0, 23) and h is

continuous and bounded on [0, 1]. Furthermore, ρH(k) is asymptoticallyequivalent to H(2H − 1)k2H−2 as k → ∞ (see Taqqu (2003)) and thuslimk→∞ ρH(k) = 0 for every H ∈ (0, 1). Now, statement (i) follows byCorollary 9 (i). In order to establish (ii), note that h′ is non-vanishing on(0, 2

3). Furthermore, when H < 3

4, there exists a β > 1

2such that ρH(k) =

o(k−β) (for instance, we can choose β = 54−H). Thus, statement (ii) follows

by Corollary 9 (ii).

19

Page 20: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

Alternative estimation methods.. There are various alternative methods forestimating the Hurst parameter, such as Maximum Likelihood (ML) meth-ods and approximations thereof, or semi-parametric estimates based on therescaled range statistic or on the periodogram. Although ML methods haveasymptotic optimality properties, they are computationally intensive andthus often impracticable. Semi-parametric methods are more robust andgenerally applicable, but usually also less efficient. A further disadvantage ofsemi-parametric estimates is that they depend on certain tuning parameterswhich are difficult to select automatically.

Here we consider a computationally simple alternative estimation methodproposed by Kettani and Gubner (2002). Note that the probability of achange is monotonically related to the first-order autocorrelation of Y (whichin turn is monotonically related to the Hurst parameter). The estimator ofKettani and Gubner first computes the sample autocorrelation of Y,

ρn :=

∑n−1t=1 (Yt − Yn)(Yt+1 − Yn)∑n

t=1(Yt − Yn)2(9)

where Yn := 1n

∑nt=1 Yt is the sample mean, and plugs ρn into the inverse of

the monotonic relation to the Hurst parameter. Altogether, this gives theestimate

Hn := g(ρn)

of the Hurst parameter, where

g(x) := max0,

1

2(log2(1 + x) + 1)

(10)

for x ∈ [0, 1]. Note that Hn can be regarded as the estimator obtained byplugging the estimate ρn := cos(πcn) of the first order autocorrelation intog(·). In this sense, Hn is the “ordinal” analogue of Hn.

In the next section, we compare the performance of Hn and Hn in asimulation study. Note that Hn can be more generally applied to the esti-mation of the index of self-similarity in (not necessarily Gaussian) selfsimilarprocesses with stationary increments. In contrast to Hn, however, Hn isin general not invariant with respect to monotone transformations of theprocess.

20

Page 21: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

n = 100 n = 1000 n = 10 000H µ σ µ σ µ σ

0.05 0.087 (0.070) 0.101 (0.072) 0.054 (0.051) 0.043 (0.030) 0.050 (0.050) 0.016 (0.010)0.10 0.119 (0.108) 0.113 (0.082) 0.099 (0.100) 0.048 (0.031) 0.100 (0.100) 0.016 (0.010)0.15 0.156 (0.152) 0.124 (0.088) 0.149 (0.150) 0.048 (0.030) 0.150 (0.150) 0.015 (0.009)0.20 0.199 (0.199) 0.131 (0.089) 0.199 (0.200) 0.047 (0.029) 0.200 (0.200) 0.015 (0.009)0.25 0.245 (0.248) 0.134 (0.087) 0.249 (0.250) 0.045 (0.028) 0.250 (0.250) 0.014 (0.009)0.30 0.291 (0.297) 0.133 (0.085) 0.299 (0.300) 0.043 (0.027) 0.300 (0.300) 0.014 (0.008)0.35 0.341 (0.345) 0.131 (0.082) 0.349 (0.350) 0.042 (0.026) 0.350 (0.350) 0.013 (0.008)0.40 0.390 (0.394) 0.127 (0.079) 0.399 (0.400) 0.040 (0.025) 0.400 (0.400) 0.013 (0.008)0.45 0.441 (0.442) 0.121 (0.076) 0.449 (0.449) 0.038 (0.024) 0.450 (0.450) 0.012 (0.008)0.50 0.491 (0.489) 0.116 (0.073) 0.499 (0.499) 0.036 (0.023) 0.500 (0.500) 0.011 (0.007)0.55 0.541 (0.536) 0.110 (0.070) 0.549 (0.548) 0.034 (0.022) 0.550 (0.550) 0.011 (0.007)0.60 0.592 (0.581) 0.103 (0.067) 0.599 (0.597) 0.032 (0.021) 0.600 (0.600) 0.010 (0.007)0.65 0.641 (0.625) 0.097 (0.064) 0.649 (0.646) 0.031 (0.020) 0.650 (0.649) 0.010 (0.007)0.70 0.692 (0.668) 0.093 (0.061) 0.699 (0.693) 0.030 (0.020) 0.700 (0.698) 0.010 (0.007)0.75 0.741 (0.708) 0.089 (0.059) 0.749 (0.739) 0.029 (0.020) 0.750 (0.747) 0.010 (0.007)0.80 0.790 (0.746) 0.085 (0.056) 0.798 (0.783) 0.031 (0.020) 0.800 (0.794) 0.012 (0.008)0.85 0.837 (0.782) 0.081 (0.053) 0.848 (0.824) 0.035 (0.020) 0.849 (0.838) 0.017 (0.009)0.90 0.884 (0.814) 0.076 (0.050) 0.895 (0.860) 0.039 (0.019) 0.898 (0.879) 0.023 (0.009)0.95 0.930 (0.843) 0.067 (0.046) 0.941 (0.893) 0.039 (0.018) 0.944 (0.914) 0.027 (0.010)

Table 1: Sample mean µ and sample standard deviation σ for the estimator Hn (Hn)

4. Simulation studies

In order to illustrate the performance of Hn and to point out some in-teresting phenomenon in the distribution of changes for large H , we presentsome data based on simulations. We have used the pseudo random numbergenerator of Matlab 7.6.0 and the algorithm of Davies and Harte (1987) forsimulating paths of equidistant discretizations of fBm. For the sample sizesn = 100, 1000, 10 000 and different values of the Hurst parameter, we havegenerated each 100 000 paths and have computed the number of changes andthe value of Hn and Hn for each path. The sample mean µ and the samplestandard deviation σ of the obtained values for Hn and Hn (in brackets)are shown in Table 1.

The results suggest that the bias of Hn is particularly large for largervalues of the Hurst parameter. For n = 100 and n = 1000 the samplestandard deviation is the highest for small values of H , and for n = 10 000 itis maximal in the case H = 0.95. Compared to the results for the estimatorHn the sample standard deviation of the ZC estimator is larger. For instance,when n = 10 000 and H < 0.85, the standard deviation is about 1.5 times aslarge. The observed loss of efficiency of Hn compared to Hn is not surprising,because Hn is based only on the number of changes between upwards anddownwards whereas Hn uses the metric structure. When n is small, both

21

Page 22: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

estimators tend to overestimate the Hurst parameter (except for very smallvalues of H). Particularly for large values of the Hurst parameter, Hn seemsto have much smaller bias than Hn.

Figure 3 shows the empirical distribution of the number of changes bet-ween “upwards” and “downwards” in the samples of length n = 100. ForH = 0.70, the distribution looks approximately normal, which is compatiblewith Corollary 9 (ii), but for large values of H , the distribution is veryirregular. Most remarkably, the frequency of even numbers is larger thanthe frequency of odd numbers, and the distributions conditioned on an oddor an even number look entirely different. For instance, when H = 0.95,the frequencies of odd and even numbers are 0.321 and 0.679, respectively.The distribution conditioned on an odd number of changes is slightly left-skewed with the mean 21.5 and the mode 23. The distribution conditionedon an even number of changes has the mean 14.4 and the mode 0 and looks,roughly, like the mixture of a geometric and a binomial distribution. Aninteresting consequence is that the probability of a change (which is givenby c(0.95) = 0.167) is overestimated by the relative frequency of changesin a sample given that the number of changes in the sample is odd, andunderestimated given that the number of changes in the sample is even.

A heuristic explanation for the high frequency of even numbers is thefollowing: When H is large, there is a high probability to observe path seg-ments which look, roughly, like a straight line. Typically for such segments,there are only local changes in direction. Globally, there is one prevailingtrend, either “downwards” or “upwards”. Overall, such segments result inan even number of changes. In other words: paths with an even number ofchanges look more similar to a straight line than paths with an odd numberof changes. Note that in the limit case H → 1, the number of changes iseven with probability 1 (namely, equal to 0).

Acknowledgements

The research of Mathieu Sinn was supported by a Government of CanadaPost-Doctoral Research Fellowship (PDRF).

References

[1] Arcones, M. A., 1994. Limit theorems for nonlinear functionals of astationary Gaussian sequence of vectors. Annals of Probability 22, 2242-2274.

22

Page 23: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

[2] Bandt, C., 2005. Ordinal time series analysis. Ecological Modelling 182,229-238.

[3] Bandt, C., Pompe, B., 2002. Permutation entropy: A natural complexitymeasure for time series. Phys. Rev. Lett. 88, 174102.

[4] Bandt, C., Shiha, F., 2007. Order patterns in time series. J. Time Ser.Anal. 28, 646-665.

[5] Cao, Y. H., Tung, W. W. , Gao, J. B., Protopopescu, V. A.,Hively, L. M., 2004. Detecting dynamical changes in time series usingthe permutation entropy. Phys. Rev. E 70, 046217.

[6] Coeurjolly, J. F., 2000. Simulation and identification of the fractionalBrownian motion: A bibliographical and comparative study. J. Stat.Software 5.

[7] Cornfeld, I. P., Fomin, S. V., Sinai, Ya. G., 1982. Ergodic Theory.Springer-Verlag, Berlin.

[8] Davies, R. B., Harte, D. S., 1987. Tests for Hurst effect. Biometrika 74,95-101.

[9] Ho, H.-C., Sun, T.-C., 1987. A central limit theorem for noninstanta-neous filters of a stationary Gaussian process. J. Multivariate Anal. 22,144-155.

[10] Kedem, B., 1994. Time Series Analysis by Higher Order Crossings. IEEEPress, New York.

[11] Keller, K., Lauffer, H., 2003. Symbolic analysis of high-dimensional timeseries. Int. J. Bifurcation Chaos 13, 2657-2668.

[12] Keller, K., Sinn, M., 2005. Ordinal analysis of time series. Physica A356, 114-120.

[13] Keller, K., Sinn, M., Emonds, J., 2007. Time series from the ordinalviewpoint. Stochastics and Dynamics 2, 247-272.

[14] Keller, K., Lauffer, H., Sinn, M., 2007. Ordinal analysis of EEG timeseries. Chaos and Complexity Letters 2, 247-258.

23

Page 24: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

[15] Kettani, H., Gubner, J. A., 2006. A novel approach to the estimation ofthe Hurst parameter in self-similar traffic. IEEE Trans. Circuits Syst.II, 53, 463-467.

[16] Lehmann, E., 1999. Elements of Large Sample Theory. Springer, NewYork.

[17] Li, X., Cui, S., Voss, L. J., 2008. Using permutation entropy to measurethe electroencephalographic effects of sevoflurane. Anesthesiology 109,448-456.

[18] Li, X., Ouyang, G., Richards, D. A., 2007. Predictability analysis ofabsence seizures with permutation entropy. Epilepsy research 77, 70-74.

[19] Pfanzagl, J., 1994. Parametric Statistical Theory. De Gruyter, Berlin.

[20] Sinn, M., Keller, K., 2009. Covariances of zero-crossings in Gaussianprocesses. Submitted.

[21] Taqqu, M. S., 2003. Fractional Brownian Motion and Long-Range-Dependence, in: Doukhan, P., Oppenheim, G., Taqqu, M. S. (Eds.),Theory and Applications of Long-Range-Dependence, Birkhauser,Boston.

24

Page 25: Estimation of ordinal pattern probabilities in Gaussian ... · particularly well-suited for the analysis of long biophysical time series where the exact calibration of the measurement

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.90

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.95

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.80

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.85

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.70

0 10 20 30 40 50 600.00

0.02

0.04

0.06

0.08

H = 0.75

Figure 3: Simulation of the number of changes in samples of the length n = 100.

25