A Bayesian Analysis of Multiple Change Point Problems
-
Upload
liliana-alejandra-rodriguez-zamora -
Category
Documents
-
view
218 -
download
0
Transcript of A Bayesian Analysis of Multiple Change Point Problems
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 1/26
A Bayesian analysis of multiple change pointproblems in data sequence
Rosangela H. Loschi1
Departamento de Estatıstica, Universidade Federal de Minas Gerais,
Belo Horizonte - MG, Brazil
email: [email protected]
Pilar L.Iglesias and Reinaldo B. Arellano-Valle
Departamento de Estadıstica, Facultad de Matematicas
Pontificia Universidade Catolica de Chile, Santiago -Chile.
email:{pliz, reivalle}@mat.puc.cl
Frederico R. B. Cruz
Departamento de Estatıstica, Universidade Federal de Minas Gerais,
Belo Horizonte - MG, Brazil
email: [email protected]
Abstract
We apply the product partition model (PPM) to identify multiple change points
in normal means (µ) and variances (σ2), extending some previous works. We
establish a full predictivistic characterization for the prior distribution of µ andσ2 which yields an easier way to obtain the prior distribution of these parameters
by considering opinion on observable quantities only. We also propose a Gibbs
sampling scheme to estimate the posterior distributions of the number of change
points and of the instants when changes occured. We apply the results to identify
multiple changes in the expected return and the volatility of a series of returns
in the Chilean stock market, providing a sensitivity analysis of the model if some
different prior specifications are considered. We conclude that Chilean market
possesses expected return and volatility clusteres and that the product estimates
are influenced by the prior specifications.
Keywords: Gibbs sampling, predictivism, product partition model, Student-t
distribution.
1Corresponding author.Departamento de Estatıstica, ICEx, Universidade Federal de Minas Gerais, Caixa
Posta 702, CEP: 31270-901- Belo Horizonte, MG, Brasil. Fax: +55 33 3499 5924
1
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 2/26
1 Introduction
In this paper we consider a Bayesian analysis of the multiple change point problem using
the product partition model (PPM) proposed by Hartigan (1990). The PPM allows the
identification of multiple change points in the parameters as well as in functional form of
the distribution function itself. Besides this some flexibility is introduced by the PPM in the
analysis of change point problems since the number of change points is a random variable (as
opposed to a known number considered in threshold models (Chen and Lee (1995), Geweke
and Terui (1993)) and in the model considered by Hawkins (2001), for example).
The one change point problem has been approached from the Bayesian point of view by sev-
eral authors. For example, Menzefricke (1981) considers the problem of making inferences
about a change point in the precision of normal data with unknown mean. A single change
point in the functional form of the distribution is explored by Hsu (1984), who considers the
class of the exponential-power distributions (Box and Tiao, 1973) for treating the problem.
Both authors apply their methodologies to stock market prices. The Bayesian identification
of a single change point is also discussed by Smith (1975). The PPM proposed later by
Hartigan (1990) generalizes most situations described before. The PPM is applied by Barry
and Hartigan (1993) to identify multiple change points in the mean of normal random vari-
ables with common variance. Recently, Crowley (1997) provides a new implementation of the
Gibbs sampling in order to solve the problem of estimating normal means by using PPM. The
identification of change points in normal means with common variance is also considered by
Chernoff and Zacks (1964) and Gardner (1969) using different Bayesian approaches. (More
about change point problems can be found in Carlstein, Mueller and Siegmund (1994).)
The aim of this paper is to apply the PPM presented by Barry and Hartigan (1992) to
identify multiple change points in both the mean µ and the variance σ2 of normal data
2
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 3/26
which are sequentially observed, extending some results from Barry and Hartigan (1993)
and Crowley (1997). We consider a conjugate prior distribution for the parameters µ and
σ2, justifying this choice within a full predictivistic setting due to de Finetti (1937). In fact,
we propose a more tractable way to elicit the prior distibution of µ and σ2 by considering
only opinions on observable quantities. We also use Yao’s (1984) algorithm to compute the
posterior estimates or product estimates for these parameters. A Gibbs sampling scheme to
estimate the posterior distributions of the number of change points as well as the instants
when changes occured is proposed. In spite of using the transformation suggested by Barry
and Hartigan (1993), the proposed method to estimate these posterior distribution was not
found in the literature. We also consider different prior specifications for the probability that
a change occurs in any instant and evaluate the sensitivity of the PPM to these different
choices. In order to illustrate the method, the results are applied to identify multiple change
points in the mean and variance of a series of returns of the Chilean stock market. As a
consequence, it is reported that returns in the Chilean stock market are characterized by
changes in the expected or mean return and volatility (measured here as variance).
The PPM introduced by Barry and Hartigan (1992) is briefly reviewed in Section 2. Later in
Section 2 we obtain the Student-t PPM for random variables which are normally distributed,
given the mean and variance (both unknown), providing the posterior estimation for these
parameters. A predictivistic characterization of the Student-t PPM, which explains the
choice of the prior distributions adopted in an alternative way, is provided as a by-product.
In Section 3, we introduce procedures based on Gibbs sampling schemes to compute the pos-
terior distributions for the random partition and for the number of change points, assuming
normal data. Finally, in Section 4 we apply the procedures obtained in Sections 2 and 3
to identify change points in the mean return as well as in the volatility of Endesa (Chilean
National Electric Company) returns. We also provide a sensitivity analysis to the PPM.
3
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 4/26
2 The Student-t PPM
In this section we apply the Product Partition Model (PPM) introduced by Barry and Har-
tigan (1992) to identify change points in the mean and variance of normal data observed
through time. We consider a conjugate analysis and present a full predictivistic characteri-
zation to the complete model (likelihood function and prior distribution). First, we present
the definition of PPM and some preliminary results obtained from this model, as given by
Barry and Hartigan (1992, 1993).
2.1 The product partition model (PPM)
Let X 1, . . . , X n be a data sequence. Consider a random partition ρ of the set I = {1, . . . , n}
and a random variable B that represents the number of blocks in ρ. Consider that each parti-
tion ρ = {i0, i1, . . . , ib}, 0 = i0 < i1 < · · · < ib = n, divides the sequence X 1, . . . , X n into B =
b, b ∈ I , contiguous subsequences, which will be denoted by X[ir−1ij ] = (X ir−1+1, . . . , X ir),
r = 1, . . . , b. Let c[ij] be the prior cohesion associated to the block [ij] = {i + 1, . . . , j},
i, j ∈ I ∪ {0}, j > i, which represents the degree of similarity among the observations in X[ij]
(Hartigan, 1990).
Hence, it is said that the random quantity (X 1, . . . , X n; ρ) follows a PPM, denoted by
(X 1, . . . , X n; ρ) ∼ P P M , if:
i) the prior distribution of ρ is the following product distribution:
P (ρ = {i0, . . . , ib}) =Πb j=1c[ij−1ij ]C Πb
j=1c[ij−1ij ]
, (2.1)
where C is the set of all possible partitions of the set I into b contiguous blocks with
end points i1, · · · , ib, satisfying the condition 0 = i0 < i1 < .. . < ib = n, b ∈ I ;
4
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 5/26
ii) conditionally on ρ = {i0, . . . , ib}, the sequence X 1, . . . , X n has the joint density given
by:
f (X 1, . . . , X n|ρ = {i0, . . . , ib}) = Πb j=1f [ij−1ij ](X[ij−1ij ]), (2.2)
where f [ij](X[ij]) is the joint density of the random vector X[ij] = (X i+1, . . . , X j).
Notice that the number of blocks B in ρ has a prior distribution given by:
P (B = b) ∝ C1 Πb j=1c[ij−1ij ], b ∈ I, (2.3)
where C1 is the set of all partitions of I in b contiguous blocks.
As shown in Barry and Hartigan (1992), the posterior distributions of ρ and B have the
same form of prior distribution, where the posterior cohesion for the block [ij] is given by
c∗[ij] = c[ij]f [ij](X[ij]). That is, the PPM induces some kind of conjugacy.
In the parametric approach to the PPM, a sequence of unknown parameters θ1, . . . , θn, such
that, conditionally in θ1, . . . , θn, the sequence of random variables X 1, . . . , X n has conditional
marginal densities f 1(X 1|θ1), . . . , f n(X n|θn), respectively, is considered. In this case, it is
considered that two observations X i and X j , such that i = j, are in the same block, if it
is believed that they are identically distributed. Thus, in this approach to the PPM, the
predictive distribution f [ij](X [ij]), which appeared in (2.2), can be obtained as follows:
f [ij](X[ij]) = Θ[ij]
f [ij](X[ij]|θ) π[ij](θ)dθ, (2.4)
where Θ[ij] is the parameter space corresponding to the common parameter, say, θ[ij] =
θi+1 = . . . = θ j , which indexes the conditional density of X[ij].
The prior distribution of θ1, . . . , θn is constructed as follows. Given a partition ρ = {i0, . . . , ib},
b ∈ I, we have that θi = θ[ir−1ir] for every ir−1 < i ≤ ir, r = 1, . . . , b, and that θ[i0i1], . . . , θ[ib−1ib]
5
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 6/26
are independent, with θ[ij] having (block) prior density π[ij](θ), θ ∈ Θ[ij].
Hence, the goal in the parametric PPM is to obtain the marginal posterior distributions of
the parameters ρ, B, and θk, k = 1, . . . , n. Barry and Hartigan (1992) have shown that the
posterior distributions of θk is given by:
π(θk|X 1, . . . , X n) =k−1i=0
n j=k
r∗[ij] π[ij](θk|X[ij]), (2.5)
for k = 1, . . . , n, and the posterior expectation of θk is given by:
E (θk|X 1, . . . , X n) =
k−1i=0
n j=k
r∗[ij] E (θk|X[ij]), (2.6)
for k = 1, . . . , n, where r∗[ij] denotes the posterior relevance for the block [ij], that is:
r∗[ij] = P ([ij] ∈ ρ|X 1, . . . , X n) =λ[0i]c
∗[ij]λ[ jn]
λ[0n]
, (2.7)
where λ[ij] =
Πbk=1c∗[ik−1ik]
, and the summation is over all partitions of {i + 1, . . . , j} in b
blocks with endpoints i0, i1, . . . , ib, satisfying the condition i = i0 < i1 < · · · < ib = j.
2.2 Product estimates for normal means and variances
Assume that θ1 = (µ1, σ21), . . . , θn = (µn, σ2
n), such that X k|µk, σ2k ∼ N (µk, σ2
k), k = 1, . . . , n ,
and they are independent. Denote by θ[ij] = (µ[ij], σ2[ij]) the common parameter related to the
block [ij]. Thus, the Student-t PPM can be specified by considering the following conditional
( j − i)-dimensional normal distribution for the observations in X[ij]:
X[ij]|µ[ij], σ2[ij] ∼ N j−i(µ[ij]1 j−i, σ2
[ij]I j−i), (2.8)
where 1k and Ik are the k-dimensional vector of one and the k × k-dimensional identity
matrix, respectively; as well as by assuming that (µ[ij], σ2[ij]) has normal-inverted-gamma
6
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 7/26
prior distribution denoted by (µ[ij], σ2[ij]) ∼ NIG(m[ij], v[ij]; a[ij]/2, d[ij]/2), that is,
µ[ij]|σ2[ij] ∼ N (m[ij], v[ij]σ
2[ij]) and σ2
[ij] ∼ IG(a[ij]/2, d[ij]/2), (2.9)
where IG(a, b) is the inverted-gamma distribution with parameters a and b. Under (2.8) and
(2.9), the conditional distribution of θ[ij] = (µ[ij], σ2[ij]), given the observations in X[ij], is the
normal-inverted-gamma distribution given by
µ[ij]|X[ij], σ2[ij] ∼ N (m∗
[ij], v∗[ij]σ2[ij]) and
σ2[ij]|X[ij] ∼ IG(a∗[ij]/2, d∗[ij]/2),
(2.10)
where
m∗[ij] =
( j
−i)v[ij]X[ij]
( j−i)v[ij]+1 +
m[ij]
( j−i)v[ij]+1
v∗[ij] =v[ij]
( j−i)v[ij]+1
d∗[ij] = d[ij] + j − i
a∗[ij] = a[ij] + q[ij](X[ij]),
(2.11)
with
X [ij] =1
j − i
jr=i+1
X r,
q[ij](X[ij]) =
jr=i+1
(X r − X [ij])2 + ( j − i)(X [ij] − m[ij])
2
( j − i)v[ij] + 1.
(See O’Hagan (1994) for details). Therefore, we obtain from (2.10) and (2.6) that the product
estimates for µk and σ2k are given by
E (µk|X 1, . . . , X n) =k−1i=0
n j=k
r∗[ij]m∗[ij] (if d∗[ij] > 1) (2.12)
and
E (σ2k|X 1, . . . , X n) =
k
−1
i=0
n j=k
r∗[ij] a∗[ij]d∗[ij] − 2
(if d∗[ij] > 2), (2.13)
respectively, k = 1, . . . , n , where m∗[ij], a∗[ij] and d∗[ij] are defined as in (2.11).
Notice that the PPM induced by (2.8) and (2.9) implies that for each block [ij], the ran-
dom vector X[ij] follows a ( j − i)-dimensional Student-t distribution denoted by X[ij] ∼
t j−i(m[ij], V[ij]; a[ij], d[ij]) with density function given by
7
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 8/26
f (X[ij]) = c(d[ij], j − i)ad[ij]/2
[ij] |V[ij]|−1/2
{a[ij] + (X[ij] − m[ij])V−1
[ij](X[ij] − m[ij])}−(d[ij]+ j−i)/2, (2.14)
where c(d, k) = Γ[d+k2
]{Γ[ d2
] πk2 }−1 and m[ij] = m[ij]1 j−i and V[ij] = I j−i + v[ij]1 j−i1
j−i.
The distribution in (2.14) is named by Arellano-Valle and Bolfarine (1995) Generalized
Student-t distribution, which is reduced to the usual Student-t distribution with d[ij] degrees
of freedom and the same dispersion matrix when a[ij] = d[ij]. Notice that assuming this
model, the elements within the same block are correlated and distributed according to a
distribution with heavier tail than the normal distribution. Moreover, for the block [ij] it
follows that
E (X j|X j−1, . . . , X i) = E (µ[ij]|X j−1, . . . , X i) = m∗[i( j−1)]
and
E (X 2 j |X j−1, . . . , X i) = E [(σ2[ij] + µ2
[ij])|X j−1, . . . , X i)
=( j − i)v[ij] + 1
( j − i − 1)v[ij] + 1
a∗[i( j−1)]
d∗[i( j−1)] − 2 + (m∗[i( j−1)])2
,
where m∗[i( j−1)], d∗[i( j−1)] and a∗[i( j−1)] are defined as in (2.11).
2.3 Yao’s algorithm
In order to compute the posterior relevances given in (2.7) we consider the following recursive
algorithm proposed by Yao (1984).
λ[00] = 1,
λ[01] = c∗[01],
λ[0 j] = c∗[0 j] + j−1
t=1 λ[0t]c∗[tj], ∀ j = 2, . . . , n ,
λ[(n−1)n] = c∗[(n−1)n],
λ[in] = c∗[in] +n−1
t=i+1 λ[tn]c∗[it], ∀i = 1, . . . , n − 2,
λ[nn] = 1.
(2.15)
8
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 9/26
where λ[ij] is the summation presented in (2.7) and c∗[ij] is the posterior cohesion of the block
[ij]. A Gibbs sampling scheme to compute the posterior relevances can be found in Loschi
et al. (2003). See Barry and Hartigan (1993) for a Gibbs sampling scheme to compute the
product estimates directly.
2.4 A Predictivistic justification of the Student-t PPM
Sometimes to elicit prior distributions to solve real problems is not an easy task. In this
section we establish a full predictivistic characterization to the Student-t PPM presented in
Section 2.2 where the likelihood function as well as the prior distribution of µ and σ2 are
consequences of judgements on observable quantities. As a by-product this characterization
provides a tractable way to elicit the prior distribution of (µ, σ2).
As shown in Section 2.2, the Student-t distribution is a location and scale mixture of the
normal distribution, where the mixing measure is the normal-inverted-gamma distribution.
Thus, it follows that the Student-t distribution can be obtained in two stages. Firstly, a
conditional normal distribution, given the location and scale parameters, is specified. Sec-
ondly, we identify a normal-inverted-gamma distribution as the prior joint distribution for
the location and scale parameters. By adopting the predictivistic approach de Finetti (1937),
the first stage is replaced by an assumption about observables (Iglesias (1993) and Wech-
sler (1993)). For example, the assumption of invariance under some groups of orthogonal
transformation over infinite sequences of random quantities implies that the law of sequence
of observables can be represented as mixtures of conditionally normally distributed and in-
dependent quantities (see Kingman (1972), Smith (1981), Diaconis, Eaton and Lauritzen
(1992)). However, this type of condition does not permit the characterization of the mixing
measure. Additional conditions have to be assumed to obtain the mixing measure. Arellano-
9
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 10/26
Valle, Bolfarine and Iglesias (1994), following Diaconis and Ylvisaker( 1979, 1985), charac-
terize a scale mixture of a normal distribution by considering invariance under orthogonal
transformation and additional conditions which determine how to predict X 2n+1. In the full
predictivistic approach considered by Arellano-Valle, Bolfarine and Iglesias (1994) the mix-
ing measure (prior distribution) obtained is the inverted-gamma distribution. These authors
also obtain a characterization for a location and scale mixture of normal distributions which
depends on non-observable quantities - that is, it is not a full predictivistic characterization
of the model. Proposition 2.1 in the following improves this partial result.
Consider X n = 1
nn
i=1
X i and S 2n = n
i=1
(X i − X n)2. We say that an infinite sequence
of random variables X 1, X 2, . . . is O(1)-invariant, if for each n ≥ 2 and real values m and
r, the conditional distribution of X[0n], given X n = m and S 2n = r2, is uniform on the n-
sphere centred in m1n and with ratio r, that is, on the set S n = {(x1, . . . , xn) ∈ Rn : xn =
m,n
i=1(xi − xn)2 = r2}.
Proposition 2.1 Let X 1, X 2, . . . be an infinite sequence of O(1)–invariant random variables,
such that P (X 1 = X 2) = 0 and
E (X 23 |X 1, X 2) = e(X 21 + X 22 ) + w,E (X 3|X 1, X 2) = e(X 1 + X 2) + u,
(2.16)
then e ∈ (0, 1/2), u ∈ R, w > u2/(1 − 2e) and, for each n ≥ 3,
X[0n] ∼ tn
u
1 − 2e1n, In +
e
1 − 2e1n1
n;1
e
w −
u2
1 − 2e
;
1 + e
e
. (2.17)
The converse also holds.
Proof: From Smith’s (1981) theorem, there are random variables µ and σ2, such that, for
every n ≥ 2,
X[0n]|µ, σ2 ∼ N (µ1n, σ2In),
where σ2 > 0 with probability one. Consequently, considering M =2
i=1 X i = 2X and
Q =2
i=1 X 2i = S 2 + 2X 2 and denoting by θ = (θ1, θ2) = (µ/σ2, −1/2σ2) the natural
10
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 11/26
parameter of the distribution of (M, Q), given (µ, σ2), we obtain the following conditional
density of (M, Q) given θ:
dP θ(M, Q) = exp{(θ1, θ2)(M, Q)t − D(θ)}dξ(M, Q),
where dξ(M, Q) = 1π√2
(Q − M 2/2)−12 dλ, λ is the Lebesgue measure defined on R2 and
D(θ) = −θ21/(2θ2) − log(−θ2).
The vector of partial derivates of D(θ) with respect to the natural parameters θ1 and θ2 is
given by
D(θ) = −θ1θ2
,θ21
2θ22−
1
θ2 = E {(M, Q)|θ}.
Hence, by using properties of the conditional expectation and conditions (2.16), it follows
thatE {D(θ)|(M, Q)} = E {E {(M, Q)|θ1, θ2}|(M, Q)}
= 2E {E {(X 3, X 23)|(µ, σ2)}|X 1, X 2}= 2e(X 2 + X 1; X 22 + X 21 ) + 2(u, w).
From Theorem 3 in Diaconis and Ylvisaker (1979) the following prior density for ( µ, σ2) is
obtained:
π(µ, σ2) = K 1σ2
12e+
32 exp− 1
2eσ2 w − u2
1−2e 1−2e
eσ2 12
exp
−1−2e2eσ2
µ − u
1−2e2
. (2.18)
Consequently, (2.17) is obtained (see O’Hagan (1994) pp.244). The converse is obtained by
using the properties of the Student-t distribution (see Arellano-Valle and Bolfarine (1995)).
Proposition 2.1 improves some partial results from Arellano-Valle, Bolfarine and Iglesias
(1994) by providing a full predictivistic characterization to a location and scale mixture of
normal distributions. Extensions of this result to Student-t linear models can be found in
Loschi, Iglesias and Arellano-Valle (2003).
Corollary 2.1 Consider the assumptions established in Proposition 2.1. Then, the parame-
ters µ and σ2 have the following inverse-gamma-normal distribution:
µ|σ2 ∼ N
u1−2e , eσ2
1−2e
and
σ2 ∼ IG
12e
w − u2
1−2e
, 1+e
2e
.
(2.19)
11
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 12/26
Thus, under O(1)-invariance assumptions the representations in (2.16) are equivalent to the
specification in (2.19).
3 Posterior Distributions for ρ and B
In this section, we provide the exact posterior distribution for ρ and B assuming the prior
cohesions suggested by Yao (1984) and propose a Gibbs sampling scheme to estimate these
posterior distributions.
3.1 Exact Posterior Distributions
Let p, 0 ≤ p ≤ 1, be the probability that a change occurs at any instant in the sequence.
Therefore the prior cohesion for block [ij] corresponds to the probability that a new change
takes place after j − i instants, given that a change has taken place at instant i, that is,
c[ij] =
p(1 − p) j−i−1 if j < n(1 − p) j−i−1 if j = n.
(3.1)
Notice that the prior cohesions given in (3.1) imply that the sequence of change points
establishes a discrete renewal process, with occurence times identically distributed with
geometric distribution. If a high value for p is considered we are previously assuming that
there are small blocks of data (or, equivalently, a large number of change points) in the
data sequence. Assuming these cohesions, it follows from expression (2.1) that the prior
distribution of ρ takes the form
P (ρ = {i0, i1, . . . , ib}) = pb−1(1 − p)n−b,
b ∈ I, which depends only on the number of observations n and the number of blocks b in the
partition, but does not depend on the positions where the change points occur. Moreover,
12
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 13/26
it follows that the prior distribution for the random variable B is given by
P (B = b) = C n−1b−1 pb−1(1 − p)n−b, ∀ b ∈ I,
where C n
−1
b−1 is the number of distinct partitions of I into b contiguous blocks. Consequently,
we have that:
E (B) = (n − 1) p and (3.2)
V (B) = (n − 1) p(1 − p). (3.3)
From Section 2.1 we only need to find the posterior cohesion for each block to obtain the
posterior distribution of ρ and B. Recalling that the posterior cohesion for the block [ij] is
obtained by multiplying the correspondent prior cohesion by the predictive distribution of
X[ij], which is the Student-t distribution defined in (2.14), the following result is obtained:
c∗[ij] =
p(1− p)j−i−1c(d[ij],j−i)ad[ij]/2
[ij]
(1+( j−i)v[ij])1/2{a[ij]+q[ij](X[ij])}(d[ij]+j−i)/2 , if j < n
(1− p)j−i−1c(d[ij],j−i)ad[ij]/2
[ij]
(1+( j−i)v[ij])1/2{a[ij]+q[ij](X[ij])}(d[ij]+j−i)/2 , if j = n,
where c(d, k) and q[ij](X[ij]) are defined as in (2.14) and (2.11), respectively.
Notice that the exact calculation of the posterior distribution for ρ and B demands great
computational efforts, in spite of the simplifications introduced by Yao’s (1984) algorithm. In
the next section we propose a Gibbs sampling scheme for computing the posterior distribution
for the random partition ρ and for the random quantity B, which is based on the sample
generated by using the Gibbs sampling approach (see Gelfand and Smith (1990), Gamerman
(1997) for MCMC methods).
3.2 Gibbs Sampling Approach
Consider the auxiliary random quantity U i suggested by Barry and Hartigan (1993) which
reflects whether a change point has, or has not occured at the time i, that is,
U r =
1 if θr = θr−10 if θr = θr−1,
13
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 14/26
r = 2, . . . , n (U 1 = 0). Thus, the random quantity ρ is perfectly identified by considering a
vector of these random quantities, namely, (U 2, . . . , U n), n > 2. Consequently, we can esti-
mate the posterior probability for each particular partition ρ = {i0, i1, . . . , ib} by computing
the proportion of samples of (U 2, . . . , U n) such that U ir = 0 for r = ik + 1, k = 1, . . . , b − 1,
and U r = 1 otherwise.
Similarly, it is possible to use the above procedure to estimate the posterior distribution of
B noticing that
B = 1 +n−1i=1
(1 − U i).
The vector or partition (U k
2 , . . . , U k
n) at step k is generated by using the Gibbs sampling as
follows. Starting with an initial sampling (U 02 , . . . , U 0n) of the random vector (U 2, . . . , U n), at
step k, the r-th element U kr is generated from the conditional distribution
U r|U k2 , . . . , U kr−1, U k−1r+1 , . . . , U k−1n ; X 1, . . . , X n,
r = 2, . . . , n . To generate the vectors above, it is sufficient to consider the ratios given by
the following expressions:
Rr =P (U r = 1|Ak; X 1, . . . , X n)
P (U r = 0|Akr ; X 1, . . . , X n)
,
r = 2, . . . , n , where Akr = {U k2 = u2, . . . , U kr−1 = ur−1, U k−1r+1 = ur+1, . . . , U k−1n = un}. Hence,
considering a degenerate prior distribution for p, we have that
Rr =c∗[xy]
c∗[xr]c∗[ry]
,
where c∗[ij] is the posterior cohesion for block X[ij],
x =
maxi{0 < i < r, U ki = 0}, if U ki = 0 for some i ∈ {2, . . . , r − 1}
0, otherwise,
and
y =
mini{r < i < n, U k−1i = 0}, if U k−1i = 0 for some i ∈ {r + 1, . . . , n}
n, otherwise.
14
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 15/26
Consequently, the criterion for choosing the vectors (U k2 , . . . , U kn) becomes
U kr =
1, if
c∗[xy]
c∗[xr]
c∗[ry]
≥ 1−uu
0, otherwise,
r = 2, . . . , n , where u is a random number chosen from the uniform distribution U (0, 1). This
completes the procedure to estimate the posterior distributions for the random partition ρ
and for the number of blocks B. (Loschi et al. (2003) extend the PPM presented in this
paper by considering a beta prior distribution for p. In these cases, the choice of p seems
less arbitrary since the beta family is rich enough to describe the uncertainty about p under
many practical circumstances. For example, a proper non-informative prior distribution for
p can be specified declaring the beta parameters equal to 1. A comparison between the
results obtained here and those obtained by Loschi et al. (2003) can be found in Loschi and
Cruz (2002a) which conclude that the product estimates obtained by using a degenerate
prior distribution to p or a beta prior distribution with modal value close to this fix value of
p are similar.)
4 Applications: The Chilean Stock Market Behavior
The ultimate goal of this section is to present a sensitivity analysis for the PPM assuming
different degenerate prior distributions for p and to identify multiple change points in the
mean (or expected return) and variance (volatility) of the returns of the Endesa stock series
(Figure 1) within the period from 1987 to 1994 using the methodology developed in the
previous section. As usual in finance, a return series is defined by using the transformation
X t = (P t − P t−1)/P t−1, where P t is the price in the month t. Defined in this way, the returns
within each block can be considered normally distributed, given the expected return and the
volatility (Correa, 1998).
15
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 16/26
Figure 1: Returns of ENDESA
Year
Returns
87 88 89 90 91 92 93 94 95
-0.2
0.0
0.2
0.4
4.1 Sensitivity analysis
We adopt the following normal-inverted-gamma prior specification to describe uncertainty
on the parameter (µ[ij], σ2[ij]):
µ[ij]|σ2[ij] ∼ N (0, σ2
[ij]), and σ2[ij] ∼ IG
0.01
2,
4
2
.
We also consider the prior cohesions given in (3.1). Since a small number of changes is
expected we consider p = 0.01 and 0.1 to evaluate the influence of these prior specifications on
the posterior estimates of µ, σ2, B and ρ. We also consider very different prior specifications
for the chance of a change occurring assuming p = 0.5 and p = 0.9. In these cases, a higher
number of change points is expected in the prior evaluation.
In the Gibbs sampling scheme, we generate 5,000 samples of (U 2, . . . , U n) with dimension 94,
starting from a vector of zeros. After convergence has been reached, we discarded the initial
1,000 interactions. A lag of 1 is selected since the correlation among vectors is low.
The algorithm used here were coded in C ++. All tests were performed in a PC-like computer,
166 MHz, 32 MB RAM, running Windows 98, and using the freely available C ++ compiler
DJGPP (http://www.delorie.com/djgpp).
Figures 2 and 3 show the posterior estimates of µk and σ2k, k = 1, . . . , 95, that is, for the
16
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 17/26
monthly mean returns and volatility, respectively. The product estimates of µ (σ2) are
contrasted with the centered arithmetic moving average (variance) of order 10 for the means
(variances), respectively. It is noticeable that more instants are identified as a change point
if higher values of p are considered. We also notice that similar estimates are obtained for
close values of p. If p = 0.1 we observe that the estimates obtained using PPM are very
similar to the naıve estimates.
Figure 2: Posterior means of µ
*
*
**
*
*
*
**
*****
**
*
****
*
*
*
*
*
*
*
*
****
*
**
**
*
***
***
*
*
*
*
*
*
*
*
*
*
*
*
***
**
*
**
**
*
*
**
*
**
*
***
***
**
*
*
*
*
*
*****
*
*
Year
MeanRetur
n
87 88 89 90 91 92 93 94 95
-0.2
0.0
0.2
0.4
p=0.01
*DataMean R.M. Aver.
*
*
**
*
*
*
**
*****
**
*
****
*
*
*
*
*
*
*
*
****
*
**
**
*
***
***
*
*
*
*
*
*
*
*
*
*
*
*
***
**
*
**
**
*
*
**
*
**
*
***
***
**
*
*
*
*
*
*****
*
*
Year
MeanRetur
n
87 88 89 90 91 92 93 94 95
-0.2
0.0
0.2
0.4
p=0.1
*DataMean R.M. Aver.
*
*
**
*
*
*
**
*****
**
*
****
*
*
*
*
*
*
*
*
****
*
**
**
*
***
***
*
*
*
**
*
*
*
*
*
*
*
***
**
*
*
*
**
*
*
**
*
**
*
*
**
***
***
*
*
**
*****
*
*
Year
MeanReturn
87 88 89 90 91 92 93 94 95
-0.2
0.0
0.2
0.4
p=0.5
*DataMean R.M. Aver.
*
*
**
*
*
*
**
*****
**
*
****
*
*
*
*
*
*
*
*
****
*
**
**
*
***
***
*
*
*
**
*
*
*
*
*
*
*
***
**
*
*
*
**
*
*
**
*
**
*
*
**
***
***
*
*
**
*****
*
*
Year
MeanReturn
87 88 89 90 91 92 93 94 95
-0.2
0.0
0.2
0.4
p=0.9
*DataMean R.M. Aver.
Figure 3: Posterior means of σ2
Year
Volatility
87 88 89 90 91 92 93 94 95
0.005
0.015
0.025
p=0.01
VolatilityMoving Variance
Year
Volatility
87 88 89 90 91 92 93 94 95
0.01
0.02
0.03
0.04
p=0.1
VolatilityMoving Variance
Year
Volatility
87 88 89 90 91 92 93 94 95
0.01
0.02
0.03
0.04
p=0.5
VolatilityMoving Variance
Year
Volatility
87 88 89 90 91 92 93 94 95
0.005
0.015
0.025
p=0.9
VolatilityMoving Variance
17
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 18/26
Figure 4 presents the most probable partition for different values of p. Notice that similarly
to the conclusions we drew from Figures 2 and 3, we can observe that for higher values of p
more instants are identified as change points.
Figure 4: Posterior distribution of ρ
****************************
*
***************
*
**********
*
**************************************
Year
Returns
87 88 89 90 91 92 93 94 95
-0.2
0.2
0.6
1.0
*PatitionEndesa Returns
p=0.01
********
*
***********
*
*
*
*
*
***
*
************
*
**
*
**********
*
***
*
****
*
***********
*
**************
*
**
Year
Returns
87 88 89 90 91 92 93 94 95
-0.2
0.2
0.6
1.0
*PatitionEndesa Returns
p=0.1
*
**
**
*
*
*
*
****
*
*
*
*
**
*
*
***
*
*
********
*
****
*
**
*
*
***
*
***
*
****
*
**
*
**
**
*
*
*
*
*
***
*
*
*
*
*
*
**
**
*
*
*
*
**
*
*
*
*
**
Year
Returns
87 88 89 90 91 92 93 94 95
-0.2
0.2
0.6
1.0
*PatitionEndesa Returns
p=0.5
*
*
*****************************************
*
**
**
***************
*
**
*
*************
*
************
*
Year
Returns
87 88 89 90 91 92 93 94 95
-0.2
0.2
0.6
1.0
*PatitionEndesa Returns
p=0.9
Table 1 presents the prior and posterior probabilities of the most probable partition. Notice
that the probability of occurrence of the posterior most probable partition increase substan-
tially in the posterior evaluation.
Table 1: Prior and posterior probability of the most probable partition
p Prior probability Posterior probability0.010 4.007 × 10−7 0.3567
0.100 1.593 × 10−16
0.01730.500 2.524 × 10−29 0.00130.900 1.161 × 10−13 0.0285
From Figure 5 we can notice that the posterior distribution of the number of blocks in the
partition B (or for the number of change points in the time series B − 1) has only one mode
independently of the value assumed for p. We can also notice that if p is small the posterior
18
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 19/26
distribution of B are centered in lower values (see Table 2 for the descriptive statistics of
the posterior distribution of B). It is also noticeable that for all values of p the probability
of having one or more change point in the Endesa series is one.
Figure 5: Posterior distribution of B
No. Blocks
probability
0 20 40 60 80
0.0
0.1
0.2
0.3
0.4
0.5
p=0.01
No. Blocks
probability
0 20 40 60 80
0.0
0.04
0.08
0.12
p=0.1
No. Blocks
probability
0 20 40 60 80
0.0
0.02
0.04
0.06
0.08
p=0.5
No. Blocks
probability
0 20 40 60 80
0.0
0.04
0.08
0.12
Table 2: Descriptive statistics - prior and posterior distributions of B
Prior Distribution Posterior Distribution
p Mean Variance Mean Variance Mode Median Q1 Q30.010 0.940 0.9306 5.093 2.158 4 4 4 60.100 9.400 8.4600 17.075 9.682 16 17 15 190.500 47.000 23.5000 50.521 23.436 50 50 47 540.900 84.600 8.4600 84.753 9.263 85 85 83 87
Notice from Table 2 that the summaries of location (mean, mode, median) of the posterior
distribution of B as well as the mean of the prior distribution of B increase if p increases. We
also observe that the posterior variance is higher than the prior variance for p = 0.01, 0.1 and
0.9. The opposite conclusion can be drawn for 0.5. It is also noticeable that the posterior
variance increases if p increases for values of p up to 0.5. (See more about the influence of
19
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 20/26
prior specifications in the PPM in Loschi and Cruz (2002a,b)).
4.2 A note on the model specification
We suppose that, conditionally in the average stock return and its total standard deviation,
any path followed by the returns within a block presenting the same average returns and
total standard deviation is “equally likely” to occur, which is mathematically expressed
by the O(1)-invariance assumption amongst the returns. Hence, assuming extendibility -
that is, assuming that all subsequences (X i+1, . . . , X j) are part of an infinite O(1)-invariant
sequence, we have that the joint distribution of the Endesa returns in the same block, X[ij],
can be represented as a mixture of the product of the normal distributions N (µ[ij], σ2[ij])
(Smith, 1981), what agrees with Correa (1998) assumptions about the Chilean market.
We also assume the conditions in (2.16) understanding that these conditions elucidate the
considerations made by Mandelbrot (1963), as well as what Maeda (1996) suggests to be
reasonable for the Chilean market, that large returns tend to be followed by large returns
and small returns tend to be followed by small returns and changes in this behavior are
produced by unanticipated information.
These assumptions leads to a predictive distribution with heavy tails (Student-t distribution)
for the returns in the same block which also discloses a structure of correlation amongst the
returns. Since the Chilean stock market is an emerging market, and so it can experience more
changes than a developed market, because it is more susceptible to the political atmosphere,
the Student-t distribution is more appropriate to describe the behavior of its stock returns
(Duarte Jr. and Mendes (1997) and Mendes (2000)). (Notice that the normality assumption
adopted by Hsu (1984) (see also Hawkins (2001)) to describe the behavior of the Dow Jones
Industrial Average is stronger than the assumptions we did - we only state that data is
conditionally normally distributed.)
20
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 21/26
The prior cohesions given in (3.1) and considered to analyze the Endesa series imply that
the sequence of change points establishes a discrete renewal process, with occurence times
identically distributed with geometric distribution. This type of product partition distribu-
tion is adequate to represent reasonably well the situation described by Mandelbrot (1963)
(and later by Maeda (1996) for the Chilean stock market), who established that changes in
the behavior of the series of stock returns are a consequence of the receipt of information not
previously anticipated, so that the past change points are noninformative about the future
change points.
5 Conclusions
In this paper we have applied the PPM to identify multiple change points in normal means
and variances for data sequences, extending previous results from Barry and Hartigan (1993)
and Crowley (1997). We have proposed a Gibbs sampling scheme to estimate the posterior
distributions of the number of change points as well as for the instants when the changes
occured. We have applied the method to indentify change points in the mean return and
the volatility of the Endesa stock returns and provided a sensitivity analysis for the PPM.
The results indicate that the procedures proposed to compute the posterior estimates of B
and ρ are quite effective, simple and easy to implement. We also conclude that the prior
specifications for p strongly influences on both the posterior distributions of the number of
change points and of the instants when changes occured as well as in the product estimates of
the mean and the variance. Since p is crucial for the inferences we can estimate p assuming a
prior distribution for it. In this case, the conjugacy of PPM model is lost and, it is impossible
to use Yao’s procedure (see Loschi and Cruz (2002a)). A procedure to obtain the posterior
distribution of p using Gibbs sampling can be found in Loschi and Cruz (2003)
We believe that some improvement would be obtained if a modification could be done in Yao’s
21
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 22/26
procedure, such that a prior distribution to p could be included. An alternative algorithm is
considered in Quintana and Iglesias (2003) in connection with the non-parametric approach
to cluster analysis.
Some open questions remain. Can different prior specifications for the mean and variance
affect the product estimates? Would it be possible to find even simpler implementations for
the PPM? How well does the methodology fit in the presence of outliers? These and other
similar questions are interesting topics for future research in this area.
6 Acknowledgements
This research supported in part by PRPq-UFMG, grant 4801-UFMG/RTR/ FUNDO/PRPq/
RECEM DOUTORES/00; and CAPES; FONDECYT, grants 8000004, 1971128 and 1990431;
and Fundacion Andes (Chile). The authors hereby would like to thank Heleno Bolfarine and
Wilfredo Palma for their valuable comments and contributions to this paper.
References
Arellano-Valle, R. B. and H. Bolfarine. On some characterizations of the t-distribution.
Statistics & Probability Letters, 25:79–85, 1995.
Arellano-Valle, R. B., H. Bolfarine, and P. L. Iglesias. A predictivistic interpretation of the
multivariate t distribution. Test , 2(3):221–236, 1994.
Barry, D. and J. A. Hartigan. Product partition models for change point problems. The
Annals of Statistics, 20(1):260–279, 1992.
Barry, D. and J. A. Hartigan. A Bayesian analysis for change point problem. Journal of the
American Statistical Association , 88(421):309–319, 1993.
22
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 23/26
Box, G. E. P. and G.C. Tiao. Bayesian Inference in Statistical Analysis. Addison-Wesley,
New York, USA, 1973.
Chen, C. W. S. and J. C. Lee. Bayesian inference of threshold autorregressive models. Jounal
of Time Series Analysis, 16(5):483–492, 1995.
Chernoff, H. and S. Zacks. Estimating the current mean of a normal distribution which is
subjected to changes in time. Annals of Mathematical Statistics, 35:999–1018, 1964.
Correa, L.. Modelacion Bayesiana de puntos de cambio en la volatilidad. Master’s thesis,
Facultad de Matematicas - Pontifıcia Universidad Catolica de Chile, Chile, 1998. (in
Spanish).
Crowley, E. M. Product partition models for normal means. Journal of the American
Statistical Association , 92(437):192–198, 1997.
de Finetti, B.. La prevision: ses lois logiques, ses sources subjectives. Annales de l’Institute
Henri Poincare, 7:1–68, 1937.
Diaconis, P., M. L. Eaton, and S. L. Lauritzen. Finite de finetti theorems in linear models
and multivariate analysis. Scandinavian Journal of Statistics, 19:298–315, 1992.
Diaconis, P. and D. Ylvisaker. Conjugate priors for esponential families. Annals of Statistics,
7:269–281, 1979.
Diaconis, P. and D. Ylvisaker. Quantifying prior opinion. In J. M. Bernardo, M. H. DeGroot,
D. V Lindley, and A. F. M Smith, editors, Bayesian Statistical 2 , pages 133–156. North-
Holland, Elsevier Science, 1985.
Duarte Jr., A. M. and B. V. M. Mendes. Product partition models for normal means.
Emerging Markets Quarterly , 1(4):85–95, 1997.
23
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 24/26
Gamerman, D. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference.
Chapman and Hall, London, UK, 1997.
Gardner, L. A. On detecting change in the mean in the normal variates.Annals of Mathe-
matical Statistics, 40:116–126, 1969.
Gelfand, A. E. and A. F. M. Smith. Sampling-based approaches to calculating marginal
densities. Journal of the American Statistical Association , 85:398–409, 1990.
Geweke, J. and N. Terui. Bayesian threshold autoregressive models for nonlinear time series.
Journal of Time Series Analysis, 14(5):441–454, 1993.
Hartigan, J. A. Partition models. Communication in Statistics - Theory and Method , 19(8):
2745–2756, 1990.
Hawkins, D. M. Fitting multiple change-point models to data. Computational Statistics &
Data Analysis, 1:323–341, 2001.
Hsu, D. A. A bayesian robust detection of shift in the risk struture of stock market returns.
Journal of the American Statistical Association , 77(2):407–416, 1984.
Iglesias, P. L. Formas finitas do teorema de de Finetti: A vis˜ ao preditivista da Inferencia
Estatıstica em populac˜ oes finitas. PhD thesis, Departamento de Estatıstica, Instituto
de Matematica e Estatıstica, Universidade de Sao Paulo, Sao Paulo, Brazil, 1993. (in
Portuguese).
Kingman, J. F. C. On random sequences with spherical symetry. Biometrika , 59:183–197,
1972.
Loschi, R. H. and F. R. B. Cruz. An analysis of the influence of some prior specifications in
24
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 25/26
the identification of change points via product partition model. Computational Statistics
& Data Analysis, 39:477-501, 2002.
Loschi, R. H., F. R. B. Cruz. Appling the product partition model to the identification of multiple change points. Advances in Complex Sistems 5 (4):371-387,2002.
Loschi, R. H., F. R. B. Cruz. Extension to the Product Partition Model: Computing the
Probability of a Change. (Manuscript submmited to publication), 2003.
Loschi, R. H., F. R. B. Cruz, P. L. Iglesias, and R. B. Arellano-Valle. A gibbs sampling scheme
to the product partition model: An application to change-point problems. Computers &
Operations Research , 30(3):463–482..
Loschi, R. H., P. L. Iglesias, and R. B. Arellano-Valle. Predictivistic characterization of
multivariate Student-t models. Journal of Multivariate Analysis,85 (1):10-23, 2003.
Maeda, M. A. Volatilidad estocastica en el mercado accionario chileno. Master’s thesis,
Facultad de Ciencias Economicas y Administrativas - Universidad de Chile, Chile, 1996.
(in Spanish).
Mandelbrot, B.. The variation of certain speculative prices. Journal of Business, 36:394–419,
1963.
Mendes, B. V. M. Computing robust risk measures in emerging equity markets using extreme
value theory. Emerging Markets Quarterly , pages 24–41, 2000.
Menzefricke, U. A Bayesian analysis of a change in the precision of a sequence of independent
normal random variables at an unknown time point. Applied Statistics, 30(2):141–146,
1981.
25
8/3/2019 A Bayesian Analysis of Multiple Change Point Problems
http://slidepdf.com/reader/full/a-bayesian-analysis-of-multiple-change-point-problems 26/26
Carlstein, E. , Mueller, H. G. and D. Siegmund (eds.) Change-point problems. IMS Lecture
Notes - Monograph Series, 23, USA, 1994.
O’Hagan, A.Kendall’s Advanced Theory of Statistics 2A
, chapter Bayesian Inference. JohnWiley & Sons, New York, NY, 1994.
Quintana, F. A. and P. L. Iglesias. Nonparametric Bayesian clustering and product partition
models. Journal of the Royal Statistical Society B (to appear), 2003.
Smith, A. F. M. A Bayesian approach to inference about a change-point in a sequence of
random variables. Biometrika , 62(2):407–416, 1975.
Smith, A. F. M. On random sequences with centered spherical symmetry. Journal of the
Royal Statistical Society, B , 43:203–241, 1981.
Wechsler, S. Exchangeability and predictivism. Erkenntnis: International Journal of Ana-
lytic philosophy , 38:343–350, 1993.
Yao, Y. Estimation of a noisy discrete-time step function: Bayes and empirical Bayes ap-
proaches. The Annal of Statistics, 12(4):1434–1447, 1984.
26