1 Random Thinning of stochastic process Strong Completeness and its Application to Random Thinning...
-
date post
19-Dec-2015 -
Category
Documents
-
view
220 -
download
3
Transcript of 1 Random Thinning of stochastic process Strong Completeness and its Application to Random Thinning...
1
Random Thinning of stochastic process
Strong Completeness and its Application to Random Thinning of Random Processes
Shailaja DeshmukhUniversity of Pune, Pune – India
Visiting professor, University of Michigan, Ann Arbor
Identifiable sampling schemes
Strong completeness of gamma family
Markov sampling of a continuous parameter stochastic process
Applications in spatial sampling
Outline
Negative binomial sampling of a discrete parameter stochastic process and its applications in risk analysis
2
Complete observation for a fixed time interval Keiding (1974, 1975), Athreya (1975, 1978)
Observing a process till a fixed no. events occur (inverse sampling) Moran (1951), Keiding (1974, 1975)
Observing the process at specified deterministic epochs t1, t2, …tn
Prakasa Rao (1988), Su & Cambanis (1993)
{X(t) , t T} , T : Discrete or continuous
Various observational schemes
3
Kingman (1963, Ann. Math. Statist ) Fixed epoch sampling suffers from non- identifiability Observed data may come from different processes
Kingman (1963) advocated selecting epochs t1, t2, …tn randomly
Criterion – Process derived from the original process randomly, should determine the stochastic structure of the original process uniquely
{X(t), t T}: Original process under study
Zn = X(Tn), Tn : Random variables
{Zn, n 1} : Derived process or randomly thinned process
The process used for sampling should be identifiable
4
{X(1) (t)} {X(2) (t)}
{Z(1) (n)} {Z(2) (n)}=d
=d
Derived process determines the original process uniquely
Identifiability is essential for justfication of inference based on randomly derived process Basawa (1974) , Baba (1982) Identifiable sampling schemes
Identifiability of a sampling scheme
5
continuous discrete
Tn : n th eventin Poisson process.Poisson samplingKingman (1963) Ann. Math.Statist
Tn : n th visit to state 1in two state Markov process.Markov process samplingStrong completeness of gammafamily. Deshmukh (2005)Stochastic modelling & Applications
Tn : n th success in independent Bernoulli trials.Bernoulli samplingDeshmukh (1991), Austr. J. of Statistics
Tn : n th visit to state 1in two state Markov chainMarkov samplingDeshmukh (2000), Austr.&New Zealand J. of Statistics
Extension of PASTA
Tn: n-th epoch of k-th successin Bernoulli trialsStrong completeness of negative binomial family
6
{X(t), t ≥ 0}: continuous parameter stochastic process
Markov sampling of a continuous parameter stochastic process
{Y(t), t ≥ 0}: Markov process with state space {0,1} and Y(0) = 1 {Y(t)} is independent of {X(t)}
Observe {X(t)} at the epochs of visits to state 1 of {Y(t)}
Tk = S1 + … + Sk : epoch of k-th visit to state 1 of {Y(t)}
{X(t)} is observed at Tk, k ≥ 1, {Z(k) = X(Tk), k ≥ 1} is derived from the original process by MS
7
Aim: Whether {Z(k)} determines the stochastic structure of the original process uniquely
Waiting time Tk for the k-th visit to the state 1 of the Markov process
1 W1 0 W0 1
Waiting time for the first visit to the state1: S1 = W0 + W1
W0 and W1 are independent random variables having exponential distribution with mean λ0
-1 and λ1-1 respectively.
Tk = S1 + … + Sk = V0 + V1, where Vi ~ ~ G (λi, k ), λi : scale parameter and k is the shape parameter, i = 0,1.
88
{X(1) (t)} {X(2) (t)}
{Z(1) (k)} {Z(2) (k)}=d
=d
Sampling scheme is identifiable
Gi = P[Zi(K(j)) ≤ xj, j = 1, 2, …,n ], K(j) = k1 + k2 + … + kj
Markov sampling
Fi ≡ Fi(t1, t2,…, tn) = Fi(t1, t2,…, tn; x1,.., xn) = P[Xi(tj) ≤ xj; j = 1,…, n],
x1,x2,…, xn - real numbers, t1,t2,…, tn - positive real numbers, t1 < t2 < … < tn.
Family of finite dimensional distribution functions of {X(i) (t)}
Family of finite dimensional distribution functions of {Z(i) (k)}
9
Gi ≡ P[Zi(K(j)) ≤ xj, j = 1,…, n] = P[Xi(TK(j)) ≤ xj, j = 1,…, n] = P[Xi(Uj) ≤ xj, j = 1,…, n] = E{ P[Xi(Uj) ≤ xj, j = 1,…, n | U1,U2,…,Un] } = E{P[Xi(Uj) ≤ xj, j = 1,…, n ]} = E{Fi(L1, L1+L2, …, L1+L2+…+Ln)}
0 K(1) K(2) K(n-1) K(n)
k1 k2 kn
TK(1) TK(2) TK(n-1) TK(n)
U1 U2 Un-1 Un
L1 L2 Ln
Let TK(j) = Uj and Lj = Uj – Uj-1 , Lj = U1 + U2 + … + Uj
10
G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)}E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0
Expectation is with respect to the joint distribution of (L1, L2,…,Ln). L1, L2,…,Ln are independent and Lj ~ Vokj + V1kj ,where Vokj ~G (λ0, kj ) and V1kj ~ G (λ1, kj ), λ0 and λ1 are known, kj j= 1, …,n are the only unknown parameters .Expectation is with respect to the joint distribution of (Vokj ,V1kj j= 1, …,n).
If the joint distribution of (Vokj ,V1kj j= 1, …,n) is complete G1 = G2 implies F1 = F2Strong completeness of family of Vokj /V1kj for any j implies completeness of the joint distributions
11
X ~ G ( α, k), α : scale parameter, k : shape parameterf (x) = αk e-αx xk-1/Г(k), x > 0
α : known, k Є I+
Not a one parameter exponential family, parameter space is not an open set
Complete family
12
Ek( h(x)) = 0 for all k Є I+
⇔ ∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k Є I+⇔ g(k) = 0 , for all k Є I+
⇔ Σ zk g(k) = 0, 0 < z < 1
⇔ ∫ h(x) e-αx (Σ (α zx)k-1/(k-1)!) dx = 0
⇔ ∫ h(x) e-θx = 0, θ = α (1 – z), 0 < θ < α
⇔ ∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation
⇔ h(x) = 0, a.s. Pk for all k Є I+
{G(α , k), k Є I+} is a complete family
Strongly complete
13
Definition: A family of distributions {Fθ, θ Є Θ} is called strongly complete if there exists a measure μ on (Θ , fi) such that for every subset Θ* of Θ for which μ(Θ – Θ*) = 0, ∫ h(x) Fθ(dx) = 0 for all θ Є Θ* implies that h (x) = 0 a. s. Pθ for every θ Є Θ. (Zacks, 1971)
Strong completeness implies completeness by taking Θ* = Θ
Suppose T1 and T2 are independent random variables. If {FθT1,θ Є Θ}
is complete and {FθT2, θ Є Θ’} is strongly complete then the family
of joint distributions {Fθ, θ’T1,T2 θ Є Θ, θ’ Є Θ’} is complete.
(Zacks, 1971) Gamma family is strongly complete
Parameter space - I+, fi - sigma field, μ is a measure induced by geometric distribution
For A Є fi, μ(A) = Σ δ (1 – δ) k – 1, sum being taken over k Є A
14
Suppose Θ* is a subset of I+ such that μ(I+ - Θ* ) = 0
∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k Є Θ*
⇔ g(k) = 0, for all k Є Θ*
μ({k}) = 0 , for all k Є (I+ - Θ*)
g(k) μ(k) = 0 , for all k Є I+
Σ δ (1 – δ) k – 1( ∫ h(x) αk e-αx xk-1/Г(k)dx) = 0, sum being taken on I+
Using Fubini’s theorem, summation and integration can be interchanged
∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation
h(x) = 0, a.s. Pk for all k Є I+
Gamma family is strongly complete
15
Thus, the joint distribution of ((Vokj ,V1kj j= 1, …,n) ) is complete. Further using continuity of F we get G1 = G2 implies F1 = F2
Markov sampling is an identifiable sampling scheme
{Z(k), k ≥ 1} is a Markov process iff {X(t)}is a Markov process. {Z(k), k ≥ 1} is a stationary process iff {X(t)}is a stationary process. lim
t →∞P[X(t) Є B] = lim
n→∞ P[Zn Є B]
Fraction of time the process {X(t)} is in set B (a measurable subset of a state space of {X(t)}) is the same as the fraction of time the process {X(t)} is in B when observed at the epochs of visits to the state 1 of {Y(t)} Parallel to the Poisson Arrivals See Time Averages (PASTA) property
1616
Application : Identifiable sampling designs in spatial processes to select the locations. {Z(s) , s D} : Spatial process s: Locations, D : Study region
Aim : To select locations at which the characteristic under study is to be measured, thickness or smoothness of powder coating, nests of birds Most common scheme: Regular sampling, Cressie,1993
Non-identifiability
1717
Study region – Continuous
Study region – Discrete Adopt Bernoulli sampling or Markov sampling
Deshmukh (2003), JISA (Adke Special volume)
Aim : Selection of locations (s1,s2)
If both coordinates are selected by Poisson sampling, it generates CSR pattern. If both coordinates are selected by Markov Process sampling, it generates aggregated pattern. Spatial process observed at these locations determines the original process uniquely
Prayag & Deshmukh (2000) : Environmetrics Test for CSR against aggregated pattern
18
Suppose X has negative binomial distribution
Pk[ X = x] = (x + k -1)C(k-1) pk qx-k , x = 0, 1, …,
p – known, k Є I+,
Complete
Strongly complete
not a one parameter exponential family
19
Risk models in insurance
U (t) : reserve/ value of the fund/ insurer’s surplus at time t
U (t) = initial capital + input via premiums by time t – output due to claims by t
S (t) = Output due to claim payments by t = 0∫t X(u) du, random part
Probability of ruin = P [ U (t) < 0]
Observed data are the claim amounts in various time periods - weeks or months
Uk = Σ Xi, i runs from 1 to Nk, Nk is the frequency of a claim in a fixed time period,and Xi denotes the claim amount, Nk and Xi are random.
If Nk = 0, Uk = 0
Nk – Poisson, negative binomial
Distribution of {S (t)} or its discrete version : Sn = Σ Xi, i running from 1 to n
20
{Tk, k ≥ 1}, Tk – Tk-1 are distributed as Nk with support I+
Uk = S(Tk) – S(Tk-1)
Observed data are realization of the process {S (Tn), n ≥ 1}, a process observed at random epochs On the basis of these data we wish to study the process {Sn, n ≥ 1}
Identifiability of the random sampling scheme.
If {Sn} is modelled as a renewal process then identifiability of the random sampling scheme is valid for any discrete distribution of Nn with support I+ . (Teke & Deshmukh,2008, SPL)
If {Sn} is a discrete parameter process then identifiability of the random sampling scheme is valid for negative binomial distribution of Nn
Strong completeness of the family of negative binomial distributions helps to prove identifiability
21
Renewalprocesses
{Sn , n 1}: Renewal process, f(s) : L.T.
{Tn , n 1} : Renewal processSupport – N, P(s) : p.g.f.
Zn = S(Tn), Renewal process
Renewalprocesses
Cox process
Cox process
P(s) – Geometric,Shifted geometric
P(s) – Geometric,Poisson& negative Binomial, both truncated at zeroBernstein, Stieltjes
g(s) : L.T. g(s) = P(f(s)) f(s) = P-1(g(s)) : Inversion formula{Zn} determines {Sn}gn(s) : Empirical L.T.fn(s) = P-1(gn(s))
{Sn , n 1}: Random walk{S(t) , t 0}: Levy process
22
Work in progress
{X(t), t T}: Original process under study {Zn = X(Tn)}, {Tn, n ≥ 1}: Renewal process
G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)}E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0
Lj : sum of kj iid random variables, if the joint distribution of (L1, L2,…,Ln) is complete then, G1 = G2 implies F1 = F2.
f(x, k), k Є I+}: family of L
Σ δ (1 – δ) k – 1( ∫ h(x) f(x,k)dx) = 0, sum being taken on I+
⇔ ∫ h(x) Σ (1 – δ)k f(x,k) = 0⇔ ∫ h(x) A(x, δ)= 0, A(x, δ) = Σ (1 – δ)k f(x,k) Can we conclude that h(x) = 0 a.s.?
23
References
1.Baba, Y. (1982). Maximum likelihood estimation of parameters in birth and death process by Poisson sampling, J. Oper. Res. 15, 99-111.
2. Basawa, I.V. (1974). Maximum likelihood estimation of parameters in renewal and Markov renewal processes. Austral. J. Statist. 16, 33-43.
3.Cressie, N. A. C. (1993). Statistics for Spatial Data, Wiley, New York.
4.Deshmukh, S.R. (1991). Bernoulli sampling, Austral. J. Statist. 33, 167-176.
5.Deshmukh, S.R. (2000). Markov sampling, Aust. N. Z.J. Statist. 42(3), 337-345. 6.Deshmukh, S.R. (2003). Identifiable sampling design for spatial process. J. Ind. Statist. Assoc. 41(2) 261-274.
7.Deshmukh, S.R. (2005). Markov Arrivals See Time Averages, Stochastic Modelling and Applications. Vol. 8, 2, p. 1-20.
24
8.Kingman, J.F.C. (1963). Poisson counts for random sequences of events. Ann. Math. Statist. 34, 1217-1232.
9.Prakasa Rao, B.L.S. (1988). Statistical inference from sampled data for stochastic process. Contemp. Math. 80, 249-284.
10. Prayag, V.R. & Deshmukh, S.R. (2000). Testing randomness of spatial pattern using Eberhardt’s index, Environmetrics, Vol. 11, p. 571-582.
11.Su, Y. and Cambanis, S. (1993). Sampling designs for estimation of a random process. Stochastic Process Appl. 46, 47-89.
12.Teke S.P. & Deshmukh, S.R.(2008) . Inverse Thinning of Cox and Renewal Processes, Statistics and Probability Letters, 78, p. 2705-2708.