1 Random Thinning of stochastic process Strong Completeness and its Application to Random Thinning...

25
1 Random Thinning of stochastic process Strong Completeness and its Application to Random Thinning of Random Processes Shailaja Deshmukh University of Pune, Pune – India Visiting professor, University of Michigan, Ann Arbor Identifiable sampling schemes Strong completeness of gamma family kov sampling of a continuous parameter stochastic p Applications in spatial sampling Outline Negative binomial sampling of a discrete parameter stochastic process and its applications in risk analysis
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    3

Transcript of 1 Random Thinning of stochastic process Strong Completeness and its Application to Random Thinning...

1

Random Thinning of stochastic process

Strong Completeness and its Application to Random Thinning of Random Processes

Shailaja DeshmukhUniversity of Pune, Pune – India

Visiting professor, University of Michigan, Ann Arbor

Identifiable sampling schemes

Strong completeness of gamma family

Markov sampling of a continuous parameter stochastic process

Applications in spatial sampling

Outline

Negative binomial sampling of a discrete parameter stochastic process and its applications in risk analysis

2

Complete observation for a fixed time interval Keiding (1974, 1975), Athreya (1975, 1978)

Observing a process till a fixed no. events occur (inverse sampling) Moran (1951), Keiding (1974, 1975)

Observing the process at specified deterministic epochs t1, t2, …tn

Prakasa Rao (1988), Su & Cambanis (1993)

{X(t) , t T} , T : Discrete or continuous

Various observational schemes

3

Kingman (1963, Ann. Math. Statist ) Fixed epoch sampling suffers from non- identifiability Observed data may come from different processes

Kingman (1963) advocated selecting epochs t1, t2, …tn randomly

Criterion – Process derived from the original process randomly, should determine the stochastic structure of the original process uniquely

{X(t), t T}: Original process under study

Zn = X(Tn), Tn : Random variables

{Zn, n 1} : Derived process or randomly thinned process

The process used for sampling should be identifiable

4

{X(1) (t)} {X(2) (t)}

{Z(1) (n)} {Z(2) (n)}=d

=d

Derived process determines the original process uniquely

Identifiability is essential for justfication of inference based on randomly derived process Basawa (1974) , Baba (1982) Identifiable sampling schemes

Identifiability of a sampling scheme

5

continuous discrete

Tn : n th eventin Poisson process.Poisson samplingKingman (1963) Ann. Math.Statist

Tn : n th visit to state 1in two state Markov process.Markov process samplingStrong completeness of gammafamily. Deshmukh (2005)Stochastic modelling & Applications

Tn : n th success in independent Bernoulli trials.Bernoulli samplingDeshmukh (1991), Austr. J. of Statistics

Tn : n th visit to state 1in two state Markov chainMarkov samplingDeshmukh (2000), Austr.&New Zealand J. of Statistics

Extension of PASTA

Tn: n-th epoch of k-th successin Bernoulli trialsStrong completeness of negative binomial family

6

{X(t), t ≥ 0}: continuous parameter stochastic process

Markov sampling of a continuous parameter stochastic process

{Y(t), t ≥ 0}: Markov process with state space {0,1} and Y(0) = 1 {Y(t)} is independent of {X(t)}

Observe {X(t)} at the epochs of visits to state 1 of {Y(t)}

Tk = S1 + … + Sk : epoch of k-th visit to state 1 of {Y(t)}

{X(t)} is observed at Tk, k ≥ 1, {Z(k) = X(Tk), k ≥ 1} is derived from the original process by MS

7

Aim: Whether {Z(k)} determines the stochastic structure of the original process uniquely

Waiting time Tk for the k-th visit to the state 1 of the Markov process

1 W1 0 W0 1

Waiting time for the first visit to the state1: S1 = W0 + W1

W0 and W1 are independent random variables having exponential distribution with mean λ0

-1 and λ1-1 respectively.

Tk = S1 + … + Sk = V0 + V1, where Vi ~ ~ G (λi, k ), λi : scale parameter and k is the shape parameter, i = 0,1.

88

{X(1) (t)} {X(2) (t)}

{Z(1) (k)} {Z(2) (k)}=d

=d

Sampling scheme is identifiable

Gi = P[Zi(K(j)) ≤ xj, j = 1, 2, …,n ], K(j) = k1 + k2 + … + kj

Markov sampling

Fi ≡ Fi(t1, t2,…, tn) = Fi(t1, t2,…, tn; x1,.., xn) = P[Xi(tj) ≤ xj; j = 1,…, n],

x1,x2,…, xn - real numbers, t1,t2,…, tn - positive real numbers, t1 < t2 < … < tn.

Family of finite dimensional distribution functions of {X(i) (t)}

Family of finite dimensional distribution functions of {Z(i) (k)}

9

Gi ≡ P[Zi(K(j)) ≤ xj, j = 1,…, n] = P[Xi(TK(j)) ≤ xj, j = 1,…, n] = P[Xi(Uj) ≤ xj, j = 1,…, n] = E{ P[Xi(Uj) ≤ xj, j = 1,…, n | U1,U2,…,Un] } = E{P[Xi(Uj) ≤ xj, j = 1,…, n ]} = E{Fi(L1, L1+L2, …, L1+L2+…+Ln)}

0 K(1) K(2) K(n-1) K(n)

k1 k2 kn

TK(1) TK(2) TK(n-1) TK(n)

U1 U2 Un-1 Un

L1 L2 Ln

Let TK(j) = Uj and Lj = Uj – Uj-1 , Lj = U1 + U2 + … + Uj

10

G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)}E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0

Expectation is with respect to the joint distribution of (L1, L2,…,Ln). L1, L2,…,Ln are independent and Lj ~ Vokj + V1kj ,where Vokj ~G (λ0, kj ) and V1kj ~ G (λ1, kj ), λ0 and λ1 are known, kj j= 1, …,n are the only unknown parameters .Expectation is with respect to the joint distribution of (Vokj ,V1kj j= 1, …,n).

If the joint distribution of (Vokj ,V1kj j= 1, …,n) is complete G1 = G2 implies F1 = F2Strong completeness of family of Vokj /V1kj for any j implies completeness of the joint distributions

11

X ~ G ( α, k), α : scale parameter, k : shape parameterf (x) = αk e-αx xk-1/Г(k), x > 0

α : known, k Є I+

Not a one parameter exponential family, parameter space is not an open set

Complete family

12

Ek( h(x)) = 0 for all k Є I+

⇔ ∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k Є I+⇔ g(k) = 0 , for all k Є I+

⇔ Σ zk g(k) = 0, 0 < z < 1

⇔ ∫ h(x) e-αx (Σ (α zx)k-1/(k-1)!) dx = 0

⇔ ∫ h(x) e-θx = 0, θ = α (1 – z), 0 < θ < α

⇔ ∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation

⇔ h(x) = 0, a.s. Pk for all k Є I+

{G(α , k), k Є I+} is a complete family

Strongly complete

13

Definition: A family of distributions {Fθ, θ Є Θ} is called strongly complete if there exists a measure μ on (Θ , fi) such that for every subset Θ* of Θ for which μ(Θ – Θ*) = 0, ∫ h(x) Fθ(dx) = 0 for all θ Є Θ* implies that h (x) = 0 a. s. Pθ for every θ Є Θ. (Zacks, 1971)

Strong completeness implies completeness by taking Θ* = Θ

Suppose T1 and T2 are independent random variables. If {FθT1,θ Є Θ}

is complete and {FθT2, θ Є Θ’} is strongly complete then the family

of joint distributions {Fθ, θ’T1,T2 θ Є Θ, θ’ Є Θ’} is complete.

(Zacks, 1971) Gamma family is strongly complete

Parameter space - I+, fi - sigma field, μ is a measure induced by geometric distribution

For A Є fi, μ(A) = Σ δ (1 – δ) k – 1, sum being taken over k Є A

14

Suppose Θ* is a subset of I+ such that μ(I+ - Θ* ) = 0

∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k Є Θ*

⇔ g(k) = 0, for all k Є Θ*

μ({k}) = 0 , for all k Є (I+ - Θ*)

g(k) μ(k) = 0 , for all k Є I+

Σ δ (1 – δ) k – 1( ∫ h(x) αk e-αx xk-1/Г(k)dx) = 0, sum being taken on I+

Using Fubini’s theorem, summation and integration can be interchanged

∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation

h(x) = 0, a.s. Pk for all k Є I+

Gamma family is strongly complete

15

Thus, the joint distribution of ((Vokj ,V1kj j= 1, …,n) ) is complete. Further using continuity of F we get G1 = G2 implies F1 = F2

Markov sampling is an identifiable sampling scheme

{Z(k), k ≥ 1} is a Markov process iff {X(t)}is a Markov process. {Z(k), k ≥ 1} is a stationary process iff {X(t)}is a stationary process. lim

t →∞P[X(t) Є B] = lim

n→∞ P[Zn Є B]

Fraction of time the process {X(t)} is in set B (a measurable subset of a state space of {X(t)}) is the same as the fraction of time the process {X(t)} is in B when observed at the epochs of visits to the state 1 of {Y(t)} Parallel to the Poisson Arrivals See Time Averages (PASTA) property

1616

Application : Identifiable sampling designs in spatial processes to select the locations. {Z(s) , s D} : Spatial process s: Locations, D : Study region

Aim : To select locations at which the characteristic under study is to be measured, thickness or smoothness of powder coating, nests of birds Most common scheme: Regular sampling, Cressie,1993

Non-identifiability

1717

Study region – Continuous

Study region – Discrete Adopt Bernoulli sampling or Markov sampling

Deshmukh (2003), JISA (Adke Special volume)

Aim : Selection of locations (s1,s2)

If both coordinates are selected by Poisson sampling, it generates CSR pattern. If both coordinates are selected by Markov Process sampling, it generates aggregated pattern. Spatial process observed at these locations determines the original process uniquely

Prayag & Deshmukh (2000) : Environmetrics Test for CSR against aggregated pattern

18

Suppose X has negative binomial distribution

Pk[ X = x] = (x + k -1)C(k-1) pk qx-k , x = 0, 1, …,

p – known, k Є I+,

Complete

Strongly complete

not a one parameter exponential family

19

Risk models in insurance

U (t) : reserve/ value of the fund/ insurer’s surplus at time t

U (t) = initial capital + input via premiums by time t – output due to claims by t

S (t) = Output due to claim payments by t = 0∫t X(u) du, random part

Probability of ruin = P [ U (t) < 0]

Observed data are the claim amounts in various time periods - weeks or months

Uk = Σ Xi, i runs from 1 to Nk, Nk is the frequency of a claim in a fixed time period,and Xi denotes the claim amount, Nk and Xi are random.

If Nk = 0, Uk = 0

Nk – Poisson, negative binomial

Distribution of {S (t)} or its discrete version : Sn = Σ Xi, i running from 1 to n

20

{Tk, k ≥ 1}, Tk – Tk-1 are distributed as Nk with support I+

Uk = S(Tk) – S(Tk-1)

Observed data are realization of the process {S (Tn), n ≥ 1}, a process observed at random epochs On the basis of these data we wish to study the process {Sn, n ≥ 1}

Identifiability of the random sampling scheme.

If {Sn} is modelled as a renewal process then identifiability of the random sampling scheme is valid for any discrete distribution of Nn with support I+ . (Teke & Deshmukh,2008, SPL)

If {Sn} is a discrete parameter process then identifiability of the random sampling scheme is valid for negative binomial distribution of Nn

Strong completeness of the family of negative binomial distributions helps to prove identifiability

21

Renewalprocesses

{Sn , n 1}: Renewal process, f(s) : L.T.

{Tn , n 1} : Renewal processSupport – N, P(s) : p.g.f.

Zn = S(Tn), Renewal process

Renewalprocesses

Cox process

Cox process

P(s) – Geometric,Shifted geometric

P(s) – Geometric,Poisson& negative Binomial, both truncated at zeroBernstein, Stieltjes

g(s) : L.T. g(s) = P(f(s)) f(s) = P-1(g(s)) : Inversion formula{Zn} determines {Sn}gn(s) : Empirical L.T.fn(s) = P-1(gn(s))

{Sn , n 1}: Random walk{S(t) , t 0}: Levy process

22

Work in progress

{X(t), t T}: Original process under study {Zn = X(Tn)}, {Tn, n ≥ 1}: Renewal process

G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)}E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0

Lj : sum of kj iid random variables, if the joint distribution of (L1, L2,…,Ln) is complete then, G1 = G2 implies F1 = F2.

f(x, k), k Є I+}: family of L

Σ δ (1 – δ) k – 1( ∫ h(x) f(x,k)dx) = 0, sum being taken on I+

⇔ ∫ h(x) Σ (1 – δ)k f(x,k) = 0⇔ ∫ h(x) A(x, δ)= 0, A(x, δ) = Σ (1 – δ)k f(x,k) Can we conclude that h(x) = 0 a.s.?

23

References

1.Baba, Y. (1982). Maximum likelihood estimation of parameters in birth and death process by Poisson sampling, J. Oper. Res. 15, 99-111.

2. Basawa, I.V. (1974). Maximum likelihood estimation of parameters in renewal and Markov renewal processes. Austral. J. Statist. 16, 33-43.

3.Cressie, N. A. C. (1993). Statistics for Spatial Data, Wiley, New York.

4.Deshmukh, S.R. (1991). Bernoulli sampling, Austral. J. Statist. 33, 167-176.

5.Deshmukh, S.R. (2000). Markov sampling, Aust. N. Z.J. Statist. 42(3), 337-345. 6.Deshmukh, S.R. (2003). Identifiable sampling design for spatial process. J. Ind. Statist. Assoc. 41(2) 261-274.

7.Deshmukh, S.R. (2005). Markov Arrivals See Time Averages, Stochastic Modelling and Applications. Vol. 8, 2, p. 1-20.

24

8.Kingman, J.F.C. (1963). Poisson counts for random sequences of events. Ann. Math. Statist. 34, 1217-1232.

9.Prakasa Rao, B.L.S. (1988). Statistical inference from sampled data for stochastic process. Contemp. Math. 80, 249-284.

10. Prayag, V.R. & Deshmukh, S.R. (2000). Testing randomness of spatial pattern using Eberhardt’s index, Environmetrics, Vol. 11, p. 571-582.

11.Su, Y. and Cambanis, S. (1993). Sampling designs for estimation of a random process. Stochastic Process Appl. 46, 47-89.

12.Teke S.P. & Deshmukh, S.R.(2008) . Inverse Thinning of Cox and Renewal Processes, Statistics and Probability Letters, 78, p. 2705-2708.

25

Thank You