MAXIMUM LIKELIHOOD ESTIMATION OF MULTIVARIATE … · 2016. 12. 28. · polyserial correlation model...
Transcript of MAXIMUM LIKELIHOOD ESTIMATION OF MULTIVARIATE … · 2016. 12. 28. · polyserial correlation model...
MAXIMUM LIKELIHOOD ESTIMATION
OF
MULTIVARIATE
POLYSERIAL AND POLYCHORIC CORRELATION COEFFICIENTS
by
WAI-YIN POON
A Thesis
submitted to
the Graduate School of
The Chinese University of Hong Kong
(Division of Statistics)
In Partial Fulfillment
of the Requirements for the Degree of
Master of Philosophy (M. Phil.)
Hong Kong
May, 1985
THE CHINESE UNIVERSITY OF HONG KONG
GRADUATE SCHOOL
The undersigned certify that we have read a thesis, entitled
Maximum Likelihood Estimation of Multivariate Polyserial and Polychori
Correlation Coefficients submitted to the Graduate School by Wai-yin
Poon ( 潘 偉 賢 )
of Master of Philosophy in Statistics. We recommend that it be
accepted.
Dr. S.Y. Lee,
Supervisor
Professor H. Ton
Dr. ELK. Lam
Professor P.M. Bentler,
External Examiner
in partial fulfillment of the requirement for the degree
DCLARATION
No portion of the work referred to in this thesis as been
submitted in support of an application for another degree or
qualification of this or any other university or other institution
of learning.
Acknowledgments
I would like to express any sincere thanks to Dr. S.Y. Lee for his
encourage neilt and his supervision on my thesis. It is also a pleasure to
express my gratitude to any father, Mr. Y.D. Poon for his encouragement.
Abstract
The method for finding the maximum likelihood estimates
of the parameters in a multivariate normal model with some of
the component variables observable only in polytomous form is
developed. The main trick used is a reparameterization whick
converts the corresponding log-likelihood function to a easily
handled one. The maximum likelihood estimates are found byra
Fletcher-Powell algorithm, and their standard errors obtained
from the information matrix. Two special models, namely the
polyserial correlation model and the multivariate polychoric
correlation model, are studied. When the dimension of the
random vector observable only in polytomous form is large,
obtaining the maximum likelihood estimates is computationally
rather labor expensive. Therefore, a more efficient method,
the partition maximum likelihood method, is proposed. These
estimation methods are demonstrated by real and simulated
data, and compared by a simulation study.
Contents
Chapter 1 Introduction
Chapter 2 Maximum Likelihood Estimation of the General Model
2.1 Model
2.2 Maximum Likelihood Estimation
2.3 Optimization procedure
2.4 Example
Chapter 3 Special Models
3.1 Polyserial correlation model
3.2 Multivariate polychoric correlation model
Chapter 4 Partition I1axiinuni Likelihood (PML) Estimation
4.1 Tne PLI-MIL procedure
4.2 Example
4.3 Simulation
Chapter 5 Summary and Discussion
Tables
Reference
Page
1
5
5
7
23
27
29
29
36
40
40
41
42
45
48
62
Chapter 1
Introduction
There are many examples in psycnology (Lazarsfeld,1959 Lord
Novick,1968), econometrics (Nerlove & Press,1973; Schmidt
Strauss,1975) and biometrics (Ashford & Sowden,1970 Finny,1971) for
which a continuous variable underlies a dichotomous or polytomous
observed variable. Examples of such variables are attitude items,
rating scales, performance items and the like. Typical cases are when a
subject is asked to answer the question on scale like
don't know disapprove disapproveapprove approve
stronglystrongly
When ana yzing this kind of data, a common approach used by nonrigorous
statisticians is to assign integer values to each category and proceed
in tiie analysis as it the data had been measured on an interval scale
with the desired distributions. Although many statistical Methods seem
to be fairly robust against this kind of deviation from the
distributional assumptions, there are ,many situations that may lead to
erroneous results. Olsson (1979) showed that due to the biased
estimates of the correlation, the application of factor analysis to this
kind of discrete data may lead to erroneous conclusions. Thus, as we
expected, the applications of principal component analysis, Multiple
correlations and canonical correlation analysis may lead to incorrect
results as well, because these statistical methods also depend heavily
on the estimation of the correlations. Thus, it is important to derive
reliable correlation estimates with this kind of data.
2
0
The measure of bivariate normal correlation based on data from a 2
x 2 contingency table was suggested by Pearson (1901). He called it the
tetrachoric correlation. This correlation has been extended to data
from a r x s contingency table with the underlying observed variables
have r and s ordinal categories respectively. Tallis (1962) studied the
problem of maximum likelihood estimation with r-s= 3. Martinson
Hamdan (1971) developed a two- step maximum likelihood method that gave
the estimates of a r x s table. In this method, the thresholds are
first estimated by the cumulative marginal proportions, and then the
polychoric correlation is estimated with the thresholds fixed at their
estimates. Olsson (1979) developed a procedure that gave the maximum
likelihood estimates of the correlation and thresholds. He also
compared the full rnaximurn likelihood approach with the two- step
approach. Lee (1984) extended Olsson' s method to three way r x s x t
contingency tables. In that case, three observed polytomous ordinal
variables were considered. The thresholds and correlations were
estimated using the maximum likelihood approach with the assumption that
the associated underlying latent variables have a standardized
trivariate normal distributions.
Let Z be an observed discrete variable which depends on an
underlying 'latent continuous random variable Y, and X represent another
observed continuous variable. The correlation between X and Y obtained
from observed X and Z is called the polyserial correlation. Under the
normality assumption and when Z is dichotomous, the maximum likelihood
estimation of the correlation between X and Y has been studied by
Tate (1955a,b). Tate's work has been generalized in two directions.
Hannan Tate (1965) considered the situation with X being a Land [fl
3vector. They derived the maximum likelihood estimates for the
correlations and the point of dichotomy (threshold), and the standard
errors estimates as well. However, because of the complicated integrals
involved, the expressions for standard errors are very difficult to use.
Tables that help the computation of standard errors have been published
later by Prince Tate (1966). Cox (1974) generalized Tate's work by
treating Z as a polytomous observed variable. Maximum likelihood
estimates of the parameters in the model were obtained via the scoring
algorithrll. Olsson, Drasgow Dorans (1982) compared the maximum
likelihood estimator with a two- step estimator and with a simple ad
hoc estimator. Lee Poon (1985) developed a method for estimating the
parameters of a model in which X is an observable random vector and Z is
an observable ordinal polytoinous variable. The set of parameters in
their model contains the mean vector and covariance matrix of X, the
thresholds, and the polyserial correlations between X and Y. Under the
noi-:nality assumption, a method that based on the Newton- Raphson
algoritiri was developed to produce the maximum likelihood estimates and
the standard error estimates. Clearly, the model of Lee Poon (1985)
reduces to the previously studeid situations in special cases.
In this thesis, we will consider a model which involves an
observable continuous random vector X and a latent continuous random
vector Y. An observable random vector Z is similarly defined based on
values of Y and the thresholds. The main purpose is to develop a
maximum likelihood approach for estimating the parameters in the model
which contains the mean vector and the covariance matrix of X, the
polyserial correlations between X and Y, the polychoric correlations
among the variables in Y, and the thresholds. In Chapter 2, the general
4
model is described and the maximum likelihood estimation is studied.
The estimate are obtained via the Fletcher- Powell algorithm, and
standard error estimates are obtained by implementing the inverse of the
information matrix. Two special cases of the general model that have
wide practical applicabilities are described in Chapter 3. The first
one is the iiiodel studied by Lee Poon (1985) while the second one is
the model that only Z is involved. A procedure, we will call it the
partition maximum likelihood (PMl) procedure, for analyzing the
general model by means of the special models in Chapter 3 is studied in
Chapter 4. As we will see, the PML procedure requires much less
computer time than the original maximum likelihood procedure. Chapter 4
also reports results from a simulation study which was implemented to
compare the accuracy of the maximum likelihood and the partition
maximum likelihood methods. The indication is that the estimates
produced by these methods are very close to each other. The discussion
of the results and the conclusion are given in Chapter 5.
ChaDter 2
Maximum Likelihood Estimation of the General Model
1. Modei
Let X and Y be continuous vectors of dimension r and n
respectively. It is assumed that is distributed
according to a multivariate normal distribution with mean vector
and variance covariance matrix
where
e is the r x 1 dimensional mean vector of X,
0 is the n x 1 dimensional zero mean vector of Y,
is the r x r covariance matrix of X,
is tne n x r covariance matrix of
is tne n x n correlation matrix of Y.
C.leariv. i is tne correlaion matrix o will store
the correlation matrix oJ
Le 1 be a n x 1 polytomous vector defined
as
6where Yi is the i th component of Y and
for all i.
(2.1)
Let
where Vec(C) is a vector taking the elements in C row by row and
Pij= P ji, i, j= 1 ,2, ...n ,i j are off diagonal elements of Ryy. Suppose
that we have a random sample f roam (X', Z') with sample size N. Then the
information available consists of N (r+n) x 1 observed vector of the
form (x'z'_(i(z)), z'), that is
The ith component of z_ takes the value from 1,2,3,...n(i) anc
i(z_) taices the first value from the sequence 1,2,,.. f(z) with
f(z) is the total number of observations with Z= z_. Clearly
Based on this data set, a method for finding the maximum likelihood
estimates is developed. Tne estimates of elements in based
on random observations of X and Z are called the polyserial
correlations, and the estimates of are called the polychoric
correlations.
2. Maximum Likelihood Estimation
Let be the probability density of X and
respeclively. Then Moreover,
the conditional distribution of Y given X= x is multivariate
normal with mean vector
and variance covariance matrix
that is
Let be the ith row of and let denote
the diagonal matrix with its (i,i) tn entry equals to
Then we have
where
(2.2)
and its (i,j) th entry takes the value
It is clear from (2.2) that
has a multivariate normal distribution with correlation matrix R.
The probalility density function is given by
Therefore,
where
As a consequence, we can represent where
is the dimensional multivariate normal density function
and is given by (2.3). In addition, tne likelihood
function of for this random sample is given by:
(2.4)
The maximum likelihood estimate is defined as the
vector that maximizesHowever, it should be pointed out
that maximizing is extremely complicated because the unknown
parameters in e and involved in p(x) is also involved in
To overcome this difficulty, we use a transformation. Let
and define
(2.5)
Thus, the Darameters vectoi is transformed by (2.5) to a new
parameters vector
This trasformation is one to one with its inverse given by
11
(2.6)
Casing this transfforlnation, Pr(z'/x) can be expressed in terms of
t:ie elements in; that is
(2.7)
Let
(2.8)
be tiie likelihood equation in terms of the parameters in As a
result, finding the maximum likelihood estimates is easier
than that of since maximizing is easier to handle. The
reason is that the parameters in which involved in is now
no longer involved in Pr
The maximum likelihood estimate is the vector that
maximizes or eauivalently, maximizes the function
where
(2.9)
Under appropriate mild regularity conditions, is always exists.
Therefore, to obtain the maximum likelihood estimate we are required
to solve the following system of equations:
(2.10)
(2.11)
The solution of in (2.10) gives the traditional maximum
likelihood estimates of e_ and
(2.12)
m
However, some iterative procedure is required for getting the maximum
likelihood estimates of since (2.11) cannot be solved
algebraically in closed form. In addition, the maximum likelihood
estimate of can be obtained by (2.6) in terms of since the
transformation defined in (2.5) is one to one( see,e.g. Anderson,1958).
It is well known that under mild regularity conditions, the maximum
likelihood estimatei
and posses the following statistical properties
(see,e.g-.. Rao,1973; Muirhead,1982; Kendall Stuart, 1978):
(a) They are consistent.
(b) The distribution of is multivariate normal with mean e and
variance covariance matrix
(c) The distribution of is Wischart with N- 1 degrees of
freedom and covariance matrix
(d) The asymptotic distribution of is normal with mean vectoi , and
covariance matrix
(2.13)
(e) Let be the parameters vector containing parameters defined
in (2.6)and -2 be its maximum likelihood estimate. The
asymptotic distribution ofis multivariate normal with mean
vector and covariance matrix -1 K1 (see,e.g.Rao 1973)
where K The explicit expressions for the
entries ion K are given by
(i)
i- 1,2...n
(ii) if i t j
(iii)
1j 2•• n
(iv)
i, j- 1,2... n
k= 2,3... n( i)
i, j, k= i, 2... n
j k
(v)
i- 1,2... n
k= 2,3... n (i)
(vi) if i¥ j
i? j 1;2,— n
k(i)= 2,3,...n(i)
(vii)
5,,= Krotiecker deltakl
i- 1,2... n
k, 1= 2,3,... n (i)
(viii)if i f j
j j±, 2,... n
k= 2,3,...n(i)
i= 2,5,...n(j)
(ix)
1} l?C j 1_ 1,2,...n
k 1
j= 2,3,...n(i)
(x)
i,j 1,2,...n
i j
(xi)
i,j- 1,2,...n
i j
(xii)
ij k 1,2, ...n
1- 2,3,...n(k)
(xiii)
i, j= 1,2,... ii
i j
(xiv) if (i,j) 1 (k,l) f (l,k)
i j 1,2,.. .n
i j, k 1
(2.14)
Let _k denote the row vector (k(l),k(2),...k(n)), then the (c,d) th
element of I in (2.13) is given by
(2.15)
where 9 and 9. are the ctn and dth elements of 0O. As it stands, this
matrix is extremely difficult to compute. Howerver, by the following
lemma, we can use a matrix IA to approximate I with the (c,d) th
element of IA given by
n(l) n(n) f(k)
(2.16)
Lemma
Let L be the log- likelihood function of a random sample i-
l,2,...n from a distribution with probability density function f(9_),
wnere 9 is the vector of unknown parameters. Then the matrix
converges in probability to the information matrix
Proof:
The (c,d) th element of the information matrix I, by definition,
is given by
since independent
since E
= n E
since identically distributed.
Therefore, by law of large numbers( see,e.g. Rao,1973,p.112), the
matrix IA witn its (c,d) th elements equals to
converges in probability to I.
«
Therefore, the asymptotic covariance matrix of can be estimatedi i
-i
by (IA) while the asymptotic covariance matrix of can estimated
by K(IA)-1K.
In deriving the derivatives involved in (2.16), the following
theorem which is a generalization of Lee's (1984) result is needed.
Theorem
The multivariate normal distribution function with correlation matrix R
‘8火文中港
can be written as
where i 0 j, p. is the (i,j) the entry of R, R. is the partial
correlation matrix given and(•) is the standard normal density
function.
Proof: see Lee (1984)
following the notations and using the result of the theorem, we
have
On the other hand, by Johnson Kotz (192), we nave (using the
notations in the theorem)
where 0 is the bivariate normal density function with correlation P.
hk
k 1 i? j
and R.. is the partial correlation matrix given X. and X..• i J
Using (2.17), (2.18) and the method of matrix calculus (Mcdonald
Swaminathan,1973), the following expressions for 9 log Pr(kx)9 Q
in (2.16) are obtained.
(i) 3log Pr(kxf) 3b.
where R. is the partial correlation matrix given Y.x• 1 1
i J
i- 1,2,... n
(2.19)
(ii) 9log Pr(kXk(f)) 3 ai,m(i)
with i(i)= 0 in numerator if m(i)= k(i);
and i(i)= 1 in numerator if rii(i)- k(i)+ 1
j i
(2.20)
otherwise= 0
i= 1,2,...a
m(i)= 2,3,...n(i)
(iii) 8 log Pr(kxk(t-)) 3 1%,
where
ni
R.. is the partial correlation matrix given Y.x and Y.x..ij i— j—
k i, k j
ijj 1,2,.. .n
i 1
(2.21)
Finally, the polyserial correlations can be estimated by the
N
components of cm divided by the square root of the appropriate element
in the diagonal of Cxx
3. Optimization Procedure
Solution of 92 i-n (2.11) cannot be solved algebracially in closed
form, so that some nonlinear optimization procedure is required. The
Newton- Raphson method (see,e.g. Luenberger,1373) is conceptually the
simplest numerical procedure for minimizing a function. However, tne
use of Newton- Raphson algorithm requires the analytic form of the
second derivatives of tne function which, unfortunately, are extremely
difficult to find for our function I7. Therefore, a modified Fletcher-
Powell Procedure (Luenberger,1973) which requires only the first
derivatives, is used. The basic steps of its k- th iteration in
minimizing a general function f with respect to 9 are given as follows:
(i) set s(k)
(ii) find to minimize f
(iii) set 0(k+1)
(iv) updated giving
(2.22)
where is the gradient of f evaluate at 0 and is a symmetric
positive definite matrix which is updated by the so -.called BFGS
formula
H(k+1)= H(h)
wnere
It lias also been shown that the positive definite of H is preserved ana
hence the function value decreases in each iteration (Luenberger;1973).
We use 9 as our final estimate if the root mean square of gradient is
less than a pre- assigned small number, say e.
To apply the above Fletcher- Powell method in minimizing our
function what we required are tiie gradient vector 9 Fr 9 an
initial estimate of the parameters and a starting positive definite
matrix Hk The identity matrix is used for if no better initial
matrix for H is available. The expressions for 9 F9 9 are easily
obtained using (2.19), (2.20) and (2.21) and are given as follows:
(i)
wnere() is the right hand side of (2.i9)
i= 1,2,... n
(2.23)
(ii)
where
f(k,ti)- f(k(l),...k(i-i),m(i),k(i+l),...k(n));
f(k.»)- f(k(l),... ,k(i-l),m(i)-l,k(i+l),.. .k(n));
(-'•) is tiie right hand sine of (2.20) with i(i)-0 in numerator
and k(i)=m(i),
(•') is the right hand side of (2.20) with i(i)-l in
numerator and k(i)=m(i)-l.
i= 1,2,... n
m(i)= 2,3,—n(i)
( 9 9•
(iii)
where() is the right hand, side of (2.21)
i? j i 7 2,n
i j
(2.25)
In general, the starting value of the parameters -s duite robust
to the algorithm defined by (2.22). However, experience indicates that
a good starting value would reduce the time of convergence. Therefore,
a sample estimate that based on (x.',z.') i- 1,2,...N is used to
estimate the initial value of tL. Let
be tne sample standard deviation of Z.;
be the sample covariance of X and Z. divided by
be tne sample correlation of Z. and Z. and
i= 1,2,... n
h= 2,3,...n(i)
where
is the inverse of the standard univariate normal distribution
function;
f is the observed proportion in the (k(i),k(2),...,k(n))- th cell.
Then tne starting value of b., a..,, and r.. can be obtained by
(2.26)
(2.27)
(2.28)
4. Example
Computer programs written in FORTRAN IV with double precision has
been implemented to obtain the maximum likelihood estimate of with
the dimension of Y equals to 1, 2 and 3 respectively. The algorithm is
constructed based on the modified Fletcher- Powell procedure discussed
in Section 3. In the expressions for 3Y 3 2.23, 2.24 and 2.25,
distribution functions of normal variates ±2 and are involved.
To evaluate these functions, numerical integrations are required. Tne
programs use tne subroutine DCADRE from the IMSL(1975) library in getting
and use tne subroutine BINORM developed by Divgi(1979) in getting $2
In evaluating a» 6 Y R) the result of the theorem in Section
2.2 is used. Firstly, a single- argument function
is implemented witn BINORM, where p. is the (i,j) th entry in R and
P0„, is the partial correlation. Then the numerical integration of
tnis function in the range(- 0°, a) is computed by DCADRE. However,
it must be pointed out that although$ and can be obtained
efficiently, evaluating$ is rather computationally expensive since
it involves both the subroutine DCADRE and BINORM.
Tne following example is based on N= 100 simulated data from a
standard multivariate normal distribution with the dimension of X and Y
equal to 2 and 3 respectively. Each (x.'') was transformed to
Ox,;' ir;') by (2.1) with pre-assigned thresholds
Tne parameters in were estimated using the random sample ((x..', z_.'),
i= 1,2,...N). Letting convergence criterion£•= 0.001, the algoritnm
converged quicxly in 3 iterations. To give some idea about the
benaviour of the algorithm, the convergence summary is presented in
Table 2.1. The maximum likelihood estimates and their standard error
estimates are reported in Table 2.2. For information, tne Pearson's
product sample covariances (i.e. the maximum likelihood estimates based
on the continuous data) between X and Y and the Pearson's product
correlations of variables within Y are reported. We see that the
estimates of polyserial covariances and polychoric correlations obtained
from our maximum likelihood method are very close to the maximum
likelihood estimates from continuous observations of X and Y. We also
noted tnat tne maximum likelinood estimates of the thresholds are pretty
close to the pre- assigned thresholds.
Chapter 3
Special Models
In this chapter, we will discuss two special cases of tne general
model described in Chapter 2. We call thern the polyserial correlation
model and the multivariate polychoric correlation model. These models
are particularly interesting not only because they have wide practical
applicabilities but also they subsume many studies in the literature
as their special cases.
1. Polyserial Correlation Model
In this model we suppose the dimension of Y in our general model
is equal to 1. Thus we have a continuous random vector X and a latent
variabe Y. The observed discrete variable Z is related with Y by
if
if
(3.1)
The correlations c between X and Y are trie polyserial
correlations. Certain special cases of this model have received quite
a lot of attention in the literature (Tate,1955a,b; Kannan Tate,
1965; Prince Tate,1966; Cox,1974; Olsson et al,1982). The maximum
likelihood estimate of the complete model has been provided by Lee
Poon (1985). Their treatment is briefly described in the remaining of
this section.
Trie parameter vector in this model is
Using tne similar maximum likelihood approach as described in Chapter 2,
the parameters in are estimated based on a random sample (x,., z,) for
k= l,2,...t, 1= i,2,...n(k) with z,= k and n(k) being the
corresponding total number of observations. More specifically, the
maximum likelihood estimates of e_ and C,, are equal to the sample mean
and the sample covariance matrix respectively, while the maximum
likeiinood estimates of a and c are obtained from the maximum
likelihood estimate of 9- (a,b), where
au
a
-- 7 t-IX.
b
(3.2)
by similar reasoning as in Chapter 2, we see that the maximum
likelihood estimates of b and a can be obtained by minimizing the
function
F,,(b.a)
(3.3)
Again, the estimates of b and a cannot be solved algebraically in
closed form, so that some nonlinear optimization procedure is also
needed. Since tne expression of in (3.3) is much simiplier than
that in (2.9), instead of using the Fletcher- Powell algorithm given
in (2.22), we can use the following more efficient Newton- Raphson
algorithm:
- Y H(9)_1(0)
(5.4)
where (9)- 3 r 3 9 is the gradient vector, H(9)=( 3 3£ 3) i-s
the Hessian matrix and y is a step- size parameter which takes the
first value in the sequence 1, 12, 14,... that reduces F.,. It is well
known in mathematical programming (Bard,1975) that if H is positive
definite, tiie algorithm is very efficeint. Witn the starting values
obtained via tne similar equations as (2.26) and (2.27), experience
indicates that the Hessian matrix is always positive definite.
cased on tae results
and
if k- h
otherwise
the expressions for g(9)~ 3F,, 3 can be found and are given by
en
(5.5)
It should be noted that (3.5) and (3.6) are simple special cases of
(2.23) and (2.24). Differentiate these expressions once more, we
nave
(3.8)
(iii)
1f m= h- 1
if m= a
if ra= h+1
otherwise- u
i n.r~i j-n zzz J'
(3.9)
The asymptotic covariance matrix of the maximum likelihood estimate
of (cd, a) is again given by
k r1 K
where I is the information matrix and
Clearly,-£, while the other expressions in K can be obtained
as special cases by (i), (v) and (vii) of (2.14)
Unlike the Fletcher- Powell method, the Newton- Rapnson algorithm
produces not only the maximum likelihood estimate of 9_ but also an
approximation of the information matrix. Noted that by definition,
1(9)= E(H(9)).
Since the Hessian matrix converges in probability to its expectation, we
may use the Hessian matrix to approximate the information matrix.
Tnerefore, the standard errors of the estimates of c and a can be
obtained by using
K(H(9))-1K
Based on the above method, the example reported in Cox(1974) was
reanalyzed as a special case of this model. Choosing the convergence
criterion£ as 0.0001, the program converged extremely rapidly in two
iterations. The convergence summary is reported in Table 3.1 while the
maximum likelihood estimates and their standard errors obtained are
reported in Table 3.2. It is found that the results are almost
identical to those reported in Cox(1974).
Tne next example is based on N= 100 simulated data from a
multivariate normal distribution with the dimension of (X',Y) equals to
6. Each (x.',y.) was transformed to (x.',z.) by (3.1) with the pre-
assigned thresholds=(-°°, -1.0, 0.0, 0.6, 1.0, 1.3,00). The
parameters in 9 were estimated using the random sanple {(x), i=
1,2,...100}. The program converged quickly to its solution after 4
iterations; the convergence summary of this run is presented in Table
3.3. The maximum likelihood estimates and their standard error
estimates are reported in Table 3.4. For infomation, the Pearson's
product covariance between X and Y are also reported. It is found that
poiyseriai covariances obtained from the maximum likelihood method are
very close to the sample covariance between X and Y. It also been noted
that the maximum likelihood estimates of the thresholds are very
accurate.
2. Multivariate Polychoric Correlation Model
In this model, we suppose than we have only a n- dimensional
latent vector Y. The observed discrete vector _Z is related with Y
similarly by
Z.- k(i;0 f
tOL 1 k(i)- l,2,...n(i) with
and for ail i
hased on the frequencies f(k(l),k(2),...k(n)), k(i)- 1,2,—n(i) in the
multivariate contingency table, tne polychoric correlations P..
between Y. and Y. and the thresholds can be estimated using the maximum
likelihood approach discussed in Chapter 2. Studies of various special
cases of this model can be found in the literature (Pearson,1901;
Tallis,1962; Martinson Hanndan,1971; Lee,1984).
The parameters vector of this model is given by
Let Pr(k)= P(Zi= k(l),...Z= k(n))
the negative log-likelihood equation is given by
(5.10)
It should be noted that the log- likelihood function of FV in (3.10) is
a special case of 10. given in (2.0) by defining a. w.N- a..• b.=
0 and l..= P.. Therefore, to find the maximum likelihood estimates
of the parameter vector 9 in this model, tne computer prograns developed
in Chapter 2 for finding the maximum likelihood estimates of the general
model can be used with b. set equal to zero.
This is in fact not unexpected. Since the behaviour of Z is now no
longer depend on X, the polyserial covariances should equal to zero.
The standard error of the estimate of 9_ can be obtained by tne
corresponding information matrix I. The (c,d) th element of I is given
by
hd
(3.11)
The expressions for 9 log Pr(k) 3 y can be obtained via (2.19) and
(2.20) easily and hence is not presented here. As a result, the
standard error estimates can be directly obtained via (5.11). Moreover,
it is interesting to note that since in this case
(IA),
the covergence in probability of IA to I is easily seen.
Based on the 3x3 contingency table given by Oisson(1979), five
parameters were reestimated using the above method. We found that tne
result is identical to that of Olsson's and is reported in Table 3.5.
Chapter 4
Partition Maximum Likilihood (PML) Estimation
1. The P ML Procedure
The method discussed in Chapter 2 gives the maximum likelihood
estimates of the parameter vector 0 in the general model. It is nice
to have maximum likelihood estimates because they posses many nice
statistical properties. However, the cost is that when the dimension of
tne latent vector Y is high, the method must accomplished by a lot of
computer time. The problem is arisen from the calculation of the
multiple integrals of the multivariate normal distribution functions.
Theoretically, the distribution functions of normal variates can be
calculated using the theorem in Section 2.2 by a recursive formula but
practically, it is computational labor expensive. As a result, the
algorithm become less and less efficient when the dimension of Y is
higher and higher. Therefore, it is interesting to seek another more
efficient method for finding the estimate of 9, especially when the
dimension of Y is large.
We now propose a new met nod, we call it the partition maximum
likelihood (PML) method, for estimating the parameters in the general
model. The basic features are described as follows:
(i) Tne mean vector e and the covariance matrix C are estimated by
the sample estimates as before, based on random observations of X.
(ii) for each i= l,2,...n, the polyserial correlations between X and
Y. are estimated based on the observed random sample corresponding
to (X',Z.). Since the dimension of Z. is one, the underlying
model can be regarded as a polyserial correlation model. Hence
the efficient Newton- Raphson algorithm developed in Section 3.1
can be employed to get the maximum likelihood estimates. This
gives the partition maximum likelihood estimates c. of c. and a—l —l
set of estimates he ne- thresholds, for i= l,2,...n.
(iii) For i,j- 1,2,...n, i j, the polychoric correlation P.. is
estimated based on the n(i) x n(j) contingency table which
contains observed frequencies corresponding to and Z.. We treat
this as a special 2- dimensional polychoric correlation model. In
this simple model, the computational task for getting maximum
likelihood estimates of P.. and the thresholds is light. This
eives the partition maximum likelihood estimates P.. of P.. and
another (n-1) sets of thresholds estimates d E(i)'S~ l2,...n-l.
As we can realize, the neavy computational burden for obtaining the
maximum likelihood estimates of the general model is mainly due to the
evaluation of the multivariate normal distribution functions which
require to compute multiple integrals. In the P ML method, we separate
the huge general model into many small models. In obtaining the
partition maximum likelihood estimates of these small models, we only
need to compute simple single and double integrals instead of the
complicated multiple integrals. Therefore, a lot of computer time can
be saved. One shortcoming of this method is that there are n sets of
threshold estimates, St anc ik()' s= -2,...n-1. However as
we will see from the result of a simulation study, the difference among
these estimates are very tiny.
2. Example
Using the'.PML method, the example given in Section 4 of Chapter 2
is reanalyzed. It is found that the estimates given by the PML method
are almost identical to those given by the maximum likelihood approach.
The result is reported in Table 4.1. The estimates of the thresholds
and their standard error reported in this table are the means of the
estimates obtained from the polyserial correlation model and various 2-
dimensional polychoric correlation models.
3. Simulation
The main purpose of this simulation study is to compare the
performance of various kinds of estimates, e.g. the maximum likelihood
estimates and the partition maximum likelihood estimates. The study
is based on simulated data (sample size N= 40, 70, 100) from a
multivariate normal distribution with the dimension of X and Y are both
equal to two. The population mean vector of this distribution is taken
to be 0, waiie the covariance matrix is taken to be
i .0
U. 0
0.0
0.2
1 f'i .KJ
0.0
0.1
1.0'hJ• -4- i.O
The standardized simulated random vector (x.11)1 was transformed to
(x. ,_z.')' with the following sets of pre- assigned thresholds:»L i
en
and
(II)
Note tnat the first set of thresholds were selected so that the
distributions of and are skewed at opposite directions, while the
second set were selected so that the distributions are both skewed to
the right. For each case, 50 replications were generated. The
estimates were obtained using both the maximum likelihood and the PML
approach.
The simulation results about the polyserial and the polychoric
correlations estimates are reported in Table 4.2 and Table 4.3. These
tables are obtained based on random vectors generated with the first set
of thresholds and the second set of thresholds, respectively. The mean
estimates and root mean square error
; -H
are reported. The RME column cl-c2, ci-c3 and cl-c4 are used to examine
discrepancy between the various estimates and the true value. The
columns c2-c3 and c2-c4 are used to examine discrepancy between the
Pearson's product correlation with the maximum likelihood estimates and
the partition maximum likelihood estimates respectively. Finally the
column c3-c4 is used to examine discrepancy between the maximum
likelihood and partition maximum likelihood estimates. From Tables
4.2 and Table 4.3, we see tnat there is very little bias evidence for
the partition maximum likelihood and maximum likelihood estimates.
Tneir mean estimates virtually identical. From the columns of RME, the
following interesting phenomena are observed.
(i) The maximun likelihood estimates performs very well with small
and moderate samples, and under both type of skewed distributions.
As expected, increasing the sample size decrease the RME and the
RME corresponging to the large correlation is relatively small
( see columns cl-c3 and c2-c3).
(ii) Tne behaviour of the partition maximum likelihood estimates are
almost identical to the maximum likelihood estimates( see columns
cl-c4 and c2-c4)
(iii) Under all various situations, there is little difference between
the- partition maximum likelihood and the maximum likelihood
estimates especially for the polyserial correlations estimates.
Noted also that the RME is large when the true correlation is
large (see column c3-c4).
Tne result about tne threshold estimates are presented in Table 4.4
and Tabie 4.5. Again the mean estimates and various RME are reported.
We observe that the behaviour of the threshold estimates is very similar
to tnat of the correlation estimates; for example, the mean estimates
for tne partition maximum likelihood and maximum likelihood estimates
are almost the suae; the bias for both type of estimates are tiny even
with small and moderate samples and with different type of skewed
distributions; and tne discrepancy between the partition maximum
likelihood estimates and the maximum likelihood estimates are very
little. From columns cl-c3, cl-c4, c2-c3 and c2-c4, we see tliat the
threshold estimates obtained from the polyserial correlation model are
better than those from tne polychoric model. This is a natural fact
because more data information is used by the polyserial correlation
i nodel.
Chapter 5
Summary and Discussion
In this thesis, the maximum likelihood estimation method is
developed for finding the estimates of the parameters in a multivariate
normal model with some of the component variables observable only in
poiytomous form. The parameters underlying include the mean vector and
the covariance matrix of tne continuous random vector, the polyserial
correlations between the continuous random variables ana tne latent
variables, tne polychoric correlations among the latent variables and
the thresholds. By means of an appropriate transformation, tne
complicated likelihood function that involves these parameters is
transformed to a comparatively simple function that only involves tne
interesting parameters: the polyserial correlations, the poiycnoric
correlations and tne thresholds. The negative log- likelihood function
is minimized via tne Fletcher- Powell algorithm, giving tne maximum
likelihood estimates of the parameters. The standard error estimates of
the parameters estimates are obtained from the inverse of tne
approximation of tne information matrix. Two important special cases
that have wide applicability are studied. They are tne polyserial
correlation model and the multivariate polychoric correlation model. In
tne polyserial correlation model, the classical Newton- Raphson
algorithm is used to produce the maximum likelihood estimates and their
standard errors. As expected, we found that the Newton- Raphson
algorithm is extremely efficient.
Although we nave developed all the essential theoretical aspects of
the maximum likeiinood estimates, in practice it taxes a long computer
time to obtain the estimates, especially when the dimension of the
random vector observable only in poiytomous form, n is large. This is
because in computing the function value and the gradient vector, one
requires to evaluate a lot of multiple integrals. To overcome these
practical difficulties, we propose another estimation approach, namely
tae partition maximum likelihood approach. This approach requires
much less computer time than the maximum likelihood approach. Moreover,
from our simulation results, we observe that the discrepancy between the
partition maximum likelihood estimates and the maximum likelihood
estimates are extremely tiny. Therefore, the partition maximum
likelihood approach represents a very attractive method which can
produce accurate estimates effi ch entiv.
Still another method is to repeatedly use one continuous variable
and one discrete variable, say X. and Z., for finding the estimates of
the polyserial correlations and the thresholds; the process is continued
until ail the oolvserial correlations are estimated us ins? the underlvinp
discrete variables. In the context of the polyserial correlation mode
Lee Poon (1985) demonstrated that this method is inferior to the
partition maximum likelihood method. Hence, in the context of the
general model we anticipate that this method is also inferior to the
partition maximum likelihood method.
based on similar rationale as OLsson, Drasgow Dorans (1982), two
- step estimates of the polyserial and the polychoric correlations rnigh
hp nhrp-ined as follows. Time thresholds are first estimated by the
cumulative proportions, then the correlation estimates are obtained witl
the thresholds fixed at their estimates. As this reduces the number of
parameters to be iteratively estimated, it may require less computer
time to obtain the estimates. However, we expect a great deal of
computer time is again necessary because the evaluation of multiple
integrals are still required. Therefore, it has significant practical
value to develop similar kind partition two- step maximum likelihood
estimates.
Another direction for further research is to apply the results of
tiiis thesis to other multivariate techniques that relate with the
correlations or the covariances. Examples of these techniques are
canonical correlation analysis, principal component analysis, factor
analysis and covariance structure analysis. One attempt is to apply the
estimated correlationcovariance matrix directly to obtain the results.
Tne results of this thesis are developed based on the assumption
that the variables are normally distributed. Hence, if the distribution
was unknown a priori, there is no statistical justification for the
maximum likelihood approach or the partition maximum likelihood
approach. Kraemer (1981) studied some modified biseriai correlation
coefficients which require less restrictive assumption than tne
bivariate normality assumption. In the general case, this robustness
problem remains an interesting research topic.
Table 2.1
Convergence Summary of the F- P Algorithm
Iterationt2(92)
RMSb21 al,2 a2,4 5 r12
0 4.1619 0.0078 0.128 -1.038 0.718 1.088 -0.127
1 4.1618 0.0093 0.142 -1.043 0.715 1.095 -0.138
2 4.1609 0.0011 0.139 -1.038 0.722 1.092 -0.134
4.1609 0.0009 0.139 -1.038 0.722 1.092 -0.134
F-OO denotes the function value
RMS denotes the root mean square of the gradient
Table 2.2
Maximum Likelihood Estimates of the Parameters
Parameters P.Value PPC MLE S. E.
C11
c12
c2l
c22
C31
c32
al,2
a
1.3
a
1.4
a
2.2
a
2.3
a
2.4
a
3,2
a
0,0
a
0,+
a
3.5
P12
P13
P2o
0.0
0.0
0.0
0.0
0.0
0.0
-1.0
-0.2
0.8
-0.8
0.0
0.8
-1.2
-0.4
0.2
1.0
0.0
0.0
0.0
0.231
-0.103
-0.134
-0.107
0.016
-0.134
-0.163
0.009
0.032
0.289
-0.046
-0.123
-0.136
-0.032
-0.094
-0.993
-0.215
0.838
-0.774
-0.066
0.708
-1.401
-0.286
0.248
1.086
-0.159
-0.021
0.094
0.108
0.120
0.098
0.119
0.127
0.136
0.171
0.128
0.153
0.156
0.140
0.144
0.202
0.140
0.139
0.174
0.125
0.125
0.125
P.Value denotes the polulation value
PPC denotes the Pearson's product correlation
MLE denotes our maximum likelihood estimate
S.E. denotes the standard error
Table 3.1
Convergence Summary of Cox's Data
IterationF2(b,a)
RMSbl a2
a.o
0 20.6o35 1.165J -0.104 -1.182 0.64b
1 20.5401 0.0205 -0.215 -1.195 0.361
2 20.5401 0.0000 -0.216 -1.197 0.863
2— cen°fes the function value
RMS denotes the root mean square of the gradient
Table 3.2Maximum Lrkel
Maximum Likelihood Estimates of Cox's Data
Parameters
estimates
Our solution
standard errors estimates
Cox's solution
standard errors
c 0.211 0. 220 0.211 0.224
a. -1.169 0.319 -1.169 0.324
a. 0.843 0.284 0.843 0.286
Tahl P 3.8
Convergence Summary of the Newton- Raphson Algorithm
IterationF2(b,a)
RMSb2 b4 a2
0
1
2
J
4
78.8099 4.7125 0.151 -0.456 -2.307 2.051
U.6227 0.9806 0.356 -0.660 -3.182 2.68
69.2041 0.1542 0.420 -0.766 -3.83 3.01
69.146o 0.0062 0.4o3 -0.815 -4.006 3.164
69.1407 0.0000 0.433 -0.81 -4.014 3.168
denotes the function value
RMS denotes the root mean square of the gradient
Table 3.4
Maximum Likelihood Estimates of the Parameters
Parameters P.Value PPC MLE S.E.
C1
c2
c3
c4
c5
°2
°3
a4
a5
a6
-0.6
-0.1
0.0
0.2
0.7
-1.0
0.0
0.6
1.0
1.3
-0.695
-0.041
0.010
0.121
0.701
-0.700
-0.067
0.014
0.140
0.694
-1.067
0.138
0.649
1.028
1.506
0.031
0.041
0.042
0.043
0.029
0.090
0.064
0.066
0.079
0.108
P.Value denotes the population value
PPC denotes the Pearson's product correlation
MLE denotes our maximum likelihood estimates
S.E. denotes the standard error
Table 3.5
Maximum Likelihood Estimates of Olsson's Data
Parameters
estimates
Our solution
standard errors estimates
Olsson's soultion
standard errors
a11,2
al ,3
a2,2
a
p1212
-1.774
-0.137
—0.688
0.667
0.492
0.104
0.056
0.061
0.061
0.050
-1.77
-0.14
-0.69
0.67
0.46
0.103
0.056
0.061
0.061
0.043
Table 4.1
PML Estimates of the Parameters
Parameters P.Value PPG MLE PML S.E.
c
11
c12
zl
c22
c31
c3232
a1,21,2
a3
al,4
a2,2
a2,32,3
a2,4
a3,23,2
a3,33.3
a3.4
a3,5
012
P13
P23
0.0
0.0
0.0
0.0
0.0
0.0
-1.0
-0.2
0.3
-0.3
0.0
0.8
-1.2
-0.4
0.2
1.0
0.0
0.0
0.0
0.2310.231
-0.103
-0.134
-0.107
0.016
-0.134
-0.163
0.009
0.032
0.269
-0.046
-0.123
0.136-0.136
-0.032
-0.094
-0.993
-0.215
0.838
-0.774
-0.066
0.708
-1.401
-0.26b
o. 24o
1.086
-0.159
-0.021
0.094
0.288
-0.045
-0.123
-0.134
-0.032
-0.095
-0.994
-0.214
0.853
-0.84
-0.079
0.697
-1.403
-0.262
0.251
1.082
-0.137
-0.016
0.091
0.097
0.103
0.107
0.108
0.104
0.103
0.152
0.125
0.144
0.140
0.125
0.137
0.162
0.127
0.127
0.15b
0.113
0.112
0.112
P.Value denotes the poiulation value
PPC denotes the Pearson's product correlation
MLE denotes the maximum likelihood estimate
PML denotes the (mean of) differencing maximum likelinood estimate
S.E. denotes the (mean of) standard error
Table 4.2
Simulation Results for Correlations
Distributions of and Zr skewed at opposite directions
Mean of Estimates RME
PAR P.V PPC MLE pML cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4
(N= 40)
C11
c12
C21
c22
P12
0.80
0.00
0.30
0.10
0.40
0.801
0.011
0.307
0.123
0.402
0.815
0.029
0.315
0.131
0.426
0.800
0.029
0.314
0.130
0.425
0.062
0.142
0.158
0.147
0.128
0.072
0.164
0.170
0.161
0.142
0.062
0.161
0.169
0.160
0.140
0.028
0.058
0.054
0.054
0.077
0.022
0.056
0.053
0.053
0.081
0.020
0.008
0.004
0.004
0.020
(N= 70)
C11
c12
C21
c22
P12
0.80
0.00
0.30
0.10
0.40
0.794
0.029
0.332
0.132
0.417
0.801
0.041
0.338
0.132
0.425
0.791
0.040
0.336
0.130
0.427
0.052
0.125
0.110
0.115
0.091
0.056
0.145
0.127
0.129
0.119
0.050
0.142
0.124
0.126
0.122
0.026
0.038
0.042
0.041
0.063
0.023
0.036
0.040
0.040
0.067
0.010
0.006
0.006
0.007
0.011
(N= 100)
C11
c12
C21
c22
P12
0.80
0.00
0.30
0.10
0.40
0.796
-0.014
0.310
0.118
0.415
0.799
-0.019
0.307
0.111
0.410
0.791
-0.019'
0.306
0.110
0.411
0.035
0.084
0.111
0.117
0.096
0.040
0.091
0.113
0.129
0.103
0.037
0.089
0.112
0.129
0.104
0.020
0.041
0.032
0.033
0.037
0.019
0.040
0.032
0.033
0.038
0.010
0.004
0.002
0.002
0.008
PAR denotes parameter
P.V denotes population value
PPC denotes the Pearson's product correlation
MLE denotes maximum likelihood estimates
PML denotes differencing maximum likelihood estimates
RME denotes root mean square error
Table 4.3
Simulation Results for Correlations
Distributions of and Z both skewed to the right
Mean or F.a ti ma t-pq RME
PAR P.v PPC MLE PML cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4
(N= 401
C11
c12
C21
c22
P12
0. 80
o.oc
0.30
0.10
0.40
0.803
-0.015
0.289
0.098
0.380
0.816
-0.019
0.291
0.108
0.387
0.803
-0.017
0.289
0.108
0.391
0.063
0.131
0.149
0.136
0.141
0.074
0.147
0.153
0.154
0.171
0.063
0.144
0.152
0.154
0.165
0.034
0.053
0.047
0.055
0.073
0.026
0.051
0.048
0.055
0.071
0.020
0.009
0.003
0.004
0.016
(N=70)
11
c12
21
c22
pl2
0.80
0.00
0.30
0.10
0.40
0.798
-0.025
0.298
0.089
0.388388
0.798
-0.022
0.299
0.093
0.392
0.791
-0.021
0.299
0.092
0.391
0.039
0.103
0.101
0.104
0.093
0.043
0.106
0.115
0.109
0.107
0.039
0.106
0.115
0.109
0.106
0.021
0.038
0.038
0.033
0.048
0.021
0.038
0.038
0.033
0.050
0.010
0.00b
0.002
0.002
0.010
(N= 100}
c
1Z
C21
c22
p12
0. bO
0.00
0.30
0.10
0.40
0.805
-0.015
0.313
0.086
0.394
0.81C
-0.017
0.318
0.093
0.40C
0.786
—0.017
0.318
0.093
0.400
0.03c
0.086
0.097
0.085
0.094
0.040
0.089
0.094
0.091
0.103
0.037
0.087
0.094
0.091
0.101
0.025
0.033
0.031
0.033
0.050
0.031
u.032
0.031
0.033
0.050
0.024
0.006
0.002
0.001
0.009
PAR denotes parameter
P.V denotes population value
PPC denotes the Pearson's product correlation
MLE denotes maximum likelihood estimates
PML denotes differencing maximum likelihood estimates
RME denotes root mean square error
Table 4.4
Simulation Results for Thresholds
Distributions of ana skeved at opposite directions
Mean of Estimates
P ML RME
PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-e4 c2-c3 c2-c4 c3-c4
(N= 40)
a1,2
a1,0
a1.4
a1.5
a
a2,3
a 2,4
a
2,0
-1.5
-i.O
0.0
1.0
-1.0
0.0
1.0
1.5
-1.522
-0.991
-0.035
0.994
-0.988
-0.011
0.995
1.565
-1.525
-0.990
-0.033
0.996
-0.991
-0.007
0.994
1.555
-1.533
-0.998
-0.032
1.003
-0.968
-0.010
0.995
1.561.563
0.189
0.105
0.106
0.141
0.132
0.119
0.146
0.250
0.212
0.116
0.117
0.129
0.132
0.117
0.144
0.240
0.182
0.103
0.106
0.135
0.133
0.119
0.147
0.243
0.094
0.052
0.051
0.053
0.009
0.015
0.012
0.033
0.039
0.022
0.020
0.028
0.008
0.011
0.009
0.025
0.092
0.045
0.049
0.039
0.012
0.016
0.015
0.032
(N- 70)
a1,2
a1
a1,4
a
a2,2
a
a2,5
-1.5
-1.0
0.0
1,0
-1.0
0.0
1.0
1.5
-1.502
-1.006
0.025
0.972
-0.9b2
-0.006
1.0O7
1.544
-1.514
-1.002
0.031
0.970
-0.963
-0.008
1.006
1.51.546
-1.502
-1.006
0.022
0.976
-0.982
-0.006
1.007
1.545
0.142
0.065
0.076
0.124
0.101
0.098
0.116
0.163
0.15
0.098
0.087
0.127
0.101
U.099
0.117
0.164
0.140
0.084
0.075
0.125
0.101
0.099
0.117
0.138
0.060
0.041
0.036
rip nU. OZo
0.008
0.009
0.010
0.016
0.022
0.011
0.013
0.013
0.006
0.009
0.009
0.017
0.064
0.041
0.037
0.029
0.008
0.012
0.010
0.021
(to be continued)
(continued)
Mean of Estimates
PML RME
PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4
(N- 100)
al,2
1,3
a
1.4
a
1.5
a2,2
a
2,3
a2,4
a2,5
-1.5
-1.0
0.0
1.0
-1.0
0.0
1.0
1.5
-1.485
-0.995
-0.008
0.991
-0.985
0.004
0.974
1.498
-1.476
-0.994
-0.013
0.987
-0.986
0.004
0.974
1.498
-1.490
-0.997
-0.006
0.993
-0.986
0.007
0.974
1.494
0.104
O.Obb
0.059
0.076
0.074
0.062
0.076
0.146
0.105
0.068
0.062
0.077
0.074
0.062
0.075
0.146
0.101
0.065
0.058
0.075
0.073
0.064
0.076
0.146
0.046
0.030
0.029
0.023
0.004
0.006
0.005
0.009
0.014
0.009
0.011
0.010
0.005
0.008
0.006
0.013
0.045
0.028
0.029
0.020
0.005
0.011
0.008
0.016
PAR denotes parameter
True denotes true value
RME denotes the root mean square error
MLE denotes maximum likelihood estimate
PML denotes 'partition maximum likelihood estimate
PMLC denotes the'PML estimates obtained from the 2- dimensional
polychoric correlation model
PMLS denotes the PML estimates obtained from the polyserial
correlation model
Table 4.5
Simulation Results for Thresholds
Distributions of Z-, and both skewed to the right
Mean of Estimates
PML RME
PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4
(LSI- 40)
1,2
a 1,31,3
a
1,4
a
1,5
a
2,2
ai 9Z. O
a
2.4
52.5
-1.0
0.0
1.0
1.5
-1.0
0.0
1.0
1.5
-1.033
0.023
1.013
1.488
-0.986
0.012
0.969
1.506
-1.034
0.019
0.997
1.497
-0.983
0.009
0.968
1.509
-1.042
0.020
1.016
1.499
-0.985
0.012
0.967
1.509
0.175
0.101
0.107
0.190
0.134
0.121
0.136
0.191
0.180
0.096
0.098
0.216
0.133
0.119
0.137
0.190
0.173
0.099
0.104
0.187
0.133
0.122
0.139
0.193
0.050
0.053
0.052
0.079
0.009
0.011
0.010
0.016
0.020
0.016
0.020
0.030
0.009
0.0i3
0.014
0.025
0.043
0.051
0.048
0.074
0.012
0.018
0.016
0.029
(N- 70)
a1,2
a 1,3
a 1,4
1,5
1.2
1.3
a 2,4
1,5
-1.0
0.0
1.0
1.5
-1.0
0.0
1.0
1.5
-0.992
-0.016
1.027
1.505
-0.998
-0.011
1.012
1.533
-0.990
-0.021
1.024
1.504
-0.999
-0.011
1.011
1.536
-0.997
-0.017
1.030
1.509
-0.997
-0.010
1.013
1.534
0.086
0.085
0.101
0.125
0.101
0.095
0.117
0.187
0.091
0.095
0.104
0.126
0.102
0.092
0. li
0.188
0.089
0.087
0.106
0.125
0.101
0.094
0.117
0.188
0.034
0.034
0.028
0.052
0.005
0.008
0.007
0.014
0.013
0.010
0.013
0.020
0.005
0.008
0.007
0.015
0.033
0.033
0.028
0.047
0.006
0.011
0.009
0.019
(to be continued)
( n nn t-? ni)
Moan
DMT PMFPAT? Trno MT.F. P'MLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4
ai,;
% :
V
V
a0 rZ jz
% :
V
S c
-l.C
0.c
1.C
1.!
-l.C
0.c
1.C
1.5
-1.00:
o.oi:
o.9s;
1.46;
-0.99;
o.oi;
1.00(
1 srv
-1.002
0.01C
0.986
1.476
-0.99:
o.oi:
1.001
1.50!
-1.004
0.012
0.999
1.486
-0.994
0.014
1.000
1.503
0.075
0.075
0.075
0.135
0.091
0.077
0.079
n. 1
0.08C
0.076
0.076
0.154
0.092
0.076
0.075
0.141
0.077
0.078
0.074
0.138
0.092
0.077
0.079
0.133
0.02
0.026
0.02
0.05
0.00!
0.005
0.006
0.005
0.012
0.009
0.016
0.027
0.003
0.004
0.005
0.007
0.020
0.029
0.028
0.051
0.005
0.007
0.009
0.012
PAR denotes parameter
T v~t l o rl P o f x ro 1 t in
RME denotes the root mean square error
MLE denotes maximum likelihood estimate
P?. dpnnf.ps tsar t t t i on mpiYimiim 1 i kpl i Vinnri pqfimpfp
PMLC denotes the PML- estimates obtained from the 2- dimensional
oolvchoric correlation model
PMLS denotes the PML estimates obtained from the polyserial
correlation model
Reference
Anderson, T.W. An introduction to multivariate statistical analysis.
New York: Wiley, 1958.
Ashford, J.R., Sowden, R.R. Multivariate probit analysis.
Biometrices, 1970, 26, 535-546.
Bard, Y. Nonlinear Parameter Estimation. New York, Academic Press,
1975.
Cox, N.R. Estimation of tne correlation between a continuous and a
discrete variable. Biometrics, 1974, 30, 171-178.
Divgi, D.R. Calculation of tne tetrachoric correlation coefficient.
Psycnometrika, 1979, 44, 169-172.
Finney, D.J. Probit Analysis (3rd ed.). Cambridge, England:
Cambridge University Press, 1971.
Hannan, J.P., Tate, R.F. Estimation of tne parameters for a
multivariate normal distribution when one variable is dichotomized.
Biornetrika, 1965, 52, 664-668.
1MSL Library (ed.5). Houston, Texas: international Mathematical and
Statistical Libraries, 1975.
Johnson, N.L. Kotz, S. Distributions in statistics: Continuous
multivariate distributions. New York: Wiley, 1972.
Kendall,M.G., Stuart, A. The Advanced Theory of Statistics, vol.II:
Inference and Relationship. Hafner, New York, 1967
Kraemer, H.C. Modified biserial correlation coefficients.
Psychometrika, 1981, 46, 275-282.
Lazarsfeld, P.P. Latent structure analysis. In S. Kock (ed.),
Psychology: A Study of Science, Vol. 3, New York: McGraw-Hill, 1959.
Lee, S.Y. Maximum likelihood estimation of polychoric correlations in
r x s x t contingency tables. Submitted for publication, 1984.
Lee, S.Y., Poon, W.Y. Maximum likelihood estimation of polyserial
correlations. Technical Report, dept. of Statistics, CUHK, No.22,
1985.
Lord, F.M., Noviek, M.R. Statistical Theories of Mental Test Scores.
Reading, Mass: Addison-Wesley, 1968.
Luenberger, D.G. Introduction to Linear and Non-linear Programming.
Reading, Mass: Addison-Wesley, 1973.
Martinson, P.O. Hamdan, M.A. Maximum likelihood and some other
asymptotically efficient estimators of correlation in two contingency
tables. Journal of Statistical Computation and Simulation, 1971, 1,
45-54.
Mcdonald, R.P., Swaminathan, H.A. A simple matrix calculus with
application to multivariate analysis, General Systan, 1973, 18, 37-54.
Muirhead, R.J. Aspects of Multivariate Statistical Theory. John
Wiley Sons, i982.
Nerlove, M., Press, S.J. Univariate and multivariate log-linear and
logistic models. Santa Monica: The Rand Corp., R: 1506-LDANIH, 193.
V
Olsson, U. Maximum likelihood estimation of the polychoric correlation
coefficient. Psycnometrika, 1979, 44, 443-460.
Olsson, (J., Drasgow, F., Dorans, N.J. The polyserial correlation
coefficient. Psycnometrika, 1982, 47, 337-347.
Pearson, K. Mathematical contributions to the theory of evaluation,
II: On the correlation of characters not quantitatively measurable.
Philosophical transactions of the Royal Society of London, Series A,
1901, 195, 1-47.
Prince, B.M., Tate, R.F. Accuracy of maximum-likelihood estimates of
correlation for a biserial model. Psycnometrika, 1966, 31, 85-92.
Rao, C.R. Linear Statistical Inference and its Applications. John
Wiley Sons, 1973.
Schmidt, P., Strauss, R.P. Estimation of models with jointly
.dependent qualitative variables: A simultaneous logit approach.
Econometrica, 1975, 43, 745-755.
Tallis, G. The maximum likelihood estimation of correlation from
contingency tables. Biometrics, 1962, 18, 342-353.
Tate, R.F. The theory of correlation between two continuous variable
when one is dichotomized. Biometrika, 1955, 42, 205-216, a.
Tate, R.F. Applications of correlation models for biserial data.
Journal. Arner. Statist. Ass, 1955, 50, 1078-1095, b.