MAXIMUM LIKELIHOOD ESTIMATION OF MULTIVARIATE … · 2016. 12. 28. · polyserial correlation model...

MAXIMUM LIKELIHOOD ESTIMATION

OF

MULTIVARIATE

POLYSERIAL AND POLYCHORIC CORRELATION COEFFICIENTS

by

WAI-YIN POON

A Thesis

submitted to

the Graduate School of

The Chinese University of Hong Kong

(Division of Statistics)

In Partial Fulfillment

of the Requirements for the Degree of

Master of Philosophy (M. Phil.)

Hong Kong

May, 1985

THE CHINESE UNIVERSITY OF HONG KONG

GRADUATE SCHOOL

The undersigned certify that we have read a thesis, entitled

Maximum Likelihood Estimation of Multivariate Polyserial and Polychori

Correlation Coefficients submitted to the Graduate School by Wai-yin

Poon （潘偉賢）

of Master of Philosophy in Statistics. We recommend that it be

accepted.

Dr. S.Y. Lee,

Supervisor

Professor H. Ton

Dr. ELK. Lam

Professor P.M. Bentler,

External Examiner

in partial fulfillment of the requirement for the degree

DCLARATION

No portion of the work referred to in this thesis as been

submitted in support of an application for another degree or

qualification of this or any other university or other institution

of learning.

Acknowledgments

I would like to express any sincere thanks to Dr. S.Y. Lee for his

encourage neilt and his supervision on my thesis. It is also a pleasure to

express my gratitude to any father, Mr. Y.D. Poon for his encouragement.

Abstract

The method for finding the maximum likelihood estimates

of the parameters in a multivariate normal model with some of

the component variables observable only in polytomous form is

developed. The main trick used is a reparameterization whick

converts the corresponding log-likelihood function to a easily

handled one. The maximum likelihood estimates are found byra

Fletcher-Powell algorithm, and their standard errors obtained

from the information matrix. Two special models, namely the

polyserial correlation model and the multivariate polychoric

correlation model, are studied. When the dimension of the

random vector observable only in polytomous form is large,

obtaining the maximum likelihood estimates is computationally

rather labor expensive. Therefore, a more efficient method,

the partition maximum likelihood method, is proposed. These

estimation methods are demonstrated by real and simulated

data, and compared by a simulation study.

Contents

Chapter 1 Introduction

Chapter 2 Maximum Likelihood Estimation of the General Model

2.1 Model

2.2 Maximum Likelihood Estimation

2.3 Optimization procedure

2.4 Example

Chapter 3 Special Models

3.1 Polyserial correlation model

3.2 Multivariate polychoric correlation model

Chapter 4 Partition I1axiinuni Likelihood (PML) Estimation

4.1 Tne PLI-MIL procedure

4.2 Example

4.3 Simulation

Chapter 5 Summary and Discussion

Tables

Reference

Page

1

5

5

7

23

27

29

29

36

40

40

41

42

45

48

62

Chapter 1

Introduction

There are many examples in psycnology (Lazarsfeld,1959 Lord

Novick,1968), econometrics (Nerlove & Press,1973; Schmidt

Strauss,1975) and biometrics (Ashford & Sowden,1970 Finny,1971) for

which a continuous variable underlies a dichotomous or polytomous

observed variable. Examples of such variables are attitude items,

rating scales, performance items and the like. Typical cases are when a

subject is asked to answer the question on scale like

don't know disapprove disapproveapprove approve

stronglystrongly

When ana yzing this kind of data, a common approach used by nonrigorous

statisticians is to assign integer values to each category and proceed

in tiie analysis as it the data had been measured on an interval scale

with the desired distributions. Although many statistical Methods seem

to be fairly robust against this kind of deviation from the

distributional assumptions, there are ,many situations that may lead to

erroneous results. Olsson (1979) showed that due to the biased

estimates of the correlation, the application of factor analysis to this

kind of discrete data may lead to erroneous conclusions. Thus, as we

expected, the applications of principal component analysis, Multiple

correlations and canonical correlation analysis may lead to incorrect

results as well, because these statistical methods also depend heavily

on the estimation of the correlations. Thus, it is important to derive

reliable correlation estimates with this kind of data.

2

0

The measure of bivariate normal correlation based on data from a 2

x 2 contingency table was suggested by Pearson (1901). He called it the

tetrachoric correlation. This correlation has been extended to data

from a r x s contingency table with the underlying observed variables

have r and s ordinal categories respectively. Tallis (1962) studied the

problem of maximum likelihood estimation with r-s= 3. Martinson

Hamdan (1971) developed a two- step maximum likelihood method that gave

the estimates of a r x s table. In this method, the thresholds are

first estimated by the cumulative marginal proportions, and then the

polychoric correlation is estimated with the thresholds fixed at their

estimates. Olsson (1979) developed a procedure that gave the maximum

likelihood estimates of the correlation and thresholds. He also

compared the full rnaximurn likelihood approach with the two- step

approach. Lee (1984) extended Olsson' s method to three way r x s x t

contingency tables. In that case, three observed polytomous ordinal

variables were considered. The thresholds and correlations were

estimated using the maximum likelihood approach with the assumption that

the associated underlying latent variables have a standardized

trivariate normal distributions.

Let Z be an observed discrete variable which depends on an

underlying 'latent continuous random variable Y, and X represent another

observed continuous variable. The correlation between X and Y obtained

from observed X and Z is called the polyserial correlation. Under the

normality assumption and when Z is dichotomous, the maximum likelihood

estimation of the correlation between X and Y has been studied by

Tate (1955a,b). Tate's work has been generalized in two directions.

Hannan Tate (1965) considered the situation with X being a Land [fl

3vector. They derived the maximum likelihood estimates for the

correlations and the point of dichotomy (threshold), and the standard

errors estimates as well. However, because of the complicated integrals

involved, the expressions for standard errors are very difficult to use.

Tables that help the computation of standard errors have been published

later by Prince Tate (1966). Cox (1974) generalized Tate's work by

treating Z as a polytomous observed variable. Maximum likelihood

estimates of the parameters in the model were obtained via the scoring

algorithrll. Olsson, Drasgow Dorans (1982) compared the maximum

likelihood estimator with a two- step estimator and with a simple ad

hoc estimator. Lee Poon (1985) developed a method for estimating the

parameters of a model in which X is an observable random vector and Z is

an observable ordinal polytoinous variable. The set of parameters in

their model contains the mean vector and covariance matrix of X, the

thresholds, and the polyserial correlations between X and Y. Under the

noi-:nality assumption, a method that based on the Newton- Raphson

algoritiri was developed to produce the maximum likelihood estimates and

the standard error estimates. Clearly, the model of Lee Poon (1985)

reduces to the previously studeid situations in special cases.

In this thesis, we will consider a model which involves an

observable continuous random vector X and a latent continuous random

vector Y. An observable random vector Z is similarly defined based on

values of Y and the thresholds. The main purpose is to develop a

maximum likelihood approach for estimating the parameters in the model

which contains the mean vector and the covariance matrix of X, the

polyserial correlations between X and Y, the polychoric correlations

among the variables in Y, and the thresholds. In Chapter 2, the general

4

model is described and the maximum likelihood estimation is studied.

The estimate are obtained via the Fletcher- Powell algorithm, and

standard error estimates are obtained by implementing the inverse of the

information matrix. Two special cases of the general model that have

wide practical applicabilities are described in Chapter 3. The first

one is the iiiodel studied by Lee Poon (1985) while the second one is

the model that only Z is involved. A procedure, we will call it the

partition maximum likelihood (PMl) procedure, for analyzing the

general model by means of the special models in Chapter 3 is studied in

Chapter 4. As we will see, the PML procedure requires much less

computer time than the original maximum likelihood procedure. Chapter 4

also reports results from a simulation study which was implemented to

compare the accuracy of the maximum likelihood and the partition

maximum likelihood methods. The indication is that the estimates

produced by these methods are very close to each other. The discussion

of the results and the conclusion are given in Chapter 5.

ChaDter 2

Maximum Likelihood Estimation of the General Model

1. Modei

Let X and Y be continuous vectors of dimension r and n

respectively. It is assumed that is distributed

according to a multivariate normal distribution with mean vector

and variance covariance matrix

where

e is the r x 1 dimensional mean vector of X,

0 is the n x 1 dimensional zero mean vector of Y,

is the r x r covariance matrix of X,

is tne n x r covariance matrix of

is tne n x n correlation matrix of Y.

C.leariv. i is tne correlaion matrix o will store

the correlation matrix oJ

Le 1 be a n x 1 polytomous vector defined

as

6where Yi is the i th component of Y and

for all i.

(2.1)

Let

where Vec(C) is a vector taking the elements in C row by row and

Pij= P ji, i, j= 1 ,2, ...n ,i j are off diagonal elements of Ryy. Suppose

that we have a random sample f roam (X', Z') with sample size N. Then the

information available consists of N (r+n) x 1 observed vector of the

form (x'z'_(i(z)), z'), that is

The ith component of z_ takes the value from 1,2,3,...n(i) anc

i(z_) taices the first value from the sequence 1,2,,.. f(z) with

f(z) is the total number of observations with Z= z_. Clearly

Based on this data set, a method for finding the maximum likelihood

estimates is developed. Tne estimates of elements in based

on random observations of X and Z are called the polyserial

correlations, and the estimates of are called the polychoric

correlations.

2. Maximum Likelihood Estimation

Let be the probability density of X and

respeclively. Then Moreover,

the conditional distribution of Y given X= x is multivariate

normal with mean vector

and variance covariance matrix

that is

Let be the ith row of and let denote

the diagonal matrix with its (i,i) tn entry equals to

Then we have

where

(2.2)

and its (i,j) th entry takes the value

It is clear from (2.2) that

has a multivariate normal distribution with correlation matrix R.

The probalility density function is given by

Therefore,

where

As a consequence, we can represent where

is the dimensional multivariate normal density function

and is given by (2.3). In addition, tne likelihood

function of for this random sample is given by:

(2.4)

The maximum likelihood estimate is defined as the

vector that maximizesHowever, it should be pointed out

that maximizing is extremely complicated because the unknown

parameters in e and involved in p(x) is also involved in

To overcome this difficulty, we use a transformation. Let

and define

(2.5)

Thus, the Darameters vectoi is transformed by (2.5) to a new

parameters vector

This trasformation is one to one with its inverse given by

11

(2.6)

Casing this transfforlnation, Pr(z'/x) can be expressed in terms of

t:ie elements in; that is

(2.7)

Let

(2.8)

be tiie likelihood equation in terms of the parameters in As a

result, finding the maximum likelihood estimates is easier

than that of since maximizing is easier to handle. The

reason is that the parameters in which involved in is now

no longer involved in Pr

The maximum likelihood estimate is the vector that

maximizes or eauivalently, maximizes the function

where

(2.9)

Under appropriate mild regularity conditions, is always exists.

Therefore, to obtain the maximum likelihood estimate we are required

to solve the following system of equations:

(2.10)

(2.11)

The solution of in (2.10) gives the traditional maximum

likelihood estimates of e_ and

(2.12)

m

However, some iterative procedure is required for getting the maximum

likelihood estimates of since (2.11) cannot be solved

algebraically in closed form. In addition, the maximum likelihood

estimate of can be obtained by (2.6) in terms of since the

transformation defined in (2.5) is one to one( see,e.g. Anderson,1958).

It is well known that under mild regularity conditions, the maximum

likelihood estimatei

and posses the following statistical properties

(see,e.g-.. Rao,1973; Muirhead,1982; Kendall Stuart, 1978):

(a) They are consistent.

(b) The distribution of is multivariate normal with mean e and

variance covariance matrix

(c) The distribution of is Wischart with N- 1 degrees of

freedom and covariance matrix

(d) The asymptotic distribution of is normal with mean vectoi , and

covariance matrix

(2.13)

(e) Let be the parameters vector containing parameters defined

in (2.6)and -2 be its maximum likelihood estimate. The

asymptotic distribution ofis multivariate normal with mean

vector and covariance matrix -1 K1 (see,e.g.Rao 1973)

where K The explicit expressions for the

entries ion K are given by

(i)

i- 1,2...n

(ii) if i t j

(iii)

1j 2•• n

(iv)

i, j- 1,2... n

k= 2,3... n( i)

i, j, k= i, 2... n

j k

(v)

i- 1,2... n

k= 2,3... n (i)

(vi) if i¥ j

i? j 1;2,— n

k(i)= 2,3,...n(i)

(vii)

5,,= Krotiecker deltakl

i- 1,2... n

k, 1= 2,3,... n (i)

(viii)if i f j

j j±, 2,... n

k= 2,3,...n(i)

i= 2,5,...n(j)

(ix)

1} l?C j 1_ 1,2,...n

k 1

j= 2,3,...n(i)

(x)

i,j 1,2,...n

i j

(xi)

i,j- 1,2,...n

i j

(xii)

ij k 1,2, ...n

1- 2,3,...n(k)

(xiii)

i, j= 1,2,... ii

i j

(xiv) if (i,j) 1 (k,l) f (l,k)

i j 1,2,.. .n

i j, k 1

(2.14)

Let _k denote the row vector (k(l),k(2),...k(n)), then the (c,d) th

element of I in (2.13) is given by

(2.15)

where 9 and 9. are the ctn and dth elements of 0O. As it stands, this

matrix is extremely difficult to compute. Howerver, by the following

lemma, we can use a matrix IA to approximate I with the (c,d) th

element of IA given by

n(l) n(n) f(k)

(2.16)

Lemma

Let L be the log- likelihood function of a random sample i-

l,2,...n from a distribution with probability density function f(9_),

wnere 9 is the vector of unknown parameters. Then the matrix

converges in probability to the information matrix

Proof:

The (c,d) th element of the information matrix I, by definition,

is given by

since independent

since E

= n E

since identically distributed.

Therefore, by law of large numbers( see,e.g. Rao,1973,p.112), the

matrix IA witn its (c,d) th elements equals to

converges in probability to I.

«

Therefore, the asymptotic covariance matrix of can be estimatedi i

-i

by (IA) while the asymptotic covariance matrix of can estimated

by K(IA)-1K.

In deriving the derivatives involved in (2.16), the following

theorem which is a generalization of Lee's (1984) result is needed.

Theorem

The multivariate normal distribution function with correlation matrix R

‘8火文中港

can be written as

where i 0 j, p. is the (i,j) the entry of R, R. is the partial

correlation matrix given and(•) is the standard normal density

function.

Proof: see Lee (1984)

following the notations and using the result of the theorem, we

have

On the other hand, by Johnson Kotz (192), we nave (using the

notations in the theorem)

where 0 is the bivariate normal density function with correlation P.

hk

k 1 i? j

and R.. is the partial correlation matrix given X. and X..• i J

Using (2.17), (2.18) and the method of matrix calculus (Mcdonald

Swaminathan,1973), the following expressions for 9 log Pr(kx)9 Q

in (2.16) are obtained.

(i) 3log Pr(kxf) 3b.

where R. is the partial correlation matrix given Y.x• 1 1

i J

i- 1,2,... n

(2.19)

(ii) 9log Pr(kXk(f)) 3 ai,m(i)

with i(i)= 0 in numerator if m(i)= k(i);

and i(i)= 1 in numerator if rii(i)- k(i)+ 1

j i

(2.20)

otherwise= 0

i= 1,2,...a

m(i)= 2,3,...n(i)

(iii) 8 log Pr(kxk(t-)) 3 1%,

where

ni

R.. is the partial correlation matrix given Y.x and Y.x..ij i— j—

k i, k j

ijj 1,2,.. .n

i 1

(2.21)

Finally, the polyserial correlations can be estimated by the

N

components of cm divided by the square root of the appropriate element

in the diagonal of Cxx

3. Optimization Procedure

Solution of 92 i-n (2.11) cannot be solved algebracially in closed

form, so that some nonlinear optimization procedure is required. The

Newton- Raphson method (see,e.g. Luenberger,1373) is conceptually the

simplest numerical procedure for minimizing a function. However, tne

use of Newton- Raphson algorithm requires the analytic form of the

second derivatives of tne function which, unfortunately, are extremely

difficult to find for our function I7. Therefore, a modified Fletcher-

Powell Procedure (Luenberger,1973) which requires only the first

derivatives, is used. The basic steps of its k- th iteration in

minimizing a general function f with respect to 9 are given as follows:

(i) set s(k)

(ii) find to minimize f

(iii) set 0(k+1)

(iv) updated giving

(2.22)

where is the gradient of f evaluate at 0 and is a symmetric

positive definite matrix which is updated by the so -.called BFGS

formula

H(k+1)= H(h)

wnere

It lias also been shown that the positive definite of H is preserved ana

hence the function value decreases in each iteration (Luenberger;1973).

We use 9 as our final estimate if the root mean square of gradient is

less than a pre- assigned small number, say e.

To apply the above Fletcher- Powell method in minimizing our

function what we required are tiie gradient vector 9 Fr 9 an

initial estimate of the parameters and a starting positive definite

matrix Hk The identity matrix is used for if no better initial

matrix for H is available. The expressions for 9 F9 9 are easily

obtained using (2.19), (2.20) and (2.21) and are given as follows:

(i)

wnere() is the right hand side of (2.i9)

i= 1,2,... n

(2.23)

(ii)

where

f(k,ti)- f(k(l),...k(i-i),m(i),k(i+l),...k(n));

f(k.»)- f(k(l),... ,k(i-l),m(i)-l,k(i+l),.. .k(n));

(-'•) is tiie right hand sine of (2.20) with i(i)-0 in numerator

and k(i)=m(i),

(•') is the right hand side of (2.20) with i(i)-l in

numerator and k(i)=m(i)-l.

i= 1,2,... n

m(i)= 2,3,—n(i)

( 9 9•

(iii)

where() is the right hand, side of (2.21)

i? j i 7 2,n

i j

(2.25)

In general, the starting value of the parameters -s duite robust

to the algorithm defined by (2.22). However, experience indicates that

a good starting value would reduce the time of convergence. Therefore,

a sample estimate that based on (x.',z.') i- 1,2,...N is used to

estimate the initial value of tL. Let

be tne sample standard deviation of Z.;

be the sample covariance of X and Z. divided by

be tne sample correlation of Z. and Z. and

i= 1,2,... n

h= 2,3,...n(i)

where

is the inverse of the standard univariate normal distribution

function;

f is the observed proportion in the (k(i),k(2),...,k(n))- th cell.

Then tne starting value of b., a..,, and r.. can be obtained by

(2.26)

(2.27)

(2.28)

4. Example

Computer programs written in FORTRAN IV with double precision has

been implemented to obtain the maximum likelihood estimate of with

the dimension of Y equals to 1, 2 and 3 respectively. The algorithm is

constructed based on the modified Fletcher- Powell procedure discussed

in Section 3. In the expressions for 3Y 3 2.23, 2.24 and 2.25,

distribution functions of normal variates ±2 and are involved.

To evaluate these functions, numerical integrations are required. Tne

programs use tne subroutine DCADRE from the IMSL(1975) library in getting

and use tne subroutine BINORM developed by Divgi(1979) in getting $2

In evaluating a» 6 Y R) the result of the theorem in Section

2.2 is used. Firstly, a single- argument function

is implemented witn BINORM, where p. is the (i,j) th entry in R and

P0„, is the partial correlation. Then the numerical integration of

tnis function in the range(- 0°, a) is computed by DCADRE. However,

it must be pointed out that although$ and can be obtained

efficiently, evaluating$ is rather computationally expensive since

it involves both the subroutine DCADRE and BINORM.

Tne following example is based on N= 100 simulated data from a

standard multivariate normal distribution with the dimension of X and Y

equal to 2 and 3 respectively. Each (x.'') was transformed to

Ox,;' ir;') by (2.1) with pre-assigned thresholds

Tne parameters in were estimated using the random sample ((x..', z_.'),

i= 1,2,...N). Letting convergence criterion£•= 0.001, the algoritnm

converged quicxly in 3 iterations. To give some idea about the

benaviour of the algorithm, the convergence summary is presented in

Table 2.1. The maximum likelihood estimates and their standard error

estimates are reported in Table 2.2. For information, tne Pearson's

product sample covariances (i.e. the maximum likelihood estimates based

on the continuous data) between X and Y and the Pearson's product

correlations of variables within Y are reported. We see that the

estimates of polyserial covariances and polychoric correlations obtained

from our maximum likelihood method are very close to the maximum

likelihood estimates from continuous observations of X and Y. We also

noted tnat tne maximum likelinood estimates of the thresholds are pretty

close to the pre- assigned thresholds.

Chapter 3

Special Models

In this chapter, we will discuss two special cases of tne general

model described in Chapter 2. We call thern the polyserial correlation

model and the multivariate polychoric correlation model. These models

are particularly interesting not only because they have wide practical

applicabilities but also they subsume many studies in the literature

as their special cases.

1. Polyserial Correlation Model

In this model we suppose the dimension of Y in our general model

is equal to 1. Thus we have a continuous random vector X and a latent

variabe Y. The observed discrete variable Z is related with Y by

if

if

(3.1)

The correlations c between X and Y are trie polyserial

correlations. Certain special cases of this model have received quite

a lot of attention in the literature (Tate,1955a,b; Kannan Tate,

1965; Prince Tate,1966; Cox,1974; Olsson et al,1982). The maximum

likelihood estimate of the complete model has been provided by Lee

Poon (1985). Their treatment is briefly described in the remaining of

this section.

Trie parameter vector in this model is

Using tne similar maximum likelihood approach as described in Chapter 2,

the parameters in are estimated based on a random sample (x,., z,) for

k= l,2,...t, 1= i,2,...n(k) with z,= k and n(k) being the

corresponding total number of observations. More specifically, the

maximum likelihood estimates of e_ and C,, are equal to the sample mean

and the sample covariance matrix respectively, while the maximum

likeiinood estimates of a and c are obtained from the maximum

likelihood estimate of 9- (a,b), where

au

a

-- 7 t-IX.

b

(3.2)

by similar reasoning as in Chapter 2, we see that the maximum

likelihood estimates of b and a can be obtained by minimizing the

function

F,,(b.a)

(3.3)

Again, the estimates of b and a cannot be solved algebraically in

closed form, so that some nonlinear optimization procedure is also

needed. Since tne expression of in (3.3) is much simiplier than

that in (2.9), instead of using the Fletcher- Powell algorithm given

in (2.22), we can use the following more efficient Newton- Raphson

algorithm:

- Y H(9)_1(0)

(5.4)

where (9)- 3 r 3 9 is the gradient vector, H(9)=( 3 3£ 3) i-s

the Hessian matrix and y is a step- size parameter which takes the

first value in the sequence 1, 12, 14,... that reduces F.,. It is well

known in mathematical programming (Bard,1975) that if H is positive

definite, tiie algorithm is very efficeint. Witn the starting values

obtained via tne similar equations as (2.26) and (2.27), experience

indicates that the Hessian matrix is always positive definite.

cased on tae results

and

if k- h

otherwise

the expressions for g(9)~ 3F,, 3 can be found and are given by

en

(5.5)

It should be noted that (3.5) and (3.6) are simple special cases of

(2.23) and (2.24). Differentiate these expressions once more, we

nave

(3.8)

(iii)

1f m= h- 1

if m= a

if ra= h+1

otherwise- u

i n.r~i j-n zzz J'

(3.9)

The asymptotic covariance matrix of the maximum likelihood estimate

of (cd, a) is again given by

k r1 K

where I is the information matrix and

Clearly,-£, while the other expressions in K can be obtained

as special cases by (i), (v) and (vii) of (2.14)

Unlike the Fletcher- Powell method, the Newton- Rapnson algorithm

produces not only the maximum likelihood estimate of 9_ but also an

approximation of the information matrix. Noted that by definition,

1(9)= E(H(9)).

Since the Hessian matrix converges in probability to its expectation, we

may use the Hessian matrix to approximate the information matrix.

Tnerefore, the standard errors of the estimates of c and a can be

obtained by using

K(H(9))-1K

Based on the above method, the example reported in Cox(1974) was

reanalyzed as a special case of this model. Choosing the convergence

criterion£ as 0.0001, the program converged extremely rapidly in two

iterations. The convergence summary is reported in Table 3.1 while the

maximum likelihood estimates and their standard errors obtained are

reported in Table 3.2. It is found that the results are almost

identical to those reported in Cox(1974).

Tne next example is based on N= 100 simulated data from a

multivariate normal distribution with the dimension of (X',Y) equals to

6. Each (x.',y.) was transformed to (x.',z.) by (3.1) with the pre-

assigned thresholds=(-°°, -1.0, 0.0, 0.6, 1.0, 1.3,00). The

parameters in 9 were estimated using the random sanple {(x), i=

1,2,...100}. The program converged quickly to its solution after 4

iterations; the convergence summary of this run is presented in Table

3.3. The maximum likelihood estimates and their standard error

estimates are reported in Table 3.4. For infomation, the Pearson's

product covariance between X and Y are also reported. It is found that

poiyseriai covariances obtained from the maximum likelihood method are

very close to the sample covariance between X and Y. It also been noted

that the maximum likelihood estimates of the thresholds are very

accurate.

2. Multivariate Polychoric Correlation Model

In this model, we suppose than we have only a n- dimensional

latent vector Y. The observed discrete vector _Z is related with Y

similarly by

Z.- k(i;0 f

tOL 1 k(i)- l,2,...n(i) with

and for ail i

hased on the frequencies f(k(l),k(2),...k(n)), k(i)- 1,2,—n(i) in the

multivariate contingency table, tne polychoric correlations P..

between Y. and Y. and the thresholds can be estimated using the maximum

likelihood approach discussed in Chapter 2. Studies of various special

cases of this model can be found in the literature (Pearson,1901;

Tallis,1962; Martinson Hanndan,1971; Lee,1984).

The parameters vector of this model is given by

Let Pr(k)= P(Zi= k(l),...Z= k(n))

the negative log-likelihood equation is given by

(5.10)

It should be noted that the log- likelihood function of FV in (3.10) is

a special case of 10. given in (2.0) by defining a. w.N- a..• b.=

0 and l..= P.. Therefore, to find the maximum likelihood estimates

of the parameter vector 9 in this model, tne computer prograns developed

in Chapter 2 for finding the maximum likelihood estimates of the general

model can be used with b. set equal to zero.

This is in fact not unexpected. Since the behaviour of Z is now no

longer depend on X, the polyserial covariances should equal to zero.

The standard error of the estimate of 9_ can be obtained by tne

corresponding information matrix I. The (c,d) th element of I is given

by

hd

(3.11)

The expressions for 9 log Pr(k) 3 y can be obtained via (2.19) and

(2.20) easily and hence is not presented here. As a result, the

standard error estimates can be directly obtained via (5.11). Moreover,

it is interesting to note that since in this case

(IA),

the covergence in probability of IA to I is easily seen.

Based on the 3x3 contingency table given by Oisson(1979), five

parameters were reestimated using the above method. We found that tne

result is identical to that of Olsson's and is reported in Table 3.5.

Chapter 4

Partition Maximum Likilihood (PML) Estimation

1. The P ML Procedure

The method discussed in Chapter 2 gives the maximum likelihood

estimates of the parameter vector 0 in the general model. It is nice

to have maximum likelihood estimates because they posses many nice

statistical properties. However, the cost is that when the dimension of

tne latent vector Y is high, the method must accomplished by a lot of

computer time. The problem is arisen from the calculation of the

multiple integrals of the multivariate normal distribution functions.

Theoretically, the distribution functions of normal variates can be

calculated using the theorem in Section 2.2 by a recursive formula but

practically, it is computational labor expensive. As a result, the

algorithm become less and less efficient when the dimension of Y is

higher and higher. Therefore, it is interesting to seek another more

efficient method for finding the estimate of 9, especially when the

dimension of Y is large.

We now propose a new met nod, we call it the partition maximum

likelihood (PML) method, for estimating the parameters in the general

model. The basic features are described as follows:

(i) Tne mean vector e and the covariance matrix C are estimated by

the sample estimates as before, based on random observations of X.

(ii) for each i= l,2,...n, the polyserial correlations between X and

Y. are estimated based on the observed random sample corresponding

to (X',Z.). Since the dimension of Z. is one, the underlying

model can be regarded as a polyserial correlation model. Hence

the efficient Newton- Raphson algorithm developed in Section 3.1

can be employed to get the maximum likelihood estimates. This

gives the partition maximum likelihood estimates c. of c. and a—l —l

set of estimates he ne- thresholds, for i= l,2,...n.

(iii) For i,j- 1,2,...n, i j, the polychoric correlation P.. is

estimated based on the n(i) x n(j) contingency table which

contains observed frequencies corresponding to and Z.. We treat

this as a special 2- dimensional polychoric correlation model. In

this simple model, the computational task for getting maximum

likelihood estimates of P.. and the thresholds is light. This

eives the partition maximum likelihood estimates P.. of P.. and

another (n-1) sets of thresholds estimates d E(i)'S~ l2,...n-l.

As we can realize, the neavy computational burden for obtaining the

maximum likelihood estimates of the general model is mainly due to the

evaluation of the multivariate normal distribution functions which

require to compute multiple integrals. In the P ML method, we separate

the huge general model into many small models. In obtaining the

partition maximum likelihood estimates of these small models, we only

need to compute simple single and double integrals instead of the

complicated multiple integrals. Therefore, a lot of computer time can

be saved. One shortcoming of this method is that there are n sets of

threshold estimates, St anc ik()' s= -2,...n-1. However as

we will see from the result of a simulation study, the difference among

these estimates are very tiny.

2. Example

Using the'.PML method, the example given in Section 4 of Chapter 2

is reanalyzed. It is found that the estimates given by the PML method

are almost identical to those given by the maximum likelihood approach.

The result is reported in Table 4.1. The estimates of the thresholds

and their standard error reported in this table are the means of the

estimates obtained from the polyserial correlation model and various 2-

dimensional polychoric correlation models.

3. Simulation

The main purpose of this simulation study is to compare the

performance of various kinds of estimates, e.g. the maximum likelihood

estimates and the partition maximum likelihood estimates. The study

is based on simulated data (sample size N= 40, 70, 100) from a

multivariate normal distribution with the dimension of X and Y are both

equal to two. The population mean vector of this distribution is taken

to be 0, waiie the covariance matrix is taken to be

i .0

U. 0

0.0

0.2

1 f'i .KJ

0.0

0.1

1.0'hJ• -4- i.O

The standardized simulated random vector (x.11)1 was transformed to

(x. ,_z.')' with the following sets of pre- assigned thresholds:»L i

en

and

(II)

Note tnat the first set of thresholds were selected so that the

distributions of and are skewed at opposite directions, while the

second set were selected so that the distributions are both skewed to

the right. For each case, 50 replications were generated. The

estimates were obtained using both the maximum likelihood and the PML

approach.

The simulation results about the polyserial and the polychoric

correlations estimates are reported in Table 4.2 and Table 4.3. These

tables are obtained based on random vectors generated with the first set

of thresholds and the second set of thresholds, respectively. The mean

estimates and root mean square error

; -H

are reported. The RME column cl-c2, ci-c3 and cl-c4 are used to examine

discrepancy between the various estimates and the true value. The

columns c2-c3 and c2-c4 are used to examine discrepancy between the

Pearson's product correlation with the maximum likelihood estimates and

the partition maximum likelihood estimates respectively. Finally the

column c3-c4 is used to examine discrepancy between the maximum

likelihood and partition maximum likelihood estimates. From Tables

4.2 and Table 4.3, we see tnat there is very little bias evidence for

the partition maximum likelihood and maximum likelihood estimates.

Tneir mean estimates virtually identical. From the columns of RME, the

following interesting phenomena are observed.

(i) The maximun likelihood estimates performs very well with small

and moderate samples, and under both type of skewed distributions.

As expected, increasing the sample size decrease the RME and the

RME corresponging to the large correlation is relatively small

( see columns cl-c3 and c2-c3).

(ii) Tne behaviour of the partition maximum likelihood estimates are

almost identical to the maximum likelihood estimates( see columns

cl-c4 and c2-c4)

(iii) Under all various situations, there is little difference between

the- partition maximum likelihood and the maximum likelihood

estimates especially for the polyserial correlations estimates.

Noted also that the RME is large when the true correlation is

large (see column c3-c4).

Tne result about tne threshold estimates are presented in Table 4.4

and Tabie 4.5. Again the mean estimates and various RME are reported.

We observe that the behaviour of the threshold estimates is very similar

to tnat of the correlation estimates; for example, the mean estimates

for tne partition maximum likelihood and maximum likelihood estimates

are almost the suae; the bias for both type of estimates are tiny even

with small and moderate samples and with different type of skewed

distributions; and tne discrepancy between the partition maximum

likelihood estimates and the maximum likelihood estimates are very

little. From columns cl-c3, cl-c4, c2-c3 and c2-c4, we see tliat the

threshold estimates obtained from the polyserial correlation model are

better than those from tne polychoric model. This is a natural fact

because more data information is used by the polyserial correlation

i nodel.

Chapter 5

Summary and Discussion

In this thesis, the maximum likelihood estimation method is

developed for finding the estimates of the parameters in a multivariate

normal model with some of the component variables observable only in

poiytomous form. The parameters underlying include the mean vector and

the covariance matrix of tne continuous random vector, the polyserial

correlations between the continuous random variables ana tne latent

variables, tne polychoric correlations among the latent variables and

the thresholds. By means of an appropriate transformation, tne

complicated likelihood function that involves these parameters is

transformed to a comparatively simple function that only involves tne

interesting parameters: the polyserial correlations, the poiycnoric

correlations and tne thresholds. The negative log- likelihood function

is minimized via tne Fletcher- Powell algorithm, giving tne maximum

likelihood estimates of the parameters. The standard error estimates of

the parameters estimates are obtained from the inverse of tne

approximation of tne information matrix. Two important special cases

that have wide applicability are studied. They are tne polyserial

correlation model and the multivariate polychoric correlation model. In

tne polyserial correlation model, the classical Newton- Raphson

algorithm is used to produce the maximum likelihood estimates and their

standard errors. As expected, we found that the Newton- Raphson

algorithm is extremely efficient.

Although we nave developed all the essential theoretical aspects of

the maximum likeiinood estimates, in practice it taxes a long computer

time to obtain the estimates, especially when the dimension of the

random vector observable only in poiytomous form, n is large. This is

because in computing the function value and the gradient vector, one

requires to evaluate a lot of multiple integrals. To overcome these

practical difficulties, we propose another estimation approach, namely

tae partition maximum likelihood approach. This approach requires

much less computer time than the maximum likelihood approach. Moreover,

from our simulation results, we observe that the discrepancy between the

partition maximum likelihood estimates and the maximum likelihood

estimates are extremely tiny. Therefore, the partition maximum

likelihood approach represents a very attractive method which can

produce accurate estimates effi ch entiv.

Still another method is to repeatedly use one continuous variable

and one discrete variable, say X. and Z., for finding the estimates of

the polyserial correlations and the thresholds; the process is continued

until ail the oolvserial correlations are estimated us ins? the underlvinp

discrete variables. In the context of the polyserial correlation mode

Lee Poon (1985) demonstrated that this method is inferior to the

partition maximum likelihood method. Hence, in the context of the

general model we anticipate that this method is also inferior to the

partition maximum likelihood method.

based on similar rationale as OLsson, Drasgow Dorans (1982), two

- step estimates of the polyserial and the polychoric correlations rnigh

hp nhrp-ined as follows. Time thresholds are first estimated by the

cumulative proportions, then the correlation estimates are obtained witl

the thresholds fixed at their estimates. As this reduces the number of

parameters to be iteratively estimated, it may require less computer

time to obtain the estimates. However, we expect a great deal of

computer time is again necessary because the evaluation of multiple

integrals are still required. Therefore, it has significant practical

value to develop similar kind partition two- step maximum likelihood

estimates.

Another direction for further research is to apply the results of

tiiis thesis to other multivariate techniques that relate with the

correlations or the covariances. Examples of these techniques are

canonical correlation analysis, principal component analysis, factor

analysis and covariance structure analysis. One attempt is to apply the

estimated correlationcovariance matrix directly to obtain the results.

Tne results of this thesis are developed based on the assumption

that the variables are normally distributed. Hence, if the distribution

was unknown a priori, there is no statistical justification for the

maximum likelihood approach or the partition maximum likelihood

approach. Kraemer (1981) studied some modified biseriai correlation

coefficients which require less restrictive assumption than tne

bivariate normality assumption. In the general case, this robustness

problem remains an interesting research topic.

Table 2.1

Convergence Summary of the F- P Algorithm

Iterationt2(92)

RMSb21 al,2 a2,4 5 r12

0 4.1619 0.0078 0.128 -1.038 0.718 1.088 -0.127

1 4.1618 0.0093 0.142 -1.043 0.715 1.095 -0.138

2 4.1609 0.0011 0.139 -1.038 0.722 1.092 -0.134

4.1609 0.0009 0.139 -1.038 0.722 1.092 -0.134

F-OO denotes the function value

RMS denotes the root mean square of the gradient

Table 2.2

Maximum Likelihood Estimates of the Parameters

Parameters P.Value PPC MLE S. E.

C11

c12

c2l

c22

C31

c32

al,2

a

1.3

a

1.4

a

2.2

a

2.3

a

2.4

a

3,2

a

0,0

a

0,+

a

3.5

P12

P13

P2o

0.0

0.0

0.0

0.0

0.0

0.0

-1.0

-0.2

0.8

-0.8

0.0

0.8

-1.2

-0.4

0.2

1.0

0.0

0.0

0.0

0.231

-0.103

-0.134

-0.107

0.016

-0.134

-0.163

0.009

0.032

0.289

-0.046

-0.123

-0.136

-0.032

-0.094

-0.993

-0.215

0.838

-0.774

-0.066

0.708

-1.401

-0.286

0.248

1.086

-0.159

-0.021

0.094

0.108

0.120

0.098

0.119

0.127

0.136

0.171

0.128

0.153

0.156

0.140

0.144

0.202

0.140

0.139

0.174

0.125

0.125

0.125

P.Value denotes the polulation value

PPC denotes the Pearson's product correlation

MLE denotes our maximum likelihood estimate

S.E. denotes the standard error

Table 3.1

Convergence Summary of Cox's Data

IterationF2(b,a)

RMSbl a2

a.o

0 20.6o35 1.165J -0.104 -1.182 0.64b

1 20.5401 0.0205 -0.215 -1.195 0.361

2 20.5401 0.0000 -0.216 -1.197 0.863

2— cen°fes the function value


Table 3.2Maximum Lrkel

Maximum Likelihood Estimates of Cox's Data

Parameters

estimates

Our solution

standard errors estimates

Cox's solution

standard errors

c 0.211 0. 220 0.211 0.224

a. -1.169 0.319 -1.169 0.324

a. 0.843 0.284 0.843 0.286

Tahl P 3.8

Convergence Summary of the Newton- Raphson Algorithm

IterationF2(b,a)

RMSb2 b4 a2

0

1

2

J

4

78.8099 4.7125 0.151 -0.456 -2.307 2.051

U.6227 0.9806 0.356 -0.660 -3.182 2.68

69.2041 0.1542 0.420 -0.766 -3.83 3.01

69.146o 0.0062 0.4o3 -0.815 -4.006 3.164

69.1407 0.0000 0.433 -0.81 -4.014 3.168

denotes the function value


Table 3.4

Maximum Likelihood Estimates of the Parameters

Parameters P.Value PPC MLE S.E.

C1

c2

c3

c4

c5

°2

°3

a4

a5

a6

-0.6

-0.1

0.0

0.2

0.7

-1.0

0.0

0.6

1.0

1.3

-0.695

-0.041

0.010

0.121

0.701

-0.700

-0.067

0.014

0.140

0.694

-1.067

0.138

0.649

1.028

1.506

0.031

0.041

0.042

0.043

0.029

0.090

0.064

0.066

0.079

0.108

P.Value denotes the population value


MLE denotes our maximum likelihood estimates

S.E. denotes the standard error

Table 3.5

Maximum Likelihood Estimates of Olsson's Data

Parameters

estimates

Our solution

standard errors estimates

Olsson's soultion

standard errors

a11,2

al ,3

a2,2

a

p1212

-1.774

-0.137

—0.688

0.667

0.492

0.104

0.056

0.061

0.061

0.050

-1.77

-0.14

-0.69

0.67

0.46

0.103

0.056

0.061

0.061

0.043

Table 4.1

PML Estimates of the Parameters

Parameters P.Value PPG MLE PML S.E.

c

11

c12

zl

c22

c31

c3232

a1,21,2

a3

al,4

a2,2

a2,32,3

a2,4

a3,23,2

a3,33.3

a3.4

a3,5

012

P13

P23

0.0

0.0

0.0

0.0

0.0

0.0

-1.0

-0.2

0.3

-0.3

0.0

0.8

-1.2

-0.4

0.2

1.0

0.0

0.0

0.0

0.2310.231

-0.103

-0.134

-0.107

0.016

-0.134

-0.163

0.009

0.032

0.269

-0.046

-0.123

0.136-0.136

-0.032

-0.094

-0.993

-0.215

0.838

-0.774

-0.066

0.708

-1.401

-0.26b

o. 24o

1.086

-0.159

-0.021

0.094

0.288

-0.045

-0.123

-0.134

-0.032

-0.095

-0.994

-0.214

0.853

-0.84

-0.079

0.697

-1.403

-0.262

0.251

1.082

-0.137

-0.016

0.091

0.097

0.103

0.107

0.108

0.104

0.103

0.152

0.125

0.144

0.140

0.125

0.137

0.162

0.127

0.127

0.15b

0.113

0.112

0.112

P.Value denotes the poiulation value


MLE denotes the maximum likelihood estimate

PML denotes the (mean of) differencing maximum likelinood estimate

S.E. denotes the (mean of) standard error

Table 4.2

Simulation Results for Correlations

Distributions of and Zr skewed at opposite directions

Mean of Estimates RME

PAR P.V PPC MLE pML cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4

(N= 40)

C11

c12

C21

c22

P12

0.80

0.00

0.30

0.10

0.40

0.801

0.011

0.307

0.123

0.402

0.815

0.029

0.315

0.131

0.426

0.800

0.029

0.314

0.130

0.425

0.062

0.142

0.158

0.147

0.128

0.072

0.164

0.170

0.161

0.142

0.062

0.161

0.169

0.160

0.140

0.028

0.058

0.054

0.054

0.077

0.022

0.056

0.053

0.053

0.081

0.020

0.008

0.004

0.004

0.020

(N= 70)

C11

c12

C21

c22

P12

0.80

0.00

0.30

0.10

0.40

0.794

0.029

0.332

0.132

0.417

0.801

0.041

0.338

0.132

0.425

0.791

0.040

0.336

0.130

0.427

0.052

0.125

0.110

0.115

0.091

0.056

0.145

0.127

0.129

0.119

0.050

0.142

0.124

0.126

0.122

0.026

0.038

0.042

0.041

0.063

0.023

0.036

0.040

0.040

0.067

0.010

0.006

0.006

0.007

0.011

(N= 100)

C11

c12

C21

c22

P12

0.80

0.00

0.30

0.10

0.40

0.796

-0.014

0.310

0.118

0.415

0.799

-0.019

0.307

0.111

0.410

0.791

-0.019'

0.306

0.110

0.411

0.035

0.084

0.111

0.117

0.096

0.040

0.091

0.113

0.129

0.103

0.037

0.089

0.112

0.129

0.104

0.020

0.041

0.032

0.033

0.037

0.019

0.040

0.032

0.033

0.038

0.010

0.004

0.002

0.002

0.008

PAR denotes parameter

P.V denotes population value


MLE denotes maximum likelihood estimates

PML denotes differencing maximum likelihood estimates

RME denotes root mean square error

Table 4.3

Simulation Results for Correlations

Distributions of and Z both skewed to the right

Mean or F.a ti ma t-pq RME

PAR P.v PPC MLE PML cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4

(N= 401

C11

c12

C21

c22

P12

0. 80

o.oc

0.30

0.10

0.40

0.803

-0.015

0.289

0.098

0.380

0.816

-0.019

0.291

0.108

0.387

0.803

-0.017

0.289

0.108

0.391

0.063

0.131

0.149

0.136

0.141

0.074

0.147

0.153

0.154

0.171

0.063

0.144

0.152

0.154

0.165

0.034

0.053

0.047

0.055

0.073

0.026

0.051

0.048

0.055

0.071

0.020

0.009

0.003

0.004

0.016

(N=70)

11

c12

21

c22

pl2

0.80

0.00

0.30

0.10

0.40

0.798

-0.025

0.298

0.089

0.388388

0.798

-0.022

0.299

0.093

0.392

0.791

-0.021

0.299

0.092

0.391

0.039

0.103

0.101

0.104

0.093

0.043

0.106

0.115

0.109

0.107

0.039

0.106

0.115

0.109

0.106

0.021

0.038

0.038

0.033

0.048

0.021

0.038

0.038

0.033

0.050

0.010

0.00b

0.002

0.002

0.010

(N= 100}

c

1Z

C21

c22

p12

0. bO

0.00

0.30

0.10

0.40

0.805

-0.015

0.313

0.086

0.394

0.81C

-0.017

0.318

0.093

0.40C

0.786

—0.017

0.318

0.093

0.400

0.03c

0.086

0.097

0.085

0.094

0.040

0.089

0.094

0.091

0.103

0.037

0.087

0.094

0.091

0.101

0.025

0.033

0.031

0.033

0.050

0.031

u.032

0.031

0.033

0.050

0.024

0.006

0.002

0.001

0.009


P.V denotes population value


MLE denotes maximum likelihood estimates

PML denotes differencing maximum likelihood estimates

RME denotes root mean square error

Table 4.4

Simulation Results for Thresholds

Distributions of ana skeved at opposite directions

Mean of Estimates

P ML RME

PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-e4 c2-c3 c2-c4 c3-c4

(N= 40)

a1,2

a1,0

a1.4

a1.5

a

a2,3

a 2,4

a

2,0

-1.5

-i.O

0.0

1.0

-1.0

0.0

1.0

1.5

-1.522

-0.991

-0.035

0.994

-0.988

-0.011

0.995

1.565

-1.525

-0.990

-0.033

0.996

-0.991

-0.007

0.994

1.555

-1.533

-0.998

-0.032

1.003

-0.968

-0.010

0.995

1.561.563

0.189

0.105

0.106

0.141

0.132

0.119

0.146

0.250

0.212

0.116

0.117

0.129

0.132

0.117

0.144

0.240

0.182

0.103

0.106

0.135

0.133

0.119

0.147

0.243

0.094

0.052

0.051

0.053

0.009

0.015

0.012

0.033

0.039

0.022

0.020

0.028

0.008

0.011

0.009

0.025

0.092

0.045

0.049

0.039

0.012

0.016

0.015

0.032

(N- 70)

a1,2

a1

a1,4

a

a2,2

a

a2,5

-1.5

-1.0

0.0

1,0

-1.0

0.0

1.0

1.5

-1.502

-1.006

0.025

0.972

-0.9b2

-0.006

1.0O7

1.544

-1.514

-1.002

0.031

0.970

-0.963

-0.008

1.006

1.51.546

-1.502

-1.006

0.022

0.976

-0.982

-0.006

1.007

1.545

0.142

0.065

0.076

0.124

0.101

0.098

0.116

0.163

0.15

0.098

0.087

0.127

0.101

U.099

0.117

0.164

0.140

0.084

0.075

0.125

0.101

0.099

0.117

0.138

0.060

0.041

0.036

rip nU. OZo

0.008

0.009

0.010

0.016

0.022

0.011

0.013

0.013

0.006

0.009

0.009

0.017

0.064

0.041

0.037

0.029

0.008

0.012

0.010

0.021

(to be continued)

(continued)

Mean of Estimates

PML RME

PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4

(N- 100)

al,2

1,3

a

1.4

a

1.5

a2,2

a

2,3

a2,4

a2,5

-1.5

-1.0

0.0

1.0

-1.0

0.0

1.0

1.5

-1.485

-0.995

-0.008

0.991

-0.985

0.004

0.974

1.498

-1.476

-0.994

-0.013

0.987

-0.986

0.004

0.974

1.498

-1.490

-0.997

-0.006

0.993

-0.986

0.007

0.974

1.494

0.104

O.Obb

0.059

0.076

0.074

0.062

0.076

0.146

0.105

0.068

0.062

0.077

0.074

0.062

0.075

0.146

0.101

0.065

0.058

0.075

0.073

0.064

0.076

0.146

0.046

0.030

0.029

0.023

0.004

0.006

0.005

0.009

0.014

0.009

0.011

0.010

0.005

0.008

0.006

0.013

0.045

0.028

0.029

0.020

0.005

0.011

0.008

0.016


True denotes true value

RME denotes the root mean square error

MLE denotes maximum likelihood estimate

PML denotes 'partition maximum likelihood estimate

PMLC denotes the'PML estimates obtained from the 2- dimensional

polychoric correlation model

PMLS denotes the PML estimates obtained from the polyserial

correlation model

Table 4.5

Simulation Results for Thresholds

Distributions of Z-, and both skewed to the right

Mean of Estimates

PML RME

PAR True MLE PMLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4

(LSI- 40)

1,2

a 1,31,3

a

1,4

a

1,5

a

2,2

ai 9Z. O

a

2.4

52.5

-1.0

0.0

1.0

1.5

-1.0

0.0

1.0

1.5

-1.033

0.023

1.013

1.488

-0.986

0.012

0.969

1.506

-1.034

0.019

0.997

1.497

-0.983

0.009

0.968

1.509

-1.042

0.020

1.016

1.499

-0.985

0.012

0.967

1.509

0.175

0.101

0.107

0.190

0.134

0.121

0.136

0.191

0.180

0.096

0.098

0.216

0.133

0.119

0.137

0.190

0.173

0.099

0.104

0.187

0.133

0.122

0.139

0.193

0.050

0.053

0.052

0.079

0.009

0.011

0.010

0.016

0.020

0.016

0.020

0.030

0.009

0.0i3

0.014

0.025

0.043

0.051

0.048

0.074

0.012

0.018

0.016

0.029

(N- 70)

a1,2

a 1,3

a 1,4

1,5

1.2

1.3

a 2,4

1,5

-1.0

0.0

1.0

1.5

-1.0

0.0

1.0

1.5

-0.992

-0.016

1.027

1.505

-0.998

-0.011

1.012

1.533

-0.990

-0.021

1.024

1.504

-0.999

-0.011

1.011

1.536

-0.997

-0.017

1.030

1.509

-0.997

-0.010

1.013

1.534

0.086

0.085

0.101

0.125

0.101

0.095

0.117

0.187

0.091

0.095

0.104

0.126

0.102

0.092

0. li

0.188

0.089

0.087

0.106

0.125

0.101

0.094

0.117

0.188

0.034

0.034

0.028

0.052

0.005

0.008

0.007

0.014

0.013

0.010

0.013

0.020

0.005

0.008

0.007

0.015

0.033

0.033

0.028

0.047

0.006

0.011

0.009

0.019

(to be continued)

( n nn t-? ni)

Moan

DMT PMFPAT? Trno MT.F. P'MLC PMLS cl-c2 cl-c3 cl-c4 c2-c3 c2-c4 c3-c4

ai,;

% :

V

V

a0 rZ jz

% :

V

S c

-l.C

0.c

1.C

1.!

-l.C

0.c

1.C

1.5

-1.00:

o.oi:

o.9s;

1.46;

-0.99;

o.oi;

1.00(

1 srv

-1.002

0.01C

0.986

1.476

-0.99:

o.oi:

1.001

1.50!

-1.004

0.012

0.999

1.486

-0.994

0.014

1.000

1.503

0.075

0.075

0.075

0.135

0.091

0.077

0.079

n. 1

0.08C

0.076

0.076

0.154

0.092

0.076

0.075

0.141

0.077

0.078

0.074

0.138

0.092

0.077

0.079

0.133

0.02

0.026

0.02

0.05

0.00!

0.005

0.006

0.005

0.012

0.009

0.016

0.027

0.003

0.004

0.005

0.007

0.020

0.029

0.028

0.051

0.005

0.007

0.009

0.012


T v~t l o rl P o f x ro 1 t in

RME denotes the root mean square error

MLE denotes maximum likelihood estimate

P?. dpnnf.ps tsar t t t i on mpiYimiim 1 i kpl i Vinnri pqfimpfp

PMLC denotes the PML- estimates obtained from the 2- dimensional

oolvchoric correlation model

PMLS denotes the PML estimates obtained from the polyserial

correlation model

Reference

Anderson, T.W. An introduction to multivariate statistical analysis.

New York: Wiley, 1958.

Ashford, J.R., Sowden, R.R. Multivariate probit analysis.

Biometrices, 1970, 26, 535-546.

Bard, Y. Nonlinear Parameter Estimation. New York, Academic Press,

1975.

Cox, N.R. Estimation of tne correlation between a continuous and a

discrete variable. Biometrics, 1974, 30, 171-178.

Divgi, D.R. Calculation of tne tetrachoric correlation coefficient.

Psycnometrika, 1979, 44, 169-172.

Finney, D.J. Probit Analysis (3rd ed.). Cambridge, England:

Cambridge University Press, 1971.

Hannan, J.P., Tate, R.F. Estimation of tne parameters for a

multivariate normal distribution when one variable is dichotomized.

Biornetrika, 1965, 52, 664-668.

1MSL Library (ed.5). Houston, Texas: international Mathematical and

Statistical Libraries, 1975.

Johnson, N.L. Kotz, S. Distributions in statistics: Continuous

multivariate distributions. New York: Wiley, 1972.

Kendall,M.G., Stuart, A. The Advanced Theory of Statistics, vol.II:

Inference and Relationship. Hafner, New York, 1967

Kraemer, H.C. Modified biserial correlation coefficients.

Psychometrika, 1981, 46, 275-282.

Lazarsfeld, P.P. Latent structure analysis. In S. Kock (ed.),

Psychology: A Study of Science, Vol. 3, New York: McGraw-Hill, 1959.

Lee, S.Y. Maximum likelihood estimation of polychoric correlations in

r x s x t contingency tables. Submitted for publication, 1984.

Lee, S.Y., Poon, W.Y. Maximum likelihood estimation of polyserial

correlations. Technical Report, dept. of Statistics, CUHK, No.22,

1985.

Lord, F.M., Noviek, M.R. Statistical Theories of Mental Test Scores.

Reading, Mass: Addison-Wesley, 1968.

Luenberger, D.G. Introduction to Linear and Non-linear Programming.

Reading, Mass: Addison-Wesley, 1973.

Martinson, P.O. Hamdan, M.A. Maximum likelihood and some other

asymptotically efficient estimators of correlation in two contingency

tables. Journal of Statistical Computation and Simulation, 1971, 1,

45-54.

Mcdonald, R.P., Swaminathan, H.A. A simple matrix calculus with

application to multivariate analysis, General Systan, 1973, 18, 37-54.

Muirhead, R.J. Aspects of Multivariate Statistical Theory. John

Wiley Sons, i982.

Nerlove, M., Press, S.J. Univariate and multivariate log-linear and

logistic models. Santa Monica: The Rand Corp., R: 1506-LDANIH, 193.

V

Olsson, U. Maximum likelihood estimation of the polychoric correlation

coefficient. Psycnometrika, 1979, 44, 443-460.

Olsson, (J., Drasgow, F., Dorans, N.J. The polyserial correlation

coefficient. Psycnometrika, 1982, 47, 337-347.

Pearson, K. Mathematical contributions to the theory of evaluation,

II: On the correlation of characters not quantitatively measurable.

Philosophical transactions of the Royal Society of London, Series A,

1901, 195, 1-47.

Prince, B.M., Tate, R.F. Accuracy of maximum-likelihood estimates of

correlation for a biserial model. Psycnometrika, 1966, 31, 85-92.

Rao, C.R. Linear Statistical Inference and its Applications. John

Wiley Sons, 1973.

Schmidt, P., Strauss, R.P. Estimation of models with jointly

.dependent qualitative variables: A simultaneous logit approach.

Econometrica, 1975, 43, 745-755.

Tallis, G. The maximum likelihood estimation of correlation from

contingency tables. Biometrics, 1962, 18, 342-353.

Tate, R.F. The theory of correlation between two continuous variable

when one is dichotomized. Biometrika, 1955, 42, 205-216, a.

Tate, R.F. Applications of correlation models for biserial data.

Journal. Arner. Statist. Ass, 1955, 50, 1078-1095, b.

MAXIMUM LIKELIHOOD ESTIMATION OF MULTIVARIATE … · 2016. 12. 28. · polyserial correlation model...

Documents

Transcript of MAXIMUM LIKELIHOOD ESTIMATION OF MULTIVARIATE … · 2016. 12. 28. · polyserial correlation model...