Working Paper 2014-03_cp4

S C H O O L O F S T AT I S T I C S UNIVERSITY OF THE PHILIPPINES DILIMAN

WORKING PAPER SERIES

School of Statistics Ramon Magsaysay Avenue U.P. Diliman, Quezon City

Telefax: 928-08-81 Email: [email protected]

SEMIPARAMETRIC PRINCIPAL COMPONENTS POISSON REGRESSION ON CLUSTERED DATA

by

Kristina Celene M. Manalaysay School of Statistics, University of the Philippines Diliman

InterContinental Hotels Group

Erniel B. Barrios School of Statistics, University of the Philippines Diliman

UPSS Working Paper No. 2014-03

January 2014

SEMIPARAMETRIC PRINCIPAL COMPONENTS POISSON

REGRESSION ON CLUSTERED DATA

Kristina Celene M. Manalaysay School of Statistics, University of the Philippines Diliman

InterContinental Hotels Group

Erniel B. Barrios School of Statistics, University of the Philippines Diliman

ABSTRACT

In modelling count data with multivariate predictors, we often encounter problems with

clustering of observations and interdependency of predictors. We propose to use principal

components of predictors to mitigate the multicollinearity problem and to abate information

losses due to dimension reduction, a semiparametric link between the count dependent

variable and the principal components is postulated. Clustering of observations is accounted

into the model as a random component and the model is estimated via the backfitting

algorithm. A simulation study illustrates the advantages of the proposed model over standard

poisson regression in a wide range of simulation scenarios.

Keywords: semiparametric poisson regression, clustered data, multicollinearity, principal

components analysis

MSC Codes: 62G08 62H25 62J07

1. Introduction In many diverse fields, outcomes of certain phenomena are measured using indicators that

possess the characteristics of poisson events, e.g., prevalence of a disease, number of

customers patronizing products/services, number of student enrollees. Poisson regression is

used to characterizesuch dataand in predicting the average number of instances an event

occurs, conditional on one or more factors.[1] demonstrated using malaria data that poisson

regression is advantageous over classical regression in modeling count data. Classical

regression analysis requires more predictors to achieve as much predictive ability as poisson

regression.

Spatial aggregation causes certain poisson events to manifest clustering.The spread of AH1N1

is influenced by determinants leadings towards vulnerability of individuals in the same

community, this may be different from those causing vulnerability of other individuals from a

different community. Clusters may still be independent but members of the same cluster (or

neighborhood) are necessarily dependent since there is some spatial endowment commonly

shared among units that formed the cluster. Classical statistical inference assumes

independence of observations, i.e., data are independently collected on similar, homogenous

units. This assumption is not necessarily true for clustered data. Thus, in analyzing clustered

data with methods that implicitly consider independence of observations may yield incorrect

analyseson the dynamics of the events/phenomena being characterized.

Predictors that explain occurrence of poisson events within the cluster can also be naturally

correlated. The interdependence among predictors usually causes problems in statistical

inference involving linear models. The multicollinearity problem exists when two or more

explanatory variables in a regression model are highly correlatedimplicating the inefficiency

of ordinary least squares estimates of the regression coefficients. As an illustration, consider

income and educational attainment as predictors of political preference. Income and

educational attainment are structurally correlated since income varies according to the level of

educational attainment of an individual. The presence of multicollinearity in a statistical

model inflates the standard error of the estimated coefficients, resulting in

unreliablecharacterization of the coefficients.[2]It further weakens in sensitivity of the

dependent variable on changes in independent variables and makes it difficult to assess the

relative importance of the independent variables in the model.

There are several solutions to the multicollinearity problem. For example, instead of

individual predictors, some important principal componentsare used in the model. In the

presence of multicollinearity, the design matrix becomes ill-conditioned if not singular and

hence, principal components analysis transforms correlated variables into fewer independent

components. Since Principal Components Regression (PCR) uses only a subset of the

principal components, there is a loss of information resulting to thedeterioration of the

predictive ability of the estimated regression function compared to the model that usesall the

individual predictors.[3] It is also possible that the use of a subset of principal components

can result to bias in the assessment of the relative importance of a predictor in explaining the

dependent variable.

The lost information in principal components regression can be recovered by allowing

flexibility on the functional relationship between the dependent variable and the principal

components.[4] In nonparametric regression, the functional form of the link between the

dependent and independent variables is allowed to be flexible with only the requirement of

smoothness of the function incorporated into the objective function of the estimation. With a

flexible functional form, the principal components can have a more accurate characterization

of the variation of the dependent variable, hence improving its predictive ability.

We postulate an additive combination of nonparametric functions on principal

componentsand random effects in a regression model with measurements of poisson events as

the dependent variable. This semiparametric poisson regression model can be used in

characterizing high dimensional clustered data. Clustering effect is accounted into the model

through a random intercept term. Dimension reduction is achieved through principal

components and due to the inherent deterioration in model fit due to dimension reduction, the

covariate effect summarized in terms of the principal components will be postulated as

nonparametric functions.

2. Some Modeling Strategies

Classical linear regression assumes continuous dependent variable and will lead to inefficient,

inconsistent and biased estimates when used in count dependent variable. Poisson regression

is appropriate in modeling with count dependent variable data.Even if poisson regression can

be approximated by classical linear regression, e.g., large sample size, poisson regression is

advantageous over classical linear regression since it usually requires fewer predictorsto

achieve a good fit, as demonstrated in the study of malaria incidence by [1].

[5] introduced the generalized linear models (GLM) to relax some of the classical

assumptions of a linear model. The model is given by

where for every i, Yibelongs to the exponential family and is a function that

links the random component Yi to the systematic component .

These are developed for regression models with non-normal dependent variables; special

cases include poisson regression where Y is a count variable and logistic regression where Y

is a binary outcome.

[6] compared the following models for clustered data: (1) ordinary poisson regression, which

ignores intracluster correlation, (2) poisson regression with fixed cluster-specific intercepts,

(3) a generalized estimating equations approach with an equi-correlation matrix, (4) an exact

generalized estimating equations approach with an exact covariance matrix, and (5) maximum

likelihood. All five methods lead to consistent estimates of slopes but have yield varying

efficiency levels especiallyfor unbalanced data.

Poisson regression assumes heterogeneous mean that is expressed as a linear combination of

explanatory variables. Since the parameter is positive, it is convenient to express this

parameter through an exponential function, , where is an mx1 vector of

regression parameters and xi is an mx1 vector of explanatory variables or covariates, i =

1,2,,n. In the context of generalized linear models, poisson regression has log link

because , [7].Poisson model is useful for clustered data since the cluster-

specific intercepts may be eliminated and can be viewed as a limiting maximum likelihood

estimates when the variance of the intercepts approaches infinity.[6]

Principal components analysis (PCA) transforms a set of p correlated variables into

uncorrelated linear combinations called principal components (PCs). PCA rotates the original

variable space to a point where the variance of the new variate is maximized. Since the PCs

are ranked by order of explained variance, the last PCs have the smallest variance but it is

through the last PCs that the relationship of the independent variables to the dependent

variable are determined, that is, the variables with high loadings on the last PCs are proven to

be highly correlated.[8] In modeling where only a subset of the PCs isused, there is

substantial loss of information. Typically, only a subset of the PCs is included in regression

modeling, though there is no universally acceptable procedure yet to determine the PCs to

retain.[9]However, [10] proposed a procedure that simultaneously chooses the components

while model fit is optimized.

[11] and [3]justified why the principal components with low eigenvalues are not included in

the model. Since the variance of the estimator (of ) is a linear combination of

the reciprocal of the eigenvalues, inclusion of one or more components with small

eigenvalues in the model yields high variance of . Nevertheless, [12]noted that given

specific theoretical models oriented towards parameter estimation, principal component

regression can yield desirable (maximum) variance property with minimal bias.

The bias and the lost information in principal component regression should be addressed. For

instance, nonparametric smoothing techniques which aim to provide a strategy in modeling

the relationships between variables without specifying any particular form for the underlying

regression function may be considered.When several covariates are present, [13] proposed to

extend the idea of linear regression into a flexible form known as generalized additive model

(GAM). The regression model is given by

where are nonparametric components. Additive models assume nonparametric smoothing

splines for each predictor in regression models. [14]suggested that additive models are used as

initial procedure to locate the patterns and behavior of the predictors relative to the response,

suggesting a possible parametric form for which to model Y at a later stage.

[15] formulated a censored regression model with additive effects of the covariates. The

additive model sped up computation in the inference process and yieldmore promising results

over a class of linear models especially where there is violation in the linearity assumption.

3. Methodology

Clustered data is characterized by homogeneity of elements within the cluster and

heterogeneity between clusters. This case would require a model structure that accounts for

variation across the clusters, while accounting for similarity of subjects within the cluster.The

choice of covariates or predictors that can sufficiently account for between-cluster variability

and within-cluster homogeneity is often complicated resulting to very long list. This would

necessarily lead to a high dimensional data type and often, this can invite other confounding

problems. High dimensional data becomes prone to multicollinearity and suffers from the

curse of dimensionality.

Multicollinearity can be a crucial issue in modeling. However, there are various approaches

to mitigate its ill-effects, e.g., transform the individual predictors into linear combinations, the

combinations are chosen so that they are independent, yet it contains the maximum amount of

variance that the original predictorscontain. With linear combinations as predictors instead of

the individual variables in a linear model, the predictive ability could suffer.

On the other hand, high dimensionality of predictors, multicollinearity, discreteness of the

dependent variable (count data), and the clustering usually associated with poisson process

that generates the count data can posed many complications in many model structures. Thus,

we offer a solution that can potentially alleviate the predictive ability of the model through a

nonparametric postulated link function in poisson regression for count data.

3.1 Postulated Model

[16]proposed a semiparametric poisson regression model for spatially clustered count data

given npredetermined clusters with nk observationsin eachcluster given by

(1)

Equivalently, Model 1 can also be written as:

(2)

where

are the explanatory variables

are smoothfunctions of

are clusterrandom intercepts

is the error term

is the response variable

As the number of predictorincreases, the likelihood of the multicollinearity problem also

increases and either can cause potential problems in estimating Model 2. We propose to use

principal components of the predictors instead of the individual variables in the model. This

however leads to bias since some information on the predictors can be lost in the process.

Thus, the link function is postulated as a nonparametric function resulting to the following

semiparametric model:

(3)

where

are cluster random intercepts

is the score for jth principal component on the ith observation

is the error term

is the response variable

The flexibility in form of the nonparametric function of the principal components will

compensate for the bias in the estimation of the regression coefficients due to the information

lost by selecting only the most important principal components to be included as predictors in

the model. The random intercepts will account for cluster differences (homogeneous within a

cluster) and the possible peculiarities in the model caused by clustering.

3.2 Estimation Procedures

The principle of backfitting is used to estimate the parametric and nonparametric components

of the model.Assuming additivity of Model (3), the parametric and nonparametric frameworks

are imbedded into the backfitting algorithm to estimate the parameters and the nonparametric

components of the model.

Two approaches are presented in this section.In Method 1, the backfitting algorithm estimates

the nonparametric part first, and then the parametric part is estimated from the residuals. In

Method 2, ordinary poisson regression is usedbut with principal components as predictors.

3.2.1 Method 1: Backfitting of the Semiparametric Model

After the extraction of principal components, the parametric and nonparametric parts of the

model are estimated iteratively in the context of backfitting. The nonparametric functions of

the principal components are estimated through spline smoothing.

Smoothing splines are used to fit the nonparametric part of the model. Consider a simple

additive regression model Y = f(X) + , where E () = 0 and Var () = 2. We want to estimate

the function fas a solution to the penalized least squares problem given by

given a value of the smoothing

parameter >0. The first term measures the goodness of fit and the second term served as the

penalty for lack of smoothness in f due to the interpolation in the first term.The smoothing

parameter controls the tradeoff between smoothness and goodness of fit. Largevalues of

emphasizes smoothness of f over model fit, while small values put higher leverage on

model fit rather than on smoothness of f. As , the solution to f is an interpolation of the

data points.The choice of the value of smoothing parameter is optimized through the

generalized cross validation (GCV). is chosen to minimize the generalized cross validation

mean squared error given by , see [17] and [18] for further details.

The partial residual ie is computed and used to estimate the random intercepts (with a priori

information on clustering of observations) through methods like maximum likelihood

methods, EM algorithm, see for example [19].

Spline smoothing and mixed model estimation in the backfitting framework are then iterated

until convergence, see [20] for some optimal properties of the backfitting estimators.

3.2.2 Method 2: Ordinary Poisson Regression on the Principal Components

With poisson link function, a general linear model (GLM) with the principal components as

predictors was also estimated. In GLM, the outcome Y is assumed to be generated from a

particular distribution in the exponential family, in this case, from a poisson distribution. The

heterogeneous mean, , of the distribution depends on the independent variables, in this case,

the PCs, through:

(4)

These two methods are compared with Ordinary Poisson Regression on the original predictors

in terms of their predictive ability.

3.3 Simulation Studies

We conduct a simulation study that covers the features of typical data to be analyzed using

Model (3), i.e., either it includes high dimensional predictors, or that the multicollinearity

problem is present, or both. We then compared the predictive ability of Methods 1 and 2 with

ordinary poisson regression with the individual variables as predictors. The simulation

scenarios are summarized in Table 1:

Table 1 Summary of Simulation Scenarios

Number of Variables/ Expected Number of

PCs

Few (5)

Many (30) - Single Pattern

Many (30) - Three Patterns

Multicollinearity Absence

Strong

Sample Size 50

100

Number of Clusters 5

10

Model Fit Good

Poor

Starting with five Xs, the rest were simulated as an additive combination of a function of

another X and the random error term. The initial set of predictors is simulated from:

(5)

The error term is distributed asN (0,1), the multiplier of the error term induces

multicollinearity among the predictors, i.e., higher multiplier implies absence of

multicollinearity while lower multiplier indicates presence of multicollinearity.The twenty

five other predictors were generated to fulfil various multicollinearity structures.

The response variable was computed as the linear combination of the Xs and added with a

cluster mean (from the normal distribution) and an error term. The means of the normal

distribution from where the cluster means were simulated were spread thoroughly to

differentiate the clusters. A multiplier to the error term is included to altermodel fit (large

multiplier implies poor model fit; small multiplier implies good model fit). A model usually

fits the data well when the functional form used is correct and that thecorrect predictors are

aptly accounted into the model. When the functional form of the model is incorrect or that

there are missed out predictors, variation of the error term will dominate. Hence, to simulate

misspecification, we magnified the error by multiplying it with a constant.

Scenarios for varying sample sizes (50, 100) as well as varyingnumber of clusters (5,10) were

generated to assess robustness of the proposed model. Furthermore, correlations among

predictors that can result to absence of multicollinearity or severe multicollinearity were

included. The number of variables can be 5 or 30, with the 30-variable scenarios further

divided into single and three-pattern multicollinearity structure. Single-pattern scenarios

represent data with the variables being correlated with one variable, i.e., they are computed as

a function of a particular variable only. Three-pattern scenarios, on the other hand, have a

number of these base variables, from whom the rest of the variables are computed. These

variations in pattern were incorporated to simulate varying predicted number of principal

components. Each scenario is replicated 100 times.

4. Results and Discussion

The predictive ability of the proposed model is assessed by comparing the mean absolute

prediction error (MAPE)using estimation methods 1 and 2 as well as the MAPE obtained

using Ordinary Poisson Regression (OPR) based on the original predictors,

4.1 Effect of Misspecification Error

In empirical modeling, misspecification is very common especially when variables were

measured in an ad hoc manner, i.e., not based on some theoretical foundation.

Misspecification error is introduced by magnifying the error term, i.e., multiplying with a

constant to inflate the variance. As the magnitude of the constant increases (equivalently,

model fit worsens or that the extent of misspecification error increases), the mean absolute

prediction error (MAPE) also increases.

In the simulation study, good model fit is represented by linear equations where coefficient

of determination is at least 60%, while poor model fit is associated with coefficient of

determination lower than 60%. Simulation shows that whether misspecification error is

present or not, the proposed semiparametric model always outperformed the other models in

terms of predictive ability. In the absence of misspecification error, the proposed model is

advantageous to ordinary poisson regression by about 10% in MAPE. Shifting from ordinary

poisson regression to the parametric principal components regression, on the other hand,

results in 65% increase in MAPE. This illustrates the effect of lost information due to the

summarization of the predictors into principal components instead of using the individual

predictors.

The advantages of the proposed method are also observed even with the presence of

misspecification errors. MAPE in the semiparametric model improved 9% over ordinary

poisson regression. Furthermore, there is advantage of 48% in terms of MAPE in using

ordinary poisson regression relative to the parametric model principal component model. The

MAPE of the three models by nature of model fit are summarized in Table 2.

Table 2. Comparison of MAPE for Varying Model Fit

MAPE (%)

Model Fit Semiparametric Model Parametric Model OPR on Original Predictors Good 24.26 44.28 26.88 Poor 141.54 230.64 156.23

Since substantial number of cases with poor model fit was included, subsequent results

yield higher MAPE levels since predictive ability of those models with good fit were

contaminated. The discussions then will focus on the comparison of the three models, rather

than on the magnitude of MAPE.

4.2 Effect of Multicollinearity

In the absence of multicollinearity, the proposed model yields better prediction over ordinary

poisson regression on the original predictors (OPR) where the MAPE is lower by 5%. The

parametric principal component regression on the other hand, yields 48% higher MAPE than

OPR. This is explained as the effect of lost information due to selection only of the more

important principal components. With multicollinearity present, similar ranges of MAPE can

be observed from the proposed model with improvementby 14% over ordinary poisson

regression. Again, the parametric model yield 52% higher MAPE relative to ordinary poisson

regression on original variables.Principal Components Analysis is a procedure aimed at

addressing multicollinearity, thus giving the proposed model advantage over ordinary poisson

regression with original correlated variables as predictors. The semiparametric nature of the

proposed model makes a more flexible regression and endowed advantages over the

parametric model.The proposed model performs best among the three methods whether

multicollinearityis present or not. The MAPE values are summarized in Table 3.

Table 3. Comparison of MAPE with or without Multicollinearity

MAPE (%)

Multicollinearity Semiparametric Model Parametric Model OPR on Original Predictors Absence 108.84 170.28 114.79 Presence 65.61 115.58 76.06

4.3 Effect of Sample Size

We simulated two sample size values: 50 observations (small) and 100 observations (large).In

small sample size,while ordinary poisson regressionyields better predictive ability among the

three models, the proposed model is still within a comparable range. For large sample size,

however, the proposed model is most advantageous, yielding 19% lower MAPE than ordinary

poisson regression. This is consistent with the observation of[21] that in small samples, errors

can easily occur particularly among multivariate techniques such as principal components

analysis (e.g. extraction of erroneous principal components). For small samples, estimates of

correlation among the predictors are relatively unstable, hence, the estimates of component

loadings may not be accurate resulting to losses in information. This lost information is

somehow retrieved by relaxing the parametric structure of the model and employ the more

flexible nature of nonparametric regression.

In Table 4, the parametric model is the most robust to sample size, with its MAPE changing

only by 3% as sample size changes from 50 to 100. MAPE of the proposed model increased

23% as sample size increased, whereas MAPE of ordinary poisson regression increased 63%.

In the parametric model, a sample of size 50 is already fairly large, hence no significant

changes in MAPE is observed as the sample size is increased further to 100. While MAPE for

the proposed model increases with the sample size, it still performed better than ordinary

poisson regression and the parametric model.

Table 4. Comparison of MAPE for Varying Sample Size

MAPE (%)

Sample Size Semiparametric Model Parametric Model OPR on Original Predictors 50 74.19 135.26 69.59

100 91.61 139.66 113.51

4.4 Effect of Number of Clusters

A priori information on clustering of observations can inform the modeler of possible sources

of variation of the response variable. More clusters mean that there is more basis for the

attribution of heterogeneity of the observations. Furthermore, the proposed model included a

random intercept term that basically explains the cluster differences. The simulation study

illustrates the influence of the number of clusters in the proposed model. Increasing the

number of clusters yield improvement on the predictive ability of the proposed model. In

fewer clusters, MAPE of the semiparametric model improved by 7% compared toordinary

poisson regression. On the other hand, MAPE increased by 49% in the parametric model from

ordinary poisson regression. Similar trend can be observed for cases with more clusters, but

the greater advantage of the semiparametric model over ordinary poisson regression (about

12% decline in MAPE) is observed in these cases.

The above result is consistent with [22] who recommended adding more clusters, not the

individual observations, for increased efficiency in the analysis of clustered data. Table 5

provides details related to the effect of number of clusters on predictive ability of models in

clustered data. Table 6 illustrates the implication of the number of clusters and number of

observations per cluster in the analysis of clustered data.

Table 5 Comparison of MAPE for Varying Cluster Count

MAPE (%) Number of Clusters Semiparametric Model Parametric Model

OPR on Original Predictors

5 87.49 140.57 94.19 10 78.31 134.36 88.91

Table 6. Comparison of MAPE for Varying Cluster Count and Cluster Size

MAPE (5) Number of

Clusters (Cluster Size) Semiparametric Model Parametric Model


5 (10) 79.40 131.69 64.20 5 (20) 95.58 149.45 124.18 10 (5) 68.98 138.84 74.98

10 (10) 87.64 129.87 102.84 4.5 Effect of Number of Variables While multicollinearity may appear even with few predictors, the chances of observing it in

high dimensional data are higher. Scenarios with 5 (few) and 30 (many) variables were

simulated in this study.In high dimensional data, principal component analysis is knownas an

effective data reduction technique. Whether there are few or many variables, the

semiparametric model still exhibitsbetter predictive ability.MAPE values are even lower in

cases where there are 30variables. With fewer variables, the proposed model yields 17%

lower MAPE compared toordinary poisson regression, whereas the parametric model yields

8% higher MAPE than that of ordinary poisson regression. Table 7indicates similar results for

scenarios involving many variables.

Table 7. Comparison of MAPE for Varying Number of Variables

MAPE (%) Number of Variables Semiparametric Model Parametric Model


5 81.53 106.84 98.36 30 78.19 146.52 83.37

For scenarios with thirty variables, presence of multicollinearity is further examined for the

effect of only a single pattern of interdependence among the predictors and cases where there

are three patterns of such interdependencies. The purpose of heterogeneity in

interdependencies is to account for the number of principal components that need to be

included in the model. The simulation study illustrates that predictive ability of the

semiparametric model improves as the interdependencies become more complicated, i.e.,

more patterns of interdependencies. Compared to the ordinary poisson regression and the

parametric model, the proposed model is still advantageous in terms of predictive ability.

Table 8. Comparison of MAPE for Single- and Three-Pattern Scenarios

MAPE (%) Number of

Correlations Pattern Semiparametric Model Parametric Model OPR on Original

Predictors 30 Single-Pattern 95.05 180.60 94.31 30 Three-Pattern 61.33 112.43 72.43

5.Conclusions The semiparametric principal component poisson regression model is aimed to characterize

clustered data with high dimensional or correlated predictors.The proposed model provides

solution to the modelling issues associated with high dimensional or correlated predictors

while it mitigates the bias caused by lost information in dimension reduction.

The predictive ability of the semiparametric model is superior in the presence of

multicollinearity. While it is superior to parametric poisson principal component regression

and ordinary poisson regression in any sample size, it is best in cases of large sample sizes.

Although ordinary poisson regression performed better than the proposedmodel for small

sample size, the performance of the proposed model is still at par with the other methods, and

it is the more robust than ordinary poisson regression to changes in sample size.

The proposed model and the corresponding estimation procedure is capable of mitigating the

problem of multicollinearity by regressing on the principal components instead of on the

original predictors. Furthermore, the nonparametric specification of the effect of principal

components abate the potential reduction in the predictive ability of the model that is usually

observed in principal components regression caused by loss in information from dimension

reduction.

REFERENCES

[1] Ruru Y, Barrios E.Poisson Regression Models of Malaria Incidence in Jayapura,

Indonesia. The Phil. Stat. 2003; 52:27-38.

[2] Curto J, Pinto J. New Multicollinearity Indicators in Linear Regression Models. Int. Stat.

Rev. 2007; 75(1):114-121.

[3] Dunteman J. Principal Component Analysis. Sage University Papers Series on

Quantitative Applications in the Social Sciences, 07-069. Thousand Oaks, CA: Sage;

1989.

[4] Barrios E, Umali J. Nonparametric Principal Components Regression. Proceedings of the

58th World Congress of the ISI. 2011.

[5] Nelder J, Wedderburn R. Generalized Linear Models. JRSS. 1972; 135:370-384.

[6] Demidenko E. Poisson Regression for Clustered Data. Int. Stat. Rev. 2007; 75:96-113.

[7] Demidenko E. Mixed Models Theory and Applications. New Jersey: John Wiley; 2004.

[8] Jolliffe IT. Principal Components Analysis. New York: Springer, 2002.

[9] Draper N, Smith H. Applied Regression Analysis, 2nd ed. New York: John Wiley; 1981.

[10] Filzmoser P, Croux C. Dimension Reduction of the Explanatory Variables in Multiple

Linear Regression. Pliska. Stud. Math. Bulgaria. 2002; 29:1-12.

[11] Montgomery DC, Peck EA. Introduction to Linear Regression. New York: John Wiley;

1982.

[12] Marx BD, Smith EP. Principal Components Estimation for Generalized Linear

Regression. Biometrika. 1990; 77(1):23-31.

[14] Hastie T, Tibshirani R. Generalized Additive Models. London: Chapman and Hall; 1990.

[15] Alvarez J, Pardinas, J. Additive Models in Censored Regression. Comp. Stat. and Data

Anal. 2009; 53: 3490-3501.

[16] De Vera E.Semiparametric Poisson Regression for Clustered Data. Unpublished Thesis,

School of Statistics, University of the Philippines Diliman, 2010.

[17] Golub G, Heath M, Wahba G. Generalized Cross-Validation as a Method for Choosing a

Good Ridge Parameter. Technometrics. 1979; 21(2): 215-223.

[18] Hardle W, Muller M, Sperlich S, Werwatz A. Nonparametric and Semiparametric

Models. Berlin: Springer; 2004.

[19] Seidel, W., 2011, Mixture Models, in Lovric, M., ed., International Encyclopedia of

Statistical Science, Springer Reference, pp. 827-829.

[20] Opsomer J. Asymptotic Properties of Backfitting Estimators. J of Multi. Anal.

2000;73:166-179.

[21] Osborne J, Costello A. Sample Size and Subject to Item Ratio in Principal Components

Analysis. Practical Assessment, Research & Evaluation. 2004; 9: Paper No. 11.

[22] Arceneaux K, Nickerson D. Modeling Certainty with Clustered Data: A Comparison of

Methods. Pol. Anal. 2009; 17:177-190.

Working Paper 2014-03_cp4

Documents

Transcript of Working Paper 2014-03_cp4