Random coefﬁcients models - DTU

eNote 9 1

eNote 9

Random coefficients models

eNote 9 INDHOLD 2

Indhold

9 Random coefficients models 1

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

9.2 Example: Constructed data . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

9.2.1 Simple regression analysis . . . . . . . . . . . . . . . . . . . . . . . . 4

9.2.2 Fixed effects analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

9.2.3 Two step analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

9.2.4 Random coefficient analysis . . . . . . . . . . . . . . . . . . . . . . . 8

9.3 Example: Consumer preference mapping of carrots . . . . . . . . . . . . . 10

9.4 Random coefficient models in perspective . . . . . . . . . . . . . . . . . . . 15

9.5 R-TUTORIAL: Constructed data . . . . . . . . . . . . . . . . . . . . . . . . 15

9.6 R-TUTORIAL: Consumer preference mapping of carrots . . . . . . . . . . 24

9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

9.1 Introduction

Random coefficient models emerge as natural mixed model extensions of simple line-ar regression models in a hierarchical (nested) data setup. In the standard situation,we are interested in the relationship between x and y. Assume we have observations

eNote 9 9.1 INTRODUCTION 3

(x1, y1), . . . (xn, yn) for a subject. Then we would fit the linear regression model, givenby

yj = α + βxj + εj

Assume next that such regression data are available on a number of subjects. Then amodel that expresses different regression lines for each subject is expressed by:

yij = αi + βixij + εij

or using the more general notation:

yi = α(subjecti) + β(subjecti)xi + εi (9-1)

This model has the same structure as the different slopes ANCOVA model of the pre-vious module, only now the regression relationships are in focus. Assume finally thatthe interest lies in the average relationship across subjects. A commonly used “ad hoc”approach is to employ a two-step procedure:

1. Carry out a regression analysis for each subject.

2. Do subsequent calculations on the parameter estimates from these regression ana-lyzes to obtain the average slope (and intercept) and their standard errors.

Since the latter treats the subjects as a random sample, it would be natural to incorporatethis in the model, by assuming the subject effects (intercepts and slopes) to be random:

yi = a(subjecti) + b(subjecti)xi + εi

wherea(k) ∼ N(α, σ2

a ), b(k) ∼ N(β, σ2b ), εi ∼ N(0, σ2)

and where k = 1, . . . , K with K being the number of subjects. The parameters α and β arethe unknown population values for the intercept and slope. This is a mixed model, alt-hough a few additional considerations are required to identify the typical mixed modelexpression. The expected value is

Eyi = α + βxi

and the variance isVaryi = σ2

a + σ2b x2

i + σ2

So, an equivalent way of writing the model is the following where the fixed and therandom part is split:

yi = α + βxi + a(subjecti) + b(subjecti)xi + εi (9-2)

eNote 9 9.2 EXAMPLE: CONSTRUCTED DATA 4

wherea(k) ∼ N(0, σ2

a ), b(k) ∼ N(0, σ2b ), εi ∼ N(0, σ2) (9-3)

Now the linear mixed model structure is apparent. Although we do not always expli-citly state this, there is the additional assumption that the random effects a(k), b(k) andεi are mutually independent. For randomly varying lines (a(k), b(k)) in the same x-domain this may be an unreasonable assumption since the slope and intercept valuesmay very well be related to each other. It is possible to extend the model to allow forsuch a correlation/covariance between the intercept and slope by assuming a bi-variatenormal distribution for each set of line parameters:

(a(k), b(k)) ∼ N(0,(

σ2a σab

σab σ2b

)), εi ∼ N(0, σ2) (9-4)

The model given by (9-2) and (9-4) is the standard random coefficient mixed model.

9.2 Example: Constructed data

To illustrate the basic principles we start with two constructed data sets of 100 observa-tions of y for 10 different x-values, see figure 9.1.

It reflects that a raw scatter plot of a data set can be hiding quite different structures,if the data is in fact hierarchical (repeated observations on each individual rather thanexactly one observation for each individual).

9.2.1 Simple regression analysis

Had the data NOT been hierarchical, but in stead observations on 100 subjects, a simpleregression analysis, corresponding to the model

yi = α + βxi + εi (9-5)

where εi ∼ N(0, σ2), i = 1, . . . , 100 would be a reasonable approach. For comparison westate the results of such an analysis for the two data sets. The parameter estimates are:

Data 1 Data 2Parameter Estimate SE P-value Estimate SE P-valueσ2 15.9899 20.5229α 10.7280 0.8638 7.8356 0.9786β 0.90461 0.1392 <0.0001 1.21519 0.1577 <0.0001


See figure 9.1(left) for the estimated lines.

9.2.2 Fixed effects analysis

If we had special interest in the 10 subjects, a fixed effects analysis corresponding tomodel (9-1) could be carried out. The F-tests and P-values from the Type 1 (successive)ANOVA tables become:

Data set 1 Data set 2Source DF F P-value F P-valuex 1 9220.98 <.0001 70.74 <.0001subject 9 2091.49 <.0001 3.07 0.0033x*subject 9 277.71 <.0001 1.02 0.4311

2 4 6 8 10

1015

2025

x

y1

2 4 6 8 10

1015

2025

x

y1

2 4 6 8 10

1015

2025

x

y1

2 4 6 8 10

510

1520

25

x

y2

2 4 6 8 10

510

1520

25

x

y2

2 4 6 8 10

510

1520

25

x

y2

Figur 9.1: Constructed data: Top: data set 1, bottom: data set 2. Left: Raw scatter plotwith simple regression line, middle: Individual patterns, right: individual lines


For data set 1 the slopes are clearly different whereas for data set 2 the slopes can beassumed equal, but the intercepts (subjects) are different. Although it is usually recom-mended to rerun the analysis without an insignificant interaction effect, the Type I tableshows that the result of this will clearly be that the subject (intercept) effect is significantfor data set 2, cf. the discussion of Type I/Type III tables in Module 3. So for data set 1,the (fixed effect) story is told by providing the 10 intercept and slope estimates and/orpossibly as described for the different slopes ANCOVA model in the previous module.For data set 2, an equal slopes ANCOVA model can be used to summarize the results.The common slope and error variance estimates are:

β̂ = 1.2152, SEβ̂ = 0.1446, σ̂2 = 17.2582

The confidence band for the common slope, using the 89 error degrees of freedom beco-mes

1.2152± t0.975(89)0.1446

which, since t0.975(89) = 1.987, gives

[0.9279, 1.5025]

The subjects could be described and compared as for the common slopes ANCOVAmodel of the previous module.

9.2.3 Two step analysis

If the interest is NOT in the individual subjects but rather in the average line, then anatural ad hoc approach is simply to start by calculating the individual intercepts andslopes and then subsequently treat those as simple random samples and calculate avera-ge, variance and standard error to obtain confidence limits for the population averagevalues. So e.g. for the slopes we have β̂1, . . . , β̂10 and calculate the average

β̂ =1

10

10

∑i=1

β̂i,

the variance

s2β̂=

19

10

∑i=1

(β̂i − β̂)2

and the standard errorSEβ̂ =

sβ̂√10


to obtain the 95% confidence interval: (using that t0.975(9) = 2.26)

β̂± 2.26SEβ̂

The variances for data set 1 are:

s2α̂ = 16.2779, s2

β̂= 0.2465

and for data set 2:s2

α̂ = 8.5663, s2β̂= 0.2130

The results for the intercepts and slopes for the two data sets are given in the followingtable:

Data set 1 Data set 2α β α β

Average 10.7279 0.9046 7.8356 1.2152SE 1.2759 0.1570 0.9255 0.1460Lower 7.8416 0.5495 5.7419 0.8850Upper 13.6142 1.2597 9.9293 1.5454

Note that for data set 2, the standard error for the slope is almost identical to the stan-dard error from the fixed effect equal slopes model from above. However, due to thesmaller degrees of freedom, 9 instead of 89, the confidence band is somewhat largerhere. This reflects the difference in interpretation: In the fixed effects analysis the β esti-mates the common slope for these specific 10 subjects. Here the estimate is of the popu-lation average slope (the population from which these 10 subjects were sampled). Thisdistinction does not alter the estimate itself, but does change the statistical inference thatis made.

Note, by the way, that for estimating the individual lines, it does not make a differencewhether an overall different slopes model is used or 10 individual (”small”) regressionmodels separately.

Although not used, the observed correlation between the intercepts and slopes in eachcase can be found:

corr1 = −0.382, corr2 = −0.655


9.2.4 Random coefficient analysis

The results of fitting the random coefficient model given by (9-2) and (9-4) to each dataset is given in the following table:

Data set 1 Data set 2α β α β

Estimate 10.7279 0.9046 7.8356 1.2152SE 1.2759 0.1570 0.9255 0.1460Lower 7.8416 0.5495 5.7419 0.8850Upper 13.6142 1.2597 9.9293 1.5454

Note that this table is an exact copy of the result table for the two-step analysis above!The parameters of the variance part of the mixed model for data set 1 is estimated at:(read off from R-output)

σ̂a = 4.031, σ̂b = 0.496, ρ̂ab = −0.38, σ̂ = 0.0732

which corresponds to the following variances:

σ̂2a = 16.25, σ̂2

b = 0.246, σ̂2 = 0.073

and for data set 2:

σ̂a = 1.086, σ̂b = 0.145, ρ̂ab = 1.00, σ̂ = 4.132

which corresponds to the following variances:

σ̂2a = 1.18, σ̂2

b = 0.021, σ̂2 = 17.07

Compare with the variances calculated in the two-step procedure: For data set 1, therandom coefficient model estimates are slightly smaller, whereas for data set 2, they areconsiderably smaller. This makes good sense, as the variances in the two-step procedurealso will include some additional variation due to the residual error variance (just likethe mean squares in a standard hierarchical model). For data set 1, this residual errorvariance is estimated at a very small value (0.0732) whereas for data set 2 it is 17.07.This illustrates how the random coefficient model provides the proper ”story”aboutwhat is going on, and directly distinguishes between the two quite different situationsexemplified here.

Note also that for data set 1, the correlation estimate ρ̂ab = −0.38 is close to the ob-served correlation calculated in the two-step procedure. However, for data set 2 the


estimated correlation becomes ρ̂ab = 1!!! This obviously makes no sense! We encoun-ter a situation similar to the the ”negative variance”problem discussed previously. Thecorrelation may become meaningless when some of the variances are estimated verysmall, which is the case for the slope variance here. To put it differently, for data set 2the model we have specified include components (in the variance) that is not actuallypresent in the data. We already new this, since the equal slopes model was a reasonabledescription of this data. In the random coefficient framework the equal slopes model isexpressed by

yi = α + βxi + a(subjecti) + εi (9-6)


a ), εi ∼ N(0, σ2) (9-7)

The adequacy of this model can be tested by a residual likelihood ratio test, cf. Module5. For data set 2 we obtain

G = −2lREML,1 − (−2lREML,2) = 0.65

which is non-significant using a χ2 distribution with 2 degrees of freedom. For data set1 the similar test becomes

G = −2lREML,1 − (−2lREML,2) = 249.9

which is extremely significant.

For data set 2 the conclusions should be based on the equal slopes model given by (9-6)and (9-7), and we obtain the following:

Data set 2α β

Estimate 7.8356 1.2152SE 1.0774 0.1446Lower 5.6544 0.9278Upper 10.0168 1.5026

We see a minor change in the confidence bands: believing in equal slopes increases the(estimated) precision (smaller confidence interval) for this slope, whereas the precisionof the average intercept decreases.

eNote 9 9.3 EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 10

9.3 Example: Consumer preference mapping of carrots

In a consumer study 103 consumers scored their preference of 12 danish carrot typeson a scale from 1 to 7. The carrots were harvested in autumn 1996 and tested in march1997. A number of background information variables were recorded for each consumer,see the data description in Module 13 for details. The data file can be downloaded as:carrots.txt and is described also in eNote13.

The aim of a so-called ”external preference mapping” is to find the ”sensory drivers” ofthe consumer preference behaviour and to investigate if these are different in differentsegments of the population. To do this, in addition to the consumer survey, the carrotproducts are evaluated by a trained panel of tasters, the sensory panel, with respect toa number of sensory (taste, odour and texture) properties. Since usually a high numberof (correlated) properties(variables) are used, in this case 14, it is a common procedureto use a few, often 2, combined variables that contain as much of the information inthe sensory variables as possible. This is achieved by extracting the first two principalcomponents in a principal components analysis(PCA) on the product-by-property panelaverage data matrix. PCA is a commonly used multivariate technique to explore and/ordecompose high dimensional data.

We call these two variables sens1 and sens2 and they are given by

sens1i =14

∑j=1

ajvij and sens2i =

14

∑j=1

bjvij

where vi1, . . . , vi

14 are the 14 average sensory scores for carrot product i and the coeffi-cients aj and bj defining the two combined sensory variables are as depicted in figure9.2. So sens1 is a variable that (primarily) measures bitterness vs. nutty taste whereassens2 measures sweetness (and related properties). The actual ”preference mapping”is carried out by first fitting regression models for the preference as a function of thesensory variables for each individual consumer using the 12 observations across thecarrot products. Next, the individual regression coefficients are investigated, often inan explorative manner in which a scatter plot is used to look for a possible segmenta-tion of consumers in these regression coefficients. In stead of looking for segmentation(”Cluster analysis”) we investigate whether we see any differences with respect to thebackground variables in the data, e.g. the gender or homesize (number of persons inthe household). Let yi be the ith preference score. The natural model for this is a modelthat expresses randomly varying individual relations to the sensory variables, but withaverage (expected) values that may depend on the homesize.

Let us consider the factor structure of the setting. The basic setting is a randomized

http://www2.compute.dtu.dk/courses/02429/Data/datafiles/carrots.txt

http://02429.compute.dtu.dk/enote/afsnit/NUID193/


−0.2 0.0 0.2 0.4

−0.

4−

0.2

0.0

0.2

0.4

Sens1

Sen

s2

colour

transp

car_od

earty_od

hard

crisp

juicy

fruit_ta

nut_ta

sweet_ta

bitter_ta

earthy_ta

carrot_af

bitter_af

Figur 9.2: Loadings plot for PCA of sensory variables: Scatter plot of coefficients bj ver-sus aj.

block experiment with 12 treatments (carrot products), the factor prod, and 103 blocks(consumers), the factor cons. Homesize (size) is a factor that partitions the consumersinto two groups, those with homesize of 1 or 2, and those with a larger homesize. Sothe factor cons is nested within size, or equivalently size is coarser than cons. Thisbasic structure is depicted in figure 9.3. (Note that the corresponding diagram plot inthe video/audio based presentation of this module has a couple errors compared to thecorrect one given here)


[I]11221236

[cons]101103

[prod]1112

size12

011

Figur 9.3: The factor structure diagram for the carrots data

The linear effect of the sensory variables is a part of the prod effect, since these covaria-tes ”are on product level”. So they are both coarser than the product effect. The sensoryvariables in the model will therefore explain some of the product differences. Includingprod in the model as well will enable us to test whether the sensory variables can ex-plain all the product differences. As we do no expect this to be the case, we adopt thepoint of view that the 12 carrot products is a random sample from the population ofcarrot products in Denmark, that is, the product effect is considered as a random ef-fect. In other words, we consider the deviations in the product variation from what canbe explained by the regression on the sensory variables, as random variation. Finally,the interactions between homesize and the sensory variables should enter the model asfixed effects, allowing for different average slopes for the two homesizes, leading to themodel given by

yi = α(sizei) + β1(sizei) · sens1i + β2(sizei) · sens2i + a(consi)

+b1(consi) · sens1i + b2(consi) · sens2i + d(prodi) + εi (9-8)

where

a(k) ∼ N(0, σ2a ), b1(k) ∼ N(0, σ2

b1), b2(k) ∼ N(0, σ2b2), k = 1, . . . 103. (9-9)

andd(prodi) ∼ N(0, σ2

P), εi ∼ N(0, σ2) (9-10)


To finish the specification of a general random coefficient model, we need the assump-tion of the possibility of correlations between the random coefficients:

(a(k), b1(k), b2(k)) ∼ N(0,

σ2a σab1 σab2

σab1 σ2b1

σb1b2

σab2 σb1b2 σ2b2

) (9-11)

Before studying the fixed effects, the variance part of the model is investigated further.We give details in the R-tutorial section on how we end up simplifying this 8-parametervariance model down to the 4-parameter variance model, where the σ2

b1-parameter and

the two related correlations first can be tested non-significant, and after that the corre-lation between the b2-effect and the intercept (which can make sense here as the sens2-values are mean centered, see the discussion in the tutorial section):

yi = α(sizei) + β1(sizei) · sens1i + β2(sizei) · sens2i + a(consi)

+b2(consi) · sens2i + d(prodi) + εi (9-12)


a ), b2(k) ∼ N(0, σ2b2), k = 1, . . . 103. (9-13)


P), εi ∼ N(0, σ2) (9-14)

and where there are no more correlations in the model. The three remaining varianceparameters (not counting the residual variance) are now all significant.

With this variance structure, we investigate the fixed effects - here showing the resultsof the automated step-function of lmerTest:

Sum Sq Mean Sq NumDF DenDF F.value elim.num Pr(>F)Homesize:sens1 0.19 0.19 1 1016.24 0.18 1 0.67

sens1 0.54 0.54 1 8.98 0.52 2 0.49Homesize:sens2 1.08 1.08 1 101.02 1.04 3 0.31

Homesize 5.40 5.40 1 100.98 5.20 kept 0.02sens2 18.17 18.17 1 12.19 17.49 kept 0.00

The final model for these data is therefore given by:

yi = α(sizei) + β2 · sens2i + a(consi)

+b2(consi) · sens2i + d(prodi) + εi (9-15)



a ), b2(k) ∼ N(0, σ2b2), k = 1, . . . 103. (9-16)


P), εi ∼ N(0, σ2) (9-17)

The estimates of the variances are listed in the following table:

σ̂b2 0.0545σ̂a 0.442σ̂P 0.1774σ̂ 1.0194

α̂(Homesize1) 4.91α̂(Homesize3) 4.67

β̂2 0.071

With confidence intervals as they come from the confint function:

2.5 % 97.5 %.sig01 0.03 0.08.sig02 0.36 0.53.sig03 0.09 0.29

.sigma 0.98 1.07Homesize1 4.74 5.08

Homesize3-Homesize1 -0.45 -0.03sens2 0.04 0.10

The conclusions regarding the relation between the preference and the sensory variablesare that no significant relation was found to sens1, but indeed so for sens2. The relationdoes not depend on the homesize and is estimated at:(with 95% confidence interval)

β̂2 = 0.071, [0.04, 0.10]

So two products with a difference of 10 in the 2nd sensory dimension (this is the span inthe data set) are expected to differ in average preference with between 0.4 and 10. Sweetproducts are preferred to non-sweet products, cf. figure 9.2 above. The expected valuesfor the two homesizes (for an average product) and their differences are estimated at:

α̂(1) + β̂2 · sens2 = 4.91, [4.73, 5.09]α̂(2) + β̂2 · sens2 = 4.67, [4.47, 4.85]

α̂(1)− α̂(2) = 0.25, [0.04, 0.46]

So homes with more persons tend to have a slightly lower preference in general for suchcarrot products.

eNote 9 9.4 RANDOM COEFFICIENT MODELS IN PERSPECTIVE 15

9.4 Random coefficient models in perspective

Although the factor structure diagrams with all the features of finding expected meansquares and degrees of freedom are only strictly valid for balanced designs and modelswith no quantitative covariates, they may still be useful as a more informal structurevisualization tool for these non-standard situations.

The setting with hierarchical regression data is really an example of what also couldbe characterized as repeated measures data. A common situation is that repeated mea-surements on a subject(animal, plant, sample) are taken over time then also known aslongitudinal data. So apart from appearing as natural extensions of fixed regressionmodels, the random coefficient models are one option for analyzing repeated measuresdata. The simple models can be extended to polynomial models to cope with non-linearstructures in the data. Also additional residual correlation structures can be incorpora-ted. In Modules 11 and 12 a thorough treatment of repeated measures data is given witha number of different methods – simple as well as more complex approaches.

9.5 R-TUTORIAL: Constructed data

The data file can be downloaded as: randcoef.txt and is described also in eNote13.

The simple linear regression analyses of the two response y1 and y2 in the data setrandcoef are obtained using lm:

randcoef <- read.table("randcoef.txt", sep=",", header=TRUE)

randcoef$subject <- factor(randcoef$subject)

model1y1 <- lm(y1 ~ x, data = randcoef)

model1y2 <- lm(y2 ~ x, data = randcoef)

The parameter estimates with corresponding standard errors in the two models are:

summary(model1y1)

##

## Call:

## lm(formula = y1 ~ x, data = randcoef)

##

http://www2.compute.dtu.dk/courses/02429/Data/datafiles/randcoef.txt


eNote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 16

## Residuals:

## Min 1Q Median 3Q Max

## -6.69 -3.10 -1.05 3.33 8.27

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 10.728 0.864 12.4 < 2e-16 ***

## x 0.905 0.139 6.5 3.4e-09 ***

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## Residual standard error: 4 on 98 degrees of freedom

## Multiple R-squared: 0.301,Adjusted R-squared: 0.294

## F-statistic: 42.2 on 1 and 98 DF, p-value: 3.41e-09

summary(model1y2)

##

## Call:

## lm(formula = y2 ~ x, data = randcoef)

##

## Residuals:


## -11.234 -2.757 -0.492 3.102 10.314

##

## Coefficients:


## (Intercept) 7.836 0.979 8.01 2.5e-12 ***

## x 1.215 0.158 7.70 1.1e-11 ***

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## Residual standard error: 4.53 on 98 degrees of freedom



The raw scatter plots for the data with superimposed regression lines are obtained usingthe plot and abline functions:


par(mfrow=c(1,2))

with(randcoef, {plot(x,y1)abline(model1y1)

plot(x,y2)

abline(model1y2)})

2 4 6 8 10

10

15

2025

x

y1

2 4 6 8 10

510

1520

25

x

y2

par(mfrow=c(1,1))

The individual patterns in the data can be seen from the next plot:

par(mfrow=c(1,2))

with(randcoef, {plot(x,y1)for (i in 1:10) {lines(x[subject==i],y1[subject==i],lty=i)}plot(x,y2)

for (i in 1:10) {lines(x[subject==i],y2[subject==i],lty=i)}})


2 4 6 8 10

10

1520

25

x

y1

2 4 6 8 10

510

1520

25

x

y2

par(mfrow=c(1,1))

The function lines connects points with line segments. Notice how the repetetive plot-ting is solved using a for loop: For each i between 1 and 10 the relevant subset of thedata is plotted with a line type that changes as the subject changes. Alternatively wecould have used 10 lines lines for each response.

The fixed effects analysis with the two resulting ANOVA tables are:

model2y1 <- lm(y1 ~ x + subject + x * subject, data = randcoef)

model2y2 <- lm(y2 ~ x + subject + x * subject, data = randcoef)

library(xtable)

print(xtable(anova(model2y1)))

print(xtable(anova(model2y2)))

A plot of the data with individual regression lines based on model2y1 and model2y2 isagain produced using a for loop. First we fit the two models in a different parameteri-


Df Sum Sq Mean Sq F value Pr(>F)x 1 675.12 675.12 9220.98 0.0000subject 9 1378.16 153.13 2091.49 0.0000x:subject 9 182.99 20.33 277.71 0.0000Residuals 80 5.86 0.07

Df Sum Sq Mean Sq F value Pr(>F)x 1 1218.27 1218.27 70.74 0.0000subject 9 475.27 52.81 3.07 0.0033x:subject 9 158.18 17.58 1.02 0.4311Residuals 80 1377.79 17.22

sation (to obtain the estimates in a convenient form of one intercept and one slope persubject)

model3y1 <- lm(y1 ~ subject - 1 + x * subject - x, data = randcoef)

model3y2 <- lm(y2 ~ subject - 1 + x * subject - x, data = randcoef)

The plots are produced using

par(mfrow=c(1,2))

with(randcoef, {plot(x,y1)for (i in 1:10) {abline(coef(model3y1)[c(i,i+10)],lty=i)}plot(x,y2)

for (i in 1:10) {abline(coef(model3y2)[c(i,i+10)],lty=i)}})


2 4 6 8 10

10

1520

25

x

y1

2 4 6 8 10

510

1520

25

x

y2

par(mfrow=c(1,1))

Explanation: Remember that coef extracts the parameter estimates. Now the first 10estimates will be the intercept estimates and the next 10 will be the slope estimates. Thusthe component pairs (1, 11), (2, 12), . . . , (10, 20) will be belong to the subjects 1, 2, . . . , 10,respectively. This is exploited in the for loop in the part [c(i,i+10)]! which produces thesepairs as i runs from 1 to 10.

The equal slopes model for the second data set with parameter estimates is:

model4y2 <- lm(y2 ~ subject + x, data = randcoef)

summary(model4y2)

##

## Call:

## lm(formula = y2 ~ subject + x, data = randcoef)

##

## Residuals:



## -11.053 -2.853 0.414 2.146 10.428

##

## Coefficients:


## (Intercept) 5.7182 1.5358 3.72 0.00034 ***

## subject2 4.6967 1.8579 2.53 0.01324 *

## subject3 -0.0299 1.8579 -0.02 0.98720

## subject4 3.0225 1.8579 1.63 0.10730

## subject5 3.1704 1.8579 1.71 0.09140 .

## subject6 2.7096 1.8579 1.46 0.14823

## subject7 0.3074 1.8579 0.17 0.86896

## subject8 1.9357 1.8579 1.04 0.30027

## subject9 6.2555 1.8579 3.37 0.00112 **

## subject10 -0.8941 1.8579 -0.48 0.63153

## x 1.2152 0.1446 8.40 6.5e-13 ***

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## Residual standard error: 4.15 on 89 degrees of freedom



The summary of the two step analysis can be obtained using the functions mean andsd (computing empirical mean and standard deviation of a vector, respectively) to thevector of intercept estimates and to the vector of slope estimates (from the differentslopes models) to perform the computations in this Module 9. Here it comes from dataset 1, but it is done similarly for data set 2:

aInty1 <- mean(coef(model3y1)[1:10])

sdInty1 <- sd(coef(model3y1)[1:10])/sqrt(10)

uInty1 <- aInty1 + 2.26 * sdInty1

lInty1 <- aInty1 - 2.26 * sdInty1

aSloy1 <- mean(coef(model3y1)[11:20])

sdSloy1 <- sd(coef(model3y1)[11:20])/sqrt(10)

uSloy1 <- aSloy1 + 2.26 * sdSloy1

lSloy1 <- aSloy1 - 2.26 * sdSloy1


The correlations between intercepts and between slopes in the two data set are compu-ted using corr

cor(coef(model3y1)[1:10], coef(model3y1)[11:20])

## [1] -0.3822

cor(coef(model3y2)[1:10], coef(model3y2)[11:20])

## [1] -0.6548

The random coefficients analysis is done with lmer. The different slopes random coeffi-cient model is

library(lmerTest)

model5y1 <- lmer(y1 ~ x + (1 + x | subject), data = randcoef)

model5y2 <- lmer(y2 ~ x + (1 + x | subject), data = randcoef)

After random the part 1x+ specifies the terms to which the random factors after | areassigned. One way to think about this, is that 1 is ’multiplied’ by subject and that x is’multiplied’ by subject yielding the terms

1× subject + x× subject

which corresponds to the random part in formula (9.2).

The (fixed effects) parameter estimates andn their standard errors are obtained from themodel summary:

summodel5y1 <- summary(model5y1)

summodel5y2 <- summary(model5y2)

print(xtable(summodel5y1$coefficients))

Estimate Std. Error df t value Pr(>|t|)(Intercept) 10.73 1.28 9.00 8.41 0.00

x 0.90 0.16 9.00 5.76 0.00


print(xtable(summodel5y2$coefficients))


x 1.22 0.15 27.88 8.04 0.00

The variance parameter, including the correlations between intercept and slope, estima-tes are obtained using:

summodel5y1$varcor

## Groups Name Std.Dev. Corr

## subject (Intercept) 4.031

## x 0.496 -0.38

## Residual 0.271

summodel5y2$varcor

## Groups Name Std.Dev. Corr

## subject (Intercept) 1.086

## x 0.147 1.00

## Residual 4.132

The equal slopes models within the random coefficient framework are specified as

model6y1 <- lmer(y1 ~ x + (1 | subject), data = randcoef)

model6y2 <- lmer(y2 ~ x + (1 | subject), data = randcoef)

Likelihood ratio tests for reduction from different slopes to equal slopes can be obtainedusing anova with two lmer result objects as arguments (the first argument (model) is lessgeneral than the second argument (model)).

print(xtable(anova(model6y1, model5y1, refit=FALSE)))

eNote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 24

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)object 4 409.67 420.09 -200.84 401.67..1 6 163.81 179.44 -75.90 151.81 249.87 2 0.0000

print(xtable(anova(model6y2, model5y2, refit=FALSE)))

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)object 4 586.63 597.05 -289.31 578.63..1 6 589.97 605.61 -288.99 577.97 0.65 2 0.7208

Confidence intervals for the relevant final models may be obtained by:

print(xtable(confint(model6y2)))

2.5 % 97.5 %.sig01 0.64 3.40

.sigma 3.59 4.82(Intercept) 5.73 9.94

x 0.93 1.50

print(xtable(confint(model5y1)))

In the latter case, three of the variance parameters cannot be profiled (only for thesubject-main-effect variance component a finite confidence interval is found). This isnot necessarily a problem, as the likelihood and CIs and tests for the fixed effects stillmake good sense.

The (fixed effects) parameter estimates for the final model for data set 2 are

print(xtable(summary(model6y2)$coefficients))

9.6 R-TUTORIAL: Consumer preference mapping ofcarrots

The data file can be downloaded as: carrots.txt and is described also in eNote13.




2.5 % 97.5 %.sig01 2.60 6.39.sig02 -1.00 1.00.sig03 0.00 Inf

.sigma 0.00 Inf(Intercept) 8.11 13.35

x 0.58 1.23


x 1.22 0.14 89.00 8.40 0.00

Recall that the most general model ((9.8) to (9.11) in Module 9) states that for each levelof Consumer the random intercept and random slopes of sens1 and sens2 are correlatedin an arbitrary way (the specification in (9.11)). It can be specified as follows

carrots <- read.table("carrots.txt", header = TRUE , sep = ",")

carrots$Homesize <- factor(carrots$Homesize)

carrots$Consumer <- factor(carrots$Consumer)

carrots$product <- factor(carrots$product)

model1 <- lmer(Preference ~ Homesize + sens1 + sens2 +

Homesize * sens1 + Homesize * sens2 +

(1|product) + (1+sens1+sens2|Consumer),data=carrots)

summary(model1)

## Linear mixed model fit by REML [’merModLmerTest’]

## Formula:

## Preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize *

## sens2 + (1 | product) + (1 + sens1 + sens2 | Consumer)

## Data: carrots

##

## REML criterion at convergence: 3748

##

## Scaled residuals:


## -3.700 -0.549 0.024 0.610 2.864

##

## Random effects:


## Groups Name Variance Std.Dev. Corr

## Consumer (Intercept) 0.19777 0.4447

## sens1 0.00029 0.0170 -0.20

## sens2 0.00305 0.0552 0.17 0.93

## product (Intercept) 0.03361 0.1833

## Residual 1.03363 1.0167

## Number of obs: 1233, groups: Consumer, 103; product, 12

##

## Fixed effects:

## Estimate Std. Error df t value Pr(>|t|)

## (Intercept) 4.90668 0.08824 35.30000 55.61 <2e-16 ***

## Homesize3 -0.24083 0.10565 101.00000 -2.28 0.0247 *

## sens1 0.01354 0.01645 13.10000 0.82 0.4250

## sens2 0.06192 0.01931 16.60000 3.21 0.0053 **

## Homesize3:sens1 -0.00611 0.01482 315.30000 -0.41 0.6803

## Homesize3:sens2 0.01954 0.01925 102.80000 1.02 0.3122

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## Correlation of Fixed Effects:

## (Intr) Homsz3 sens1 sens2 Hms3:1

## Homesize3 -0.535

## sens1 -0.019 0.016

## sens2 0.044 -0.036 0.049

## Homsz3:sns1 0.021 -0.039 -0.403 -0.054

## Homsz3:sns2 -0.044 0.082 -0.049 -0.445 0.121

The random part deserves some explanation. The structure (9.11) amounts to the term(1sens1+sens2—Consumer)+, for each level of Consumer we have 3 random effects,one intercept and two slopes, and they are arbitrarily correlated.

In addition there is the random effect product.

Let us check what the step function of lmerTest can tell us about the random effects ofthis model:

mystep <- step(model1)

print(xtable(mystep$rand.table))

We note that the sens1:Consumer effect is tested with 3 degrees of freedom (and is NS,hence eliminated). This is because elminating this term from the model means that the


Chi.sq Chi.DF elim.num p.valuesens1:Consumer 2.03 3 1 0.57

product 13.19 1 kept 0.00sens2:Consumer 7.63 2 kept 0.02

variance AND the correlations between this coefficient and the sens2 coefficients and theintercepts are are all assumed to be zero. This is the elimination principle implementedhere.

Remark 9.1 Random coefficient correlations

Generally it is recommended to include these correlations in the models (and alsowhat R is doing for us, when implemented as shown above). This is so, as corre-lations between the x’es will induce such correlations between the coefficients byconstruction and hence it would be wrong not to allow for them in the model. Thebasic example is a non-centred x in a regression that will lead to a relation betwe-en the slope and the intercept. However, IF the x is centered (and hence the x has”correlation zero with the constant”) this relation disappears. And generally if thex’es are independent (orthogonal) then models with independent coefficients couldmake sense and could be a reasonable approach to stabilize the random effect partof the model.

In this case the sens1 and sens2 are in fact both mean centred and independent byconstruction (scores from a principal component analysis), but check:

mean(carrots$sens1)

## [1] 6.667e-11

mean(carrots$sens2)

## [1] -7.5e-11

cor(carrots$sens1, carrots$sens2)

## [1] -1.93e-11


The models with (A) and without (B) correlation between intercepts and sens2-slopes(Model 1 in Module 9) is (note the difference in R-syntax for the random effects)

model2A <- lmer(Preference ~ Homesize + sens1 + sens2 +


(1|product) + (1+sens2|Consumer),

data=carrots)

model2B <- lmer(Preference ~ Homesize + sens1 + sens2 +


(1|product) + (1|Consumer) +

(0+sens2|Consumer),

data=carrots)

print(xtable(anova(model2A, model2B, refit=FALSE)))

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)..1 10 3770.67 3821.85 -1875.34 3750.67object 11 3771.97 3828.26 -1874.99 3749.97 0.70 1 0.4015

So we do not need the correlation in the model. We could anyway without any problemsuse the results of the step call above or we could redo by applying the step function onthe model2B-fit:

mystep2 <- step(model2B)

## Warning: Model failed to converge with max|grad| = 0.00203706 (tol = 0.002)

## Warning: Model failed to converge with max|grad| = 0.00203706 (tol = 0.002)

print(xtable(mystep2$rand.table))

Chi.sq Chi.DF elim.num p.valueproduct 13.18 1 kept 0.00

Consumer 87.55 1 kept 0.00sens2:Consumer 6.92 1 kept 0.01

Now we also see a test for the random main (intercept) effect of Consumer, which wasnot part of the above.


The warnings do not worry us to much here: In one or two of the models the convergen-ce just barely failed due to one of the convergence criteria, but clearly it was pretty close.There are ways to work with setting various optimizer options including extending thenumber of iterations etc. but we will not pursue these here. Instead we check that thefinal model converges:

finalmodel <- lmer(Preference ~ Homesize + sens2 + (1 | product) +

(1 | Consumer) + (0 + sens2 | Consumer), data = carrots)

After having reduced the covariance structure in the model, we turn attention to themean structure, ie the fixed effects:

print(xtable(mystep2$anova.table))

Sum Sq Mean Sq NumDF DenDF F.value elim.num Pr(>F)Homesize:sens1 0.19 0.19 1 1016.14 0.18 1 0.67

sens1 0.54 0.54 1 8.98 0.52 2 0.49Homesize:sens2 1.08 1.08 1 100.83 1.04 3 0.31

Homesize 5.40 5.40 1 100.95 5.20 kept 0.02sens2 18.17 18.17 1 12.19 17.49 kept 0.00

And various model parameter summaries and post hoc:

VarCorr(mystep2$model)

## Groups Name Std.Dev.

## Consumer sens2 0.0545

## Consumer.1 (Intercept) 0.4442

## product (Intercept) 0.1774

## Residual 1.0194

print(xtable(confint(mystep2$model)))

print(xtable(mystep$lsmeans))

eNote 9 9.7 EXERCISES 30

2.5 % 97.5 %.sig01 0.03 0.08.sig02 0.36 0.53.sig03 0.09 0.29

.sigma 0.98 1.07(Intercept) 4.74 5.08

Homesize3 -0.45 -0.03sens2 0.04 0.10

Homesize Estimate Standard Error DF t-value Lower CI Upper CI p-valueHomesize 1 1 4.91 0.09 39.90 56.35 4.73 5.09 0.00Homesize 3 3 4.66 0.09 48.90 49.76 4.47 4.85 0.00

print(xtable(mystep$diffs.lsmeans))

Estimate Standard Error DF t-value Lower CI Upper CI p-valueHomesize 1-3 0.25 0.11 100.90 2.37 0.04 0.46 0.02

9.7 Exercises

Exercise 1 Carrots data

Consider the carrots data of this Chapter. The data file can be downloaded as: carrots.txtand is described also in eNote13. module. Carry out a similar analysis using (at least)one of the other three response variables (Sweetness,Bitter or Crisp) instead of the pre-ference. Try to include (at least) one other background variable than the homesize, e.g.gender.



Random coefﬁcients models - DTU

Documents

Transcript of Random coefﬁcients models - DTU