NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

6
NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION“ PETER KENNEDY Simon Fraser University I. INTRODUCTION In recent years there has appeared a spate of articles addressing the problem of testing non-nested hypotheses; MacKinnon (1983) and McAleer (1987) offer surveys of this literature. The purpose of this paper is to give a non-technical, and hence deliberately non- rigorous, exposition of these tests by means of a Venn diagram called the Ballentine, introduced to econometricians by Kennedy (1981). This exposition reveals that the competing non-nested hypothesis tests share a common theme; their differences stem primarily from the way in which they are operationalized. Section I1 below describes the Cox and J tests, prototypes of the two prominent variants of non-nested tests. In Section 111 the Ballentine is presented and is used to show the similarities between the Cox and J tests. Section IV shows how the Ballentine can reflect various features of these tests, their variants, and competing tests. Section V concludes. 11. THE COX AND J TESTS For expositional purposes, this paper is couched in the context of two hypotheses, HO and HI, each taking the form of a linear regression model with normally distributed errors. To be specific, we have Ho: Y = XP + EO; EO - N(0, 00~1) H ~ : Y = Za + El; El - N(O, 0121) where the matrices X and Z are fixed in repeated samples and the notation is standard. Two hypotheses (or models) are said to be non-tested if neither may be obtained from the other by the imposition of appropriate parametric restrictions; in this context, X and Z are not nested within each other. The Cox test operates on the principle that the validity of the null hypothesis HO may be tested by examining whether or not it is capable of predicting the “performance” of the alternative hypothesis HI. The actual performance of the alternative hypothesis is compared with the performance that would be expected if the null hypothesis were in fact true, and if this difference is insignificantly different from zero, the null hypothesis is accepted; otherwise it is rejected. When the hypotheses HO and HI are linear regression My thanks. without implication, to Michael McAleer for useful suggestions 160

Transcript of NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

Page 1: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION“

PETER KENNEDY

Simon Fraser University

I. INTRODUCTION

In recent years there has appeared a spate of articles addressing the problem of testing non-nested hypotheses; MacKinnon (1983) and McAleer (1987) offer surveys of this literature. The purpose of this paper is to give a non-technical, and hence deliberately non- rigorous, exposition of these tests by means of a Venn diagram called the Ballentine, introduced to econometricians by Kennedy (1981). This exposition reveals that the competing non-nested hypothesis tests share a common theme; their differences stem primarily from the way in which they are operationalized.

Section I1 below describes the Cox and J tests, prototypes of the two prominent variants of non-nested tests. In Section 111 the Ballentine is presented and is used to show the similarities between the Cox and J tests. Section IV shows how the Ballentine can reflect various features of these tests, their variants, and competing tests. Section V concludes.

11. THE COX A N D J TESTS

For expositional purposes, this paper is couched in the context of two hypotheses, HO and H I , each taking the form of a linear regression model with normally distributed errors. To be specific, we have

Ho: Y = XP + E O ; E O - N ( 0 , 0 0 ~ 1 )

H ~ : Y = Za + El; E l - N ( O , 0121)

where the matrices X and Z are fixed in repeated samples and the notation is standard. Two hypotheses (or models) are said to be non-tested if neither may be obtained from the other by the imposition of appropriate parametric restrictions; in this context, X and Z are not nested within each other.

The Cox test operates on the principle that the validity of the null hypothesis HO may be tested by examining whether or not it is capable of predicting the “performance” of the alternative hypothesis H I . The actual performance of the alternative hypothesis is compared with the performance that would be expected if the null hypothesis were in fact true, and if this difference is insignificantly different from zero, the null hypothesis is accepted; otherwise it is rejected. When the hypotheses HO and H I are linear regression

’ My thanks. without implication, to Michael McAleer for useful suggestions

160

Page 2: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

1989 NON-NESTED HYPOTHESIS TESTS 161

models with normally distributed errors, the Cox approach can be captured' by defining performance as the estimated variance of the error term. This is accomplished by testing whether 612 - ?lo2 is significantly different from zero, where 612 is the maximum likelihood estimate of o12 and 6102 is an estimate of a102, the probability limit (or expected magnitude) of 612 assuming HO is true.

Now suppose the hypotheses HO and H1 are combined linearly to form the artificial nesting model

Ha,: Y = (1-8) X P + BZa +

The J test operates on the principle that if HO is true then the OLS estimate of 0 will be insignificantly different from zero. However, the parameters in Ha,, including 8, are not identified. This problem is circumvented by introducing an estimate ic and then regressing Yon X and Zic. When 6 is the OLS estimate of a from the regression of Yon Z , a t-test on the coefficient of Zic in Ha, defines the J test.

111. THE BALLENTINE

An example of the Ballentine Venn diagram is presented in Figure 1 , drawn to reflect a situation in which the true model is given by Ho. The circle Y represents variation in the dependent variable, and the circles X and Z represent variation in the sets of explanatory variables of HO and H I , respectively.* The X and Z circles overlap, reflecting the typical case of some correlation between the variables of the two models. The overlap of X with Y , the blue plus brown area, reflects the variation in Y explained by X , on average, in repeated samples. The remainder of the Y circle, the green plus (shaded) red area. represents variation in Y not explained by X and is attributable to the error term; its magnitude is

Figure 1: Ho True

'This can be shown using results in Pesaran (1974), Fisher and McAleer (1981) and Fisher (1983).

'In the context of this application of the Ballentine, variation can be identified with sample variance. The circle representing variation of a set of variables must be interpreted as the union of the circles associated with each of the variables in that set. Since the explanatoryvariables are assumed to be non- stochastic, the X and Z circles do not change i n repeated samples. Y incorporates the error terms and thus the Y circle varies in repeated samples; as drawn, it reflects a typical sample, representing the expected variation in Y .

Page 3: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

162 AUSTKALIAN ECONOMIC PAPEKS JUNE

therefore identified with go2. In any given sample, the residuals from an OLS regression of Yon Xcan be used to create an estimate ,?o2 of oo2. The magnitudes of this and comparable areas are identified with the magnitudes of variance estimates.

The circle Z is drawn in Figure 1 so that its intersection with the circle Y occurs through its overlap with X (the brown area) and the red area. Since H 1 is a false hypothesis, Z should have little explanatory power for Y except insofar as Z may be collinear to some degree with X, and thus “explains” the brown area. The red area reflects variation in Y explained by Z due to there being a finite number of degrees of freedom. As the degrees of freedom become larger and larger this area becomes smaller and smaller, so that asympto- tically it disappears.

In Figure 2 is drawn a similar diagram for the case in which H1 is the true model. Here the roles of X and Z are reversed: the red (shaded) area of Figure 1 has increased in magnitude in Figure 2 to reflect the true role of Z , whereas the blue area of Figure 1 has shrunk in Figure 2 , reflecting the negligible role of X.

The purpose of the Cox test and its variants is to compare 612, the actual performance of H I , and 0102, its expected performance when HO is true. The magnitude of 0102 can be determined from Figure 1, since Figure 1 reflects the case of HO being true. We would expect Z , through its collinearity (overlap) with X, to explain the brown area when regressed on Y . This follows because when HO is true the red area in Figure 1 should be negligible.3 Thus when Y is regressed on Z we expect the brown area to be explained, implying that a102 is given by the remainder of the Y circle, the blue plus green plus red area in both Figures 1 and 2. An estimate of this area, discussed in Section IV, yields ,?lo2.

Now what about 612, the actual performance of H1 resultingfrom the regression of Y on Z? In both Figures 1 and 2, the residual variation from this regression is given by the blue plus green area, so 612 differs from 6102 by the red area. Their difference, 612 - ,?lo2, should be negative in sign under the null HO and equal in magnitude to the red area. When HO is true (Figure 1) this red area should be negligible, so HO should be accepted; when HO is false (Figure 2) the red area is not negligible, so this difference should be significantly different from zero and HO should be rejected. In essence the Cox test is testing whether or not the red area is of significant magnitude.

Figure 2: H i True

’Some variants of the Cox test incorporate small-sample adjustments to correct for this See Godtrey and Pesaran (1983).

Page 4: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

1989 NON-NESTED HYPOTHESIS TESTS 163

Let us now examine the 1 test using the Ballentine. In the first stage of the test, Y is regressed on Z to produce ZiX, represented by the brown plus red area. In the second stage of the 1 test, Y is regressed on X and Z 2 together and a t test is conducted on the coefficient of Zfi. The square of this t statistic reflects the significance of the change in the sum of squared errors resulting from adding the regressor ZiX to the set of regressors X. The variation in Y explained by X alone is given by the blue plus brown area; regressing Yon X and Z2 (where Z2 is represented by the brown plus red area) together augments this explained variation by the red area, and thereby reduces the sum of squared errors by the red area. Consequently, the] statistic is used to test whether or not the magnitude of the red area is significantlydifferent from zero, precisely what the Cox test isdoing. Ofcoursethese two statistics are not identical because they each measure the red area in a different way.4

IV. EXTENTIONS A N D REMARKS

Several comments can be added to show how the Ballentine can illustrate various features of these statistics and their variants.

I. Estimating ~ 1 0 2 . To operationalize the Cox test an estimate of C102, the sum of the green, blue and red

areas, must be obtained. Regressing Yon X and taking the residuals permits estimation of the green plus red area. (This is in fact Co2.) What remains is to estimate the blue area. This is done in two steps. First, regress Yon X and obtain the predicted values of Y; this yields the variation represented by the blue plus brown area. Second, regress these predicted values on 2 ; the residuals from this regression allow estimationof the blue area ( i e . , the blue plus brown overlaps with the Z circle by the brown area, so the residual variation is an estimate of the blue area.) The estimated variance of these residuals, added to Co2, yields

2. Accepting or Rejecting Both Hypotheses Figure 1 illustrates a case in which one hypothesis is accepted and the other rejected.

When HO serves as the null and H I as the alternative, HO is likely to be accepted, because in Figure 1 the red area is small; when the roles of HO and H1 are reversed, H 1 should be rejected, because now in Figure 1 the role of the red area is being played by the blue area. Suppose now we modify Figure 1 to reflect a case in which X and Z are highly collinear, so that both the red and blue areas are small. In this case it is quite possible that both HO and H 1 would be accepted. As a final case, suppose Figure 1 is modified to illustrate a case in which both X and Z determine Y, so that both the red and blue areas are large; in this case, it is quite possible that (correctly) both hypotheses would be rejected.

3 . A n Alternative View MacICinnon (1983) notes that what the Cox and J tests d o is test whether the residuals

from HO are uncorrelated with the difference between the fitted values from HO and H 1. I n Figure 1 the residuals from H O are represented by the green plus red area. The fitted values of HO are represented by the blue plus brown area, and the fitted values from H1 are represented by the brown plus red area, so their difference is represented by the blue plus red area. (The red area variation must now be interpreted as being opposite in direction

5102.

'The nature of the relationship between t h e c o x a n d J statistics (and variants thereof). and theextent to which they are equivalent, havc been discussed by Dastoor (1983). Davidson and MacKinnon (1982). Fisher (1983). Fisher and McAleer (1981) and McAleer (1987)

Page 5: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

164 AUSTRALIAN ECONOMIC PAPERS JUNE

from before.) Correlation between the residuals from HO and the difference between the fitted values from HO and H1 is reflected by an overlap of their respective areas; this overlap is the red area, so that this suggestion also is a test of whether the magnitude of the red area is significantly different from zero.

4. Atkinson Variants Atkinson (1970) suggested that the entire Cox statistic be evaluated under the null

hypothesis, a suggestion that has led to variants of both the Cox and tests. i n the Cox test, C l 2 is calculated using ti, an estimate of a obtained employing the information contained in the brown plus red area i.e., from the OLS regression of Y on 2. Under the null hypothesis the red area is of trivial size, so, at least in an asymptotic sense, 6 , and thus 612, is based on the brown area. When HO is false (Figure 2) this red area is not trivial, so 312 is no longer in any me,aningful sense based solely on the information in the brown area. The Atltinson idea is to ensure that the estimate of .12 is calculated using a n estimate of a based on the information in only the brown area. This estimate a'? of a is calculated in two steps. First, regress Yon X and obtain the predicted (or fitted) value of Y (the blue plus brown area in Figure 1 or 2). Second, regress these fitted values on Z to obtain a':; because the overlap of the blue plus brown with the Zcircle is the brown area, a*' is based solely on the brown area. Then a"' is employed to estimate u l 2 as 0 1 ~ ~ ' . The Atltinson variant of the Cox statistic is then based on ul2'; - 610~. The Atltinson variant of the ] test, the ] A test, is formed by replacing Z6 in the test with Z6::'.

5. The Non-Nested F Test Suppose Y is regressed on X and Z":, where Z': is Z with any variables it shares with X

deleted. This modification to Z is necessary to avoid perfect multicollinearity and thereby permit this regression t o be run. Z':: Ioolts like Z, but is missing part of Z's overlap with X . An F test o n the coefficient vector of 2" is used as a non-nested hypothesis test; if this coefficient vector is insignificantly different from the zero vector, Hg is accepted; otherwise, Ho is rejected. An F test of this nature tests whether or not the sum of squared errors is reduced significantly when Z:: is added to the set of regressors X . This change in the sum of squared errors is clearly just the red area, so once again we find that the red area is being tested against zero.'

6. Dastoor's R Test When HO is true, ii and a': should be insignificantly different from each other, a principle

formalized in the R test of Dastoor (1983). Since a" is based on information in the brown area, whereas 6 is based on information in the brown plus red area, this test is also clearly focussing on the influence of the red area. The larger is the magnitude of the red area, the greater one would expect .its influence to be in this regard.

'Both the J test and the non-nested F a r e testing whether a sum of squared errors representing the red area is significant. The J test is constrained to use a (stochastic) linear combination of the Z variables for this purpose, however, whereas the non-nested F test is not, implying that the measured magnitude of the red area will always be larger in the F test. Although this phenomenon is to some extent offset by theadjustments for degrees offreedom built into the critical values of the respective test statistics, this is not the whole story. As stressed by Mizon and Richard (1986), the implicit null hypothesis is different for the one degree-of-freedom test than for the many degrees-of-freedom test. Hendry (1983) has a clear exposition of this in the context of the Cox test versus the non-nested F test. See also Dastoor and McAleer (1987).

Page 6: NON-NESTED HYPOTHESIS TESTS: A DIAGRAMMATIC EXPOSITION

1989 NON-NESTED HYPOTHESIS TESTS 165

V. CONCLUSION

The common theme of non-nested hypothesis tests, illustrated in Figures 1 and 2 as the significance of the magnitude of the red area, has been brought into focus by the diagrammatic presentation of this paper. The fact that these tests are all implicitly testing the magnitude of the red area against zero should not, however, suggest that they are identical. All calculate the magnitude of the red area in different ways, and even though some methods are asymptotically equivalent under the null and under local alternatives, in general they are not equivalent in small samples. This provides scope for one test to outperform the others in small samples, a n issue which has been addressed in Monte Carlo studies.

REFERENCES

Atltinson, A. C. (1970), “A Method for Discriminating between Models”, lournal of the Royal Statistical Society, vol. B32.

Dastoor, N. I<. (1983), “Some Aspects of Testing Non-Nested Hypotheses”, /ournu1 of Econometrics, vol. 21.

Dastoor, N. I<. and McAleer, M. (1987), “On the Consistency of Joint and Paired Tests for Non-nested Regression Models”, lournal of Quantitative Economics, vol. 3.

Davidson, R. and MacKinnon, J . (1982), “Some Non-Nested Hypothesis Tests and the Relations Among Them”, Review of Economic Studies, vol. 49.

Fisher, G. R. (1983), “Tests for two Separate Regressions”, /ournu1 of Econometrics, vol. 21

Fisher, G. R. and McAleer, M. (1981), “Alternative Procedures and Associated Tests of Significance for Non-Nested Hypotheses”, Iournal o f Econometrics, vol. 16.

Godfrey, L. G. and Pesaran, M. H. (1983), “Tests of Non-Nested Regression Models: Small Sample Adjustments and Monte Carlo Evidence”, /ournu1 of Econometrics, vol. 21.

Hendry, D. F. (1983), “Comment”, Econometric Reviews, vol. 2

Kennedy, P. E. (1981), “The ‘Ballentine’: A Graphical Aid for Econometrics”, Australian Economic Papers, vol. 20.

McAleer, M. (1987), “Specification Tests for Separate Models: A Survey”, in M. L. King and Giles, D. E. A. (eds) Specification Analysis in the Linear Model (London: Routledge and Kegan Paul).

MacKinnon, J. G. (1983), “Model Specification Tests Against Non-Nested Alternatives”, Econometric Reviews, vol. 2.

Mizon, G . E. and Richard, J.-F. (1986), “The Encompassing Principle and its Application to Testing Non-nested Hypotheses”, Econometrica, vol. 54.

Pesaran, M. H. (1974), “On the General Problem of Model Selection”, Review of Economic Studies, vol. 41.