Weighted Empiricals and Linear Models

12
Weighted Empiricals and Linear Models Hira L. Koul Michigan State University

Transcript of Weighted Empiricals and Linear Models

Weighted Empiricalsand Linear Models

Hira L. KoulMichigan State University

Institute of Mathematical Statistics

LECTURE NOTES-MONOGRAPH SERIES

Volume 21

Weighted Empiricalsand Linear ModelsHira L. KoulMichigan State University

Institute of Mathematical StatisticsHayward, California

Institute of Mathematical Statistics

Lecture Notes-Monograph Series

Editorial Board

Andrew D. Barbour, John A. Rice,Robert J. Serfling (Editor), and William E. Strawderman

The production of the IMS Lecture Notes-Monograph Series ismanaged by the IMS Business Office: Jessica Utts, IMSTreasurer, and Jose L. Gonzalez, IMS Business Manager.

Library of Congress Catalog Card Number: 92-54658

International Standard Book Number 0-940600-28-5

Copyright © 1992 Institute of Mathematical Statistics

All rights reserved

Printed in the United States of America

To the memory of my parents

Prabhawati (Dhar) and Radhakrishen Kotd (Gassi)

PREFACE

An empirical process that assigns possibly different non-random(random] weights to different observations is called a weighted (randomlyweighted) empirical process. These processes are as basic to linear regressionand autoregression models as the ordinary empirical process is to one samplemodels. However their usefulness in studying linear regression andautoregression models has not been fully exploited. This monographaddresses this question to a large extent.

There is a vast literature in Nonparametric Inference that discussesinferential procedures based on empirical processes in k-sample locationmodels. However, their analogs in autoregression and linear regressionmodels are not readily accessible. This monograph makes an attempt to fillthis void. The statistical methodologies studied here extend to these modelsmany of the known results in k-sample location models, thereby giving aunified theory.

By viewing linear regression models via certain weighted empiricalprocesses one is naturally led to new and interesting inferential procedures.Examples include minimum distance estimators of regression parameters andgoodness-of-fit tests pertaining to the errors in linear models. Similarly, byviewing autoregression models via certain randomly weighted empiricalprocesses one is naturally led to classes of minimum distance estimators ofautoregression parameters and goodness-of-fit tests pertaining to the errordistribution.

The introductory Chapter 1 gives an overview of the usefulness ofweighted and randomly weighted empirical processes in linear models.Chapter 2 gives general sufficient conditions for the weak convergence ofsuitably standardized versions of these processes to continuous Gaussianprocesses. This chapter also contains the proof of the asymptotic uniformlinearity of weighted empirical processes based on the residuals when errorsare heteroscedastic and independent. Chapter 3 discusses the asymptoticuniform linearity of linear rank and signed rank statistics when errors areheteroscedastic and independent. It also includes some results about theweak convergence of weighted empirical processes of ranks and signed ranks.Chapter 4 is devoted to the study of the asymptotic behavior of M- and R-estimators of regression parameters under heteroscedastic and independenterrors, via weighted empirical processes. A brief discussion about bootstrapapproximations to the distribution of a class of M-estimators appears inSection 4.2b. This chapter also contains a proof of the consistency of a classof robust estimators for certain scale parameters under heteroscedastic errors.

In carrying out the analysis of variance of linear regression modelsbased on ranks, one often needs an estimator of the functional Jfd<p(F), whereF is the error distribution function, f its density and φ is a function from[0, 1] to the real line. Some estimators of this functional and the proofs oftheir consistency in the linear regression setting appear in Section 4.5.

Chapters 5 and 6 deal with minimum distance estimation, viaweighted empirical processes, of the regression parameters and tests ofgoodness-of-fit pertaining to the error distribution. One of the main themesemerging from these two chapters is that the inferential procedures based onweighted empiricals with weights proportional to the design matrix providethe right extensions of k-sample location model procedures to linearregression models.

It is customary to expect that a method that works for linearregression models should have an analogue that will also work inautoregression models. Indeed many of the inferential procedures based onweighted empirical processes in linear regression that are discussed inChapters 3-6 have precise analogs in autoregression based on certainrandomly weighted empirical processes and appear in Chapter 7. Inparticular, the proof of the asymptotic uniform linearity of the ordinaryempirical process of the residuals in autoregression appears here.

All asymptotic uniform linearity results in the monograph are shownto be consequences of the asymptotic continuity of certain basic weightedand randomly weighted empirical processes.

Chapters 2-4 are interdependent. Chapter 5 is mostly self-containedand can be read after reading the Introduction. Chapter 6 uses results fromChapters 2 and 5. Chapter 7 is almost self-contained. The basic resultneeded for this chapter appears in Section 2.2b.

The first version of this monograph was prepared while I was visitingthe Department of Statistics, Poona University, India, on sabbatical leavefrom Michigan State University, during the academic year 1982-83. Severallectures on some parts of this monograph were given at the Indian StatisticalInstitute, New Delhi, and Universities of La Trobe, Australia, andWisconsin, Madison. I wish to thank Professors S. R. Adke, RichardJohnson, S. K. Mitra, M. S. Prasad and B. L. S. Prakasa Rao for havingsome discussions pertaining to the monograph. My special thanks go toJames Hannan for encouraging me to finish the project and for proof readingparts of the manuscript, to Soumendra Lahiri for helping me with sections onbootstrapping, and to Bob Serfling for taking keen interest in the monographand for many comments that helped to improve the initial draft.

Ms. Achala Sabane and Ms. Lora Kemler had the pedestrian task oftyping the manuscript. Their patient endeavors are gratefully acknowledged.Ms. Kemler's keen eye for details has been an indispensable help.

During the preparation of the monograph the author was partlysupported by the University Grants Commission of India and the NationalScience Foundation, grant numbers NSF 82-01291, DMS-9102041.

Hira L. Koul,East Lansing, ML 488235/28/92.

VI

NOTATION AND CONVENTIONS

The p-dimension Euclidean space is denoted by Rp, p > 1; R = R 3V := the

σ—algebra of Borel sets in Rp, 3=3 λ := Lebesgue measure on (R, 3).The symbol " : = " stands for "by definition".

For any set A c R, D(i4) denotes the class of real valued functions onA that are right continuous and have left limits while ϋI(A) denotes thesubclass in B(4) whose members are nondecreasing. <C[0, 1] := the class ofreal valued bounded continuous functions on [0, 1].

A vector or a matrix will be designated by a bold letter. A t e Rp is

/ 2 P 2

a p*l vector, t ' or t its transpose, | |t | | := .Σ tj, | t | := max{|tj |, l<j<p}For any p-square matrix C, ||C|| = sup {||t'C||; | |t | | < 1}. For an nxp

matrix D, dni denotes its i t h row, 1 < i < n, and Dc the nxp matrix D — ϋ ,

whose i t h row consists of (dni — (In)', with Hn := Σi dni/n, 1 < i < n.

w.e.p.('s)r.w.e.p.('s)i.i.d.r.v.('sd f (W.Γ.t.C-SD.C.T.FubiniL-F CLT

v.('s).f.('s)

t

N(0, C)

weighted empirical process(es).randomly weighted empirical process(es).independent identically distributed,random variable(s).distribution function(s).with respect to.the Cauchy—Schwarz inequality,the Dominated Convergence Theoremthe Fubini Theorem.the Lindeberg-Feller Central Limit Theorem,a sequence of numbers (r.v.'s) converging to zero (inprobability).a sequence of numbers (r.v.'s) that is bounded (inprobability).

= either a r.v. with normal distribution whose meanvector is 0 and the covariance matrix C orthe corresponding distribution.

||g|| := the supremum norm over the domain of g, g a real

valued function.

2 n 2

r a := .Σ ani, for an arbitrary real vector (ani, ..., a n n ) ' .

Often in a discussion or in a proof the subscript n on the triangulararrays and various other quantities will not be exhibited. The index i inΣi or Σ and maxi or max will vary from 1 to n, unless specified

otherwise. All limits, unless specified otherwise, are taken as n —»αo.

Vll

For a sequence of r.v.'s {X, XΏ, n > 1}, Xn —> X means that thed

distribution of Xn converges weakly to that of X. For two r.v.'s X} Y,X = Y means that the distribution of X is the same as that of Y.

dFor a sequence of stochastic processes {Y, Yn, n > 1}, Yn * Y

means that YΏ converges weakly to Y in a given topology. YΏ —» Yfd

means that all finite dimensional distributions of YΏ converge weakly tothat of Y.

Reference to an expression or a display is made by the (expressionnumber) if referring in the same section and by the (chapter number.sectionnumber.expression number), otherwise. For example, by (3.2.1) is meant anexpression (1) of Section 2 of Chapter 3. A reference to this while in Section3.2 would appear as (1).

For convenient reference we list here some of the most often usedconditions in the manuscript. For an arbitrary d.f. F on R, conditions (Fl),(F2) and (F3) are as follows:

(Fl) F has uniformly continuous density f w.r.t. λ.

(F2) f > 0, a.e. λ.

(F3) sup χ G R | x f(x) | < * .

These conditions are introduced for the first time just before Corollary 3.2.1and are used frequently subsequently.

For an nxp design matrix matrix X, the conditions (NX), (NX1)and (NXc) are as follows:

(NX) (x 'x)" 1 exists, n > p; maxixήiίx'xΓ^ήi = o(l).

(NX1) (XcXcΓ1 exists, n > p;

maXi Xni (XcXc)~~ Xni = θ(l).

(NXC) (XcXcΓ1 exists, n > p ;

i (xni - Xn) (XcXcrtxni - *n) = θ(l).

The condition (NX) is the most often used from Theorem 2.3.3. onwards.The letter N in these conditions stands for Noether, who was the firstperson to use (NX), in the case p=l, to obtain the asymptotic normality ofweighted sums of r.v.'s; see Noether (1949).

Vlll

TABLE OF CONTENTS

1. Introduction

1.1. Weighted Emprical Processes 11.2. M-, R- and Scale Estimators 21.3. Minimum Distance Estimators and

Goodness-of-Fit Tests 41.4. Randomly Weighted Empirical Processes 7

2. Asymptotic Properties of Weighted Empiiicals 10

2.1. Introduction 102.2. Weak Convergence 10

2.2a. Wd-Processes 102.2b. Vh-Processes 20

2.3. Asymptotic Uniform Linearity (A.U.L.) ofResidual w.e.p.'s 28

2.4. Some Further Probabilistic Results forw.e.p.'s 39

3. Lineax Rank and Signed Rank Statistics 44

3.1. Introduction 443.2. AUL of Linear Rank Statistics 453.3. AUL of Linear Signed Rank Statistics 573.4. Weak Convergence of Rank and Signed

Rank w.e.p.'s 62

4. M, R and Some Scale Estimators 71

4.1. Introduction 714.2. M-Estimators 72

4.2a. First Order Approximations:Asymptotic Normality 72

4.2b. Bootstrap Approximations 784.3. Distribution of Some Scale Estimators 824.4. R-Estimators 904.5. Estimation of Q(f) 95

5. Minimum Distance Estimators 105

5.1. Introduction 1055.2. Definitions of M.D. Estimators 1065.3. Finite Sample Properties and Existence 1105.4. Asymptotics of Minimum Dispersion

Estimators: A General Case 1225.5. Asymptotic Uniform Quadraticity 1285.6. Asymptotic Distributions, Efficiencies,

and Robustness 152

IX

5.6a. Asymptotic Distributions andEfficiencies 152

5.6b. Robustness 1605.6c. Locally Asymptotically Minimax

Property 167

6. Goodness-of-Fit Tests for the Errors 176

6.1. Introduction 1766.2. The Supremum Distance Tests 178

6.2a. Asymptotic Null Distributions 1786.2b. Bootstrap Distributions 187

6.3. L2-Distance Tests 1926.4. Testing with Unknown Scale 2016.5. Testing for Symmetry of the Errors 202

7. Autoregression 209

7.1. Introduction 2097.2. Asymptotic Uniform Linearity

of Wh and Fn 2117.3. GM- and R-Estimators 223

7.3a. GM-Estimators 2237.3b. R-Estimators 2247.3c. Estimation of Q(f) := \i dp(F) 2337.3d. Proof of Lemma 7.3b.l 235

7.4. M.D. Estimation 2437.5. Goodness-of-Fit Testing 255

APPENDIX 256

BIBLIOGRAPHY 258