econometrics_I definitions , theorems and facts

download econometrics_I definitions , theorems and facts

of 6

Transcript of econometrics_I definitions , theorems and facts

  • 7/27/2019 econometrics_I definitions , theorems and facts

    1/6

    1 Useful mathematics, notation and definitions

    Fact 1.1. If we multiply a column vector x by a matrix Amxn , this defines a function: Amn : Rn ! Rm

    Notation 1. A = [a(1)...a(n)], ai 2 Rm refers to columns 1 through n of matrix A.

    Notation 2. A =264

    a1...

    am1

    375, ai 2 Rn refers to the rows of A.Definition 1.1. C(A) = ha(1),...,a(n)i refers to the space spanned by the columns of A.Proposition 1.1. D = AB and B is nonsingular =) rank(D) = rank(A)Proposition 1.2. Let Amn, then, Rank(A) {m, n}Proposition 1.3. D = AB =) Rank(D) min {Rank(A),Rank(B)}Proposition 1.4. Let Amn =) rank(A) = rank(AA0) = rank(A0A).

    1.1 Some Vector Calculus

    For S() = min

    (Y-X)(Y-x) where Yn1, Xnk and k1

    Fact 1.2. Derivative w.r.t. a column vector

    @S()

    @=

    0BB@

    @S()@1

    ...@S()@k

    1CCA

    Fact 1.3. Derivative w.r.t. a row vector

    @S()

    @0=

    @S()@1

    , . . . , @S()@k

    Fact 1.4. Let() : Rk ! Rm, then() =

    0BBB@

    1()2()

    ...

    m()

    1CCCA is a vector valued mapping, then

    @()

    @0=

    0BB@

    @1()@ 0

    ...@m()@ 0

    1CCA

    where each element is a row vector as in fact 1.3 and@()@ 0

    is a m k matrix1.

    Fact 1.5. Let() : Rk ! Rm, () =0BBB@

    1()2()

    ...

    m()

    1CCCA , be a vector valued mapping, then

    @ 0()

    @=

    @1()

    @, . . . ,

    @m()

    @

    where each element is a column vector as in fact 1.2 and@0()@

    is an k m matrix.1This symbol 0 means transpose.

    1

  • 7/27/2019 econometrics_I definitions , theorems and facts

    2/6

    Fact 1.6. Lety = Ax where A does not depend on x, then @y@x0

    = A where @y@x0

    is an m n matrix.Fact 1.7. Lety = Ax where A does not depend on x, then @y

    @x= A0 where @y

    @xis an nm matrix.

    Proposition 1.5. Suppose y = Ax (), where =

    0

    B@

    1...

    r

    1

    CA, then if A does not depend on,

    @y()

    @0=

    @y

    @x0@x

    @0= A

    @x ()

    @0

    the resulting matrix is m r.Note 1. A@x

    0()@0

    is not well defined, to see this :

    @x0

    @0=

    0BBB@

    @x1@1

    @x2@1

    @x3@1

    @xn@1

    @x1@2

    @x2@2

    @x3@2

    @xn@2

    ......

    ......

    ...@x1@r

    @x2@r

    @x3@r

    @xn@r

    1CCCA

    rn

    , but Amn!

    Fact 1.8. y = x0Bx = (x0Bx)0 = x0B0x = y0, since y is a scalar.

    Proposition 1.6. If Fact 1.8 holds =) we can always find a decomposition A of B s.t A is symmetricmatrix.

    Proof. Let A = 12 (B + B0) be a symmetric matrix, then:

    y =1

    2(x0Bx + x0B0x)) =

    1

    2[x0 (B + B0) x] =

    1

    2(2x0Ax) = y0

    Proposition 1.7. y = x

    0

    Ax =)@y

    @x = 2x

    0

    A = 2Ax

    1.2 Geometry of Least Squares

    y = x11 + ... + xkk implies that y lives in the linear space generated by hx1,...,xki Rn , however whenwe add ", we account for all aspects of y that do no live in hx1,...,xki.Definition 1.2. We call P a projection matrix if:

    1. P : Rn ! Rn.2. P =P0

    3. P2 = P

    Fact 1.9. In our particular caseP = X(X0X)1X0

    2 Maximum likelihood method

    Definition 2.1. Suppose that random variables X1,...,Xn have a joint density or frequency functionf(x1, x2,...,xn|). Given observed values Xi = xi, where i = 1,...,n, the likelihood of as a functionof x1, x2,...,xn is defined as

    lik () = f(x1,...,xn|) (1)

    2

  • 7/27/2019 econometrics_I definitions , theorems and facts

    3/6

    X

    0

    y-X=e

    x1

    x2

    y

    Figure 1: Geometry of Least Squares.

    Definition 2.2. The Maximum likelihood estimate, is that value of which maximizes 1In the particular case where Xi are assumed to bi i.i.d. their joint density is the producto of the marginal

    desnsities and the likelihood is

    lik (

    ) =

    n

    Yi=1 f(Xi|

    ) (2)

    Rather than maximizing 1 itself, it is usually easier to maximize the natural logarithm (which is equivalentsince the logarithm is a monotonic function). For an i.i.d. sample, the log likelihood is

    l () =nX

    i=1

    ln [f(Xi|)] (3)

    2.1 Hypothesis testing

    Definition 2.3. Type I error. The procedure may lead to rejection of the null hypothesis when it is true.

    Definition 2.4. Type II error. The procedure may fail to reject the null hypothesis when it is false or

    accepting H0 when it is false.

    State of nature DecisionTruth Accept H0 (p = 1 ) Reject Ha

    H0 Correct Decision Type I error (Size of the test or significance level, denoted )Ha Type II error (p = ) Prob. of rejecting H0 when it is false, it is the power of the test. = 1

    Table 1: Type I and II error

    3

  • 7/27/2019 econometrics_I definitions , theorems and facts

    4/6

    Definition 2.5. Power of a test. The power of a test is the probability that it will correctly lead torejection of a false null hypothesis:

    Power = 1 = 1 P(typeII error)Some dificulty may arise because Ha : < 0 only specifies a range, this in turn complicates by hand

    calculations.

    Example 2.1. Suppose that underHo = and Ha : E[D] = . Then the power of a given test is : () = p (X > c|), that is, the probability that we reject H0 when Ha is true.

    Remark 2.1. Notice in the above example we are testing at a specific value Ha : = . To see a case uschas Ha : < 0, see Casella-Berger 383.

    2.2 Distributions used in hypothesis testing under normality

    z =

    pn (x )

    s t [n 1] (4)

    c =(n 1) s2

    2 2 [n 1] (5)

    See page 1062 of Econometric Analysis 7th edition for an example.

    2.3 Wald Test

    2.3.1 One linear restriction

    Also know as distance test or significance test.

    Wk =k kk

    tnk (6)

    The latter is the standard form of the test of a single restricion. In the case of one linear restriction formultiple parameters, denot c as a 1 k vector containing a restricion of the form c= d where is a k 1vector.

    W =c dq

    2c (X0X)1 c0

    tnk (7)

    Notice the denominator in 7 is the standard deviation of the sum of the is taking c into consideration.Finally, W is the distance from d into standard deviation units.

    2.3.2 Multiple linear restrictions

    For multiple linear restriction, the statistical test takes the form

    W =

    R q0 h

    2R (X0X)1

    R0i1

    R q

    W =

    R q

    0 hR (X0X)1 R0

    i1

    R q

    2 2q (8)

    Notice 8 follows the same pattern, except now it is in cuadratic terms, specifically in square standarddeviation units.

    4

  • 7/27/2019 econometrics_I definitions , theorems and facts

    5/6

    Definition 2.6. Let z1 2(p) and z2 2(m) be independent. Then their ratio, devided by theirrespective degrees of freedom is a random variable with distribution F(p, m)

    z1/p

    z2/m F(p, m)

    In order to build a test statistics for multiple linear restrictions, first notice (n

    k) 2 = u0u , hence

    (n k) 22

    =u0u

    2 2nk (9)

    where 2 in the denominator is used to standarize our distribution. Using the RHS of 9 and 8 anddividing each one by their degrees of freedom, we obtain

    W

    q/

    u0u

    2 (nK) =2 (q) /q

    2 (n k) /n k F(q, n k)0

    B@

    R q

    0 hR (X0X)1 R0

    i1

    R q

    2

    1

    CA

    1

    q

    2

    2

    nKnK

    (10)

    R q

    0 hR (X0X)

    1 R0i1

    R q

    q2 F(q, n k)

    3 Asymptotic theory

    Definition 3.1. Convergence in probability. Let X be a r.v. and {Xn} a sequence of random variables,

    limn!1

    P (|Xn X| > ") ! 0OR

    limn!1

    P (|Xn X| ") ! 1

    It is denoted Xnp! X.

    Definition 3.2. Almost Sure Convergence. Let X be a r.v. and {Xn} a sequence of random variables,

    P

    limn!1

    |Xn X| > "! 0

    OR

    P

    limn!1

    |Xn X| "! 1

    OE

    P (sup |Xn X| ") ! 1It is denoted Xn

    a.s.! X.Definition 3.3. Convergence in rth mean. Let X be a r.v. and {Xn} a sequence of random variables,

    E|Xn X|r !n!1

    0

    5

  • 7/27/2019 econometrics_I definitions , theorems and facts

    6/6

    Definition 3.4. A sequence {Xn} is at most of order n in probability, denoted Op

    n

    if there existsa non-stochatic O (1) (bounded) sequence {an} s.t.

    Xnn

    an p! 0

    i.e.

    limn

    P

    Xnn an > "

    ! 0

    Definition 3.5. A sequence {Xn} is of order smaller than n in probability, denoted op

    n

    if

    Xnn

    p! 0i.e.

    limn

    P

    Xnn > "

    ! 0

    Definition 3.6. A sequence of random variables {Xn} is stochastically bounded, denoted O (1), if for any

    arbitrary small ", there exists a finite M s.t.

    P (|Xn > m|) "

    Note 2. See Wooldridge page 35.

    6