Lecture Notes: Quantitative Methods Ellen R. McGrattan · Federal Reserve Bank of Minneapolis...

Federal Reserve Bank of Minneapolis

Research Department

Revised June 2007

Lecture Notes: Quantitative Methods

Ellen R. McGrattan

Table of Contents

1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1. Solving Nonlinear Systems of Equations . . . . . . . . . . . . . . . 1

1.1.1. The Bisection Method . . . . . . . . . . . . . . . . . . . . 1

1.1.2. The Newton-Raphson Method . . . . . . . . . . . . . . . . 2

1.1.3. The Secant Method . . . . . . . . . . . . . . . . . . . . . 3

1.2. Numerical Differentiation . . . . . . . . . . . . . . . . . . . . . 3

1.3. Numerical integration . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1. Choosing Quadrature Weights . . . . . . . . . . . . . . . . 4

1.3.2. Trapezoidal Rule . . . . . . . . . . . . . . . . . . . . . . 5

1.3.3. Simpson’s Rule . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.4. Gauss-Legendre Quadrature . . . . . . . . . . . . . . . . . 6

2. Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1. Discrete-time Dynamic Programming . . . . . . . . . . . . . . . . 8

2.2. Continuous-time Dynamic Programming . . . . . . . . . . . . . . . 10

3. Computing Equilibria in Near-Linear Economies . . . . . . . . . . . . . . 13

3.1. Linearizing and Log-linearizing . . . . . . . . . . . . . . . . . . . 13

3.2. Mapping the Problem to a Standard LQ Problem . . . . . . . . . . . 14

3.3. A Variant on Vaughn’s Method . . . . . . . . . . . . . . . . . . . 19

4. Computing Equilibria in Nonlinear Economies . . . . . . . . . . . . . . . 23

4.1. The Method of Parameterized Expectations . . . . . . . . . . . . . 23

4.2. Weighted Residual Methods . . . . . . . . . . . . . . . . . . . . 25

4.2.1. The General Procedure . . . . . . . . . . . . . . . . . . . 26

4.2.2. Applied to the Deterministic Growth Model . . . . . . . . . . 29

4.2.3. Applied to the Stochastic Growth Model . . . . . . . . . . . . 37

5. Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . 42

5.1. Vector autoregressive representation . . . . . . . . . . . . . . . . . 44

5.2. The Likelihood Function . . . . . . . . . . . . . . . . . . . . . . 44

6. A Prototype Real Business Cycle Model . . . . . . . . . . . . . . . . . . 47

6.1. A Version of the Model with AR(1) Technology . . . . . . . . . . . . 47

6.1.1. Maximization problems . . . . . . . . . . . . . . . . . . . 47

6.1.2. First-order conditions . . . . . . . . . . . . . . . . . . . . 48

6.1.3. Log-linear computation . . . . . . . . . . . . . . . . . . . 49

6.2. A Version of the Model with Random Walk Technology . . . . . . . . 51




6.3. MLE Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.3.1. State-space form in the general case . . . . . . . . . . . . . . 52

6.3.2. Log-likelihood function . . . . . . . . . . . . . . . . . . . . 52

6.3.3. MLE in the Benchmark Case . . . . . . . . . . . . . . . . . 53

6.3.4. MLE in the Random Walk Case . . . . . . . . . . . . . . . . 54

6.4. Simulating Data from the Models . . . . . . . . . . . . . . . . . . 56

7. A Prototype Sticky Price Model . . . . . . . . . . . . . . . . . . . . . 57

7.1. Model Economy . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.2. Computing an Equilibrium . . . . . . . . . . . . . . . . . . . . . 60

8. Business Cycle Accounting . . . . . . . . . . . . . . . . . . . . . . . 66

8.1. The Prototype Model with Time-Varying Wedges . . . . . . . . . . . 66

8.2. Mapping Frictions to Wedges . . . . . . . . . . . . . . . . . . . . 68

8.2.1. Efficiency Wedges . . . . . . . . . . . . . . . . . . . . . . 68

8.2.2. Labor Wedges . . . . . . . . . . . . . . . . . . . . . . . 72

8.3. The Accounting Procedure . . . . . . . . . . . . . . . . . . . . . 76

8.3.1. The Accounting Procedure at a Conceptual Level . . . . . . . . 76

8.3.2. A Markovian Implementation . . . . . . . . . . . . . . . . . 77

9. Structural VARs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

9.1. A Version of the RBC Model . . . . . . . . . . . . . . . . . . . 80




9.2. VARs and the 2-Shock Version of the Model . . . . . . . . . . . . . 85

9.2.1. The Decision Functions . . . . . . . . . . . . . . . . . . . 85

9.2.2. The Model’s Moving Average . . . . . . . . . . . . . . . . . 88

9.2.3. Special Property of the D’s . . . . . . . . . . . . . . . . . . 89

9.2.4. VAR Coefficients . . . . . . . . . . . . . . . . . . . . . . 89

9.2.5. Proposition 1: Model has infinite-order VAR . . . . . . . . . . 90

9.2.6. Blanchard-Quah Identification . . . . . . . . . . . . . . . . 92

9.2.6.1. Sign convention on A0(1, 1) . . . . . . . . . . . . . . . 94

9.2.6.2. Sign convention on A(1, 1) . . . . . . . . . . . . . . . 94

9.2.6.3. Full solution . . . . . . . . . . . . . . . . . . . . . 94

9.2.6.4. Cholesky decomposition . . . . . . . . . . . . . . . . 95

9.2.7. Proposition 2: OLS Results . . . . . . . . . . . . . . . . . . 95

9.2.8. The Propositions for Two Special Cases . . . . . . . . . . . . 98

9.2.8.1. Proposition 3a: No capital in the model . . . . . . . . . 98

9.2.8.2. Proposition 3b: Only one shock . . . . . . . . . . . . . 99

9.3. VARs and 3-Shock Versions of the Model . . . . . . . . . . . . . . 102

9.3.1. Adding an Investment Tax Shock . . . . . . . . . . . . . . . 102

9.3.1.1. The Model’s Moving Average . . . . . . . . . . . . . . 106

9.3.1.2. Special Property of the D’s . . . . . . . . . . . . . . . 107

9.3.1.3. Proposition 4: Model has infinite-order VAR . . . . . . . 108

9.3.1.4. A Way to Make M Singular . . . . . . . . . . . . . . . 111

Chapter 1.

Background

1.1. Solving Nonlinear Systems of Equations

1.1.1. The Bisection Method

The problem is: find x on [a, b] such that f(x) = 0, where f : IR → IR is a continuous

function and f(a) and f(b) have opposite signs. See, for example, Figure 1. The idea

of the bisection method is to bracket a root. Once the root is bracketed, simply divide

the interval in half, figure out which of the two halves brackets the root, and repeat the

process. (On Figure 1, the halfway point is marked. Notice the function lies above the

x-axis at that point. Therefore, at the second iteration, this will be the new point a.)

f(x)

x

a b

Figure 1.

One problem with this method is that it can be slow; the error bound is

|xm − x∗| ≤ (b− a) /2m

where m is the index for the mth iteration and x∗ is the solution. Suppose b− a is 1 and

one wants the error to be less than or equal to 10−6. Then, we need to iterate 20 times

(i.e., m = 6/ log10 2).

1

1.1.2. The Newton-Raphson Method

The problem is: find x such that f(x) = 0, where f : IRn → IRn is a continuous function.

The idea behind this method is to apply a Taylor expansion,

f (x) = f (x) + f ′ (x) (x− x) + higher order terms,

where the expansion is taken around an approximate solution to f(x) = 0. Evaluate f at

the solution. Assuming that x is sufficiently close to x∗, higher order terms are small and

x∗ = x− f (x) /f ′ (x)

in the case that n = 1. This leads to the following updating scheme:

xk+1 = xk − f(

xk)

/f ′(

xk)

,

starting at k = 0 with initial guess x0. For n > 1, the updating scheme is:

xk+1 = xk − J(

xk)−1

f(

xk)

where the i, j element of J(x) is the derivative of the ith element of f with respect to

the jth element of x. Figure 2 displays one step of the updating scheme. Point x0 is the

starting point. Draw a tangent of f at that point and trace it to the x-axis. The crossing

point is the new guess, x1. Repeat the procedure until the fixed point is found.

f(x)

x

x0 x1

Figure 2.

2

When x0 is sufficiently far from a solution, one might have problems. (Try, for ex-

ample, finding the root of f(x) = log(x) with a starting point of x0 = 5. Try again with

x0 = 2.) On the other hand, if the method converges, it achieves quadratic convergence,

that is, |xk+1 − x∗| ≤ c|xk − x∗|2).

1.1.3. The Secant Method

The secant method uses the same updating scheme as Newton-Raphson but numerical

derivatives are used for J . (See next section.)

1.2. Numerical Differentiation

The problem is: find df(x)/dx where f : IR → IR is differentiable. In this case Taylor

expansions can be taken around a point x

f (x+ h) = f (x) + f ′ (x) h+1

2f ′′ (x) h2 + higher order terms

f (x− h) = f (x) − f ′ (x) h+1

2f ′′ (x) h2 + higher order terms

to get approximations

f ′ (x) ≈ f (x+ h) − f (x)

h

f ′ (x) ≈ f (x) − f (x− h)

h

f ′ (x) ≈ f (x+ h) − f (x− h)

2h

for the first derivative and

f ′′ (x) ≈ 1

h2f (x+ h) − 2f (x) + f (x− h)

and for the second derivative. These are not the only possible approximations. But these

are the approximations most typically used in applications. In practice, h cannot be too

small. If x is a vector and f vector-valued, then differentiation can be done element by

element.

1.3. Numerical integration

The problem is: find∫ b

af(x) dx.

3

Before solving the problem, it helps to start with some preliminary theorems and

definitions used later. Let x0, x1, . . . xn be n + 1 distinct points in interval [a, b] and let

f be a function with n + 1 continuous derivatives on [a, b] (which is denoted simply by

Cn+1[a, b]).

Theorem: There exists a unique polynomial Pn of degree at most n such that

f (xk) = Pn (xk) , k = 0, . . . n

and

Pn (x) = f (x0)Ln,0 (x) + f (x1)Ln,1 (x) + . . . f (xn)Ln,n (x) (1.3.1)

where

Ln,k (x) =

n∏

i=0

i 6=k

(x− xi)

(xk − xi).

Pn is called the nth Lagrange interpolating polynomial.

Theorem: For each x in [a, b] there exists a point y(x) in (a, b) such that

f (x) = Pn (x) +f (n+1) (y (x))

(n+ 1)!(x− x0) (x− x1) · · · (x− xn) ,

with Pn(x) given by (1.3.1).

1.3.1. Choosing Quadrature Weights

Approximate the integral with a weighted sum, that is,∫ b

a

f (x) dx ≈n∑

k=1

ωkf (xk) .

Here, we can use the Lagrange interpolating polynomial Pn:∫ b

a

f (x) dx =

∫ b

a

Pn (x) dx+ err (f)

=

∫ b

a

n∑

k=0

f (xk)Ln,k (x) dx+ err (f)

=n∑

k=0

[

∫ b

a

Ln,k (x) dx

]

f (xk) + err (f)

=

n∑

k=0

ωkf (xk) + err (f)

≈n∑

k=0

ωkf (xk)

4

where the weights ωk are given by

ωk =

∫ b

a

Ln,k (x) dx

for k = 0, . . . , n and the error err(f) is

err (f) =

∫ b

a

f (n+1) (y (x))

(n+ 1)!

n∏

k=0

(x− xk) dx.

The err(f) formula can be used to compute bounds on errors.

1.3.2. Trapezoidal Rule

For the Trapezoidal rule, use the first Lagrange polynomial with x0 = a and x1 = b. In

this case, the weights are

ω0 =

∫ x1

x0

L1,0 dx =

∫ x1

x0

x− x1

x0 − x1dx =

1

2(x1 − x0) =

1

2(b− a)

ω1 =

∫ x1

x0

L1,1 dx =

∫ x1

x0

x− x0

x1 − x0dx =

1

2(x1 − x0) =

1

2(b− a)

and the approximate integral is

∫ b

a

f (x) dx ≈ h

2f (a) + f (b)

where h = b− a.

1.3.3. Simpson’s Rule

For Simpson’s rule, use the second Lagrange polynomial and equally spaced nodes: x0 = a,

x1 = (a+ b)/2, and x2 = b. In this case, the weights are

ω0 =

∫ x2

x0

L2,0 dx =

∫ x2

x0

(x− x1) (x− x2)

(x0 − x1) (x0 − x2)dx =

1

6(b− a)

ω1 =

∫ x2

x0

L2,1 dx =

∫ x2

x0

(x− x0) (x− x2)

(x1 − x0) (x1 − x2)dx =

2

3(b− a)

ω2 =

∫ x2

x0

L2,2 dx =

∫ x2

x0

(x− x0) (x− x1)

(x2 − x0) (x2 − x1)dx =

1

6(b− a)

5

and the approximate integral is

∫ b

a

f (x) dx ≈ h

3f (a) + 4f (a/2 + b/2) + f (b)

where h = (b− a)/2.

The Simpson’s rule and Trapezoidal rule are special cases of Newton-Cotes formulas

that use equally spaced nodes xk = x0 + kh with x0 = a, xn = b, and h = (b− a)/n.

1.3.4. Gauss-Legendre Quadrature

The formulas above are weighted sums of the functional values at a set of equally spaced

points. Applying Gauss-Legendre quadrature involves choosing both the weights and the

abscissas – and doubling the degree of precision. For this, the following theorem is useful.

Theorem: If P is any polynomial of degree less than or equal to 2n− 1, then

∫ 1

−1

P (x) dx =n∑

k=1

ωkP (xi)

where

ωk =

∫ 1

−1

∏

j=1

j 6=k

(x− xj)

(xk − xj)dx (1.3.2)

and x1, x2, . . ., xn are the zeros of the nth Legendre polynomial.

Legendre polynomials can be found recursively as follows:

(i+ 1) pi+1 = (2i+ 1) xpi − ipi−1

starting with p0(x) ≡ 1 and p1(x) ≡ x. This class of polynomials is orthogonal on [−1, 1]

with respect to weighting function w(x) = 1, which means that

∫ 1

−1

pj (x) pk (x)w (x) dx

= 0 if j 6= k> 0 if j = k.

Orthogonal polynomials have the property that they can be used as basis functions to

represent any polynomial (assuming the degree of the polynomial is equal to the highest

order polynomial in the orthogonal set). Gaussian quadrature is highly accurate if the

function f being integrated is well approximated by a polynomial.

6

Applying a simple transformation from the domain [a, b] to the domain [−1, 1] yields

the following Gaussian-Legendre approximation

∫ b

a

f (x) dx =1

2(b− a)

∫ 1

−1

f ([(b− a) z + b+ a] /2) dz

≈ 1

2(b− a)

n∑

k=1

ωkf ([(b− a) zk + b+ a] /2)

where the weights ωk, k = 1, . . . , n are given by (1.3.2) and the abscissas zk, k = 1, . . . , n

are the zeros of the nth order Legendre polynomial.

7

Chapter 2.

Dynamic Programming

In this chapter, we cover discrete-time and continuous-time dynamic programming.

2.1. Discrete-time Dynamic Programming

The problem is: find utTt=0 that solves

maxT∑

t=0

βtr (xt, ut) + V0 (xT+1)

subject to xt+1 = g (xt, ut) (2.1.1)

with the initial value for x0 and the value function V0 known. Because the objective

function has terms that are separable in time, this problem can be restated recursively

with a sequence of Bellman equations,

Vj+1 (xT−j) = maxuT−j

r (xT−j , uT−j) + βVj (xT−j+1) (2.1.2)

where the maximization is subject to (2.1.1) with V0(x) and its derivative known. The

solution is found by solving a sequence of simple maximization problems, taking as given

the value function at the last step. With j = 0 in (2.1.2), we have

V1 (xT ) = maxuT

r (xT , uT ) + βV0 (g (xT , uT )) (2.1.3)

and therefore uT satisfies,

∂r (xT , uT )

∂uT+ β

∂g (xT , uT )

∂uTV ′

0 (xT+1) = 0. (2.1.4)

Finding uT to satisfy (2.1.4) is the standard problem described earlier, namely to solve

a nonlinear equation or system of equations if uT is a vector. If we solve this problem

for each value of xT , we can trace out the optimal decision function, call it uT = h0(xT ).

Substituting this solution into (2.1.3),

V1 (xT ) = r (xT , h0 (xT )) + βV0 (g (xT , h0 (xT ))) .

8

On the next step, we’ll need the derivative of V1(x) which (if it is differentiable) is

given by

V ′1 (xT ) =

[

∂r (xT , h0 (xT ))

∂uT+ β

∂g (xT , h0 (xT ))

∂uTV ′

0 (xT+1)

]

h′ (xT )

+∂r (xT , h0 (xT ))

∂xT+ β

∂g (xT , h0 (xT ))

∂xTV ′

0 (xT+1)

=∂r (xT , h0 (xT ))

∂xT+ β

∂g (xT , h0 (xT ))

∂xTV ′

0 (xT+1) (2.1.5)

where the second equality follows from the fact that (2.1.4) holds at a maximum.

With j = 1 in (2.1.2), we have

V2 (xT−1) = maxuT−1

r (xT−1, uT−1) + βV1 (g (xT−1, uT−1)) (2.1.6)

and therefore uT−1 satisfies,

∂r (xT−1, uT−1)

∂uT−1+ β

∂g (xT−1, uT−1)

∂uT−1V ′

1 (xT ) = 0. (2.1.7)

Notice that this expression depends on the derivative in (2.1.5). Solving (2.1.7) for uT−1

is the same excercise as for uT . If we solve this problem for each value of xT−1, we can

trace out the optimal decision function uT−1 = h1(xT−1). In fact, the same exercise is

conducted for each j = 0, . . . T and yields optimal decision functions for all ut, t = 0, . . . T .

If T = ∞ and V0(x) = 0, then the time-independent solution ut = h(xt) is found by

taking j to ∞. Under certain conditions on the objective function and constraints, this

limit is the solution to

V (x) = maxu

r (x, u) + βV (x)

where x = g(x, u) and V = limj→∞ Vj . The limiting value function V is equal to the

objective function at an optimum:

V (x0) = maxut∞

t=0

∞∑

t=0

βtr (xt, ut) .

As a practical matter, solving the problem by solving a sequence of subproblems works for

both the finite time and the infinite time cases.

9

A variation of the problem allows for stochastic shocks. The problem is: find ut∞t=0

to solve:

maxE0

∞∑

t=0

βtr (xt, ut)

subject to xt+1 = g (xt, ut, ǫt+1) (2.1.8)

where ǫt is a sequence of independently and identically distributed random variables. In

this case, the Bellman equation is

V (x) = maxu

r (x, u) + β

∫

V (g (x, u, ǫ)) dF (ǫ)

≡ maxu

r (x, u) + βE [V (g (x, u, ǫ)) |x] (2.1.9)

where F (ǫ) is the cumulative distribution function for ǫ. In this case, the sequence of

first-order conditions for (2.1.9)

∂r (x, u)

∂u+ βE

[

∂g (x, u, ǫ)

∂uV ′ (g (x, u, ǫ))

∣

∣x

]

= 0.

In practice, some method of integration is required to solve u = h(x). In the stochastic

case, the solution depends on the parameters of F (ǫ).

2.2. Continuous-time Dynamic Programming

The problem is: derive the standard Bellman’s equation for continuous time problems with

the state evolving according to the following differential equation:

dx = µ (t, x, u)dt+ σ (t, x, u)dz. (2.2.1)

Here, dz is the increment of a stochastic process z which is a Wiener process (also called a

Brownian motion) and µ and σ are known functions of time t, the state x, and a decision

variable(s) u to be described later. A stochastic process [z(t), t ≥ 0] is called a Wiener

process if (i) z(0) = 0; (ii) z(t) has stationary independent increments; and (iii) for every

t > 0, z(t) is normally distributed with mean 0 and variance c∆t where c is some positive

constant.

Consider the stochastic optimal control problem:

V (t0, x0) = maxu

E

[

∫ T

t0

r (t, x, u)dt+ g (x (T ) , T )

]

10

subject to (2.2.1) and x(t0) = x0. First, note that V (T, x(T )) = g(x(T ), T ). Next, break

up the integral as follows

V (t0, x0) = maxu

E

(

∫ t0+∆t

t0

r (t, x, u)dt+

∫ T

t0+∆t

V (T, x (T ))

)

= maxu

t0≤t≤t0+∆t

E

(

∫ t0+∆t

t0

r (t, x, u)dt+ maxu

t0+∆t≤t≤T

(

∫ T

t0+∆t

V (T, x (T ))

))

= maxu

t0≤t≤t0+∆t

E

(

∫ t0+∆t

t0

r (t, x, u)dt+ V (t0 + ∆t, x0 + ∆x)

)

where x(t0 + ∆t) = x0 + ∆x. If V is twice continuously differentible, then

V (t0, x0) ≃ maxu

E(

r (t0, x0, u)∆t+ V (t0, x0) + Vt (t0, x0)∆t+ Vx (t0, x0) ∆x

+1

2Vxx (t0, x0) (∆x)

2+ h.o.t.

)

(2.2.2)

where h.o.t. stands for higher order terms. Recall that the following holds (approximately)

∆x = µ∆t+ σ∆z

(∆x)2

= µ2 (∆t)2

+ σ2 (∆z)2

+ 2µσ∆t∆z

= σ2∆t+ h.o.t (2.2.3)

where the arguments of µ and σ have been dropped for convenience. The result in (2.2.3)

follows from the fact that increments of z, e.g., z(tj)−z(tj−1) are independently distributed

with mean zero and variances proportional to increments of t, e.g., tj − tj−1. Thus, the

variance (dz)2 = dt is first order while the other terms are all higher order.

Using this approximation in (2.2.2), we have the following for the value function

V (t, x) ≃ maxu

E(

r (t, x, u)∆t+ V (t, x) + Vt (t, x)∆t+ Vx (t, x)µ (t, x, u)∆t

+ Vx (t, x)σ (t, x, u)∆z +1

2Vxx (t, x)σ (t, x, u)

2∆t+ h.o.t.

)

Take expectations (which drops the ∆z term), subtract V (t, x) from both sides, divide

through by ∆t, and then take ∆t to zero to get:

−Vt (t, x) ≃ maxu

(

r (t, x, u) + µ (t, x, u)Vx (t, x) +1

2σ (t, x, u)

2Vxx (t, x)

)

which is the standard Bellman’s equation for continuous time problems.

11

Consider a variation of the problem with discounting:

V (t0, x0) = maxu

E

(

∫ T

t0

e−ρtr (t, x, u)dt+ g (x (T ) , T )

)

In this case, the steps above lead to

−Vt (t, x) + ρV (t, x) = maxu

(

r (t, x, u) + µ (t, x, u)Vx (t, x) +1

2σ (t, x, u)

2Vxx (t, x)

)

(2.2.4)

An example is the standard stochastic growth model. Households choose consumption

to maximize expected lifetime utility

maxcE

∫ ∞

0

e−ρtu (c) dt

subject to

dk = (f (k) − c) dt+ σ (k) dz

where u(c) = cω/ω and ω < 1. For this model, equation (2.2.4) is given by

−Vt (t, k) + ρV (t, k) = maxc

(

u (c) + (f (k) − c)Vk (t, k) +1

2σ (k)

2Vkk (t, k)

)

.

The first-order condition for the maximization is u′(c) = Vk. Substituting this back in

yields the following differential equation.

−Vt (t, k)+ρV (t, k) = (1 − ω)Vk (t, k)ω/(ω−1)

/ω+(f (k) − c)Vk (t, k)+1

2σ (k)

2Vkk (t, k) .

This differential equation can be solved using finite difference methods outlined in Candler

(1999).

12

Chapter 3.

Computing Equilibria in Near-Linear Economies

In this chapter, we solve economic decision problems that are inherently nonlinear

assuming that the solutions of these problems are well-approximated by linear or log-linear

functions. Most business cycle models fall in this category.

3.1. Linearizing and Log-linearizing

We will sometimes need to do a first-order Taylor expansion of a function f(x) around a

point x, that is,

f (x) = f (x) + f ′ (x) (x− x) + higher order terms.

We will also need to do the expansion after writing the variables in logs:

f (x) = f(

elog x)

= g (log x) = g (log x) + g′ (log x) (log x− log x) + higher order terms

Consider the following example based on the Euler equation of a very simple growth

model:

u′ (ct) = βu′ (ct+1) (f ′ (kt+1) − 1 + δ) (3.1.1)

with u(c) = c1−σ/(1 − σ) and f(k) = Akθ. If we linearize the difference between the left

and right hand sides of (3.1.1), we get

βc−σt+1

[

θkθ−1t+1 + 1 − δ

]

− c−σt ≈ βc−σss[

θkθ−1ss + 1 − δ

]

− c−σss

− σβc−σ−1ss

[

θkθ−1ss + 1 − δ

]

(ct+1 − css) + σc−σ−1ss (ct − css)

+ (θ − 1)βc−σss θkθ−2ss (kt+1 − kss)

= −σc−σ−1ss (ct+1 − ct) + (θ − 1)βc−σss θk

θ−2ss (kt+1 − kss)(3.1.2)

where the subscript ‘ss’ stands for steady state value.

Next, consider log-linearizing the equation. For convenience, we use a hat over the

variable to denote the natural logarithm, that is, ct = log ct. Then, the residual can be

13

approximated as:

βc−σt+1

[

θkθ−1t+1 + 1 − δ

]

− c−σt

= βθe−σct+1e(θ−1)kt+1 + β (1 − δ) e−σct+1 − e−σct

≈ βθc−σss kθ−1ss

(

1 − σ (ct+1 − css) + (θ − 1)(

kt+1 − kss

))

+ β (1 − δ) c−σss (1 − σ (ct+1 − css))

− c−σss (1 − σ (ct − css))

= constant + c−σss

[

(1 − β (1 − δ)) (θ − 1) kt+1 − σ (ct+1 − ct)]

(3.1.3)

The last equation uses the fact that the residual is equal to zero in the steady state.

How would we check this algebra using the computer? One way to do it is to take

approximate numerical derivatives as described above.

3.2. Mapping the Problem to a Standard LQ Problem

The original problem is:

maxut∞

t=0

E

[

∞∑

t=0

βtr (Xt, ut) |X0

]

subject to Xt+1 = g (Xt, ut, ǫt+1)

X0 given

Instead of solving this, we solve the following related problem:

maxut∞

t=0

E0

∞∑

t=0

βt (X ′tQXt + u′tRut + 2X ′

tWut)

subject to Xt+1 = AXt +But + Cǫt+1

X0 given (3.2.1)

where

r (Xt, ut) ≃ X ′tQXt + u′tRut + 2X ′

tWut

g (Xt, ut, ǫt+1) ≃ AXt +But + Cǫt+1, (3.2.2)

with Q and R symmetric. That is, we solve a problem with a quadratic objective function

and linear constraints. Note that implicit in our formulation of (3.2.1) are the assumptions

14

that Xt is contained in the agents’ information sets at time t and that the agents know

the objective function and transition functions for all variables.

To obtain the functions in (3.2.2), we take a second and first-order Taylor expansion

of the corresponding nonlinear functions around the steady state of the system. Thus,

when evaluated at the stationary point, the original and approximated functions have the

same value.

To find the steady state of the system, we first set the disturbance term ǫt to its

unconditional mean. Without loss of generality, assume the mean is zero. We then find

the first order conditions of the resulting nonstochastic version of the model:

maxut∞

t=0

∞∑

t=0

βtr (Xt, ut)

subject to Xt+1 = g (Xt, ut, 0) (3.2.3)

and X0 given. Formulating the Lagrangian:

L =∞∑

t=0

βtr (Xt, ut) − λ′t+1 (Xt+1 − g (Xt, ut, 0)) (3.2.4)

and taking derivatives with respect to ut and Xt+1, we obtain the following first-order

conditions

∂r (Xt, ut)

∂ut+∂g (Xt, ut, 0)

∂ut

′

λt+1 = 0

β∂r (Xt+1, ut+1)

∂Xt+1− λt+1 + β

∂g (Xt+1, ut+1, 0)

∂Xt+1

′

λt+2 = 0 (3.2.5)

for t ≥ 0, where λt is a sequence of Lagrange multipliers. Eliminating time subscripts

from (3.2.5) and the constraint in (3.2.3), we then get the following set of nonlinear equa-

tions:

∂r (X, u)

∂u+∂g (X, u, 0)

∂u

′

λ = 0

β∂r (X, u)

∂X− λ+ β

∂g (X, u, 0)

∂X

′

λ = 0

X − g (X, u, 0) = 0 (3.2.6)

This is a set of 2m+ n equations with 2m+ n unknowns, X, u, λ. The fixed point of this

system is the steady state, say X, u, λ, around which we take first and second-order Taylor

expansions of g and r. Thus, we have the problem given by (3.2.1).

15

As shown in Kwakernaak and Sivan (1972) or Sargent (1980), if R < 0 and the system

Xt+1 = AXt + But

Yt = DXt (3.2.7)

is stabilizable and detectable, where A =√β(A − BR−1W ′), B =

√βB, D is some

matrix satisfying Q = D′ΩD for some Ω < 0, Q = Q − WR−1W ′, Xt = βt2Xt and

ut = βt2 (ut + R−1W ′Xt), then the optimal policy function for the optimization problem

(3.2.1) is the time-invariant linear rule:

ut = −FXt, F = (R + βB′PB)−1

(βB′PA+W ′)

=(

R+ B′PB)−1

B′PA+R−1W ′. (3.2.8)

The matrix P in (3.2.8) is the steady-state solution to the matrix Riccati difference equation

Pt = Q+ βA′Pt+1A− (βA′Pt+1B +W ) (R+ βB′Pt+1B)−1

(βB′Pt+1A+W ′)

= Q+ A′Pt+1A− A′Pt+1B(

R + B′Pt+1B)−1

B′Pt+1A (3.2.9)

as t→ −∞, with terminal condition PT ≤ 0.

There have been many algorithms developed for the solution of the discrete-time

Riccati equation. Here, we review several which will later be used to solve the stochastic

growth model. (See Anderson and Moore (1979) for further discussion.) In all cases, we

take as given the matrices A, B, Q, R, W and scalar β (or equivalently A, B, Q, and R),

tolerance criteria γ1 and γ2, and a matrix norm ‖ · ‖.

Direct Iteration. Set an initial symmetric Riccati matrix, P 0 ≤ 0.a) At iteration n, we compute Pn+1 and Fn to be

Pn+1 = Q+ A′PnA− A′PnB(

R+ B′PnB)−1

B′PnA

Fn =(

R + B′PnB)−1

BPnA

b) If ‖Pn+1 − Pn‖ < γ1‖Pn‖ and ‖Fn+1 − Fn‖ < γ2‖Fn‖, go to (c); otherwise,increase n by one and return to (a).

c) Set F = Fn +R−1W ′, P = Pn.

Doubling Algorithm. Set additional initial conditons: a0 = A, b0 = BR−1B′, p0 = Q.

16

a) At iteration n, we compute an+1, bn+1, pn+1, Fn to be

an+1 = an (I + bnpn)−1an

bn+1 = bn + an (I + bnpn)−1bnan′

pn+1 = pn + an′pn (I + bnpn)−1an

Fn =(

R+ B′pnB)−1

B′pnA.

b) If ‖pn+1 − pn‖ < γ1‖pn‖ and ‖Fn+1 − Fn‖ < γ2‖Fn‖, go to (c); otherwise,increase n by one and return to (a).

c) Set F = Fn +R−1W ′, P = pn.

Vaughan’s (1970) Algorithm.a) Find the eigenvalues and eigenvectors of the Hamiltonian matrix, H:

H =

[

A−1 A−1BR−1B′

QA−1 QA−1BR−1B′ + A′

]

=

[

V11 V12

V21 V22

] [

Λ 00 Λ−1

] [

V11 V12

V21 V22

]−1

.

Note that Λ is a diagonal matrix containing the eigenvalues of H that exceedunity in absolute value.

b) Set P = V21V−111 , F = (R+ B′PB)−1B′PA+R−1W ′.

With a steady-state solution to the Riccati matrix, we can use (3.2.8) to compute F

and the law of motion for the state variables:

Xt+1 = (A−BF )Xt + Cǫt+1 (3.2.10)

Futhermore, given an initial condition for the states, X0, and a realization of the shocks,

ǫt, t ≥ 0, we can generate time-series for Xt via (3.2.10) and ut via (3.2.8).

Example 1.1 To illustrate these algorithms, let’s consider the version of the growth model

that was used in comparing alternative methods by Taylor and Uhlig. The problem in this

case is to find ct = h(kt, zt) that solves

maxct

E

[

∞∑

t=0

βtU (ct)∣

∣z0, k0

]

subject to constraints given by

ct + kt+1 − kt = ztkαt

log zt = ρ log zt−1 + ǫt, ǫt ∼ N(

0, σ2ǫ

)

17

and subject to the initial conditions z0 and k0. Notice that the rate of depreciation is equal

to 0 as in Taylor and Uhlig (1990). That simplifies some of the algebra below. In what

follows, we assume that U(c) = c1−σ/(1 − σ).

If we substitute the resource constraint into the objective function, we can rewrite the

problem as follows:

maxkt+1−kt∞

t=0

E0

∞∑

t=0

βt(kt + eωtkαt − kt+1)

1−σ

1 − σ

subject to ωt+1 = ρωt + ǫt+1

k0, ω0 given (3.2.11)

where β, 0 < β < 1 is a discount factor, ωt = log zt. In this formulation of the problem,

we have eliminated consumption, ct. However, given a policy function for kt+1 − kt, we

can compute the policy function for ct since ct = kt − kt+1 + eωtkαt .

To find the linear-quadratic version of (3.2.11), we must first find the steady state.

Thus, we set ǫt = 0 and form the Lagrangian

L =

∞∑

t=0

βt (kt + eωtkαt − kt+1)

1−σ

1 − σ− λt+1 (ωt+1 − ρωt)

(3.2.12)

Taking the derivative with respect to kt, we get

− (kt + eωtkαt − kt+1)−σ

+ β(

kt+1 + eωt+1kαt+1 − kt+2

)−σ (1 + αeωt+1kα−1

t+1

)

= 0

Eliminating time subscripts in this condition and in ωt+1 = ρωt implies

ω = 0, k =

(

αβ

1 − β

)1

1−α

. (3.2.13)

If we set ut = kt+1 − kt and xt = [kt 1 ωt]′, then the matrices of the transition function

for xt are given by

A =

1 0 00 1 00 0 ρ

, B =

100

, C =

0 0 00 0 00 0 1

.

The second element of xt captures constant terms. Since the constraints in the problem are

already linear, we need only approximate the objective function. Taking a second-order

Taylor expansion of the objective function in (3.2.11) around (3.2.13), we obtain

Q =kα(1−σ)

2

−σα2+α2−αk2

σα2−α2+2αk

α(1−σ)k

σα2−α2+2αk

21−σ − 3α+ α2 (1 − σ) 1 + ασ − α

α(1−σ)k

1 + ασ − α 1 − σ

,

18

W = [ασk−ασ−1 − (1+ασ)k−ασ σk−ασ]′/2, and R = −σk−ασ−α/2. Thus, given a value

for β and the parameters underlying A, B, C, Q, R, and W , we can compute the optimal

controls via (3.2.8) and (3.2.9).

As a check on our solution, we can compare it to one found analytically. Since the

state space for the stochastic growth model is small, it is easy to find F analytically using

the fact that (3.2.5) implies:

[

xt+1

λt+1

]

−(

H + H−1)

[

xtλt

]

+

[

xt−1

λt−1

]

= 0 (3.2.14)

when r is quadratic and g is linear, where λt = βt2λt/2. Taking the first equation of

(3.2.14), we have

βkt+1 −[

1 + β +(1 − α) (1 − β)

2

ασ

]

kt + kt−1 = κ0 + κ1ωt

where κ0 = β2α(α− 1)k2α−1/σ and κ1 = β(1 − ρ−1)kα − β2αk2α−1/σ and, hence,

kt+1 − kt = (ψ − 1) kt −ψκ0

1 − βψ− ψκ1ρ

1 − βρψωt, (3.2.15)

where ψ is the root of s2 −(

1 + 1/β + (1 − α)(1 − β)2/(βασ))

s + 1/β with modulus less

than one. From (3.2.15), we get ut = −Fxt.

3.3. A Variant on Vaughn’s Method

Let xt be the l-dimensional vector of endogenous state variables for the model we are

interested in. Let St be the m-dimensional vector of exogenous states variables of the

model with

st+1 = Pst +Qǫt+1,

and ǫt iid. Finally, assume that zt is a n-dimensional vector of choice variables and prices

that are, in equilibrium, functions of xt and st. The form of the solution we are seeking is

xt+1 = Axt +Bst and zt = Cxt +Dst.

Assume that the first-order equations to be solved, after log-linearization, can be

written as follows

0 = Θ1xt+1 + Θ2xt + Θ3zt + Θ4st

0 = EtΦ1xt+1 + Φ2xt + Φ3zt+1 + Φ4zt + Φ5st+1 + Φ6st (3.3.1)

19

Theory tells us that we can do this in two steps: find the coefficients on the endogenous

state vector xt and then use the results to compute the coefficients on st.

In the first step, we stack up the matrices of the equilibrium equations as follows:

0 = A1

[

xt+1

zt+1

]

+ A2

[

xtzt

]

+ stochastic shocks (3.3.2)

where

A1 =

[

Θ1 0Φ1 Φ3

]

A2 =

[

Θ2 Θ3

Φ2 Φ4

]

.

To compute A and C, we find generalized eigenvalues Λ (and associated eigenvectors V )

such that

A2V = −A1V Λ. (3.3.3)

For a unique stationary equilibrium, we need the same number of roots inside the unit

circle as there are elements of x. If we sort the eigenvalues and eigenvectors so that the

roots inside one are ordered first, then we have

A = V11Λ1V−111 (3.3.4)

C = V21V−111 (3.3.5)

where V11 is the l × l upper left partition of V , V21 is the n× l lower left partition of V ,

and Λ1 are the eigenvalues inside the unit circle.

Given A and C, solving for B and D involves solving a linear system of equations. To

see this, substitute the form of the solution into (3.3.1)

0 = Θ1 (Axt +Bst) + Θ2xt + Θ3 (Cxt +Dst) + Θ4st

= (Θ1A+ Θ2 + Θ3C) xt + (Θ1B + Θ3D + Θ4) st (3.3.6)

0 = EtΦ1 (Axt +Bst) + Φ2xt + Φ3 (CAxt + CBst +DPst +DQǫt+1)

+ Φ4 (Cxt +Dst)Φ5 (Pst +Qǫt+1) + Φ6st= (Φ1A+ Φ2 + Φ3CA+ Φ4C) xt

+ (Φ1B + Φ3CB + Φ3DP + Φ4D + Φ5P + Φ6) st. (3.3.7)

It turns out that the coefficients on xt in (3.3.6) and (3.3.7) are equal to 0 if we evaluate

them at A in (3.3.4) and C in (3.3.5). We need to set elements of B and D so that the

20

coefficients on st are also 0. We do this by stacking the coefficients in vectors using vec(x)

= [x11, ...xm1, x12, ...xm2, ...xmn]′ and setting it equal to zero:

[

I ⊗ Θ′1 I ⊗ Θ′

3

I ⊗ Φ′1 + I ⊗ Φ3C P ′ ⊗ Φ3 + I ⊗ Φ4

] [

vec (B)vec (D)

]

= −[

vec (Θ4)

vec (Φ5P + Φ6)

]

where we used the fact that vec(A+B)= vec(A) +vec(B) and vec(ABC) = (C′⊗A)vec(B).

Example 1.2. Let’s apply this to the growth model that we just solved in Example 1.1.

For that example, xt = kt and st = ωt, and—if we substitute out ct—zt = kt+1. In this

case, the capital stock is in levels which allows us to compare with the solution above.

After substituting for ct, the first-order condition is

0 = − (kt + eωtkαt − kt+1)−σ

+ β(

kt+1 + eωt+1kαt+1 − kt+2

)−σ

(

1 + αeωt+1kα−1t+1

)

.

Linearizing this equation yields the following

0 = akt+2 + bkt+1 + ckt + dωt+1 + eωt + constant (3.3.8)

where

a = βc−σ−1(

1 + αkα−1)

b = −σc−σ−1 + βσc−σ−1(

1 + αkα−1)2

+ βc−σα (α− 1) kα−2

c = σc−σ−1(

1 + αkα−1)

d = βσc−σ−1kα(

1 + αkα−1)

+ βc−σαkα−1

e = σc−σ−1kα

The matrices in (3.3.2) are equal to

A1 =

[

1 00 a

]

, A2 =

[

0 −1c b

]

.

It is easy to show that computing the eigenvalues associated with (3.3.3) is equivalent to

finding the roots of

aλ2 + bλ+ c = 0. (3.3.9)

Since there is only one state variable, the dimension of V11 is 1×1 and therefore cancels in

(3.3.4). Thus, A = λ1 where λ1 is the root of the quadratic equation in (3.3.9) that is inside

21

the unit circle. Since zt = xt+1, then it must be the case that A = C. It is easy to show

that this is indeed the case by deriving the eigenvectors V that satisfy A2V = −A1V Λ.

For this example, they are such that λ1 = V21V−111 .

Homework Exercise 1. Redo Example 1.2 without first substituting for ct. In this case, we

set zt = [kt+1, ct]′ and use the linearized first-order conditions in (3.1.2) and linearize the

resource constraint. Here, it is necessary to compute generalized eigenvalues since A1 will

be singular.

Homework Exercise 2. Consider an extension of the growth model used in Examples 1.1

and 1.2 that allows for some depreciation of capital at rate δ and a positive elasticity of

labor. In this case, replace U(c) for Example 1.1 with the utility function given by

U (c, h) =

[

(

c (1 − h)ψ)1−σ

− 1

]

/ (1 − σ)

and the resource constraint by,

ct + kt+1 − (1 − δ) kt = eωtkαt h1−αt

where ωt is an AR(1) process as before. Here the decisions are ct, kt+1, and ht. The state

variables are kt, ωt, and a constant. For this example, compute two sets of solutions, one

set with all decisions and capital stocks in levels (e.g., ct = a+ bkt+ cωt) and the second

with these variables in logs (e.g., log ct = a+ b log kt+ ωt).

22

Chapter 4.

Computing Equilibria in Nonlinear Economies

4.1. The Method of Parameterized Expectations

Den Haan and Marcet (1990) describe a method of parameterized expectations. Instead

of approximating a decision function, they approximate the conditional expectation that

typically appears in the first order conditions of a stochastic model. For example, they

consider the following version of the stochastic growth model:

maxct

E

[

∞∑

t=0

βtu (ct) |k0, θ0

]

, u (c) =c1−τ − 1

1 − τ

subject to

ct + kt − µkt−1 = θtkαt−1, (1)

ln θt+1 = ρ ln θt + ǫt+1, Eǫt = 0, Eǫtǫ′t = σ2. (2)

The intertemporal first order condition is given by

c−τt = βE[

c−τt+1

(

θt+1αkα−1t + µ

)

|kt−1, θt]

.

Den Haan and Marcet (1990) find an approximation to

Etc−τt+1

(

θt+1αkα−1t + µ

)

rather than to ct or kt+1.

But the choice of the function to approximate is not the main difference between this

method and those we described earlier. The main difference is that the approximation

is based on simulating time series with a guess for φ, projecting the resulting series for

c−τt+1(θt+1αkα−1t + µ) on the guess, and choosing the projection to minimize the mean

squared errors (i.e., they do nonlinear least squares).

To be more specific, suppose that the approximation for the conditional expectation

has the form:

φ (kt−1, θt; δ) = exp (Pn (log kt−1, log (θt)))

23

where Pn is an nth order polynomial that depends on the logarithm of the state vector.

For example, Den Haan and Marcet (1990) use a first order polynomial. Thus φ is defined

as follows:

φ (kt−1, θt; δ) = δ1kδ2t−1θ

δ3t .

Let ct(δ), kt(δ) be the sequence for consumption and the capital stock that is generated

for a particular δ by the following steps. First, draw a realization for ǫ from a normal

distribution. Note that only one draw will ever be used. Second, generate a realization for

θt using (2) and the simulated sequence for ǫt. Third, recursively derive consumption and

capital stock values from

c−τt = βφ (kt−1, θt; δ)

and (1).

A candidate solution for the optimization problem is δ. Given δ, we can derive values

for the conditional expectation, consumption, and capital. We just showed how to simulate

time series given a candidate solution. What we want to do next is choose a particular δ

– one that minimizes or maximizes some criterion with attractive features. The criterion

that Den Haan and Marcet use is the mean squared error. Define S : IRm → IRm as

follows:

S (δ) = argminδE[

c−τt+1 (δ)(

θt+1αkα−1t (δ) + µ

)

− φ(

kt−1 (δ) , θt; δ)]2

.

The goal then is to find the fixed point of δ = S(δ).

The steps to this fixed point starting with an initial guess δ0 are as follows:

1. Generate time series for ǫt and θt, t = 1 . . . T

2. At iteration j, j = 0, . . ., given δj , compute ct(δj), kt(δj)Tt=0;

3. Run a nonlinear least squares regression of

c−τt+1 (δ)(

θt+1αkα−1t (δ) + µ

)

on φ(kt−1(δ), θt; δ) to get an approximation for S(δj);

4. Update as follows:

δj+1 = (1 − λ) δj + λS(

δj)

where a smaller λ implies a more stable mapping.

5. Return to step 2 if ||δj+1 − δj || is small; stop otherwise.

24

Den Haan and Marcet (1990) accomplish step 3 by doing a sequence of ordinary least

squares regressions. The trick is to approximate φ(·) as a linear function in delta.

4.2. Weighted Residual Methods

Many problems in economics require the solution to a functional equation as an inter-

mediate step. Typically, we seek decision functions that satisfy a set of Euler conditions

or a value function that satisfies Bellman’s equation. In many cases, we cannot derive

analytical solutions for these functions and instead must rely on numerical methods. In

this chapter, we will apply weighted residual and finite element methods to this type of

problem.

In the case of weighted residual methods, the approximate solution to the functional

equation is represented as a linear combination of known basis functions. In many cases,

the basis functions are polynomials. The coefficients on each basis function are the objects

to be computed to obtain an approximate solution. These coefficients are found by setting

the residual of the equation to zero in an average sense. In other words, a weighted integral

of the residual is set to zero.

The finite element method can be viewed as a piecewise application of the weighted

residual method. With the finite element method, the first step in solving the functional

equation is to subdivide the domain of the state space into nonintersecting subdomains

called elements. The domain is subdivided because the method relies on fitting low-order

polynomials on subdomains of the state space rather than high-order polynomials on the

entire state space. The local approximations are then pieced together to get a global

approximation. As the dimensionality of the problem increases, higher-order functions can

be used where needed, with fewer elements.

The primary goal in this chapter is to illustrate the application of weighted residual and

finite element methods by way of examples. We start with a simple differential equation

because the coefficients to be computed satisfy a linear system of equations. For this

problem, we can work through examples without a computer. we then apply the methods

to a deterministic growth model and a stochastic growth model – two standard models

in economics.1 In the growth model examples, the coefficients to be computed satisfy

1 See Taylor and Uhlig (1990) for a summary of alternative algorithms used to solve the stochasticgrowth model.

25

nonlinear systems of equations. Fortunately, these nonlinear equations are exploitably

sparse if they are derived from a finite element method.

4.2.1. The General Procedure

The problem is to find d : IRm → IRn that satisfies a functional equation F (d) = 0, where

F : C1 → C2 and C1 and C2 are function spaces. As an example, think of d as decision or

policy variables and F as first-order conditions from some maximization problem. My goal

here is to find an approximation dn(x; θ) on x ∈ Ω which depends on a finite-dimensional

vector of parameters θ = [θ1, θ2, . . . , θn]′. Weighted residual methods assume that dn is a

finite linear combination of known functions, ψi(x), i = 0, . . . , n, called basis functions:

dn (x; θ) = ψ0 (x) +

n∑

i=1

θiψi (x) . (4.2.1)

The functions ψi(x), i = 0, . . . , n are typically simple functions. Standard examples of basis

functions include simple polynomials (for example, ψ0(x) = 1, ψi(x) = xi), orthogonal

polynomials (for example, Chebyshev polynomials), and piecewise linear functions.

Figure 3 displays the first five polynomials in the class of Chebyshev polynomials,

which is a popular choice for the basis functions. Chebyshev polynomials are defined on

[−1, 1] and are given recursively as follows: p0(x) = 1, p1(x) = x, and

pi (x) = 2x pi−1 (x) − pi−2 (x) , i = 2, 3, 4, . . .

(or, nonrecursively, as pi(x) = cos(i arccosx)). The domain Ω is not typically given by

[−1, 1]. If the domain is instead [a, b], then we can use ψi(x) = pi−1(2(x− a)/(b− a) − 1)

for i = 1, 2, . . . and ψ0(x) = 0.

26

-1 -0.5 0 0.5 1Domain (x)

-1

-0.5

0

0.5

1

Che

bysh

evpo

lyno

mia

ls(p

i(x),

i=1,

...,5

)

p1(x)p2(x)p3(x)p4(x)p5(x)

Five Chebyshev Polynomial Basis Functions

Figure 3.

Chebyshev polynomials constitute a set of orthogonal polynomials with respect to the

weight function w(x) = 1/√

1 − x2, because∫ 1

−1pi(x)pj(x)w(x)dx = 0 for all i 6= j. Using

orthogonal polynomials in my representation dn rather than the simple polynomials xi may

be preferable as n gets large. For large n, it is difficult to distinguish xn from xn+1. Thus,

the approximation is hardly improved when we add xn+1. With orthogonal polynomials,

however, pn is easily distinguished from pn+1 because they are orthogonal to each other.

Figure 4 displays basis functions that can be used to construct a piecewise linear

representation for dn. These basis functions are of the form

ψi (x) =

x−xi−1

xi−xi−1if x ∈ [xi−1, xi]

xi+1−xxi+1−xi

if x ∈ [xi, xi+1]

0 elsewhere.

(4.2.2)

We do not need to have the points xi, i = 1, . . . , n equally spaced. Therefore, if we want

to represent a function that has large gradients or kinks in certain places – say, because

27

inequality constraints bind – then we can cluster points in those regions. In regions where

the function is near-linear, we do not need many points.

0 1 2 3 4 5 6 7 8 9 10

Domain (x)

0

0.5

1

1.5

Fin

iteE

lem

entL

inea

rB

ases

(ψi(x

),i=

1,...

,5)

ψ1(x)ψ2(x)ψ3(x)ψ4(x)ψ5(x)

Five Piecewise Linear Basis Functions

x1 x2 x3 x4 x5

Figure 4.

We define the residual equation as the functional equation evaluated at the approxi-

mate solution dn:

R (x; θ) = F (dn (x; θ)) .

We want to choose θ so that R(x; θ) is close to zero for all x. Weighted residual methods

get the residual close to zero in the weighted integral sense. That is, we choose θ so that

∫

Ω

φi (x)R (x; θ)dx = 0, i = 1, . . . , n,

where φi(x), i = 1, . . . , n are weight functions. Note that φi(x) and ψi(x) can be different

functions. Alternatively, the weighted integral can be written

∫

Ω

w (x)R (x; θ)dx = 0, (4.2.3)

28

where w(x) =∑

i ωiφi(x) and (4.2.3) must hold for any nonzero weights ωi, i = 1, . . . , n.

Therefore, instead of setting R(x; θ) to zero for all x ∈ Ω, the method sets a weighted

integral of R to zero.

We consider three specific sets of weight functions and, hence, three ways of deter-

mining the coefficients θ1, . . . , θn.

1. Least Squares: φi(x) = ∂R(x; θ)/∂θi. This set of weights can be derived by calcu-

lating the first-order derivatives for the following optimization problem:

minθ

∫

Ω

R (x; θ)2dx.

2. Collocation: φi(x) = δ(x − xi), where δ is the Dirac delta function. This set of

weights implies that the residual is set to zero at n points x1, . . . , xn called the col-

location points: R(xi; θ) = 0, i = 1, . . . , n. If the basis functions are chosen from a

set of orthogonal polynomials with collocation points given as the roots of the nth

polynomial in the set, the method is called orthogonal collocation.

3. Galerkin: φi(x) = ψi(x). In this case, the set of weight functions is the same as the

basis functions used to represent d. Thus, the Galerkin method forces the residual to

be orthogonal to each of the basis functions. As long as the basis functions are chosen

from a complete set of functions, then equation (4.2.1) represents the exact solution,

given that enough terms are included. The Galerkin method is motivated by the fact

that a continuous function is zero if it is orthogonal to every member of a complete

set of functions.

To illustrate weighted residual methods, We apply the methods to standard growth

models. (See Aiyagari and McGrattan 1997, Braun and McGrattan 1993, and Chari,

Kehoe and McGrattan 1997 for other examples.)

4.2.2. Applied to the Deterministic Growth Model

We start with a version of the deterministic growth model:

maxct

∞∑

t=0

βtu (ct)

subject to ct + kt = f (kt−1) ,

(4.2.4)

where ct is consumption at date t, kt is the capital stock at date t, u(·) is the utility

function, f(·) is the production function, and β < 1 is a discount factor.2 From the Euler

2 See Sargent (1987) for a detailed discussion of the problems described here and in the next section.

29

equation, the functional equation is given by

F (c) (k) = βu′ (c (f (k) − c (k)))

u′ (c (k))f ′ (f (k) − c (k)) − 1 = 0,

and the boundary condition is given by c(0) = 0. In this case, we want to compute an

approximation cn(k; θ) to the consumption function that sets F (c) approximately equal to

zero for all k.

Example 2.1. Let u(c) = ln(c) and f(k) = λkα. In this case, the functional equation is

F (c) (k) =βαλ (λkα − c (k))

α−1c (k)

c (λkα − c (k))− 1

The solution for consumption in this case is

c (k) = (1 − βα)λkα.

Suppose that we want to obtain an approximate solution of the form

cn (k; θ) = θ1k + θ2k2 + . . .+ θnk

n

which satisfies the boundary condition at k = 0. The residual equation is therefore

R (k; θ) =βαλ

(

λkα −∑nj=1 θjk

j)α−1

∑nj=1 θjk

j

∑ni=1 θi

(

λkα −∑nj=1 θjk

j)i

− 1.

To apply weighted residual methods, we have to compute integrals of the form

∫ k

0

φi (k)R (k; θ) dk, i = 1, . . . , n (4.2.5)

where k is the upper bound of the domain for the capital stock. Since the residual R is a

nonlinear function of θ, it makes sense to do numerical integration. If we apply Gaussian

quadrature (which is typically done), then equation (4.2.5) is replaced by

∑

l

ωlφi (kl)R (kl; θ) , i = 1, . . . , n (4.2.6)

where ωl are the quadrature weights and the grid points kl are the quadrature abscissas. (See

Press et al. 1986 for the quadrature formulas and a description of how they are derived.)

The values for ωl and kl do not depend on the function being integrated (φi(k)R(k; θ) in

30

this case). In other words, once we know the bounds of integration (for example, 0 and

k) and the number of quadrature points, we can look up the ωl’s and kl’s in a standard

quadrature table.3 Depending on the specific quadrature rule (for example, Legendre,

Chebyshev, Hermite) used, ωl’s and kl’s will differ, but the calculations of R and φ will

look the same no matter what quadrature rule is used.

The final step is to solve the system of equations in (4.2.6). In this case, the system is

nonlinear. The problem is to find θ such that G(θ) = 0, where G has the same dimension

as θ. Applying Newton’s method to G(θ) = 0 means iterating on

θj+1 = θj −[

∂G (θ)

∂θ

∣

∣

∣

∣

∣

θ=θj

]−1

G(

θj)

, j = 1, 2, . . .

with some initial guess θ0, where θj is the vector of unknown coefficients at the jth iteration.

Notice that as we iterate, we solve a sequence of problems of the following form: find θ

such that Aθ = b, where A is the Jacobian matrix ∂G/∂θ evaluated at θj and b is the

function itself, G(θj).

For the three weighted residual applications below, assume that α = 0.25, β = 0.96,

λ = 1/(αβ), and k = 2. For this set of parameters, the steady-state capital stock is equal to

one. Assume also that the quadrature rule is Legendre with 20 quadrature abscissas used to

approximate the integral in (4.2.5). In this case, ωl =∫ 1

−1

∏20i=1,i6=l(x−xi)/(xl−xi)dx, l =

1, . . . , 20, and x1, . . . , x20 are the roots of the 20th Legendre polynomial found recursively

as follows: p0(x) = 1, p1(x) = x, ipi(x) = (2i−1)xpi−1(x)−(i−1)pi−2(x) for i = 2, . . . , 20.

Since k = 2, the points kl are given by kl = xl + 1, l = 1, . . . , 20.

5a. Least squares. To apply the method of least squares, we set φi(k) = ∂R(k; θ)/∂θi,

where the derivative of the residual is given by

∂R (k; θ)

∂θl= −βαλk

α−1kℓ

cn(

k; θ)

1 +(α− 1) cn (k; θ)

k− cn (k; θ)

cn(

k; θ)

(

kℓ − kℓn∑

i=1

iθiki−1

)

l = 1, . . . , n, and k = λkα +∑

j θjkj . Figure 5 displays the approximate solution cn for

n = 5 along with the exact solution. Since the derivative of the true function is infinite

at k = 0 and relatively small for high values of k, we must add more polynomials to

3 The weights and abscissas are chosen so that the n-point quadrature rule is exact for integrals ofall polynomials of order 2n− 1 times some weight function, which depends on the specific rule. Forexample, Gauss-Legendre quadrature uses a weight function of 1 and Gauss-Chebyshev quadrature

uses a weight function of 1/√

1 − x2, where x is defined on (−1, 1).

31

completely resolve the solution at all capital stocks. We also plot the result for a more

restricted grid on the capital stocks, namely, [ 13 ,53 ]. This is the grid Judd (1992) uses

when evaluating weighted residual methods for the deterministic growth model. For both

approximations, we assume that n = 5. Notice that although the approximation on [13 ,53 ]

is very close to the true solution, the exact solution is very smooth – almost linear.

0 0.5 1 1.5 2Capital Stock

0

0.5

1

1.5

2

2.5

3

3.5

4

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

ExactApproximation on [0,2]Approximation on [1/3,5/3]

Two Least-squares Approximations for the

Deterministic Growth Model

Figure 5.

5b. Collocation. To apply the collocation method, we set φi(k) = δ(k − ki), where ki,

i = 1, . . . , n are collocation points in [0,k]. Figure 6 shows two approximations: one with

five evenly spaced collocation points between 0.1 and 2 and one with five evenly spaced

collocation points between 13

and 53. The problem of fitting functions with steep gradients

becomes acute in this case, which is why we avoid the region of capital stocks below 0.1.

Even so, the approximation on [0.1,2] is not very accurate. It is clear that we need better

choices for basis functions and collocation points to make this method competitive with

least squares. On [ 13 ,53 ], We find that the approximation is not quite as good as that for

32

least squares, but it is not too different from the exact solution. Here again, the fit is good

because the exact solution is very smooth on [ 13 ,53 ].


0

0.5

1

1.5

2

2.5

3

3.5

4

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

ExactApproximation on [.1,2]Approximation on [1/3,5/3]

Two Collocation Approximations for the


Figure 6.

5c. Galerkin. To apply the Galerkin method, we set φi(k) = ki, i = 1, . . . , n. Figure 7

shows approximate functions on [0,2] and [ 13 ,53 ] along with the exact solution. The results

here are similar to the results of the least squares method.

33


0

0.5

1

1.5

2

2.5

3

3.5

4

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

ExactApproximation on [0,2]Approximation on [1/3,5/3]

Two Galerkin Approximations for the


Figure 7.

Because we need to include more polynomials, which in the case of ki, i = 1, . . . , n

become similar to each other as n gets large, it makes sense to use a class of orthogonal

polynomials. Judd (1992) uses a representation for consumption of the form

cn (k; θ) =

n∑

i=1

θiψi (k) , (4.2.7)

where ψi(k) = pi−1(2(k − k)/(k − k) − 1), k is a lower bound on the capital stocks, and

pi(x) is the ith Chebyshev polynomial defined in equation (4.2.7).4

Example 2.2. In this case, assume that u(c) = c1−τ/(1−τ) and f(k) = λkα+(1−δ)k. Let

τ = 5, α = 0.25, δ = 0.025, β = 0.99, and λ = (1−β(1−δ))/(αβ) (so that the steady-state

capital is equal to 1). Let cn take the form of (4.2.7) with n = 10. Figure 8 displays the

4 Note that this approximation will not satisfy the boundary condition at c(0) = 0 if k = 0 for any θ.However, if we make a slight modification, namely, ψi(k) = kpi−1(2k/k − 1) defined on [0, k], then

the boundary condition is satisfied for all possible choices of θ.

34

approximate solutions for k = 0.03, k = 2 (marked with a square), and k = 0.1, k = 1.9

(marked with a circle) along with the exact solution.5 The location of the points marked

by squares or circles are the quadrature abscissas.


0

0.05

0.1

0.15

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

ExactApproximation on [.03,2]Approximation on [.1,1.9]

Two Galerkin Approximations with Chebyshev

Basis Functions for the Deterministic Growth Model

Figure 8.

It is clear from Figure 8 that more polynomials are needed for a good approximation

on [0.03,2]. This is because we are trying to approximate a very steep part of the function

and a very flat part of the function using the same basis functions. When we restrict

the domain to [0.1,1.9], there is a significant improvement in the approximation over this

region of the state space. The approximation is visually indistinguishable from the exact

solution. In this restricted region of the domain, the function does not have any large

gradients.

5 What we’ll call the exact solution here is actually a finite element approximation with a large numberof elements. Although this itself is an approximation, doubling the number of elements leaves Figure8 unchanged.

35

Suppose that, instead of using Chebyshev polynomials, we apply the Galerkin method

with piecewise linear basis functions as is done for the finite element method.

Example 7. Assume that u(·), f(·), and the parameterization are the same as in Example

6. Let x1 = 0, x11 = 2, and xi = xi−1 + 0.005 exp(0.574(i − 2)). This partition implies

that there are 10 elements with lengths that increase exponentially. Thus, there will be

more points near the origin, where the function has a large (infinite in this case) gradient.

To compute the weighted integral, we use a Legendre quadrature rule with two quadrature

points per element. On an element of length ℓe, the Legendre quadrature rule with two

quadrature points implies the following weights and abscissas for (4.2.6): ωl = ℓe/2, l =

1, 2, and k1 = ke+0.211ℓe, k2 = ke+0.789ℓe, where ke is the first endpoint of the element.

Figure 9 displays the finite element approximation along with the exact solution.

Because the finite element method is a piecewise application of a weighted residual method,

it is possible to get a more accurate approximation over the entire [0,2] domain—we are

not using the same basis functions in the very steep region and the very flat region of the

consumption function.


0

0.05

0.1

0.15

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

ExactApproximation

Finite Element Approximation for the Deterministic

Growth Model

36

Figure 9.

To obtain the approximation in Figure 9, the main computational task is the inversion

of a 10×10 matrix. In this matrix, 68 of the 100 elements are zeros, and the structure of the

matrix is band diagonal. As the number of unknowns becomes large, it becomes expensive

and, in some cases, infeasible to invert the matrix without using inversion routines that

exploit the fact that the matrix is band diagonal. (See Saad 1996.)

4.2.3. Applied to the Stochastic Growth Model

Suppose that, instead of the deterministic growth model, we want to calculate the decision

functions for the stochastic growth model in which decisions depend on the capital stock

and a stochastic shock. 6 The stochastic growth model assumes that output at date t can

be allocated either to current consumption ct or to current investment it. The consump-

tion/savings decision is assumed to be optimal in that the preferences of households are

maximized. The preferences are given by

E

[ ∞∑

t=0

βt u (ct)

∣

∣

∣

∣

k−1

]

, 0 < β < 1, (4.2.8)

where kt is the capital stock at t and k−1 is known. The maximization of equation (4.2.8)

is done subject to the feasibility constraints

ct + kt − (1 − δ) kt−1 = λtkαt−1, 0 < α < 1, 0 ≤ δ ≤ 1, (4.2.9)

the nonnegativity constraints ct ≥ 0, kt ≥ 0 for all t ≥ 0 and subject to the process for the

technology shock,

lnλt = ρ lnλt−1 + εt, −1 < ρ < 1, (4.2.10)

where εt is a serially uncorrelated, normally distributed random variable with mean zero

and variance σ2. Because ε is normally distributed, it does not have a compact support.

The technology shock in this case takes on values between 0 and infinity. On the computer,

we cannot specify an upper bound of infinity. Instead, we can either specify a large

upper bound (in which the probability of observing a larger value is small) or make a

transformation of variables and work with a bounded interval. Let z = tanh(ln(λ)), which

is defined on [−1, 1]. Then we can rewrite equation (4.2.10) as follows:

zt = tanh(

ρ tanh−1 (zt−1) +√

2σνt

)

,

6 See Judd (1992) for more details on spectral methods as applied to this problem and McGrattan(1996) for more details on the finite element method as applied to this problem.

37

where νt = εt/(√

2σ).

Because the stochastic shock takes on a continuum of values, we need to solve a

two-dimensional problem. The representation of the approximate solution is then

cn (k, z; θ) =

n∑

i=1

θiψi (k, z) .

A simple set of basis functions is all products of the elements of 1, k, k2, . . . , knk and

1, z, z2, . . . , znz. Alternatively, we can use all products of the elements of two sets of

orthogonal polynomials. In either case, however, the number of unknowns starts to add up

quickly, especially if a large number of polynomials are needed to approximate consumption

at both high and low values of the capital stock.

One way to keep the problem tractable is to use the set of complete polynomials rather

than all products of terms in kink

i=0 and zinz

i=0 (for example, bases 1, k, z, k2, kz, z2rather than 1, k, z, kz, k2, z2, k2z, kz2, k2z2). Using the set of complete polynomials al-

lows me to approximate higher-order functions but limits the number of unknown coeffi-

cients.7 Another way to keep the problem tractable is to apply a finite element method.

As earlier examples show, the system of equations to be solved for the unknown coefficients

θ is typically very sparse. Therefore, in big problems, we do not need as much storage as

in a typical spectral method, and we can apply algorithms for solving sparse systems of

equations.

Consider application of the finite element method to the stochastic growth model.

The first step is to write out the residual equation using the first-order condition for the

problem in (4.2.8):

R (k, z; θ) =β√π

∫ ∞

−∞

cn(

k, z; θ)−τ

cn (k, z; θ)−τ

(

αkα−1

√

1 + z

1 − z+ 1 − δ

)

e−ν2

dν − 1 = 0, (4.2.11)

where

k = kα√

(1 + z) / (1 − z) + (1 − δ) k − cn (k, z; θ)

z = tanh(

ρ tanh−1 (z) +√

2σν)

,

cn(0, z; θ) = 0, ν is distributed normally with mean zero and variance 1/2, and the domain

for the state space is Ω = [0, k] × [−1, 1]. If we apply a Gauss-Hermite quadrature rule

7 See Judd (1992) for a comparison of complete polynomials and tensor products in the stochasticgrowth model example.

38

when computing the integral in equation (4.2.11), then the residual equation becomes

R (k, z; θ) ≃ β√π

mν∑

l=1

cn(

k, zl; θ)−τ

cn (k, z; θ)−τ

(

αkα−1

√

1 + zl1 − zl

+ 1 − δ)

ωl − 1,

where zl = tanh(ρ tanh−1(z) +√

2σνl) and νl, ωl, l = 1, . . . , mν are the abscissas and

weights for an mν -point quadrature rule. (For the quadrature formulas, see Press et

al. 1986.)

The second step in applying the finite element method is to divide up the domain

into smaller nonoverlapping subdomains called elements. In this problem, the domain is

two-dimensional and rectangular: Ω = [0, k]× [−1, 1]. A reasonable choice for the element

shape, therefore, is a rectangle. Suppose that we divide the domain into smaller rectangular

subdomains which do not overlap.8 Each element will be a rectangle in Ω, say, [ki, ki+1]

× [zj , zj+1], where ki is the ith grid point for the the capital stock and zj is the jth grid

point for the technology shock.

We consider two types of approximations over the rectangular elements: linear and

quadratic. Suppose the representation for consumption on some element e is linear,

cne (k, z) = a+ b k + c z + d kz. (4.2.12)

Because there are four unknowns, we require an element with four nodes. If we place

the four nodes at the corners of the rectange, then we can uniquely define the geometry

of the element and use the values of the solution at the four nodes to pin down the

constants in equation (4.2.12). That is, as in the one-dimensional case, we can rewrite the

approximation in (4.2.12) so that cne (k, z; θ) =∑

i θeiψ

ei (k, z), i = 1, . . . , 4, where the basis

functions are such that ψei is 1 at node i and zero at the other three nodes on the element.

Before we give formulas for the basis functions, it is convenient to first consider a

mapping from global coordinates (k, z) to local coordinates (ξ, η) defined on a master

element. This is done for convenience, since the master element has a fixed set of coor-

dinates, while each element in Ω has a different set of coordinates. Thus, we can con-

struct basis functions once but use them for each element. Consider functions ξ(k) and

η(z) that map a typical element [ki, ki+1] × [zj , zj+1] to the square [−1, 1] × [−1, 1]; that

is, ξ(k) = (2k − ki − ki+1)/(ki+1 − ki) and η(z) = (2z − zj − zj+1)/(zj+1 − zj). As-

sume that the four nodes of the master element are (−1,−1), (1,−1), (1, 1), and (−1, 1)

8 Extensions to non-rectangular element shapes require additional work but are not as useful in eco-nomic problems as in engineering problems, which sometimes involve irregularly shaped domains.(See, for example, Hughes 1987 and Reddy 1993.)

39

using the local coordinates. In this case, the basis functions are constructed so that

cne (ξ, η; θ) =∑

i θeiψ

ei (ξ, η) with θe1 = cne (−1,−1; θ), θe2 = cne (1,−1; θ), θe3 = cne (1, 1; θ), and

θe4 = cne (−1, 1; θ). These restrictions imply that

cne (ξ, η; θ) =1

4(1 − ξ) (1 − η) θe1 +

1

4(1 + ξ) (1 − η) θe2

+1

4(1 + ξ) (1 + η) θe3 +

1

4(1 − ξ) (1 + η) θe4. (4.2.13)

To attain a more accurate approximation, we can increase the number of elements

while retaining linear basis functions or use higher-order polynomials. Consider, for exam-

ple, quadratic functions in two dimensions. One simple way to construct these functions

is to take the product of one-dimensional quadratic polynomials. A unique set of coeffi-

cients for the polynomial requires that there be nine nodes and, hence, nine interpolation

functions. In this case, the approximation on the master element [−1, 1] × [−1, 1] is given

by

cne (ξ, η; θ) =1

4ξ (ξ − 1) η (η − 1) θe1 +

1

4ξ (ξ + 1) η (η − 1) θe2 +

1

4ξ (ξ + 1) η (η + 1) θe3

+1

4ξ (ξ − 1) η (η + 1) θe4 +

1

2

(

1 − ξ2)

η (η − 1) θe5 +1

2ξ (ξ + 1)

(

1 − η2)

θe6

+1

2

(

1 − ξ2)

η (η + 1) θe7 +1

2ξ (ξ − 1)

(

1 − η2)

θe8 +(

1 − ξ2) (

1 − η2)

θe9.(4.2.14)

Example 8. Let τ = 1, δ = 0, β = 0.95, α = 0.33, ρ = 0.95, and σ = 0.1. Assume that

the partition on z is given by [−0.391, −0.123, 0.123, 0.391] and that the partition on k

is given by [0, 0.010, 0.036, 0.102, 0.273, 0.714, 1.85]. We set the number of quadrature

points on each element to nine, that is, three points for integration with respect to the

capital stock and three points for integration with respect to the technology shock. For

integration over ν, we set the number of quadrature points, mν , equal to 10.

Figure 10 displays the approximate piecewise linear solution (marked with a square)

along with the exact solution. Even though there are only 18 elements, it is hard to

distinguish the two.

40


0

0.2

0.4

0.6

0.8

1

1.2

1.4

Exa

ctan

dA

ppro

xim

ate

Sol

utio

nsfo

rC

onsu

mpt

ion

Exact18 Element, Linear Bases18 Element, Quadratic Bases

Two Finite Element Approximations for the Stochastic

Growth Model

Figure 10.

Example 9. Suppose that we use the same parameterization as in Example 8, but instead

of linear basis functions, we use the quadratic functions in equation (4.2.14). In Figure 10,

the solution is marked with a circle. Notice that the fit with quadratic bases is slightly bet-

ter than that with linear bases – however, since the coarse piecewise linear approximation

is very accurate, there is not much room for improvement.

41

Chapter 5.

Maximum Likelihood Estimation

We describe how to use the Kalman filter to obtain an innovations representation and how

to use it to compute a Gaussian likelihood function. Finally, we display formulas for the

gradient of the log of Gaussian likelihood function with respect to free parameters of an

economic model. These formulas are messy, but easy to program and useful for accelerating

the process of maximizing the likelihood function.

Constructing an innovations representation is a key step in deducing the implications

of a model for vector autoregressions and for evaluating a Gaussian likelihood function.9 An

innovations representation is a state-space representation in which the vector white noise

driving the system is of the correct dimension (equal to that of the vector of observables)

and lives in the proper space (the space spanned by current and lagged values of the

observables).

Suppose that our theorizing and data collection lead us to a system of the form

xt+1 = Aoxt + Cwt+1

zt = Gxt + vt

vt = Dvt−1 + ηt,

(5.1)

where D is a matrix whose eigenvalues are bounded in modulus by unity and ηt is a

martingale difference sequence that satisfies

Eηtη′t = R

Ewt+1η′s = 0 for all t and s.

In Eq. (5.1), vt is a serially correlated measurement error process that is orthogonal to the

xt process.

We define the quasi-differenced process as

zt ≡ zt+1 −Dzt. (5.2)

From Eq. (5.1) and the definition (5.2) it follows that

zt = (GAo −DG) xt +GCwt+1 + ηt+1.

9 The calculations in this section are versions of ones described by Anderson and Moore (1979).

42

Then (xt, zt) is governed by the state-space system

xt+1 = Aoxt + Cwt+1

zt = Gxt +GCwt+1 + ηt+1,(5.3)

where G = GAo−DG. This system has nonzero covariance between the state noise Cwt+1

and the “measurement noise” (GCwt+1+ ηt+1). Let [Kt,Σt] be the Kalman gain and state

covariance matrix associated with the Kalman filter, namely,

Kt =(

CC′G′ + AoΣtG′)

Ω−1t (5.4)

Ωt = GΣtG′ +R+GCC′G′ (5.5)

Σt+1 = AoΣtAo′+CC′−

(

CC′G′+AoΣtG′)

Ω−1t

(

GΣtAo′+GCC′

)

. (5.6)

Then an innovations representation for system (5.3) is

xt+1 = Aoxt +Ktut

zt = Gxt + ut,(5.7)

wherext = E [xt | zt−1, zt−2, . . . , z0, x0]

ut = zt − E [zt | zt−1, . . . , z0, x0]

Ωt ≡ Eutu′t = GΣtG

′ +R +GCC′G′.

Initial conditions for the system are x0 and Σ0. From definition (5.2), it follows that

[zt+1, zt, . . . , z0, x0] and [zt, zt−1, . . . , z0, x0] span the same space, so that

xt = E [xt | zt, zt−1, . . . , z0, x0]

ut = zt+1 − E [zt+1 | zt, . . . , z0, x0] .

So ut is said to be an innovation in zt+1.

Equation (5.6) is a matrix Riccati difference equation. The Kalman filter has a steady-

state solution if there exists a time-invariant matrix Σ which satisfies Eq. (5.6), i.e., one

that satisfies the algebraic matrix Riccati equation. In this case, the same computational

procedures used for the optimal linear regulator problem apply. This is a benefit of the

duality of filtering and control referred to earlier. The steady-state Kalman gain, K, is

given by Eq. (5.4) with Σt = Σ and Ωt = GΣG′ +R+GCC′G′.

The innovations representation is equivalent with a Wold representation or vector

autoregression. Estimates of these representations are recovered in empirical work using

43

the vector autoregressive techniques promoted by Sims (1980) and Doan, Litterman, and

Sims (1984). It is convenient to have a quick way of deducing the vector autoregression

implied by a particular theoretical structure. To get a Wold representation for zt, substitute

Eq. (5.2) into Eq. (5.7) to obtain

xt+1 = Aoxt +Kut

zt+1 −Dzt = Gxt + ut.(5.8)

A Wold representation for zt is

zt+1 = [I −DL]−1[

I + G (I − AoL)−1KL

]

ut, (5.9)

where again L is the lag operator. From Eq. (5.8) a recursive whitening filter for obtaining

ut from zt is given byut = zt+1 −Dzt − Gxt

xt+1 = Aoxt +Kut.(5.10)

5.1. Vector autoregressive representation

Hansen and Sargent (1994) show that an autoregressive representation for zt is

zt+1 = D + (I −DL) G[

I −(

Ao −KG)

L]−1

KL zt + ut. (5.1.1)

or

zt+1 =[D + GK]zt +∞∑

j=1

[G(

Ao −KG)jK

−DG(

Ao −KG)j−1

K]zt−j + ut.

(5.1.2)

This equation expresses zt+1 as the sum of the one-step-ahead linear least squares forecast

and the one-step prediction error.

5.2. The Likelihood Function

We start with a “raw” time series yt that determines an adjusted series zt according to

zt = f (yt,Θ) ,

where Θ is the vector containing the free parameters of the model, including parameters

determining particular detrending procedures. For example, if our raw series has a geomet-

ric growth trend equal to µt which is to be removed before estimation, then the adjusted

44

series is zt = yt/µt. We assume that the state-space model of the form (5.3) and the

associated innovations representation (5.7) pertains to the adjusted data zt. We can

use the innovations representation (5.7) recursively to compute the innovation series, then

calculate the log-likelihood function

L (Θ) =

T−1∑

t=0

log |Ωt| + trace(

Ω−1t utu

′t

)

− log∣

∣

∂f (yt,Θ)

∂yt

∣

∣

(5.2.1)

and find estimates, Θ = argminΘL(Θ), where Ωt = Eutu′t is the covariance matrix of the

innovations. To find the minimizer Θ, we can use a standard optimization program. In

practice, it is best if we can calculate both the log-likelihood function and its derivatives

analytically. First, the computational burden is much lower with analytical derivatives.

Consider, for example, the model of McGrattan, Rogerson, and Wright (1993), which

has 84 elements in Θ. For each step of a quasi-Newton optimization routine, L and ∂L∂θ

are computed. To obtain ∂L∂θ

numerically for the McGrattan, Rogerson, Wright (1993)

example, the log-likelihood function must be evaluated 168 times if central differences are

used in computing an approximation for ∂L∂θ, e.g.,

∂L

∂θ≈ L (Θ + ǫe) − L (Θ − ǫe)

2ǫ, (5.2.2)

where e is a vector of zeros except for a 1 in the element corresponding to θ and ǫ is some

positive number. Usually, the costs of computing L a large number of times far outweigh

the costs of computing ∂L∂θ once. If L and ∂L

∂θ are to be computed many times, which is

typically the case, then the costs of computing numerical derivatives can be quite large.

A second advantage to analytical derivatives is numerical accuracy. If the log-likelihood

function is not very smooth for the entire parameter space, there may be problems with the

accuracy of approximations such as Eq. (5.2.2). With inaccurate derivatives, it is difficult

to determine the curvature of the function and, hence, to find a minimum.

For L(Θ) in Eq. (5.2.1), the derivatives ∂L(Θ)/∂θ are easy to derive. We derive them

in Anderson et al. (1996) and distinguish formulas that are steps in the derivation from

those that would be put into a computer code.

Once we have the log-likelihood function and its derivatives, we can apply standard

optimization methods to the problem of finding the maximum likelihood estimates. In

practice, we will have a constrained optimization problem since the equilibrium is not

typically computable for all possible parameterizations. For example, we may have simple

constraints such as ℓ < Θ < u, where ℓ and u are the lower and upper bounds for the

45

parameter vector. In this case, we use either a constrained optimization package or penalty

functions (see Fletcher 1987).

After computing the maximum likelihood estimates, we need to compute their stan-

dard errors,

Se (Θ) = diag

(

√

(

∑

t

∂Lt∂Θ

∂Lt∂Θ

′)−1)

, (5.2.3)

where Lt(Θ) is the logarithm of the density function of the date t innovation, i.e.,

Lt (Θ) = log |Ωt| + u′tΩ−1t ut − log

∣

∣

∂f (yt,Θ)

∂yt

∣

∣. (5.2.4)

See Anderson et al. (1996) for the formula for ∂Lt

∂θ.

46

Chapter 6.

A Prototype Real Business Cycle Model

We consider two versions of the model. The first has technology parameters that are

autoregressive of order one and the second has technology parameters that are unit root

processes.

6.1. A Version of the Model with AR(1) Technology

6.1.1. Maximization problems

Consider an economy with households, firms, and the government. The representative

household chooses consumption, investment, and labor to solve the following maximization

problem:

maxct,xt,lt

E∞∑

t=0

βt U (ct, 1 − lt)Nt

subject to (1 + τct) ct + (1 + τxt)xt = (1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt

Nt+1kt+1 = [(1 − δ) kt + xt]Nt

ct, xt ≥ 0 in all states

taking processes for the rental rate rt, wage rate wt, the tax rates τct, τxt, τkt, τlt, and

transfers trt as given. The representative firm solves a simple static problem at t:

maxKt,Lt

F (Kt, ZtLt) − rtKt − wtLt.

The government sets rates of taxes and transfers in such a way that their budget constraint

at t, namely,

Gt +Nttrt = τkt (rt − δ)Ntkt + τltwtltNt + τctNtct + τxtNtxt

is satisfied. In equilibrium, the following conditions must hold:

Nt (ct + xt) +Gt = F (Kt, ZtLt) (6.1.1)

Ntkt = Kt

Ntlt = Lt.

47

6.1.2. First-order conditions

Next, consider the first-order conditions in this economy. The Lagrangian for the household

optimization problem is given by

L = E∑

t

βtNt

U (ct, 1 − lt)

+ µt

(1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt − (1 + τct) ct − (1 + τxt) xt

+ λt

(1 − δ) kt + xt − (1 + gn) kt+1

Here, the nonnegativity constraints on consumption and investment have been ignored.

These constraints will not bind for postwar size business cycles. They do bind for large

shocks such as occurred during the Great Depression or World War II. When analyz-

ing those periods, we need to include a penalty function to enforce the nonnegativity

constraints. (See Chari, Kehoe, and McGrattan’s Staff Report 328 or McGrattan and

Ohanian’s Staff Report 315.)

The relevant first-order conditions are found by taking derivatives of L with respect

to ct, lt, xt, and kt+1:

0 = U1 (ct, 1 − lt) − µt (1 + τct)

0 = −U2 (ct, 1 − lt) + µt (1 − τlt)wt

0 = µt (1 + τxt) + λt = 0

0 = − (1 + gn)λt + Etµt+1 [(1 − τkt+1) rt+1 + δτkt+1] + λt+1 (1 − δ)

Eliminating multipliers yields:

U2 (ct, 1 − lt)

U1 (ct, 1 − lt)=

1 − τlt1 + τct

wt (6.1.2)

1 + τxt1 + τct

U1 (ct, 1 − lt) = βEt

[

U1 (ct+1, 1 − lt+1)

1 + τct+1

(1 − τkt+1) rt+1 + δτkt+1

+ (1 − δ) (1 + τxt+1)

]

. (6.1.3)

In addition, there are first-order conditions for the firm’s static problem. These are

rt = F1 (Kt, ZtLt) (6.1.4)

wt = F2 (Kt, ZtLt)Zt. (6.1.5)

48

Finally, we have a resource constraint given by (6.1.1).

From here on, we make the following functional form assumptions and auxiliary

choices:

F (k, l) = kθl1−θ (6.1.6)

U (c, 1 − l) =(

c (1 − l)ψ)1−σ

/ (1 − σ) (6.1.7)

τkt = τct = 0

st = [log zt, τlt, τxt, log gt]′

st+1 = P0 + Pst +Qǫs,t+1, ǫs ∼ N (04×1, I4×4) . (6.1.8)

We have turned off τc since it plays a similar role to τn in distorting the labor-leisure

choice. Similarly, we have turned off τk since it plays a similar role to τx in distorting the

intertemporal margin.

If we substitute the choices (6.1.6)-(6.1.7) into (6.1.1) and (6.1.2)-(6.1.5), then substi-

tute the equilibrium rates rt and wt into (6.1.2) and (6.1.3), we have:

Nt (ct + gt) +Nt+1kt+1 − (1 − δ)Ntkt = (Ntkt)θ(ZtNtlt)

1−θ(6.1.9)

ψct1 − lt

= (1 − τlt) (1 − θ) (Ntkt)θZ1−θt (Ntlt)

−θ(6.1.10)

(1 + τxt) c−σt (1 − lt)

ψ(1−σ)

= βEt[

c−σt+1 (1 − lt+1)ψ(1−σ)

(1 − τkt+1) θ (Nt+1kt+1)θ−1

(Zt+1Nt+1lt+1)1−θ

+ δτkt+1 + (1 − δ) (1 + τxt+1)]

. (6.1.11)

6.1.3. Log-linear computation

The next big step is to approximate the decision function for capital. Given an approximate

function for kt+1, We can use the static equations (2.1.4) and (2.1.7) to determine the

decisions ct and lt.

Log-linearizations are done for a stationary version of the equations (6.1.9)-(6.1.11).

Thus, before proceeding, We need to normalize variables. Dividing all variables that grow

by (1 + gz)t gives me:

ct + gt + (1 + gz) (1 + gn) kt+1 − (1 − δ) kt = yt = kθt (ztlt)1−θ

(6.1.12)

49

ψct1 − lt

= (1 − τlt) (1 − θ) kθt l−θt z1−θ

t (6.1.13)

(1 + τxt) c−σt (1 − lt)

ψ(1−σ)

= βEtc−σt+1 (1 − lt+1)

ψ(1−σ)[

θkθ−1t+1 (zt+1lt+1)

1−θ+ (1 − δ) (1 + τxt+1)

]

(6.1.14)

where β = β(1 + gz)−σ.

To do the log-linear approximation, we will also need the steady state values of the

variables in (6.1.12)-(6.1.14) (assuming constant values for z, the taxes, and government

spending):

k/l =

(1 + τx)(

1 − β (1 − δ))

βθz1−θ

1/(θ−1)

c =

[

(

k/l)θ−1

z1−θ − (1 + gz) (1 + gn) + 1 − δ

]

k − g = ξ1k − g

c =

[

(1 − τl) (1 − θ)(

k/l)θ

z1−θ/ψ

]

(

1 − 1/(

k/l)

k)

= ξ2 − ξ3k

where the last 2 equations imply k = (ξ2 + g)/(ξ1 + ξ3), c = ξ1k − g, l = (1/(k/l))k.

Assume that the solution for the capital decision takes the form:

log kt+1 = γk log kt + γ [ log zt τlt τxt log gt ]′+ constant, (6.1.15)

where γk is a scalar and γ is 1 × 4 and equal to [γz, γl, γx, γg]. Assume the residual from

the dynamic first-order condition (6.1.14) can be written (after substitutions from (6.1.12)

and (6.1.13)):

f(

Et log kt+2, log kt+1, log kt, log zt+1, log zt, τlt+1, τlt, τxt+1, τxt, log gt+1, log gt

)

≈ a0Et log kt+2 + a1 log kt+1 + a2 log kt + b0Etst+1 + b1st.

Then the general solution algorithm is to find γk that solves the quadratic equation

a0γ2k + a1γk + a2 = 0,

and γ that solves the linear equations:

a0γkγ + a0γP + a1γ + b0P + b1 = 01×4.

Note that this implies:

γ = − [(a0a+ a1) I4×4 + a0P′]−1

(b0P + b1I4×4)′.

Once we have values for the the coefficients γk and γ, We can use (6.1.12) and (6.1.13) to

back out ct and lt (either nonlinearly or by way of a log-linear approximation).

50

6.2. A Version of the Model with Random Walk Technology

The only changes relative to the model with AR(1) technology is: Zt = Zt−1zt where z

is the innovation to technology. In this case, detrending is done slightly differently. We

use vt = Vt/[NtZt] to denote the detrended, per-capita variable Vt, except in the case of

capital. There, we use kt = Kt/[NtZt−1].


The maximization problems are the same as above except that households in this version

assume Zt = Zt−1zt with the process for log zt assumed to be autoregressive.


The first-order conditions are the same as above.


The main difference between the benchmark model and the version with random-walk

technology is the step taken to normalize variables In this version, the normalized variables

are:

ct = ct/Zt, xt = xt/Zt, gt = gt/Zt, yt = yt/Zt, kt = kt/Zt−1.

Using the functional forms for F and U in (6.1.6) and (6.1.7), respectively, the equilibrium

rental and wage rates are:

rt = θKθ−1t (ZtLt)

1−θ= θkθ−1

t (ztlt)1−θ

wt = (1 − θ)Kθt (ZtLt)

−θZt = (1 − θ) kθt (ztlt)

−θZt.

This implies the following first-order conditions

ct + gt + (1 + gn) kt+1 − (1 − δ) z−1t kt = yt = kθt l

1−θt z−θt (6.2.1)

ψct1 − lt

= (1 − τlt) (1 − θ) kθt (ztlt)−θ

(6.2.2)

(1 + τxt) c−σt (1 − lt)

ψ(1−σ)

= βz−σt+1Etc−σt+1 (1 − lt+1)

ψ(1−σ)[


1−θ+ (1 − δ) (1 + τxt+1)

]

.(6.2.3)

Next, we compute the steady state of the system for constant values for z, the taxes,

51

and government spending:

k/l =

(

(1 + τx) (1 − βz−σ (1 − δ))

βz−σθz1−θ

)1/(θ−1)

c =

[

(

k/l)θ−1

z−θ − (1 + gn) + (1 − δ) z−1

]

k − g = ξ1k − g

c =

[

(1 − τl) (1 − θ)(

k/l)θ

z−θ/ψ

]

(

1 − 1/(

k/l)

k)

= ξ2 − ξ3k


The form of the solution and the procedure for computing it is the same as in the

benchmark case.

6.3. MLE Estimation

The next step is to describe a standard method we can use to estimate the processes

governing the four exogenous variables in st with the data described above.

6.3.1. State-space form in the general case

Assume that X is a vector of state variables from the model and Y are observables. The

state-space form then isXt+1 = AXt +Bǫt+1

Yt = CXt + ωt

ωt = Dωt−1 + ηt

where D is equal to parameters governing serial correlation of measurement error. Assume

that Eηtη′t = R, Eǫtη

′s = 0 for all periods t and s. Define Yt ≡ Yt+1 −DYt. Then system

can be rewritten as follows:

Xt+1 = AXt +Bǫt+1

Yt = CXt + CBǫt+1 + ηt+1

6.3.2. Log-likelihood function

The log-likehlihood function is

L (Θ) =

T−1∑

t=0

log |Ωt| + trace(

Ω−1t utu

′t

)

− log |∂f (Zt,Θ) /∂Zt|

(6.3.1)

52

where the parameters to be estimated are stacked in vector Θ, the innvation vector is ut,

and its covariance is Ωt. The last term in (6.3.1) is nonzero if the Y are not the raw series

but depend on the raw series Z plus the parameter vector. For example, if We estimate gz

and use per-capita values as our raw data, then Z is per-capita data and Y is detrended,

per-capita data.

The innovation vector ut and its covariance Ωt are defined as follows:

ut = Yt − E[

Yt|Yt−1, Yt−2, . . . , Y0, X0

]

= Yt+1 − E[

Yt+1|Yt, Yt−1, . . . , Y0, X0

]

= Yt+1 −DYt − CXt

Ωt = Eutu′t = CΣtC

′ +R + CBB′C′.

which in turn depends on the predicted state Xt:

Xt = E[

Xt|Yt, Yt, . . . , Y0, X0

]

.

The predicted state evolves according to

Xt+1 = AXt +Ktut

where Kt is the Kalman gain,

Kt =(

BB′C′ + AΣtC′)

Ω−1t

Σt+1 = AΣtA′ +BB′ −

(

BB′C′ + AΣtC′)

Ω−1t

(

CΣtA′ + CBB′

)

with state covariance Σt.

6.3.3. MLE in the Benchmark Case

In the benchmark case, we have Xt = [log kt, log zt, τlt, τxt, log gt, 1]′, Yt = [log yt, log xt,

log lt, log gt], and

A =

γk γz γl γx γg γ0

04×1 P P0

0 01×4 1

B =

01×4

Q0

C =

φyk φyz φyl 0 φyg φy0φxk 0 0 0 0 φx0φlk φlz φll 0 φlg φl00 0 0 0 1 0

+

φyk′

φxk′φlk′

0

[ γk γz γl γx γg 0 ] . (6.3.2)

53

The coefficients φ are derived by log-linearizing (6.1.13) after substituting in for con-

sumption from (6.1.12):

0 ≈ ψ

kθ (zl)1−θ

[

θ log kt + (1 − θ) (log zt + log lt)]

− (1 + gz) (1 + gn) k log kt+1 + (1 − δ) k log kt − g log gt

+ (1 − θ) (1 − τl) kθl−θz1−θ (1 − l)

1/ (1 − τl) τlt

− θ log kt + θ log lt − (1 − θ) log zt + l/ (1 − l) log lt

which we write succinctly as

log lt = φlk log kt + φlz log zt + φllτlt + φlg log gt + φlk′ log kt+1. (6.3.3)

Using this equation for log l, we use the production relation and the capital accumulation

equation to write log y and log x as follows:

log yt = (θ + (1 − θ)φlk) log kt + (1 − θ) (1 + φlz) log zt

+ (1 − θ)[

φllτlt + φlg gt + φlk′ log kt+1

]

≡ φyk log kt + φyz log zt + φylτlt + φyg log gt + φyk′ log kt+1 (6.3.4)

log xt = (1 + gz) (1 + gn) k/x log kt+1 − (1 − δ) k/x log kt

≡ φxk log kt + φxk′ log kt+1. (6.3.5)

We fixed parameters of preferences, production, and growth and estimated the pro-

cesses for the shocks. The parameters that were fixed were: ψ = 2.24, σ = 1, β = .9722,

θ = .35, δ = .0464, gn = 1.5%, and gz = 1.6%. We also set D = 04×4 and R = .0001×I4×4.

The parameters that were estimated were elements of P0, P , and Q.

6.3.4. MLE in the Random Walk Case

In the case of random-walk technology, the settings are slightly different. In this case,

we have Xst = [log kt, log zt, τlt, τxt, log gt, 1]′, Xt = [Xst, Xst−1]′, and Yt = [log yt −

log yt−1, log xt − log xt−1, log lt, log gt − log gt−1]. We can write the growth rates in Yt

as elements of Xt as follows:

log yt − log yt−1 = log (ytZt) − log (yt−1Zt−1)

= log (yt) − log (yt−1) + log zt

= φyk

(

log kt − log kt−1

)

+ (1 + φyz) log zt − φyz log zt−1

+ φyl (τlt − τlt−1) + φyg (log gt − log gt−1) + φyk′(

log kt+1 − log kt

)

54

Similarly the growth rates for xt and gt can be written in terms of the elements of Xt.

To obtain the φ coefficients, We log-linearize (6.2.2) after substituting in for consump-

tion from (6.2.1):

0 ≈ ψ

kθl1−θz−θ[

θ(

log kt − log zt

)

+ (1 − θ) log lt

]

− (1 + gn) k log kt+1 + (1 − δ) z−1k(

log kt − log zt

)

− g log gt

+ (1 − θ) (1 − τl) kθ (zl)

−θ(1 − l)

1/ (1 − τl) τlt

− θ log kt + θ (log lt + log zt) + l/ (1 − l) log lt

which again can be written succinctly as in (6.3.3). Using the equation for log l, production

relation and the capital accumulation equation can be used to write log y and log x as

follows:

log yt = (θ + (1 − θ)φlk) log kt + ((1 − θ)φlz − θ) log zt

+ (1 − θ)[

φllτlt + φlg gt + φlk′ log kt+1

]


log xt = (1 + gn) k/x log kt+1 − (1 − δ) z−1k/x(

log kt − log zt

)

≡ φxk log kt + φxz log zt + φxk′ log kt+1. (6.3.7)

The matrices in the state space form are

A =

[

As 0I 0

]

B =

[

Bs0

]

where

As =

γk γz γl γx γg γ0

04×1 P P0

0 01×4 1

Bs =

01×4

Q0

and

C =

φyk − φyk′ 1 + φyz φyl 0 φyg φy0 −φyk −φyz −φyl 0 −φyg −φy0φxk − φxk′ 1 + φxz 0 0 0 φx0 −φxk −φxz 0 0 0 −φx0

φlk φlz φll 0 φlg φl00 1 0 0 1 0 0 0 0 0 −1 0

+

φyk′

φxk′φlk′

0

[ γk γz γl γx γg γ0 0 0 0 0 0 0 ] . (6.3.8)

55

6.4. Simulating Data from the Models

We first draw 1000 sequences ǫs,t. Given MLE estimates for P0, P , Q, and initial

conditions for s, we can use (6.1.8) to derive sequences for technology, tax rates, and

spending. Given an initial condition for the capital stock k0, we can use (6.1.8) to derive

the time path for a sequence kt. With technology, tax rates, spending, and capital, we

have the entire state vector Xt period by period. we then use Yt = CXt (since we have

assumed negligible measurement error) for my observable vector where C is (6.3.2) in the

benchmark case and (6.3.8) in the random-walk case.

56

Chapter 7.

A Prototype Sticky Price Model

7.1. Model Economy

Since we will use the first order conditions over and over again in these notes, we start

with a statement of the optimization problems solved by all of the agents in the economy

and the associated first order conditions.

The problem solved by the final goods producers each period is

maxY (i,st)

P(

st)

−∫

P(

i, st−1)

Y(

i, st)

/Y(

st)

di (7.1.1)

subject to∫

g

(

Y (i, st)

Y (st)

)

di = 1 (7.1.2)

The first order conditions for this problem are

P(

i, st−1)

= λ(

st)

g′(

Y (i, st)

Y (st)

)

where λ is the Lagrange multiplier on the constraint ((7.1.2)). The zero-profit condition,

P(

st)

=

∫

P(

i, st−1)

Y(

i, st)

/Y(

st)

di, (7.1.3)

and the first order condition for Pi imply the following for the relative price

P (i)

P=

g′(

Y (i)Y

)

∫

g′(

Y (j)Y

)

Y (j)Y dj

.

Inverting this equation gives the input demand functions

Y(

i, st)

= D

(

P(

i, st−1)

P (st)

∫

g′(

Y (j, st)

Y (st)

)

Y (j, st)

Y (st)dj

)

Y(

st)

(7.1.4)

where D ≡ (g′)−1.

57

If we assume that g(y) = yθ, which is the case in most of the paper, then we have:

Y(

i, st)

=

[

P (st)

P (i, st−1)

]1

1−θ

Y(

st)

(7.1.5)

P(

st)

=

[∫

P(

i, st−1)

θθ−1 di

]θ−1

θ

. (7.1.6)

The problem solved by consumers is

max∞∑

t=0

∑

st

βtπ(

st)

U(

C(

st)

, L(

st)

,M(

st)

/P(

st))

, (7.1.7)

subject to the sequence of budget constraints

P(

st)

C(

st)

+M(

st)

+∑

st+1

Q(

st+1|st)

B(

st+1)

(7.1.8)

≤ P(

st)

W(

st)

L(

st)

+M(

st−1)

+B(

st)

+ Π(

st)

+ T(

st)

, t = 0, 1, . . . ,

and borrowing constraints B(st+1) ≥ B for some large negative number B.

The first order conditions for the consumer are therefore given by the following equa-

tions:

−Ul (st)

Uc (st)= W

(

st)

(7.1.9)

Uc (st)

P (st)= β

∑

st+1

π(

st+1|st) Uc

(

st+1)

P (st+1)+Um (st)

P (st)(7.1.10)

Q(

sτ |st)

= βτ−tπ(

sτ |st) Uc (sτ )

Uc (st)

P (st)

P (sτ )for all τ > t (7.1.11)

where U(st) is shorthand for U(C(st), L(st),M(st)/P (st)).

The problem solved by the monopolist adjusting his price is to choose P (i, st−1),

K(i, sτ ), X(i, sτ), and L(i, sτ) τ = t, . . . , t+N − 1 to maximize

∞∑

τ=t

∑

sτ

Q(

sτ |st−1) [

P(

i, st−1)

Y (i, sτ ) − P (sτ )W (sτ )L (i, sτ ) − P (sτ )X (i, sτ )]

(7.1.12)

subject to the demand for good i in ((7.1.4)), the production technology:

Y(

i, st)

= F(

K(

i, st−1)

, L(

i, st))

(7.1.13)

58

and the law of motion for capital used in producing i:

K(

i, st)

= (1 − δ)K(

i, st−1)

+X(

i, st)

− φ

(

X (i, st)

K (i, st−1)

)

K(

i, st−1)

. (7.1.14)

The first order conditions for the case with F (K,L) = Kα1Lα2 are given by

∑

τ

∑

sτ

Q(

sτ |st−1)

Y (i, sτ ) + Y (sτ )[

1 − P (sτ )V (i, sτ ) /P(

i, st−1)]

(7.1.15)

D′

(

P(

i, st−1)

P (sτ )

∫

g′(

Y (j, sτ )

Y (sτ )

)

Y (j, sτ )

Y (sτ )dj

)

g′(

Y (i, sτ )

Y (sτ )

)

= 0

V(

i, st)

= W(

st)

/Fl(

i, st)

(7.1.16)

1

1 − φ′ (i, st)=∑

st+1

Q(

st+1|st−1)

P(

st+1)

Q (st|st−1) P (st)

[

V(

i, st+1)

Fk(

i, st+1)

(7.1.17)

+1

1 − φ′ (i, st+1)

1 − δ − φ(

i, st+1)

+ φ′(

i, st+1) X

(

i, st+1)

K (i, st)

]

where F (i, st) and φ(i, st) are shorthand for F (K(i, st−1), L(i, st)) and φ(X(i, st)/K(i, st−1)),

respectively. The monopolists not setting prices will still maximize with respect to labor,

investment, and capital. Therefore, there will be one pricing equation and N Euler equa-

tions for capital. The first order conditions for those monopolists not setting prices depend

on the prices that they last set.

Note that if the technology of the final goods producer is given by g(y) = yθ, then the

first order condition in ((7.1.15)) can be written more simply as follows:

Pi =1

θ

∑

τ

∑

sτ Q(

sτ |st−1)

Y (sτ ) P (sτ )2−θ1−θ V (i, sτ )

∑

τ

∑

sτ Q (sτ |st−1)Y (sτ ) P (sτ )1

1−θ

(7.1.18)

If adjustment costs are equal to zero, then the Euler equation for capital ((7.1.17)) can be

written more simply as follows:

Uc(

st)

= β∑

st+1

π(

st+1|st)

Uc(

st+1) [

V(

i, st+1)

Fk(

i, st+1)

+ 1 − δ]

Finally, the following equilibrium constraints must hold:

M(

st)

= µ(

st)

M(

st−1)

(7.1.19)

59

T(

st)

= M(

st)

−M(

st−1)

(7.1.20)

L(

st)

=

∫

L(

i, st)

di (7.1.21)

Y(

st)

= C(

st)

+

∫

X(

i, st)

di. (7.1.22)

To summarize, we have equations ((7.1.4))-((7.1.3)) from the final goods producers,

equations ((7.1.9))-((7.1.11)) from the consumers, equations ((7.1.14))-((7.1.17)) from the

intermediate goods producers, and equations ((7.1.19))-((7.1.13)) that must hold in equi-

librium.

7.2. Computing an Equilibrium

In this section, we describe in some detail the numerical algorithm used to solve the full-

blown model of Section 2. The solution takes the form

Zt = AZt−1 +BSt (7.2.1)

where Zt = [z′t, . . . , z′t−N+2]

′ is a (N + 2)(N − 1) × 1 vector and

zt = [pt−1 −mt−1, k1,t, . . . , kN,t, yt]′

St = [µt, µt−1, . . . , µt−N+1]′.

The matrices A and B are chosen to satisfy the first order conditions which can be written

generally as follows

Et [a0Zt+N−1 + a1Zt+N−2 + . . . aNZt−1 + b0St+N−1 + b1St+N−2 + . . . bN−1St] = 0

(7.2.2)

where Et ≡ E[·|st−1] for the first residual (the pricing equation) and Et = E[·|st] for all

other residuals.

Writing the residuals as in ((7.2.2)) makes the notation simpler but actually implies

lots of duplication. For example, lagged prices appear multiple times. Our subroutine

uses a smaller set of variables when constructing residuals of the first order conditions.

In particular, there are two inputs: the vector of parameters appearing in the first order

conditions and the following vector of variables:

Z ≡ [zt+N−1, zt+N−2, . . . , zt, pt−2 −mt−2, . . . , pt−N −mt−N , k1,t−1, . . . , kN,t−1,

µt+N−1, . . . , µt−N+1]′.

60

We show later that all other variables can be constructed once we know those in Z.

Above we assumed that β ≈ 1 when deriving the linearized pricing equation. In

writing the code, we will not make this assumption. If we linearize ((7.1.15)) we get

pt−1 =1

(N − 1)∑N−1i=0 βi

Et−1

[

pt−N + (1 + β) pt−N+1 + . . .(

1 + β + . . .+ βN−2)

pt−2(7.2.3)

+(

β + . . .+ βN−1)

pt + . . .+ βN−1pt+N−2

+ ϕN

vi,t + βvi,t+1 + . . .+ βN−1vi,t+N−1

]

where the constant terms have been ignored. It turns out that it is most convenient to

write the residuals by first normalizing the prices: we divide them by the money supply.

If we do this, then the pricing equation in ((7.2.3)) is equivalent to

pt−1−mt−1

=1

(N − 1)∑N−1i=0 βi

Et−1

[

(pt−N −mt−N ) + . . .(

1 + β + . . .+ βN−2)

(pt−2 −mt−2)

+(

β + . . .+ βN−1)

(pt −mt) + . . .+ βN−1 (pt+N−2 −mt+N−2)

+ ϕN

vi,t + βvi,t+1 + . . .+ βN−1vi,t+N−1

+[(

β + . . .+ βN−1)

+(

β2 + . . .+ βN−1)

+ . . .+ βN−1]

µt

+[(

β2 + . . .+ βN−1)

+ . . .+ βN−1]

µt+1

+ . . .+ βN−1µt+N−2

− µt−N+1 − [1 + (1 + β)]µt−N+2 −[

1 + (1 + β) +(

1 + β + β2)]

µt−N+3

− . . .−[

1 + (1 + β) + . . .+(

1 + β + . . .+ βN−2)]

µt−1

]

.

We had to write the pricing equation as above because we do not have explicit func-

tional forms for g(·) (and hence the demand function D(·) in ((7.1.15))). The other residu-

als can either be linearized by hand or numerically. For ease of reading the code, we chose

to linearize them numerically.

In addition to the pricing equation we have the the money demand equation and N

Euler equations for capital:

Uc (st)

P (st)= β

∑

st+1

π(

st+1|st) Uc

(

st+1)

P (st+1)+Um (st)

P (st)(7.2.4)

61

Uc(

st)

=[

1 − φ′(

i, st)]

β∑

st+1

π(

st+1|st)

Uc(

st+1)

[

V(

i, st+1)

Fk(

i, st+1)

(7.2.5)

+1

1 − φ′ (i, st+1)

1 − δ − φ(

i, st+1)

+ φ′(

i, st+1) X

(

i, st+1)

K (i, st)

]

In writing the residuals, we will use the following convention for naming the cohorts

(which is different than that used above). we will assume that monopolists named i are

those that set their prices i periods ago. For example, in t, group 1 charges pt−1, group 2

charges pt−2, and so on. Note that the particular assignment is not important. To evaluate

the pricing equation we need unit costs of group 1 for t, t + 1, . . ., t + N − 1. For these

costs, we will use the notation: v1,t, v2,t+1, . . ., vN,t+N−1 where

v1,t = wt + (1 − α2) l1,t − α1kN,t−1 − log (α2)

v2,t+1 = wt+1 + (1 − α2) l2,t+1 − α1k1,t − log (α2)

...

vN,t+N−1 = wt+N−1 + (1 − α2) lN,t+N−1 − α1kN−1,t+N−2 − log (α2)

if F (K,L) = Kα1Lα2 . The capital stocks are included in Z. The labor inputs are given

by

l1,t =ǫ

α2(pt − pt−1) −

α1

α2kN,t−1 +

1

α2yt

l2,t+1 =ǫ

α2(pt+1 − pt−1) −

α1

α2k1,t +

1

α2yt+1

...

lN,t+N−1 =ǫ

α2(pt+N−1 − pt−1) −

α1

α2kN−1,t+N−2 +

1

α2yt+N−1

which follows from yi,t − yt = −ǫ(pi,t−1 − pt) and exp(yi,t) = F (exp(ki−1,t−1), exp(li,t)).

Aggregate output is given in Z. For the relative prices, we write

pt − pt−1 =1

N[(pt−1 −mt−1) + . . .+ (pt−N −mt−N )] − (pt−1 −mt−1) (7.2.6)

− N − 1

Nµt−1 −

N − 2

Nµt−2 − . . .− 1

Nµt−N+1

pt+1 − pt−1 =1

N[(pt −mt) + . . .+ (pt−N+1 −mt−N+1)] − (pt−1 −mt−1)

62

+1

Nµt −

N − 2

Nµt−1 − . . .− 1

Nµt−N+2

pt+2 − pt−1 =1

N[(pt+1 −mt+1) + . . .+ (pt−N+2 −mt−N+2)] − (pt−1 −mt−1)

+1

Nµt+1 +

2

Nµt −

N − 3

Nµt−1 − . . .− 1

Nµt−N+3

...

pt+N−1 − pt−1 =1

N[(pt+N−1 −mt+N−1) + . . .+ (pt−1 −mt−1)] − (pt−1 −mt−1)(7.2.7)

+1

Nµt+N−2 +

2

Nµt+N−3 + . . .+

N − 1

Nµt

which depend on terms in Z.

The wage rate appears in the equation for unit costs. To construct wage rates we need

C(st), L(st), and M(st)/P (st). For aggregate consumption, we need aggregate output and

the individual investments:

C(

st)

= Y(

st)

− 1

N

∑

i

X(

i, st)

X(

i, st)

=1

b

(

1 + bδ −√

1 + 2bδ − 2b (K (i, st) /K (i− 1, st−1) − 1 + δ))

K(

i− 1, st−1)

where the capital stocks and output are in Z. When linearized, these equations look like

ct =(

Y yt −X∑

xi,t/N)

/C

ki,t = (1 − δ) ki−1,t−1 + δxi,t

where constant terms have been ignored. Notice that the monopolists with capital stocks

K(i−1, st−1) in t−1 are the same monopolists with capital K(i, st) using our new naming

convention. Monopolists named N this period are named 1 next period since they are the

next to change prices. Aggregate labor is given by

L(

st)

=1

N

∑

i

L(

i, st)

or, in logs, by

lt =1

N

∑

i

li,t.

Finally, logged real balances are given by

mt − pt = mt −1

N(pt−1 + . . .+ pt−N )

63

=1

N(mt−1 + µt) + (mt−2 + µt + µt−1) + . . .+ (mt−N + µt + µt−1 + . . . µt−N+1)

− 1

N(pt−1 + . . .+ pt−N )

= − 1

N

N∑

i=1

(pt−i −mt−i) +1

N(Nµt + (N − 1)µt−1 + . . . µt−N+1) .

For the pricing equation, we need real balances in t, t+ 1, . . . , t+N − 1 so we need to

know the sequences pt−N −mt−N , . . . pt+N−2 −mt+N−2. and µt−N+1, . . . , µt+N−1.These are in Z. The formulas for the relative prices pτ − pτ−i can be found in ((7.2.6))-

((7.2.7)). All of the variables appearing in the money demand equation ((7.2.5)) have at

this point been constructed.

There are two steps to solving the system of equations in ((7.2.1)). We start with

the first step: computing A. We use standard methods to solve the deterministic solution

Zt = AZt−1.

Define Xt to be the following vector of state variables:

Xt = [pt−2 −mt−2, . . . , pt−N −mt−N , k1,t−1, . . . , kN,t−1]′. (7.2.8)

Using the definition of X in ((7.2.8)), the residuals (dropping terms with µ) can be written

as

A1

[

Xt+1

Zt+N−1

]

+A2

[

XtZt+N−2

]

+

(

shock

terms

)

= 0

where elements of A1 and A2 are either coefficients of linearized residuals or 1’s and 0’s

used to associate variables with their lagged values. To compute A, we find generalized

eigenvalues Λ (and associated eigenvectors D) such that A2D = −A1DΛ. For a unique

stationary equilibrium, we need 2N−1 roots inside the unit circle. Note that X has length

2N−1. If we sort the eigenvalues and eigenvectors so that the roots inside one are ordered

first, then we have

Xt+1 = D11Λ1D−111 Xt

Zt+N−2 = D21D−111 Xt

where D11 is the upper left partition of D and is 2N − 1 × 2N − 1, D21 is the lower left

partition of D and has dimension (N + 2)(N − 2) × 2N − 1, and Λ1 is the upper left

partition of the matrix of eigenvalues. Recall that Zt+N−2 = [z′t+N−2, . . . , z′t]′. Recall also

that all of the elements in Xt are also in Zt−1. Therefore, we can use the solutions above

to fill in the elements of A.

64

Given A, solving for B involves solving a linear system of equations. The law of

motion for the shocks is given by

St = St−1 + ǫt+1 (7.2.9)

where the (1,1) of is ρ and the remaining elements are 1’s or 0’s and the first element

of ǫt is nonzero and all other elements are zero. We plug this law of motion and the law

of motion for Zt into ((7.2.2)) using recursion to write the equation in terms of Zt−1 and

St. The coefficients on these variables are both set equal to zero. Setting the coefficients

on St equal to zero gives us the equations we need for solving the elements of B. The

problem then is to find B such that EtFBG+HSt = 0 where the elements of matrices

F , G, and H are functions of the parameters and the computed elements of A. If Et ≡E[·|µt, µt−1, . . .], then B is given by

vec (B) = − (G⊗ F ′) vec (H) (7.2.10)

where vec(B) is a vector with the columns of B stacked one after another.

Note, however, that the first residual equation has Et ≡ E[·|µt−1, µt−2, . . .]. Therefore,

we have to treat it slightly differently from the others. Even so, the solution procedure is

an application of undetermined coefficients, and all elements of B can be found by solving

a system of linear equations.

65

Chapter 8.

Business Cycle Accounting

Business cycle accounting is a simple method to help researchers develop quantitative

models of economic fluctuations. The method rests on the insight that many models are

equivalent to a prototype growth model with time-varying wedges which resemble produc-

tivity, labor and investment taxes, and government consumption. Wedges corresponding

to these variables—efficiency, labor, investment, and government consumption wedges—

are measured and then fed back into the model in order to assess the fraction of various

fluctuations they account for.

8.1. The Prototype Model with Time-Varying Wedges

The prototype model is a a version of the RBC model described earlier. The main difference

is that we have a different set of shocks. We also keep track of the stochastic events so

as to be very clear about the timing of these shocks. Specifically, In each period t, the

economy experiences one of finitely many events st, which index the shocks. We denote

by st = (s0, ..., st) the history of events up through and including period t and often

refer to st as the state. The probability, as of period 0, of any particular history st is

πt(st). The initial realization s0 is given. The economy has four exogenous stochastic

variables, all of which are functions of the underlying random variable st: the efficiency

wedge At(st), the labor wedge 1 − τlt(s

t), the investment wedge 1/[1 + τxt(st)], and the

government consumption wedge gt(st).

Consumers maximize expected utility over per capita consumption ct and per capita

labor lt,∞∑

t=0

∑

st

βtπt(

st)

U(

ct(

st)

, lt(

st))

Nt,

subject to the budget constraint

ct +[

1 + τxt(

st)]

xt(

st)

=[

1 − τlt(

st)]

wt(

st)

lt(

st)

+ rt(

st)

kt(

st−1)

+ Tt(

st)

and the capital accumulation law

(1 + γn) kt+1

(

st)

= (1 − δ) kt(

st−1)

+ xt(

st)

, (8.1.1)

66

where kt(st−1) denotes the per capita capital stock, xt(s

t) per capita investment, wt(st)

the wage rate, rt(st) the rental rate on capital, β the discount factor, δ the depreciation

rate of capital, Nt the population with growth rate equal to 1 + γn, and Tt(st) per capita

lump-sum transfers.

The production function is F (kt(st−1), (1 + γ)tlt(s

t), where 1 + γ is the rate of labor-

augmenting technical progress, which is assumed to be a constant. Firms maximize profits

given by At(st)F (kt(s

t−1), (1 + γ)tlt(st)− rt(s

t)kt(st−1)− wt(s

t)lt(st).

The equilibrium of this benchmark prototype economy is summarized by the resource

constraint,

ct(

st)

+ xt(

st)

+ gt(

st)

= yt(

st)

, (8.1.2)

where yt(st) denotes per capita output, together with

yt(

st)

= At(

st)

F(

kt(

st−1)

, (1 + γ)tlt(

st)

)

, (8.1.3)

−Ult (st)

Uct (st)=[

1 − τlt(

st)]

At(

st)

(1 + γ)tFlt, (8.1.4)

Uct(

st) [

1 + τxt(

st)]

(8.1.5)

= β∑

st+1

πt(

st+1|st)

Uct+1

(

st+1)

At+1

(

st+1)

Fkt+1

(

st+1)

+ (1 − δ)[

1 + τxt+1

(

st+1)]

,

where, here and throughout, notations like Uct, Ult, Flt, and Fkt denote the derivatives

of the utility function and the production function with respect to their arguments and

πt(st+1|st) denotes the conditional probability πt(s

t+1)/πt(st). We assume that gt(s

t)

fluctuates around a trend of (1 + γ)t.

Notice that in this benchmark prototype economy, the efficiency wedge resembles a

blueprint technology parameter, and the labor wedge and the investment wedge resem-

ble tax rates on labor income and investment. Other more elaborate models could be

considered, models with other kinds of frictions that look like taxes on consumption or

on capital income. Consumption taxes induce a wedge between the consumption-leisure

marginal rate of substitution and the marginal product of labor in the same way as do

labor income taxes. Such taxes, if time-varying, also distort the intertemporal margins in

(8.1.5). Capital income taxes induce a wedge between the intertemporal marginal rate of

substitution and the marginal product of capital which is only slightly different from the

distortion induced by a tax on investment.

67

We emphasize that each of the wedges represents the overall distortion to the relevant

equilibrium condition of the model. For example, distortions both to labor supply affect-

ing consumers and to labor demand affecting firms distort the static first-order condition

(8.1.4). Our labor wedge represents the sum of these distortions. Thus, our method iden-

tifies the overall wedge induced by both distortions and does not identify each separately.

Likewise, liquidity constraints on consumers distort the consumer’s intertemporal Euler

equation, while investment financing frictions on firms distort the firm’s intertemporal Eu-

ler equation. Our method combines the Euler equations for the consumer and the firm

and therefore identifies only the overall wedge in the combined Euler equation given by

(8.1.5). We focus on the overall wedges because what matters in determining business

cycle fluctuations is the overall wedges, not each distortion separately.

8.2. Mapping Frictions to Wedges

Now we illustrate the mapping between detailed economies and prototype economies for

two types of wedges. We show that input-financing frictions in a detailed economy map

into efficiency wedges in our prototype economy. Sticky wages in a monetary economy map

into our prototype (real) economy with labor wedges. In an appendix, we show as well

that investment-financing frictions map into investment wedges and that fluctuations in net

exports in an open economy map into government consumption wedges in our prototype

(closed) economy. In general, our approach is to show that the frictions associated with

specific economic environments manifest themselves as distortions in first-order conditions

and resource constraints in a growth model. We refer to these distortions as wedges.

We choose simple models in order to illustrate how the detailed models map into the

prototypes. Since many models map into the same configuration of wedges, identifying one

particular configuration does not uniquely identify a model; rather, it identifies a whole

class of models consistent with that configuration. In this sense, our method does not

uniquely determine the model most promising to analyze business cycle fluctuations. It

does, however, guide researchers to focus on the key margins that need to be distorted in

order to capture the nature of the fluctuations.

8.2.1. Efficiency Wedges

In many economies, underlying frictions either within or across firms cause factor inputs to

be used inefficiently. These frictions in an underlying economy often show up as aggregate

68

productivity shocks in a prototype economy similar to our benchmark economy. Schmitz

(2005) presents an interesting example of within-firm frictions resulting from work rules

that lower measured productivity at the firm level. Lagos (2006) studies how labor market

policies lead to misallocations of labor across firms and, thus, to lower aggregate produc-

tivity. And Chu (2001) and Restuccia and Rogerson (2003) show how government policies

at the levels of plants and establishments lead to lower aggregate productivity.

Here we develop a detailed economy with input-financing frictions and use it to make

two points. This economy illustrates the general idea that frictions which lead to ineffi-

cient factor utilization map into efficiency wedges in a prototype economy. Beyond that,

however, the economy also demonstrates that financial frictions can show up as efficiency

wedges rather than as investment wedges. In our detailed economy, financing frictions lead

some firms to pay higher interest rates for working capital than do other firms. Thus, these

frictions lead to an inefficient allocation of inputs across firms.

A Detailed Economy With Input-Financing Frictions

Consider a simple detailed economy with financing frictions which distort the alloca-

tion of intermediate inputs across two types of firms. Both types of firms must borrow to

pay for an intermediate input in advance of production. One type of firm is more finan-

cially constrained, in the sense that it pays a higher interest rate on borrowing than does

the other type. We think of these frictions as capturing the idea that some firms, such as

small firms, often have difficulty borrowing. One motivation for the higher interest rate

faced by the financially constrained firms is that moral hazard problems are more severe

for small firms.

Specifically, consider the following economy. Aggregate gross output qt is a combi-

nation of the gross output qit from the economy’s two sectors, indexed i = 1, 2, where 1

indicates the sector of firms that are more financially constrained and 2 the sector of firms

that are less financially constrained. The sectors’ gross output is combined according to

qt = qφ1tq1−φ2t , (8.2.1)

where 0 < φ < 1. The representative producer of the gross output qt chooses q1t and q2t

to solve this problem:

max qt − p1tq1t − p2tq2t

subject to (8.2.1), where pit is the price of the output of sector i.

69

The resource constraint for gross output in this economy is

ct + kt+1 +m1t +m2t = qt + (1 − δ) kt, (8.2.2)

where ct is consumption, kt is the capital stock, and m1t and m2t are intermediate goods

used in sectors 1 and 2, respectively. Final output, given by yt = qt− m1t− m2t, is gross

output less the intermediate goods used.

The gross output of each sector i, qit, is made from intermediate goods mit and a

composite value-added good zit according to

qit = mθitz

1−θit , (8.2.3)

where 0 < θ < 1. The composite value-added good is produced from capital kt and labor

lt according to

z1t + z2t = zt = F (kt, lt) . (8.2.4)

The producer of gross output of sector i chooses the composite good zit and the

intermediate good mit to solve this problem:

max pitqit − vtzit −Ritmit

subject to (8.2.3). Here vt is the price of the composite good and Rit is the gross within-

period interest rate paid on borrowing by firms in sector i. If firms in sector 1 are more

financially constrained than those in sector 2, then R1t > R2t. Let Rit = Rt(1+τit), where

Rt is the rate consumers earn within period t and τit measures the within-period spread,

induced by financing constraints, between the rate paid to consumers who save and the

rate paid by firms in sector i. Since consumers do not discount utility within the period,

Rt = 1.

In this economy, the representative producer of the composite good zt chooses kt and

lt to solve this problem:

max vtzt − wtlt − rtkt

subject to (8.2.4), where wt is the wage rate and rt is the rental rate on capital.

Consumers solve this problem:

max

∞∑

t=0

βtU (ct, lt) (8.2.5)

70

subject to

ct + kt+1 = rtkt + wtlt + (1 − δ) kt + Tt,

where lt = l1t + l2t is the economy’s total labor supply and Tt = Rt∑

i τitmit lump-sum

transfers. Here we assume that the financing frictions act like distorting taxes, and the

proceeds are rebated to consumers. If, instead, we assumed that these frictions represent,

say, lost gross output, then we would adjust the economy’s resource constraint (8.2.2)

appropriately.

The Associated Prototype Economy

Now consider a version of the benchmark prototype economy that will have the same

aggregate allocations as the input-financing frictions economy just detailed. This prototype

economy is identical to our benchmark prototype except that the new prototype economy

has an investment wedge that resembles a tax on capital income rather than a tax on

investment. Here the government consumption wedge is set equal to zero.

Now the consumer’s budget constraint is

ct + kt+1 = (1 − τkt) rtkt + (1 − τlt)wtlt + (1 − δ) kt + Tt, (8.2.6)

and the efficiency wedge is

At = κ(

a1−φ1t aφ2t

)θ

1−θ

[1 − θ (a1t + a2t)] , (8.2.7)

where a1t = φ/(1 + τ1t), a2t = (1− φ)/(1 + τ2t), κ = [φφ(1 − φ)1−φθθ]1

1−θ , and τ1t and τ2t

are the interest rate spreads in the detailed economy.

Comparing the first-order conditions in the detailed economy with input-financing

frictions to those of the associated prototype economy with efficiency wedges leads imme-

diately to this proposition:

Proposition 1: Consider the prototype economy with resource constraint (8.1.2) and

consumer budget constraint (8.2.6) with exogenous processes for the efficiency wedge At

given in (8.2.7), the labor wedge given by

1

1 − τlt=

1

1 − θ

[

1 − θ

(

φ

1 + τ∗1t+

1 − φ

1 + τ∗2t

)]

, (8.2.8)

and the investment wedge given by τkt = τlt where τ∗1t and τ∗2t are the interest rate spreads

from the detailed economy with input-financing frictions. Then the equilibrium allocations

71

for aggregate variables in the detailed economy are equilibrium allocations in this prototype

economy.

Consider the following special case of Proposition 1 in which only the efficiency wedge

fluctuates. Specifically, suppose that in the detailed economy the interest rate spreads τ1t

and τ2t fluctuate over time, but in such a way that the weighted average of these spreads,

a1t + a2t =φ

1 + τ1t+

1 − φ

1 + τ2t, (8.2.9)

is constant while a1−φ1t aφ2t fluctuates. Then from (8.2.8) we see that the labor and invest-

ment wedges are constant, and from (8.2.7) we see that the efficiency wedge fluctuates. In

this case, on average, financing frictions are unchanged, but relative distortions fluctuate.

An outside observer who attempted to fit the data generated by the detailed economy

with input-financing frictions to the prototype economy would identify the fluctuations in

relative distortions with fluctuations in technology and would see no fluctuations in either

the labor wedge 1 − τlt or the investment wedge τkt. In particular, periods in which the

relative distortions increase would be misinterpreted as periods of technological regress.

8.2.2. Labor Wedges

Now we show that a monetary economy with sticky wages is equivalent to a (real) prototype

economy with labor wedges. In the detailed economy, the shocks are to monetary policy,

while in the prototype economy, the shocks are to the labor wedge.

A Detailed Economy With Sticky Wages

Consider a monetary economy populated by a large number of identical, infinitely lived

consumers. The economy consists of a competitive final goods producer and a continuum

of monopolistically competitive unions that set their nominal wages in advance of the

realization of shocks to the economy. Each union represents all consumers who supply a

specific type of labor.

In each period t, the commodities in this economy are a consumption-capital good,

money, and a continuum of differentiated types of labor, indexed by j ∈ [0, 1]. The

technology for producing final goods from capital and a labor aggregate at history, or

state, st has constant returns to scale and is given by y(st) = F (k(st−1), l(st), where y(st)

is output of the final good, k(st−1) is capital, and

l(

st)

=

[∫

l(

j, st)vdj

]1v

(8.2.10)

72

is an aggregate of the differentiated types of labor l(j, st).

The final goods producer in this economy behaves competitively. This producer

has some initial capital stock k(s−1) and accumulates capital according to k(st) = (1 −δ)k(st−1) + x(st), where x(st) is investment. The present discounted value of profits for

this producer is

∞∑

t=0

∑

st

Q(

st) [

P(

st)

y(

st)

− P(

st)

x(

st)

−W(

st−1)

l(

st)]

, (8.2.11)

where Q(st) is the price of a dollar at st in an abstract unit of account, P (st) is the dollar

price of final goods at st, and W (st−1) is the aggregate nominal wage at st which depends

on only st−1 because of wage stickiness.

The producer’s problem can be stated in two parts. First, the producer chooses se-

quences for capital k(st−1), investment x(st), and aggregate labor l(st) in order to maximize

(8.2.11) given the production function and the capital accumulation law. The first-order

conditions can be summarized by

P(

st)

Fl(

st)

= W(

st−1)

(8.2.12)

Q(

st)

P(

st)

=∑

st+1

Q(

st+1)

P(

st+1) [

Fk(

st+1)

+ 1 − δ]

. (8.2.13)

Second, for any given amount of aggregate labor l(st), the producer’s demand for each

type of differentiated labor is given by the solution to

minl(j,st),j∈[0,1]

∫

W(

j, st−1)

l(

j, st)

dj (8.2.14)

subject to (8.2.10); here W (j, st−1) is the nominal wage for differentiated labor of type j.

Nominal wages are set by unions before the realization of the event in period t; thus, wages

depend on, at most, st−1. The demand for labor of type j by the final goods producer is

ld(

j, st)

=

[

W(

st−1)

W (j, st−1)

]1

1−v

l(

st)

, (8.2.15)

where W (st−1) ≡[∫

W (j, st−1)v

v−1 dj]

v−1

v is the aggregate nominal wage. The minimized

value in (8.2.14) is, thus, W (st−1)l(st).

In this economy, consumers can be thought of as being organized into a continuum of

unions indexed by j. Each union consists of all the consumers in the economy with labor

73

of type j. Each union realizes that it faces a downward-sloping demand for its type of

labor, given by (8.2.15). In each period, the new wages are set before the realization of

the economy’s current shocks.

The preferences of a representative consumer in the jth union is

∞∑

t=0

∑

st

βtπt(

st)

[U(

c(

j, st)

, l(

j, st)

+ V(

M(

j, st)

/P(

st)))

], (8.2.16)

where c(j, st), l(j, st),M(j, st) are the consumption, labor supply, and money holdings of

this consumer, and P (st) is the economy’s overall price level. Note that the utility function

is separable in real balances. This economy has complete markets for state-contingent

nominal claims. The asset structure is represented by a set of complete, contingent, one-

period nominal bonds. Let B(j, st+1) denote the consumers’ holdings of such a bond

purchased in period t at history st, with payoffs contingent on some particular event st+1

in t+ 1, where st+1 = (st, st+1). One unit of this bond pays one dollar in period t + 1 if

the particular event st+1 occurs and 0 otherwise. Let Q(st+1|st) denote the dollar price of

this bond in period t at history st, where Q(st+1|st) = Q(st+1)/Q(st).

The problem of the jth union is to maximize (8.2.16) subject to the budget constraint

P(

st)

c(

j, st)

+M(

j, st)

+∑

st+1

Q(

st+1|st)

B(

j, st+1)

≤W(

j, st−1)

l(

j, st)

+M(

j, st−1)

+B(

j, st)

+ P(

st)

T(

st)

+D(

st)

,

the constraint l(j, st) = ld(j, st), and the borrowing constraint B(st+1) ≥ −P (st)b, where

ld(j, st) is given by (8.2.15). Here T (st) is transfers and the positive constant b constrains

the amount of real borrowing by the union. Also, D(st) = P (st)y(st) − P (st)x(st) −W (st−1)l(st) are the dividends paid by the firms. The initial conditions M(j, s−1) and

B(j, s0) are given and assumed to be the same for all j. Notice that in this problem, the

union chooses the wage and agrees to supply whatever labor is demanded at that wage.

The first-order conditions for this problem can be summarized by

Vm (j, st)

P (st)− Uc (j, st)

P (st)+ β

∑

st+1

π(

st+1|st) Uc

(

j, st+1)

P (st+1)= 0, (8.2.17)

Q(

st|st−1)

= βπt(

st|st−1) Uc (j, st)

Uc (j, st−1)

P(

st−1)

P (st), and (8.2.18)

W(

j, st−1)

= −∑

st Q (st)P (st)Ul (j, st) /Uc (j, st) ld (j, st)

v∑

st Q (st) ld (j, st). (8.2.19)

74

Here πt(st+1|st) = πt(s

t+1)/πt(st) is the conditional probability of st+1 given st. Notice

that in a steady state, (8.2.19) reduces to W/P = (1/v)(−Ul/Uc), so that real wages are

set as a markup over the marginal rate of substitution between labor and consumption.

Given the symmetry among the unions, all of them choose the same consumption, labor,

money balances, bond holdings, and wages, which are denoted simply by c(st), l(st), M(st),

B(st+1), and W (st).

Consider next the specification of the money supply process and the market-clearing

conditions for this sticky-wage economy. The nominal money supply process is given

by M(st) = µ(st)M(st−1), where µ(st) is a stochastic process. New money balances

are distributed to consumers in a lump-sum fashion by having nominal transfers satisfy

P (st)T (st) = M(st)−M(st−1). The resource constraint for this economy is c(st)+k(st) =

y(st) + (1 − δ)k(st−1). Bond market–clearing requires that B(st+1) = 0.

The Associated Prototype Economy

Consider now a real prototype economy with labor wedges and the production function

for final goods given above in the detailed economy with sticky wages. The representative

firm maximizes (8.2.11) subject to the capital accumulation law given above. The first-

order conditions can be summarized by (8.2.12) and (8.2.13). The representative consumer

maximizes∞∑

t=0

∑

st

βtπt(

st)

U(

c(

st)

, l(

st))

subject to the budget constraint

c(

st)

+∑

st+1

q(

st+1|st)

b(

st+1)

≤[

1 − τl(

st)]

w(

st)

l(

st)

+ b(

st)

+ v(

st)

+ d(

st)

with w(st) replacing W (st−1)/P (st) and q(st+1/st) replacing Q(st+1)P (st+1)/Q(st)P (st)

and a bound on real bond holdings, where the lowercase letters q, b, w, v, and d denote the

real values of bond prices, debt, wages, lump-sum transfers, and dividends. Here the first-

order condition for bonds is identical to that in (8.2.18) once symmetry has been imposed

with q(st/st−1) replacing Q(st/st−1)P (st)/P (st−1). The first-order condition for labor is

given by

−Ul (st)

Uc (st)=(

1 − τl(

st))

w(

st)

.

Consider an equilibrium of the sticky wage economy for some given stochastic process

M∗(st) on money supply. Denote all of the allocations and prices in this equilibrium with

asterisks. Then this proposition can be easily established:

75

Proposition 2: Consider the prototype economy just described with labor wedges

given by

1 − τl(

st)

= −U∗l (st)

U∗c (st)

1

F ∗l (st)

, (8.2.20)

where U∗l (st), U∗

c (st), and F ∗l (st) are evaluated at the equilibrium of the sticky wage econ-

omy and where real transfers are equal to the real value of transfers in the sticky wage

economy adjusted for the interest cost of holding money. Then the equilibrium allocations

and prices in the sticky wage economy are the same as those in the prototype economy.

The proof of this proposition is immediate from comparing the first-order conditions,

the budget constraints, and the resource constraints for the prototype economy with labor

wedges to those of the detailed economy with sticky wages. The key idea is that distortions

in the sticky-wage economy between the marginal product of labor implicit in (8.2.19) and

the marginal rate of substitution between leisure and consumption are perfectly captured

by the labor wedges (8.2.20) in the prototype economy.

8.3. The Accounting Procedure

Having established our equivalence result, we now describe our accounting procedure at a

conceptual level and discuss a Markovian implementation of it.

Our procedure is to conduct experiments that isolate the marginal effect of each wedge

as well as the marginal effects of combinations of these wedges on aggregate variables. In

the experiment in which we isolate the marginal effect of the efficiency wedge, for example,

we hold the other wedges fixed at some constant values in all periods. In conducting this

experiment, we ensure that the probability distribution of the efficiency wedge coincides

with that in the prototype economy. In effect, we ensure that agents’ expectations of

how the efficiency wedge will evolve are the same as in the prototype economy. For each

experiment, we compare the properties of the resulting equilibria to those of the prototype

economy. These comparisons, together with our equivalence results, allow us to identify

promising classes of detailed economies.

8.3.1. The Accounting Procedure at a Conceptual Level

Suppose for now that the stochastic process πt(st) and the realizations of the state st in

some particular episode are known. Recall that the prototype economy has one underlying

(vector-valued) random variable, the state st, which has a probability of πt(st). All of the

76

other stochastic variables, including the four wedges—the efficiency wedge At(st), the labor

wedge 1 − τlt(st), the investment wedge 1/[1 + τxt(s

t)], and the government consumption

wedge gt(st)—are simply functions of this random variable. Hence, when the state st is

known, so are the wedges.

To evaluate the effects of just the efficiency wedge, for example, we consider an econ-

omy, referred to as an efficiency wedge alone economy, with the same underlying state

st and probability πt(st) and the same function At(s

t) for the efficiency wedge as in the

prototype economy, but in which the other three wedges are set to constants, in that

τlt(st) = τl, τxt(s

t) = τx, and gt(st) = g. Note that this construction ensures that the

probability distribution of the efficiency wedge in this economy is identical to that in the

prototype economy.

For the efficiency wedge alone economy, we then compute the equilibrium outcomes

associated with the realizations of the state st in a particular episode and compare these

outcomes to those of the economy with all four wedges. We find this comparison to be

of particular interest because in our applications, the realizations st are such that the

economy with all four wedges exactly reproduces the data on output, labor, investment,

and consumption.

In a similar manner, we define the labor wedge alone economy, the investment wedge

alone economy, and the government consumption wedge alone economy, as well as economies

with a combination of wedges such as the efficiency and labor wedge economy.

8.3.2. A Markovian Implementation

So far we have described our procedure assuming that we know the stochastic process

πt(st) and that we can observe the state st. In practice, of course, we need to either specify

the stochastic process a priori or use data to estimate it, and we need to uncover the state

st from the data. Here we describe a set of assumptions that makes these efforts easy.

Then we describe in detail the three steps involved in implementing our procedure.

We assume that the state st follows a Markov process of the form π(st|st−1) and that

the wedges in period t can be used to uniquely uncover the event st, in the sense that the

mapping from the event st to the wedges (At, τlt, τxt, gt) is one-to-one and onto. Given this

assumption, without loss of generality, let the underlying event st = (sAt, slt, sxt, sgt), and

let At(st) = sAt, τlt(s

t) = slt, τxt(st) = sxt, and gt(s

t) = sgt. Note that we have effectively

assumed that agents use only past wedges to forecast future wedges and that the wedges

in period t are sufficient statistics for the event in period t.

77

The first step in our procedure is to use data on yt, lt, xt, and gt from an actual

economy to estimate the parameters of the Markov process π(st|st−1). We can do so using

a variety of methods, including the maximum likelihood procedure described below.

The second step in our procedure is to uncover the event st by measuring the realized

wedges. We measure the government consumption wedge directly from the data as the sum

of government spending and net exports. To obtain the values of the other three wedges,

we use the data and the model’s decision rules. With ydt , ldt , x

dt , g

dt , and kd0 denoting the

data and y(st, kt), l(st, kt), and x(st, kt) denoting the decision rules of the model, the

realized wedge series sdt solves

ydt = y(

sdt , kt)

, ldt = l(

sdt , kt)

, andxdt = x(

sdt , kt)

, (8.3.1)

with kt+1 = (1 − δ)kt + xdt , k0 = kd0 , and gt = gdt . Note that we construct a series for the

capital stock using the capital accumulation law (8.1.1), data on investment xt, and an

initial choice of capital stock k0. In effect, we solve for the three unknown elements of the

vector st using the three equations (8.1.3)–(8.1.5) and thereby uncover the state. We use

the associated values for the wedges in our experiments.

Note that the four wedges account for all of the movement in output, labor, investment,

and government consumption, in that if we feed the four wedges into the three decision

rules in (8.3.1) and use gt(sdt ) = sgt along with the law of motion for capital, we simply

recover the original data.

Note also that, in measuring the realized wedges, the estimated stochastic process

plays a role in measuring only the investment wedge. To see that the stochastic process

does not play a role in measuring the efficiency and labor wedges, note that these wedges

can equivalently be directly calculated from (8.1.3) and (8.1.4) without computing the

equilibrium of the model. In contrast, calculating the investment wedge requires computing

the equilibrium of the model because the right side of (8.1.5) has expectations over future

values of consumption, the capital stock, the wedges, and so on. The equilibrium of the

model depends on these expectations and, therefore, on the stochastic process driving the

wedges.

The third step in our procedure is to conduct experiments to isolate the marginal

effects of the wedges. To do that, we allow a subset of the wedges to fluctuate as they do

in the data while the others are set to constants. To evaluate the effects of the efficiency

wedge, we compute the decision rules for the efficiency wedge alone economy, denoted

ye(st, kt), le(st, kt), and xe(st, kt), in which At(s

t) = sAt, τlt(st) = τl, τxt(s

t) = τx, and

78

gt(st) = g. Starting from kd0 , we then use sdt , the decision rules, and the capital accumula-

tion law to compute the realized sequence of output, labor, and investment, yet , let , and xet ,

which we call the efficiency wedge components of output, labor, and investment. We com-

pare these components to output, labor, and investment in the data. Other components

are computed and compared similarly.

Notice that in this experiment we computed the decision rules for an economy in

which only one wedge fluctuates and the others are set to be constants in all events. The

fluctuations in the one wedge are driven by fluctuations in a 4 dimensional state st.

Notice also that our experiments are designed to separate out the direct effect and the

forecasting effect of fluctuations in wedges. As a wedge fluctuates, it directly affects either

budget constraints or resource constraints. This fluctuation also affects the forecasts of

that wedge as well as of other wedges in the future. Our experiments are designed so that

when we hold a particular wedge constant, we eliminate the direct effect of that wedge,

but we retain its forecasting effect on the other wedges. By doing so, we ensure that

expectations of the fluctuating wedges are identical to those in the prototype economy.

79

Chapter 9.

Structural VARs

9.1. A Version of the RBC Model


Consider an economy with households, firms, and the government. The representative

household chooses consumption, investment, and labor to solve the following maximization

problem:

maxct,xt,lt

E∞∑

t=0

βt U (ct, 1 − lt)Nt

subject to (1 + τct) ct + (1 + τxt)xt = (1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt

Nt+1kt+1 = [(1 − δ) kt + xt]Nt

ct, xt ≥ 0 in all states

taking processes for the rental rate, wage rate, the tax rates, and transfers as given. The

representative firm solves a simple static problem at t:

maxKt,Lt

F (Kt, ZtLt) − rtKt − wtLt.

The government sets rates of taxes and transfers in such a way that their budget constraint

at t, namely,

Gt +Nttrt = τkt (rt − δ)Ntkt + τltwtltNt + τctNtct + τxtNtxt

is satisfied. In equilibrium, the following conditions must hold:

Nt (ct + xt) +Gt = F (Kt, ZtLt) (9.1.1)

Ntkt = Kt

Ntlt = Lt.

80


We now derive first-order conditions in this economy. The Lagrangian for the household

optimization problem is given by

L = E∑

t

βtNt

U (ct, 1 − lt)

+ µt

(1 − τkt) rtkt + (1 − τlt)wtlt + τktδkt + trt − (1 + τct) ct − (1 + τxt) xt

+ λt

(1 − δ) kt + xt − (1 + gn) kt+1

Here, it is assumed that the investment decision will be interior.

The relevant first-order conditions are found by taking derivatives of L with respect

to ct, lt, xt, and kt+1:

0 = U1 (ct, 1 − lt) − µt (1 + τct)

0 = −U2 (ct, 1 − lt) + µt (1 − τlt)wt

0 = µt (1 + τxt) + λt = 0

0 = − (1 + gn)λt + Etµt+1 [(1 − τkt+1) rt+1 + δτkt+1] + λt+1 (1 − δ)

Eliminating multipliers yields:

U2 (ct, 1 − lt)

U1 (ct, 1 − lt)=

1 − τlt1 + τct

wt (9.1.2)

1 + τxt1 + τct

U1 (ct, 1 − lt) = βEt

[

U1 (ct+1, 1 − lt+1)

1 + τct+1

(1 − τkt+1) rt+1 + δτkt+1

+ (1 − δ) (1 + τxt+1)

]

. (9.1.3)

In addition, there are first-order conditions for the firm’s static problem. These are

rt = F1 (Kt, ZtLt) (9.1.4)

wt = F2 (Kt, ZtLt)Zt. (9.1.5)

Finally, there is a resource constraint given by (9.1.1).

From here on, the following functional form assumptions and auxiliary choices are

made:

F (k, l) = kθl1−θ (9.1.6)

81

U (c, 1 − l) =(

c (1 − l)ψ)1−σ

/ (1 − σ) (9.1.7)

τkt = τct = 0

st = [log zt, τlt, τxt, log gt]′

st+1 = P0 + Pst +Qηs,t+1, ηs ∼ N (04×1, I4×4) . (9.1.8)

The tax rate τc has been set to 0 in all periods since it plays a similar role to τn in distorting

the labor-leisure choice. Similarly, τk has been set to 0 since it plays a similar role to τx

in distorting the intertemporal margin.

If we substitute the choices (9.1.6)-(9.1.7) into (9.1.1) and (9.1.2)-(9.1.5), then substi-

tute the equilibrium rates rt and wt into (9.1.2) and (9.1.3), We have:

Nt (ct + gt) +Nt+1kt+1 − (1 − δ)Ntkt = (Ntkt)θ(ZtNtlt)

1−θ(9.1.9)

ψct1 − lt

= (1 − τlt) (1 − θ) (Ntkt)θZ1−θt (Ntlt)

−θ(9.1.10)

(1 + τxt) c−σt (1 − lt)

ψ(1−σ)

= βEt[

c−σt+1 (1 − lt+1)ψ(1−σ)

θ (Nt+1kt+1)θ−1

(Zt+1Nt+1lt+1)1−θ

+ (1 − δ) (1 + τxt+1)]

. (9.1.11)


We first normalize the variables as follows:

ct = ct/Zt, xt = xt/Zt, gt = gt/Zt, yt = yt/Zt, kt = kt/Zt−1.

Using the functional forms for F and U in (9.1.6) and (9.1.7), respectively, the equilibrium

rental and wage rates are:

rt = θKθ−1t (ZtLt)

1−θ= θkθ−1

t (ztlt)1−θ

wt = (1 − θ)Kθt (ZtLt)

−θZt = (1 − θ) kθt (ztlt)

−θZt.

This implies the following first-order conditions

ct + gt + (1 + gn) kt+1 − (1 − δ) z−1t kt = yt = kθt l

1−θt z−θt (9.1.12)

ψct1 − lt

= (1 − τlt) (1 − θ) kθt (ztlt)−θ

(9.1.13)

(1 + τxt) c−σt (1 − lt)

ψ(1−σ)

= βz−σt+1Etc−σt+1 (1 − lt+1)

ψ(1−σ)[


1−θ+ (1 − δ) (1 + τxt+1)

]

.(9.1.14)

82

Next, we compute the steady state of the system for constant values for z, the taxes,

and government spending:

k/l =

(

(1 + τx) (1 − βz−σ (1 − δ))

βz−σθz1−θ

)1/(θ−1)

c =

[

(

k/l)θ−1

z−θ − (1 + gn) + (1 − δ) z−1

]

k − g = ξ1k − g

c =

[

(1 − τl) (1 − θ)(

k/l)θ

z−θ/ψ

]

(

1 − 1/(

k/l)

k)

= ξ2 − ξ3k


Assume that the solution for the capital decision takes the form:

log kt+1 = γk log kt + γ [ log zt τlt τxt log gt ]′+ constant, (9.1.15)

where γk is a scalar and γ is 1 × 4 and equal to [γz, γl, γx, γg]. Assume the residual from

the dynamic first-order condition (9.1.14) can be written (after substitutions from (9.1.12)

and (9.1.13)):

f(

Et log kt+2, log kt+1, log kt, log zt+1, log zt, τlt+1, τlt, τxt+1, τxt, log gt+1, log gt

)

≈ a0Et log kt+2 + a1 log kt+1 + a2 log kt + b0Etst+1 + b1st.

Then the general solution algorithm is to find γk that solves the quadratic equation

a0γ2k + a1γk + a2 = 0,

and γ that solves the linear equations:

a0γkγ + a0γP + a1γ + b0P + b1 = 01×4.

Note that this implies:

γ = − [(a0a+ a1) I4×4 + a0P′]−1

(b0P + b1I4×4)′. (9.1.16)

Once we have values for the the coefficients γk and γ, We can use (9.1.12) and (9.1.13) to

back out ct and lt (either nonlinearly or by way of a log-linear approximation).

One property of the solution that we use later is the fact that γk = −γz. A second look

at This is true because kt is everywhere divided by zt in the first-order conditions (9.1.12)-

(9.1.13). Thus, when the first-order conditions are log-linearized, the same coefficients hit

log(kt) and − log(zt).

83

Given values for the coefficients in (9.1.15), We can derive expressions for labor, con-

sumption, and investment using the static first-order conditions. In particular, we log-

linearize (9.1.13) after substituting in for consumption from (9.1.12):

0 ≈ ψ

kθl1−θz−θ[

θ(

log kt − log zt

)

+ (1 − θ) log lt

]

− (1 + gn) k log kt+1 + (1 − δ) z−1k(

log kt − log zt

)

− g log gt

+ (1 − θ) (1 − τl) kθ (zl)

−θ(1 − l)

1/ (1 − τl) τlt

− θ log kt + θ (log lt + log zt) + l/ (1 − l) log lt

.

which can be written succinctly as

log lt = φlk log kt + φlz log zt + φllτlt + φlg log gt + φlk′ log kt+1.

With this equation for log l, we use the production relation and the capital accumulation

equation to write log y and log x as follows:

log yt = (θ + (1 − θ)φlk) log kt + ((1 − θ)φlz − θ) log zt

+ (1 − θ)[

φllτlt + φlg log gt + φlk′ log kt+1

]


log xt ≈ (1 + gn) k/x log kt+1 − (1 − δ) z−1k/x(

log kt − log zt

)

≡ φxk log kt + φxz log zt + φxk′ log kt+1. (9.1.18)

Finally, we can log-linearize (9.1.12) to get

log ct ≈ y[

θ(

log kt − log zt

)

+ (1 − θ) log lt

]

− g log gt

− (1 + gn) k log kt+1 + (1 − δ) z−1k[

log kt − log zt

]

/c

=[

θy/c+ (1 − θ)φlk y/c+ (1 − δ) k/ (cz)]

log kt

−[

θy/c− (1 − θ)φlz y/c+ (1 − δ) k/ (cz)]

log zt

+ [(1 − θ)φlly/c] τlt

+ [(1 − θ)φlg y/c− g/c] log gt

+[

(1 − θ)φlk′ y/c− (1 + gn) k/c]

log kt+1

≡ φck log kt + φcz log zt + φclτlt + φcg log gt + φck′ log kt+1. (9.1.19)

84

9.2. VARs and the 2-Shock Version of the Model

9.2.1. The Decision Functions

Assume the economy has only two shocks and they are orthogonal: a unit root in

technology log z and an AR(1) in the tax rate on labor τ . (For convenience We drop l on

τlt throughout this section.) The capital decision function has the form:

log kt+1 = γ0 + γk log kt + γz log zt + γlτt

and the labor decision function can be written:

log lt = φlz log zt + φllτt + φlk log kt + φlk′ log kt+1

= φlz log zt + φllτt + φlk log kt + φlk′[

γ0γk log kt + γz log zt + γlτt

]

= (φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt + (φll + φlk′γl) τt.

These imply that output from a Cobb-Douglas production technology with capital share

θ is:

log yt = θ(

log kt − log zt

)

+ (1 − θ) log lt

= (θ + (1 − θ)φlk) log kt − (θ − (1 − θ)φlz) log zt + (1 − θ)φllτt

+ (1 − θ)φlk′ log kt+1

= (θ + (1 − θ) (φlk + φlk′γk)) log kt − (θ − (1 − θ) (φlz + φlk′γz)) log zt

+ (1 − θ) (φll + φlk′γl) τt

We can write the capital stock in terms of all lagged shocks as follows:

log kt = γ0 + γk

(

γ0 + γk log kt−2 + γz log zt−2 + γlτt−2

)

+ γz log zt−1 + γlτt−1

= γ0

[

1 + γk + γ2k + . . .

]

+ γz[

log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .

]

+ γl[

τt−1 + γkτt−2 + γ2kτt−3 + . . .

]

or in differences as follows:

log kt − log kt−1 = γz[

log zt−1 + (γk − 1) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

]

+ γl[

τt−1 + (γk − 1) τt−2 + γkτt−3 + γ2kτt−4 + . . .

]

85

or in quasi-differences as follows:

log kt − α log kt−1 = γz[

log zt−1 + (γk − α) log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

]

+ γl[

τt−1 + (γk − α) τt−2 + γkτt−3 + γ2kτt−4 + . . .

]

We can also write hours in terms of past shocks as follows:

log lt = φlz log zt + φllτt + φlk log kt + φlk′ log kt+1

= φlz log zt + φllτt

+ φlkγz[


]

+ φlkγl[

τt−1 + γkτt−2 + γ2kτt−3 + . . .

]

+ φlk′γz[

log zt + γk log zt−1 + γ2k log zt−2 + . . .

]

+ φlk′γl[

τt + γkτt−1 + γ2kτt−2 + . . .

]

= [(φlz + φlk′γz) log zt + (φlk + φlk′γk) γz log zt−1 + (φlk + φlk′γk) γkγz log zt−2 + . . .]

+ [(φll + φlk′γl) τt + (φlk + φlk′γk) γlτt−1 + (φlk + φlk′γk) γkγlτt−2 + . . .]

where constant terms have been ignored.

We can write logged hours in differences as follows:

log lt − log lt−1 = φlz (log zt − log zt−1) + φll (τt − τt−1)

+ φlk′(

log kt+1 − log kt

)

+ φlk

(


)

= φlz (log zt − log zt−1) + φll (τt − τt−1)

+ φlk′γz[

log zt + (γk − 1) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .

]

+ φlk′γl[

τt + (γk − 1) τt−1 + γkτt−2 + γ2kτt−3 + . . .

]

+ φlkγz[


]

+ φlkγl[

τt−1 + (γk − 1) τt−2 + γkτt−3 + γ2kτt−4 + . . .

]

= [φlz + φlk′γz] log zt − [φlz − φlkγz − φlk′γz (γk − 1)] log zt−1

+ γz (γk − 1) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

+ [φll + φlk′γl] τt − [φll − φlkγl − φlk′γl (γk − 1)] τt−1

+ γl (γk − 1) [φlk′γk + φlk] τt−2 + γkτt−3 + γ2kτt−4 + . . .

86

or in quasi-difference form as follows:

log lt − α log lt−1 = φlz (log zt − α log zt−1) + φll (τt − ατt−1)

+ φlk′(

log kt+1 − α log kt

)

+ φlk

(

log kt − α log kt−1

)

= φlz (log zt − α log zt−1) + φll (τt − ατt−1)

+ φlk′γz[

log zt + (γk − α) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .

]

+ φlk′γl[

τt + (γk − α) τt−1 + γkτt−2 + γ2kτt−3 + . . .

]

+ φlkγz[


]

+ φlkγl[

τt−1 + (γk − α) τt−2 + γkτt−3 + γ2kτt−4 + . . .

]

= [φlz + φlk′γz] log zt − [αφlz − φlkγz − φlk′γz (γk − α)] log zt−1

+ γz (γk − α) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

+ [φll + φlk′γl] τt − [αφll − φlkγl − φlk′γl (γk − α)] τt−1

+ γl (γk − α) [φlk′γk + φlk] τt−2 + γkτt−3 + γ2kτt−4 + . . .

We can use the expressions for output and hours to write out the change in produc-

tivity as follows:

log (yt/lt) − log (yt−1/lt−1)

= log yt − log yt−1 + log zt − log lt − log lt−1

= log zt + θ(

log kt − log kt−1 − log lt + log lt−1 − log zt + log zt−1

)

= (1 − θ) log zt + θ log zt−1

− θ(

log lt − log lt−1 − log kt + log kt−1

)

= (1 − θ) log zt + θ log zt−1 − θ[φlz + φlk′γz] log zt

− [φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1

+ γz (γk − 1) [φlk′γk + φlk − 1][


]

+ [φll + φlk′γl] τt − [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τt−1

+ γl (γk − 1) [φlk′γk + φlk − 1][

τt−2 + γkτt−3 + γ2k log zt−4 + . . .

]

= 1 − θ − θ [φlz + φlk′γz] log zt

+ θ [1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1

− θγz (γk − 1) [φlk′γk + φlk − 1][


]

− θ [φll + φlk′γl] τt

+ θ [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τt−1

− θγl (γk − 1) [φlk′γk + φlk − 1][

τt−2 + γkτt−3 + γ2kτt−4 + . . .

]

87

9.2.2. The Model’s Moving Average

The moving average for the model is given by:

[

(1 − L) log yt/lt(1 − αL) log lt

]

≡ Xt = D0ωt +D1ωt−1 +D2ωt−2 + . . .

where ωt = [log zt, τt]′ and

D0 =

[

1 − θ − θ (φlz + φlk′γz) −θ (φll + φlk′γl)φlz + φlk′γz φll + φlk′γl

]

D1 =

[

θ (1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)) θ (φll − (φlk − 1) γl − φlk′γl (γk − 1))−αφlz + (φlk + φlk′ (γk − α)) γz −αφll + (φlk + φlk′ (γk − α)) γl

]

D2 =

[

−θγz (γk − 1) [φlk′γk + φlk − 1] −θγl (γk − 1) [φlk′γk + φlk − 1](φlk + φlk′γk) (γk − α) γz (φlk + φlk′γk) (γk − α) γl

]

and Dj = γkDj−1 for j ≥ 3.

Let a = φlk + φlk′γk and b = φll + φlk′γl. Also, note that φlz = −φlk and γz = −γkhold in the model economy with a unit root in technology.

D0 =

[

1 − θ + θa −θb−a b

]

D1 =

[

θ (1 − γk) (1 − a) θ (b+ (1 − a) γl)(α− γk) a −αb+ γla

]

D2 =

[

γk (1 − a) θ (1 − γk) −γl (1 − a) θ (1 − γk)γka (α− γk) −γla (α− γk)

]

and, again, Dj = γj−2k D2 for j ≥ 3. Note that D2 is singular.

If τt is an AR(1), it is more convenient to write the MA process in terms of ηt =

[log zt, ηlt] rather than in terms of ωt. In this case,

Xt = D0ηt+(D0P +D1) ηt−1+(

D0P2 +D1P +D2

)

ηt−2+(

D0P3 +D1P

2 +D2P +D3

)

ηt−3. . . .

We normalize the MA so it has an identity for the first coefficient. That is, set C0 = I,

C1 = (D0P +D1)D−10 , and Cj = Cj−1D0PD

−10 +DjD

−10 .

88

9.2.3. Special Property of the D’s

Next, we will see that the D matrices have a special property that will be exploited when

we characterize coefficients of the VAR found by regressing Xt on lags of itself. The D’s

for the RBC model satisfy the relation:

(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2 = 0. (9.2.1)

One method of proof is to multiply all terms of the matrices in (9.2.1) and show that all

elements are zero. We have done this but the algebra is messy.

A simpler proof is as follows. Note that

D2 =

[

(1 − a) θ (1 − γk)(α− γk) a

]

[ γk −γl ] ≡ gh′. (9.2.2)

Thus, we can rewrite the left hand side of (9.2.1) as follows

(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2

=[

γk (gh′) − (gh′) (D0P +D1)−1

(gh′)]

−[

(D0P +D1)P (D0P +D1)−1gh′]

.(9.2.3)

We will prove that both terms in (9.2.3) in square brackets is equal to 2×2 zero matrices.

The first step of the proof is to show that

(D0P +D1)−1g =

[

10

]

. (9.2.4)

The proof of this step is trivial since the first column ofD0P+D1 is equal to g. Substituting

(9.2.4) into (9.2.3), the result (9.2.1) follows immediately from the fact that h′[1, 0]′ = γk

and P [1, 0]′ = 0.

9.2.4. VAR Coefficients

Given expressions for the D coefficients in the model MA, and thus the normalized C

coefficients, We can directly write out expressions for the coefficients in the VAR of Xt

regressed on lags of itself. We will denote the VAR coefficients by Bj, j = 1, 2, . . .. They

are related to the MA coefficients as follows:

Bj = Cj −B1Cj−1 −B2Cj−2 − . . .Bj−1C1. (9.2.5)

89

9.2.5. Proposition 1: Model has infinite-order VAR

Proposition 1. The model described above has a VAR representation with coefficients Bj

that satisfy

Bj = MBj−1 (9.2.6)

for j ≥ 2, with B1 = C1 = (D0P + D1)D−10 . The matrix M is a 2×2 matrix with

eigenvalues equal to α andγk − γla/b− θ

1 − θ,

where a = φlk + φlk′γk and b = φll + φlk′γl are the coefficients on k and τl in the labor

decision function.

Proof of Proposition 1. Choose M = C2C−11 −C1. Using the formula (9.2.5) for the VAR

coefficient, it is easy to show that M = B2B−11 . Therefore, B2 = MB1 holds. Consider

the next coefficient. Using the formula (9.2.5), we have

B3 −MB2 = C3 −B1C2 −B2C1 −M (C2 −B1C1)

= C3 −B1C2 −MC2

= C3 − C1C2 −(

C2C−11 − C1

)

C2

= C3 − C2C−11 C2

= C2D0PD−10 + γkD2D

−10 − C2C

−11

(

C1D0PD−10 +D2D

−10

)

= γkD2D−10 − C2C

−11 D2D

−10

=(

γkI − C2C−11

)

D2D−10

=(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2D−10

= 0

where the last relation follows from intermediate calculations done in Section 3. The same

calculation can be done for any j∗ using the fact that (9.2.6) holds for all j < j∗, namely

Bj −MBj−1 = Cj −B1Cj−1 . . .−Bj−1C1 −M (Cj−1 − . . .Bj−2C1)

= Cj −B1Cj−1 −MCj−1

= Cj − C1Cj−1 −(

C2C−11 − C1

)

Cj−1

= Cj − C2C−11 Cj−1

= Cj−1D0PD−10 +DjD

−10 − C2C

−11

(

Cj−2D0PD−10 +Dj−1D

−10

)

= Cj−1D0PD−10 + γj−2

k D2D−10 − C2C

−11

(

Cj−2D0PD−10 + γj−3

k D2D−10

)

90

=(

Cj−1 − C2C−11 Cj−2

)

D0PD−10 + γj−3

k

(

γkI − C2C−11

)

D2D−10

= (Bj−1 −MBj−2)D0PD−10 + γj−3

k

(

γkI − C2C−11

)

D2D−10

= γj−3k

(

γkI − C2C−11

)

D2D−10

= γj−3k

(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2D−10

= 0.

Next, we prove that the two eigenvalues of M are λ1 = α and λ2 = (γk−γla/b−θ)/(1−θ).One way to do this is to write out all of the terms for matrix M and derive expressions

for the trace and the determinant. The trace is equal to the sum of the eigenvalues and

the determinant is equal to the product of the eigenvalues. This is 2 equations and 2

unknowns. We have done this but the algebra is messy.

A simpler proof that the eigenvalues are λ1 = α and λ2 = (γk − γla/b− θ)/(1 − θ) is

as follows. Using (9.2.2) and the definitions of the C’s in terms of the D’s, we can derive

the following expression for M in terms of the D’s, P , and h:

M = C2C−11 − C1

=(

D0P2 +D1P +D2

)

(D0P +D1)−1 − (D0P +D1)D

−10

= (D0P +D1)P (D0P +D1)−1

+D2 (D0P +D1)−1

− (D0P +D1)D−10 (D0P +D1) (D0P +D1)

−1

= D2 (D0P +D1)−1 − (D0P +D1)D

−10 D1 (D0P +D1)

−1

= (D0P +D1) [1, 0]′h′ (D0P +D1)

−1 − (D0P +D1)D−10 D1 (D0P +D1)

−1

= (D0P +D1)(

[1, 0]′h′ −D−1

0 D1

)

(D0P +D1)−1.

Appealing to standard results in linear algebra, the eigenvalues of M are equal to the

eigenvalues of the simpler matrix [1, 0]′h′ −D−10 D1, which is equal to

[1, 0]′h′−D−1

0 D1 =1

1 − θ

[

γk − θ (1 − a+ aα) −γl + θb (1 − α)(γk − α− θ (1 − a) (1 − α)) a/b α− θ (a+ α (1 − a)) − γla/b

]

Taking the trace, we get

trace(

[1, 0]′h′ −D−1

0 D1

)

= α+γk − γla/b− θ

1 − θ. (9.2.7)

Taking the determinant, we get

det(

[1, 0]′h′ −D−1

0 D1

)

= α× γk − γla/b− θ

1 − θ. (9.2.8)

The two equations (9.2.7) and (9.2.8) uniquely determine the two eigenvalues which are

those proposed.

91

9.2.6. Blanchard-Quah Identification

We now consider the procedure of Blanchard and Quah (1989) when applied to data from

the 2-shock version of the model described above.

Blanchard and Quah start with a VAR

Xt = B1Xt−1 +B2Xt−2 + . . .BpXt−p + vt, Evtvt = Ω

which is estimated using time series Xt. As described above, this implies the MA

Xt = vt + C1vt−1 + C2vt−2 + . . . . (9.2.9)

Some structure is needed to derive a “structural MA” with shocks that have economic

interpretation. In this case, we will use the following notation for the structural MA:

Xt = A0ǫt + A1ǫt−1 +A2ǫt−2 + . . . . (9.2.10)

where A0ǫt = vt and Aj = CjA0.

Because we will impose restrictions on the sums of the A’s and C’s, we define

C = I + C1 + C2 + C3 + . . .

A = A0 +A1 +A2 + A3 + . . .

= CA0.

Since A0ǫt = vt, it must be the case that A0EǫtǫtA′0 = Ω. Blanchard and Quah

assume that the elements of ǫt are orthogonal and demand shocks do not have a long-run

effect on productivity. Without loss of generality, we can normalize the magnitude of the

variances of the elements of ǫt and assume, therefore, that

A0A′0 = Ω

C (1, 1)A0 (1, 2) + C (1, 2)A0 (2, 2) = 0 (9.2.11)

which is four equations in the four unknown elements of A0. Condition (9.2.11) ensures

that the demand shock does not have a long-run effect on productivity. Writing out the

system of 4 equations and 4 unknowns yields:

ω11 = A0 (1, 1)2

+A0 (1, 2)2

ω12 = A0 (1, 1)A0 (2, 1) + A0 (1, 2)A0 (2, 2) (9.2.12)

ω22 = A0 (2, 1)2

+A0 (2, 2)2

0 = C (1, 1)A0 (1, 2) + C (1, 2)A0 (2, 2)

92

Eliminate A0(1, 2) using the fact that A0(1, 2) = −C(1, 2)A0(2, 2)/C(1, 1):

ω11 = A0 (1, 1)2+ f2A0 (2, 2)

2

ω12 = A0 (1, 1)A0 (2, 1) + fA0 (2, 2)2

ω22 = A0 (2, 1)2+ A0 (2, 2)

2

where f = −C(1, 2)/C(1, 1). Solve for A0(1, 1) and A0(2, 1):

A0 (1, 1) =[

ω11 − f2A0 (2, 2)2]1/2

(9.2.13)

A0 (2, 1) =[

ω22 −A0 (2, 2)2]1/2

(9.2.14)

and substitute to get:

ω12 = fA0 (2, 2)2

+[

ω11 − f2A0 (2, 2)2]1/2 [

ω22 − A0 (2, 2)2]1/2

.

Let λ = A0(2, 2)2 and the result is a quadratic in λ:

(ω12 − fλ)2

=(

ω11 − f2λ)

(ω22 − λ)

which can be written out:

ω212 − 2fλω12 + f2λ2 = ω11ω22 − f2λω22 − ω11λ+ f2λ2

and simplified as follows:

λ =ω11ω22 − ω2

12

ω11 + f2ω22 − 2fω12. (9.2.15)

In addition, we need to impose sign conventions since impulse responses can be either

positive or negative. We will consider one sign convention for the demand shock and two

different sign conventions for the technology shock.

The demand shock in our example is a shock to the tax rate on labor. For this choice

of shock, we want to impose A0(2, 2) < 0 so that hours fall with a positive shock to the tax

rate on labor. For A0(2, 2), it must be the case that A0(2, 2) = −√λ since λ is positive.

Given A0(2, 2), it immediately follows that A0(1, 2) = fA0(2, 2).

93

9.2.6.1. Sign convention on A0(1, 1)

For the technology shock, we first consider the sign convention that productivity rises on

impact in response to a positive technology shock, namely A0(1, 1) > 0. In this case, We

need to use the positive root of A0(1, 1)2 = ω11 − f2λ:

A0 (1, 1) =√

ω11 − f2λ.

Given a value for A0(1, 1), we have A0(2, 1) from:

A0 (2, 1) = (ω12 − fλ) /A0 (1, 1) .

9.2.6.2. Sign convention on A(1, 1)

The second sign convention assumes that productivity is positive in the long run so that

A(1, 1) > 0 and therefore

C (1, 1)A0 (1, 1) + C (1, 2)A0 (1, 2) > 0.

This condition can also be written in terms of A0(1, 1) and known parameters:

C (1, 1)A0 (1, 1) + C (1, 2) (ω1,2 − fλ) /A0 (1, 1) > 0. (9.2.16)

In this case, we choose the sign on the square root of ω11−f2λ so that (9.2.16) is satisfied.

9.2.6.3. Full solution

The full solution is

A0 (2, 2) = −√λ

A0 (1, 2) = fA0 (2, 2)

A0 (1, 1) = root of ω11 − f2λ satisfying sign convention

A0 (2, 1) = (ω12 − fλ) /A0 (1, 1)

where λ is defined in (9.2.15) and f = −C(1, 2)/C(1, 1).

94

9.2.6.4. Cholesky decomposition

In the literature, many report using the following formula for A0:

A0 = C−1L

where L is a lower triangular matrix such that with positive elements on the diagonal that

satisfies LL′ = CΩC′. This choice imposes the long-run restriction in (9.2.11) and the

long-run sign convention A(1, 1) automatically.

It does not impose A0(2, 2) < 0. However, in most cases, responses to demand shocks

are not discussed.

9.2.7. Proposition 2: OLS Results

Proposition 1 says that the model has an infinite-lag vector autoregressive structure. The

next proposition considers the outcome when OLS regressions are run with one lag. Let

V0 = EXtX′t be the theoretical variance matrix for Xt. Let V1 = EXtX

′t−1 be the

covariance matrix for Xt and its lag. If ED0ηtη′tD

′0 = Ω is the theoretical variance-

covariance of the model’s shock vector, then

V0 = Ω + C1ΩC′1 + C2ΩC

′2 + . . . (9.2.17)

V1 = C1Ω + C2ΩC′1 + C3ΩC

′2 + . . . . (9.2.18)

Proposition 2. Assume that a regression is run of the form

Xt = BolsXt−1 + vt, Evtv′t = Ωols

with Xt from the RBC model. Then, the variance-covariance matrix is

Ωols = V0 − V1V−10 V ′

1 (9.2.19)

= Ω +MΩM ′ −MΩV −10 ΩM ′ (9.2.20)

where M = C2C−11 − C1 and the inverse of the sum of MA coefficients is

C−1ols = I −Bols

= C−1 +M (I −M)−1C1 +M (Ω − V0)V

−10 .

95

In other words, the OLS matrices Ωols and Cols are not equal to their theoretical counter-

parts, Ω and C.

Proof of Proposition 2. The relation (9.2.19) follows from the standard projection formulas,

Bols = (EXtXt−1)(

EXt−1X′t−1

)−1= V1V

−10

Evtv′t = E (Xt −BolsXt−1) (Xt −BolsXt−1)

′= V0 − V1V

−10 V ′

1 . (9.2.21)

Before substituting in (9.2.17) and (9.2.18), we can exploit the nature of the model’s MA.

In particular, We can use the fact that

Cj = (C1 +M)Cj−1, (9.2.22)

which follows from the formula (9.2.5) and Proposition 1. That is,

Cj − (C1 +M)Cj−1 = (Bj +Bj−1C1 +Bj−2C2 + . . .+B1Cj−1)

− C1Cj−1 −M (Bj−1 +Bj−2C1 + . . .+B1Cj−2)

= B1Cj−1 − C1Cj−1

= 0.

Thus, we can write V0 as follows:

V0 = Ω + C1ΩC′1 + (C1 +M)C1ΩC

′1 (C1 +M)

′+ (C1 +M)

2C1ΩC

′1(C1 +M)

2′+ . . .

which implies that

V0 = (C1 +M)V0 (C1 +M)′+ Ω + C1ΩC

′1 − (C1 +M)Ω (C1 +M)

′. (9.2.23)

For V1,

V1 = C1Ω + (C1 +M)C1ΩC′1 + (C1 +M)

2Ω (C1 +M) + . . .

= (C1 +M)V0 −MΩ. (9.2.24)

Substituting (9.2.23) and (9.2.24) into (9.2.21) yields

V0 − V1V−10 V ′

1 = V0 − [(C1 +M)V0 −MΩ]V −10 [(C1 +M)V0 −MΩ]

′

= V0 − (C1 +M)V0 (C1 +M)′+MΩ (C1 +M)

′

+ (C1 +M)ΩM ′ −MΩV −10 ΩM ′

= Ω + C1ΩC′1 − (C1 +M)Ω (C1 +M)

′+MΩ (C1 +M)

′

+ (C1 +M)ΩM ′ −MΩV −10 ΩM ′

= Ω +MΩM ′ −MΩV −10 ΩM ′

96

which is the same as (9.2.20). This proves the first part of the proposition.

For the second part, we need to construct the matrix Cols using the relation between

the AR coefficients and the MA coefficients in (9.2.5). In this case,

Cols = (I −Bols)−1.

Thus, we have

Cols =(

I − V1V−10

)−1

=(

I − [(C1 +M)V0 −MΩ]V −10

)−1

=(

I − C1 −M +MΩV −10

)−1

=(

I − (I −M)−1C1 + (I −M)

−1C1 − C1 −M +MΩV −1

0

)−1

=(

C−1 +(

I +M +M2 + . . .)

C1 − C1 −M +MΩV −10

)−1

=(

C−1 +M (I −M)−1C1 +M (Ω − V0)V

−10

)−1

. (9.2.25)

The term M(I−M)−1C1 +M(Ω−V0)V−10 is not generically zero. This proves the second

part of the proposition.

The term MΩM ′ −MΩV −10 ΩM ′ is zero if the RBC model’s VAR representation has

only one lag (e.g., M = 0). It is close to zero if one of the shocks is close to 0. In this

latter case, ΩV −10 Ω ≈ Ω and the SVAR user detects correctly the variance of the one shock

driving the system. This is true even if the VAR coefficients are wrong (e.g., M is very

different than 0).

What happens if we have a VAR with n lags? In this case, the formula is messy but

Evtv′t can be written

Evtv′t = V0 − [V1 V2 · · · Vn ]

V0 V1 · · · Vn−1

V ′1 V0 · · · Vn−2

......

......

V ′n−1 V ′

n−2 · · · V0

−1

V ′1

V ′2...V ′n

with Vj = (C1 + M)j−1V1, where V0 is the matrix in (9.2.23) and V1 is the matrix in

(9.2.24).

97

9.2.8. The Propositions for Two Special Cases

In this section, we consider two special cases. The first has θ = 0. The second has

στ = 0. We show in these very special cases that the SVAR can uncover the true impulse

response for hours in response to a technology shock even if only one lag is used in the

VAR regression.

9.2.8.1. Proposition 3a: No capital in the model

Proposition 3a. Assume that θ is set to 0 in the RBC model. If a regression is run of the

form

Xt = BolsXt−1 + vt

with Xt from the RBC model, then the Blanchard-Quah procedure recovers the true

impulse response function for hours in response to technology, namely

Aj (2, 1) = 0 (9.2.26)

for all j.

Proof of Proposition 3a. It is important to note that C1 is singular in this case. Thus, we

can’t write M as C2C−11 − C1, but rather, we simply work with:

M =

[

0 0M (2, 1) α

]

for arbitrary M(2, 1) (which is what we would have in the limit as θ goes to 0). It is easy

to show that B2 = C2 −C21 = MB1, B3 = C3 −B2C2 −B1C1 = MB2 and so on. To prove

the result in (9.2.26), we need to show that the errors that a SVAR user encounters in

estimating Evtv′t and C do not affect the (2,1) elements of the Aj ’s. Starting with Evtv

′t:

Evtv′t = Ω +MΩM ′ −MΩV −1

0 M ′

=

[

σ2z 00 σ2

τ

]

+

[

0 00 x

]

where the second matrix is MΩM ′ −MΩV −10 M ′ with a nonzero (2,2) element x. The

value of x does not affect the result, so we don’t need to specify it precisely. Next, we

98

consider the C matrix:

C−1ols =

(

C−1 +M (I −M)−1C1 +M (Ω − V0)V

−10

)−1

=

(

[

1 00 (1 − α) / (1 − ρ)

]−1

+

[

0 00 y

]

)−1

where y is a nonzero term in the SVAR error. The magnitude of y does not affect the

result. Notice that the (1,1) and (1,2) elements of C are correctly computed. Notice also

that this implies f = −C(1, 2)/C(1, 1) = 0 and therefore A0(2, 1) = 0. For all higher

terms, Aj(2, 1) = 0 because V1V−10 has zeros in the first column.

9.2.8.2. Proposition 3b: Only one shock

Proposition 3b. Assume that στ = 0 in the RBC model. If a regression is run of the form

Xt = BolsXt−1 + vt

with Xt from the RBC model, then the Blanchard-Quah procedure recovers the true

impulse responses to technology, namely

Aj = DjQ

for all j.

Proof of Proposition 3b. The first part of the proof is concerned with the impact coefficient

A0. We show that Evtv′t = Ω if στ = 0, where Ω is the true variance-covariance matrix for

the model. This is the main step in showing that the impact coefficient is correct. Then

we show that the other coefficients can also be recovered by the SVAR. From Proposition

2, the following holds for the one-lag regression regardless of the size of the shocks:

Evtv′t = Ω +MΩM ′ −MΩV −1

0 ΩM ′.

We now show that Ω = ΩV −10 Ω if στ = 0, and therefore Evtv

′t = Ω. We do this in three

steps. First, we show that

Ω = Ω (1, 1)

[

1 ζζ ζ2

]

(9.2.27)

99

where ζ = −a/(1 − θ + θa). Second, we show that

1

1 + ζν

[

1 νζ ζν

]

V0 = Ω (9.2.28)

where ν = −θ(1− γk)(1−a)/[(α− γk)a]. Third, we show that (9.2.27) and (9.2.28) imply:

Ω − ΩV −10 Ω = 0. (9.2.29)

Writing out Ω yields

Ω = D0QQ′D′

0

=

[

1 − θ + θa −θb−a b

] [

σ2z 00 0

] [

1 − θ + θa −a−θb b

]

=

[

(1 − θ + θa)2 − (1 − θ + θa)a− (1 − θ + θa) a a2

]

σ2z

= Ω (1, 1)

[

1 −a/ (1 − θ + θa)

−a/ (1 − θ + θa) a2/ (1 − θ + θa)2

]

and (9.2.27) holds. Writing out V0 in (9.2.28) yields

1

1 + ζν

[

1 νζ ζν

]

(Ω + C1ΩC′1 + C2ΩC

′2 + C3ΩC

′3 . . .) . (9.2.30)

The second term on the right hand side of (9.2.30) is equal to a 2×2 matrix of zeros:[

1 νζ ζν

]

C1ΩC′1

=

[

1 νζ ζν

] [

θ2 (1 − γk)2(1 − a)

2θ (1 − γk) (1 − a) (α− γk) a

θ (1 − γk) (1 − a) (α− γk) a (α− γk)2a2

]

σ2z

=

[

0 00 0

]

.

All higher terms are also equal to 2×2 matrices of zeros:[

1 νζ ζν

]

CjΩC′j

=

[

1 νζ ζν

]

(

Cj−1D0PD−10 +DjD

−10

)

(D0QQ′D′

0)(

Cj−1D0PD−10 +DjD

−10

)′

=

[

1 νζ ζν

]

(

γj−2k D2

)

(QQ′)(

γj−2k D2

)′

=

[

1 νζ ζν

]

γ2j−4k

[

[γk (1 − a) θ (1 − γk)]2 γ2

k (1 − a) θ (1 − γk) a (α− γk)

γ2k (1 − a) θ (1 − γk) a (α− γk) [γka (α− γk)]

2

]

σ2z

=

[

0 00 0

]

.

100

Thus,

1

1 + ζν

[

1 νζ ζν

]

(Ω + C1ΩC′1 + C2ΩC

′2 + C3ΩC

′3 . . .)

=1

1 + ζν

[

1 νζ ζν

]

Ω

=Ω (1, 1)

1 + ζν

[

1 νζ ζν

] [

1 ζζ ζ2

]

= Ω

which proves (9.2.28). Now, we are ready for

Ω − ΩV −10 Ω = Ω (1, 1)

[

1 ζζ ζ2

]

− 1

1 + ζν

[

1 νζ ζν

] [

1 ζζ ζ2

]

= Ω (1, 1)

[

1 ζζ ζ2

]

− 1

1 + ζν

[

1 + ζν ζ (1 + ζν)ζ (1 + ζν) ζ2 (1 + ζν)

]

=

[

0 00 0

]

which proves that there is no error in computing Evtv′t, that is Evtv

′t = Ω. The next step

is to show that this is all that is needed for the correct inference. Recall the formulas for

the elements of A0 and λ in Section 3.6. Because Ωols = Ω and det(Ω)=0, it must be the

case that λ = 0. Thus, A0(2, 1) found by the SVAR is

A0 =

[√ω11 0√ω22 0

]

where√ωjj =

√

Ω(j, j). Using the formulas above, we have A0(1, 1) = (1− θ+ θa)σz and

A0(2, 1) = −aσz. Thus, A0 = D0Q. This proves that there is no mistaken inference for

the impact coefficient. Next, we check Aj for j > 1, which is equal to BjolsA0. For j = 1,

BolsA0 = V1V−10 A0

= V1V−10 D0Q

=(

C1 +M −MΩV −10

)

D0Q

= C1D0Q

= (D0P +D1)D−10 D0Q

= D1Q

where (I − ΩV −10 )D0Q = 0 has been used. For j = 2,

B2olsA0 =

(

V1V−10

)2A0

101

=(

V1V−10

)

C1D0Q

=(

C1 +M −MΩV −10

)

C1D0Q

= C2D0Q−MΩV −10 C1D0Q

= C2D0Q

=(

D0P2 +D1P +D2

)

D−10 D0Q

= D2Q

where ΩV −10 C1D0Q = 0 has been used. Similarly, we can prove it for higher terms by

noting that if Bj−1ols A0 = Cj−1D0Q holds, then BjolsA0 = CjD0Q holds and so does the

following:

BjolsA0 =(

V1V−10

)

Cj−1D0Q

=(

C1 +M −MΩV −10

)

Cj−1D0Q

= CjD0Q−MΩV −10 Cj−1D0Q

= CjD0Q

=(

D0Pj +D1P + . . .+Dj

)

D−10 D0Q

= DjQ.

This establishes that in the case with στ = 0, the SVAR uncovers the true impulse responses

to technology.

What is interesting about the last two propositions is that the special cases are not

relevant for modern business cycle theorists. Modern business cycle theorists assume that

both capital accumulation and shocks in addition to technology (e.g., distortions to labor)

are quantitatively important. Furthermore, adding these factors is not a recent phenomena.

They are central to the work following Kydland and Prescott (1982) (which includes my

thesis).

9.3. VARs and 3-Shock Versions of the Model

We consider several versions of an RBC model with three shocks and three variables in the

VAR. The first has a investment tax shock and the log of the investment-output ratio in the

VAR. The second has a government spending shock and the log of the investment-output

ratio in the VAR. The third has a investment tax shock and the log of the consumption-

output ratio in the VAR.

102

9.3.1. Adding an Investment Tax Shock

Assume the economy is an RBC model with three orthogonal shocks: a unit root in

technology log z, an AR(1) in the tax rate on labor τl, and an AR(1) in the tax rate on

investment τx. The capital decision function has the form:

log kt+1 = γ0 + γk log kt + γz log zt + γlτlt + γxτxt (9.3.1)

and the labor decision function can be written:

log lt = φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′ log kt+1

= φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′[

γ0γk log kt + γz log zt + γlτlt + γxτxt

]

= (φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt + (φll + φlk′γl) τlt + (φlx + φlk′γx) τxt.

Note that we include the term φlxτxt here even though it is equal to 0 in equilibrium.

We do so because the same mathematics will be used later in the case of the government

spending shock.

Next, we write out output from a Cobb-Douglas production technology with capital

share θ is:

log yt = θ(

log kt − log zt

)

+ (1 − θ) log lt

= (θ + (1 − θ)φlk) log kt − (θ − (1 − θ)φlz) log zt + (1 − θ)φllτlt

+ (1 − θ)φlxτxt + (1 − θ)φlk′ log kt+1

= (θ + (1 − θ) (φlk + φlk′γk)) log kt − (θ − (1 − θ) (φlz + φlk′γz)) log zt

+ (1 − θ) (φll + φlk′γl) τlt

+ (1 − θ) (φlx + φlk′γx) τxt

We can write the capital stock in terms of all lagged shocks as follows:

log kt = γ0 + γk

(

γ0 + γk log kt−2 + γz log zt−2 + γlτlt−2 + γxτxt−2

)

+ γz log zt−1 + γlτlt−1 + γxτxt−1

= γ0

[

1 + γk + γ2k + . . .

]

+ γz[


]

+ γl[

τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .

]

+ γx[

τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .

]

or in differences as follows:

log kt − log kt−1 = γz[


]

+ γl[

τlt−1 + (γk − 1) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

]

+ γx[

τxt−1 + (γk − 1) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

]

103

or in quasi-differences as follows:

log kt − α log kt−1 = γz[


]

+ γl[

τlt−1 + (γk − α) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

]

+ γx[

τxt−1 + (γk − α) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

]

We can also write hours in terms of past shocks as follows:

log lt = φlz log zt + φllτlt + φlxτxt + φlk log kt + φlk′ log kt+1

= φlz log zt + φllτlt + φlxτxt

+ φlkγz[


]

+ φlkγl[

τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .

]

+ φlkγx[

τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .

]

+ φlk′γz[

log zt + γk log zt−1 + γ2k log zt−2 + . . .

]

+ φlk′γl[

τlt + γkτlt−1 + γ2kτlt−2 + . . .

]

+ φlk′γx[

τxt + γkτxt−1 + γ2kτxt−2 + . . .

]

= [(φlz + φlk′γz) log zt + (φlk + φlk′γk) γz log zt−1 + (φlk + φlk′γk) γkγz log zt−2 + . . .]

+ [(φll + φlk′γl) τlt + (φlk + φlk′γk) γlτlt−1 + (φlk + φlk′γk) γkγlτlt−2 + . . .]

+ [(φll + φlk′γx) τxt + (φlk + φlk′γk) γxτxt−1 + (φlk + φlk′γk) γkγxτxt−2 + . . .]

where constant terms have been ignored.

104

We can write logged hours in differences as follows:

log lt − log lt−1 = φlz (log zt − log zt−1) + φll (τlt − τlt−1) + φlx (τxt − τxt−1)

+ φlk′(

log kt+1 − log kt

)

+ φlk

(


)

= φlz (log zt − log zt−1) + φll (τlt − τlt−1) + φlx (τxt − τxt−1)

+ φlk′γz[

log zt + (γk − 1) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .

]

+ φlk′γl[

τlt + (γk − 1) τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .

]

+ φlk′γx[

τxt + (γk − 1) τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .

]

+ φlkγz[


]

+ φlkγl[

τlt−1 + (γk − 1) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

]

+ φlkγx[

τxt−1 + (γk − 1) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

]

= [φlz + φlk′γz] log zt − [φlz − φlkγz − φlk′γz (γk − 1)] log zt−1

+ γz (γk − 1) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

+ [φll + φlk′γl] τlt − [φll − φlkγl − φlk′γl (γk − 1)] τlt−1

+ γl (γk − 1) [φlk′γk + φlk] τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

+ [φlx + φlk′γx] τxt − [φlx − φlkγx − φlk′γx (γk − 1)] τxt−1

+ γx (γk − 1) [φlk′γk + φlk] τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

or in quasi-difference form as follows:

log lt − α log lt−1 = φlz (log zt − α log zt−1) + φll (τlt − ατlt−1) + φlx (τxt − ατxt−1)

+ φlk′(

log kt+1 − α log kt

)

+ φlk

(

log kt − α log kt−1

)

= φlz (log zt − α log zt−1) + φll (τlt − ατlt−1) + φlx (τxt − ατxt−1)

+ φlk′γz[

log zt + (γk − α) log zt−1 + γk log zt−2 + γ2k log zt−3 + . . .

]

+ φlk′γl[

τlt + (γk − α) τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .

]

+ φlk′γx[

τxt + (γk − α) τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .

]

+ φlkγz[


]

+ φlkγl[

τlt−1 + (γk − α) τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

]

+ φlkγx[

τxt−1 + (γk − α) τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

]

= [φlz + φlk′γz] log zt − [αφlz − φlkγz − φlk′γz (γk − α)] log zt−1

+ γz (γk − α) [φlk′γk + φlk] log zt−2 + γk log zt−3 + γ2k log zt−4 + . . .

+ [φll + φlk′γl] τlt − [αφll − φlkγl − φlk′γl (γk − α)] τlt−1

105

+ [φlx + φlk′γl] τxt − [αφlx − φlkγl − φlk′γx (γk − α)] τxt−1

+ γl (γk − α) [φlk′γk + φlk] τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

+ γx (γk − α) [φlk′γk + φlk] τxt−2 + γkτxt−3 + γ2kτxt−4 + . . . (9.3.2)

We can use the expressions for output and hours to write out the change in produc-

tivity as follows:

log (yt/lt) − log (yt−1/lt−1)

= log yt − log yt−1 + log zt − log lt − log lt−1

= log zt + θ(

log kt − log kt−1 − log lt + log lt−1 − log zt + log zt−1

)


− θ(


)


− θ(


)

= (1 − θ) log zt + θ log zt−1 − θ[φlz + φlk′γz] log zt

− [φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1

+ γz (γk − 1) [φlk′γk + φlk − 1][


]

+ [φll + φlk′γl] τlt − [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τlt−1

+ [φlx + φlk′γx] τxt − [φlx − (φlk − 1) γx − φlk′γx (γk − 1)] τxt−1

+ γl (γk − 1) [φlk′γk + φlk − 1][

τlt−2 + γkτlt−3 + γ2k log τlt−4 + . . .

]

+ γx (γk − 1) [φlk′γk + φlk − 1]

[

τxt−2 + γkτxt−3 + γ2k log τxt−4 + . . .

]

= 1 − θ − θ [φlz + φlk′γz] log zt

+ θ [1 + φlz − (φlk − 1) γz − φlk′γz (γk − 1)] log zt−1

− θγz (γk − 1) [φlk′γk + φlk − 1][


]

− θ [φll + φlk′γl] τlt

+ θ [φll − (φlk − 1) γl − φlk′γl (γk − 1)] τlt−1

− θγl (γk − 1) [φlk′γk + φlk − 1][

τlt−2 + γkτlt−3 + γ2kτlt−4 + . . .

]

− θ [φlx + φlk′γx] τxt

+ θ [φlx − (φlk − 1) γx − φlk′γx (γk − 1)] τxt−1

− θγx (γk − 1) [φlk′γk + φlk − 1][

τxt−2 + γkτxt−3 + γ2kτxt−4 + . . .

]

(9.3.3)

106

Now we write out the log of the investment share:

log (xt/yt) = log xt − log yt

= φxk

(

log kt − log zt

)

+ φxk′ log kt+1 − θ(

log kt − log zt

)

− (1 − θ) log lt

= (φxk − θ)(

log kt − log zt

)

+ φxk′[

γk log kt + γz log zt + γlτlt + γxτxt

]

− (1 − θ) [(φlk + φlk′γk) log kt + (φlz + φlk′γz) log zt

+ (φll + φlk′γl) τlt + (φlx + φlk′γx) τxt]

= [−φxk + θ + φxk′γz − (1 − θ) (φlz + φlk′γz)] log zt

+ [φxk′γl − (1 − θ) (φll + φlk′γl)] τlt

+ [φxk′γx − (1 − θ) (φlx + φlk′γx)] τxt

+ [φxk − θ + φxk′γk − (1 − θ) (φlk + φlk′γk)]

γz[


]

+ γl[

τlt−1 + γkτlt−2 + γ2kτlt−3 + . . .

]

+ γx[

τxt−1 + γkτxt−2 + γ2kτxt−3 + . . .

]

9.3.1.1. The Model’s Moving Average

The moving average for the model is given by:

(1 − L) log yt/lt(1 − αL) log lt

log xt/yt

≡ Xt = D0ωt +D1ωt−1 +D2ωt−2 + . . .

where ωt = [log zt, τlt, τxt]′ and

D0 =

1 − θ + θa −θb −θc−a b c−d e f

(9.3.4)

D1 =

θ (1 − a) (1 − γk) θ (b+ (1 − a) γl) θ (c+ (1 − a) γx)(α− γk) a −αb+ γla −αc+ γxa

−dγk dγl dγx

(9.3.5)

D2 =

γk (1 − a) θ (1 − γk) −γl (1 − a) θ (1 − γk) −γx (1 − a) θ (1 − γk)γka (α− γk) −γla (α− γk) −γxa (α− γk)

−dγ2k dγlγk dγxγk

,(9.3.6)

107

and Dj = γkDj−1 for j ≥ 3 where a = φlk + φlk′γk, b = φll + φlk′γl, c = φlx + φlk′γx,

d = φxk + φxk′γk − θ − (1 − θ)a, e = φxk′γl − (1 − θ)b, and f = φxk′γx − (1 − θ)c. Note

that φlz = −φlk, φxz = −φxk, and γz = −γk hold in the model economy with a unit root

in technology. Note also that D2 is singular for all parameterizations, and D1 is singular

if α = 0.

If τlt and τxt are AR(1) processes, then it is more convenient to write the MA process

in terms of ηt = [log zt, ηlt, ηxt] rather than in terms of ωt. In this case,

Xt = D0ηt+(D0P +D1) ηt−1+(

D0P2 +D1P +D2

)

ηt−2+(

D0P3 +D1P

2 +D2P +D3

)

ηt−3. . . .

We normalize the MA so it has an identity for the first coefficient. That is, set C0 = I,

C1 = (D0P +D1)D−10 , and Cj = Cj−1D0PD

−10 +DjD

−10 .

9.3.1.2. Special Property of the D’s

As in the 2-shock case, it is the case that the D matrices have a special property that can

be exploited when we characterize coefficients of the VAR of Xt. In other words, the D’s

for the 3-shock RBC model also satisfy the relation:(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2 = 0. (9.3.7)

which is the same as (9.2.1). Because D1 is singular when α = 0, we will assume that the

choice of P and α is such that D0P +D1 is invertible. This rules out the case with P and

α identically equal to 0. If that is the case of interest, assume that α is positive but very

close to zero.

The steps of the proof of (9.3.7) in the 3-variable case is the same as in the 2-variable

case. First note from (9.3.6) that

D2 =

(1 − a) θ (1 − γk)(α− γk) a

−dγk

[ γk −γl −γx ] ≡ gh′

Thus, we can rewrite the left hand side as follows

(

γkI −(

D0P2 +D1P +D2

)

(D0P +D1)−1)

D2

=[

γk (gh′) − (gh′) (D0P +D1)−1

(gh′)]

−[

(D0P +D1)P (D0P +D1)−1gh′]

.(9.3.8)

Both terms in (9.3.8) in square brackets are equal to 3×3 zero matrices. The first step is

to show that

(D0P +D1)−1g =

100

. (9.3.9)

108

The proof of this step is trivial since the first column ofD0P+D1 is equal to g. Substituting

(9.3.9) into (9.3.8), the result (9.3.7) follows immediately from the fact that h′[1, 0, 0]′ = γk

and P [1, 0, 0]′ = 0.

9.3.1.3. Proposition 4: Model has infinite-order VAR

The map between the theoretical MA and the VAR is the same as before. What is new is

the VAR representation.

Proposition 4. The model described above has a VAR representation with coefficients Bj

that satisfy

Bj = MBj−1 (9.3.10)

for j ≥ 2, with B1 = C1 = (D0P +D1)D−10 . The matrix M is 3×3 with eigenvalues equal

to 0, α, and (1 − δ)/[z(1 + gn)].

Proof of Proposition 4. The first part of the proof is the same as for Proposition 1.

The second part, involving the expressions of the eigenvalues, is different. In the three

shock case, one can use the same derivations as those in Proposition 1 to show that

[1, 0, 0]′h′ −D−10 D1 has the same eigenvalues as M . In this case, D−1

0 is given by:

D−10 =

1

|D0|

bf − ce θ (bf − ce) 0af − cd (1 − θ) f + θ (af − cd) − (1 − θ) cbd− ae − (1 − θ) e+ θ (bd− ae) (1 − θ) b

and the elements of [1, 0, 0]′h′ −D−10 D1 are given by

(1, 1) = γk − θ (1 − γk − a+ aα) / (1 − θ)

(1, 2) = −γl − θ (γl + b− bα) / (1 − θ)

(1, 3) = −γx − θ (γx + c− cα) / (1 − θ)

(2, 1) = (af − cd) (γk − θ + θa− θaα) − af (1 − θ)α/|D0|(2, 2) = (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α/|D0|(2, 3) = (af − cd) (−γx − θc+ θcα) + cf (1 − θ)α/|D0|(3, 1) = (bd− ae) (γk − θ + θa− θaα) + ae (1 − θ)α/|D0|(3, 2) = (bd− ae) (−γl − θb+ θbα) − be (1 − θ)α/|D0|(3, 3) = (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α/|D0|

where |D0| = (1−θ)(bf−ce). To prove the proposition, we will show that trace([1, 0, 0]′h′−D−1

0 D1) equals the sum of the proposed eigenvalues, |[1, 0, 0]′h′ − D−10 D1| = 0, and

109

|[1, 0, 0]′h′ − D−10 D1 − αI| = 0. These three conditions uniquely determine the three

eigenvalues.

To compute the trace, sum (1,1), (2,2), and (3,3):

trace(

[1, 0, 0]′h′ −D−1

0 D1

)

= γk − θ (1 − γk − a+ aα) / (1 − θ)

+ (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α

+ (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α/|D0|

= α+ (γk − θ) (bf − ce) − γl (af − cd) − γx (bd− ae)/|D0|

= α+ (γk − θ)φxk′ (bγx − cγl) − γl (aφxk′γx − c (φxk + φxk′γk − θ))

− γx (b (φxk + φxk′γk − θ) − aφxk′γl)/ [(1 − θ)φxk′ (bγx − cγl)]

= α+θ (1 − φxk′) − φxk

φxk′ (1 − θ)(9.3.11)

= α+(1 − δ)

z (1 + gn)(9.3.12)

where z without a subscript is the steady state value.

Next we compute the determinant of [1, 0, 0]′h′ −D−10 D1 and show it is 0. Denoting

the matrix by M, we get

det (M) = M1,1|M ([2, 3] , [2, 3]) | −M1,2|M ([2, 3] , [1, 3]) | + M1,3|M ([2, 3] , [1, 2]) |= (γk − θ (1 − γk − a+ aα) / (1 − θ)) (1 − θ)αd (fb− ec) (γlc− γxb) /|D0|2

+ (γl + θ (γl + b− bα) / (1 − θ)) (1 − θ)αd (fb− ec) (−γkc+ θc+ γxa) /|D0|2

− (γx + θ (γx + c− cα) / (1 − θ)) (1 − θ)αd (fb− ec) (−γkb+ θb+ γla) /|D0|2

=

(γk (1 − θ) − θ (1 − γk − a+ aα)) (γlc− γxb)

+ (γl (1 − θ) + θ (γl + b− bα)) (−γkc+ θc+ γxa)

− (γx (1 − θ) + θ (γx + c− cα)) (−γkb+ θb+ γla)

αd (fb− ec) /|D0|2

=

(γk − θ + θa (1 − α)) (γlc− γxb)

+ (γl + θb (1 − α)) (− (γk − θ) c+ γxa)

− (γx + θc (1 − α)) (− (γk − θ) b+ γla)

αd (fb− ec) /|D0|2

= 0 (9.3.13)

110

Finally, we take the determinant of M− αI and show it is 0 as follows:

det (M− αI)

= (M1,1 − α) |M ([2, 3] , [2, 3]) − αI| −M1,2 (|M ([2, 3] , [1, 3]) | − αM2,1)

+ M1,3 (|M ([2, 3] , [1, 2]) | + αM3,1)

= αα [M1,1 + M2,2 + M3,3 − α]

−M1,1M2,2 −M1,1M3,3 −M2,2M3,3

+ M1,2M2,1 + M1,3M3,1 + M2,3M3,2= αα [trace (M) − α]

− (M1,1M2,2 −M1,2M2,1)

− (M1,1M3,3 −M1,3M3,1)

− (M2,2M3,3 −M2,3M3,2)= αα[(bf − ce) (γk − θ + θa− θaα)

+ (af − cd) (−γl − θb+ θbα) + bf (1 − θ)α

+ (bd− ae) (−γx − θc+ θcα) − ce (1 − θ)α− α (1 − θ) (bf − ce)]

− fα [b (γk − θ) − aγl]

+ eα [c (γk − θ) − aγx]

− dα [γlc− γxb]/|D0|= 0 (9.3.14)

The result in (9.3.14) implies that α is an eigenvalue. The result in (9.3.13) implies that

0 is an eigenvalue. Given these results, the fact that the trace is (9.3.12) implies that

(1 − δ)/[z(1 + gn)] is the third eigenvalue. This completes the proof.

9.3.1.4. A Way to Make M Singular

Above we included the investment share in the VAR. The investment share is typically

added to capture the capital dynamics if capital is unobserved. What if we assume that

capital is observed and use the capital share instead?

We can see the answer directly from the proof in Proposition 4. At the step (equation

(9.3.11)) that we fill in expressions for φxk and φxk′ using (9.1.18), we could instead use

φxk = 1 and φxk′ = 0. This yields a third eigenvalue equal to -1/0 or −∞. This clearly

doesn’t work since the MA is not invertible.

111

However, it shows me what would work: adding the capital next period relative to

output, log(kt+1/yt), and, therefore, setting φxk = 0 and φxk′ = 1. If α = 0, then the

matrix M has 3 zero eigenvalues. A researcher running a VAR would find that B2 is

singular and the rest of the Bj , j ≥ 3, are zero matrices. In fact the structure would be

such that the second and third column of B2 would be equal and equal to the negative of

the first column of B2. That is how it works: certain lags are cancelling so it effectively

mimics the model’s finite-lag VAR.

What is interesting is that it won’t work if we divide kt+1 by kt and include the log of

the growth rate of capital. If we add log(kt+1/kt) to the VAR, then we proceed the same

way through the proof of Proposition 4 using d = γk − 1, e = γl, and f = γx. The result

is that the eigenvalues of M are 0, α, and 1. The fact that one is 1 means that the MA is

not invertible.

What these results tell me is that one has to proceed carefully, using lots of the details

of the model, to determine if the SVAR has a short-lag representation. Since most business

cycle models have a short-lag state-space representation, we advise using it directly. The

state-space representation also allows us to treat the capital stocks as unobserved. This

is certainly necessary in business cycle models with sticky price models and staggered

constracts; the state vector includes the distribution of capital stocks which is unobserved.

112

References

Aiyagari, S. Rao and Ellen R. McGrattan. 1998. The optimum quantity of debt, Journal

of Monetary Economics, 42: 447-469.

Anderson, Brian D. O. and John B. Moore. 1979. Optimal Filtering, Englewood Cliffs:

Prentice-Hall.

Anderson, Evan, Lars Peter Hansen, Ellen R. McGrattan, and Thomas J. Sargent. 1996.

Mechanics of Forming and Estimating Dynamic Linear Economies, Handbook of

Computational Economics, eds. H. Amman, D. Kendrick, and J. Rust, (North-

Holland).

Bertsekas, Dimitri and Steven Shreve. 1978. Stochastic Optimal Control: The Discrete

Time Case, New York: Academic Press.

Blanchard, Olivier J. and Charles M. Kahn. 1980. The solution of linear difference models

under rational expectations, Econometrica, 48: 1305-1311.

Braun, Richard A. and Ellen R. McGrattan. 1993. The macroeconomics of war and peace.

NBER Macroeconomics Annual 1993. Cambridge: MIT Press.

Candler, G.V., Wright, M.J., and McDonald, J.D. (1994). A Data Parallel LU Relaxation

Method for Reacting Flows, AIAA Journal, Vol. 32, No. 12, pp. 2380-2386.

Chari, V.V., Patrick Kehoe, and Ellen R. McGrattan. 1997. The poverty of nations: A

quantitative investigation. Staff Report #204, Federal Reserve Bank of Minneapo-

lis.

Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 1999. Sticky price models

of the business cycle: Can the contract multiplier solve the persistence problem?

Econometrica, 68(5): 1151–1179, September 2000.

Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2002 Accounting for the Great

Depression, American Economic Review, Papers and Proceedings, 92(2): 22–27.

Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2005. A critique of structural

VARS Using business cycle theory, Staff Report #364, Federal Reserve Bank of

Minneapolis.

Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2006. Business cycle accounting,

Econometrica, forthcoming.

Christiano, Lawrence J. 1990. Solving the stochastic growth model by linear-quadratic

113

approximation and by value-function iteration, Journal of Business and Economic

Statistics, 8: 23-26.

Christiano, Lawrence J. and Jonas D. Fisher. 2000. Algorithms for solving dynamic models

with occasionally binding constraints, Journal of Economic Dynamics and Control,

24(8): 1179–1232.

Computational Methods for the Study of Dynamic Economies, eds. R. Marimon and A. Scott

(Oxford University Press, Oxford, U.K.).

Den Haan, W. J. and A. Marcet. 1990. Solving the stochastic growth model by parame-

terized expectations, Journal of Business and Economic Statistics, 8: 31-34.

Fletcher, R. 1987. Practical methods of optimization, (Wiley: Chichester, U.K.).

Frontiers of Business Cycle Research. 1994. ed. T. F. Cooley (Princeton University Press,

Princeton, NJ).

Courant, R., and Hilbert, D. (1962). Methods of Mathematical Physics, Vols. I and II,

New York, Interscience.

Ferziger, J.H., and Peric, M. (1996). Computational Methods for Fluid Dynamics, Berlin,

Springer-Verlag.

Golub, G. H. and J. M. Ortega. 1992. Scientific Computing and Differential Equations:

An Introduction to Numerical Methods (Academic Press: New York, NY).

Golub, G. H. and C. F. Van Loan. 1989. Matrix Computations (Johns Hopkins Press:

Baltimore, MD).

Hansen, Lars Peter. 1982. Large sample properties of generalized method of moments

estimators, Econometrica, 50:1029–1054.

Hansen, Lars Peter, and Kenneth Singleton. 1982. Generalized instrumental variables

estimation of nonlinear rational expectations models, Econometrica, 50(5): 1269–

1286.

Hirsch, C. (1988). Numerical Computation of Internal and External Flows, Vols. I and II,

New York, Wiley.

Hughes, Thomas J.R. 1987. The Finite Element Method: Linear Static and Dynamic

Finite Element Analysis. Englewood Cliffs: Prentice-Hall

Judd, Kenneth L.. 1992. Projection methods for solving aggregate growth models, Journal

of Economic Theory, 58: 410-452.

114

Judd, Kenneth L. 1998. Numerical Methods in Economics (MIT Press, Cambridge, MA).

Kwakernaak, H. and R. Sivan. 1972. Linear Optimal Control Systems, New York: Wiley

and Sons.

Kwakernaak, Huibert and Raphael Sivan. 1972. Linear Optimal Control Systems, New

York: Wiley and Sons.

Kydland, Finn E. and Edward C. Prescott. 1982. Time to build and aggregate fluctuations,

Econometrica, 50, 1345-1370.

McGrattan, Ellen R. 1989. Computation and Application of Equilibrium Models with

Distortionary Taxes, Stanford University, Thesis.

McGrattan, Ellen R. 1990. Solving the stochastic growth model by linear-quadratic ap-

proximation, Journal of Business and Economic Statistics, 8: 41-44.

McGrattan, Ellen R. 1994. The macroeconomic effects of distortionary taxation, Journal

of Monetary Economics, 33: 573-601.

McGrattan, Ellen R. 1994. A note on computing competitive equilibria in linear models,

Journal of Economic Dynamics and Control, 18: 149-160.

McGrattan, Ellen R. 1994. A progress report on business cycle models, Federal Reserve

Bank of Minneapolis Quarterly Review, Fall.

McGrattan, Ellen R. 1996. Solving the stochastic growth model with a finite element

method, Journal of Economic Dynamics and Control, 20: 19-42.

McGrattan, Ellen R., Richard Rogerson, and R. Wright, 1997, An Equilibrium Model

of the Business Cycle with Household Production and Fiscal Policy, International

Economic Review, 38: 267-290.

McGrattan, E. R., and E. C. Prescott, 2004, The 1929 stock market: Irving Fisher was

right, International Economic Review, 45(4): 991–1009.

Press, W.H., B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. 1986. Numerical recipes:

The art of scientific computing. Cambridge: Cambridge University Press.

Reddy, J.N. 1993. An introduction to the finite element method. New York: McGraw-Hill.

Saad, Yousef. 1996. Iterative Methods for Sparse Linear Systems. Boston: PWS.

Sargent, Thomas J. 1980. Notes on Filtering, Control, and Rational Expectations, unpub-

lished manuscript, University of Minnesota.

115

Sargent, Thomas J. 1987. Dynamic Macroeconomic Theory. Cambridge: Harvard Univer-

sity Press.

Taylor, John B. and Harald Uhlig. 1990. Solving nonlinear stochastic growth models:

A comparison of alternative solution methods. Journal of Business and Economic

Statistics 8: 1-17.

Vaughan, David R. 1970. A Nonrecursive Algebraic Solution for the Discrete Riccati

Equation, IEEE Transactions on Automatic Control, AC-15, 597-599.

116

Lecture Notes: Quantitative Methods Ellen R. McGrattan · Federal Reserve Bank of Minneapolis...

Documents

Transcript of Lecture Notes: Quantitative Methods Ellen R. McGrattan · Federal Reserve Bank of Minneapolis...