Dynamic numerical simulation of gas-liquid two-phase flows Euler/Euler versus Euler/Lagrange
Euler Equation Based Policy Function...
Transcript of Euler Equation Based Policy Function...
Euler Equation Based Policy Function Iteration
Hang Qian
Iowa State University
Developed by Coleman (1990), Baxter, Crucini and Rouwenhorst (1990), policy function Iteration on the
basis of FOCs is one of the effective ways to solve dynamic programming problems. It is fast and flexible,
and can be applied to many complicated programs.
1. The simple growth model
The starting point of contemporary macroeconomics is the simple growth model:
* +
∑
s.t. (1.1)
We know the problem has an analytic solution, which serves as our benchmark.
( ) (1.2)
(1.3)
Let us see how the policy function iteration works.
The general form of Euler equation is: ( ) ( ) ( )
For our problem, ( ) (1.4)
Suppose we have a guess on the policy function for consumption
( ), (1.5)
and the policy function for
( ) (1.6)
Though in this example ( ) seems trivial, since the budget constraint (1.1) requires
( ) ( ).
It is important to note that the policy function is time invariant, that is to say, ( ), which
inherit the functional form ( ). Then (1.4) becomes
( ) ( ) (1.7)
Now we have two equations (1.1) and (1.7). Treat the two controls and as unknowns, and we are
ready to solve them as a function of current state . If our first GUESS of the consumption policy
( ) , ( ) is correct, then the solution of (1.1) and (1.7) should be identical to ( ),
( ). So we have VERIFIED the validity of the policy functions. In other words, the Euler equation admits a
fix point for the policy functions. Coleman (1987, 1989) shows that this fixed-point equation has a unique
positive solution and that iterations based on this equation uniformly converge to this solution.
Euler equation based policy function iteration is important in that it justifies the doctrine of “N equations
solve for N unknowns”. In this example, we often hear people say (1,1) and (1.4) solve for two unknowns
* +, but we have varied time-scripts in the two equations! (We have as well as , as well
as . Obviously they are not the same thing.) The thrust of policy function iteration is that the
time-invariant policy function enables us to transform tomorrow’s policies to today’s policies, so that we
can indeed solve equations.
In summary, to find the path for N sequences, the policy function iteration works in the following way:
Step 1: write down N FOCs (including resource constraints )
Step 2: have an initial guess on the N policy functions.
Step 3: use those policies to rewrite tomorrow’s policies to today’s controls.
Step 4: Solve the nonlinear N equations.
Step 5: compare the solutions to the initial guess. iterate step 2-4 until convergence.
Important notes:
In Step 1, the FOCs contain today’s controls and tomorrow’s controls, so it is not solvable.
In Step 3, our guess on the policy function is only used for transforming tomorrow’s controls. But we do not
use our policy guess to replace today’s controls into current states.
In Step 4, once the substitution step is done, the N nonlinear equations simply consist of N today’s control
variables and several current state variables, so that we can solve for those N controls.
In step 5, we compare the solved N controls with our initial guess, if they agree, we are done, or we need to
treat the solved N controls as initial guess for another round of iteration.
As Baxter, Crucini and Rouwenhorst (1990) point out, the policy function iteration has a natural
interpretation as backward solving of dynamic systems. Suppose the sequential problem has a terminal
time T. Then the initial guess can be viewed as the period T policy functions, and the solutions of the N
equations are period T-1 policy functions. As the iteration goes on, we get policy functions for T-2, T-3, …
Always bear the above interpretation in mind when using the policy function iteration. This helps us to
understand why we do not use ( ) and ( ) to replace and themselves.
Even with pencil and paper, we can conduct the policy function iteration for the simple growth model. For
instance, we start from the policy ,
, where M is an undetermined
coefficient, we are able to use (1,1) and (1.7) to pin down M. Alternatively, we may set a specific number to
M, after several iteration, we should figure out ( )
To perform policy function iteration on a computer, we first discretize the state space into a grid of points.
For example, we discretize into a vector, and then we design an guess consumption policy ,
which is also a vector where the jth element of corresponds to the jth state in , Similarly
guess . We definitely wish our guess close to the truth, so we “guess” . In principle
that is a state-by-state calculation, i.e. for each state we derive in that state.
The next step is to use (1.1) and (1.4) to solve for , . We must solve them for each point in the grid.
To be more specific, how many grid points we create, how many times we will need to solve the N-equation
nonlinear system. For example, when we are working on the jth grid and we know the value of , we
are looking for numerical values for , which are roots of (1.1) and (1.7). We resort to numerical
root finder, like Newton Gauss algorithm. With an initial values of , , we look up the policy table
to pin down . Then we check how well our initial values fit the equations, which also determines our
improved , to better fit the equations. Note that in the table lookup we find the value
corresponding to . However, our grid points are created in terms of consumption. So table lookup
actually requires interpolation. In practice, people usually use linear (bi-linear, multi-linear, depending on
dimensions of state variables) interpolation to find out the value tomorrow’s controls. In that case, the
policy iteration algorithm is not simply a grid search. The policy function goes beyond the integer of the
grids and can be viewed piece-wise linear. That is why a small number of grids, if chosen properly, are
generally sufficient.
As a numerical example, we set , ,
500 grids are created evenly spaced from (0.01,5)
As is seen in Figure 1.1, the simulated paths are almost identical to the theoretical counterparts, confirming
the usefulness of policy function. Note that it takes 28 seconds on my computer to achieve convergence
after 11 iterations. It does have some advantage over value function iteration.
Figure 1.1 Simulated path of simple growth model
We conclude the simple growth model by explaining an important relationship—the policy function
iteration vs. value function iteration. As a learner of practical dynamic programming, we must be most
familiar with value function iteration on the basis of Bellman Equation. So a natural question follows: what
is the similarity and difference between policy iteration and value function iteration?
We have explained the algorithm of Euler equation based policy function iteration. It seems that policy
iteration is stand-alone, where value function plays no role. Yes, …, but Let us think it deeper.
In our simple growth model, the Bellman equation is:
0 10 20 300
2
4
6Simulated Capital
0 10 20 301
1.5
2
2.5Simulated Consumption
0 10 20 300
2
4
6Theoretical Capital
0 10 20 301
1.5
2
2.5Theoretical Consumption
( )
( )
s.t.
The time subscript is unnecessary, but we keep it anyway.
To perform value function iteration on a computer, we also discretize the state space into a grid of
points, say, a vector. We then guess an initial value function ( ) , which is also a
vector. The next step is to work on the RHS of Bellman equation. In essence, for each state (each element in
the vector), we need to find the “best attainable” ( ) pair which can maximize the RHS of
Bellman equation. Of course, table lookup is required to compute ( ), because the time-invariant
initial guess of ( ) is readily there. Once the RHS of Bellman equation is computed for every state, we
have updated the value function, because the LHS is simply ( ) itself. In other words, the Bellman
equation admits a fix-point for the functional ( ) . Repeat the above steps and iterate value function till
convergence. One interpretation of value function iteration is to think about the finite period counterpart.
Essentially, we first take tomorrow’s value functional form as given, and search for today’s optimal policy,
and will end up with today’s value function.
What does “best attainable” mean? It is characterized by FOCs (1,1) and (1.4). (and transversality condition
which we neglect). In value function iteration algorithm, we used to grid search the “best attainable” policy,
while in policy iteration algorithm, we directly work on the prescribed “best attainable” path given by FOCs.
In value function iteration, we iterate on the functional ( ) on the basis of fix-point Bellman equation,
while in policy iteration, we iterate on the functional ( ) on the basis of fix-point Euler equation. In
value function iteration, we take the functional form of tomorrow’s value function as given, and update
today’s value function; while in policy iteration, we take the functional form of tomorrow’s policy function
as given, and update today’s policy function.
Now that both the stand-alone value function iteration and Euler equation policy iteration can work, their
hybrid also does the job. Let us see how the Howard Policy Improvement Algorithm works, this approach is
in line with Putterman and Brumelle (1979), Putterman and Shin (1978)
First of all, we need to address one problem: given a full set of policy functions in hand, how to calculate
the associated value function.
In our simple growth problem, suppose we have some feasible policy (not necessarily optimal), which
satisfies the budget constraint:
( ) , ( )
The associated value function(not necessarily optimal) is defined as
( ) ∑
We do not want to add up to infinity for the sure. We can solve ( ) by manipulating Bellman equation,
when the current states are discretized into grid points.
( ) ( )
where ( ) , ( )
Note that there is no here, since we always use the prescribed policy.
For illustration we consider an example with only 5 states ranging from 1 to 5. We also imagine a
hypothetical feasible policy as follows.
(
)
,
(
)
,
(
)
Now we are ready to solve for the value function ( ), which is a vector too.
( )
(
)
(
)
( )
( )
where the matrix reflects the policy function ( ). Can you figure out how the policy
matrix P is formed? Essentially this is a table lookup procedure. For example, today we are in state 1, and
look up policy table, we know tomorrow we will in state 2. So the (1,2) element of P should be one, other
elements in the first row should be zero.
With the above linear equations, we solve for the value function: ( ) , -
To summarize, the linkage between policy function and value function is: on the one hand, if we know the
value function, we can work on the RHS of Bellman equation to solve for policy function, the solution is
characterized by N FOCs including Euler equation. On the other hand, if we know the policy function, we
can build the policy matrix P and by proper inversion, we will solve for the associated value function.
In the practical policy iteration procedure, we have a set of policies at each round of iteration. Though we
can invert the big policy matrix P each time and derive the associated value function, there is an alternative
algorithm. Note that when the policy functions converge, we have
( ) ( ) ( )
So we can calculate the RHS with last time’s value function, and what we get should be the value function
of this round. This algorithm avoids inverting the big policy matrix P each time. Of course, in the first round,
we had better use the inversion method to get the value function.
2. Adding labor to the model
We now enrich the benchmark growth model by adding the labor.
CRRA utility is used.
* +
∑ [
( ) ]
s.t. ( )
( ) (2.1)
The FOCs are:
( ) ( ) ( )
( ) , ( ) ( )- ( )
Rearrange terms,
( )
( ) (2.2)
{
( )
( )( )}
( )( ) (2.3)
Where
( )
As the saying goes, the 3 equations (2.1), (2.2),(2.3) should solve for 3 unknowns * + .
To be more specific, We first guess a policy function for each of the control variable:
( ) , ( ) , ( )
(The subscript is simply a notation, not a partial derivative.)
In practice, we always use (2.2) , (2.1) to generate ( ) , ( ) from ( )
Then treat the above guess as the tomorrow’s policy, thus (2.3) can be written as :
{
, ( )- , ( )-
( )( )}
( )( ) (2.4)
So we can use (2.1), (2.2), (2.4) to solve for * + , which should be functions of current states
To perform policy iteration on a computer, we first discretize the state space into grids, say, a
vector. Then we conjecture ( ) , ( ) , ( ) , each of which is also a
vector. For each grid, solve (2.1), (2.2), (2.4) and get updated policies on that grid. Interpolation is
needed to look up the tables related to ( ) , ( ).
There is one aggressive approach to iterate the policy, which is fast but not reliable.
We first guess ( ) , then we use (2.2) to derive policy of another control variable: ( ) .
Furthermore, use (2.1) to derive ( )
Finally, view (2.3) as a fix-point equation for the policy function ( ). That is, tomorrow’s policy and
will inherit the functional form of ( ) , ( ) respectively, which are function of , and can
downgrade to the function of with the aid of ( )
This iteration algorithm solves the policy functions of control variables in the sequential fashion, without
solving any nonlinear equations. Furthermore, it can be done all at once by manipulating vectors without
grid-by-grid computation . However, the convergence is not guaranteed.
Here we provide a numerical example.
we set , , , , ,
600 grids are created evenly spaced from (9,15)
Policy function iteration cannot be further improved after 23 iteration. The elapsed time is 84.35 seconds.
Iteration 23 ,policy norm 0.0044651 , increment of value function 0.0025982
FOC1 deviation ratio: 3.0547e-008
FOC2 deviation ratio: 1.3659e-006
FOC3 deviation ratio: 0.0023979
Clearly, the model does not have an analytical path. But we can calculate the steady state via solving the
FOCs. The theoretical steady states are C=0.90746 , L=0.32621 , K=12.5162
As is shown in Figure 2.1 , the simulated path looks reasonable, and settles down to
C=0.8848 , L=0.32216 , K=11.3338
Figure 2.1 Simulated path of simple growth model with labor
3. Add Shocks to the model
Now we go a step further by introducing Markov productivity shocks.
The model becomes:
* +
∑ [
( ) ]
s.t. ( )
( ) (3.1)
where random technology takes on 3 values: * +
the law of motion is Markov with some transition matrix
realize itself at the beginning of the period t before decision making, thus there is no randomness
at period t
The FOCs are:
( ) ( ) ( )
( ) * , ( ) ( )- ( ) +
Rearrange terms,
( )
( ) (3.2)
{ [
( )
( )( )] }
( )( ) (3.3)
Where
( )
Before specify the policy, we must first find out the state variables at period t. State variables are defined
0 100 200 3000.85
0.9
0.95
1Consumption
0 100 200 3000.3
0.31
0.32
0.33Labor
0 100 200 30011
12
13
14Capital
0 100 200 3001.15
1.2
1.25Output
as the set of variables that fully characterize the status quo (summarize the history). In other words, the
agent makes decision rules solely on the basis of that set of state variables, others being irrelevant. Then
the agent takes actions on the variables that she could control, hence the name control variables.
In models with shocks, the state space is enlarged by one dimension—a pair of ( ), and the controls
are * +
The 3 equations (3.1), (3.2),(3.3) should solve for 3 unknowns * +
To be more specific, we guess an initial policy: ( ) , ( ) , ( ) ,
perhaps with the aid of (3.2) and (3.1) .
Then treat the above guess as the tomorrow’s policy, thus (2.3) can be written as:
{
, ( )- , ( )-
( )( )}
( )( ) (3.4)
So we can use (3.1), (3.2), (3.4) to solve for * + , which should be functions of current states
.
To perform policy iteration on a computer, we first discretize the state space into two-dimensional
grid points. Then we conjecture ( ) , ( ) , ( ) , each of which is also a
two dimensional matrix. For each grid, solve (3.1), (3.2), (3.4) and get updated policies on that grid. Two
dimensional interpolation is needed to look up the ( ) , ( ) tables. Usually this is done by
bi-linear interpolation.
Note that the RHS of (3.4) involves the random shock and ( ) w.r.t. . For discrete
random variable , it is very easy to cope with—simply plug in each possible values of , derive the
terms inside ( ) (through policy table lookup), and average with weight assign by Markov transition
matrix. That is why must be one of the state variables; we need it to pin down which row of transition
matrix we should use.
We might be also interested in calculating the value function associated with a full set of policy functions.
In principle, the value function should also be 2 dimensions: ( )
In practice, the ( ) are all grids. To facilitate matrix manipulation, it is easier to vectorize ( ),
i.e. transform the matrix into a vector.
, ( )- [[
( ) ]
] , ( )-
, ( )- [[
( ) ]
] , ( )-
, ( )- , - [[
( ) ]
]
where the policy matrix P should be . In each row, there are only n nonzero elements, and they
sum up to unity. The numbers are determined by the Markov transition matrix, the locations are
determined by policy function ( )
Let us consider a hypothetical example,
can possibly resid on one of the three states 1-3, Markovian takes on two values -1, 1;
( ) , .
/ , .
/
Imaginary policy would be:
(
) , (
) , (
)
Then the policy matrix P is:
(
)
For instance, today we are in , , that is, we are in (3,2) of two-dimension grids, and need to
fill in the 6th row of policy matrix P. Look up policy table, we know tomorrow we will have .
Furthermore, the Markov transition matrix tell us that ( ) , ( ) . In
other words, we know tomorrow we will have 40% chance in (3,1) grids and 60% chance in (3,2)grids. The
vectorized (3,1) is 3, while vectorized (3,2) is 6, thus (6,3) element of P should be filled with 0.4 and (6,6)
elements filled with 0.6.
Here we provide a numerical example.
we set , , , ,
AR(1) shocks are discretized into 3 regime markov chain.
60 grids are created evenly spaced in (10,13)
Convergence is achieved after 36 iterations, All FOCs have been satisfied.
Iteration 35 ,policy norm 0.00017924 , increment of value function 0.0073572
FOC1 deviation ratio: 4.5904e-010
FOC2 deviation ratio: 1.4089e-008
FOC3 deviation ratio: 0.00023527
Figure 3.1 Impulse response to productivity shocks (autarky)
4. Two country model with complete market
Now we extend the close-economy model into a two country model. Two countries are symmetric in that
they have the same preference and technology structure. However, each country is met with idiosyncratic
productivity shocks with some across correlations. Arrow securities are traded so that idiosyncratic risks are
shared away. The Problem is equivalent to a benevolent central planner who allocates resources for the
two countries.
{
}
{ ∑ [
( ) ]
∑ [
( ) ]
}
s.t. ( )
( )
( ) ( ) (4.1)
where random technology takes on 3 values: * + , * +
the law of motion is Markov with some known transition matrix
realize itself at the beginning of the period t prior to decision making
The FOCs are:
( ) ( ) ( ) (4.2)
( ) ( ) ( ) (4.3)
( ) { [ ( ) ( )] ( ) } (4.4)
( ) { [ ( ) ( )] ( ) } (4.5)
( ) ( ) (4.6)
0 50 100 150 200-0.5
0
0.5
1Consumption
0 50 100 150 200-0.5
0
0.5
1Labor
0 50 100 150 200-2
-1
0
1Capital
0 50 100 150 200-1
0
1
2Output
For simplicity, we only consider symmetric weights:
We have 6 equations (4.1)-(4.6) to solve for 6 unknowns * +
To conduct the policy iteration, we first come up with an initial guess of policy functions:
( )
( )
( )
( )
( )
( )
Then we treat the functional form of initial guess as tomorrow’s policy, and use them to replace
tomorrow’s control variables.
(4.4) can be written as
( ) 2 0 . ( )/ ( )1
[ ( ) ( )] 3
(4.5) can be written as
( ) 2 0 . ( )/ ( )1
[ ( ) ( )] 3
So we can use the above 6 nonlinear equations to solve for 6 unknowns. ( ) absorbs tomorrow’s shocks
, , and the solution are purely functions of current states
.
To perform policy iteration on a computer, we first discretize the state space into
four-dimensional grids. Then we think up some initial guess of
( )
( )
( )
( )
( )
( )
Each of them is also a four dimensional matrix. For each grid, solve the 6 nonlinear equations and get
updated policies on that grid. Four dimensional interpolation is needed to look up the policy tables. Usually
this is done by multi-linear interpolation.
There is a shortcut algorithm which works faster, but convergence is not guaranteed.
Rearranging terms,
(4.2) ( )
( )
(4.7)
(4.3) ( )
( )
(4.8)
(4.6) ( )
( )( ) ( )
( )( ) (4.9)
(4.4)-(4.6) 2 ( ) 0
( )( )( )
1 3 (4.10)
where
( ),
( )
(4.4) 2 ( ) ( )( ) 0
( )
( )( )13
(4.11)
The state variables are 4 dimensional pair ( ) , discretized into several grids.
We start from a guess of initial policy: ( )
Then use (4.7) to solve for ( ) , twice grid search is used to ensure the labor
supply to lie within (0,1). We did not use more complicated algorithm to solve for non-linear equation for
the sake of computation speed— for each state grid we have to solve such an equation! The advantage of
simple grid search is that all work can be done at the 4-dimensional state space at the same time, given
sufficient memory.
Then we use (4.8), (4.9) to solve for ( ) , ( ) . This
can be done by first solve for ( ) (with grid search), and then use
( )
( ) to get ( ) . However, we explore the symmetry of the two countries and simply permute
the dimension of ( ) and ( ) to derive the policy for country 2.
The next step is to use (4.1) and (4.10) to solve for the policy of ( ) and
( ). This must be done by solving two equations simultaneously, because we
need to downgrade the t+1 control variables. We address this problem by taking advantage of the
discreteness of capital, which is set and agreed at the beginning. From (4.1), for every discrete ,
there is a corresponding (rounded to the nearest discrete value), and then we test which
( ) pair can make (4.10) holds most precisely.
With all the policy functions solved, it is time to work on the fix-point equation (4.11). Plug policies into RHS
of (4.11), which will give us the updated policy for . In that fashion, a new round of iteration starts.
As for the value function, the methodology is the same as described in Section 3. The only difference is that
the 4 dimensional state is vectorized to a long column.
Here we provide a numerical example.
we set , , , ,
Productivities follows a VAR(1) process:
(
) .
/ (
) .
/ , .
/ 0.
/ .
/1
The VAR(1) system is approximated by a bivariate 3-regime Markov Chain. (See separate documents for
more details.)
30 grids are created evenly spaced in (10,13) for , respectively.
Convergence is achieved after 11 iterations, with the following FOC deviations ratios, all of which are less
than 0.1%, suggesting good performance of the algorithm. The total computation time is roughly 15
minutes
Iteration 11 ,policy norm 0.00098252 , increment of value function 0.067171
FOC1 deviation ratio: 1.013e-008
FOC2 deviation ratio: 2.9257e-007
FOC3 deviation ratio: 2.9264e-007
FOC4 deviation ratio: 0.0010542
FOC5 deviation ratio: 0.0010564
FOC6 deviation ratio: 0.00031434
Elapsed time is 739.058416 seconds.
As we can see from Figure 4.1, the simulated IRF of complete market does exhibit anomalies. The
cross-country consumption is highly positive, and the international correlation of investment, employment
and production is negative, which are in conflict with real data.
Figure 4.1 Impulse response graphs with productivity shocks to country one
Figure 4.2 change of NX after the shock
0 10 20 30 40-2
0
2Production 1
0 10 20 30 40-1
-0.5
0Production 2
0 10 20 30 400
1
2Consumption 1
0 10 20 30 400
0.5
1Consumption 2
0 10 20 30 40-50
0
50Investment 1
0 10 20 30 40-50
0
50Investment 2
0 10 20 30 40-0.5
0
0.5Labor 1
0 10 20 30 40-1
-0.5Labor 2
Figure 4.3 Simulated path of two country model with complete markets
5. Two country model with Enforcement constraint
Now we incorporate the complete market model with a new feature, i.e. the central planner is constrained
by contract participation. Two countries are symmetric in that they have the same preference and
technology structure. However, each country is met with idiosyncratic productivity shocks with some across
0 5 10 15 20 25 30 35 400
0.5
1
1.5
2
2.5
3
3.5x 10
-3 NX / GDP
0 20 40 60 80 1000.85
0.9
0.95Consumption 1
0 20 40 60 80 1000.85
0.9
0.95Consumption 2
0 20 40 60 80 1000.25
0.3
0.35Labor 1
0 20 40 60 80 1000.32
0.34
0.36Labor 2
0 20 40 60 80 1005
10
15Capital 1
0 20 40 60 80 10010
15Capital 2
0 20 40 60 80 1001
1.2
1.4Production 1
0 20 40 60 80 1001
1.2
1.4Production 2
0 20 40 60 80 1000.95
1
1.05Shock 1
0 20 40 60 80 1000.95
1
1.05Shock 2
correlations. A benevolent central planner allocates resources for the two countries. However, each
country may elect to quit the contract and live in autarky.
{
}
{ ∑ ( )
∑ ( )
}
s.t. ( )
( )
( ) ( ) (5.1)
∑∑ ( ) , ( ) (
)-
, ( ) -
( ) is the value function of autarky
Define the sum of past multipliers, normalized multipliers and relative multipliers:
( ) (
) ( )
( )
( )
( )
( ) (
)
( )
The FOCs are:
( ) ( ) ( ) (5.2)
( ) ( ) ( ) (5.3)
( ) {
[ ( ) ( )] ( )
}
(5.4)
( ) {
[ ( ) ( )] ( )
}
(5.5)
( ) ( ) (5.6)
(5.7)
, , ( ) (5.8)
, , ( ) (5.9)
As is suggested by Kehoe and Perri(2002), to conduct policy iteration on the basis of the above FOCs, we
first come up with a initial guess of policy functions:
( )
( )
( )
( )
( )
( )
( )
( )
( )
The easiest initial policy is the solution of complete market case. Also calculate the value functions of
complete market ( ) ,
( )
Treat the functional form of initial guess of policies (and value functions) as tomorrow’s policies (and value
functions), we can replace , , , , , with the corresponding
policy functions. Eventually they are all functions of .
To update the initial policy, in principle, it is still solving 9 nonlinear equations (5.1)-(5.9) for 9 unknowns,
namely . Conceptually, solving the current problem is no more
difficult than the simple growth model in term of policy iteration.
However, we must deal with the complementary slackness conditions (5.8) and (5.9). There are three cases:
neither bind; country 1 binds but not country 2; country 2 binds but not country 1.
Since the binding patterns are unknown to the researcher, it is natural to start from the neither binding
case. Here , , . So essentially our task is to solve (5.1)-(5.6) for
. Once we have solved the new policy, we can check the associated value
functions, which can be computed as
( ) ( )
( )
( ) ( )
( )
Alternatively, we can invert the big policy matrix to get the associated value function, though it is a
little computationally intensive.
If the value functions are larger than the autarky value, we are happy to set those policies as our updated
policies. However, if smaller, extra step is needed.
If country 1 binds but not country 2 then , ( ) , this new
equation, along with (5.1)-(5.7) enable us to solve for
If country 2 binds but not country 1 then , ( ) , this new
equation, along with (5.1)-(5.7) enable us to solve for
To perform policy iteration on a computer, we first discretize the state space into
five-dimensional grids. Then we think up some initial guess of
( )
( )
( )
( )
( )
( )
( )
( )
( )
Each of them is also a five dimensional matrix. For each grid, use the above procedure to solve for
nonlinear equations and get updated policies on that grid. Five dimensional interpolation is needed to look
up the policy tables. Usually this is done by multi-linear interpolation.
Here we provide a numerical example.
we set , , , ,
Productivities follows a VAR(1) process:
(
) .
/ (
) .
/ , .
/ 0.
/ .
/1
The VAR(1) system is approximated by a bivariate 3-regime Markov Chain. (See separate documents for
more details.)
13 grids are created for and respectively, ranging from 10.6 to 13;
9 grids are created for , ranging from 0.96 to 1.04
Those grids are roughly centered around the steady states. The boundary for capital is carefully chosen to
ensure the feasibility of the policy—No touch on the lower bound or the upper bound.
As for the grids for , in autarky, the preset magnitude of shocks will driven down the marginal utility of
consumption by 3%; in complete market, with risk sharing the decrease of marginal utility is much smaller
because country 1 give some of their additional consumptions to country 2. From (5.6) we thus set the
lower bound and upper bound about 4%, which should be sufficient.
The grids for each dimension is small, but we use 5D linear interpolations to increase the flexibility of the
policy function. Policy functions should be viewed as piece-wise linear, continuous. That why even with as
small as 5 grids the algorithm can perform somewhat satisfactorily.
In equation solving, the analytical Jocabian is provided to speed up the computation. However, We have
13689 grids in total. Due to the complicity of the high dimensional non-linear equations (because
interpolation is needed for all of tomorrow’s policies and value functions), solving those equations for
13689 times in each iteration is a non-trivial task.
It turns out that roughly 12%+12% of the grids are binding either for country 1 and country 2.
It goes though 10 iterations before convergence is achieved.
The total computation time is roughly an hour on my personal computer.
As we can see from Table 5.1 and Figure 5.1, the enforcement constraints do help to solve the puzzle of
open economy. Though not exactly as Kohoe and Perri (2002) described “go a long way toward resolving
the anomalies”, the result is somewhat different from the complete market. Notably, the cross country
correlation of output is negative in complete market, while the enforcement constraint makes output
positively correlated. The cross-country correlation of consumption is very highly complete market due to
risk sharing , but the enforcement constraint alleviate that phenomenon.
Table 5.1 Simulated business cycle statistics
Complete market Enforcement
Volatility
GDP 2.36 3.08
NX / GDP 5.12 4.46
C / Y 1.11 1.30
I / Y 5.06 4.39
L / Y 0.32 0.41
Domestic
Comovement
Corr (Y1 , C1) 0.69 0.83
Corr (Y1 , I 1) 0.15 0.24
Corr (Y1 , L1) 0.88 0.93
Corr (Y1 , NX) 0.14 0.18
International
Correlation
Corr (Y1 , Y2) -0.04 0.45
Corr (C1 , C2) 0.92 0.89
Corr (I 1 , I 2) -0.95 -0.84
Corr (L1 , L2) -0.57 -0.16
Figure 5.1 Impulse response of productivity shocks of country 1
References:
Baxter, M., 1991. "Approximating suboptimal dynamic equilibria : An Euler equation approach," Journal of
Monetary Economics, vol. 28.2, 173-200
Baxter,M., Crucini,M., and Rouwenhorst,K. G. 1990. Solving the Stochastic Growth Model by a
Discrete-State-Space, Euler-Equation Approach. Journal of Business & Economic Statistics, Vol. 8, No.
1, 19-21.
Coleman, W.J. 1989. An algorithm to solve dynamic models. International Finance Discussion Working
0 5 10 15 20 25 301
2
3
4Production 1
0 5 10 15 20 25 30
0
2
4Production 2
0 5 10 15 20 25 30-0.5
0
0.5
1Consumption 1
0 5 10 15 20 25 30-0.5
0
0.5
1Consumption 2
0 5 10 15 20 25 300
50
100Investment 1
0 5 10 15 20 25 30-100
-50
0
50Investment 2
0 5 10 15 20 25 300
1
2
3Labor 1
0 5 10 15 20 25 30-1
0
1
2
3Labor 2
Paper 351, Federal Reserve Board.
Coleman, W.J. 1990. Solving the Stochastic Growth Model by Policy-Function Iteration. Journal of Business
& Economic Statistics, Vol. 8, No. 1, 27-29.
Kehoe, P, Perri, F., 2002. International Business Cycles with Endogeneous Market Incompleteness,
Econometrica, Vol.70.3, 907-928.
Putterman, M. L., Brumelle, S. 1979. On the Convergence of Policy Iteration on Stationary Dynamic
Programming. Mathematics of Operations Research, Vol. 4.1, 60–67.
Putterman, M., Shin, M.C. 1978. Modified Policy Iteration Algorithms for Discounted Markov Decision
Problems. Management Science, Vol. 24.11, 1127–1137.
Tauchen, G., 1986. Finite State Markov-Chain Approximations to Univariate and Vector Autoregressions.
Economic Letters. 20, 177–181.
Tauchen, G. and Hussey, R., 1991. Quadrature-Based Methods for Obtaining Approximate Solutions to
Nonlinear Asset Pricing Models. Econometrica. 59.2, 317–396.