DP Matlab Coding

2
Q1 Solution: Iteration Number:1 Value:V(E)=12.9759, V(G)=12.736, V(D)=12.704, V(F)=12.6784 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:2 Value:V(E)=10.691, V(G)=10.3615, V(D)=10.359, V(F)=10.3569 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:3 Value:V(E)=8.8416, V(G)=8.5264, V(D)=8.5262, V(F)=8.5261 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:4 Value:V(E)=7.3655, V(G)=7.0481, V(D)=7.0481, V(F)=7.048 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:5 Value:V(E)=6.1841, V(G)=5.867, V(D)=5.867, V(F)=5.867 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:6 Value:V(E)=5.2391, V(G)=4.9219, V(D)=4.9219, V(F)=4.9219 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:7 Value:V(E)=4.483, V(G)=4.1659, V(D)=4.1659, V(F)=4.1659 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:8 Value:V(E)=3.8782, V(G)=3.561, V(D)=3.561, V(F)=3.561 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:9 Value:V(E)=3.3943, V(G)=3.0772, V(D)=3.0772, V(F)=3.0772 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:10 Value:V(E)=3.0072, V(G)=2.6901, V(D)=2.6901, V(F)=2.6901 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:11 Value:V(E)=2.6975, V(G)=2.3804, V(D)=2.3804, V(F)=2.3804 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:12 Value:V(E)=2.4498, V(G)=2.1327, V(D)=2.1327, V(F)=2.1327 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:13 Value:V(E)=2.2516, V(G)=1.9345, V(D)=1.9345, V(F)=1.9345 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:14 Value:V(E)=2.0931, V(G)=1.7759, V(D)=1.7759, V(F)=1.7759 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1 Iteration Number:15 Value:V(E)=1.9662, V(G)=1.6491, V(D)=1.6491, V(F)=1.6491 Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

description

How to code a DP in Matlab using LP.

Transcript of DP Matlab Coding

  • Q1

    Solution:

    Iteration Number:1

    Value:V(E)=12.9759, V(G)=12.736, V(D)=12.704, V(F)=12.6784

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:2

    Value:V(E)=10.691, V(G)=10.3615, V(D)=10.359, V(F)=10.3569

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:3

    Value:V(E)=8.8416, V(G)=8.5264, V(D)=8.5262, V(F)=8.5261

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:4

    Value:V(E)=7.3655, V(G)=7.0481, V(D)=7.0481, V(F)=7.048

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:5

    Value:V(E)=6.1841, V(G)=5.867, V(D)=5.867, V(F)=5.867

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:6

    Value:V(E)=5.2391, V(G)=4.9219, V(D)=4.9219, V(F)=4.9219

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:7

    Value:V(E)=4.483, V(G)=4.1659, V(D)=4.1659, V(F)=4.1659

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:8

    Value:V(E)=3.8782, V(G)=3.561, V(D)=3.561, V(F)=3.561

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:9

    Value:V(E)=3.3943, V(G)=3.0772, V(D)=3.0772, V(F)=3.0772

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:10

    Value:V(E)=3.0072, V(G)=2.6901, V(D)=2.6901, V(F)=2.6901

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:11

    Value:V(E)=2.6975, V(G)=2.3804, V(D)=2.3804, V(F)=2.3804

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:12

    Value:V(E)=2.4498, V(G)=2.1327, V(D)=2.1327, V(F)=2.1327

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:13

    Value:V(E)=2.2516, V(G)=1.9345, V(D)=1.9345, V(F)=1.9345

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:14

    Value:V(E)=2.0931, V(G)=1.7759, V(D)=1.7759, V(F)=1.7759

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:15

    Value:V(E)=1.9662, V(G)=1.6491, V(D)=1.6491, V(F)=1.6491

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

  • Iteration Number:16

    Value:V(E)=1.8647, V(G)=1.5476, V(D)=1.5476, V(F)=1.5476

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:17

    Value:V(E)=1.7836, V(G)=1.4664, V(D)=1.4664, V(F)=1.4664

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:18

    Value:V(E)=1.7186, V(G)=1.4015, V(D)=1.4015, V(F)=1.4015

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:19

    Value:V(E)=1.6667, V(G)=1.3495, V(D)=1.3495, V(F)=1.3495

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:20

    Value:V(E)=1.6251, V(G)=1.308, V(D)=1.308, V(F)=1.308

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:21

    Value:V(E)=1.5918, V(G)=1.2747, V(D)=1.2747, V(F)=1.2747

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:22

    Value:V(E)=1.5652, V(G)=1.2481, V(D)=1.2481, V(F)=1.2481

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:23

    Value:V(E)=1.544, V(G)=1.2268, V(D)=1.2268, V(F)=1.2268

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:24

    Value:V(E)=1.5269, V(G)=1.2098, V(D)=1.2098, V(F)=1.2098

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1

    Iteration Number:25

    Value:V(E)=1.5133, V(G)=1.1962, V(D)=1.1962, V(F)=1.1962

    Optimal Policy:a(E)=2, a(G)=1, a(D)=1, a(F)=1