1 OR II GSLM 52800. 2 Outline introduction to discrete-time Markov Chain introduction to...
-
Upload
cristina-hogard -
Category
Documents
-
view
239 -
download
0
Transcript of 1 OR II GSLM 52800. 2 Outline introduction to discrete-time Markov Chain introduction to...
1
OR IIOR IIGSLM 52800GSLM 52800
2
OutlineOutline
introduction to discrete-time Markov Chain
problem statement
long-term average cost pre unit time
solving the MDP by linear programming
3
States of a MachineStates of a Machine
State Condition
0 Good as new
1 Operable – minor deterioration
2 Operable – major deterioration
3 Inoperable – output of unacceptable quality
4
Transition of StatesTransition of States
State 0 1 2 3
0 0 7/8 1/16 1/16
1 0 3/4 1/8 1/8
2 0 0 1/2 1/2
3 0 0 0 1
5
Possible Actions Possible Actions
Decision Action Relevant States
1 Do nothing 0, 1, 2
2Overhaul (return to
state 1)2
3Replace (return to
state 0)1, 2, 3
6
ProblemProblem
adopting different collections of actions adopting different collections of actions leading to different long-term average cost leading to different long-term average cost per unit time per unit time
problem: to find the policy that minimizes problem: to find the policy that minimizes the long-term average cost per unit time the long-term average cost per unit time
7
Costs of ProblemCosts of Problem
cost of defective items state 0: 0; state 1: 1000; state 2: 3000
cost of replacing the machine = 4000
cost of losing production in machine replacement = 2000
cost of overhauling (at state 2) = 2000
8
Policy Policy RRdd: Always Replace : Always Replace
When State When State 0 0 half of the time at state 0, with cost 0
half of the time at other states, all with cost 6000, because of machine replacement
average cost per unit time = 3000
0 1
23
1/16
7/8
1
1
1 1/16
9
Long-Term Average Cost ofLong-Term Average Cost ofa Positive, Irreducible Discrete-time Markov Chaina Positive, Irreducible Discrete-time Markov Chain
a positive, irreducible discrete-time Markov chain with M+1 states, 0, …, M only M of the balance eqt plus the
normalization eqt
0
0
balance eqt.: , 0,...,
normalization eqt.: 1
M
j i iji
M
ii
p j M
10
Policy Policy RRaa: Replace at Failure : Replace at Failure
but Otherwise Do Nothingbut Otherwise Do Nothing
0 1
23
1/16
7/8
1
1/2
1 1/16
3/4
1/8
1/8
1/2
0 3
7 31 0 18 4
1 1 12 0 1 216 8 2
1 1 13 0 1 216 8 2
0 1 2 3 1
72 2 20 1 2 313 13 13 13
; ; ;
0 1 2 3(0) 1000 3000 6000 1923
11
Policy Policy RRbb: Replace in State 3, : Replace in State 3,
and Overhaul in State 2and Overhaul in State 20 3
7 31 0 1 28 4
1 12 0 116 8
1 13 0 116 8
0 1 2 3 1
52 2 20 1 2 321 7 21 21
; ; ;
0 1 2 3(0) 1000 4000 6000 1667
0 1
23
1/16
7/8
1
1 1/16
3/4
1/8
1/81
12
Policy Policy RRcc: Replace : Replace
in States 2 and 3in States 2 and 30 2 3
7 31 0 18 4
1 12 0 116 8
1 13 0 116 8
0 1 2 3 1
72 1 10 1 2 311 11 11 11
; ; ;
0 1 2 3(0) 1000 6000 6000 1727
0 1
231/16
7/8
1
1 1/16
3/4
1/8
1/81
1
13
ProblemProblem
in this case the minimum-cost policy is RRbb, ,
i.e., replacing in State 3 and overhauling in i.e., replacing in State 3 and overhauling in State 2State 2
question: Is there any efficient way to find the minimum cost policy if there are many states and different types of actions? impossible to check all possible cases
14
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
let
Dik be the probability of adopting decision k at state i
i be the stationary probability of state i
yik = P(state i and decision k)
Cik = the cost of adopting decision k at state i
15
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
0
0
, 0,...,
1
M
j i iji
M
ii
p j M
, 0,...,ik i iky D i M
0 1 1M K
iki k
y
1 0 1 ( ), 0,1,...,K M K
jk ik ijk i ky y p k j M
0, 0,1,..., ; 1,...,iky i M k K
0 1 0 1 ( ) =
M K M K
i ik ik ik iki k i k
E C C D C y
16
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
0 1
0 1
1 0 1
min = ,
. .
1,
( ) 0, 0,1,...,
0, 0,1,..., ; 0,1,...,
M K
ik iki k
M K
iki k
K M K
jk ik ijk i k
ik
Z C y
s t
y
y y p k j M
y i M k K
at optimal, Dik = 0 or 1, i.e., a deterministic policy is used
17
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
actions possibly to adopt at state 0: do nothing (i.e., k = 1)
1: do nothing or replace (i.e., k = 1 or 3)
2: do nothing, overhaul, or replace (i.e., k = 1, 2, or 3)
3: replace (i.e., k = 3)
variables: y01, y11, y13, y21, y22, y23, and y33
18
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
11 13 21
22 23 33
01 11 13 21 22 23 33
01 13 23 33
11 13
min =1000 6000 +3000
+4000 +6000 +6000 ,
. .
+ + + + 1
( + + ) 0
+
Z y y y
y y y
s t
y y y y y y y
y y y y
y y
7 301 11 228 4
1 1 121 22 23 01 11 2116 8 2
1 1 133 01 11 2116 8 2
( + + ) 0
+ + ( + + ) 0
( + + ) 0
0, 0,1,..., ; 0,1,...,ik
y y y
y y y y y y
y y y y
y i M k K
State 0 1 2 3
0 0 7/8 1/161/16
1 0 3/4 1/8 1/8
2 0 0 1/2 1/2
3 0 0 0 1
19
Linear Programming Approach Linear Programming Approach for an MDPfor an MDP
solving, y01 = 2/21, y11 = 5/7, y13 = 0, y21 = 0, y22 = 2/21, y23 = 0, y33 = 2/21
optimal policy at state 0: do nothing
state 1: do nothing
state 2: overhaul
state 3: replace