A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems
-
Upload
bert-van-vreckem -
Category
Education
-
view
296 -
download
0
description
Transcript of A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems
![Page 1: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/1.jpg)
![Page 2: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/2.jpg)
A Reinforcement Learning Approach to SolvingHybrid Flexible Flowline Scheduling Problems
Bert Van Vreckem Dmitriy Borodin Wim De Bruyn AnnNowe
![Page 3: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/3.jpg)
Authors
• Bert Van Vreckem, HoGent Business and [email protected]
• Dmitriy Borodin, [email protected]
• Wim De Bruyn, HoGent Business and [email protected]
• Ann Nowe, Artificial Intelligence Lab, Vrije Universiteit [email protected]
HFFSP MISTA2013: 29 August 2013 3/28
![Page 4: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/4.jpg)
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 4/28
![Page 5: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/5.jpg)
Hybrid Flexible Flowline Scheduling Problems
Powerful model for complex real-life production schedulingproblems.In α/β/γ notation1:
HFFLm, ((RM(i))
(m)i=1/Mj , rm, prec, Siljk, Ailjk, lag/Cmax
Flowline Scheduling problems: jobs processed in consecutive stages.
Stage 1 Stage 2 Stage 3 Stage 4
1(Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 5/28
![Page 6: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/6.jpg)
Hybrid Flexible Flowline Scheduling Problems
Powerful model for complex real-life production schedulingproblems.In α/β/γ notation1:
HFFLm, ((RM(i))
(m)i=1/Mj , rm, prec, Siljk, Ailjk, lag/Cmax
Flowline Scheduling problems: jobs processed in consecutive stages.
Stage 1 Stage 2 Stage 3 Stage 4
1(Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 5/28
![Page 7: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/7.jpg)
Hybrid Flexible Flowline Scheduling Problems
Hybrid case: unrelated parallel machines
M11
M12
M13
M21
M22
M31
M32
M33
M34
M41
M42
HFFSP MISTA2013: 29 August 2013 6/28
![Page 8: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/8.jpg)
Hybrid Flexible Flowline Scheduling Problems
Flexible case: stages may be skipped
M11
M12
M13
M21
M22
M41
M42
HFFSP MISTA2013: 29 August 2013 7/28
![Page 9: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/9.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Machine eligibility
M11
M13
M21
M22
M31
M33
M42
HFFSP MISTA2013: 29 August 2013 8/28
![Page 10: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/10.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Time lag between stages
Stage 1
Stage 2
Stage 3
Stage 4
HFFSP MISTA2013: 29 August 2013 9/28
![Page 11: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/11.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 10/28
![Page 12: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/12.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 10/28
![Page 13: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/13.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 11/28
![Page 14: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/14.jpg)
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Precendence relations between jobs
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 12/28
![Page 15: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/15.jpg)
Hybrid Flexible Flowline Scheduling Problems
Precedence relations between jobs make the problem muchharder, in a way that MILP/CPLEX approach doesn’t workanymore for larger instances (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 13/28
![Page 16: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/16.jpg)
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 14/28
![Page 17: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/17.jpg)
A Machine Learning ApproachScheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations
→ Learning Automata
• Machine assignment
→ Earliest Preparation Next Stage(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
![Page 18: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/18.jpg)
A Machine Learning ApproachScheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment
→ Earliest Preparation Next Stage(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
![Page 19: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/19.jpg)
A Machine Learning ApproachScheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment → Earliest Preparation Next Stage(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
![Page 20: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/20.jpg)
A Machine Learning ApproachScheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment → Earliest Preparation Next Stage(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
![Page 21: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/21.jpg)
Reinforcement learningAt every discrete time step t:
• Agent percieves environment state s(t)
• Agent chooses action a(t) ∈ A = a1, . . . , an according tosome policy
• Environment places agent in new state s(t+ 1) and givesreinforcement r(t)
• Goal: learn policy that maximizes long term cumulativereward
∑t r(t)
Environment
Agent
s
r
a
HFFSP MISTA2013: 29 August 2013 16/28
![Page 22: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/22.jpg)
Learning Automata (LA)
Reinforcement Learning agents that choose action according toprobability distribution p(t) = (p1(t), . . . , pn(t)), withpi = Prob[a(t) = ai] and s.t.
∑ni=1 pi = 1
pi(0) = 1n (1)
pi(t+ 1) = pi(t) +αrewr(t)(1− pi(t))−αpen(1− r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t+ 1) = pj(t) −αrewr(t)pj(t)
+αpen(1− r(t))(
1
n− 1− pj(t)
)(3)
if aj 6= ai
HFFSP MISTA2013: 29 August 2013 17/28
![Page 23: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/23.jpg)
Learning Automata (LA)
Reinforcement Learning agents that choose action according toprobability distribution p(t) = (p1(t), . . . , pn(t)), withpi = Prob[a(t) = ai] and s.t.
∑ni=1 pi = 1
pi(0) = 1n (1)
pi(t+ 1) = pi(t) +αrewr(t)(1− pi(t))−αpen(1− r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t+ 1) = pj(t) −αrewr(t)pj(t)
+αpen(1− r(t))(
1
n− 1− pj(t)
)(3)
if aj 6= ai
HFFSP MISTA2013: 29 August 2013 17/28
![Page 24: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/24.jpg)
Learning Automata (LA)
Reinforcement Learning agents that choose action according toprobability distribution p(t) = (p1(t), . . . , pn(t)), withpi = Prob[a(t) = ai] and s.t.
∑ni=1 pi = 1
pi(0) = 1n (1)
pi(t+ 1) = pi(t) +αrewr(t)(1− pi(t))−αpen(1− r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t+ 1) = pj(t) −αrewr(t)pj(t)
+αpen(1− r(t))(
1
n− 1− pj(t)
)(3)
if aj 6= ai
HFFSP MISTA2013: 29 August 2013 17/28
![Page 25: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/25.jpg)
Learning Automaton update
1 2 3 40
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 0
pi
HFFSP MISTA2013: 29 August 2013 18/28
![Page 26: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/26.jpg)
Learning Automaton update
1 2 3 40
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 0
pi
HFFSP MISTA2013: 29 August 2013 18/28
![Page 27: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/27.jpg)
Learning Automaton update
1 2 3 40
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 0
pi
HFFSP MISTA2013: 29 August 2013 18/28
![Page 28: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/28.jpg)
Learning Automaton update
1 2 3 40
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
1 2 3 40
0.2
0.4
0.6
0.8
1
r(t) = 0
pi
HFFSP MISTA2013: 29 August 2013 18/28
![Page 29: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/29.jpg)
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 19/28
![Page 30: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/30.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 31: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/31.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 32: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/32.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 33: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/33.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 34: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/34.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1
• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 35: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/35.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 36: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/36.jpg)
Probabilistic Basic Simple Strategy (PBSS)(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resultingin a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule LinearReward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
![Page 37: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/37.jpg)
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems thatinvolve learning permutations
• but doesn’t work well when precedence constraints areinvolved
• PBSS only learns from positive experience (i.e. improving onprevious solutions)
• Doesn’t learn to avoid invalid permutations
HFFSP MISTA2013: 29 August 2013 21/28
![Page 38: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/38.jpg)
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems thatinvolve learning permutations
• but doesn’t work well when precedence constraints areinvolved
• PBSS only learns from positive experience (i.e. improving onprevious solutions)
• Doesn’t learn to avoid invalid permutations
HFFSP MISTA2013: 29 August 2013 21/28
![Page 39: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/39.jpg)
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems thatinvolve learning permutations
• but doesn’t work well when precedence constraints areinvolved
• PBSS only learns from positive experience (i.e. improving onprevious solutions)
• Doesn’t learn to avoid invalid permutations
HFFSP MISTA2013: 29 August 2013 21/28
![Page 40: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/40.jpg)
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems thatinvolve learning permutations
• but doesn’t work well when precedence constraints areinvolved
• PBSS only learns from positive experience (i.e. improving onprevious solutions)
• Doesn’t learn to avoid invalid permutations
HFFSP MISTA2013: 29 August 2013 21/28
![Page 41: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/41.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;• equally good: r(t) = 1/2;• worse: r(t) = msbest
2ms ;• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 42: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/42.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;• equally good: r(t) = 1/2;• worse: r(t) = msbest
2ms ;• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 43: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/43.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;
• equally good: r(t) = 1/2;• worse: r(t) = msbest
2ms ;• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 44: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/44.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;• equally good: r(t) = 1/2;
• worse: r(t) = msbest2ms ;
• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 45: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/45.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;• equally good: r(t) = 1/2;• worse: r(t) = msbest
2ms ;
• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 46: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/46.jpg)
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update withr(t) = 0 and αpen > 0 for all agents that are involved in theviolation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in allagents, depending on the resulting makespan ms and bestmakespan until now msbest:
• improved: r(t) = 1;• equally good: r(t) = 1/2;• worse: r(t) = msbest
2ms ;• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
![Page 47: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/47.jpg)
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 23/28
![Page 48: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/48.jpg)
Experiments
• HFFSP Benchmark problems from (Ruiz et al., 2008)2
• problem sets with 5, 7, 9, 11, 13, 15 jobs, 96 instances in eachset
• + other constraints that make problems harder (precedencerelations!)
• αrew = 0.1; αpen = 0.5 (no tuning)
• Run until converges, or at most 300 seconds
2Available at http://soa.iti.es/problem-instances
HFFSP MISTA2013: 29 August 2013 24/28
![Page 49: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/49.jpg)
ResultsInstance set 5 7 9 11 13 15 overallmean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34# improved 11 12 18 12 9 6 68# equal 62 40 19 18 8 7 154# worse 23 44 59 66 79 82 354
HFFSP MISTA2013: 29 August 2013 25/28
![Page 50: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/50.jpg)
ResultsInstance set 5 7 9 11 13 15 overallmean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34# improved 11 12 18 12 9 6 68# equal 62 40 19 18 8 7 154# worse 23 44 59 66 79 82 354
HFFSP MISTA2013: 29 August 2013 25/28
![Page 51: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/51.jpg)
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 26/28
![Page 52: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/52.jpg)
Results and Discussion
Contributions:
• Extension of PBSS for learning permutations with precedenceconstraints
• Simple model + RL approach can yield good quality resultsfor challenging HFFSP instances
Discussion & future work:
• Precedence relations do make the problem harder
• Parameter tuning
• Convergence
• Larger instances (50, 100 jobs)
• Explore possibilities for improvement in machine assignment
HFFSP MISTA2013: 29 August 2013 27/28
![Page 53: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/53.jpg)
Results and Discussion
Contributions:
• Extension of PBSS for learning permutations with precedenceconstraints
• Simple model + RL approach can yield good quality resultsfor challenging HFFSP instances
Discussion & future work:
• Precedence relations do make the problem harder
• Parameter tuning
• Convergence
• Larger instances (50, 100 jobs)
• Explore possibilities for improvement in machine assignment
HFFSP MISTA2013: 29 August 2013 27/28
![Page 54: A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems](https://reader034.fdocuments.us/reader034/viewer/2022052412/558e4de31a28ab1b318b4666/html5/thumbnails/54.jpg)
Thank you!
Questions?
[email protected]://www.slideshare.net/bertvanvreckem/
HFFSP MISTA2013: 29 August 2013 28/28