Str
ictl
y
CONFIDENTIAL
The end of Airline Revenue Management
as we know it?(Deep) Reinforcement Learning for Revenue Management
AGIFORS Symp 2017, London
October 2017Speakers: Rodrigo ACUNA AGOST and Thomas FIIG
© A
ma
de
us
IT G
rou
p a
nd
its
aff
ilia
tes
an
d s
ub
sid
iari
es
INR and Airlines R&D
Credits
Nicolas BONDOUX
Quan NGUYEN
Thomas FIIG
Rodrigo ACUNA-AGOST
RM
Expertise
AI
Expertise
Amadeus Airlines R&D + Innovation and Research
(Deep)
Reinforcement
Learning for
Revenue
Management
Motivation – limitations of RMSRMS assumptions
− RMS assumes that the future is accurately described by the past:
• Issue with change in business environment (new competitors)
• Issue with shift in demand and willingness to pay
• Issue with change in customer behavior (for example: arrival pattern)
− RMS assumes that customers are rational:• However, customers are irrational, influenced by psychological factors (framing, etc.).• There is no model for irrationality.
− RMS assumes monopoly:
• Competitors offers are accounted for implicitly by how they affect customers behavior. This corresponds to a monopoly seen from RMS
− RMS assumes that a model exist that describes “world”. For general offers this is impossible:
• Increased complexity of offers (seat + ancillaries) • Complex products (flexibility, time to think, etc.), bundles of ancillaries; are difficult to price. • Interactions between the prices of ancillaries, bundles, fare families, etc.
Scientific worldview Ability to model customers behavior?
• Collect observations
• Experimentation
YesNo
Low dimensions
(degrees of freedom)
High dimensions
(degrees of freedom)
Optimization - Multiple sequential decisions
• Dynamic Programming (Bellman 1950s)
• Balancing exploitation and
exploration
Build forecasting model
• All future customers
Iterative Solution Complete
solution=
Proof in 1990s
Classification
Prediction
Build forecasting model
• One given customer
REINFORCEMENT
LEARNING
CLASSICAL
REVENUE MANAGEMENT
SUPERVISED
MACHINE LEARNING
…Many
iterations
5
Application of RL
6
(1) Actions
(2) State
(3) Reward
How it works?
7
Actions: { Left, no-change, Right }
Reward = stay alive as long as possible
(Alive = no crash)
State: { Information of Sensors }
Sensors …
8
© A
ma
de
us
IT G
rou
p a
nd
its
aff
ilia
tes
an
d s
ub
sid
iari
es
So, what’s new?
… it is very disruptive
No modeling
passenger
behavior (WTP)
No demand
forecast
No RM
optimization
model
Reinforcement LearningReinforcement LearningReinforcement LearningReinforcement LearningMathematical details*)
� �, � = ���� 1 − �( ) � � + 1, � + �( ) + � � + 1, � − 1� � = ���� � ����� [����� + �(�′)]
��
Bellman (1950s)
Q �, � = � ����� [����� + �(�′)]��
Q-learning Watkins (1989)
a1V(s)
a2
V(s’)
max
f1
0
f2
0
f
0
(s,a1)
(s,a2)
Q(s,a)
V(s’)a’1
a’2
a’1
a’2
max
maxV(s)
a
Explorative
action
*) Reinforcement Learning, Sutton, Brato, 1998
Q s, a ← [1 − α]Q s, a + α f + V(s�)
� � = ���� "(�, �)
10
© A
ma
de
us
IT G
rou
p a
nd
its
aff
ilia
tes
an
d s
ub
sid
iari
es
Experimentation
our research journey …
Base Reinforcement Learning
Scenario: Monopoly
+ GPUs
+ Scenario:
with 1 competitor
+ Deep Learning
1
2
3
High
Low
Co
mp
lexi
ty /
Re
ali
sm
11
Experiments 1
Base Reinforcement Learning in a Monopoly
AAA BBB
AL1
Simulation set-up
Cap = 10
DCP = 20
Fare classes = 3
Fenceless fare structure
RMS basecase
• AL1: Dynamic Programming
• Two customer segments with different frat5
• Forecaster = Q-forecasting
Reinforcement Learning
• No Forecaster or Optimizer
• AL1: Q –learning
• State (t,x)
• Action: f1,f2,f3, closed
12
60%
65%
70%
75%
80%
85%
90%
95%
100%
105%
110%
0 7
13
19
25
32
38
44
51
57
63
70
76
82
88
95
101
107
114
120
126
133
139
145
152
158
164
170
177
183
189
196
202
208
215
221
227
233
240
246
252
259
265
271
278
284
290
296
Revenue
Data (Years)
Theoretical optimum
=> perfect demand forecast, perfect WTP
models, no changes in the market
Revenue with RL
Experiments 1
Base Reinforcement Learning in a Monopoly
� RL can solve the problem.
� RL converges to the optimal solution
— Poor performance (RL needs a lot of
data and computation.
13
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 7 6 4 9 1 6 3 2 0 3 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 7 5 8 5 7 7 6 9 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 2 2 5 10 8 1 0 2 1 7 0 5 9 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 2 0 1 1 0 2 2 3 3 3 10 9 6 4 8 2 5 2 7 3 0 6 6 1 4 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
1 1 2 0 1 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 1 1 2 3 2 2 4 9 8 8 10 4 10 7 3 10 3 6 6 0 8 7 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
5 1 1 0 2 1 1 1 1 1 4 1 2 1 1 1 0 3 1 0 1 0 1 0 0 1 0 0 2 2 0 0 2 2 0 1 1 0 0 0 1 0 0 0 0 2 3 1 2 0 1 0 2 1 0 0 1 1 1 0 0 2 0 1 1 2 1 0 0 0 2 0 1 1 4 1 1 1 1 0 1 1 1 1 1 0 1 1 2 0 1 1 1 1 1 2 6 6 7 6 3 7 4 1 5 5 1 7 1 4 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
5 7 10 7 0 7 1 0 0 1 1 3 2 0 1 3 2 3 4 1 1 2 4 0 3 2 2 1 0 0 1 2 1 0 2 2 1 0 1 0 0 1 1 0 0 2 0 2 0 0 0 2 1 0 2 1 0 0 2 0 2 0 4 0 1 0 1 2 0 1 3 1 0 2 0 2 1 2 3 0 0 2 0 0 1 1 1 0 0 0 1 1 2 1 4 0 1 1 1 1 0 0 1 1 2 2 1 3 0 4 6 2 4 6 6 1 7 5 5 2 7 4 3 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
1 5 6 5 5 7 1 3 1 3 3 1 2 0 1 1 0 2 2 1 1 1 2 1 2 1 0 1 1 1 1 1 3 2 0 2 0 2 1 0 1 3 1 0 1 1 0 1 0 1 0 0 2 1 0 1 1 3 1 1 3 1 0 0 2 0 1 0 1 3 0 3 0 2 1 1 0 1 3 0 3 3 1 2 1 1 0 0 2 0 1 1 0 0 2 1 1 1 1 0 0 1 1 1 3 1 1 0 1 1 1 1 1 1 0 1 2 0 5 6 2 4 2 4 2 3 4 0 2 3 1 3 3 3 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
2 1 2 4 4 2 5 7 3 1 1 2 2 0 1 1 2 2 1 0 0 0 2 3 2 2 0 2 0 0 2 0 1 1 0 1 0 1 3 1 2 2 3 2 1 2 2 3 0 0 0 1 2 3 1 2 1 1 0 1 2 0 3 1 1 2 1 2 1 1 2 3 1 2 2 1 3 1 3 1 2 1 0 2 0 3 2 1 0 0 0 0 1 1 2 2 0 2 1 1 1 3 0 2 0 1 2 3 1 1 1 0 0 0 2 1 0 1 1 0 2 2 1 0 0 0 2 1 1 2 5 1 6 1 4 3 5 3 2 4 3 2 3 1 3 6 6 6 6 5 2 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
3 1 4 0 2 4 3 2 1 4 1 4 3 0 0 3 5 1 2 2 2 1 1 0 0 1 1 3 1 1 0 0 0 1 0 1 1 2 2 2 2 1 1 2 0 1 1 0 4 0 1 3 1 1 0 1 1 3 1 2 1 0 1 1 0 1 1 0 1 0 4 3 2 0 1 1 2 1 3 1 0 1 1 1 1 0 0 2 2 1 1 2 1 0 0 2 1 0 2 1 0 1 1 0 2 1 1 0 0 0 1 2 2 1 0 1 1 1 0 0 0 1 1 1 1 1 1 2 0 0 1 1 0 1 0 0 0 1 2 4 1 2 3 4 6 2 3 4 3 5 0 1 3 3 5 6 7 0 2 7 1 7 7 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
0 1 0 3 1 3 1 2 4 1 2 4 3 3 1 2 3 2 4 3 3 1 3 3 2 1 0 1 3 0 0 1 2 0 2 0 1 1 3 1 4 1 2 1 2 1 0 0 1 4 1 0 2 1 1 2 1 1 1 4 0 3 3 0 2 0 2 2 1 1 0 0 1 1 2 1 1 0 3 1 3 1 1 1 0 3 3 1 1 1 0 3 1 1 0 0 1 1 0 3 1 1 0 1 1 1 2 2 0 0 0 2 2 2 0 0 2 1 1 1 1 0 1 0 3 0 2 1 2 1 0 2 1 1 1 1 2 0 1 1 0 0 1 1 0 2 2 2 1 1 2 4 1 3 2 3 3 2 2 3 0 1 7 0 1 6 4 8 1 4 1 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
2 2 4 5 1 0 3 3 1 4 3 3 4 1 0 0 3 0 0 3 2 1 3 1 3 3 3 1 1 1 2 1 1 3 0 1 2 2 1 2 1 1 0 1 2 2 0 1 2 1 2 1 1 0 2 1 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 0 0 1 1 2 3 0 1 1 0 1 3 0 1 1 1 1 0 1 1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 2 0 0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 0 1 2 0 0 1 0 1 3 2 1 2 2 2 1 1 1 5 6 6 1 5 4 1 1 1 8 9 1 9 4 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
3 5 5 5 4 3 0 3 1 2 2 4 4 3 3 3 2 0 3 0 3 1 2 2 3 2 2 2 2 1 0 0 1 1 3 1 1 0 1 1 0 0 2 0 0 0 1 1 2 2 1 2 1 1 1 0 1 1 1 0 0 1 4 0 2 1 0 2 1 0 1 0 0 1 1 1 0 0 0 1 1 2 0 1 1 1 1 1 1 2 1 0 1 0 1 0 1 2 0 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 2 0 2 1 0 3 2 2 0 1 1 1 1 2 0 0 0 1 1 0 2 1 0 1 1 1 2 1 1 0 1 1 2 0 1 0 0 0 1 0 1 1 0 0 1 0 1 2 1 2 3 2 2 2 1 2 8 2 3 1 0 5 7 1 5 0 2 9 1 0 3 5 3 1 9 9 9 9 9 9 9 9
6 6 6 6 5 1 0 3 3 2 2 6 0 6 1 0 0 6 3 5 2 2 0 1 0 1 1 1 0 2 4 2 1 0 1 4 0 4 1 0 2 1 1 0 0 1 1 1 0 2 1 0 1 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 1 3 0 1 0 1 1 1 1 1 0 0 1 1 2 0 1 1 0 1 1 1 1 1 2 0 2 1 0 1 1 0 1 0 1 1 0 1 0 0 1 1 0 1 1 0 1 0 1 1 1 1 0 1 0 1 1 4 0 1 1 2 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 0 2 1 1 3 0 1 2 0 0 0 1 0 1 0 0 0 0 1 0 0 2 0 1 6 3 7 0 0 1 1 7 1 0 8 6 9 8 9 9 8 3 0 3 1 5 6 9
6 6 6 6 1 6 6 5 1 4 6 0 6 3 5 2 6 1 1 2 2 2 3 2 1 1 2 1 1 0 1 0 0 3 2 1 1 1 2 0 1 2 0 2 2 0 1 0 1 0 0 1 1 0 2 0 2 0 0 0 1 0 0 0 1 1 2 0 1 1 1 1 0 1 1 1 1 0 0 0 1 0 0 2 0 0 0 0 0 1 0 3 1 0 0 1 1 0 0 2 1 0 0 0 0 0 2 1 0 1 0 0 1 1 3 0 1 1 1 1 0 1 0 0 0 1 0 1 1 1 0 2 0 1 1 1 1 1 1 0 1 0 1 0 0 2 0 1 0 1 1 1 0 0 1 1 0 1 1 0 2 1 1 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 1 0 0 0 1 2 0 1 1 1 0 0 5 0 8 4 4 5 3 3 3 0
7 7 7 7 7 7 7 7 5 0 3 7 4 4 2 1 6 7 7 4 5 1 6 4 5 6 6 2 1 2 0 3 1 0 0 0 0 3 3 2 0 1 0 2 1 0 2 1 1 0 2 1 1 0 0 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 2 0 1 0 0 0 0 0 1 1 0 1 1 0 2 1 1 3 2 0 1 1 0 2 0 0 0 2 1 0 1 0 2 0 1 1 0 1 0 0 3 1 1 0 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 0 0 0 0 1 2 0 0 0 0 1 2 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 2 2 2 2 1 4
7 7 7 7 7 7 7 7 7 7 7 4 7 7 7 2 4 4 0 1 2 6 7 2 7 6 7 5 3 6 7 2 2 1 4 2 1 2 0 1 0 0 0 2 1 2 1 2 2 1 1 1 0 0 1 0 0 1 1 2 0 1 1 0 2 1 1 1 1 0 1 1 1 0 1 0 1 0 1 0 1 1 0 2 1 1 0 1 2 2 1 0 1 1 1 2 0 1 0 1 0 1 2 1 0 1 1 1 0 0 0 1 2 1 0 2 0 0 1 3 1 1 1 1 1 2 0 0 0 0 1 1 1 0 0 0 2 0 1 0 1 2 1 0 1 1 2 0 1 1 2 2 0 0 1 0 1 1 0 2 0 0 1 0 2 1 1 2 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 0 0 1 2 1 4 2 7 8 6 2 7 4 3 5 3 3 0 3 1 0 3 1 0 1 0 1 1 1 0 1 1 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 0 0 1 1 0 1 2 1 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 2 0 1 2 1 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 1 1 0 2 1 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 1 0 2 1 0 1 0 1 2 0 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 2 8 2 4 4 6 2 8 1 1 2 3 1 2 2 2 3 1 0 1 1 1 1 0 2 0 0 1 1 1 1 1 0 1 1 0 1 0 0 1 2 1 0 0 1 3 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 0 0 0 1 1 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 1 1 0 1 1 3 0 0 0 1 1 2 1 1 0 0 0 1 1 2 0 0 0 2 1 2 2 0 1 1 1 0 2 1 0 0 1 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 1 0 0 0 1 1 0 1 2 0 0 1 0 1 0 1 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0
BookingsEmpty Full
Tim
e t
o d
ep
art
ure
Departure
RL Policies vs. Optimal Policies
Lack of observations
(cabin full far from
departure)
Lack of observations
(cabin empty at
departure)
Good policies where we
have lots of
observations
Experiments 1
Base Reinforcement Learning in a Monopoly
14
Deep Reinforcement LearningDeep Neural Network
Classical Artificial Neural Networks Deep Neural Network
Accuracy:
worse than other ML methods
Accuracy:
better than other ML methods
Input OutputInput OutputLayer1 Layer2 Layer3
man woman
2
15
Deep Reinforcement LearningDeep Neural Network as function approximation
What is function approximation?
(�, �) "(�, �)
(�, �) "(�, �, θ) = α� + β�θ(�, %) = α� + β%
(�, �) "(�, �, )
2
16
© A
ma
de
us
IT G
rou
p a
nd
its
aff
ilia
tes
an
d s
ub
sid
iari
es
40%50%60%70%80%90%100%110%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Revenue
Data (Year)
Deep RL
5-15Y
of
flight data
~99%
of
opt. rev
80%
85%
90%
95%
100%
105%
110%
33
39
45
50
56
62
68
73
79
85
91
96
102
108
114
119
125
131
137
142
148
154
160
Revenue
Data (1000 year)
Normal RL
135kY
of
flight data
~97%
of
opt. rev
Experiments + Deep Learning
2
17
Experiments 3
Reinforcement Learning in Duopoly
Simulation set-up
Cap =50
DCP=20
Fare classes = 10
Fenceless fare structure
RMS basecase
• AL1: Dynamic Programming
• A2: AT80
• Two customer segments with different frat5
• Estimated frat5 (optimal revenue)
• Forecaster = Q-forecasting
Reinforcement Learning
• No Forecaster or Optimizer
• AL1: Deep RL
• State (t,x)
• Action: f1,f2,f3,…,f10, closed
AAA BBB
AAA BBB
AL1
AL2
0,8
0,85
0,9
0,95
1
0,7 0,8 0,9 1 1,1 1,2 1,3
Re
ven
ue
Frat5 scaling factor
18
1 11 21 31 41 51 61 71 81
Months
Revenue
DQL AT80
1 11 21 31 41 51 61 71 81 91
Months
Revenue
RMS AT80
RMS vs Competitor
Experiments One Competitor + GRUs
3
100%
89%
80%
102.5%
DRL vs Competitor
AT80
RMS
AT80
DRL
Why is RL better?
0
500
1000
1500
2000
2500
3000
0 10 20 30 40 50
Low
est
ava
ila
ble
fa
re
RMS DRL
0
0.2
0.4
0.6
0.8
1
0 4 8 12 16 20
Bo
ok
ing
Cu
rve
RMS AT80
0
0.2
0.4
0.6
0.8
1
0 4 8 12 16 20
Bo
ok
ing
cu
rve
DRL AT80
DRL closes classes
far from departure
DRL open classes
close to departure
• Remember RMS were optimal
• DRL produces higher revenue by
understanding the competitive game and
swamping the competitors with low yield
passengers.
3
20
Conclusion
• More complex scenarios: � Full Networks
� Many competitors
� Pricing of complex product
� Pricing of psychological factors -
irrational customers.
� Shift in demands/WTP
• Improve learning performance
• Add more information to the state (eg.,
competitors and market)
today
future
• Classical RMS techniques are no longer
sufficient.
• RL opens the door to a radical new
approach:� Model free
� No forecasting
� No optimization
� Leans by direct price testing
• Shown that RL = RMS for monopoly
• We discover the richness of RL
• Beats RMS against competition
Top Related