Differential games
-
Upload
jorgeherreradelacruz -
Category
Education
-
view
341 -
download
0
description
Transcript of Differential games
1
The ‘how much’ war.A numerical method to solve
a duopolistic differential game in a closed-loop equilibrium
Advisors: Benjamín Ivorra, Ángel M. Ramos
(*)MsC Thesis. Mathematical EngineeringUCM, Facultad de Matemáticas. Madrid
27-09-2012
Jorge Herrera de la Cruz (*)
2
General Idea
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
The “how much” war:Marketing departments have to decide how much they invest
in advertising to maximize sales. Also, they have to set a price. Both variables are relevant key drivers on sales.
A duopolistic differential game:Usually, companies are in a competitive world (generally
oligopolistic or duopolistic). We need a theoretical framework in order to incorporate this behaviour to study the optimal decision
A numerical method:Differential games are hard to solve analytically, so we have
to rely on numerical solutions as the ones presented here.
2
3
Motivation I
12.000 M. € advertising expenditurein Spain in 2011(Infoadex)
41
1181716
10 7
% investment
TV
cinema
Newspapers
tabloids
Ext
Internet
Radio
Magazines
3Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
4
Motivation II
Most of these investors are in an oligopolistic or duopolistic framework
4Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
5
Motivation III
These methods should deal with the following idea: “all competitors in the market are rational”
We believe that it is important to develop methods to optimize their expenses.
5Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
6
Some relevant questions
…in a competitive environment?
How much to spend in advertising per week in a fixed period of time…
What should be our weekly price…
6Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
7
Objectives
Introduce a theoretical method in a differential game form toanswer previous key questions
Estimate parameter values for this theoretical model
Develope a numerical method inspired fromdynamic programming methods to solve the game
Present the results of the game, and compare them with real data.
7Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
8
Novel contributions
1 We use LOTKA VOLTERRA models of species competition in a
Differential Marketing Game.
We present an algorithm to solve Closed-loop equilibrium adapted from dynamic programming literature.
The literature uses Lanchaster-war models as benchmark
2
The literature in differential games usually obtains analytic Closed-loop equilibrium from simple models or obtains Open-loop equilibrium
8Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
9
Theoretical Model
10
Theoretical Model (I): a differential game
St :
�̇� 𝒊= 𝒇 (𝒙 𝒊 ,𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋 ) 𝒙 𝒊 (𝟎 )=𝒙𝒊 ,𝟎
There are 2 players (i=1,2)
Each player try to maximize Ji with respect to control variables (p,u)
The functional is a dynamic discounted benefit function:Discounting factor r, Price p, sales x and C(u) is a cost function.
10Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
11
Theoretical Model (II)
St : The players coexist in a dynamical system that reflects the relationship among :
-State variables: sales [x] -Controls: advertising [u] and price [p]
11Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)I maximize and I know you do it
I maximize and I know you do it
12
Solving the model
This model is a differential game. We can compute two equilibrium (solutions):
Open-loop
Optimal Controls are functions of time
Ui*=Ui(t), pi*=Pi(t)
Closed-loopOptimal Controls are functions of time and state variables
Ui*=Ui(t,xi(t)), pi*=Pi(t, xi(t))
12Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
13
NASH EQUILIBRIUM
TIME CONSISTENCE
Properties of equilibrium
Open-loop
Closed-loop
SUBGAME PERFECTNESS
13Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
14
Parameter estimation
15
Which parameters to estimate?
St :
�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)
We have to estimate parameters for the Cost function and the market dynamic system
15Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
16
Our data set:
Data set from a well-known company in Spain is available.
This company competes with a withe label.
Our company needs to maximize:
The competitor only needs to maximize:
pricei advertisingi
16Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
pricei
17
Market Dynamic System
�̇�=𝝆𝟏√𝒗𝟏 (𝟏−𝐌 )− 𝝆𝟐√𝒗𝟐𝑴
�̇� 𝒊=𝝆 𝒊𝒗𝒊√ (𝑵𝟏+𝑵𝟐−𝒙 𝒊− 𝒙 𝒋 ) 𝑫 𝒊(𝒑𝒊)
�̇� 𝒊=[𝜶𝟏(𝟏− 𝜶𝟏 𝒙 𝒊
𝑵 𝒊)− 𝜷 𝒊𝒋 𝒙 𝒋 ]𝒙𝒊
LANCHESTER Model of COMBAT
LOTKA-VOLTERRA
Species competition
17Jorge Herrera de la Cruz
Advisors: Benjamín Ivorra, Ángel M. Ramos
�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)
18
Discrete time LOTKA VOLTERRA models WITH EXOGENOUS INPUTS
LV-1 LV-2 VARX
18
�̇� 𝒊=[𝜶𝟏(𝟏− 𝜶𝟏 𝒙 𝒊
𝑵 𝒊)− 𝜷 𝒊𝒋 𝒙 𝒋+…+𝜺𝒊]𝒙 𝒊
- price and advertising in a multiplicative way.
- a multiplicative error term.
-LV-1 and price and advertising in a quadratic way.
- a multiplicative error term.
-Is a linearized version of LV-1
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
19
OLS estimations
Equation 1 Equation 2 Equation 1 Equation 2 Equation 1 Equation 2
Dependent Variable0.695 1.285
std deviation 0.201 0.557
std deviation
0.690 5.410 -0.200 -0.042
std deviation 0.146 2.070 0.096 0.267
1.540 1.540std deviation 0.348 0.348
-0.022 -0.022std deviation 0.003 0.003
-0.083 -0.083std deviation 0.017 0.017
-0.032 -0.034 -0.016 -0.115std deviation 0.008 0.007 0.037 0.102
-0.029 -0.029std deviation 0.009 0.009
0.0004 0.000 0.0002 -0.0003std deviation 0.0001 0.000 0.0001 0.0002
std deviation-0.023 -0.800 -0.032 -0.061
std deviation 0.010 0.342 0.013 0.037
-0.047 -0.047 -0.014 -0.037std deviation 0.014 0.014 0.006 0.016
0.032std deviation 0.014
0.000std deviation 0.000
R-squared 0.775034 0.538344 0.79 0.538344 0.72 0.48Dwatson 1.899407 1.833568 1.9 1.833568 1.9 1.83AIC -423.0484 -73.21995 -435.76 -73.21995SC -252.6166 61.49 -262.289 61.49
LV1 model LV2 model VAR MODEL
19
20
Cost Function
20
0 50 100 150 200 250 300 350 400 4500 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
GRPs
Inve
stm
ent
• Theoretical models usually use a quadratic function to derive maximum policies.
• Data exhibit a logarithmic behavior.
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
21
Numerical Method
22
Dynamic Programming (I)
22
• Discrete scheme using a step h>0:
�̇� 𝒊=𝒈𝒊(𝒙𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)
𝒙 𝒊(𝒕+𝟏)=𝒙 𝒊 (𝒕 )+𝒉𝒈𝒊(𝒙 (𝒕 ) ,𝒑 (𝒕 ) ,𝒖 (𝒕 ) ;𝜶 )
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
23
Dynamic Programming (III)
23
• Discrete Hamilton-Jacobi-Bellman (simplified notation)
𝑽 𝒊 ,𝒉 (𝒙 )=𝐦𝐚𝐱𝒖
{𝒉× 𝒇 𝒊 ( 𝒙 (𝒕 ) ,𝒑 (𝒕 ) ,𝒖 (𝒕 ) ,𝜶 )+𝜷×𝑽 𝒊 ,𝒉 (𝒙 (𝒕 )+𝒉𝒈𝒊 (𝒙 (𝒕 ) ,𝒑 (𝒕 )𝒖 (𝒕 ) ;𝜶 ))}
• To solve it the scheme is iterated in time and space.
• This maximum is explicitly calculated (so we need the Gradient of )
• To solve this fixed point [V is in both terms] the contracting mapping theorem is applied
• Linear interpolation is used to obtain the values of Vout of discrete grid of x
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
Dynamic Programming (II)
24
• Discrete Hamilton-Jacobi-Bellman (we simplify notation)
𝑽 𝒊 ,𝒉 (𝒙 ,𝜶 )=𝐦𝐚𝐱𝒖
{𝒉× 𝒇 𝒊 ( 𝒙 (𝒕 ) ,𝒖 (𝒕 ) ,𝜶 )+𝜷×𝑽 𝒊 ,𝒉(𝒙 (𝒕 )+𝒉𝒈𝒊 (𝒙 (𝒕 ) ,𝒖 (𝒕 ) ;𝜶 ))}
The algorithm
1 • Let G a 2D grid with x1 and x2, define dx1,dx2 and h• Initialize control vectors to 0, =0
2
3
4
5
• Startwith a guessestimateforVih and calculateitsgradient (gradVi)
• Actualize x1 and x2 accordingto• VALUE ITERATION (subroutine). Takingfixed, iterate in HJB
tillconvergence. Use interpolation
• POLICY ITERATION (subroutine). Taking Vihfixed, explicitly obtain a new iterating in HJB till convergence.
• Reiterate VALUE and POLICY till tolerance criteria is achieved.
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
25
Dynamic Programming (II)
25
• We have run numerical experiments to validate our algorithm:
• We have solved a 1D model with an explicit solution [Brock-Mirman]
• We have also solved a 2D benchmark model, found in the literature, without any explicit solution, but we have obtained similar results to the published ones.
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos
26
The ‘how much’ war. Numerical Solution
27
Our solution
27
28
Competitor
28
29
We could have won a 7% more
29
30
Conclusions and Future Research
31
Conclusions and Future Research
We have shown a differential game algorithm in a closed-loop equilibrium to solve a duopolistic problem with real world data:
1. Obtaining optimal advertising2. Obtaining optimal price
In our experiment, we show that we could have won a 7% more than with the adopted strategy.
31
In future works, we propose:1. Study Stackelberg equilibrium (leader-follower)2. Study Stochastic Games taking into account estimatedvalues of error sizes with statistical methods3. Incorporate richer lag schemes in dynamical systemas suggested by statistical tests.
Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos