Differential games

1

The ‘how much’ war.A numerical method to solve

a duopolistic differential game in a closed-loop equilibrium

Advisors: Benjamín Ivorra, Ángel M. Ramos

(*)MsC Thesis. Mathematical EngineeringUCM, Facultad de Matemáticas. Madrid

27-09-2012

Jorge Herrera de la Cruz (*)

Microsoft

Modificar en todo el texto y poner Advisors

2

General Idea

Jorge Herrera de la CruzAdvisors: Benjamín Ivorra, Ángel M. Ramos

The “how much” war:Marketing departments have to decide how much they invest

in advertising to maximize sales. Also, they have to set a price. Both variables are relevant key drivers on sales.

A duopolistic differential game:Usually, companies are in a competitive world (generally

oligopolistic or duopolistic). We need a theoretical framework in order to incorporate this behaviour to study the optimal decision

A numerical method:Differential games are hard to solve analytically, so we have

to rely on numerical solutions as the ones presented here.

2

3

Motivation I

12.000 M. € advertising expenditurein Spain in 2011(Infoadex)

41

1181716

10 7

% investment

TV

cinema

Newspapers

tabloids

Ext

Internet

Radio

Magazines

3Jorge Herrera de la Cruz


4

Motivation II

Most of these investors are in an oligopolistic or duopolistic framework



5

Motivation III

These methods should deal with the following idea: “all competitors in the market are rational”

We believe that it is important to develop methods to optimize their expenses.



6

Some relevant questions

…in a competitive environment?

How much to spend in advertising per week in a fixed period of time…

What should be our weekly price…



7

Objectives

Introduce a theoretical method in a differential game form toanswer previous key questions

Estimate parameter values for this theoretical model

Develope a numerical method inspired fromdynamic programming methods to solve the game

Present the results of the game, and compare them with real data.



8

Novel contributions

1 We use LOTKA VOLTERRA models of species competition in a

Differential Marketing Game.

We present an algorithm to solve Closed-loop equilibrium adapted from dynamic programming literature.

The literature uses Lanchaster-war models as benchmark

2

The literature in differential games usually obtains analytic Closed-loop equilibrium from simple models or obtains Open-loop equilibrium



9

Theoretical Model

10

Theoretical Model (I): a differential game

St :

�̇� 𝒊= 𝒇 (𝒙 𝒊 ,𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋 ) 𝒙 𝒊 (𝟎 )=𝒙𝒊 ,𝟎

There are 2 players (i=1,2)

Each player try to maximize Ji with respect to control variables (p,u)

The functional is a dynamic discounted benefit function:Discounting factor r, Price p, sales x and C(u) is a cost function.



11

Theoretical Model (II)

St : The players coexist in a dynamical system that reflects the relationship among :

-State variables: sales [x] -Controls: advertising [u] and price [p]



�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)I maximize and I know you do it

I maximize and I know you do it

12

Solving the model

This model is a differential game. We can compute two equilibrium (solutions):

Open-loop

Optimal Controls are functions of time

Ui*=Ui(t), pi*=Pi(t)

Closed-loopOptimal Controls are functions of time and state variables

Ui*=Ui(t,xi(t)), pi*=Pi(t, xi(t))



13

NASH EQUILIBRIUM

TIME CONSISTENCE

Properties of equilibrium

Open-loop

Closed-loop

SUBGAME PERFECTNESS



14

Parameter estimation

15

Which parameters to estimate?

St :

�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)

We have to estimate parameters for the Cost function and the market dynamic system



16

Our data set:

Data set from a well-known company in Spain is available.

This company competes with a withe label.

Our company needs to maximize:

The competitor only needs to maximize:

pricei advertisingi



pricei

17

Market Dynamic System

�̇�=𝝆𝟏√𝒗𝟏 (𝟏−𝐌 )− 𝝆𝟐√𝒗𝟐𝑴

�̇� 𝒊=𝝆 𝒊𝒗𝒊√ (𝑵𝟏+𝑵𝟐−𝒙 𝒊− 𝒙 𝒋 ) 𝑫 𝒊(𝒑𝒊)

�̇� 𝒊=[𝜶𝟏(𝟏− 𝜶𝟏 𝒙 𝒊

𝑵 𝒊)− 𝜷 𝒊𝒋 𝒙 𝒋 ]𝒙𝒊

LANCHESTER Model of COMBAT

LOTKA-VOLTERRA

Species competition



�̇� 𝒊= 𝒇 (𝒙 𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)

18

Discrete time LOTKA VOLTERRA models WITH EXOGENOUS INPUTS

LV-1 LV-2 VARX

18

�̇� 𝒊=[𝜶𝟏(𝟏− 𝜶𝟏 𝒙 𝒊

𝑵 𝒊)− 𝜷 𝒊𝒋 𝒙 𝒋+…+𝜺𝒊]𝒙 𝒊

- price and advertising in a multiplicative way.

- a multiplicative error term.

-LV-1 and price and advertising in a quadratic way.

- a multiplicative error term.

-Is a linearized version of LV-1


19

OLS estimations

Equation 1 Equation 2 Equation 1 Equation 2 Equation 1 Equation 2

Dependent Variable0.695 1.285

std deviation 0.201 0.557

std deviation

0.690 5.410 -0.200 -0.042

std deviation 0.146 2.070 0.096 0.267

1.540 1.540std deviation 0.348 0.348

-0.022 -0.022std deviation 0.003 0.003

-0.083 -0.083std deviation 0.017 0.017

-0.032 -0.034 -0.016 -0.115std deviation 0.008 0.007 0.037 0.102

-0.029 -0.029std deviation 0.009 0.009

0.0004 0.000 0.0002 -0.0003std deviation 0.0001 0.000 0.0001 0.0002

std deviation-0.023 -0.800 -0.032 -0.061

std deviation 0.010 0.342 0.013 0.037

-0.047 -0.047 -0.014 -0.037std deviation 0.014 0.014 0.006 0.016

0.032std deviation 0.014

0.000std deviation 0.000

R-squared 0.775034 0.538344 0.79 0.538344 0.72 0.48Dwatson 1.899407 1.833568 1.9 1.833568 1.9 1.83AIC -423.0484 -73.21995 -435.76 -73.21995SC -252.6166 61.49 -262.289 61.49

LV1 model LV2 model VAR MODEL

19

20

Cost Function

20

0 50 100 150 200 250 300 350 400 4500 €

50,000 €

100,000 €

150,000 €

200,000 €

250,000 €

GRPs

Inve

stm

ent

• Theoretical models usually use a quadratic function to derive maximum policies.

• Data exhibit a logarithmic behavior.


21

Numerical Method

22

Dynamic Programming (I)

22

• Discrete scheme using a step h>0:

�̇� 𝒊=𝒈𝒊(𝒙𝒊 , 𝒙 𝒋 ,𝒑𝒊 ,𝒖𝒊 ,𝒖 𝒋)

𝒙 𝒊(𝒕+𝟏)=𝒙 𝒊 (𝒕 )+𝒉𝒈𝒊(𝒙 (𝒕 ) ,𝒑 (𝒕 ) ,𝒖 (𝒕 ) ;𝜶 )


23

Dynamic Programming (III)

23

• Discrete Hamilton-Jacobi-Bellman (simplified notation)

𝑽 𝒊 ,𝒉 (𝒙 )=𝐦𝐚𝐱𝒖

{𝒉× 𝒇 𝒊 ( 𝒙 (𝒕 ) ,𝒑 (𝒕 ) ,𝒖 (𝒕 ) ,𝜶 )+𝜷×𝑽 𝒊 ,𝒉 (𝒙 (𝒕 )+𝒉𝒈𝒊 (𝒙 (𝒕 ) ,𝒑 (𝒕 )𝒖 (𝒕 ) ;𝜶 ))}

• To solve it the scheme is iterated in time and space.

• This maximum is explicitly calculated (so we need the Gradient of )

• To solve this fixed point [V is in both terms] the contracting mapping theorem is applied

• Linear interpolation is used to obtain the values of Vout of discrete grid of x


Dynamic Programming (II)

24

• Discrete Hamilton-Jacobi-Bellman (we simplify notation)

𝑽 𝒊 ,𝒉 (𝒙 ,𝜶 )=𝐦𝐚𝐱𝒖

{𝒉× 𝒇 𝒊 ( 𝒙 (𝒕 ) ,𝒖 (𝒕 ) ,𝜶 )+𝜷×𝑽 𝒊 ,𝒉(𝒙 (𝒕 )+𝒉𝒈𝒊 (𝒙 (𝒕 ) ,𝒖 (𝒕 ) ;𝜶 ))}

The algorithm

1 • Let G a 2D grid with x1 and x2, define dx1,dx2 and h• Initialize control vectors to 0, =0

2

3

4

5

• Startwith a guessestimateforVih and calculateitsgradient (gradVi)

• Actualize x1 and x2 accordingto• VALUE ITERATION (subroutine). Takingfixed, iterate in HJB

tillconvergence. Use interpolation

• POLICY ITERATION (subroutine). Taking Vihfixed, explicitly obtain a new iterating in HJB till convergence.

• Reiterate VALUE and POLICY till tolerance criteria is achieved.


Microsoft

es SUBROUTINE te falta la O

25

Dynamic Programming (II)

25

• We have run numerical experiments to validate our algorithm:

• We have solved a 1D model with an explicit solution [Brock-Mirman]

• We have also solved a 2D benchmark model, found in the literature, without any explicit solution, but we have obtained similar results to the published ones.


26

The ‘how much’ war. Numerical Solution

27

Our solution

27

28

Competitor

28

29

We could have won a 7% more

29

30

Conclusions and Future Research

31

Conclusions and Future Research

We have shown a differential game algorithm in a closed-loop equilibrium to solve a duopolistic problem with real world data:

1. Obtaining optimal advertising2. Obtaining optimal price

In our experiment, we show that we could have won a 7% more than with the adopted strategy.

31

In future works, we propose:1. Study Stackelberg equilibrium (leader-follower)2. Study Stochastic Games taking into account estimatedvalues of error sizes with statistical methods3. Incorporate richer lag schemes in dynamical systemas suggested by statistical tests.


Differential games

Education

Transcript of Differential games