A Brief Introduction to Differential Games

Click here to load reader

  • date post

    16-Nov-2015
  • Category

    Documents

  • view

    2
  • download

    0

Embed Size (px)

description

A Brief Introduction to Differential Games

Transcript of A Brief Introduction to Differential Games

  • 396 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

    International Journal of Physical and Mathematical Sciences

    journal homepage: http://icoci.org/ijpms

    A Brief Introduction to Differential Games

    L. Gmez Esparza, G. Mendoza Torres, L. M. Saynes Torres.

    Facultad de Ciencias de la Electrnica. Benemrita Universidad Autnoma de Puebla.

    1. Introduction

    The theory of dynamics games is concerned with multi-person decision making. The

    principal characteristic of a dynamic game is that involves a dynamic decision process evolving

    in time (continuous or discrete), with more than on decision maker, each with its own cost

    function and possibly having access to different information. Dynamic game theory adopts

    characteristics from game theory and optimal control theory, although it is much more versatile

    than each of.

    Differential games belong to a subclass of dynamic games called games in the state space.

    In a game in the state space, the modeler introduces a set of variables to describe the state of a

    dynamic system, at any particular instant of time in which the game takes place. The systematic

    study of the problems of differential games was initiated by Isaacs in 1954.

    After development of the maximum principle of Pontryagin's maximum principle, it became

    clear that there was a connection between differential games and optimal control theory. In fact,

    the differential game problems are a generalization of the optimal control problems in cases

    where more than one driver or player. However, differential games are conceptually much more

    complex than optimal control problems in that it is not as what constitutes a solution. There are

    different kinds optimal solutions for problems such as differential games minimax solution,

    Nash equilibrium, Pareto equilibrium, depending on the characteristics of the games (see e.g.,

    Tolwinski (1982) and Haurie, Tolwinski, and Leitman (1983)).

    We present some results on differential games cooperative and non-cooperative differential

    games, and theirs "optimal" solutions. In particular we will study those that relate Pareto

    equilibrium and Nash equilibrium (non-cooperative games), although other types of cooperative

    and non-cooperative games, for example, commitment games, Stackelberg games, to name a

    few.

    2. Preliminary in optimal control theory

    As mentioned above, optimal control problems are a special class of differential games

    played and a cost criterion. In this section we study some basic results on optimal control theory:

    dynamic programming and the maximum principle, since these results are determining in

    dynamic game theory.

  • 397 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

    2.1. Statement of optimal control problem (OCP)

    In general, the optimal control problem (continuous time) can be defined as follows

    (1)

    where is called the state equation and

    is called the objective function or cost criteria. This is, in own words, the problem is find the

    admissible control , which Maximizes the objective function, subject to the state equation and the

    control constraints

    (2)

    Usually the set is determined by constraints (physical, economic, biological, etc.) on the

    values of the control variables at time . The control is called the optimal control

    and , determined by means of state equation with , is called the optimal trajectory or an

    optimal path.

    2.2. Dynamic Programming and the Maximum Principle.

    Dynamic programming is based on Bellman's principle of optimality (Richard Bellman in

    1957 stated this principle in his book on dynamic programming)

    Let us consider the optimal control problem (1). The principle of maximum can be derived from

    Bellman's principle of optimality (see [45]). We state the principle of maximum as follows

    Theorem 1. Let us assume, that exists an optimal couple for the optimal control

    problem (1), and we assume that and are continuously differentiable in and continuous in

    and . Then, exists an adjoint variable that satisfies

    An optimal policy has the property that, whatever the initial

    state and initial conditions are, the remaining decision must

    constitute an optimal policy with regard to the outcome

    resulting from the first decision.

  • 398 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

    , (3)

    (4)

    (5)

    where the so-called Hamiltonian is defined as

    (6)

    The maximum principle states that under certain assumptions there exists for every optimal

    control path a trajectory such that the maximum condition, the adjoin equation, and

    transversality condition (eq. 4) are satisfied. To obtain a sufficiency theorem we augment these

    conditions by convexity assumptions. This yields the following theorem.

    Theorem 2. Consider the optimal control problem given by the equation (1), (2), (3) and define

    the Hamiltonian function like in (7), and the maximized Hamiltonian function

    (7)

    Assume that the state space is a convex set and that is continuously differentiable and

    concave. Let be a feasible control path with corresponding state trajectory . If there exists

    an absolutely continuous function such that the maximum condition

    (8)

    the adjoint equation

    (9)

    and the transversality condition

    (10)

    are satisfied, and such that the function is concave and continuously

    differentiable with respect to for all , then is an optimal path. If the set of feasible

    controls, does not depend on , this result remains true if equation (10) is replaced by

  • 399 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

    (11)

    3. Differential games: basic concepts

    The general -player (deterministic) differential game time is described by the state equation

    (12)

    and the cost functional for each player is given by the equation

    (13)

    for , where the index set is called the players' set.

    In this formulation we consider a fixed interval of time that is the prescribed duration of

    the game, is the initial state known by all players. Let called

    trajectory space of the game. The controls are chosen by player for all , here

    is named an admissible strategy set for player . Then the problem can be stated as follows

    For each , player wants to choose his control such as to minimize (or maximize)

    the cost functional (profits) subject to the state equation (13).

    It is assumed that all players know the state equation as well as the cost functionals.

    Example 1. In a two-firm differential game with one state variable , the state evolves over

    time according to the differential equation

    in which are scalar control variables of firm 1 and 2, respectively. The state variable

    represents the number of customers that firm 1 has at time and is the constant size of the

    total market. Hence is the number of customers of firm 2. The control variables

    are the firm`s respective advertising effort rates at time . The differential equation, in this case, can

    be interpreted in the following way: the number of customers of firm 1 tends to increase by the

    advertising efforts of firm 1 since these efforts attract customers from firm 2. On the other hand, the

    advertising efforts of firm 2 tend to draw away customers from firm 1.

    Payoffs are given by

  • 400 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

    in which represent firm i's unit revenues. The second term in the integrand of is a convex

    advertising cost function of firm . Feasibility requires that and are not negative. Each

    firm wishes to choose its advertising strategy over so as to maximize its payoff. The payoff is

    simply the present value of a firm's profit on the horizon.

    Remark. In this game, the rival firm's actions do not influence a firm's payoff directly but only

    indirectly through the state dynamics.

    3.1. The information structure

    In many problems the control function , for each , should be specified by means of an

    information structure, which is denoted by , and is defined as

    where is nondecreasing in .

    Depending on the type of information available, we can define a strategy space of player

    of all suitable mappings as follows

    We also require that belongs to for .

    Some types of standard information structures