1976-Optimal Control Theory

Optimal Control TheoryAuthor(s): Leonard D. BerkovitzSource: The American Mathematical Monthly, Vol. 83, No. 4 (Apr., 1976), pp. 225-239Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/2318209Accessed: 16/12/2010 05:52

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available athttp://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and youmay use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=maa.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

http://www.jstor.org/action/showPublisher?publisherCode=maa

http://www.jstor.org/stable/2318209?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp

http://www.jstor.org/action/showPublisher?publisherCode=maa

NOTICE

Beginning May 1st, 1976, my Editorial Office will be in Zurich. Please address all editorial correspondence to:

Alex Rosenberg Forschungsinstitut fur Mathematik ETH Zurich 8006 Zurich Switzerland

Authors may, in order to save postage, send only one copy of their manuscript.

ALEX ROSENBERG, Editor- in -Chief

OPTIMAL CONTROL THEORY

LEONARD D. BERKOVITZ

1. Introduction. The development of the mathematical theory of optimal control began in the early to mid 1950's, partially in response to problems in various branches of engineering and economics. Despite its modern origins, optimal control theory, from a mathematical point of view, is a variant of one of the oldest and most important subfields in mathematics-the calculus of variations. We shall sketch some of the important aspects of optimal control theory, the story of their development, and their relationship to the calculus of variations and to other branches of mathematics.

Besides informing the reader about certain aspects of optimal control theory, we hope to impress him with yet another example of the fruitful interaction between developments in pure mathematics and applications and between concrete special problems and general theories. The great historical examples of such interactions and their contributions to the vigorous growth of mathematics are well known. The story of the development of optimal control theory in the last twenty years is a small and yet unfinished chapter in the history of such interactions. It illustrates how meaningful applied problems can lead to new developments in mathematics and to a revitalization of old areas of research. It also illustrates the utility of existing mathematics, which may have been developed in other contexts, for the solution of applied problems.

In order to keep the length of this paper within reason we must limit our discussion to a few selected topics. We trust, however, that these will adequately illustrate and document the assertions we made above.

2. The servo problem. In the early 1950's electrical engineers, influenced by a paper of McDonald [21], became interested in improving the performance of servomechanisms with respect to time of response. A typical problem is the following.

A control surface on an aircraft is to be kept at rest at a fixed position. A wind gust displaces the surface from the desired position. It is assumed that if nothing were done, then the control surface would behave as a damped harmonic oscillator. Thus if 8 measures the deviation from the desired position, which we take to be zero, then the free motion of the surface satisfies the differential equation

Ol? aO,, + 20 = 0,

with initial conditions 6(0) = GO and O(0) = 6O. Here 6O is the displacement of the surface resulting

225

226 L. D. BERKOVITZ [April

from the wind gust and O is the velocity imparted to the surface by the gust. On an aircraft the damped oscillation of a control surface cannot be permitted. Therefore, a servomechanism is to apply a restoring torque so as to bring the system to rest at the desired position in minimum time. The equation of motion of the control surface then becomes

(2.1) 0"+ aO'+wCt)20 = u(t) (0)= 0O 0'(O)= 0O,

where u(t) represents the restoring torque at time t. Since the voltages available are not unlimited, the magnitude of the torque is constrained by an inequality I u(t) I c, where c is a constant. The problem is to apply the torque as a function of time in such a way that the system will be brought to 0 = 0, 0' = 0 in minimum time.

It is clear that if Oo > 0 and 0 > 0 then the torque should initially be directed in the direction of negative 0 and should have the largest possible magnitude. Thus u (t)= - c, initially. It is also clear that if the torque - c is applied for too long a period then we shall overshoot the desired terminal condition of 0 = 0, 0' = 0. Thus it appears that at some point there should be a torque reversal to + c in order to brake the system. The question arises whether this is indeed so, and if it is, where should the switch occur. Or, is it better to remove the torque at some point, allow a small overshoot, and then apply + c. Still another possibility is to allow the system to overshoot under the influence of - c, then apply + c, and finally brake with - c. In this vein, one could ask whether a sequence - c, + c, - c, .. + c of n steps is best, and if so, what is n, and where do the switches occur. Related to these questions is the question of what to do if the servo motor starts to act at Oo >0 and O <0?

In the engineering literature of the period it was always assumed-on the basis of intuition-that the optimal mode of operation was one in which the restoring torque only took on the values + c and - c, and various incomplete analyses were made to determine the appropriate sequence of ? c to be used. A restoring torque, or control, u that only takes on the extreme values u(t) = ? c is called a "bang-bang" control. A major contribution to the problem was made in 1952 by Bushaw in his Princeton doctoral thesis, later published in [8]. Bushaw also assumed that the bang-bang mode of operation was indeed optimal and then proceeded to determine for what values of (0, 6') one used + c and for what values of (0, 0') one used - c. His arguments were elementary and non-variational. He also determined the optimal trajectories of the system in the (0, 0')-plane.

3. The linear time-optimal problem. The servo problem of the preceding section can be rewritten in the following form. Let xl = 0 and let x2 = 0'. The differential equation (2.1) is then equivalent to the system

dxt 2 xt(O) =0oo dt X

dx2 _ 2 2 1+ U, 2(O) = , dt - ax -w x+u x ()0~,

where u is a function to be chosen from a specified class of functions and is required to satisfy the constraint u (t) ?c. The problem is to choose u so as to minimize

l dt,

where t1 is the first time such that xl(t) = 0 and x2(t) = 0, provided such a first time exists. The servo problem in the formulation just given is a special case of the following problem. The

state of a system at time t is described by an n-vector x(t) = (xl(t), . , xn(t)). The system evolves according to a system of linear differential equations that we write in vector matrix form as follows:

dxldt = A (t)x + B(t)u(t) x(to) = xo.

Here A is an n x n matrix whose entries are functions of t, B is an n x m matrix whose entries are

1976] OPTIMAL CONTROL THEORY 227

functions of t, u = (u , um) is an mr-vector whose components are functions of t that satisfy u'(t)? c', where ci, i - 1, - -, m are constants, to is the initial time, and xo is the initial state. It is

required to choose u so as to bring the system to the origin in minimum time. In other words, we are to choose a u that minimizes t1, where t1 is the first time such that x (t) = 0. The functions u that we choose are called controls. A control u * that minimizes t1 is called an optimal control.

With the problem formulated this way, Bellman, Glicksberg, and Gross [4] showed, using rather elementary functional analysis, that for special matrices A and B an optimal control does exist and is indeed bang-bang. Unfortunately, the assumptions made about the matrix A in [4] were not fulfilled in many examples of interest, including the original servo problem. Nevertheless, this was the first general formulation of the linear problem and the methods introduced in this paper were expanded and exploited by others. Later, LaSalle [17] showed that for linear systems more general than those treated by Bellman, Glicksberg, and Gross it is true that an optimal control exists and that it is bang-bang. LaSalle also showed, under appropriate hypotheses, that if the system can be driven to the origin by a control u, then there is a bang-bang control that will drive the system to the origin in the same time. This is the so-called, "bang-bang principle." Note that this principle holds for any control, optimal or not.

The validity of the bang-bang principle is of importance in engineering for the following reason. One need only use "contact" or "on-off" servomechanisms, since one need only consider controls that take on the values u' (t) = + c', i = 1, * *, m. Such servomechanismns are easier to build and operate than those in which all values in the intervals [- c', c'] are allowed.

The bang-bang principle can be established in varying degrees of generality by various arguments. As with other mathematical theorems, the proof has undergone a series of refinements and improvements. At present it is seen to follow with a minimum of technical detail from the Krein-Milman theorem and the characterization of the extreme points of convex sets in a certain function space. See [6].

In the Soviet Union, R. V. Gamkrelidze [11] and his collaborators also attacked the time optimal problem. Their methods were different, and we shall discuss their work later.

4. The nonlinear optimal control problem. The linear time optimal problem described in the preceding section is a special case of the following general nonlinear problem. The state of a system at time t is described by a point or vector

x (t) =(x l(t), X n

(t))

in n-dimensional euclidean space, n > 1. Initially, at time to, the state of the system is

x(to) = Xo = (XI, ,x).

More generally, we can require that at the initial time to, the initial state xo is such that the point (to, xo) belongs to some pre-assigned set SO in (t, x)-space. The state of the system varies with time according to the system of differential equations

(4.1) d- =f (,x,xz) x x0= = n,

where z = (z', * * *, z') is a vector in real euclidean space Em and the functions fi are real valued continuous functions of the variables (t, x, z).

By the 'system varying according to (4.1)' we mean the following. A function u with values in m-dimensional euclidean space is chosen from some prescribed class V of functions, such as the piecewise continuous functions, the bounded measurable functions, the functions in L, etc. When the substitution z = u (t) is made in the right hand side of (4.1) we obtain a system of ordinary differential equations:

(4.2) dx= 1, n.


For each u in W it is assumed that there exists a point (to, xo) in S0 and a function o = (4l,., n)

defined on an interval [to, t2] with values in E" such that (4.2) is satisfied. That is, we require that

( )()d+ '= fi (t +(t),u (t)), i (to) = x i = ,**,n.

The function f describes the evolution of the system with time and is called a trajectory. The function u is called a control.

The control u is further required to be such that at some time t1, where to < t1 ' t2, the point (t1, 4 (ti)) belongs to a preassigned set S- and for to ' t < t1 the points (t, + (t)) do not belong to 3-,. The set Sl is called the terminal set for the problem. In the servo problem the set 3- is the point (t0 4, x0) and the set 3- is defined by:

1 = {(tl,Xl,x):X1 = O,X2 = O}.

The discussion in the preceding paragraphs is sometimes summarized in less precise but somewhat more graphic language by the statement that the controls u are required to transfer the system from an initial state xo at time to to a terminal state xl at time t1, where (to, xo) E S0 and (t,, xl) E 3Y1. Note that to a given u in W there will generally correspond more than one trajectory 4. This results from different choices of initial points (to, xo) in -o or from non-uniqueness of solutions of (4.2) if no assumptions are made to guarantee the uniqueness of solutions of (4.2).

It is often further required that a control u in W and a corresponding solution f must satisfy a system of inequality constraints

(4.3) R'(t, +(t), u(t))-'O i = 1,2," * , r,

for all to ' t '- t, where the functions R',* , Rr are given functions of (t, x, z). In the linear time optimal problem the constraints are

ui(t)+ c ?0

c -ui(t)_O i=l, ,m.

More generally, it is required that u(t) E (t, 4(t)), where Q is a mapping that assigns to each point (t, x) a subset Q(t, x) of E m.

Let f? be a real valued function of (t, x, z) and let g be a real valued function defined on Yo>X x i. For each control u in W and corresponding trajectory 0 we define a payoff as follows:

(4.4) J(4, u) = g(to, 4(t0), t,, 4(t0))+ f(s, ?(s), u(s))ds.

It is tacitly assumed here that the function r? and the class W are such that the integral in (4.4) exists. In the servo problem we can write J in two equivalent ways. We can either set fo 1 and g 0 or

g(to, xo, t1, x1) = t1 and fo 0. We define an admissible control u to be a function u in 16 such that a corresponding solution 0 of

(4.2) exists with the property that (4.3) holds, the integral in (4.4) is defined, and

(4.5) (to, 0 (to), ti, 0 40l) E 30 x 3-1.

We call the solution 4 an admissible trajectory, and we call the pair (4, u) an admissible pair. The nonlinear optimal control problem that we consider is the following.

PROBLEM: Find an admissible pair (o *, u *) such that

J(A*,u*)<-J(, u)

for all admissible pairs (0, u). That is, minimize J over the class of admissible pairs.


5. More about applied problems. Concurrent with the interest of electrical engineers in the time optimal control problem, there was interest in other disciplines in optimization problems whose mathematical formulation was that of the general control problem. Notable among these disciplines were mechanical engineering, chemical engineering, aerospace engineering, and economics. Also, problems in electrical engineering other than the time optimal one were arising that could be formulated as a general control problem. A representative sample of these problems, let alone a complete catalogue, is not possible in this paper. A representative sample is given in [6]. We shall, however, present one simplified example from economics of a production planning problem.

Let x(t) denote the rate of production at time t of steel. The amount produced at time t is to be allocated to one of two uses, the production of consumer products or investment. It is assumed that the steel allocated to investment is used to increase productive capacity by using steel to produce new steel mills, mining equipment, transport facilities, etc. Let u (t), where 0_ u (t) ?1, denote the fraction of the steel produced at time t that is allocated to investment. Then (1 - u (t)) represents the fraction allocated to consumption. The assumption that the reinvested steel is used to increase the productive capacity means that

dx = ku(t)x,

where k is an appropriate constant. The problem is to choose u(t) so as to maximize the total consumption over a fixed time period of length T >0. That is, we are to maximize

fT

J(1 - u (t)) x (t) dt.

The problem we are faced with is the following. Do we always consume everything produced? Or, do we invest some at present so as to increase capacity now so that we shall be able to produce more, and hence consume more later? Do we follow a "bang-bang" procedure of first investing everything and then consuming everything?

The production planning problem can be written in the control formulation of Section 4 as follows. Minimize J(4, u), where

rT

J(, u)= - f (1- u(s))4 (s)ds

over the set of pairs (4, u) that satisfy the differential equation

dx = ku (t)x x(0)= c

and constraints

0<u(t)?1 +(t)-0.

Here c > 0 is the initial capacity. The constraint + (t) ' 0 is present since we do not destroy capacity, and thus negative capacity is meaningless.

Although the problems considered in the various areas of application had the common mathematical formulation of the problem in Section 4, these problems were not so formulated at the time. Each was formulated in the language of its own field. For the most part, investigators were unaware of the work in other disciplines. Each discipline seemed to favor its own set of methods for attacking the problem.

In the next section we shall see that the control problem is variational in character. Almost all of the investigators in the areas of application were aware of the variational character of the problem. The constraint (4.3), however, prevented direct application of the known results in the calculus of variations. In many special problems the basic technique of the calculus of variations, namely that of


comparing the optimal solution with a perturbation thereof which still satisfies the side conditions, was carried out for the example in question, (see e.g. [2]). Others, tried various transformations of the problem into a variational problem without the constraint (4.3) with varying degrees of success (see e.g. [9], [19], [23]). As already noted, functional analysis was used effectively in the time optimal problem. This is because the differential equations (4.2) are linear. In some problems the dynamic programming formalism was used, (see e.g. [1]).

6. The calculus of variations. The "simple problem" in the calculus of variations is the following. A real valued function f: (t, x, z) -f?(t, x, z) is given, where t is a scalar, x = (xl, * * , x"') is a vector in euclidean space En and z = (zl, * * *, z'n) is a vector in En. A point (to, xo) and a point (t,, xl) in E"' are given, as is a class (D of functions 4 such that 0(to) = xo, 0(t1) = xi and such that the integral

J()= f (t, + (t), 0'(t))dt

is defined. The problem is to find a function 4 * in (D1 that minimizes J(4); i.e., find a 4 * in (D1 such that J(4 *) ? J(4) for all 4 in (. In connection with the simple problem, we call the reader's attention to two necessary conditions that a minimizing function must satisfy. Our statements will not be precise. The precise sense in which the conditions hold depend on the function class (D considered, the properties of f, etc. For typographic purposes we now let 4 designate the minimizing function.

The first necessary condition is the system of Euler equations

(6.1) d fCi(t, +(t), ?'(t) f ( i n.

The second is the Weierstrass condition n

fo(t, + (t), z) f?z (t, +(t) 01(t))zi

i =1

for all z in the open domain of definition of fo. The Bolza problem in the calculus of variations is that of minimizing a functional of the form

g(to, 4(to), t1, 4(ti)) + J fo(t, ?(t), +'(t)) dt

over those functions in an appropriate class C that satisfy a system of differential equations

G'(t, + (t), +'(t)) = O i = 1, m

and end conditions

(to, 4 (to), t1, 4 (t1)) E 3,

where m < n and - is a preassigned set in E2+2n. If g 0 then the problem is called a Lagrange

problem. If fo 0 it is called a Mayer problem. It can be shown that all three problems are equivalent [7].

A fairly complete theory for the Bolza problem had been developed by 1940. Again, from the wealth of information about the Bolza problem we select two necessary conditions for the reader's attention. These can be summarized roughly as follows. Introduce the Lagrangian function F by the formula

m

F(t, x, z, 1? , l) = 1?f (t, x, z) + I G (t, x, z).

19761 OPTIMAL CONTROL THEORY 231

If 4 is the minimizing function then there exists a scalar 0_i/ 0 and a vector valued function = (Ofr1', *, *fY") defined on [to, t,] such that equations (6.1) and (6.2) hold with f? replaced by F.

Thus:

dt (6.3) d-~jFz(t, +)(t), d>'(t), {4?, +(t)) = F'(t, ?(t), 4)'(t), ?,O if(t)) i = 1, **, n

and

(6.4) F(t, + (t), z, 00, + (t))- F* (t)z > F(t, + (t), 0 (t), 00, + (t)) - ,F*, (t) (O ) (t),

for all z satisfying G'(t, +(t), z) = 0, i = 1, *, m, where

F*(t= Fz i(t, + (t), +'(t), 0?, + (t)).

With (6.3) there is a set of conditions that the values 4i, fr(to) and q1(t1) must satisfy in conjunction with the end values (t0, 0(to), ti, 0(t1)). These conditions are known as the "transversality conditions." Equations (6.3) are called the Euler-Lagrange equations or the "Lagrange multiplier rule." Inequality (6.4) is called the Weierstrass condition.

The Euler equations and Weierstrass condition for the simple problem are relatively easy to establish. Although many engineering texts give the impression that the same is true for the multiplier rule, this is definitely not the case. The Lagrange multiplier rule was first stated by Lagrange for the problem that bears his name. His proof, however, contained two major errors. The first was corrected by A. Mayer in 1886 and the second by Kneser in 1900 and also by Hilbert in 1905. A satisfactory treatment of the Weierstrass condition for the Bolza problem was also long in coming forth. It was not until 1939 that such a treatment was first given by McShane [22]. Prior to this work the Weierstrass condition was established under conditions that were not entirely satisfactory. In his proof, McShane introduced a completely new idea involving convex sets of variations and the separation theorem for convex sets. We shall return to this point later.

In the preceding section we stated that the control problem of Section 4 is variational in nature. We now show this. If we introduce a new independent variable y = (yl, * * *, yi) and let y' = u, then the control problem of Section 4 can be written as follows.

Minimize

g(to, 4)(to), t1, 4)(to))+ 12 f(s, 4(s), y'(s))ds

over the functions (4, y) in an appropriate class that satisfy the system of differential equations

( i )(t)- f fi (4 (t), y '(t)) = 0 j = 1, n,

the end conditions

(to, 4)(to), t1, 4(t1)) e Y,

and inequality constraints

R s(ts + (t), y'(t))-'` 0 i = 1, * ,r.

The problem without the inequality constraint is a Bolza problem in the calculus of variations. We have already noted that the early workers in applied problems recognized the variational character of the problems, but were prevented from applying the theory of the Bolza problem by the inequality constraint.

7. The maximum principle. Interest in problems of optimal control in the 1950's was not confined to the United States. These problems were also pursued in the Soviet Union, where they attracted the


attention of L. S. Pontryagin and his students V. G. Boltyanskii, R. V. Gamkrelidze and E. F. Mishchenko. They stated the general control problem essentially in the form given in Section 4, and in a series of papers in the late 1950's announced what has come to be known as the Pontryagin maximum principle. This principle is a set of necessary conditions that must hold along an optimal trajectory. The announcement of the maximum principle can properly be regarded as the birth of the mathematical theory of optimal control. Workers in different areas of application saw that their problems had a common mathematical formulation and were provided with a tool for attacking their problems.

We now give an imprecise statement of one form of the maximum principle. Other, more general forms are known, but we shall not consider them here. For a precise statement of the principle we refer the reader to [27], which is an account in English of the work of Pontryagin and his collaborators in the late 1950's, or to any other standard reference. As in [27], we shall suppose that the constraints on the control are of the form

u (t)EQ

where Q is a fixed set in Em. In the interests of simplicity we shall assume that the problem is in Lagrange form, i.e., g 0. As in the Bolza problem, there is no loss of generality in this assumption.

The statement of the maximum principle requires the introduction of a function H defined as follows:

H(t, x,z, p0, p) = p?fo(t x, z) +> p ifi (, X z).

The maximum principle states that if (0, u) is an optimal pair, then there exist a constant 0+= _ 0 and a vector valued function 1= (q,1,. ., ifrn) defined and absolutely continuous on [to, t,] such that (fO +f(t)) 0 0 for all t in [to, t1], such that the following system of equations is satisfied

(7.1) d4'= fi(t, +(t), u(t)) n, dt

(7.2) dd~~~~~ = - E Oi(t)f if(t, + (t), u (t)) ,,n dt j=O

and such that

(7.3) H(t, 4O(t), u(t), (/0? 0/(t)) ->H(t, 4O(t), z, t/0? 4/(t)) for all z in fl. If the set of initial values So and the set of terminal values S- are C"1) manifolds and if certain other regularity conditions hold, then we also have that the vector

(7.4) (H(to, 4(to), u(to), q?0 q(to)) -(to))

is orthogonal to SO at (to, 4(to)) and the vector

(7.5) ( - H(ti, 0 (ti), u (ti), 0?, 0 (ti)), 0 (ti)) is orthogonal to f- at (t1, 4(t1)). The orthogonality conditions are the "transversality conditions."

Equations (7.1) are just the state equations. If we consider the constant f0 and the pair (4, u) as known in (7.2), then (7.2) is a linear system in f. Equations (7.1) and (7.2) when written in vector notation have the following Hamiltonian form

do >= Hp (t, +(t), u (t), 0 0 +(t))

dt = - H (t, 4 (t), u(t), q/ 0 q(t)) dt

19761 OPTIMAL CONTROL THEORY 233

A pair (4, u) that satisfies the maximum principle is called an extremal pair. Theoretically, we can use the maximum principle to determine all extremal pairs (4, u), and from among these then determine the optimal ones by some other criteria. In problems where the dimension of the state vector is large it is difficult to determine extremal pairs. This is primarily because equations (7.1) and (7.2) and inequality (7.3) when taken in conjunction with the transversality conditions are such that the end values of 4 and f are not either all terminal or all initial. Rather some values are given at the initial time and others at the terminal time. In problems where the dimension of the state is small, one can either integrate (7.1)-(7.3) forward or backward and keep the unknown initial or terminal values as literal constants which can be adjusted to the data at the other end. For systems of high dimension, where one is trying to determine extremal pairs by numerical methods, this procedure is not feasible. Neither are procedures in which the missing values are guessed and later corrected. We shall not go into these problems. We mention them here merely to point out that the maximum principle has not done away with all difficulties.

Despite the limitations just mentioned, the maximum principle is important and useful. It does give information about the structure of the solution. It has yielded the solution in certain classes of problems. In many specific examples in which the dimension of the state is small, the maximum principle has given the complete solution. To appreciate this last statement the reader should compare the worked out examples in [27] with some of the early treatments of these problems in the engineering literature, say as collected in [261.

Pontryagin and his co-workers were aware of the variational character of the control problem and also believed that the presence of the constraints (4.3) made the existing theory of the Bolza problem inapplicable. They therefore undertook to develop necessary conditions for the control problem without reference to existing variational theory. They did, however, make essential use of the ideas introduced by McShane [22] in his proof of the Weierstrass condition. In the proof of the maximum principle convex sets of variations of the controls are introduced and it is shown that as a consequence of optimality, a certain pair of convex sets can be separated. The analytic consequences of this separation constitute the maximum principle. Further details are given in the appendix.

Shortly after the publication of the maximum principle the relationship between the calculus of variations and the maximum principle was clarified. At the end of Section 6 we showed how a control problem with inequality constraints could be transformed into a Bolza problem with inequality constraints involving the derivatives. If one introduces a new variable f by means of the differential equations

(7.4) R(t,x x,y)()2= 0 = - r

then the inequality constraints are transformed into differential equation constraints and the problem is of standard Bolza type. Under reasonable assumptions on the functions R' and on the optimal trajectory, the necessary conditions for the variational problem apply, and when these are translated back into the control formulation language, one gets the maximum principle (see [5]). It is also quite easy to obtain the necessary conditions (6.3), (6.4) and the transversality conditions for the variational problem from the maximum principle. See [273 Chap. 5. Under these translations, the Euler-Lagrange equations (6.3) correspond to equations (7.2) of the maximum principle and the Weierstrass condition (6.4) corresponds to the maximum condition (7.3). Also, the transversality conditions of the two problems correspond.

The use of the variable f as in (6.3) for Lagrange problems with inequality constraints on the derivatives is due to Valentine [283. It was used by Hestenes [151 in a minimum time problem.

8. A unified theory of necessary conditions. Primarily in response to problems in economics and operations research, the theory of mathematical programming also began its development in the 1950's. One form of the mathematical programming problem is the following.


A set V is given in euclidean space E" and mappings,

F?: --E1

F= (F', * *Fv): - gEv

G= (G',, G'r): __> E'

are given. The problem is to minimize F(x) over all x in V such that

F'(x) 0 i =l, ,v

G'(x)-'0 i =l, ,r.

If the constraints G (x) 0 are absent, if the set V is open and the functions F? and F are differentiable, then at a point x * in V at which the minimum is attained, the classical Lagrange multiplier rule holds. Various extensions of the classical multiplier rule were developed to handle the inequality constraints, and further extensions were developed to accommodate non-differentiable functions F?, F, and G.

In the early to mid 1960's it was noted that the control problem of Section 4 can be cast as a programming problem in an appropriate function space. One way of so casting the control problem is the following. Let V denote the set of function pairs (4, u) such that 4) and u belong to function spaces appropriate to the problem and such that

(0 i)At)= fi(t, + (t), u(t)) i = 1, *, n.

Let the mapping F?: V -- El be defined by

F?0(0, u) = f(t, (t), u(t)) dt.

We now suppose that the initial set SO is a C(l) manifold, defined by a system of equations,

xi (td, xo) = ? i = 1, * * *, y

and that the terminal set is a C'") manifold, defined by a system of equations

Xi (ti, Xi) = ? i = y + 1, * ,v.

We then define

F =(Fl,...Fv): VEv

by

Fi(,O, u)= X'(to, +(to)) i =1, * ,y

F(,u) X xI(ti, +(tl)) i =y + 1, * ,v.

Let

G = (G', G ,rG): 9 E'r

be defined by

Gi(0, u) = inf{R'(t, + (t), u(t)): to_ t - t1}.

If we write x = (4, u), then formally the control problem of Section 4 becomes the following. Minimize F?(x) over x in ' subject to F'(x) = 0, i = 1, * * , v and G'(x) ' 0, i = 1, * * , r. The control problem reads exactly as the programming problem, except that x is now an element of a function space. The formulation just given of the control problem is essentially that of Neustadt [24]. Other writers have given slightly different formulations.


Not only were the similarities in the abstract formulation of control and programming problems noticed, but it was also noticed that the techniques and results concerned with necessary conditions in various optimization problems were similar. These similarities in the techniques,' however, were not always on the surface. These observations prompted various writers, notably Neustadt [24], Halkin [14], Hestenes [16], Gamkrelidze and Haratisvili [12], and Dubovitskii and Milyutin to embark on the following program. (See [13] for an account in English of the work of Dubovitskii and Milyutin.)

First formulate a general optimization or programming problem that would include as special cases all optimization problems of interest, such as ordinary control problems, control problems with bounded states, control problems with lags, control problems with distributed parameters, mathematical programming problems, etc.

Second, develop a meaningful set of necessary conditions for the general problem under reasonable hypotheses. The necessary conditions must be such that one obtains useful necessary conditions for the special problems when appropriate specializations and identifications are made. One of the difficulties in the formulation of the general problem is that the hypotheses must be specific enough to yield necessary conditions with some structure, yet they should be general enough to include all the special problems of interest.

The program outlined above was carried the furthest by Neustadt. An account of his work will be found in his posthumous book [25].

9. Other topics. Although we have written about the time optimal problem and about the maximum principle, these were not the only developments in the mathematical theory of optimal control. An account of other developments is not possible in this paper. We shall, however, list a few developments.

In the course of studying existence questions for control problems, new methods and ideas were introduced which gave refinements, simplifications, and generalizations of classical results in the calculus of variations, as well as the sought for results in control theory. The field theory of the calculus of variations was significantly generalized by L. C. Young in his book [31] in order to cope with sufficiency problems in the calculus of variations. Generalized curves, introduced into the calculus of variations by L. C. Young in 1937 [30], were rediscovered in different form by investigators in control theory and were named "chattering controls" or "relaxed controls." Here again, new results were obtained as were improvements over the earlier work in this area.

Two other areas of investigation which arose in response to applied problems were the distributed parameter control problems and the stochastic control problems. In the distributed parameter problems the state of the system is governed by a system of partial differential equations and the functional to be minimized is taken over a region in the space of the independent variable. In stochastic control problems the state equations and payoff functional involve randomness.

A very important problem in terms of applications is that of developing computational methods. As we pointed out earlier, the pursuit of computational schemes to determine extremal trajectories and controls (those satisfying the maximum principle) has not been too fruitful. More promising techniques are those that involve direct methods. Although many specific applied problems of interest have been treated numerically with some success, there are no widely applicable algorithms with solid theoretical basis extant.

Although we could proceed with our list of topics, we shall stop at this point and suggest that the reader who wants to learn more consult some of the references [6], [10], [16], [18], [20], [25], [27], [29], [31].

10. Appendix. A proof of the maximum principle. We now sketch'a proof of the maximum principle. Some of our statements will not be precise, some will be heuristic, and some will be formal. We hope, however, to convey the underlying ideas of the proof given by Pontryagin and his co-workers and the ideas behind the later generalizations and refinements of this proof.


In this section we shall use vector-matrix notation. Thus, the state equations will be written

do = f(t, 4)(t), U (t)).

The symbol f, will denote the matrix of partial derivatives (df'l/dx) i = 1,*, n, j = 1**, n and the symbol fu will denote the matrix (dfIlduj) i = 1, , n, j = 1,* , m. The symbols f? and fu will denote vectors of partial derivatives. If x and y are vectors in En, then the symbol (x, y) will be used to denote their inner product. If M is a matrix, then MT will designate the transpose of M.

In order to make the exposition as simple as possible we assume that the initial point (t., xo) and the terminal point (t,, xi) are fixed. As in Section 7, we assume that the problem is to minimize

J(4, u)= f fr(t, 4 (t), u(t))dt

subject to the state equation (7.1) and to the control constraints u(t) E fl. Let x = (x?l,x x') and let f = (f0, f,*, f" ). If we introduce a new state coordinate xo by means of the differential equation

dxo 0o)=o d= (t, x, u (t)) x?(to) = 0,

then the problem becomes that of minimizing

(10.1) J(,O, u)= O)0(t1)

subject to the state equations

(10.2) d'>= f(t, 4(t), U(t)),

the end conditions

400(t0)=0 O4(to)=xo C(t1)=x1,

and control constraints u (t) E f. Note that in (10.2) the zero-th component of 4 does not appear in the right hand side.

Let (4), u) be an optimal trajectory-control pair. We perturb u by Su and obtain a trajectory 4) + 84) + e, where 84) represents the "first order" terms in the new trajectory and e represents the "higher order" terms. If we substitute u + Su and O + 84) + e into (10.2) and expand to first order, we get that 84) satisfies

6(t)'(t) = f. 0, ?>(t), U (O))SO(t) + fu 01 ( (t), u (O)SU (t) (10.3)

so) '(t) = (f*(t, 4)(t), u(t)), st)(t)) + (fo?(t, O)(t), u(t)), Su(t)), with & (to) = 0. Hence by the variation of parameters formula we get that

(10.4) 84 (t1) = V(t1, to)f V1(s, to)fu(s)8u(s)ds,

where ft (s) = fu (s, 4 (s), u (s)) and where V(*, to) is the fundamental matrix of solutions of the system

d= f(t, 4)(t), u (t))z

satisfying V(to, to) = I. Here I is the n x n identity matrix. From (10.3) and (10.4) we get 8400(t1) by a simple quadrature as follows:

(10.5) 800(t1) = f [(f(s, +(s), u(s)), 6+(s)) + (f?(s, +)(s), u(s)), 8u(s))] ds.


The perturbations Su are functions in some function space. Therefore we can speak of a convex set of perturbations and of a convex cone of perturbations. Equations (10.4) and (10.5) define a linear map L from the space of perturbations into E"'. Hence, if we restrict our attention to perturbations in an appropriate convex cone 0 with vertex at the origin in the space of perturbations, the set of corresponding points 84(t1) in En+1 will be a convex cone in En+1 with vertex at the origin. Let X denote the translate of this cone by the vector 4 (t1). Points in X are of the form

(10.6) +(t1)+ &f(ti),

where So(t1) is given by (10.4) and (10.5) for some Su in 9l. Let 2 denote the half-line parallel to the xo axis with initial point at 0(t1) and in the direction of

the negative xo axis. If the points of X were the true end points of the perturbed trajectories rather than the first order approximations thereto, then since 40(t1) is the minimum value of the zero-th coordinate of all trajectories, it would follow that no point of X could also be a point of Y. Thus we would have that the two convex cones X and Y have a common vertex and have no other points in common. It turns out that this is still true, even though the points (10.6) differ from the true end points by higher order terms. The proof of this assertion involves the use of either the implicit function theorem or of a fixed point theorem and will not be discussed here.

Since X and Y are convex cones whose intersection is their common vertex 0 (t1), they can be separated by a hyperplane Hl through 4(t1). We now consider the analytic consequences of this separation.

Let to C r < t1 and let St > 0 be such that r + St < ti. Define a control Su as follows: Su (t) = 0 for t not in [r, r + at] and Su(t) = v - u(t) for t E [r, r + at], where v is a fixed vector in fQ. Thus we perturb the trajectory by replacing u (t) by v over a small interval. We suppose that 0U is defined in such a way that Su E VU. Let f denote the trajectory corresponding to u + Su. Then

+(t)= +(t)+ 6+(t)+ e(t),

where e(t) represents "higher order terms." Since ( T+8t

q(T + at) (r)+ I f(s, S(S), v)ds,

rT+8t

O + at) = (r) + J f(s, 4 (s), u(s))ds,

it follows that for St small,

(10.7) &+(T + at) = [f(S, t(,), V)-tr, 4(r) uQr))]8t

assuming that f is continuous and that u is continuous at r. Let ( denote the right hand side of (10.7). Since u (t) + SU (t) = u (t) for t ' r + St, it follows that for

t >- T + St we can consider So as the first order approximation to the perturbation of the trajectory 4 that is obtained if we solve

= f(t, x, u(t))

with initial condition X(r + )t) =(r + &t) perturbed to $(r + t) + e. It is then a consequence of theorems describing the dependence of solutions of differential equations on initial data that So is a solution of

(10.8) dZ A

(t, + (t), U(t))Z Z(T + at)

Note that ( and z are vectors in En+1 and that f0 = 0.


Let ni be a normal to the hyperplane fl that separates X and 2 and let n' point into the half-space containing Y. Then (&1(t1), fi) -0 We now construct the hyperplane through (r, +(r)) with normal vector A(T), where A is the solution of

(10.9) dw fw +(t n(tl)=

It is readily verified by differentiation and the use of (10.8) and (10.9) that (8+(t), A(t)) is constant. Since A (t1) = n and since ( n/(t1), n) _ 0, it follows that

( + at), A(T + at)) _ 0. From (10.7) and from the continuity of A we then get that

(6 f (Tr, 0 (Tr), V )f (Tr, 0 (Tr), U (T))], A (Tr) + 0 (1))- 0,

where o (1) is as St -- 0. Since St >0, it follows on division by St that

(f(r, 4 (r), v)f ((r, 4 (r), u(r)), A (r)) _ 0.

This inequality and equation (10.9) with w replaced by A constitute the maximum principle for the problem in the form studied here.

The assertion that A0 is constant follows from (10.9) and from the fact that

fjo = (df?/dx?, dftldx0, * *, df /dx?)-0.

Since n = A(t1), since Y is the translate by O(t1) of all nonnegative multiples of the E` vector (-1,0, ,0), and since (y,n 0 for all y in Y, it follows that

1n s *, 0), (AO, A (t) and so AO-=,--.

References

1. R. Aris, The Optimal Design of Chemical Reactors; A Study in Dynamic Programming, Academic Press, New York, 1961.

2. K. J. Arrow, S. Karlin and H. Scarf, Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, California, 1958.

3. N. I. Akhiezer, The Calculus of Variations (trans. by A. H. Frink), Blaisdell, Waltham, Mass., 1962. 4. R. Bellman, I. Glicksberg, and 0. Gross, On the "bang-bang" control problem, Quart. Appl. Math., 14

(1956) 11-18. 5. L. D. Berkovitz, Variational methods in problems of control and programming, J. Math. Anal. Appl., 3

(1961) 145-169. 6. , Optimal Control Theory, Springer-Verlag, New York, 1974. 7. G. A. Bliss, Lectures on the Calculus of Variations, The University of Chicago Press, Chicago, 1946. 8. D. Bushaw, Optimal discontinuous forcing terms, Contributions to the Theory of Nonlinear Oscillations IV,

Annals of Math Study 41, S. Lefschetz ed., Princeton University Press, Princeton, (1958) 29-52. 9. C. A. Desoer, The bang bang servo problem treated by variational techniques, Information and Control, 2

(1959) 333-348. 10. W. Fleming and R. W. Rishell, Deterministic and Stochastic Optimal Control, Springer Verlag, New York,

Heidelberg, Berlin, 1975. 11. R. V. Gamkrelidze, Theory of time-optimal processes for linear systems, Izv. Akad. Nauk SSSR. Ser. Mat.,

22 (1958) 449-474 (Russian). 12. R. V. Gamkrelidze and G. L. Haratisvili, Extremal problems in linear topological spaces, Izv. Akad. Nauk

SSSR, 33 (1969) 781-839. 13. I. V. Girsanov, Lectures on Mathematical Theory of Extremum Problems, Springer-Verlag, Berlin,

Heidelberg, New York, 1967. 14. H. Halkin, An abstract framework for the theory of process optimization, Bull. Amer. Math. Soc., 72

(1966) 677-678.

19761 NOTE ON THE BERNOULLI-L'HOSPITAL RULE 239

15. M. R. Hestenes, A general problem in the calculus of variations with applications to paths of least time, RAND Corporation RM-100 (1950).

16. , Calculus of Variations and Optimal Control Theory, Wiley, New York-London-Sydney, 1966. 17. J. P. LaSalle, The time optimal control problem, Contributions to the Theory of Nonlinear Oscillations

Vol. 5, Annals of Math. Study No. 45, Princeton University Press, Princeton (1960) 1-24. 18. E. B. Lee and L. Markus, Foundations of Optimal Control Theory, Wiley, New York-London-Sydney,

1967. 19. G. Leitmann, On a class of variational problems in rocket flight, J. Aero and Space Sciences, 26 (1959)

586-591. 20. J. L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer-Verlag, New

York, Heidelberg, Berlin 1971. 21. D. McDonald, Non linear techniques for improving servo performance, National Electronics Conference,

6 (1950) 400-421. 22. E. J. McShane, On multipliers for Lagrange problems, Amer. J. Math., 61 (1939) 809-819. 23. A. Miele, The calculus of variations in applied aerodynamics and flight mechanics, Optimization

Techniques with Applications to Aerospace Systems, G. Leitmann ed., Academic Press, New York-London, 1962. 24. L. W. Neustadt, A general theory of extremals, J. Comput. System Sci., 3 (1969) 57-92. 25. , Optimization: A Theory of Necessary Conditions, Princeton University Press, Princeton, N. J., To

Appear. 26. R. Oldenburger, ed. Optimal and Self-Optimizing Control, The M.I.T. Press, Cambridge, London, 1966. 27. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, E. F. Mishchenko, The Mathematical Theory of

Optimal Processes, (Translated by K. N. Trirogoff, L. W. Neustadt, editor) Wiley, New York, 1962. 28. F. A. Valentine, The problem of Lagrange with differential inequalities as added side conditions,

Contributions to the Calculus of Variations 1933-37, Department of Mathematics, University of Chicago, University of Chicago Press, Chicago.

29. J. Warga, Optimal Control of Differential and Functional Equations, Academic Press, New York, 1972. 30. L. C. Young, Generalized curves and the existence of an attained absolute minimum in the calculus of

variations, Compt. Rend. Soc. Sci. et Lettres. Varsovie, Cl III, 30 (1937) 212-234. 31. , Lectures on the Calculus of Variations and Optimal Control Theory, Saunders, Philadelphia,

London, Toronto, 1969.

DEPARTMENT OF MATHEMATICS, PURDUE UNIVERSITY, WEST LAFAYETTE, IN 47907.

NOTE ON THE BERNOULLI-L'HOSPITAL RULE

A. M. OSTROWSKI

1. The Bernoulli-L'Hospital Rule on the convergence of f(x)Ig(x) in the case that g(x )* oc can be formulated as follows (See [2], 45-46; [3], 128-129):

I. Consider a fixed one of the following limiting processes:

(1) x oo, x t xo, x J Xo,

where xo is finite. Assume that f(x) and g(x), for the corresponding x, are continuous and have a derivative. Assume that g'(x) is either always < 0 or always > 0 and that J g(x) m0. Then we have

(2) lim f'(x)Ig'(x) -' lim f(x)Ig(x) c lim f'(x)Ig'(x).

If this formulation has to be applied in the case where f(x) or g(x) are defined as integrals, the difficulty arises that the relation

(f(x)dx (x)

1976-Optimal Control Theory

Documents

Transcript of 1976-Optimal Control Theory