A Priori Error Estimates for Space-Time Finite Element ...2 Dominik Meidner and Boris Vexler...

28
www.oeaw.ac.at www.ricam.oeaw.ac.at A Priori Error Estimates for Space-Time Finite Element Discretization of Parabolic Optimal Control Problems Part II: Problems with Control Constraints D. Meidner, B. Vexler RICAM-Report 2007-12

Transcript of A Priori Error Estimates for Space-Time Finite Element ...2 Dominik Meidner and Boris Vexler...

  • www.oeaw.ac.at

    www.ricam.oeaw.ac.at

    A Priori Error Estimates forSpace-Time Finite ElementDiscretization of ParabolicOptimal Control Problems

    Part II: Problems withControl Constraints

    D. Meidner, B. Vexler

    RICAM-Report 2007-12

  • A PRIORI ERROR ESTIMATES FORSPACE-TIME FINITE ELEMENT DISCRETIZATION OF

    PARABOLIC OPTIMAL CONTROL PROBLEMSPART II: PROBLEMS WITH CONTROL CONSTRAINTS

    DOMINIK MEIDNER† AND BORIS VEXLER‡

    Abstract. This paper is the second part of our work on a priori error analysis for finite elementdiscretizations of parabolic optimal control problems. In the first part [18] problems without controlconstraints were considered. In this paper we derive a priori error estimates for space-time finiteelement discretizations of parabolic optimal control problems with pointwise inequality constraintson the control variable. The space discretization of the state variable is done using usual conformingfinite elements, whereas the time discretization is based on discontinuous Galerkin methods. For thetreatment of the control discretization we discuss different approaches extending techniques knownfrom the elliptic case.

    Key words. optimal control, parabolic equations, error estimates, finite elements, pointwiseinequality constraints

    AMS subject classifications.

    1. Introduction. In this paper we develop a priori error analysis for space-timefinite element discretizations of parabolic optimization problems. We consider thefollowing linear-quadratic optimal control problem for the state variable u and thecontrol variable q involving pointwise control constraints:

    Minimize J(q, u) =12

    ∫ T0

    ∫Ω

    (u(t, x)−ud(t, x))2 dx dt+α

    2

    ∫ T0

    ∫Ω

    q(t, x)2 dx dt, (1.1a)

    subject to

    ∂tu−∆u = f + q in (0, T )× Ω,u(0) = u0 in Ω,

    (1.1b)

    and

    qa ≤ q(t, x) ≤ qb a.e. in (0, T )× Ω, (1.1c)

    combined with either homogeneous Dirichlet or homogeneous Neumann boundaryconditions on (0, T )×∂Ω. A precise formulation of this problem including a functionalanalytic setting is given in the next section.

    Although the a priori error analysis for finite element discretization of optimalcontrol problems governed by elliptic equations is discussed in many publications, see,e.g., [8, 10, 1, 11, 19, 5], there are only few published results on this topic for parabolicproblems, see [17, 26, 14, 16, 21].

    In the first part of our work on a priori error analysis of parabolic optimal controlproblems [18], we developed a priori error estimates for problems without control

    †Institut für Angewandte Mathematik, Ruprecht-Karls-Universität Heidelberg, INF 294, 69120Heidelberg, Germany; phone: +49 (0)6221 / 54-5714; fax: +49 (0)6221 / 54-5634; email:[email protected]

    ‡Johann Radon Institute for Computational and Applied Mathematics (RICAM), AustrianAcademy of Sciences, Altenberger Straße 69, 4040 Linz, Austria; phone: +43 (0)732 2468 5219;fax: +43 (0)732 2468 5212; email: [email protected]

    1

  • 2 Dominik Meidner and Boris Vexler

    constrains. The consideration of control constraints (1.1c) leads to many additionaldifficulties. In the absence of inequality constraints the regularity of the optimalsolution (q̄, ū) of (1.1a)-(1.1b) is restricted only by the regularity of the domain Ω,by the regularity of the data f, u0, û, and possibly by some compatibility conditions.Therefore, in this case it is reasonable to assume high regularity of (q̄, ū) leadingto a corresponding order of convergence of the finite element discretization, see thediscussion in [18].

    The presence of control constraints (1.1c) leads to a stronger restriction of theregularity of the optimal solution, which is often reflected in a reduction of the orderof convergence of the finite element discretization. For a discussion of the regularityof solutions to parabolic optimal control problems with control constraints we refer,e.g., to [13].

    In order to describe the claims and challenges of a priori error analysis for finiteelement discretization of (1.1), we first recall some corresponding results in the ellipticcase. Using a finite element discretization with discretization parameter h, one candefine a discretized optimal control problem with the discrete solution (q̄h, ūh).

    Many authors made the effort to analyze the behavior of ‖q̄−q̄h‖L2(Ω) with respectto h: In the first papers concerning approximation of elliptic optimal control problems,see [8, 10], the convergence order O(h) was established using a cellwise constantdiscretization of the control variable; see also [6, 1, 5]. For finite element discretizationof the control variable by (bi-/tri-)linear H1-conforming elements, the convergenceorder O(h 32 ) can be shown; see, e.g., [4, 22, 2]. Recently, two approaches achievingO(h2)-convergence for the error in the control variable have been established; see [11,19]. In [11] a variational approach is proposed, where no explicit discretization of thecontrol variable is used. The discrete control variable is obtained by the projectionof the discretized adjoint state on the set of admissible controls. In [19] a cellwiseconstant discretization is utilized, and a post-processing step is used to obtain thedesired accuracy. The later technique is extended to optimal control of the Stokesequations in [23].

    For discretization of parabolic problems as (1.1), the state variable has to bediscretized with respect to space and time leading to two discretization parametersh, k, see Section 3 for a detailed description. The solution of the discretized optimalcontrol problem is denoted by (q̄σ, ūσ), where σ = (k, h, d) is a general discretizationparameter and d denotes an abstract discretization parameter for the control space;cf. [18].

    The main purpose of this paper is to analyze the behavior of ‖q̄− q̄σ‖L2(0,T ;L2(Ω))with respect to all involved discretization parameters. Our aim is to discuss thefollowing four approaches for the discretization of the control variable, which extendsome techniques known from the elliptic case:

    1. Discretization using cellwise constant ansatz functions with respect to spaceand time. In this case we obtain similar to [14, 16] the order of convergenceO(h + k): The result is obtained under weaker regularity assumptions thanin [14, 16]. Moreover, we separate the influences of the spatial and temporalregularity on the discretization error, see Corollary 5.3.

    2. Discretization using cellwise (bi-/tri-)linear, H1-conforming finite elements inspace, and piecewise constant functions in time: For this type of discretizationwe obtain the improved order of convergence O(k+h

    32−

    1p ), see Corollary 5.7.

    Here, p depends on the regularity of the adjoint solution. In two space dimen-sions we show the assertion for any p

  • Finite Elements for Parabolic Optimal Control 3

    the result is proven for p ≤ 6. Under an additional regularity assumption,one can choose p = ∞ leading to O(k + h 32 ). Again the influences of spatialand temporal regularity as well as of spatial and temporal discretization areclearly separated.

    3. The discretization following the variational approach from [11], where noexplicit discretization of the control variable is used: In this case we obtainan optimal result O(k + h2), see Corollary 5.9. The usage of this approachrequires a non-standard implementation and more involved stopping criteriafor optimization algorithms, since the control variable does not lie in anyfinite element space associated with the given mesh. However, there are noadditional difficulties caused by the time discretization.

    4. The post-processing strategy extending the technique from [19] to parabolicproblems: In this case we use the cellwise constant ansatz functions with re-spect to space and time. For the discrete solution (q̄σ, ūσ), a post-processingstep based on a projection formula is proposed leading to an approxima-tion q̃σ with order of convergence ‖q̄ − q̃σ‖L2(0,T ;L2(Ω)) = O(k + h2−

    1p ), see

    Corollary 5.15. Here, p can be chosen as discussed for the cellwise linear dis-cretization. Under an additional regularity assumption, one can also choosep = ∞ leading to O(k + h2).

    The paper is organized as follows: In the next section, we present a functionalanalytic setting for the optimal control problem (1.1), discuss optimality conditionsand the regularity of optimal solutions. Section 3 is devoted to the discretization ofthe considered optimal control problem. Therein, we address the temporal and spatialdiscretization of the state equation by Galerkin finite element methods. Moreover, wegive a detailed presentation of the four possibilities for discretizing the control variableintroduced before. In Section 4 we provide basic results on stability and approximationquality proved in the first part of this article [18]. In Section 5 we develop our mainresults on a priori error analysis for the four mentioned types of control discretizations.Finally, we illustrate our theoretical results by numerical experiments.

    2. Optimization. In this section we briefly discuss the precise formulation of theoptimization problem under consideration. Furthermore, we recall theoretical resultson existence, uniqueness, and regularity of optimal solutions as well as optimalityconditions.

    To set up a weak formulation of the state equation (1.1b), we introduce thefollowing notation: For a convex polygonal domain Ω ⊂ Rn, n = 2, 3, we denote V tobe either H1(Ω) or H10 (Ω) depending on the prescribed type of boundary conditions(homogeneous Neumann or homogeneous Dirichlet). Together with H = L2(Ω), theHilbert space V and its dual V ∗ build a Gelfand triple V ↪→ H ↪→ V ∗. Here and inwhat follows, we employ the usual notion for Lebesgue and Sobolev spaces.

    For a time interval I = (0, T ) we introduce the state space

    X :={v

    ∣∣ v ∈ L2(I, V ) and ∂tv ∈ L2(I, V ∗) }and the control space

    Q = L2(I, L2(Ω)).

    In addition, we use the following notations for the inner products and norms on L2(Ω)and L2(I, L2(Ω)):

    (v, w) := (v, w)L2(Ω), (v, w)I := (v, w)L2(I,L2(Ω)),‖v‖ := ‖v‖L2(Ω), ‖v‖I := ‖v‖L2(I,L2(Ω)).

  • 4 Dominik Meidner and Boris Vexler

    In this setting, a standard weak formulation of the state equation (1.1b) for givencontrol q ∈ Q, f ∈ L2(I,H), and u0 ∈ V reads: Find a state u ∈ X satisfying

    (∂tu, ϕ)I + (∇u,∇ϕ)I = (f + q, ϕ)I ∀ϕ ∈ X,u(0) = u0.

    (2.1)

    As in [18], we use the following result on existence and regularity:Proposition 2.1. For fixed control q ∈ Q, f ∈ L2(I,H), and u0 ∈ V there

    exists a unique solution u ∈ X of problem (2.1). Moreover the solution exhibits theimproved regularity

    u ∈ L2(I,H2(Ω)) ∩H1(I, L2(Ω)) ↪→ C(Ī , V ).

    Furthermore, the stability estimate

    ‖∂tu‖I + ‖∇2u‖I ≤ C{‖f + q‖I + ‖∇u0‖

    }holds.

    To formulate the optimal control problem we introduce the admissible set Qadcollecting the inequality constraints (1.1c) as

    Qad := { q ∈ Q | qa ≤ q(t, x) ≤ qb a.e. in I × Ω } ,

    where the bounds qa, qb ∈ R fulfill qa < qb.The weak formulation of the optimal control problem (1.1) is given as

    Minimize J(q, u) :=12‖u− û‖2I +

    α

    2‖q‖2I subject to (2.1) and (q, u) ∈ Qad×X, (2.2)

    where û ∈ L2(I,H) is a given desired state and α > 0 is the regularization parameter.Proposition 2.2. For given f, û ∈ L2(I,H), u0 ∈ V , and α > 0 the optimal

    control problem (2.2) admits a unique solution (q̄, ū) ∈ Qad ×X.For the standard proof we refer, e.g., to [15].The existence result for the state equation in Proposition 2.1 ensures the existence

    of a control-to-state mapping q 7→ u = u(q) defined through (2.1). By means of thismapping we introduce the reduced cost functional j : Q→ R:

    j(q) := J(q, u(q)).

    The optimal control problem (2.2) can then be equivalently reformulated as

    Minimize j(q) subject to q ∈ Qad. (2.3)

    The first order necessary optimality condition for (2.3) reads as

    j′(q̄)(δq − q̄) ≥ 0 ∀δq ∈ Qad. (2.4)

    Due to the linear-quadratic structure of the optimal control problem this condition isalso sufficient for optimality.

    Utilizing the adjoint state equation for z = z(q) ∈ X given by

    −(ϕ, ∂tz)I + (∇ϕ,∇z)I = (ϕ, u(q)− û)I ∀ϕ ∈ X,z(T ) = 0,

    (2.5)

  • Finite Elements for Parabolic Optimal Control 5

    the first derivative of the reduced cost functional can be expressed as

    j′(q)(δq) = (αq + z(q), δq)I . (2.6)

    The second derivative j′′(q)(·, ·) is independent of q and positive definite, i.e.

    j′′(q)(p, p) ≥ α‖p‖2I ∀p ∈ Q. (2.7)

    Using a pointwise projection on the admissible set Qad,

    PQad : Q→ Qad, PQad(r)(t, x) = max(qa,min(qb, r(t, x))), (2.8)

    the optimality condition (2.4) can be expressed as

    q̄ = PQad

    (− 1αz(q̄)

    ). (2.9)

    Employing this formulation of the optimality condition we obtain the following regu-larity result:

    Proposition 2.3. Let (q̄, ū) be the solution of the optimization problem (2.2)and z̄ = z(q̄) be the corresponding adjoint state. Then there holds:

    ū, z̄ ∈ L2(I,H2(Ω)) ∩H1(I, L2(Ω)),

    q̄ ∈ L2(I,W 1,p(Ω)) ∩H1(I, L2(Ω)) ∩ L∞(I × Ω)

    for any p

  • 6 Dominik Meidner and Boris Vexler

    Here, Pr(Im, V ) denotes the space of polynomials up to order r defined on Im withvalues in V . On Xrk we use the notations

    (v, w)Im := (v, w)L2(Im,L2(Ω)) and ‖v‖Im := ‖v‖L2(Im,L2(Ω)).

    To define the discontinuous Galerkin approximation (dG(r)) using the space Xrk ,we employ the following definition for functions vk ∈ Xrk :

    v+k,m := limt→0+

    vk(tm + t), v−k,m := limt→0+

    vk(tm − t) = vk(tm), [vk]m := v+k,m − v−k,m

    and define the bilinear form B(·, ·) by

    B(uk, ϕ) :=M∑

    m=1

    (∂tuk, ϕ)Im + (∇uk,∇ϕ)I +M∑

    m=2

    ([uk]m−1, ϕ+m−1) + (u+k,0, ϕ

    +0 ). (3.2)

    Then, the dG(r) semi-discretization of the state equation (2.1) for given control q ∈ Qreads: Find a state uk = uk(q) ∈ Xrk such that

    B(uk, ϕ) = (f + q, ϕ)I + (u0, ϕ+0 ) ∀ϕ ∈ Xrk . (3.3)

    The existence and uniqueness of solutions to (3.3) can be shown by using Fourieranalysis, see [25] for details.

    Remark 3.1. Using a density argument, it is possible to show that the exactsolution u = u(q) ∈ X satisfies the identity

    B(u, ϕ) = (f + q, ϕ)I + (u0, ϕ+0 ) ∀ϕ ∈ Xrk .

    Thus, we have here the property of Galerkin orthogonality

    B(u− uk, ϕ) = 0 ∀ϕ ∈ Xrk ,

    although the dG(r) semidiscretization is a nonconforming Galerkin method (Xrk 6⊂X).

    Throughout the paper we restrict ourselves to the case r = 0. This is reasonabledue to the limited regularity caused by pointwise control constraints. The resultingdG(0) scheme is a variant of the implicit Euler method. In this case the semidiscretestate equation (3.3) can be explicitly rewritten as the following time-stepping scheme,using the fact that uk is piecewise constant in time. We use the notation Um =uk

    ∣∣Im

    ∈ V and obtain:

    (U1, ψ) + k1(∇U1,∇ψ) = (f + q, ψ)Im + (u0, ψ) ∀ψ ∈ V,(Um, ψ) + km(∇Um,∇ψ) = (f + q, ψ)Im + (Um−1, ψ) ∀ψ ∈ V, m = 2, 3, . . . ,M.

    The semi-discrete optimization problem for the dG(0) time discretization has theform:

    Minimize J(qk, uk) subject to (3.3) and (qk, uk) ∈ Qad ×X0k . (3.4)

    As in [18] the following result holds:Proposition 3.1. For α > 0, the semidiscrete optimal control problem (3.4)

    admits a unique solution (q̄k, ūk) ∈ Qad ×X0k .

  • Finite Elements for Parabolic Optimal Control 7

    Note, that the optimal control q̄k is searched for in the subset Qad of the con-tinuous space Q and the subscript k indicates the usage of the semidiscretized stateequation.

    Similar to the continuous case, we introduce the semidiscrete reduced cost func-tional jk : Q→ R:

    jk(q) := J(q, uk(q))

    and reformulate the semidiscrete optimal control problem (3.4) as

    Minimize jk(qk) subject to qk ∈ Qad.

    The first order necessary optimality condition reads as

    j′k(q̄k)(δq − q̄k) ≥ 0 ∀δq ∈ Qad, (3.5)

    and the derivative of jk can be expressed as

    j′k(q)(δq) = (αq + zk(q), δq)I . (3.6)

    Here, zk = zk(q) ∈ X0k denotes the solution of the semidiscrete adjoint equation

    B(ϕ, zk) = (ϕ, uk(q)− û)I ∀ϕ ∈ X0k . (3.7)

    As on the continuous level the second derivative j′′k (q) is independent of q andpositive definite, i.e.

    j′′k (q)(p, p) ≥ α‖p‖2I ∀p ∈ Q. (3.8)

    Similar to (2.9) the optimality condition (3.5) can be rewritten as

    q̄k = PQad

    (− 1αzk(q̄k)

    ). (3.9)

    This projection formula implies particularly that the optimal solution q̄k is piecewiseconstant in time. We will make use of this fact in Section 5.

    3.2. Discretization in space. To define the finite element discretization inspace, we consider two or three dimensional shape-regular meshes, see, e.g., [7].A mesh consists of quadrilateral or hexahedral cells K, which constitute a non-overlapping cover of the computational domain Ω. The corresponding mesh is denotedby Th = {K}, where we define the discretization parameter h as a cellwise constantfunction by setting h

    ∣∣K

    = hK with the diameter hK of the cell K. We use the symbolh also for the maximal cell size, i.e., h = maxhK .

    On the mesh Th we construct a conform finite element space Vh ⊂ V in a standardway:

    V sh ={v ∈ V

    ∣∣ v∣∣K∈ Qs(K) for K ∈ Th

    }.

    Here, Qs(K) consists of shape functions obtained via (bi-/tri-)linear transformationsof polynomials in Q̂s(K̂) defined on the reference cell K̂ = (0, 1)n, cf. [18].

    To obtain the fully discretized versions of the time discretized state equation (3.3),we utilize the space-time finite element space

    Xr,sk,h ={vkh ∈ L2(I, V sh )

    ∣∣∣ vkh∣∣Im ∈ Pr(Im, V sh ) } ⊂ Xrk .

  • 8 Dominik Meidner and Boris Vexler

    Remark 3.2. Here, the spatial mesh and therefore also the space V sh is fixed forall time intervals. We refer to [24] for a discussion of treatment of different meshesT mh for each of the subintervals Im.

    The so called cG(s)dG(r) discretization of the state equation for given controlq ∈ Q has the form: Find a state ukh = ukh(q) ∈ Xr,sk,h such that

    B(ukh, ϕ) = (f + q, ϕ)I + (u0, ϕ+0 ) ∀ϕ ∈ Xr,sk,h. (3.10)

    Throughout this paper we will restrict ourselves to the consideration of (bi-/tri-)linearelements, i.e., we set s = 1 and consider the cG(1)dG(0) scheme.

    Then, the corresponding optimal control problem is given as

    Minimize J(qkh, ukh) subject to (3.10) and (qkh, ukh) ∈ Qad ×X0,1k,h, (3.11)

    and by means of the discrete reduced cost functional jkh : Q→ R

    jkh(q) := J(q, ukh(q)),

    it can be reformulated as

    Minimize jkh(qkh) subject to qkh ∈ Qad.

    The uniquely determined optimal solution of (3.11) is denoted by (q̄kh, ūkh) ∈ Qad ×X0,1k,h.

    The optimal control q̄kh ∈ Qad fulfills the first order optimality condition

    j′kh(q̄kh)(δq − q̄kh) ≥ 0 ∀δq ∈ Qad, (3.12)

    where j′kh(q)(δq) is given by

    j′kh(q)(δq) = (αq + zkh(q), δq)I (3.13)

    with the discrete adjoint solution zkh = zkh(q) ∈ X0,1k,h of

    B(ϕ, zkh) = (ϕ, ukh(q)− û)I ∀ϕ ∈ X0,1k,h. (3.14)

    For the second derivative of jkh we have as before:

    j′′kh(q)(p, p) ≥ α‖p‖2I ∀p ∈ Q. (3.15)

    3.3. Discretization of the controls. In this subsection, we describe four dif-ferent approaches for the discretization of the control variable. Choosing a subspaceQd ⊂ Q we introduce the corresponding admissible set

    Qd,ad = Qd ∩Qad.

    Note that in what follows the space Qd will be either finite dimensional or the wholespace Q. The optimal control problem on this level of discretization is given as

    Minimize J(qσ, uσ) subject to (3.10) and (qσ, uσ) ∈ Qd,ad ×X0,1k,h . (3.16)

    The unique optimal solution of (3.16) is denoted by (q̄σ, ūσ) ∈ Qd,ad × X0,1k,h, wherethe subscript σ collects the discretization parameters k, h, and d. The optimalitycondition is given using the discrete reduced cost functional jkh introduced before:

    j′kh(q̄σ)(δq − q̄σ) ≥ 0 ∀δq ∈ Qd,ad. (3.17)

  • Finite Elements for Parabolic Optimal Control 9

    3.3.1. Cellwise constant discretization. The first possibility for the controldiscretization is to use cellwise constant functions. Employing the same time parti-tioning and the the same spatial mesh as for the discretization of the state variablewe set

    Qd ={q ∈ Q

    ∣∣∣ q∣∣Im×K ∈ P0(Im ×K), m = 1, 2, . . . ,M, K ∈ Th } .The discretization error for this type of discretization will be analyzed in Section 5.1.

    3.3.2. Cellwise linear discretization. Another possibility for the discretiza-tion of the control variable is to choose the control discretization as for the statevariable, i.e. piecewise constant in time and cellwise (bi-/tri-)linear in space. Using aspatial space

    Qh ={v ∈ C(Ω̄)

    ∣∣ v∣∣K∈ Q1(K) for K ∈ Th

    }we set

    Qd ={q ∈ Q

    ∣∣∣ q∣∣Im

    ∈ P0(Im, Qh)}.

    The state space X0,1k,h coincides with the control space Qd in case of homogeneousNeumann boundary conditions and is a subspace of it, i.e., Qd ⊃ X0,1k,h in the presenceof homogeneous Dirichlet boundary conditions.

    The discretization error for this type of discretization will be analyzed in Sec-tion 5.2.

    3.3.3. Variational approach. Extending the discretization approach presentedin [11], we can choose Qd = Q. In this case the optimization problems (3.11) and(3.16) coincide and therefore, q̄σ = q̄kh ∈ Qad.

    We use the fact that the optimality condition (3.12) can be rewritten employingthe projection (2.8) as

    q̄kh = PQad

    (− 1αzkh(q̄kh)

    ),

    and obtain that q̄kh is piecewise constant function in time. However, q̄kh is in generalnot a finite element function corresponding to the spatial mesh Th. This fact requiresmore care for the construction of algorithms for computation q̄kh, see [11] for details.

    The discretization error for this type of discretization will be analyzed in Sec-tion 5.3.

    3.3.4. Post-processing strategy. The strategy described in this section ex-tends the approach from [19] to parabolic problems. For the discretization of thecontrol space we employ the same choice as in Section 3.3.1, i.e. cellwise constantdiscretization. After the computation of the corresponding solution q̄σ, a better ap-proximation q̃σ is constructed by a post-processing making use of the projection op-erator (2.8):

    q̃σ = PQad

    (− 1αzkh(q̄σ)

    ). (3.18)

    Note, that similar to the solution obtained by variational approach in Section 3.3.3,the solution q̃σ is piecewise constant in time and is general not a finite element functionin space with respect to the spatial mesh Th. This solution can be simply evaluatedpointwise, however, the corresponding error analysis requires an additional assump-tion on the structure of active sets, see the discussion in Section 5.4.

  • 10 Dominik Meidner and Boris Vexler

    4. Auxiliary Results. In this section we recall some results provided in thefirst part of this article [18], which will be used in the sequel.

    The first proposition provides a stability result for the purely time discretizedstate and adjoint solutions. It follows from an estimate in [18] and elliptic regularity.

    Proposition 4.1. Let for q ∈ Q the solutions uk(q) ∈ X0k and zk(q) ∈ X0k begiven by the semidiscrete state equation (3.3) and adjoint equation (3.7), respectively.Then it holds

    ‖∇2uk(q)‖I + ‖∇uk(q)‖I + ‖uk(q)‖I ≤ C{‖f + q‖I + ‖∇u0‖+ ‖u0‖

    },

    ‖∇2zk(q)‖I + ‖∇zk(q)‖I + ‖zk(q)‖I ≤ C‖uk(q)− û‖I .

    A similar result holds for the fully discretized solutions of the state and adjointequations:

    Proposition 4.2. Let for q ∈ Q the solutions ukh(q) ∈ X0,1k,h and zkh(q) ∈ X0,1k,h

    be given by the discrete state equation (3.10) and adjoint equation (3.14), respectively.Then it holds

    ‖∇ukh(q)‖I + ‖ukh(q)‖I ≤ C{‖f + q‖I + ‖∇Πhu0‖+ ‖Πhu0‖

    },

    ‖∇zkh(q)‖I + ‖zkh(q)‖I ≤ C‖ukh(q)− û‖I ,

    where Πh : V → Vh denotes the spacial L2-projection.In the following two propositions, we recall a priori estimates for the errors due

    to temporal and spatial discretizations of the state and adjoint variables:Proposition 4.3. Let for q ∈ Q the solutions u(q) ∈ X and z(q) ∈ X be

    given by the state equation (2.1) and adjoint equation (2.5), respectively. Moreover,let uk(q) ∈ X0k and zk(q) ∈ X0k be determined as solutions of the semidiscrete stateequation (3.3) and adjoint equation (3.7). Then the following error estimates holds:

    ‖u(q)− uk(q)‖I ≤ Ck‖∂tu(q)‖I ,‖z(q)− zk(q)‖I ≤ Ck

    {‖∂tu(q)‖I + ‖∂tzk(q)‖I

    }.

    Proposition 4.4. Let for q ∈ Q the solutions uk(q) ∈ X0k and zk(q) ∈ X0k begiven by the semidiscrete state equation (3.3) and adjoint equation (3.7), respectively.Moreover, let ukh(q) ∈ X0,1k,h and zkh(q) ∈ X

    0,1k,h be determined as solutions of the

    discrete state equation (3.10) and adjoint equation (3.14). Then the following errorestimates holds:

    ‖uk(q)− ukh(q)‖I ≤ Ch2‖∇2uk(q)‖I ,‖zk(q)− zkh(q)‖I ≤ Ch2

    {‖∇2uk(q)‖I + ‖∇2zk(q)‖I

    }.

    Proposition 4.2 provides a stability result for the discrete adjoint solution with re-spect to the norm of L2(I,H1(Ω)). For later use we additionally prove a correspondingresult with respect to the norm of L2(I, L∞(Ω)):

    Lemma 4.5. Let for q ∈ Q the solutions ukh(q) ∈ X0,1k,h and zkh(q) ∈ X0,1k,h be

    given by the discrete state equation (3.10) and adjoint equation (3.14), respectively.Then it holds

    ‖zkh(q)‖L2(I,L∞(Ω)) ≤ C‖ukh(q)− û‖I .

  • Finite Elements for Parabolic Optimal Control 11

    Proof. We define an additional adjoint solution z̃k ∈ X0k as solution of

    B(ϕ, z̃k) = (ϕ, ukh(q)− û)I ∀ϕ ∈ Xrk .

    Since z̃k and zkh(q) are given by means of the same right-hand side ukh(q) − û it ispossible to apply standard a priori error estimates to the discretization error zkh(q)−z̃ksimilar to Proposition 4.4.

    By inserting the solution z̃k and utilizing the embedding L2(I, L∞(Ω)) ↪→ L2(I,H2(Ω))we get

    ‖zkh(q)‖L2(I,L∞(Ω)) ≤ ‖zkh(q)− z̃k‖L2(I,L∞(Ω)) + ‖z̃k‖L2(I,L∞(Ω))≤ ‖zkh(q)− z̃k‖L2(I,L∞(Ω)) + C‖z̃k‖L2(I,H2(Ω)).

    For the first term we obtain by inserting a spatial interpolation ihz̃k ∈ X0,1k,h

    ‖zkh(q)− z̃k‖L2(I,L∞(Ω)) ≤ ‖zkh(q)− ihz̃k‖L2(I,L∞(Ω)) +‖ihz̃k− z̃k‖L2(I,L∞(Ω)). (4.1)

    For the first term on the right-hand side of (4.1) we proceed by means of an in-verse estimate, an estimate for the error due to space discretization (cf. [18]), and aninterpolation estimate as

    ‖zkh(q)− ihz̃k‖2L2(I,L∞(Ω)) =M∑

    m=1

    km‖zkh(q)(tm)− ihz̃k(tm)‖2L∞(Ω)

    ≤ Ch−nM∑

    m=1

    km‖zkh(q)(tm)− ihz̃k(tm)‖2

    ≤ Ch−n{‖zkh(q)− z̃k‖2I + ‖z̃k − ihz̃k‖2I

    }≤ Ch4−n‖∇2z̃k‖2I .

    By standard interpolation estimates we have for the second term on the right-sideside of (4.1)

    ‖ihz̃k − z̃k‖2L2(I,L∞(Ω)) =M∑

    m=1

    km‖ihz̃k(tm)− z̃k(tm)‖2L∞(Ω)

    ≤ Ch4−nM∑

    m=1

    km‖∇2z̃k(tm)‖2

    = Ch4−n‖∇2z̃k‖2I .

    We complete the proof by collecting all estimates and application of the stabilityresult from Proposition 4.1:

    ‖zkh(q)‖L2(I,L∞(Ω)) ≤ Ch4−n‖∇2z̃k‖I + C‖z̃k‖L2(I,H2(Ω)) ≤ C‖ukh(q)− û‖I .

    5. Error Estimates. In this section we provide a priori error estimates for thedifferent discretization approaches described in Section 3. We start with an assertionof the error between the solution q̄ of the continuous problem (2.2) and the solutionq̄k of the semidiscretized problem (3.4).

  • 12 Dominik Meidner and Boris Vexler

    Theorem 5.1. Let q̄ ∈ Qad be the solution of optimization problem (2.2) andq̄k be the solution of the semidiscretized problem (3.4). Then the following estimateholds:

    ‖q̄ − q̄k‖I ≤1α‖z(q̄)− zk(q̄)‖I .

    Proof. Using the optimality conditions (2.4) and (3.5) we obtain the relation

    −j′k(q̄k)(q̄ − q̄k) ≤ 0 ≤ −j′(q̄)(q̄ − q̄k).

    From (3.8) we have with any p ∈ Q:

    α‖q̄ − q̄k‖2I ≤ j′′k (p)(q̄ − q̄k, q̄ − q̄k)= j′k(q̄)(q̄ − q̄k)− j′k(q̄k)(q̄ − q̄k)≤ j′k(q̄)(q̄ − q̄k)− j′(q̄)(q̄ − q̄k).

    By means of the representations (2.6) and (3.6) of j′ and j′k respectively, we obtain:

    α‖q̄ − q̄k‖2I ≤ (z(q̄)− zk(q̄), q̄ − q̄k)I .

    The desired assertion follows by Cauchy’s inequality.

    5.1. Cellwise constant discretization. In this section we are going to provean estimate for the error ‖q̄− q̄σ‖I when the control is discretized by cellwise constantpolynomials in space and time, see Section 3.3.1.

    For doing so, we will extend the techniques presented in [6] to the case of parabolicoptimal control problems. This demands the introduction of the solution q̄d of thepurely control discretized problem

    Minimize j(qd) subject to qd ∈ Qd,ad. (5.1)

    The uniquely determined solution q̄d fulfills the optimality condition

    j′(q̄d)(δq − q̄d) ≥ 0 ∀δq ∈ Qd,ad. (5.2)

    To formulate the main result of this section we introduce the L2-projection πd : Q→Qd and note that due to the cellwise constant discretization the following propertyholds true:

    πdQad ⊂ Qd,ad.

    Theorem 5.2. Let q̄ ∈ Qad be the solution of the optimal control problem (2.3),q̄σ ∈ Qd,ad be the solution of the discretized problem (3.16), where the cellwise constantdiscretization for the control variable is employed. Let moreover q̄d ∈ Qd,ad be thesolution of the purely control discretized problem (5.1). Then the following estimateholds:

    ‖q̄ − q̄σ‖I ≤ ‖q̄ − πdq̄‖I +1α‖z(q̄d)− πdz(q̄d)‖I +

    1α‖z(q̄d)− zkh(q̄d)‖I .

    Proof. We split the error

    ‖q̄ − q̄σ‖I ≤ ‖q̄ − q̄d‖I + ‖q̄d − q̄σ‖I

  • Finite Elements for Parabolic Optimal Control 13

    and estimate both terms on the right-hand side separately. For treating the first termwe use the fact that πdq̄ ∈ Qd,ad and obtain from the optimality conditions (2.4) and(5.2) the inequalities

    j′(q̄)(q̄ − q̄d) ≤ 0 and − j′(q̄d)(πdq̄ − q̄d) ≤ 0.

    Using (2.7) we proceed with any p ∈ Q

    α‖q̄ − q̄d‖2I ≤ j′′(p)(q̄ − q̄d, q̄ − q̄d)= j′(q̄)(q̄ − q̄d)− j′(q̄d)(q̄ − q̄d)= j′(q̄)(q̄ − q̄d)− j′(q̄d)(q̄ − πdq̄)− j′(q̄d)(πdq̄ − q̄d)≤ −j′(q̄d)(q̄ − πdq̄)

    By means of the representation of the derivative j′ from (2.6) and the properties ofπd, we have

    α‖q̄ − q̄d‖2I ≤ −j′(q̄d)(q̄ − πdq̄)= −(αq̄d + z(q̄d), q̄ − πdq̄)I= (πdz(q̄d)− z(q̄d), q̄ − πdq̄)I ,

    and by Young’s inequality we obtain the intermediary result

    ‖q̄ − q̄d‖2I ≤ ‖q̄ − πdq̄‖2I +1

    4α2‖z(q̄d)− πdz(q̄d)‖2I . (5.3)

    In order to estimate the term ‖q̄d − q̄σ‖I we exploit the optimality conditions (5.2)and (3.17) leading to the following relation:

    −j′kh(q̄σ)(q̄d − q̄σ) ≤ 0 ≤ −j′(q̄d)(q̄d − q̄σ).

    Using (3.15) and the representations (2.6) for j′ and (3.13) for j′kh respectively, weobtain:

    α‖q̄d − q̄σ‖2I ≤ j′′kh(p)(q̄d − q̄σ, q̄d − q̄σ)= j′kh(q̄d)(q̄d − q̄σ)− j′kh(q̄σ)(q̄d − q̄σ)≤ j′kh(q̄d)(q̄d − q̄σ)− j′(q̄d)(q̄d − q̄σ)≤ ‖z(q̄d)− zkh(q̄d)‖I‖q̄d − q̄σ‖I .

    Thus, we achieve

    ‖q̄d − q̄σ‖I ≤1α‖z(q̄d)− zkh(q̄d)‖I . (5.4)

    Collecting estimates (5.3) and (5.4) we complete the proof.This theorem directly implies the following result:Corollary 5.3. Under the conditions of Theorem 5.2 the following estimate

    holds:

    ‖q̄ − q̄σ‖I ≤C

    αk {‖∂tq̄‖I + ‖∂tu(q̄d)‖I + ‖∂tz(q̄d)‖I}

    +C

    αh

    {‖∇q̄‖I + ‖∇z(q̄d)‖I + h

    (‖∇2uk(q̄d)‖I + ‖∇2zk(q̄d)‖I

    )}= O(k + h).

    Proof. The assertion follows from Theorem 5.2 by interpolation estimates and thePropositions 4.3 and 4.4. Due to the fact q̄, q̄d ∈ Qd, we obtain using stability esti-mates that all norms involved in this estimate are bounded by a constant independentof all discretization parameters.

  • 14 Dominik Meidner and Boris Vexler

    5.2. Cellwise linear discretization. This section is devoted to the error anal-ysis for the discretization of the control variable by piecewise constants in time andcellwise (bi-/tri-)linear functions in space as described in Section 3.3.2. To this endwe split the error

    ‖q̄ − q̄σ‖I ≤ ‖q̄ − q̄k‖I + ‖q̄k − q̄σ‖

    and use the result of Theorem 5.1 for the first part. For treating the error ‖q̄k − q̄σ‖we adapt the technique described in [2] to parabolic problems.

    The analysis in this section is based on an assumption on the structure of theactive sets. For each time interval Im we group the cells of the mesh Th into twoclasses Th = T 1h,m ∪T 2h,m with T 1h,m ∩T 2h,m = ∅ as follows: The cell K belongs to T 1h,mif and only if one of the following conditions is satisfied:

    (a) q̄k = qa on Im ×K,(b) q̄k = qb on Im ×K,(c) qa < q̄k < qb on Im ×K.

    The set T 2h,m is given by T 2h,m = Th \ T 1h,m and consists of the cells which lie “close tothe free boundary between the active and the inactive sets” for the time interval Im.

    Assumption 1. We assume that there exists a positive constant C independentof k, h, and m such that ∑

    K∈T 2h,m

    |K| ≤ Ch.

    Remark 5.1. A similar assumption is used in [19, 23, 2]. This assumption isvalid if the boundary of the level sets

    { x | q̄k(tm, x) = qa } and { x | q̄k(tm, x) = qb }

    consists of a finite number of rectifiable curves.In what follows we will exploit a special interpolation Ihq̄k ∈ Qd of q̄k which is

    defined using the information on the active sets of q̄k. To this end we introduce thepatch ω(xi) for each node xi in Th consisting of all cells K with xi ∈ ∂K. Then theinterpolant Ihq̄k ∈ Qd is given piecewise for t ∈ Im (m = 1, 2, . . . ,M) by

    Ihq̄k(t, xi) =

    qa, if min

    x∈ω̄(xi)q̄k(tm, x) = qa,

    qb, if maxx∈ω̄(xi)

    q̄k(tm, x) = qb,

    q̄k(tm, xi), otherwise.

    (5.5)

    A similar construction can be found in [6, 2].In order to ensure that this interpolation is well-defined, one has to check that

    the first and second cases in (5.5) can not apply for the same node xi. This requiresan assumption on the smallness of the discretization parameter h:

    Lemma 5.4. Let a positive constant γ be chosen arbitrarily with γ < 1 forn = 2 and γ < 12 for n = 3. Then, there exists a positive constant C independentof discretization parameters k, h and of the regularization parameter α such that thecondition

    Chγ ≤ αk12m(qb − qa)

  • Finite Elements for Parabolic Optimal Control 15

    is sufficient to ensure the interpolation Ihq̄k to be well-defined on the interval Im. Ifthe above condition is fulfilled for all time-intervals Im, the following relation holds:

    j′k(q̄k)(δq − Ihq̄k) ≥ 0 ∀δq ∈ Qad. (5.6)

    Proof. The stability estimate from Proposition 4.1 and the fact that q̄k ∈ Qadimply that

    k12m‖zk(q̄k)(tm)‖H2(Ω) ≤ C for m = 1, 2, . . . ,M.

    Due to a Sobolev embedding theorem and the choice of γ we have

    ‖zk(q̄k)(tm)‖C0,γ(Ω̄) ≤ Ck− 12m for m = 1, 2, . . . ,M.

    The projection PQad : C0,γ(Ω̄) → C0,γ(Ω̄) is continuous and therefore the optimality

    condition (3.9) implies

    ‖q̄k(tm)‖C0,γ(Ω̄) ≤C

    αk− 12m for m = 1, 2, . . . ,M.

    Let xi be a node in Th. Since the mesh Th is quasi uniform, we have:

    maxx∈ω̄(xi)

    q̄k(tm, x)− minx∈ω̄(xi)

    q̄k(tm, x) ≤ C ‖q̄k(tm)‖C0,γ(Ω̄)hγ ≤C

    αk− 12m h

    γ .

    If this difference is below qb − qa for all m, then Ihq̄k is well-defined.In order to show (5.6) we first consider a cell K with

    minx∈K

    q̄k(tm, x) = qa.

    Then we have Ihq̄k(tm) = qa onK. Moreover, it holds q̄k(tm) < qb onK for sufficientlysmall h by the same arguments as in the first part of the proof. This and the optimalitycondition (3.5) imply that

    αq̄k(tm, x) + zk(q̄k)(tm, x) ≥ 0 for x ∈ K.

    Therefore we obtain:

    (αq̄k(tm, x) + zk(q̄k)(tm, x))(r − Ihq̄k(tm, x)) ≥ 0 for qa ≤ r ≤ qb and x ∈ K.

    For the case

    maxx∈K

    q̄k(tm, x) = qb

    we proceed similarly. On cells K with

    qa < q(tm, x) < qb for x ∈ K,

    we get by the optimality condition (3.5)

    αq̄k(tm, x) + zk(q̄k)(tm, x) = 0 for x ∈ K.

    Integrating over all cells and time intervals leads to the desired result.

  • 16 Dominik Meidner and Boris Vexler

    In the following theorem we provide an assertion on the error ‖q̄k − q̄σ‖I .Theorem 5.5. Let q̄k ∈ Qad be the solution of the semidiscretized optimal control

    problem (3.4) and q̄σ ∈ Qd,ad be the solution of the discrete problem (3.16), wherethe cellwise (bi-/tri-)linear discretization for the control variable is employed. Letmoreover the conditions of Lemma 5.4 be fulfilled. Then the following estimate holds:

    ‖q̄k − q̄σ‖I ≤(

    2 +C

    α

    )‖q̄k − Ihq̄k‖I +

    1α‖zk(q̄k)− zkh(q̄k)‖I .

    Proof. We split

    ‖q̄k − q̄σ‖I ≤ ‖q̄k − Ihq̄k‖I + ‖Ihq̄k − q̄σ‖I ,

    and estimate the term ‖Ihq̄k − q̄σ‖I . Due to (3.15) we obtain for any p ∈ Q

    α‖Ihq̄k−q̄σ‖2I ≤ j′′kh(p)(Ihq̄k−q̄σ, Ihq̄k−q̄σ) = j′kh(Ihq̄k)(Ihq̄k−q̄σ)−j′kh(q̄σ)(Ihq̄k−q̄σ).

    The optimality condition (3.17) for q̄σ and (5.6) imply the relation

    −j′kh(q̄σ)(Ihq̄k − q̄σ) ≤ 0 ≤ −j′k(q̄k)(Ihq̄k − q̄σ).

    Using the representations (3.6) of j′k and (3.13) of j′kh respectively, and Proposition 4.2

    we obtain:

    α‖Ihq̄k − q̄σ‖2I ≤ j′kh(Ihq̄k)(Ihq̄k − q̄σ)− j′k(q̄k)(Ihq̄k − q̄σ)= j′kh(Ihq̄k)(Ihq̄k − q̄σ)− j′kh(q̄k)(Ihq̄k − q̄σ)

    + j′kh(q̄k)(Ihq̄k − q̄σ)− j′k(q̄k)(Ihq̄k − q̄σ)= (α(Ihq̄k − q̄k) + (zkh(Ihq̄k)− zkh(q̄k)), Ihq̄k − q̄σ)I

    + (zkh(q̄k)− zk(q̄k), Ihq̄k − q̄σ)I≤ (C + α)‖Ihq̄k − q̄k‖I‖Ihq̄k − q̄σ‖I

    + ‖zk(q̄k)− zkh(q̄k)‖I‖Ihq̄k − q̄σ‖I .

    This implies the desired estimate.In the next lemma we provide an estimate for the interpolation error ‖q̄k−Ihq̄k‖I .Lemma 5.6. Let q̄k ∈ Qad be the solution of the semidiscretized optimization

    problem (3.4) and Ihq̄k be the interpolation constructed by (5.5). Let moreover theconditions of Lemma 5.4 be fulfilled. Then, if Assumption 1 is fulfilled, the followingestimate holds for n < p ≤ ∞ provided zk(q̄k) ∈ L2(I,W 1,p(Ω)):

    ‖q̄k − Ihq̄k‖I ≤C

    α

    {h2‖∇2zk(q̄k)‖I + h

    32−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω))

    }.

    Proof. Since Ihq̄k as well as q̄k are piecewise constant in time we write

    ‖q̄k − Ihq̄k‖2I =M∑

    m=1

    ∫Im

    ‖q̄k(t)− Ihq̄k(t)‖2 dt =M∑

    m=1

    km‖q̄k(tm)− Ihq̄k(tm)‖2. (5.7)

    We start with

    ‖q̄k(tm)− Ihq̄k(tm)‖2

    =∑

    K∈T 1h,m

    ‖q̄k(tm)− Ihq̄k(tm)‖2L2(K) +∑

    K∈T 2h,m

    ‖q̄k(tm)− Ihq̄k(tm)‖2L2(K).

  • Finite Elements for Parabolic Optimal Control 17

    For the first term we have:∑K∈T 1h,m

    ‖q̄k(tm)− Ihq̄k(tm)‖2L2(K) ≤ Ch4

    ∑K∈T 1h,m

    ‖∇2q̄k(tm)‖2L2(K)

    ≤ Cα2h4‖∇2zk(q̄k)(tm)‖2,

    (5.8)

    since q̄k(tm) equals either qa, qb, or − 1αzk(q̄k)(tm) on all cells K ∈ T1

    h,m.For cells K ∈ T 2h,m we obtain either

    minx∈K

    q̄k(tm, x) = qa and maxx∈K

    q̄k(tm, x) < qb,

    or

    maxx∈K

    q̄k(tm, x) = qb and minx∈K

    q̄k(tm, x) > qa.

    As in the proof of Lemma 5.4 we have for the first case Ihq̄k(tm) = qa on K. SinceW 1,p(K) ↪→ C(K) for p > n, we may utilize standard interpolation estimates leadingto

    ‖q̄k(tm)− Ihq̄k(tm)‖2L2(K) = ‖q̄k(tm)− qa‖2L2(K) ≤ |K|

    1− 2p ‖q̄k(tm)− Ihq̄k(tm)‖2Lp(K)≤ Ch2|K|1−

    2p ‖∇q̄k(tm)‖2Lp(K).

    The same estimate is obtained also for the second case. By summing up and usingAssumption 1 we proceed:∑

    K∈T 2h,m

    ‖q̄k(tm)− Ihq̄k(tm)‖2L2(K) ≤ Ch2

    ∑K∈T 2h,m

    |K|1−2p ‖∇q̄k(tm)‖2Lp(K)

    ≤ Ch2 ∑

    K∈T 2h,m

    |K|

    1− 2p ‖∇q̄k(tm)‖2Lp(Ω)≤ Cα2h3−

    2p ‖∇zk(q̄k)(tm)‖2Lp(Ω).

    (5.9)

    Here we have used (3.9) and the continuity of the projection PQad : W1,p(Ω) →

    W 1,p(Ω). By inserting (5.8) and (5.9) in (5.7) we obtain the desired estimate.Corollary 5.7. Under the conditions of Theorem 5.2 and Lemma 5.6 the fol-

    lowing estimate holds:

    ‖q̄ − q̄σ‖I ≤C

    αk{‖∂tu(q̄)‖I + ‖∂tz(q̄)‖I

    }+C

    αh2

    {‖∇2uk(q̄k)‖I + ‖∇2zk(q̄k)‖I

    }+C

    α

    (1 +

    ){h2‖∇2zk(q̄k)‖I + h

    32−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω))

    }= O

    (k + h

    32−

    1p).

    Proof. The result follows directly from Theorem 5.1, Theorem 5.5, Lemma 5.6,and Proposition 4.4.

    In the sequel we discuss the result from Corollary 5.7 in more details. This resultholds under the assumption that zk(q̄k) ∈ L2(I,W 1,p(Ω)). From the stability resultin Proposition 4.1 and the fact that q̄k ∈ Qad, we know that

    ‖zk(q̄k)‖L2(I,H2(Ω)) ≤ C.

  • 18 Dominik Meidner and Boris Vexler

    By a Sobolev embedding theorem we have H2(Ω) ↪→ W 1,p(Ω) for all p < ∞ in twospace dimensions and for n < p ≤ 6 in three dimensions. This implies the order ofconvergence O

    (k+h

    32−

    1p)

    for all n < p

  • Finite Elements for Parabolic Optimal Control 19

    • for g ∈ H2(K)∣∣∣∣∫K

    (g(x)− (Rdg)(x)) dx∣∣∣∣ ≤ Ch2|K| 12 ‖∇2g‖L2(K),

    • and for g ∈W 1,p(K) with n < p ≤ ∞

    ‖g −Rdg‖Lp(K) ≤ Ch‖∇g‖Lp(K).

    Proof. The proof is done by standard arguments using the Bramble-HilbertLemma; see [19] for details.

    The operator Rd will also be used for time-dependent functions g by setting(Rdg)(t) = Rdg(t). There holds the following lemma:

    Lemma 5.11. For a function gk ∈ X0k∩L2(I,H2) and a cellwise constant functionpd ∈ Qd the estimate

    (pd, gk −Rdgk)I ≤ Ch2‖pd‖I‖∇2gk‖I

    holds.Proof. Using Lemma 5.10 we obtain:

    (pd, gk −Rdgk)I =M∑

    m=1

    ∫Im

    (pd(t), gk(t)−Rdgk(t)) dt

    =M∑

    m=1

    km(pd(tm), gk(tm)−Rdgk(tm))

    =M∑

    m=1

    km∑

    K∈Th

    pd(tm, SK)∫

    K

    (gk(tm, x)− (Rdgk)(tm, x)) dx

    ≤ Ch2M∑

    m=1

    km∑

    K∈Th

    |pd(tm, SK)| |K|12 ‖∇2gk(tm)‖L2(K).

    We complete the proof by Cauchy’s inequality.Lemma 5.12. Let q̄k ∈ Qad be the solution of the semidiscrete optimization

    problem (3.4) and q̄σ ∈ Qd,ad be the solution of the discrete problem (3.16), where thecellwise constant control discretization is employed. Then the following relation holds:

    (αRdq̄k +Rdzk(q̄k), q̄σ −Rdq̄k)I ≥ 0.

    Proof. From the optimality condition (3.5) for q̄k we obtain

    (αq̄k(tm, x) + zk(q̄k)(tm, x)) · (δq(tm, x)− q̄k(tm, x)) ≥ 0

    for any δq ∈ Qd,ad pointwise a.e. in Ω and for m = 1, 2, . . . ,M . For an arbitrary cellK ∈ Th we apply this formula for x = SK and δq = q̄σ:

    (αq̄k(tm, SK) + zk(q̄k)(tm, SK)) · (q̄σ(tm, SK)− q̄k(tm, SK)) ≥ 0.

    This can be done because of the spatial continuity of zk(q̄k), q̄k, and q̄σ. Due to thedefinition of Rd this is equivalent to

    (αRdq̄k(tm, SK) +Rdzk(q̄k)(tm, SK)) · (q̄σ(tm, SK)−Rdq̄k(tm, SK)) ≥ 0.

  • 20 Dominik Meidner and Boris Vexler

    Then, integration over K and Im, summation over all K ∈ Th and m = 1, 2, . . . ,Mleads to the proposed relation.

    Lemma 5.13. Let q̄k ∈ Qad be the solution of the semidiscrete optimizationproblem (3.4) and ψkh ∈ X0,1k,h. Let, moreover, Assumption 1 be fulfilled and n < p ≤∞. Then, it holds

    (ψkh, q̄k −Rdq̄k)I ≤C

    αh2

    {‖∇ψkh‖I‖∇zk(q̄k)‖I + ‖ψkh‖L2(I,L∞(Ω))‖∇2zk(q̄k)‖I

    }+C

    αh2−

    1p ‖ψkh‖L2(I,L∞(Ω))‖∇zk(q̄k)‖L2(I,Lp(Ω)),

    provided that zk(q̄k) ∈ L2(I,W 1,p(Ω)).Proof. By means of the L2-projection πd : Q→ Qd, we split

    (ψkh, q̄k −Rdq̄k)I = (ψkh, q̄k − πdq̄k)I + (ψkh, πdq̄k −Rdq̄k)I

    Using the optimality condition (3.9) and the continuity of the projection operatorPQad : H

    1(Ω) → H1(Ω), we have for the first term

    (ψkh, q̄k − πdq̄k)I = (ψkh − πdψkh, q̄k − πdq̄k)I ≤ Ch2‖∇ψkh‖I‖∇q̄k‖I

    ≤ Cαh2‖∇ψkh‖I‖∇zk(q̄k)‖I .

    (5.10)

    For the second term we obtain

    (ψkh, πdq̄k −Rdq̄k)I =M∑

    m=1

    ∫Im

    (ψkh(t), πdq̄k(t)−Rdq̄k(t)) dt

    =M∑

    m=1

    km(ψkh(tm), πdq̄k(tm)−Rdq̄k(tm)).

    Utilizing the fact that πdq̄k(tm) as well as Rdq̄k(tm) are constant on each cell K, weproceed with

    (ψkh(tm), πdq̄k(tm)−Rdq̄k(tm))

    =∑

    K∈Th

    ∫K

    ψkh(tm, x)(πdq̄k(tm, x)− (Rdq̄k)(tm, x)) dx

    =∑

    K∈Th

    1|K|

    ∫K

    ψkh(tm, x) dx∫

    K

    (πdq̄k(tm, x)− (Rdq̄k)(tm, x)) dx

    ≤ ‖ψkh(tm)‖L∞(Ω)∑

    K∈Th

    ∣∣∣∣∫K

    (q̄k(tm, x)− (Rdq̄k)(tm, x)) dx∣∣∣∣ .

    (5.11)

    As in Section 5.2, we split the last sum using the separation Th = T 1h,m ∪ T 2h,m form = 1, 2, . . . ,M . For the sum over T 1h,m we obtain by means of Lemma 5.10 and thefact that q̄k(tm) equals either qa, qb, or − 1αzk(q̄k)(tm):∑

    K∈T 1h,m

    ∣∣∣∣∫K

    (q̄k(tm, x)−Rdq̄k(tm, x)) dx∣∣∣∣ ≤ Ch2 ∑

    K∈T 1h,m

    |K| 12 ‖∇2q̄k(tm)‖L2(K)

    ≤ Cαh2‖∇2zk(q̄k)(tm)‖.

    (5.12)

  • Finite Elements for Parabolic Optimal Control 21

    For the part of the sum over T 2h,m the estimate of Lemma 5.10, Assumption 1, the opti-mality condition (3.9) and the continuity of the projection operator PQad on W

    1,p(Ω)lead to∑K∈T 2h,m

    ∣∣∣∣∫K

    (q̄k(tm, x)−Rdq̄k(tm, x)) dx∣∣∣∣ ≤ ∑

    K∈T 2h,m

    |K|1−1p ‖q̄k(tm)−Rdq̄k(tm)‖Lp(K)

    ≤ Ch∑

    K∈T 2h,m

    |K|1−1p ‖∇q̄k(tm)‖Lp(K)

    ≤ Cαh2−

    1p ‖∇zk(q̄k(tm))‖Lp(Ω).

    (5.13)

    Inserting (5.12) and (5.13) into (5.11) and collecting the estimates (5.10) and (5.11)completes the proof.

    The following theorem provides a supercloseness result on the difference Rdq̄k−q̄σ:Theorem 5.14. Let q̄k ∈ Qad be the solution of the semidiscretized optimization

    problem (3.4) and q̄σ ∈ Qd,ad be the solution of the discrete problem (3.16), where thecellwise constant discretization for the control variable is employed. Let, moreover,Assumption 1 be fulfilled and n < p ≤ ∞. Then, it holds

    ‖Rdq̄k − q̄σ‖I ≤C

    αh2

    {‖∇2uk(q̄k)‖I +

    1α‖∇zk(q̄k)‖I +

    (1 +

    )‖∇2zk(q̄k)‖I

    }+C

    α2h2−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω)),

    provided that zk(q̄k) ∈ L2(I,W 1,p(Ω)).Proof. Similar as before, we proceed with an arbitrary p ∈ Q

    α‖Rdq̄k − q̄σ‖2I ≤ j′′kh(p)(Rdq̄k − q̄σ, Rdq̄k − q̄σ)= j′kh(Rdq̄k)(Rdq̄k − q̄σ)− j′kh(q̄σ)(Rdq̄k − q̄σ).

    By means of the inequality

    −j′kh(q̄σ)(Rdq̄k − q̄σ) ≤ 0 ≤ −(αRdq̄k +Rdzk(q̄k), Rdq̄k − q̄σ)I ,

    which is implied by the optimality of q̄σ and Lemma 5.12, and by means of the explicitrepresentation of j′kh from (3.13) we obtain

    α‖Rdq̄k − q̄σ‖2I ≤ (zkh(Rdq̄k)−Rdzk(q̄k), Rdq̄k − q̄σ)I≤ (zkh(Rdq̄k)− zk(q̄k), Rdq̄k − q̄σ)I

    + (zk(q̄k)−Rdzk(q̄k), Rdq̄k − q̄σ)I .(5.14)

    For the first term on the right-hand side of (5.14), we have by Cauchy’s inequality

    (zkh(Rdq̄k)− zk(q̄k), Rdq̄k − q̄σ)I ≤ ‖zkh(Rdq̄k)− zk(q̄k)‖I‖Rdq̄k − q̄σ‖I .

    By insertion of zkh(q̄k), the term‖zkh(Rdq̄k)− zk(q̄k)‖I is further estimated as

    ‖zkh(Rdq̄k)− zk(q̄k)‖I ≤ ‖zkh(Rdq̄k)− zkh(q̄k)‖I + ‖zkh(q̄k)− zk(q̄k)‖I . (5.15)

  • 22 Dominik Meidner and Boris Vexler

    Due to the stability estimate of the fully discrete adjoint solution, see Proposition 4.2,the first term is bounded by

    ‖zkh(Rdq̄k)− zkh(q̄k)‖I ≤ C‖ukh(Rdq̄k)− ukh(q̄k)‖I . (5.16)

    Further, we have by means of the discrete state equation (3.10) and the discreteadjoint equation (3.14):

    ‖ukh(Rdq̄k)− ukh(q̄k)‖2I = (zkh(q̄k)− zkh(Rdq̄k), q̄k −Rdq̄k)I .

    With ψkh = zkh(q̄k)− zkh(Rdq̄k) in Lemma 5.13, we have

    ‖ukh(Rdq̄k)− ukh(q̄k)‖2I ≤C

    αh2

    {‖∇(zkh(q̄k)− zkh(Rdq̄k))‖I‖∇zk(q̄k)‖I

    + ‖zkh(q̄k)− zkh(Rdq̄k)‖L2(I,L∞(Ω))‖∇2zk(q̄k)‖I}

    +C

    αh2−

    1p ‖zkh(q̄k)− zkh(Rdq̄k)‖L2(I,L∞(Ω))‖∇zk(q̄k)‖L2(I,Lp(Ω)),

    and the stability estimates from Proposition 4.2 and Lemma 4.5

    ‖∇(zkh(qk)− zkh(Rdqk))‖I ≤ C‖ukh(Rdqk)− ukh(qk)‖I ,‖zkh(qk)− zkh(Rdqk)‖L2(I,L∞(Ω)) ≤ C‖ukh(Rdqk)− ukh(qk)‖I ,

    yield the following intermediary result

    ‖ukh(Rdq̄k)− ukh(q̄k)‖I ≤C

    αh2

    {‖∇zk(q̄k)‖I + ‖∇2zk(q̄k)‖I

    }+C

    αh2−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω)).

    We proceed by inserting this in (5.16) and in (5.15). This leads together with anestimate for the second term on the right-hand side of (5.15) from Proposition 4.4 to

    ‖zkh(Rdq̄k)− zk(q̄k)‖I ≤ Ch2{‖∇2uk(q̄k)‖I +

    1α‖∇zk(q̄k)‖I

    +(1 +

    )‖∇2zk(q̄k)‖I

    }+C

    αh2−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω)). (5.17)

    By applying Lemma 5.11 with pd = Rdq̄k − q̄σ to the second term on the right-handside of (5.14), we get

    (zk(q̄k)−Rdzk(q̄k), Rdq̄k − q̄σ)I ≤ Ch2‖Rdq̄k − q̄σ‖I‖∇2zk(q̄k)‖I .

    The asserted result is obtained by insertion of the last two estimates into (5.14).Based on this theorem, we state the main result of this section concerning the

    order of convergence of the error between q̄ and q̃σ, where q̃σ is defined using thepost-processing step (3.18).

    Corollary 5.15. Let the conditions of Theorem 5.14 be fulfilled. Then, thereholds:

    ‖q̄ − q̃σ‖I ≤C

    α

    (1 +

    )k{‖∂tu(q̄)‖I + ‖∂tz(q̄)‖I

    }+C

    α

    (1 +

    )h2

    {‖∇2uk(q̄k)‖I +

    1α‖∇zk(q̄k)‖I +

    (1 +

    )‖∇2zk(q̄k)‖I

    }+C

    α2

    (1 +

    )h2−

    1p ‖∇zk(q̄k)‖L2(I,Lp(Ω)) = O

    (k + h2−

    1p).

  • Finite Elements for Parabolic Optimal Control 23

    Proof. From the optimality condition (2.9) and the definition (3.18) of q̃σ we havethe representation

    ‖q̄ − q̃σ‖I =∥∥∥∥PQad (− 1αz(q̄)

    )− PQad

    (− 1αzkh(q̄σ)

    )∥∥∥∥I

    .

    By means of the Lipschitz continuity of PQad on L2(I, L2(Ω)), this leads to

    α‖q̄ − q̃σ‖I ≤ ‖z(q̄)− zkh(q̄σ)‖I ≤ ‖z(q̄)− zk(q̄k)‖I + ‖zk(q̄k)− zkh(q̄σ)‖I . (5.18)

    The first term is controlled by means of Proposition 4.1, Theorem 5.1, and Proposi-tion 4.3 as

    ‖z(q̄)− zk(q̄k)‖I ≤ ‖z(q̄)− zk(q̄)‖I + ‖zk(q̄)− zk(q̄k)‖I≤ ‖z(q̄)− zk(q̄)‖I + C‖uk(q̄)− uk(q̄k)‖I≤ ‖z(q̄)− zk(q̄)‖I + C‖q̄ − q̄k‖I

    ≤(1 +

    C

    α

    )‖z(q̄)− zk(q̄)‖I

    ≤ C(1 +

    )k{‖∂tu(q̄)‖I + ‖∂tz(q̄)‖I

    }.

    The second term can be estimated by means of the stability result of Proposition 4.2as

    ‖zk(q̄k)− zkh(q̄σ)‖I ≤ ‖zk(q̄k)− zkh(Rdq̄k)‖I + ‖zkh(Rdq̄k)− zkh(q̄σ)‖I≤ ‖zk(q̄k)− zkh(Rdq̄k)‖I + C‖ukh(Rdq̄k)− ukh(q̄σ)‖I≤ ‖zk(q̄k)− zkh(Rdq̄k)‖I + C‖Rdq̄k − q̄σ‖I .

    Inserting the two last inequalities into (5.18) and applying the estimates from (5.17)and Theorem 5.14 yields the stated assertion.

    The choice of p in Corollary 5.15 follows the description in Section 5.2 requiringzk(q̄k) ∈ L2(I,W 1,p(Ω)). Due to the fact that ‖zk(q̄k)‖L2(I,H2(Ω)) is bounded indepen-dently of k, the result in Corollary 5.15 holds for any n < p

  • 24 Dominik Meidner and Boris Vexler

    10−2

    10−1

    100

    101

    100 101 102 103

    M

    constant controlbilinear control

    O(k)

    (a) Refinement of the time steps for N = 1089spatial nodes

    101 102 103 104

    10−2

    10−1

    100

    101

    N

    constant controlbilinear control

    O(h)O(h 32 )

    (b) Refinement of the spatial triangulation forM = 2048 time steps

    Fig. 6.1. Discretization error ‖q̄ − q̄σ‖I

    with PQad given by (2.8) with qa = −70 and qb = −1. For this choice of data andwith the regularization parameter α chosen as α = π−4, the optimal solution triple(q̄, ū, z̄) of the optimal control problem (2.2) is given by

    q̄(t, x1, x2) := PQad(−π4{wa(t, x1, x2)− wa(T, x1, x2)}

    ),

    ū(t, x1, x2) :=−1

    2 + aπ2wa(t, x1, x2),

    z̄(t, x1, x2) := wa(t, x1, x2)− wa(T, x1, x2).

    We are going to validate the estimates developed in the previous section by sepa-rating the discretization errors. That is, we consider at first the behavior of the errorfor a sequence of discretizations with decreasing size of the time steps and a fixedspatial triangulation with N = 1089 nodes. Secondly, we examine the behavior of theerror under refinement of the spatial triangulation for M = 2048 time steps.

    The state discretization is chosen as cG(1)dG(0), i.e., r = 0, s = 1. For the controldiscretization we use the same temporal and spatial meshes as for the state variableand present results for two choices of the discrete control space Qd: cG(1)dG(0) anddG(0)dG(0). For the following computations, we choose the the free parameter a tobe −

    √5.

    The optimal control problems are solved by the optimization library RoDoBo[20] using a primal-dual active set strategy (cf. [3, 12]) in combination with a conjugategradient method applied to the reduced problem (3.16).

    Figure 6.1(a) depicts the development of the error under refinement of the tem-poral step size k. Up to the spatial discretization error it exhibits the proven con-vergence order O(k) for both kinds of spatial discretization of the control space. Forpiecewise constant control (dG(0)dG(0) discretization), the discretization error is al-

  • Finite Elements for Parabolic Optimal Control 25

    10−3

    10−2

    10−1

    100

    100 101 102 103

    M

    constant controlbilinear control

    O(k)

    (a) Refinement of the time steps for N = 1089spatial nodes

    101 102 103 10410−3

    10−2

    10−1

    100

    N

    constant controlbilinear control

    O(h2)

    (b) Refinement of the spatial triangulation forM = 2048 time steps

    Fig. 6.2. Discretization error ‖ū− ūσ‖I

    ready reached at 128 time steps, whereas in the case of bilinear control (cG(1)dG(0)discretization), the number of time steps could be increased up to M = 1024 un-til reaching the spatial accuracy. This illustrates the convergence results from theSections 5.1 and 5.2 with respect to the temporal discretization.

    In Figure 6.1(b) the development of the error in the control variable under spa-tial refinement is shown. The expected order O(h) for piecewise constant control(dG(0)dG(0) discretization) and O(h 32 ) for bilinear control (cG(1)dG(0) discretiza-tion) is observed. This illustrates the convergence results from the Sections 5.1 and 5.2with respect to the spatial discretization.

    The Figures 6.2 and 6.3 show the errors in the state and in the adjoint variables,‖ū− ūσ‖I and ‖z̄− z̄σ‖I , for separate refinement of the time and space discretization.Thereby, we observe convergence of order O(k + h2) regardless of the type of spatialdiscretization used for the controls. This is consistent with the results proven in theprevious section. Since the post-processing strategy presented in Section 5.4 reliesessential on the convergence properties of the adjoint variable, Figure 6.3 confirmsthe proved order of convergence of the error ‖q̄ − q̃σ‖I .

    Acknowledgments. The first author is supported by the German research Foun-dation DFG through the International Research Training Group 710 “Complex Pro-cesses: Modeling, Simulation, and Optimization”. The second author has been par-tially supported by the Austrian Science Fund FWF project P18971-N18 “Numericalanalysis and discretization strategies for optimal control problems with singularities”.

    REFERENCES

  • 26 Dominik Meidner and Boris Vexler

    10−4

    10−3

    10−2

    10−1

    100 101 102 103

    M

    constant controlbilinear control

    O(k)

    (a) Refinement of the time steps for N = 1089spatial nodes

    101 102 103 104

    10−4

    10−3

    10−2

    N

    constant controlbilinear control

    O(h2)

    (b) Refinement of the spatial triangulation forM = 2048 time steps

    Fig. 6.3. Discretization error ‖z̄ − z̄σ‖I

    [1] N. Arada, E. Casas, and F. Tröltzsch, Error estimates for a semilinear elliptic optimalcontrol problem, Comput. Optim. Appl., 23 (2002), pp. 201–229.

    [2] R. Becker and B. Vexler, Optimal control of the convection-diffusion equation using stabi-lized finite element methods, Numer. Math., 106 (2007), pp. 349–367.

    [3] M. Bergounioux, K. Ito, and K. Kunisch, Primal-dual strategy for constrained optimalcontrol problems, SIAM J. Control Optim., 37 (1999), pp. 1176–1194.

    [4] E. Casas and M. Mateos, Error estimates for the numerical approximation of boundary semi-linear elliptic control problems. Continuous piecewise linear approximations, in Systems,control, modeling and optimization, New York, 2006, Springer, pp. 91–101. IFIP Int. Fed.Inf. Process., 202.

    [5] E. Casas, M. Mateos, and F. Tröltzsch, Error estimates for the numerical approximation ofboundary semilinear elliptic control problems, Comput. Optim. Appl., 31 (2005), pp. 193–220.

    [6] E. Casas and F. Tröltzsch, Error estimates for linear-quadratic elliptic control problems,in Analysis and Optimization of Differential Systems, V. Barbu, I. Lasiecka, D. Tiba, andC. Varsan, eds., Boston, 2003, Kluwer Academic Publishers, pp. 89–100. Proceeding ofInternational Working Conference on Analysis and Optimization of Differential Systems.

    [7] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, vol. 40 of Classics Appl.Math., SIAM, Philadelphia, 2002.

    [8] R. Falk, Approximation of a class of optimal control problems with order of convergenceestimates, J. Math. Anal. Appl., 44 (1973), pp. 28–47.

    [9] The finite element toolkit Gascoigne. http://www.gascoigne.uni-hd.de.[10] T. Geveci, On the approximation of the solution of an optimal control problem governed by

    an elliptic equation, M2AN Math. Model. Numer. Anal., 13 (1979), pp. 313–328.[11] M. Hinze, A variational discretization concept in control constrained optimization: The linear-

    quadratic case, Comput. Optim. and Appl., 30 (2005), pp. 45–61.[12] K. Kunisch and A. Rösch, Primal-dual active set strategy for a general class of constrained

    optimal control problems, SIAM J. Optim., 13 (2002), pp. 321–334.[13] I. Lasiecka and K. Malanowski, On regularity of solutions to convex optimal control problems

    with control constraints for parabolic systems, Control Cybern., 6 (1977), pp. 57–74.[14] , On discrete-time Ritz-Galerkin approximation of control constrained optimal control

    problems for parabolic systems, Control Cybern., 7 (1978), pp. 21–36.

  • Finite Elements for Parabolic Optimal Control 27

    [15] J.-L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, vol. 170of Grundlehren Math. Wiss., Springer, Berlin, 1971.

    [16] K. Malanowski, Convergence of approximations vs. regularity of solutions for convex, control-constrained optimal-control problems, Appl. Math. Optim., 8 (1981), pp. 69–95.

    [17] R. S. McNight and W. E. Bosarge, jr., The Ritz-Galerkin procedure for parabolic controlproblems, SIAM J. Control Optim., 11 (1973), pp. 510–524.

    [18] D. Meidner and B. Vexler, A priori error estimates for space-time finite element approxi-mation of parabolic optimal control problems. Part I: Problems without control constraints,(2007). submitted.

    [19] C. Meyer and A. Rösch, Superconvergence properties of optimal control problems, SIAM J.Control Optim., 43 (2004), pp. 970–985.

    [20] RoDoBo. A C++ library for optimization with stationary and nonstationary PDEs with in-terface to Gascoigne [9]. http://www.rodobo.uni-hd.de.

    [21] A. Rösch, Error estimates for parabolic optimal control problems with control constraints,Zeitschrift für Analysis und ihre Anwendungen ZAA, 23 (2004), pp. 353–376.

    [22] A. Rösch, Error estimates for linear-quadratic control problems with control constraints, Op-tim. Methods Softw., 21 (2006), pp. 121–134.

    [23] A. Rösch and B. Vexler, Optimal control of the Stokes equations: A priori error analysisfor finite element discretization with postprocessing, SIAM J. Numer. Anal., 44 (2006),pp. 1903–1920.

    [24] M. Schmich and B. Vexler, Adaptivity with dynamic meshes for space-time finite elementdiscretizations of parabolic equations, submitted, (2006).

    [25] V. Thomée, Galerkin Finite Element Methods for Parabolic Problems, vol. 25 of Spinger Ser.Comput. Math., Springer, Berlin, 1997.

    [26] R. Winther, Error estimates for a Galerkin approximation of a parabolic control problem,Ann. Math. Pura Appl. (4), 117 (1978), pp. 173–206.