A New Unbiased Stochastic Derivative Estimator for ...

14
This article was downloaded by: [73.132.152.220] On: 03 February 2018, At: 16:19 Publisher: Institute for Operations Research and the Management Sciences (INFORMS) INFORMS is located in Maryland, USA Operations Research Publication details, including instructions for authors and subscription information: http://pubsonline.informs.org A New Unbiased Stochastic Derivative Estimator for Discontinuous Sample Performances with Structural Parameters Yijie Peng, Michael C. Fu, Jian-Qiang Hu, Bernd Heidergott To cite this article: Yijie Peng, Michael C. Fu, Jian-Qiang Hu, Bernd Heidergott (2018) A New Unbiased Stochastic Derivative Estimator for Discontinuous Sample Performances with Structural Parameters. Operations Research Published online in Articles in Advance 02 Feb 2018 . https://doi.org/10.1287/opre.2017.1674 Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions This article may be used only for the purposes of research, teaching, and/or private study. Commercial use or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher approval, unless otherwise noted. For more information, contact [email protected]. The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or support of claims made of that product, publication, or service. Copyright © 2018, INFORMS Please scroll down for article—it is on subsequent pages INFORMS is the largest professional society in the world for professionals in the fields of operations research, management science, and analytics. For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Transcript of A New Unbiased Stochastic Derivative Estimator for ...

Page 1: A New Unbiased Stochastic Derivative Estimator for ...

This article was downloaded by: [73.132.152.220] On: 03 February 2018, At: 16:19Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Operations Research

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

A New Unbiased Stochastic Derivative Estimator forDiscontinuous Sample Performances with StructuralParametersYijie Peng, Michael C. Fu, Jian-Qiang Hu, Bernd Heidergott

To cite this article:Yijie Peng, Michael C. Fu, Jian-Qiang Hu, Bernd Heidergott (2018) A New Unbiased Stochastic Derivative Estimator forDiscontinuous Sample Performances with Structural Parameters. Operations Research

Published online in Articles in Advance 02 Feb 2018

. https://doi.org/10.1287/opre.2017.1674

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2018, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Page 2: A New Unbiased Stochastic Derivative Estimator for ...

OPERATIONS RESEARCHArticles in Advance, pp. 1–13

http://pubsonline.informs.org/journal/opre/ ISSN 0030-364X (print), ISSN 1526-5463 (online)

A New Unbiased Stochastic Derivative Estimator forDiscontinuous Sample Performances with Structural ParametersYijie Peng,a Michael C. Fu,b Jian-Qiang Hu,c Bernd HeidergottdaDepartment of Industrial Engineering and Management, College of Engineering, Peking University, Beijing 100871, China;bThe Robert H. Smith School of Business, Institute for Systems Research, University of Maryland, College Park, Maryland 20742;cDepartment of Management Science, School of Management, Fudan University, Shanghai 200433, China; dDepartment of Econometricsand Operations Research, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, NetherlandsContact: pengyij[email protected], http://orcid.org/0000-0003-2584-8131 (YP); [email protected] (MCF); [email protected] (J-QH);[email protected] (BH)

Received: January 12, 2016Revised: October 12, 2016; June 12, 2017;June 27, 2017Accepted: July 27, 2017Published Online in Articles in Advance:February 2, 2018

Subject Classifications: simulation;inventory/production: sensitivity analysis;inverntory/production: uncertainty: stochasticArea of Review: Simulation

https://doi.org/10.1287/opre.2017.1674

Copyright: © 2018 INFORMS

Abstract. In this paper, we propose a new unbiased stochastic derivative estimator in aframework that can handle discontinuous sample performances with structural parame-ters. This work extends the three most popular unbiased stochastic derivative estimators:(1) infinitesimal perturbation analysis (IPA), (2) the likelihood ratio (LR) method, and(3) the weak derivative method, to a setting where they did not previously apply. Exam-ples in probability constraints, control charts, and financial derivatives demonstrate thebroad applicability of the proposed framework. The new estimator preserves the single-run efficiency of the classic IPA-LR estimators in applications, which is substantiated bynumerical experiments.

Funding: This work was supported in part by the National Natural Science Foundation of China[Grants 71571048, 71720107003, 71690232, 71371015], by the National Science Foundation [GrantsCMMI-0856256, CMMI-1362303, CMMI-1434419], by the Air Force Office of Scientific Research[Grant FA9550-15-10050], and by the Science and Technology Agency of Sichuan Province [Grant2014GZX0002].

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2017.1674.

Keywords: simulation • stochastic derivative estimation • discontinuous sample performance • likelihood ratio • perturbation analysis •weak derivative

1. IntroductionStochastic derivative estimation is an active researcharea in simulation optimization, because it plays a cen-tral role in both sensitivity analysis and gradient-basedoptimization (see Asmussen and Glynn 2007). Three ofthe most popular unbiased stochastic derivative esti-mators are infinitesimal perturbation analysis (IPA),the likelihood ratio (LR) method (also known as thescore function method), and the weak derivative (WD)method; see Ho and Cao (1991), Glasserman (1991),Rubinstein and Shapiro (1993), Pflug (1996), and Fu(2006, 2008, 2015). In this work, we propose a new unbi-ased stochastic derivative estimator, which generalizesthree methods in a given framework.We consider stochastic models with discontinuous

sample performances in the presence of structural pa-rameters (parameters appearing directly in the sampleperformance). IPA requires continuity of the sampleperformance, whereas LR can easily handle discontin-uous sample performances but only deals with distri-butional parameters (parameters appearing in the dis-tribution of input random variables) and not structuralparameters. The proposed generalized likelihood ratio(GLR) method allows sample performances that may

be discontinuous with respect to structural parame-ters, where the “generalized” reflects the property thatthe GLR estimator reduces to the classic LR estimatorwhen there are no structural parameters.

A unified IPA-LR estimator was given by L’Ecuyer(1990) defined on a general probability space re-quiring—expressed in our framework—continuity ofthe sample performance when applied to a structuralparameter. We provide a representation for the bias ofthis IPA-LR estimator because of discontinuities usinga surface integration, and the GLR estimator is an unbi-ased modification of IPA-LR. More specifically, GLRis basically a summation of the classic LR estimatorand an additional term because of discontinuities. Thetechnique used in the development of GLR can alsodefine a WD estimator for a distribution whose classi-cal derivative cannot be directly defined, and therebyextends WD estimators to a setting that covers a broadrange of applications with discontinuities.

Examples in probability constraints (Andrieu et al.2010), control charts (Fu and Hu 1999), and financialderivatives (Wang et al. 2012), including new problemsthat cannot be easily addressedby existingmethods, forexample, compound options (see Online Appendix C),are treated by the GLR framework. The GLR estimator

1

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 3: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method2 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

has an analytical form, thus is computationally efficientto implement and preserves the single-run property ofthe classic IPA-LR estimator. GLR does not always givethemost statistically efficient estimator, for example, forsettings in which IPA applies, but is widely applicableand easily implementable.We briefly review existing methods that address dis-

continuities in various settings. Smoothed perturba-tion analysis (SPA), an extension of IPA, is designedto deal with discontinuous sample performances, butfinding what to condition on is generally problemdependent, and the estimation of the resulting con-ditional expectation may require function inversionand additional simulation (Gong and Ho 1987, Fu andHu 1997). Push-out LR treats discontinuous sampleperformances with structural parameters, but in gen-eral requires an explicit function inversion to push thestructural parameters out of the sample performanceand into the density (Rubinstein and Shapiro 1993). InOnline Appendix B, we show the GLR estimator is anextension of the push-out estimator when the struc-tural parameters cannot be explicitly pushed out.

Early work on addressing discontinuous samplepaths in derivative estimation for discrete event sys-tems (DES), for example, queueing, inventory, andmaintenance systems, includes Fu and Hu (1993),Fu (1994), and Heidergott (1999). Work focused ontreating discontinuous payoffs in Greeks estimationfor financial derivatives can be found in Fu and Hu(1995), Broadie and Glasserman (1996), Fournié et al.(1999), Heidergott (2001), Chen andGlasserman (2007),Liu and Hong (2011), and Wang et al. (2012). Morerecent work dealing with discontinuities in quantile,conditional value-at-risk, and distortion probabilitiesincludes Hong and Liu (2009), Fu et al. (2009), Hongand Liu (2009), Cao and Wan (2014), Jiang and Fu(2015), and Heidergott and Volk-Makarewicz (2016).

We summarize the contributions of our work asfollows:

• We derive a new stochastic derivative estimator ina general framework that handles discontinuous sam-ple performances with structural parameters.

• The new estimator is unbiased and single run,involving no explicit functional inversions.

• The new method uses a single framework to treata broad class of applications, many of which have beentreated separately in the literature.This work focuses on the theoretical foundations of

the GLR estimator in a general framework. Concur-rently, we have applied GLR to quantile sensitivity, dis-tribution sensitivity, and additional problems in Penget al. (2017, 2016a, b).

The rest of the paper is organized as follows. Generaltheory for the proposed GLR estimator is establishedin Section 2. Section 3 provides some applications andnumerical performances of the new method. The lastsection offers conclusions.

2. General TheoryIn this section, we formulate the problem in Section 2.1and provide a general theory for the GLR estimator ofproblem (2) beginning in Section 2.2 with an overviewof the key ideas for its derivation without getting intothe technical details, followed by a representation forthe bias of IPA-LR in Section 2.3; and concluding inSection 2.4 with the conditions justifying unbiasednessand corresponding theoretical results.

2.1. Problem FormulationWe consider the derivative of an expectation with re-spect to a scalar parameter θ. Taking θ scalar is withoutloss of generality, since a gradient could be obtained bytaking a vector of the derivatives. Suppose the sampleperformance (output random variable) is of the follow-ing form:

ϕ(g(X;θ))ϕ(g1(X1 , . . . ,Xn ;θ), . . . , gm(X1 , . . . ,Xn ;θ)),m 6 n , (1)

where• X denotes an n-dimensional real-valued random

vector, in symbols, X (X1 , . . . ,Xn) ∈ n , Xi , i 1,. . . , n, where Xi are input random variables with jointdensity f (·;θ),

• g(·;θ) (g1(·;θ), . . . , gm(·;θ)), gi(·;θ), i 1, . . . ,m,are functions that have sufficient differentiability withrespect to both the argument and parameter θ, and

• ϕ: m→ is a measurable function that is contin-uous almost everywhere (a.e.).In practice, a.e. continuity is often satisfied. For ex-

ample, it covers the case ϕ(y)∏mi1 hi(yi), where hi(yi)

has countablymany discontinuity points, which coversall examples in Section 3 and Online Appendix C. InOnline Appendix A, the theory for general measurablefunction ϕ can be found. In Section 2.4, wewill providesufficient conditions for our GLR estimator specifyingthe setup in the above bulleted list.

Our theoretical results will focus on the problem ofestimating the derivative of the expectation of sampleperformance (1), i.e.,

∂∂θ

Ɛ[ϕ(g(X;θ))] ∂∂θ

∫nϕ(g(x;θ)) f (x;θ) dx , (2)

where x (x1 , . . . , xn).

2.2. Overview of GLRIn this subsection, we provide a concise overview forderiving the GLR estimator for problem (2). Threeingredients for the derivation, i.e., function smoothing,integration by parts, and taking limits, will be pre-sented successively.

1. Function smoothing. Notice that a major differencebetween our setting (1) and the classic IPA-LR frame-work is that ϕ is not necessarily continuous, sowewant

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 4: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 3

to first “tame” it to be more tractable. For ϕ continuousa.e., wewill show the existence of a smooth sequence offunctions that approximate a nonsmooth function in aclassic sense. To avoid imposing extra regularity condi-tions on the smoothed sequence for interchanging limitand integration later, we first truncate function ϕ to bebounded and then smooth the truncated function.The indicator function of a set S is 1S(z) 1 if z ∈ S,

0 otherwise. We truncate ϕ to ϕL(y) (τL ϕ)(y) ·1SL(y), where L is a positive number and denotes the

composition of two functions,

τL(z) z1[−L, L](z)+ L1(L,∞)(z) − L1(−∞,−L)(z),SL y ∈ m : yi ∈ [−L, L], i 1, . . . ,m.

Note that τL endows ϕL with boundedness, and if ϕis continuous at y, so is τL ϕ because τL is continu-ous. ϕL has bounded support SL, and the discontinuitypoints of ϕL are contained by the union of the discon-tinuous points of ϕ and the boundary of SL that haszero Lebesgue measure. Thus, the truncated functionhas the following two properties: (1) if ϕ is continuousa.e., so is ϕL; (2) for any y ∈ m , |ϕL(y)| 6 L, |ϕL(y)| 6|ϕ(y)|, and

limL→∞

ϕL(y)→ ϕ(y).

Next, we show the existence of a “mollifier” that ap-proximates and smooths the truncated measurablefunction under a continuity condition for ϕ.(A.0) ϕ: m → is a measurable function that is

continuous almost everywhere (a.e.).Theorem 1. For any ϕ satisfying condition (A.0), thereexists a sequence of smooth functions ϕε, L such that| |ϕε, L | |∞ supy∈m |ϕε, L(y)| 6 L, and

limε→0

ϕε, L(y) ϕL(y) a.e .

Remark 1. Theorem 1 can be proved by usingLemma 1, which provides an approximation of ϕL by acontinuous function, and Lemma 2, which provides anapproximation of the continuous function by a smoothfunction in Online Appendix A, where the functionapproximation for any measurable function in an pspace can also be found.While the mollifier mappings ϕε, L(y) in Theorem 1

are only implicitly given, it is possible to provide anexplicit construction in simple cases, as we show inthe following example for the case of an indicatormapping.Example 1. A continuous approximation function foran indicator function 1(−∞, 0]( · ) that has a discontinuitypoint at zero is given by

χε(z)

1 z < −ε,1− (z + ε)/(2ε) −ε 6 z 6 ε,0 z > ε.

(3)

Note that as ε→ 0, χε( · ) → 1(−∞, 0]( · ) a.e. Moreover,∂χε(z)/∂z −1/2 for −ε < z < ε, and 0 for |z | > ε,while χε(z) fails to be differentiable at the boundarypoints −ε, ε.

Suppose the support Ω ⊂ n of the density f ( · ) isindependent of θ; otherwise, a change of variable canbe implemented to push parameter θ out of the sup-port (see Wang et al. 2012). Then, we derive the deriva-tive estimator for the expectation of the truncated andsmoothed sample performance. Assuming sufficientsmoothness on g and f , and that derivative and expec-tation can be interchanged,

∂∂θ

Ɛ[ϕε, L(g(X;θ))] ∂∂θ

∫Ω

ϕε, L(g(x;θ)) f (x;θ) dx

∫Ω

sε, L(x;θ) f (x;θ) dx , (4)

where

sε, L(x;θ) m∑

i1

∂ϕε, L(y)∂yi

yg(x;θ)

∂gi(x;θ)∂θ︸ ︷︷ ︸

IPA part

+ϕε, L(g(x;θ))∂ ln f (x;θ)

∂θ︸ ︷︷ ︸LR part

. (5)

Notice that this is a straightforward application of theproduct rule of differentiation form analysis; see alsothe discussion in L’Ecuyer (1990), where the above wayof organizing the derivatives is referred to as IPA-LR.

The above estimator will be biased because of thetruncation and smoothing. In general, we should notexpect that directly letting ε go to zero could elimi-nate the bias, because the limit of the smoothed func-tion after differentiationwouldbe ill-behavedat thedis-continuity points being smoothed. Taking the explicitapproximation functionχε givenby (3), for example, thesmoothed function after differentiating χ′ε( · )would goto infinity at the discontinuity point being smoothed asε goes to zero, thus the integrand in (4) cannot be dom-inated by an integrable function. Actually, we know thelimit of χ′ε( · ) is zero a.e., so letting ε go to zero in (4) can-not lead to an unbiased estimator in this specific case.A sampleperformancewith indicator functions is a typ-ical setting where the IPA-LR estimator fails, and howto derive an unbiased estimator for this problem hasbeen studied widely in the literature (e.g., Glassermanand Gong 1990, Fu and Hu 1997, Hong and Liu 2010,Heidergott and Volk-Makarewicz 2016).

2. Integration by parts. The following treatment usesintegration by parts (Evans 1998), derived from the

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 5: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method4 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

Gauss-Green Theorem, to deal with the bias intro-duced by truncation and smoothing in (4). Denote theJacobian matrix of the vector of functions g by

Jg(x;θ)

©­­­­­­­­­­«

∂g1(x;θ)∂x1

∂g2(x;θ)∂x1

· · ·∂gm(x;θ)∂x1

∂g1(x;θ)∂x2

∂g2(x;θ)∂x2

· · ·∂gm(x;θ)∂x2

......

. . ....

∂g1(x;θ)∂xn

∂g2(x;θ)∂xn

· · ·∂gm(x;θ)∂xn

ª®®®®®®®®®®¬,

assuming g is continuously differentiable. By the chainrule,

∂ϕε, L(g(x;θ))∂xi

m∑j1

∂ϕε, L(y)∂y j

yg(x;θ)

∂g j(x;θ)∂xi

,

i 1, . . . , n ,

or in a matrix form:

∇x(ϕε, L g)(x;θ) Jg(x;θ) · ∇yϕε, L(y)|yg(x;θ) ,

where

∇x(ϕε,L g)(x;θ)(∂ϕε,L(g(x;θ))

∂x1, . . . ,

∂ϕε,L(g(x;θ))∂xn

)T

,

∇yϕε,L(y)|yg(x;θ)

(∂ϕε,L(y)∂y1

, . . . ,∂ϕε,L(y)∂ym

)T yg(x;θ)

,

the superscript T denoting transpose. We assume thefollowing regularity condition.(A.1) Matrix invertibility: There exists an m ×m sub-

matrix Jg(x;θ) of the Jacobian Jg(x;θ) such that Jg(x;θ)is invertible a.e., i.e., ν(N ) 0, where ν is the Lebesguemeasure, and

N x ∈ n : det( Jg(x;θ)) 0.

To simplify the notation, we assume m n through-out Section 2.2, so Jg(x;θ) Jg(x;θ). Let J−1

g be theinverse of Jg . For m < n, J−1

g is a generalized inverseof Jg , which will be discussed in Section 2.4. With thenotation and assumption (A.1), we have (a.e.)

∇yϕε, L(y)|yg(x;θ) J−1g (x;θ)∇x(ϕε, L g)(x;θ). (6)

Equation (6) transforms the gradient operation withrespect to y to a gradient operation with respect to x,which is pivotal for applying integration by parts, sowe call (6) the transformational equation. With the trans-formational equation,∫Ω

sε, L(x;θ) f (x;θ) dx

∫Ω

ϕε, L(g(x;θ)) ∂∂θ

f (x;θ) dx

+

∫Ω

(∂θg(x;θ))T J−1g (x;θ)∇x(ϕε, L g)(x;θ) f (x;θ) dx ,

(7)

where ∂θg(x;θ) (∂g1(x;θ)/∂θ, . . . , ∂gm(x;θ)/∂θ)T .By integration by parts, the second term on the right-hand side of (7) is equal to∫∂Ω

(ϕε, L g) f (∂θg)T J−1g v ds

−∫Ω

ϕε, L(g(x;θ))div((∂θg(x;θ))T J−1g (x;θ) f (x;θ)) dx ,

(8)

where the first term in (8) is a surface integration alongthe boundary ∂Ω with the unit norm vector v point-ing outward, ds is the surface measure on ∂Ω, and fora vector of functions h(x) (h1(x), . . . , hn(x)), its diver-gence is

div(h(x))n∑

i1

∂hi(x)∂xn

.

We will provide sufficient condition for the surfaceintegral to be zero, so that by introducing appropriatelikelihood ratios, we obtain∫Ω

sε, L(x;θ) f (x;θ) dx

∫Ω

ϕε, L(g(x;θ))∂ ln f (x;θ)

∂θf (x;θ) dx

−∫Ω

ϕε, L(g(x;θ))div((∂θg(x;θ))T J−1g (x;θ) f (x;θ)) dx.

In obtaining the estimator, we could alternatively split∂ f (x;θ)/∂θ into a difference of two densities obtainedfrom normalizing the positive and negative part of∂ f (x;θ)/∂θ; see Pflug (1996) or Heidergott and Leahu(2010) for details. In this paper, we keep the classic LRterm.

3. Taking limits. Notice that there is no differentia-tion operation on the truncated smoothed function ϕε, Lin (8) after applying integration by parts to move thedifferentiation to the other terms. Under appropriateregularity conditions that will be formalized in Sec-tion 2.4, we can take limits in (7) to obtain an unbi-ased estimator. Specifically, by letting ε→ 0 and L→∞in (4), (7), and (8), assuming suitable smoothness, andthat limit, derivative, integration can be interchanged,

∂∂θ

Ɛ[ϕ(g(X;θ))]

∫Ω

ϕ(g(x;θ))(∂ ln f (x;θ)

∂θ+ d(x;θ)

)f (x;θ) dx , (9)

where

d(x;θ) −div((∂θg(x;θ))T J−1g (x;θ) f (x;θ))/ f (x;θ).

Notice that the smoothing function sequence is notin (9) anymore after taking limits.

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 6: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 5

The additional term d( · ) to the classic LR score function in theGLR estimator is because of the presence of structural parame-ters. An analytical expression for d( · ) given by (11) is derivedin Section 2.4, yielding a single-run unbiased estimator.

If we assume the density f goes to zero as x ap-proaches the boundary ∂Ω, the surface integrationin (9) could be zero under an appropriate integrabil-ity condition (see Online Appendix A). We also pro-vide here a high-level view to understand taking thelimit to achieve unbiasedness when the surface inte-gration part is zero. For two measurable functions ψ1and ψ2 under an appropriate integrability condition,we denote the linear operation in the p (inner productin 2) space by Rudin (1987)

〈ψ1 , ψ2〉 ∫nψ1(y)ψ2(y) dy.

For simplicity of illustration, we assume that g isinvertible in this section. By the change of variablesy g(x;θ), the right-hand side of (9) can be written as

〈ϕ, ψ · 1S 〉,

where ψ(y;θ)ω(g−1(y;θ);θ)|det(Jg(g−1(y;θ);θ))|−1 ·f (g−1(y;θ);θ),

ω(x;θ) ∂ ln f (x;θ)

∂θ+ d(x;θ),

andS (θ) y ∈ m : y g(x;θ), x ∈ n, (10)

with dependence on θ suppressed in 1S . Similarly,〈ϕε, L , ψ · 1S 〉 is the right-hand side of (7) subtractingthe surface integration term in (8). If ϕ is continuousa.e., the limit can be taken in a classic sense and unbi-asedness of GLR is obtained by interchanging limit andintegration, justified by the dominated convergencetheorem (Rudin 1987), i.e.,

limε→0〈ϕε, L , ψ · 1S 〉 〈lim

ε→0ϕε, L , ψ · 1S 〉 〈ϕL , ψ · 1S 〉.

In Online Appendix A, we show that for anymeasur-able function ϕ, the limits can be taken in an p spacewith integrability of an order higher than that of thefunction ψ · 1S , which is stronger than the first-orderintegrability condition required when ϕ is continuousa.e. Therefore, an interesting insight on how to dealwith discontinuous sample performances is that lesssmoothness in the sample performance can be compen-sated by stronger integrability conditions.

By a change of variables,

Ɛ[ϕ(g(X;θ))] 〈ϕ, φ · 1S 〉,

where

φ(y;θ) |det(Jg(g−1(y;θ);θ))|−1 f (g−1(y;θ);θ),

Figure 1. Expanding the Unbiasedness of the GLREstimator from Smoothed and Truncated ϕε, L to theGeneral ϕ That Is Not Necessarily Smooth and Bounded

⟨, L, · 1S⟩ = ⟨, L, · 1S⟩

Objective

⟨, · 1S⟩ ⟨, · 1S⟩=

and φ · 1S is the density of a distribution supported onthe image space S (θ). Figure 1 illustrates expandingthe unbiasedness of the GLR estimator from smoothedand truncated ϕε, L to the general ϕ that is not necessar-ily smooth and bounded through the three ingredientsof the derivation. Notice that through the three ingre-dients, a WD ψ · 1S is defined for a distribution φ · 1S

whose classical derivative cannot be directly definedin general, because the image S (θ) may be dependenton the parameter. GLR extends the classical setting ofWD (see Pflug 1996 and Heidergott and Leahu 2010)by allowing the existence of structural parameters.

2.3. Bias Representation of IPA-LRIn this subsection, we offer a representation of the biasfor IPA-LR because of discontinuities in the sample per-formance. To illustrate in a simplified setting, if θ is theparameter of interest and ϑ(X;θ) is the output sampleperformancewritten as a function of the vector of inputrandom variables X following joint density f , then IPAtreats integrals of the form∫

Ω

ϑ(x;θ) f (x) dx ,

for ϑ continuous, whereas LR treats integral of theform ∫

Ω

ϑ(x) f (x;θ) dx ,

where ϑ can be discontinuous. A natural generaliza-tion (L’Ecuyer 1990) is∫

Ω

ϑ(x;θ) f (x;θ) dx ,

but again ϑ must be continuous with respect to θ,so that IPA can be applied. Differentiating and inter-changing with integration then yields an IPA-LRestimator.

In our framework, the bias of the IPA-LR estimatorbecause of the discontinuity in the sample performancecan be understood by using integration by parts. Let Ξbe the set of discontinuity points of ϕ, and the inverseimage of g on Ξ is denoted by

Γ x ∈ n : g(x;θ) ∈ Ξ.

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 7: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method6 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

Suppose ϕ g is continuously differentiable outsideof Γ, and Γ is a smooth surface on n with zeroLebesgue measure. We can construct an open set Γε ⊃ Γwith Lebesgue measure ε (see the proof of Lemma 1 inOnline Appendix A). Assuming the surface integrationis zero on the boundary ∂Ω, similarly as the procedureto derive (9), we can use integration by parts to obtain∫

Ω\Γεs(x;θ) f (x;θ) dx

∫∂Γε

(ϕ g) f (∂θg)T J−1g v ds

+

∫Ω\Γε

ϕ(g(x;θ))ω(x;θ) f (x;θ) dx ,

where

s(x;θ) (∂θg(x;θ))T∇yϕ(y;θ)|yg(x;θ)

+ϕ(g(x;θ))∂ ln f (x;θ)

∂θ.

With an appropriate integrability condition, we canshrink Γε to Γ such that∫

Ω\Γs(x;θ) f (x;θ) dx I± + 〈ϕ, ψ · 1S 〉,

where

∫Γ+∪Γ−(ϕ g) f (∂θg)T J−1

g v ds ,

Γ+ and Γ− are two directed surfaces with opposite ori-entations, corresponding to the undirected surface Γ,and I± is a surface integration along the two surfaceswith opposite orientations. Fortunately, if ϕ g is con-tinuous on Γ, then I± 0. For the case where ϕ g isdiscontinuous on Γ, I± is a representation of the biasof the IPA-LR estimator s(x;θ). Introducing an appro-priate measure on the surfaces, the difference I± couldbe estimated as a difference of two (usually rather com-plex) simulation experiments; see Pflug (1996). In ourframework with discontinuities and structural param-eters, the GLR estimator can be viewed as an unbiasedmodification of IPA-LR through the three ingredientspresented in Section 2.2.

2.4. Technical Details for GLRIn this subsection, we provide rigorous conditions tojustify the unbiasedness of the GLR estimator, andderive an analytical form for the GLR estimator thatachieves single-run efficiency in simulation.The boundary of Ω is given by ∂Ω Ω\Ω0, where Ω

is the closure ofΩ andΩ0 is the interior ofΩ. We intro-duce a smoothness condition for the function vectorg(x;θ) and density f (x;θ).(A.2) Smoothness conditions: g(x;θ) is twice contin-

uously differentiable with respect to (x , θ) ∈ Ω × Θ,where Θ is an open neighborhood for the parameterof interest, and f (x;θ) is continuously differentiablewith respect to (x , θ) ∈ n ×Θ and goes to zero as xapproaches infinity.

Remark 2. If the support of the density is not allof n , the density f is zero ∀ x ∈ ∂Ω under condi-tion (A.2); otherwise, the density will be strictly posi-tive at x0 ∈ ∂Ω, thus is discontinuous at x0, because itis a cluster of points where the density is zero, whichviolates condition (A.2). Many distributions not sup-ported on the whole space are not continuous on thewhole space, for example, the exponential distribution,which is discontinuous at zero. However, in practice,we can implement a change of variables first to trans-form the support of the density to the whole space tosatisfy condition (A.2), for example, for the exponentialdistribution, we can use transformation z ln x. WhenXi , i 1, . . . , n, are independent, a systematic way tocarry out the change of variables is provided in OnlineAppendix A. An example of a change of variables inthe dependent case can be found in the maintenancesystems example in Online Appendix C.

For m < n, suppose condition (A.1) holds. Withoutloss of generality, we can assume the submatrix com-prising the first m rows of Jg has rank m a.e.; otherwise,we can reorder the indices of gi , i 1, . . . , n, appropri-ately. By the chain rule,

∇x(ϕε, L g)(x;θ) Jg(x;θ) · ∇yϕε, L(y;θ)|yg(x;θ) ,

where x (x1 , . . . , xm), Jg is the submatrix comprisingof the first m rows of Jg , and

∇x(ϕε,L g)(x;θ)(∂ϕε,L(g(x;θ))

∂x1, . . . ,

∂ϕε,L(g(x;θ))∂xm

)T

,

∇yϕε,L(y;θ)|yg(x;θ)

(∂ϕε,L(y)∂y1

, . . . ,∂ϕε,L(y)∂ym

)T yg(x;θ)

,

so transformational Equation (6) is adjusted to be

∇yϕε, L(y;θ)|yg(x;θ) J−1g (x;θ)∇x(ϕε, L g)(x;θ).

We only need to replace J−1g (x;θ) and ∇x(ϕε, L g)(x;θ)

in (7) with J−1g (x;θ) and ∇x(ϕε, L g)(x;θ), respectively,

to make the corresponding equation hold for m < n.Let y y if m n and y (y , xm+1:n) if m < n, andg(x;θ) g(x;θ) if m n and g(x;θ) (g(x;θ), xm+1:n)if m < n. We introduce the following function invert-ibility condition.

(A.3) Function invertibility: There exist sets Ωi , i 1, . . . , l, such that Ω\N

⋃i1,...,lΩi and g(·;θ): Ωi →

S i(θ) is invertible.

Remark 3. Under conditions (A.1) and (A.2), for anyx ∈ Ω\N , there exists a neighborhood Ox × O g(x) of(x , g(x;θ)) and a unique differentiable inverse functiong−1(·;θ): O g(x)→ Ox , by the implicit function theorem(Lang 2013). IfΩ is compact and N , condition (A.1)implies (A.3) by the Heine-Borel theorem (Lang 2013).

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 8: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 7

Next, we introduce the remaining regularity condi-tions required to establish the unbiasedness of the GLRestimator for problem (2).

(A.4) Integrability conditions:(i) For any ε, L > 0,∫

Ω

supθ∈Θ|sε, L(x;θ) f (x;θ)| dx <∞,

where sε, L is defined by (5).(ii) d( · ) is absolutely integrable:

Ɛ[|d(X;θ)|]∫Ω

|d(x;θ)| f (x;θ) dx <∞.

(iii) If Ωε is a sequence of bounded open setssuch that Ωε ⊂ Ωε′ if ε′ < ε and Ω

⋃εΩε, then there

exists 0 < c < 1 such that

limε→0

∫∂Ωε

|(∂θg)T J−1g v | | f |c ds <∞.

(iv) For i 1, . . . , l,∫n|ϕ(y) ∨ 1| |φ( y;θ) · 1S i (θ)( y)| d y <∞,∫

n|ϕ(y) ∨ 1 | sup

θ∈Θ|ψ( y;θ) · 1S i (θ)( y)| d y <∞,

where z1 ∨ z2 maxz1 , z2, and φ and ψ equal zerooutside of S i .

Remark 4. Condition (i) justifies the interchange ofderivative and integration in (4). Conditions (i)–(iii)together with condition (A.2) justify the use of integra-tion by parts in (8). Condition (iii) together with condi-tion (A.2) also guarantee the surface integration I ∂Ω iszero. Condition (iv) together with condition (A.4) jus-tifies the unbiasedness of the GLR estimator by inter-changing limits with respect to ε and L and derivativewith respect to θ, which requires a certain uniformconvergence. Under (A.2), the inverse image of f atzero, i.e., x: f (x) 0, is a closed set; thus the supportΩ is an open set and the monotone expansion frombounded open sets Ωε to Ω in (iii) is always feasible.

Remark 5. The generality of GLR comes at the pricethat only the existence of ϕε, L and g−1 is provided,thus the integrability condition (A.4) cannot generallybe checked in practice. An exception is the case thatϕε, L and g−1 can be explicitly given.

Theorem 2. Under continuity condition (A.0), the matrixinvertibility condition (A.1), smoothness condition (A.2),function invertibility condition (A.3), and integrability con-dition (A.4),

∂∂θ

Ɛ[ϕ(g(X;θ))] Ɛ[ϕ(g(X;θ))ω(X;θ)],

where ω(x;θ) ∂ ln f (x;θ)/∂θ+ d(x;θ), and

d(x;θ)m∑

i1( J−1

g (x;θ)∂xiJg(x;θ) J−1

g (x;θ)ei)T∂θg(x;θ)

− trace( J−1g (x;θ)∂θ Jg(x;θ))

− (∂θg(x;θ))T J−1g (x;θ)∇x ln f (x;θ), (11)

where ∂z Jg means differentiating every element in the matrixJg with respect to z, and

∇x ln f (x;θ) (∂ ln f (x;θ)/∂x1 , . . . , ∂ ln f (x;θ)/∂xm)T .

Remark 6. The proof of the Theorem 2 can be found inOnline Appendix A, where an alternative justificationof the unbiasedness for any measurable function ϕ canalso be found. The GLR estimator for problem (2) isgiven by ϕ(g(X;θ))ω(X;θ), and has an analytical formthat can be calculated by differentiation, matrix inver-sion, and elementary operations. If the sample perfor-mance contains no structural parameters, the GLR esti-mator simplifies to the classic LR estimator, because allthe ∂θ terms would be zero, and hence d(x;θ) wouldbe zero. For general measurable ϕ discussed in OnlineAppendix A, it requires a stronger condition, i.e., con-dition (A.5), which is more difficult to verify than con-dition (A.4), to justify the unbiasedness of GLR.

We offer a more tractable version of condi-tions (A.1)–(A.4) for the special case that (a) the deriva-tives of ϕε, L(y) exist a.e. for each ε and are uniformlybounded in y and ε on all points of differentiability;(b) the input random variables are independent andsupported on thewhole space; (c) g: n→n is a linear(affine) transformation:

g(x;θ) A(θ)x + B(θ),

where A(θ) is a parameterized n × n matrix and B(θ)is a parameterized n×1 vector, with sufficient smooth-ness. Push-out LR applies to this linear transformationand would lead to the same formula as GLR in thiscase (see Online Appendix B), but GLR applies moregenerally to nonlinear g where the structural parame-ter is not easy to be explicitly pushed out (see OnlineAppendix C).

Condition (a) is satisfied by the indicator functionin Example 1 and all applications presented in Sec-tion 3; condition (c) applies to most cases where SPAand push-out LR are easily implementable. Thus, thisspecial set of conditions applies to a rich class of prob-lems that are of importance in applications. It is easyto show that conditions (A.1) and (A.2) are satisfiedif A(θ) is invertible, and f satisfies the smoothness incondition (A.2). We can easily calculate

Jg(x;θ) A(θ), g−1(y;θ) A−1(θ)(y − B(θ)),d(x;θ)− trace(A−1(θ) ∂θA(θ))

− (∂θA(θ)x + ∂θB(θ))TA−1(θ)∇x ln f (x;θ).

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 9: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method8 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

Since g is invertible, condition (A.3) is satisfied. Inte-grability conditions (i)–(iii) in (A.4) can be simplifiedas follows:

(i’) Suppose∫n

supθ∈Θ|∂θ f (x;θ)| dx <∞,

maxi1,...,n

∫n

supθ∈Θ|eT

i (∂θA(θ)x + ∂θB(θ)) f (x;θ)| dx <∞.

(ii’) For i 1, . . . , n,∫

|∂xifi(xi ;θ)| dxi <∞,

|xi∂xifi(xi ;θ)| dxi <∞,

where fi is the marginal density of Xi .(iii’) For i 1, . . . , n,

limxi→∞|xi | fi(xi ;θ) 0,

|xi | fi(xi ;θ) dxi <∞.

Notice that for this special case, the integrability con-dition (A.4) involves no implicitly constructed func-tions. Uniform integrability is often used to justify theunbiasedness of classic IPA and LR by applying themean value theorem and dominated convergence the-orem (Glasserman 1991). Thus, checking the integra-bility condition required to justify the unbiasedness ofthe GLR estimator for this special case is no more dif-ficult than that of IPA and LR in principle. For f and Aindependent of θ,

ω(x;θ)−(∂θB(θ))TA−1∇x ln f (x),

integrability condition (i’) and the second inequalityin (ii’) are automatically satisfied, and integrabilityconditions (iv) can be further simplified as follows:(iv’) For i 1, . . . , n,∫

n|ϕ(y) ∨ 1| | f (A−1(y − B(θ)))| dy <∞,∫

n|ϕ(y) ∨ 1| sup

θ∈Θ

∂θbi(θ)∂xif (x)|xA−1(y−B(θ))

dy <∞,

where B(θ) (b1(θ), . . . , bn(θ))T .

3. ApplicationsIn this section, we consider three applications previ-ously analyzed by three different methods in the liter-ature. Applications in maintenance systems and com-pound options can be found in Online Appendix C.

3.1. Probability Constraints (Andrieu et al. 2010)We consider an investment problem where the capitalis borrowed at interest rate r. Let θ1 be the proportionof capital invested in a bondwith fixed rate b, and θ2 bethe proportion of capital invested in a risky asset with arandom rate X, where Ɛ[X]> r, whichmeans that there

is a positive risk premium. The remaining proportionof capital 1− θ1 − θ2 is for consumption. Andrieu et al.(2010) aim to maximize the sum of consumption satis-faction and the expected final capital subject to proba-bility constraint

P((1+ b)θ1 + (1+X)θ2 > 1+ r) > π,

which means that the probability of repayment is atleast as high as π. Since the parameters of interest θ1and θ2 appear in the probability, which is the expecta-tion of an indicator function that is discontinuous, IPAand LR do not apply. Andrieu et al. (2010) construct anexplicit mollifier with a tuning parameter to addressthe discontinuity, similar to the ε-approximation inExample 1, which leads to a biased derivative estima-tor. For the actual optimization algorithm, they let thetuning parameter go to zero as the number of iterationsteps in stochastic approximation (SA) goes to infinity.For such a Kiefer-Wolfowitz type algorithm (Kushnerand Yin 2003), the step size and the tuning parameterin the mollifier needs to be well tuned at proper ratesas the step goes to infinity in SA, which is a nontriv-ial task even in simple examples. Fortunately, as wewill show in the following, applying GLR we obtainan unbiased derivative estimator, which allows for aRobbins-Monro type algorithm (Kushner andYin 2003)and thereby overcomes the difficulty of reducing thebias induced by the mollifiers at the correct rate.

We estimate∂P((1+ b)θ1 + (1+X)θ2 > 1+ r)

∂θ,

for θ θ1 0.4, θ2 0.4, r 0.05, b 0.1, X a normalrandom variable with mean µ 0.2 and variance σ2

0.04. Since this example falls into the special case at theend of Section 2, conditions (A.1)–(A.4) can be easilychecked for this example. The GLR estimator is

1(1+ b)θ1 + (1+X)θ2 > 1+ r(1+ b)(X − µ)

θ2σ2 .

Andrieu et al. (2010) provide an approximation by con-volution (AC) method and propose

h(

1+ r − (1+ b)θ1 − (1+X)θ2

δ

)(1+ b),

where h(x) 3(1− x2)I(x)/4, I(x) 1 if −1 6 x 6 1 andI(x) 0 otherwise. We compare GLR with AC andfinite difference method with common random num-bers (FDC). AC(δ) and FDC(δ) denote AC with tuningparameter δ and FDC with perturbation size δ, respec-tively. The standard error is defined by the estimatedstandard deviation divided by the square root of thenumber of samples.

From Table 1, we see that both AC and FDC suf-fer from large bias when δ 0.1 and large variance

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 10: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 9

Table 1. Derivatives Estimates of the Probability Constraintwith Respect to θ, Based on 106 Independent Replications(Mean± Standard Error)

GLR AC(0.1) AC(0.01) FDC(0.1) FDC(0.01)

1.46± 0.006 1.77± 0.003 1.47± 0.010 3.49± 0.005 1.62± 0.009

when δ 0.01. FDC(0.001) outputs 1.46± 0.003 (mean± standard error) based on 108 independent replica-tions, which substantiates the unbiasedness of GLR. Inaddition, the performance of AC heavily relies on thechoice of themollifier h, and Andrieu et al. (2010) showthat some choices perform very poorly, for example,h(x) I(x). In contrast, GLR is an unbiased derivativeestimator without any mollifier and tuning parameterappearing in the final estimator.

3.2. Control Charts (Fu and Hu 1999)In statistical process control, control charts are used todetermine if a manufacturing or business process is ina state of statistical control. A performance of interestis the average run length

ARL(θ) Ɛ[N]∞∑

n1nP(N n),

where N is the time when the system declares out-of-control, i.e.,

N mini: Yi < [θ1, i , θ2, i],

andYi ψi(Xi ,Yi−1;θ), i > 1, Y1 X1 ,

where θ is a generic parameter that can be θ1, i and θ2, iin this example, Xi is the output of the ith sample, Yi isthe (observable) test statistic after the ith sample, ψi isthe transition function of a Markov chain Yi, θ1, i andθ2, i are the lower control limit and upper control limitfor the ith test statistic, respectively.We want to estimate the sensitivity of the average

run length with respect to parameter θ. To treat thisproblem in our framework, we can rewrite the sampleperformance as

Vn(X1 , . . . ,Xn ;θ) 1N n

( n−1∏i1

10 < g(n)i (X1 , . . . ,Xi ;θ) < 1)

× (1− 10 < g(n)n (X1 , . . . ,Xn ;θ) < 1),

g(n)i (X1 , . . . ,Xi ;θ)Yi − θ1, i

θ2, i − θ1, i,

(12)

where ∏0i1 1. The system will output samples of dif-

ferent statistical behaviors when it is in control and outof control. Conditional on Z z, where Z R/∆, R is

a (unobservable) random duration for the system to goout of control and follows a distribution with densityq0( · ), and ∆ is the sampling interval (duration betweentwo monitoring epochs), the conditional density of Xiis given by

fi(xi | z) 1i < zq1(xi)+ 1i > zq2(xi),where q1( · ) and q2( · ) are the densities of the distribu-tions of the sampling process when in control and outof control, respectively. Conditional on Z, (X1 , . . . ,Xn)are independent, and their conditional joint density isgiven by

f (n)(x1:n ; z)n∏

i1f (xi | z),

where x1:n (x1 , . . . , xn).The derivative estimators in Fu and Hu (1999) re-

quire extra simulation to estimate a conditional expec-tation term and an explicit inverse for ψi , for example,for Shewhart chart, Yi Xi , and for EWMA chart, Yi

αXi +(1−α)Yi−1, whereas our framework does not havethese requirements.

In the numerical experiment, we let θ1, i θ1 andθ2, i θ2 and test the sensitivity with respect to θ

θ2, where classical IPA and LR do not apply. Supposethe time to go out of control follows an exponentialdistribution, in-control sample output follows a normaldistribution with mean µ0 and variance σ2, and out-of-control sample output follows a normal distributionwith mean µ1 and variance σ2. The GLR estimator forthe EWMA control chart is given by

∇x1:nf (n)(x1:n ; z)−

(xi −(µ01i < z+µ11i > z)

σ2

)T

i1,...,n,

∂θg(n)(x;θ)

− 1θ2−θ1

(g(n)1 (x1;θ), . . . , g(n)n (x1 , . . . , xn , ;θ))T ,

and

J(n)g 1

θ2−θ1

·

©­­­­­­­«

1 1−α (1−α)2 · ·· (1−α)n−2 (1−α)n−1

0 α α(1−α) ··· α(1−α)n−3 α(1−α)n−2

0 0 α · ·· α(1−α)n−4 α(1−α)n−3

......

.... . .

......

0 0 · ·· · ·· α α(1−α)0 0 · ·· · ·· 0 α

ª®®®®®®®¬n×n

.

It is easy to verify the conditions (A.1)–(A.4) for theunbiasedness of the GLR estimator for ∂θP(N n)/∂θ,which falls into the special case at the end of Section 2.The GLR estimator for ∂ARL/∂θ is given by

N[

Nθ2 − θ1

− (∂θg(N)(X1 , . . . ,XN ;θ))T

· (J(N)g )−1∇x1:nf (n)(x1:n ; Z)|nN, x1:nX1:n

],

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 11: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method10 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

Table 2. Derivative of Shewhart Control Chart with Respectto Upper Control Limit θ2 (Mean± Standard Error)

µ1 1.0 µ1 3.0

10,000 repsFDC(0.1) 70.9± 2.2 3.63± 0.34FDC(0.01) 67.5± 7.0 3.96± 1.3PAL 59.5± 1.3 3.64± 0.06PAR 61.8± 0.6 3.91± 0.18GLR 61.0± 4.0 3.32± 1.1

1,000,000 repsFDC(0.1) 71.2± 0.2 3.54± 0.03FDC(0.01) 63.1± 0.6 3.80± 0.1GLR 62.8± 0.4 3.77± 0.1

True value 63.0 3.73

where X1:n (X1 , . . . ,Xn). The unbiasedness for the es-timator of ∂ARL/∂θ can be justified by imposing addi-tional integrability condition

∞∑n1

n supθ∈Θ

∂θP(N n)∂θ

<∞,

which justifies interchange of summation and differen-tiation by mean value theorem and dominated conver-gence theorem.We test the performance for the Shewhart chart, be-

cause the analytical form of the derivative can be usedto access the accuracy of the estimates. We compareGLR with the PAL and PAR estimators in Fu and Hu(1999), and FDC under the same setting as in Fu andHu (1999): α 1, sampling frequency ∆ 1, time togo out of control following an exponential distributionwithmean 20, sample output following a normal distri-bution with in-control mean µ0 0 and out-of-controlmean µ1 1.0 and 3.0, and variance σ 1, lower andupper control limits θ1 −2.81 and θ2 2.81, respec-tively, for all i ∈ .The results of PAL and PAR are taken from Fu andHu

(1999). From Table 2, we can see FDC suffers from largebias when δ 0.1 and large variance when δ 0.01.Although the variances of PAL and PAR are smallerthan that of GLR, they achieve this by running extrasimulation, so it is not clear which is superior aftertaking the extra simulation into account.

3.3. Barrier Options (Wang et al. 2012)To cover more applications, we consider a sample per-formance more general than (1) as follows:

Q(X;θ) ϑ(V(X;θ);θ), (13)

where ϑ(v;θ), v (v1 , . . . , vk), is differentiable withrespect to θ and v, V(x;θ) (V1(x;θ), . . . ,Vk(x;θ)),

Vi(x;θ) ϕi(g1(x;θ), . . . , gmi(x;θ)),

and ϕi( · ) is a measurable function with mi 6 n, i 1, . . . , k. With additional mild regularity conditions,the GLR estimator for sample performance (13) is

∂ϑ(v;θ)∂θ

vV(X;θ)

+Q(X;θ)ω(X;θ),

where ω is defined by (11) with g(x;θ) (g1(x;θ),. . . , gm(x;θ)), where m maxi1,...,k mi and Jg being anm ×m submatrix of the Jacobian of g. The only majordifferences in the derivation are as follows: the IPA-LRsε, L(x;θ) after truncation and smoothing in (4) needsto be changed to

(∂θg(x;θ))TΣ(x;θ)∇vϑ(v;θ)|vV(x;θ) ,

where

Σ(x;θ) ((∇y ϕ1(y;θ))T , . . . , (∇y ϕk(y;θ))T)T |yg(x;θ) ,

∇y ϕi(y;θ)(∂ϕi(y)∂y1

, . . . ,∂ϕi(y)∂ymi

, 0, . . . , 0)T

1×m

,

i 1, . . . , k ,

ϕi , i 1, . . . , k, denote the truncated and smoothedfunctions, and by the chain rule,

∇x(ϑ V)(x;θ) Jg(x;θ)Σ(x;θ)∇vϑ(v;θ)|vV(x;θ) ,

so the transformational Equation (6) is changed to

(∂θg(x;θ))TΣ(x;θ)∇vϑ(v;θ)|vV(x;θ)

(∂θg(x;θ))T J−1g (x;θ) ∇x(ϑ V)(x;θ).

A similar analysis in Section 2 follows, and the unbi-asedness of the GLR estimator for sample perfor-mance (1) can be straightforwardly extended to themore general case (13).

In a particular probability space where simula-tion can be naturally implemented, sample perfor-mance (13) covers the IPA-LR framework if we viewϑ(v;θ) as a simulation model with structural param-eter θ and a vector of input random variables v

V(X;θ), where X is the vector of uniform randomnumbers. Our framework also coversmany discontinu-ous settings previously studied in literature, for exam-ple, the setting ϕ(g1(x;θ), g2(x;θ)), where ϕ(y1 , y2) y11y2 6 0, in a kernel-based method (Liu and Hong2011) and support independent unified likelihoodratio and infinitesimal perturbation analysis (SLRIPA)(Wang et al. 2012).Framework (13) can easily handle sensitivity esti-

mates of barrier options, which were first addressedin Wang et al. (2012) using SLRIPA. In this exam-ple, we assume the underlying asset process fol-lows a geometric Brownian motion process, i.e., St

S0e (r−σ2/2)t+σBt , where St and Bt are the asset priceand standard Brownian motion at time t, respectively.

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 12: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 11

The numerical results for the underlying asset follow-ing a jump-diffusion process can be found in OnlineAppendix C.An up-and-out barrier option is a financial derivative

that becomes worthless if the path of the underlyingasset exceeds the barrier H. For a European up-and-outbarrier option, the payoff in the discrete-time setting isgiven by

e−n∆(Sn∆ −K)1

maxi1,...,n−1

Si∆ < H1K < Sn∆ < H,

where T n∆ is the time to maturity and K is the strikeprice (K < H). We have that for i 1, . . . , n − 1,

Si∆/H exp(gi(X1 , . . . ,Xi ;θ)),Sn∆/K explog(H/K)gn(X1 , . . . ,Xn ;θ),

and by taking log and simple algebra,

1Si∆ < H 1gi(X1 , . . . ,Xi ;θ) < 0,1K < Sn∆ < H 11 < Sn∆/K < H/K

10 < gn(X1 , . . . ,Xi ;θ) < 1,

where for i 1, . . . , n − 1,

gi(X1 , . . . ,Xi ;θ) log(S0/H)+ σ√∆

i∑j1

X j + i(r− σ

2

2

)∆,

gn(X1 , . . . ,Xn ;θ)

1

log(H/K)

(log(S0/K)+ σ

√∆

n∑j1

X j + n(r− σ2

2 )∆).

Thus, the sample performance of the payoff for Euro-pean up-and-out barrier option can be rewritten by

Q(X1 , . . . ,Xn ;θ) e−rn∆Kexp[log(H/K)V1(X1 , . . . ,Xn ;θ)] − 1·V2(X1 , . . . ,Xn ;θ),

where θ is a generic parameter that can be r, ∆, K, H,S0, and σ in this example, ∆ is the discrete step size,Xi (Bi∆ − B(i−1)∆)/

√∆, i 1, . . . , n, which is standard

normally distributed if the underlying asset followsgeometric Brownian motion,

V1(X1 , . . . ,Xn ;θ) gn(X1 , . . . ,Xn ;θ),V2(X1 , . . . ,Xn ;θ) ϕ(g(X1 , . . . ,Xn ;θ)),

where

ϕ(y1 , . . . , yn)n−1∏i1

1yi < 010 < yn < 1.

Then we test the performance of GLR for the Euro-pean barrier options, and compare it with the SLRIPA,SPA, and FDC, which are tested in Wang et al. (2012)using the following parameter values: σ 0.1, r 0.05,

H 110, S0 K 100, n∆ 1. For FDC, we perturbθ to θ + δ with δ 0.2. As in Wang et al. (2012), wetest sensitivity with respect to parameter θ H, whereclassical IPA and LR do not apply. GLR can also beapplied straightforwardly to estimate the sensitivitywith respect to other parameters.

In this example, we have ∇x ln f (x;θ)−(x1 , . . . , xn),where f is the joint density of n i.i.d. standard normalrandom variables, ∂θg(x;θ) −(1/H)(1, . . . , 1, gn(x1 ,. . . , xn , ;θ)/log(H/K)),

Jg σ√∆

©­­­­­«1 1 · · · 1 (log(H/K))−1

0 1 · · · 1 (log(H/K))−1

......

. . ....

...0 0 · · · 1 (log(H/K))−1

0 0 · · · 0 (log(H/K))−1

ª®®®®®¬,

J−1g

1σ√∆

©­­­­­«1 −1 · · · 0 00 1 · · · 0 0...

.... . .

......

0 0 · · · 1 −10 0 · · · 0 log(H/K)

ª®®®®®¬.

We can construct a continuous and piecewise smoothfunction as

ϕε(y1 , . . . , yn)n−1∏i1χε(yi)(χε(yn − 1) − χε(yn)).

Notice that this example falls into the special case dis-cussed at the end of Section 2. Conditions (A.1)–(A.3),and (i)–(iii) in (A.4) are obviously satisfied. Since theinput random variables follow normal distributions,integrability condition (iv) in (A.4) can justified simi-larly as in the proof of Lemma 2 in Online Appendix A.Thus, the GLR estimator is guaranteed to be unbiasedand has the following form:

e−rT KH

1

maxi1,...,n−1

gi(X1 , . . . ,Xi ;θ) < 0

· 10 < gn(X1 , . . . ,Xn ;θ) < 1

×

exp[log

(HK

)gn(X1 , . . . ,Xn ;θ)

]− 1

·[ (1− gn(X1 , . . . ,Xn ;θ))Xn −X1

σ√∆

+1

log(H/K)

]+ exp

[log

(HK

)gn(X1 , . . . ,Xn ;θ)

]gn(X1 , . . . ,Xn ;θ)

.

The SLRIPA and SPA estimators for this example canbe found in Online Appendix C. The GLR and SLRIPAestimators have analytical forms and are single-run,while the SPA estimator includes a conditional expec-tation term that is not in analytical form and requiresextra simulation to estimate.

The numerical results are shown in Table 3, wherethe results for SLRIPA, SPA, and FDC are taken from

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 13: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio Method12 Operations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS

Table 3. Sensitivity of European Barrier Option withRespect to Parameter H, Based on 2,000 IndependentReplications (Mean± Standard Error)

n 10 n 20 n 30

FDC 0.329± 0.084 0.310± 0.077 0.271± 0.071SPA 0.286± 0.011 0.262± 0.013 0.254± 0.015SLRIPA 0.276± 0.018 0.251± 0.024 0.266± 0.028GLR 0.278± 0.020 0.263± 0.025 0.255± 0.029

Wang et al. (2012). Note that the SPA estimator requiresadditional simulation to estimate some conditionalexpectation terms (see Online Appendix C), and thevariances of these estimator are not reflected in thereported standard errors in Table 3. All results areobtained from 2000 independent replications for eachestimator. GLR and SLRIPA have comparable vari-ances and superior computational efficiency. SPA hasthe lowest variance but requires a substantially higheramount of computation because of the additional sim-ulations. FDC has the largest variances in this example,and is generally biased. The convergence rate of FDC interms of mean-squared error can be found in L’Ecuyerand Perron (1994).

4. ConclusionsIn this paper, we propose a systematic procedure usingfunction smoothing and integration by parts for deriv-ing an unbiased derivative estimator for discontinuoussample performances with structural parameters. Forthe particular model specified in Section 2.1, the GLRestimator extends IPA, LR, and WD to a setting wherethey did not previously apply. Applications in proba-bility constraints, statistical process control, and finan-cial derivatives can be treated in our general frame-work. Although GLR is not always the best alternative,numerical experiments substantiate its overall single-run efficiency. The proposed GLR estimator is compat-ible with variance reduction techniques such as split-ting in the WD method, conditional Monte Carlo inSPA, and renewal theory in steady-state simulation.How to further reduce the variance of GLR is an impor-tant direction for future research.

AcknowledgmentsThe authors thank the referees, editors, and Cheng Jie fortheir comments and suggestions that have greatly improvedthe paper.

ReferencesAndrieu L, Cohen G, Vázquez-Abad FJ (2010) Gradient-based sim-

ulation optimization under probability constraints. Eur. J. Oper.Res. 212(2):345–351.

Asmussen S, Glynn PW (2007) Stochastic Simulation: Algorithms andAnalysis, Vol. 57 (Springer, New York).

BroadieM, Glasserman P (1996) Estimating security price derivativesusing simulation. Management Sci. 42(2):269–285.

Cao X-R, Wan X (2014) Sensitivity analysis of nonlinear behaviorwith distorted probability. Math. Finance 27(1):115–150.

Chen N, Glasserman P (2007) Malliavin Greeks without Malliavincalculus. Stochastic Processes and Their Appl. 117(11):1689–1723.

Evans LC (1998) Partial Differential Equations (American Mathemati-cal Society, Providence, RI).

Fournié E, Lasry J-M, Lebuchoux J, Lions P-L, Touzi N (1999) Appli-cations of Malliavin calculus toMonte Carlo methods in finance.Finance and Stochastics 3(4):391–412.

Fu MC (1994) Sample path derivatives for (s, S) inventory systems.Oper. Res. 42(2):351–364.

Fu MC (2006) Gradient estimation. Henderson SG, Nelson BL, eds.Handbooks in Operations Research and Management Science, Vol. 13(Elsevier, Amsterdam), 575–616.

Fu MC (2008) What you should know about simulation and deriva-tives. Naval Res. Logist. 55(8):723–736.

Fu MC (2015) Stochastic gradient estimation. Fu MC, ed. Handbook ofSimulation Optimization, Vol. 216 (Springer, New York), 105–147.

Fu MC, Hu J-Q (1993) Second derivative sample path estimators forthe GI/G/m queue. Management Sci. 39(3):359–383.

FuMC,Hu J-Q (1995) Sensitivity analysis forMonte Carlo simulationof option pricing. Probab. Engrg. Informational Sci. 9(3):417–446.

FuMC, Hu J-Q (1997) Conditional Monte Carlo: Gradient Estimation andOptimization Applications (KluwerAcademic Publishers, Boston).

Fu MC, Hu J-Q (1999) Efficient design and sensitivity analysis ofcontrol charts using Monte Carlo simulation. Management Sci.45(3):395–413.

Fu MC, Hong LJ, Hu J-Q (2009) Conditional Monte Carlo estimationof quantile sensitivities.Management Sci. 55(12):2019–2027.

Glasserman P (1991) Gradient Estimation via Perturbation Analysis(Kluwer Academic Publishers, Boston).

Glasserman P, Gong W-B (1990) Smoothed perturbation analysis fora class of discrete-event systems. IEEE Trans. Automatic Control35(11):1218–1230.

Gong W-B, Ho Y-C (1987) Smoothed perturbation analysis of dis-crete event dynamical systems. IEEE Trans. Automatic Control32(10):858–866.

Heidergott B (1999) Optimisation of a single-component mainte-nance system: A smoothed perturbation analysis approach. Eur.J. Oper. Res. 119(1):181–190.

Heidergott B (2001) Option pricing via Monte Carlo simulation: Aweak derivative approach. Probab. Engrg. Informational Sci. 15(3):335–349.

Heidergott B, Leahu H (2010) Weak differentiability of product mea-sures. Math. Oper. Res. 35(1):27–51.

Heidergott B, Volk-Makarewicz W (2016) A measure-valued differ-entiation approach to sensitivity analysis of quantiles. Math.Oper. Res. 41(1):293–317.

Ho Y-C, Cao X-R (1991) Discrete Event Dynamic Systems and Perturba-tion Analysis (Kluwer Academic Publishers, Boston).

Hong LJ, Liu G (2009) Simulating sensitivities of conditional value atrisk.Management Sci. 55(2):281–293.

Hong LJ, Liu G (2010) Pathwise estimation of probability sensitivi-ties through terminating or steady-state simulations. Oper. Res.58(2):357–370.

Jiang G, Fu MC (2015) Technical note—On estimating quantile sen-sitivities via infinitesimal perturbation analysis. Oper. Res. 63(2):435–441.

Kushner HJ, Yin GG (2003) Stochastic Approximation and RecursiveAlgorithms and Applications (Springer, New York).

Lang S (2013) Undergraduate Analysis (Springer, New York).L’Ecuyer P (1990) A unified view of the IPA, SF, and LR gradient

estimation techniques. Management Sci. 36(11):1364–1383.L’Ecuyer P, Perron G (1994) On the convergence rates of IPA and FDC

derivative estimators. Oper. Res. 42(4):643–656.Liu G, Hong LJ (2011) Kernel estimation of the Greeks for options

with discontinuous payoffs. Oper. Res. 59(1):96–108.Peng Y, Fu MC, Hu J-Q (2016a) Estimating distribution sensitivity

using generalized likelihood ratiomethod. Cassandras CG,GiuaA, Li Z, eds. Proc. 13th Internat. Workshop on Discrete Event Sys-tems, WODES ’16 (IEEE, Piscataway, NJ), 123–128.

Peng Y, Fu MC, Hu J-Q (2016b) On the regularity conditions andapplications for generalized likelihood ratio method. Proc. Win-ter Simulation Conf., WSC ’16 (IEEE, Piscataway, NJ), 919–930.

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 14: A New Unbiased Stochastic Derivative Estimator for ...

Peng et al.: Generalized Likelihood Ratio MethodOperations Research, Articles in Advance, pp. 1–13, ©2018 INFORMS 13

Peng Y, Fu MC, Glynn PW, Hu J-Q (2017) On the asymptotic analysisof quantile sensitivity estimation by Monte Carlo simulation.Proc. Winter Simulation Conf., WSC ’17 (IEEE, Piscataway, NJ).

Pflug GC (1996) Optimization of Stochastic Models (Kluwer Academic,Boston).

Rubinstein RY, Shapiro A (1993) Discrete Event Systems: SensitivityAnalysis and Stochastic Optimization by the Score Function Method(John Wiley & Sons, New York).

Rudin W (1987) Real and Complex Analysis (McGraw-Hill Education,New York).

Wang Y, Fu MC, Marcus SI (2012) A new stochastic derivative esti-mator for discontinuous payoff functions with application tofinancial derivatives. Oper. Res. 60(2):447–460.

Yijie Peng is an assistant professor in the Depart-ment of Industrial Engineering and Management at PekingUniversity. His research interests include sensitivity analysisand ranking and selection in simulation optimization, withapplications in manufacturing and financial engineering.

Michael C. Fu holds the Smith Chair of Management Sci-ence in the Robert H. Smith School of Business, with a joint

appointment in the Institute for Systems Research and affili-ate faculty appointment in the Department of Electrical andComputer Engineering, all at the University of Maryland. Heis a Fellow of INFORMS and IEEE.

Jian-Qiang Hu is a professor with the Department ofMan-agement Science, School of Management, Fudan University.His research interests include discrete-event stochastic sys-tems, simulation, queueing network theory, stochastic con-trol theory, with applications toward supply chain manage-ment, risk management in financial markets and derivatives,and healthcare.

Bernd Heidergott is a professor of stochastic optimizationat the Department of Econometrics and Operations Researchat the Vrije Universiteit Amsterdam, the Netherlands. Heis programme director of the B.S.c and M.Sc. Econometricsand Operations Research, and research fellow of the Tinber-gen Institute and of EURANDOM. His research interests areoptimization and control of discrete event systems, perturba-tion analysis of Markov chains, max-plus algebra, and socialnetworks.

Dow

nloa

ded

from

info

rms.

org

by [

73.1

32.1

52.2

20]

on 0

3 Fe

brua

ry 2

018,

at 1

6:19

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.