On the Hamiltonicity Gap and Doubly Stochastic Matrices · 2019. 5. 9. · On the Hamiltonicity Gap...

24
On the Hamiltonicity Gap and Doubly Stochastic Matrices * Vivek S. Borkar, Vladimir Ejov and Jerzy A. Filar July 23, 2006 Abstract We consider the Hamiltonian cycle problem embedded in singu- larly perturbed (controlled) Markov chains. We also consider a func- tional on the space of stationary policies of the process that consists of the (1,1)-entry of the fundamental matrices of the Markov chains induced by these policies. We focus on the subset of these policies that induce doubly stochastic probability transition matrices which we re- fer to as the “doubly stochastic policies”. We show that when the perturbation parameter, ε, is sufficiently small, the minimum of this functional over the space of the doubly stochastic policies is attained at a Hamiltonian cycle, provided that the graph is Hamiltonian. We also show that when the graph is non-Hamiltonian, the above mini- mum is strictly greater than that in a Hamiltonian case. We call the size of this difference the “Hamiltonicity Gap” and derive a conserva- tive lower bound for this gap. Our results imply that the Hamiltonian cycle problem is equivalent to the problem of minimizing the vari- ance of the first hitting time of the home node, over doubly stochastic policies. Key words: Hamiltonian cycle, controlled Markov chains, optimal policy, singular perturbation. * This work is supported by ARC Grant DP 0666632. We are indebted to Nelly Litvak and Giang Nguyen for their reading of the manuscript, comments and corrections. School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 40005, India; e-mail: [email protected]. This author’s work was supported in part by a grant from the Dept. of Science and Technology, Govt. of India. School of Mathematics, The University of South Australia, Mawson Lakes, SA 5095, Australia; e-mail: jerzy.fi[email protected]; e-mail: [email protected] 1

Transcript of On the Hamiltonicity Gap and Doubly Stochastic Matrices · 2019. 5. 9. · On the Hamiltonicity Gap...

  • On the Hamiltonicity Gap andDoubly Stochastic Matrices∗

    Vivek S. Borkar,† Vladimir Ejov and Jerzy A. Filar ‡

    July 23, 2006

    AbstractWe consider the Hamiltonian cycle problem embedded in singu-

    larly perturbed (controlled) Markov chains. We also consider a func-tional on the space of stationary policies of the process that consistsof the (1,1)-entry of the fundamental matrices of the Markov chainsinduced by these policies. We focus on the subset of these policies thatinduce doubly stochastic probability transition matrices which we re-fer to as the “doubly stochastic policies”. We show that when theperturbation parameter, ε, is sufficiently small, the minimum of thisfunctional over the space of the doubly stochastic policies is attainedat a Hamiltonian cycle, provided that the graph is Hamiltonian. Wealso show that when the graph is non-Hamiltonian, the above mini-mum is strictly greater than that in a Hamiltonian case. We call thesize of this difference the “Hamiltonicity Gap” and derive a conserva-tive lower bound for this gap. Our results imply that the Hamiltoniancycle problem is equivalent to the problem of minimizing the vari-ance of the first hitting time of the home node, over doubly stochasticpolicies. Key words: Hamiltonian cycle, controlled Markov chains,optimal policy, singular perturbation.

    ∗This work is supported by ARC Grant DP0666632. We are indebted to Nelly Litvakand Giang Nguyen for their reading of the manuscript, comments and corrections.

    †School of Technology and Computer Science, Tata Institute of Fundamental Research,Homi Bhabha Road, Mumbai 40005, India; e-mail: [email protected]. This author’s workwas supported in part by a grant from the Dept. of Science and Technology, Govt. ofIndia.

    ‡School of Mathematics, The University of South Australia, Mawson Lakes, SA 5095,Australia; e-mail: [email protected]; e-mail: [email protected]

    1

  • 1 Introduction

    We consider the following version of the Hamiltonian cycle problem: givena directed graph, find a simple cycle that contains all vertices of the graph(Hamiltonian cycle (HC)) or prove that HC does not exist . With respect tothis property - Hamiltonicity - graphs possessing HC are called Hamiltonian1.

    This paper is a direct continuation of a recent contribution [3] by thepresent authors. It considerably strengthens the results reported in [3] andintroduces a new notion of “Hamiltonicity Gap” that differentiates betweenthe Hamiltonian and non-Hamiltonian graphs.

    More generally, this contribution can be seen as belonging to a line ofresearch [8], [1], [10], [4], [7] which aims to exploit the tools of controlledMarkov decision chains (MDP’s)2 to study the properties of a famous problemof combinatorial optimization: the Hamiltonian Cycle Problem (HCP).

    In particular, in [9] and Ejov et al [5] it was shown that Hamiltonian cyclesof a graph can be characterized as the minimizers of a functional based onthe fundamental matrices of Markov chains induced by deterministic policiesin a suitably perturbed MDP; provided that the value of the perturbationparameter, ε, is sufficiently small. Furthermore, it was conjectured that theminimum over the space of deterministic policies would also constitute theminimum over the space of all stationary policies. The above conjectureremains open under the perturbation studied in [5] and [9].

    However, in the present paper we extend the results of Borkar et al [3] toprove that under another linear, but symmetric, singular perturbation, thesame element of the fundamental matrix - considered as a function on thethe space of all stationary policies that induce doubly stochastic probabilitytransition matrices - is minimized at a Hamiltonian cycle, provided that thegraph is Hamiltonian. Interestingly, the minimum value of this objectivefunction is a function of only the total number of nodes and the perturbationparameter and is strictly less than the corresponding minimum value of anynon-Hamiltonian graph with the same number of nodes. We name the dif-ference between the values of these global minima, the “Hamiltonicity gap”and claim that its presence has a theoretical - but, not necessarily, practical- importance.

    We believe that the theoretical importance of the Hamiltonicity gap stems

    1The name of the problem owes to the fact that Sir William Hamilton investigated theexistence of such cycles on the dodecahedron graph [14]

    2The acronym MDP stems from the alternative name of Markov decision processes.

    2

  • from the fact that it enables us to differentiate between Hamiltonian andnon-Hamiltonian graph on the basis of the minimum value of an objectivefunction of a mathematical program that lends itself to an appealing proba-bilistic/statistical interpretation; namely a constant plus the scaled varianceof the first return time of node 1. Hence, the Hamiltonian cycle problem isequivalent to the problem of minimizing the variance of the first hitting timeof the home node, over doubly stochastic policies.

    Next, we shall briefly differentiate between our approaches and some ofthe best related lines of research. Many of the successful classical approachesof discrete optimization to the Hamiltonian cycle problem focus on solvinga linear programming “relaxation” followed by heuristics that prevent theformation of sub-cycles. In our approach, we embed a given graph in asingularly perturbed MDP in such a way that we can identify Hamiltoniancycles and sub-cycles with exhaustive and non-exhaustive ergodic classes ofinduced Markov chains, respectively. Indirectly, this allows us to exploit aformidable array of properties of Markov chains, including those of the cor-responding fundamental matrices. It should be mentioned that algorithmicapproaches to the HCP based on embeddings of the problem in Markov de-cision processes are also beginning to emerge (e.g., see Andramonov et al [1],Filar and Lasserre [10] and Ejov et al, [6]).

    Note that this approach is essentially different from that adopted in thestudy of random graphs where an underlying random mechanism is used togenerate a graph (e.g., see Karp’s seminal paper [12]). In our approach, thegraph that is to be studied is given and fixed but a controller can choose arcsaccording to a probability distribution and with a small probability (due toa perturbation) an arc may take you to a node other than its “head”. Ofcourse, random graphs played an important role in the study of Hamiltonicity,a striking result to quote is that of Robinson and Wormald [13] who showedthat with high probability k-regular graphs are Hamiltonian for k ≥ 3.

    2 Formulation and preliminaries

    As in Borkar et al [3], our dynamic stochastic approach to the HCP considersa moving object tracing out a directed path on the graph Γ with its move-ment “controlled” by a function f mapping the set of nodes V = V(Γ) ={1, 2 . . . , s} of Γ into the set of arcs A = A(Γ) of Γ. We think of this set ofnodes as the state space of a controlled Markov chain Σ = Σ(Γ) where for

    3

  • each state/node i, the action space A(i) := {a|(i, a) ∈ A} is in one-to-onecorrespondence with the set of arcs emanating from that node, or, equiva-lently, with the set of endpoints of those arcs.

    Illustration: Consider the complete graph Γ5 on five nodes (with noself-loops) and think of the nodes as the states of an MDP, denoted by Σ,and of the arcs emanating from a given node as of actions available at thatstate. In a natural way the Hamiltonian cycle c1 : 1 → 2 → 3 → 4 → 5 → 1corresponds to the “deterministic control” f1 : {1, 2, 3, 4, 5} → {2, 3, 4, 5, 1},where f1(2) = 3 corresponds to the controller choosing arc (2,3) in state 2with probability 1. The Markov chain induced by f1 is given by the “zero-one” transition matrix P (f1) which, clearly, is irreducible. On the otherhand, the union of two sub-cycles: 1 → 2 → 3 → 1 and 4 → 5 → 4corresponds to the control f2 : {1, 2, 3, 4, 5} → {2, 3, 1, 5, 4} which identifiesthe Markov chain transition matrix P (f2) (see below) containing two distinctergodic classes. This leads to a natural embedding of the Hamiltonian cycleproblem in a Markov decision problem Σ. The latter MDP has a multi-chainergodic structure which considerably complicates the analysis. However, thismulti-chain structure can be “disguised” - but not completely lost - with thehelp of a “singular perturbation”. For instance, we could easily replace P (f2)with Pε(f2):

    P (f2) =

    0 1 0 0 00 0 1 0 01 0 0 0 00 0 0 0 10 0 0 1 0

    and Pε(f2) =

    ε 1− 4ε ε ε εε ε 1− 4ε ε ε

    1− 4ε ε ε ε εε ε ε ε 1− 4εε ε ε 1− 4ε ε

    .

    The above perturbation is singular because it altered the ergodic structureof P (f2) by changing it to an irreducible (completely ergodic) Markov ChainPε(f2). Furthermore, it leads to a doubly stochastic matrix whose rows andcolumns both add to one. This has been achieved as follows: Each rowand column of the unperturbed s × s matrix (a permutation matrix) hasexactly one 1 and the rest zeros. We replace the 1’s by 1− (s− 1)ε for some0 < ε < 1/(s−1) and the zeros by ε to get the doubly stochastic perturbationof the matrix.

    Next, we introduce the notion of a Markov chain induced by a stationaryrandomized policy (also called a randomized strategy or a randomized con-trol). The latter can be defined by an s× s stochastic matrix f with entries

    4

  • representing probabilities f(i, a) of choosing a possible action a at a partic-ular state i whenever this state is visited. Of course, f(i, a) = 0, whenevera 6∈ A(i). Randomized policies compose the strategy space FS. The dis-crete nature of the HCP focuses our attention on special paths which ourmoving object can trace out in Γ. These paths correspond to the subspaceFD ⊂ FS of deterministic policies arising when the controller at every fixedstate chooses some particular action with probability 1 whenever this state isvisited (f1 and f2 are instances of the latter). To illustrate these definitionsconsider the simple case where fλ is obtained from the strictly deterministicpolicy f2 by the “controller” deciding to randomize at node 4 by choosingthe arcs (4, 5) and (4, 3) with probabilities f(4, 5) = 1 − λ and f(4, 3) = λ,respectively. The transition probability matrix of the resulting policy fλ isgiven by

    Pε(fλ) =

    ε 1− 4ε ε ε εε ε 1− 4ε ε ε

    1− 4ε ε ε ε εε ε λ− 5ελ + ε ε 5ελ− λ− 4ε + 1ε ε ε 1− 4ε ε

    .

    As λ ranges from 0 to 1 the Markov chain ranges from the one inducedby f2 to one induced by another deterministic policy. An attractive featureof an irreducible Markov chain P is the simple structure of its Cesaro-limit(or, stationary distribution) matrix P ∗. It consists of an identical row-vectorπ representing the unique solution of the linear system of equations: πP =π, π1 = 1, where 1 is an s-dimensional column vector with unity in everyentry.

    The fundamental matrix

    We consider a Markov Chain {Xn} on a finite state space S = {1, 2, . . . , s},with a transition matrix P = [[p(i, j)]], assumed to be irreducible. Letπ = [π1, . . . , πs] denote its unique invariant probability vector and define

    5

  • its stationary distribution matrix by

    P ∗ = limN↑∞

    1

    N + 1

    N∑n=0

    P n =

    π. . .π. . ....

    . . .π

    .

    For the sake of completeness, we recall probabilistic expressions for the

    elements of the fundamental matrix G = [[g(i, j)]]4= (I −P + P ∗)−1, I being

    the s× s identity matrix.Let

    r = [1, 0, . . . , 0]T ∈ Rs,1 = [1, 1, . . . , 1]T ∈ Rs,and define

    w = G r, z = rT G.

    These then satisfy the equations

    (I − P + P ∗)w = r,i.e.,

    w − Pw + (∑

    i

    πiwi)1 = r, (1)

    andz(I − P + P ∗) = rT ,

    i.e.,

    z − zP + (∑

    i

    zi)π = rT , (2)

    respectively. Let τi = min{n > 0 : Xn = i}(= ∞ if the set on the right isempty). The following theorem is proved in [3]

    Theorem 1 If P is irreducible, the elements of the fundamental matrix G =[[g(i, j)]] are given by:

    g(i, i) =1

    2

    Ei[τi(τi + 1)]

    (Ei[τi])2,

    g(i, j) =1

    2

    Ei[τi(τi + 1)]

    (Ei[τi])2− Ei[τj]

    Ei[τi], i 6= j.

    6

  • Doubly stochastic perturbations of graphs

    A stochastic matrix P is called doubly stochastic if the transposed matrixP T is also stochastic (i.e., row sums of P and column sums of P add up toone). A doubly stochastic deterministic policy is one that induces a doublystochastic probability transition matrix where units are inserted in place ofarcs selected by that policy and zeroes in all other places. Hence a Markovchain induced by such a policy has a probability transition matrix that isa permutation matrix. The doubly stochastic perturbation of a policy (de-terministic or randomized) is achieved by passing to a singularly perturbedMDP that is obtained from the original MDP generated by the graph Γ byintroducing perturbed transition probabilities {pε(j|i, a)| i, a, j = 1, . . . , s}defined by the rule

    pε(j|i, a) :={

    1− (s− 1)ε , if a = j and (i, a) is an arc of Γ,ε , if a 6= j.

    For any randomized policy f the corresponding s × s probability transitionmatrix Pε(f) has entries defined by the rule (for 0 < ε <

    1s−1)

    Pε(f)(i, j) :=s∑

    a=1

    pε(j|i, a) · f(i, a).

    Thus every stationary policy f induces not only a perturbed (irreducible)probability transition matrix Pε(f), but also the accompanying stationarydistribution matrix P ∗ε (f) and the fundamental matrix Gε(f). The expressionfor the stationary distribution of a doubly stochastic policy f implies thatEfi [τi] := Ei[τi] = π

    −1i = s for all i ∈ S, where πi(f) := πi is the ith element

    of any row of P ∗ε (f).

    Notational remark: Given the underlying graph, a selection of a particularpolicy f uniquely determines a Markov chain. Thus the various probabilisticquantities associated with that chain, in the sense prescribed above, aresynonymous with the selection of that policy. To simplify the notation, inmany parts of this manuscript the explicit dependence of these quantities onthe policy f will suppressed.

    Denote by Dε the convex set of doubly stochastic matrices obtained bytaking the closed convex hull of the finite set Dεe corresponding to the doublystochastic deterministic policies. We also write Dεe as the disjoint union

    7

  • Dεd ∪Dεh where Dεh corresponds to the Hamiltonian cycles and Dεd to disjointunions of short cycles that cover the graph. Double stochasticity eliminatesany other possibilities. Define the matrix norm ||A||∞ = maxi,j |a(i, j)| fora square matrix A = [[a(i, j)]]. Call P ∈ Dε a perturbation of Hamiltoniancycle if there exists a P̂ ∈ Dεh such that ||P − P̂ ||∞ < ε0 for a prescribedε0 > 0. Let Dεp denote the set of such P . The following are proved in [3].

    Lemma 1 (a) For a policy f inducing a doubly stochastic probability tran-sition matrix we have (with wε1 := w

    ε1(f))

    wε1 =1

    s2

    ∑i

    Ei[τ1] (3)

    =1

    2

    (s + 1)

    s+

    1

    2s2E1[(τ1 − s)2].

    Furthermore, for policies corresponding to Dεh, the above reduces to3

    wε1 =1

    2

    s + 1

    s+ O(ε). (4)

    (b) For P ε ∈ Dεd, wε1 →∞ as ε ↓ 0.(c) For Hamiltonian graphs and sufficiently small ε > 0, all minima of wε1

    on Dεe are attained on Dεp.

    Theorem 2 For Hamiltonian graphs and sufficiently small ε > 0, all min-ima of wε1 on Dε are attained on Dεp.

    3 Hamiltonian cycles are global minimizers

    Clearly, the preceding theorem would be considerably strengthened if it couldbe shown that the minima of wε1 on Dε are attained at doubly stochasticmatrices induced by Hamiltonian cycles, not merely in their “sufficientlysmall” neighborhoods. This section is dedicated to the proof of this fact.

    3Note that we the superscript in wε1 emphasizes the dependence on ε.

    8

  • Directional derivatives

    We shall now derive an expression for the directional derivative of ourobjective function wε1. Let P0, P1 denote two doubly stochastic matrices inDε and for 0 ≤ λ ≤ 1, set P (λ) = λP1 + (1− λ)P04. Correspondingly defineVλ(i) := Ei[τ1], i ∈ S, Vλ = [Vλ(2), · · · , Vλ(s)]T , 1 = [1, · · · , 1]T , where thedependence of the distribution of τ1 on the parameter λ is implicit. Also, letP̃0, P̃1, P̃ (λ) denote sub-matrices derived, respectively, from P0, P1, P (λ) bydeletion of their first row and column.

    From Lemma 1.6, p. 46, of [2], we know that Vλ is the unique solution to

    Vλ = 1 + P̃ (λ)Vλ. (5)

    That is,Vλ = (I − P̃ (λ))−11. (6)

    Let νλ(i) = E[∑τ1

    m=1 I{Xm = i}] when the initial distribution is theuniform distribution. Then νλ := [νλ(2), · · · , νλ(s)] is the unique solution to

    νλ =1

    s1T + νλP̃ (λ), (7)

    that is,

    νλ =1

    s1T (I − P̃ (λ))−1. (8)

    Let J(λ) denote our objective as a function of λ, that is, wε1, evaluated alongthe line segment {P (λ) : 0 ≤ λ ≤ 1}. From (3), we have

    J(λ) =1

    s2

    ∑i

    Vλ(i).

    Differentiating with respect to λ on both sides yields

    J ′(λ) =1

    s2

    ∑i

    V ′λ(i) =1

    s2

    i 6=1V ′λ(i),

    because Vλ(1) = E1[τ1] = s ∀ λ ∈ [0, 1] ⇒ V ′λ(1) ≡ 0. From (5) and thedefinition of P̃ (λ) we have

    V ′λ = (P̃1 − P̃0)Vλ + P̃ (λ)V ′λ.4This is equivalent to a “line segment” fλ = λf1 + (1− λ)f0 in the strategy space.

    9

  • ThereforeV ′λ = (I − P̃ (λ))−1(P̃1 − P̃0)Vλ,

    and with the help of (7)-(8) this leads to

    J ′(λ) =1

    s21T (I − P̃ (λ))−1(P̃1 − P̃0)Vλ

    =1

    sνλ(P̃1 − P̃0)Vλ (9)

    =1

    s21T (I − P̃ (λ))−1(P̃1 − P̃0)(I − P̃ (λ))−11. (10)

    An alternative expression can be found as follows. Recall the fundamentalmatrix G(λ) = (I − P (λ) + P ∗)−1 from Theorem 1 of [3], where P ∗ is thematrix all of whose elements are equal to 1

    s. Then G(λ) satisfies the identity

    G(λ) = (P (λ)− P ∗)G(λ) + I.Differentiating the above with respect to λ on both sides we obtain,

    G′(λ) = (P1 − P0)G(λ) + (P (λ)− P ∗)G′(λ).That is,

    G′(λ) = (I − P (λ) + P ∗)−1(P1 − P0)G(λ) = G(λ)(P1 − P0)G(λ).

    Since J(λ) = eT1 G(λ)e1 for e1def= [1, 0, · · · , 0]T , we have

    J ′(λ) = eT1 G(λ)(P1 − P0)G(λ)e1 (11)= eT1 (I − P (λ) + P ∗)−1(P1 − P0)(I − P (λ) + P ∗)−1e1. (12)

    It is not difficult to verify the equivalence of (9) and (11) directly.

    Hamiltonian cycles are both local and global minima

    The following, purely technical, lemma may already be known but weinclude its proof for the sake of completeness.

    Lemma 3 If xm = m, 1 ≤ m ≤ s, and {ym} a permutation of {xj}, then∑i xiyi is maximized exactly when yi = xi ∀ i and minimized exactly when

    yi = s + 1− xi ∀ i.

    10

  • Proof The first claim is immediate from the Cauchy-Schwartz inequality.Thus for any permutation {yi} of {xj},

    ∑i xiyi ≤

    ∑i x

    2i with equality if and

    only if xi = yi ∀ i. In particular, for the permutation zi = s + 1− yi we have∑i

    xiyi =∑

    i

    xi(s + 1− zi)

    = (s + 1)∑

    i

    xi −∑

    i

    xizi

    ≥ (s + 1)∑

    i

    xi −∑

    i

    x2i =∑

    i

    xi(s + 1− xi),

    that is, ∑i

    xiyi ≥∑

    i

    xi(s + 1− xi),

    with equality if and only if yi = s + 1− xi ∀ i. The second claim follows. 2

    We now consider J ′(0) in the situations where the doubly stochastic ma-trix P0 ∈ Dεh is induced by a deterministic policy tracing out a Hamiltoniancycle. We shall first show that, on a straight line path from P0 towards anydoubly stochastic P1 induced by the graph, J

    ′(0) > 0. This shows that de-terministic policies inducing Hamiltonian cycles correspond to local minima.

    Suppose then that P0 corresponds to a Hamiltonian cycle C0. Without

    loss of generality, we assume C0 is the simplest Hamiltonian cycle that corre-sponds to the permutation σ0 that maps {1, 2, · · · , s−1, s} onto {2, 3, · · · , s, 1}.

    To start with, we consider P1 ∈ Dεh ∪ Dεd, other than P0. That is, P1is induced by any deterministic policy that traces out in the graph either aunion of disjoint cycles or a Hamiltonian cycle other than C0.

    For each i ∈ S, other than 1, let m(i) denote the number of steps requiredto reach node 1 from i on C0, if ε = 0. It is clear that, with ε > 0,

    V0(i) = Ei[τ1] = m(i) + O(ε) = (s− i + 1) + O(ε).Also, if j is after i (or j = i) and before 1, then Ej[

    ∑τ1`=1 I{X` = i}] = O(ε).

    Further, if it is after 1 and before i, then Ej[∑τ1

    `=1 I{X` = i}] = 1 + O(ε).Hence, for i = 2, · · · , s

    ν0(i) =1

    s

    ∑1≤j≤i

    Ej[

    τ1∑

    `=1

    I{X` = i}] + O(ε)

    =s−m(i)

    s+ O(ε) =

    i− 1s

    + O(ε).

    11

  • Thus

    ν0P̃0V0 =s−1∑i=2

    ν0(i)V0(i + 1)

    =1

    s{1(s− 2) + 2(s− 3) + · · · (s− 2)1}+ O(ε)

    =1

    s{1[(s− 1)− 1] + 2[(s− 1)− 2] + · · · (s− 1)[(s− 1)− (s− 1)]}+ O(ε)

    =1

    s

    ((s− 1)

    s−1∑r=1

    r −s−1∑r=1

    r2

    )+ O(ε)

    =(s− 1)2

    2− 1

    s

    s−1∑r=1

    r2 + O(ε).

    A similar argument can be used to show that

    ν0P̃1V0 =(s− 1)2

    2− 1

    s

    s−1∑r=1

    ryr + O(ε),

    where yr ∈ {0, 1, 2, · · · , (s− 1)} and yr 6= yk whenever r 6= k. Note that if yrwere allowed to take values only in the set yr ∈ {1, 2, · · · , (s − 1)}, then byLemma 3 we would have that

    s−1∑r=1

    r2 >

    s−1∑r=1

    ryr,

    whenever (y1, · · · , ys−1) 6= (1, · · · , s− 1). However, the inclusion of 0 as oneof the possible values for yr can only lower the right hand side of the aboveinequality.

    Hence we have proved that

    ν0P̃1V0 − ν0P̃0V0 > 0

    whenever P̃1 ∈ Dεh ∪ Dεd and P̃1 6= P̃0.

    Now consider an arbitrary doubly stochastic P̃1 other than P̃0. By Birkhoff-von Neumann theorem we know that P̃1 =

    ∑Mi=1 γiP

    ∗i , where γi > 0 ∀ i,

    ∑i γi =

    12

  • 1, and P ∗i ∈ Dεh ∪ Dεd correspond to permutation matrices. Note also thatfor at least one i in the above summation P ∗i 6= P0. Then by the precedingstrict inequalities we have that

    J ′(0) = ν0P̃1V0 − ν0P̃0V0 =∑

    i

    γi(ν0P∗i V0 − ν0P̃0V0) > 0.

    The following main result now follows rather easily. 2

    Theorem 4

    (a) If P0 is induced by a Hamiltonian cycle, it is a strict local minimum forthe cost functional wε1.

    (b) If P0 is induced by a Hamiltonian cycle, it is also a global minimum forthe cost functional wε1.

    Proof The first part was proved above for P0 corresponding to the Hamilto-nian cycle C0: it is sufficient to observe that for a strict local minimum thequantity

    ν0P̃1V0 − ν0P̃0V0remains strictly bounded away from zero as ε ↓ 0 for all extremal P1 6= P0.The effect of considering another Hamiltonian cycle would be only to permuteorder of terms in various summations, without changing the conclusions.

    To see the second part, first note that the above allows us to choose anε0 > 0 such that P0 is the strict local minimum of w

    ε1 in the ε0−neighborhood

    of P0. As in the proof of Theorem 2 (see pp. 385-7 of [3]), choose ε > 0 smallenough so that the global minimum of wε1 is attained on the ε0−neighborhoodof P0. (‘Small enough’ here is quantified by an upper bound that dependsonly on s and ε0, see ibid.) The claim follows. 2

    4 Hamiltonicity Gap

    Recall from Lemma 1 (see also Lemma 6 of [3]) that for a doubly stochasticpolicy f ∈ Dε, we had proved that the functional consisting of the top lefthand element of the fundamental matrix induced by f is given by

    wε1 := wε1(f) =

    1

    2

    s + 1

    s+

    1

    2s2E1[(τ1 − s)2]. (13)

    13

  • Note that dependence on f on the right is suppressed, but the expectationterm is a function of f , since the policy determines the distribution of τ1.

    It should now be clear that a consequence of Lemma 1 and Theorem 4 isthat whenever the underlying graph Γ is Hamiltonian the minimum of theabove functional over f ∈ Dε is given by

    wε1(fh) = minf∈Dε

    [wε1(f)] =1

    2

    s + 1

    s+ O(ε), (14)

    where fh ∈ Dεh is a policy defining any Hamiltonian cycle in the graph.This section is devoted to the proof of the fact that, for ε > 0 and

    sufficiently small, there exists ∆(s) > 0 such that whenever the graph Γ isnon-Hamiltonian

    minf∈Dε

    [wε1(f)]− wε1(fh) ≥ ∆(s)−O(ε).

    We name the quantity ∆(s) the Hamiltonicity gap (of order s) because itdistinguishes all non-Hamiltonian graphs with s nodes from all Hamiltoniangraphs with the same number of nodes.

    Before presenting the proof, we note that such a result is reasonablewhen we consider the possible variability of τ1 - as captured by its varianceE1[(τ1 − s)2] - for both Hamiltonian and non-Hamiltonian graphs. In theformer case, it is clear that this variance can be made nearly zero by fol-lowing a Hamiltonian cycle because the latter policy would yield a varianceactually equal to zero were it not for the (small) perturbation ε. However, ifthe graph is non-Hamiltonian, perhaps we can not avail ourselves of such avariance annihilating policy. This intuitive reasoning is made rigorous in theremainder of this section.

    A lower bound for the non-Hamiltonian case

    The key step in what follows is the derivation of a lower bound onPf ({τ1 = s}), for an arbitrary doubly stochastic, policy f in a non-Hamiltoniangraph.

    Lemma 5 Suppose that the graph Γ is non-Hamiltonian, that is, Dεh = φand let f be an arbitrary doubly stochastic policy.

    (a) If ε = 0, then

    Pf ({τ1 = s}) ≤ 14

    14

  • (b) If ε > 0 and small, then

    Pf ({τ1 = s}) ≤ 14

    + O(ε).

    ProofFirst consider the case ε = 0. Let f be an arbitrary doubly stochastic

    policy and let {Xt}∞0 be the Markov chain induced by f and the startingstate 1. Let γ1 = (X0, X1, . . . , Xs) be a path of s steps through the graphand let χ1 = {γ1|X0 = Xs = 1, Xk 6= s; k = 1, . . . s− 1}. That is, the eventthat the first return to 1 occurs after s steps is simply {τ1 = s} = {χ1} andhence

    Pf ({τ1 = s}) =∑

    γ1∈χ1pγ1 ,

    where pγ1 denotes the probability (under f) of observing the path γ1. How-ever, because the graph is assumed to be non-Hamiltonian, all the paths inχ1 that receive a positive probability have the structure

    γ1 = γ′1 ⊕ γ̄1,

    where γ′1 consists of a non-self-intersecting ‘reduced path’ from 1 to itself of

    length m ≤ s−2 adjoined at some node(s) other than 1 by one or more loopsof total length s−m, that together constitute γ̄1. One can think of γ′1 and γ̄1as the first and second parts of a figure comprising of basic loop with one ormore side-lobes attached to it, each of which is either a loop or a connectedunion of loops. The simplest instance of this is a figure of eight, with twoloops of length m and s−m respectively, attached at a node other than 1.

    Let pγ1 denote the probability of the original path and p′γ1

    that of the

    reduced path. Let q := pγ1/p′γ1≤ 1, which is the contribution to p coming

    from the loops comprising γ̄1. More generally, define

    γ0 = γ′1, γ1 = γ

    ′1 ⊕ γ̄1, γ2 = γ

    ′1 ⊕ γ̄1 ⊕ γ̄1, γ2 = γ

    ′1 ⊕ γ̄1 ⊕ γ̄1, , . . . .

    The paths γn (n 6= 2) from 1 to itself all begin with the same reduced pathγ′1 but may repeat exactly the same loop traversals γ̄1 path for n ≥ 2 times

    all contribute to the event {τ1 6= s}, as does γ0 = γ ′1.The paths γn, n ≥ 2 have probabilities pγ1qn−1. The total probability

    that these paths and γ0 = γ′1 (but excluding the original γ1) contribute to

    15

  • {τ1 6= s} is as followspγ1q

    +∑n≥2

    pγ1qn−1 = pγ1(

    1

    q+

    q

    1− q )

    = pγ1(−1 +1

    q(1− q))≥ 3pγ1 .

    From the above it follows that

    Pf ({τ1 6= s}) ≥∑

    γ1∈χ13pγ1 = 3Pf ({τ1 = s}).

    Hence,

    1 = Pf (τ1 < ∞)= Pf (τ1 = s) + Pf (τ1 6= s)≥ 4Pf (τ1 = s),

    implying Pf (τ1 = s) ≤ 14 , or, Pf (τ1 6= s) ≥ 34 .Returning to the case ε > 0 and small, we note that in the Markov chain

    induced by f there are now two types of transitions: strong that correspondto f assigning a positive probability to arcs that are actually in the graphand weak that are strictly the result of our perturbation. The latter are oforder ε. Thus the only impact that the perturbation makes on the argumentpresented above is to introduce an adjustment of order ε. This completesthe proof. 2

    Theorem 6 Suppose that the graph Γ is non-Hamiltonian, that is, Dεh = φ.Let ∆(s) = 3

    8s2.

    (a) For any policy f , we have that E1[(τ1 − s)2] ≥ 34 −O(ε).(b) We also have that

    minf∈Dε

    [wε1(f)]− wε1(fh) ≥ ∆(s)−O(ε).

    Proof Let f be an arbitrary doubly stochastic policy and E1[(τ1−s)2] be thecorresponding variance of the first hitting time of node 1 (starting from1).

    16

  • Clearly,

    E1[(τ1 − s)2] =∑

    k≥1(k − s)2Pf (τ1 = k)

    ≥∑

    k≥1, k 6=sPf (τ1 = k) = Pf (τ1 6= s).

    Hence by Lemma 4 (ii) we have obtained part (a), namely

    E1[(τ1 − s)2] ≥ 34−O(ε). (15)

    It now follows from (13) that

    wε1 := wε1(f) ≥

    1

    2

    s + 1

    s+

    1

    2s2(3

    4−O(ε)) = s + 1

    s+ ∆(s)−O(ε). (16)

    Part (b) now follows immediately from (16) and (14). 2

    5 Symmetric doubly stochastic policies

    In view of the results of the preceding sections, it is clear that the Hamilto-nian cycle problem is equivalent to the following mathematical programmingproblem

    min[wε1(f)] = min{[I − Pε(f) + P ∗ε (f)]−11,1}subject to:

    (i)∑

    a

    f(i, a) = 1, i = 1, . . . , s, f(i, a) ≥ 0 ∀i, a,

    (ii) 1T Pε(f) = 1T .

    Of course, constraints (i) in the above ensure that f is a proper stationarypolicy, while constraints (ii) ensure that Pε(f) is a doubly stochastic prob-ability transition matrix. Furthermore, (14) ensures that this problem isequivalent to minimizing the variance of τ1, the first hitting time of the homenode 1. That is

    arg minf∈Dε

    [wε1(f)] = arg minf∈Dε

    [V arf (τ1)] (17)

    17

  • where for every doubly stochastic stationary policy the variance of the firstreturn time to 1 is denoted by V arf (τ1) := E1[(τ1 − s)2].

    Since, in some contexts, variance minimization is a convex optimizationit is important to stress that this is not the case here. However, there is aninteresting convex subset of symmetric doubly stochastic policies such thatour objective function over that subset is strictly convex. At this stage, itis not clear whether the convex program consisting of minimization of wε1(f)over that set is related to the original Hamiltonicity problem. Nonetheless,for the sake of completeness, the remainder of this section is devoted toestablishing these restricted convexity properties and to showing that theydo not extend to all of Dε.

    Let P0, P1 be doubly stochastic matrices corresponding to a pair of policiesf0, f1 and let P̃0, P̃1 denote the corresponding matrices same with the firstrow and column removed. Also, P̃ (λ) = λP̃1 + (1− λ)P̃0 for λ ∈ (0, 1) and,as in Section 3, let

    J(λ) =1

    s2

    ∑i

    Vλ(i), (18)

    that is, J(λ) is the short-hand notation for the corresponding objective func-tion wε1(fλ) = λw

    ε1(f1) + (1 − λ)wε1(f0). We now derive a useful expression

    for the m-th derivative of J(λ).

    Lemma 7 For m ≥ 1,

    J (m)(λ) =dmJ(λ)

    dλm=

    (m!

    s2

    )1T (I − P̃ (λ))−1

    ((P̃1 − P̃0)(I − P̃ (λ))−1

    )m1.

    (19)

    Proof In Section 3, the claim has already been proved for m = 1. We claim

    that V(m)λ

    def= dmVλ/dλ

    m is uniquely specified by the linear system

    V(m)λ = m(P̃1 − P̃0)V m−1λ + P̃ (λ)V (m)λ .

    This follows by simple induction. The claim follows from this and (18). 2

    In particular,

    J (2)(λ) =2

    s21(I − P̃ (λ))−1(P̃1 − P̃0)(I − P̃ (λ))−1(P̃1 − P̃0)(I − P̃ (λ))−11

    =2

    s2qT2 Aq1, (20)

    18

  • where,

    q1def= (I − P̃ (λ))−1(P̃1 − P̃0)(I − P̃ (λ))−11,

    q2def= (I − P̃ (λ)T )−1(P̃ T1 − P̃ T0 )(I − P̃ (λ)T )−11,

    Adef= I − 1

    2(P̃ (λ) + P̃ (λ)T ).

    Lemma 8 A is positive definite.

    Proof Since P (λ) is irreducible doubly stochastic, 12(P̃ (λ)+P̃ (λ)T ) is strictly

    substochastic with spectral radius < 1. Thus A is symmetric with strictlypositive eigenvalues. The claim follows. 2

    Lemma 9 If J (2)(λ) = 0, then J (m)(λ) = 0 ∀ m ≥ 2.

    Proof By Lemma 8, J (2)(λ) = 0 would imply q = 0. The claim is nowimmediate from (19). 2

    Define D²sym def= {Q ∈ Dε : Q = QT}. Note that for every P ∈ Dε,P T ∈ Dε and hence 1

    2(P + P T ) ∈ D²sym . Clearly D²sym is a convex compact

    polytope. Let D²sym,e denote the set of its extreme points.

    Lemma 10 P̄ ∈ D²sym,e if and only if P̄ = 12(P + P T ) for some P ∈ D²e.

    Proof If P̄ ∈ D²e as well, the claim trivially holds. Suppose not. If theclaim is false, P̄ =

    ∑mi=1 aiP̂i for some m ≥ 2, ai > 0 ∀i with

    ∑i ai = 1, and

    distinct P̂i ∈ De for 1 ≤ i ≤ m. Since P̄ = P̄ T , we also have P̄ =∑m

    i=1 aiP̂Ti

    and thus P̄ =∑m

    i=1 aiP̄i for P̄idef= 1

    2(P̂i + P̂

    Ti ) ∀i. This contradicts the fact

    P̄ ∈ D²sym,e, proving the claim. 2

    From now on, consider P0, P1 ∈ D²sym.

    Lemma 11 J (2)(λ) > 0.

    Proof Now q1 = q2. By Lemma 8 and (20), J(2)(λ) ≥ 0. If it were zero,

    so would be all the higher derivatives by Lemma 9 and the rational functionJ(λ) would have to be a constant, which it cannot be. 2

    19

  • Corollary 12 The cost function w²1 restricted to D²sym is strictly convex andhence attains its minimum at a unique point P̂ in D²sym,e.

    Lemma 13 The directional derivative of w²1 at P̂ above is nonnegative in alldirections.

    Proof For P ∈ Dε, let P̄ = 12(P + P T ) ∈ D²sym. Then as P̂ is the minimizer

    of w²1 on D²sym,

    V T0 (P̄ − P̂ )V0 =1

    2(V T0 (P − P̂ )V0 + V T0 (P T − P̂ )V0) ≥ 0.

    ButV T0 (P − P̂ )V0 = V T0 (P T − P̂ )V0.

    Hence V T0 (P − P̂ )V0 = V T0 (P T − P̂ )V0 ≥ 0 ∀ P . The claim follows. 2

    Our cost function is

    1

    s2

    ∑i

    Ei[τ1] =1

    s2

    (∑

    i6=1Vi + s

    )

    =1

    s21T (I − P̃ )−11 + 1

    s.

    Let P (λ) = λP T + (1− λ)P, λ ∈ [0, 1] and J(λ) the corresponding cost.

    Lemma 14 J(λ) is symmetric about λ = 0.5.

    Proof We have

    1T (I − P̃ (λ))−11= 1T (I − (λP̃ T + (1− λ)P̃ ))−11= 1T (I − ((λ− 0.5)(P̃ T − P̃ ) + 0.5(P̃ + P̃ T )))−11 (21)= 1T (I − P̃ (λ)T )−11= 1T (I − ((λ− 0.5)(P̃ − P̃ T ) + 0.5(P̃ + P̃ T )))−11= 1T (I − ((0.5− λ)(P̃ T − P̃ ) + 0.5(P̃ + P̃ T )))−11. (22)

    Comparing (21) and (22), we get the result. 2

    20

  • Lemma 15 The odd derivatives of J(λ) vanish at λ = 0.5 and the evenderivatives are negative.

    This in particular implies that the minimum point among symmetric dou-bly stochastic matrices is in fact a saddle point over all doubly stochasticmatrices.

    Proof The first claim is immediate from Lemma 8. Now for P̂ = 0.5(P+P T ),

    J (2)(0.5) = 1T (I − P̂ )−1(P T − P )(I − P̂ )−1(P T − P )(I − P̂ )−11= −1T (I − P̂ )−1(P − P T )(I − P̂ )−1(P T − P )(I − P̂ )−11= −rT Ar

    for A = I − P̂ and r = (I − P̂ )−1(P T − P )(I − P̂ )−11. A similar argumentworks for other even derivatives. The claim follows. 2

    Remark We note that the above lemma does not necessarily imply that J(λ)is concave. Though the function J(λ) = 1T (

    ∑m P (λ)

    m)1, is real analyticon [0, 1], the radius of convergence of its Taylor series around λ = 0.5, ingeneral, could be less than 0.5.

    Example We consider Petersen graph with the adjacency matrix

    0 1 0 0 1 1 0 0 0 0

    1 0 1 0 0 0 1 0 0 0

    0 1 0 1 0 0 0 1 0 0

    0 0 1 0 1 0 0 0 1 0

    1 0 0 1 0 0 0 0 0 1

    1 0 0 0 0 0 0 1 1 0

    0 1 0 0 0 0 0 0 1 1

    0 0 1 0 0 1 0 0 0 1

    0 0 0 1 0 1 1 0 0 0

    0 0 0 0 1 0 1 1 0 0

    . (23)

    Next, we consider a deterministic policy f0 consisting of two sub-cycles oflength 5. In particular, f0 is equivalent to the map

    {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} → {2, 3, 4, 5, 1, 8, 9, 10, 6, 7}.

    21

  • Figure 1: Function J(λ)

    . Furthermore, let f1 be equivalent to the map

    {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} → {5, 1, 2, 3, 4, 9, 10, 6, 7, 8},

    namely, f1 consists of the “reverse sub-cycles” of f0.

    Clearly, for every ε, Pε(f1) = PTε (f1) and we can evaluate J(λ) = w

    ε1(fλ) =

    λwε1(f1) + (1 − λ)wε1(f0) and it’s second derivative on the interval λ ∈ [0, 1]in accordance with (18) and (20), respectively. The plots of these functionsare displayed in Figures 1 and 2. It is clear from the figures that J(λ) =wε1(fλ) is not convex but that, consistent with Lemma 15, it is concave in aneighbourhood of λ = 0.5

    22

  • Figure 2: Function J”(λ)

    References

    [1] M. Andramonov, J.A. Filar, A. Rubinov and P. Pardalos “HamiltonianCycle Problem via Markov Chains and Min-type Approaches”, In: Ap-proximation and Complexity in Numerical Optimization, P. M. Pardalos(ed.), Kluwer, Dordrecht, (2000), pp. 31-47. New Jersey, (1992).

    [2] Borkar, V. S.; Topics in Controlled Markov Chains, Pitman LectureNotes in Mathematics No. 240, Longman Scientific and Technical, Har-low, UK, 1991.

    [3] Borkar, V. S.; Ejov, V.; and Filar, J. A.; Directed graphs, Hamiltonicityand doubly stochastic matrices, Random Structures and Algorithms 25(2004), 376-395.

    [4] M. Chen and J.A. Filar, “Hamiltonian Cycles, Quadratic Programmingand Ranking of Extreme Points”, In: Recent Advances in Global Opti-mization, C. Floudas and P. Pardalos (eds.), Princeton University Press,New Jersey, (1992).

    [5] V. Ejov, J. Filar, M. Nguyen, “Hamiltonian cycles and singularly per-turbed Markov chains”, Mathematics of Operations Research, (2004)vol. 29-1, pp. 114-131.

    23

  • [6] V. Ejov, J. Filar, J. Gondzio, “An interior point heuristic algorithmfor the HCP”, Journal of Global Optimization, (2004) vol 29 (3), pp.315-334.

    [7] E. Feinberg, “Constrained Discounted Markov Decision Process withHamiltonian cycles”, Mathematics of Operations Research, 25 (2000),pp. 130 -140.

    [8] J.A. Filar and D. Krass, “Hamiltonian Cycles and Markov Chains”,Mathematics of Operations Research, 19 (1994), pp. 223-237.

    [9] J.A. Filar and K. Liu, “Hamiltonian Cycle Problem and Singularly Per-turbed Markov Decision Process”, In: Statistics, Probability and GameTheory, Eds. T.S. Ferguson et al. IMS Lecture Notes - Monograph Series(1996).

    [10] J.A. Filar and J-B Lasserre, “A Non-Standard Branch and BoundMethod for the Hamiltonian Cycle Problem”, ANZIAM J., 42 (E)(2000), pp. C556-C577.

    [11] Filar, J. A.; and Vrieze, K.; Competitive Markov Decision Processes,Springer Verlag, New York, 1997.

    [12] R. Karp, “Probabilistic Analysis of partitioning algorithms for thetravelling-salesman problem in the plane”, Mathematics of OperationsResearch, 2(3) (1977), 209–224.

    [13] R. Robinson, N. Wormald, “Almost all regular graphs are Hamiltonian”,Random Structures and Algorithms , 5(2) (1994), 363–374.

    [14] R. J. Wilson, “Introduction to graph theory”, Longman, (1996).

    24