Optimal Stochastic Location Updates in Mobile Ad Hoc Networks by Coreieeeprojects

8/3/2019 Optimal Stochastic Location Updates in Mobile Ad Hoc Networks by Coreieeeprojects

1/15

Optimal Stochastic Location Updatesin Mobile Ad Hoc Networks

Zhenzhen Ye and Alhussein A. Abouzeid

AbstractWe consider the location service in a mobile ad-hoc network (MANET), where each node needs to maintain its location

information by 1) frequently updating its location information within its neighboring region, which is called neighborhood update (NU),

and 2) occasionally updating its location information to certain distributed location server in the network, which is called location server

update (LSU). The trade off between the operation costs in location updates and the performance losses of the target application due

to location inaccuracies (i.e., application costs) imposes a crucial question for nodes to decide the optimal strategy to update their

location information, where the optimality is in the sense of minimizing the overall costs. In this paper, we develop a stochastic

sequential decision framework to analyze this problem. Under a Markovian mobility model, the location update decision problem is

modeled as a Markov Decision Process (MDP). We first investigate the monotonicityproperties of optimal NU and LSU operations with

respect to location inaccuracies under a general cost setting. Then, given a separablecost structure, we show that the location update

decisions of NU and LSU can be independently carried out without loss of optimality, i.e., a separationproperty. From the discovered

separation property of the problem structure and the monotonicity properties of optimal actions, we find that 1) there always exists a

simple optimal threshold-based update rule for LSU operations; 2) for NU operations, an optimal threshold-based update rule exists in

a low-mobility scenario. In the case that no a priori knowledge of the MDP model is available, we also introduce a practical model-free

learning approach to find a near-optimal solution for the problem.

Index TermsLocation update, mobile ad hoc networks, Markov decision processes, least-squares policy iteration.

1 INTRODUCTION

WITH the advance of very large-scale integrated circuits(VLSI) and the commercial popularity of globalpositioning services (GPS), the geographic location informa-tion of mobile devices in a mobile ad hoc network

(MANET) is becoming available for various applications.This location information not only provides one moredegree of freedom in designing network protocols [1], butalso is critical for the success of many military and civilianapplications [2], [3], e.g., localization in future battlefieldnetworks [4], [5] and public safety communications [6], [7].In a MANET, since the locations of nodes are not fixed, anode needs to frequently update its location information tosome or all other nodes. There are two basic location updateoperations at a node to maintain its up-to-date locationinformation in the network [8]. One operation is to updateits location information within a neighboring region, wherethe neighboring region is not necessarily restricted to one-

hop neighboring nodes [9], [10]. We call this operationneighborhood update (NU), which is usually implemented bylocal broadcasting/flooding of location information mes-sages. The other operation is to update the nodes locationinformation at one or multiple distributed location servers.The positions of the location servers could be fixed (e.g.,

Homezone-based location services [11], [12]) or unfixed(e.g., Grid Location Service [13]). We call this operationlocation server update (LSU), which is usually implementedby unicast or multicast of the location information message

via multihop routing in MANETs.It is obvious that there is a tradeoff between the operationcosts of location updates and the performance losses of thetarget application in the presence of the location errors (i.e.,application costs). On one hand, if the operations of NU andLSU are too frequent, the power and communication bandwidth of nodes are wasted for those unnecessaryupdates. On the other hand, if the frequency of theoperations of NU and/or LSU is not sufficient, the locationerror will degrade the performance of the application thatrelies on the location information of nodes (see [3] for adiscussion of different location accuracy requirements fordifferent applications). Therefore, to minimize the overall

costs, location update strategies need to be carefullydesigned. Generally speaking, from the network point ofview, the optimal design to minimize overall costs shouldbe jointly carried out on all nodes, and thus, the strategiesmight be coupled. However, such a design has a formidableimplementation complexity since it requires informationabout all nodes, which is hard and costly to obtain.Therefore, a more viable design is from the individualnode point of view, i.e., each node independently choosesits location update strategy with its local information.

In this paper, we provide a stochastic decision frame-work to analyze the location update problem in MANETs.

We formulate the location update problem at a node as aMarkov Decision Process (MDP) [16], under a widely usedMarkovian mobility model [17], [18], [19]. Instead of solvingthe MDP model directly, the objective is to identify some

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10, NO. X, XXXXXXX 2011 1

. Z. Ye is with iBasis, Inc., 20 2nd Avenue, Burlington, MA 01803.E-mail: [email protected].

. A.A. Abouzeid is with the Department of Electrical, Computer andSystems Engineering, Rensselaer Polytechnic Institute, 110 8th Street,Troy, NY 12180. E-mail: [email protected].

Manuscript received 13 Apr. 2009; revised 13 Apr. 2010; accepted 23 June2010; published online 14 Oct. 2010.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TMC-2009-04-0127.Digital Object Identifier no. 10.1109/TMC.2010.201.

1536-1233/11/$26.00 2011 IEEE Published by the IEEE CS, CASS, ComSoc, IES, & SPS

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10, NO. 5, May 2011


2/15

general and critical properties of the problem structure andthe optimal solution that could be helpful in providinginsights into practical protocol design. We first investigatethe solution structure of the model by identifying themonotonicity properties of optimal NU and LSU operationswith respect to (w.r.t.) location inaccuracies under a generalcost setting. Then, given a separable cost structure such that

the effects of location inaccuracies induced by insufficientNU operations and LSU operations are separable, we showthat the location update decisions on NU and LSU can beindependently carried out without loss of optimality, i.e., aseparation property exists. From the discovered separationproperty of the model and the monotonicity properties ofoptimal actions, we find that 1) there always exists a simpleoptimal threshold-based update rule for LSU operationswhere the threshold is generally location dependent; 2) forNU operations, an optimal threshold-based update ruleexists in a heavy-traffic and/or a low-mobility scenario. Theseparation property of the problem structure and the

existence of optimal thresholds in LSU and NU operations,not only significantly simplify the search of optimallocation update strategies, but also provide guidelines ondesigning location update algorithms in practice. We alsoprovide a practical model-free learning approach to find anear-optimal solution for the location update problem, inthe case that no a priori knowledge of the MDP modelavailable in practice.

Up to our knowledge, the location update problem inMANETs has not been formally addressed as a stochasticdecision problem. The theoretical work on this problem isalso very limited. In [9], the authors analyze the optimal

location update strategy in a hybrid position-based routingscheme, in terms of minimizing achievable overall routingoverhead. Although, a closed-form optimal update thresh-old is obtained in [9], it is only valid for their routingscheme. On the contrary, our analytical results can beapplied in much broader application scenarios as the costmodel used is generic and holds in many practicalapplications. On the other hand, the location managementproblem in mobile cellular networks has been extensivelyinvestigated in the literature (see [17], [18], [19]), where thetradeoff between the location update cost of a mobile deviceand the paging cost of the system is the main concern. A

similar stochastic decision formulation with a semi-MarkovDecision Process (SMDP) model for the location update incellular networks has been proposed in [19]. However,there are several fundamental differences between ourwork and [19]. First, the separation principle discoveredhere is unique to the location update problem in MANETssince there are two different location update operations(i.e., NU and LSU); second, the monotonicity properties ofthe decision rules w.r.t. location inaccuracies have not beenidentified in [19]; and third, the value iteration algorithmused in [19] relies on the existence of powerful basestations, which can estimate the parameters of the decision

process model while the learning approach, we providehere is model free and has a much lower complexity inimplementation, which is favorable to infrastructurelessMANETs.

2 PROBLEM FORMULATION

2.1 Network Model

We consider a MANET in a finite region. The whole regionis partitioned into small cells and the location of a node isidentified by the index of the cell it resides in. The size ofthe cell is set to be sufficiently small such that the locationdifference within a cell has little impact on the performance

of the target application. The distance between any twopoints in the region is discretized in units of the minimumdistance between the centers of two cells. Since the area ofthe region is finite, the maximum distance between thecenters of two cells is bounded. For notation simplicity, wemap the set of possible distances between cell centers to afinite set f0; 1; . . . ; dg, where 1 stands for the minimumdistance between two distinct cells and d represents themaximum distance between cells. Thereafter, we use thenominal value dm; m0 2 f0; 1; . . . ; dg to represent thedistance between two cells m and m0.

Nodes in the network are mobile and follow a

Markovian mobility model. Here, we emphasize that theMarkovian assumption on the nodes mobility is notrestrictive in practice. In fact, any mobility setting with afinite memory on the past movement history can beconverted into a Markovian type mobility model bysuitably including the finite movement history into thedefinition of a state in the Markov chain. For illustration,we assume that the movement of a node only depends onthe nodes current position [17], [18], [19]. We assume thatthe time is slotted. In this discrete-time setting, the mobilitymodel can be represented by the conditional probabilityPm0jm, i.e., the probability of the nodes position at cell m0

in the next time slot given that the current position is atcell m. Given a finite maximum speed on nodes movement,when the duration of a time slot is set to be sufficientlysmall, it is reasonable to assume that

Pm0jm 0; dm; m0 > 1: 1

That is, a node can only move around its nearestneighboring cells in the duration of a time slot.

Each node in the network needs to update its locationinformation within a neighboring region and to one locationserver (LS) in the network. The LS provides a nodeslocation information to other nodes, which are outside of

the nodes neighboring region. There might be multiple LSsin the network. We emphasize that the location serverdefined here does not imply that the MANET needs to beequipped with any super-node or base station to providethe location service. For example, an LS can be interpretedas the Homezone of a node in [11], [12]. The neighboringregion of a node is assumed to be much smaller than thearea of the whole region, and thus, the NU operations arerather localized, which is also a highly preferred propertyfor the scalability of the location service in a large-scaleMANET. Fig. 1 illustrates the network setting and thelocation update model.

There are two types location inaccuracies about thelocation of a node. One is the location error within thenodes neighboring region, due to the nodes mobility andinsufficient NU operations. We call it local location error of

2 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10, NO. X, XXXXXXX 2011IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10, NO. 5, May 2011


3/15

the node. Another is the inaccurate location information ofthe node stored at its LS, due to infrequent LSU operations.We call it global location ambiguity of the node. There are alsotwo types of location related costs in the network. One is thecost of a location update operation, which could bephysically interpreted as the power and/or bandwidthconsumption in distributing the location messages. Anotheris the performance loss of the application induced bylocation inaccuracies of nodes. We call it application cost. Toreduce the overall location related costs in the network,

each node (locally) minimizes the total costs induced by itslocation update operations and location inaccuracies. Theapplication cost induced by an individual nodes locationinaccuracies can be further classified as follows:

. Local Application Cost: This portion of applicationcost only depends on the nodes local location error,which occurs when only the nodes location infor-mation within its neighborhood is used. For in-stance, in a localized communication between nodeswithin their NU update ranges, a node usually onlyrelies on its stored location information of itsneighboring nodes, not the ones stored in distributed

LSs. A specific example of this kind of cost is theexpected forwarding progress loss in geographicalrouting [10], [15].

. Global Application Cost: This portion of applicationcost depends on both the nodes local location errorand global location ambiguity, when both (inaccu-rate) location information of the node within itsneighborhood and that at its LS are used. Thisusually happens in the setup phase of a long-distance communication, where the node is thedestination of the communication session and itslocation is unknown to the remote source node. In

this case, the location information of the destinationnode at its LS is used to provide an estimation of itscurrent location and a location request is sent from thesource node to the destination node, based on this

estimated location information. Depending on spe-cific techniques used in location estimation and/orlocation discovery, the total cost in searching for thedestination node can be solely determined by thedestination nodes global location ambiguity [14] ordetermined by both the nodes local location errorand global location ambiguity [8].

At the beginning of a time slot, each node decides if itneeds to carry out an NU and/or an LSU operation. Aftertaking the possible update of location information accord-ing to the decision, each node performs an applicationspecified operation (e.g., a local data forwarding or settingup a new communication session with another node) withthe (possibly updated) location information of other nodes.Since decisions are associated with the costs discussedabove, to minimize the total costs induced by its locationupdate operations and location inaccuracies, a node has tooptimize its decisions, which will be stated as follows.

2.2 An MDP Model

As the location update decision needs to be carried out ineach time slot, it is natural to formulate the location updateproblem as a discrete-time sequential decision problem.Under the given Markovian mobility model, this sequentialdecision problem can be formulated with a MDP model[16]. An MDP model is composed of a 4-tuple fS;A;Pjs; a; rs; ag, where S is the state space, A is the actionset, Pjs; a is a set of state- and action-dependent statetransition probabilities, and rs; a is a set of state- andaction- dependent instant costs. In the location updateproblem, we define these components as follows.

2.2.1 The State Space

Since both the local location error and the global locationambiguity introduce costs, and thus, have impacts on the

nodes decision, we define a state of the MDP model as

s m;d;q 2 S, where m is the current location of the node(i.e., the cell index), d!0 is the distance between thecurrent location and the location in the last NU operation(i.e., the local location error) and qis the time (in the number

of slots) elapsed since the last LSU operation (i.e., the ageof the location information stored at the LS of the node). Asthe nearest possible LSU operation is in the last slot, the

value ofqobserved in current slot is no less than 1. Since theglobal location ambiguity of the node is nondecreasing with

q [14], [20], we further impose an upper bound q on the

value ofq, corresponding to the case that the global locationambiguity of the node is so large that the locationinformation at its LS is almost useless for the application.

As all components in a state s are finite, the state space S isalso finite.

2.2.2 The Action Set

As there are two basic location update operations, i.e., NUand LSU, we define an action of a state as a vector a aNU;

aLSU 2 A, where aNU 2 f0; 1g and aLSU 2 f0; 1g, with 0standing for the action of not update and 1 as the actionof update. The action set A f0; 0; 0; 1; 1; 0; 1; 1g isidentical on all states s 2 S.

YE AND ABOUZEID: OPTIMAL STOCHASTIC LOCATION UPDATES IN MOBILE AD HOC NETWORKS 3

Fig. 1. Illustration of the location update model in a MANET, where thenetwork is partitioned into small square cells; LS(A) is the location serverof node A; node A (frequently) carries out NU operations within itsneighborhood (i.e., NU range) and (occasionally) updates its location

information to its LS, via LSU operations.


4/15

2.2.3 State Transition Probabilities

Under the given Markovian mobility model, the statetransition between consecutive time slots is determined bythe current state and the action. That is, given the currentstate st m;d;q and the action at aNU; aLSU, theprobability of the next state st1 m0; d0; q0 is given byPst1jst; at. Observing that the transition from q to q0 isdeterministic for a given aLSU, i.e.,

q0 minfq 1; qg; aLSU 0;1; aLSU 1;

&2

we have

Pst1jst; at Pm0; d0; q0jm;d;q;aNU; aLSU;

Pd0jm;d;m0; aNU Pq0jq; aLSU Pm

0jm;

Pd0jm;d;m0 Pm0jm; aNU 0;

Pm0jm; aNU 1;

&3

for st1 m0

; d0

; q0

, where q0

satisfies (2) and d0

dm; m0

if aNU 1, and zeros for other st1.

2.2.4 Costs

We define a generic cost model for location related costsmentioned in Section 2.1, which preserves basic propertiesof the costs met in practice.

. The NU operation cost is denoted as cNUaNU,where cNU1 > 0 represents the (localized) flood-ing/broadcasting cost and cNU0 0 as no NUoperation is carried out.

. The (expected) LSU operation cost cLSUm; aLSU is a

function of the nodes position and the action aLSU.Since an LSU operation is a multihop unicasttransmission between the node and its LS, this costis a nondecreasing function of the distance betweenthe LS and the nodes current location m if aLSU 1and cLSUm; 0 0; 8m.

. The (expected) local application cost is denoted asclm;d;aNU, which is a function of the nodesposition m, the local location error d and the NUaction aNU. Naturally, clm; 0; aNU 0, 8m; aNUwhen the local location error d 0 and clm;d;aNUis nondecreasing with d at any location m if no NU

operation is carried out. And, when aNU 1,clm;d; 1 0; 8m; d.. The (expected) global application cost is denoted as

cgm;d;q;aNU; aLSU, which is a function of thenodes current location m, the local location errord, the age of the location information at the LS (i.e.,q), the NU action aNU and the LSU action aLSU. Fordifferent actions a aNU; aLSU, we set

cgm;d;q;aNU; aLSU

cdqm;d;q; a 0; 0;cdm; d; a 0; 1;cqm; q; a 1; 0;0; a 1; 1;

8>>>: 4

where cdqm;d;q is the cost given that there is nolocation update operation; cdm; d is the cost given

that the location information at the LS is up-to-date(i.e., aLSU 1); and cqm; q is the cost given that thelocation information within the nodes neighbor-hood is up-to-date (i.e., aNU 1). We assume thatfollowing properties hold for cgm;d;q;aNU; aLSU:

1. cdqm;d;q is component-wise nondecreasing with d

and q at any location m;2. cdm; d is nondecreasing with dat any location m

and cdm; 0 0;3. cqm; q is nondecreasing with qat any location m;4. cdqm; 0; q cqm; q.

All the above costs are non-negative. The nondecreasingproperties of costs w.r.t. location inaccuracies hold inalmost all practical applications.

With the above model parameters, the objective of thelocation update decision problem at a node can be stated asfinding a policy ftg; t 1; 2; . . . to minimize the expectedtotal cost in a decision horizon. Here, t is the decision rule

specifying the actions on all possible states at the beginningof a time slot t and the policy includes decision rules overthe whole decision horizon. A decision horizon is chosen tobe the interval between two consecutive location requests tothe node. Observing that the beginning of a decisionhorizon is also the ending of the last horizon, the nodecontinuously minimizes the expected total cost within thecurrent decision horizon. This choice of the decisionhorizon is especially appropriate for the real-time applica-tions where the future location related costs are lessimportant. Fig. 2 illustrates the decision process in adecision horizon. The decision horizon has a length of

H time slots where H!1 is a random variable since thearrival of a location request to the node is random. At anydecision epoch t with the state of the node as st, the nodetakes an action at, which specifies what location updateaction the node performed in this time slot. Then, the nodereceives a cost rst; at, which is composed of operationcosts and application costs. For example, if the state st mt; dt; qt at the decision epoch t and a decision ruletst NUt st;

LSUt st is adopted, the cost is given by

rst; tst

cNUNUt st cLSUmt; LSUt st

clmt; dt; NUt st; t < H;

cNUNUt st cLSUmt; LSUt st

clmt; dt; NUt st cgst; tst; t H;

8>>>>>:

4 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 10, NO. X, XXXXXXX 2011

Decison HorizonArrival of a

location request

Arrival of a

location request

s1

s2

s3

sH

...

a1

a2

a3

aH

r(s1, a

1) r(s

2, a

2) r(s

3, a

3) r(sH

, aH)

slot #1 slot #2slot #H

Fig. 2. The illustration of the MDP model with the expected total costcriterion, where the delay of a location request w.r.t. the beginning of atime slot is due to the location update operations at the beginning of thetime slot and the transmission delay of the location request message.



5/15

where the global application cost cgst; tst is introducedwhen a location request arrives.

Therefore, for a given policy f1; 2; . . .g, the ex-pected total cost in a decision horizon for any initial states1 2 S is

vs1 IEs1 X

H

t1

rst; tst( );where the expectation is over all random state transitions

and random horizon length H. v is also called the value

function for the given policy in the MDP literature.

Assume that the probability of a location request arrival in

each time slot is , where 0 < < 1 and might be different

at different nodes in general. With some algebraic manip-

ulation, we can show that

vs1 IEs1

X1t1

1 t1rest; tst

( ); 5

where rest; tst 4 cNUNUt st cLSUmt; LSUt st

clmt; dt; NUt st cgst; tst, is the effective cost per slot.

Specifically, for any s m;d;q; a aNU; aLSU,

res; a

clm;d; 0 cdqm;d;q; a 0; 0;clm;d; 0 cdm; d cLSUm; 1; a 0; 1;cNU1 cqm; q; a 1; 0;cNU1 cLSUm; 1; a 1; 1:

8>>>:

6

Equation (5) shows that the original MDP model with the

expected total cost criterion can be transformed into a new

MDP model with the expected total discounted cost criterionwith a discount factor 1 2 0; 1 over an infinite timehorizon, and the cost per slot is given by rest; tst. One

should notice that there is no change on the values vs; s 2

S in this transformation. For a stationary policy

f;; . . .g, (5) becomes

vs1 res1; s1 1 X

s2

Ps2js1; s1

IEs2

X1t1

1 t1re

s0t; s0t( )

;

res1; s1 1 Xs2

Ps2js1; s1vs2; 8s1 2 S;

7

where s0t 4

st1. Since the state space S and the action set A

are finite in our formulation, there exists an optimal

deterministic stationary policy f;;:::g to minimize

vs; 8s 2 S among all policies (see [16], Chapter 6).

Furthermore, the optimal value vs (i.e., the minimumexpected total cost in a decision horizon) can be found by

solving the following optimality equations

vs mina2A

res; a 1 Xs0

Ps0js; avs0( ); 8s 2 S;8

and the corresponding optimal decision rule is

s argmina2A

res; a 1 X

s0

Ps0js; avs0

( );

8s 2 S:

9

Specifically, 8s m;d;q 2 S, let

Wm;d;q 4

clm;d; 0 cdqm;d;q 1 Xm0;d0

Pm0; d0jm; dvm0; d0; minfq 1; qg; 10

Xm; d 4

clm;d; 0 cdm; d cLSUm; 1 1 Xm0;d0

Pm0; d0jm; dvm0; d0; 1; 11

Ym; q 4

cNU1 cqm; q 1

Xm0

Pm0

jmvm0

; dm; m0

; minfq 1; qg; 12

Zm 4

cNU1 cLSUm; 1 1 Xm0

Pm0jmvm0; dm; m0; 1; 13

the optimality equation in (8) becomes

vm;d;q minfWm;d;qzfflfflfflfflfflfflffl}|fflfflfflfflfflfflffl{a0;0

; Xm; dzfflfflfflffl}|fflfflfflffl{a0;1

; Ym; qzfflfflfflffl}|fflfflfflffl{a1;0

; Zmzfflffl}|fflffl{a1;1

g;

8s m;d;q 2 S;

14

and the optimal decision rule m;d;q NUm;d;q;LSUm;d;q is given by

NUm;d;q

0; minfWm;d;q; Xm; dg < minfYm; q; Zmg;

1; otherwise;

&15

LSUm;d;q

0; minfWm;d;q; Ym; qg < minfXm; d; Zmg;

1; otherwise:

&

16

3 THE EXISTENCE OF A STRUCTURED OPTIMALPOLICY

In this section, we investigate the existence of a structuredoptimal policy of the proposed MDP model (8). Such kindof policy is attractive for implementation in energy and/orcomputation limited mobile devices as it can reduce thesearch effort for the optimal policy in the state-actionspace, once we know there exists an optimal policy withcertain special structure. We are especially interested in the

component-wise monotonicity property of an optimal deci-sion rule whose action is monotone w.r.t. the certaincomponent of the state, given that the other components ofthe state are fixed.



6/15

3.1 The Monotonicity of Optimal Values andActions w.r.t. q

Consider the decisions on LSU operations, we show that the

optimal values vm;d;q and the corresponding optimalaction LSUm;d;q are nondecreasing with the value ofq, forany given current location m and the local location error d

of the node.

Lemma 3.1. vm;d;q1 vm;d;q2; 8m; d, and 1 q1 q2 q.

Proof. See the Appendix. tu

Theorem 3.2. LSUm;d;q1 LSUm;d;q2; 8m; d, and1 q1 q2 q.

Proof. From the proof of Lemma 3.1, we have seen thatWm;d;q in (10) and Ym; q in (12) are nondecreasingwith q, and minfXm; d; Zmg is a constant, for anygiven m; d. The result then follows by (16). tu

3.2 The Monotonicity of Optimal Values and

Actions w.r.t. dWe similarly investigate if the optimal values vm;d;q andthe corresponding optimal action NUm;d;q are nonde-

creasing with the local location error d, for any given current

location m and the age qof the location information at the

LS of the node. We first assume that a torus border rule [25]

is applied to govern the movements of nodes on the

boundaries of the network region. Although, without this

assumption, the following condition (2) might not hold

when a node is around network boundaries, this assump-

tion can be relaxed, in practice, when nodes have small

probabilities to be on the network boundaries. Then, we

impose two conditions on the mobility pattern and/or

traffic intensity of the node.

1.clm;1;0

11Pmjm ! cNU1; 8m;

2. given any m and m0 such that Pm0jm 6 0, Pd0 !

xjm; d1; m0 Pd0 ! xjm; d2; m0, for all x 2 f0; . . . ;dg; 1 d1 d2 d.

For condition (1), since both local application cost clm; 1; 0(with local location error d 1, aNU 0) and the locationupdate cost cNU1 in an NU operation are constants, 1

1 Pmjm needs to be sufficiently small, which can be

satisfied if the traffic intensity on the node is high (i.e., thelocation request rate is high) and/or the mobility degree

of the node at any location is low (i.e., the probability that

the nodes location is unchanged in a time slot Pmjm ishigh). Condition (2) indicates that a larger location error d

in current time slot is more likely to remain large in the next

time slot, if no NU operation is performed in current time

slot, which can also be easily satisfied when the nodes

mobility degree is low. These two conditions are sufficient

for the existence of the monotonicity properties of the

optimal values and actions with the value of d, which are

stated as follows.

1

Lemma 3.3. Under the conditions (1) and (2), vm; d1; q vm; d2; q; 8m; q, and 0 d1 d2 d.

Proof. See Appendix. tu

With Lemma 3.3, the monotonicity of the optimal actionNUm;d;q w.r.t. d is stated in the following theorem.

Theorem 3.4. Under the conditions (1) and (2), NUm; d1; q

NUm; d2; q; 8m; q, and 0 d1 d2 d.

Proof. From Lemma 3.3 and its proof, we have seen thatW0m;d;q and X0m; d are nondecreasing with d, forany given m; q and an arbitrarily chosen u0 2 V. Letu0 v 2 V, Wm;d;q in (10) and Xm; d in (11) are thusalso nondecreasing with d. Since Ym; q in (12) andZm in (13) are constants for any given m; q, the resultfollows by (15). tu

4 THE CASE OF A SEPARABLE COST STRUCTURE

In this section, we consider the case that the globalapplication cost described in Section 2.1 only depends onthe global location ambiguity of the node (at its LS), i.e.,cgm;d;q;aNU; aLSU in (4) is independent of local locationerror dand neighborhood update action aNU. In this case, theglobal application cost can be denoted as cgm;q;aLSU, i.e.,

cgm;q;aLSU cqm; q; aLSU 0;0; aLSU 1:

&As mentioned in Section 2.1, this special case holds undercertain location estimation and/or location discoverytechniques. In practice, there are some such examples. In

the Location Aided Routing (LAR) scheme [14], a direc-tional flooding technique is used to discover the location ofthe destination node. The corresponding search cost (i.e.,the directional flooding cost) is proportional to thedestination nodes global location ambiguity (equivalently,q) while the destination nodes local location error (i.e., d)has little impact on this cost. For another example, there arevarious unbiased location tracking algorithms available forthe applications in MANETs, e.g., a Kalman filter withadaptive observation intervals [20]. If such an algorithm isused at the LS, the effect of the destination nodes locallocation error on the search cost is also eliminated, since

the location estimation provided by the LS is unbiased andthe estimation error (e.g., variance) only depends on theage of the location information at the LS (i.e., q) [20].

Under this setting for the global application cost, we findthat the impacts of dand q are separable in the effective costres; a in (6), i.e., a separable cost structure exists.Specifically, for any s m;d;q and a aNU; aLSU,

res; a re;NUm;d;aNU re;LSUm;q;aLSU; 17

where

re;NUm;d;aNU clm;d; 0; aNU 0;

cNU1; aNU 1;& 18re;LSUm;q;aLSU

cqm; q; aLSU 0;

cLSUm; 1; aLSU 1:

&19


1. The sufficiency of the conditions (1) and (2) implies that themonotonicity property of the optimal values and actions with d mightprobably hold in a broader range of traffic and mobility settings.



7/15

Together with the structure of the state-transition probabil-

ities in (2) and (3), we find that the original location update

decision problem can be partitioned into two subproblems

the NU decision subproblem and the LSU decision

subproblem, and they can be solved separately without loss of

optimality. To formally state this separation principle, we

first construct two MDP models as follows.

4.1 An MDP Model for the NU Decision Subproblem

In the NU decision subproblem (P1), the objective is to

balance the cost in NU operations and the local application

cost to achieve the minimum sum of these two costs in a

decision horizon. An MDP model for this problem can be

defined as the 4-tuple fSNU; ANU; PjsNU; aNU; rsNU;aNUg. Specifically, a state is defined as sNU m; d 2SNU, the action is aNU 2 f0; 1g, the state transition probabilityPs0NUjsNU; aNU follows (3) for sNU m; d and s

0NU

m0; d0, where d0 dm; m0 ifaNU 1, and the instant cost is

re;NUm;d;aNU in (18).

Similar to the procedure described in Section 2.2, theMDP model with the expected total cost criterion for theNU decision subproblem can also be transformed into anequivalent MDP model with the expected total discountedcost criterion (with the discount factor 1 ). Theoptimality equations are given by

vNUm; d minaNU2f0;1g

&re;NUm;d;aNU 1

Xm0;d0

Pm0; d0jm;d;aNUvNUm0; d0

';

minfEm; dzfflfflfflffl}|fflfflfflffl{aNU0

; Fmzfflffl}|fflffl{aNU1

g; 8m; d 2 SNU;

20

where vNUm; d is the optimal value of the state m; d and

Em; d 4

clm;d; 0 1 Xm0;d0

Pm0; d0jm; dvNUm0; d0; 21

Fm 4

cNU1 1 Xm0

Pm0jmvNUm0; dm; m0: 22

Since the state space SNU and action set ANU are finite, theoptimality equations (20) have a unique solution and there

exists an optimal deterministic stationary policy [16]. The

corresponding optimal decision rule NU is given by

NUm; d 0; Em; d < Fm;1; otherwise:

&8m; d 2 SNU; 23

4.2 An MDP Model for LSU Decision Subproblem

In the LSU decision subproblem (P2), the objective is to

balance the cost in LSU operations and the global applica-

tion cost to achieve the minimum sum of these two costs in

a decision horizon. An MDP model for this problem can bedefined as the 4-tuple fSLSU; ALSU; PjsLSU; aLSU; rsLSU;aLSUg. Specifically, a state is defined as sLSU m; q 2SLSU, the action is aLSU 2 f0; 1g, the state transition

probabilities Ps0LSUjsLSU; aLSU Pm0jm for the state

transition from sLSU m; q to s0LSU m0; q0, where q0 is

given in (2), and the instant cost is re;LSUm;q;aLSU in (19).Similar to the procedure described in Section 2.2, the

MDP model with the expected total cost criterion for theLSU decision subproblem can also be transformed into anequivalent MDP model with the expected total discounted

cost criterion (with the discount factor 1 ). Theoptimality equations are given by

vLSUm; q minaLSU2f0;1g

&re;LSUm;q;aLSU 1

Xm0;q0

Pm0; q0jm;q;aLSUvLSUm0; q0

';

minfGm; qzfflfflfflffl}|fflfflfflffl{aLSU0

; Hmzfflffl}|fflffl{aLSU1

g; 8m; q 2 SLSU;

24

where vLSUm; q is the optimal value of the state m; q and

Gm; q 4

cqm; q 1 Xm0

Pm0jmvLSUm0; minfq 1; qg; 25

Hm 4

cLSUm; 1 1 X

m0

Pm0jmvLSUm0; 1: 26

Since the state space SLSU and action set ALSU are finite,the optimality equations have a unique solution and thereexists an optimal deterministic stationary policy [16]. Thecorresponding optimal decision rule LSU is given by

LSUm; q 0; Gm; q < Hm;1; otherwise:& 8m; q 2 SLSU;

27

4.3 The Separation Principle

With the defined MDP models for P1 and P2, the separationprinciple can be stated as follows:

Theorem 4.1.

1. The optimal value vm;d;q for any state s m;d;q 2 S in the MDP model (8) can be repre-sented as

vm;d;q vNUm; d vLSUm; q; 28

where vNUm; d and vLSUm; q are optimal values ofP1 and P2 at the corresponding states m; d andm; q, respectively.

2. a deterministic stationary policy with the decision rule NU; LSU is optimal for the MDP model in (8),where NU given in (23) and LSU given in (27), areoptimal decision rules for P1 and P2, respectively.

Proof. See Appendix. tu

With Theorem 4.1, given a separable cost structure,instead of choosing the location update strategies based on

the MDP model in (8), we can consider the NU and LSUdecisions separately without loss of optimality. This notonly significantly reduces the computation complexity asthe separate state-spaces SNU and SLSU are much smaller



8/15

than S, but also provides a simple design guideline, inpractice, i.e., given a separable cost structure, NU and LSU can

be two separate and independent routines/functions in the

location update algorithm implementation.

4.4 The Existence of Monotone Optimal Policies

With the separation principle in Section 4.3 and thecomponent-wise monotonicity properties studied in Sec-

tion 3, we investigate if the optimal decision rules in P1

and P2 satisfy, for any m;d;q 2 S,

NUm; d 0; d < d m;

1; d! dm;

&29

LSUm; q 0; q < q m;

1; q ! qm:

&30

where dm and qm are the (location-dependent) thresh-olds for NU and LSU operations. Thus, if (29) and (30) hold,

the search of the optimal policies for NU and LSU is

reduced to simply finding these thresholds.Lemma 4.2. 1) vLSUm; q1 vLSUm; q2; 8m and 1 q1

q2 q ; 2) under the conditions (1) and (2), vNUm; d1 vNUm; d2; 8m and 0 d1 d2 d.

Proof. From Theorem 4.1, we see that vm;d;q vNUm; d vLSUm; q; 8m;d;q 2 S. F or a ny g iv enm; d, with Lemma 3.1, we know that vm;d;q isnondecreasing with q, and thus, vLSUm; q is nonde-creasing with qfor any given m. Similarly, For any given

m; q, with Lemma 3.3 we know that vm;d;q isnondecreasing with d under conditions (1) and (2)

specified in Section 3. Thus, vNUm; d is nondecreasingwith d for any given m under the same conditions. tu

The following monotonicity properties of the optimal

action LSUm; q w.r.t. q and the optimal action NUm; dw.r.t. d follow immediately from Lemma 4.2, (23) and (27).

Theorem 4.3. 1) LSUm; q1 LSUm; q2; 8m and 1 q1 q2 q ; 2) under the conditions (1) and (2), NUm; d1 NUm; d2; 8m and 0 d1 d2 d.

The results in Theorem 4.3 tell us that,

.

there exist optimal thresholds on the time intervalbetween two consecutive LSU operations, i.e., if theage q of the location information at the LS isolder than certain threshold, an LSU operation iscarried out;

. for NU operations, there exist optimal thresholds onthe local location error dfor the node to carry out anNU operation within its neighborhood, given certainconditions on the nodes mobility and/or trafficintensity are satisfied.

This further indicates a design guideline, in practice, i.e., a

threshold-based optimal update scheme exists for LSU operations

and a threshold-based optimal update scheme exists for NUoperations when the mobility degree of nodes is low, and thealgorithm design for both operations can focus on searching those

optimal thresholds.

4.5 Upperbounds of Optimal Thresholds

Two simple upperbounds of the optimal thresholds on qand dcan be developed with the monotonicity properties inLemma 4.2.

4.5.1 An Upperbound of the Optimal Thresholdqm

From Lemma 4.2, we see that

vLSUm; minfq 1; qg ! vLSUm; 1; 8m; q:

And since cqm; q is nondecreasing with q, from (25) and(26), we note that if cqm; q ! cLSUm; 1, Gm; q

0 !Hm; 8q0 ! q, i .e., the optimal action LSUm; q0 1;8q0 ! q. Thus, we obtain an upperbound for the optimalthreshold qm, i.e.,

qm minfq : cqm; q ! cLSUm; 1; 1 q qg: 31

Then, LSUm; q 1; 8q! qm. This upperbound clearlyshows that if the global application cost (due to the nodeslocation ambiguity at its LS) exceeds the the location update

cost of an LSU operation at the current location, it is optimalto perform an LSU operation immediately.

4.5.2 An Upperbound of the Optimal Thresholddm

From Lemma 4.2 and observing that Pm0jm 0 for allm; m0 such that dm; m0 > 1, for d > 1,X

m0;d0

Pm0; d0jm; dvNUm0; d0

!X

m0

Pm0jmvNUm0; dm; m0:

Thus, from (21) and (22), if clm;d; 0 ! cNU1 and d > 1,

Em; d0

! Fm; 8d0

! d, i.e., the optimal action NU

m;d0 1; 8d0 ! d. Thus, we obtain an upperbound for theoptimal threshold dm, i.e.,

dm minfd: clm;d; 0 ! cNU1; 1 < d dg: 32

Then, NUm; d 1; 8d! dm. This upperbound clearlyshows that if the local application cost (for the nodes locallocation error d > 1) exceeds an NU operation cost, it isoptimal to perform an NU operation immediately.

5 A LEARNING ALGORITHM

The previously discussed separation property of theproblem structure and the monotonicity properties ofactions are general and can be applied to many specificlocation update protocol/algorithm design, as long as theconditions of these properties (e.g., a separable applicationcost structure and a low mobility degree) are satisfied. Inthis section, we introduce a practically useful learningalgorithmleast-squares policy iteration (LSPI) [21] tosolve the location update problem, and illustrate how theproperties developed previously are used in the algorithmdesign. The selection of LSPI as the solver for the locationupdate problem is based on two practical considerations.

The first is the lack of the a priori knowledge of the MDPmodel for the location update problem (i.e., instant costsand state transition probabilities), which makes the stan-dard algorithms such as value iteration, policy iteration,



9/15

and their variants unavailable.2 Second, the small cell size ina fine partition of the network region produces large statespaces (i.e., S or SNU and SLSU), which makes the ordinarymodel-free learning approaches with lookup-table repre-sentations impractical since a large storage space on a nodeis required to store the lookup-table representation of thevalues of state-action pairs [22]. LSPI overcomes thesedifficulties and can find a near-optimal solution for thelocation update problem in MANETs.

LSPI algorithm is a model-free learning approach which

does not require the a priori knowledge of the MDP models,and its linear function approximation structure provides acompact representation of the values of states which savesthe storage space [21]. In LSPI, the values of a given policy f;; . . .g are represented by vs; s s; sTw,where w

4w1; . . . ; wb

T is the weight vector associated withthe given policy , and s; a

41s; a; . . . ; bs; a

T is thecollection of b


10/15

a square region (see Fig. 1). The region is partitioned intoM2 small cells (i.e., grids) and the location of a node in thenetwork is represented by the index of the cell it resides in.We set M 20 in the simulation. Nodes can freely movewithin the region. In each time slot, a node is only allowedto move around its nearest neighboring positions, i.e., thefour nearest neighboring cells of the nodes currentposition. For the nodes around the boundaries of theregion, a torus border rule is assumed to control theirmovements [25]. For a node at cell m (m 1; 2; . . . ; M2)with the set of its nearest neighboring cells to be N m, thespecific mobility model used in simulation is

Pm0jm 1 4 p; m0 m; p; m0 2 N m;

&

where p 2 0; 0:25. Each node updates its location within aneighboring region (i.e., NU range specified in Fig. 1) andto its location server.

6.1 Validation of the Separation Principle inTheorem 4.1

To validate Theorem 4.1, we consider a separable coststructure as follows: cNU1 0:5, cLSUm; 1 0:1DLSm,cqm; q 0:5q, and clm;d; 0 0:5fDd, where DLSmis the true euclidean distance to the nodes location server,Dd is the true euclidean distance w.r.t. the nominaldistance d, 1 q q with q bM=2c, and f is theprobability of the nodes location information used by its

neighbor(s) in a time slot. Two methods are applied incomputing the cost valuesone is based on the the modelgiven by (14) in Section 2.2, where the separation principleis not applied; the other is based on the models for NU andLSU subproblems in Section 4, where the separationprinciple is applied.

Fig. 3 illustrates the convergence of cost values with bothmethods at some sample states, where p 0:15, 0:6 andf 0:6 and x; y represents the sampled location in theregion. We see that, at any state, the cost values achieved by both methods converge to the same (optimal) value,which validates the correctness of the separation principle.

6.2 Near-Optimality of LSPI Algorithm

We use the same cost setting in Section 6.1 to check the near-optimality of the LSPI algorithm in Section 5. To implement

the algorithm, we choose a set of 25 basis functions for eachof two actions in P1. These 25 basis functions include a

constant term and 24 Gaussian RBFs arranged in a 6 4grids over the two-dimensional state space SNU. Inparticular, for some state sNU m; d and some actionaNU 2 f0; 1g, all basis functions were zero, except thecorresponding active block for action aNU which is

1; exp ksNU 1k

2

22NU

" #; exp

ksNU 2k2

22NU

" #; . . . ;

(

exp ksNU 24k

2

22NU

" #);

where the is are 24 points of the grid f0; M2=5; 2M2=5;3M2=5; 4M2=5; M2 1g f0; D d=3, 2D d=3; D dg, and2NU M

2D d=4. Similarly, we also choose a set of 25 basisfunctions for each of two actions in P2, including a constant

term and 24 Gaussian RBFs arranged in a 6 4 grids overthe two-dimensional state space SLSU. In particular, the is

are 24 points of the grid f0; M2=5; 2M2=5; 3M2=5; 4M2=5;M2 1g f1; q=3; 2q=3; qg and 2NU M

2q=4. The RBF type bases selected here provide a universal basis function

format, which is independent of the problem structure.

One should note that the choice of basis functions is not

unique and there are many other ways in choosing basisfunctions (see [22], [23] and the references therein for more

details). The stopping criterion of LSPI iterations insimulation is set as 102.

Table 2 shows the performance of LSPI under differenttraffic intensities (i.e., ; f) and mobility degrees (i.e., p), in

terms of the values (i.e., achievable overall costs of thelocation update) at states with using the decision ruleobtained from LSPI compared to the optimal values. Bothgreedy and monotone policy update schemes are evalu-ated. We also include the performance results of thescheme with the combination of monotone policy updateand the upperbounds given in (31) and (32). From Table 2,we observe that: 1) the values achieved by LSPI are close tothe optimal values (i.e., the average relative valuedifference is less than 6 percent) and 2) the 95 percentconfidence intervals are relatively small (i.e., the values atdifferent states are close to the average value). These

observations imply that the policy obtained by LSPI iseffective in minimizing the overall costs of the locationupdate at all states. On the other hand, the monotonepolicy update shows a better performance than the greedyupdate. The best results achieved by the scheme with thecombination of monotone policy update and the upperbounds among all three schemes imply that a reliableestimation on these upperbounds can be beneficial inobtaining a near-optimal solution. Table 3 shows thepercentages of action differences between the decisionrules obtained by LSPI (with monotone policy update) andthe optimal decision rule in different testing cases. We see

that, in all cases, the actions obtained by LSPI are the samewith the ones in the optimal decision rule at most states(>80 percent), which demonstrates that LSPI can find anear-optimal location update rule.


Fig. 3. The convergence of cost values at different sample states inmethods with and without separation principle applied; x; y representsthe sampled location in the region.



11/15

6.3 Applications

We further evaluate the effectiveness of the proposed modeland optimal solution in three practical application scenar-ios, i.e., the location server update operations in well-known Homezone location service [11], [12] and Gridlocation service (GLS) [13], and the neighborhood updateoperations in the widely used Greedy Packet Forwardingalgorithm [26], [1]. In the simulation, the number of nodesin the network is set as 100.

6.3.1 Homezone Location Service

We apply the proposed LSU model to the location serverupdate operations in Homezone location service [11], [12].The location of the homezone (i.e., location server) ofany node is determined by a hash function to the node ID.For comparison, we also consider the schemes, which

carry out location server update operations in fixedintervals, i.e., q 2; 4; 6; 8 slots.3 As both LSU operationsand global location ambiguity of nodes introduce controlpackets (i.e., location update packets in LSU operationsand route search packets in location ambiguity of thedestination node), we count the number of control packetsgenerated in the network with a given location updatescheme. Fig. 4 shows the number of total control packets,the number of LSU packets and the number of routesearch packets in the network per slot generated bydifferent schemes, where p 0:15 and 0:3. The

95 percent confidence levels are also included, which are

obtained from 30 independent simulation runs. We seethat the scheme obtained from the proposed model

(denoted as OPT) introduces the smallest number of

control packets in the network among all schemes incomparison. Although the scheme with the fixed interval

q 4 has a close performance to OPT, one should notethat the best value of q in the scheme with a fixed intervalis unknown during the setup phase of the scheme.

6.3.2 Grid Location Service

We also apply the proposed LSU model to the locationserver update operations in GLS [13]. The locations oflocation servers of any node are distributed over thenetwork and the density of location servers decreaseslogarithmically with the distance from the node. To applyour model to GLS, we assume that a location server updateoperation uses multicast to update all location servers of thenode in the network. For comparison, we also consider theschemes, which carry out such location server updateoperations in fixed intervals, i.e., q 2; 4; 6; 8 slots.4 Fig. 5shows the number of total control packets, the number ofLSU packets and the number of route search packets in thenetwork per slot generated by different schemes, where p 0:15 and 0:3. Again, the scheme obtained from theproposed model (denoted as OPT) achieves the smallestnumber of control packets in the network among all

schemes in comparison.


TABLE 3

The Action Difference between the Decision Rule Obtained from LSPI (with Monotone Update) and the Optimal Decision Rules

3. One should note that, in practice, other location update schemes canalso be applied here. For example, the author in [12] has suggested alocation update scheme based on the number of link changes. We do notinclude this scheme in comparison since this scheme cannot be fit into ourmodel.

4. The distance effect technique and distance-based update schemeproposed in [13] are not applied in the simulation as they do not fit intoour model in its current version.

TABLE 2The Relative Value Difference (with 95 Percent Confidence Level)

between the Values Achieved by LSPI (vLSPI) and the Optimal Values (v)


12/15

6.3.3 Greedy Packet Forwarding

We apply the proposed NU model to the neighborhoodupdate operations in Greedy Packet Forwarding [26], [1]. Ina transmission, the greedy packet forwarding strategyalways forward the data packet to the node that makesthe most progress to the destination node. With thepresence of local location errors of nodes, a possibleforwarding progress loss happens [10], [15]. This forward-

ing progress loss implies the suboptimality of the route thatthe data packet follows, and thus, more (i.e., redundant)copies of the data packet need to be transmitted along theroute, compared to the optimal route obtained with accuratelocation information. As the NU operations introducecontrol packets, we count the number of control packetsand redundant data packets in the network per slot with agiven location update scheme. For comparison, we alsoconsider the schemes, which carry out the NU operationwhen the local location error of a node exceeds some fixedthreshold, i.e., d 1; 3; 5; 7. Fig. 6 shows the number of totalpackets, the number of NU packets and the number of

redundant data packets per slot achieved by differentschemes, where p 0:15 and f 0:3. The 95 percentconfidence levels are also included, which are obtainedfrom 30 independent simulation runs. We see that the

scheme obtained from the proposed model (denoted as

OPT) achieves the smallest number of total packets in thenetwork among all schemes in comparison.

7 CONCLUSIONS

We have developed a stochastic sequential decision frame-work to analyze the location update problem in MANETs.The existence of the monotonicity properties of optimal NUand LSU operations w.r.t. location inaccuracies have beeninvestigated under a general cost setting. If a separable coststructure exists, one important insight from the proposedMDP model is that the location update decisions on NU and

LSU can be independently carried out without loss ofoptimality, which motives the simple separate considera-tion of NU and LSU decisions in practice. From thisseparation principle and the monotonicity properties ofoptimal actions, we have further showed that 1) for the LSUdecision subproblem, there always exists an optimalthreshold-based update decision rule; and 2) for the NUdecision subproblem, an optimal threshold-based updatedecision rule exists in a low-mobility scenario. To make thesolution of the location update problem to be practicallyimplementable, a model-free low-complexity learning algo-rithm (LSPI) has been introduced, which can achieve a near-

optimal solution.The proposed MDP model for the location update

problem in MANETs can be extended to include moredesign features for the location service in practice. Forexample, there might be multiple distributed locationservers (LSs) for each node in the network and these LSscan be updated independently [1], [13]. This case can behandled by expanding the action aLSU to be in the setf0; 1; . . . ; Kg, where KLSs are assigned to a node. Similarly,the well-known distance effect technique [24] in NU opera-tions can also be incorporated into the proposed MDP modelby expanding the action aNU to be in the set f0; 1; . . . ; Lg,

where L tiers of a nodes neighboring region can followdifferent update frequencies when the distance effect isconsidered. Under a separable cost structure, the separationprinciple would still hold in the above extensions. However,


Fig. 4. Homezone: the number of total control packets, the number ofLSU packets and the number of route search packets in the network perslot generated by the scheme obtained from the proposed LSU model,compared to the schemes, which carry out the location server updateoperations in fixed intervals, i.e., q 2; 4; 6; 8 slots; p 0:15, and 0:3.

Fig. 5. GLS: the number of total control packets, the number of LSU

packets and the number of route search packets in the network per slotgenerated by the scheme obtained from the proposed LSU model,compared to the schemes which carry out the location server updateoperation in fixed intervals, i.e., q 2; 4; 6; 8 slots; p 0:15, and 0:3.

Fig. 6. Greedy Packet Forwarding: the number of total packets, thenumber of NU packets and the number of redundant data packets in thenetwork per slot generated by the scheme obtained from the proposedNU model, compared to the schemes, which carry out the neighborhoodupdate operation when the local location error of a node exceeds somefixed threshold, i.e., d 1; 3; 5; 7; p 0:15, and f 0:3.



13/15

the discussed monotone properties would not hold anylonger. In addition, it is also possible to include the userssubjective behavior in the model. For example, if a userssubjective behavior is in a set B fb1; b2; . . . ; bKg and iscorrelated with its behavior in the previous time slot, themodel can be extended by including b 2 B as a componentof the system state. However, the separation principle couldbe affected if the users subjective behavior is coupled withboth location inaccuracies (i.e., dand q). All these extensionsare a part of our future work.

APPENDIX

Proof of Lemma 3.1. For any given m; d, Xm; d in (11)and Zm in (13) are constants, and thus, we only need toshow that minfWm; d; q ; Ym; qg is nondecreasingwith q. As 1 q q, we prove the result by induction.

First, when q q 1, note that both cdqm;d;q andcqm; q are nondecreasing with q, from (10) and (12), wehave Wm;d;q Wm;d; q and Ym; q Ym; q.

Therefore, vm;d; q 1 vm;d; q; 8m; d.Assume that vm;d;q vm;d;q 1; 8m; d; q 0,



14/15

we find that Y0m; q > W0m; 0; q, and Z0m >X0m; 0. Therefore, (38) becomes

u1m; 0; q min fW0m; 0; q; X0m; 0g

min fY0m; q; Z0mg cNU1:39

For d 1, from (38) and (39), we have

u1m; 1; q minfW0m; 1; q; X0m; 1; u1m; 0; q

cNU1g:

40

We next show that W0m; 1; q ! W0m; 0; q

and X0m; 1 ! X0m; 0. Since both cdqm;d;q

and cdm; d are nondecreasing with d, and

cLSUm; 1 is a constant, for any given m; q, it

is sufficient to show that

clm; 1; 0 1 Xm0;d0

Pm0; d0jm; 1u0m0; d0; q0

! 1 X

m0

Pm0jmu0m0; dm; m0; q0;

which is given as follows

clm; 1; 0 1 Xm0;d0

Pm0; d0jm; 1u0m0; d0; q0

clm; 1; 0 1 X

m06m;d0

Pm0; d0jm; 1

u0m0

; d0

; q0

1 Pmjmu0m; 1; q0

! 1 X

m06m

Pm0jm

&clm; 1; 0

1 1 Pmjm

u0m0; 0; q0

' 1 Pmjmu0m; 1; q

0

! 1 X

m06m

Pm0jm cNU1 u0m0; 0; q0f g

1 Pmjmu0m; 1; q0

! 1 X

m06m

Pm0jmu0m0; 1; q0

1 Pmjmu0m; 1; q0

! 1 X

m06m

Pm0jmu0m0; 1; q0

1 Pmjmu0m; 0; q0

1 X

m06m

Pm0jmu0m0; dm; m0; q0

1 Pmjmu0m; 0; q0

1 X

m0

Pm0jmu0m0; dm; m0; q0;

where the first, the third and the last inequalities

follow by notingu

02 V

, the second inequalityfollows the condition (1), the next to the last

equality is due to Pm0jm 0 for any m0 such that

dm; m0 > 1. Thus, from (39) and (40), we see that

u1m; 0; q u1m; 1; q and u1m; 1; q cNU1 u1m; 0; q.

Combining the results in the above two cases, we haveproved that u1 ! 0, u1m;d;q is nondecreasing with dand u1m; 1; q cNU1 u1m; 0; q for any m; q, i.e.,u1 2 V. By induction, un 2 V ; 8n ! 1 in the value itera-tion procedure, and consequently, the limit, i.e., the

optimal value function v, is also in V. tu

Proof of Lemma 4.1. For part 1, let

~vm;d;q 4

vNUm; d vLSUm; q

minfEm; d; Fmg minfGm; q; Hmg

minfEm; d Gm; q; Em; d Hm; Fm

Gm; q; Fm Hmg:

It is straightforward to see that

Xm0;d0

Pm0; d0jm; dvNUm0; d0

X

m0

Pm0jmvLSUm0; q0

Xm0;d0

Pm0; d0jm; dvNUm0; d0 vLSUm

0; q0

Xm0;d0

Pm0; d0jm; d~vm0; d0; q0:

where q0 is given in (2). Thus,

Em; d Gm; q clm;d; 0 cqm; q 1

Xm0;d0

Pm0

; d0

jm; d~vm0

; d0

; minfq 1; qg;

Em; d Hm clm;d; 0 cLSUm; 1 1 Xm0;d0

Pm0; d0jm; d~vm0; d0; 1;

Fm Gm; q cNU1 cqm; q 1 Xm0

Pm0jm~vm0; dm; m0; minfq 1; qg;

Fm Hm cNU1 cLSUm; 1 1 Xm0

Pm0jm~vm0; dm; m0; 1:

Thus, ~v is a solution of optimality (14) (or (8)) under aseparable cost structure in (17). Since the solution of (14)is unique [16], ~vm;d;q vm;d;q; 8m;d;q 2 S.

For part 2, since the decision rules NU in (23) andLSU in (27) are optimal for P1 and P2, respectively, thedecision rule NU; LSU minimizes the sum of thecosts in NU and LSU subproblems, i.e., achieves~vm;d;q; 8m;d;q 2 S. Consequently, a deterministicstationary policy with the decision rule is optimal forthe MDP model in (8). tu

ACKNOWLEDGMENTS

This work was supported in part by the National ScienceFoundation under grants CNS-0546402 and CNS-0627039.



15/15

REFERENCES[1] M. Mauve, J. Widmer, and H. Hannes, A Survey on Position-

Based Routing in Mobile Ad Hoc Networks, Proc. IEEE Network,pp. 30-39, Nov./Dec. 2001.

[2] Y.C. Tseng, S.L. Wu, W.H. Liao, and C.M. Chao, LocationAwareness in Ad Hoc Wireless Mobile Networks, Proc. IEEEComputer, pp. 46-52, June 2001.

[3] S.J. Barnes, Location-Based Services: The State of the Art,e-Service J., vol. 2, no. 3, pp. 59-70, 2003.

[4] M.A. Fecko and M. Steinder, Combinatorial Designs in MultipleFaults Localization for Battlefield Networks, Proc. IEEE MilitaryComm. Conf. (MILCOM 01), Oct. 2001.

[5] M. Natu and A.S. Sethi, Adaptive Fault Localization in MobileAd Hoc Battlefield Networks, Proc. IEEE Military Comm. Conf.(MILCOM 05), pp. 814-820, Oct. 2005.

[6] PSWAC, Final Report of the Public Safety Wireless AdvisoryCommittee to the Federal Communications Commission and theNational Telecommunications and Information Administration,http://pswac.ntia.doc.gov/pubsafe/publications/PSWAC_AL.PDF, Sept. 1996.

[7] NIST Communications and Networking for Public Safety Project,http://w3.antd.nist.gov/comm_net_ps.shtml, 2010.

[8] I. Stojmenovic, Location Updates for Efficient Routing in Ad HocNetworks, Handbook of Wireless Networks and Mobile Computing,pp. 451-471, Wiley, 2002.

[9] T. Park and K.G. Shin, Optimal Tradeoffs for Location-BasedRouting in Large-Scale Ad Hoc Networks, IEEE/ACM Trans.Networking, vol. 13, no. 2, pp. 398-410, Apr. 2005.

[10] R.C. Shah, A. Wolisz, and J.M. Rabaey, On the Performance ofGeographic Routing in the Presence of Localization Errors, Proc.IEEE Intl Conf. Comm. (ICC 05), pp. 2979-2985, May 2005.

[11] S. Giordano and M. Hamdi, Mobility Management: The VirtualHome Region, ICA technical report, EPFL, Mar. 2000.

[12] I. Stojmenovic, Home Agent Based Location Update andDestination Search Schemes in Ad Hoc Wireless Networks,Technical Report TR-99-10, Comp. Science, SITE Univ. Ottawa,Sept. 1999.

[13] J. Li et al., A Scalable Location Service for Geographic Ad HocRouting, Proc. ACM MobiCom, pp. 120-130, 2000.

[14] Y.B. Ko and N.H. Vaidya, Location-Aided Routing (LAR) inMobile Ad Hoc Networks, ACM/Baltzer Wireless Networks J.,vol. 6, no. 4, pp. 307-321, 2000.

[15] S. Kwon and N.B. Shroff, Geographic Routing in the Presence ofLocation Errors, Proc. IEEE Intl Conf. Broadband Comm. Networksand Systems (BROADNETS 05), pp. 622-630, Oct. 2005.

[16] M.L. Puterman, Markov Decision Processes: Discrete StochasticDynamic Programming. Wiley, 1994.

[17] A. Bar-Noy, I. Kessler, and M. Sidi, Mobile Users: To Update ornot to Update? ACM/Baltzer Wireless Networks J., vol. 1, no. 2,pp. 175-195, July 1995.

[18] U. Madhow, M. Honig, and K. Steiglitz, Optimization ofWireless Resources for Personal Communications MobilityTracking, IEEE/ACM Trans. Networking, vol. 3, no. 6, pp. 698-707, Dec. 1995.

[19] V.W.S. Wong and V.C.M. Leung, An Adaptive Distance-BasedLocation Update Algorithm for Next-Generation PCS Networks,

IEEE J. Selected Areas on Comm., vol. 19, no. 10, pp. 1942-1952, Oct.2001.[20] K.J. Hintz and G.A. McIntyre, Information Instantiation in Sensor

Management, Proc. SPIE Intl Symp. Aerospace and Defense Sensing,Simulation, and Controls (AEROSENSE 98), vol. 3374, pp. 38-47,1998.

[21] M.G. Lagoudakis and R. Parr, Least-Squares Policy Iteration, J. Machine Learning Research (JMLR 03), vol. 4, pp. 1107-1149,Dec. 2003.

[22] D.P. Bertsekas and J.N. Tsitsiklis, Nero-Dynamic Programming.Athena Scientific, 1996.

[23] R. Sutton and A. Barto, Reinforcement Learning: An Introduction.MIT, 1998.

[24] S. Basagni, I. Chlamtac, V.R. Syrotiuk, and B.A. Woodward, ADistance Routing Effect Algorithm for Mobility (DREAM), Proc.

ACM MobiCom, pp. 76-84, 1998.

[25] D.M. Blough, G. Resta, and P. Santi, A Statistical Analysis ofthe Long-Run Node Spatial Distribution in Mobile Ad HocNetworks, Proc. ACM Intl Conf. Modeling, Analysis and Simula-tion of Wireless and Mobile Systems (MSWiM 02), pp. 30-37, Sept.2002

[26] H. Takagi and L. Kleinrock, Optimal Transmission Ranges forRandomly Distributed Packet Radio Terminals, IEEE Trans.Comm., vol. 32, no. 3, pp. 246-257, Mar. 1984.

Zhenzhen Ye received the BE degree fromSoutheast University, Nanjing, China, in 2000,the MS degree in high-performance computa-tion from the Singapore-MIT Alliance (SMA)Program, National University of Singapore, in2003, the MS degree in electrical engineering

from the University of California, Riverside, in2005, and the PhD degree in electrical engi-neering from Rensselaer Polytechnic Institute in2009. He is currently with the R&D Division at

iBasis, Inc. His research interests include wireless communications andnetworking, including stochastic control and optimization for wirelessnetworks, cooperative communications in mobile ad hoc networks andwireless sensor networks, and ultra-wideband communications.

Alhussein A. Abouzeid received the BSdegree with honors from Cairo University, Egypt,in 1993, and the MS and PhD degrees from theUniversity of Washington, Seattle, in 1999 and2001, respectively, all in electrical engineering.From 1993 to 1994, he was with the InformationTechnology Institute, Information and DecisionSupport Center, The Cabinet of Egypt, where hereceived a degree in information technology.From 1994 to 1997, he was a project manager at

Alcatel Telecom. He held visiting appointments with the aerospacedivision of AlliedSignal (currently Honeywell), Redmond, Washington,and Hughes Research Laboratories, Malibu, California, in 1999 and2000, respectively. He is an associate professor of electrical, computer,and systems engineering at Rensselaer Polytechnic Institute (RPI),Troy, New York. He has been on leave from RPI since December 2008,serving as a program director in the Computer and Network SystemsDivision, Computer and Information Science and Engineering Directo-rate, US National Science Foundation (NSF), Arlington, Virginia. He is amember of the editorial board of the IEEE Transactions on WirelessCommunications and Elsevier Computer Networks. He was a recipientof the Faculty Early Career Development Award (CAREER) from the

NSF in 2006.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.


Optimal Stochastic Location Updates in Mobile Ad Hoc Networks by Coreieeeprojects

Documents

Transcript of Optimal Stochastic Location Updates in Mobile Ad Hoc Networks by Coreieeeprojects