Sensor Management Problem with Arrivals and Departures

2
Sensor Management Problem with Arrivals and Departures Assume there are a finite number of locations 1,...,N , each of which may at a particular time have an object of a given type or may be empty. Assume there is a set of S sensors, each of which has multiple sensor modes, and that each sensor can observe a set of locations at each discrete time-instant (stage) with a selected mode. Let x i (t) ∈{0, 1,...,D} denote the state of location i at time t, where x i (t) = 0 if location i is unoccupied, and otherwise x i (t)= k> 0 indicates location i contains an object of type k at time t. Let π i (0) ∈< D+1 be a discrete apriori probability distribution over the possible states for the i th location for i =1,...,N where D 2. Assume additionally that the random variables x i (t) for i =1,...,N are mutually independent for each time t. Let the state of location i be governed by a Markov chain as in Fig. 1. (NEED EXPLANATION OF HMM MODEL). There are s =1,...,S sensors, each of which has m =1,...,M s possible modes of observation. We assume there is a series of T discrete decision stages with t =0,...,T - 1 where sensors can make measurements. Each sensor s has a limited set of locations that it can observe at each stage, denoted by O s (t) ⊆{1,...,N }. At each stage, each sensor can choose to employ one of its sensor modes to collect noisy measurements concerning the states x i (t) of the sensed locations. Bayesian statistics are used to interpret the result of each sensor observation. We assume that no two sensors observe the same location at the same time so as to minimize the complexity of the associated action and observation spaces. A more complex model could for instance consider all possible pairs of sensor observations by two sensors looking at the same object, but this results in an action and observation space which are the cartesian products of the original action and observation spaces and is thus cumbersome. A sensor action by sensor s at stage t is the set of pairs: u s (t)= {(i, m s (t)) | i o s (t),m s (t) M s } (1) consisting of a set of locations to observe, o s (t) O s (t), and a mode for these observations, m s (t), where the mode is restricted to the set of feasible modes given the current resource-levels for each sensor. Let u i,s (t) refer to the sensor action taken on location i with sensor s at stage t if any, or let u i,s (t)= otherwise. Sensor measurements are modeled as belonging to a finite set y ∈{1,...,L s }. Following a convention which is prevalent within the sensor management com- munity, we assume that a measurement action at location i at time t returns a measurement y(t) concerning the state of this location, x i (t), at time t, rather than having the observations at time t refer to the future value of the state, x i (t +1), as is commonly the case within the robotics community. The likelihood of the measured value is assumed to depend on the sensor s, sensor mode m, lo- cation i and on the true state at the location x i (t), but not on the states of other locations (statistical independance). Denote this likelihood as P (y|x i (t), i, s, m). 1

Transcript of Sensor Management Problem with Arrivals and Departures

Page 1: Sensor Management Problem with Arrivals and Departures

Sensor Management Problem with Arrivals and Departures

Assume there are a finite number of locations 1, . . . , N , each of which mayat a particular time have an object of a given type or may be empty. Assumethere is a set of S sensors, each of which has multiple sensor modes, and thateach sensor can observe a set of locations at each discrete time-instant (stage)with a selected mode.

Let xi(t) ∈ {0, 1, . . . , D} denote the state of location i at time t, wherexi(t) = 0 if location i is unoccupied, and otherwise xi(t) = k > 0 indicateslocation i contains an object of type k at time t. Let πi(0) ∈ <D+1 be a discreteapriori probability distribution over the possible states for the ith location fori = 1, . . . , N where D ≥ 2. Assume additionally that the random variables xi(t)for i = 1, . . . , N are mutually independent for each time t. Let the state oflocation i be governed by a Markov chain as in Fig. 1. (NEED EXPLANATIONOF HMM MODEL).

There are s = 1, . . . , S sensors, each of which has m = 1, . . . ,Ms possiblemodes of observation. We assume there is a series of T discrete decision stageswith t = 0, . . . , T − 1 where sensors can make measurements. Each sensor shas a limited set of locations that it can observe at each stage, denoted byOs(t) ⊆ {1, . . . , N}. At each stage, each sensor can choose to employ one ofits sensor modes to collect noisy measurements concerning the states xi(t) ofthe sensed locations. Bayesian statistics are used to interpret the result of eachsensor observation. We assume that no two sensors observe the same locationat the same time so as to minimize the complexity of the associated actionand observation spaces. A more complex model could for instance consider allpossible pairs of sensor observations by two sensors looking at the same object,but this results in an action and observation space which are the cartesianproducts of the original action and observation spaces and is thus cumbersome.A sensor action by sensor s at stage t is the set of pairs:

us(t) = {(i,ms(t)) | i ∈ os(t),ms(t) ∈Ms} (1)

consisting of a set of locations to observe, os(t) ∈ Os(t), and a mode for theseobservations, ms(t), where the mode is restricted to the set of feasible modesgiven the current resource-levels for each sensor. Let ui,s(t) refer to the sensoraction taken on location i with sensor s at stage t if any, or let ui,s(t) = ∅otherwise.

Sensor measurements are modeled as belonging to a finite set y ∈ {1, . . . , Ls}.Following a convention which is prevalent within the sensor management com-munity, we assume that a measurement action at location i at time t returns ameasurement y(t) concerning the state of this location, xi(t), at time t, ratherthan having the observations at time t refer to the future value of the state,xi(t+1), as is commonly the case within the robotics community. The likelihoodof the measured value is assumed to depend on the sensor s, sensor mode m, lo-cation i and on the true state at the location xi(t), but not on the states of otherlocations (statistical independance). Denote this likelihood as P (y|xi(t), i, s,m).

1

Page 2: Sensor Management Problem with Arrivals and Departures

Figure 1: HMM Model for each of the N locations. pa is an arrival probabilityand pd is a departure probability for the Markov chain. The blue (dashed) arcsoccur (with an accompanying classification cost) whenever the system makesa decision concerning the identity of a cell. The red (dot+dashed) arcs repre-sent the cost of penalties associated with an object changing state before it ismeasured. The black arrows are transitions that occur at no cost.

We assume that this likelihood given xi(t) is time-invariant, and that the ran-dom measurements yi,s,m(t) are conditionally independent of other measure-ments yj,σ,n(τ) given the states xi(t), xj(τ) for all sensor modes m, n providedi 6= j or τ 6= t.

Assume each sensor has a quantity Rs of resources available for measure-ments during each decision stage, so there is a periodic constraint on sensingperformance. Associated with the use of mode m by sensor s on location i attime t is a resource cost rs(us(t)) to use this mode, representing power or someother type of resource required to operate the sensor:∑

i∈o(t)

rs(i,ms(t)) ≤ Rs ∀ s ∈ [1 . . . S]; ∀ t ∈ [0 . . . T − 1] (2)

This is a hard constraint for each realization of observations and decisions.The objective of this problem is to estimate the state of each location at

each time with minimum error as measured by the number of FA’s and MD’s:

minγ∈Γ

[N∑i=1

Ti∑w=1

minτi∈[ti(w−1),ti(w))

(minvj∈X

c(xi(τi), vj),MD)]

(3)

subject to Eq. (2), where the Ti values represent the total number of states thatoccur at location i, and the ti(w) values indicate the (unknown and random)time of the wth state-transition at location i with ti(0) defined as 0.

After replacing the hard resource-constraint in Eq. (2) with an expected-resource-use constraint for each of the S sensors, we can dualize the resourceconstraints and create an augmented objective-function (Lagrangian) of theform:

minγ∈Γ

[N∑i=1

Ti∑w=1

minτi∈[ti(w−1),ti(w))

(minvj∈X

c(xi(τi), vj),MD)

−T−1∑t=0

S∑s=1

λs(t)

Rs(t)− ∑i∈o(t)

r(ui,s(t))

(4)

2