IEEE Infocom 2007
On a Routing Problem within Probabilistic Graphs
and its application to Intermittently Connected Networks
Joy Ghosh Hung Q. Ngo, Seokhoon Yoon, Chunming Qiao
Messenger Server Department of Computer Science and Engineering,
Yahoo! Inc. State University of New York at Buffalo,
Sunnyvale, CA Buffalo, NY
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
In many networks such as MANET, DTN, ICMAN, ..., links are probabilistic.
Natural to formulate the following problem.
A Routing Problem on Probabilistic Graphs
s
t
pe
G: directed graph
For each e=(u,v), pe is the probablity that u can deliver a packet to vAll pe are independent
Find a subgraph H maximizing Conn-ProbH(s,t)Subject to application-dependent constraints
Find a subgraph H maximizing Conn-ProbH(s,t)Subject to application-dependent constraints
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Natural Questions That Follow (Talk Outline)1. How do we construct G (and compute pe)?
Accuracy of pe routing efficiency
2. What are the constraints for H?
3. Given H, how to compute Conn-ProbH(s,t)?
4. What’s the complexity of finding optimal H?
5. If complexity is too high, how to design good routing algorithms/protocols?
6. How useful is this model, anyhow?
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
1. How do we construct G (and compute pe)? Short answer: application dependent Long answer:
Don’t care (e.g. Epidemic, randomized flooding, Spray-and-Wait, …)
Locally estimate delivery predictability/frequencies (e.g., ProPHET, ZebraNet, …)
Assume an Oracle (e.g., Spyropoulos et al. in Secon’04, Jain et al. Sigcomm’05, …)
Mobility modeling/profiling (e.g., random waypoint, group mobility, freeway mobility, …) We use random orbit model from our SOLAR framework Main reasons: model built on real-world data traces, and we
already have the simulation code for it
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
2. Constraints for the delivery subgraph H Short answer: application dependent Long answer:
No constraint (blind flooding) Acyclic (Epidemic) Threshold on expected delivery time (e.g. some works
on DTN) … Our proposal: |E(H)| ≤ given threshold k, because this
will reduce Contention, thus message drops and retransmissions Data and bandwidth overheads
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
3. Given H, how to compute Conn-ProbH(s,t)? Bad news: #P-complete
Classic s,t-reliability problem Shown #P-complete by Valiant (1979)
Good news: Can be approximated with
our simple and efficient heuristics to within about 85%-90% accuracy on average
Note: May have an FPRAS using Markov-chain Monte
Carlo (along the line of Jerrum-Sinclair-Vigoda’s work, Journal of the ACM, 2004)
But: Long Standing Open Problem
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
4. What’s the complexity of finding optimal H?
Somewhat subtle: hard to compute objective function does not imply hard to optimize
Bad News: #P-hard, as we showed in the paper Good News:
Our heuristic can find reasonably good H
Note: A wide-open research direction: approximation
algorithms for #P-optimization problems.
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
#P-hardness of finding optimal H
Given directed graph D = (V,E) with edge probabilities pe, source s, destination t; computing Conn-ProbD(s,t) is #P-complete in 1979
Let c be the least common multiple of all denominators of the pe
We show that: a) If the optimal H can be found for any given G, then an
efficient decision procedure for deciding if Conn-ProbD(s,t) ≤ c’/c can be designed for any c’ ≤ c.
b) If such a decision procedure exists, then we can compute Conn-ProbD(s,t) with a simple binary search
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
#P-hardness of finding optimal H
Add path with k=|E(D)| edges to D to get G with Πk
i=1pi = c’/c + ε, for any ε < 1/c Suppose finding optimal H can be done efficiently If H is the
Upper part Conn-ProbD(s,t) > c’/c
Lower part Conn-ProbD(s,t) ≤ c’/c
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
5. Heuristic for finding good H (i.e. routing algo)
Edge-constrained routing EC-SOLAR-KSP
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Simulation Parameters
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Edge-constrained routing – EC-SOLAR-KSP
EC-SOLAR-KSP1 L = |E| EC-SOLAR-KSP2 L = 0.8 * |E| EC-SOLAR-KSP3 L = 0.6 * |E| PROB-ROUTE Pinit = β = γ = 0.5;
A. Lindgren, A. Doria, and O. Schelen, “Poster: Probabilistic routing in intermittently connected networks,” Proceedings of The Fourth ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc 2003), June 2003.
EPIDEMIC Not shown in 2nd figure as its Network Byte Overhead was much higher A. Vahdat and D. Becker, “Epidemic routing for partially connected ad hoc networks,” Technical Report
CS- 00006, Duke University, April 2000.
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
6. How useful is this model, anyhow?
Short answer: we don’t know yet Long answer:
The good This model is potentially useful There are many interesting open questions Good solution can serve as benchmark
The bad and the ugly Edge probabilities may not be independent Practical applications can’t afford to use a centralized
algorithm Intermediate vertices may have more information than
the source at the point they are about to deliver packets This may be a good thing, may be a bad thing!
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Concluding Remarks
Contributions Formulatied of a new routing problem within
probabilistic graphs Addressed several aspects of the problem: graph
construction, complexity, routing heuristics Open problems
Approximation algorithm for this #P-Hard problem Better distributive algorithm Better Heuristics for this and other mobility models
IEEE Infocom 2007
Thank You!
Questions?
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Mobile Users
• influenced by social routines
• visit a few “hubs” /
places (outdoor/indoor) regularly
• “orbit” around (fine to coarse grained) hubs at several levels
Sociological Orbit Framework
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Illustration of A Random Orbit Model(Random Waypoint + Corridor Path)
Conference Track 1
Conference Track 3
Cafeteria
Lounge
Conference Track 2
Conference Track 4
PostersRegistration
Exhibits
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Random Orbit Model
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
How to compute model’s parameters?
ETH Zurich traces 1 year from 4/1/04 till 3/31/05 13,620 wireless users, 391
APs, 43 buildings Mapped APs into buildings
based on AP’s coordinates, and each building becomes a “hub” Converted AP-based traces
into hub-based traces
Real-worldData Traces
Hub-Lists(Binary Vectors)
Users’ Mobility Profiles
Hub TransitionProbabilities
Hub Staying TimeDistributions
Hub Transition Time Distributionss
Model Trained UsingEM-Algorithm
Each Profile is a weighted Hublist, e.g.Profile = (0.4, 0.5, 0.9)
Each profile is a cluster mean obtained via the Expectation Maximization (EM) Algorithm
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Examples of Hub-Lists and Mobility Profiles On any given day, a user may regularly visit a small number of “hubs”
(e.g., locations A and B) Each mobility profile is a weighted list of hubs, where weight = hub visit
probability (e.g., 70% A and 50% B) In any given period (e.g., week), a user may follow a few such “mobility
profiles” (e.g., P1 and P2) Each profile is in turn associated with a (daily) probability (e.g., 60% P1
and 40% P2) Example: P1={A=0.7, B=0.5} and P2={B=0.9, C=0.6}
On an ordinary day, a user may go to locations A, B and C with the following probabilities, resp.: 0.42 (=0.6x0.7), 0.66 (= 0.6x0.5 + 0.4+0.9) and 0.24 (=0.4x0.6)
20% more accurate than simple visit-frequency based prediction Knowing exactly which profile a user will follow on a given day can result
in even more accurate prediction
On any given day, a user may regularly visit a small number of “hubs” (e.g., locations A and B)Each mobility profile is a weighted list of hubs,
where weight = hub visit probability (e.g., 70% A and 50% B)
In any given period (e.g., week), a user may follow a few such “mobility profiles”
(e.g., P1 and P2)Each profile is in turn associated with a (daily) probability (e.g., 60% P1 and 40% P2) Example: P1={A=0.7, B=0.5} and P2={B=0.9, C=0.6}On an ordinary day, a user may go to locations A, B & C with the following probabilities: 0.42 (=0.6x0.7), 0.66 (= 0.6x0.5 + 0.4+0.9), 0.24 (=0.4x0.6)• 20% more accurate than simple visit-frequency based prediction• Knowing exactly which profile a user will follow on a given day can result in even more accurate prediction
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Profiling illustration
Obtain daily hub stay durations
Translate to binary hub visitation vectors
Apply clustering algorithm to find mixture of profiles
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Profile parameters for all sample users
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Approximation algorithm for Conn-ProbG(s,d) G = (V, E) where edge
probability between nodes u and v is pe(u,v)
In G, starting from s, all nodes choose at most k downstream edges to get Gk = (V, Ek) (b)
Weight of each edge in Gk is set to we(u,v) = -1 * log (pe(u,v)) to get
G’k say
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Approximation algorithm for Conn-ProbG(s,d)
Compute shortest path from s to all nodes in G’k to get Gsp = (V, Esp) & assign BFS level
Reset we(u,v) = pe(u,v) & add all edges (v,d) that were in G to get G’ = (V, E’) (d)
Let Pd(u,v) be delivery probability of node u to v
Apply Approximation Algorithm to G’ to get Pd(s,d) Start with any u ≠ d with maximum level # Pd(u,d) = 1 – Πk
1(1 – pi) Where pi = we(u,vi) * Pd(vi, d) for all edges
(u,vi)
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Approximation algorithm for Conn-ProbG(s,d)
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Optimal algorithm for delivery probability
Calculate all paths from s to d Apply Algorithm 2 by rules of
inclusion and exclusion
State University of New York (SUNY) at BUFFALO IEEE Infocom 2007
Approximation ratio simulation
Top Related