Doctoral thesis defense Arezu Moghadam 13 May 2011
description
Transcript of Doctoral thesis defense Arezu Moghadam 13 May 2011
1
Application platform, routing protocols and behavior models in mobile
disruption-tolerant networks (DTNs)
Doctoral thesis defense
Arezu Moghadam13 May 2011
D
D
Introduction
2
Communication in mobile DTNs :1 – No knowledge of the routes beyond the immediate hop2 – Mobility3 – Opportunistic
InternetWiFi or 3G
?? D1
D2
D1
D2
DTN: Disruption-Tolerant Networks 3
4
Introduction Applications of mobile DTNs:
Covering regions with no infrastructure, e.g. natural disasters Retrieving data from remote sensor networks Sharing music, news, pictures in the subway or networks of
pedestrians Collaborative ad-hoc environments
Challenges of mobile DTNs Networking and connectivity No application server or end-to-end communication path Different routing requirements and models Performance of the applications and routing algorithms relies on
the mobility behavior of mobile users
5
Problem scope MobileDTNs
MobileDTNs
MobilityMobilityRoutingRouting
A modularapp. platform
Popularity-based and interest-aware
communicationmodels
Markov-basedmobility model
and routing algorithm
Application
6
Problem scope MobileDTNs
MobileDTNs
ApplicationsApplications MobilityMobility
Class of disruption-
tolerant
Core functional
requirements
RoutingRouting
A modularApp platform
7
Motivation
8
Problem
?Internet
3G
9
Solution
7DS platform Provides a class of
disruption-tolerant applications
Store-carry-forward communication
Node and service discovery Web, email, file-
synchronization and bulletin-board
Modular platform for application developers
InternetSuman Srinivasan, Arezu Moghadam, Se Gi Hong, Henning G Schulzrinne, "7DS - Node Cooperation and Information Exchange in Mostly Disconnected Networks", IEEE International Conference on Communications (ICC), Jun 2007.
10
Email exchange
Mobile nodes act as mail transport agents (MTA) Email client configuration
SMTP server is set to the 7DS local MTA in the email client Database
TTL, relays identities to avoid loops.
11
File synchronization7DS nodes running file-sync application
(view of the nodes before sync).
Shared folder content:test1.txt=2e6480af642eeba3;1170886792000test2.txt=a66a86c11861cb0e;1170957333000
Shared folder content:test1.txt=2e6480af642eeba3; 1170886792000test3.doc=a6ba76c21861db5e;1170757443000
Shared folder content:test1.txt=2e6480af642eeba3; 1170886792000test4.doc=c78a56b341861cd06;1170867833000
Shared folder content:test1.txt=2e6480af642eeba3; 1170886792000test2.txt=a66a86c11861cb0e;1170957333000test4.doc=c78a56b341861cd06;1170867833000
DiscoveryDiscovery
DiscoveryDiscovery
7DS nodes running file-sync application (view of the nodes after sync).
All shared folders content after sync:test1.txt=2e6480af642eeba3;1170886792000test2.txt=a66a86c11861cb0e;1170957333000test3.doc=a6ba76c21861db5e;1170757443000test4.doc=c78a56b341861cd06;1170867833000
SyncSync
SyncSync
Pull-based: automatic download
12
Bulletin board system Push-based data sharing Data exchange should be
approved by the user Metadata in an XML format
7DS Access Boxat 116th & Broadway
1. User publishes announcements on the bulletin board.
2. Users can search for and read bulletin board announcements.
Users can generate and share content in the spirit of Web 2.0
1
1
2
2
2
13
User interface
EmailBulletinBoard
File Synchronization
Webserver
Proxy server
Cachemanager
Mail Transport
Agent
Multicastengine
Deltacompression
DiscoveryModule
Web query
Supportservices
APIs
Search engine
Datasharing
APPs
BonAHAA thin wrapper
around Apple’s Bonjour
BonAHAA thin wrapper
around Apple’s Bonjour
Emulates a connected communication path in
the absence of Internet
Emulates a connected communication path in
the absence of Internet
Fetches the locally cached
web pages.
Fetches the locally cached
web pages.
Query the local neighbors
Query the local neighbors
Search the internal cache
Search the internal cache
1 - Arezu Moghadam, Suman Srinivasan, Henning Schulzrinne, "7DS - A Modular Platform to Develop Mobile Disruption-tolerant Applications", Second IEEE Conference and Exhibition on Next Generation Mobile Applications, Services, and Technologies (NGMAST 2008) , Sep 2008. 2 - Suman Srinivasan, Arezu Moghadam, Henning Schulzrinne, "BonAHA: Service Discovery Framework for Mobile Ad-Hoc Applications", IEEE Consumer Communications & Networking Conference 2009 (CCNC'09), Jan 2009.
. Implementation of the Rsync algorithm
. A more efficient use of the BW and contact opportunity. Useful when someone has a newer version of the stale file (>>)
> rsync
14
Problem scope MobileDTNs
MobileDTNs
ApplicationsApplications MobilityMobilityRoutingRouting
A modularapp. platform
Popularity-based and interest-aware
communicationmodels
Markov-basedmobility model
and routing algorithm
15
Problem scope MobileDTNs
MobileDTNs
ApplicationsApplications MobilityMobilityRoutingRouting
Lack of groupcommunication
model
Popularity-based Interest-aware
model
16
Routing Problem Store-carry-forward
Storage constraints Routing objectives:
Minimize delay Maximize throughput
Per-hop routing vs. source routing No end-to-end path
MANET’s routing protocols fail Proactive and reactive
No knowledge of the topology Time varying connectivity graph
Unicast vs. Multicast
))(),(,),(( tdtcvue nn
S
u
x
w
v
D
Each edge is a contact meaningan opportunity to transfer data.
> Routing Models
Problem – lack of group communication model for mobile DTNs? Any cast communication model
Emergencies Traffic congestion notifications Severe weather alerts
Traditional multicast as a group communication model Fails! No knowledge of the topology No infrastructure to track group memberships
Communication with communities of interest Even a harder problem! Market news, sport events Scientific articles Advertisement about particular products
Epidemic routing
17
18
Solution – interest-aware communication model
Our one-to-many communication model with communities of users
Objective: transmitting data to users who are interested in the content
Assumptions No previous knowledge about
the location of the recipients No knowledge about the
mobility behavior of users No previous knowledge about
interests of users Uniform probability of
encounter
S
X
Y
Y
D1
1
1
3
3
3
3
wireless contactdata transfer
Y
a
b
c
d
e
f
g
2D
4
4
D
D
X
X
X
Arezu Moghadam, Henning Schulzrinne, "Interest-aware content distribution protocol for mobile disruption-tolerant networks", 10th IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks , Kos, Greece, Jun 2009.
Interest Vector
User profiling for the Web Profiles users based on their downloaded or reviewed
web content, clicked hyperlinks and… Music
The genre of the music user is playing more often Topic and category of the documents user has
downloaded
Monitoring Behavior
Interests – IV
MusicReviewedwebpages
Downloadeddocuments
Restaurantreviews
20
cache j
j
jkI
1
cache i
i
ikI
D
D3
correlation(D , ) > jkI
2 jkI
jkI : Interest-vector of node j ?
Solution – interest-aware communication model
21
LSA
w1 ... w j ... w J
d1
...
d i
...
d I
D
W
...
...
...
...
Document-Term Matrix
User profiling for the Web Profiles users based on their
downloaded or reviewed web content, clicked hyperlinks and…
Latent Semantic Analysis A low-dimensional topic-based
representation of web documents is obtained
Then low-dimensional representations are clustered to semantic groups
> Web recommender
22
Singular Value Decomposition (SVD)
A U
= x x
m x n m x r r x r r x n
TV
23
k= x x
m x n m x r r x r r x n
TkV
k
k
kA kU
Tkkkk VUA K << r
1~ kkTUDD
kI
Singular Value Decomposition (SVD)
> Sim
24
Rock
Soul
Pop
P2PMusic Bulletin Board
Adele
Madonna
Vampire weekend
Miles Davis
Reviews all
JazzJazz
Interest-aware music sharing app.
?
Problem with interest-aware: Greedy!
S
X
Y
Y
D1
1
1
3
3
3
3
wireless contactdata transfer
Y
a
b
c
d
e
f
g
2D
4
4
D
D
X
X
X
Yh
5D
25
Solution – PEEP Still interest-aware
Interest vectors; binary Learning interests: feedback from user, # data items of
each category, play times for music files, or LSA Transmit-budget
Amount of data items allowed for transmission at each connection
How to divide the transmit budget?
Popularity Should be estimated
1 2Items of interest? Others?
1 0 0 1 1 1 0
26
Popular
Arezu Moghadam, Henning Schulzrinne, "PEEP: Popularity-based and Energy Efficient Protocol for Data Distribution in Mobile DTNs ", CCNC'2011 - Smart Spaces and Personal Area Networks, Las Vegas, USA, Jan 2011.
T1 T2 T3 T4 T5 T6 T7
Popularity estimation
Contact window N History of the users’ interests Average or weighted average
Example: C=6, N=8 Replace the oldest
i iI
NP
1
1 0 1 0 0 1
1 0 0 1 1 1
0 1 0 0 0 0
1 0 0 1 0 0
0 0 1 0 0 0
0 1 0 0 0 0
1 1 0 0 0 0
1 0 1 0 0 0
.62 .37 .37 .25 .12 .25
27
T1 T2 T3 T4 T5 T6
Evaluation of PEEP
28> Simulation details
29
Problem scope MobileDTNs
MobileDTNs
Applications RoutingRouting MobilityMobility
A modularapp. platform
Popularity-based and interest-aware
communicationmodels
Markov-basedmobility model
and routing algorithm
30
Problem scope MobileDTNs
MobileDTNs
ApplicationsApplications MobilityMobilityRoutingRouting
Markov models toModel users’
movement
Markov-basedRouting algorithm
31
Mobility is a crucial factor!
partition
D
S
32
Mobility models Mobility models usage
Application provisioning and evaluation of routing protocols performance analysis
QoS in cellular networks Problem: Inadequacy of the current synthetic and trace-based
mobility models Trace-based studies
Precision and granularity Specific population of study
Our empirical analysis based on a new set of traces Calculating patterns of human movement and using it in designing
routing protocols
> Levy
33
Problem with the current models Synthetic models mostly based on RWP
Simplified assumptions about human movement Synthesized or trace-driven models
Cellular networks Handoff predictions for QoS Movement of the node is not important within the cell
Mobile DTNs No cell-tower or AP
Impact of the mobility is higher on data propagation Traces or models extracted for cellular networks are not fine-grained
enough! Traces from a limited number of users from a specific class Traces from APs with not enough granularity
Arezu Moghadam, Tony Jebara, Henning Schulzrinne, “A Markov Routing Algorithm for Mobile DTNs based on Spatio-Temporal Modeling of Human Movement Data ", ACM MSWiM 2011 , Miami Beach, FL, USA, Oct 2011.
Spatial and Temporal Patterns
34
8 AM: Home
9 am:Drop kid @
school
10 AM:Work
12 pm:Café X
1 PM:Work
4 pm:Coffee X 6 PM:
Work
7 pm:Shop Y
8 pm:Home
10 pm:Bar Z
12am~8am Home
35
Sense Network’s traces GPS traces of a wide-spectrum of
mobile users Citysense application
Nightlife discovery Friend-finder
Privacy concerns People are owners of their own data
GPS precision of 20 feet compared to 1~20miles cell-tower coverage
Population of 10,000 users
36
Data presentation
Sequence of gridsG1, G1, G17, G23,…, GN…
Learning mechanism Ngrams
A subsequence of N items from a sequence
Modeling sequences in NLP, gene sequence analyzing, speech recognition
Goal: most probable future locations
Pattern Likelihood of traversing a given
sequence.
123456789
101112
A B CD E F GH I J K L M NO
Tuples of Grids
5039907665
5038663466
5038414624
5038414623
5060063904
5053345115
. . .
5039907665
1370 10 230 10 0 30 . . .
5038663466
30 130 110 0 0 0 . . .
5038414624
220 110 3420 120 0 60 . . .
5038414623
10 0 50 0 14 0 . . .
5060063904
0 0 12 110 0 0 . . .
5053345115
0 50 0 13 176 343 . . .
.
...
.
...
.
...
.
.. . .. . .
Triple of Grids 5039907665
5038414624
5050607875
5053345115
5038414623
. . .
5039907665
5039907665
1180 110 30 30 10 . . .
5039907665
5038414624
0 230 0 0 0 . . .
5038414624
5038414624
220 2820 10 60 110 . . .
5038414624
5039907665
110 100 0 0 0 . . .
5039907665
5050607875
0 20 10 0 0 . . .
5050607875
5038414624
0 30 133 0 44 . . .
.
...
.
...
.
...
.
.. . .. . .
Ngrams
G1 , G2 , … , Gi , … , Gn Training
Extract bigram and trigram tables.
Testing Calculating the likelihood of
a new observation
37
)(log
),...,,,...,,,(log 1321
i
Nii
GP
GGGGGG
Markov chains for users’ movement Set of states
S = {S1, s2, …, sr} Transition matrix
Transitions correspond to consecutive GPS pings
users’ mobility profiles Pattern
States should be positive recurrent
Finite hitting times with prob. 1 Matrix of hitting times
39
xxx xxxxx
xx
xx
x xxxxxx xx
xx xx
xxx
x
xx
xxxx xx
xxx
xx
xx
x xxxxxx xx
xx xx
xx
xx
xxxxxx
x
xx
xx
x
xxx
xx
xxx
x xxx
xxxx
xx
xxx
xxx
x
x
xx
x
x
xx
xx
50%
xx
x
25%10%
xxx
xx
xx
grid
s (1
00ft
)
grids (100ft)
ijP
Markov-based routing algorithm
Absorption (hitting) times = number of transitions until
chain arrives at state j starting @ i
Select the relay (r) with less absorption time than source (s).
40
1 2 3 4
1.0.7529
.0882.625
.0588
1.0
.3750.1
ijN
ijij NET
nnnn
n
n
ttt
ttt
ttt
T
.
....
.
.
21
22221
11211
sijj
rij TT
Monte Carlo simulation
41
1 2 3 4
1.0.7529
.0882.625
.0588
1.0
.3750.1 Mobility
Generator Engine
--------------Sampling from the Markov Chains
MobilityGenerator
Engine--------------Sampling from the Markov Chains
Users’ locations after each transition
Routing Algorithm Emulator
Routing Algorithm Emulator1 2 3 4
1.00.7
0.150.6
0.05
0.20.20.1 5
0.30.2
0.3
0.7
0.3
1 2 3
0.4
.0.3
0.6.625
.3750.1
0.6
Delay = #transitions
Energy = #transmissions
Performance measure
Performance objective Delay Consumed energy
Family of α-epidemics Measure performance
curve:
42
SS
RRRR
RR
RR
RRRR
RR
RR
RR
RR
α = 100%α = 70%α = 30%
)(energyfdelay
)(energyfdelay MBMB ?
Evaluation of results
43
α = 1
α = 0.1
α = 0.7
α = 0.2
α = 0.3
Random DestinationRandom Destination Popular DestinationPopular Destination
44
Conclusion MobileDTNs
MobileDTNs
ApplicationsApplications MobilityMobility
Class of disruption-
tolerant
Core functional
requirements
Simulationsbased onmobility
Synthetic &synthesized
models
RoutingRouting
Classes ofrouting
protocols
Groupcommunication
model
Developed a Modular Platform
(Released on sourceforge)
DevelopedInterest-Aware,PEEP algorithms
Mobile music-sharing system1 – N-Grams to estimate future locations2 – Routing based on Markov Model3 – Best to route to popular locations
Markov-based Mobility-Model
and Routing Algorithm
45
46
Back up slides
47
NewFile
(R0, H0)(R1, H1)
(Checksum, Hash)
(R2, H2)
(R5, H5)(R6, H6)
(R4, H4)
Signatures File(received from client)
(R3, H3)
Look up hash
(pointer i) Copy
Download
matching
non-matching
Difference(deltas file, to be sent back to the
client)
……
……
……
(pointer i+1) Copy(pointer i+2) Copy
Download(pointer i+4) Copy(pointer i+5) Copy
OldFile
(R0, H0)(R1, H1)
(Checksum, Hash)
(R2, H2)
(R5, H5)(R6, H6)
(R4, H4)
Signatures File(to be sent to server)
(R3, H3)
Insert hash…
…
……
Client ServerRsync Algorithm
48
Current routing models Single-source single-destination (no knowledge of topology)
Flooding based protocols Epidemic
Probabilistic routing PROPHET [57], RPLM [79], MaxProp [21]
Context or behavior of mobile users HiBOp [18], Profile-cast [42], MobySpace [54]
Multicast Extends the classical model with group memberships to mobile DTNs
No infrastructure No knowledge of the topology (e.g., no multicast routers)
Epidemic based multicast (no knowledge)
49
Current routing models Single-source single-destination (no knowledge of topology)
Flooding based protocols Epidemic
Probabilistic routing PROPHET [57], RPLM [79], MaxProp [21]
Context or behavior of mobile users HiBOp [18], Profile-cast [42], MobySpace [54]
Multicast Extends the classical model with group memberships to mobile
DTNs No infrastructure (e.g., no multicast routers) No knowledge of the topology
Epidemic based multicast (no knowledge)
50
Probabilistic routing criteria
PROPHET Delivery predictability calculation.
Routing with Persistent Link Modeling (RPLM) Monitors link connectivity to calculate its cost. Dijkstra to find a minimum cost path.
MaxProp Assigning a cost value to each destination based on probability. Priority queue younger messages higher chances.
MobySpace MobyPoint each node’s coordinates or mobility pattern. Distance on each axes probability of contacts or presence in a location.
51
Characteristics of the current modelsModel objective
Delivery ratio Delay Message redundancy
Knowledgeof topology
Flooding1-to-1
1-to-many
High Low(the least)
High Buffer congestion
Zero
Knowledge based1-to-1
1-to-many
MF the highest (even higher than
ER)
Moderate Low Provided to the algorithm
Probabilistic1-to-1
Close to ER with tendency in mobility
Close to ER with tendency in
mobility
Moderate Memory(learning from the
past)
Multicast1-to-many
Flooding based is the highest
Flooding based is the lowest
Flooding based is the highest
Required in non-epidemic
52
Interest-aware simulation results The ONE simulator for mobile DTNs Movement generation based on reality-mining’s mobile
traces Compared to epidemic multicast with the same storage
constraints The only model with no knowledge about topology and group
memberships Measured # relevant and irrelevant documents received
by mobile users Increases # received relevant documents by 30% Decreases # received irrelevant documents by 35%
Interest-aware algorithm limits the resource usage in terms of the cache and contact duration
The ONE, reality-mining
53
Web recommender systems Systems for recommending items (e.g. books,
movies, CD’s, web pages, newsgroup messages) to users based on examples of their preferences.
Many on-line stores provide recommendations (e.g. Amazon, CDNow).
Personalization to the individual needs, interests, and preferences of each user.
54
E.g. book recommenderRedMars
Juras-sicPark
1984
Ident-ity
Foundation
Differ-enceEngine
Machine Learning
UserProfile
Animalfarm
Neuro-mancer
history
55
Collaborative filtering
Maintains a database of many users’ ratings of a variety of items
For a given user, find other similar users whose ratings strongly correlate with the current user
Recommend items rated highly by these similar users, but not rated by the current user
Almost all existing commercial recommenders use this approach (e.g. Amazon)
56
Collaborative filtering
A 9B 3C: :Z 5
A B C 9: :Z 10
A 5B 3C: : Z 7
A B C 8: : Z
A 6B 4C: :Z
A 10B 4C 8. .Z 1
UserDatabase
ActiveUser
CorrelationMatch
A 9B 3C . .Z 5
A 9B 3C: :Z 5
A 10B 4C 8. .Z 1
ExtractRecommendations
57
The ONE Simulator A modular simulation environment for mobile DTNs Routing package
Prophet Epidemic Spray and wait
Internal and external mobility generation RWP Map based Stationary
Internal and external message event generation Reports of connection and message passing
58
Snapshot of map-based movement
The ONE Simulator
A modular simulation environment for mobile DTNs Implements routing packages for one-to-one model
Prophet Epidemic Spray and wait
Internal and external mobility generation RWP Map based Stationary
Internal and external message event generation Reports of contacts and message transmission
59
60
Interest-aware protocol implementation Interest-aware routing as a new module for the
routing package General categories for documents Each node randomly assigned with some interest
in each category A sub-population is randomly selected to be in the
same community of interest Documents/messages are generated from nodes
outside this community Coverage, pollution and dropped messages
61
Choice of mobility model for interest-aware
Synthetic mobility traces RWP Map-based Community-based
Speed of nodes Residence time Directions More realistic simulation with real-world traces
Reality-mining traces
62
Users behavior: Reality Mining Social behavior study;
Users encounters and visited locations
How predictable is people’s lives?
How does information flow?
100 subjects with Nokia symbian series 6600.
Logs AP, GSM base stations and
users encounters, call logs. Goal: learn users behaviors
and social network studies.
63
Reality-mining database
Tables in REALMINE
activityspan
callspan
cellname
cellspan
celltower
coverspan
device
devicespan
person
phonenumber
MySql database Device Devicespan Person person-person contacts device-device contacts
64
Relations we usedperson
PK oid
name
password
devicespan
PK oid
FK1
FK2
starttime
endtime
person_oid
device_oid
device
PK oid
FK1
macaddr
name
person_oid
65
Statistics and simulation set up
Reality-mining subjects: 97 Total number of encountered devices: 20795 44% of contacts with duration 0 15% of total contacts with devices outside the
reality-mining 66% of these contacts just happened once!
40% have been considered in the same community of interests
Fixed number of general categories
Optimization criteria for PEEP Maximize the number of received items of interest
Minimize the delay of data distribution
Not two independent values! The more the distribution the less the delay
has nodes interested = set of nodes interested in
i i
i Sjij
N
t
Min i
iC iN
iS
ii NS
66
PEEP implementation in The ONE PEEP routing as a new module for the routing package General categories for documents Each node is assigned some interest in each category based
on Zipf distribution Distribution of the popular items follows Zipf law
No knowledge of the topology Documents/messages are generated uniformly from
different sources Measurements:
Number of received documents of interest over time Number of received documents of interest over contacts Speed of the distribution (slope of the graph)
67
Choice of mobility model for PEEP Synthetic mobility traces
RWP Map-based Community-based
Speed of nodes Residence time Directions The relative performance of the algorithm should be
independent from the choice of the mobility model Our choice: RWP A constant slope verifies this fact
68
Evaluation of the results
If storage size is low buffer overflow happens too soon No chance for the items of interest to survive
The most important difference with our previous work Unlimited storage size Limited energy (transmit-budget) Not far from the reality
69
Low storage size
EpidemicInterest-aware
70
Medium ~ High storage sizes
71
72
Levy flight Human walk follows a Levy flight
distribution A random walk for which step size follows a
power-law distribution: :step size
Rhee et al. GPS traces of 44 users; truncated power-law
Brockmann et al. Bank notes is fat tailed power-law
Gonzalez et al. Cell phone traces of 100,000 users; truncated
power-law
r
* Graph from: D. Brockmann and F. Theis, “Money Circulation, Trackable Items, and the Emergence of Universal Human Mobility Patterns“, IEEE Pervasive Computing, Volume 7, Piscataway, NJ, October 2008.
110010 210 310510
410
310
210
110
)( rp
][kmr *