Declarative Networking: Extensible Networks with Declarative Queries
Boon Thau LooUniversity of California, Berkeley
Era of change for the Internet
“in the thirty-odd years since its invention, new uses and abuses, …., are pushing the Internet intorealms that its original design neither anticipated nor easily accommodates…..”
Overcoming Barriers to Disruptive Innovation in Networking, NSF Workshop Report ‘05
Efforts at Internet Innovation
Evolution: Overlay Networks Commercial (Akamai, VPN, MS Exchange
servers) P2P (filesharing, telephony) Research prototypes on testbed (PlanetLab)
Revolution: Clean slate design NSF Future Internet Design (FIND) program NSF Global Environment for Network
Investigations (GENI) initiative
Missing: software tools that can significantly accelerate Internet
innovation
Missing: software tools that can significantly accelerate Internet
innovation
Internet
Overlay
Approach: Declarative Networking
A declarative framework for networks: Declarative language: “ask for what you
want, not how to implement it” Declarative specifications of networks,
compiled to distributed dataflows Runtime engine to execute distributed
dataflows
Observation: Recursive queries are a natural fit for routing
P2 Declarative Networking System
P2 Declarative Networking SystemNetwork
Specifications as Queries
Query Planner Dataflow Engine
Network Protocols
http://p2.cs.berkeley.edu
lookup
lookup
Dem
ux
link
Local Tables
path ...U
DP
T
xR
ou
nd
Ro
bin
Qu
eu
eC
C
Tx
Qu
eu
eU
DP
R
xC
C
Rx
Dataflow
The Case for DeclarativeEase of programming: Compact and high-level representation of
protocols Orders of magnitude reduction in code size Easy customization
Safety: Queries are “sandboxed” within query processor Potential for static analysis techniques on safety
What about efficiency? No fundamental overhead when executing
standard routing protocols Application of well-studied query optimizations Note: Same question was asked of relational
databases in the 70’s.
Main Contributions
Declarative Routing [HotNets ’04, SIGCOMM ’05]: Extensible Routers (balance of flexibility,
efficiency and safety).
Declarative Overlays [SOSP ’05]: Rapid prototyping of new overlay networks
Database Fundamentals [SIGMOD ‘06]: Network specific query language and semantics Distributed recursive query execution strategies Query Optimizations, classical and new
A Breadth of Use Cases
Implemented to date: Textbook routing protocols (3-8 lines, UCB/Wisconsin) Chord DHT overlay routing (47 lines, UCB/IRB) Narada mesh (16 lines, UCB/Intel) Distributed Gnutella/Web crawlers (Dataflow, UCB) Lamport/Chandy snapshots (20 lines, Intel/Rice/MPI) Paxos distributed consensus (44 lines, Harvard)
In Progress: OSPF routing (UCB) Distributed Junction Tree statistical inference (UCB)
Outline
BackgroundThe Connection: Routing as a Query Execution Model Path-Vector Protocol Example
Query specification protocol implementation More Examples
Realizing the Connection P2: Declarative Routing Engine
Beyond routing: Declarative Overlays Conclusion
Traditional Router
Neighbor Table Forwarding TableRouting
Infrastructure
Packets
Packets
Traditional Router
Control PlaneForwarding Plane
Routing Protocol
Neighbor Table
updates
Forwarding Table
updates
Review: Path Vector Protocol
Advertisement: entire path to a destinationEach node receives advertisement, add itself to path and forward to neighbors
path=[c,d]path=[b,c,d]path=[a,b,c,d]
c advertises [c,d]b advertises [b,c,d]
b dca
Declarative RouterTraditional Router
Declarative Router
Neighbor Table Forwarding Table
Input Tables
Declarative Queries
Control PlaneForwarding Plane
Output Tables
Routing Infrastructure
Packets
Packets
P2 EngineRouting Protocol
Neighbor Table
updates
Forwarding Table
updates
Introduction to Datalog
<result> <condition1>, <condition2>, … , <conditionN>.
Datalog rule syntax:
Types of conditions is body: Input tables: link(src,dst) predicate Arithmetic and list operations
Head is an output table Recursive rules: result of head in rule body
BodyHead
All-Pairs Reachability
R2: reachable(S,D) link(S,Z), reachable(Z,D)
R1: reachable(S,D) link(S,D)
Input: link(source, destination)Output: reachable(source, destination)
“For all nodes S,D, If there is a link from S to D, then S can reach D”.
link(a,b) – “there is a link from node a to node b”
reachable(a,b) – “node a can reach node b”
All-Pairs Reachability
R2: reachable(S,D) link(S,Z), reachable(Z,D)
R1: reachable(S,D) link(S,D)
Input: link(source, destination)Output: reachable(source, destination)
“For all nodes S,D and Z, If there is a link from S to Z, AND Z can reach D, then S can reach D”.
Towards Network Datalog
Specify tuple placement Value-based partitioning of tables
Tuples to be combined are co-located Rule rewrite ensures body is always single-
site
All communication is among neighbors No multihop routing during basic rule
execution Enforced via simple syntactic restrictions
All-Pairs Reachability
R1: reachable(@S,D) link(@S,D)
R2: reachable(@S,D) link(@S,Z), reachable(@Z,D)
Network Datalog
Query: reachable(@M,N)
@S
D
@a
b
@a
c
@a
d
reachableOutput
table:
Input table:
Query: reachable(@a,N)
@S
D
@c b
@c d
link@
SD
@b
c
@b
a
link@
SD
@a
b
link @
SD
@d
c
link
b dca
@S
D
@b
a
@b
c
@b
d
reachable @
SD
@c a
@c b
@c d
reachable @
SD
@d
a
@d
b
@d
c
reachable
Location Specifier “@S”
Query: reachable(@a,N)
Path Vector in Network Datalog
Input: link(@source, destination)Query output: path(@source, destination, pathVector)
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: link(@Z,S), path(@S,D,P)
P=SP2. path(@Z,D,P2),
Query: path(@S,D,P)
Add S to front of P2
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: path(@S,D,P) link(@Z,S), path(@Z,D,P2), P=SP2.
@S
D P @S
D P
@c d [c,d]
Query Execution
@S
D P @S
D P
Query: path(@a,d,P,C)
Neighbor table:
@S
D
@c b
@c d
link@S D
@b c
@b a
link@
SD
@a
b
link @S D
@d c
link
b dca
path path path
Forwarding table:
@S
D P @S
D P @S
D P
@c d [c,d]
Query Execution
Forwarding table:
@S
D P
@b
d [b,c,d]
b dca
path(@b,d,[b,c,d])
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: path(@S,D,P) link(@Z,S), path(@Z,D,P2), P=SP2.
Query: path(@a,d,P,C)
Neighbor table:
@S
D
@c b
@c d
link@S D
@b c
@b a
link@
SD
@a
b
link @S D
@d c
link
path path path@S
D P
@a
d [a,b,c,d]
path(@a,d,[a,b,c,d])
Communication patterns are identical to those in the actual path vector
protocol
Communication patterns are identical to those in the actual path vector
protocol
Matching variable Z = “Join”
Sanity CheckAll-pairs shortest latency path query:
Query convergence time: proportional to diameter of the network. Same as hand-coded PV.
Per-node communication overhead: Increases linearly with the number of nodes
Same scalability trends compared with PV/DV protocols
Outline
BackgroundThe Connection: Routing as a Query Execution Model Path-Vector Protocol Example
Query specifications protocol implementation
Example Queries
Realizing the ConnectionDeclarative OverlaysConclusion
Example Routing Queries
Best-Path Routing Distance VectorDynamic Source Routing Policy Decisions QoS-based RoutingLink-stateMulticast Overlays (Single-Source & CBT)
• Compact, natural representation• Customization: easy to make modifications to get new protocols• Connection between query optimization and protocols
Takeaways:
R1: path(@S,D, ,C) link(@S,D,C) R2: path(@S,D, ,C) C=C1+C2, Query: path(@S,D, ,C)
link(@S,Z,C1), path(Z,D, ,C2),
All-pairs All-paths
, P=(S,D).PP
P2
P=SP2.P
R1: path(@S,D,P,C) link(@S,D,C), P=(S,D). R2: path(@S,D,P,C) link(@S,Z,C1), path(@Z,D,P2,C2),
C=C1+C2,
Query: bestPath(@S,D,P,C)
R3: bestPathCost(@S,D,min<C>) path(@S,D,Z,C).R4: bestPath(@S,D,Z,C) bestPathCost(@S,D,C), path(@S,D,P,C).
All-pairs Best-path
P=SP2.
R1: path(@S,D,P,C) link(@S,D,C), P=(S,D). R2: path(@S,D,P,C) link(@S,Z,C1), path(@Z,D,P2,C2),
C=FN(C1,C2),
Query: bestPath(@S,D,P,C)
R3: bestPathCost(@S,D,AGG<C>) path(@S,D,Z,C).R4: bestPath(@S,D,Z,C) bestPathCost(@S,D,C), path(@S,D,P,C).
Customizable Best-Paths
Customizing C, AGG and FN: lowest RTT, lowest loss rate, highest capacity, best-k
P=SP2.
All-pairs All-paths
R1: path(@S,D, ,C) link(@S,D,C) R2: path(@S,D, ,C) C=C1+C2, Query: path(@S,D, ,C)
link(@S,Z,C1), path(@Z,D, ,C2),
, P=(S,D).PP
P2
P=SP2.P
R1: path(@S,D, ,C) link(@S,D,C). R2: path(@S,D, ,C) link(@S,Z,C1), path(@Z,D, ,C2), C=C1+C2
Query: (@S,D, ,C)
Distance Vector
DZ W
R3: shortestLength(@S,D,min<C>) path(@S,D,Z,C).R4: nextHop(@S,D,Z,C) nextHop(@S,D,Z,C), shortestLength(@S,D,C).ZnextHop
Count to Infinity problem?
R1: path(@S,D,D,C) link(@S,D,C) R2: path(@S,D,Z,C) link(@S,Z,C1), path(@Z,D,W,C2), C=C1+C2
R3: shortestLength(@S,D,min<C>) path(@S,D,Z,C). R4: nextHop(@S,D,Z,C) nextHop(@S,D,Z,C), shortestLength(@S,D,C). Query: nextHop(@S,D,Z,C)
Distance Vector with Split Horizon
, W!=S
R1: path(@S,D,D,C) link(@S,D,C) R2: path(@S,D,Z,C) link(@S,Z,C1), path(@Z,D,W,C2), C=C1+C2, W!=S
R4: shortestLength(@S,D,min<C>) path(@S,D,Z,C). R5: nextHop(@S,D,Z,C) nextHop(@S,D,Z,C), shortestLength(@S,D,C). Query: nextHop(@S,D,Z,C)
Distance Vector with Poisoned Reverse
R3: path(@S,D,Z,C) link(@S,Z,C1), path(@Z,D,W,C2), C=, W=S
All-pairs All-Paths
R1: path(@S,D,P,C) link(@S,D,C), P= (S,D). R2: path(@S,D,P,C) C=C1+C2, Query: path(@S,D,P,C)
link(@S,Z,C1), path(@Z,D,P2,C2),P=SP2.
Dynamic Source Routing
Predicate reordering: path vector protocol dynamic source routing
R1: path(@S,D,P,C) link(@S,D,C), P= (S,D). R2: path(@S,D,P,C) C=C1+C2, Query: path(@S,D,P,C)
path(@S,Z,P1,C1), link(@Z,D,C2),P=SP2.P=P1D.
Other Routing Examples
Best-Path RoutingDistance VectorDynamic Source Routing Policy DecisionsQoS-based RoutingLink-stateMulticast Overlays (Single-Source & CBT)
Outline
BackgroundThe Connection: Routing as a QueryRealizing the Connection Dataflow Generation and Execution Recursive Query Processing Optimizations Semantics in a dynamic network
Beyond routing: Declarative Overlays Conclusion
lookup
lookup
Dem
ux
link
Local Tables
path ...
UD
P
Tx
Round
Robin
Queue
CC
T
x
Queue
UD
P
Rx
CC
R
x
Dataflow Graph
Nodes in dataflow graph (“elements”): Network elements (send/recv, cc, retry, rate limitation) Flow elements (mux, demux, queues) Relational operators (selects, projects, joins, aggregates)
Strands
Messages
Network In
Messages
Network Out
Single P2 Node
Dataflow Strand
Input Tuples
Output Tuples
Element1
Element2
Elementn
Input: Incoming network messages, local table changes, local timer events
…
Strand Elements
Condition: Process input tuple using strand elements
Output: Outgoing network messages, local table updates
lookup
lookup
Dem
ux
link
Local Tables
path ...
UD
P
Tx
Ro
und
Ro
binQ
ueue
CC
T
x
Qu
eueU
DP
R
xC
C
Rx
Rule Dataflow “Strands”
lookup
lookup
Dem
ux
link
Local Tables
path ...
UD
P
Tx
Round
Robin
Queue
CC
T
x
Queue
UD
P
Rx
CC
R
x
R2: path(@S,D,P) link(@S,Z), path(@Z,D,P2), P=SP2.
Localization RewriteRules may have body predicates at different locations:
R2: path(@S,D,P) link(@S,Z), path(@Z,D,P2), P=SP2.
R2b: path(@S,D,P) linkD(S,@Z), path(@Z,D,P2), P=SP2.
R2a: linkD(S,@D) link(@S,D)
Matching variable Z = “Join”
Rewritten rules:
Matching variable Z = “Join”
Dataflow Strand Generation
Strand Elements
path Joinpath.Z = linkD.Z
linkD
Projectpath(S,D,P) Send to
path.S
R2b: path(@S,D,P) linkD(S,@Z), path(@Z,D,P2), P=SP2.
Netw
ork
In
Netw
ork
InlinkD
JoinlinkD.Z =
path.Z
path
Projectpath(S,D,P) Send to
path.S
Recursive Query Evaluation
Semi-naïve evaluation: Iterations (rounds) of synchronous computation Results from iteration ith used in (i+1)th
Path Table
87
3-hop
109
21
1-hop3
65 2-hop4
Link Table Network
510
021
3
4
6
8
7
Problem: Unpredictable delays and failures
9
Pipelined Semi-naïve (PSN)Fully-asynchronous evaluation:
Computed tuples in any iteration pipelined to next iteration
Natural for distributed dataflows
Path Table
41
7
Link Table Network
25836910
510
021
3
4
6
8
79
Relaxation of semi-naïve
Relaxation of semi-naïve
Pipelined Evaluation
Challenges: Does PSN produce the correct answer? Is PSN bandwidth efficient?
I.e. does it make the minimum number of inferences?
Duplicate avoidance: local timestampsTheorems: RSSN(p) = RSPSN(p), where RS is results set No repeated inferences in computing RSPSN(p)p(x,z) :- p1(x,y), p2(y,z), …, pn(y,z),
q(z,w)recursive w.r.t. p
Outline
BackgroundThe Connection: Routing as a QueryP2 Declarative Networking System Dataflow Generation and Execution Recursive Query Processing Optimizations
Beyond routing: Declarative OverlaysConclusion
Overview of Optimizations
Traditional: evaluate in the NW context Aggregate Selections Magic Sets rewrite Predicate Reordering
New: motivated by NW context Multi-query optimizations:
Query Results caching Opportunistic message sharing
Cost-based optimizations (work-in-progress) Neighborhood density function Hybrid rewrites
PV/DV DSR
Zone Routing Protocol
Aggregate Selections
Prune communication using running state of monotonic aggregate Avoid sending tuples that do not affect value
of agg E.g., shortest-paths query
Challenge in distributed setting: Out-of-order (in terms of monotonic
aggregate) arrival of tuples Solution: Periodic aggregate selections
Buffer up tuples, periodically send best-agg tuples
Aggregate Selections Evaluation
P2 implementation of routing protocols on Emulab (100 nodes)All-pairs best-path queries (with aggregate selections)
Aggregate Selections reduces communication overhead More effective when link metric correlated with network
delayPeriodic AS reduces communication overhead further
Outline
BackgroundThe Connection: Routing as a QueryRealizing the Connection P2: Declarative Routing Engine
Beyond routing: Declarative OverlaysConclusion
Declarative Router
Recall: Declarative Routing
Input Tables
P2 EngineDeclarative
Queries
Control PlaneForwarding Plane
Output Tables
Routing Infrastructure
Neighbor Table
updates
Forwarding Table
updates
Neighbor Table Forwarding Table Packets
Packets
Declarative Overlays
Declarative Queries
Application levelInternet
Declarative Overlay Node
Internet
Packets Packets
Overlay topology tables
P2 Engine
Default Internet Routing
Control and forwarding
Plane
Declarative Overlays
More challenging to specify: Not just querying for routes using input
links Rules for generating overlay topology Message delivery, acknowledgements,
failure detection, timeouts, periodic probes, etc…
Extensive use of timer-based event predicates:
ping(@D,S) :- periodic(@S,10), link(@S,D)
P2-Chord
Chord Routing, including: Multiple successors Stabilization Optimized finger
maintenance Failure detection
47 rules13 table definitionsMIT-Chord: x100 more codeAnother example:
Narada mesh in 16 rules
10 pt font
Actual Chord Lookup Dataflow
L1 Joinlookup.NI ==
node.NI
Joinlookup.NI ==bestSucc.NI
TimedPullPush 0
SelectK in (N, S]
ProjectlookupRes
MaterializationsInsert
Insert
Insert
L3TimedPullPush
0
JoinbestLookupDist.NI
== node.NI
L2TimedPullPush
0
node
Demux(@local?)
Tim
edPullP
ush0
Network OutQueueremote
local
Netw
ork In
be
stL
oo
kup
Dis
t
fing
er
be
stS
ucc
bestSucc
loo
kup
Mux
Tim
edPullP
ush 0
Queue
Dup
finger
no
de
RoundR
obin
Dem
ux(tuple nam
e)
Agg min<D>on finger
D:=K-B-1, B in (N,K)
Agg min<BI>on finger
D==K-B-1, B in (N,K)
Joinlookup.NI ==
node.NI
P2-Chord Evaluation
P2 nodes running Chord on 100 Emulab nodes: Logarithmic lookup hop-count and state (“correct”) Median lookup latency: 1-1.5s BW-efficient: 300 bytes/s/node
Moving up the stack
Querying the overlay: Routing tables are “views” to be queried Queries on route resilience, network diameter, path
lengthRecursive queries for network discovery:
Distributed Gnutella crawler on PlanetLab [IPTPS ‘03] Distributed web crawler over DHTs on PlanetLab
Oct ’03 distributed crawl: 100,000 nodes, 20 million files
Outline
BackgroundThe Connection: Routing as a QueryRealizing the Connection Beyond routing: Declarative OverlaysConclusion
A Sampling of Related Work
Databases Recursive queries: software analysis,
trust management, distributed systems diagnosis
Opportunities : Computational biology, data integration, sensor networks
Networking XORP – Extensible Routers High-level routing specifications
Meta-Routing, Routing logic
Future DirectionsDeclarative Networking: Static checks on desirable network
properties Automatic cost-based optimizations Component-based network abstractions
Core Internet Infrastructure Declarative specifications of ISP
configurations P2 deployment in routers
Distributed Data Management on Declarative Networks
Run-time cross-layer optimizations: Reoptimize data placement and queries Reconfigure networks based on data and query
workloads
P2: Declarative Networks
Customized routes, DHTs, Flood, Gossip, Multicast Mesh
Distributed Algorithms
Consensus (Harvard), 2PC, Byzantine, Snapshots (Rice/Intel), Replication
P2P Search, network monitoring, P2P data integration, collaborative filtering, content distribution networks…
Data Management Applications
Distributed Queries
SQL, XML, Datalog
Other WorkInternet-Scale Query Processing PIER – Distributed query processor on DHTs http://pier.cs.berkeley.edu [VLDB 2003, CIDR
2005]
P2P Search Infrastructures P2P Web Search and Indexing [IPTPS 2003] Gnutella measurements on PlanetLab [IPTPS
2004] Distributed Gnutella crawler and monitoring
Hybrid P2P search [VLDB 2004]
Contributions and Summary
P2 Declarative Networking System Declarative Routing Engine
Extensible routing infrastructure Declarative Overlays
Rapid prototyping overlay networks Database fundamentals
Query language New distributed query execution strategies and
optimizations Semantics in dynamic networks
Period of flux in Internet research Declarative Networks can play an important
role
Thank You
Top Related