Post on 28-Dec-2015
1IntelIntel Research Research
(Phi)Timothy RoscoeTimothy Roscoe, Joseph M. Hellerstein, Brent Chun, Nina Taft, Petros , Joseph M. Hellerstein, Brent Chun, Nina Taft, Petros
Maniatis, Ryan Huebsch, Tyson Condie, Boon Thau LongManiatis, Ryan Huebsch, Tyson Condie, Boon Thau Long
Intel Research at Berkeley and U.C. BerkeleyIntel Research at Berkeley and U.C. Berkeley
(much input from Tom Anderson, Vern Paxson, Larry Peterson, Scott (much input from Tom Anderson, Vern Paxson, Larry Peterson, Scott Shenker, Ion Stoica, and David Wetherall)Shenker, Ion Stoica, and David Wetherall)
2IntelIntel Research Research
Lessons from PlanetLab… What have we learned from PlanetLab?What have we learned from PlanetLab?
We understand how to build robust large-scale systemsWe understand how to build robust large-scale systems
E.g. structured overlay networks that scale to the planetE.g. structured overlay networks that scale to the planet
What have we learned about the state of the What have we learned about the state of the Internet?Internet?
It’s brittle. It’s brittle.
It’s unpredictable.It’s unpredictable.
It’s doesn’t know what’s happening.It’s doesn’t know what’s happening.
It’s afraid. It’s afraid.
3IntelIntel Research Research
The brittleness of the Internet Security systems are afraid of the unknown. Security systems are afraid of the unknown.
Everything new is unknown.Everything new is unknown.
Everything new is a threat. Everything new is a threat.
Better to shut it down now.Better to shut it down now.
Users Users reallyreally don’t comprehend the problems (and don’t comprehend the problems (and why should they?)why should they?)
Not exactly easy to understand Not exactly easy to understand
Very little information availableVery little information available
4IntelIntel Research Research
The brittleness of the Internet Performance is unpredictablePerformance is unpredictable
Failures, bottlenecks, congestion, misconfigurations, etc. Failures, bottlenecks, congestion, misconfigurations, etc.
It appears overlays can do betterIt appears overlays can do better
Provided they can measure the network.Provided they can measure the network.
This shouldn’t work, but it does.This shouldn’t work, but it does.
IP protocols assume no information is available IP protocols assume no information is available about current network state.about current network state.
5IntelIntel Research Research
Research trends and directions
Extensive network measurement / modellingExtensive network measurement / modelling
Distributed security solutionsDistributed security solutions
Distributed performance diagnosisDistributed performance diagnosis
Machine learning across networksMachine learning across networks
Measurement-based overlaysMeasurement-based overlays
Better Internet protocolsBetter Internet protocols
Network visualizations Network visualizations
6IntelIntel Research Research
What’s missing?
Measurement, monitoring, logging, etc. of the real Internet
Applications and services
User awareness
?
7IntelIntel Research Research
An “Information Plane” for the Internet
Continuous queries over distributed network state, Continuous queries over distributed network state, available to all end systemsavailable to all end systems
Integrate data from:Integrate data from:
Backbone monitoringBackbone monitoring
Router configurationRouter configuration
Network state databases (e.g. RouteViews)Network state databases (e.g. RouteViews)
Security systems (Firewalls, DShield, Autograph, etc.)Security systems (Firewalls, DShield, Autograph, etc.)
End-system monitoringEnd-system monitoring
8IntelIntel Research Research
The big picture
sensorsensor
sensor
INTERNET
End-systemsBackbone monitorsRoutersNetwork databasesFirewall logs
Types of sensorsTypes of Clients
End usersEnd applicationsOverlays
9IntelIntel Research Research
The big picture
sensor
sensor
sensor
sensor
sensordisseminate
query
queryplan
queryplan
10IntelIntel Research Research
The big picture
sensor
sensor
sensor
sensor
sensorQueryresults
queryexecution
queryexecution
Answer
11IntelIntel Research Research
Implications of success Short-term: Short-term: Enable Enable & & Connect Connect measurement & security measurement & security
researchers researchers
E.g. “Live DShield”, E.g. “Live DShield”, top 10 IP address result from Barford et.altop 10 IP address result from Barford et.al
Promote user-awareness through downloadable toolsPromote user-awareness through downloadable tools
Medium-term: Medium-term: Provide Provide global network knowledge for global network knowledge for planetary-planetary-scale applications & overlaysscale applications & overlays
E.g. Resource discovery on PlanetLab, OpenDHT could exploit NW link E.g. Resource discovery on PlanetLab, OpenDHT could exploit NW link informationinformation
Long-term: Long-term: Kick off Kick off a new generation of a new generation of NNetwork-Awareetwork-Aware Internet Internet ProtocolsProtocols
E.g. Host-based source routing solutionsE.g. Host-based source routing solutions
12IntelIntel Research Research
Phi goals Create the missing piece of the information plane by building Create the missing piece of the information plane by building
a a scalable, distributed dataflow enginescalable, distributed dataflow engine for processing for processing continuous queriescontinuous queries in-network in-network
Data tuples are Data tuples are
Routed between nodes along a dataflow graphRouted between nodes along a dataflow graph
Processed at nodes (filtering, aggregation, data reduction, correlation, Processed at nodes (filtering, aggregation, data reduction, correlation, result dissemination)result dissemination)
Physical Network
Physical Dataflowin Overlay Network
AbstractDataflow(Query Plan)
DeclarativeQueries
13IntelIntel Research Research
The hard problems Scale: Millions of Scale: Millions of sources, sinks, queriessources, sinks, queries
Linear scaling on a Linear scaling on a nn33 problem : need to factor out problem : need to factor out nn2 2 redundant redundant communication & computationcommunication & computation
Fidelity & SecurityFidelity & Security
Bad inputs: data poisoning, perturbed computationsBad inputs: data poisoning, perturbed computations
Bad outputs: launchpads, vulnerability detectionBad outputs: launchpads, vulnerability detection
Efficiently embedding analysis algorithms in network Efficiently embedding analysis algorithms in network topologiestopologies
Data must be combined (hence moved around the network) according Data must be combined (hence moved around the network) according to the distributed analysis algorithmto the distributed analysis algorithm
14IntelIntel Research Research
The rest of the talk Where we are todayWhere we are today
PIER: distributed relational query processorPIER: distributed relational query processor
Single query, many sources, many sinksSingle query, many sources, many sinks
Deployed on PlanetLab for the last 12 monthsDeployed on PlanetLab for the last 12 months
Where we intend to goWhere we intend to go
P2: full dataflow engine with multiquery scalingP2: full dataflow engine with multiquery scaling
Topological Fault ToleranceTopological Fault Tolerance
Develop embeddings of distributed analysis algorithmsDevelop embeddings of distributed analysis algorithms
15IntelIntel Research Research
Key technology: Structured overlay networks (DHTs)
• E.g. Chord, Pastry, Tapestry, CAN, Kademlia...
• Flat, sparse ID space (e.g. 160-bit identifiers)
• Routing in log(n) hops routing to the owner of any key
• Based on “interesting” routing graphs
16IntelIntel Research Research
What can DHTs do?
• Content-Based Routing– i.e. send a message to a
key– Equivalent to hashing a
key to a node
• Storage– Storing values in the
network under a key
• Tree construction– Formed by routing to a
key from all nodes
150
1
2
3
4
5
6
78
9
10
11
12
13
14
17IntelIntel Research Research
What can DHTs do?
• Content-Based Routing– i.e. send a message to a
key– Equivalent to hashing a
key to a node
• Storage– Storing values in the
network under a key
• Tree construction– Formed by routing to a
key from all nodes
150
1
2
3
4
5
6
78
9
10
11
12
13
14
18IntelIntel Research Research
Query Dissemination (trees)Query Dissemination (trees)
Hierarchical Aggregation (trees and storage)Hierarchical Aggregation (trees and storage)
Indexing (routing and storage)Indexing (routing and storage)
Range Indexing Substrate (routing and storage)Range Indexing Substrate (routing and storage)
Hash-partitioned parallelism (routing)Hash-partitioned parallelism (routing)
Hash tables for group-by, join (storage)Hash tables for group-by, join (storage)
Using DHTs in Phi
19IntelIntel Research Research
Bamboo: our DHT(Sean Rhea)
Pastry-style routingPastry-style routing
Epidemic propagation of leaf sets, routing tablesEpidemic propagation of leaf sets, routing tables
Recursive routingRecursive routing
Adaptive timeouts based on continuous Adaptive timeouts based on continuous measurementmeasurement
HighlyHighly robust under churn robust under churn
Tested to ~1000 nodesTested to ~1000 nodes
PlanetLab, ModelNetPlanetLab, ModelNet
20IntelIntel Research Research
PIER: a relational query engine
Data is tuples in named tablesData is tuples in named tables
Tables exist on nodesTables exist on nodes
Relational operators:Relational operators:
SelectionSelection
ProjectionProjection
Join (correlate, intersect, match)Join (correlate, intersect, match)
Aggregation (summarize, compress, group by)Aggregation (summarize, compress, group by)
Also has recursive queriesAlso has recursive queries
Can query topological structuresCan query topological structures
21IntelIntel Research Research
PIER architecture
IPNetwork
Network
DHTWrapper
StorageManager
OverlayRouting
DHT
CoreRelationalExecution
EngineCatalogManager
QueryOptimizer
PIER
NetworkMonitoring
Other UserApps
Applications
Physical Network
Overlay Network
Query Plan
DeclarativeQueries
22IntelIntel Research Research
Experience so far
• PIER has run on PlanetLab for about a year
• Querying PlanetLab sensors, in particular Snort events
23IntelIntel Research Research
Experience so far Use of DHT for query processing by-and-large Use of DHT for query processing by-and-large
worksworks
Need story for NATs, non-transitive connectivityNeed story for NATs, non-transitive connectivity
Node heterogeneityNode heterogeneity
Multiresolution emulation is essentialMultiresolution emulation is essential
Simulation, emulation (ModelNet), deployment Simulation, emulation (ModelNet), deployment (PlanetLab)(PlanetLab)
Simple results are quite compellingSimple results are quite compelling
E.g. top 10 attackers demo for IDFE.g. top 10 attackers demo for IDF
24IntelIntel Research Research
P2 Build on PIER techniquesBuild on PIER techniques
Reimplementation in C++Reimplementation in C++
Extend beyond relational operatorsExtend beyond relational operators
Synopses/sketches, junction trees, Bayes nets, PCA,..Synopses/sketches, junction trees, Bayes nets, PCA,..
Address multiquery optimization (2 factors of Address multiquery optimization (2 factors of nn))
Investigate using the overlay for data fidelityInvestigate using the overlay for data fidelity
Codify communication and computation for a variety Codify communication and computation for a variety of algorithmsof algorithms
25IntelIntel Research Research
Granular lineage for data inputs and intermediate data productsGranular lineage for data inputs and intermediate data products
Telegraph: Tuple Telegraph: Tuple lineagelineage bitmaps (operators & queries) bitmaps (operators & queries)
Scaling via cluster analysis: bits name Scaling via cluster analysis: bits name setssets of queries/operators of queries/operators
Embedded in the networkEmbedded in the network
Multi-operator query Multi-operator query meshmesh of multiple trees is formed of multiple trees is formed
Optimizations in routing & replicating intermediate resultsOptimizations in routing & replicating intermediate results
Scaling result dissemination Scaling result dissemination
Multicast from within the MQO meshMulticast from within the MQO mesh
A many-source/many-sink multicast problemA many-source/many-sink multicast problem
Tie-ins with MQO: Can choose the multicast source points as part of query optimizationTie-ins with MQO: Can choose the multicast source points as part of query optimization
150
1
2
3
4
5
6
78
9
10
11
12
13
14
Multiquery optimization
26IntelIntel Research Research
DHTs emulate InterConnect NetworksDHTs emulate InterConnect Networks
These have deep algebraic structureThese have deep algebraic structure
Based on group-theoretic graph constructsBased on group-theoretic graph constructs
Rich families of such graphs with different propertiesRich families of such graphs with different properties
We can exploit the structure (i.e. constraints) of the overlayWe can exploit the structure (i.e. constraints) of the overlay
To embed complex computations with efficient communicationTo embed complex computations with efficient communication
To reason about the “influence” of malicious nodes in the networkTo reason about the “influence” of malicious nodes in the network
We could choose We could choose ephemeral ephemeral topologies to suit specific analysis algorithmstopologies to suit specific analysis algorithms
Ephemeral overlay topologies
27IntelIntel Research Research
Topological Fault Tolerance
Fidelity and SecurityFidelity and Security
Diversifying Influence Diversifying Influence
Reundant computation (a la process pairs) Reundant computation (a la process pairs) applied in an adversarial environmentapplied in an adversarial environment
Structured overlay topologies admit analysis Structured overlay topologies admit analysis of of spheres of influencespheres of influence
Two Dimensions to DiversityTwo Dimensions to Diversity
In Space: Multiple trees, different rootsIn Space: Multiple trees, different roots
In Time: Reassign node IDs to change spheres In Time: Reassign node IDs to change spheres of influenceof influence
influence: 8 nodes
influence: 1 node
28IntelIntel Research Research
Design Patterns for Network-Embedded Data Analysis
Taxonomize and abstract comm patterns for in-network analysesTaxonomize and abstract comm patterns for in-network analyses
We already understand how some of these map to DHTsWe already understand how some of these map to DHTs
Up-tree aggregation (AVG, SUM, etc.)Up-tree aggregation (AVG, SUM, etc.)
Up-a-special-tree aggregation (Haar Wavelets)Up-a-special-tree aggregation (Haar Wavelets)
Arbitrary dissemination (e.g. MIN, MAX, Gibbons-style Arbitrary dissemination (e.g. MIN, MAX, Gibbons-style duplicate-insensitive sketching)duplicate-insensitive sketching)
Lessons from sensornet arena (topology-oblivious)Lessons from sensornet arena (topology-oblivious)
First cut taxonomy of aggrs (TAG)First cut taxonomy of aggrs (TAG)
Junction Trees for distributed inference: Up-then-down a special treeJunction Trees for distributed inference: Up-then-down a special tree
What all this meansWhat all this means
Algebraic propertiesAlgebraic properties of comm patterns of comm patterns dictatedictate an an extensibility APIextensibility API
Expose only enough of alg logic to achieve optimization, code reuse, resource controlExpose only enough of alg logic to achieve optimization, code reuse, resource control
Efficient comm patterns can drive research into new analysis techniquesEfficient comm patterns can drive research into new analysis techniques
150
1
2
3
4
5
6
78
9
10
11
12
13
14
binomial tree
29IntelIntel Research Research
Collaborations• Network Measurement
Community– End-host packet traces (KAIST VMS,
NETI@home, DIMES)– Firewall log repositories (Snort, Bro,
DShield, Domino)– Backbone monitors and repositories
(CAIDA, CoMo)– Network tomography (RouteViews,
NetTelescope)
• Distributed Algorithms Community– Summarization / data reduction
(IRP, Bell Labs)– Inference / anomaly detection
(CMU, UCB, IR)– Signature detection (IRP,
EarlyBird)– Joins / correlations (UCB, ICSI)
Value proposition: reusable backplane for – Real-time data summarization & transport– Data validation (against other sources)– Correlation with other data– Algorithm design and deployment
30IntelIntel Research Research
The Plan (1) BuildBuild a distributed peer-to-peer dataflow enginea distributed peer-to-peer dataflow engine
Define protocols:Define protocols:
Tuple transfer protocolTuple transfer protocol
Dataflow signalling protocolDataflow signalling protocol
Instantiate the “right” overlayInstantiate the “right” overlay
Address multiquery optimizationAddress multiquery optimization
Rich aggregations/summarization and joins/correlationsRich aggregations/summarization and joins/correlations
Explore topological diversity in time and spaceExplore topological diversity in time and space
Identify efficient, realizable familiesIdentify efficient, realizable families
Establish feasible timescales for topology constructionEstablish feasible timescales for topology construction
Apply to both topological FT and net-embedded computationsApply to both topological FT and net-embedded computations
31IntelIntel Research Research
The Plan (2) DeployDeploy an initial information plane, starting on PlanetLab and building outan initial information plane, starting on PlanetLab and building out
Multiple classes of data sources:Multiple classes of data sources:
End-system monitoring (e.g. Neti@home) End-system monitoring (e.g. Neti@home)
Link monitors (e.g. CoMo)Link monitors (e.g. CoMo)
Network Telescopes (dark address space)Network Telescopes (dark address space)
Databases / archives (e.g. Routeviews)Databases / archives (e.g. Routeviews)
Build example Build example applicationsapplications ourselves ourselves
Implement example Implement example analysis operatorsanalysis operators: wavelet, PCA, etc.: wavelet, PCA, etc.
Enable Enable others to more easily build applicationsothers to more easily build applications
Client librariesClient libraries
Query handholdingQuery handholding