April 20, 2023 Department of Computer Sciences, UT Austin 1
SDIMS: A Scalable Distributed Information Management System
Praveen Yalagandula Mike Dahlin
Laboratory for Advanced Systems Research (LASR)
University of Texas at Austin
April 20, 2023 Department of Computer Sciences, UT Austin 2
Goal
A Distributed Operating System Backbone Information collection and management
Core functionality of large distributed systems Monitor, query and react to changes in the system Examples:
Benefits Ease development of new services Facilitate deployment Avoid repetition of same task by different services Optimize system performance
System administration and management Service placement and location Sensor monitoring and control Distributed Denial-of-Service attack detection
File location service Multicast tree construction Naming and request routing …………
April 20, 2023 Department of Computer Sciences, UT Austin 3
Contributions – SDIMS
Provides an important basic building block Information collection and management
Satisfies key requirements Scalability
• With both nodes and attributes• Leverage Distributed Hash Tables (DHT)
Flexibility• Enable applications to control the aggregation• Provide flexible API: install, update and probe
Autonomy• Enable administrators to control flow of information• Build Autonomous DHTs
Robustness• Handle failures gracefully• Perform re-aggregation upon failures
– Lazy (by default) and On-demand (optional)
April 20, 2023 Department of Computer Sciences, UT Austin 4
Outline
SDIMS: a distributed operating system backbone
Aggregation abstraction Our approach
Design Prototype Simulation and experimental results
Conclusions
April 20, 2023 Department of Computer Sciences, UT Austin 5
Aggregation Abstraction
Attributes Information at machines
Aggregation tree Physical nodes are leaves Each virtual node represents a logical group of nodes
• Administrative domains, groups within domains, etc.
Aggregation function, f, for attribute A Computes the aggregated value Ai for level-i subtree
• A0 = locally stored value at the physical node or NULL• Ai = f(Ai-1
0, Ai-11, …, Ai-1
k) for virtual node with k children Each virtual node is simulated by one or more
machines Example: Total users logged in the system
Attribute: numUsers Aggregation Function: Summation
a b c dA0
A1
A2
f(a,b) f(c,d)
f(f(a,b), f(c,d))
4 1 2 2
5 4
9Astrolabe [VanRenesse et al TOCS’03]
April 20, 2023 Department of Computer Sciences, UT Austin 6
Outline
SDIMS: a distributed operating system backbone
Aggregation abstraction Our approach
Design Prototype Simulation and experimental results
Conclusions
April 20, 2023 Department of Computer Sciences, UT Austin 7
Scalability with nodes and attributes
To be a basic building block, SDIMS should support Large number of machines
• Enterprise and global-scale services• Trend: Large number of small devices
Applications with a large number of attributes• Example: File location system
– Each file is an attribute– Large number of attributes
Challenges: build aggregation trees in a scalable way Build multiple trees
• Single tree for all attributes load imbalance Ensure small number of children per node in the tree
• Reduces maximum node stress
April 20, 2023 Department of Computer Sciences, UT Austin 8
Building aggregation trees: Exploit DHTs
A DHT can be viewed as multiple aggregation trees Distributed Hash Tables (DHTs)
Each node assigned an ID from large space (160-bit) For each prefix (k), each node has a link to another node
• Matching k bits• Not matching on the k+1 bit
Each key from ID space mapped to a node with closest ID Routing for a key: Bit correction
• Example: from 0101 for key 1001• 0101 1110 1000 1000 1001
Lots of different algorithms with different properties• Pastry, Tapestry, CAN, CHORD, SkipNet, etc.• Load-balancing, robustness, etc.
A DHT can be viewed as a mesh of trees Routes from all nodes to a particular key form a tree Different trees for different keys
April 20, 2023 Department of Computer Sciences, UT Austin 9
DHT trees as aggregation trees
Key 111
010 001100 101110 111011000
1xx
11x
111
April 20, 2023 Department of Computer Sciences, UT Austin 10
DHT trees as aggregation trees
Keys 111 and 000
010 001100 101110 111011000
1xx
11x
111
010 001100 101110 111011000
0xx
00x
000
April 20, 2023 Department of Computer Sciences, UT Austin 11
API – Design Goals
Expose scalable aggregation trees from DHT Flexibility: Expose several aggregation
mechanisms Attributes with different read-to-write ratios
• “CPU load” changes often: a write-dominated attribute– Aggregate on every write too much communication cost
• “NumCPUs” changes rarely: a read-dominated attribute– Aggregate on reads unnecessary latency
Spatial and temporal heterogeneity• Non-uniform and changing read-to-write rates across tree• Example: a multicast session with changing membership
Support sparse attributes of same functionality efficiently Examples: file location, multicast, etc. Not all nodes are interested in all attributes
April 20, 2023 Department of Computer Sciences, UT Austin 12
Design of Flexible API
New abstraction: separate attribute type from attribute name Attribute = (attribute type, attribute name) Example: type=“fileLocation”, name=“fileFoo”
Install: an aggregation function for a type Amortize installation cost across attributes of same
type Arguments up and down control aggregation on update
Update: the value of a particular attribute Aggregation performed according to up and down Aggregation along tree with key=hash(Attribute)
Probe: for an aggregated value at some level If required, aggregation done to produce this result Two modes: one-shot and continuous
April 20, 2023 Department of Computer Sciences, UT Austin 13
Scalability and Flexibility
Install time aggregation and propagation strategy Applications can specify up and down Examples
• Update-Local up=0, down=0• Update-All up=ALL, down=ALL• Update-Up
up=ALL, down=0
Spatial and temporal heterogeneity Applications can exploit continuous mode in probe API
• To propagate aggregate values to only interested nodes• With expiry time, to propagate for a fixed time
Also implies scalability with sparse attributes
April 20, 2023 Department of Computer Sciences, UT Austin 14
Scalability and Flexibility
Install time aggregation and propagation strategy Applications can specify up and down Examples
• Update-Local up=0, down=0• Update-All up=ALL, down=ALL• Update-Up
up=ALL, down=0
Spatial and temporal heterogeneity Applications can exploit continuous mode in probe API
• To propagate aggregate values to only interested nodes• With expiry time, to propagate for a fixed time
Also implies scalability with sparse attributes
April 20, 2023 Department of Computer Sciences, UT Austin 15
Scalability and Flexibility
Install time aggregation and propagation strategy Applications can specify up and down Examples
• Update-Local
up=0, down=0
• Update-All up=ALL, down=ALL• Update-Up
up=ALL, down=0
Spatial and temporal heterogeneity Applications can exploit continuous mode in probe API
• To propagate aggregate values to only interested nodes• With expiry time, to propagate for a fixed time
Also implies scalability with sparse attributes
April 20, 2023 Department of Computer Sciences, UT Austin 16
Scalability and Flexibility
Install time aggregation and propagation strategy Applications can specify up and down Examples
• Update-Local
up=0, down=0• Update-All
up=ALL, down=ALL
• Update-Up up=ALL, down=0
Spatial and temporal heterogeneity Applications can exploit continuous mode in probe API
• To propagate aggregate values to only interested nodes• With expiry time, to propagate for a fixed time
Also implies scalability with sparse attributes
April 20, 2023 Department of Computer Sciences, UT Austin 17
Scalability and Flexibility
Install time aggregation and propagation strategy Applications can specify up and down Examples
• Update-Local up=0, down=0• Update-All up=ALL, down=ALL• Update-Up
up=ALL, down=0
Spatial and temporal heterogeneity Applications can exploit continuous mode in probe API
• To propagate aggregate values to only interested nodes• With expiry time, to propagate for a finite time
Also implies scalability with sparse attributes
April 20, 2023 Department of Computer Sciences, UT Austin 18
Autonomy
Systems spanning multiple administrative domains Allow a domain administrator control information flow
• Prevent external observer from observing the information in the domain
• Prevent external failures from affecting the operations in the domain
Support for efficient domain wise aggregation
DHT trees might not conform Autonomous DHT
Two properties• Path Locality• Path Convergence
Reach domain root first 010 001
100
101
110
111
011000
Domain 1 Domain 2 Domain 3
April 20, 2023 Department of Computer Sciences, UT Austin 19
Outline
SDIMS: a distributed operating system backbone
Aggregation abstraction Our approach
Leverage Distributed Hash Tables Separate attribute type from name Flexible API Prototype and evaluation
Conclusions
April 20, 2023 Department of Computer Sciences, UT Austin 20
Prototype and Evaluation
SDIMS prototype Built on top of FreePastry [Druschel et al, Rice U.] Two layers
• Bottom: Autonomous DHT• Top: Aggregation Management Layer
Methodology Simulation
• Scalability• Flexibility
Prototype• Micro-benchmarks on real networks
– PlanetLab – CS Department
April 20, 2023 Department of Computer Sciences, UT Austin 21
Methodology Sparse attributes: multicast sessions with small size membership Node Stress = Amt. of incoming and outgoing info
Two points Max Node Stress an order of magnitude less than Astrolabe Max Node Stress decreases as the number of nodes increases
1
10
100
1000
10000
100000
1e+06
1e+07
1 10 100 1000 10000 100000
Nod
e S
tres
s
Number of sessions
MAX node stress with participant size 8
Simulation Results - Scalability
Astrolabe 256
Astrolabe 4096
Astrolabe 65536
SDIMS 256
SDIMS 4096
SDIMS 65536
April 20, 2023 Department of Computer Sciences, UT Austin 22
Simulation Results - Flexibility
Simulation with 4096 nodes Attributes with different up and down
strategies
0.1
1
10
100
1000
10000
0.0001 0.01 1 100 10000
Num
ber
of m
essa
ges
per
oper
atio
n
Read to Write ratio
Update-Local
Update-Up
Update-All
Up=5, down=0
Up=all, down=5
April 20, 2023 Department of Computer Sciences, UT Austin 23
Prototype Results
CS department: 180 machines (283 SDIMS nodes)
PlanetLab: 70 machines
Department Network
0
100
200
300
400
500
600
700
800
Update - All Update - Up Update - Local
Lat
ency
(m
s)
Planet Lab
0
500
1000
1500
2000
2500
3000
3500
Update - All Update - Up Update - Local
April 20, 2023 Department of Computer Sciences, UT Austin 24
Conclusions
SDIMS – basic building block for large-scale distributed services Provides information collection and management
Our Approach Scalability and Flexibility through
• Leveraging DHTs• Separating attribute type from attribute name• Providing flexible API: install, update and probe
Autonomy through • Building Autonomous DHTs
Robustness through• Default lazy reaggregation• Optional on-demand reaggregation
April 20, 2023 Department of Computer Sciences, UT Austin 25
For more information:
http://www.cs.utexas.edu/users/ypraveen/sdims
Top Related